exercise:91ff79e6d0

Jun 11'23

Exercise

Consider a dataset [math]\dataset[/math] of [math]\samplesize[/math] data points with feature vectors [math]\featurevec^{(\sampleidx)} \in \mathbb{R}^{\featuredim}[/math] and discrete-valued labels [math]\truelabel^{(\sampleidx)} \in \{1,2,\ldots,10\}[/math]. The data is highly imbalanced, more than [math]90[/math] percent of data points have a label [math]\truelabel =1[/math]. We learn a hypothesis out of the hypothesis space[math]\hypospace'[/math] that is constituted by the ten maps [math]h^{(\clusteridx)}(\featurevec)= \clusteridx[/math] for [math]\clusteridx=1,2,\ldots,10[/math].

Is there a hypothesis [math]h\in \hypospace'[/math] whose average [math]0/1[/math] loss on [math]\dataset[/math] does not exceed [math]0.3[/math] ?

Add answer Add answer