Exercise
[math] \require{textmacros} \def \bbeta {\bf \beta} \def\fat#1{\mbox{\boldmath$#1$}} \def\reminder#1{\marginpar{\rule[0pt]{1mm}{11pt}}\textbf{#1}} \def\SSigma{\bf \Sigma} \def\ttheta{\bf \theta} \def\aalpha{\bf \alpha} \def\ddelta{\bf \delta} \def\eeta{\bf \eta} \def\llambda{\bf \lambda} \def\ggamma{\bf \gamma} \def\nnu{\bf \nu} \def\vvarepsilon{\bf \varepsilon} \def\mmu{\bf \mu} \def\nnu{\bf \nu} \def\ttau{\bf \tau} \def\SSigma{\bf \Sigma} \def\TTheta{\bf \Theta} \def\XXi{\bf \Xi} \def\PPi{\bf \Pi} \def\GGamma{\bf \Gamma} \def\DDelta{\bf \Delta} \def\ssigma{\bf \sigma} \def\UUpsilon{\bf \Upsilon} \def\PPsi{\bf \Psi} \def\PPhi{\bf \Phi} \def\LLambda{\bf \Lambda} \def\OOmega{\bf \Omega} [/math]
Consider the linear regression model [math]Y_i = \mathbf{X}_{i,\ast} \bbeta + \varepsilon_i[/math] for [math]i=1, \ldots, n[/math] and with the [math]\varepsilon_i[/math] i.i.d. normally distributed with zero mean and a common variance. Relevant information on the response and design matrix are summarized as:
The lasso regression estimator is used to learn parameter [math]\bbeta[/math].
- Show that the lasso regression estimator is given by:
[[math]] \begin{eqnarray*} \hat{\bbeta}(\lambda_1) & = & \arg \min_{\bbeta \in \mathbb{R}^2} 3 \beta_1^2 + 2 \beta_2^2 - 4 \beta_1 \beta_2 - 6 \beta_1 + 2 \beta_2 + \lambda_1 | \beta_1 | + \lambda_1 | \beta_2|. \end{eqnarray*} [[/math]]
- For [math]\lambda_{1} = 0.2[/math] the lasso estimate of the second element of [math]\bbeta[/math] is [math]\hat{\beta}_2(\lambda_1) = 1.25[/math]. Determine the corresponding value of [math]\hat{\beta}_1(\lambda_1)[/math].
- Determine the smallest [math]\lambda_1[/math] for which it is guaranteed that [math]\hat{\bbeta}(\lambda_1) = \mathbf{0}_2[/math].