Exercise
[math] \require{textmacros} \def \bbeta {\bf \beta} \def\fat#1{\mbox{\boldmath$#1$}} \def\reminder#1{\marginpar{\rule[0pt]{1mm}{11pt}}\textbf{#1}} \def\SSigma{\bf \Sigma} \def\ttheta{\bf \theta} \def\aalpha{\bf \alpha} \def\ddelta{\bf \delta} \def\eeta{\bf \eta} \def\llambda{\bf \lambda} \def\ggamma{\bf \gamma} \def\nnu{\bf \nu} \def\vvarepsilon{\bf \varepsilon} \def\mmu{\bf \mu} \def\nnu{\bf \nu} \def\ttau{\bf \tau} \def\SSigma{\bf \Sigma} \def\TTheta{\bf \Theta} \def\XXi{\bf \Xi} \def\PPi{\bf \Pi} \def\GGamma{\bf \Gamma} \def\DDelta{\bf \Delta} \def\ssigma{\bf \sigma} \def\UUpsilon{\bf \Upsilon} \def\PPsi{\bf \Psi} \def\PPhi{\bf \Phi} \def\LLambda{\bf \Lambda} \def\OOmega{\bf \Omega} [/math]
Consider fitting the linear regression model by means of the elastic net regression estimator.
- Recall the data augmentation trick of Question of the ridge regression exercises. Use the same trick to show that the elastic net least squares loss function can be reformulated to the form of the traditional lasso function. Hint: absorb the ridge part of the elastic net penalty into the sum of squares.
- The elastic net regression estimator can be evaluated by a coordinate descent procedure outlined in Section Coordinate descent . Show that in such a procedure at each step the [math]j[/math]-th element of the elastic net regression estimate is updated according to:
[[math]] \begin{eqnarray*} \hat{\beta}_j (\lambda_1, \lambda_2) & = & (\| \mathbf{X}_{\ast, j} \|_2^2 + \lambda_2)^{-1} \mbox{sign}(\mathbf{X}_{\ast, j}^{\top} \tilde{\mathbf{Y}}) \big[ | \mathbf{X}_{\ast, j}^{\top} \tilde{\mathbf{Y}} | - \tfrac{1}{2} \lambda_1 \big]_+. \end{eqnarray*} [[/math]]with [math]\tilde{\mathbf{Y}} = \mathbf{Y} - \mathbf{X}_{\ast, \setminus j} \bbeta_{\setminus j}[/math].