Exercise
[math] \require{textmacros} \def \bbeta {\bf \beta} \def\fat#1{\mbox{\boldmath$#1$}} \def\reminder#1{\marginpar{\rule[0pt]{1mm}{11pt}}\textbf{#1}} \def\SSigma{\bf \Sigma} \def\ttheta{\bf \theta} \def\aalpha{\bf \alpha} \def\ddelta{\bf \delta} \def\eeta{\bf \eta} \def\llambda{\bf \lambda} \def\ggamma{\bf \gamma} \def\nnu{\bf \nu} \def\vvarepsilon{\bf \varepsilon} \def\mmu{\bf \mu} \def\nnu{\bf \nu} \def\ttau{\bf \tau} \def\SSigma{\bf \Sigma} \def\TTheta{\bf \Theta} \def\XXi{\bf \Xi} \def\PPi{\bf \Pi} \def\GGamma{\bf \Gamma} \def\DDelta{\bf \Delta} \def\ssigma{\bf \sigma} \def\UUpsilon{\bf \Upsilon} \def\PPsi{\bf \Psi} \def\PPhi{\bf \Phi} \def\LLambda{\bf \Lambda} \def\OOmega{\bf \Omega} [/math]
Investigate the effect of the variance of the covariates on variable selection by the lasso. Hereto consider the toy model: [math]Y_i = X_{1i} + X_{2i} + \varepsilon_i[/math], where [math]\epsilon_i \sim \mathcal{N}(0, 1)[/math], [math]X_{1i} \sim \mathcal{N}(0, 1)[/math], and [math]X_{2i} = a \, X_{1i}[/math] with [math]a \in [0, 2][/math]. Draw a hundred samples for both [math]X_{1i}[/math] and [math]\varepsilon_i[/math] and construct both [math]X_{2i}[/math] and [math]Y_i[/math] for a grid of [math]a[/math]'s. Fit the model by means of the lasso regression estimator with [math]\lambda_1=1[/math] for each choice of [math]a[/math]. Plot e.g. in one figure a) the variance of [math]X_{i1}[/math], b) the variance of [math]X_{2i}[/math], and c) the indicator of the selection of [math]X_{2i}[/math]. Which covariate is selected for which values of scale parameter [math]a[/math]?