exercise:2fe38a85ea

Jun 24'23

Exercise

Consider the standard linear regression model [math]Y_i = \mathbf{X}_{i,\ast} \bbeta + \varepsilon_i[/math] for [math]i=1, \ldots, n[/math] and with the [math]\varepsilon_i[/math] i.i.d. normally distributed with zero mean and a common variance. The rows of the design matrix [math]\mathbf{X}[/math] have two elements, and neither column represents the intercept, but [math]\mathbf{X}_{\ast, 1} = \mathbf{X}_{\ast, 2}[/math].

Suppose an estimator of the regression parameter [math]\bbeta[/math] of this model is obtained through the minimization of the sum-of-squares augmented with a ridge penalty, [math]\| \mathbf{Y} - \mathbf{X} \bbeta \|_2^2 + \lambda \| \bbeta \|_2^2[/math], in which [math]\lambda \gt 0[/math] is the penalty parameter. The minimizer is called the ridge estimator and is denoted by [math]\hat{\bbeta}(\lambda)[/math]. Show that [math][\hat{\bbeta}(\lambda)]_1 = [\hat{\bbeta}(\lambda)]_2[/math] for all [math]\lambda \gt 0[/math].
The covariates are now related as [math]\mathbf{X}_{\ast, 1} = - 2 \mathbf{X}_{\ast, 2}[/math]. Data on the response and the covariates are:
[[math]] \begin{eqnarray*} \{(y_i, x_{i,1}, x_{i,2})\}_{i=1}^6 & = & \{ (1.5, 1.0, -0.5), (1.9, -2.0, 1.0), (-1.6, 1.0, -0.5), \\ & & \, \, \, (0.8, 4.0, -2.0), (0.9, 2.0, -1.0), (\textcolor{white}{-} 0.5, 4.0, -2.0) \}. \end{eqnarray*} [[/math]]
Evaluate the ridge regression estimator for these data with [math]\lambda = 1[/math].
The data are as in part b). Show [math]\hat{\bbeta}(\lambda+\delta) = (52.5 + \lambda) (52.5 + \lambda + \delta)^{-1} \hat{\bbeta}(\lambda)[/math] for a fixed [math]\lambda[/math] and any [math]\delta \gt 0[/math]. That is, given the ridge regression estimator evaluated for a particular value of the penalty parameter [math]\lambda[/math], the remaining regularization path [math]\{ \hat{\bbeta}(\lambda + \delta) \}_{\delta \geq 0}[/math] is known analytically. Hint: Use the singular value decomposition of the design matrix [math]\mathbf{X}[/math] and the fact that its largest singular value equals [math]\sqrt{52.5}[/math].
The data are as in part b). Consider the model [math]Y_i = X_{i,1} \gamma + \varepsilon_i[/math]. The parameter [math]\gamma[/math] is estimated through minimization of [math]\sum_{i=1}^6 (Y_i - X_{i,1} \gamma)^2 + \lambda_{\gamma} \gamma^2[/math]. The perfectly linear relation of the covariates suggests that the regularization paths of the linear predictors [math]X_{i,1} \hat{\gamma}(\lambda_{\gamma})[/math] and [math]\mathbf{X}_{i,\ast} \hat{\bbeta}(\lambda)[/math] overlap. Find the functional relationship [math]\lambda_{\gamma} = f(\lambda)[/math] such that the resulting linear predictor [math]X_{i,1} \hat{\gamma}(\lambda_{\gamma})[/math] indeed coincides with that obtained from the estimate evaluated in part b) of this exercise, i.e. [math]\mathbf{X} \hat{\bbeta}(\lambda)[/math].

Add answer Add answer