exercise:969a018475

Jun 25'23

Exercise

The ridge penalty may be interpreted as a multivariate normal prior on the regression coefficients: [math]\bbeta \sim \mathcal{N}(\mathbf{0}, \lambda^{-1} \mathbf{I}_{pp})[/math]. Different priors may be considered. In case the covariates are spatially related in some sense (e.g. genomically), it may of interest to assume a first-order autoregressive prior: [math]\bbeta \sim \mathcal{N}(\mathbf{0}, \lambda^{-1} \mathbf{\Sigma}_a)[/math], in which [math]\mathbf{\Sigma}_a[/math] is a [math](p \times p)[/math]-dimensional correlation matrix with [math](\mathbf{\Sigma}_a)_{j_1, j_2} = \rho^{ | j_1 - j_2 | } [/math] for some correlation coefficient [math]\rho \in [0, 1)[/math]. Hence,

[[math]] \begin{eqnarray*} \mathbf{\Sigma}_a \, \, \, = \, \, \, \left( \begin{array}{cccc} 1 & \rho & \ldots & \rho^{p-1} \\ \rho & 1 & \ldots & \rho^{p-2} \\ \vdots & \vdots & \ddots & \vdots \\ \rho^{p-1} & \rho^{p-2} & \ldots & 1 \end{array} \right). \end{eqnarray*} [[/math]]

The penalized loss function associated with this AR(1) prior is:
[[math]] \begin{eqnarray*} \mathcal{L}(\bbeta; \lambda, \mathbf{\Sigma}_a) & = & \| \mathbf{Y} - \mathbf{X} \bbeta \|_2^2 + \lambda \bbeta^{\top} \mathbf{\Sigma}_a^{-1} \bbeta. \end{eqnarray*} [[/math]]
Find the minimizer of this loss function.
What is the effect of [math]\rho[/math] on the ridge estimates? Contrast this to the effect of [math]\lambda[/math]. Illustrate this on (simulated) data.
Instead of an AR(1) prior assume a prior with a uniform correlation between the elements of [math]\bbeta[/math]. That is, replace [math]\mathbf{\Sigma}_a[/math] by [math]\mathbf{\Sigma}_u[/math], given by [math]\mathbf{\Sigma}_u = (1-\rho) \mathbf{I}_{pp} + \rho \mathbf{1}_{pp}[/math]. Investigate (again on data) the effect of changing from the AR(1) to the uniform prior on the ridge regression estimates.

Add answer Add answer