exercise:Ecec41bd05

Jun 25'23

Exercise

Consider a pathway comprising of three genes called [math]A[/math], [math]B[/math], and [math]C[/math]. Let random variables [math]Y_{i,a}[/math], [math]Y_{i,b}[/math], and [math]Y_{i,c}[/math] be the random variable representing the expression of levels of genes [math]A[/math], [math]B[/math], and [math]C[/math] in sample [math]i[/math]. Hundred realizations, i.e. [math]i=1, \ldots, n[/math], of [math]Y_{i,a}[/math], [math]Y_{i,b}[/math], and [math]Y_{i,c}[/math] are available from an observational study. In order to assess how the expression levels of gene [math]A[/math] are affect by that of genes [math]B[/math] and [math]C[/math] a medical researcher fits the

[[math]] \begin{eqnarray*} Y_{i,a} &= & \beta_b Y_{i,b} + \beta_c Y_{i,c} + \varepsilon_{i}, \end{eqnarray*} [[/math]]

with [math]\varepsilon_i \sim \mathcal{N}(0, \sigma^2)[/math]. This model is fitted by means of ridge regression, but with a separate penalty parameter, [math]\lambda_{b}[/math] and [math]\lambda_{c}[/math], for the two regression coefficients, [math]\beta_b[/math] and [math]\beta_c[/math], respectively.

Write down the ridge penalized loss function employed by the researcher.
Does a different choice of penalty parameter for the second regression coefficient affect the estimation of the first regression coefficient? Motivate your answer.
The researcher decides that the second covariate [math]Y_{i,c}[/math] is irrelevant. Instead of removing the covariate from model, the researcher decides to set [math]\lambda_{c} = \infty[/math]. Show that this results in the same ridge estimate for [math]\beta_b[/math] as when fitting (again by means of ridge regression) the model without the second covariate.

Add answer Add answer