exercise:9b2dcb9f05: Difference between revisions

Latest revision as of 23:44, 24 June 2023

Consider the linear regression model [math]Y_i = X_i \beta + \varepsilon_i[/math] with the [math]\varepsilon_i[/math] i.i.d. following a standard normal law [math]\mathcal{N}(0, 1)[/math]. Data on the response and covariate are available: [math]\{(y_i, x_i)\}_{i=1}^8 = \{ (-5, -2), (0, -1), \\ (-4, -1), (-2, -1), (0, 0), (3,1), (5,2), (3,2) \}[/math].

Assume a zero-centered normal prior on [math]\beta[/math]. What variance, i.e. which [math]\sigma_{\beta}^2 \in \mathbb{R}_{\gt0}[/math], of this prior yields a mean posterior [math]\mathbb{E}(\beta \, | \, \{(y_i, x_i)\}_{i=1}^8, \sigma_{\beta}^2)[/math] equal to [math]1.4[/math]?
Assume a non-zero centered normal prior. What (mean, variance)-combinations for the prior will yield a mean posterior estimate [math]\hat{\beta} = 2[/math]?

@@ Line 32: / Line 32: @@
 </div>
-This exercise is inspired by one from <ref name="Drap1998">Draper, N. R. and Smith, H. (1998).''Applied Regression Analysis (3rd edition)''.John Wiley & Sons</ref>. Consider the simple linear regression model <math>Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i</math> with <math>\varepsilon_i \sim \mathcal{N}(0, \sigma^2)</math>. The data on the covariate and response are: <math>\mathbf{X}^{\top} = (X_1, X_2, \ldots, X_{8})^{\top} = (-2, -1, -1, -1, 0, 1, 2, 2)^{\top}</math> and <math>\mathbf{Y}^{\top} = (Y_1, Y_2, \ldots, Y_{8})^{\top} = (35, 40, 36, 38, 40, 43, 45, 43)^{\top}</math>, with corresponding elements in the same order.
+Consider the linear regression model <math>Y_i = X_i \beta + \varepsilon_i</math> with the <math>\varepsilon_i</math> i.i.d. following a standard normal law <math>\mathcal{N}(0, 1)</math>. Data on the response and covariate are available: <math>\{(y_i, x_i)\}_{i=1}^8 = \{ (-5, -2),  (0,  -1), \\ (-4, -1), (-2, -1), (0, 0), (3,1), (5,2), (3,2) \}</math>.
-<ul style="list-style-type:lower-alpha"><li> Find the ridge regression estimator for the data above for a general value of <math>\lambda</math>.
+<ul style="list-style-type:lower-alpha"><li> Assume a zero-centered normal prior on <math>\beta</math>. What variance, i.e. which <math>\sigma_{\beta}^2 \in \mathbb{R}_{>0}</math>, of this prior yields a mean posterior <math>\mathbb{E}(\beta \, | \, \{(y_i, x_i)\}_{i=1}^8, \sigma_{\beta}^2)</math> equal to <math>1.4</math>?
 </li>
-<li> Evaluate the fit, i.e. <math>\widehat{Y}_i(\lambda)</math> for <math>\lambda=10</math>. Would you judge the fit as good? If not, what is the most striking feature that you find unsatisfactory?
+<li>   Assume a non-zero centered normal prior. What (mean, variance)-combinations for the prior will yield a mean posterior estimate <math>\hat{\beta} = 2</math>?
-</li>
-<li> Now zero center the covariate and response data, denote it by <math>\tilde{X}_i</math> and <math>\tilde{Y}_i</math>, and evaluate the ridge estimator of <math>\tilde{Y}_i = \beta_1 \tilde{X}_i + \varepsilon_i</math> at <math>\lambda=4</math>. Verify that in terms of original data the resulting predictor now is: <math>\widehat{Y}_i(\lambda) = 40 + 1.75 X</math>.
 </li>
 </ul>
-Note that the employed estimate in the predictor found in part ''c)'' is effectively a combination of a maximum likelihood and ridge regression one for intercept and slope, respectively. Put differently, only the slope has been regularized/penalized.