exercise:49d2139750: Difference between revisions

Latest revision as of 22:45, 24 June 2023

Consider the Bayesian linear regression model [math]\mathbf{Y} = \mathbf{X} \bbeta + \vvarepsilon[/math] with [math]\vvarepsilon \sim \mathcal{N} ( \mathbf{0}_n, \sigma^2 \mathbf{I}_{nn})[/math] and priors [math]\bbeta \, | \, \sigma^2 \sim \mathcal{N} ( \mathbf{0}_p, \sigma_{\beta}^{2} \mathbf{I}_{pp})[/math] and [math]\sigma^2 \sim \mathcal{IG}(a_0, b_0)[/math] where [math] \sigma_{\beta}^{2} = c \sigma^{2}[/math] for some [math]c \gt 0[/math] and [math]a_0[/math] and [math]b_0[/math] are the shape and scale parameters, respectively, of the inverse Gamma distribution. This model is fitted to data from a study where the response is explained by a single covariate, and henceforth [math]\bbeta[/math] is replaced by [math]\beta[/math], with the following relevant summary statistics: [math]\mathbf{X}^{\top} \mathbf{X} = 2[/math] and [math]\mathbf{X}^{\top} \mathbf{Y} = 5[/math].

Suppose [math]\mathbb{E}( \beta \, | \, \sigma^2=1, c, \mathbf{X}, \mathbf{Y}) = 2[/math]. What amount of regularization should be used such that the ridge regression estimate [math]\hat{\beta}(\lambda_2)[/math] coincides with the aforementioned posterior (conditional) mean?
Give the (posterior) distribution of [math]\beta \, | \, \{ \sigma^2=2, c=2, \mathbf{X}, \mathbf{Y} \}[/math].
Discuss how a different prior of [math]\sigma^2[/math] affects the correspondence between [math]\mathbb{E} (\beta \, | \, \sigma^2, c, \mathbf{X}, \mathbf{Y})[/math] and the ridge regression estimator.

@@ Line 32: / Line 32: @@
 </div>
-Consider the simple linear regression model <math>Y_i =  \beta_0 + X_{i} \bbeta + \varepsilon_i</math> for <math>i=1, \ldots, n</math> and with <math>\varepsilon_i \sim_{i.i.d.}  \mathcal{N}(0, \sigma^2)</math>. The model comprises a single covariate and an intercept. Response and covariate data are: <math>\{(y_i, x_{i})\}_{i=1}^4 = \{ (1.4,  0.0),   (1.4, -2.0), (0.8,  0.0), (0.4,  2.0) \}</math>. Find the value of <math>\lambda</math> that yields the ridge regression estimate (with an unregularized/unpenalized intercept  as is done in part ''c)'' of [[exercise:9b2dcb9f05 |Question]]) equal to <math>(1, -\tfrac{1}{8})^{\top}</math>.
+Consider the Bayesian linear regression model <math>\mathbf{Y} = \mathbf{X} \bbeta + \vvarepsilon</math> with <math>\vvarepsilon \sim \mathcal{N} ( \mathbf{0}_n, \sigma^2 \mathbf{I}_{nn})</math> and priors
+<math>\bbeta \, | \, \sigma^2 \sim \mathcal{N} ( \mathbf{0}_p, \sigma_{\beta}^{2} \mathbf{I}_{pp})</math> and <math>\sigma^2 \sim \mathcal{IG}(a_0, b_0)</math> where <math> \sigma_{\beta}^{2} =  c \sigma^{2}</math> for some <math>c > 0</math> and <math>a_0</math> and <math>b_0</math> are the shape and scale parameters, respectively, of the inverse Gamma distribution. This model is fitted to data from a study where the response is explained by a single covariate, and henceforth <math>\bbeta</math> is replaced by <math>\beta</math>, with the following relevant summary statistics: <math>\mathbf{X}^{\top} \mathbf{X} = 2</math> and <math>\mathbf{X}^{\top} \mathbf{Y} = 5</math>.
+<ul style="list-style-type:lower-alpha"><li> Suppose <math>\mathbb{E}( \beta \, | \, \sigma^2=1, c, \mathbf{X}, \mathbf{Y}) = 2</math>. What amount of regularization should be used such that the ridge regression estimate <math>\hat{\beta}(\lambda_2)</math> coincides with the aforementioned posterior (conditional) mean?
+</li>
+<li> Give the (posterior) distribution of <math>\beta \, | \, \{ \sigma^2=2, c=2, \mathbf{X}, \mathbf{Y} \}</math>.
+</li>
+<li> Discuss how a different prior of <math>\sigma^2</math> affects the correspondence between <math>\mathbb{E} (\beta \, | \, \sigma^2, c, \mathbf{X}, \mathbf{Y})</math> and the ridge regression estimator.
+</li>
+</ul>