ABy Admin
Jun 24'23

Exercise

[math] \require{textmacros} \def \bbeta {\bf \beta} \def\fat#1{\mbox{\boldmath$#1$}} \def\reminder#1{\marginpar{\rule[0pt]{1mm}{11pt}}\textbf{#1}} \def\SSigma{\bf \Sigma} \def\ttheta{\bf \theta} \def\aalpha{\bf \alpha} \def\ddelta{\bf \delta} \def\eeta{\bf \eta} \def\llambda{\bf \lambda} \def\ggamma{\bf \gamma} \def\nnu{\bf \nu} \def\vvarepsilon{\bf \varepsilon} \def\mmu{\bf \mu} \def\nnu{\bf \nu} \def\ttau{\bf \tau} \def\SSigma{\bf \Sigma} \def\TTheta{\bf \Theta} \def\XXi{\bf \Xi} \def\PPi{\bf \Pi} \def\GGamma{\bf \Gamma} \def\DDelta{\bf \Delta} \def\ssigma{\bf \sigma} \def\UUpsilon{\bf \Upsilon} \def\PPsi{\bf \Psi} \def\PPhi{\bf \Phi} \def\LLambda{\bf \Lambda} \def\OOmega{\bf \Omega} [/math]

Consider the linear regression model [math]\mathbf{Y} = \mathbf{X} \bbeta + \vvarepsilon[/math], without intercept and [math]\vvarepsilon \sim \mathcal{N} ( \mathbf{0}_n, \sigma^2 \mathbf{I}_{nn})[/math], to explain the variation in the response [math]\mathbf{Y}[/math] by a linear combination of the columns of the design matrix [math]\mathbf{X}[/math]. The linear regression model is fitted by means of ridge estimation. The estimator is evaluated directly from its regular expression and a computationally efficient one:

[[math]] \begin{eqnarray*} \hat{\bbeta}_{\mbox{{\tiny reg}}} (\lambda) & = & (\lambda \mathbf{I}_{pp} + \mathbf{X}^{\top} \mathbf{X})^{-1} \mathbf{X}^{\top} \mathbf{Y}, \\ \hat{\bbeta}_{\mbox{{\tiny eff}}} (\lambda) & = & \lambda^{-1} \mathbf{X}^{\top} \mathbf{Y} - \lambda^{-2} \mathbf{X}^{\top} (\mathbf{I}_{nn} + \lambda^{-1} \mathbf{X} \mathbf{X}^{\top})^{-1} \mathbf{X} \mathbf{X}^{\top} \mathbf{Y}, \end{eqnarray*} [[/math]]

respectively. In the remainder study the computational gain of the latter. Hereto carry out the following instructions:

  • Load the R-package microbenchmark [1].
  • Generate data. In this fix the sample size at [math]n=10[/math], and let the dimension range from [math]p=10, 20, 30, \ldots, 100[/math]. Sample the elements of the response and the ten design matrices from the standard normal distribution.
  • Verify the superior computation time of the latter by means of the microbenchmark-function with default settings. Throughout use the first design matrix and [math]\lambda=1[/math]. Write the output of the microbenchmark-function to an R-object. It will be a data.frame with two slots expr and time that contain the function calls and the corresponding computation times, respectively. Each call has by default been evaluated a hundred times in random order. Summary statistics of these individual computation times are given when printing the object on the screen.
  • Use the crossprod- and tcrossprod-functions to improve the computation times of the evaluation of both [math]\hat{\bbeta}_{\mbox{{\tiny reg}}} (\lambda)[/math] and [math]\hat{\bbeta}_{\mbox{{\tiny eff}}} (\lambda)[/math] as much as possible.
  • Use the microbenchmark-function to evaluate the (average) computation time both [math]\hat{\bbeta}_{\mbox{{\tiny reg}}} (\lambda)[/math] and [math]\hat{\bbeta}_{\mbox{{\tiny eff}}} (\lambda)[/math] on all ten data sets, i.e. defined by the ten design matrices with different dimensions.
  • Plot, for both [math]\hat{\bbeta}_{\mbox{{\tiny reg}}} (\lambda)[/math] and [math]\hat{\bbeta}_{\mbox{{\tiny eff}}} (\lambda)[/math], the (average) computation time (on the [math]y[/math]-axis) against the dimension of the ten data sets. Conclude on the computation gain from the plot.
  1. Mersmann, O. (2014).microbenchmark: Accurate Timing Functions.R package version 1.4-2