exercise:F4045af163: Difference between revisions

From Stochiki
(Created page with "<div class="d-none"> <math> \newcommand{\DS}{\displaystyle} \newcommand{\intbeta}{\lfloor \beta \rfloor} \newcommand{\cA}{\mathcal{A}} \newcommand{\cB}{\mathcal{B}} \newcommand{\cC}{\mathcal{C}} \newcommand{\cD}{\mathcal{D}} \newcommand{\cE}{\mathcal{E}} \newcommand{\cF}{\mathcal{F}} \newcommand{\cG}{\mathcal{G}} \newcommand{\cH}{\mathcal{H}} \newcommand{\cI}{\mathcal{I}} \newcommand{\cJ}{\mathcal{J}} \newcommand{\cK}{\mathcal{K}} \newcommand{\cL}{\mathcal...")
 
No edit summary
Line 186: Line 186:
</math>
</math>
</div>
</div>
\label{EXO:jl}
 
The goal of this problem is to prove the following theorem:
The goal of this problem is to prove the following theorem:
{{proofcard|Theorem (Johnson-Lindenstrauss Lemma)|thm:jl|Given <math> n </math> points denoted by <math> X = \{x_1, \dots, x_n\} </math> in <math> \R^d </math>, let <math> Q \in \R^{k \times d} </math> be a random projection operator and set <math> P := \sqrt{\frac{d}{k}} Q </math>.
{{proofcard|Theorem (Johnson-Lindenstrauss Lemma)|thm:jl|Given <math> n </math> points denoted by <math> X = \{x_1, \dots, x_n\} </math> in <math> \R^d </math>, let <math> Q \in \R^{k \times d} </math> be a random projection operator and set <math> P := \sqrt{\frac{d}{k}} Q </math>.
Line 258: Line 258:
(Hint: Think about how to apply the previous results in this case and use the inequalities <math> \log (1-\varepsilon) \leq -\varepsilon - \varepsilon^2/2 </math> and <math> \log (1+\varepsilon) \leq \varepsilon - \varepsilon^2/2 + \varepsilon^3/3 </math>.)
(Hint: Think about how to apply the previous results in this case and use the inequalities <math> \log (1-\varepsilon) \leq -\varepsilon - \varepsilon^2/2 </math> and <math> \log (1+\varepsilon) \leq \varepsilon - \varepsilon^2/2 + \varepsilon^3/3 </math>.)
</li>
</li>
<li>Prove  [[#thm:jl|Theorem]].</li>
</ul>
</ul>
\item Prove Theorem [[#thm:jl |Sub-Gaussian Random Variables]].
\end{enumerate}

Revision as of 00:44, 22 May 2024

[math] \newcommand{\DS}{\displaystyle} \newcommand{\intbeta}{\lfloor \beta \rfloor} \newcommand{\cA}{\mathcal{A}} \newcommand{\cB}{\mathcal{B}} \newcommand{\cC}{\mathcal{C}} \newcommand{\cD}{\mathcal{D}} \newcommand{\cE}{\mathcal{E}} \newcommand{\cF}{\mathcal{F}} \newcommand{\cG}{\mathcal{G}} \newcommand{\cH}{\mathcal{H}} \newcommand{\cI}{\mathcal{I}} \newcommand{\cJ}{\mathcal{J}} \newcommand{\cK}{\mathcal{K}} \newcommand{\cL}{\mathcal{L}} \newcommand{\cM}{\mathcal{M}} \newcommand{\cN}{\mathcal{N}} \newcommand{\cO}{\mathcal{O}} \newcommand{\cP}{\mathcal{P}} \newcommand{\cQ}{\mathcal{Q}} \newcommand{\cS}{\mathcal{S}} \newcommand{\cT}{\mathcal{T}} \newcommand{\cU}{\mathcal{U}} \newcommand{\cV}{\mathcal{V}} \newcommand{\cX}{\mathcal{X}} \newcommand{\cY}{\mathcal{Y}} \newcommand{\cZ}{\mathcal{Z}} \newcommand{\sg}{\mathsf{subG}} \newcommand{\subE}{\mathsf{subE}} \newcommand{\bA}{\mathbf{A}} \newcommand{\bB}{\mathbf{B}} \newcommand{\bC}{\mathbf{C}} \newcommand{\bD}{\mathbf{D}} \newcommand{\bE}{\mathbf{E}} \newcommand{\bF}{\mathbf{F}} \newcommand{\bG}{\mathbf{G}} \newcommand{\bH}{\mathbf{H}} \newcommand{\bI}{\mathbf{I}} \newcommand{\bJ}{\mathbf{J}} \newcommand{\bK}{\mathbf{K}} \newcommand{\bM}{\mathbf{M}} \newcommand{\bN}{\mathbf{N}} \newcommand{\bO}{\mathbf{O}} \newcommand{\bP}{\mathbf{P}} \newcommand{\bp}{\mathbf{p}} \newcommand{\bQ}{\mathbf{Q}} \newcommand{\bS}{\mathbf{S}} \newcommand{\bT}{\mathbf{T}} \newcommand{\bU}{\mathbf{U}} \newcommand{\bV}{\mathbf{V}} \newcommand{\bX}{\mathbf{X}} \newcommand{\bY}{\mathbf{Y}} \newcommand{\bZ}{\mathbf{Z}} \newcommand{\bflambda}{\boldsymbol{\lambda}} \newcommand{\bftheta}{\boldsymbol{\theta}} \newcommand{\bfg}{\boldsymbol{g}} \newcommand{\bfy}{\boldsymbol{y}} \def\thetaphat{\hat{\bftheta}_\bp} \def\bflam{\boldsymbol{\lambda}} \def\Lam{\Lambda} \def\lam{\lambda} \def\bfpi{\boldsymbol{\pi}} \def\bfz{\boldsymbol{z}} \def\bfw{\boldsymbol{w}} \def\bfeta{\boldsymbol{\eta}} \newcommand{\R}{\mathrm{ I}\kern-0.18em\mathrm{ R}} \newcommand{\h}{\mathrm{ I}\kern-0.18em\mathrm{ H}} \newcommand{\K}{\mathrm{ I}\kern-0.18em\mathrm{ K}} \newcommand{\p}{\mathrm{ I}\kern-0.18em\mathrm{ P}} \newcommand{\E}{\mathrm{ I}\kern-0.18em\mathrm{ E}} %\newcommand{\Z}{\mathrm{ Z}\kern-0.26em\mathrm{ Z}} \newcommand{\1}{\mathrm{ 1}\kern-0.24em\mathrm{ I}} \newcommand{\N}{\mathrm{ I}\kern-0.18em\mathrm{ N}} \newcommand{\field}[1]{\mathbb{#1}} \newcommand{\q}{\field{Q}} \newcommand{\Z}{\field{Z}} \newcommand{\X}{\field{X}} \newcommand{\Y}{\field{Y}} \newcommand{\bbS}{\field{S}} \newcommand{\n}{\mathcal{N}} \newcommand{\x}{\mathcal{X}} \newcommand{\pp}{{\sf p}} \newcommand{\hh}{{\sf h}} \newcommand{\ff}{{\sf f}} \newcommand{\Bern}{\mathsf{Ber}} \newcommand{\Bin}{\mathsf{Bin}} \newcommand{\Lap}{\mathsf{Lap}} \newcommand{\tr}{\mathsf{Tr}} \newcommand{\phin}{\varphi_n} \newcommand{\phinb}{\overline \varphi_n(t)} \newcommand{\pn}{\p_{\kern-0.25em n}} \newcommand{\pnm}{\p_{\kern-0.25em n,m}} \newcommand{\psubm}{\p_{\kern-0.25em m}} \newcommand{\psubp}{\p_{\kern-0.25em p}} \newcommand{\cfi}{\cF_{\kern-0.25em \infty}} \newcommand{\e}{\mathrm{e}} \newcommand{\ic}{\mathrm{i}} \newcommand{\Leb}{\mathrm{Leb}_d} \newcommand{\Var}{\mathrm{Var}} \newcommand{\ddelta}{d_{\symdiffsmall}} \newcommand{\dsubh}{d_H} \newcommand{\indep}{\perp\kern-0.95em{\perp}} \newcommand{\supp}{\mathop{\mathrm{supp}}} \newcommand{\rank}{\mathop{\mathrm{rank}}} \newcommand{\vol}{\mathop{\mathrm{vol}}} \newcommand{\conv}{\mathop{\mathrm{conv}}} \newcommand{\card}{\mathop{\mathrm{card}}} \newcommand{\argmin}{\mathop{\mathrm{argmin}}} \newcommand{\argmax}{\mathop{\mathrm{argmax}}} \newcommand{\ud}{\mathrm{d}} \newcommand{\var}{\mathrm{var}} \newcommand{\re}{\mathrm{Re}} \newcommand{\MSE}{\mathsf{MSE}} \newcommand{\im}{\mathrm{Im}} \newcommand{\epr}{\hfill\hbox{\hskip 4pt\vrule width 5pt height 6pt depth 1.5pt}\vspace{0.5cm}\par} \newcommand{\bi}[1]{^{(#1)}} \newcommand{\eps}{\varepsilon} \newcommand{\Deq}{\stackrel{\mathcal{D}}{=}} \newcommand{\ubar}{\underbar} \newcommand{\Kbeta}{K_{\hspace{-0.3mm} \beta}} \newcommand{\crzero}[1]{\overset{r_0}{\underset{#1}{\longleftrightarrow}}} \newcommand{\hint}[1]{\texttt{[Hint:#1]}} \newcommand{\vp}{\vspace{.25cm}} \newcommand{\vm}{\vspace{.5cm}} \newcommand{\vg}{\vspace{1cm}} \newcommand{\vgneg}{\vspace{-1cm}} \newcommand{\vneg}{\vspace{-.5cm}} \newcommand{\vpneg}{\vspace{-.25cm}} \newcommand{\tp}{\ptsize{10}} \newcommand{\douzp}{\ptsize{12}} \newcommand{\np}{\ptsize{9}} \newcommand{\hp}{\ptsize{8}} \newcommand{\red}{\color{red}} \newcommand{\ndpr}[1]{{\textsf{\red[#1]}}} \newcommand\iid{i.i.d\@ifnextchar.{}{.\@\xspace} } \newcommand\MoveEqLeft[1][2]{\kern #1em & \kern -#1em} \newcommand{\leadeq}[2][4]{\MoveEqLeft[#1] #2 \nonumber} \newcommand{\leadeqnum}[2][4]{\MoveEqLeft[#1] #2} \newcommand\independent{\protect\mathpalette{\protect\independenT}{\perp}} \def\independenT#1#2{\mathrel{\rlap{$#1#2$}\mkern2mu{#1#2}}} \newcommand{\MIT}[1]{{\color{MITred} #1}} \newcommand{\dHyp}{\{-1,1\}^d} \newcommand{\thetahard}{\hat \theta^{hrd}} \newcommand{\thetasoft}{\hat \theta^{sft}} \newcommand{\thetabic}{\hat \theta^{bic}} \newcommand{\thetalasso}{\hat \theta^{\cL}} \newcommand{\thetaslope}{\hat \theta^{\cS}} \newcommand{\thetahard}{\hat \theta^{hrd}} \newcommand{\thetasoft}{\hat \theta^{sft}} \newcommand{\thetabic}{\hat \theta^{bic}} \newcommand{\thetalasso}{\hat \theta^{\cL}} \newcommand{\thetaslope}{\hat \theta^{\cS}} \newcommand{\thetals}{\hat \theta^{ls}} \newcommand{\thetalsm}{\tilde \theta^{ls_X}} \newcommand{\thetaridge}{\hat\theta^{\mathrm{ridge}}_\tau} \newcommand{\thetalsK}{\hat \theta^{ls}_K} \newcommand{\muls}{\hat \mu^{ls}} [/math]

The goal of this problem is to prove the following theorem:

Theorem (Johnson-Lindenstrauss Lemma)

Given [math] n [/math] points denoted by [math] X = \{x_1, \dots, x_n\} [/math] in [math] \R^d [/math], let [math] Q \in \R^{k \times d} [/math] be a random projection operator and set [math] P := \sqrt{\frac{d}{k}} Q [/math]. There is a constant [math] C \gt 0 [/math] such that if

[[math]] \begin{equation*} k \geq \frac{C}{\varepsilon^2} \log n, \end{equation*} [[/math]]
[math] P [/math] is an [math] \varepsilon [/math]-isometry for [math] X [/math], \ie

[[math]] \begin{equation*} (1 - \varepsilon) \| x_i - x_j \|_2^2 \leq \| P x_i - P x_j \|_2^2 \leq (1 + \varepsilon) \|x_i - x_j\|_2^2, \quad \text{for all } i, j \end{equation*} [[/math]]
with propability at least [math] 1 - 2 \exp(-c \varepsilon^2 k) [/math].

  • Convince yourself that if [math] d \gt n [/math], there is a projection [math] P \in \R^{n \times d} [/math] to an [math] n [/math] dimensional subspace such that [math] \| P x_i - P x_j \|_2 = \| x_i - x_j \|_2 [/math], \ie pairwise distances are exactly preserved.

Let [math] k \leq d [/math] be two integers, [math] Y = (y_1, \dots, y_d) \sim \cN(0, I_{d \times d} ) [/math] independent and identically distributed Gaussians and [math] Q \in \R^{d \times k} [/math] the projection onto the first [math] k [/math] coordinates, \ie [math] Qy = (y_1, \dots, y_k) [/math]. Define [math] Z = \frac{1}{\|Y\|}QY = \frac{1}{\|Y\|} (y_1, \dots, y_k) [/math] and [math] L = \|Z\|^2 [/math].

  • Show that [math] \E[L] = \frac{k}{d} [/math].
  • Show that for all [math] t \gt 0 [/math] such that [math] 1 - 2t(k \beta - d) \gt 0 [/math] and [math] 1 - 2t \beta k \gt 0 [/math],
    [[math]] \begin{equation*} \label{eq:a} \p \left( \sum_{i = 1}^{k} y_i^2 \leq \beta \frac{k}{d} \sum_{i = 1}^{d} y_i^2 \right) \leq (1 - 2t(k \beta - d))^{-k/2} (1 - 2t \beta k)^{-(d-k)/2} \end{equation*} [[/math]]
    (Hint: Show that [math] \E[\e^{\lambda X^2}] = \frac{1}{\sqrt{1 - 2 \lambda}} [/math] for [math] \lambda \lt \frac{1}{2} [/math] if [math] X \sim \cN(0,1) [/math].)
  • Conclude that for [math] \beta \lt 1 [/math],
    [[math]] \begin{equation*} \label{eq:b} \p\left(L \leq \beta \frac{k}{d}\right) \leq \exp \left( \frac{k}{2} (1 - \beta + \log \beta) \right). \end{equation*} [[/math]]
  • Similarly, show that for [math] \beta \gt 1 [/math],
    [[math]] \begin{equation*} \label{eq:c} \p\left(L \geq \beta \frac{k}{d}\right) \leq \exp \left( \frac{k}{2} (1 - \beta + \log \beta) \right). \end{equation*} [[/math]]
  • Show that for a random projection operator [math] Q \in \R^{k \times d} [/math] and a fixed vector [math] x \in \R^d [/math], \begin{enumerate}
  • [math] \E[\|Qx\|^2] = \frac{k}{d} \|x\|^2 [/math].
  • For [math] \varepsilon \in (0, 1) [/math], there is a constant [math] c \gt 0 [/math] such that with probability at least [math] 1 - 2 \exp(-c k \varepsilon^2) [/math],
    [[math]] \begin{equation*} \label{eq:d} (1 - \varepsilon) \frac{k}{d} \| x \|^2 \leq \| Q x \|_2^2 \leq (1 + \varepsilon) \frac{k}{d} \|x\|_2^2. \end{equation*} [[/math]]
    (Hint: Think about how to apply the previous results in this case and use the inequalities [math] \log (1-\varepsilon) \leq -\varepsilon - \varepsilon^2/2 [/math] and [math] \log (1+\varepsilon) \leq \varepsilon - \varepsilon^2/2 + \varepsilon^3/3 [/math].)
  • Prove Theorem.