⧼exchistory⧽
4 exercise(s) shown, 0 hidden
BBy Bot
May 31'24
[math] \newcommand{\smallfrac}[2]{\frac{#1}{#2}} \newcommand{\medfrac}[2]{\frac{#1}{#2}} \newcommand{\textfrac}[2]{\frac{#1}{#2}} \newcommand{\tr}{\operatorname{tr}} \newcommand{\e}{\operatorname{e}} \newcommand{\B}{\operatorname{B}} \newcommand{\Bbar}{\overline{\operatorname{B}}} \newcommand{\pr}{\operatorname{pr}} \newcommand{\dd}{\operatorname{d}\hspace{-1pt}} \newcommand{\E}{\operatorname{E}} \newcommand{\V}{\operatorname{V}} \newcommand{\Cov}{\operatorname{Cov}} \newcommand{\Bigsum}[2]{\mathop{\textstyle\sum}_{#1}^{#2}} \newcommand{\ran}{\operatorname{ran}} \newcommand{\card}{\#} \newcommand{\mathds}{\mathbb} \renewcommand{\P}{\operatorname{P}} \renewcommand{\L}{\operatorname{L}} [/math]

Let [math]X_i\sim\mathcal{N}(0,1)[/math] and [math]X=X_1+\dots+X_d[/math].

  • Use the formula [math]\E(f(X_i))=(2\pi)^{-1/2}\int_{\mathbb{R}}f(x)\exp(-x^2/2)\dd x[/math] to show that [math]\E(\exp(tX_i))=(1-2t)^{-d/2}[/math] holds for [math]t\in(0,1/2)[/math].
  • Derive the estimate [math]\P\bigl[X\geqslant a\bigr] \leqslant\inf_{t\in(0,1/2)}\frac{\exp(-ta)}{(1-2t)^{d/2}}[/math] for [math]a \gt 0[/math].
BBy Bot
May 31'24

Use Problem to give an alternative proof of the Johnson-Lindenstrauss Lemma that does not rely on the Gaussian Annulus Theorem.

BBy Bot
May 31'24

Restate the Random Projection Theorem and the Johnson-Lindenstrauss Theorem by using Landau symbols to express the dependencies of [math]\epsilon[/math], [math]k[/math] and [math]n[/math].

BBy Bot
May 31'24
[math] \newcommand{\smallfrac}[2]{\frac{#1}{#2}} \newcommand{\medfrac}[2]{\frac{#1}{#2}} \newcommand{\textfrac}[2]{\frac{#1}{#2}} \newcommand{\tr}{\operatorname{tr}} \newcommand{\e}{\operatorname{e}} \newcommand{\B}{\operatorname{B}} \newcommand{\Bbar}{\overline{\operatorname{B}}} \newcommand{\pr}{\operatorname{pr}} \newcommand{\dd}{\operatorname{d}\hspace{-1pt}} \newcommand{\E}{\operatorname{E}} \newcommand{\V}{\operatorname{V}} \newcommand{\Cov}{\operatorname{Cov}} \newcommand{\Bigsum}[2]{\mathop{\textstyle\sum}_{#1}^{#2}} \newcommand{\ran}{\operatorname{ran}} \newcommand{\card}{\#} \newcommand{\mathds}{\mathbb} \renewcommand{\P}{\operatorname{P}} \renewcommand{\L}{\operatorname{L}} [/math]

Let [math]n \gt k[/math] be integers.

  • Implement a random projection [math]T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k[/math] and test it for small [math]d[/math] and [math]k[/math].
  • Put [math]d=1000[/math], generate 10,000 points in [math]\mathbb{R}^d[/math] at random, project them via [math]T[/math] to [math]\mathbb{R}^k[/math] and compute, for different values of [math]k[/math], the worst distorsion of a pairwise distance
    [[math]] \epsilon:=\max\Bigl(1-\min_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}, \max_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}-1\Bigr) [[/math]]
    that occurs.
  • Compare the outcome of your experiment with the relation of [math]\epsilon[/math] and [math]k[/math] that is given in the Johnson-Lindenstrauss Lemma, by replicating Figure based on the data from (ii).
  • Explain how you would pick [math]\epsilon[/math] and [math]k[/math] if you are given a dataset in [math]\mathbb{R}^d[/math] and want to do a dimensionality reduction that does not corrupt a classifier based on nearest neighbors.