exercise:D4f9286df0: Difference between revisions

Revision as of 03:46, 2 June 2024

Let [math]n \gt k[/math] be integers.

Implement a random projection [math]T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k[/math] and test it for small [math]d[/math] and [math]k[/math].
Put [math]d=1000[/math], generate 10000 points in [math]\mathbb{R}^d[/math] at random, project them via [math]T[/math] to [math]\mathbb{R}^k[/math] and compute, for different values of [math]k[/math], the worst distorsion of a pairwise distance
[[math]] \epsilon:=\max\Bigl(1-\min_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}, \max_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}-1\Bigr) [[/math]]
that occurs.
Compare the outcome of your experiment with the relation of [math]\epsilon[/math] and [math]k[/math] that is given in the Johnson-Lindenstrauss Lemma, by replicating Figure based on the data from (ii).
Explain how you would pick [math]\epsilon[/math] and [math]k[/math] if you are given a dataset in [math]\mathbb{R}^d[/math] and want to do a dimensionality reduction that does not corrupt a classifier based on nearest neighbors.

@@ Line 23: / Line 23: @@
 <ul style="list-style-type:lower-roman"><li> Implement a random projection <math>T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k</math>  and test it for small <math>d</math> and <math>k</math>.
 </li>
-<li> Put <math>d=1000</math>, generate 10\,000 points in <math>\mathbb{R}^d</math> at random, project them via <math>T</math> to <math>\mathbb{R}^k</math> and compute, for different values of <math>k</math>, the worst distorsion of a pairwise distance
+<li> Put <math>d=1000</math>, generate 10000 points in <math>\mathbb{R}^d</math> at random, project them via <math>T</math> to <math>\mathbb{R}^k</math> and compute, for different values of <math>k</math>, the worst distorsion of a pairwise distance
 <math display="block">