exercise:D4f9286df0: Difference between revisions

Latest revision as of 03:47, 2 June 2024

Let [math]n \gt k[/math] be integers.

Implement a random projection [math]T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k[/math] and test it for small [math]d[/math] and [math]k[/math].
Put [math]d=1000[/math], generate 10,000 points in [math]\mathbb{R}^d[/math] at random, project them via [math]T[/math] to [math]\mathbb{R}^k[/math] and compute, for different values of [math]k[/math], the worst distorsion of a pairwise distance
[[math]] \epsilon:=\max\Bigl(1-\min_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}, \max_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}-1\Bigr) [[/math]]
that occurs.
Compare the outcome of your experiment with the relation of [math]\epsilon[/math] and [math]k[/math] that is given in the Johnson-Lindenstrauss Lemma, by replicating Figure based on the data from (ii).
Explain how you would pick [math]\epsilon[/math] and [math]k[/math] if you are given a dataset in [math]\mathbb{R}^d[/math] and want to do a dimensionality reduction that does not corrupt a classifier based on nearest neighbors.

@@ Line 1: / Line 1: @@
 <div class="d-none"><math>
-\newcommand{\indexmark}[1]{#1\markboth{#1}{#1}}
+\newcommand{\smallfrac}[2]{\frac{#1}{#2}}
-\newcommand{\red}[1]{\textcolor{red}{#1}}
+\newcommand{\medfrac}[2]{\frac{#1}{#2}}
-\newcommand{\NOTE}[1]{$^{\textcolor{red}\clubsuit}$\marginpar{\setstretch{0.5}$^{\scriptscriptstyle\textcolor{red}\clubsuit}$\textcolor{blue}{\bf\tiny #1}}}
+\newcommand{\textfrac}[2]{\frac{#1}{#2}}
-\newcommand\xoverline[2][0.75]{%
-    \sbox{\myboxA}{$\m@th#2$}%
-    \setbox\myboxB\null% Phantom box
-    \ht\myboxB=\ht\myboxA%
-    \dp\myboxB=\dp\myboxA%
-    \wd\myboxB=#1\wd\myboxA% Scale phantom
-    \sbox\myboxB{$\m@th\overline{\copy\myboxB}$}%  Overlined phantom
-    \setlength\mylenA{\the\wd\myboxA}%   calc width diff
-    \addtolength\mylenA{-\the\wd\myboxB}%
-    \ifdim\wd\myboxB<\wd\myboxA%
-       \rlap{\hskip 0.35\mylenA\usebox\myboxB}{\usebox\myboxA}%
-    \else
-        \hskip -0.5\mylenA\rlap{\usebox\myboxA}{\hskip 0.5\mylenA\usebox\myboxB}%
-    \fi}
-\newcommand{\smallfrac}[2]{\scalebox{1.35}{\ensuremath{\frac{#1}{#2}}}}
-\newcommand{\medfrac}[2]{\scalebox{1.2}{\ensuremath{\frac{#1}{#2}}}}
-\newcommand{\textfrac}[2]{{\textstyle\ensuremath{\frac{#1}{#2}}}}
-\newcommand{\nsum}[1][1.4]{% only for \displaystyle
-    \mathop{%
-        \raisebox
-            {-#1\depthofsumsign+1\depthofsumsign}
 \newcommand{\tr}{\operatorname{tr}}
 \newcommand{\e}{\operatorname{e}}
 \newcommand{\B}{\operatorname{B}}
-\newcommand{\Bbar}{\xoverline[0.75]{\operatorname{B}}}
+\newcommand{\Bbar}{\overline{\operatorname{B}}}
 \newcommand{\pr}{\operatorname{pr}}
 \newcommand{\dd}{\operatorname{d}\hspace{-1pt}}
@@ Line 33: / Line 12: @@
 \newcommand{\V}{\operatorname{V}}
 \newcommand{\Cov}{\operatorname{Cov}}
-\newcommand{\Bigsum}[2]{\ensuremath{\mathop{\textstyle\sum}_{#1}^{#2}}}
+\newcommand{\Bigsum}[2]{\mathop{\textstyle\sum}_{#1}^{#2}}
 \newcommand{\ran}{\operatorname{ran}}
 \newcommand{\card}{\#}
-\newcommand{\Conv}{\mathop{\scalebox{1.1}{\raisebox{-0.08ex}{$\ast$}}}}%
+\newcommand{\mathds}{\mathbb}
+\renewcommand{\P}{\operatorname{P}}
+\renewcommand{\L}{\operatorname{L}}
+</math></div>
-\usepackage{pgfplots}
+Let <math>n > k</math> be integers.
-\newcommand{\filledsquare}{\begin{picture}(0,0)(0,0)\put(-4,1.4){$\scriptscriptstyle\text{\ding{110}}$}\end{picture}\hspace{2pt}}
-\newcommand{\mathds}{\mathbb}</math></div>
-\label{JL-SIM} Let <math>n > k</math> be integers.
 <ul style="list-style-type:lower-roman"><li> Implement a random projection <math>T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k</math>  and test it for small <math>d</math> and <math>k</math>.
 </li>
-<li> Put <math>d=1000</math>, generate 10\,000 points in <math>\mathbb{R}^d</math> at random, project them via <math>T</math> to <math>\mathbb{R}^k</math> and compute, for different values of <math>k</math>, the worst distorsion of a pairwise distance
+<li> Put <math>d=1000</math>, generate 10,000 points in <math>\mathbb{R}^d</math> at random, project them via <math>T</math> to <math>\mathbb{R}^k</math> and compute, for different values of <math>k</math>, the worst distorsion of a pairwise distance
 <math display="block">
@@ Line 51: / Line 30: @@
 that occurs.
 </li>
-<li> Compare the outcome of your experiment with the relation of <math>\epsilon</math> and <math>k</math> that is given in the Johnson-Lindenstrauss Lemma, by replicating [[#JL-FIG|Figure]] based on the data from (ii).
+<li> Compare the outcome of your experiment with the relation of <math>\epsilon</math> and <math>k</math> that is given in the Johnson-Lindenstrauss Lemma, by replicating [[guide:7885448c04#JL-FIG|Figure]] based on the data from (ii).
 </li>