exercise:D4f9286df0: Difference between revisions
From Stochiki
(Created page with "<div class="d-none"><math> \newcommand{\indexmark}[1]{#1\markboth{#1}{#1}} \newcommand{\red}[1]{\textcolor{red}{#1}} \newcommand{\NOTE}[1]{$^{\textcolor{red}\clubsuit}$\marginpar{\setstretch{0.5}$^{\scriptscriptstyle\textcolor{red}\clubsuit}$\textcolor{blue}{\bf\tiny #1}}} \newcommand\xoverline[2][0.75]{% \sbox{\myboxA}{$\m@th#2$}% \setbox\myboxB\null% Phantom box \ht\myboxB=\ht\myboxA% \dp\myboxB=\dp\myboxA% \wd\myboxB=#1\wd\myboxA% Scale phantom...") |
No edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
<div class="d-none"><math> | <div class="d-none"><math> | ||
\newcommand{\smallfrac}[2]{\frac{#1}{#2}} | |||
\newcommand{\medfrac}[2]{\frac{#1}{#2}} | |||
\newcommand{\textfrac}[2]{\frac{#1}{#2}} | |||
\newcommand{\smallfrac}[2] | |||
\newcommand{\medfrac}[2] | |||
\newcommand{\textfrac}[2] | |||
\newcommand{\tr}{\operatorname{tr}} | \newcommand{\tr}{\operatorname{tr}} | ||
\newcommand{\e}{\operatorname{e}} | \newcommand{\e}{\operatorname{e}} | ||
\newcommand{\B}{\operatorname{B}} | \newcommand{\B}{\operatorname{B}} | ||
\newcommand{\Bbar}{\ | \newcommand{\Bbar}{\overline{\operatorname{B}}} | ||
\newcommand{\pr}{\operatorname{pr}} | \newcommand{\pr}{\operatorname{pr}} | ||
\newcommand{\dd}{\operatorname{d}\hspace{-1pt}} | \newcommand{\dd}{\operatorname{d}\hspace{-1pt}} | ||
Line 33: | Line 12: | ||
\newcommand{\V}{\operatorname{V}} | \newcommand{\V}{\operatorname{V}} | ||
\newcommand{\Cov}{\operatorname{Cov}} | \newcommand{\Cov}{\operatorname{Cov}} | ||
\newcommand{\Bigsum}[2] | \newcommand{\Bigsum}[2]{\mathop{\textstyle\sum}_{#1}^{#2}} | ||
\newcommand{\ran}{\operatorname{ran}} | \newcommand{\ran}{\operatorname{ran}} | ||
\newcommand{\card}{\#} | \newcommand{\card}{\#} | ||
\newcommand{\ | \newcommand{\mathds}{\mathbb} | ||
\renewcommand{\P}{\operatorname{P}} | |||
\renewcommand{\L}{\operatorname{L}} | |||
</math></div> | |||
Let <math>n > k</math> be integers. | |||
<ul style="list-style-type:lower-roman"><li> Implement a random projection <math>T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k</math> and test it for small <math>d</math> and <math>k</math>. | <ul style="list-style-type:lower-roman"><li> Implement a random projection <math>T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k</math> and test it for small <math>d</math> and <math>k</math>. | ||
</li> | </li> | ||
<li> Put <math>d=1000</math>, generate 10 | <li> Put <math>d=1000</math>, generate 10,000 points in <math>\mathbb{R}^d</math> at random, project them via <math>T</math> to <math>\mathbb{R}^k</math> and compute, for different values of <math>k</math>, the worst distorsion of a pairwise distance | ||
<math display="block"> | <math display="block"> | ||
Line 51: | Line 30: | ||
that occurs. | that occurs. | ||
</li> | </li> | ||
<li> Compare the outcome of your experiment with the relation of <math>\epsilon</math> and <math>k</math> that is given in the Johnson-Lindenstrauss Lemma, by replicating [[#JL-FIG|Figure]] based on the data from (ii). | <li> Compare the outcome of your experiment with the relation of <math>\epsilon</math> and <math>k</math> that is given in the Johnson-Lindenstrauss Lemma, by replicating [[guide:7885448c04#JL-FIG|Figure]] based on the data from (ii). | ||
</li> | </li> |
Latest revision as of 02:47, 2 June 2024
[math]
\newcommand{\smallfrac}[2]{\frac{#1}{#2}}
\newcommand{\medfrac}[2]{\frac{#1}{#2}}
\newcommand{\textfrac}[2]{\frac{#1}{#2}}
\newcommand{\tr}{\operatorname{tr}}
\newcommand{\e}{\operatorname{e}}
\newcommand{\B}{\operatorname{B}}
\newcommand{\Bbar}{\overline{\operatorname{B}}}
\newcommand{\pr}{\operatorname{pr}}
\newcommand{\dd}{\operatorname{d}\hspace{-1pt}}
\newcommand{\E}{\operatorname{E}}
\newcommand{\V}{\operatorname{V}}
\newcommand{\Cov}{\operatorname{Cov}}
\newcommand{\Bigsum}[2]{\mathop{\textstyle\sum}_{#1}^{#2}}
\newcommand{\ran}{\operatorname{ran}}
\newcommand{\card}{\#}
\newcommand{\mathds}{\mathbb}
\renewcommand{\P}{\operatorname{P}}
\renewcommand{\L}{\operatorname{L}}
[/math]
Let [math]n \gt k[/math] be integers.
- Implement a random projection [math]T\colon\mathbb{R}^d\rightarrow\mathbb{R}^k[/math] and test it for small [math]d[/math] and [math]k[/math].
- Put [math]d=1000[/math], generate 10,000 points in [math]\mathbb{R}^d[/math] at random, project them via [math]T[/math] to [math]\mathbb{R}^k[/math] and compute, for different values of [math]k[/math], the worst distorsion of a pairwise distance
[[math]] \epsilon:=\max\Bigl(1-\min_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}, \max_{x\not=y}\medfrac{\|Tx-Ty\|}{\sqrt{k}\|x-y\|}-1\Bigr) [[/math]]that occurs.
- Compare the outcome of your experiment with the relation of [math]\epsilon[/math] and [math]k[/math] that is given in the Johnson-Lindenstrauss Lemma, by replicating Figure based on the data from (ii).
- Explain how you would pick [math]\epsilon[/math] and [math]k[/math] if you are given a dataset in [math]\mathbb{R}^d[/math] and want to do a dimensionality reduction that does not corrupt a classifier based on nearest neighbors.