guide:86df3d1ab2: Difference between revisions

From Stochiki
No edit summary
 
No edit summary
 
Line 1: Line 1:
<div class="d-none"><math>
\newcommand{\R}{\mathbb{R}}
\newcommand{\A}{\mathcal{A}}
\newcommand{\B}{\mathcal{B}}
\newcommand{\N}{\mathbb{N}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\Rbar}{\overline{\mathbb{R}}}
\newcommand{\Bbar}{\overline{\mathcal{B}}}
\newcommand{\Q}{\mathbb{Q}}
\newcommand{\E}{\mathbb{E}}
\newcommand{\p}{\mathbb{P}}
\newcommand{\one}{\mathds{1}}
\newcommand{\0}{\mathcal{O}}
\newcommand{\mat}{\textnormal{Mat}}
\newcommand{\sign}{\textnormal{sign}}
\newcommand{\CP}{\mathcal{P}}
\newcommand{\CT}{\mathcal{T}}
\newcommand{\CY}{\mathcal{Y}}
\newcommand{\F}{\mathcal{F}}
\newcommand{\mathds}{\mathbb}</math></div>
===Gaussian Vectors===
{{definitioncard|Gaussian Random Vector|An <math>\R^n</math>-valued r.v. <math>X=(X_1,...,X_n)</math> is called a gaussian random vector if every linear combination <math>\sum_{j=1}^n\lambda_jX_j</math>, with <math>\lambda_j\in\R</math>, is a gaussian r.v. (Possibly degenerated <math>\mathcal{N}(\mu,0)=\mu</math> a.s.).}}
{{proofcard|Theorem|thm-1|<math>X</math> is an <math>\R^n</math>-valued gaussian r.v. if and only if its characteristic function has the form


<math display="block">
\varphi_X(u)=\exp\left(i\langle u,\mu\rangle-\frac{1}{2}\langle u,Qu\rangle\right),(*)
</math>
where <math>\mu\in\R^n</math> and <math>Q</math> is a symmetric nonnegative semidefinit matrix of size <math>n\times n</math>. <math>Q</math> is then the covariance matrix of <math>X</math> and <math>\mu</math> is the mean vector, i.e. <math>\E[X_j]=\mu_j</math>.
|Suppose that  <math>(*)</math> holds. Let <math>Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle</math>. For <math>v\in\R</math>, <math>\varphi_Y(v)=\varphi_X(va)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,Qa\rangle\right)\Longrightarrow Y\sim \mathcal{N}\left(\langle a,\mu\rangle,\langle a,Qa\rangle\right)\Longrightarrow X</math> is a gaussian vector. Conversely assume that <math>X</math> is a gaussian vector and let <math>Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle</math>. Let <math>\omega=Cov(X)</math> and note that <math>\E[Y]=\langle \sigma,\mu\rangle</math> and <math>Var(Y)=\sigma^2(Y)=\langle a,Qa\rangle</math>. Since <math>Y</math> is a gaussian r.v.
<math display="block">
\varphi_Y(v)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,\omega a\rangle\right).
</math>
Now <math>\varphi_X(v)=\varphi_Y(1)=(*)</math>.}}
''Notation:'' We write <math>X\sim \mathcal{N}(\mu,Q)</math>.
'''Example'''
Let <math>X_1,...,X_n</math> be independent gaussian r.v.'s with <math>X_k\sim\mathcal{N}(\mu_k,\sigma_k^2)</math>. Then <math>X=(X_1,...,X_n)</math> is a gaussian vector. Indeed, we have
<math display="block">
\begin{align*}
\varphi_X(v_1,...,v_n)&=\E\left[e^{i(v_1X_1+...+v_nX_n)}\right]\\
&=\prod_{j=1}^n\E\left[e^{iv_jX_j}\right]\\
&=e^{i\langle v,\mu\rangle}-\frac{1}{2}\langle v,Qv\rangle,
\end{align*}
</math>
where <math>\mu=(\mu_1,...,\mu_n)</math>, <math>Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}</math>.
{{proofcard|Corollary|cor-1|Let <math>X</math> be an <math>\R^n</math>-valued gaussian vector. The components <math>X_j</math> of <math>X</math> are independent if and only if <math>Q</math> is a diagonal matrix.
|Suppose <math>Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}</math>, then <math>(*)</math> shows that
<math display="block">
\varphi_X(v_1,...,v_n)=\prod_{j=1}^n\varphi_{X_j}(v_j),
</math>
where <math>X_j\sim \mathcal{N}(\mu_j,\sigma^2_j)</math>. The result follows from the uniqueness of the r.v.}}
{{proofcard|Theorem|thm-2|Let <math>X</math> be an <math>\R^n</math>-valued gaussian vector with mean <math>\mu</math>. Then there exists independent gaussian r.v.'s <math>Y_1,..,Y_n</math> with
<math display="block">
Y_j=\mathcal{N}(0,\lambda_j),\lambda_j\geq 0,1\leq j\leq n
</math>
and an orthogonal matrix <math>A</math> such that
<math display="block">
X=\mu+AY.
</math>
{{alert-info |
It is possible that <math>\lambda_j=0</math>. In that case, it is also possible to get <math>Y_j=0</math> a.s.
}}
|There is an <math>A\in O(\R)</math>, such that <math>Q=A\Lambda A^*</math>, with <math>\Lambda=\begin{pmatrix}\lambda_1&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\lambda_n\end{pmatrix}</math>, <math>\lambda_j\leq 0</math>. Set <math>Y=A^*(X-\mu)</math>. Then one can check that <math>Y</math> is gaussian. So we get that <math>Cov(Y)=A^*QA=\Lambda</math>, which implies that <math>Y_1,...,Y_n</math> are independent because <math>Cov(Y)</math> is diagonal.}}
{{proofcard|Corollary|cor-2|An <math>\R^n</math>-valued gaussian vector <math>X</math> has density on <math>\R^n</math> if and only if <math>\det(Q)\not=0</math>.|}}
{{alert-info |
If <math>\det(Q)\not=0</math>, then <math>f_X(x)=\frac{1}{\sqrt{2\pi}\sqrt{\det(Q)}}e^{-\frac{1}{2}\langle x-\mu,Q^{-1}(\lambda-\mu)\rangle}</math>.
}}
{{proofcard|Theorem|thm-3|Let <math>X</math> be an <math>\R^n</math>-valued gaussian r.v. and let <math>Y</math> be an <math>\R^m</math>-valued gaussian r.v. If <math>X</math> and <math>Y</math> are independent, then <math>Z=(X,Y)</math> is an <math>\R^{n+m}</math>-valued gaussian vector.
|Let <math>u=(w,v)</math>, <math>w\in\R^n</math> and <math>v\in\R^m</math>. Take <math>Q=\begin{pmatrix}Q^X&0\\ 0&Q^Y\end{pmatrix}</math>. Now we get
<math display="block">
\begin{align*}
\varphi_Z(u)&=\varphi_X(w)\varphi_Y(v)\\
&=\exp\left(i\langle w,\mu X\rangle-\frac{1}{2}\langle w,Q^Xw\rangle\right)+\exp\left(i\langle v,\mu^Y\rangle-\frac{1}{2}\langle v,Q^Yv\rangle \right)\\
&=\exp\left(i\langle (w,v),(\mu^X,\mu^Y)\rangle-\frac{1}{2}\langle u,Qu\rangle\right),
\end{align*}
</math>
implying that <math>Z</math> is a gaussian vector.}}
{{proofcard|Theorem|thm-4|Let <math>X</math> be an <math>\R^n</math>-valued gaussian vector. Two components <math>X_j</math> and <math>X_k</math> of <math>X</math> are independent if and only if <math>Cov(X_j,X_k)=0</math>.
|Consider <math>Y=(Y_1,Y_2)</math>, with <math>Y_1=X_j</math> and <math>Y_2=X_k</math>. If <math>Y</math> is a gaussian vector, then <math>Cov(Y_1,Y_2)=0</math>, which implies that <math>Y_1</math> and <math>Y_2</math> are independent.}}
''Warning!:''
Let <math>Y\sim\mathcal{N}(0,1)</math> and for <math>a > 0</math> fix <math>Z=Y\one_{\vert Y\vert\leq a}-Y\one_{\vert Y\vert > a}</math>. Then <math>Z\sim\mathcal{N}(0,1)</math>. But <math>Y+Z=2Y\one_{\vert Y\vert \leq  a}</math> is not gaussian because it is a bounded r.v. and it is not constant. Therefore <math>(Y,Z)</math> is not a gaussian vector.
==General references==
{{cite arXiv|last=Moshayedi|first=Nima|year=2020|title=Lectures on Probability Theory|eprint=2010.16280|class=math.PR}}

Latest revision as of 01:53, 8 May 2024

[math] \newcommand{\R}{\mathbb{R}} \newcommand{\A}{\mathcal{A}} \newcommand{\B}{\mathcal{B}} \newcommand{\N}{\mathbb{N}} \newcommand{\C}{\mathbb{C}} \newcommand{\Rbar}{\overline{\mathbb{R}}} \newcommand{\Bbar}{\overline{\mathcal{B}}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\E}{\mathbb{E}} \newcommand{\p}{\mathbb{P}} \newcommand{\one}{\mathds{1}} \newcommand{\0}{\mathcal{O}} \newcommand{\mat}{\textnormal{Mat}} \newcommand{\sign}{\textnormal{sign}} \newcommand{\CP}{\mathcal{P}} \newcommand{\CT}{\mathcal{T}} \newcommand{\CY}{\mathcal{Y}} \newcommand{\F}{\mathcal{F}} \newcommand{\mathds}{\mathbb}[/math]

Gaussian Vectors

Definition (Gaussian Random Vector)

An [math]\R^n[/math]-valued r.v. [math]X=(X_1,...,X_n)[/math] is called a gaussian random vector if every linear combination [math]\sum_{j=1}^n\lambda_jX_j[/math], with [math]\lambda_j\in\R[/math], is a gaussian r.v. (Possibly degenerated [math]\mathcal{N}(\mu,0)=\mu[/math] a.s.).

Theorem

[math]X[/math] is an [math]\R^n[/math]-valued gaussian r.v. if and only if its characteristic function has the form

[[math]] \varphi_X(u)=\exp\left(i\langle u,\mu\rangle-\frac{1}{2}\langle u,Qu\rangle\right),(*) [[/math]]
where [math]\mu\in\R^n[/math] and [math]Q[/math] is a symmetric nonnegative semidefinit matrix of size [math]n\times n[/math]. [math]Q[/math] is then the covariance matrix of [math]X[/math] and [math]\mu[/math] is the mean vector, i.e. [math]\E[X_j]=\mu_j[/math].


Show Proof

Suppose that [math](*)[/math] holds. Let [math]Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle[/math]. For [math]v\in\R[/math], [math]\varphi_Y(v)=\varphi_X(va)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,Qa\rangle\right)\Longrightarrow Y\sim \mathcal{N}\left(\langle a,\mu\rangle,\langle a,Qa\rangle\right)\Longrightarrow X[/math] is a gaussian vector. Conversely assume that [math]X[/math] is a gaussian vector and let [math]Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle[/math]. Let [math]\omega=Cov(X)[/math] and note that [math]\E[Y]=\langle \sigma,\mu\rangle[/math] and [math]Var(Y)=\sigma^2(Y)=\langle a,Qa\rangle[/math]. Since [math]Y[/math] is a gaussian r.v.

[[math]] \varphi_Y(v)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,\omega a\rangle\right). [[/math]]
Now [math]\varphi_X(v)=\varphi_Y(1)=(*)[/math].

Notation: We write [math]X\sim \mathcal{N}(\mu,Q)[/math].

Example

Let [math]X_1,...,X_n[/math] be independent gaussian r.v.'s with [math]X_k\sim\mathcal{N}(\mu_k,\sigma_k^2)[/math]. Then [math]X=(X_1,...,X_n)[/math] is a gaussian vector. Indeed, we have

[[math]] \begin{align*} \varphi_X(v_1,...,v_n)&=\E\left[e^{i(v_1X_1+...+v_nX_n)}\right]\\ &=\prod_{j=1}^n\E\left[e^{iv_jX_j}\right]\\ &=e^{i\langle v,\mu\rangle}-\frac{1}{2}\langle v,Qv\rangle, \end{align*} [[/math]]

where [math]\mu=(\mu_1,...,\mu_n)[/math], [math]Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}[/math].

Corollary

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian vector. The components [math]X_j[/math] of [math]X[/math] are independent if and only if [math]Q[/math] is a diagonal matrix.


Show Proof

Suppose [math]Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}[/math], then [math](*)[/math] shows that

[[math]] \varphi_X(v_1,...,v_n)=\prod_{j=1}^n\varphi_{X_j}(v_j), [[/math]]
where [math]X_j\sim \mathcal{N}(\mu_j,\sigma^2_j)[/math]. The result follows from the uniqueness of the r.v.

Theorem

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian vector with mean [math]\mu[/math]. Then there exists independent gaussian r.v.'s [math]Y_1,..,Y_n[/math] with

[[math]] Y_j=\mathcal{N}(0,\lambda_j),\lambda_j\geq 0,1\leq j\leq n [[/math]]
and an orthogonal matrix [math]A[/math] such that

[[math]] X=\mu+AY. [[/math]]

It is possible that [math]\lambda_j=0[/math]. In that case, it is also possible to get [math]Y_j=0[/math] a.s.


Show Proof

There is an [math]A\in O(\R)[/math], such that [math]Q=A\Lambda A^*[/math], with [math]\Lambda=\begin{pmatrix}\lambda_1&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\lambda_n\end{pmatrix}[/math], [math]\lambda_j\leq 0[/math]. Set [math]Y=A^*(X-\mu)[/math]. Then one can check that [math]Y[/math] is gaussian. So we get that [math]Cov(Y)=A^*QA=\Lambda[/math], which implies that [math]Y_1,...,Y_n[/math] are independent because [math]Cov(Y)[/math] is diagonal.

Corollary

An [math]\R^n[/math]-valued gaussian vector [math]X[/math] has density on [math]\R^n[/math] if and only if [math]\det(Q)\not=0[/math].

If [math]\det(Q)\not=0[/math], then [math]f_X(x)=\frac{1}{\sqrt{2\pi}\sqrt{\det(Q)}}e^{-\frac{1}{2}\langle x-\mu,Q^{-1}(\lambda-\mu)\rangle}[/math].

Theorem

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian r.v. and let [math]Y[/math] be an [math]\R^m[/math]-valued gaussian r.v. If [math]X[/math] and [math]Y[/math] are independent, then [math]Z=(X,Y)[/math] is an [math]\R^{n+m}[/math]-valued gaussian vector.


Show Proof

Let [math]u=(w,v)[/math], [math]w\in\R^n[/math] and [math]v\in\R^m[/math]. Take [math]Q=\begin{pmatrix}Q^X&0\\ 0&Q^Y\end{pmatrix}[/math]. Now we get

[[math]] \begin{align*} \varphi_Z(u)&=\varphi_X(w)\varphi_Y(v)\\ &=\exp\left(i\langle w,\mu X\rangle-\frac{1}{2}\langle w,Q^Xw\rangle\right)+\exp\left(i\langle v,\mu^Y\rangle-\frac{1}{2}\langle v,Q^Yv\rangle \right)\\ &=\exp\left(i\langle (w,v),(\mu^X,\mu^Y)\rangle-\frac{1}{2}\langle u,Qu\rangle\right), \end{align*} [[/math]]
implying that [math]Z[/math] is a gaussian vector.

Theorem

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian vector. Two components [math]X_j[/math] and [math]X_k[/math] of [math]X[/math] are independent if and only if [math]Cov(X_j,X_k)=0[/math].


Show Proof

Consider [math]Y=(Y_1,Y_2)[/math], with [math]Y_1=X_j[/math] and [math]Y_2=X_k[/math]. If [math]Y[/math] is a gaussian vector, then [math]Cov(Y_1,Y_2)=0[/math], which implies that [math]Y_1[/math] and [math]Y_2[/math] are independent.

Warning!: Let [math]Y\sim\mathcal{N}(0,1)[/math] and for [math]a \gt 0[/math] fix [math]Z=Y\one_{\vert Y\vert\leq a}-Y\one_{\vert Y\vert \gt a}[/math]. Then [math]Z\sim\mathcal{N}(0,1)[/math]. But [math]Y+Z=2Y\one_{\vert Y\vert \leq a}[/math] is not gaussian because it is a bounded r.v. and it is not constant. Therefore [math](Y,Z)[/math] is not a gaussian vector.

General references

Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].