guide:86df3d1ab2: Difference between revisions

Latest revision as of 01:53, 8 May 2024

Gaussian Vectors

Definition (Gaussian Random Vector)

An [math]\R^n[/math]-valued r.v. [math]X=(X_1,...,X_n)[/math] is called a gaussian random vector if every linear combination [math]\sum_{j=1}^n\lambda_jX_j[/math], with [math]\lambda_j\in\R[/math], is a gaussian r.v. (Possibly degenerated [math]\mathcal{N}(\mu,0)=\mu[/math] a.s.).

Theorem

[math]X[/math] is an [math]\R^n[/math]-valued gaussian r.v. if and only if its characteristic function has the form

[[math]] \varphi_X(u)=\exp\left(i\langle u,\mu\rangle-\frac{1}{2}\langle u,Qu\rangle\right),(*) [[/math]]

where [math]\mu\in\R^n[/math] and [math]Q[/math] is a symmetric nonnegative semidefinit matrix of size [math]n\times n[/math]. [math]Q[/math] is then the covariance matrix of [math]X[/math] and [math]\mu[/math] is the mean vector, i.e. [math]\E[X_j]=\mu_j[/math].

Show Proof

Suppose that [math](*)[/math] holds. Let [math]Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle[/math]. For [math]v\in\R[/math], [math]\varphi_Y(v)=\varphi_X(va)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,Qa\rangle\right)\Longrightarrow Y\sim \mathcal{N}\left(\langle a,\mu\rangle,\langle a,Qa\rangle\right)\Longrightarrow X[/math] is a gaussian vector. Conversely assume that [math]X[/math] is a gaussian vector and let [math]Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle[/math]. Let [math]\omega=Cov(X)[/math] and note that [math]\E[Y]=\langle \sigma,\mu\rangle[/math] and [math]Var(Y)=\sigma^2(Y)=\langle a,Qa\rangle[/math]. Since [math]Y[/math] is a gaussian r.v.

[[math]] \varphi_Y(v)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,\omega a\rangle\right). [[/math]]

Now [math]\varphi_X(v)=\varphi_Y(1)=(*)[/math].

■

Notation: We write [math]X\sim \mathcal{N}(\mu,Q)[/math].

Example

Let [math]X_1,...,X_n[/math] be independent gaussian r.v.'s with [math]X_k\sim\mathcal{N}(\mu_k,\sigma_k^2)[/math]. Then [math]X=(X_1,...,X_n)[/math] is a gaussian vector. Indeed, we have

[[math]] \begin{align*} \varphi_X(v_1,...,v_n)&=\E\left[e^{i(v_1X_1+...+v_nX_n)}\right]\\ &=\prod_{j=1}^n\E\left[e^{iv_jX_j}\right]\\ &=e^{i\langle v,\mu\rangle}-\frac{1}{2}\langle v,Qv\rangle, \end{align*} [[/math]]

where [math]\mu=(\mu_1,...,\mu_n)[/math], [math]Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}[/math].

Corollary

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian vector. The components [math]X_j[/math] of [math]X[/math] are independent if and only if [math]Q[/math] is a diagonal matrix.

Show Proof

Suppose [math]Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}[/math], then [math](*)[/math] shows that

[[math]] \varphi_X(v_1,...,v_n)=\prod_{j=1}^n\varphi_{X_j}(v_j), [[/math]]

where [math]X_j\sim \mathcal{N}(\mu_j,\sigma^2_j)[/math]. The result follows from the uniqueness of the r.v.

■

Theorem

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian vector with mean [math]\mu[/math]. Then there exists independent gaussian r.v.'s [math]Y_1,..,Y_n[/math] with

[[math]] Y_j=\mathcal{N}(0,\lambda_j),\lambda_j\geq 0,1\leq j\leq n [[/math]]

and an orthogonal matrix [math]A[/math] such that

[[math]] X=\mu+AY. [[/math]]

It is possible that [math]\lambda_j=0[/math]. In that case, it is also possible to get [math]Y_j=0[/math] a.s.

Show Proof

There is an [math]A\in O(\R)[/math], such that [math]Q=A\Lambda A^*[/math], with [math]\Lambda=\begin{pmatrix}\lambda_1&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\lambda_n\end{pmatrix}[/math], [math]\lambda_j\leq 0[/math]. Set [math]Y=A^*(X-\mu)[/math]. Then one can check that [math]Y[/math] is gaussian. So we get that [math]Cov(Y)=A^*QA=\Lambda[/math], which implies that [math]Y_1,...,Y_n[/math] are independent because [math]Cov(Y)[/math] is diagonal.

■

Corollary

An [math]\R^n[/math]-valued gaussian vector [math]X[/math] has density on [math]\R^n[/math] if and only if [math]\det(Q)\not=0[/math].

If [math]\det(Q)\not=0[/math], then [math]f_X(x)=\frac{1}{\sqrt{2\pi}\sqrt{\det(Q)}}e^{-\frac{1}{2}\langle x-\mu,Q^{-1}(\lambda-\mu)\rangle}[/math].

Theorem

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian r.v. and let [math]Y[/math] be an [math]\R^m[/math]-valued gaussian r.v. If [math]X[/math] and [math]Y[/math] are independent, then [math]Z=(X,Y)[/math] is an [math]\R^{n+m}[/math]-valued gaussian vector.

Show Proof

Let [math]u=(w,v)[/math], [math]w\in\R^n[/math] and [math]v\in\R^m[/math]. Take [math]Q=\begin{pmatrix}Q^X&0\\ 0&Q^Y\end{pmatrix}[/math]. Now we get

[[math]] \begin{align*} \varphi_Z(u)&=\varphi_X(w)\varphi_Y(v)\\ &=\exp\left(i\langle w,\mu X\rangle-\frac{1}{2}\langle w,Q^Xw\rangle\right)+\exp\left(i\langle v,\mu^Y\rangle-\frac{1}{2}\langle v,Q^Yv\rangle \right)\\ &=\exp\left(i\langle (w,v),(\mu^X,\mu^Y)\rangle-\frac{1}{2}\langle u,Qu\rangle\right), \end{align*} [[/math]]

implying that [math]Z[/math] is a gaussian vector.

■

Theorem

Let [math]X[/math] be an [math]\R^n[/math]-valued gaussian vector. Two components [math]X_j[/math] and [math]X_k[/math] of [math]X[/math] are independent if and only if [math]Cov(X_j,X_k)=0[/math].

Show Proof

Consider [math]Y=(Y_1,Y_2)[/math], with [math]Y_1=X_j[/math] and [math]Y_2=X_k[/math]. If [math]Y[/math] is a gaussian vector, then [math]Cov(Y_1,Y_2)=0[/math], which implies that [math]Y_1[/math] and [math]Y_2[/math] are independent.

■

Warning!: Let [math]Y\sim\mathcal{N}(0,1)[/math] and for [math]a \gt 0[/math] fix [math]Z=Y\one_{\vert Y\vert\leq a}-Y\one_{\vert Y\vert \gt a}[/math]. Then [math]Z\sim\mathcal{N}(0,1)[/math]. But [math]Y+Z=2Y\one_{\vert Y\vert \leq a}[/math] is not gaussian because it is a bounded r.v. and it is not constant. Therefore [math](Y,Z)[/math] is not a gaussian vector.

General references

Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].

@@ Line 1: / Line 1: @@
+<div class="d-none"><math>
+\newcommand{\R}{\mathbb{R}}
+\newcommand{\A}{\mathcal{A}}
+\newcommand{\B}{\mathcal{B}}
+\newcommand{\N}{\mathbb{N}}
+\newcommand{\C}{\mathbb{C}}
+\newcommand{\Rbar}{\overline{\mathbb{R}}}
+\newcommand{\Bbar}{\overline{\mathcal{B}}}
+\newcommand{\Q}{\mathbb{Q}}
+\newcommand{\E}{\mathbb{E}}
+\newcommand{\p}{\mathbb{P}}
+\newcommand{\one}{\mathds{1}}
+\newcommand{\0}{\mathcal{O}}
+\newcommand{\mat}{\textnormal{Mat}}
+\newcommand{\sign}{\textnormal{sign}}
+\newcommand{\CP}{\mathcal{P}}
+\newcommand{\CT}{\mathcal{T}}
+\newcommand{\CY}{\mathcal{Y}}
+\newcommand{\F}{\mathcal{F}}
+\newcommand{\mathds}{\mathbb}</math></div>
+===Gaussian Vectors===
+{{definitioncard|Gaussian Random Vector|An <math>\R^n</math>-valued r.v. <math>X=(X_1,...,X_n)</math> is called a gaussian random vector if every linear combination <math>\sum_{j=1}^n\lambda_jX_j</math>, with <math>\lambda_j\in\R</math>, is a gaussian r.v. (Possibly degenerated <math>\mathcal{N}(\mu,0)=\mu</math> a.s.).}}
+{{proofcard|Theorem|thm-1|<math>X</math> is an <math>\R^n</math>-valued gaussian r.v. if and only if its characteristic function has the form
+<math display="block">
+\varphi_X(u)=\exp\left(i\langle u,\mu\rangle-\frac{1}{2}\langle u,Qu\rangle\right),(*)
+</math>
+where <math>\mu\in\R^n</math> and <math>Q</math> is a symmetric nonnegative semidefinit matrix of size <math>n\times n</math>. <math>Q</math> is then the covariance matrix of <math>X</math> and <math>\mu</math> is the mean vector, i.e. <math>\E[X_j]=\mu_j</math>.
+|Suppose that  <math>(*)</math> holds. Let <math>Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle</math>. For <math>v\in\R</math>, <math>\varphi_Y(v)=\varphi_X(va)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,Qa\rangle\right)\Longrightarrow Y\sim \mathcal{N}\left(\langle a,\mu\rangle,\langle a,Qa\rangle\right)\Longrightarrow X</math> is a gaussian vector. Conversely assume that <math>X</math> is a gaussian vector and let <math>Y=\sum_{j=1}^na_jX_j=\langle a,X\rangle</math>. Let <math>\omega=Cov(X)</math> and note that <math>\E[Y]=\langle \sigma,\mu\rangle</math> and <math>Var(Y)=\sigma^2(Y)=\langle a,Qa\rangle</math>. Since <math>Y</math> is a gaussian r.v.
+<math display="block">
+\varphi_Y(v)=\exp\left(iv\langle a,\mu\rangle-\frac{v^2}{2}\langle a,\omega a\rangle\right).
+</math>
+Now <math>\varphi_X(v)=\varphi_Y(1)=(*)</math>.}}
+''Notation:'' We write <math>X\sim \mathcal{N}(\mu,Q)</math>.
+'''Example'''
+Let <math>X_1,...,X_n</math> be independent gaussian r.v.'s with <math>X_k\sim\mathcal{N}(\mu_k,\sigma_k^2)</math>. Then <math>X=(X_1,...,X_n)</math> is a gaussian vector. Indeed, we have
+<math display="block">
+\begin{align*}
+\varphi_X(v_1,...,v_n)&=\E\left[e^{i(v_1X_1+...+v_nX_n)}\right]\\
+&=\prod_{j=1}^n\E\left[e^{iv_jX_j}\right]\\
+&=e^{i\langle v,\mu\rangle}-\frac{1}{2}\langle v,Qv\rangle,
+\end{align*}
+</math>
+where <math>\mu=(\mu_1,...,\mu_n)</math>, <math>Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}</math>.
+{{proofcard|Corollary|cor-1|Let <math>X</math> be an <math>\R^n</math>-valued gaussian vector. The components <math>X_j</math> of <math>X</math> are independent if and only if <math>Q</math> is a diagonal matrix.
+|Suppose <math>Q=\begin{pmatrix}\sigma_1^2&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\sigma_n^2\end{pmatrix}</math>, then <math>(*)</math> shows that
+<math display="block">
+\varphi_X(v_1,...,v_n)=\prod_{j=1}^n\varphi_{X_j}(v_j),
+</math>
+where <math>X_j\sim \mathcal{N}(\mu_j,\sigma^2_j)</math>. The result follows from the uniqueness of the r.v.}}
+{{proofcard|Theorem|thm-2|Let <math>X</math> be an <math>\R^n</math>-valued gaussian vector with mean <math>\mu</math>. Then there exists independent gaussian r.v.'s <math>Y_1,..,Y_n</math> with
+<math display="block">
+Y_j=\mathcal{N}(0,\lambda_j),\lambda_j\geq 0,1\leq j\leq n
+</math>
+and an orthogonal matrix <math>A</math> such that
+<math display="block">
+X=\mu+AY.
+</math>
+{{alert-info |
+It is possible that <math>\lambda_j=0</math>. In that case, it is also possible to get <math>Y_j=0</math> a.s.
+}}
+|There is an <math>A\in O(\R)</math>, such that <math>Q=A\Lambda A^*</math>, with <math>\Lambda=\begin{pmatrix}\lambda_1&\dotsm &0\\ \vdots&\ddots&\vdots\\0&\dotsm&\lambda_n\end{pmatrix}</math>, <math>\lambda_j\leq 0</math>. Set <math>Y=A^*(X-\mu)</math>. Then one can check that <math>Y</math> is gaussian. So we get that <math>Cov(Y)=A^*QA=\Lambda</math>, which implies that <math>Y_1,...,Y_n</math> are independent because <math>Cov(Y)</math> is diagonal.}}
+{{proofcard|Corollary|cor-2|An <math>\R^n</math>-valued gaussian vector <math>X</math> has density on <math>\R^n</math> if and only if <math>\det(Q)\not=0</math>.|}}
+{{alert-info |
+If <math>\det(Q)\not=0</math>, then <math>f_X(x)=\frac{1}{\sqrt{2\pi}\sqrt{\det(Q)}}e^{-\frac{1}{2}\langle x-\mu,Q^{-1}(\lambda-\mu)\rangle}</math>.
+}}
+{{proofcard|Theorem|thm-3|Let <math>X</math> be an <math>\R^n</math>-valued gaussian r.v. and let <math>Y</math> be an <math>\R^m</math>-valued gaussian r.v. If <math>X</math> and <math>Y</math> are independent, then <math>Z=(X,Y)</math> is an <math>\R^{n+m}</math>-valued gaussian vector.
+|Let <math>u=(w,v)</math>, <math>w\in\R^n</math> and <math>v\in\R^m</math>. Take <math>Q=\begin{pmatrix}Q^X&0\\ 0&Q^Y\end{pmatrix}</math>. Now we get
+<math display="block">
+\begin{align*}
+\varphi_Z(u)&=\varphi_X(w)\varphi_Y(v)\\
+&=\exp\left(i\langle w,\mu X\rangle-\frac{1}{2}\langle w,Q^Xw\rangle\right)+\exp\left(i\langle v,\mu^Y\rangle-\frac{1}{2}\langle v,Q^Yv\rangle \right)\\
+&=\exp\left(i\langle (w,v),(\mu^X,\mu^Y)\rangle-\frac{1}{2}\langle u,Qu\rangle\right),
+\end{align*}
+</math>
+implying that <math>Z</math> is a gaussian vector.}}
+{{proofcard|Theorem|thm-4|Let <math>X</math> be an <math>\R^n</math>-valued gaussian vector. Two components <math>X_j</math> and <math>X_k</math> of <math>X</math> are independent if and only if <math>Cov(X_j,X_k)=0</math>.
+|Consider <math>Y=(Y_1,Y_2)</math>, with <math>Y_1=X_j</math> and <math>Y_2=X_k</math>. If <math>Y</math> is a gaussian vector, then <math>Cov(Y_1,Y_2)=0</math>, which implies that <math>Y_1</math> and <math>Y_2</math> are independent.}}
+''Warning!:''
+Let <math>Y\sim\mathcal{N}(0,1)</math> and for <math>a > 0</math> fix <math>Z=Y\one_{\vert Y\vert\leq a}-Y\one_{\vert Y\vert > a}</math>. Then <math>Z\sim\mathcal{N}(0,1)</math>. But <math>Y+Z=2Y\one_{\vert Y\vert \leq  a}</math> is not gaussian because it is a bounded r.v. and it is not constant. Therefore <math>(Y,Z)</math> is not a gaussian vector.
+==General references==
+{{cite arXiv|last=Moshayedi|first=Nima|year=2020|title=Lectures on Probability Theory|eprint=2010.16280|class=math.PR}}