Basic facts on Gaussian vectors
[math]
\newcommand{\R}{\mathbb{R}}
\newcommand{\A}{\mathcal{A}}
\newcommand{\B}{\mathcal{B}}
\newcommand{\N}{\mathbb{N}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\Rbar}{\overline{\mathbb{R}}}
\newcommand{\Bbar}{\overline{\mathcal{B}}}
\newcommand{\Q}{\mathbb{Q}}
\newcommand{\E}{\mathbb{E}}
\newcommand{\p}{\mathbb{P}}
\newcommand{\one}{\mathds{1}}
\newcommand{\0}{\mathcal{O}}
\newcommand{\mat}{\textnormal{Mat}}
\newcommand{\sign}{\textnormal{sign}}
\newcommand{\CP}{\mathcal{P}}
\newcommand{\CT}{\mathcal{T}}
\newcommand{\CY}{\mathcal{Y}}
\newcommand{\F}{\mathcal{F}}
\newcommand{\mathds}{\mathbb}[/math]
A random vector [math]Z=(Z_1,...,Z_n)[/math] is said to be Gaussian, if for all [math]\lambda_1,...,\lambda_n\in\R[/math]
[[math]]
\lambda_1 Z_1+\dotsm +\lambda_n Z_n
[[/math]]
is Gaussian.Moreover, [math]Z[/math] is called centered, if [math]\E[Z_j]=0[/math] for all [math]1\leq j\leq n[/math]. Let [math]Z[/math] be a Gaussian vector. Then for all [math]\xi\in\R^n[/math] we get
[[math]]
\E\left[e^{i\langle\xi, Z\rangle}\right]=\exp\left(-\frac{1}{2}\xi^t C_Z\xi\right),
[[/math]]
where [math]C_Z:=(C_{ij})[/math] and [math]C_{ij}=\E[Z_iZ_j][/math]. If [math]Cov(Z_i,Z_j)=0[/math], then [math]Z_i[/math] and [math]Z_j[/math] are independent. More generally, we have the Gaussian vectors
[[math]]
(\underbrace{X_1,...,X_{i_1}}_{Y_1},\underbrace{X_{i_1+1},...,X_{i_2}}_{Y_2},...,\underbrace{X_{i_{n-1}+1},...,X_{i_n}}_{Y_n}).
[[/math]]
[math]Y_1[/math] and [math]Y_2[/math] are independent if and only if [math]Cov(X_j,X_n)=0[/math], where [math]1\leq j\leq i_1[/math] and [math]i_1+1\leq k\leq i_2[/math]. If [math]Z_1,...,Z_n[/math] are independent Gaussian r.v.'s, we have that
[[math]]
Z=(Z_1,...,Z_n)
[[/math]]
is a Gaussian vector. If [math]Z[/math] is a Gaussian vector and [math]A\in \mathcal{M}(m\times n,\R)[/math] , we get that [math]AZ[/math] is again a Gaussian vector.
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]X\in L^1(\Omega,\F,\p)[/math] and [math]Y_1,...,Y_p\in L^1(\Omega,\F,\p)[/math] and let [math](X,Y_1,...,Y_p)[/math] be a centered Gaussian vector. Then
[[math]]
\E[X\mid Y_1,...,Y_p]
[[/math]]
is the orthogonal projection of
[math]X[/math] on the vector space
[[math]]
span\{Y_1,...,Y_p\}.
[[/math]]
Consequently, there exists real numbers [math]\lambda_1,...,\lambda_p[/math] such that
[[math]]
\E[X\mid Y_1,...,Y_p]=\lambda_1 Y_1+\dotsm +\lambda_p Y_p.
[[/math]]
Moreover, for a measurable map
[math]h:\R\to\R_+[/math] we get
[[math]]
\E[h(X)\mid Y_1,...,Y_p]=\int_\R h(x)Q_{\sum_{j=1}^n\lambda_j Y_j,\sigma^2}(x)dx,
[[/math]]
where
[math]\sigma^2=\E\left[\left( X-\sum_{j=1}^n\lambda_j Y_j\right)\right][/math] and
[[math]]
Q_{n,\sigma^2}(x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(x-m)^2}{2\sigma^2}\right).
[[/math]]
Proof: Exercise.
[a]
Show Proof
Let [math]\tilde X=\lambda_1 Y_1+\dotsm +\lambda_pY_p[/math] be the orthogonal projection of [math]X[/math] onto [math]span\{Y_1,...,Y_p\}[/math], meaning that for all [math]1\leq j\leq p[/math]
[[math]]
\E[(X-\tilde X)Y_j]=0.
[[/math]]
Note that this condition gives us explicitly the
[math]\lambda_j's[/math]. We obtain therefore that
[math](Y_1,...,Y_p,(X-\tilde X))[/math]
is a Gaussian vector. Moreover, we get
[math]\E[(X-\tilde X)Y_j]=Cov(X-\tilde X,Y_j)=0[/math] and thus
[math]X-\tilde X[/math] is independent of
[math]Y_1,...,Y_p[/math]. Hence
[[math]]
\E[X\mid Y_1,...,Y_p]=\E[X-\tilde X+\tilde X\mid Y_1,...,Y_p]=\E[X-\tilde X\mid Y_1,...,Y_p]+\E[\tilde X\mid Y_1,...,Y_p]=\E[X-\tilde X]+\tilde X=\tilde X.
[[/math]]
■
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].
Notes
- This is done similarly to the proof of theorem