More properties of the conditional expectation
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]\mathcal{G}_1\subset\F[/math] and [math]\mathcal{G}_2\subset\F[/math] be two sub [math]\sigma[/math]-Algebras of [math]\F[/math]. Then [math]\mathcal{G}_1[/math] and [math]\mathcal{G}_2[/math] are independent if and only if for every positive and [math]\mathcal{G}_2[/math]-measurable r.v. [math]X[/math] (or for [math]X\in L^1(\Omega,\mathcal{G}_2,\p)[/math] or [math]X=\one_A[/math] for [math]A\in \mathcal{G}_2[/math]) we have
We only need to prove that the statement in the bracket implies that [math]\mathcal{G}_1[/math] and [math]\mathcal{G}_2[/math] are independent. Assume that for all [math]A\in \mathcal{G}_2[/math] we have that
Let [math]Z[/math] and [math]Y[/math] be two real valued r.v.'s. Then [math]Z[/math] and [math]Y[/math] are independent if and only if for all [math]h[/math] Borel measurable, such that [math]\E[\vert h(Z)\vert] \lt \infty[/math], we get [math]\E[h(Z)\mid Y]=\E[h(Z)][/math]. To see this we can apply the theorem with [math]\mathcal{G}_2=\sigma(Z)[/math] and note that all r.v.'s in [math]L^1(\Omega,\mathcal{G}_2,\p)[/math] are of the form [math]h(Z)[/math] with [math]\E[\vert h(Z)\vert] \lt \infty[/math]. In particular, if [math]Z\in L^1(\Omega,\F,\p)[/math], we get [math]\E[Z\mid Y]=\E[Z][/math]. Be aware that the latter equation does not imply that [math]Y[/math] and [math]Z[/math] are independent. For example take [math]Z\sim\mathcal{N}(0,1)[/math] and [math]Y=\vert Z\vert[/math]. Now for all [math]h[/math] with [math]\E[\vert h(\vert Z\vert)\vert] \lt \infty[/math] we get [math]\E[Zh(\vert Z\vert )]=0[/math]. Thus [math]\E[Z\mid \vert Z\vert]=0[/math], but [math]Z[/math] and [math]\vert Z\vert[/math] are not independent.
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]X[/math] and [math]Y[/math] be two r.v.'s on that space with values in the same measure space [math]E[/math] and [math]F[/math]. Assume that [math]X[/math] is independent of the sub [math]\sigma[/math]-Algebra [math]\mathcal{G}\subset \F[/math] and that [math]Y[/math] is [math]\mathcal{G}[/math]-measurable. Then for every measurable map [math]g:E\times F\to \R_+[/math] we have
We need to show that for all [math]\mathcal{G}[/math]-measurable r.v. [math]Z[/math] we get that
===Important examples}We need to take a look at two important examples.
=Variables with densities
Let [math](X,Y)\in \R^m\times \R^n[/math]. Assume that [math](X,Y)[/math] has density [math]P(x,y)[/math], i.e. for all Borel measurable maps [math]h:\R^m\times\R^n\to \R_+[/math] we have
The density of [math]Y[/math] is given by
We want to compute [math]\E[h(X)\mid Y][/math] for some measurable map [math]h:\R^m\to\R_+[/math]. Therefore we have
where
For [math]Y\in\R^n[/math], let [math]\nu(y,dx)[/math] be the probability measure on [math]\R^n[/math] defined by
In the literature, one abusively note
The Gaussian case
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]X,Y_1,...,Y_p\in L^2(\Omega,\F,\p)[/math]. We saw that [math]\E[X\mid Y_1,...,Y_p][/math] is the orthogonal projection of [math]X[/math] on [math]L^2(\Omega,\sigma(Y_1,...,Y_p),\p)[/math]. Since this conditional expectation is [math]\sigma(Y_1,...,Y_p)[/math]-measurable, it is of the form [math]\varphi(Y_1,...,Y_p)[/math]. In general, [math]L^2(\Omega,\sigma(Y_1,...,Y_p),\p)[/math] is of infinite dimension, so it is bad to obtain [math]\varphi[/math] explicitly. We also saw that [math]\varphi(Y_1,...,Y_p)[/math] is the best approximation of [math]X[/math] in the [math]L^2(\Omega,\sigma(Y_1,...,Y_p),\p)[/math] sense by an element of [math]L^2(\Omega,\sigma(Y_1,...,Y_p),\p)[/math]. Moreover, it is well known that the best [math]L^2[/math]-approximation of [math]X[/math] by an affine function of [math]\one,Y_1,...,Y_p[/math] is the best orthogonal projection of [math]X[/math] on the vector space [math]\{\one,Y_1,...,Y_p\}[/math], i.e.
In general, this is different from the orthogonal projection on [math]L^2(\Omega,\sigma(Y_1,...,Y_p),\p)[/math], except in the Gaussian case.
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].