guide:4a4c329fd1: Difference between revisions

From Stochiki
No edit summary
 
No edit summary
 
Line 1: Line 1:
<div class="d-none"><math>
\newcommand{\R}{\mathbb{R}}
\newcommand{\A}{\mathcal{A}}
\newcommand{\B}{\mathcal{B}}
\newcommand{\N}{\mathbb{N}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\Rbar}{\overline{\mathbb{R}}}
\newcommand{\Bbar}{\overline{\mathcal{B}}}
\newcommand{\Q}{\mathbb{Q}}
\newcommand{\E}{\mathbb{E}}
\newcommand{\p}{\mathbb{P}}
\newcommand{\one}{\mathds{1}}
\newcommand{\0}{\mathcal{O}}
\newcommand{\mat}{\textnormal{Mat}}
\newcommand{\sign}{\textnormal{sign}}
\newcommand{\CP}{\mathcal{P}}
\newcommand{\CT}{\mathcal{T}}
\newcommand{\CY}{\mathcal{Y}}
\newcommand{\F}{\mathcal{F}}
\newcommand{\mathds}{\mathbb}</math></div>
{{alert-info |
Before stating the Radon-Nikodym theorem, we recall some definitions from measure theory. Let <math>(\Omega,\B)</math> be a measurable space. A measure <math>\nu</math> is <math>absolutely</math> <math>continuous</math> with respect to another measure <math>\mu</math>, written <math>\nu\ll\mu</math> if there exists some measurable <math>f\geq 0</math> with <math>d\nu=fd\mu</math>, that is if there is a finite measurable <math>f\geq 0</math> with


<math display="block">
\nu(B)=\int_B fd\mu
</math>
for all <math>B\in\B</math>. Two measures <math>\mu</math> and <math>\nu</math> are <math>singular</math> with respect to each other if there exists disjoint measurable sets <math>A_1,A_2\subset \Omega</math> with <math>\Omega=A_1\sqcup A_2</math> and with <math>\nu(A_1)=0=\mu(A_2)</math>. Finally, recall that a measure <math>\mu</math> is <math>\sigma</math>-finite if there is a decomposition of <math>\Omega</math> into measurable sets,
<math display="block">
\Omega=\bigsqcup_{i=1}^\infty A_i
</math>
with <math>\mu(A_i) < \infty</math>.
}}
{{proofcard|Theorem (Radon-Nikodym)|RN|Let <math>\mu</math> and <math>\nu</math> be two <math>\sigma</math>-finite measures on a measurable space <math>(\Omega,\B)</math>. Then <math>\nu</math> can be decomposed as
<math display="block">
\nu=\nu_{abs}+\nu_{sing}
</math>
into the sum of two <math>\sigma</math>-finite measure with <math>\nu_{abs}\ll\mu</math> being absolutely continuous with respect to <math>\mu</math>, and with <math>\nu_{sing}</math> and <math>\mu</math> being singular to each other (which will be written <math>\nu_{sing}\perp\mu</math>).|}}
{{alert-info |
The theorem implies that there exists another, more practical way of checking whether a given <math>\sigma</math>-finite measure <math>\nu</math> is absolutely continuous with respect to another <math>\sigma</math>-finite measure <math>\mu</math>. If <math>\mu(N)=0</math> implies that <math>\nu(N)=0</math> for every measurable <math>N\subset \Omega</math>, then <math>\nu=\nu_{abs}</math> is absolutely continuous. We also note that the density function <math>f</math> with <math>fd\mu=d\nu</math> is called the <math>Radon</math>-<math>Nikodym</math> <math>derivative</math> and is often written <math>f=\frac{d\nu}{d\mu}</math>.
}}
{{alert-info |
To prove this theorem, we need a theorem which gives us a nice relationship between a Hilbert space and its dual space. Actually we can identify a Hilbert space <math>\mathcal{H}</math> with its dual space <math>\mathcal{H}^*</math>.
}}
{{proofcard|Lemma (Riesz-representation for Hilbert spaces)|lem-1|For a Hilbert space <math>\mathcal{H}</math>, the map sending <math>h\in \mathcal{H}</math> to <math>\phi(h)\in\mathcal{H}^*</math> defined by
<math display="block">
\phi(h)(x)=\langle x,h\rangle
</math>
is a linear (resp. sesqui-linear in the complex case) isometric isomorphism between <math>\mathcal{H}</math> and its dual space <math>\mathcal{H}^*</math>.
|[Proof of [[#RN |Theorem]]]
Suppose that <math>\mu</math> and <math>\nu</math> are both finite measures (the general case can be reduced to this case by using the assumption that <math>\mu</math> and <math>\nu</math> are both <math>\sigma</math>-finite). We define a new measure <math>m=\mu+\nu</math> and will work with the real Hilbert space <math>\mathcal{H}=L^2(\Omega,m)</math>. On this Hilbert space we define a linear functional <math>\phi</math> by
<math display="block">
\phi(g)=\int gd\nu
</math>
for <math>g\in\mathcal{H}</math>. For <math>g</math> a simple function on <math>\Omega</math>, this is clearly well-defined and satisfies
<math display="block">
\vert\phi(g)\vert=\left\vert\int gd\nu\right\vert\leq \int \vert g\vert d\nu\leq \int \vert g\vert dm\leq \| g\|_{\mathcal{H}}\|\one\|_{\mathcal{H}}
</math>
where we have used the fact that <math>m=\mu+\nu</math>, that <math>\mu</math> is a positive measure and the Cauchy-Schwartz inequality on <math>\mathcal{H}</math>. Since the simple functions are dense in <math>\mathcal{H}</math>, the functional extends to a functional on all of <math>\mathcal{H}</math>. By the Riesz-representation for Hilbert spaces there is some <math>k\in\mathcal{H}</math> such that
<math display="block">
\begin{equation}
\int gd\nu=\phi(g)=\int gkdm.
\end{equation}
</math>
We claim that <math>k</math> takes values in <math>[0,1]</math> almost surely with respect to <math>m</math>. Indeed, for any <math>B\in\B</math> we have
<math display="block">
0\leq  \nu(B)\leq  m(B),
</math>
so (using <math>g=\one_B</math>),
<math display="block">
0\leq  \int_B kdm\leq  m(B).
</math>
Using the choices
<math display="block">
B=\{\omega\in\Omega\mid k(\omega) < 0\}
</math>
and
<math display="block">
B=\{\omega\in\Omega\mid k(\omega) > 1\}
</math>
implies the claim that <math>k</math> takes <math>m</math>-almost surely values in <math>[0,1]</math>. Since <math>m=\mu+\nu</math>, we can reformulate (7) as
<math display="block">
\begin{equation}
\int g(1-k)d\nu=\int gkd\mu.
\end{equation}
</math>
This holds by construction for all simple functions <math>g</math>, and hence for all nonnegative measurable functions by monotone convergence. Now define <math>\nu_{sing}</math> to be <math>\nu\mid_{A}</math>, where
<math display="block">
A=\{\omega\in\Omega\mid k(\omega)=1\}.
</math>
By definition, <math>\nu_{sing}(\Omega\setminus A)=0</math> and by (8) applied with <math>g=\one_{A}</math> we also have <math>\mu(A)=0</math>. Therefore
<math display="block">
\nu_{sing}\perp\mu.
</math>
We also define
<math display="block">
\nu_{abs}=\nu\mid_{\Omega\setminus A}=\nu_{\{\omega\in\Omega\mid k(\omega) < 1\}}
</math>
so that <math>\nu=\nu_{sing}+\nu_{abs}</math>. Define the function <math>f=\frac{k}{1-k}\geq 0</math> on <math>\Omega\setminus A</math> and let <math>g\geq 0</math> be measurable. Then by (8) we have
<math display="block">
\int_{\Omega\setminus A}gf d\mu=\int_{\Omega\setminus A}\frac{g}{1-k}kd\mu=\int_{\Omega\setminus A}\frac{g}{1-k}(1-k)d\nu=\int_{\Omega\setminus A}gd\nu_{abs},
</math>
which shows that <math>d\nu_{abs}=fd\mu</math> and so <math>\nu_{abs}\ll \mu</math>.}}
{{proofcard|Theorem|thm-1|Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>\mathcal{G}\subset \F</math> be a sub <math>\sigma</math>-Algebra of <math>\F</math> and let <math>X\in L^1(\Omega,\F,\p)</math> be a r.v. Then there exists a unique r.v. in <math>L^1(\Omega,\mathcal{G},\p)</math>, denoted by <math>\E[X\mid\mathcal{G}]</math>, such that for all <math>B\in\mathcal{G}</math>
<math display="block">
\E[X\one_B]=\E[\E[X\mid\mathcal{G}]\one_B].
</math>
More generally, for every bounded and <math>\mathcal{G}</math>-measurable r.v. <math>Z</math> we get
<math display="block">
\E[XZ]=\E[\E[X\mid\mathcal{G}]Z]
</math>
and if <math>X\geq 0</math>, then <math>\E[X\mid\mathcal{G}]\geq 0</math>.
|The uniqueness part was already done. To show existence, assume first that <math>X</math> is positive. Define a new measure <math>\Q</math> on <math>(\Omega,\mathcal{G})</math> by
<math display="block">
\Q[A]=\E[X\one_A]=\int_A X(\omega)d\p(\omega)
</math>
for all <math>A\in\mathcal{G}</math>. Now consider the measure <math>\p</math> restricted to <math>\mathcal{G}</math>. Then we get that
<math display="block">
\Q\ll\p
</math>
on <math>\mathcal{G}</math>. The Radon-Nikodym theorem implies that there exists a positive and <math>\mathcal{G}</math>-measurable r.v. <math>\tilde X</math> such that
<math display="block">
\Q[A]=\E[\tilde X\one_A]
</math>
for all <math>A\in\mathcal{G}</math>. For <math>A\in\mathcal{G}</math> we get that
<math display="block">
\E[X\one_A]=\E[\tilde X\one_A].
</math>
Now taking <math>A=\Omega</math>, we get that <math>\E[X]=\E[\tilde X]</math>. Therefore we have that <math>\tilde X\in L^1(\Omega,\mathcal{G},\p)</math> and hence we see that <math>\tilde X=\E[X\mid\mathcal{G}]</math> For the general case, we can just write <math>X=X^++X^-</math> and do the same as before.}}
==General references==
{{cite arXiv|last=Moshayedi|first=Nima|year=2020|title=Lectures on Probability Theory|eprint=2010.16280|class=math.PR}}

Latest revision as of 00:53, 8 May 2024

[math] \newcommand{\R}{\mathbb{R}} \newcommand{\A}{\mathcal{A}} \newcommand{\B}{\mathcal{B}} \newcommand{\N}{\mathbb{N}} \newcommand{\C}{\mathbb{C}} \newcommand{\Rbar}{\overline{\mathbb{R}}} \newcommand{\Bbar}{\overline{\mathcal{B}}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\E}{\mathbb{E}} \newcommand{\p}{\mathbb{P}} \newcommand{\one}{\mathds{1}} \newcommand{\0}{\mathcal{O}} \newcommand{\mat}{\textnormal{Mat}} \newcommand{\sign}{\textnormal{sign}} \newcommand{\CP}{\mathcal{P}} \newcommand{\CT}{\mathcal{T}} \newcommand{\CY}{\mathcal{Y}} \newcommand{\F}{\mathcal{F}} \newcommand{\mathds}{\mathbb}[/math]

Before stating the Radon-Nikodym theorem, we recall some definitions from measure theory. Let [math](\Omega,\B)[/math] be a measurable space. A measure [math]\nu[/math] is [math]absolutely[/math] [math]continuous[/math] with respect to another measure [math]\mu[/math], written [math]\nu\ll\mu[/math] if there exists some measurable [math]f\geq 0[/math] with [math]d\nu=fd\mu[/math], that is if there is a finite measurable [math]f\geq 0[/math] with

[[math]] \nu(B)=\int_B fd\mu [[/math]]
for all [math]B\in\B[/math]. Two measures [math]\mu[/math] and [math]\nu[/math] are [math]singular[/math] with respect to each other if there exists disjoint measurable sets [math]A_1,A_2\subset \Omega[/math] with [math]\Omega=A_1\sqcup A_2[/math] and with [math]\nu(A_1)=0=\mu(A_2)[/math]. Finally, recall that a measure [math]\mu[/math] is [math]\sigma[/math]-finite if there is a decomposition of [math]\Omega[/math] into measurable sets,

[[math]] \Omega=\bigsqcup_{i=1}^\infty A_i [[/math]]
with [math]\mu(A_i) \lt \infty[/math].

Theorem (Radon-Nikodym)

Let [math]\mu[/math] and [math]\nu[/math] be two [math]\sigma[/math]-finite measures on a measurable space [math](\Omega,\B)[/math]. Then [math]\nu[/math] can be decomposed as

[[math]] \nu=\nu_{abs}+\nu_{sing} [[/math]]
into the sum of two [math]\sigma[/math]-finite measure with [math]\nu_{abs}\ll\mu[/math] being absolutely continuous with respect to [math]\mu[/math], and with [math]\nu_{sing}[/math] and [math]\mu[/math] being singular to each other (which will be written [math]\nu_{sing}\perp\mu[/math]).

The theorem implies that there exists another, more practical way of checking whether a given [math]\sigma[/math]-finite measure [math]\nu[/math] is absolutely continuous with respect to another [math]\sigma[/math]-finite measure [math]\mu[/math]. If [math]\mu(N)=0[/math] implies that [math]\nu(N)=0[/math] for every measurable [math]N\subset \Omega[/math], then [math]\nu=\nu_{abs}[/math] is absolutely continuous. We also note that the density function [math]f[/math] with [math]fd\mu=d\nu[/math] is called the [math]Radon[/math]-[math]Nikodym[/math] [math]derivative[/math] and is often written [math]f=\frac{d\nu}{d\mu}[/math].

To prove this theorem, we need a theorem which gives us a nice relationship between a Hilbert space and its dual space. Actually we can identify a Hilbert space [math]\mathcal{H}[/math] with its dual space [math]\mathcal{H}^*[/math].

Lemma (Riesz-representation for Hilbert spaces)

For a Hilbert space [math]\mathcal{H}[/math], the map sending [math]h\in \mathcal{H}[/math] to [math]\phi(h)\in\mathcal{H}^*[/math] defined by

[[math]] \phi(h)(x)=\langle x,h\rangle [[/math]]
is a linear (resp. sesqui-linear in the complex case) isometric isomorphism between [math]\mathcal{H}[/math] and its dual space [math]\mathcal{H}^*[/math].


Show Proof

[Proof of Theorem] Suppose that [math]\mu[/math] and [math]\nu[/math] are both finite measures (the general case can be reduced to this case by using the assumption that [math]\mu[/math] and [math]\nu[/math] are both [math]\sigma[/math]-finite). We define a new measure [math]m=\mu+\nu[/math] and will work with the real Hilbert space [math]\mathcal{H}=L^2(\Omega,m)[/math]. On this Hilbert space we define a linear functional [math]\phi[/math] by

[[math]] \phi(g)=\int gd\nu [[/math]]
for [math]g\in\mathcal{H}[/math]. For [math]g[/math] a simple function on [math]\Omega[/math], this is clearly well-defined and satisfies

[[math]] \vert\phi(g)\vert=\left\vert\int gd\nu\right\vert\leq \int \vert g\vert d\nu\leq \int \vert g\vert dm\leq \| g\|_{\mathcal{H}}\|\one\|_{\mathcal{H}} [[/math]]
where we have used the fact that [math]m=\mu+\nu[/math], that [math]\mu[/math] is a positive measure and the Cauchy-Schwartz inequality on [math]\mathcal{H}[/math]. Since the simple functions are dense in [math]\mathcal{H}[/math], the functional extends to a functional on all of [math]\mathcal{H}[/math]. By the Riesz-representation for Hilbert spaces there is some [math]k\in\mathcal{H}[/math] such that

[[math]] \begin{equation} \int gd\nu=\phi(g)=\int gkdm. \end{equation} [[/math]]


We claim that [math]k[/math] takes values in [math][0,1][/math] almost surely with respect to [math]m[/math]. Indeed, for any [math]B\in\B[/math] we have

[[math]] 0\leq \nu(B)\leq m(B), [[/math]]
so (using [math]g=\one_B[/math]),

[[math]] 0\leq \int_B kdm\leq m(B). [[/math]]
Using the choices

[[math]] B=\{\omega\in\Omega\mid k(\omega) \lt 0\} [[/math]]
and

[[math]] B=\{\omega\in\Omega\mid k(\omega) \gt 1\} [[/math]]
implies the claim that [math]k[/math] takes [math]m[/math]-almost surely values in [math][0,1][/math]. Since [math]m=\mu+\nu[/math], we can reformulate (7) as

[[math]] \begin{equation} \int g(1-k)d\nu=\int gkd\mu. \end{equation} [[/math]]


This holds by construction for all simple functions [math]g[/math], and hence for all nonnegative measurable functions by monotone convergence. Now define [math]\nu_{sing}[/math] to be [math]\nu\mid_{A}[/math], where

[[math]] A=\{\omega\in\Omega\mid k(\omega)=1\}. [[/math]]
By definition, [math]\nu_{sing}(\Omega\setminus A)=0[/math] and by (8) applied with [math]g=\one_{A}[/math] we also have [math]\mu(A)=0[/math]. Therefore

[[math]] \nu_{sing}\perp\mu. [[/math]]
We also define

[[math]] \nu_{abs}=\nu\mid_{\Omega\setminus A}=\nu_{\{\omega\in\Omega\mid k(\omega) \lt 1\}} [[/math]]
so that [math]\nu=\nu_{sing}+\nu_{abs}[/math]. Define the function [math]f=\frac{k}{1-k}\geq 0[/math] on [math]\Omega\setminus A[/math] and let [math]g\geq 0[/math] be measurable. Then by (8) we have

[[math]] \int_{\Omega\setminus A}gf d\mu=\int_{\Omega\setminus A}\frac{g}{1-k}kd\mu=\int_{\Omega\setminus A}\frac{g}{1-k}(1-k)d\nu=\int_{\Omega\setminus A}gd\nu_{abs}, [[/math]]
which shows that [math]d\nu_{abs}=fd\mu[/math] and so [math]\nu_{abs}\ll \mu[/math].

Theorem

Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]\mathcal{G}\subset \F[/math] be a sub [math]\sigma[/math]-Algebra of [math]\F[/math] and let [math]X\in L^1(\Omega,\F,\p)[/math] be a r.v. Then there exists a unique r.v. in [math]L^1(\Omega,\mathcal{G},\p)[/math], denoted by [math]\E[X\mid\mathcal{G}][/math], such that for all [math]B\in\mathcal{G}[/math]

[[math]] \E[X\one_B]=\E[\E[X\mid\mathcal{G}]\one_B]. [[/math]]
More generally, for every bounded and [math]\mathcal{G}[/math]-measurable r.v. [math]Z[/math] we get

[[math]] \E[XZ]=\E[\E[X\mid\mathcal{G}]Z] [[/math]]
and if [math]X\geq 0[/math], then [math]\E[X\mid\mathcal{G}]\geq 0[/math].


Show Proof

The uniqueness part was already done. To show existence, assume first that [math]X[/math] is positive. Define a new measure [math]\Q[/math] on [math](\Omega,\mathcal{G})[/math] by

[[math]] \Q[A]=\E[X\one_A]=\int_A X(\omega)d\p(\omega) [[/math]]
for all [math]A\in\mathcal{G}[/math]. Now consider the measure [math]\p[/math] restricted to [math]\mathcal{G}[/math]. Then we get that

[[math]] \Q\ll\p [[/math]]
on [math]\mathcal{G}[/math]. The Radon-Nikodym theorem implies that there exists a positive and [math]\mathcal{G}[/math]-measurable r.v. [math]\tilde X[/math] such that

[[math]] \Q[A]=\E[\tilde X\one_A] [[/math]]
for all [math]A\in\mathcal{G}[/math]. For [math]A\in\mathcal{G}[/math] we get that

[[math]] \E[X\one_A]=\E[\tilde X\one_A]. [[/math]]
Now taking [math]A=\Omega[/math], we get that [math]\E[X]=\E[\tilde X][/math]. Therefore we have that [math]\tilde X\in L^1(\Omega,\mathcal{G},\p)[/math] and hence we see that [math]\tilde X=\E[X\mid\mathcal{G}][/math] For the general case, we can just write [math]X=X^++X^-[/math] and do the same as before.

General references

Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].