guide:4a4c329fd1: Difference between revisions

Latest revision as of 00:53, 8 May 2024

Before stating the Radon-Nikodym theorem, we recall some definitions from measure theory. Let [math](\Omega,\B)[/math] be a measurable space. A measure [math]\nu[/math] is [math]absolutely[/math] [math]continuous[/math] with respect to another measure [math]\mu[/math], written [math]\nu\ll\mu[/math] if there exists some measurable [math]f\geq 0[/math] with [math]d\nu=fd\mu[/math], that is if there is a finite measurable [math]f\geq 0[/math] with

[[math]] \nu(B)=\int_B fd\mu [[/math]]

for all [math]B\in\B[/math]. Two measures [math]\mu[/math] and [math]\nu[/math] are [math]singular[/math] with respect to each other if there exists disjoint measurable sets [math]A_1,A_2\subset \Omega[/math] with [math]\Omega=A_1\sqcup A_2[/math] and with [math]\nu(A_1)=0=\mu(A_2)[/math]. Finally, recall that a measure [math]\mu[/math] is [math]\sigma[/math]-finite if there is a decomposition of [math]\Omega[/math] into measurable sets,

[[math]] \Omega=\bigsqcup_{i=1}^\infty A_i [[/math]]

with [math]\mu(A_i) \lt \infty[/math].

Theorem (Radon-Nikodym)

Let [math]\mu[/math] and [math]\nu[/math] be two [math]\sigma[/math]-finite measures on a measurable space [math](\Omega,\B)[/math]. Then [math]\nu[/math] can be decomposed as

[[math]] \nu=\nu_{abs}+\nu_{sing} [[/math]]

into the sum of two [math]\sigma[/math]-finite measure with [math]\nu_{abs}\ll\mu[/math] being absolutely continuous with respect to [math]\mu[/math], and with [math]\nu_{sing}[/math] and [math]\mu[/math] being singular to each other (which will be written [math]\nu_{sing}\perp\mu[/math]).

The theorem implies that there exists another, more practical way of checking whether a given [math]\sigma[/math]-finite measure [math]\nu[/math] is absolutely continuous with respect to another [math]\sigma[/math]-finite measure [math]\mu[/math]. If [math]\mu(N)=0[/math] implies that [math]\nu(N)=0[/math] for every measurable [math]N\subset \Omega[/math], then [math]\nu=\nu_{abs}[/math] is absolutely continuous. We also note that the density function [math]f[/math] with [math]fd\mu=d\nu[/math] is called the [math]Radon[/math]-[math]Nikodym[/math] [math]derivative[/math] and is often written [math]f=\frac{d\nu}{d\mu}[/math].

To prove this theorem, we need a theorem which gives us a nice relationship between a Hilbert space and its dual space. Actually we can identify a Hilbert space [math]\mathcal{H}[/math] with its dual space [math]\mathcal{H}^*[/math].

Lemma (Riesz-representation for Hilbert spaces)

For a Hilbert space [math]\mathcal{H}[/math], the map sending [math]h\in \mathcal{H}[/math] to [math]\phi(h)\in\mathcal{H}^*[/math] defined by

[[math]] \phi(h)(x)=\langle x,h\rangle [[/math]]

is a linear (resp. sesqui-linear in the complex case) isometric isomorphism between [math]\mathcal{H}[/math] and its dual space [math]\mathcal{H}^*[/math].

Show Proof

[Proof of Theorem] Suppose that [math]\mu[/math] and [math]\nu[/math] are both finite measures (the general case can be reduced to this case by using the assumption that [math]\mu[/math] and [math]\nu[/math] are both [math]\sigma[/math]-finite). We define a new measure [math]m=\mu+\nu[/math] and will work with the real Hilbert space [math]\mathcal{H}=L^2(\Omega,m)[/math]. On this Hilbert space we define a linear functional [math]\phi[/math] by

[[math]] \phi(g)=\int gd\nu [[/math]]

for [math]g\in\mathcal{H}[/math]. For [math]g[/math] a simple function on [math]\Omega[/math], this is clearly well-defined and satisfies

[[math]] \vert\phi(g)\vert=\left\vert\int gd\nu\right\vert\leq \int \vert g\vert d\nu\leq \int \vert g\vert dm\leq \| g\|_{\mathcal{H}}\|\one\|_{\mathcal{H}} [[/math]]

where we have used the fact that [math]m=\mu+\nu[/math], that [math]\mu[/math] is a positive measure and the Cauchy-Schwartz inequality on [math]\mathcal{H}[/math]. Since the simple functions are dense in [math]\mathcal{H}[/math], the functional extends to a functional on all of [math]\mathcal{H}[/math]. By the Riesz-representation for Hilbert spaces there is some [math]k\in\mathcal{H}[/math] such that

[[math]] \begin{equation} \int gd\nu=\phi(g)=\int gkdm. \end{equation} [[/math]]

We claim that [math]k[/math] takes values in [math][0,1][/math] almost surely with respect to [math]m[/math]. Indeed, for any [math]B\in\B[/math] we have

[[math]] 0\leq \nu(B)\leq m(B), [[/math]]

so (using [math]g=\one_B[/math]),

[[math]] 0\leq \int_B kdm\leq m(B). [[/math]]

Using the choices

[[math]] B=\{\omega\in\Omega\mid k(\omega) \lt 0\} [[/math]]

and

[[math]] B=\{\omega\in\Omega\mid k(\omega) \gt 1\} [[/math]]

implies the claim that [math]k[/math] takes [math]m[/math]-almost surely values in [math][0,1][/math]. Since [math]m=\mu+\nu[/math], we can reformulate (7) as

[[math]] \begin{equation} \int g(1-k)d\nu=\int gkd\mu. \end{equation} [[/math]]

This holds by construction for all simple functions [math]g[/math], and hence for all nonnegative measurable functions by monotone convergence. Now define [math]\nu_{sing}[/math] to be [math]\nu\mid_{A}[/math], where

[[math]] A=\{\omega\in\Omega\mid k(\omega)=1\}. [[/math]]

By definition, [math]\nu_{sing}(\Omega\setminus A)=0[/math] and by (8) applied with [math]g=\one_{A}[/math] we also have [math]\mu(A)=0[/math]. Therefore

[[math]] \nu_{sing}\perp\mu. [[/math]]

We also define

[[math]] \nu_{abs}=\nu\mid_{\Omega\setminus A}=\nu_{\{\omega\in\Omega\mid k(\omega) \lt 1\}} [[/math]]

so that [math]\nu=\nu_{sing}+\nu_{abs}[/math]. Define the function [math]f=\frac{k}{1-k}\geq 0[/math] on [math]\Omega\setminus A[/math] and let [math]g\geq 0[/math] be measurable. Then by (8) we have

[[math]] \int_{\Omega\setminus A}gf d\mu=\int_{\Omega\setminus A}\frac{g}{1-k}kd\mu=\int_{\Omega\setminus A}\frac{g}{1-k}(1-k)d\nu=\int_{\Omega\setminus A}gd\nu_{abs}, [[/math]]

which shows that [math]d\nu_{abs}=fd\mu[/math] and so [math]\nu_{abs}\ll \mu[/math].

■

Theorem

Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]\mathcal{G}\subset \F[/math] be a sub [math]\sigma[/math]-Algebra of [math]\F[/math] and let [math]X\in L^1(\Omega,\F,\p)[/math] be a r.v. Then there exists a unique r.v. in [math]L^1(\Omega,\mathcal{G},\p)[/math], denoted by [math]\E[X\mid\mathcal{G}][/math], such that for all [math]B\in\mathcal{G}[/math]

[[math]] \E[X\one_B]=\E[\E[X\mid\mathcal{G}]\one_B]. [[/math]]

More generally, for every bounded and [math]\mathcal{G}[/math]-measurable r.v. [math]Z[/math] we get

[[math]] \E[XZ]=\E[\E[X\mid\mathcal{G}]Z] [[/math]]

and if [math]X\geq 0[/math], then [math]\E[X\mid\mathcal{G}]\geq 0[/math].

Show Proof

The uniqueness part was already done. To show existence, assume first that [math]X[/math] is positive. Define a new measure [math]\Q[/math] on [math](\Omega,\mathcal{G})[/math] by

[[math]] \Q[A]=\E[X\one_A]=\int_A X(\omega)d\p(\omega) [[/math]]

for all [math]A\in\mathcal{G}[/math]. Now consider the measure [math]\p[/math] restricted to [math]\mathcal{G}[/math]. Then we get that

[[math]] \Q\ll\p [[/math]]

on [math]\mathcal{G}[/math]. The Radon-Nikodym theorem implies that there exists a positive and [math]\mathcal{G}[/math]-measurable r.v. [math]\tilde X[/math] such that

[[math]] \Q[A]=\E[\tilde X\one_A] [[/math]]

for all [math]A\in\mathcal{G}[/math]. For [math]A\in\mathcal{G}[/math] we get that

[[math]] \E[X\one_A]=\E[\tilde X\one_A]. [[/math]]

Now taking [math]A=\Omega[/math], we get that [math]\E[X]=\E[\tilde X][/math]. Therefore we have that [math]\tilde X\in L^1(\Omega,\mathcal{G},\p)[/math] and hence we see that [math]\tilde X=\E[X\mid\mathcal{G}][/math] For the general case, we can just write [math]X=X^++X^-[/math] and do the same as before.

■

General references

Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].

@@ Line 1: / Line 1: @@
+<div class="d-none"><math>
+\newcommand{\R}{\mathbb{R}}
+\newcommand{\A}{\mathcal{A}}
+\newcommand{\B}{\mathcal{B}}
+\newcommand{\N}{\mathbb{N}}
+\newcommand{\C}{\mathbb{C}}
+\newcommand{\Rbar}{\overline{\mathbb{R}}}
+\newcommand{\Bbar}{\overline{\mathcal{B}}}
+\newcommand{\Q}{\mathbb{Q}}
+\newcommand{\E}{\mathbb{E}}
+\newcommand{\p}{\mathbb{P}}
+\newcommand{\one}{\mathds{1}}
+\newcommand{\0}{\mathcal{O}}
+\newcommand{\mat}{\textnormal{Mat}}
+\newcommand{\sign}{\textnormal{sign}}
+\newcommand{\CP}{\mathcal{P}}
+\newcommand{\CT}{\mathcal{T}}
+\newcommand{\CY}{\mathcal{Y}}
+\newcommand{\F}{\mathcal{F}}
+\newcommand{\mathds}{\mathbb}</math></div>
+{{alert-info |
+Before stating the Radon-Nikodym theorem, we recall some definitions from measure theory. Let <math>(\Omega,\B)</math> be a measurable space. A measure <math>\nu</math> is <math>absolutely</math> <math>continuous</math> with respect to another measure <math>\mu</math>, written <math>\nu\ll\mu</math> if there exists some measurable <math>f\geq 0</math> with <math>d\nu=fd\mu</math>, that is if there is a finite measurable <math>f\geq 0</math> with
+<math display="block">
+\nu(B)=\int_B fd\mu
+</math>
+for all <math>B\in\B</math>. Two measures <math>\mu</math> and <math>\nu</math> are <math>singular</math> with respect to each other if there exists disjoint measurable sets <math>A_1,A_2\subset \Omega</math> with <math>\Omega=A_1\sqcup A_2</math> and with <math>\nu(A_1)=0=\mu(A_2)</math>. Finally, recall that a measure <math>\mu</math> is <math>\sigma</math>-finite if there is a decomposition of <math>\Omega</math> into measurable sets,
+<math display="block">
+\Omega=\bigsqcup_{i=1}^\infty A_i
+</math>
+with <math>\mu(A_i) < \infty</math>.
+}}
+{{proofcard|Theorem (Radon-Nikodym)|RN|Let <math>\mu</math> and <math>\nu</math> be two <math>\sigma</math>-finite measures on a measurable space <math>(\Omega,\B)</math>. Then <math>\nu</math> can be decomposed as
+<math display="block">
+\nu=\nu_{abs}+\nu_{sing}
+</math>
+into the sum of two <math>\sigma</math>-finite measure with <math>\nu_{abs}\ll\mu</math> being absolutely continuous with respect to <math>\mu</math>, and with <math>\nu_{sing}</math> and <math>\mu</math> being singular to each other (which will be written <math>\nu_{sing}\perp\mu</math>).|}}
+{{alert-info |
+The theorem implies that there exists another, more practical way of checking whether a given <math>\sigma</math>-finite measure <math>\nu</math> is absolutely continuous with respect to another <math>\sigma</math>-finite measure <math>\mu</math>. If <math>\mu(N)=0</math> implies that <math>\nu(N)=0</math> for every measurable <math>N\subset \Omega</math>, then <math>\nu=\nu_{abs}</math> is absolutely continuous. We also note that the density function <math>f</math> with <math>fd\mu=d\nu</math> is called the <math>Radon</math>-<math>Nikodym</math> <math>derivative</math> and is often written <math>f=\frac{d\nu}{d\mu}</math>.
+}}
+{{alert-info |
+To prove this theorem, we need a theorem which gives us a nice relationship between a Hilbert space and its dual space. Actually we can identify a Hilbert space <math>\mathcal{H}</math> with its dual space <math>\mathcal{H}^*</math>.
+}}
+{{proofcard|Lemma (Riesz-representation for Hilbert spaces)|lem-1|For a Hilbert space <math>\mathcal{H}</math>, the map sending <math>h\in \mathcal{H}</math> to <math>\phi(h)\in\mathcal{H}^*</math> defined by
+<math display="block">
+\phi(h)(x)=\langle x,h\rangle
+</math>
+is a linear (resp. sesqui-linear in the complex case) isometric isomorphism between <math>\mathcal{H}</math> and its dual space <math>\mathcal{H}^*</math>.
+|[Proof of [[#RN |Theorem]]]
+Suppose that <math>\mu</math> and <math>\nu</math> are both finite measures (the general case can be reduced to this case by using the assumption that <math>\mu</math> and <math>\nu</math> are both <math>\sigma</math>-finite). We define a new measure <math>m=\mu+\nu</math> and will work with the real Hilbert space <math>\mathcal{H}=L^2(\Omega,m)</math>. On this Hilbert space we define a linear functional <math>\phi</math> by
+<math display="block">
+\phi(g)=\int gd\nu
+</math>
+for <math>g\in\mathcal{H}</math>. For <math>g</math> a simple function on <math>\Omega</math>, this is clearly well-defined and satisfies
+<math display="block">
+\vert\phi(g)\vert=\left\vert\int gd\nu\right\vert\leq \int \vert g\vert d\nu\leq \int \vert g\vert dm\leq \| g\|_{\mathcal{H}}\|\one\|_{\mathcal{H}}
+</math>
+where we have used the fact that <math>m=\mu+\nu</math>, that <math>\mu</math> is a positive measure and the Cauchy-Schwartz inequality on <math>\mathcal{H}</math>. Since the simple functions are dense in <math>\mathcal{H}</math>, the functional extends to a functional on all of <math>\mathcal{H}</math>. By the Riesz-representation for Hilbert spaces there is some <math>k\in\mathcal{H}</math> such that
+<math display="block">
+\begin{equation}
+\int gd\nu=\phi(g)=\int gkdm.
+\end{equation}
+</math>
+We claim that <math>k</math> takes values in <math>[0,1]</math> almost surely with respect to <math>m</math>. Indeed, for any <math>B\in\B</math> we have
+<math display="block">
+\leq  \nu(B)\leq  m(B),
+</math>
+so (using <math>g=\one_B</math>),
+<math display="block">
+\leq  \int_B kdm\leq  m(B).
+</math>
+Using the choices
+<math display="block">
+B=\{\omega\in\Omega\mid k(\omega) < 0\}
+</math>
+and
+<math display="block">
+B=\{\omega\in\Omega\mid k(\omega) > 1\}
+</math>
+implies the claim that <math>k</math> takes <math>m</math>-almost surely values in <math>[0,1]</math>. Since <math>m=\mu+\nu</math>, we can reformulate (7) as
+<math display="block">
+\begin{equation}
+\int g(1-k)d\nu=\int gkd\mu.
+\end{equation}
+</math>
+This holds by construction for all simple functions <math>g</math>, and hence for all nonnegative measurable functions by monotone convergence. Now define <math>\nu_{sing}</math> to be <math>\nu\mid_{A}</math>, where
+<math display="block">
+A=\{\omega\in\Omega\mid k(\omega)=1\}.
+</math>
+By definition, <math>\nu_{sing}(\Omega\setminus A)=0</math> and by (8) applied with <math>g=\one_{A}</math> we also have <math>\mu(A)=0</math>. Therefore
+<math display="block">
+\nu_{sing}\perp\mu.
+</math>
+We also define
+<math display="block">
+\nu_{abs}=\nu\mid_{\Omega\setminus A}=\nu_{\{\omega\in\Omega\mid k(\omega) < 1\}}
+</math>
+so that <math>\nu=\nu_{sing}+\nu_{abs}</math>. Define the function <math>f=\frac{k}{1-k}\geq 0</math> on <math>\Omega\setminus A</math> and let <math>g\geq 0</math> be measurable. Then by (8) we have
+<math display="block">
+\int_{\Omega\setminus A}gf d\mu=\int_{\Omega\setminus A}\frac{g}{1-k}kd\mu=\int_{\Omega\setminus A}\frac{g}{1-k}(1-k)d\nu=\int_{\Omega\setminus A}gd\nu_{abs},
+</math>
+which shows that <math>d\nu_{abs}=fd\mu</math> and so <math>\nu_{abs}\ll \mu</math>.}}
+{{proofcard|Theorem|thm-1|Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>\mathcal{G}\subset \F</math> be a sub <math>\sigma</math>-Algebra of <math>\F</math> and let <math>X\in L^1(\Omega,\F,\p)</math> be a r.v. Then there exists a unique r.v. in <math>L^1(\Omega,\mathcal{G},\p)</math>, denoted by <math>\E[X\mid\mathcal{G}]</math>, such that for all <math>B\in\mathcal{G}</math>
+<math display="block">
+\E[X\one_B]=\E[\E[X\mid\mathcal{G}]\one_B].
+</math>
+More generally, for every bounded and <math>\mathcal{G}</math>-measurable r.v. <math>Z</math> we get
+<math display="block">
+\E[XZ]=\E[\E[X\mid\mathcal{G}]Z]
+</math>
+and if <math>X\geq 0</math>, then <math>\E[X\mid\mathcal{G}]\geq 0</math>.
+|The uniqueness part was already done. To show existence, assume first that <math>X</math> is positive. Define a new measure <math>\Q</math> on <math>(\Omega,\mathcal{G})</math> by
+<math display="block">
+\Q[A]=\E[X\one_A]=\int_A X(\omega)d\p(\omega)
+</math>
+for all <math>A\in\mathcal{G}</math>. Now consider the measure <math>\p</math> restricted to <math>\mathcal{G}</math>. Then we get that
+<math display="block">
+\Q\ll\p
+</math>
+on <math>\mathcal{G}</math>. The Radon-Nikodym theorem implies that there exists a positive and <math>\mathcal{G}</math>-measurable r.v. <math>\tilde X</math> such that
+<math display="block">
+\Q[A]=\E[\tilde X\one_A]
+</math>
+for all <math>A\in\mathcal{G}</math>. For <math>A\in\mathcal{G}</math> we get that
+<math display="block">
+\E[X\one_A]=\E[\tilde X\one_A].
+</math>
+Now taking <math>A=\Omega</math>, we get that <math>\E[X]=\E[\tilde X]</math>. Therefore we have that <math>\tilde X\in L^1(\Omega,\mathcal{G},\p)</math> and hence we see that <math>\tilde X=\E[X\mid\mathcal{G}]</math> For the general case, we can just write <math>X=X^++X^-</math> and do the same as before.}}
+==General references==
+{{cite arXiv|last=Moshayedi|first=Nima|year=2020|title=Lectures on Probability Theory|eprint=2010.16280|class=math.PR}}