guide:D9112ff950: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
<div class="d-none"><math> | |||
\newcommand{\R}{\mathbb{R}} | |||
\newcommand{\A}{\mathcal{A}} | |||
\newcommand{\B}{\mathcal{B}} | |||
\newcommand{\N}{\mathbb{N}} | |||
\newcommand{\C}{\mathbb{C}} | |||
\newcommand{\Rbar}{\overline{\mathbb{R}}} | |||
\newcommand{\Bbar}{\overline{\mathcal{B}}} | |||
\newcommand{\Q}{\mathbb{Q}} | |||
\newcommand{\E}{\mathbb{E}} | |||
\newcommand{\p}{\mathbb{P}} | |||
\newcommand{\one}{\mathds{1}} | |||
\newcommand{\0}{\mathcal{O}} | |||
\newcommand{\mat}{\textnormal{Mat}} | |||
\newcommand{\sign}{\textnormal{sign}} | |||
\newcommand{\CP}{\mathcal{P}} | |||
\newcommand{\CT}{\mathcal{T}} | |||
\newcommand{\CY}{\mathcal{Y}} | |||
\newcommand{\F}{\mathcal{F}} | |||
\newcommand{\mathds}{\mathbb}</math></div> | |||
We start with the existence of a Markov chain associated with a given transition matrix. | |||
{{proofcard|Proposition|prop2|Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>Q</math> be a stochastic matrix on <math>E</math>. There exists a probability space <math>(\Omega',\F',\p')</math> on which there exists, for all <math>x\in E</math>, a stochastic process <math>(X_n^x)_{n\geq 0}</math> which is a Markov chain with transition matrix <math>Q</math>, starting from <math>X_0^x=x</math>. | |||
|We can take <math>\Omega'=[0,1)</math>, with the Borel <math>\sigma</math>-Algebra and the Lebesgue measure. For <math>\omega\in[0,1)</math>, we set | |||
<math display="block"> | |||
\omega=\sum_{n=0}^\infty \mathcal{E}_n(\omega)2^{-n-1}, | |||
</math> | |||
with <math>\mathcal{E}_n(\omega)\in\{0,1\}</math>. We can further take the sequence <math>(\mathcal{E}_n)_{n\geq 0}</math> of iid r.v.'s with <math>\p[\mathcal{E}_n=1]=\p[\mathcal{E}_n=0]=\frac{1}{2}</math>. If <math>\varphi:\N\times\N\to \N</math> is a bijection, then the r.v.'s <math>\eta_{i,j}=\mathcal{E}_{\varphi(i,j)}</math> for <math>i,j\in\N</math> are also iid. Now define | |||
<math display="block"> | |||
U_i=\sum_{j=0}^\infty\eta_{i,j}2^{-j-1}. | |||
</math> | |||
Therefore <math>U_0,U_1,U_2,...</math> are iid. Denote <math>E=\{y_1,y_2,...,y_k,...\}</math> and fix <math>x\in E</math>. We set <math>X_0^x=x</math> and <math>X_1^x=y_k</math> if | |||
<math display="block"> | |||
\sum_{i\leq j < k}Q(x,y_j) < U_1\leq \sum_{1\leq j\leq k}Q(x,y_j), | |||
</math> | |||
such that <math>\p[X_1^x=y]=Q(x,y)</math> for all <math>y\in E</math>. We then proceed by induction <math>X_{n+1}^x=y_k</math> if | |||
<math display="block"> | |||
\sum_{1\leq j < k} Q(X_n^x,y_j) < U_{n+1}\leq \sum_{1\leq j\leq k}Q(X_n^x,y_j). | |||
</math> | |||
Using the independence of the <math>U_i</math>'s, we check that for all <math>k\geq 1</math>, we have | |||
<math display="block"> | |||
\begin{multline*} | |||
\p[X_{n+1}^x=y_k\mid X_0^x=x,X_1^x=x_1,...,X_n^x=x_n]\\=\p\left[\sum_{1\leq j < k}Q(x_n,y_j) < U_{n+1}\leq \sum_{1\leq j\leq k}Q(x_n,y_j)\mid X_0^x=x,...,X_n^x=x_n\right]=Q(x_n,y_k). | |||
\end{multline*} | |||
</math>}} | |||
In the sequel, we shall take <math>\Omega=E^\N</math>. An element <math>\omega\in \Omega</math> is a sequence <math>\omega=(\omega_0,\omega_1,...)</math> of elements in <math>E</math>. Define the coordinate map | |||
<math display="block"> | |||
X_n(\omega)=\omega_n. | |||
</math> | |||
We take <math>\F</math> to be the smallest <math>\sigma</math>-Algebra on <math>\Omega</math> which make all <math>X_n</math>'s measurable. This <math>\sigma</math>-Algebra is generated by | |||
<math display="block"> | |||
C=\{\omega\in\Omega\mid \omega_0=x_0,\omega_1=x_1,...,\omega_n=x_n\} | |||
</math> | |||
for <math>n\in\N</math> and <math>x_0,...,x_n\in E</math>. | |||
{{proofcard|Lemma|lem2|Let <math>(G,\mathcal{G})</math> be a measurable space and let <math>\psi:G\to\Omega</math> be a map. Then <math>\psi</math> is measurable if and only if for all <math>n\geq 0</math>, <math>X_n\circ\psi</math> is measurable. | |||
|We only need to show that if <math>X_n\circ \psi</math> is measurable for all <math>n\geq 0</math>, then <math>\psi</math> is measurable as well. Consider the <math>\sigma</math>-Algebra <math>\A</math> on <math>\Omega</math> given by | |||
<math display="block"> | |||
\A:=\{A\in\F\mid \psi^{-1}(A)\in\mathcal{G}\}, | |||
</math> | |||
which contains all the sets of the form <math>X_n^{-1}(y)</math> for <math>y\in E</math>. Hence all the <math>X_n</math>'s are measurable with respect to <math>\A</math> and the claim follows.}} | |||
{{proofcard|Theorem|thm-1|Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>Q</math> be a stochastic matrix on <math>E</math>. For all <math>x\in E</math>, there exists a unique probability measure <math>\p_x</math>, on <math>\Omega=E^\N</math>, such that the sequence of coordinate functions <math>(X_n)_{n\geq 0}</math> is a Markov chain with transition matrix <math>Q</math> and <math>\p_x[X_0=x]=1</math>. | |||
|From [[#prop2 |proposition]], there exists <math>(\Omega',\F',\p')</math> and <math>(X_n^x)_{n\geq 0}</math> which is a Markov chain with transition matrix <math>Q</math> and such that <math>X_0^x=x</math>. We then define <math>\p_x</math> as the image of <math>\p'</math> by the map | |||
<math display="block"> | |||
\begin{align*} | |||
\Omega'&\longrightarrow\Omega\\ | |||
\omega'&\longmapsto (X_n^x(\omega'))_{n\geq 0} | |||
\end{align*} | |||
</math> | |||
This map is measurable because of [[#lem2 |lemma]]. We have <math>\p_x[X_0=x]=\p'[X_0^x=x]=1</math> and for <math>x_0,x_1,...,x_n\in E</math> we have | |||
<math display="block"> | |||
\begin{multline*} | |||
\p_x[X_0=x_0,X_1=x_1,...,X_n=x_n]=\p'[X_0^x=x_0,X_1^x=x_1,...,X_n^x=x_n]\\=\p'[X_0^x=x_0]\prod_{j=1}^nQ(x_{j-1},x_j)=\p_x[X_0=x_0]\prod_{j=1}Q(x_{j-1},x_j) | |||
\end{multline*} | |||
</math> | |||
Therefore it follows that the sequence of the coordinate functions is a Markov chain. For uniqueness, we note that if <math>\p'_x</math> is another such probability measure, <math>\p_x</math> and <math>\p_x'</math> coincide on cylinders. It then follows from an application of the monotone class theorem that <math>\p_x=\p_x'</math>.}} | |||
{{alert-info | | |||
It follows for all <math>n\geq 0</math> and for all <math>x,y\in E</math>, that we have | |||
<math display="block"> | |||
\p_x[X_n=y]=Q_n(x,y). | |||
</math> | |||
}} | |||
{{alert-info | | |||
If <math>\mu</math> is a probability measure on <math>E</math>, we write | |||
<math display="block"> | |||
\p_\mu:=\sum_{x\in E}\mu(x)\p_x, | |||
</math> | |||
which defines a new probability measure on <math>\Omega</math>. By writing an explicit formula for <math>\p_\mu[X_0=x_0,...,X_n=x_n]</math>, one immediately checks that under <math>\p_\mu</math> we get that <math>(X_n)_{n\geq 0}</math> is a Markov chain with transition matrix <math>Q</math>, where <math>X_0</math> has the distribution <math>\mu</math>. | |||
}} | |||
{{alert-info | | |||
Let <math>(X'_n)_{n\geq 0}</math> be a Markov chain with transition matrix <math>Q</math> and initial distribution <math>\mu</math>. Then for all measurable subsets <math>B\subset\Omega=E^\N</math>, we have | |||
<math display="block"> | |||
\p[(X'_n)_{n\geq 0}\in B]=\p_\mu[B]. | |||
</math> | |||
This is only true when <math>B</math> is a cylinder as in the proof above. This shows that all results proven for the canonical Markov chains are true for a general Markov chain with the same transition matrix. One of the advantages of using a canonical Markov chain is that one can use translation operators. For all <math>k\in\N</math>, we define | |||
<math display="block"> | |||
\Theta_k:\Omega\to\Omega,(\omega_n)_{n\geq 0}\mapsto \Theta_k((\omega_n)_{n\geq 0})=(\omega_{k+n})_{n\geq 0}. | |||
</math> | |||
Lemma 16.2. shows that these applications are measurable. We write <math>\F_n=\sigma(X_0,...,X_n)</math> and we use the notation <math>\E_x</math> when we integrate with respect to <math>\p_x</math>. | |||
}} | |||
{{proofcard|Theorem (Simple Markov property)|thm5|Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>F</math> and <math>G</math> be two measurable, positive maps on <math>\Omega</math> and let <math>n\geq 0</math>. Assume that <math>F</math> is <math>\F_n</math>-measurable. Then for all <math>x\in E</math>, | |||
<math display="block"> | |||
\E_x[FG\circ \Theta_n]=\E_x[F\E_{X_n}[G]]. | |||
</math> | |||
Equivalently, we have | |||
<math display="block"> | |||
\E_x[G\circ \Theta_n\mid \F_n]=\E_{X_n}[G]. | |||
</math> | |||
We say that the conditional distribution of <math>\Theta_n(\omega)</math> given <math>(X_0,X_1,...,X_n)</math> is <math>\p_{X_n}</math>. | |||
{{alert-info | | |||
[[#thm5 |Theorem]] is also true if one replace <math>\E_x</math> with <math>\E_\mu</math>. | |||
}} | |||
|[Proof of [[#thm5 |Theorem]]] | |||
It is enough to prove the first statement. For this we can restrict ourselves to the case where <math>F=\one_{\{ X_0=x_0,X_1=x_1,...,X_n=x_n\}}</math> for <math>x_0,x_1,...,x_n\in E</math>. Let us first consider the case where <math>G=\one_{\{X_0=y_0,...,X_p=y_p\}}</math>. In this case, if <math>y\in E</math>, then | |||
<math display="block"> | |||
\E_y[G]=\one_{\{ Y_0=y\}}\prod_{j=1}^pQ(y_{j-1},y_j) | |||
</math> | |||
and | |||
<math display="block"> | |||
\begin{multline*} | |||
\E_x[FG\circ \Theta_n]=\p_x[X_0=x_0,...,X_{n}=x_{n},X_n=y_0,X_{n+1}=y_{1},...,X_{n+p}=y_p]\\ | |||
=\one_{\{X_0=x\}}\prod_{j=1}^nQ(x_{j-1},x_j)\one_{\{y_0=y_n\}}\prod_{j=1}^pQ(y_{j-1},y_j) | |||
\end{multline*} | |||
</math> | |||
Now it follows from the monotone class theorem that the above holds for <math>G=\one_A</math>, for <math>A\in\F_n</math> and thus we can conclude.}} | |||
{{alert-info | | |||
We would like to make sense of theorem 16.4. for <math>n</math> replaced by a stopping time <math>T</math>. Indeed let us assume that we want to the problem of knowing whether a Markov chain starting from <math>x</math> visits back <math>x</math>. In other words, let us write | |||
<math display="block"> | |||
N_x=\sum_{n=0}^\infty \one_{\{X_n=x\}} | |||
</math> | |||
and ask the question, whether we have that <math>\p_x[N_x=\infty]=1</math>. It is in fact enough to check that it comes back at least once at <math>x</math>. If <math>H_x=\inf\{n\geq 1\mid X_n=x\}</math>, with the convention <math>\inf\varnothing=\infty</math>, we have <math>\p_x[N_x=\infty]=1</math> if and only if <math>\p_x[H_x < \infty]=1</math>, which is trivial. If <math>\p_x[N_x < \infty]=1</math>, and if theorem 16.4. is true for stopping times, then <math>\theta_{H_x}(\omega)=\left(\omega_{H_x(\omega)=n}\right)_{n\geq 0}</math> has the law <math>\p_x</math>. But then, since <math>N_x(\omega)=1+N_x(\Theta_{H_x}(\omega))</math>, we see that <math>N_x</math> has the same distribution as <math>1+N_x</math> under <math>\p_x</math>. This is only possible if <math>N_x=\infty</math> a.s. | |||
}} | |||
{{proofcard|Theorem (Strong Markov property)|thm-2|Let <math>(\Omega,\F,(\F_n)_{n\geq 0},\p)</math> be a filtered probability space. Let <math>T</math> be a stopping time for the filtration <math>(\F_n)_{n\geq 0}</math>. Let <math>F</math> and <math>G</math> be two measurable and positive functions on <math>\Omega</math>. Assume further that <math>F</math> is <math>\F_T</math>-measurable. Then for all <math>x\in E</math>, we get | |||
<math display="block"> | |||
\E_x[\one_{\{ T < \infty\}}FG\circ \Theta_T]=\E_x[\one_{\{ T < \infty\}}F\E_x[G]] | |||
</math> | |||
or equivalently | |||
<math display="block"> | |||
\E_x[\one_{\{T < \infty\}}G\circ \Theta_T\mid \F_T]=\one_{\{ T < \infty\}}\E_{X_T}[G]. | |||
</math> | |||
|For <math>n\geq 0</math>, we have | |||
<math display="block"> | |||
\E_x[\one_{\{ T=n\}}FG\circ\Theta_T]=\E_x[\one_{\{ T=n\}}FG\circ \Theta_n]=\E_n[\one_{\{ T=n\}}\E_{X_n}[G]]. | |||
</math> | |||
The last equality follows from a previous theorem after obtaining that <math>\one_{\{T=n\}}F</math> is <math>\F_n</math>-measurable. We then sum over <math>n</math>.}} | |||
{{proofcard|Corollary|cor-1|Let <math>(\Omega,\F,(\F_n)_{n\geq 0},\p)</math> be a filtered probability space. Let <math>T</math> be a stopping time such that <math>\p_x[T < \infty]=1</math>. Let us assume that there exists <math>y\in E</math> such that <math>\p_x[X_T=y]=1</math>. Then under <math>\p_x</math>, <math>\Theta_T(\omega)</math> is independent of <math>\F_T</math> and has the law <math>\p_y</math>. | |||
|We only have to note that | |||
<math display="block"> | |||
\E_x[FG\circ \Theta_T(\omega)]=\E_x[F\E_{X_T}[G]]=\E_x[F\E_y[G]]=\E_x[F]\E_y[G]. | |||
</math>}} | |||
==General references== | |||
{{cite arXiv|last=Moshayedi|first=Nima|year=2020|title=Lectures on Probability Theory|eprint=2010.16280|class=math.PR}} |
Revision as of 01:53, 8 May 2024
We start with the existence of a Markov chain associated with a given transition matrix.
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]Q[/math] be a stochastic matrix on [math]E[/math]. There exists a probability space [math](\Omega',\F',\p')[/math] on which there exists, for all [math]x\in E[/math], a stochastic process [math](X_n^x)_{n\geq 0}[/math] which is a Markov chain with transition matrix [math]Q[/math], starting from [math]X_0^x=x[/math].
We can take [math]\Omega'=[0,1)[/math], with the Borel [math]\sigma[/math]-Algebra and the Lebesgue measure. For [math]\omega\in[0,1)[/math], we set
In the sequel, we shall take [math]\Omega=E^\N[/math]. An element [math]\omega\in \Omega[/math] is a sequence [math]\omega=(\omega_0,\omega_1,...)[/math] of elements in [math]E[/math]. Define the coordinate map
We take [math]\F[/math] to be the smallest [math]\sigma[/math]-Algebra on [math]\Omega[/math] which make all [math]X_n[/math]'s measurable. This [math]\sigma[/math]-Algebra is generated by
for [math]n\in\N[/math] and [math]x_0,...,x_n\in E[/math].
Let [math](G,\mathcal{G})[/math] be a measurable space and let [math]\psi:G\to\Omega[/math] be a map. Then [math]\psi[/math] is measurable if and only if for all [math]n\geq 0[/math], [math]X_n\circ\psi[/math] is measurable.
We only need to show that if [math]X_n\circ \psi[/math] is measurable for all [math]n\geq 0[/math], then [math]\psi[/math] is measurable as well. Consider the [math]\sigma[/math]-Algebra [math]\A[/math] on [math]\Omega[/math] given by
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]Q[/math] be a stochastic matrix on [math]E[/math]. For all [math]x\in E[/math], there exists a unique probability measure [math]\p_x[/math], on [math]\Omega=E^\N[/math], such that the sequence of coordinate functions [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix [math]Q[/math] and [math]\p_x[X_0=x]=1[/math].
From proposition, there exists [math](\Omega',\F',\p')[/math] and [math](X_n^x)_{n\geq 0}[/math] which is a Markov chain with transition matrix [math]Q[/math] and such that [math]X_0^x=x[/math]. We then define [math]\p_x[/math] as the image of [math]\p'[/math] by the map
This map is measurable because of lemma. We have [math]\p_x[X_0=x]=\p'[X_0^x=x]=1[/math] and for [math]x_0,x_1,...,x_n\in E[/math] we have
Therefore it follows that the sequence of the coordinate functions is a Markov chain. For uniqueness, we note that if [math]\p'_x[/math] is another such probability measure, [math]\p_x[/math] and [math]\p_x'[/math] coincide on cylinders. It then follows from an application of the monotone class theorem that [math]\p_x=\p_x'[/math].
It follows for all [math]n\geq 0[/math] and for all [math]x,y\in E[/math], that we have
If [math]\mu[/math] is a probability measure on [math]E[/math], we write
Let [math](X'_n)_{n\geq 0}[/math] be a Markov chain with transition matrix [math]Q[/math] and initial distribution [math]\mu[/math]. Then for all measurable subsets [math]B\subset\Omega=E^\N[/math], we have
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]F[/math] and [math]G[/math] be two measurable, positive maps on [math]\Omega[/math] and let [math]n\geq 0[/math]. Assume that [math]F[/math] is [math]\F_n[/math]-measurable. Then for all [math]x\in E[/math],
Theorem is also true if one replace [math]\E_x[/math] with [math]\E_\mu[/math].
[Proof of Theorem] It is enough to prove the first statement. For this we can restrict ourselves to the case where [math]F=\one_{\{ X_0=x_0,X_1=x_1,...,X_n=x_n\}}[/math] for [math]x_0,x_1,...,x_n\in E[/math]. Let us first consider the case where [math]G=\one_{\{X_0=y_0,...,X_p=y_p\}}[/math]. In this case, if [math]y\in E[/math], then
Now it follows from the monotone class theorem that the above holds for [math]G=\one_A[/math], for [math]A\in\F_n[/math] and thus we can conclude.
We would like to make sense of theorem 16.4. for [math]n[/math] replaced by a stopping time [math]T[/math]. Indeed let us assume that we want to the problem of knowing whether a Markov chain starting from [math]x[/math] visits back [math]x[/math]. In other words, let us write
Let [math](\Omega,\F,(\F_n)_{n\geq 0},\p)[/math] be a filtered probability space. Let [math]T[/math] be a stopping time for the filtration [math](\F_n)_{n\geq 0}[/math]. Let [math]F[/math] and [math]G[/math] be two measurable and positive functions on [math]\Omega[/math]. Assume further that [math]F[/math] is [math]\F_T[/math]-measurable. Then for all [math]x\in E[/math], we get
For [math]n\geq 0[/math], we have
Let [math](\Omega,\F,(\F_n)_{n\geq 0},\p)[/math] be a filtered probability space. Let [math]T[/math] be a stopping time such that [math]\p_x[T \lt \infty]=1[/math]. Let us assume that there exists [math]y\in E[/math] such that [math]\p_x[X_T=y]=1[/math]. Then under [math]\p_x[/math], [math]\Theta_T(\omega)[/math] is independent of [math]\F_T[/math] and has the law [math]\p_y[/math].
We only have to note that
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].