guide:072e3b3e3a: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
<div class="d-none"><math> | |||
\newcommand{\R}{\mathbb{R}} | |||
\newcommand{\A}{\mathcal{A}} | |||
\newcommand{\B}{\mathcal{B}} | |||
\newcommand{\N}{\mathbb{N}} | |||
\newcommand{\C}{\mathbb{C}} | |||
\newcommand{\Rbar}{\overline{\mathbb{R}}} | |||
\newcommand{\Bbar}{\overline{\mathcal{B}}} | |||
\newcommand{\Q}{\mathbb{Q}} | |||
\newcommand{\E}{\mathbb{E}} | |||
\newcommand{\p}{\mathbb{P}} | |||
\newcommand{\one}{\mathds{1}} | |||
\newcommand{\0}{\mathcal{O}} | |||
\newcommand{\mat}{\textnormal{Mat}} | |||
\newcommand{\sign}{\textnormal{sign}} | |||
\newcommand{\CP}{\mathcal{P}} | |||
\newcommand{\CT}{\mathcal{T}} | |||
\newcommand{\CY}{\mathcal{Y}} | |||
\newcommand{\F}{\mathcal{F}} | |||
\newcommand{\mathds}{\mathbb}</math></div> | |||
In this chapter <math>E</math> will be a finite or a countable set, endowed with the <math>\sigma</math>-Algebra <math>\mathcal{P}(E)</math>. A stochastic matrix on <math>E</math> is a family <math>(Q(x,y))_{x,y\in E}</math> of real numbers satisfying | |||
<ul style{{=}}"list-style-type:lower-roman"><li><math>0\leq Q(x,y)\leq 1</math> for all <math>x,y\in E</math>. | |||
</li> | |||
<li><math>\sum_{y\in E}Q(x,y)=1</math> for all <math>x\in E</math>. | |||
</li> | |||
</ul> | |||
This is a transition probability from <math>E</math> to <math>E</math> in the following sense: if for <math>x\in E</math> and <math>A\subset E</math> we write | |||
<math display="block"> | |||
\nu(x,A)=\sum_{y\in A}\nu(x,y), | |||
</math> | |||
we see that <math>\nu</math> is a transition kernel probability from <math>E</math> to <math>E</math>. Conversely, if we start with such a transition kernel, the formula | |||
<math display="block"> | |||
Q(x,y)=\nu(x,\{y\}) | |||
</math> | |||
defines a stochastic matrix on <math>E</math>. For <math>n\geq 1</math>, we can define <math>Q_n=Q^n</math>. Indeed, <math>Q_1=Q</math> and by induction | |||
<math display="block"> | |||
Q_{n+1}(x,y)=\sum_{z\in E}Q_n(x,z)Q(z,y). | |||
</math> | |||
One can check that <math>Q_n</math> is also a stochastic matrix on <math>E</math>. For <math>n=0</math> we take <math>Q_0(x,y)=\one_{\{x=y\}}</math>. For a measurable map <math>f:E\to\R_+</math> we write <math>Qf</math> as the function defined by | |||
<math display="block"> | |||
Qf(x)=\sum_{y\in E}Q(x,y)f(y). | |||
</math> | |||
{{definitioncard|Markov chain| | |||
Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>Q</math> be a stochastic matrix on <math>E</math> and let <math>(X_n)_{n\geq 0}</math> be a stochastic process with values in <math>E</math>. We say that <math>(X_n)_{n\geq 0}</math> is a Markov chain with transition matrix <math>Q</math> if for all <math>n\geq 0</math>, the conditional distribution of <math>X_{n+1}</math> given <math>(X_0,X_1,...,X_n)</math> is <math>Q(X_n,y)</math>, or equivalently if for all <math>x_0,x_1,...,x_n,y\in E</math> | |||
<math display="block"> | |||
\p[X_{n+1}=y\mid X_0=x_0,X_1=x_1,...,X_{n-1}=x_{n-1},X_n=x_n]=Q(x_n,y) | |||
</math> | |||
such that <math>\p[X_0=x_0,....,X_n=x_n] > 0</math>. | |||
}} | |||
{{alert-info | | |||
In general, the conditional distribution of <math>X_{n+1}</math> given <math>X_0,X_1,...,X_n</math> depends on all the variables <math>X_0,X_1,...,X_n</math>. The fact that this conditional distribution only depend on <math>X_n</math> is called the Markov property. | |||
}} | |||
{{alert-info | | |||
<math>Q(x,\cdot)</math>, which is the distribution of <math>X_{n+1}</math> given <math>X_n=x_n</math> does not depend on <math>n</math>: this is the homogeneity of the Markov chain. | |||
}} | |||
{{proofcard|Proposition|prop-1|Let <math>(\Omega,\F,\p)</math> be a probability space. A stochastic process <math>(X_n)_{n\geq 0}</math> with values in <math>E</math> is a Markov chain with transition kernel <math>Q</math> if and only if for all <math>n\geq 0</math> and for all <math>x_0,x_1,...,x_n\in E</math> | |||
<math display="block"> | |||
\begin{equation} | |||
\label{markov} | |||
\p[X_0=x_0,X_1=x_1,...,X_n=x_n]=\p[X_0=x_0]\prod_{j=1}^nQ(x_{j-1},x_j). | |||
\end{equation} | |||
</math> | |||
In particular, if <math>\p[X_0=x_0] > 0</math>, then | |||
<math display="block"> | |||
\p[X_n=x_n\mid X_0=x_0]=Q_n(x_0,x_n). | |||
</math> | |||
|If <math>(X_n)_{n\geq 0}</math> is a Markov chain with transition matrix <math>Q</math>, we have | |||
<math display="block"> | |||
\begin{align*} | |||
\p[X_0=x_0,...,X_n=x_n,X_{n+1}=x_{n+1}]&=\p[X_0=x_0,...,X_n=x_n]\p[X_{n+1}=x_{n+1}\mid X_0=x_0,...,X_n=x_n]\\ | |||
&=\p[X_1=x_1,...,X_n=x_n]\\ | |||
&=Q(x_{n+1},x_n). | |||
\end{align*} | |||
</math> | |||
Thus we can conclude by induction. Conversely, if ([[#markov |markov]]) is satisfied, then | |||
<math display="block"> | |||
\p[X_{n+1}=y\mid X_0=x_0,...,X_n=x_n]=\frac{\p[X_0=x_0]\prod_{j=0}^{n-1}Q(x_j,x_{j+1})Q(x_n,y)}{\p[X_0=x_0]\prod_{j=0}^{n-1}Q(x_j,x_{j+1})}=Q(x_n,y). | |||
</math> | |||
To conclude, we note that | |||
<math display="block"> | |||
Q_n(x_0,x)=\sum_{x_1,...x_{n+1}\in E}\prod_{j=1}^nQ(x_{j-1}x_j). | |||
</math>}} | |||
{{alert-info | | |||
Proposition 15.1. shows that for a Markov chain, <math>(X_0,X_1,...,X_n)</math> is completely determined by the initial distribution (that of <math>X_0</math>) and the transition matrix <math>Q</math>. For now we want to note <math>\p[A\mid Z]</math> for <math>\E[\one_A\mid Z]</math>. | |||
}} | |||
{{proofcard|Proposition|prop-2|Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>(X_n)_{n\geq 0}</math> be a Markov chain with transition matrix <math>Q</math>. | |||
<ul style{{=}}"list-style-type:lower-roman"><li>For all <math>n\geq 0</math> and for all measurable maps <math>f:E\to \R_+</math> we have | |||
<math display="block"> | |||
\E[f(X_{n+1})\mid X_0,X_1,...,X_n]=\E[f(X_{n+1})\mid X_n]=Qf(X_n). | |||
</math> | |||
More generally for all <math>\{ i_1,...,i_n\}\subset \{0,1,...,n-1\}</math>, we have | |||
<math display="block"> | |||
\E[f(X_{n+1})\mid X_{i_1},X_{i_2},...,X_{i_n},X_n]=Qf(X_n). | |||
</math> | |||
</li> | |||
<li>For all <math>n\geq 0, p\geq 1</math> and for all <math>y_1,...,y_p\in E</math> we have | |||
<math display="block"> | |||
\p[X_{n+1}=y_1,...,X_{n+p}=y_p\mid X_0,...,X_n]=Q(X_n,y_1)\prod_{j=1}^{p-1}Q(y_j,y_{j+1}) | |||
</math> | |||
and hence | |||
<math display="block"> | |||
\p[X_{n+p}=y_p\mid X_n]=Q_p(X_n,y_p). | |||
</math> | |||
If we take <math>Y_p=X_{n+p}</math>, then <math>(Y_p)_{p\geq 0}</math> is also a Markov chain with transition matrix <math>Q</math>. | |||
</li> | |||
</ul> | |||
|For <math>(i)</math>, we have | |||
<math display="block"> | |||
\E[f(X_{n++})\mid X_0,X_1,...,X_n]=\sum_{y\in E}Q(X_n,y)f(y)=Qf(X_n). | |||
</math> | |||
Now if <math>\{i_1,...,i_n\}\subset \{ 0,1,...,n-1\}</math>, then{{efn|Recall that if <math>X\in(E,\mathcal{E})</math>, <math>Y\in(F,\F)</math>, we call the conditional distribution of <math>Y</math> given <math>X</math> any transition kernel <math>\nu:E\to F</math> such that for all measurable maps <math>h:(F;\F)\to \R_+</math>, <math>\E[h(y)\mid X]=\int_\R h(y)\nu(X,dy)</math>.}} | |||
<math display="block"> | |||
\begin{multline*} | |||
\E[f(X_{n+1})\mid X_{i_1},...,X_{i_n},X_n]=\E[\E[f(X_{n+1})\mid X_0,X_1,...,X_n]\mid X_{i_1,...,X_{i_n}},X_n]\\=\E[Qf(X_n)\mid X_{i_1},...,X_{i_n},X_n]=Qf(X_n). | |||
\end{multline*} | |||
</math> | |||
For <math>(ii)</math>, note that it follows immediately from <math>([[#markov |markov]])</math> that | |||
<math display="block"> | |||
\p[X_{n+1}=y_1,...,X_{n-p}=y_p\mid X_0=x_0,...,X_n=x_n]=Q(x_n,y_1)\prod_{j=1}^{p-1}Q(y_j,y_{j+1}). | |||
</math> | |||
The formula for <math>\p[X_{n-p}=y_p\mid X_n]</math> follows for <math>y_1,...,y_{p-1}\in E</math>. Finally we note that | |||
<math display="block"> | |||
\p[Y_0=,Y_1=y_1,...,Y_p=y_p]=\p[X_n=y_n]\prod_{j=0}^pQ(y_{j-1},y_j) | |||
</math> | |||
and we can apply proposition 15.1.}} | |||
'''Example''' | |||
Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>(X_n)_{n\geq 0}</math> be a sequence of independent r.v.'s with values in <math>E</math>, with the same distribution <math>\mu</math>. Then <math>(X_n)_{n\geq 0}</math> is a Markov chain with transition matrix | |||
<math display="block"> | |||
Q(x,y)=\mu(y) | |||
</math> | |||
for all <math>x,y\in E</math>. | |||
'''Example''' | |||
Let <math>(\Omega,\F,\p)</math> be a probability space. Let <math>\eta,\xi_1,...,\xi_n,...</math> be independent r.v.'s in <math>\mathbb{Z}^d</math>. We assume that <math>\xi_1,...\xi_n,...</math> have the same distribution <math>\mu</math>. For <math>n\geq 0</math> we write | |||
<math display="block"> | |||
X_n=\eta+\xi_1+\dotsm+\xi_n. | |||
</math> | |||
Then <math>(X_n)_{n\geq 0}</math> is a Markov chain with transition matrix | |||
<math display="block"> | |||
Q(x,y)=\mu(y-x) | |||
</math> | |||
for all <math>x,y\in E</math>. Indeed, | |||
<math display="block"> | |||
\begin{multline*} | |||
\p[X_{n+1}=y\mid X_0=x_0,...,X_n=x_n]=\p[\xi_{n+1}=y-x_n\mid X_0=x_0,...,X_n=x_n]\\=\p[\xi_{n+1}=y-x_n]=\mu(y-x_n). | |||
\end{multline*} | |||
</math> | |||
If <math>(e_1,...,e_d)</math> is the canonical basis of <math>\R^d</math>, and if <math>\mu(e_i)=\mu(-e_i)=\frac{1}{2d}</math> for all <math>i\in\{1,...,d\}</math>, then the Markov chain is called a simple random walk on <math>\mathbb{Z}^d</math>. | |||
'''Example''' | |||
Let <math>\mathcal{P}_2(E)</math> be the subsets of <math>E</math> with two elements and let <math>A\subset\mathcal{P}_2(E)</math>. For <math>x\in E</math>, we note | |||
<math display="block"> | |||
A_x=\{y\in E\mid \{x,y\}\in A\}. | |||
</math> | |||
We assume that <math>A_x\not=\varnothing</math> for all <math>x\in E</math>. We then define a transition matrix on <math>E</math> by setting for <math>x,y\in E</math> | |||
<math display="block"> | |||
Q(x,y)=\begin{cases}\frac{1}{\vert A_x\vert},&\text{if $\{x,y\}\in A$}\\ 0,&\text{otherwise}\end{cases} | |||
</math> | |||
A Markov chain with transition matrix <math>Q</math> is called a simple random walk on the graph <math>(E,A)</math>. | |||
==General references== | |||
{{cite arXiv|last=Moshayedi|first=Nima|year=2020|title=Lectures on Probability Theory|eprint=2010.16280|class=math.PR}} | |||
==Notes== | |||
{{notelist}} |
Revision as of 01:53, 8 May 2024
In this chapter [math]E[/math] will be a finite or a countable set, endowed with the [math]\sigma[/math]-Algebra [math]\mathcal{P}(E)[/math]. A stochastic matrix on [math]E[/math] is a family [math](Q(x,y))_{x,y\in E}[/math] of real numbers satisfying
- [math]0\leq Q(x,y)\leq 1[/math] for all [math]x,y\in E[/math].
- [math]\sum_{y\in E}Q(x,y)=1[/math] for all [math]x\in E[/math].
This is a transition probability from [math]E[/math] to [math]E[/math] in the following sense: if for [math]x\in E[/math] and [math]A\subset E[/math] we write
we see that [math]\nu[/math] is a transition kernel probability from [math]E[/math] to [math]E[/math]. Conversely, if we start with such a transition kernel, the formula
defines a stochastic matrix on [math]E[/math]. For [math]n\geq 1[/math], we can define [math]Q_n=Q^n[/math]. Indeed, [math]Q_1=Q[/math] and by induction
One can check that [math]Q_n[/math] is also a stochastic matrix on [math]E[/math]. For [math]n=0[/math] we take [math]Q_0(x,y)=\one_{\{x=y\}}[/math]. For a measurable map [math]f:E\to\R_+[/math] we write [math]Qf[/math] as the function defined by
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]Q[/math] be a stochastic matrix on [math]E[/math] and let [math](X_n)_{n\geq 0}[/math] be a stochastic process with values in [math]E[/math]. We say that [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix [math]Q[/math] if for all [math]n\geq 0[/math], the conditional distribution of [math]X_{n+1}[/math] given [math](X_0,X_1,...,X_n)[/math] is [math]Q(X_n,y)[/math], or equivalently if for all [math]x_0,x_1,...,x_n,y\in E[/math]
In general, the conditional distribution of [math]X_{n+1}[/math] given [math]X_0,X_1,...,X_n[/math] depends on all the variables [math]X_0,X_1,...,X_n[/math]. The fact that this conditional distribution only depend on [math]X_n[/math] is called the Markov property.
[math]Q(x,\cdot)[/math], which is the distribution of [math]X_{n+1}[/math] given [math]X_n=x_n[/math] does not depend on [math]n[/math]: this is the homogeneity of the Markov chain.
Let [math](\Omega,\F,\p)[/math] be a probability space. A stochastic process [math](X_n)_{n\geq 0}[/math] with values in [math]E[/math] is a Markov chain with transition kernel [math]Q[/math] if and only if for all [math]n\geq 0[/math] and for all [math]x_0,x_1,...,x_n\in E[/math]
In particular, if [math]\p[X_0=x_0] \gt 0[/math], then
If [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix [math]Q[/math], we have
Thus we can conclude by induction. Conversely, if (markov) is satisfied, then
To conclude, we note that
Proposition 15.1. shows that for a Markov chain, [math](X_0,X_1,...,X_n)[/math] is completely determined by the initial distribution (that of [math]X_0[/math]) and the transition matrix [math]Q[/math]. For now we want to note [math]\p[A\mid Z][/math] for [math]\E[\one_A\mid Z][/math].
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 0}[/math] be a Markov chain with transition matrix [math]Q[/math].
- For all [math]n\geq 0[/math] and for all measurable maps [math]f:E\to \R_+[/math] we have
[[math]] \E[f(X_{n+1})\mid X_0,X_1,...,X_n]=\E[f(X_{n+1})\mid X_n]=Qf(X_n). [[/math]]More generally for all [math]\{ i_1,...,i_n\}\subset \{0,1,...,n-1\}[/math], we have[[math]] \E[f(X_{n+1})\mid X_{i_1},X_{i_2},...,X_{i_n},X_n]=Qf(X_n). [[/math]]
- For all [math]n\geq 0, p\geq 1[/math] and for all [math]y_1,...,y_p\in E[/math] we have
[[math]] \p[X_{n+1}=y_1,...,X_{n+p}=y_p\mid X_0,...,X_n]=Q(X_n,y_1)\prod_{j=1}^{p-1}Q(y_j,y_{j+1}) [[/math]]and hence[[math]] \p[X_{n+p}=y_p\mid X_n]=Q_p(X_n,y_p). [[/math]]If we take [math]Y_p=X_{n+p}[/math], then [math](Y_p)_{p\geq 0}[/math] is also a Markov chain with transition matrix [math]Q[/math].
For [math](i)[/math], we have
For [math](ii)[/math], note that it follows immediately from [math]([[#markov |markov]])[/math] that
Example
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 0}[/math] be a sequence of independent r.v.'s with values in [math]E[/math], with the same distribution [math]\mu[/math]. Then [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix
for all [math]x,y\in E[/math].
Example
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]\eta,\xi_1,...,\xi_n,...[/math] be independent r.v.'s in [math]\mathbb{Z}^d[/math]. We assume that [math]\xi_1,...\xi_n,...[/math] have the same distribution [math]\mu[/math]. For [math]n\geq 0[/math] we write
Then [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix
for all [math]x,y\in E[/math]. Indeed,
If [math](e_1,...,e_d)[/math] is the canonical basis of [math]\R^d[/math], and if [math]\mu(e_i)=\mu(-e_i)=\frac{1}{2d}[/math] for all [math]i\in\{1,...,d\}[/math], then the Markov chain is called a simple random walk on [math]\mathbb{Z}^d[/math].
Example
Let [math]\mathcal{P}_2(E)[/math] be the subsets of [math]E[/math] with two elements and let [math]A\subset\mathcal{P}_2(E)[/math]. For [math]x\in E[/math], we note
We assume that [math]A_x\not=\varnothing[/math] for all [math]x\in E[/math]. We then define a transition matrix on [math]E[/math] by setting for [math]x,y\in E[/math]
A Markov chain with transition matrix [math]Q[/math] is called a simple random walk on the graph [math](E,A)[/math].
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].
Notes
- Recall that if [math]X\in(E,\mathcal{E})[/math], [math]Y\in(F,\F)[/math], we call the conditional distribution of [math]Y[/math] given [math]X[/math] any transition kernel [math]\nu:E\to F[/math] such that for all measurable maps [math]h:(F;\F)\to \R_+[/math], [math]\E[h(y)\mid X]=\int_\R h(y)\nu(X,dy)[/math].