Definition and first properties
In this chapter [math]E[/math] will be a finite or a countable set, endowed with the [math]\sigma[/math]-Algebra [math]\mathcal{P}(E)[/math]. A stochastic matrix on [math]E[/math] is a family [math](Q(x,y))_{x,y\in E}[/math] of real numbers satisfying
- [math]0\leq Q(x,y)\leq 1[/math] for all [math]x,y\in E[/math].
- [math]\sum_{y\in E}Q(x,y)=1[/math] for all [math]x\in E[/math].
This is a transition probability from [math]E[/math] to [math]E[/math] in the following sense: if for [math]x\in E[/math] and [math]A\subset E[/math] we write
we see that [math]\nu[/math] is a transition kernel probability from [math]E[/math] to [math]E[/math]. Conversely, if we start with such a transition kernel, the formula
defines a stochastic matrix on [math]E[/math]. For [math]n\geq 1[/math], we can define [math]Q_n=Q^n[/math]. Indeed, [math]Q_1=Q[/math] and by induction
One can check that [math]Q_n[/math] is also a stochastic matrix on [math]E[/math]. For [math]n=0[/math] we take [math]Q_0(x,y)=\one_{\{x=y\}}[/math]. For a measurable map [math]f:E\to\R_+[/math] we write [math]Qf[/math] as the function defined by
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]Q[/math] be a stochastic matrix on [math]E[/math] and let [math](X_n)_{n\geq 0}[/math] be a stochastic process with values in [math]E[/math]. We say that [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix [math]Q[/math] if for all [math]n\geq 0[/math], the conditional distribution of [math]X_{n+1}[/math] given [math](X_0,X_1,...,X_n)[/math] is [math]Q(X_n,y)[/math], or equivalently if for all [math]x_0,x_1,...,x_n,y\in E[/math]
In general, the conditional distribution of [math]X_{n+1}[/math] given [math]X_0,X_1,...,X_n[/math] depends on all the variables [math]X_0,X_1,...,X_n[/math]. The fact that this conditional distribution only depend on [math]X_n[/math] is called the Markov property.
[math]Q(x,\cdot)[/math], which is the distribution of [math]X_{n+1}[/math] given [math]X_n=x_n[/math] does not depend on [math]n[/math]: this is the homogeneity of the Markov chain.
Let [math](\Omega,\F,\p)[/math] be a probability space. A stochastic process [math](X_n)_{n\geq 0}[/math] with values in [math]E[/math] is a Markov chain with transition kernel [math]Q[/math] if and only if for all [math]n\geq 0[/math] and for all [math]x_0,x_1,...,x_n\in E[/math]
In particular, if [math]\p[X_0=x_0] \gt 0[/math], then
If [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix [math]Q[/math], we have
Thus we can conclude by induction. Conversely, if (markov) is satisfied, then
To conclude, we note that
Proposition 15.1. shows that for a Markov chain, [math](X_0,X_1,...,X_n)[/math] is completely determined by the initial distribution (that of [math]X_0[/math]) and the transition matrix [math]Q[/math]. For now we want to note [math]\p[A\mid Z][/math] for [math]\E[\one_A\mid Z][/math].
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 0}[/math] be a Markov chain with transition matrix [math]Q[/math].
- For all [math]n\geq 0[/math] and for all measurable maps [math]f:E\to \R_+[/math] we have
[[math]] \E[f(X_{n+1})\mid X_0,X_1,...,X_n]=\E[f(X_{n+1})\mid X_n]=Qf(X_n). [[/math]]More generally for all [math]\{ i_1,...,i_n\}\subset \{0,1,...,n-1\}[/math], we have[[math]] \E[f(X_{n+1})\mid X_{i_1},X_{i_2},...,X_{i_n},X_n]=Qf(X_n). [[/math]]
- For all [math]n\geq 0, p\geq 1[/math] and for all [math]y_1,...,y_p\in E[/math] we have
[[math]] \p[X_{n+1}=y_1,...,X_{n+p}=y_p\mid X_0,...,X_n]=Q(X_n,y_1)\prod_{j=1}^{p-1}Q(y_j,y_{j+1}) [[/math]]and hence[[math]] \p[X_{n+p}=y_p\mid X_n]=Q_p(X_n,y_p). [[/math]]If we take [math]Y_p=X_{n+p}[/math], then [math](Y_p)_{p\geq 0}[/math] is also a Markov chain with transition matrix [math]Q[/math].
For [math](i)[/math], we have
For [math](ii)[/math], note that it follows immediately from markov that
The formula for [math]\p[X_{n-p}=y_p\mid X_n][/math] follows for [math]y_1,...,y_{p-1}\in E[/math]. Finally we note that
and we can apply proposition 15.1.
Example
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 0}[/math] be a sequence of independent r.v.'s with values in [math]E[/math], with the same distribution [math]\mu[/math]. Then [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix
for all [math]x,y\in E[/math].
Example
Let [math](\Omega,\F,\p)[/math] be a probability space. Let [math]\eta,\xi_1,...,\xi_n,...[/math] be independent r.v.'s in [math]\mathbb{Z}^d[/math]. We assume that [math]\xi_1,...\xi_n,...[/math] have the same distribution [math]\mu[/math]. For [math]n\geq 0[/math] we write
Then [math](X_n)_{n\geq 0}[/math] is a Markov chain with transition matrix
for all [math]x,y\in E[/math]. Indeed,
If [math](e_1,...,e_d)[/math] is the canonical basis of [math]\R^d[/math], and if [math]\mu(e_i)=\mu(-e_i)=\frac{1}{2d}[/math] for all [math]i\in\{1,...,d\}[/math], then the Markov chain is called a simple random walk on [math]\mathbb{Z}^d[/math].
Example
Let [math]\mathcal{P}_2(E)[/math] be the subsets of [math]E[/math] with two elements and let [math]A\subset\mathcal{P}_2(E)[/math]. For [math]x\in E[/math], we note
We assume that [math]A_x\not=\varnothing[/math] for all [math]x\in E[/math]. We then define a transition matrix on [math]E[/math] by setting for [math]x,y\in E[/math]
A Markov chain with transition matrix [math]Q[/math] is called a simple random walk on the graph [math](E,A)[/math].
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].
Notes
- Recall that if [math]X\in(E,\mathcal{E})[/math], [math]Y\in(F,\F)[/math], we call the conditional distribution of [math]Y[/math] given [math]X[/math] any transition kernel [math]\nu:E\to F[/math] such that for all measurable maps [math]h:(F;\F)\to \R_+[/math], [math]\E[h(y)\mid X]=\int_\R h(y)\nu(X,dy)[/math].