guide:D71bfb6804: Difference between revisions

From Stochiki
No edit summary
 
No edit summary
Line 1: Line 1:
<div class="d-none"><math>
\newcommand{\NA}{{\rm NA}}
\newcommand{\mat}[1]{{\bf#1}}
\newcommand{\exref}[1]{\ref{##1}}
\newcommand{\secstoprocess}{\all}
\newcommand{\NA}{{\rm NA}}
\newcommand{\mathds}{\mathbb}</math></div>
label{sec 11.4}
The fundamental limit theorem for regular Markov chains states that if
<math>\mat{P}</math> is
a regular transition matrix then


<math display="block">
\lim_{n \to \infty} \mat {P}^n = \mat {W}\ ,
</math>
where <math>\mat{W}</math> is a matrix with each row equal to the unique fixed probability
row vector
<math>\mat{w}</math> for <math>\mat{P}</math>.  In this section we shall give two very different
proofs of this
theorem.
Our first proof is carried out by showing that, for any column vector
<math>\mat{y}</math>,
<math>\mat{P}^n \mat {y}</math> tends to a constant vector.  As indicated in
Section \ref{sec 11.3}, this
will show that <math>\mat{P}^n</math> converges to a matrix with constant columns or,
equivalently, to a
matrix with all rows the same.
The following lemma says that if an <math>r</math>-by-<math>r</math> transition matrix has no zero
entries,
and <math>\mat {y}</math> is any column vector with <math>r</math> entries, then the vector <math>\mat
{P}\mat{y}</math> has
entries which are “closer together” than the entries are in <math>\mat {y}</math>.
{{proofcard|Lemma|lemma-1|Let <math>\mat{P}</math> be an <math>r</math>-by-<math>r</math> transition matrix with no zero entries.  Let <math>d</math>
be
the smallest entry of the matrix.  Let <math>\mat{y}</math> be a column vector with <math>r</math>
components, the largest of which is <math>M_0</math> and the smallest <math>m_0</math>.  Let
<math>M_1</math> and <math>m_1</math> be the largest and smallest component, respectively, of the
vector <math>\mat {P} \mat {y}</math>.  Then
<math display="block">
M_1 - m_1 \leq (1 - 2d)(M_0 - m_0)\ .
</math>\n|In the discussion following [[guide:E5c38a1e8a#thm 11.3.6 |Theorem]], it was noted that each
entry in the vector
<math>\mat {P}\mat{y}</math> is a weighted average of the entries in <math>\mat {y}</math>.  The
largest weighted
average that could be obtained in the present case would occur if all but one
of the entries of
<math>\mat {y}</math> have  value <math>M_0</math> and one entry has value <math>m_0</math>, and this one small
entry is weighted
by the smallest possible weight,  namely <math>d</math>.  In this case, the weighted
average
would equal
<math display="block">
dm_0 + (1-d)M_0\ .
</math>
Similarly, the smallest possible weighted average equals
<math display="block">
dM_0 + (1-d)m_0\ .
</math>
Thus,
<math display="block">
\begin{eqnarray*}
M_1 - m_1 &\le& \Bigl(dm_0 + (1-d)M_0\Bigr) - \Bigl(dM_0 + (1-d)m_0\Bigr) \\
        &=& (1 - 2d)(M_0 - m_0)\ .
\end{eqnarray*}
</math>
This completes the proof of the lemma.}}
We turn now to the proof of the fundamental limit theorem for regular Markov
chains. 
{{proofcard|Theorem|theorem-1|''' (Fundamental Limit Theorem for Regular Chains)'''
If <math>\mat{P}</math> is the transition matrix for a regular Markov chain, then
<math display="block">
\lim_{n \to \infty} \mat {P}^n = \mat {W}\ ,
</math>
where <math>\mat {W}</math> is matrix with all rows equal.  Furthermore, all entries in
<math>\mat{W}</math> are
strictly positive.\n|We prove this theorem for the special case that <math>\mat{P}</math> has no 0 entries.
The extension
to the general case is indicated in Exercise [[exercise:Ed0d60ec05 |Exercise]]. Let \mat {y} be
any
<math>r</math>-component column vector, where <math>r</math> is the number of states of the chain.
We assume that
<math>r  >  1</math>, since otherwise the theorem is trivial.  Let
<math>M_n</math> and
<math>m_n</math> be, respectively, the maximum and minimum components of the
vector <math>\mat {P}^n \mat { y}</math>.  The vector <math>\mat {P}^n \mat {y}</math> is obtained
from the
vector <math>\mat {P}^{n - 1}
\mat {y}</math> by multiplying on the left by the matrix <math>\mat{P}</math>.  Hence each
component
of <math>\mat {P}^n \mat {y}</math> is an average of the components of <math>\mat {P}^{n - 1}
\mat {y}</math>.  Thus
<math display="block">
M_0 \geq M_1 \geq M_2 \geq\cdots
</math>
and
<math display="block">
m_0 \leq m_1 \leq m_2 \leq\cdots\ .
</math>
Each sequence is monotone and bounded:
<math display="block">
m_0 \leq m_n \leq M_n \leq M_0\ .
</math>
Hence, each of these sequences will have a limit as <math>n</math> tends to infinity.
Let <math>M</math> be the limit of <math>M_n</math> and <math>m</math> the limit of <math>m_n</math>.  We know that <math>m \leq
M</math>.  We shall prove that <math>M - m = 0</math>.  This will be the case if <math>M_n - m_n</math>
tends to 0.  Let <math>d</math> be the smallest element of <math>\mat{P}</math>.  Since all entries
of <math>\mat{P}</math>
are strictly positive, we have <math>d  >  0</math>.  By our lemma
<math display="block">
M_n - m_n \leq (1 - 2d)(M_{n - 1} - m_{n - 1})\ .
</math>
From this we see that
<math display="block">
M_n - m_n \leq (1 - 2d)^n(M_0 - m_0)\ .
</math>
Since <math>r \ge 2</math>, we must have <math>d \leq 1/2</math>, so <math>0 \leq 1 - 2d  <  1</math>, so
the difference <math>M_n - m_n</math> tends to 0 as <math>n</math> tends to infinity.  Since every
component of <math>\mat {P}^n \mat {y}</math> lies between <math>m_n</math> and <math>M_n</math>, each component
must
approach the same number <math>u = M = m</math>. 
This shows that
<span id{{=}}"eq 11.4.4"/>
<math display="block">
\lim_{n \to \infty} \mat {P}^n \mat {y} = \mat{u}\ ,  \label{eq 11.4.4}
</math>
where <math>\mat{u}</math> is a column vector all of whose components equal <math>u</math>. 
Now let <math>\mat{y}</math> be the vector with <math>j</math>th component equal to 1 and all other
components equal to
0. Then <math>\mat {P}^n \mat {y}</math> is the
<math>j</math>th column of <math>\mat {P}^n</math>.  Doing this for each <math>j</math> proves that the columns
of <math>\mat {P}^n</math> approach constant column vectors.  That is, the rows of <math>\mat
{P}^n</math>
approach a common row vector <math>\mat{w}</math>, or,
<span id{{=}}"eq 11.4.5"/>
<math display="block">
\lim_{n \to \infty} \mat {P}^n = \mat {W}\ .  \label{eq 11.4.5}
</math>
It remains to show that all entries in <math>\mat{W}</math> are strictly positive.  As
before, let
<math>\mat{y}</math> be the vector with <math>j</math>th component equal to 1 and all other
components equal to 0. 
Then <math>\mat{P}\mat{y}</math> is the <math>j</math>th column of <math>\mat{P}</math>, and this column has all
entries strictly
positive.  The minimum component of the vector <math>\mat{P}\mat{y}</math> was defined to
be <math>m_1</math>, hence
<math>m_1  >  0</math>.  Since <math>m_1 \le m</math>, we have <math>m  >  0</math>.  Note finally that this value
of <math>m</math> is just the
<math>j</math>th component of <math>\mat{w}</math>, so all components of <math>\mat{w}</math> are strictly
positive.}}
===Doeblin's Proof===
We give now a very different proof of the main part of the fundamental limit
theorem for
regular Markov chains.  This proof was first given by Doeblin,<ref group="Notes" >W. Doeblin, “Exposé de la Théorie des Chaines Simple Constantes
de Markov à un
Nombre Fini d'Etats,” ''Rev.\ Mach.\ de l'Union Interbalkanique,'' vol. 2
(1937), pp. 77--105.</ref> a brilliant young  mathematician who was killed in
his twenties in the Second World War.
{{proofcard|Theorem|thm_11.4.1|Let <math>\mat {P}</math> be the transition matrix for a regular Markov chain with fixed
vector
<math>\mat {w}</math>.  Then for any initial probability vector <math>\mat {u}</math>,
<math>\mat {uP}^n \rightarrow \mat {w}</math> as <math>n \rightarrow \infty.</math>\n|Let <math> X_0,\ X_1,\ \ldots</math> be a Markov
chain with transition matrix <math>\mat {P}</math> started in state <math>s_i</math>.  Let <math>Y_0,\
Y_1,\ \ldots</math>  be a Markov
chain with transition probability <math>\mat {P}</math> started with initial probabilities
given by
<math>\mat {w}</math>.  The <math>X</math> and <math>Y</math> processes are run independently of each other.
We consider also a third Markov chain  <math>\mat{P}^*</math>  which consists of watching
both the
<math>X</math> and <math>Y</math> processes.  The states for <math>\mat{P}^*</math>
are pairs <math>(s_i, s_j)</math>.  The transition probabilities are given by
<math display="block">
\mat{P}^{*}[(i,j),(k,l)] = \mat{P}(i,k) \cdot  \mat{P}(j,l)\ .
</math>
 
Since <math>\mat{P}</math> is regular there is an <math>N</math> such that <math>\mat{P}^{N}(i,j)  >  0</math> for
all <math>i</math>
and
<math>j</math>.  Thus for the
<math>\mat{P}^*</math> chain it is also possible to go from any state <math>(s_i, s_j)</math> to any
other state
<math>(s_k,s_l)</math> in at most <math>N</math> steps.  That is <math>\mat{P}^*</math> is also a regular Markov
chain.
We know that a regular Markov chain will reach any state in a finite time.  Let
<math>T</math> be
the first time the the chain <math>\mat{P}^*</math> is in a state of the form <math>(s_k,s_k)</math>.
In other words,
<math>T</math> is the first time that the <math>X</math> and the <math>Y</math> processes are in the same state.
Then we have shown that
<math display="block">
P[T  >  n] \rightarrow 0 \;\;\mbox{as}\;\; n \rightarrow \infty\ .
</math>
If we watch the <math>X</math> and <math>Y</math> processes after the first time they are in the same
state we would not predict any difference in their long range behavior. Since
this will
happen no matter how we started these two processes, it seems clear that the
long range
behaviour should not depend upon the starting state.  We now show that this is
true. 
We first note that if <math>n \ge T</math>, then since <math>X</math> and <math>Y</math> are both in the
same state at time <math>T</math>,
<math display="block">
P(X_n = j\ |\ n \ge T) = P(Y_n = j\ |\ n \ge T)\ .
</math>
If we multiply both sides of this equation by <math>P(n \ge T)</math>, we obtain
<span id{{=}}"eq 11.4.1"/>
<math display="block">
\begin{equation}
P(X_n = j,\ n \ge T) = P(Y_n = j,\ n \ge T)\ .
\label{eq 11.4.1}
\end{equation}
</math>
We know that for all <math>n</math>,
<math display="block">
P(Y_n = j) = w_j\ .
</math>
But
<math display="block">
P(Y_n = j) = P(Y_n = j,\ n \ge T) + P(Y_n = j,\ n  <  T)\ ,
</math>
and the second summand on the right-hand side of this equation goes to 0 as <math>n</math>
goes to
<math>\infty</math>, since <math>P(n  <  T)</math> goes to 0 as <math>n</math> goes to <math>\infty</math>.  So,
<math display="block">
P(Y_n = j,\ n \ge T) \rightarrow w_j\ ,
</math>
as <math>n</math> goes to <math>\infty</math>.  From [[#eq 11.4.1 |Equation]], we see that
<math display="block">
P(X_n = j,\ n \ge T) \rightarrow w_j\ ,
</math>
as <math>n</math> goes to <math>\infty</math>.  But by similar reasoning to that used above, the
difference
between this last expression and <math>P(X_n = j)</math> goes to 0 as <math>n</math> goes to
<math>\infty</math>.  Therefore,
<math display="block">
P(X_n = j) \rightarrow w_j\ ,
</math>
as <math>n</math> goes to <math>\infty</math>.  This completes the proof.}}
In the above proof, we have said nothing about the rate at which the
distributions of the
<math>X_n</math>'s approach the fixed distribution <math>\mat {w}</math>.  In fact, it can be shown
that<ref group="Notes" >T. Lindvall, ''Lectures on the Coupling Method''  (New York:
Wiley 1992).</ref>
<math display="block">
\sum ^{r}_{j = 1} \mid P(X_{n} = j) - w_j \mid \leq 2
P(T  >  n)\ .
</math>
The left-hand side of this inequality can be viewed as the distance between the
distribution
of the Markov chain after <math>n</math> steps, starting in state <math>s_i</math>, and the limiting
distribution
<math>\mat {w}</math>.
\exercises
==General references==
{{cite web |url=https://math.dartmouth.edu/~prob/prob/prob.pdf |title=Grinstead and Snell’s Introduction to Probability |last=Doyle |first=Peter G.|date=2006 |access-date=June 6, 2024}}
==Notes==
{{Reflist|group=Notes}}

Revision as of 03:38, 9 June 2024

[math] \newcommand{\NA}{{\rm NA}} \newcommand{\mat}[1]{{\bf#1}} \newcommand{\exref}[1]{\ref{##1}} \newcommand{\secstoprocess}{\all} \newcommand{\NA}{{\rm NA}} \newcommand{\mathds}{\mathbb}[/math]

label{sec 11.4} The fundamental limit theorem for regular Markov chains states that if [math]\mat{P}[/math] is a regular transition matrix then

[[math]] \lim_{n \to \infty} \mat {P}^n = \mat {W}\ , [[/math]]

where [math]\mat{W}[/math] is a matrix with each row equal to the unique fixed probability row vector [math]\mat{w}[/math] for [math]\mat{P}[/math]. In this section we shall give two very different proofs of this theorem.


Our first proof is carried out by showing that, for any column vector [math]\mat{y}[/math], [math]\mat{P}^n \mat {y}[/math] tends to a constant vector. As indicated in Section \ref{sec 11.3}, this will show that [math]\mat{P}^n[/math] converges to a matrix with constant columns or, equivalently, to a matrix with all rows the same.


The following lemma says that if an [math]r[/math]-by-[math]r[/math] transition matrix has no zero entries, and [math]\mat {y}[/math] is any column vector with [math]r[/math] entries, then the vector [math]\mat {P}\mat{y}[/math] has entries which are “closer together” than the entries are in [math]\mat {y}[/math].

Lemma

Let [math]\mat{P}[/math] be an [math]r[/math]-by-[math]r[/math] transition matrix with no zero entries. Let [math]d[/math] be the smallest entry of the matrix. Let [math]\mat{y}[/math] be a column vector with [math]r[/math] components, the largest of which is [math]M_0[/math] and the smallest [math]m_0[/math]. Let [math]M_1[/math] and [math]m_1[/math] be the largest and smallest component, respectively, of the vector [math]\mat {P} \mat {y}[/math]. Then

[[math]] M_1 - m_1 \leq (1 - 2d)(M_0 - m_0)\ . [[/math]]
\n

Show Proof

In the discussion following Theorem, it was noted that each entry in the vector [math]\mat {P}\mat{y}[/math] is a weighted average of the entries in [math]\mat {y}[/math]. The largest weighted average that could be obtained in the present case would occur if all but one of the entries of [math]\mat {y}[/math] have value [math]M_0[/math] and one entry has value [math]m_0[/math], and this one small entry is weighted by the smallest possible weight, namely [math]d[/math]. In this case, the weighted average would equal

[[math]] dm_0 + (1-d)M_0\ . [[/math]]
Similarly, the smallest possible weighted average equals

[[math]] dM_0 + (1-d)m_0\ . [[/math]]
Thus,

[[math]] \begin{eqnarray*} M_1 - m_1 &\le& \Bigl(dm_0 + (1-d)M_0\Bigr) - \Bigl(dM_0 + (1-d)m_0\Bigr) \\ &=& (1 - 2d)(M_0 - m_0)\ . \end{eqnarray*} [[/math]]
This completes the proof of the lemma.

We turn now to the proof of the fundamental limit theorem for regular Markov chains.

Theorem

(Fundamental Limit Theorem for Regular Chains) If [math]\mat{P}[/math] is the transition matrix for a regular Markov chain, then

[[math]] \lim_{n \to \infty} \mat {P}^n = \mat {W}\ , [[/math]]
where [math]\mat {W}[/math] is matrix with all rows equal. Furthermore, all entries in [math]\mat{W}[/math] are strictly positive.\n

Show Proof

We prove this theorem for the special case that [math]\mat{P}[/math] has no 0 entries. The extension to the general case is indicated in Exercise Exercise. Let \mat {y} be any [math]r[/math]-component column vector, where [math]r[/math] is the number of states of the chain. We assume that [math]r \gt 1[/math], since otherwise the theorem is trivial. Let [math]M_n[/math] and [math]m_n[/math] be, respectively, the maximum and minimum components of the vector [math]\mat {P}^n \mat { y}[/math]. The vector [math]\mat {P}^n \mat {y}[/math] is obtained from the vector [math]\mat {P}^{n - 1} \mat {y}[/math] by multiplying on the left by the matrix [math]\mat{P}[/math]. Hence each component of [math]\mat {P}^n \mat {y}[/math] is an average of the components of [math]\mat {P}^{n - 1} \mat {y}[/math]. Thus

[[math]] M_0 \geq M_1 \geq M_2 \geq\cdots [[/math]]
and

[[math]] m_0 \leq m_1 \leq m_2 \leq\cdots\ . [[/math]]
Each sequence is monotone and bounded:

[[math]] m_0 \leq m_n \leq M_n \leq M_0\ . [[/math]]
Hence, each of these sequences will have a limit as [math]n[/math] tends to infinity.


Let [math]M[/math] be the limit of [math]M_n[/math] and [math]m[/math] the limit of [math]m_n[/math]. We know that [math]m \leq M[/math]. We shall prove that [math]M - m = 0[/math]. This will be the case if [math]M_n - m_n[/math] tends to 0. Let [math]d[/math] be the smallest element of [math]\mat{P}[/math]. Since all entries of [math]\mat{P}[/math] are strictly positive, we have [math]d \gt 0[/math]. By our lemma

[[math]] M_n - m_n \leq (1 - 2d)(M_{n - 1} - m_{n - 1})\ . [[/math]]
From this we see that

[[math]] M_n - m_n \leq (1 - 2d)^n(M_0 - m_0)\ . [[/math]]
Since [math]r \ge 2[/math], we must have [math]d \leq 1/2[/math], so [math]0 \leq 1 - 2d \lt 1[/math], so the difference [math]M_n - m_n[/math] tends to 0 as [math]n[/math] tends to infinity. Since every component of [math]\mat {P}^n \mat {y}[/math] lies between [math]m_n[/math] and [math]M_n[/math], each component must approach the same number [math]u = M = m[/math]. This shows that

[[math]] \lim_{n \to \infty} \mat {P}^n \mat {y} = \mat{u}\ , \label{eq 11.4.4} [[/math]]
where [math]\mat{u}[/math] is a column vector all of whose components equal [math]u[/math].


Now let [math]\mat{y}[/math] be the vector with [math]j[/math]th component equal to 1 and all other components equal to 0. Then [math]\mat {P}^n \mat {y}[/math] is the [math]j[/math]th column of [math]\mat {P}^n[/math]. Doing this for each [math]j[/math] proves that the columns of [math]\mat {P}^n[/math] approach constant column vectors. That is, the rows of [math]\mat {P}^n[/math] approach a common row vector [math]\mat{w}[/math], or,

[[math]] \lim_{n \to \infty} \mat {P}^n = \mat {W}\ . \label{eq 11.4.5} [[/math]]


It remains to show that all entries in [math]\mat{W}[/math] are strictly positive. As before, let [math]\mat{y}[/math] be the vector with [math]j[/math]th component equal to 1 and all other components equal to 0. Then [math]\mat{P}\mat{y}[/math] is the [math]j[/math]th column of [math]\mat{P}[/math], and this column has all entries strictly positive. The minimum component of the vector [math]\mat{P}\mat{y}[/math] was defined to be [math]m_1[/math], hence [math]m_1 \gt 0[/math]. Since [math]m_1 \le m[/math], we have [math]m \gt 0[/math]. Note finally that this value of [math]m[/math] is just the [math]j[/math]th component of [math]\mat{w}[/math], so all components of [math]\mat{w}[/math] are strictly positive.

Doeblin's Proof

We give now a very different proof of the main part of the fundamental limit theorem for regular Markov chains. This proof was first given by Doeblin,[Notes 1] a brilliant young mathematician who was killed in his twenties in the Second World War.

Theorem

Let [math]\mat {P}[/math] be the transition matrix for a regular Markov chain with fixed vector [math]\mat {w}[/math]. Then for any initial probability vector [math]\mat {u}[/math], [math]\mat {uP}^n \rightarrow \mat {w}[/math] as [math]n \rightarrow \infty.[/math]\n

Show Proof

Let [math] X_0,\ X_1,\ \ldots[/math] be a Markov chain with transition matrix [math]\mat {P}[/math] started in state [math]s_i[/math]. Let [math]Y_0,\ Y_1,\ \ldots[/math] be a Markov chain with transition probability [math]\mat {P}[/math] started with initial probabilities given by [math]\mat {w}[/math]. The [math]X[/math] and [math]Y[/math] processes are run independently of each other.


We consider also a third Markov chain [math]\mat{P}^*[/math] which consists of watching both the [math]X[/math] and [math]Y[/math] processes. The states for [math]\mat{P}^*[/math] are pairs [math](s_i, s_j)[/math]. The transition probabilities are given by

[[math]] \mat{P}^{*}[(i,j),(k,l)] = \mat{P}(i,k) \cdot \mat{P}(j,l)\ . [[/math]]

Since [math]\mat{P}[/math] is regular there is an [math]N[/math] such that [math]\mat{P}^{N}(i,j) \gt 0[/math] for all [math]i[/math] and [math]j[/math]. Thus for the [math]\mat{P}^*[/math] chain it is also possible to go from any state [math](s_i, s_j)[/math] to any other state [math](s_k,s_l)[/math] in at most [math]N[/math] steps. That is [math]\mat{P}^*[/math] is also a regular Markov chain.


We know that a regular Markov chain will reach any state in a finite time. Let [math]T[/math] be the first time the the chain [math]\mat{P}^*[/math] is in a state of the form [math](s_k,s_k)[/math]. In other words, [math]T[/math] is the first time that the [math]X[/math] and the [math]Y[/math] processes are in the same state. Then we have shown that

[[math]] P[T \gt n] \rightarrow 0 \;\;\mbox{as}\;\; n \rightarrow \infty\ . [[/math]]
If we watch the [math]X[/math] and [math]Y[/math] processes after the first time they are in the same state we would not predict any difference in their long range behavior. Since this will happen no matter how we started these two processes, it seems clear that the long range behaviour should not depend upon the starting state. We now show that this is true.


We first note that if [math]n \ge T[/math], then since [math]X[/math] and [math]Y[/math] are both in the same state at time [math]T[/math],

[[math]] P(X_n = j\ |\ n \ge T) = P(Y_n = j\ |\ n \ge T)\ . [[/math]]
If we multiply both sides of this equation by [math]P(n \ge T)[/math], we obtain

[[math]] \begin{equation} P(X_n = j,\ n \ge T) = P(Y_n = j,\ n \ge T)\ . \label{eq 11.4.1} \end{equation} [[/math]]
We know that for all [math]n[/math],

[[math]] P(Y_n = j) = w_j\ . [[/math]]
But

[[math]] P(Y_n = j) = P(Y_n = j,\ n \ge T) + P(Y_n = j,\ n \lt T)\ , [[/math]]
and the second summand on the right-hand side of this equation goes to 0 as [math]n[/math] goes to [math]\infty[/math], since [math]P(n \lt T)[/math] goes to 0 as [math]n[/math] goes to [math]\infty[/math]. So,

[[math]] P(Y_n = j,\ n \ge T) \rightarrow w_j\ , [[/math]]
as [math]n[/math] goes to [math]\infty[/math]. From Equation, we see that

[[math]] P(X_n = j,\ n \ge T) \rightarrow w_j\ , [[/math]]
as [math]n[/math] goes to [math]\infty[/math]. But by similar reasoning to that used above, the difference between this last expression and [math]P(X_n = j)[/math] goes to 0 as [math]n[/math] goes to [math]\infty[/math]. Therefore,

[[math]] P(X_n = j) \rightarrow w_j\ , [[/math]]
as [math]n[/math] goes to [math]\infty[/math]. This completes the proof.

In the above proof, we have said nothing about the rate at which the distributions of the [math]X_n[/math]'s approach the fixed distribution [math]\mat {w}[/math]. In fact, it can be shown that[Notes 2]

[[math]] \sum ^{r}_{j = 1} \mid P(X_{n} = j) - w_j \mid \leq 2 P(T \gt n)\ . [[/math]]

The left-hand side of this inequality can be viewed as the distance between the distribution of the Markov chain after [math]n[/math] steps, starting in state [math]s_i[/math], and the limiting distribution [math]\mat {w}[/math]. \exercises


General references

Doyle, Peter G. (2006). "Grinstead and Snell's Introduction to Probability" (PDF). Retrieved June 6, 2024.

Notes

  1. W. Doeblin, “Exposé de la Théorie des Chaines Simple Constantes de Markov à un Nombre Fini d'Etats,” Rev.\ Mach.\ de l'Union Interbalkanique, vol. 2 (1937), pp. 77--105.
  2. T. Lindvall, Lectures on the Coupling Method (New York: Wiley 1992).