guide:D71bfb6804: Difference between revisions

Revision as of 03:38, 9 June 2024

label{sec 11.4} The fundamental limit theorem for regular Markov chains states that if [math]\mat{P}[/math] is a regular transition matrix then

[[math]] \lim_{n \to \infty} \mat {P}^n = \mat {W}\ , [[/math]]

where [math]\mat{W}[/math] is a matrix with each row equal to the unique fixed probability row vector [math]\mat{w}[/math] for [math]\mat{P}[/math]. In this section we shall give two very different proofs of this theorem.

Our first proof is carried out by showing that, for any column vector [math]\mat{y}[/math], [math]\mat{P}^n \mat {y}[/math] tends to a constant vector. As indicated in Section \ref{sec 11.3}, this will show that [math]\mat{P}^n[/math] converges to a matrix with constant columns or, equivalently, to a matrix with all rows the same.

The following lemma says that if an [math]r[/math]-by-[math]r[/math] transition matrix has no zero entries, and [math]\mat {y}[/math] is any column vector with [math]r[/math] entries, then the vector [math]\mat {P}\mat{y}[/math] has entries which are “closer together” than the entries are in [math]\mat {y}[/math].

Lemma

Let [math]\mat{P}[/math] be an [math]r[/math]-by-[math]r[/math] transition matrix with no zero entries. Let [math]d[/math] be the smallest entry of the matrix. Let [math]\mat{y}[/math] be a column vector with [math]r[/math] components, the largest of which is [math]M_0[/math] and the smallest [math]m_0[/math]. Let [math]M_1[/math] and [math]m_1[/math] be the largest and smallest component, respectively, of the vector [math]\mat {P} \mat {y}[/math]. Then

[[math]] M_1 - m_1 \leq (1 - 2d)(M_0 - m_0)\ . [[/math]]

\n

Show Proof

In the discussion following Theorem, it was noted that each entry in the vector [math]\mat {P}\mat{y}[/math] is a weighted average of the entries in [math]\mat {y}[/math]. The largest weighted average that could be obtained in the present case would occur if all but one of the entries of [math]\mat {y}[/math] have value [math]M_0[/math] and one entry has value [math]m_0[/math], and this one small entry is weighted by the smallest possible weight, namely [math]d[/math]. In this case, the weighted average would equal

[[math]] dm_0 + (1-d)M_0\ . [[/math]]

Similarly, the smallest possible weighted average equals

[[math]] dM_0 + (1-d)m_0\ . [[/math]]

Thus,

[[math]] \begin{eqnarray*} M_1 - m_1 &\le& \Bigl(dm_0 + (1-d)M_0\Bigr) - \Bigl(dM_0 + (1-d)m_0\Bigr) \\ &=& (1 - 2d)(M_0 - m_0)\ . \end{eqnarray*} [[/math]]

This completes the proof of the lemma.

■

We turn now to the proof of the fundamental limit theorem for regular Markov chains.

Theorem

(Fundamental Limit Theorem for Regular Chains) If [math]\mat{P}[/math] is the transition matrix for a regular Markov chain, then

[[math]] \lim_{n \to \infty} \mat {P}^n = \mat {W}\ , [[/math]]

where [math]\mat {W}[/math] is matrix with all rows equal. Furthermore, all entries in [math]\mat{W}[/math] are strictly positive.\n

Show Proof

We prove this theorem for the special case that [math]\mat{P}[/math] has no 0 entries. The extension to the general case is indicated in Exercise Exercise. Let \mat {y} be any [math]r[/math]-component column vector, where [math]r[/math] is the number of states of the chain. We assume that [math]r \gt 1[/math], since otherwise the theorem is trivial. Let [math]M_n[/math] and [math]m_n[/math] be, respectively, the maximum and minimum components of the vector [math]\mat {P}^n \mat { y}[/math]. The vector [math]\mat {P}^n \mat {y}[/math] is obtained from the vector [math]\mat {P}^{n - 1} \mat {y}[/math] by multiplying on the left by the matrix [math]\mat{P}[/math]. Hence each component of [math]\mat {P}^n \mat {y}[/math] is an average of the components of [math]\mat {P}^{n - 1} \mat {y}[/math]. Thus

[[math]] M_0 \geq M_1 \geq M_2 \geq\cdots [[/math]]

and

[[math]] m_0 \leq m_1 \leq m_2 \leq\cdots\ . [[/math]]

Each sequence is monotone and bounded:

[[math]] m_0 \leq m_n \leq M_n \leq M_0\ . [[/math]]

Hence, each of these sequences will have a limit as [math]n[/math] tends to infinity.

Let [math]M[/math] be the limit of [math]M_n[/math] and [math]m[/math] the limit of [math]m_n[/math]. We know that [math]m \leq M[/math]. We shall prove that [math]M - m = 0[/math]. This will be the case if [math]M_n - m_n[/math] tends to 0. Let [math]d[/math] be the smallest element of [math]\mat{P}[/math]. Since all entries of [math]\mat{P}[/math] are strictly positive, we have [math]d \gt 0[/math]. By our lemma

[[math]] M_n - m_n \leq (1 - 2d)(M_{n - 1} - m_{n - 1})\ . [[/math]]

From this we see that

[[math]] M_n - m_n \leq (1 - 2d)^n(M_0 - m_0)\ . [[/math]]

Since [math]r \ge 2[/math], we must have [math]d \leq 1/2[/math], so [math]0 \leq 1 - 2d \lt 1[/math], so the difference [math]M_n - m_n[/math] tends to 0 as [math]n[/math] tends to infinity. Since every component of [math]\mat {P}^n \mat {y}[/math] lies between [math]m_n[/math] and [math]M_n[/math], each component must approach the same number [math]u = M = m[/math]. This shows that

[[math]] \lim_{n \to \infty} \mat {P}^n \mat {y} = \mat{u}\ , \label{eq 11.4.4} [[/math]]

where [math]\mat{u}[/math] is a column vector all of whose components equal [math]u[/math].

Now let [math]\mat{y}[/math] be the vector with [math]j[/math]th component equal to 1 and all other components equal to 0. Then [math]\mat {P}^n \mat {y}[/math] is the [math]j[/math]th column of [math]\mat {P}^n[/math]. Doing this for each [math]j[/math] proves that the columns of [math]\mat {P}^n[/math] approach constant column vectors. That is, the rows of [math]\mat {P}^n[/math] approach a common row vector [math]\mat{w}[/math], or,

[[math]] \lim_{n \to \infty} \mat {P}^n = \mat {W}\ . \label{eq 11.4.5} [[/math]]

It remains to show that all entries in [math]\mat{W}[/math] are strictly positive. As before, let [math]\mat{y}[/math] be the vector with [math]j[/math]th component equal to 1 and all other components equal to 0. Then [math]\mat{P}\mat{y}[/math] is the [math]j[/math]th column of [math]\mat{P}[/math], and this column has all entries strictly positive. The minimum component of the vector [math]\mat{P}\mat{y}[/math] was defined to be [math]m_1[/math], hence [math]m_1 \gt 0[/math]. Since [math]m_1 \le m[/math], we have [math]m \gt 0[/math]. Note finally that this value of [math]m[/math] is just the [math]j[/math]th component of [math]\mat{w}[/math], so all components of [math]\mat{w}[/math] are strictly positive.

■

Doeblin's Proof

We give now a very different proof of the main part of the fundamental limit theorem for regular Markov chains. This proof was first given by Doeblin,^{[Notes 1]} a brilliant young mathematician who was killed in his twenties in the Second World War.

Theorem

Let [math]\mat {P}[/math] be the transition matrix for a regular Markov chain with fixed vector [math]\mat {w}[/math]. Then for any initial probability vector [math]\mat {u}[/math], [math]\mat {uP}^n \rightarrow \mat {w}[/math] as [math]n \rightarrow \infty.[/math]\n

Show Proof

Let [math] X_0,\ X_1,\ \ldots[/math] be a Markov chain with transition matrix [math]\mat {P}[/math] started in state [math]s_i[/math]. Let [math]Y_0,\ Y_1,\ \ldots[/math] be a Markov chain with transition probability [math]\mat {P}[/math] started with initial probabilities given by [math]\mat {w}[/math]. The [math]X[/math] and [math]Y[/math] processes are run independently of each other.

We consider also a third Markov chain [math]\mat{P}^*[/math] which consists of watching both the [math]X[/math] and [math]Y[/math] processes. The states for [math]\mat{P}^*[/math] are pairs [math](s_i, s_j)[/math]. The transition probabilities are given by

[[math]] \mat{P}^{*}[(i,j),(k,l)] = \mat{P}(i,k) \cdot \mat{P}(j,l)\ . [[/math]]

Since [math]\mat{P}[/math] is regular there is an [math]N[/math] such that [math]\mat{P}^{N}(i,j) \gt 0[/math] for all [math]i[/math] and [math]j[/math]. Thus for the [math]\mat{P}^*[/math] chain it is also possible to go from any state [math](s_i, s_j)[/math] to any other state [math](s_k,s_l)[/math] in at most [math]N[/math] steps. That is [math]\mat{P}^*[/math] is also a regular Markov chain.

We know that a regular Markov chain will reach any state in a finite time. Let [math]T[/math] be the first time the the chain [math]\mat{P}^*[/math] is in a state of the form [math](s_k,s_k)[/math]. In other words, [math]T[/math] is the first time that the [math]X[/math] and the [math]Y[/math] processes are in the same state. Then we have shown that

[[math]] P[T \gt n] \rightarrow 0 \;\;\mbox{as}\;\; n \rightarrow \infty\ . [[/math]]

If we watch the [math]X[/math] and [math]Y[/math] processes after the first time they are in the same state we would not predict any difference in their long range behavior. Since this will happen no matter how we started these two processes, it seems clear that the long range behaviour should not depend upon the starting state. We now show that this is true.

We first note that if [math]n \ge T[/math], then since [math]X[/math] and [math]Y[/math] are both in the same state at time [math]T[/math],

[[math]] P(X_n = j\ |\ n \ge T) = P(Y_n = j\ |\ n \ge T)\ . [[/math]]

If we multiply both sides of this equation by [math]P(n \ge T)[/math], we obtain

[[math]] \begin{equation} P(X_n = j,\ n \ge T) = P(Y_n = j,\ n \ge T)\ . \label{eq 11.4.1} \end{equation} [[/math]]

We know that for all [math]n[/math],

[[math]] P(Y_n = j) = w_j\ . [[/math]]

But

[[math]] P(Y_n = j) = P(Y_n = j,\ n \ge T) + P(Y_n = j,\ n \lt T)\ , [[/math]]

and the second summand on the right-hand side of this equation goes to 0 as [math]n[/math] goes to [math]\infty[/math], since [math]P(n \lt T)[/math] goes to 0 as [math]n[/math] goes to [math]\infty[/math]. So,

[[math]] P(Y_n = j,\ n \ge T) \rightarrow w_j\ , [[/math]]

as [math]n[/math] goes to [math]\infty[/math]. From Equation, we see that

[[math]] P(X_n = j,\ n \ge T) \rightarrow w_j\ , [[/math]]

as [math]n[/math] goes to [math]\infty[/math]. But by similar reasoning to that used above, the difference between this last expression and [math]P(X_n = j)[/math] goes to 0 as [math]n[/math] goes to [math]\infty[/math]. Therefore,

[[math]] P(X_n = j) \rightarrow w_j\ , [[/math]]

as [math]n[/math] goes to [math]\infty[/math]. This completes the proof.

■

In the above proof, we have said nothing about the rate at which the distributions of the [math]X_n[/math]'s approach the fixed distribution [math]\mat {w}[/math]. In fact, it can be shown that^{[Notes 2]}

[[math]] \sum ^{r}_{j = 1} \mid P(X_{n} = j) - w_j \mid \leq 2 P(T \gt n)\ . [[/math]]

The left-hand side of this inequality can be viewed as the distance between the distribution of the Markov chain after [math]n[/math] steps, starting in state [math]s_i[/math], and the limiting distribution [math]\mat {w}[/math]. \exercises

General references

Doyle, Peter G. (2006). "Grinstead and Snell's Introduction to Probability" (PDF). Retrieved June 6, 2024.

Notes

W. Doeblin, “Exposé de la Théorie des Chaines Simple Constantes de Markov à un Nombre Fini d'Etats,” Rev.\ Mach.\ de l'Union Interbalkanique, vol. 2 (1937), pp. 77--105.
T. Lindvall, Lectures on the Coupling Method (New York: Wiley 1992).

[1] W. Doeblin, “Exposé de la Théorie des Chaines Simple Constantes de Markov à un Nombre Fini d'Etats,” Rev.\ Mach.\ de l'Union Interbalkanique, vol. 2 (1937), pp. 77--105.

[2] T. Lindvall, Lectures on the Coupling Method (New York: Wiley 1992).

[Notes 1]

[Notes 2]

@@ Line 1: / Line 1: @@
+<div class="d-none"><math>
+\newcommand{\NA}{{\rm NA}}
+\newcommand{\mat}[1]{{\bf#1}}
+\newcommand{\exref}[1]{\ref{##1}}
+\newcommand{\secstoprocess}{\all}
+\newcommand{\NA}{{\rm NA}}
+\newcommand{\mathds}{\mathbb}</math></div>
+label{sec 11.4}
+The fundamental limit theorem for regular Markov chains states that if
+<math>\mat{P}</math> is
+a regular transition matrix then
+<math display="block">
+\lim_{n \to \infty} \mat {P}^n = \mat {W}\ ,
+</math>
+where <math>\mat{W}</math> is a matrix with each row equal to the unique fixed probability
+row vector
+<math>\mat{w}</math> for <math>\mat{P}</math>.  In this section we shall give two very different
+proofs of this
+theorem.
+Our first proof is carried out by showing that, for any column vector
+<math>\mat{y}</math>,
+<math>\mat{P}^n \mat {y}</math> tends to a constant vector.  As indicated in
+Section \ref{sec 11.3}, this
+will show that <math>\mat{P}^n</math> converges to a matrix with constant columns or,
+equivalently, to a
+matrix with all rows the same.
+The following lemma says that if an <math>r</math>-by-<math>r</math> transition matrix has no zero
+entries,
+and <math>\mat {y}</math> is any column vector with <math>r</math> entries, then the vector <math>\mat
+{P}\mat{y}</math> has
+entries which are “closer together” than the entries are in <math>\mat {y}</math>.
+{{proofcard|Lemma|lemma-1|Let <math>\mat{P}</math> be an <math>r</math>-by-<math>r</math> transition matrix with no zero entries.  Let <math>d</math>
+be
+the smallest entry of the matrix.  Let <math>\mat{y}</math> be a column vector with <math>r</math>
+components, the largest of which is <math>M_0</math> and the smallest <math>m_0</math>.  Let
+<math>M_1</math> and <math>m_1</math> be the largest and smallest component, respectively, of the
+vector <math>\mat {P} \mat {y}</math>.  Then
+<math display="block">
+M_1 - m_1 \leq (1 - 2d)(M_0 - m_0)\ .
+</math>\n|In the discussion following [[guide:E5c38a1e8a#thm 11.3.6 |Theorem]], it was noted that each
+entry in the vector
+<math>\mat {P}\mat{y}</math> is a weighted average of the entries in <math>\mat {y}</math>.  The
+largest weighted
+average that could be obtained in the present case would occur if all but one
+of the entries of
+<math>\mat {y}</math> have  value <math>M_0</math> and one entry has value <math>m_0</math>, and this one small
+entry is weighted
+by the smallest possible weight,  namely <math>d</math>.  In this case, the weighted
+average
+would equal
+<math display="block">
+dm_0 + (1-d)M_0\ .
+</math>
+Similarly, the smallest possible weighted average equals
+<math display="block">
+dM_0 + (1-d)m_0\ .
+</math>
+Thus,
+<math display="block">
+\begin{eqnarray*}
+M_1 - m_1 &\le& \Bigl(dm_0 + (1-d)M_0\Bigr) - \Bigl(dM_0 + (1-d)m_0\Bigr) \\
+         &=& (1 - 2d)(M_0 - m_0)\ .
+\end{eqnarray*}
+</math>
+This completes the proof of the lemma.}}
+We turn now to the proof of the fundamental limit theorem for regular Markov
+chains.
+{{proofcard|Theorem|theorem-1|''' (Fundamental Limit Theorem for Regular Chains)'''
+If <math>\mat{P}</math> is the transition matrix for a regular Markov chain, then
+<math display="block">
+\lim_{n \to \infty} \mat {P}^n = \mat {W}\ ,
+</math>
+where <math>\mat {W}</math> is matrix with all rows equal.  Furthermore, all entries in
+<math>\mat{W}</math> are
+strictly positive.\n|We prove this theorem for the special case that <math>\mat{P}</math> has no 0 entries.
+The extension
+to the general case is indicated in Exercise [[exercise:Ed0d60ec05 |Exercise]]. Let \mat {y} be
+any
+<math>r</math>-component column vector, where <math>r</math> is the number of states of the chain.
+We assume that
+<math>r  >  1</math>, since otherwise the theorem is trivial.  Let
+<math>M_n</math> and
+<math>m_n</math> be, respectively, the maximum and minimum components of the
+vector <math>\mat {P}^n \mat { y}</math>.  The vector <math>\mat {P}^n \mat {y}</math> is obtained
+from the
+vector <math>\mat {P}^{n - 1}
+\mat {y}</math> by multiplying on the left by the matrix <math>\mat{P}</math>.  Hence each
+component
+of <math>\mat {P}^n \mat {y}</math> is an average of the components of <math>\mat {P}^{n - 1}
+\mat {y}</math>.  Thus
+<math display="block">
+M_0 \geq M_1 \geq M_2 \geq\cdots
+</math>
+and
+<math display="block">
+m_0 \leq m_1 \leq m_2 \leq\cdots\ .
+</math>
+Each sequence is monotone and bounded:
+<math display="block">
+m_0 \leq m_n \leq M_n \leq M_0\ .
+</math>
+Hence, each of these sequences will have a limit as <math>n</math> tends to infinity.
+Let <math>M</math> be the limit of <math>M_n</math> and <math>m</math> the limit of <math>m_n</math>.  We know that <math>m \leq
+M</math>.  We shall prove that <math>M - m = 0</math>.  This will be the case if <math>M_n - m_n</math>
+tends to 0.  Let <math>d</math> be the smallest element of <math>\mat{P}</math>.  Since all entries
+of <math>\mat{P}</math>
+are strictly positive, we have <math>d  >  0</math>.  By our lemma
+<math display="block">
+M_n - m_n \leq (1 - 2d)(M_{n - 1} - m_{n - 1})\ .
+</math>
+From this we see that
+<math display="block">
+M_n - m_n \leq (1 - 2d)^n(M_0 - m_0)\ .
+</math>
+Since <math>r \ge 2</math>, we must have <math>d \leq 1/2</math>, so <math>0 \leq 1 - 2d  <  1</math>, so
+the difference <math>M_n - m_n</math> tends to 0 as <math>n</math> tends to infinity.  Since every
+component of <math>\mat {P}^n \mat {y}</math> lies between <math>m_n</math> and <math>M_n</math>, each component
+must
+approach the same number <math>u = M = m</math>.
+This shows that
+<span id{{=}}"eq 11.4.4"/>
+<math display="block">
+\lim_{n \to \infty} \mat {P}^n \mat {y} = \mat{u}\ ,  \label{eq 11.4.4}
+</math>
+where <math>\mat{u}</math> is a column vector all of whose components equal <math>u</math>.
+Now let <math>\mat{y}</math> be the vector with <math>j</math>th component equal to 1 and all other
+components equal to
+. Then <math>\mat {P}^n \mat {y}</math> is the
+<math>j</math>th column of <math>\mat {P}^n</math>.  Doing this for each <math>j</math> proves that the columns
+of <math>\mat {P}^n</math> approach constant column vectors.  That is, the rows of <math>\mat
+{P}^n</math>
+approach a common row vector <math>\mat{w}</math>, or,
+<span id{{=}}"eq 11.4.5"/>
+<math display="block">
+\lim_{n \to \infty} \mat {P}^n = \mat {W}\ .  \label{eq 11.4.5}
+</math>
+It remains to show that all entries in <math>\mat{W}</math> are strictly positive.  As
+before, let
+<math>\mat{y}</math> be the vector with <math>j</math>th component equal to 1 and all other
+components equal to 0.
+Then <math>\mat{P}\mat{y}</math> is the <math>j</math>th column of <math>\mat{P}</math>, and this column has all
+entries strictly
+positive.  The minimum component of the vector <math>\mat{P}\mat{y}</math> was defined to
+be <math>m_1</math>, hence
+<math>m_1  >  0</math>.  Since <math>m_1 \le m</math>, we have <math>m  >  0</math>.  Note finally that this value
+of <math>m</math> is just the
+<math>j</math>th component of <math>\mat{w}</math>, so all components of <math>\mat{w}</math> are strictly
+positive.}}
+===Doeblin's Proof===
+We give now a very different proof of the main part of the fundamental limit
+theorem for
+regular Markov chains.  This proof was first given by Doeblin,<ref group="Notes" >W. Doeblin, “Exposé de la Théorie des Chaines Simple Constantes
+de Markov à un
+Nombre Fini d'Etats,” ''Rev.\ Mach.\ de l'Union Interbalkanique,'' vol. 2
+(1937), pp. 77--105.</ref> a brilliant young  mathematician who was killed in
+his twenties in the Second World War.
+{{proofcard|Theorem|thm_11.4.1|Let <math>\mat {P}</math> be the transition matrix for a regular Markov chain with fixed
+vector
+<math>\mat {w}</math>.  Then for any initial probability vector <math>\mat {u}</math>,
+<math>\mat {uP}^n \rightarrow \mat {w}</math> as <math>n \rightarrow \infty.</math>\n|Let <math> X_0,\ X_1,\ \ldots</math> be a Markov
+chain with transition matrix <math>\mat {P}</math> started in state <math>s_i</math>.  Let <math>Y_0,\
+Y_1,\ \ldots</math>  be a Markov
+chain with transition probability <math>\mat {P}</math> started with initial probabilities
+given by
+<math>\mat {w}</math>.   The <math>X</math> and <math>Y</math> processes are run independently of each other.
+We consider also a third Markov chain  <math>\mat{P}^*</math>  which consists of watching
+both the
+<math>X</math> and <math>Y</math> processes.  The states for <math>\mat{P}^*</math>
+are pairs <math>(s_i, s_j)</math>.  The transition probabilities are given by
+<math display="block">
+\mat{P}^{*}[(i,j),(k,l)] = \mat{P}(i,k) \cdot  \mat{P}(j,l)\ .
+</math>
+Since <math>\mat{P}</math> is regular there is an <math>N</math> such that <math>\mat{P}^{N}(i,j)  >  0</math> for
+all <math>i</math>
+and
+<math>j</math>.  Thus for the
+<math>\mat{P}^*</math> chain it is also possible to go from any state <math>(s_i, s_j)</math> to any
+other state
+<math>(s_k,s_l)</math> in at most <math>N</math> steps.  That is <math>\mat{P}^*</math> is also a regular Markov
+chain.
+We know that a regular Markov chain will reach any state in a finite time.  Let
+<math>T</math> be
+the first time the the chain <math>\mat{P}^*</math> is in a state of the form <math>(s_k,s_k)</math>.
+In other words,
+<math>T</math> is the first time that the <math>X</math> and the <math>Y</math> processes are in the same state.
+Then we have shown that
+<math display="block">
+P[T  >  n] \rightarrow 0 \;\;\mbox{as}\;\; n \rightarrow \infty\ .
+</math>
+If we watch the <math>X</math> and <math>Y</math> processes after the first time they are in the same
+state we would not predict any difference in their long range behavior. Since
+this will
+happen no matter how we started these two processes, it seems clear that the
+long range
+behaviour should not depend upon the starting state.  We now show that this is
+true.
+We first note that if <math>n \ge T</math>, then since <math>X</math> and <math>Y</math> are both in the
+same state at time <math>T</math>,
+<math display="block">
+P(X_n = j\ |\ n \ge T) = P(Y_n = j\ |\ n \ge T)\ .
+</math>
+If we multiply both sides of this equation by <math>P(n \ge T)</math>, we obtain
+<span id{{=}}"eq 11.4.1"/>
+<math display="block">
+\begin{equation}
+P(X_n = j,\ n \ge T) = P(Y_n = j,\ n \ge T)\ .
+\label{eq 11.4.1}
+\end{equation}
+</math>
+We know that for all <math>n</math>,
+<math display="block">
+P(Y_n = j) = w_j\ .
+</math>
+But
+<math display="block">
+P(Y_n = j) = P(Y_n = j,\ n \ge T) + P(Y_n = j,\ n  <  T)\ ,
+</math>
+and the second summand on the right-hand side of this equation goes to 0 as <math>n</math>
+goes to
+<math>\infty</math>, since <math>P(n  <  T)</math> goes to 0 as <math>n</math> goes to <math>\infty</math>.  So,
+<math display="block">
+P(Y_n = j,\ n \ge T) \rightarrow w_j\ ,
+</math>
+as <math>n</math> goes to <math>\infty</math>.  From [[#eq 11.4.1 |Equation]], we see that
+<math display="block">
+P(X_n = j,\ n \ge T) \rightarrow w_j\ ,
+</math>
+as <math>n</math> goes to <math>\infty</math>.  But by similar reasoning to that used above, the
+difference
+between this last expression and <math>P(X_n = j)</math> goes to 0 as <math>n</math> goes to
+<math>\infty</math>.  Therefore,
+<math display="block">
+P(X_n = j) \rightarrow w_j\ ,
+</math>
+as <math>n</math> goes to <math>\infty</math>.  This completes the proof.}}
+In the above proof, we have said nothing about the rate at which the
+distributions of the
+<math>X_n</math>'s approach the fixed distribution <math>\mat {w}</math>.  In fact, it can be shown
+that<ref group="Notes" >T. Lindvall, ''Lectures on the Coupling Method''   (New York:
+Wiley 1992).</ref>
+<math display="block">
+\sum ^{r}_{j = 1} \mid P(X_{n} = j) - w_j \mid \leq 2
+P(T  >  n)\ .
+</math>
+The left-hand side of this inequality can be viewed as the distance between the
+distribution
+of the Markov chain after <math>n</math> steps, starting in state <math>s_i</math>, and the limiting
+distribution
+<math>\mat {w}</math>.
+\exercises
+==General references==
+{{cite web |url=https://math.dartmouth.edu/~prob/prob/prob.pdf |title=Grinstead and Snell’s Introduction to Probability |last=Doyle |first=Peter G.|date=2006 |access-date=June 6, 2024}}
+==Notes==
+{{Reflist|group=Notes}}