Arc Sine Laws
label{sec 12.3}
In Exercise~\ref{sec 12.1}., the distribution of the time
of the last equalization in the symmetric random walk was determined. If we let [math]\alpha_{2k, 2m}[/math]
denote the probability that a random walk of length [math]2m[/math] has its last equalization at time
[math]2k[/math], then we have
We shall now show how one can approximate the distribution of the [math]\alpha[/math]'s with a simple function. We recall that
Therefore, as both [math]k[/math] and [math]m[/math] go to [math]\infty[/math], we have
This last expression can be written as
Thus, if we define
for [math]0 \lt x \lt 1[/math], then we have
The reason for the [math]\approx[/math] sign is that we no longer require that [math]k[/math] get large. This means that we can replace the discrete [math]\alpha_{2k, 2m}[/math] distribution by the continuous density [math]f(x)[/math] on the interval [math][0, 1][/math] and obtain a good approximation. In particular, if [math]x[/math] is a fixed real number between 0 and 1, then we have
It turns out that [math]f(x)[/math] has a nice antiderivative, so we can write
One can see from the graph of this last function that it has a minimum at [math]x = 1/2[/math] and is symmetric about that point. As noted in the exercise, this implies that half of the walks of length [math]2m[/math] have no equalizations after time [math]m[/math], a fact which probably would not be guessed.
It turns out that the arc sine density comes up in the answers to many other questions
concerning random walks on the line. Recall that in Section \ref{sec 12.1}, a random walk could be
viewed as a polygonal line connecting [math](0,0)[/math] with [math](m, S_m)[/math]. Under this interpretation, we
define [math]b_{2k, 2m}[/math] to be the probability that a random walk of length [math]2m[/math] has exactly [math]2k[/math] of
its [math]2m[/math] polygonal line segments above the [math]t[/math]-axis.
The probability [math]b_{2k, 2m}[/math] is frequently interpreted in terms of a two-player game. (The
reader will recall the game Heads or Tails, in Example.) Player A is said to be in
the lead at time
[math]n[/math] if the random walk is above the
[math]t[/math]-axis at that time, or if the random walk is on the [math]t[/math]-axis at time [math]n[/math] but above the
[math]t[/math]-axis at time
[math]n-1[/math]. (At time 0, neither player is in the lead.) One can ask what is the most probable number
of times that player A is in the lead, in a game of length [math]2m[/math]. Most people will say that the
answer to this question is [math]m[/math]. However, the following theorem says that [math]m[/math] is the least likely
number of times that player A is in the lead, and the most likely number of times in the
lead is 0 or [math]2m[/math].
If Peter and Paul play a game of Heads or Tails of length [math]2m[/math], the probability that Peter will be in the lead exactly [math]2k[/math] times is equal to
To prove the theorem, we need to show that
We now count the number of paths of the various types
described above. The number of paths of length [math]2j[/math] all of whose line segments lie above the
[math]t[/math]-axis and which return to the origin for the first time at time [math]2j[/math] equals
[math](1/2)2^{2j}f_{2j}[/math]. This also equals the number of paths of length [math]2j[/math] all of whose line
segments lie below the [math]t[/math]-axis and which return to the origin for the first time at time [math]2j[/math].
The number of paths of length [math](2m - 2j)[/math] which have exactly [math](2k - 2j)[/math] line segments above the
[math]t[/math]-axis is [math]b_{2k-2j, 2m-2j}[/math]. Finally, the number of paths of length [math](2m-2j)[/math] which have
exactly [math]2k[/math] line segments above the [math]t[/math]-axis is [math]b_{2k,2m-2j}[/math]. Therefore, we have
We now assume that Equation is true for [math]m \lt n[/math]. Then we have
where the last equality follows from Theorem. Thus, we have
We illustrate the above theorem by simulating 10,00 games of Heads or Tails, with each game consisting of 40 tosses. The distribution of the number of times that Peter is in the lead is given in Figure \ref{fig 12.2}, together with the arc sine density.
We end this section by stating two other results in which the arc sine density appears. Proofs of these results may be found in Feller.[Notes 1]
Let [math]J[/math] be the random variable which, for a given random walk of length [math]2m[/math], gives the smallest subscript [math]j[/math] such that [math]S_{j} = S_{2m}[/math]. (Such a subscript [math]j[/math] must be even, by parity considerations.) Let [math]\gamma_{2k, 2m}[/math] be the probability that [math]J = 2k[/math]. Then we have
The next theorem says that the arc sine density is applicable to a wide range of situations. A continuous distribution function [math]F(x)[/math] is said to be symmetric if [math]F(x) = 1 - F(-x)[/math]. (If [math]X[/math] is a continuous random variable with a symmetric distribution function, then for any real [math]x[/math], we have [math]P(X \le x) = P(X \ge -x)[/math].) We imagine that we have a random walk of length [math]n[/math] in which each summand has the distribution [math]F(x)[/math], where [math]F[/math] is continuous and symmetric. The subscript of the first maximum of such a walk is the unique subscript [math]k[/math] such that
We define the random variable [math]K_n[/math] to be the subscript of the first maximum. We can now state the following theorem concerning the random variable [math]K_n[/math].
Let [math]F[/math] be a symmetric continuous distribution function, and let [math]\alpha[/math] be a fixed real number strictly between 0 and 1. Then as [math]n \rightarrow \infty[/math], we have
A version of this theorem that holds for a symmetric random walk can also be found in Feller.
\exercises
\backmatter==General references==
Doyle, Peter G. (2006). "Grinstead and Snell's Introduction to Probability" (PDF). Retrieved June 6, 2024.