Convergence of Random Variables
Types of Convergences
We have already seen the notion of a.s. convergence. There are different types of convergences for r.v.'s in probability theory. Let us recall the notion of a.s. convergence.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of r.v.'s and let [math]X[/math] be a r.v. with values in [math]\R^d[/math]. Then
Another very important convergence type is the [math]L^p[/math]-convergence as it is described in measure theory. Recall that convergence in [math]L^p[/math] for [math]p\in[1,\infty)[/math] in the probability language means
Let [math](\Omega,\A,\p)[/math] be a probability space. We say that the sequence [math](X_n)_{n\geq 1}[/math] converges in probability to [math]X[/math] if for all [math]\epsilon \gt 0[/math]
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]\mathcal{L}^0_{\R^d}(\Omega,\A,\p)[/math] be the space of r.v.'s with values in [math]\R^d[/math] and let [math]L^0_{\R^d}(\Omega,\A,\p)[/math] be the quotient of [math]\mathcal{L}^0_{\R^d}(\Omega,\A,\p)[/math] by the equivalence relation [math]X\sim Y:\Longleftrightarrow X=Y[/math] a.s. Then the map
It's easy to see that [math]d[/math] defines a distance. If [math]\lim_{n\to\infty\atop \p}X_n=X[/math], then for all [math] \epsilon \gt 0[/math] we get
Let [math](\Omega,\A,\p)[/math] be a probability space. If [math](X_n)_{n\geq 1}[/math] converges a.s. or in [math]L^p[/math] to [math]X[/math], it also converges in probability to [math]X[/math]. Conversely, if [math](X_n)_{n\geq 1}[/math] converges to [math]X[/math] in probability, then there exists a subsequence [math](X_{n_k})_{k\geq 1}[/math] of [math](X_n)_{n\geq 1}[/math] such that
Consider [math]d(X_n,X)[/math]. We need to prove that [math]\lim_{n\to\infty\atop a.s.}X_n=X[/math] or [math]\lim_{n\to\infty\atop L^p}X_n=X[/math], which implies that [math]\lim_{n\to\infty}d(X_n,X)=0[/math]. If [math]\lim_{n\to\infty\atop a.s.}X_n=X[/math], then we apply Lebesgue's dominated convergence theorem (we can do this, because [math]\vert X_n-X\vert\land 1\leq 1[/math] and [math]\E[1] \lt \infty[/math]) to obtain that [math]\lim_{n\to\infty}\E[\vert X_n-X\vert\land 1]=\E[\lim_{n\to\infty}(\vert X_n-X\vert\land 1)]=0[/math]. If [math]\lim_{n\to\infty\atop L^p}X_n=X[/math], we can use the fact that for all [math]p\geq 1[/math],
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of r.v.'s and let [math]\lim_{n\to\infty\atop\p}X_n=X[/math]. Assume there is some [math]r \lt 1[/math], such that [math](X_n)_{n\geq 1}[/math] is bounded in [math]L^r[/math], i.e.
The fact that [math](X_n)_{n\geq 1}[/math] is bounded in [math]L^r[/math] implies that there is some [math]C \gt 0[/math], such that for all [math]n\geq 1[/math]
The strong law of large numbers
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of independent r.v.'s with values in arbitrary measure spaces. For [math]n\geq 1[/math], define the [math]\sigma[/math]-Algebra
We can easily see that a r.v. which is [math]\B_\infty[/math]-measurable is constant a.s. indeed its distribution function can only take the values 0 and 1.
Define [math]\mathcal{D}_n:=\sigma(X_k\mid k\leq n)[/math]. We have already observed that [math]\mathcal{D}_n[/math] and [math]\B_{n+1}[/math] are independent and hence since [math]\B_\infty\subset \B_{n+1}[/math], we get that for all [math]n\geq 1[/math], [math]\mathcal{D}_{n}[/math] and [math]\B_{\infty}[/math] are also independent. This implies that for all [math]A\in\bigcup_{n=1}^\infty\mathcal{D}_n[/math] and for all [math]B\in\B_\infty[/math] we get
If [math](X_n)_{n\geq 1}[/math] is a sequence of independent r.v.'s, then [math]\limsup_{n}\frac{X_1+...+X_n}{n}[/math] [math](\in[-\infty,\infty])[/math] is [math]\B_\infty[/math] measurable. It follows that [math]\frac{1}{n}(X_1+...+X_n)[/math] converges a.s. Moreover, its limit is a.s. constant.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of independent r.v.'s with the same distribution
We first need to show that for [math]p\geq 1[/math] we get [math]\p[-p\leq \inf_n S_n\leq \sup_n S_n\leq p]=0[/math]. This is a good exercise [a]. Now take [math]p\to\infty[/math] to obtain
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of iid r.v.'s, such that [math]X_i\in L^1(\Omega,\A,\p)[/math] for all [math]i\in\{1,...,n\}[/math]. Then
The assumption [math]\E[X_1] \lt \infty[/math] is important, but if [math]X_1\geq 0[/math] and [math]\E[X_1]=\infty[/math] we can apply the theorem to [math]X_1\land k[/math] for [math]k \gt 0[/math], and obtain that the theorem also holds with [math]\E[X_1]=\infty[/math].
Let [math]S_n=X_1+...+X_n[/math] with [math]S_0=0[/math] and take [math]a \gt \E[X_1][/math]. Define [math]M=\sup_{n \gt 0}(S_n-na)[/math]. We shall show that [math]M \lt \infty[/math] a.s. Since we obviously have [math]S_n\leq na+M[/math], it follows immediately that [math]\frac{S_n}{n}\leq a[/math] a.s. Choosing [math]a\searrow \E[X_1][/math] we obtain that [math]\limsup_n\frac{S_n}{n}\leq \E[X_1][/math]. Replacing [math](X_n)_{n\geq 1}[/math] with [math](-X_n)_{n\geq 1}[/math], we also get [math]\liminf_n\frac{S_n}{n}\geq \E[X_1][/math] a.s. So it follows that
Hence we only need to show that [math]M \lt \infty[/math] a.s. We first note that [math]\{M \lt \infty\}\in\B_\infty[/math]. Indeed, for all [math]k\geq 0[/math] we get that [math]\{M \lt \infty\}=\{\sup_{n\in\N}(S_n-an) \lt \infty\}=\{\sup_{n\geq k}(S_n-S_k-(n-k)a) \lt \infty\}[/math]. So it follows that [math]\p[M \lt \infty]\in\{0,1\}[/math]. Now we need to show that [math]\p[M \lt \infty]=1[/math] or equivalently [math]\p[M=\infty] \lt 1[/math]. We do it by contradiction. For [math]k\in\N[/math], set [math]M_k=\sup_{0\leq n\leq k}(S_n-na)[/math] and [math]M'_k=\sup_{0\leq n\leq k}(S_{n+1}-S_n-na)[/math]. Then [math]M_k[/math] and [math]M'_k[/math] have the same distribution. Indeed, [math](X_1,...,X_k)[/math] and [math](X_2,...,X_{k+1})[/math] have the same distribution and [math]M_k=F_k(X_1,...,X_k)[/math] and [math]M'_k=F_k(X_2,...,X_{k+1})[/math] with some map [math]F_k:\R^k\to\R[/math]. Moreover, [math]M=\lim_{k\to\infty}\uparrow M_k[/math] and therefore [math]M'=\lim_{k\to\infty}M_k[/math]. Since [math]M_k[/math] and [math]M_k'[/math] have the same distribution, [math]M[/math] and [math]M'[/math] also have the same distribution. Indeed [math]\p[M'\leq X]=\lim_{k\to\infty}\downarrow \p[M_k'\leq X]=\lim_{k\to\infty}\downarrow \p[M_k\leq X]=\p[M\leq X][/math]. So [math]M[/math] and [math]M'[/math] have the same distribution function. Moreover, [math]M_{k+1}=\sup\{0,\sup_{1\leq n\leq k+1}(S_n-na)\}=\sup\{0,M_k'+X_1-a\}[/math], which implies that [math]M_{k+1}=M_k'-\inf\{a-X_1,M_k'\}[/math]. Now we can use the fact that [math]M_k'[/math] and [math]M_k[/math] are bounded to obtain
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].