guide:68fbdee177: Difference between revisions

Latest revision as of 14:17, 8 May 2024

Theorem (Central limit theorem (CLT))

Let [math] (\Omega,\A,\p) [/math] be a probability space. Let [math] (X_n)_{n\geq 1} [/math] be a sequence of iid r.v.'s with values in [math] \R [/math] . We assume that [math] \E[X_k^2] \lt \infty [/math] (i.e. [math] X_k\in L^2(\Omega,\A,\p) [/math] ) and let [math] \sigma^2=Var(X_k) [/math] for all [math] k\in\{1,...,n\} [/math] . Then for all [math] k\in\{1,...,n\} [/math] we get

[[math]] \sqrt{n}\left(\frac{1}{n}\sum_{i=0}^nX_i-\E[X_k]\right)\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,\sigma^2). [[/math]]

Equivalently, for all [math] a,b\in\bar\R [/math] with [math] a \lt b [/math] and for all [math] k\in\{1,...,n\} [/math] we get

[[math]] \lim_{n\to\infty}\p\left[\sum_{i=0}^nX_i\in \left[n\E[X_k]+a\sqrt{n},n\E[X_k]+b\sqrt{n}\right]\right]=\frac{1}{\sigma\sqrt{2\pi}}\int_a^be^{-\frac{x^2}{2\sigma^2}}dx. [[/math]]

Show Proof

Without loss of generality we can assume that [math] \E[X_k]=0 [/math] , for all [math] k\in\{1,...,n\} [/math] . Define now a sequence [math] Z_n=\frac{\sum_{i=1}^nX_i}{\sqrt{n}} [/math] . Then we can obtain

[[math]] \Phi_{Z_n}(\xi)=\E\left[e^{i\xi Z_n}\right]=\E\left[e^{i\xi \left(\frac{\sum_{i=1}^nX_i}{\sqrt{n}}\right)}\right]=\prod_{j=1}^n\E\left[e^{i\xi \frac{X_j}{\sqrt{n}}}\right]=\E\left[e^{i\xi\frac{X_k}{\sqrt{n}}}\right]^n=\Phi_{X_k}^n(\xi). [[/math]]

We have already seen that

[[math]] \Phi_{X_k}(\xi)=1+n\xi\E[X_k]-\frac{1}{2}\xi^2\E[X_k^2]+O(\xi^2)=1-\frac{\sigma^2}{2}\xi^2+O(\xi^2). [[/math]]

Finally, for fixed [math] \xi [/math] , we have

[[math]] \Phi_{X_k}\left(\frac{\xi}{\sqrt{n}}\right)=1-\frac{\sigma^2\xi^2}{2n}+O\left(\frac{1}{n}\right) [[/math]]

[[math]] \lim_{n\to\infty}\Phi_{X_k}^n(\xi)=\lim_{n\to\infty}\left(1-\frac{\sigma^2\xi^2}{2}+O\left(\frac{1}{n}\right)\right)^n=\E\left[e^{i\xi \mathcal{N}(0,\sigma^2)}\right]. [[/math]]

■

Example

If [math]\E[X_k]=0[/math] for all [math]k\in\{1,...,n\}[/math], then

[[math]] \lim_{n\to\infty}\p\left[\frac{\sum_{i=1}^nX_i-n\E[X_k]}{\sqrt{n}}\in[a,b]\right]=\frac{1}{\sigma\sqrt{2\pi}}\int_a^be^{-\frac{x^2}{2\sigma^2}}dx. [[/math]]

For the illustration of the CLT, the distribution of a Gaussian distributed r.v. [math]X[/math] with [math]\mu=100[/math] and [math]\sigma=10[/math] ([math]X\sim \mathcal{N}(\mu,\sigma^2)[/math]).

The distribution of the mean of [math]n=5[/math] i.i.d. Gaussian r.v.'s [math](X_n)[/math] with [math]\mu=100[/math] and [math]\sigma^2=10[/math] ([math]X_k\sim \mathcal{N}(\mu,\sigma^2)[/math]). The figure illustrates the behavior of the distribution of the mean, which is given by [math]\bar X_n=\frac{1}{n}\sum_{k=1}^nX_k[/math].

The distribution of the mean of [math]n=20[/math] i.i.d. Gaussian r.v.'s [math](X_n)[/math] with [math]\mu=100[/math] and [math]\sigma^2=10[/math] ([math]X_k\sim \mathcal{N}(\mu,\sigma^2)[/math]). The figure illustrates the behavior of the distribution of the mean, which is given by [math]\bar X_n=\frac{1}{n}\sum_{k=1}^nX_k[/math].

The distribution of the mean of [math]n=100[/math] i.i.d. Gaussian r.v.'s [math](X_n)[/math] with [math]\mu=100[/math] and [math]\sigma^2=10[/math] ([math]X_k\sim \mathcal{N}(\mu,\sigma^2)[/math]). The figure illustrates the behavior of the distribution of the mean, which is given by [math]\bar X_n=\frac{1}{n}\sum_{k=1}^nX_k[/math].

The distribution of the mean of [math]n=1000[/math] i.i.d. Gaussian r.v.'s [math](X_n)[/math] with [math]\mu=100[/math] and [math]\sigma^2=10[/math] ([math]X_k\sim \mathcal{N}(\mu,\sigma^2)[/math]). The figure illustrates the behavior of the distribution of the mean, which is given by [math]\bar X_n=\frac{1}{n}\sum_{k=1}^nX_k[/math]. Clearly we see that it's converging to [math]\mu[/math] for [math]n\to\infty[/math].

Theorem

Let [math](X_n)_{n\geq 1}[/math] be independent r.v.'s but not necessarily i.i.d. We assume that [math]\E[X_j]=0[/math] and that [math]\E[X_j^2]=\sigma_j^2 \lt \infty[/math], for all [math]j\in\{1,...,n\}[/math]. Assume further that [math]\sup_n\E[\vert X_n\vert^{2+\delta}] \lt \infty[/math] for some [math]\delta \gt 0[/math], and that [math]\sum_{j=1}^\infty \sigma_j^2 \lt \infty.[/math] Then

[[math]] \frac{\sum_{j=1}^nX_j}{\sqrt{\sum_{j=1}^n \sigma_j^2}}\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,1). [[/math]]

Example

We got the following examples:

Let [math](X_n)_{n\geq 1}[/math] be i.i.d. r.v.'s with [math]\p[X_n=1]=p[/math] and [math]\p[X_n=0]=1-p[/math]. Then [math]S_n=\sum_{i=1}^nX_i[/math] is a binomial r.v. [math]\B(p,n)[/math]. We have [math]\E[S_n]=np[/math] and [math]Var(S_n)=np(1-p)[/math]. Now with the strong law of large numbers we get [math]\frac{S_n}{n}\xrightarrow{n\to\infty\atop a.s.}p[/math] and with the central limit theorem we get
[[math]] \frac{S_n-np}{\sqrt{np(1-p)}}\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,1). [[/math]]
Let [math]\mathcal{P}[/math] be the set of prime numbers. For [math]p\in\mathcal{P}[/math], define [math]\B_p[/math] as [math]\p[\B_p=1]=\frac{1}{p}[/math] and [math]\p[\B_p=0]=1-\frac{1}{p}[/math]. We take the [math](\B_p)_{p\in\mathcal{P}}[/math] to be independent and
[[math]] W_n=\sum_{\mathcal{P}\subset\N\atop p\in\mathcal{P}}\B_p, [[/math]]
the probabilistic model for the total numbers of distinct prime divisors of [math]n:=W(0)[/math]. It's a simple exercise to check that [math](W_n)_{n\geq 1}[/math] satisfies the assumption of theorem 13.2 and using the fact that [math]\sum_{p\leq n\atop p\in\mathcal{P}}\frac{1}{p}\sim \log\log n[/math] and we obtain
[[math]] \frac{W_n-\log\log n}{\sqrt{\log\log n}}\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,1). [[/math]]

An illustration of the CLT, where the r.v.'s [math](X_n)[/math] are i.i.d. exponentially distributed. Also with the cumulative distribution functions. Here we have [math]n=2[/math] exponentially distributed r.v's [math]\left(X_k\sim \lambda e^{-\lambda}\right)[/math]. On the left side, the black curve represents the density function of the r.v.'s and the red curve represents the density of a Gaussian r.v. [math]Y[/math] with [math]\mu=0[/math] ([math]Y\sim\mathcal{N}(0,\sigma^2)[/math]). On the right side, the black curve represents the cumulative distribution function of the r.v.'s [math]X_k[/math] and the red curve represents the cumulative distribution function of a Gaussian r.v. [math]Y[/math].

An illustration of the CLT, where the r.v.'s [math](X_n)[/math] are i.i.d. exponentially distributed. Also with the cumulative distribution functions. Here we have [math]n=4[/math] exponentially distributed r.v's [math]\left(X_k\sim \lambda e^{-\lambda}\right)[/math]. On the left side, the black curve represents the density function of the r.v.'s and the red curve represents the density of a Gaussian r.v. [math]Y[/math] with [math]\mu=0[/math] ([math]Y\sim\mathcal{N}(0,\sigma^2)[/math]). On the right side, the black curve represents the cumulative distribution function of the r.v.'s [math]X_k[/math] and the red curve represents the cumulative distribution function of a Gaussian r.v. [math]Y[/math].

An illustration of the CLT, where the r.v.'s [math](X_n)[/math] are i.i.d. exponentially distributed. Also with the cumulative distribution functions. Here we have [math]n=10[/math] exponentially distributed r.v's [math]\left(X_k\sim \lambda e^{-\lambda}\right)[/math]. On the left side, the black curve represents the density function of the r.v.'s and the red curve represents the density of a Gaussian r.v. [math]Y[/math] with [math]\mu=0[/math] ([math]Y\sim\mathcal{N}(0,\sigma^2)[/math]). On the right side, the black curve represents the cumulative distribution function of the r.v.'s [math]X_k[/math] and the red curve represents the cumulative distribution function of a Gaussian r.v. [math]Y[/math].

An illustration of the CLT, where the r.v.'s [math](X_n)[/math] are i.i.d. exponentially distributed. Also with the cumulative distribution functions. Here we have [math]n=20[/math] exponentially distributed r.v's [math]\left(X_k\sim \lambda e^{-\lambda}\right)[/math]. On the left side, the black curve represents the density function of the r.v.'s and the red curve represents the density of a Gaussian r.v. [math]Y[/math] with [math]\mu=0[/math] ([math]Y\sim\mathcal{N}(0,\sigma^2)[/math]). On the right side, the black curve represents the cumulative distribution function of the r.v.'s [math]X_k[/math] and the red curve represents the cumulative distribution function of a Gaussian r.v. [math]Y[/math]. Now we can see that both, the density and the cumulative distribution function, are converging to a Gaussian density and a Gaussian cumulative distribution function for [math]n\to\infty[/math].

Theorem (Erdös-Kac)

Let [math]N_n[/math] be a r.v. with uniformly distribution in [math]\{1,...,n\}[/math], then
[[math]] \frac{W(N_n)-\log\log n}{\sqrt{\log\log n}}\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,1), [[/math]]
where [math]W(n)=\sum_{p\leq n\atop p\in\mathcal{P}}\one_{p|n}[/math].
Suppose that [math](X_n)_{n\geq 1}[/math] are i.i.d. r.v.'s with distribution function [math]F(x)=\p[X_1\leq x][/math]. Let [math]Y_n(x)=\one_{X_n\leq x}[/math], where [math](Y_n)_{n\geq 1}[/math] are i.i.d. Define [math]F_n(x)=\frac{1}{n}\sum_{k=1}^nY_k(x)=\frac{1}{n}\sum_{k=1}^n\one_{X_k\leq x}[/math]. [math]F_n[/math] is called the empirical distribution function. With the strong law of large numbers we get [math]\lim_{n\to\infty\atop a.s.}F_n(x)=\E[Y_1(x)][/math] and
[[math]] \E[Y_1(x)]=\E[\one_{X_1\leq x}]=\p[X_1\leq x]=F(x). [[/math]]
In fact, it is a theorem (Gliwenko-Cantelli) which says that
[[math]] \sup_{x\in\R}\vert F_n(x)-F(x)\vert\xrightarrow{n\to\infty\atop a.s.}0. [[/math]]
Next we note that
[[math]] \sqrt{n}(F_n(x)-F(x))=\sqrt{n}\left(\frac{1}{n}\sum_{k=1}^nY_k(x)-\E[Y_1(x)]\right)=\frac{1}{\sqrt{n}}\left(\sum_{k=1}^nY_k(x)-n\E[Y_1(x)]\right) [[/math]]
Now with the central limit theorem we get
[[math]] \sqrt{n}(F_n(x)-F(x))\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,\sigma^2(x)), [[/math]]
where [math]\sigma^2(x)=Var(Y_1)=\E[Y_1^2(x)]=\E[Y_1(x)]=F(x)-F^2(x)=F(x)(1-F(x))[/math]. Hence
[[math]] \sqrt{n}(F_n(x)-F(x))\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,F(x)(1-F(x))). [[/math]]

Theorem (Berry-Esseen)

Let [math](X_n)_{n\geq 1}[/math] be i.i.d. r.v's and suppose that [math]\E[\vert X_k\vert^3] \lt \infty[/math], [math]\forall k\in\{1,...,n\}[/math]. Let

[[math]] G_n(x)=\p\left[\frac{\sum_{i=1}^nX_i-n\E[X_k]}{\sigma\sqrt{n}}\leq x\right],\forall k\in\{1,...,n\}, [[/math]]

where [math]\sigma^2=\E[X_k^2][/math] [math]\forall k\in\{1,...,n\}[/math] and [math]\Phi(x)=\p\left[\mathcal{N}(0,1)\leq x\right]=\int_{-\infty}^xe^{-\frac{u^2}{2}}\frac{1}{\sqrt{2\pi}}du[/math]. Then

[[math]] \sup_{x\in\R}\vert G_n(x)-\Phi(x)\vert\leq C\frac{\E[\vert X_k\vert^3]}{\sigma^3\sqrt{n}},\forall k\in\{1,...,n\} [[/math]]

where [math]C[/math] is a universal constant.

General references

Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].

@@ Line 19: / Line 19: @@
 \newcommand{\F}{\mathcal{F}}
 \newcommand{\mathds}{\mathbb}</math></div>
-\begin{thm}[Central limit theorem (CLT)]
+{{proofcard|Theorem (Central limit theorem (CLT))|thm-1|Let  <math > (\Omega,\A,\p) </math>  be a probability space. Let  <math > (X_n)_{n\geq 1} </math>  be a sequence of iid r.v.'s with values in  <math > \R </math> . We assume that  <math > \E[X_k^2]  <  \infty </math>  (i.e.  <math > X_k\in L^2(\Omega,\A,\p) </math> ) and let  <math > \sigma^2=Var(X_k) </math>  for all  <math > k\in\{1,...,n\} </math> . Then for all  <math > k\in\{1,...,n\} </math>  we get
-Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(X_n)_{n\geq 1}</math> be a sequence of iid r.v.'s with values in <math>\R</math>. We assume that <math>\E[X_k^2] < \infty</math> (i.e. <math>X_k\in L^2(\Omega,\A,\p)</math>) and let <math>\sigma^2=Var(X_k)</math> for all <math>k\in\{1,...,n\}</math>. Then for all <math>k\in\{1,...,n\}</math> we get
+<math display="block" >
-<math display="block">
 \sqrt{n}\left(\frac{1}{n}\sum_{i=0}^nX_i-\E[X_k]\right)\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,\sigma^2).
 </math>
-Equivalently, for all <math>a,b\in\bar\R</math> with <math>a < b</math> and for all <math>k\in\{1,...,n\}</math> we get
+Equivalently, for all  <math > a,b\in\bar\R </math>  with  <math > a  <  b </math>  and for all  <math > k\in\{1,...,n\} </math>  we get
+<math display="block" >
-<math display="block">
 \lim_{n\to\infty}\p\left[\sum_{i=0}^nX_i\in \left[n\E[X_k]+a\sqrt{n},n\E[X_k]+b\sqrt{n}\right]\right]=\frac{1}{\sigma\sqrt{2\pi}}\int_a^be^{-\frac{x^2}{2\sigma^2}}dx.
 </math>
-\end{thm}
+|Without loss of generality we can assume that  <math > \E[X_k]=0 </math> , for all  <math > k\in\{1,...,n\} </math> . Define now a sequence  <math > Z_n=\frac{\sum_{i=1}^nX_i}{\sqrt{n}} </math> . Then we can obtain
+<math display="block" >
+\Phi_{Z_n}(\xi)=\E\left[e^{i\xi Z_n}\right]=\E\left[e^{i\xi \left(\frac{\sum_{i=1}^nX_i}{\sqrt{n}}\right)}\right]=\prod_{j=1}^n\E\left[e^{i\xi \frac{X_j}{\sqrt{n}}}\right]=\E\left[e^{i\xi\frac{X_k}{\sqrt{n}}}\right]^n=\Phi_{X_k}^n(\xi).
+</math>
+We have already seen that
+<math display="block" >
+\Phi_{X_k}(\xi)=1+n\xi\E[X_k]-\frac{1}{2}\xi^2\E[X_k^2]+O(\xi^2)=1-\frac{\sigma^2}{2}\xi^2+O(\xi^2).
+</math>
+Finally, for fixed  <math > \xi </math> , we have
+<math display="block" >
+\Phi_{X_k}\left(\frac{\xi}{\sqrt{n}}\right)=1-\frac{\sigma^2\xi^2}{2n}+O\left(\frac{1}{n}\right)
+</math>
+<math display="block" >
+\lim_{n\to\infty}\Phi_{X_k}^n(\xi)=\lim_{n\to\infty}\left(1-\frac{\sigma^2\xi^2}{2}+O\left(\frac{1}{n}\right)\right)^n=\E\left[e^{i\xi \mathcal{N}(0,\sigma^2)}\right].
+</math>}}
 '''Example'''
@@ Line 40: / Line 50: @@
 </math>
-\begin{proof}
-Without loss of generality we can assume that <math>\E[X_k]=0</math>, for all <math>k\in\{1,...,n\}</math>. Define now a sequence <math>Z_n=\frac{\sum_{i=1}^nX_i}{\sqrt{n}}</math>. Then we can obtain
-<math display="block">
-\Phi_{Z_n}(\xi)=\E\left[e^{i\xi Z_n}\right]=\E\left[e^{i\xi \left(\frac{\sum_{i=1}^nX_i}{\sqrt{n}}\right)}\right]=\prod_{j=1}^n\E\left[e^{i\xi \frac{X_j}{\sqrt{n}}}\right]=\E\left[e^{i\xi\frac{X_k}{\sqrt{n}}}\right]^n=\Phi_{X_k}^n(\xi).
-</math>
-We have already seen that
-<math display="block">
-\Phi_{X_k}(\xi)=1+n\xi\E[X_k]-\frac{1}{2}\xi^2\E[X_k^2]+O(\xi^2)=1-\frac{\sigma^2}{2}\xi^2+O(\xi^2).
-</math>
-Finally, for fixed <math>\xi</math>, we have
-<math display="block">
-\Phi_{X_k}\left(\frac{\xi}{\sqrt{n}}\right)=1-\frac{\sigma^2\xi^2}{2n}+O\left(\frac{1}{n}\right)
-</math>
-<math display="block">
-\lim_{n\to\infty}\Phi_{X_k}^n(\xi)=\lim_{n\to\infty}\left(1-\frac{\sigma^2\xi^2}{2}+O\left(\frac{1}{n}\right)\right)^n=\E\left[e^{i\xi \mathcal{N}(0,\sigma^2)}\right].
-</math>
-\end{proof}
 <div id="" class="d-flex justify-content-center">
 [[File:guide_fe641_CLT_Histx_0.png | 400px | thumb | For the illustration of the CLT, the distribution of a Gaussian distributed r.v. <math>X</math> with <math>\mu=100</math> and <math>\sigma=10</math> (<math>X\sim \mathcal{N}(\mu,\sigma^2)</math>). ]]