Independence
Independent events
Let [math](\Omega,\A,\p)[/math] be a probability space. If [math]A,B\in\A[/math], we say that [math]A[/math] and [math]B[/math] are independent if
Example
[Throw of a die] We have the state space [math]\Omega=\{1,2,3,4,5,6\}[/math], [math]\omega\in\Omega[/math]. Hence we have [math]\p[\{\omega\}]=\frac{1}{6}[/math]. Now let [math]A=\{1,2\}[/math] and [math]B=\{1,3,5\}[/math]. Then
Therefore we get
Hence we get that [math]A[/math] and [math]B[/math] are independent.
We say that the [math]n[/math] events [math]A_1,...,A_n\in\A[/math] are independent if [math]\forall \{j_1,...,j_l\}\subset\{1,...,n\}[/math] we have
It is not enough to have [math]\p[A_1\cap\dotsm\cap A_n]=\p[A_1]\dotsm\p[A_n][/math]. It is also not enough to check that [math]\forall \{i,j\}\subset\{1,...,n\}[/math], [math]\p[A_i\cap A_j]=\p[A_i]\p[A_j][/math]. For instance, let us consider two tosses of a coin and consider events [math]A,B[/math] and [math]C[/math] given by
The [math]n[/math] events [math]A_1,...,A_n\in\A[/math] are independent if and only if
If the above is satisfied and if [math]\{j_1,...,j_l\}\subset\{1,...,n\}[/math], then for [math]i\in\{j_1,...,j_l\}[/math] take [math]B_i=A_i[/math] and for [math]i\not\in\{j_1,...,j_l\}[/math] take [math]B_i=\Omega[/math]. So it follows that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]A,B\in\A[/math] such that [math]\p[B] \gt 0[/math]. The conditional probability of [math]A[/math] given [math]B[/math] is then defined as
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]A,B\in\A[/math] and suppose that [math]\p[B] \gt 0[/math].
- [math]A[/math] and [math]B[/math] are independent if and only if
[[math]] \p[A\mid B]=\p[A]. [[/math]]
- The map
[[math]] \A\to [0,1],A\mapsto \p[A\mid B] [[/math]]defines a new probability measure on [math]\A[/math] called the conditional probability given [math]B[/math].
We need to show both points.
- If [math]A[/math] and [math]B[/math] are independent, then
[[math]] \p[A\mid B]=\frac{\p[A\cap B]}{\p[B]}=\frac{\p[A]\p[B]}{\p[B]}=\p[A] [[/math]]and conversely if [math]\p[A\mid B]=\p[A][/math], we get that[[math]] \p[A\cap B]=\p[A]\p[B], [[/math]]and hence [math]A[/math] and [math]B[/math] are independent.
- Let [math]\Q[A]=\p[A\mid B][/math]. We have
[[math]] \Q[\Omega]=\p[\omega\mid B]=\frac{\p[\omega\cap B]}{\p[B]}=\frac{\p[B]}{\p[B]}=1. [[/math]]Take [math](A_n)_{n\geq 1}\subset \A[/math] as a disjoint family of events. Then[[math]] \begin{align*} \Q\left[\bigcup_{n\geq 1}A_n\right]&=\p\left[\bigcup_{n\geq 1}A_n\mid B\right]=\frac{\p\left[\left(\bigcup_{n\geq 1}A_n\right)\cap B\right]}{\p[B]}=\p\left[\bigcup_{n\geq 1}(A_n\cap B)\right]\\ &=\sum_{n\geq 1}\frac{\p[A_n\cap B]}{\p[B]}=\sum_{n\geq 1}\Q[A_n]. \end{align*} [[/math]]
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]A_1,...,A_n\in\A[/math] with [math]\p[A_1\cap\dotsm\cap A_n] \gt 0[/math]. Then
We prove this by induction. For [math]n=2[/math] it's just the definition of the conditional probability. Now we want to go from [math]n-1[/math] to [math]n[/math]. Therefore set [math]B=A_1\cap \dotsm\cap A_{n-1}[/math]. Then
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]\left(E_{n}\right)_{n\geq 1}[/math] be a finite or countable measurable partition of [math]\Omega[/math], such that [math]\p[E_n] \gt 0[/math] for all [math]n[/math]. If [math]A\in\A[/math], then
Note that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](E_n)_{n\geq 1}[/math] be a finite or countable partition of [math]\Omega[/math] and assume that [math]\p[A] \gt 0.[/math] Then
By the previous theorem we know that
Independent Random Variables and independent [math]\sigma[/math]-Algebras
Let [math](\Omega,\A,\p)[/math] be a probability space. We say that the sub [math]\sigma[/math]-Algebras [math]\B_1,...,\B_n[/math] of [math]\A[/math] are independent if for all [math] A_1\in\B_1,..., A_n\in\B_n[/math] we get
If [math]\B_1,...,\B_n[/math] are [math]n[/math] independent sub [math]\sigma[/math]-Algebras and if [math]X_1,...,X_n[/math] are independent r.v.'s such that [math]X_i[/math] is [math]\B_i[/math] measurable for all [math]i\in\{1,...,n\}[/math], then [math]X_1,...,X_n[/math] are independent r.v.'s (This comes from the fact that for all [math]i\in\{1,...,n\}[/math] we have that [math]\sigma(X_i)\subset \B_i[/math]).
The [math]n[/math] events [math]A_1,...,A_n\in\A[/math] are independent if and only if [math]\sigma(A_1),...,\sigma(A_n)[/math] are independent.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X_1,...,X_n[/math] be [math]n[/math] r.v.'s. Then [math]X_1,...,X_n[/math] are independent if and only if the law of the vector [math](X_1,...,X_n)[/math] is the product of the laws of [math]X_1,...,X_n[/math], i.e.
Let [math]F_i\in\mathcal{E}_i[/math] for all [math]i\in\{1,...,n\}[/math]. Thus we have
We see from the proof above that as soon as for all [math]i\in\{1,...,n\}[/math] we have [math]\E[\vert f_i(X_i)\vert] \lt \infty[/math], it follows that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X_1[/math] and [math]X_2[/math] be two independent r.v.'s in [math]L^2(\Omega,\A,\p)[/math]. Then we get
Recall that if [math]X\in L^2(\Omega,\A,\p)[/math], we also have that [math]X\in L^1(\Omega,\A,\p)[/math]. Thus
Note that the converse is not true! Let [math]X_1\sim\mathcal{N}(0,1)[/math]. We can also take for [math]X_1[/math] any symmetric r.v. in [math]L^2(\Omega,\A,\p)[/math] with density [math]P(x)[/math], such that [math]P(-x)=P(x)[/math]. Recall that being in [math]L^2(\Omega,\A,\p)[/math] simply means
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X_1,...,X_n[/math] be [math]n[/math] r.v.'s with values in [math]\R[/math].
- Assume that for [math]i\in \{1,...,n\}[/math], [math]\p_{X_i}[/math] has density [math]P_i[/math] and that the r.v.'s [math]X_1,...,X_n[/math] are independent. Then the law of [math](X_1,...,X_n)[/math] also has density given by [math]P(x_1,...,x_n)=\prod_{i=1}^nP_i(x_i)[/math].
- Conversely assume that the law of [math](X_1,...,X_n)[/math] has density [math]P(x_1,...,x_n)=\prod_{i=1}^nq_i(x_i)[/math], where [math]q_i[/math] is Borel measurable and positive. Then the r.v.'s [math]X_1,...,X_n[/math] are independent and the law of [math]X_i[/math] has density [math]P_i=c_iq_i[/math], with [math]c_i \gt 0[/math] for [math]i\in\{1,...,n\}[/math].
We only need to show [math](ii)[/math]. From Fubini we get
Example
Let [math]U[/math] be a r.v. with exponential distribution. Let [math]V[/math] be a uniform r.v. on [math][0,1][/math]. We assume that [math]U[/math] and [math]V[/math] are independent. Define the r.v.'s [math]X=\sqrt{U}\cos(2\pi V)[/math] and [math]Y=\sqrt{U}\sin(2\pi V)[/math]. Then [math]X[/math] and [math]Y[/math] are independent. Indeed, for a measurable function [math]\varphi:\R^2\to \R_+[/math] we get
which implies that [math](X,Y)[/math] has density [math]\frac{e^{-x^2}e^{-y^2}}{\pi}[/math] on [math]\R\times\R[/math]. With the previous corollary we get that [math]X[/math] and [math]Y[/math] are independent and [math]X[/math] and [math]Y[/math] have the same density [math]P(x)=\frac{1}{\sqrt{\pi}}e^{-x^2}[/math]. This means that [math]X[/math] and [math]Y[/math] are independent.
We write [math]X\stackrel{law}{=}Y[/math] to say that [math]\p_X=\p_Y[/math]. Thus in the example above we would have
Important facts
Let [math]X_1,...,X_n[/math] be [math]n[/math] real valued r.v.'s. Then the following are equivalent
- [math]X_1,...,X_n[/math] are independent.
- For [math]X=(X_1,...,X_n)\in\R^n[/math] we have
[[math]] \Phi_X(\xi_1,...,\xi_n)=\prod_{i=1}^n\Phi_{X_i}(\xi_i). [[/math]]
- For all [math]a_1,..,a_n\in\R[/math], we have
[[math]] \p[X_1\leq a_1,..,X_n\leq a_n]=\prod_{i=1}^n\p[X_i\leq a_i] [[/math]]
- If [math]f_1,...,f_n:\R\to\R_+[/math] are continuous, measurable maps with compact support, then
[[math]] \E\left[\prod_{i=1}^nf_i(X_i)\right]=\prod_{i=1}^n\E[f_i(X_i)]. [[/math]]
\begin{proof} First we show [math](i)\Longrightarrow (ii)[/math]. By definition and the iid property, we get
where the map [math]t\mapsto e^{it}[/math] is measurable and bounded. Next we show [math](ii)\Longrightarrow (i)[/math]. Note that by theorem we have [math]\p_X=\p_Y[/math] if
Now if [math]\Phi_X(\xi_1,...,\xi_n)=\prod_{i=1}^n\Phi_{X_i}(\xi_i)[/math], we note that [math]\prod_{i=1}^n\Phi_{X_i}(\xi_i)[/math] is the characteristic function of the probability distribution if the probability distribution is [math]\p_{X_1}\otimes\dotsm \otimes\p_{X_n}[/math]. Now from injectivity it follows that [math]\p_{(X_1,...,X_n)}=\p_{X_1}\otimes\dotsm\otimes\p_{X_n}[/math], which implies that [math]X_1,...,X_n[/math] are independent. \end{proof}
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]\B_1,...,\B_n\subset\A[/math] be sub [math]\sigma[/math]-Algebras of [math]\A[/math]. For every [math]i\in\{1,...,n\}[/math], let [math]\mathcal{C}_i\subset\B_i[/math] be a family of subsets of [math]\Omega[/math] such that [math]\mathcal{C}_i[/math] is stable under finite intersection and [math]\sigma(\mathcal{C}_i)=\B_i[/math]. Assume that for all [math]C_i\in\mathcal{C}_i[/math] with [math]i\in\{1,...,n\}[/math] we have
Let us fix [math]C_2\in \mathcal{C}_2,...,C_n\in\mathcal{C}_n[/math] and define
[math]Consequence:[/math] Let [math]\B_1,...,\B_n[/math] be [math]n[/math] independent [math]\sigma[/math]-Algebras and let [math]m_0=0 \lt m_1 \lt ... \lt m_p=n[/math]. Then the [math]\sigma[/math]-Algebras
are also independent. Indeed, we can apply The previous proposition to the class of sets
In particular if [math]X_1,...,X_n[/math] are independent r.v.'s, then
are also independent.
Example
Let [math]X_1,...,X_4[/math] be real valued independent r.v.'s. Then [math]Z_1=X_1X_3[/math] and [math]Z_2=X_2^3+X_4[/math] are independent and [math]Z_3=\sigma(X_1,X_3)[/math] and [math]Z_4=\sigma(X_2,X_4)[/math] are measurable. From above [math]\sigma(X_1,X_3)[/math] and [math]\sigma(X_2,X_4)[/math] are independent if for [math]X:\Omega\to\R[/math] we have that [math]Y[/math] is [math]\sigma(X)[/math] measurable if and only if [math]Y=f(X)[/math] with [math]f[/math] being a measurable map, i.e. if [math]Y[/math] is [math]\sigma(X_1,...,X_n)[/math] measurable, then [math]Y=f(X_1,....,X_n)[/math].
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](\B_i)_{i\in I}[/math] be an infinite family of sub [math]\sigma[/math]-Algebras of [math]A[/math]. We say that the family [math](\B_i)_{i\in I}[/math] is independent if for all [math]\{i_1,..,i_p\}\in I[/math], [math]\B_{i_1},...,\B_{i_p}[/math] are independent. If [math](X_i)_{i\in I}[/math] is a family of r.v.'s we say that they are independent if [math](\sigma(X_i))_{i\in I}[/math] is independent.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of independent r.v.'s. Then for all [math]p\in\N[/math] we get that [math]p_1=\sigma(X_1,...,X_p)[/math] and [math]p_2=\sigma(X_{p+1},...,X_n)[/math] are independent.
Apply Proposition 5.9. to [math]\mathcal{C}_1=\sigma(X_1,...,X_p)[/math] and [math]\mathcal{C}_2=\bigcup_{k=p+1}^\infty\sigma(X_{p+1},...,X_n)\in\B_2[/math].
The Borel-Cantelli Lemma
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](A_n)_{n\in\N}[/math] be a sequence of events in [math]\A[/math]. Recall that we can write
Moreover, both are again measurable sets. For [math]\omega\in\limsup_n A_n[/math] we get that [math]\omega\in\bigcup_{k=n}^\infty A_k[/math], for all [math]n\geq 0[/math]. Moreover, for all [math]n\geq 0[/math], there exists a [math]k\geq n[/math] such that, [math]\omega\in A_n[/math] and [math]\omega[/math] is in infinitely many [math]A_k[/math]'s. For [math]\omega\in\liminf_n A_n[/math], we get that for all [math]n\geq 0[/math] such that [math]\omega\in\bigcap_{k=n}^\infty A_k[/math], there exists [math]n\geq 0[/math], such that for all [math]k\geq n[/math] we have [math]\omega\in A_k[/math], which shows that [math]\liminf_nA_n\subset \limsup_nA_n[/math].
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](A_n)_{n\in\N}\in\A[/math] be a family of measurable sets.
- If [math]\sum_{n\geq 1}\p[A_n] \lt \infty[/math], then
[[math]] \p\left[\limsup_{n\to\infty} A_n\right]=0, [[/math]]which means that the set [math]\{n\in\N\mid \omega\in A_n\}[/math] is a.s. finite.
- If [math]\sum_{n\geq 1}\p[A_n]=\infty[/math], and if the events [math](A_n)_{n\in\N}[/math] are independent, then
[[math]] \p\left[\limsup_{n\to\infty} A_n\right]=1, [[/math]]which means that the set [math]\{n\in\N\mid \omega\in A_n\}[/math] is a.s. finite.
We need to show both points.
- If [math]\sum_{n\geq 1}\p[A_n] \lt \infty,[/math] then, by Fubini, we get
[[math]] \E\left[\sum_{n\geq 1}\one_{A_n}\right]=\sum_{n\geq 1}\p[A_n], [[/math]]which implies that [math]\sum_{n\geq 1}\one_{A_n} \lt \infty[/math] and [math]\one_{A_n}\not=0[/math] a.s. for finite numbers of [math]n[/math].
- Fix [math]n_0\in\N[/math] and note that for all [math]n\geq n_0[/math] we have
[[math]] \p\left[\bigcap_{k=n_0}^nA_k^C\right]=\prod_{k=n_0}^n\p[A_k^C]=\prod_{k=n_0}^n\p[1-A_n]. [[/math]]Now we see that[[math]] \sum_{n\geq 1}\p[A_n]=\infty [[/math]]and thus[[math]] \p\left[\bigcap_{k=n_0}^nA_k^C\right]=0. [[/math]]Since this is true for every [math]n_0[/math] we have that[[math]] \p\left[\bigcup_{n=0}^\infty\bigcap_{k=n_0}^\infty A_k^C\right]\leq \sum_{n\geq 1}\p[A_k^C]=0. [[/math]]Hence we get[[math]] \p\left[\bigcup_{n=0}^\infty\bigcap_{k=n_0}^\infty A_k^C\right]=\p\left[\bigcap_{n=0}^\infty\bigcup_{k=n}^\infty A_k\right]=\p\left[\limsup_{n\to\infty} A_n\right]=1. [[/math]]
Application 1
Let [math](\Omega,\A,\p)[/math] be a probability space. There does not exist a probability measure on [math]\N[/math] such that the probability of the set of multiples of an integer [math]n[/math] is [math]\frac{1}{n}[/math] for [math]n\geq 1[/math]. Let us assume that such a probability measure exists. Let [math]\tilde{p}[/math] denote the set of prime numbers. For [math]p\in\tilde{p}[/math] we note that [math]A_p=p\N[/math], i.e. the set of all multiples of [math]p[/math]. We first show that the sets [math](A_p)_{p\in\tilde{p}}[/math] are independent. Indeed let [math]p_1,...,p_n\in\tilde{p}[/math] be distinct. Then we have
Moreover it is known that
The second part of the Borel-Cantelli lemma implies that all integers [math]n[/math] belong to infinitely many [math]A_p[/math]'s. So it follows that [math]n[/math] is divisible by infinitely many distinct prime numbers.
Application 2
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X[/math] be an exponential r.v. with parameter [math]\lambda=1[/math]. Thus we know that [math]X[/math] has density [math]e^{-x}\one_{\R_+}(x)[/math]. Now consider a sequence [math](X_n)_{n\geq 1}[/math] of independent r.v.'s with the same distribution as [math]X[/math], i.e. for all [math]n\geq 1[/math],we have [math]X_n\sim X[/math]. Then [math]\limsup_n \frac{X_n}{\log(n)}=1[/math] a.s., i.e. there exists an [math]N\in\A[/math] such that [math]\p[N]=0[/math] and for [math]\omega\not\in N[/math] we get
Therefore we can compute the probability
Now let [math]\epsilon \gt 0[/math] and consider the sets [math]A_n=\{X_n \gt (1+\epsilon)\log(n)\}[/math] and [math]B_n=\{X_n \gt \log(n)\}[/math]. Then
This implies that
With the Borel-Cantelli lemma we get that [math]\p\left[\limsup_{n\to\infty} A_n\right]=0[/math]. Let us define
Then we have [math]\p[N_\epsilon]=0[/math] for [math]\omega\not\in N_{\epsilon}[/math], which implies that there exists an [math]n_0(\omega)[/math] such that for all [math]n\geq n_0[/math] we have
and thus for [math]\omega\not\in N_{\epsilon}[/math], we get [math]\limsup_{n\to\infty}\frac{X_n(\omega)}{\log(n)}\leq 1+\epsilon[/math]. Moreover, let
Therefore we get [math]\p[N']\leq \sum_{\epsilon\in\Q_+}\p[N_{\epsilon}]=0[/math] for [math]\omega\not\in N'[/math]. Hence we get
Now we note that the [math]B_n[/math]'s are independent, since [math]B_n\in\sigma(X_n)[/math] and the fact that the [math]X_n[/math]'s are independent. Moreover,
which gives that
Now we can use Borel-Cantelli to get
If we denote [math]N''=\left(\limsup_{n\to\infty} B_n\right)^C[/math], then for [math]\omega\not\in N''[/math] we get that [math]X_n(\omega) \gt \log(n)[/math] for infinitely many [math]n[/math]. So it follows that for [math]\omega\not\in N''[/math] we have
Finally, take [math]N=N'\cup N''[/math] to obtain [math]\p[N]=0[/math]. Thus for [math]\omega\not\in N[/math] we get
Sums of independent Random Variables
Let us first define the convolution of two probability measures. If [math]\mu[/math] and [math]\nu[/math] are two probability measures on [math]\R^d[/math], we denote by [math]\mu*\nu[/math] the image of the measure [math]\mu\otimes\nu[/math] by the application
Moreover, for all measurable maps [math]\varphi:\R^d\to \R_+[/math], we have
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X[/math] and [math]Y[/math] be two independent r.v.'s with values in [math]\R^d[/math]. Then the following hold.
- The law of [math]X+Y[/math] is given by [math]\p_X*\p_Y[/math]. In particular if [math]X[/math] has density [math]f[/math] and [math]Y[/math] has density [math]g[/math], then [math]X+Y[/math] has density [math]f*g[/math], where [math]*[/math] denotes the convolution product, which is given by
[[math]] f*g(\xi)=\int_{\R^d} f(x)g(\xi-x)dx. [[/math]]
- [math]\Phi_{X+Y}(\xi)=\Phi_X(\xi)\Phi_Y(\xi).[/math]
- If [math]X[/math] and [math]Y[/math] are in [math]L^2(\Omega,\A,\p)[/math], we get
[[math]] K_{X+Y}=K_X+K_Y. [[/math]]In particular when [math]d=1[/math], we obtain[[math]] Var(X+Y)=Var(X)+Var(Y). [[/math]]
We need to show all three points.
- If [math]X[/math] and [math]Y[/math] are independent r.v.'s, then [math]\p_{(X,Y)}=\p_X\otimes\p_Y[/math]. Consequently, for all measurable maps [math]\varphi:\R^d\to\R_+[/math], we have
[[math]] \begin{multline*} \E[\varphi(X+Y)]=\iint_{\R^d}\varphi(X+Y)\p_{(X,Y)}(dxdy)=\iint_{\R^d}\varphi(X+Y)\p_X(dx)\p_{Y}(dy)\\=\int_{\R^d}\varphi(\xi)(\p_X*\p_Y)(d\xi). \end{multline*} [[/math]]Now since [math]X[/math] and [math]Y[/math] have densities [math]f[/math] and [math]g[/math] respectively, we get[[math]] \E[\varphi(Z=X+Y)]=\iint_{\R^d}\varphi(X+Y)f(x)*g(y)dxdy=\iint_{\R^d}\varphi(\xi)\left(\int_{\R^d} f(x)g(\xi-x)dx\right)d\xi. [[/math]]Since this identity here is true for all measurable maps [math]\varphi:\R^d\to\R_+[/math], the r.v. [math]Z:=X+Y[/math] has density[[math]] h(\xi)=(f*g)(\xi)=\int_{\R^d}f(x)g(\xi-x)dx. [[/math]]
- By definition of the characteristic function and the independence property, we get
[[math]] \Phi_{X+Y}(\xi)=\E\left[e^{i\xi(X+Y)}\right]=\E\left[e^{i\xi X}e^{i\xi Y}\right]=\E\left[e^{i\xi X}\right]\E\left[e^{i\xi Y}\right]=\Phi_X(\xi)\Phi_Y(\xi). [[/math]]
- If [math]X=(X_1,...,X_d)[/math] and [math]Y=(Y_1,...,Y_d)[/math] are independent r.v.'s on [math]\R^d[/math], we get that [math]Cov(X_i,Y_j)=0[/math], for all [math]0\leq i,j\leq d[/math]. By using the multi linearity of the covariance we get that
[[math]] Cov(X_i+Y_i,X_j+Y_j)=Cov(X_i,X_j)+Cov(Y_j+Y_j), [[/math]]and hence [math]K_{X+Y}=K_X+K_Y[/math]. For [math]d=1[/math] we get[[math]] \begin{align*} Var(X+Y)&=\E[((X+Y)-\E[X+Y])^2]=\E[((X-\E[X])+(Y-\E[Y]))^2]\\ &=\underbrace{\E[(X-\E[X])^2]}_{Var(X)}+\underbrace{\E[(Y-\E[Y])^2]}_{Var(Y)}+\underbrace{2\E[(X-\E[X])(Y-\E[Y])]}_{2Cov(X,Y)} \end{align*} [[/math]]Now since [math]Cov(X,Y)=0[/math], we get the result.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of independent r.v.'s. Moreover, write [math]\mu=\E[X_n][/math] for all [math]n\geq1[/math] and assume [math]\E[(X_n-\mu)^2]\leq C[/math] for all [math]n\geq1[/math] and for some constant [math]C \lt \infty[/math]. We also write [math]S_n=\sum_{j=1}^nX_j[/math] and [math]\tilde X_n=\frac{S_n}{n}[/math] for all [math]n\geq 1[/math]. Then for all [math]\epsilon \gt 0[/math]
We note that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](A_n)_{n\geq 1}\in \A[/math] be a sequence of independent events with the same probabilities, i.e. [math]\p[A_n]=\p[A_m][/math], for all [math]n,m\geq 1[/math]. Then
Note that by the weak law of large numbers, we get for a sequence of independent r.v.'s [math](X_n)_{n\geq 1}[/math] with the same expectation for all [math]n\geq 1[/math]
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].