guide:Aec06a58f6: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
<div class="d-none"><math> | |||
\newcommand{\R}{\mathbb{R}} | |||
\newcommand{\A}{\mathcal{A}} | |||
\newcommand{\B}{\mathcal{B}} | |||
\newcommand{\N}{\mathbb{N}} | |||
\newcommand{\C}{\mathbb{C}} | |||
\newcommand{\Rbar}{\overline{\mathbb{R}}} | |||
\newcommand{\Bbar}{\overline{\mathcal{B}}} | |||
\newcommand{\Q}{\mathbb{Q}} | |||
\newcommand{\E}{\mathbb{E}} | |||
\newcommand{\p}{\mathbb{P}} | |||
\newcommand{\one}{\mathds{1}} | |||
\newcommand{\0}{\mathcal{O}} | |||
\newcommand{\mat}{\textnormal{Mat}} | |||
\newcommand{\sign}{\textnormal{sign}} | |||
\newcommand{\CP}{\mathcal{P}} | |||
\newcommand{\CT}{\mathcal{T}} | |||
\newcommand{\CY}{\mathcal{Y}} | |||
\newcommand{\F}{\mathcal{F}} | |||
\newcommand{\mathds}{\mathbb}</math></div> | |||
===Independent events=== | |||
Let <math>(\Omega,\A,\p)</math> be a probability space. If <math>A,B\in\A</math>, we say that <math>A</math> and <math>B</math> are independent if | |||
<math display="block"> | |||
\p[A\cap B]=\p[A]\p[B]. | |||
</math> | |||
'''Example''' | |||
[Throw of a die] We have the state space <math>\Omega=\{1,2,3,4,5,6\}</math>, <math>\omega\in\Omega</math>. Hence we have <math>\p[\{\omega\}]=\frac{1}{6}</math>. Now let <math>A=\{1,2\}</math> and <math>B=\{1,3,5\}</math>. Then | |||
<math display="block"> | |||
\p[A\cap B]=\p[\{1\}]=\frac{1}{6}\text{and}\p[A]=\frac{1}{3},\p[B]=\frac{1}{2} | |||
</math> | |||
Therefore we get | |||
<math display="block"> | |||
\p[A\cap B]=\p[A]\p[B]. | |||
</math> | |||
Hence we get that <math>A</math> and <math>B</math> are independent. | |||
{{definitioncard|Independence of events|We say that the <math>n</math> events <math>A_1,...,A_n\in\A</math> are independent if <math>\forall \{j_1,...,j_l\}\subset\{1,...,n\}</math> we have | |||
<math display="block"> | |||
\p[A_{j_1}\cap A_{j_2}\cap\dotsm \cap A_{j_l}]=\p[A_{j_1}]\dotsm \p[A_{j_l}]. | |||
</math> | |||
}} | |||
{{alert-info | | |||
It is not enough to have <math>\p[A_1\cap\dotsm\cap A_n]=\p[A_1]\dotsm\p[A_n]</math>. It is also not enough to check that <math>\forall \{i,j\}\subset\{1,...,n\}</math>, <math>\p[A_i\cap A_j]=\p[A_i]\p[A_j]</math>. For instance, let us consider two tosses of a coin and consider events <math>A,B</math> and <math>C</math> given by | |||
<math display="block"> | |||
A=\{\text{$H$ at the first throw}\},B=\{\text{$T$ at the first throw}\},C=\{\text{same outcome for both tosses}\} | |||
</math> | |||
The events <math>A,B</math> and <math>C</math> are two by two independent but <math>A,B</math> and <math>C</math> are not independent events. | |||
}} | |||
{{proofcard|Proposition|prop-1|The <math>n</math> events <math>A_1,...,A_n\in\A</math> are independent if and only if | |||
<math display="block"> | |||
(*)\p[B_1\cap\dotsm \cap B_n]=\p[B_1]\dotsm\p[B_n] | |||
</math> | |||
for all <math>B_i\in\sigma(A_i)=\{\emptyset,A_i,A_i^C,\Omega\}</math>, <math>\forall i\in\{1,...,n\}</math>. | |||
|If the above is satisfied and if <math>\{j_1,...,j_l\}\subset\{1,...,n\}</math>, then for <math>i\in\{j_1,...,j_l\}</math> take <math>B_i=A_i</math> and for <math>i\not\in\{j_1,...,j_l\}</math> take <math>B_i=\Omega</math>. So it follows that | |||
<math display="block"> | |||
\p[A_{j_1}\cap\dotsm \cap A_{j_l}]=\p[A_{j_1}]\dotsm\p[A_{j_l}]. | |||
</math> | |||
Conversely, assume that <math>A_1,...,A_n\in\A</math> are independent and we want to deduce <math>(*)</math>. We can assume that <math>\forall i\in\{1,...,n\}</math> we have <math>B_i\not=\emptyset</math> (for otherwise the identity is trivially satisfied). If <math>\{j_1,...,j_l\}=\{i\mid B_i\not=\Omega\}</math>, we have to check that | |||
<math display="block"> | |||
\p[B_{j_1}\cap\dotsm\cap B_{j_l}]=\p[B_{j_1}]\dotsm\p[B_{j_l}], | |||
</math> | |||
as soon as <math>B_{j_k}=A_{j_k}</math> or <math>B_{j_k}=A_{j_k}^C</math>. Finally it's enough to show that if <math>C_1,...,C_p</math> are independent events, then | |||
<math display="block"> | |||
C_1^C,C_2,...,C_p | |||
</math> | |||
are also independent. But if <math>1\not\in\{i_1,...,i_q\}</math>, for all <math>\{i_1,...,i_q\}\subset\{1,...,p\}</math>, then from the definition of independence we have | |||
<math display="block"> | |||
\p[C_{i_1}\cap\dotsm\cap C_{i_q}]=\p[C_{i_1}]\dotsm\p[C_{i_q}]. | |||
</math> | |||
If <math>1\in\{i_1,...,i_q\}</math>, say <math>1=i_1</math>, then | |||
<math display="block"> | |||
\begin{align*} | |||
\p[C_{i_1}^C\cap C_{i_2}\cap\dotsm\cap C_{i_q}]&=\p[C_{i_1}\cap\dotsm\cap C_{i_q}]-\p[C_1\cap C_{i_2}\cap\dotsm\cap C_{i_q}]\\ | |||
&=\p[C_{i_2}]\dotsm\p[C_{i_q}]-\p[C_1]\p[C_{i_2}]\dotsm\p[C_{i_q}]\\ | |||
&=(1-\p[C_1])\p[C_{i_2}]\dotsm\p[C_{i_q}]=\p[C_1^C]\p[C_{i_2}]\dotsm\p[C_{i_q}] | |||
\end{align*} | |||
</math>}} | |||
{{definitioncard|Conditional probability|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>A,B\in\A</math> such that <math>\p[B] > 0</math>. The conditional probability of <math>A</math> given <math>B</math> is then defined as | |||
<math display="block"> | |||
\p[A\mid B]=\frac{\p[A\cap B]}{\p[B]}. | |||
</math> | |||
}} | |||
{{proofcard|Theorem|thm-1|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>A,B\in\A</math> and suppose that <math>\p[B] > 0</math>. | |||
<ul style{{=}}"list-style-type:lower-roman"><li><math>A</math> and <math>B</math> are independent if and only if | |||
<math display="block"> | |||
\p[A\mid B]=\p[A]. | |||
</math> | |||
</li> | |||
<li>The map | |||
<math display="block"> | |||
\A\to [0,1],A\mapsto \p[A\mid B] | |||
</math> | |||
defines a new probability measure on <math>\A</math> called the conditional probability given <math>B</math>. | |||
</li> | |||
</ul> | |||
|We need to show both points. | |||
<ul style{{=}}"list-style-type:lower-roman"><li>If <math>A</math> and <math>B</math> are independent, then | |||
<math display="block"> | |||
\p[A\mid B]=\frac{\p[A\cap B]}{\p[B]}=\frac{\p[A]\p[B]}{\p[B]}=\p[A] | |||
</math> | |||
and conversely if <math>\p[A\mid B]=\p[A]</math>, we get that | |||
<math display="block"> | |||
\p[A\cap B]=\p[A]\p[B], | |||
</math> | |||
and hence <math>A</math> and <math>B</math> are independent. | |||
</li> | |||
<li>Let <math>\Q[A]=\p[A\mid B]</math>. We have | |||
<math display="block"> | |||
\Q[\Omega]=\p[\omega\mid B]=\frac{\p[\omega\cap B]}{\p[B]}=\frac{\p[B]}{\p[B]}=1. | |||
</math> | |||
Take <math>(A_n)_{n\geq 1}\subset \A</math> as a disjoint family of events. Then | |||
<math display="block"> | |||
\begin{align*} | |||
\Q\left[\bigcup_{n\geq 1}A_n\right]&=\p\left[\bigcup_{n\geq 1}A_n\mid B\right]=\frac{\p\left[\left(\bigcup_{n\geq 1}A_n\right)\cap B\right]}{\p[B]}=\p\left[\bigcup_{n\geq 1}(A_n\cap B)\right]\\ | |||
&=\sum_{n\geq 1}\frac{\p[A_n\cap B]}{\p[B]}=\sum_{n\geq 1}\Q[A_n]. | |||
\end{align*} | |||
</math> | |||
</li> | |||
</ul>}} | |||
{{proofcard|Theorem|thm-2|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>A_1,...,A_n\in\A</math> with <math>\p[A_1\cap\dotsm\cap A_n] > 0</math>. Then | |||
<math display="block"> | |||
\p[A_1\cap\dotsm\cap A_n]=\p[A_1]\p[A_2\mid A_1]\p[A_3\mid A_1\cap A_2]\dotsm\p[A_n\mid A_1\cap\dotsm\cap A_{n-1}]. | |||
</math> | |||
|We prove this by induction. For <math>n=2</math> it's just the definition of the conditional probability. Now we want to go from <math>n-1</math> to <math>n</math>. Therefore set <math>B=A_1\cap \dotsm\cap A_{n-1}</math>. Then | |||
<math display="block"> | |||
\p[B\cap A_n]=\p[A_n\mid B]\p[B]=\p[A_n\mid B]\p[A_1]\p[A_\mid A_1]\dotsm\p[A_{n-1}\mid A_1\cap\dotsm\cap A_{n-2}]. | |||
</math>}} | |||
{{proofcard|Theorem|thm-3|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>\left(E_{n}\right)_{n\geq 1}</math> be a finite or countable measurable partition of <math>\Omega</math>, such that <math>\p[E_n] > 0</math> for all <math>n</math>. If <math>A\in\A</math>, then | |||
<math display="block"> | |||
\p[A]=\sum_{n\geq 1}\p[A\mid E_n]\p[E_n]. | |||
</math> | |||
|Note that | |||
<math display="block"> | |||
A=A\cap\Omega=A\cap\left(\bigcup_{n\geq 1}E_n\right)=\bigcup_{n\geq 1}(A_n\cap E_n). | |||
</math> | |||
Now since the <math>(A\cap E_n)_{n\geq 1}</math> are disjoint, we can write | |||
<math display="block"> | |||
\p[A]=\sum_{n\geq 1}\p[A\cap E_n]=\sum_{n\geq 1}\p[A\mid E_n]\p[E_n]. | |||
</math>}} | |||
{{proofcard|Theorem (Baye)|thm-4|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(E_n)_{n\geq 1}</math> be a finite or countable partition of <math>\Omega</math> and assume that <math>\p[A] > 0.</math> Then | |||
<math display="block"> | |||
\p[E_n\mid A]=\frac{\p[A\mid E_n]\p[E_n]}{\sum_{n\geq 1}\p[A\mid E_n]\p[E_n]}. | |||
</math> | |||
|By the previous theorem we know that | |||
<math display="block"> | |||
\p[A]=\sum_{n\geq 1}\p[A\mid E_n]\p[E_n],\p[E_n\mid A]=\frac{\p[E_n\cap A]}{\p[A]},\p[A\mid E_n]=\frac{\p[A\cap E_n]}{\p[E_n]} | |||
</math> | |||
Therefore, combining things, we get | |||
<math display="block"> | |||
\p[E_n\mid A]=\frac{\p[E_n\cap A]}{\p[A]}=\frac{\p[A\mid E_n]\p[E_n]}{\sum_{n\geq 1}\p[A\mid E_n]\p[E_n]}. | |||
</math>}} | |||
===Independent Random Variables and independent <math>\sigma</math>-Algebras=== | |||
{{definitioncard|Independence of <math>\sigma</math>-Algebras|Let <math>(\Omega,\A,\p)</math> be a probability space. We say that the sub <math>\sigma</math>-Algebras <math>\B_1,...,\B_n</math> of <math>\A</math> are independent if for all <math> A_1\in\B_1,..., A_n\in\B_n</math> we get | |||
<math display="block"> | |||
\p[A_1\cap\dotsm \cap A_n]=\p[A_1]\dotsm\p[A_n]. | |||
</math> | |||
Let now <math>X_1,...,X_n</math> be <math>n</math> r.v.'s with values in measureable spaces <math>(E_1,\mathcal{E}_1),...,(E_n,\mathcal{E}_n)</math> respectively. We say that the r.v.'s <math>X_1,...,X_n</math> are independent if the <math>\sigma</math>-Algebras <math>\sigma(X_1),...,\sigma(X_n)</math> are independent. This is equivalent to the fact that for all <math>F_1\in\mathcal{E}_1,...,F_n\in\mathcal{E}_n</math> we have | |||
<math display="block"> | |||
\p[\{X_1\in F_1\}\cap\dotsm\cap\{X_n\in F_n\}]=\p[X_1\in F_1]\dotsm \p[X_n\in F_n]. | |||
</math> | |||
(This comes from the fact that for all <math>i\in\{1,...,n\}</math> we have that <math>\sigma(X_i)=\{X_i^{-1}(F)\mid F\in\mathcal{E}_i\}</math>)}} | |||
{{alert-info | | |||
If <math>\B_1,...,\B_n</math> are <math>n</math> independent sub <math>\sigma</math>-Algebras and if <math>X_1,...,X_n</math> are independent r.v.'s such that <math>X_i</math> is <math>\B_i</math> measurable for all <math>i\in\{1,...,n\}</math>, then <math>X_1,...,X_n</math> are independent r.v.'s (This comes from the fact that for all <math>i\in\{1,...,n\}</math> we have that <math>\sigma(X_i)\subset \B_i</math>). | |||
}} | |||
{{alert-info | | |||
The <math>n</math> events <math>A_1,...,A_n\in\A</math> are independent if and only if <math>\sigma(A_1),...,\sigma(A_n)</math> are independent. | |||
}} | |||
{{proofcard|Theorem (Independence of Random Variables)|thm7|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>X_1,...,X_n</math> be <math>n</math> r.v.'s. Then <math>X_1,...,X_n</math> are independent if and only if the law of the vector <math>(X_1,...,X_n)</math> is the product of the laws of <math>X_1,...,X_n</math>, i.e. | |||
<math display="block"> | |||
\p_{(X_1,...,X_n)}=\p_{X_1}\otimes\dotsm\otimes \p_{X_n}. | |||
</math> | |||
Moreover, for every measurable map <math>f_i:(E_i,\mathcal{E}_i)\to\R_+</math> defined on a measurable space <math>(E_i,\mathcal{E}_i)</math> for all <math>i\in\{1,...,n\}</math>, we have | |||
<math display="block"> | |||
\E\left[\prod_{i=1}^nf_i(X_i)\right]=\prod_{i=1}^n\E[f_i(X_i)]. | |||
</math> | |||
|Let <math>F_i\in\mathcal{E}_i</math> for all <math>i\in\{1,...,n\}</math>. Thus we have | |||
<math display="block"> | |||
\p_{(X_1,...,X_n)}(F_1\times\dotsm \times F_n)=\p[\{X_1\in F_1\}\cap\dotsm\cap\{X_n\in F_n\}] | |||
</math> | |||
and on the other hand | |||
<math display="block"> | |||
\left(\p_{X_1}\otimes\dotsm\otimes\p_{X_n}\right)(F_1\times\dotsm\times F_n)=\p_{X_1}[F_1]\dotsm\p_{X_n}[F_n]=\prod_{i=1}^n\p_{X_i}[F_i]=\prod_{i=1}^n\p[X_i\in F_i]. | |||
</math> | |||
If <math>X_1,...,X_n</math> are independent, then | |||
<math display="block"> | |||
\p_{(X_1,...,X_n)}(F_1\times\dotsm \times F_n)=\prod_{i=1}^n\p[X_i\in F_i]=\left(\p_{X_1}\otimes\dotsm\otimes \p_{X_n}\right)(F_1\times\dotsm\times F_n), | |||
</math> | |||
which implies that <math> \p_{(X_1,...,X_n)}</math> and <math>\p_{X_1}\otimes\dotsm\otimes \p_{X_n}</math> are equal on rectangles. Hence the monotone class theorem implies that | |||
<math display="block"> | |||
\p_{(X_1,...,X_n)}=\p_{X_1}\otimes\dotsm\otimes\p_{X_n}. | |||
</math> | |||
Conversely, if <math>\p_{(X_1,...,X_n)}=\p_{X_1}\otimes\dotsm\otimes\p_{X_n}</math>, then for all <math>F_i\in\mathcal{E}_i</math>, with <math>i\in\{1,...,n\}</math>, we get that | |||
<math display="block"> | |||
\p_{(X_1,...,X_n)}(F_1\times\dotsm\times F_n)=\left(\p_{X_1}\otimes\dotsm\otimes\p_{X_n}\right)(F_1\times\dotsm\times F_n) | |||
</math> | |||
and therefore | |||
<math display="block"> | |||
\p[\{X_1\in F_1\}\cap\dotsm\cap\{X_n\in F_n\}]=\p[X_1\in F_1]\dotsm\p[X_n\in F_n]. | |||
</math> | |||
This implies that <math>X_1,...,X_n</math> are independent. For the second assumption we get | |||
<math display="block"> | |||
\E\left[\prod_{i=1}^nf_i(X_i)\right]=\int_{E_1\times\dotsm\times E_n}\prod_{i=1}^nf_i(X_i)\underbrace{P_{X_1}dx_1\dotsm P_{X_n}dx_n}_{\p_{X_1,...,X_n}(dx_1\dotsm dx_n)}=\prod_{i=1}^n\int_{E_i}f_i(x_i)P_{X_i}dx_i=\prod_{i=1}^n\E[f_i(X_i)], | |||
</math> | |||
where we have used the first part and Fubini's theorem.}} | |||
{{alert-info | | |||
We see from the proof above that as soon as for all <math>i\in\{1,...,n\}</math> we have <math>\E[\vert f_i(X_i)\vert] < \infty</math>, it follows that | |||
<math display="block"> | |||
\E\left [\prod_{i=1}^n f_i(X_i)\right]=\prod_{i=1}^n\E[ f_i(X_i) ]. | |||
</math> | |||
Indeed, the previous result shows that | |||
<math display="block"> | |||
\E\left[\prod_{i=1}^n\vert f_i(X_i)\vert\right]=\prod_{i=1}^n\E[\vert f_i(X_i)\vert] < \infty | |||
</math> | |||
and thus we can apply Fubini's theorem. In particular if <math>X_1,...,X_n\in L^1(\Omega,\A,\p)</math> and independent, we get that | |||
<math display="block"> | |||
\E\left[\prod_{i=1}^nX_i\right]=\prod_{i=1}^n\E[X_i]. | |||
</math> | |||
}} | |||
{{proofcard|Corollary|cor-1|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>X_1</math> and <math>X_2</math> be two independent r.v.'s in <math>L^2(\Omega,\A,\p)</math>. Then we get | |||
<math display="block"> | |||
Cov(X_1,X_2)=0. | |||
</math> | |||
|Recall that if <math>X\in L^2(\Omega,\A,\p)</math>, we also have that <math>X\in L^1(\Omega,\A,\p)</math>. Thus | |||
<math display="block"> | |||
Cov(X_1,X_2)=\E[X_1X_2]-\E[X_1]\E[X_2]=\E[X_1]\E[X_2]-\E[X_1]\E[X_2]=0. | |||
</math>}} | |||
{{alert-info | | |||
Note that the converse is not true! Let <math>X_1\sim\mathcal{N}(0,1)</math>. We can also take for <math>X_1</math> any symmetric r.v. in <math>L^2(\Omega,\A,\p)</math> with density <math>P(x)</math>, such that <math>P(-x)=P(x)</math>. Recall that being in <math>L^2(\Omega,\A,\p)</math> simply means | |||
<math display="block"> | |||
\E[X^2]=\int_\R x^2 P(x)dx < \infty, | |||
</math> | |||
which implies that <math>P(x)=P(-x)</math> and thus <math>\E[X^2]=\int_\R x^2P(x)dx=0.</math> | |||
Now consider a r.v. <math>Y</math> with values in <math>\{-1,+1\}</math>. Then we get <math>\p[Y=1]=\p[Y=-1]=\frac{1}{2}</math> and thus <math>Y</math> is independent of <math>X_1</math>. Define <math>X_2:=YX_1</math> and observe then | |||
<math display="block"> | |||
Cov(X_1,X_2)=\E[X_1X_2]-\E[X_1]\E[X_2]=\E[YX_1^2]-\E[YX_1]\E[X_1] | |||
</math> | |||
and hence | |||
<math display="block"> | |||
\E[Y]\E[X_1^2]-\E[Y]\E^2[X_1]=0-0=0. | |||
</math> | |||
If <math>X_1</math> and <math>X_2</math> are independent, we note that <math>\vert X_1\vert</math> and <math>\vert X_2\vert</math> would also be independent. But <math>\vert X_2\vert =\vert Y\vert \vert X_1\vert=\vert X_1\vert</math>. This would mean that <math>\vert X_1\vert</math> is independent of itself. So it follows that <math>\vert X_1\vert</math> is equal to a constant a.s. If <math>c=\E[\vert X_1\vert]</math>, and we want to look at <math>\E[(\vert X_1\vert-c)^2]</math>, we now know that <math>\vert X_1\vert -c</math> is independent of itself. Therefore we get | |||
<math display="block"> | |||
\E[(\vert X_1\vert-c)^2]=\E[\vert X_1\vert-c]\E[\vert X_1\vert-c]=0\Longrightarrow \vert X_1\vert=c\text{a.s.} | |||
</math> | |||
This cannot happen since <math>\vert X_1\vert</math> is the absolute value of a standard Gaussian distribution, which has a density given by | |||
<math display="block"> | |||
P(x)=\frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}. | |||
</math> | |||
}} | |||
{{proofcard|Corollary|cor-2|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>X_1,...,X_n</math> be <math>n</math> r.v.'s with values in <math>\R</math>. | |||
<ul style{{=}}"list-style-type:lower-roman"><li>Assume that for <math>i\in \{1,...,n\}</math>, <math>\p_{X_i}</math> has density <math>P_i</math> and that the r.v.'s <math>X_1,...,X_n</math> are independent. Then the law of <math>(X_1,...,X_n)</math> also has density given by <math>P(x_1,...,x_n)=\prod_{i=1}^nP_i(x_i)</math>. | |||
</li> | |||
<li>Conversely assume that the law of <math>(X_1,...,X_n)</math> has density <math>P(x_1,...,x_n)=\prod_{i=1}^nq_i(x_i)</math>, where <math>q_i</math> is Borel measurable and positive. Then the r.v.'s <math>X_1,...,X_n</math> are independent and the law of <math>X_i</math> has density <math>P_i=c_iq_i</math>, with <math>c_i > 0</math> for <math>i\in\{1,...,n\}</math>. | |||
</li> | |||
</ul> | |||
|We only need to show <math>(ii)</math>. From Fubini we get | |||
<math display="block"> | |||
\int_\R\prod_{i=1}^nq_i(x_i)dx_i=\prod_{i=1}^n\int_\R q_i(x_i)dx_i=\int_{\R^{n}}P(x_1,...,x_n)dx_1\dotsm dx_n=1. | |||
</math> | |||
which implies that <math>K_i:=\int_\R q_i(x_i)dx_i\in(0,\infty)</math>, for all <math>i\in\{1,...,n\}</math>. Now we know that the law of <math>X_i</math> has density <math>P_i</math> given by | |||
<math display="block"> | |||
P_i(x_i)=\int_{\R^{n-1}}P(x_1,...,x_{i-1},x_i,x_{i+1},...,x_n)dx_1\dotsm dx_{i-1}dx_{i+1}\dotsm dx_n=\left(\prod_{j\not=i}K_j\right)q_i(x_i)=\frac{1}{K_i}q_i(x_i). | |||
</math> | |||
We can rewrite | |||
<math display="block"> | |||
P(x_1,...,x_n)=\prod_{i=1}^nq_i(x_i)=\prod_{i=1}^nP_i(x_i). | |||
</math> | |||
Hence we get <math>P(x_1,...,x_n)=\p_{X_1}\otimes\dotsm \otimes \p_{X_n}</math> and therefore <math>X_1,...,X_n</math> are independent.}} | |||
'''Example''' | |||
Let <math>U</math> be a r.v. with exponential distribution. Let <math>V</math> be a uniform r.v. on <math>[0,1]</math>. We assume that <math>U</math> and <math>V</math> are independent. Define the r.v.'s <math>X=\sqrt{U}\cos(2\pi V)</math> and <math>Y=\sqrt{U}\sin(2\pi V)</math>. Then <math>X</math> and <math>Y</math> are independent. Indeed, for a measurable function <math>\varphi:\R^2\to \R_+</math> we get | |||
<math display="block"> | |||
\E[\varphi(X,Y)]=\int_0^\infty\int_{0}^1\varphi(\sqrt{u}\cos(2\pi v),\sqrt{u}\sin(2\pi v))e^{u}dudv | |||
</math> | |||
<math display="block"> | |||
=\frac{1}{\sqrt{\pi}}\int_{0}^\infty\int_0^{2\pi}\varphi(r\cos(\theta),r\sin(\theta))re^{-r^2}drd\theta, | |||
</math> | |||
which implies that <math>(X,Y)</math> has density <math>\frac{e^{-x^2}e^{-y^2}}{\pi}</math> on <math>\R\times\R</math>. With the previous corollary we get that <math>X</math> and <math>Y</math> are independent and <math>X</math> and <math>Y</math> have the same density <math>P(x)=\frac{1}{\sqrt{\pi}}e^{-x^2}</math>. This means that <math>X</math> and <math>Y</math> are independent. | |||
{{alert-info | | |||
We write <math>X\stackrel{law}{=}Y</math> to say that <math>\p_X=\p_Y</math>. Thus in the example above we would have | |||
<math display="block"> | |||
X\stackrel{law}{=}Y\sim\mathcal{N}(0,\frac{1}{2}). | |||
</math> | |||
}} | |||
====Important facts==== | |||
Let <math>X_1,...,X_n</math> be <math>n</math> real valued r.v.'s. Then the following are equivalent | |||
<ul style{{=}}"list-style-type:lower-roman"><li><math>X_1,...,X_n</math> are independent. | |||
</li> | |||
<li>For <math>X=(X_1,...,X_n)\in\R^n</math> we have | |||
<math display="block"> | |||
\Phi_X(\xi_1,...,\xi_n)=\prod_{i=1}^n\Phi_{X_i}(\xi_i). | |||
</math> | |||
</li> | |||
<li>For all <math>a_1,..,a_n\in\R</math>, we have | |||
<math display="block"> | |||
\p[X_1\leq a_1,..,X_n\leq a_n]=\prod_{i=1}^n\p[X_i\leq a_i] | |||
</math> | |||
</li> | |||
<li>If <math>f_1,...,f_n:\R\to\R_+</math> are continuous, measurable maps with compact support, then | |||
<math display="block"> | |||
\E\left[\prod_{i=1}^nf_i(X_i)\right]=\prod_{i=1}^n\E[f_i(X_i)]. | |||
</math> | |||
</li> | |||
</ul> | |||
\begin{proof} | |||
First we show <math>(i)\Longrightarrow (ii)</math>. By definition and the iid property, we get | |||
<math display="block"> | |||
\Phi_X(\xi_1,..,\xi_n)=\E\left[e^{i(\xi_1X_1+...+\xi_nX_n)}\right]=\E\left[e^{i\xi_1X_1}\dotsm e^{i\xi_nX_n}\right]=\prod_{i=1}^n\E[e^{i\xi X_1}]=\prod_{i=1}^n\Phi_{X_i}(\xi_i), | |||
</math> | |||
where the map <math>t\mapsto e^{it}</math> is measurable and bounded. | |||
Next we show <math>(ii)\Longrightarrow (i)</math>. Note that by [[#thm7 |theorem]] we have <math>\p_X=\p_Y</math> if | |||
<math display="block"> | |||
\Phi_X(\xi_1,...,\xi_n)=\Phi_Y(\xi_1,...,\xi_n). | |||
</math> | |||
Now if <math>\Phi_X(\xi_1,...,\xi_n)=\prod_{i=1}^n\Phi_{X_i}(\xi_i)</math>, we note that <math>\prod_{i=1}^n\Phi_{X_i}(\xi_i)</math> | |||
is the characteristic function of the probability distribution if the probability distribution is <math>\p_{X_1}\otimes\dotsm \otimes\p_{X_n}</math>. Now from injectivity it follows that <math>\p_{(X_1,...,X_n)}=\p_{X_1}\otimes\dotsm\otimes\p_{X_n}</math>, which implies that <math>X_1,...,X_n</math> are independent. | |||
\end{proof} | |||
{{proofcard|Proposition|prop-2|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>\B_1,...,\B_n\subset\A</math> be sub <math>\sigma</math>-Algebras of <math>\A</math>. For every <math>i\in\{1,...,n\}</math>, let <math>\mathcal{C}_i\subset\B_i</math> be a family of subsets of <math>\Omega</math> such that <math>\mathcal{C}_i</math> is stable under finite intersection and <math>\sigma(\mathcal{C}_i)=\B_i</math>. Assume that for all <math>C_i\in\mathcal{C}_i</math> with <math>i\in\{1,...,n\}</math> we have | |||
<math display="block"> | |||
\p\left[\prod_{i=1}^nC_i\right]=\prod_{i=1}^n\p[C_i]. | |||
</math> | |||
Then <math>\B_1,...,\B_n</math> are independent <math>\sigma</math>-Algebras. | |||
|Let us fix <math>C_2\in \mathcal{C}_2,...,C_n\in\mathcal{C}_n</math> and define | |||
<math display="block"> | |||
M_1:=\left\{B_1\in\B_1\mid \p[B_1\cap C_2\cap\dotsm\cap C_2]=\p[B_1]\p[C_2]\dotsm \p[C_n]\right\}. | |||
</math> | |||
Now since <math>\mathcal{C}_1\subset M_1</math> and <math>M_1</math> is a monotone class, we get <math>\sigma(\mathcal{C}_1)=\B_1\subset M_1</math> and thus <math>\B_1=M_1</math>. Let now <math>B_1\in\B_1,</math> <math>C_3\in\mathcal{C}_3,...,C_n\in\mathcal{C}_n</math> and define | |||
<math display="block"> | |||
M_2:=\{B_2\in\B_2\mid \p[B_2\cap B_1\cap C_3\cap\dotsm\cap C_n]=\p[B_2]\p[B_1]\p[C_3]\dotsm\p[C_n]\}. | |||
</math> | |||
Again, since <math>\mathcal{C}_2\subset M_2</math>, we get <math>\sigma(\mathcal{C}_2)=\B_2\subset M_2</math> and thus <math>B_2=M_2</math>. By induction we complete the proof.}} | |||
<math>Consequence:</math> Let <math>\B_1,...,\B_n</math> be <math>n</math> independent <math>\sigma</math>-Algebras and let <math>m_0=0 < m_1 < ... < m_p=n</math>. Then the <math>\sigma</math>-Algebras | |||
<math display="block"> | |||
\begin{align*} | |||
\mathcal{D}_1&=\B_1\lor\dotsm\lor\B_n=\sigma(\B_1,...,\B_n)=\sigma\left(\bigcup_{k=1}^n\B_k\right)\\ | |||
\mathcal{D}_2&=\B_{m_i+1}\lor\dotsm\lor\B_{n_2}\\ | |||
\vdots\\ | |||
\mathcal{D}_p&=\B_{n_{p-1}+1}\lor\dotsm\lor\B_{n_p} | |||
\end{align*} | |||
</math> | |||
are also independent. Indeed, we can apply The previous proposition to the class of sets | |||
<math display="block"> | |||
C_j=\{B_{n_{j-1}+1}\cap\dotsm\cap B_{n_j}\mid B_i\in\mathcal{C}_i, i\in\{n_{j-1}+1,...,n_j\}\}. | |||
</math> | |||
In particular if <math>X_1,...,X_n</math> are independent r.v.'s, then | |||
<math display="block"> | |||
\begin{align*} | |||
Y_1&=(X_1,...,X_n)\\ | |||
\vdots\\ | |||
Y_p&=(X_{n_{p_1}},...,X_{n_p}) | |||
\end{align*} | |||
</math> | |||
are also independent. | |||
'''Example''' | |||
Let <math>X_1,...,X_4</math> be real valued independent r.v.'s. Then <math>Z_1=X_1X_3</math> and <math>Z_2=X_2^3+X_4</math> are independent and <math>Z_3=\sigma(X_1,X_3)</math> and <math>Z_4=\sigma(X_2,X_4)</math> are measurable. | |||
From above <math>\sigma(X_1,X_3)</math> and <math>\sigma(X_2,X_4)</math> are independent if for <math>X:\Omega\to\R</math> we have that <math>Y</math> is <math>\sigma(X)</math> measurable if and only if <math>Y=f(X)</math> with <math>f</math> being a measurable map, i.e. if <math>Y</math> is <math>\sigma(X_1,...,X_n)</math> measurable, then <math>Y=f(X_1,....,X_n)</math>. | |||
{{proofcard|Proposition (Independence for an infinite family)|prop-3|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(\B_i)_{i\in I}</math> be an infinite family of sub <math>\sigma</math>-Algebras of <math>A</math>. We say that the family <math>(\B_i)_{i\in I}</math> is independent if for all <math>\{i_1,..,i_p\}\in I</math>, <math>\B_{i_1},...,\B_{i_p}</math> are independent. If <math>(X_i)_{i\in I}</math> is a family of r.v.'s we say that they are independent if <math>(\sigma(X_i))_{i\in I}</math> is independent.|}} | |||
{{proofcard|Proposition|prop-4|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(X_n)_{n\geq 1}</math> be a sequence of independent r.v.'s. Then for all <math>p\in\N</math> we get that <math>p_1=\sigma(X_1,...,X_p)</math> and <math>p_2=\sigma(X_{p+1},...,X_n)</math> are independent. | |||
|Apply Proposition 5.9. to <math>\mathcal{C}_1=\sigma(X_1,...,X_p)</math> and | |||
<math>\mathcal{C}_2=\bigcup_{k=p+1}^\infty\sigma(X_{p+1},...,X_n)\in\B_2</math>.}} | |||
===The Borel-Cantelli Lemma=== | |||
Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(A_n)_{n\in\N}</math> be a sequence of events in <math>\A</math>. Recall that we can write | |||
<math display="block"> | |||
\limsup_{n\to \infty} A_n=\bigcap_{n=0}^\infty\left(\bigcup_{k=n}^\infty A_k\right)\text{and}\liminf_{n\to \infty} A_n=\bigcup_{n=0}^\infty\left(\bigcap_{k=n}^\infty A_k\right). | |||
</math> | |||
Moreover, both are again measurable sets. For <math>\omega\in\limsup_n A_n</math> we get that <math>\omega\in\bigcup_{k=n}^\infty A_k</math>, for all <math>n\geq 0</math>. Moreover, for all <math>n\geq 0</math>, there exists a <math>k\geq n</math> such that, <math>\omega\in A_n</math> and <math>\omega</math> is in infinitely many <math>A_k</math>'s. For <math>\omega\in\liminf_n A_n</math>, we get that for all <math>n\geq 0</math> such that <math>\omega\in\bigcap_{k=n}^\infty A_k</math>, there exists <math>n\geq 0</math>, such that for all <math>k\geq n</math> we have <math>\omega\in A_k</math>, which shows that <math>\liminf_nA_n\subset \limsup_nA_n</math>. | |||
{{proofcard|Lemma (Borel-Cantelli)|lem-1|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(A_n)_{n\in\N}\in\A</math> be a family of measurable sets. | |||
<ul style{{=}}"list-style-type:lower-roman"><li>If <math>\sum_{n\geq 1}\p[A_n] < \infty</math>, then | |||
<math display="block"> | |||
\p\left[\limsup_{n\to\infty} A_n\right]=0, | |||
</math> | |||
which means that the set <math>\{n\in\N\mid \omega\in A_n\}</math> is a.s. finite. | |||
</li> | |||
<li>If <math>\sum_{n\geq 1}\p[A_n]=\infty</math>, and if the events <math>(A_n)_{n\in\N}</math> are independent, then | |||
<math display="block"> | |||
\p\left[\limsup_{n\to\infty} A_n\right]=1, | |||
</math> | |||
which means that the set <math>\{n\in\N\mid \omega\in A_n\}</math> is a.s. finite. | |||
</li> | |||
</ul> | |||
|We need to show both points. | |||
<ul style{{=}}"list-style-type:lower-roman"><li>If <math>\sum_{n\geq 1}\p[A_n] < \infty,</math> then, by Fubini, we get | |||
<math display="block"> | |||
\E\left[\sum_{n\geq 1}\one_{A_n}\right]=\sum_{n\geq 1}\p[A_n], | |||
</math> | |||
which implies that <math>\sum_{n\geq 1}\one_{A_n} < \infty</math> and <math>\one_{A_n}\not=0</math> a.s. for finite numbers of <math>n</math>. | |||
</li> | |||
<li>Fix <math>n_0\in\N</math> and note that for all <math>n\geq n_0</math> we have | |||
<math display="block"> | |||
\p\left[\bigcap_{k=n_0}^nA_k^C\right]=\prod_{k=n_0}^n\p[A_k^C]=\prod_{k=n_0}^n\p[1-A_n]. | |||
</math> | |||
Now we see that | |||
<math display="block"> | |||
\sum_{n\geq 1}\p[A_n]=\infty | |||
</math> | |||
and thus | |||
<math display="block"> | |||
\p\left[\bigcap_{k=n_0}^nA_k^C\right]=0. | |||
</math> | |||
Since this is true for every <math>n_0</math> we have that | |||
<math display="block"> | |||
\p\left[\bigcup_{n=0}^\infty\bigcap_{k=n_0}^\infty A_k^C\right]\leq \sum_{n\geq 1}\p[A_k^C]=0. | |||
</math> | |||
Hence we get | |||
<math display="block"> | |||
\p\left[\bigcup_{n=0}^\infty\bigcap_{k=n_0}^\infty A_k^C\right]=\p\left[\bigcap_{n=0}^\infty\bigcup_{k=n}^\infty A_k\right]=\p\left[\limsup_{n\to\infty} A_n\right]=1. | |||
</math> | |||
</li> | |||
</ul>}} | |||
====Application 1==== | |||
Let <math>(\Omega,\A,\p)</math> be a probability space. There does not exist a probability measure on <math>\N</math> such that the probability of the set of multiples of an integer <math>n</math> is <math>\frac{1}{n}</math> for <math>n\geq 1</math>. Let us assume that such a probability measure exists. Let <math>\tilde{p}</math> denote the set of prime numbers. For <math>p\in\tilde{p}</math> we note that <math>A_p=p\N</math>, i.e. the set of all multiples of <math>p</math>. We first show that the sets <math>(A_p)_{p\in\tilde{p}}</math> are independent. Indeed let <math>p_1,...,p_n\in\tilde{p}</math> be distinct. Then we have | |||
<math display="block"> | |||
\p[p_1\N\cap\dotsm\cap p_n\N]=\p[p_1,...,p_n\N]=\frac{1}{p_1\dotsm p_n}=\p[p_1\N]\dotsm\p[p_n\N]. | |||
</math> | |||
Moreover it is known that | |||
<math display="block"> | |||
\sum_{p\in\tilde{p}}\p[p\N]=\sum_{p\in\tilde{p}}\frac{1}{p}=\infty. | |||
</math> | |||
The second part of the Borel-Cantelli lemma implies that all integers <math>n</math> belong to infinitely many <math>A_p</math>'s. So it follows that <math>n</math> is divisible by infinitely many distinct prime numbers. | |||
====Application 2==== | |||
Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>X</math> be an exponential r.v. with parameter <math>\lambda=1</math>. Thus we know that <math>X</math> has density <math>e^{-x}\one_{\R_+}(x)</math>. Now consider a sequence <math>(X_n)_{n\geq 1}</math> of independent r.v.'s with the same distribution as <math>X</math>, i.e. for all <math>n\geq 1</math>,we have <math>X_n\sim X</math>. Then <math>\limsup_n \frac{X_n}{\log(n)}=1</math> a.s., i.e. there exists an <math>N\in\A</math> such that <math>\p[N]=0</math> and for <math>\omega\not\in N</math> we get | |||
<math display="block"> | |||
\limsup_{n\to\infty} \frac{X_n(\omega)}{\log(n)}=1. | |||
</math> | |||
Therefore we can compute the probability | |||
<math display="block"> | |||
\p[X > t]=\int_t^\infty e^{-x}dx=e^{-t}. | |||
</math> | |||
Now let <math>\epsilon > 0</math> and consider the sets <math>A_n=\{X_n > (1+\epsilon)\log(n)\}</math> and <math>B_n=\{X_n > \log(n)\}</math>. Then | |||
<math display="block"> | |||
\p[A_n]=\p[X_n > (1+\epsilon)\log(n)]=\p[X > (1+\epsilon)\log(n)]=e^{-(1+\epsilon)\log(n)}=\frac{1}{n^{1+\epsilon}}. | |||
</math> | |||
This implies that | |||
<math display="block"> | |||
\sum_{n\geq 1}\p[A_n] < \infty. | |||
</math> | |||
With the Borel-Cantelli lemma we get that <math>\p\left[\limsup_{n\to\infty} A_n\right]=0</math>. Let us define | |||
<math display="block"> | |||
N_{\epsilon}=\limsup_{n\to\infty} A_n. | |||
</math> | |||
Then we have <math>\p[N_\epsilon]=0</math> for <math>\omega\not\in N_{\epsilon}</math>, which implies that there exists an <math>n_0(\omega)</math> such that for all <math>n\geq n_0</math> we have | |||
<math display="block"> | |||
X_n(\omega)\leq (1+\epsilon)\log(n) | |||
</math> | |||
and thus for <math>\omega\not\in N_{\epsilon}</math>, we get <math>\limsup_{n\to\infty}\frac{X_n(\omega)}{\log(n)}\leq 1+\epsilon</math>. Moreover, let | |||
<math display="block"> | |||
N'=\bigcup_{\epsilon\in \Q_+}N_{\epsilon}. | |||
</math> | |||
Therefore we get <math>\p[N']\leq \sum_{\epsilon\in\Q_+}\p[N_{\epsilon}]=0</math> for <math>\omega\not\in N'</math>. Hence we get | |||
<math display="block"> | |||
\limsup_{n\to\infty}\frac{X_n(\omega)}{\log(n)}\leq 1. | |||
</math> | |||
Now we note that the <math>B_n</math>'s are independent, since <math>B_n\in\sigma(X_n)</math> and the fact that the <math>X_n</math>'s are independent. Moreover, | |||
<math display="block"> | |||
\p[B_n]=\p[X_n > \log(n)]=\p[X > \log(n)]=\frac{1}{n}, | |||
</math> | |||
which gives that | |||
<math display="block"> | |||
\sum_{n\geq 1}\p[B_n]=\infty. | |||
</math> | |||
Now we can use Borel-Cantelli to get | |||
<math display="block"> | |||
\p\left[\limsup_{n\to\infty} B_n\right]=1. | |||
</math> | |||
If we denote <math>N''=\left(\limsup_{n\to\infty} B_n\right)^C</math>, then for <math>\omega\not\in N''</math> we get that <math>X_n(\omega) > \log(n)</math> for infinitely many <math>n</math>. So it follows that for <math>\omega\not\in N''</math> we have | |||
<math display="block"> | |||
\limsup_{n\to\infty}\frac{X_n(\omega)}{\log(n)}\geq 1. | |||
</math> | |||
Finally, take <math>N=N'\cup N''</math> to obtain <math>\p[N]=0</math>. Thus for <math>\omega\not\in N</math> we get | |||
<math display="block"> | |||
\limsup_{n\to\infty} \frac{X_n(\omega)}{\log(n)}=1. | |||
</math> | |||
===Sums of independent Random Variables=== | |||
Let us first define the convolution of two probability measures. If <math>\mu</math> and <math>\nu</math> are two probability measures on <math>\R^d</math>, we denote by <math>\mu*\nu</math> the image of the measure <math>\mu\otimes\nu</math> by the application | |||
<math display="block"> | |||
\R^d\times\R^d\to\R^d,(x,y)\mapsto x+y. | |||
</math> | |||
Moreover, for all measurable maps <math>\varphi:\R^d\to \R_+</math>, we have | |||
<math display="block"> | |||
\int_{\R^d}\varphi(z)(\mu*\nu)(dz)=\iint_{\R^d}\varphi(x+y)\mu(dx)\nu(dy). | |||
</math> | |||
{{proofcard|Proposition|prop-5|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>X</math> and <math>Y</math> be two independent r.v.'s with values in <math>\R^d</math>. Then the following hold. | |||
<ul style{{=}}"list-style-type:lower-roman"><li>The law of <math>X+Y</math> is given by <math>\p_X*\p_Y</math>. In particular if <math>X</math> has density <math>f</math> and <math>Y</math> has density <math>g</math>, then <math>X+Y</math> has density <math>f*g</math>, where <math>*</math> denotes the convolution product, which is given by | |||
<math display="block"> | |||
f*g(\xi)=\int_{\R^d} f(x)g(\xi-x)dx. | |||
</math> | |||
</li> | |||
<li> | |||
<math>\Phi_{X+Y}(\xi)=\Phi_X(\xi)\Phi_Y(\xi).</math> | |||
</li> | |||
<li>If <math>X</math> and <math>Y</math> are in <math>L^2(\Omega,\A,\p)</math>, we get | |||
<math display="block"> | |||
K_{X+Y}=K_X+K_Y. | |||
</math> | |||
In particular when <math>d=1</math>, we obtain | |||
<math display="block"> | |||
Var(X+Y)=Var(X)+Var(Y). | |||
</math> | |||
</li> | |||
</ul> | |||
|We need to show all three points. | |||
<ul style{{=}}"list-style-type:lower-roman"><li>If <math>X</math> and <math>Y</math> are independent r.v.'s, then <math>\p_{(X,Y)}=\p_X\otimes\p_Y</math>. Consequently, for all measurable maps <math>\varphi:\R^d\to\R_+</math>, we have | |||
<math display="block"> | |||
\begin{multline*} | |||
\E[\varphi(X+Y)]=\iint_{\R^d}\varphi(X+Y)\p_{(X,Y)}(dxdy)=\iint_{\R^d}\varphi(X+Y)\p_X(dx)\p_{Y}(dy)\\=\int_{\R^d}\varphi(\xi)(\p_X*\p_Y)(d\xi). | |||
\end{multline*} | |||
</math> | |||
Now since <math>X</math> and <math>Y</math> have densities <math>f</math> and <math>g</math> respectively, we get | |||
<math display="block"> | |||
\E[\varphi(Z=X+Y)]=\iint_{\R^d}\varphi(X+Y)f(x)*g(y)dxdy=\iint_{\R^d}\varphi(\xi)\left(\int_{\R^d} f(x)g(\xi-x)dx\right)d\xi. | |||
</math> | |||
Since this identity here is true for all measurable maps <math>\varphi:\R^d\to\R_+</math>, the r.v. <math>Z:=X+Y</math> has density | |||
<math display="block"> | |||
h(\xi)=(f*g)(\xi)=\int_{\R^d}f(x)g(\xi-x)dx. | |||
</math> | |||
</li> | |||
<li>By definition of the characteristic function and the independence property, we get | |||
<math display="block"> | |||
\Phi_{X+Y}(\xi)=\E\left[e^{i\xi(X+Y)}\right]=\E\left[e^{i\xi X}e^{i\xi Y}\right]=\E\left[e^{i\xi X}\right]\E\left[e^{i\xi Y}\right]=\Phi_X(\xi)\Phi_Y(\xi). | |||
</math> | |||
</li> | |||
<li>If <math>X=(X_1,...,X_d)</math> and <math>Y=(Y_1,...,Y_d)</math> are independent r.v.'s on <math>\R^d</math>, we get that <math>Cov(X_i,Y_j)=0</math>, for all <math>0\leq i,j\leq d</math>. By using the multi linearity of the covariance we get that | |||
<math display="block"> | |||
Cov(X_i+Y_i,X_j+Y_j)=Cov(X_i,X_j)+Cov(Y_j+Y_j), | |||
</math> | |||
and hence <math>K_{X+Y}=K_X+K_Y</math>. For <math>d=1</math> we get | |||
<math display="block"> | |||
\begin{align*} | |||
Var(X+Y)&=\E[((X+Y)-\E[X+Y])^2]=\E[((X-\E[X])+(Y-\E[Y]))^2]\\ | |||
&=\underbrace{\E[(X-\E[X])^2]}_{Var(X)}+\underbrace{\E[(Y-\E[Y])^2]}_{Var(Y)}+\underbrace{2\E[(X-\E[X])(Y-\E[Y])]}_{2Cov(X,Y)} | |||
\end{align*} | |||
</math> | |||
Now since <math>Cov(X,Y)=0</math>, we get the result. | |||
</li> | |||
</ul>}} | |||
{{proofcard|Theorem (Weak law of large numbers)|thm-5|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(X_n)_{n\geq 1}</math> be a sequence of independent r.v.'s. Moreover, write <math>\mu=\E[X_n]</math> for all <math>n\geq1</math> and assume <math>\E[(X_n-\mu)^2]\leq C</math> for all <math>n\geq1</math> and for some constant <math>C < \infty</math>. We also write <math>S_n=\sum_{j=1}^nX_j</math> and <math>\tilde X_n=\frac{S_n}{n}</math> for all <math>n\geq 1</math>. Then for all <math>\epsilon > 0</math> | |||
<math display="block"> | |||
\p[\vert \tilde X_n-\mu\vert > \epsilon]\xrightarrow{n\to\infty}0. | |||
</math> | |||
Thus, we also have | |||
<math display="block"> | |||
\E[S_n]=\frac{1}{n}\E\left[\sum_{j=1}^nX_j\right]=\frac{1}{n}n\E[X_j]=\E[X_j]. | |||
</math> | |||
|We note that | |||
<math display="block"> | |||
\E[(S_n-n\mu)^2]=\sum_{j=1}^n\E[(X_j-\mu)^2]\leq nC. | |||
</math> | |||
Hence for <math>\epsilon > 0</math> we get by Markov's inequality | |||
<math display="block"> | |||
\p[\vert \tilde X-\mu > \epsilon]=\p[(S_n-n\mu)^2 > (n\epsilon)^2]\leq \frac{\E[(S_n-n\mu)^2]}{n^2\epsilon^2}\leq \frac{C}{n\epsilon^2}\xrightarrow{n\to\infty}0 | |||
</math>}} | |||
{{proofcard|Corollary|cor-3|Let <math>(\Omega,\A,\p)</math> be a probability space. Let <math>(A_n)_{n\geq 1}\in \A</math> be a sequence of independent events with the same probabilities, i.e. <math>\p[A_n]=\p[A_m]</math>, for all <math>n,m\geq 1</math>. Then | |||
<math display="block"> | |||
\lim_{n\to\infty}\frac{1}{n}\sum_{i=1}^\infty \one_{A_i}=\p[A_1]a.s. | |||
</math> | |||
|Note that by the weak law of large numbers, we get for a sequence of independent r.v.'s <math>(X_n)_{n\geq 1}</math> with the same expectation for all <math>n\geq 1</math> | |||
<math display="block"> | |||
\lim_{n\to\infty}\E\left[\frac{1}{n}\sum_{j=1}^nX_j\right]=\E[X_1] | |||
</math> | |||
and thus we can take <math>X_j=\one_{A_j}</math>, since we know that <math>\E[\one_A]=\p[A]</math>.}} | |||
==General references== | |||
{{cite arXiv|last=Moshayedi|first=Nima|year=2020|title=Lectures on Probability Theory|eprint=2010.16280|class=math.PR}} |
Latest revision as of 01:53, 8 May 2024
Independent events
Let [math](\Omega,\A,\p)[/math] be a probability space. If [math]A,B\in\A[/math], we say that [math]A[/math] and [math]B[/math] are independent if
Example
[Throw of a die] We have the state space [math]\Omega=\{1,2,3,4,5,6\}[/math], [math]\omega\in\Omega[/math]. Hence we have [math]\p[\{\omega\}]=\frac{1}{6}[/math]. Now let [math]A=\{1,2\}[/math] and [math]B=\{1,3,5\}[/math]. Then
Therefore we get
Hence we get that [math]A[/math] and [math]B[/math] are independent.
We say that the [math]n[/math] events [math]A_1,...,A_n\in\A[/math] are independent if [math]\forall \{j_1,...,j_l\}\subset\{1,...,n\}[/math] we have
It is not enough to have [math]\p[A_1\cap\dotsm\cap A_n]=\p[A_1]\dotsm\p[A_n][/math]. It is also not enough to check that [math]\forall \{i,j\}\subset\{1,...,n\}[/math], [math]\p[A_i\cap A_j]=\p[A_i]\p[A_j][/math]. For instance, let us consider two tosses of a coin and consider events [math]A,B[/math] and [math]C[/math] given by
The [math]n[/math] events [math]A_1,...,A_n\in\A[/math] are independent if and only if
If the above is satisfied and if [math]\{j_1,...,j_l\}\subset\{1,...,n\}[/math], then for [math]i\in\{j_1,...,j_l\}[/math] take [math]B_i=A_i[/math] and for [math]i\not\in\{j_1,...,j_l\}[/math] take [math]B_i=\Omega[/math]. So it follows that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]A,B\in\A[/math] such that [math]\p[B] \gt 0[/math]. The conditional probability of [math]A[/math] given [math]B[/math] is then defined as
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]A,B\in\A[/math] and suppose that [math]\p[B] \gt 0[/math].
- [math]A[/math] and [math]B[/math] are independent if and only if
[[math]] \p[A\mid B]=\p[A]. [[/math]]
- The map
[[math]] \A\to [0,1],A\mapsto \p[A\mid B] [[/math]]defines a new probability measure on [math]\A[/math] called the conditional probability given [math]B[/math].
We need to show both points.
- If [math]A[/math] and [math]B[/math] are independent, then
[[math]] \p[A\mid B]=\frac{\p[A\cap B]}{\p[B]}=\frac{\p[A]\p[B]}{\p[B]}=\p[A] [[/math]]and conversely if [math]\p[A\mid B]=\p[A][/math], we get that[[math]] \p[A\cap B]=\p[A]\p[B], [[/math]]and hence [math]A[/math] and [math]B[/math] are independent.
- Let [math]\Q[A]=\p[A\mid B][/math]. We have
[[math]] \Q[\Omega]=\p[\omega\mid B]=\frac{\p[\omega\cap B]}{\p[B]}=\frac{\p[B]}{\p[B]}=1. [[/math]]Take [math](A_n)_{n\geq 1}\subset \A[/math] as a disjoint family of events. Then[[math]] \begin{align*} \Q\left[\bigcup_{n\geq 1}A_n\right]&=\p\left[\bigcup_{n\geq 1}A_n\mid B\right]=\frac{\p\left[\left(\bigcup_{n\geq 1}A_n\right)\cap B\right]}{\p[B]}=\p\left[\bigcup_{n\geq 1}(A_n\cap B)\right]\\ &=\sum_{n\geq 1}\frac{\p[A_n\cap B]}{\p[B]}=\sum_{n\geq 1}\Q[A_n]. \end{align*} [[/math]]
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]A_1,...,A_n\in\A[/math] with [math]\p[A_1\cap\dotsm\cap A_n] \gt 0[/math]. Then
We prove this by induction. For [math]n=2[/math] it's just the definition of the conditional probability. Now we want to go from [math]n-1[/math] to [math]n[/math]. Therefore set [math]B=A_1\cap \dotsm\cap A_{n-1}[/math]. Then
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]\left(E_{n}\right)_{n\geq 1}[/math] be a finite or countable measurable partition of [math]\Omega[/math], such that [math]\p[E_n] \gt 0[/math] for all [math]n[/math]. If [math]A\in\A[/math], then
Note that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](E_n)_{n\geq 1}[/math] be a finite or countable partition of [math]\Omega[/math] and assume that [math]\p[A] \gt 0.[/math] Then
By the previous theorem we know that
Independent Random Variables and independent [math]\sigma[/math]-Algebras
Let [math](\Omega,\A,\p)[/math] be a probability space. We say that the sub [math]\sigma[/math]-Algebras [math]\B_1,...,\B_n[/math] of [math]\A[/math] are independent if for all [math] A_1\in\B_1,..., A_n\in\B_n[/math] we get
If [math]\B_1,...,\B_n[/math] are [math]n[/math] independent sub [math]\sigma[/math]-Algebras and if [math]X_1,...,X_n[/math] are independent r.v.'s such that [math]X_i[/math] is [math]\B_i[/math] measurable for all [math]i\in\{1,...,n\}[/math], then [math]X_1,...,X_n[/math] are independent r.v.'s (This comes from the fact that for all [math]i\in\{1,...,n\}[/math] we have that [math]\sigma(X_i)\subset \B_i[/math]).
The [math]n[/math] events [math]A_1,...,A_n\in\A[/math] are independent if and only if [math]\sigma(A_1),...,\sigma(A_n)[/math] are independent.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X_1,...,X_n[/math] be [math]n[/math] r.v.'s. Then [math]X_1,...,X_n[/math] are independent if and only if the law of the vector [math](X_1,...,X_n)[/math] is the product of the laws of [math]X_1,...,X_n[/math], i.e.
Let [math]F_i\in\mathcal{E}_i[/math] for all [math]i\in\{1,...,n\}[/math]. Thus we have
We see from the proof above that as soon as for all [math]i\in\{1,...,n\}[/math] we have [math]\E[\vert f_i(X_i)\vert] \lt \infty[/math], it follows that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X_1[/math] and [math]X_2[/math] be two independent r.v.'s in [math]L^2(\Omega,\A,\p)[/math]. Then we get
Recall that if [math]X\in L^2(\Omega,\A,\p)[/math], we also have that [math]X\in L^1(\Omega,\A,\p)[/math]. Thus
Note that the converse is not true! Let [math]X_1\sim\mathcal{N}(0,1)[/math]. We can also take for [math]X_1[/math] any symmetric r.v. in [math]L^2(\Omega,\A,\p)[/math] with density [math]P(x)[/math], such that [math]P(-x)=P(x)[/math]. Recall that being in [math]L^2(\Omega,\A,\p)[/math] simply means
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X_1,...,X_n[/math] be [math]n[/math] r.v.'s with values in [math]\R[/math].
- Assume that for [math]i\in \{1,...,n\}[/math], [math]\p_{X_i}[/math] has density [math]P_i[/math] and that the r.v.'s [math]X_1,...,X_n[/math] are independent. Then the law of [math](X_1,...,X_n)[/math] also has density given by [math]P(x_1,...,x_n)=\prod_{i=1}^nP_i(x_i)[/math].
- Conversely assume that the law of [math](X_1,...,X_n)[/math] has density [math]P(x_1,...,x_n)=\prod_{i=1}^nq_i(x_i)[/math], where [math]q_i[/math] is Borel measurable and positive. Then the r.v.'s [math]X_1,...,X_n[/math] are independent and the law of [math]X_i[/math] has density [math]P_i=c_iq_i[/math], with [math]c_i \gt 0[/math] for [math]i\in\{1,...,n\}[/math].
We only need to show [math](ii)[/math]. From Fubini we get
Example
Let [math]U[/math] be a r.v. with exponential distribution. Let [math]V[/math] be a uniform r.v. on [math][0,1][/math]. We assume that [math]U[/math] and [math]V[/math] are independent. Define the r.v.'s [math]X=\sqrt{U}\cos(2\pi V)[/math] and [math]Y=\sqrt{U}\sin(2\pi V)[/math]. Then [math]X[/math] and [math]Y[/math] are independent. Indeed, for a measurable function [math]\varphi:\R^2\to \R_+[/math] we get
which implies that [math](X,Y)[/math] has density [math]\frac{e^{-x^2}e^{-y^2}}{\pi}[/math] on [math]\R\times\R[/math]. With the previous corollary we get that [math]X[/math] and [math]Y[/math] are independent and [math]X[/math] and [math]Y[/math] have the same density [math]P(x)=\frac{1}{\sqrt{\pi}}e^{-x^2}[/math]. This means that [math]X[/math] and [math]Y[/math] are independent.
We write [math]X\stackrel{law}{=}Y[/math] to say that [math]\p_X=\p_Y[/math]. Thus in the example above we would have
Important facts
Let [math]X_1,...,X_n[/math] be [math]n[/math] real valued r.v.'s. Then the following are equivalent
- [math]X_1,...,X_n[/math] are independent.
- For [math]X=(X_1,...,X_n)\in\R^n[/math] we have
[[math]] \Phi_X(\xi_1,...,\xi_n)=\prod_{i=1}^n\Phi_{X_i}(\xi_i). [[/math]]
- For all [math]a_1,..,a_n\in\R[/math], we have
[[math]] \p[X_1\leq a_1,..,X_n\leq a_n]=\prod_{i=1}^n\p[X_i\leq a_i] [[/math]]
- If [math]f_1,...,f_n:\R\to\R_+[/math] are continuous, measurable maps with compact support, then
[[math]] \E\left[\prod_{i=1}^nf_i(X_i)\right]=\prod_{i=1}^n\E[f_i(X_i)]. [[/math]]
\begin{proof} First we show [math](i)\Longrightarrow (ii)[/math]. By definition and the iid property, we get
where the map [math]t\mapsto e^{it}[/math] is measurable and bounded. Next we show [math](ii)\Longrightarrow (i)[/math]. Note that by theorem we have [math]\p_X=\p_Y[/math] if
Now if [math]\Phi_X(\xi_1,...,\xi_n)=\prod_{i=1}^n\Phi_{X_i}(\xi_i)[/math], we note that [math]\prod_{i=1}^n\Phi_{X_i}(\xi_i)[/math] is the characteristic function of the probability distribution if the probability distribution is [math]\p_{X_1}\otimes\dotsm \otimes\p_{X_n}[/math]. Now from injectivity it follows that [math]\p_{(X_1,...,X_n)}=\p_{X_1}\otimes\dotsm\otimes\p_{X_n}[/math], which implies that [math]X_1,...,X_n[/math] are independent. \end{proof}
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]\B_1,...,\B_n\subset\A[/math] be sub [math]\sigma[/math]-Algebras of [math]\A[/math]. For every [math]i\in\{1,...,n\}[/math], let [math]\mathcal{C}_i\subset\B_i[/math] be a family of subsets of [math]\Omega[/math] such that [math]\mathcal{C}_i[/math] is stable under finite intersection and [math]\sigma(\mathcal{C}_i)=\B_i[/math]. Assume that for all [math]C_i\in\mathcal{C}_i[/math] with [math]i\in\{1,...,n\}[/math] we have
Let us fix [math]C_2\in \mathcal{C}_2,...,C_n\in\mathcal{C}_n[/math] and define
[math]Consequence:[/math] Let [math]\B_1,...,\B_n[/math] be [math]n[/math] independent [math]\sigma[/math]-Algebras and let [math]m_0=0 \lt m_1 \lt ... \lt m_p=n[/math]. Then the [math]\sigma[/math]-Algebras
are also independent. Indeed, we can apply The previous proposition to the class of sets
In particular if [math]X_1,...,X_n[/math] are independent r.v.'s, then
are also independent.
Example
Let [math]X_1,...,X_4[/math] be real valued independent r.v.'s. Then [math]Z_1=X_1X_3[/math] and [math]Z_2=X_2^3+X_4[/math] are independent and [math]Z_3=\sigma(X_1,X_3)[/math] and [math]Z_4=\sigma(X_2,X_4)[/math] are measurable. From above [math]\sigma(X_1,X_3)[/math] and [math]\sigma(X_2,X_4)[/math] are independent if for [math]X:\Omega\to\R[/math] we have that [math]Y[/math] is [math]\sigma(X)[/math] measurable if and only if [math]Y=f(X)[/math] with [math]f[/math] being a measurable map, i.e. if [math]Y[/math] is [math]\sigma(X_1,...,X_n)[/math] measurable, then [math]Y=f(X_1,....,X_n)[/math].
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](\B_i)_{i\in I}[/math] be an infinite family of sub [math]\sigma[/math]-Algebras of [math]A[/math]. We say that the family [math](\B_i)_{i\in I}[/math] is independent if for all [math]\{i_1,..,i_p\}\in I[/math], [math]\B_{i_1},...,\B_{i_p}[/math] are independent. If [math](X_i)_{i\in I}[/math] is a family of r.v.'s we say that they are independent if [math](\sigma(X_i))_{i\in I}[/math] is independent.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of independent r.v.'s. Then for all [math]p\in\N[/math] we get that [math]p_1=\sigma(X_1,...,X_p)[/math] and [math]p_2=\sigma(X_{p+1},...,X_n)[/math] are independent.
Apply Proposition 5.9. to [math]\mathcal{C}_1=\sigma(X_1,...,X_p)[/math] and [math]\mathcal{C}_2=\bigcup_{k=p+1}^\infty\sigma(X_{p+1},...,X_n)\in\B_2[/math].
The Borel-Cantelli Lemma
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](A_n)_{n\in\N}[/math] be a sequence of events in [math]\A[/math]. Recall that we can write
Moreover, both are again measurable sets. For [math]\omega\in\limsup_n A_n[/math] we get that [math]\omega\in\bigcup_{k=n}^\infty A_k[/math], for all [math]n\geq 0[/math]. Moreover, for all [math]n\geq 0[/math], there exists a [math]k\geq n[/math] such that, [math]\omega\in A_n[/math] and [math]\omega[/math] is in infinitely many [math]A_k[/math]'s. For [math]\omega\in\liminf_n A_n[/math], we get that for all [math]n\geq 0[/math] such that [math]\omega\in\bigcap_{k=n}^\infty A_k[/math], there exists [math]n\geq 0[/math], such that for all [math]k\geq n[/math] we have [math]\omega\in A_k[/math], which shows that [math]\liminf_nA_n\subset \limsup_nA_n[/math].
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](A_n)_{n\in\N}\in\A[/math] be a family of measurable sets.
- If [math]\sum_{n\geq 1}\p[A_n] \lt \infty[/math], then
[[math]] \p\left[\limsup_{n\to\infty} A_n\right]=0, [[/math]]which means that the set [math]\{n\in\N\mid \omega\in A_n\}[/math] is a.s. finite.
- If [math]\sum_{n\geq 1}\p[A_n]=\infty[/math], and if the events [math](A_n)_{n\in\N}[/math] are independent, then
[[math]] \p\left[\limsup_{n\to\infty} A_n\right]=1, [[/math]]which means that the set [math]\{n\in\N\mid \omega\in A_n\}[/math] is a.s. finite.
We need to show both points.
- If [math]\sum_{n\geq 1}\p[A_n] \lt \infty,[/math] then, by Fubini, we get
[[math]] \E\left[\sum_{n\geq 1}\one_{A_n}\right]=\sum_{n\geq 1}\p[A_n], [[/math]]which implies that [math]\sum_{n\geq 1}\one_{A_n} \lt \infty[/math] and [math]\one_{A_n}\not=0[/math] a.s. for finite numbers of [math]n[/math].
- Fix [math]n_0\in\N[/math] and note that for all [math]n\geq n_0[/math] we have
[[math]] \p\left[\bigcap_{k=n_0}^nA_k^C\right]=\prod_{k=n_0}^n\p[A_k^C]=\prod_{k=n_0}^n\p[1-A_n]. [[/math]]Now we see that[[math]] \sum_{n\geq 1}\p[A_n]=\infty [[/math]]and thus[[math]] \p\left[\bigcap_{k=n_0}^nA_k^C\right]=0. [[/math]]Since this is true for every [math]n_0[/math] we have that[[math]] \p\left[\bigcup_{n=0}^\infty\bigcap_{k=n_0}^\infty A_k^C\right]\leq \sum_{n\geq 1}\p[A_k^C]=0. [[/math]]Hence we get[[math]] \p\left[\bigcup_{n=0}^\infty\bigcap_{k=n_0}^\infty A_k^C\right]=\p\left[\bigcap_{n=0}^\infty\bigcup_{k=n}^\infty A_k\right]=\p\left[\limsup_{n\to\infty} A_n\right]=1. [[/math]]
Application 1
Let [math](\Omega,\A,\p)[/math] be a probability space. There does not exist a probability measure on [math]\N[/math] such that the probability of the set of multiples of an integer [math]n[/math] is [math]\frac{1}{n}[/math] for [math]n\geq 1[/math]. Let us assume that such a probability measure exists. Let [math]\tilde{p}[/math] denote the set of prime numbers. For [math]p\in\tilde{p}[/math] we note that [math]A_p=p\N[/math], i.e. the set of all multiples of [math]p[/math]. We first show that the sets [math](A_p)_{p\in\tilde{p}}[/math] are independent. Indeed let [math]p_1,...,p_n\in\tilde{p}[/math] be distinct. Then we have
Moreover it is known that
The second part of the Borel-Cantelli lemma implies that all integers [math]n[/math] belong to infinitely many [math]A_p[/math]'s. So it follows that [math]n[/math] is divisible by infinitely many distinct prime numbers.
Application 2
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X[/math] be an exponential r.v. with parameter [math]\lambda=1[/math]. Thus we know that [math]X[/math] has density [math]e^{-x}\one_{\R_+}(x)[/math]. Now consider a sequence [math](X_n)_{n\geq 1}[/math] of independent r.v.'s with the same distribution as [math]X[/math], i.e. for all [math]n\geq 1[/math],we have [math]X_n\sim X[/math]. Then [math]\limsup_n \frac{X_n}{\log(n)}=1[/math] a.s., i.e. there exists an [math]N\in\A[/math] such that [math]\p[N]=0[/math] and for [math]\omega\not\in N[/math] we get
Therefore we can compute the probability
Now let [math]\epsilon \gt 0[/math] and consider the sets [math]A_n=\{X_n \gt (1+\epsilon)\log(n)\}[/math] and [math]B_n=\{X_n \gt \log(n)\}[/math]. Then
This implies that
With the Borel-Cantelli lemma we get that [math]\p\left[\limsup_{n\to\infty} A_n\right]=0[/math]. Let us define
Then we have [math]\p[N_\epsilon]=0[/math] for [math]\omega\not\in N_{\epsilon}[/math], which implies that there exists an [math]n_0(\omega)[/math] such that for all [math]n\geq n_0[/math] we have
and thus for [math]\omega\not\in N_{\epsilon}[/math], we get [math]\limsup_{n\to\infty}\frac{X_n(\omega)}{\log(n)}\leq 1+\epsilon[/math]. Moreover, let
Therefore we get [math]\p[N']\leq \sum_{\epsilon\in\Q_+}\p[N_{\epsilon}]=0[/math] for [math]\omega\not\in N'[/math]. Hence we get
Now we note that the [math]B_n[/math]'s are independent, since [math]B_n\in\sigma(X_n)[/math] and the fact that the [math]X_n[/math]'s are independent. Moreover,
which gives that
Now we can use Borel-Cantelli to get
If we denote [math]N''=\left(\limsup_{n\to\infty} B_n\right)^C[/math], then for [math]\omega\not\in N''[/math] we get that [math]X_n(\omega) \gt \log(n)[/math] for infinitely many [math]n[/math]. So it follows that for [math]\omega\not\in N''[/math] we have
Finally, take [math]N=N'\cup N''[/math] to obtain [math]\p[N]=0[/math]. Thus for [math]\omega\not\in N[/math] we get
Sums of independent Random Variables
Let us first define the convolution of two probability measures. If [math]\mu[/math] and [math]\nu[/math] are two probability measures on [math]\R^d[/math], we denote by [math]\mu*\nu[/math] the image of the measure [math]\mu\otimes\nu[/math] by the application
Moreover, for all measurable maps [math]\varphi:\R^d\to \R_+[/math], we have
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math]X[/math] and [math]Y[/math] be two independent r.v.'s with values in [math]\R^d[/math]. Then the following hold.
- The law of [math]X+Y[/math] is given by [math]\p_X*\p_Y[/math]. In particular if [math]X[/math] has density [math]f[/math] and [math]Y[/math] has density [math]g[/math], then [math]X+Y[/math] has density [math]f*g[/math], where [math]*[/math] denotes the convolution product, which is given by
[[math]] f*g(\xi)=\int_{\R^d} f(x)g(\xi-x)dx. [[/math]]
- [math]\Phi_{X+Y}(\xi)=\Phi_X(\xi)\Phi_Y(\xi).[/math]
- If [math]X[/math] and [math]Y[/math] are in [math]L^2(\Omega,\A,\p)[/math], we get
[[math]] K_{X+Y}=K_X+K_Y. [[/math]]In particular when [math]d=1[/math], we obtain[[math]] Var(X+Y)=Var(X)+Var(Y). [[/math]]
We need to show all three points.
- If [math]X[/math] and [math]Y[/math] are independent r.v.'s, then [math]\p_{(X,Y)}=\p_X\otimes\p_Y[/math]. Consequently, for all measurable maps [math]\varphi:\R^d\to\R_+[/math], we have
[[math]] \begin{multline*} \E[\varphi(X+Y)]=\iint_{\R^d}\varphi(X+Y)\p_{(X,Y)}(dxdy)=\iint_{\R^d}\varphi(X+Y)\p_X(dx)\p_{Y}(dy)\\=\int_{\R^d}\varphi(\xi)(\p_X*\p_Y)(d\xi). \end{multline*} [[/math]]Now since [math]X[/math] and [math]Y[/math] have densities [math]f[/math] and [math]g[/math] respectively, we get[[math]] \E[\varphi(Z=X+Y)]=\iint_{\R^d}\varphi(X+Y)f(x)*g(y)dxdy=\iint_{\R^d}\varphi(\xi)\left(\int_{\R^d} f(x)g(\xi-x)dx\right)d\xi. [[/math]]Since this identity here is true for all measurable maps [math]\varphi:\R^d\to\R_+[/math], the r.v. [math]Z:=X+Y[/math] has density[[math]] h(\xi)=(f*g)(\xi)=\int_{\R^d}f(x)g(\xi-x)dx. [[/math]]
- By definition of the characteristic function and the independence property, we get
[[math]] \Phi_{X+Y}(\xi)=\E\left[e^{i\xi(X+Y)}\right]=\E\left[e^{i\xi X}e^{i\xi Y}\right]=\E\left[e^{i\xi X}\right]\E\left[e^{i\xi Y}\right]=\Phi_X(\xi)\Phi_Y(\xi). [[/math]]
- If [math]X=(X_1,...,X_d)[/math] and [math]Y=(Y_1,...,Y_d)[/math] are independent r.v.'s on [math]\R^d[/math], we get that [math]Cov(X_i,Y_j)=0[/math], for all [math]0\leq i,j\leq d[/math]. By using the multi linearity of the covariance we get that
[[math]] Cov(X_i+Y_i,X_j+Y_j)=Cov(X_i,X_j)+Cov(Y_j+Y_j), [[/math]]and hence [math]K_{X+Y}=K_X+K_Y[/math]. For [math]d=1[/math] we get[[math]] \begin{align*} Var(X+Y)&=\E[((X+Y)-\E[X+Y])^2]=\E[((X-\E[X])+(Y-\E[Y]))^2]\\ &=\underbrace{\E[(X-\E[X])^2]}_{Var(X)}+\underbrace{\E[(Y-\E[Y])^2]}_{Var(Y)}+\underbrace{2\E[(X-\E[X])(Y-\E[Y])]}_{2Cov(X,Y)} \end{align*} [[/math]]Now since [math]Cov(X,Y)=0[/math], we get the result.
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](X_n)_{n\geq 1}[/math] be a sequence of independent r.v.'s. Moreover, write [math]\mu=\E[X_n][/math] for all [math]n\geq1[/math] and assume [math]\E[(X_n-\mu)^2]\leq C[/math] for all [math]n\geq1[/math] and for some constant [math]C \lt \infty[/math]. We also write [math]S_n=\sum_{j=1}^nX_j[/math] and [math]\tilde X_n=\frac{S_n}{n}[/math] for all [math]n\geq 1[/math]. Then for all [math]\epsilon \gt 0[/math]
We note that
Let [math](\Omega,\A,\p)[/math] be a probability space. Let [math](A_n)_{n\geq 1}\in \A[/math] be a sequence of independent events with the same probabilities, i.e. [math]\p[A_n]=\p[A_m][/math], for all [math]n,m\geq 1[/math]. Then
Note that by the weak law of large numbers, we get for a sequence of independent r.v.'s [math](X_n)_{n\geq 1}[/math] with the same expectation for all [math]n\geq 1[/math]
General references
Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].