Introduction

[math] \newcommand{\R}{\mathbb{R}} \newcommand{\A}{\mathcal{A}} \newcommand{\B}{\mathcal{B}} \newcommand{\N}{\mathbb{N}} \newcommand{\C}{\mathbb{C}} \newcommand{\Rbar}{\overline{\mathbb{R}}} \newcommand{\Bbar}{\overline{\mathcal{B}}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\E}{\mathbb{E}} \newcommand{\p}{\mathbb{P}} \newcommand{\one}{\mathds{1}} \newcommand{\0}{\mathcal{O}} \newcommand{\mat}{\textnormal{Mat}} \newcommand{\sign}{\textnormal{sign}} \newcommand{\CP}{\mathcal{P}} \newcommand{\CT}{\mathcal{T}} \newcommand{\CY}{\mathcal{Y}} \newcommand{\F}{\mathcal{F}} \newcommand{\mathds}{\mathbb}[/math]

If we consider a probability space [math](\Omega,\F,\p)[/math] with a sequence of iid r.v.'s [math](X_n)_{n\geq1}[/math], we can look at the expectation [math]\E[\vert X_1\vert]=\dotsm =\E[\vert X_n\vert] \lt \infty[/math]. Now limit theorems play a central role, as we have seen in stochastics I. For example we have seen the strong law of large numbers, that is

[[math]] \frac{X_1+\dotsm+X_n}{n}\xrightarrow{n\to \infty}\E[X_1] \text{a.s.} [[/math]]

Notice that [math]\E[X_1]=\dotsm =\E[X_n][/math] since [math](X_n)_{n\geq1}[/math] are iid. Another very important limit theorem is the central limit theorem (CLT), that is

[[math]] \sqrt{n}\left(\frac{X_1+\dotsm+X_n}{n}\right)\xrightarrow{n\to\infty\atop law}\mathcal{N}(0,1). [[/math]]

That means that the distribution of the sum of the r.v.'s over [math]\sqrt{n}[/math] converges to a standard Gaussian distribution. Notice that it doesn't matter what the distribution of the [math]X_i[/math] is. The way we have proved this in stochastics I was the approach of the characteristic function. Take

[[math]] \E\left[e^{it\left(\frac{X_1+\dotsm+X_n}{\sqrt{n}}\right)}\right]=\prod_{i=1}^n\E\left[e^{it\frac{X_1}{\sqrt{n}}}\right]=\left(\E\left[e^{it\frac{X_1}{\sqrt{n}}}\right]\right)^n\xrightarrow{n\to\infty}e^{-\frac{t^2}{2}}. [[/math]]

Since we know that the characteristic function of a standard Gaussian is [math]e^{-\frac{t^2}{2}}[/math], we get the claim.

Now the more interesting question is what kind of dependence structure can one put on a family of r.v.'s [math](Z_n)_{n\geq1}[/math]?

This question will be discussed in detail in this notes and will lead to two very important notions in probability theory:

  • The notion of martingales
  • The notion of Markov-chains (ergodicity)

The extra notion we get is that [math]n[/math] is a representative for the time, i.e. one can imagine stochastic processes changing with time (time evolution).

Therefore we consider tuples of the form [math](Z_n,\F_n)_{n\geq1}[/math], where [math]\F_n[/math] is a [math]\sigma[/math]-Algebra for all [math]n\geq 1[/math]. It is often important to write the tuple down and to emphasize the [math]\F_n's[/math]. Assume that the space of events in infinite time steps is known and denote it by [math]\F[/math]. Then one will see that [math]\F_n\subset \F[/math] and moreover [math]\F_n\subset \F_{n+1}[/math]. This is a very important fact and is known as filtration.

Another very important thing is the notion of conditional expectation and conditional distribution, which we will cover as the first part of these notes.

General references

Moshayedi, Nima (2020). "Lectures on Probability Theory". arXiv:2010.16280 [math.PR].