guide:31815919f9: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
<div class="d-none"><math> | |||
\newcommand{\NA}{{\rm NA}} | |||
\newcommand{\mat}[1]{{\bf#1}} | |||
\newcommand{\exref}[1]{\ref{##1}} | |||
\newcommand{\secstoprocess}{\all} | |||
\newcommand{\NA}{{\rm NA}} | |||
\newcommand{\mathds}{\mathbb}</math></div> | |||
label{sec | |||
10.3} In the previous section, we introduced | |||
the concepts of moments and moment generating functions for discrete random variables. These | |||
concepts have natural analogues for continuous random variables, provided some care is taken | |||
in arguments involving convergence. | |||
===Moments=== | |||
If <math>X</math> is a continuous random variable defined on the probability | |||
space <math>\Omega</math>, with density function <math>f_X</math>, then we define the <math>n</math>th moment | |||
of <math>X</math> by the formula | |||
<math display="block"> | |||
\mu_n = E(X^n) = \int_{-\infty}^{+\infty} x^n f_X(x)\, dx\ , | |||
</math> | |||
provided the integral | |||
<math display="block"> | |||
\mu_n = E(X^n) = \int_{-\infty}^{+\infty} |x|^n f_X(x)\, dx\ , | |||
</math> | |||
is finite. Then, just as in the discrete case, we see | |||
that <math>\mu_0 = 1</math>, <math>\mu_1 = \mu</math>, and <math>\mu_2 - \mu_1^2 = \sigma^2</math>. | |||
===Moment Generating Functions=== | |||
Now we define the ''moment generating function'' <math>g(t)</math> for <math>X</math> by the | |||
formula | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
g(t) &=& \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} = \sum_{k = 0}^\infty | |||
\frac{E(X^k) t^k}{k!} \\ | |||
&=& E(e^{tX}) = \int_{-\infty}^{+\infty} e^{tx} f_X(x)\, dx\ , | |||
\end{eqnarray*} | |||
</math> | |||
provided this series converges. Then, as before, we have | |||
<math display="block"> | |||
\mu_n = g^{(n)}(0)\ . | |||
</math> | |||
===Examples=== | |||
<span id="exam 10.3.1"/> | |||
'''Example''' | |||
Let <math>X</math> be a continuous random variable with range <math>[0,1]</math> and density | |||
function <math>f_X(x) = 1</math> for <math>0 \leq x \leq 1</math> (uniform density). Then | |||
<math display="block"> | |||
\mu_n = \int_0^1 x^n\, dx = \frac1{n + 1}\ , | |||
</math> | |||
and | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
g(t) &=& \sum_{k = 0}^\infty \frac{t^k}{(k+1)!}\\ | |||
&=& \frac{e^t - 1}t\ . | |||
\end{eqnarray*} | |||
</math> | |||
Here the series converges for all <math>t</math>. Alternatively, we have | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
g(t) &=& \int_{-\infty}^{+\infty} e^{tx} f_X(x)\, dx \\ | |||
&=& \int_0^1 e^{tx}\, dx = \frac{e^t - 1}t\ . | |||
\end{eqnarray*} | |||
</math> | |||
Then (by L'H\^opital's rule) | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\mu_0 &=& g(0) = \lim_{t \to 0} \frac{e^t - 1}t = 1\ , \\ | |||
\mu_1 &=& g'(0) = \lim_{t \to 0} \frac{te^t - e^t + 1}{t^2} = \frac12\ , \\ | |||
\mu_2 &=& g''(0) = \lim_{t \to 0} \frac{t^3e^t - 2t^2e^t + 2te^t - 2t}{t^4} = | |||
\frac13\ . | |||
\end{eqnarray*} | |||
</math> | |||
In particular, we verify that <math>\mu = g'(0) = 1/2</math> and | |||
<math display="block"> | |||
\sigma^2 = g''(0) - (g'(0))^2 = \frac13 - \frac14 = \frac1{12} | |||
</math> | |||
as before (see [[guide:E5be6e0c81#exam 6.18.5 |Example]]). | |||
<span id="exam 10.3.2"/> | |||
'''Example''' | |||
Let <math>X</math> have range <math>[\,0,\infty)</math> and density function <math>f_X(x) = \lambda | |||
e^{-\lambda x}</math> (exponential density with parameter <math>\lambda</math>). In this case | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\mu_n &=& \int_0^\infty x^n \lambda e^{-\lambda x}\, dx = \lambda(-1)^n | |||
\frac{d^n}{d\lambda^n} \int_0^\infty e^{-\lambda x}\, dx \\ | |||
&=& \lambda(-1)^n \frac{d^n}{d\lambda^n} [\frac1\lambda] = \frac{n!} | |||
{\lambda^n}\ , | |||
\end{eqnarray*} | |||
</math> | |||
and | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
g(t) &=& \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} \\ | |||
&=& \sum_{k = 0}^\infty [\frac t\lambda]^k = \frac\lambda{\lambda - t}\ . | |||
\end{eqnarray*} | |||
</math> | |||
Here the series converges only for <math>|t| < \lambda</math>. Alternatively, we have | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
g(t) &=& \int_0^\infty e^{tx} \lambda e^{-\lambda x}\, dx \\ | |||
&=& \left.\frac{\lambda e^{(t - \lambda)x}}{t - \lambda}\right|_0^\infty = | |||
\frac\lambda{\lambda - t}\ . | |||
\end{eqnarray*} | |||
</math> | |||
Now we can verify directly that | |||
<math display="block"> | |||
\mu_n = g^{(n)}(0) = \left.\frac{\lambda n!}{(\lambda - t)^{n + 1}}\right|_{t = | |||
0} = \frac{n!}{\lambda^n}\ . | |||
</math> | |||
<span id="exam 10.3.3"/> | |||
'''Example''' | |||
Let <math>X</math> have range <math>(-\infty,+\infty)</math> and density function | |||
<math display="block"> | |||
f_X(x) = \frac1{\sqrt{2\pi}} e^{-x^2/2} | |||
</math> | |||
(normal density). In this case we have | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\mu_n &=& \frac1{\sqrt{2\pi}} \int_{-\infty}^{+\infty} x^n e^{-x^2/2}\, dx \\ | |||
&=& \left \{ \begin{array}{ll} | |||
\frac{(2m)!}{2^{m} m!}, & \mbox{if $ n = 2m$,}\cr | |||
0, & \mbox{if $ n = 2m+1$.}\end{array}\right. | |||
\end{eqnarray*} | |||
</math> | |||
(These moments are calculated by integrating once by parts to show that <math>\mu_n | |||
= (n - 1)\mu_{n - 2}</math>, and observing that <math>\mu_0 = 1</math> and <math>\mu_1 = 0</math>.) Hence, | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
g(t) &=& \sum_{n = 0}^\infty \frac{\mu_n t^n}{n!} \\ | |||
&=& \sum_{m = 0}^\infty \frac{t^{2m}}{2^{m} m!} = e^{t^2/2}\ . | |||
\end{eqnarray*} | |||
</math> | |||
This series converges for all values of <math>t</math>. Again we can verify that | |||
<math>g^{(n)}(0) = \mu_n</math>. | |||
Let <math>X</math> be a normal random variable with parameters <math>\mu</math> and <math>\sigma</math>. It is easy | |||
to show that the moment generating function of <math>X</math> is given by | |||
<math display="block"> | |||
e^{t\mu + (\sigma^2/2)t^2}\ . | |||
</math> | |||
Now suppose that <math>X</math> and <math>Y</math> are two independent normal random variables with | |||
parameters <math>\mu_1</math>, <math>\sigma_1</math>, and <math>\mu_2</math>, <math>\sigma_2</math>, respectively. Then, | |||
the product of the moment generating functions of <math>X</math> and <math>Y</math> is | |||
<math display="block"> | |||
e^{t(\mu_1 + \mu_2) + ((\sigma_1^2 + \sigma_2^2)/2)t^2}\ . | |||
</math> | |||
This is the moment generating function for a normal random variable with mean | |||
<math>\mu_1 + \mu_2</math> and variance <math>\sigma_1^2 + \sigma_2^2</math>. Thus, the sum | |||
of two independent normal random variables is again normal. (This was proved | |||
for the special case that both summands are standard normal in Example \ref{exam | |||
7.8}.) | |||
In general, the series defining <math>g(t)</math> will not converge for all <math>t</math>. But in | |||
the important special case where <math>X</math> is bounded (i.e., where the range of <math>X</math> | |||
is contained in a finite interval), we can show that the series does converge | |||
for all <math>t</math>. | |||
{{proofcard|Theorem|thm_10.4|Suppose <math>X</math> is a continuous random variable with range contained in the | |||
interval <math>[-M,M]</math>. Then the series | |||
<math display="block"> | |||
g(t) = \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} | |||
</math> | |||
converges for all <math>t</math> to an infinitely differentiable function <math>g(t)</math>, and | |||
<math>g^{(n)}(0) = \mu_n</math>.\n|We have | |||
<math display="block"> | |||
\mu_k = \int_{-M}^{+M} x^k f_X(x)\, dx\ , | |||
</math> | |||
so | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
|\mu_k| &\leq& \int_{-M}^{+M} |x|^k f_X(x)\, dx \\ | |||
&\leq& M^k \int_{-M}^{+M} f_X(x)\, dx = M^k\ . | |||
\end{eqnarray*} | |||
</math> | |||
Hence, for all <math>N</math> we have | |||
<math display="block"> | |||
\sum_{k = 0}^N \left|\frac{\mu_k t^k}{k!}\right| \leq \sum_{k = 0}^N | |||
\frac{(M|t|)^k}{k!} \leq e^{M|t|}\ , | |||
</math> | |||
which shows that the power series converges for all <math>t</math>. We know that the sum | |||
of a convergent power series is always differentiable.}} | |||
===Moment Problem=== | |||
{{proofcard|Theorem|thm_10.5|If <math>X</math> is a bounded random variable, then the moment generating function | |||
<math>g_X(t)</math> of <math>x</math> determines the density function <math>f_X(x)</math> uniquely. | |||
''Sketch of the Proof.'' | |||
We know that | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
g_X(t) &=& \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} \\ | |||
&=& \int_{-\infty}^{+\infty} e^{tx} f(x)\, dx\ . | |||
\end{eqnarray*} | |||
</math> | |||
If we replace <math>t</math> by <math>i\tau</math>, where <math>\tau</math> is real and <math>i = \sqrt{-1}</math>, then | |||
the series converges for all <math>\tau</math>, and we can define the function | |||
<math display="block"> | |||
k_X(\tau) = g_X(i\tau) = \int_{-\infty}^{+\infty} e^{i\tau x} f_X(x)\, dx\ . | |||
</math> | |||
The function <math>k_X(\tau)</math> is called the ''characteristic function'' of <math>X</math>, and is defined by the above equation even when the series for <math>g_X</math> does not | |||
converge. This equation says that <math>k_X</math> is the ''Fourier transform'' of <math>f_X</math>. It is known that the Fourier transform has an inverse, given by the | |||
formula | |||
<math display="block"> | |||
f_X(x) = \frac1{2\pi} \int_{-\infty}^{+\infty} e^{-i\tau x} k_X(\tau)\, d\tau\ , | |||
</math> | |||
suitably interpreted.<ref group="Notes" >H. Dym and H. P. McKean, ''Fourier Series and | |||
Integrals'' (New York: Academic Press, 1972).</ref> Here we see that the | |||
characteristic function <math>k_X</math>, and hence the moment generating function <math>g_X</math>, | |||
determines the density function <math>f_X</math> uniquely under our hypotheses.|}} | |||
===Sketch of the Proof of the Central Limit Theorem=== | |||
With the above result in mind, we can now sketch a proof of the Central Limit Theorem | |||
for bounded continuous random variables (see [[guide:452fd94468#thm 9.4.7 |Theorem]]). To this end, | |||
let <math>X</math> be a continuous random variable with density function <math>f_X</math>, mean <math>\mu | |||
= 0</math> and variance <math>\sigma^2 = 1</math>, and moment generating function <math>g(t)</math> defined | |||
by its series for all <math>t</math>. Let <math>X_1</math>, <math>X_2</math>, \ldots, <math>X_n</math> be an independent | |||
trials process with each <math>X_i</math> having density <math>f_X</math>, and let <math>S_n = X_1 + X_2 | |||
+\cdots+ X_n</math>, and <math>S_n^* = (S_n - n\mu)/\sqrt{n\sigma^2} = S_n/\sqrt n</math>. Then | |||
each <math>X_i</math> has moment generating function <math>g(t)</math>, and since the <math>X_i</math> are | |||
independent, the sum <math>S_n</math>, just as in the discrete case (see Section \ref{sec 10.1}), | |||
has moment generating function | |||
<math display="block"> | |||
g_n(t) = (g(t))^n\ , | |||
</math> | |||
and the standardized sum <math>S_n^*</math> has moment generating function | |||
<math display="block"> | |||
g_n^*(t) = \left(g\left(\frac t{\sqrt n}\right)\right)^n\ . | |||
</math> | |||
We now show that, as <math>n \to \infty</math>, <math>g_n^*(t) \to e^{t^2/2}</math>, where | |||
<math>e^{t^2/2}</math> is the moment generating function of the normal density <math>n(x) = | |||
(1/\sqrt{2\pi}) e^{-x^2/2}</math> (see [[#exam 10.3.3 |Example]]). | |||
To show this, we set <math>u(t) = \log g(t)</math>, and | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
u_n^*(t) &=& \log g_n^*(t) \\ | |||
&=& n\log g\left(\frac t{\sqrt n}\right) = nu\left(\frac t{\sqrt n}\right)\ , | |||
\end{eqnarray*} | |||
</math> | |||
and show that <math>u_n^*(t) \to t^2/2</math> as <math>n \to \infty</math>. First we note that | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
u(0) &=& \log g_n(0) = 0\ , \\ | |||
u'(0) &=& \frac{g'(0)}{g(0)} = \frac{\mu_1}1 = 0\ , \\ | |||
u''(0) &=& \frac{g''(0)g(0) - (g'(0))^2}{(g(0))^2} \\ | |||
&=& \frac{\mu_2 - \mu_1^2}1 = \sigma^2 = 1\ . | |||
\end{eqnarray*} | |||
</math> | |||
Now by using L'H\^opital's rule twice, we get | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\lim_{n \to \infty} u_n^*(t) &=& \lim_{s \to \infty} \frac{u(t/\sqrt s)}{s^{-1}}\\ | |||
&=& \lim_{s \to \infty} \frac{u'(t/\sqrt s) t}{2s^{-1/2}} \\ | |||
&=& \lim_{s \to \infty} u''\left(\frac t{\sqrt s}\right) \frac{t^2}2 = \sigma^2 | |||
\frac{t^2}2 = \frac{t^2}2\ . | |||
\end{eqnarray*} | |||
</math> | |||
Hence, <math>g_n^*(t) \to e^{t^2/2}</math> as <math>n \to \infty</math>. Now to complete the proof | |||
of the Central Limit Theorem, we must show that if <math>g_n^*(t) \to e^{t^2/2}</math>, | |||
then under our hypotheses the distribution functions <math>F_n^*(x)</math> of the <math>S_n^*</math> | |||
must converge to the distribution function <math>F_N^*(x)</math> of the normal | |||
variable <math>N</math>; that is, that | |||
<math display="block"> | |||
F_n^*(a) = P(S_n^* \leq a) \to \frac1{\sqrt{2\pi}} \int_{-\infty}^a | |||
e^{-x^2/2}\, dx\ , | |||
</math> | |||
and furthermore, that the density functions <math>f_n^*(x)</math> of the <math>S_n^*</math> must | |||
converge to the density function for <math>N</math>; that is, that | |||
<math display="block"> | |||
f_n^*(x) \to \frac1{\sqrt{2\pi}} e^{-x^2/2}\ , | |||
</math> | |||
as <math>n \rightarrow \infty</math>. | |||
Since the densities, and hence the distributions, of the <math>S_n^*</math> are uniquely | |||
determined by their moment generating functions under our hypotheses, these | |||
conclusions are certainly plausible, but their proofs involve a detailed | |||
examination of characteristic functions and Fourier transforms, and we shall | |||
not attempt them here. | |||
In the same way, we can prove the Central Limit Theorem for bounded discrete | |||
random variables with integer values (see [[guide:4add108640#thm 9.3.6 |Theorem]]). Let <math>X</math> be a | |||
discrete random variable with density function <math>p(j)</math>, mean <math>\mu = 0</math>, variance | |||
<math>\sigma^2 = 1</math>, and moment generating function <math>g(t)</math>, and let <math>X_1</math>, <math>X_2</math>, | |||
\ldots, <math>X_n</math> form an independent trials process with common density <math>p</math>. Let | |||
<math>S_n = X_1 + X_2 +\cdots+ X_n</math> and <math>S_n^* = S_n/\sqrt n</math>, with densities | |||
<math>p_n</math> and <math>p_n^*</math>, and moment generating functions <math>g_n(t)</math> and | |||
<math>g_n^*(t) = \left(g(\frac t{\sqrt n})\right)^n.</math> | |||
Then we have | |||
<math display="block"> | |||
g_n^*(t) \to e^{t^2/2}\ , | |||
</math> | |||
just as in the continuous case, and this implies in the same way that the | |||
distribution functions <math>F_n^*(x)</math> converge to the normal distribution; that is, | |||
that | |||
<math display="block"> | |||
F_n^*(a) = P(S_n^* \leq a) \to \frac1{\sqrt{2\pi}} \int_{-\infty}^a | |||
e^{-x^2/2}\, dx\ , | |||
</math> | |||
as <math>n \rightarrow \infty</math>. | |||
The corresponding statement about the distribution functions <math>p_n^*</math>, however, | |||
requires a little extra care (see [[guide:4add108640#thm 9.3.5 |Theorem]]). The trouble arises | |||
because the distribution <math>p(x)</math> is not defined for all <math>x</math>, but only for | |||
integer <math>x</math>. It follows that the distribution <math>p_n^*(x)</math> is defined only for <math>x</math> of | |||
the form <math>j/\sqrt n</math>, and these values change as <math>n</math> changes. | |||
We can fix this, however, by introducing the function <math>\bar p(x)</math>, defined | |||
by the formula | |||
<math display="block"> | |||
\bar p(x) = \left \{ \begin{array}{ll} | |||
p(j), & \mbox{if $j - 1/2 \leq x < j + 1/2$,} \cr | |||
0\ , & \mbox{otherwise}.\end{array}\right. | |||
</math> | |||
Then <math>\bar p(x)</math> is defined for all <math>x</math>, <math>\bar p(j) = p(j)</math>, and the | |||
graph of <math>\bar p(x)</math> is the step function for the distribution <math>p(j)</math> (see Figure 3 | |||
of Section \ref{sec 9.1}). | |||
In the same way we introduce the step function <math>\bar p_n(x)</math> and | |||
<math>\bar p_n^*(x)</math> associated with the distributions <math>p_n</math> and <math>p_n^*</math>, and their | |||
moment generating functions <math>\bar g_n(t)</math> and <math>\bar g_n^*(t)</math>. If we | |||
can show that <math>\bar g_n^*(t) \to e^{t^2/2}</math>, then we can conclude that | |||
<math display="block"> | |||
\bar p_n^*(x) \to \frac1{\sqrt{2\pi}} e^{t^2/2}\ , | |||
</math> | |||
as <math>n \rightarrow \infty</math>, for all <math>x</math>, a conclusion strongly suggested by | |||
Figure \ref{fig 9.2}. | |||
Now <math>\bar g(t)</math> is given by | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\bar g(t) &=& \int_{-\infty}^{+\infty} e^{tx} \bar p(x)\, dx \\ | |||
&=& \sum_{j = -N}^{+N} \int_{j - 1/2}^{j + 1/2} e^{tx} p(j)\, dx\\ | |||
&=& \sum_{j = -N}^{+N} p(j) e^{tj} \frac{e^{t/2} - e^{-t/2}} | |||
{2t/2} \\ | |||
&=& g(t) \frac{\sinh(t/2)}{t/2}\ , | |||
\end{eqnarray*} | |||
</math> | |||
where we have put | |||
<math display="block"> | |||
\sinh(t/2) = \frac{e^{t/2} - e^{-t/2}}2\ . | |||
</math> | |||
In the same way, we find that | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\bar g_n(t) &=& g_n(t) \frac{\sinh(t/2)}{t/2}\ , \\ | |||
\bar g_n^*(t) &=& g_n^*(t) \frac{\sinh(t/2\sqrt n)}{t/2\sqrt n}\ . | |||
\end{eqnarray*} | |||
</math> | |||
Now, as <math>n \to \infty</math>, we know that <math>g_n^*(t) \to e^{t^2/2}</math>, and, by | |||
L'H\^opital's rule, | |||
<math display="block"> | |||
\lim_{n \to \infty} \frac{\sinh(t/2\sqrt n)}{t/2\sqrt n} = 1\ . | |||
</math> | |||
It follows that | |||
<math display="block"> | |||
\bar g_n^*(t) \to e^{t^2/2}\ , | |||
</math> | |||
and hence that | |||
<math display="block"> | |||
\bar p_n^*(x) \to \frac1{\sqrt{2\pi}} e^{-x^2/2}\ , | |||
</math> | |||
as <math>n \rightarrow \infty</math>. | |||
The astute reader will note that in this sketch of the proof of [[guide:4add108640#thm 9.3.5 |Theorem]], | |||
we never made use of the hypothesis that the greatest common divisor of the | |||
differences of all the values that the <math>X_i</math> can take on is 1. This is a technical | |||
point that we choose to ignore. A complete proof may be found in Gnedenko and | |||
Kolmogorov.<ref group="Notes" >B. V. Gnedenko and A. N. Kolomogorov, ''Limit Distributions | |||
for Sums of Independent Random Variables'' (Reading: Addison-Wesley, 1968), p. 233.</ref> | |||
===Cauchy Density=== | |||
The characteristic function of a continuous density is a useful tool even in | |||
cases when the moment series does not converge, or even in cases when the | |||
moments themselves are not finite. As an example, consider the Cauchy density | |||
with parameter <math>a = 1</math> (see [[guide:D26a5cb8f7#exam 5.20 |Example]]) | |||
<math display="block"> | |||
f(x) = \frac1{\pi(1 + x^2)}\ . | |||
</math> | |||
If <math>X</math> and <math>Y</math> are independent random variables with Cauchy density <math>f(x)</math>, | |||
then the average <math>Z = (X + Y)/2</math> also has Cauchy density <math>f(x)</math>, that is, | |||
<math display="block"> | |||
f_Z(x) = f(x)\ . | |||
</math> | |||
This is hard to check directly, but easy to check by using characteristic | |||
functions. Note first that | |||
<math display="block"> | |||
\mu_2 = E(X^2) = \int_{-\infty}^{+\infty} \frac{x^2}{\pi(1 + x^2)}\, dx = \infty | |||
</math> | |||
so that <math>\mu_2</math> is infinite. Nevertheless, we can define the characteristic | |||
function <math>k_X(\tau)</math> of <math>x</math> by the formula | |||
<math display="block"> | |||
k_X(\tau) = \int_{-\infty}^{+\infty} e^{i\tau x}\frac1{\pi(1 + x^2)}\, dx\ . | |||
</math> | |||
This integral is easy to do by contour methods, and gives us | |||
<math display="block"> | |||
k_X(\tau) = k_Y(\tau) = e^{-|\tau|}\ . | |||
</math> | |||
Hence, | |||
<math display="block"> | |||
k_{X + Y}(\tau) = (e^{-|\tau|})^2 = e^{-2|\tau|}\ , | |||
</math> | |||
and since | |||
<math display="block"> | |||
k_Z(\tau) = k_{X + Y}(\tau/2)\ , | |||
</math> | |||
we have | |||
<math display="block"> | |||
k_Z(\tau) = e^{-2|\tau/2|} = e^{-|\tau|}\ . | |||
</math> | |||
This shows that <math>k_Z = k_X = k_Y</math>, and leads to the conclusions that <math>f_Z = f_X | |||
= f_Y</math>. | |||
It follows from this that if <math>X_1</math>, <math>X_2</math>, \ldots, <math>X_n</math> is an independent | |||
trials process with common Cauchy density, and if | |||
<math display="block"> | |||
A_n = \frac{X_1 + X_2 + \cdots+ X_n}n | |||
</math> | |||
is the average of the <math>X_i</math>, then <math>A_n</math> has the same density as do the <math>X_i</math>. | |||
This means that the Law of Large Numbers fails for this process; the | |||
distribution of the average <math>A_n</math> is exactly the same as for the individual | |||
terms. Our proof of the Law of Large Numbers fails in this case because the | |||
variance of <math>X_i</math> is not finite. | |||
\exercises | |||
\newdimen\snellbaselineskip | |||
\newdimen\snellskip | |||
\snellskip=1.5ex | |||
\snellbaselineskip=\baselineskip | |||
\def\srule{\omit\kern.5em\vrule\kern-.5em} | |||
\newbox\bigstrutbox | |||
\setbox\bigstrutbox=\hbox{\vrule height14.5pt depth9.5pt width0pt} | |||
\def\bigstrut{\relax\ifmmode\copy\bigstrutbox\else\unhcopy\bigstrutbox\fi} | |||
\def\middlehrule#1#2{\noalign{\kern-\snellbaselineskip\kern\snellskip} | |||
&\multispan#1\strut\hrulefill | |||
&\omit\hbox to.5em{\hrulefill}\vrule | |||
height \snellskip\kern-.5em&\multispan#2\hrulefill\cr} | |||
\makeatletter | |||
\def\bordermatrix#1{\begingroup \m@th | |||
\@tempdima 8.75\p@ | |||
\setbox\z@\vbox{ \def\cr{\crcr\noalign{\kern2\p@\global\let\cr\endline}} \ialign{<math>##</math>\hfil\kern2\p@\kern\@tempdima&\thinspace\hfil<math>##</math>\hfil | |||
&&\quad\hfil<math>##</math>\hfil\crcr | |||
\omit\strut\hfil\crcr\noalign{\kern-\snellbaselineskip} #1\crcr\omit\strut\cr}} \setbox\tw@\vbox{\unvcopy\z@\global\setbox\@ne\lastbox} \setbox\tw@\hbox{\unhbox\@ne\unskip\global\setbox\@ne\lastbox} \setbox\tw@\hbox{<math>\kern\wd\@ne\kern-\@tempdima\left(\kern-\wd\@ne | |||
\global\setbox\@ne\vbox{\box\@ne\kern2\p@} \vcenter{\kern-\ht\@ne\unvbox\z@\kern-\snellbaselineskip}\,\right)</math>} \null\;\vbox{\kern\ht\@ne\box\tw@}\endgroup} | |||
\makeatother==General references== | |||
{{cite web |url=https://math.dartmouth.edu/~prob/prob/prob.pdf |title=Grinstead and Snell’s Introduction to Probability |last=Doyle |first=Peter G.|date=2006 |access-date=June 6, 2024}} | |||
==Notes== | |||
{{Reflist|group=Notes}} |
Revision as of 02:37, 9 June 2024
label{sec 10.3} In the previous section, we introduced the concepts of moments and moment generating functions for discrete random variables. These concepts have natural analogues for continuous random variables, provided some care is taken in arguments involving convergence.
Moments
If [math]X[/math] is a continuous random variable defined on the probability space [math]\Omega[/math], with density function [math]f_X[/math], then we define the [math]n[/math]th moment of [math]X[/math] by the formula
provided the integral
is finite. Then, just as in the discrete case, we see that [math]\mu_0 = 1[/math], [math]\mu_1 = \mu[/math], and [math]\mu_2 - \mu_1^2 = \sigma^2[/math].
Moment Generating Functions
Now we define the moment generating function [math]g(t)[/math] for [math]X[/math] by the formula
provided this series converges. Then, as before, we have
Examples
Example Let [math]X[/math] be a continuous random variable with range [math][0,1][/math] and density function [math]f_X(x) = 1[/math] for [math]0 \leq x \leq 1[/math] (uniform density). Then
and
Here the series converges for all [math]t[/math]. Alternatively, we have
Then (by L'H\^opital's rule)
In particular, we verify that [math]\mu = g'(0) = 1/2[/math] and
as before (see Example).
Example Let [math]X[/math] have range [math][\,0,\infty)[/math] and density function [math]f_X(x) = \lambda e^{-\lambda x}[/math] (exponential density with parameter [math]\lambda[/math]). In this case
and
Here the series converges only for [math]|t| \lt \lambda[/math]. Alternatively, we have
Now we can verify directly that
Example Let [math]X[/math] have range [math](-\infty,+\infty)[/math] and density function
(normal density). In this case we have
(These moments are calculated by integrating once by parts to show that [math]\mu_n = (n - 1)\mu_{n - 2}[/math], and observing that [math]\mu_0 = 1[/math] and [math]\mu_1 = 0[/math].) Hence,
This series converges for all values of [math]t[/math]. Again we can verify that [math]g^{(n)}(0) = \mu_n[/math].
Let [math]X[/math] be a normal random variable with parameters [math]\mu[/math] and [math]\sigma[/math]. It is easy
to show that the moment generating function of [math]X[/math] is given by
Now suppose that [math]X[/math] and [math]Y[/math] are two independent normal random variables with parameters [math]\mu_1[/math], [math]\sigma_1[/math], and [math]\mu_2[/math], [math]\sigma_2[/math], respectively. Then, the product of the moment generating functions of [math]X[/math] and [math]Y[/math] is
This is the moment generating function for a normal random variable with mean [math]\mu_1 + \mu_2[/math] and variance [math]\sigma_1^2 + \sigma_2^2[/math]. Thus, the sum of two independent normal random variables is again normal. (This was proved for the special case that both summands are standard normal in Example \ref{exam 7.8}.)
In general, the series defining [math]g(t)[/math] will not converge for all [math]t[/math]. But in the important special case where [math]X[/math] is bounded (i.e., where the range of [math]X[/math] is contained in a finite interval), we can show that the series does converge for all [math]t[/math].
Suppose [math]X[/math] is a continuous random variable with range contained in the interval [math][-M,M][/math]. Then the series
We have
Moment Problem
If [math]X[/math] is a bounded random variable, then the moment generating function [math]g_X(t)[/math] of [math]x[/math] determines the density function [math]f_X(x)[/math] uniquely.
Sketch of the Proof. We know that
If we replace [math]t[/math] by [math]i\tau[/math], where [math]\tau[/math] is real and [math]i = \sqrt{-1}[/math], then the series converges for all [math]\tau[/math], and we can define the function
The function [math]k_X(\tau)[/math] is called the characteristic function of [math]X[/math], and is defined by the above equation even when the series for [math]g_X[/math] does not converge. This equation says that [math]k_X[/math] is the Fourier transform of [math]f_X[/math]. It is known that the Fourier transform has an inverse, given by the formula
Sketch of the Proof of the Central Limit Theorem
With the above result in mind, we can now sketch a proof of the Central Limit Theorem for bounded continuous random variables (see Theorem). To this end, let [math]X[/math] be a continuous random variable with density function [math]f_X[/math], mean [math]\mu = 0[/math] and variance [math]\sigma^2 = 1[/math], and moment generating function [math]g(t)[/math] defined by its series for all [math]t[/math]. Let [math]X_1[/math], [math]X_2[/math], \ldots, [math]X_n[/math] be an independent trials process with each [math]X_i[/math] having density [math]f_X[/math], and let [math]S_n = X_1 + X_2 +\cdots+ X_n[/math], and [math]S_n^* = (S_n - n\mu)/\sqrt{n\sigma^2} = S_n/\sqrt n[/math]. Then each [math]X_i[/math] has moment generating function [math]g(t)[/math], and since the [math]X_i[/math] are independent, the sum [math]S_n[/math], just as in the discrete case (see Section \ref{sec 10.1}), has moment generating function
and the standardized sum [math]S_n^*[/math] has moment generating function
We now show that, as [math]n \to \infty[/math], [math]g_n^*(t) \to e^{t^2/2}[/math], where [math]e^{t^2/2}[/math] is the moment generating function of the normal density [math]n(x) = (1/\sqrt{2\pi}) e^{-x^2/2}[/math] (see Example). To show this, we set [math]u(t) = \log g(t)[/math], and
and show that [math]u_n^*(t) \to t^2/2[/math] as [math]n \to \infty[/math]. First we note that
Now by using L'H\^opital's rule twice, we get
Hence, [math]g_n^*(t) \to e^{t^2/2}[/math] as [math]n \to \infty[/math]. Now to complete the proof of the Central Limit Theorem, we must show that if [math]g_n^*(t) \to e^{t^2/2}[/math], then under our hypotheses the distribution functions [math]F_n^*(x)[/math] of the [math]S_n^*[/math] must converge to the distribution function [math]F_N^*(x)[/math] of the normal variable [math]N[/math]; that is, that
and furthermore, that the density functions [math]f_n^*(x)[/math] of the [math]S_n^*[/math] must converge to the density function for [math]N[/math]; that is, that
as [math]n \rightarrow \infty[/math].
Since the densities, and hence the distributions, of the [math]S_n^*[/math] are uniquely
determined by their moment generating functions under our hypotheses, these
conclusions are certainly plausible, but their proofs involve a detailed
examination of characteristic functions and Fourier transforms, and we shall
not attempt them here.
In the same way, we can prove the Central Limit Theorem for bounded discrete
random variables with integer values (see Theorem). Let [math]X[/math] be a
discrete random variable with density function [math]p(j)[/math], mean [math]\mu = 0[/math], variance
[math]\sigma^2 = 1[/math], and moment generating function [math]g(t)[/math], and let [math]X_1[/math], [math]X_2[/math],
\ldots, [math]X_n[/math] form an independent trials process with common density [math]p[/math]. Let
[math]S_n = X_1 + X_2 +\cdots+ X_n[/math] and [math]S_n^* = S_n/\sqrt n[/math], with densities
[math]p_n[/math] and [math]p_n^*[/math], and moment generating functions [math]g_n(t)[/math] and
[math]g_n^*(t) = \left(g(\frac t{\sqrt n})\right)^n.[/math]
Then we have
just as in the continuous case, and this implies in the same way that the distribution functions [math]F_n^*(x)[/math] converge to the normal distribution; that is, that
as [math]n \rightarrow \infty[/math].
The corresponding statement about the distribution functions [math]p_n^*[/math], however,
requires a little extra care (see Theorem). The trouble arises
because the distribution [math]p(x)[/math] is not defined for all [math]x[/math], but only for
integer [math]x[/math]. It follows that the distribution [math]p_n^*(x)[/math] is defined only for [math]x[/math] of
the form [math]j/\sqrt n[/math], and these values change as [math]n[/math] changes.
We can fix this, however, by introducing the function [math]\bar p(x)[/math], defined
by the formula
Then [math]\bar p(x)[/math] is defined for all [math]x[/math], [math]\bar p(j) = p(j)[/math], and the graph of [math]\bar p(x)[/math] is the step function for the distribution [math]p(j)[/math] (see Figure 3 of Section \ref{sec 9.1}). In the same way we introduce the step function [math]\bar p_n(x)[/math] and [math]\bar p_n^*(x)[/math] associated with the distributions [math]p_n[/math] and [math]p_n^*[/math], and their moment generating functions [math]\bar g_n(t)[/math] and [math]\bar g_n^*(t)[/math]. If we can show that [math]\bar g_n^*(t) \to e^{t^2/2}[/math], then we can conclude that
as [math]n \rightarrow \infty[/math], for all [math]x[/math], a conclusion strongly suggested by Figure \ref{fig 9.2}.
Now [math]\bar g(t)[/math] is given by
where we have put
In the same way, we find that
Now, as [math]n \to \infty[/math], we know that [math]g_n^*(t) \to e^{t^2/2}[/math], and, by L'H\^opital's rule,
It follows that
and hence that
as [math]n \rightarrow \infty[/math]. The astute reader will note that in this sketch of the proof of Theorem, we never made use of the hypothesis that the greatest common divisor of the differences of all the values that the [math]X_i[/math] can take on is 1. This is a technical point that we choose to ignore. A complete proof may be found in Gnedenko and Kolmogorov.[Notes 2]
Cauchy Density
The characteristic function of a continuous density is a useful tool even in cases when the moment series does not converge, or even in cases when the moments themselves are not finite. As an example, consider the Cauchy density with parameter [math]a = 1[/math] (see Example)
If [math]X[/math] and [math]Y[/math] are independent random variables with Cauchy density [math]f(x)[/math], then the average [math]Z = (X + Y)/2[/math] also has Cauchy density [math]f(x)[/math], that is,
This is hard to check directly, but easy to check by using characteristic functions. Note first that
so that [math]\mu_2[/math] is infinite. Nevertheless, we can define the characteristic function [math]k_X(\tau)[/math] of [math]x[/math] by the formula
This integral is easy to do by contour methods, and gives us
Hence,
and since
we have
This shows that [math]k_Z = k_X = k_Y[/math], and leads to the conclusions that [math]f_Z = f_X = f_Y[/math]. It follows from this that if [math]X_1[/math], [math]X_2[/math], \ldots, [math]X_n[/math] is an independent trials process with common Cauchy density, and if
is the average of the [math]X_i[/math], then [math]A_n[/math] has the same density as do the [math]X_i[/math]. This means that the Law of Large Numbers fails for this process; the distribution of the average [math]A_n[/math] is exactly the same as for the individual terms. Our proof of the Law of Large Numbers fails in this case because the variance of [math]X_i[/math] is not finite. \exercises
\newdimen\snellbaselineskip
\newdimen\snellskip
\snellskip=1.5ex
\snellbaselineskip=\baselineskip
\def\srule{\omit\kern.5em\vrule\kern-.5em}
\newbox\bigstrutbox
\setbox\bigstrutbox=\hbox{\vrule height14.5pt depth9.5pt width0pt}
\def\bigstrut{\relax\ifmmode\copy\bigstrutbox\else\unhcopy\bigstrutbox\fi}
\def\middlehrule#1#2{\noalign{\kern-\snellbaselineskip\kern\snellskip}
&\multispan#1\strut\hrulefill
&\omit\hbox to.5em{\hrulefill}\vrule
height \snellskip\kern-.5em&\multispan#2\hrulefill\cr}
\makeatletter
\def\bordermatrix#1{\begingroup \m@th
\@tempdima 8.75\p@ \setbox\z@\vbox{ \def\cr{\crcr\noalign{\kern2\p@\global\let\cr\endline}} \ialign{[math]##[/math]\hfil\kern2\p@\kern\@tempdima&\thinspace\hfil[math]##[/math]\hfil &&\quad\hfil[math]##[/math]\hfil\crcr \omit\strut\hfil\crcr\noalign{\kern-\snellbaselineskip} #1\crcr\omit\strut\cr}} \setbox\tw@\vbox{\unvcopy\z@\global\setbox\@ne\lastbox} \setbox\tw@\hbox{\unhbox\@ne\unskip\global\setbox\@ne\lastbox} \setbox\tw@\hbox{[math]\kern\wd\@ne\kern-\@tempdima\left(\kern-\wd\@ne \global\setbox\@ne\vbox{\box\@ne\kern2\p@} \vcenter{\kern-\ht\@ne\unvbox\z@\kern-\snellbaselineskip}\,\right)[/math]} \null\;\vbox{\kern\ht\@ne\box\tw@}\endgroup}
\makeatother==General references== Doyle, Peter G. (2006). "Grinstead and Snell's Introduction to Probability" (PDF). Retrieved June 6, 2024.