guide:31815919f9: Difference between revisions
No edit summary |
mNo edit summary |
||
Line 6: | Line 6: | ||
\newcommand{\NA}{{\rm NA}} | \newcommand{\NA}{{\rm NA}} | ||
\newcommand{\mathds}{\mathbb}</math></div> | \newcommand{\mathds}{\mathbb}</math></div> | ||
In the previous section, we introduced the concepts of moments and moment generating functions for discrete random variables. These | |||
the concepts of moments and moment generating functions for discrete random variables. These | concepts have natural analogues for continuous random variables, provided some care is taken in arguments involving convergence. | ||
concepts have natural analogues for continuous random variables, provided some care is taken | |||
in arguments involving convergence. | |||
===Moments=== | ===Moments=== | ||
Line 44: | Line 42: | ||
\mu_n = g^{(n)}(0)\ . | \mu_n = g^{(n)}(0)\ . | ||
</math> | </math> | ||
===Examples=== | ===Examples=== | ||
Line 71: | Line 68: | ||
\end{eqnarray*} | \end{eqnarray*} | ||
</math> | </math> | ||
Then (by L' | Then (by L'Hôpital's rule) | ||
<math display="block"> | <math display="block"> | ||
Line 118: | Line 115: | ||
\end{eqnarray*} | \end{eqnarray*} | ||
</math> | </math> | ||
Now we can verify directly that | Now we can verify directly that | ||
Line 172: | Line 168: | ||
This is the moment generating function for a normal random variable with mean | This is the moment generating function for a normal random variable with mean | ||
<math>\mu_1 + \mu_2</math> and variance <math>\sigma_1^2 + \sigma_2^2</math>. Thus, the sum | <math>\mu_1 + \mu_2</math> and variance <math>\sigma_1^2 + \sigma_2^2</math>. Thus, the sum | ||
of two independent normal random variables is again normal. (This was proved | of two independent normal random variables is again normal. (This was proved for the special case that both summands are standard normal in [[guide:Ec62e49ef0#exam 7.8|Example]].) | ||
for the special case that both summands are standard normal in | |||
7.8 | |||
In general, the series defining <math>g(t)</math> will not converge for all <math>t</math>. But in | In general, the series defining <math>g(t)</math> will not converge for all <math>t</math>. But in | ||
Line 253: | Line 247: | ||
+\cdots+ X_n</math>, and <math>S_n^* = (S_n - n\mu)/\sqrt{n\sigma^2} = S_n/\sqrt n</math>. Then | +\cdots+ X_n</math>, and <math>S_n^* = (S_n - n\mu)/\sqrt{n\sigma^2} = S_n/\sqrt n</math>. Then | ||
each <math>X_i</math> has moment generating function <math>g(t)</math>, and since the <math>X_i</math> are | each <math>X_i</math> has moment generating function <math>g(t)</math>, and since the <math>X_i</math> are | ||
independent, the sum <math>S_n</math>, just as in the discrete case (see | independent, the sum <math>S_n</math>, just as in the discrete case (see [[guide:04b2b9c99f|Generating Functions for Discrete Distributions]]), | ||
has moment generating function | has moment generating function | ||
Line 325: | Line 319: | ||
random variables with integer values (see [[guide:4add108640#thm 9.3.6 |Theorem]]). Let <math>X</math> be a | random variables with integer values (see [[guide:4add108640#thm 9.3.6 |Theorem]]). Let <math>X</math> be a | ||
discrete random variable with density function <math>p(j)</math>, mean <math>\mu = 0</math>, variance | discrete random variable with density function <math>p(j)</math>, mean <math>\mu = 0</math>, variance | ||
<math>\sigma^2 = 1</math>, and moment generating function <math>g(t)</math>, and let <math>X_1 | <math>\sigma^2 = 1</math>, and moment generating function <math>g(t)</math>, and let <math>X_1, X_2,\ldots, X_n</math> form an independent trials process with common density <math>p</math>. Let | ||
\ldots, | |||
<math>S_n = X_1 + X_2 +\cdots+ X_n</math> and <math>S_n^* = S_n/\sqrt n</math>, with densities | <math>S_n = X_1 + X_2 +\cdots+ X_n</math> and <math>S_n^* = S_n/\sqrt n</math>, with densities | ||
<math>p_n</math> and <math>p_n^*</math>, and moment generating functions <math>g_n(t)</math> and | <math>p_n</math> and <math>p_n^*</math>, and moment generating functions <math>g_n(t)</math> and | ||
Line 361: | Line 354: | ||
</math> | </math> | ||
Then <math>\bar p(x)</math> is defined for all <math>x</math>, <math>\bar p(j) = p(j)</math>, and the | Then <math>\bar p(x)</math> is defined for all <math>x</math>, <math>\bar p(j) = p(j)</math>, and the | ||
graph of <math>\bar p(x)</math> is the step function for the distribution <math>p(j)</math> (see Figure 3 | graph of <math>\bar p(x)</math> is the step function for the distribution <math>p(j)</math> (see Figure 3 of [[guide:452fd94468|Central Limit Theorem for Continuous Independent Trials]]). In the same way we introduce the step function <math>\bar p_n(x)</math> and | ||
of | |||
In the same way we introduce the step function <math>\bar p_n(x)</math> and | |||
<math>\bar p_n^*(x)</math> associated with the distributions <math>p_n</math> and <math>p_n^*</math>, and their | <math>\bar p_n^*(x)</math> associated with the distributions <math>p_n</math> and <math>p_n^*</math>, and their | ||
moment generating functions <math>\bar g_n(t)</math> and <math>\bar g_n^*(t)</math>. If we | moment generating functions <math>\bar g_n(t)</math> and <math>\bar g_n^*(t)</math>. If we | ||
Line 371: | Line 362: | ||
\bar p_n^*(x) \to \frac1{\sqrt{2\pi}} e^{t^2/2}\ , | \bar p_n^*(x) \to \frac1{\sqrt{2\pi}} e^{t^2/2}\ , | ||
</math> | </math> | ||
as <math>n \rightarrow \infty</math>, for all <math>x</math>, a conclusion strongly suggested by | as <math>n \rightarrow \infty</math>, for all <math>x</math>, a conclusion strongly suggested by [[#fig 9.2|Figure]]. | ||
Line 485: | Line 475: | ||
terms. Our proof of the Law of Large Numbers fails in this case because the | terms. Our proof of the Law of Large Numbers fails in this case because the | ||
variance of <math>X_i</math> is not finite. | variance of <math>X_i</math> is not finite. | ||
==General references== | |||
{{cite web |url=https://math.dartmouth.edu/~prob/prob/prob.pdf |title=Grinstead and Snell’s Introduction to Probability |last=Doyle |first=Peter G.|date=2006 |access-date=June 6, 2024}} | {{cite web |url=https://math.dartmouth.edu/~prob/prob/prob.pdf |title=Grinstead and Snell’s Introduction to Probability |last=Doyle |first=Peter G.|date=2006 |access-date=June 6, 2024}} | ||
==Notes== | ==Notes== | ||
{{Reflist|group=Notes}} | {{Reflist|group=Notes}} |
Latest revision as of 16:56, 11 June 2024
In the previous section, we introduced the concepts of moments and moment generating functions for discrete random variables. These concepts have natural analogues for continuous random variables, provided some care is taken in arguments involving convergence.
Moments
If [math]X[/math] is a continuous random variable defined on the probability space [math]\Omega[/math], with density function [math]f_X[/math], then we define the [math]n[/math]th moment of [math]X[/math] by the formula
provided the integral
is finite. Then, just as in the discrete case, we see that [math]\mu_0 = 1[/math], [math]\mu_1 = \mu[/math], and [math]\mu_2 - \mu_1^2 = \sigma^2[/math].
Moment Generating Functions
Now we define the moment generating function [math]g(t)[/math] for [math]X[/math] by the formula
provided this series converges. Then, as before, we have
Examples
Example Let [math]X[/math] be a continuous random variable with range [math][0,1][/math] and density function [math]f_X(x) = 1[/math] for [math]0 \leq x \leq 1[/math] (uniform density). Then
and
Here the series converges for all [math]t[/math]. Alternatively, we have
Then (by L'Hôpital's rule)
In particular, we verify that [math]\mu = g'(0) = 1/2[/math] and
as before (see Example).
Example Let [math]X[/math] have range [math][\,0,\infty)[/math] and density function [math]f_X(x) = \lambda e^{-\lambda x}[/math] (exponential density with parameter [math]\lambda[/math]). In this case
and
Here the series converges only for [math]|t| \lt \lambda[/math]. Alternatively, we have
Now we can verify directly that
Example Let [math]X[/math] have range [math](-\infty,+\infty)[/math] and density function
(normal density). In this case we have
(These moments are calculated by integrating once by parts to show that [math]\mu_n = (n - 1)\mu_{n - 2}[/math], and observing that [math]\mu_0 = 1[/math] and [math]\mu_1 = 0[/math].) Hence,
This series converges for all values of [math]t[/math]. Again we can verify that [math]g^{(n)}(0) = \mu_n[/math].
Let [math]X[/math] be a normal random variable with parameters [math]\mu[/math] and [math]\sigma[/math]. It is easy
to show that the moment generating function of [math]X[/math] is given by
Now suppose that [math]X[/math] and [math]Y[/math] are two independent normal random variables with parameters [math]\mu_1[/math], [math]\sigma_1[/math], and [math]\mu_2[/math], [math]\sigma_2[/math], respectively. Then, the product of the moment generating functions of [math]X[/math] and [math]Y[/math] is
This is the moment generating function for a normal random variable with mean [math]\mu_1 + \mu_2[/math] and variance [math]\sigma_1^2 + \sigma_2^2[/math]. Thus, the sum of two independent normal random variables is again normal. (This was proved for the special case that both summands are standard normal in Example.)
In general, the series defining [math]g(t)[/math] will not converge for all [math]t[/math]. But in the important special case where [math]X[/math] is bounded (i.e., where the range of [math]X[/math] is contained in a finite interval), we can show that the series does converge for all [math]t[/math].
Suppose [math]X[/math] is a continuous random variable with range contained in the interval [math][-M,M][/math]. Then the series
We have
Moment Problem
If [math]X[/math] is a bounded random variable, then the moment generating function [math]g_X(t)[/math] of [math]x[/math] determines the density function [math]f_X(x)[/math] uniquely.
Sketch of the Proof. We know that
If we replace [math]t[/math] by [math]i\tau[/math], where [math]\tau[/math] is real and [math]i = \sqrt{-1}[/math], then the series converges for all [math]\tau[/math], and we can define the function
The function [math]k_X(\tau)[/math] is called the characteristic function of [math]X[/math], and is defined by the above equation even when the series for [math]g_X[/math] does not converge. This equation says that [math]k_X[/math] is the Fourier transform of [math]f_X[/math]. It is known that the Fourier transform has an inverse, given by the formula
Sketch of the Proof of the Central Limit Theorem
With the above result in mind, we can now sketch a proof of the Central Limit Theorem for bounded continuous random variables (see Theorem). To this end, let [math]X[/math] be a continuous random variable with density function [math]f_X[/math], mean [math]\mu = 0[/math] and variance [math]\sigma^2 = 1[/math], and moment generating function [math]g(t)[/math] defined by its series for all [math]t[/math]. Let [math]X_1[/math], [math]X_2[/math], \ldots, [math]X_n[/math] be an independent trials process with each [math]X_i[/math] having density [math]f_X[/math], and let [math]S_n = X_1 + X_2 +\cdots+ X_n[/math], and [math]S_n^* = (S_n - n\mu)/\sqrt{n\sigma^2} = S_n/\sqrt n[/math]. Then each [math]X_i[/math] has moment generating function [math]g(t)[/math], and since the [math]X_i[/math] are independent, the sum [math]S_n[/math], just as in the discrete case (see Generating Functions for Discrete Distributions), has moment generating function
and the standardized sum [math]S_n^*[/math] has moment generating function
We now show that, as [math]n \to \infty[/math], [math]g_n^*(t) \to e^{t^2/2}[/math], where [math]e^{t^2/2}[/math] is the moment generating function of the normal density [math]n(x) = (1/\sqrt{2\pi}) e^{-x^2/2}[/math] (see Example). To show this, we set [math]u(t) = \log g(t)[/math], and
and show that [math]u_n^*(t) \to t^2/2[/math] as [math]n \to \infty[/math]. First we note that
Now by using L'H\^opital's rule twice, we get
Hence, [math]g_n^*(t) \to e^{t^2/2}[/math] as [math]n \to \infty[/math]. Now to complete the proof of the Central Limit Theorem, we must show that if [math]g_n^*(t) \to e^{t^2/2}[/math], then under our hypotheses the distribution functions [math]F_n^*(x)[/math] of the [math]S_n^*[/math] must converge to the distribution function [math]F_N^*(x)[/math] of the normal variable [math]N[/math]; that is, that
and furthermore, that the density functions [math]f_n^*(x)[/math] of the [math]S_n^*[/math] must converge to the density function for [math]N[/math]; that is, that
as [math]n \rightarrow \infty[/math].
Since the densities, and hence the distributions, of the [math]S_n^*[/math] are uniquely
determined by their moment generating functions under our hypotheses, these
conclusions are certainly plausible, but their proofs involve a detailed
examination of characteristic functions and Fourier transforms, and we shall
not attempt them here.
In the same way, we can prove the Central Limit Theorem for bounded discrete
random variables with integer values (see Theorem). Let [math]X[/math] be a
discrete random variable with density function [math]p(j)[/math], mean [math]\mu = 0[/math], variance
[math]\sigma^2 = 1[/math], and moment generating function [math]g(t)[/math], and let [math]X_1, X_2,\ldots, X_n[/math] form an independent trials process with common density [math]p[/math]. Let
[math]S_n = X_1 + X_2 +\cdots+ X_n[/math] and [math]S_n^* = S_n/\sqrt n[/math], with densities
[math]p_n[/math] and [math]p_n^*[/math], and moment generating functions [math]g_n(t)[/math] and
[math]g_n^*(t) = \left(g(\frac t{\sqrt n})\right)^n.[/math]
Then we have
just as in the continuous case, and this implies in the same way that the distribution functions [math]F_n^*(x)[/math] converge to the normal distribution; that is, that
as [math]n \rightarrow \infty[/math].
The corresponding statement about the distribution functions [math]p_n^*[/math], however,
requires a little extra care (see Theorem). The trouble arises
because the distribution [math]p(x)[/math] is not defined for all [math]x[/math], but only for
integer [math]x[/math]. It follows that the distribution [math]p_n^*(x)[/math] is defined only for [math]x[/math] of
the form [math]j/\sqrt n[/math], and these values change as [math]n[/math] changes.
We can fix this, however, by introducing the function [math]\bar p(x)[/math], defined
by the formula
Then [math]\bar p(x)[/math] is defined for all [math]x[/math], [math]\bar p(j) = p(j)[/math], and the graph of [math]\bar p(x)[/math] is the step function for the distribution [math]p(j)[/math] (see Figure 3 of Central Limit Theorem for Continuous Independent Trials). In the same way we introduce the step function [math]\bar p_n(x)[/math] and [math]\bar p_n^*(x)[/math] associated with the distributions [math]p_n[/math] and [math]p_n^*[/math], and their moment generating functions [math]\bar g_n(t)[/math] and [math]\bar g_n^*(t)[/math]. If we can show that [math]\bar g_n^*(t) \to e^{t^2/2}[/math], then we can conclude that
as [math]n \rightarrow \infty[/math], for all [math]x[/math], a conclusion strongly suggested by Figure.
Now [math]\bar g(t)[/math] is given by
where we have put
In the same way, we find that
Now, as [math]n \to \infty[/math], we know that [math]g_n^*(t) \to e^{t^2/2}[/math], and, by L'H\^opital's rule,
It follows that
and hence that
as [math]n \rightarrow \infty[/math]. The astute reader will note that in this sketch of the proof of Theorem, we never made use of the hypothesis that the greatest common divisor of the differences of all the values that the [math]X_i[/math] can take on is 1. This is a technical point that we choose to ignore. A complete proof may be found in Gnedenko and Kolmogorov.[Notes 2]
Cauchy Density
The characteristic function of a continuous density is a useful tool even in cases when the moment series does not converge, or even in cases when the moments themselves are not finite. As an example, consider the Cauchy density with parameter [math]a = 1[/math] (see Example)
If [math]X[/math] and [math]Y[/math] are independent random variables with Cauchy density [math]f(x)[/math], then the average [math]Z = (X + Y)/2[/math] also has Cauchy density [math]f(x)[/math], that is,
This is hard to check directly, but easy to check by using characteristic functions. Note first that
so that [math]\mu_2[/math] is infinite. Nevertheless, we can define the characteristic function [math]k_X(\tau)[/math] of [math]x[/math] by the formula
This integral is easy to do by contour methods, and gives us
Hence,
and since
we have
This shows that [math]k_Z = k_X = k_Y[/math], and leads to the conclusions that [math]f_Z = f_X = f_Y[/math]. It follows from this that if [math]X_1[/math], [math]X_2[/math], \ldots, [math]X_n[/math] is an independent trials process with common Cauchy density, and if
is the average of the [math]X_i[/math], then [math]A_n[/math] has the same density as do the [math]X_i[/math]. This means that the Law of Large Numbers fails for this process; the distribution of the average [math]A_n[/math] is exactly the same as for the individual terms. Our proof of the Law of Large Numbers fails in this case because the variance of [math]X_i[/math] is not finite.
General references
Doyle, Peter G. (2006). "Grinstead and Snell's Introduction to Probability" (PDF). Retrieved June 6, 2024.