Generating Functions for Continuous Densities

In the previous section, we introduced the concepts of moments and moment generating functions for discrete random variables. These concepts have natural analogues for continuous random variables, provided some care is taken in arguments involving convergence.

Moments

If [math]X[/math] is a continuous random variable defined on the probability space [math]\Omega[/math], with density function [math]f_X[/math], then we define the [math]n[/math]th moment of [math]X[/math] by the formula

[[math]] \mu_n = E(X^n) = \int_{-\infty}^{+\infty} x^n f_X(x)\, dx\ , [[/math]]

provided the integral

[[math]] \mu_n = E(X^n) = \int_{-\infty}^{+\infty} |x|^n f_X(x)\, dx\ , [[/math]]

is finite. Then, just as in the discrete case, we see that [math]\mu_0 = 1[/math], [math]\mu_1 = \mu[/math], and [math]\mu_2 - \mu_1^2 = \sigma^2[/math].

Moment Generating Functions

Now we define the moment generating function [math]g(t)[/math] for [math]X[/math] by the formula

[[math]] \begin{eqnarray*} g(t) &=& \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} = \sum_{k = 0}^\infty \frac{E(X^k) t^k}{k!} \\ &=& E(e^{tX}) = \int_{-\infty}^{+\infty} e^{tx} f_X(x)\, dx\ , \end{eqnarray*} [[/math]]

provided this series converges. Then, as before, we have

[[math]] \mu_n = g^{(n)}(0)\ . [[/math]]

Examples

Example Let [math]X[/math] be a continuous random variable with range [math][0,1][/math] and density function [math]f_X(x) = 1[/math] for [math]0 \leq x \leq 1[/math] (uniform density). Then

[[math]] \mu_n = \int_0^1 x^n\, dx = \frac1{n + 1}\ , [[/math]]

and

[[math]] \begin{eqnarray*} g(t) &=& \sum_{k = 0}^\infty \frac{t^k}{(k+1)!}\\ &=& \frac{e^t - 1}t\ . \end{eqnarray*} [[/math]]

Here the series converges for all [math]t[/math]. Alternatively, we have

[[math]] \begin{eqnarray*} g(t) &=& \int_{-\infty}^{+\infty} e^{tx} f_X(x)\, dx \\ &=& \int_0^1 e^{tx}\, dx = \frac{e^t - 1}t\ . \end{eqnarray*} [[/math]]

Then (by L'Hôpital's rule)

[[math]] \begin{eqnarray*} \mu_0 &=& g(0) = \lim_{t \to 0} \frac{e^t - 1}t = 1\ , \\ \mu_1 &=& g'(0) = \lim_{t \to 0} \frac{te^t - e^t + 1}{t^2} = \frac12\ , \\ \mu_2 &=& g''(0) = \lim_{t \to 0} \frac{t^3e^t - 2t^2e^t + 2te^t - 2t}{t^4} = \frac13\ . \end{eqnarray*} [[/math]]

In particular, we verify that [math]\mu = g'(0) = 1/2[/math] and

[[math]] \sigma^2 = g''(0) - (g'(0))^2 = \frac13 - \frac14 = \frac1{12} [[/math]]

as before (see Example).

Example Let [math]X[/math] have range [math][\,0,\infty)[/math] and density function [math]f_X(x) = \lambda e^{-\lambda x}[/math] (exponential density with parameter [math]\lambda[/math]). In this case

[[math]] \begin{eqnarray*} \mu_n &=& \int_0^\infty x^n \lambda e^{-\lambda x}\, dx = \lambda(-1)^n \frac{d^n}{d\lambda^n} \int_0^\infty e^{-\lambda x}\, dx \\ &=& \lambda(-1)^n \frac{d^n}{d\lambda^n} [\frac1\lambda] = \frac{n!} {\lambda^n}\ , \end{eqnarray*} [[/math]]

and

[[math]] \begin{eqnarray*} g(t) &=& \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} \\ &=& \sum_{k = 0}^\infty [\frac t\lambda]^k = \frac\lambda{\lambda - t}\ . \end{eqnarray*} [[/math]]

Here the series converges only for [math]|t| \lt \lambda[/math]. Alternatively, we have

[[math]] \begin{eqnarray*} g(t) &=& \int_0^\infty e^{tx} \lambda e^{-\lambda x}\, dx \\ &=& \left.\frac{\lambda e^{(t - \lambda)x}}{t - \lambda}\right|_0^\infty = \frac\lambda{\lambda - t}\ . \end{eqnarray*} [[/math]]

Now we can verify directly that

[[math]] \mu_n = g^{(n)}(0) = \left.\frac{\lambda n!}{(\lambda - t)^{n + 1}}\right|_{t = 0} = \frac{n!}{\lambda^n}\ . [[/math]]

Example Let [math]X[/math] have range [math](-\infty,+\infty)[/math] and density function

[[math]] f_X(x) = \frac1{\sqrt{2\pi}} e^{-x^2/2} [[/math]]

(normal density). In this case we have

[[math]] \begin{eqnarray*} \mu_n &=& \frac1{\sqrt{2\pi}} \int_{-\infty}^{+\infty} x^n e^{-x^2/2}\, dx \\ &=& \left \{ \begin{array}{ll} \frac{(2m)!}{2^{m} m!}, & \mbox{if $ n = 2m$,}\cr 0, & \mbox{if $ n = 2m+1$.}\end{array}\right. \end{eqnarray*} [[/math]]

(These moments are calculated by integrating once by parts to show that [math]\mu_n = (n - 1)\mu_{n - 2}[/math], and observing that [math]\mu_0 = 1[/math] and [math]\mu_1 = 0[/math].) Hence,

[[math]] \begin{eqnarray*} g(t) &=& \sum_{n = 0}^\infty \frac{\mu_n t^n}{n!} \\ &=& \sum_{m = 0}^\infty \frac{t^{2m}}{2^{m} m!} = e^{t^2/2}\ . \end{eqnarray*} [[/math]]

This series converges for all values of [math]t[/math]. Again we can verify that [math]g^{(n)}(0) = \mu_n[/math].

Let [math]X[/math] be a normal random variable with parameters [math]\mu[/math] and [math]\sigma[/math]. It is easy to show that the moment generating function of [math]X[/math] is given by

[[math]] e^{t\mu + (\sigma^2/2)t^2}\ . [[/math]]

Now suppose that [math]X[/math] and [math]Y[/math] are two independent normal random variables with parameters [math]\mu_1[/math], [math]\sigma_1[/math], and [math]\mu_2[/math], [math]\sigma_2[/math], respectively. Then, the product of the moment generating functions of [math]X[/math] and [math]Y[/math] is

[[math]] e^{t(\mu_1 + \mu_2) + ((\sigma_1^2 + \sigma_2^2)/2)t^2}\ . [[/math]]

This is the moment generating function for a normal random variable with mean [math]\mu_1 + \mu_2[/math] and variance [math]\sigma_1^2 + \sigma_2^2[/math]. Thus, the sum of two independent normal random variables is again normal. (This was proved for the special case that both summands are standard normal in Example.)

In general, the series defining [math]g(t)[/math] will not converge for all [math]t[/math]. But in the important special case where [math]X[/math] is bounded (i.e., where the range of [math]X[/math] is contained in a finite interval), we can show that the series does converge for all [math]t[/math].

Theorem

Suppose [math]X[/math] is a continuous random variable with range contained in the interval [math][-M,M][/math]. Then the series

[[math]] g(t) = \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} [[/math]]

converges for all [math]t[/math] to an infinitely differentiable function [math]g(t)[/math], and [math]g^{(n)}(0) = \mu_n[/math].\n

Show Proof

We have

[[math]] \mu_k = \int_{-M}^{+M} x^k f_X(x)\, dx\ , [[/math]]

so

[[math]] \begin{eqnarray*} |\mu_k| &\leq& \int_{-M}^{+M} |x|^k f_X(x)\, dx \\ &\leq& M^k \int_{-M}^{+M} f_X(x)\, dx = M^k\ . \end{eqnarray*} [[/math]]

Hence, for all [math]N[/math] we have

[[math]] \sum_{k = 0}^N \left|\frac{\mu_k t^k}{k!}\right| \leq \sum_{k = 0}^N \frac{(M|t|)^k}{k!} \leq e^{M|t|}\ , [[/math]]

which shows that the power series converges for all [math]t[/math]. We know that the sum of a convergent power series is always differentiable.

■

Moment Problem

Theorem

If [math]X[/math] is a bounded random variable, then the moment generating function [math]g_X(t)[/math] of [math]x[/math] determines the density function [math]f_X(x)[/math] uniquely.

Sketch of the Proof. We know that

[[math]] \begin{eqnarray*} g_X(t) &=& \sum_{k = 0}^\infty \frac{\mu_k t^k}{k!} \\ &=& \int_{-\infty}^{+\infty} e^{tx} f(x)\, dx\ . \end{eqnarray*} [[/math]]

If we replace [math]t[/math] by [math]i\tau[/math], where [math]\tau[/math] is real and [math]i = \sqrt{-1}[/math], then the series converges for all [math]\tau[/math], and we can define the function

[[math]] k_X(\tau) = g_X(i\tau) = \int_{-\infty}^{+\infty} e^{i\tau x} f_X(x)\, dx\ . [[/math]]

The function [math]k_X(\tau)[/math] is called the characteristic function of [math]X[/math], and is defined by the above equation even when the series for [math]g_X[/math] does not converge. This equation says that [math]k_X[/math] is the Fourier transform of [math]f_X[/math]. It is known that the Fourier transform has an inverse, given by the formula

[[math]] f_X(x) = \frac1{2\pi} \int_{-\infty}^{+\infty} e^{-i\tau x} k_X(\tau)\, d\tau\ , [[/math]]

suitably interpreted.^{[Notes 1]} Here we see that the characteristic function [math]k_X[/math], and hence the moment generating function [math]g_X[/math], determines the density function [math]f_X[/math] uniquely under our hypotheses.

Sketch of the Proof of the Central Limit Theorem

With the above result in mind, we can now sketch a proof of the Central Limit Theorem for bounded continuous random variables (see Theorem). To this end, let [math]X[/math] be a continuous random variable with density function [math]f_X[/math], mean [math]\mu = 0[/math] and variance [math]\sigma^2 = 1[/math], and moment generating function [math]g(t)[/math] defined by its series for all [math]t[/math]. Let [math]X_1[/math], [math]X_2[/math], \ldots, [math]X_n[/math] be an independent trials process with each [math]X_i[/math] having density [math]f_X[/math], and let [math]S_n = X_1 + X_2 +\cdots+ X_n[/math], and [math]S_n^* = (S_n - n\mu)/\sqrt{n\sigma^2} = S_n/\sqrt n[/math]. Then each [math]X_i[/math] has moment generating function [math]g(t)[/math], and since the [math]X_i[/math] are independent, the sum [math]S_n[/math], just as in the discrete case (see Generating Functions for Discrete Distributions), has moment generating function

[[math]] g_n(t) = (g(t))^n\ , [[/math]]

and the standardized sum [math]S_n^*[/math] has moment generating function

[[math]] g_n^*(t) = \left(g\left(\frac t{\sqrt n}\right)\right)^n\ . [[/math]]

We now show that, as [math]n \to \infty[/math], [math]g_n^*(t) \to e^{t^2/2}[/math], where [math]e^{t^2/2}[/math] is the moment generating function of the normal density [math]n(x) = (1/\sqrt{2\pi}) e^{-x^2/2}[/math] (see Example). To show this, we set [math]u(t) = \log g(t)[/math], and

[[math]] \begin{eqnarray*} u_n^*(t) &=& \log g_n^*(t) \\ &=& n\log g\left(\frac t{\sqrt n}\right) = nu\left(\frac t{\sqrt n}\right)\ , \end{eqnarray*} [[/math]]

and show that [math]u_n^*(t) \to t^2/2[/math] as [math]n \to \infty[/math]. First we note that

[[math]] \begin{eqnarray*} u(0) &=& \log g_n(0) = 0\ , \\ u'(0) &=& \frac{g'(0)}{g(0)} = \frac{\mu_1}1 = 0\ , \\ u''(0) &=& \frac{g''(0)g(0) - (g'(0))^2}{(g(0))^2} \\ &=& \frac{\mu_2 - \mu_1^2}1 = \sigma^2 = 1\ . \end{eqnarray*} [[/math]]

Now by using L'H\^opital's rule twice, we get

[[math]] \begin{eqnarray*} \lim_{n \to \infty} u_n^*(t) &=& \lim_{s \to \infty} \frac{u(t/\sqrt s)}{s^{-1}}\\ &=& \lim_{s \to \infty} \frac{u'(t/\sqrt s) t}{2s^{-1/2}} \\ &=& \lim_{s \to \infty} u''\left(\frac t{\sqrt s}\right) \frac{t^2}2 = \sigma^2 \frac{t^2}2 = \frac{t^2}2\ . \end{eqnarray*} [[/math]]

Hence, [math]g_n^*(t) \to e^{t^2/2}[/math] as [math]n \to \infty[/math]. Now to complete the proof of the Central Limit Theorem, we must show that if [math]g_n^*(t) \to e^{t^2/2}[/math], then under our hypotheses the distribution functions [math]F_n^*(x)[/math] of the [math]S_n^*[/math] must converge to the distribution function [math]F_N^*(x)[/math] of the normal variable [math]N[/math]; that is, that

[[math]] F_n^*(a) = P(S_n^* \leq a) \to \frac1{\sqrt{2\pi}} \int_{-\infty}^a e^{-x^2/2}\, dx\ , [[/math]]

and furthermore, that the density functions [math]f_n^*(x)[/math] of the [math]S_n^*[/math] must converge to the density function for [math]N[/math]; that is, that

[[math]] f_n^*(x) \to \frac1{\sqrt{2\pi}} e^{-x^2/2}\ , [[/math]]

as [math]n \rightarrow \infty[/math].

Since the densities, and hence the distributions, of the [math]S_n^*[/math] are uniquely determined by their moment generating functions under our hypotheses, these conclusions are certainly plausible, but their proofs involve a detailed examination of characteristic functions and Fourier transforms, and we shall not attempt them here.

In the same way, we can prove the Central Limit Theorem for bounded discrete random variables with integer values (see Theorem). Let [math]X[/math] be a discrete random variable with density function [math]p(j)[/math], mean [math]\mu = 0[/math], variance [math]\sigma^2 = 1[/math], and moment generating function [math]g(t)[/math], and let [math]X_1, X_2,\ldots, X_n[/math] form an independent trials process with common density [math]p[/math]. Let [math]S_n = X_1 + X_2 +\cdots+ X_n[/math] and [math]S_n^* = S_n/\sqrt n[/math], with densities [math]p_n[/math] and [math]p_n^*[/math], and moment generating functions [math]g_n(t)[/math] and [math]g_n^*(t) = \left(g(\frac t{\sqrt n})\right)^n.[/math] Then we have

[[math]] g_n^*(t) \to e^{t^2/2}\ , [[/math]]

just as in the continuous case, and this implies in the same way that the distribution functions [math]F_n^*(x)[/math] converge to the normal distribution; that is, that

[[math]] F_n^*(a) = P(S_n^* \leq a) \to \frac1{\sqrt{2\pi}} \int_{-\infty}^a e^{-x^2/2}\, dx\ , [[/math]]

as [math]n \rightarrow \infty[/math].

The corresponding statement about the distribution functions [math]p_n^*[/math], however, requires a little extra care (see Theorem). The trouble arises because the distribution [math]p(x)[/math] is not defined for all [math]x[/math], but only for integer [math]x[/math]. It follows that the distribution [math]p_n^*(x)[/math] is defined only for [math]x[/math] of the form [math]j/\sqrt n[/math], and these values change as [math]n[/math] changes. We can fix this, however, by introducing the function [math]\bar p(x)[/math], defined by the formula

[[math]] \bar p(x) = \left \{ \begin{array}{ll} p(j), & \mbox{if $j - 1/2 \leq x \lt j + 1/2$,} \cr 0\ , & \mbox{otherwise}.\end{array}\right. [[/math]]

Then [math]\bar p(x)[/math] is defined for all [math]x[/math], [math]\bar p(j) = p(j)[/math], and the graph of [math]\bar p(x)[/math] is the step function for the distribution [math]p(j)[/math] (see Figure 3 of Central Limit Theorem for Continuous Independent Trials). In the same way we introduce the step function [math]\bar p_n(x)[/math] and [math]\bar p_n^*(x)[/math] associated with the distributions [math]p_n[/math] and [math]p_n^*[/math], and their moment generating functions [math]\bar g_n(t)[/math] and [math]\bar g_n^*(t)[/math]. If we can show that [math]\bar g_n^*(t) \to e^{t^2/2}[/math], then we can conclude that

[[math]] \bar p_n^*(x) \to \frac1{\sqrt{2\pi}} e^{t^2/2}\ , [[/math]]

as [math]n \rightarrow \infty[/math], for all [math]x[/math], a conclusion strongly suggested by Figure.

Now [math]\bar g(t)[/math] is given by

[[math]] \begin{eqnarray*} \bar g(t) &=& \int_{-\infty}^{+\infty} e^{tx} \bar p(x)\, dx \\ &=& \sum_{j = -N}^{+N} \int_{j - 1/2}^{j + 1/2} e^{tx} p(j)\, dx\\ &=& \sum_{j = -N}^{+N} p(j) e^{tj} \frac{e^{t/2} - e^{-t/2}} {2t/2} \\ &=& g(t) \frac{\sinh(t/2)}{t/2}\ , \end{eqnarray*} [[/math]]

where we have put

[[math]] \sinh(t/2) = \frac{e^{t/2} - e^{-t/2}}2\ . [[/math]]

In the same way, we find that

[[math]] \begin{eqnarray*} \bar g_n(t) &=& g_n(t) \frac{\sinh(t/2)}{t/2}\ , \\ \bar g_n^*(t) &=& g_n^*(t) \frac{\sinh(t/2\sqrt n)}{t/2\sqrt n}\ . \end{eqnarray*} [[/math]]

Now, as [math]n \to \infty[/math], we know that [math]g_n^*(t) \to e^{t^2/2}[/math], and, by L'H\^opital's rule,

[[math]] \lim_{n \to \infty} \frac{\sinh(t/2\sqrt n)}{t/2\sqrt n} = 1\ . [[/math]]

It follows that

[[math]] \bar g_n^*(t) \to e^{t^2/2}\ , [[/math]]

and hence that

[[math]] \bar p_n^*(x) \to \frac1{\sqrt{2\pi}} e^{-x^2/2}\ , [[/math]]

as [math]n \rightarrow \infty[/math]. The astute reader will note that in this sketch of the proof of Theorem, we never made use of the hypothesis that the greatest common divisor of the differences of all the values that the [math]X_i[/math] can take on is 1. This is a technical point that we choose to ignore. A complete proof may be found in Gnedenko and Kolmogorov.^{[Notes 2]}

Cauchy Density

The characteristic function of a continuous density is a useful tool even in cases when the moment series does not converge, or even in cases when the moments themselves are not finite. As an example, consider the Cauchy density with parameter [math]a = 1[/math] (see Example)

[[math]] f(x) = \frac1{\pi(1 + x^2)}\ . [[/math]]

If [math]X[/math] and [math]Y[/math] are independent random variables with Cauchy density [math]f(x)[/math], then the average [math]Z = (X + Y)/2[/math] also has Cauchy density [math]f(x)[/math], that is,

[[math]] f_Z(x) = f(x)\ . [[/math]]

This is hard to check directly, but easy to check by using characteristic functions. Note first that

[[math]] \mu_2 = E(X^2) = \int_{-\infty}^{+\infty} \frac{x^2}{\pi(1 + x^2)}\, dx = \infty [[/math]]

so that [math]\mu_2[/math] is infinite. Nevertheless, we can define the characteristic function [math]k_X(\tau)[/math] of [math]x[/math] by the formula

[[math]] k_X(\tau) = \int_{-\infty}^{+\infty} e^{i\tau x}\frac1{\pi(1 + x^2)}\, dx\ . [[/math]]

This integral is easy to do by contour methods, and gives us

[[math]] k_X(\tau) = k_Y(\tau) = e^{-|\tau|}\ . [[/math]]

Hence,

[[math]] k_{X + Y}(\tau) = (e^{-|\tau|})^2 = e^{-2|\tau|}\ , [[/math]]

and since

[[math]] k_Z(\tau) = k_{X + Y}(\tau/2)\ , [[/math]]

we have

[[math]] k_Z(\tau) = e^{-2|\tau/2|} = e^{-|\tau|}\ . [[/math]]

This shows that [math]k_Z = k_X = k_Y[/math], and leads to the conclusions that [math]f_Z = f_X = f_Y[/math]. It follows from this that if [math]X_1[/math], [math]X_2[/math], \ldots, [math]X_n[/math] is an independent trials process with common Cauchy density, and if

[[math]] A_n = \frac{X_1 + X_2 + \cdots+ X_n}n [[/math]]

is the average of the [math]X_i[/math], then [math]A_n[/math] has the same density as do the [math]X_i[/math]. This means that the Law of Large Numbers fails for this process; the distribution of the average [math]A_n[/math] is exactly the same as for the individual terms. Our proof of the Law of Large Numbers fails in this case because the variance of [math]X_i[/math] is not finite.

General references

Doyle, Peter G. (2006). "Grinstead and Snell's Introduction to Probability" (PDF). Retrieved June 6, 2024.

Notes

H. Dym and H. P. McKean, Fourier Series and Integrals (New York: Academic Press, 1972).
B. V. Gnedenko and A. N. Kolomogorov, Limit Distributions for Sums of Independent Random Variables (Reading: Addison-Wesley, 1968), p. 233.

[1] H. Dym and H. P. McKean, Fourier Series and Integrals (New York: Academic Press, 1972).

[2] B. V. Gnedenko and A. N. Kolomogorov, Limit Distributions for Sums of Independent Random Variables (Reading: Addison-Wesley, 1968), p. 233.

[Notes 1]

[Notes 2]