guide:A33f712f32: Difference between revisions

From Stochiki
No edit summary
mNo edit summary
 
Line 1: Line 1:
The '''moment-generating function''' of a [[wikipedia:random_variable|random variable]] is an alternative specification of its [[wikipedia:probability_distribution|probability distribution]]. Thus, it provides the basis of an alternative route to analytical results compared with working directly with [[wikipedia:probability_density_function|probability density functions]] or [[wikipedia:cumulative_distribution_function|cumulative distribution functions]]. There are particularly simple results for the moment-generating functions of distributions defined by the weighted sums of random variables. Note, however, that not all random variables have moment-generating functions.
The '''moment-generating function''' of a random variable is an alternative specification of its [[guide:82d603b116|probability distribution]]. Thus, it provides the basis of an alternative route to analytical results compared with working directly with [[guide:82d603b116#Continuous probability distribution|probability density function]] or [[guide:82d603b116#Cumulative_distribution_function|cumulative distribution functions]]. There are particularly simple results for the moment-generating functions of distributions defined by the weighted sums of random variables. Note, however, that not all random variables have moment-generating functions.


In addition to univariate distributions, moment-generating functions can be defined for vector- or matrix-valued random variables, and can even be extended to more general cases.
In addition to univariate distributions, moment-generating functions can be defined for vector- or matrix-valued random variables, and can even be extended to more general cases.


The moment-generating function does not always exist even for real-valued arguments, unlike the [[wikipedia:Characteristic function (probability theory)|characteristic function]]. There are relations between the behavior of the moment-generating function of a distribution and properties of the distribution, such as the existence of moments.
The moment-generating function does not always exist even for real-valued arguments, unlike the [[Characteristic function (probability theory)|characteristic function]]. There are relations between the behavior of the moment-generating function of a distribution and properties of the distribution, such as the existence of moments.


The '''probability generating function''' of a [[wikipedia:discrete random variable|discrete random variable]] is a [[wikipedia:power series|power series]] representation (the generating function) of the [[wikipedia:probability_mass_function|probability mass function]]of the random variable.  Probability generating functions are often employed for their succinct description of the sequence of probabilities <math>\operatorname{P}(X=i)</math> in the probability mass function for a random variable <math>X</math>, and to make available the well-developed theory of power series with non-negative coefficients.
The '''probability generating function''' of a [[discrete random variable|discrete random variable]] is a [[power series|power series]] representation (the generating function) of the [[guide:82d603b116#Probability_Mass_Function|probability mass function]] of the random variable.  Probability generating functions are often employed for their succinct description of the sequence of probabilities <math>\operatorname{P}(X=i)</math> in the probability mass function for a random variable <math>X</math>, and to make available the well-developed theory of power series with non-negative coefficients.


==The Moment Generating Function ==
==The Moment Generating Function ==


===Definition===
===Definition===
In [[wikipedia:probability theory|probability theory]] and [[wikipedia:statistics|statistics]], the '''moment-generating function''' of a [[wikipedia:random_variable|random variable]] <math>X</math> is<math display="block"> M_X(t) := \operatorname{E}\!\left[e^{tX}\right], \quad t \in \mathbb{R}, </math>
In probability theory and statistics, the '''moment-generating function''' of a [[guide:1b8642f694|random variable]] <math>X</math> is<math display="block"> M_X(t) := \operatorname{E}\!\left[e^{tX}\right], \quad t \in \mathbb{R}, </math>


wherever this [[wikipedia:expectation|expectation]] exists. In other terms, the '''moment-generating function''' can be interpreted as the [[wikipedia:expectation|expectation]] of the random variable <math> e^{tX}</math>.
wherever this [[guide:82d603b116#Expected_Value|expectation]] exists. In other terms, the '''moment-generating function''' can be interpreted as the [[guide:82d603b116#Expected_Value|expectation]] of the random variable <math> e^{tX}</math>.


A key problem with moment-generating functions is that moments and the moment-generating function may not exist, as the integrals need not converge absolutely. By contrast, the [[wikipedia:Characteristic function (probability theory)|characteristic function]] always exists and thus may be used instead.
A key problem with moment-generating functions is that moments and the moment-generating function may not exist, as the integrals need not converge absolutely. By contrast, the [[Characteristic function (probability theory)|characteristic function]] always exists and thus may be used instead.


The reason for defining this function is that it can be used to find all the moments of the distribution.<ref>Bulmer, M.G., Principles of Statistics, Dover, 1979, pp. 75&ndash;79</ref>  The series expansion of <math>e^{tX}</math> is:
The reason for defining this function is that it can be used to find all the moments of the distribution.<ref>Bulmer, M.G., Principles of Statistics, Dover, 1979, pp. 75&ndash;79</ref>  The series expansion of <math>e^{tX}</math> is:
Line 34: Line 34:
</math>
</math>


where <math>m_n</math> is the <math>n</math><sup>th</sup> [[wikipedia:moment (mathematics)|moment]]. Differentiating <math>M_X(t)</math> <math>i</math> times with respect to <math>t</math> and setting <math>t=0</math> we obtain the <math>i</math><sup>th</sup> moment about the origin, <math>m_i</math>,
where <math>m_n</math> is the <math>n</math><sup>th</sup> [[guide:E4d753a3b5#Moments|moment]]. Differentiating <math>M_X(t)</math> <math>i</math> times with respect to <math>t</math> and setting <math>t=0</math> we obtain the <math>i</math><sup>th</sup> moment about the origin, <math>m_i</math>,
see [[wikipedia:Moment-generating function#Calculations of moments|Calculations of moments]] below.
see [[#Calculations of moments|Calculations of moments]] below.


===Calculation===
===Calculation===
Line 44: Line 44:
! Case !! Calculation
! Case !! Calculation
|-
|-
| General || <math>M_X(t) = \int_{-\infty}^\infty e^{tx}\,dF(x)</math>, using the [[wikipedia:Riemann&ndash;Stieltjes integral|Riemann&ndash;Stieltjes integral]], and  where <math>F</math> is the [[wikipedia:Cumulative_distribution_function|cumulative distribution function]]
| General || <math>M_X(t) = \int_{-\infty}^\infty e^{tx}\,dF(x)</math>, using the [[Riemann&ndash;Stieltjes integral|Riemann&ndash;Stieltjes integral]], and  where <math>F</math> is the [[guide:82d603b116#Cumulative_distribution_function|cumulative distribution function]]
|-
|-
| Discrete [[wikipedia:Probability_Mass_Function|probability mass function]] || <math>M_X(t)=\sum_{i=1}^\infty e^{tx_i}\, p_i</math>
| Discrete [[guide:82d603b116#Probability_Mass_Function|probability mass function]] || <math>M_X(t)=\sum_{i=1}^\infty e^{tx_i}\, p_i</math>
|-
|-
| Continuous [[wikipedia:probability_density_function|probability density function]] || <math> M_X(t)  = \int_{-\infty}^\infty e^{tx} f(x)\,dx </math>
| Continuous [[guide:82d603b116#Continuous probability distribution|probability density function]] || <math> M_X(t)  = \int_{-\infty}^\infty e^{tx} f(x)\,dx </math>
|}
|}


====Sum of independent random variables====
====Sum of independent random variables====


If <math>S_n = \sum_{i=1}^{n} a_i X_i</math>, where the <math>X_i</math> are independent random variables and the  <math>a_i</math> are constants, then the probability density function for <math>S_n</math> is the [[wikipedia:convolution|convolution]] of the probability density functions of each of the <math>X_i</math>, and the moment-generating function for <math>S_n</math> is given by
If <math>S_n = \sum_{i=1}^{n} a_i X_i</math>, where the <math>X_i</math> are independent random variables and the  <math>a_i</math> are constants, then the probability density function for <math>S_n</math> is the [[convolution|convolution]] of the probability density functions of each of the <math>X_i</math>, and the moment-generating function for <math>S_n</math> is given by


<math display="block">
<math display="block">
Line 61: Line 61:
====Calculations of moments====
====Calculations of moments====


The moment-generating function is so called because if it exists on an open interval around <math>t=0</math> then it is the [[wikipedia:exponential generating function|exponential generating function]] of the [[wikipedia:moment (mathematics)|moments]] of the [[wikipedia:probability_distribution|probability distribution]]:
The moment-generating function is so called because if it exists on an open interval around <math>t=0</math> then it is the [[exponential generating function|exponential generating function]] of the [[guide:E4d753a3b5#Moments|moments]] of the [[guide:82d603b116|probability distribution]]:


<math display="block">m_n = E \left( X^n \right) = M_X^{(n)}(0) = \frac{d^n M_X}{dt^n}(0).</math>
<math display="block">m_n = E \left( X^n \right) = M_X^{(n)}(0) = \frac{d^n M_X}{dt^n}(0).</math>
Line 68: Line 68:


===Relation to other functions===
===Relation to other functions===
Related to the moment-generating function are a number of other [[wikipedia:integral transform|transforms]] that are common in probability theory:
Related to the moment-generating function are a number of other [[integral transform|transforms]] that are common in probability theory:


{| class="table"
{| class="table"
Line 75: Line 75:
! Function !! Description
! Function !! Description
|-
|-
| [[wikipedia:characteristic function (probability theory)|Characteristic function]] || The characteristic function <math>\varphi_X(t)</math> is related to the moment-generating function via <math>\varphi_X(t) = M_{iX}(t) = M_X(it):</math> the characteristic function is the moment-generating function of ''iX'' or the moment generating function of <math>X</math> evaluated on the imaginary axis.  This function can also be viewed as the [[wikipedia:Fourier transform|Fourier transform]] of the [[wikipedia:probability_density_function|probability density function]], which can therefore be deduced from it by inverse Fourier transform.
| [[characteristic function (probability theory)|Characteristic function]] || The characteristic function <math>\varphi_X(t)</math> is related to the moment-generating function via <math>\varphi_X(t) = M_{iX}(t) = M_X(it):</math> the characteristic function is the moment-generating function of ''iX'' or the moment generating function of <math>X</math> evaluated on the imaginary axis.  This function can also be viewed as the [[Fourier transform|Fourier transform]] of the [[guide:82d603b116#Continuous probability distribution|probability density function]], which can therefore be deduced from it by inverse Fourier transform.
|-
|-
| [[wikipedia:cumulant-generating function|Cumulant-generating function]] || The cumulant-generating function is defined as the logarithm of the moment-generating function; some instead define the cumulant-generating function as the logarithm of the [[wikipedia:Characteristic function (probability theory)|characteristic function]], while others call this latter the ''second'' cumulant-generating function.
| [[cumulant-generating function|Cumulant-generating function]] || The cumulant-generating function is defined as the logarithm of the moment-generating function; some instead define the cumulant-generating function as the logarithm of the [[Characteristic function (probability theory)|characteristic function]], while others call this latter the ''second'' cumulant-generating function.
|-
|-
| [[#pgf|Probability generating functions]] || The probability-generating function is defined as <math>G(z) = E[z^X].\,</math> This immediately implies that <math>G(e^t)  = E[e^{tX}] = M_X(t).\,</math>
| [[#pgf|Probability generating functions]] || The probability-generating function is defined as <math>G(z) = E[z^X].\,</math> This immediately implies that <math>G(e^t)  = E[e^{tX}] = M_X(t).\,</math>
Line 83: Line 83:


===Examples===
===Examples===
Here are some examples of the moment generating function and the characteristic function for comparison. It can be seen that the characteristic function is a [[wikipedia:Wick rotation|Wick rotation]] of the moment generating function Mx(t) when the latter exists.
Here are some examples of the moment generating function and the characteristic function for comparison.


<table class="table">
<table class="table">
Line 94: Line 94:


<tr>
<tr>
<td>[[wikipedia:Bernoulli distribution|Bernoulli]]  <math>\, \operatorname{P}(X=1)=p</math> </td>
<td>[[guide:B5ab48c211#Bernoulli_distribution|Bernoulli]]  <math>\, \operatorname{P}(X=1)=p</math> </td>
<td>&nbsp; <math>\, 1-p+pe^t</math></td>
<td>&nbsp; <math>\, 1-p+pe^t</math></td>
<td> &nbsp; <math>\, 1-p+pe^{it}</math></td>
<td> &nbsp; <math>\, 1-p+pe^{it}</math></td>
<tr>
<tr>
<td> [[wikipedia:geometric_distribution|Geometric]]  <math>(1 - p)^{k-1}\,p\!</math></td>
<td> [[guide:B5ab48c211#geometric_distribution|Geometric]]  <math>(1 - p)^{k-1}\,p\!</math></td>
<td> &nbsp; <math>\frac{p e^t}{1-(1-p) e^t}\!</math>    <br>  &nbsp;<math>\forall t<-\ln(1-p)\!</math></td>
<td> &nbsp; <math>\frac{p e^t}{1-(1-p) e^t}\!</math>    <br>  &nbsp;<math>\forall t<-\ln(1-p)\!</math></td>
<td> &nbsp; <math>\frac{p e^{it}}{1-(1-p)\,e^{it}}\!</math></td>
<td> &nbsp; <math>\frac{p e^{it}}{1-(1-p)\,e^{it}}\!</math></td>
</tr>
</tr>
<tr><td>[[wikipedia:binomial_distribution|Binomial]] <math>B(n, p)</math></td>
<tr><td>[[guide:B5ab48c211#Binomial|Binomial]] <math>B(n, p)</math></td>
<td> &nbsp; <math>\, (1-p+pe^t)^n</math></td>
<td> &nbsp; <math>\, (1-p+pe^t)^n</math></td>
<td> &nbsp; <math>\, (1-p+pe^{it})^n</math></td>
<td> &nbsp; <math>\, (1-p+pe^{it})^n</math></td>
<tr>
<tr>
<td>[[wikipedia:poisson_distribution|Poisson]] Pois(<math>λ</math>)</td>
<td>[[guide:B5ab48c211#Poisson_Distribution|Poisson]] Pois(<math>λ</math>)</td>
<td> &nbsp; <math>\, e^{\lambda(e^t-1)}</math> </td>
<td> &nbsp; <math>\, e^{\lambda(e^t-1)}</math> </td>
<td>&nbsp; <math>\, e^{\lambda(e^{it}-1)}</math></td>  
<td>&nbsp; <math>\, e^{\lambda(e^{it}-1)}</math></td>  
<tr>
<tr>
<td>[[wikipedia:Uniform distribution (continuous)|Uniform (continuous)]] <math>U(a, b)</math></td>
<td>[[guide:269af6cf67#Uniform_distribution (continuous)|Uniform (continuous)]] <math>U(a, b)</math></td>
<td>&nbsp; <math>\, \frac{e^{tb} - e^{ta}}{t(b-a)}</math></td>
<td>&nbsp; <math>\, \frac{e^{tb} - e^{ta}}{t(b-a)}</math></td>
<td> &nbsp; <math>\, \frac{e^{itb} - e^{ita}}{it(b-a)}</math></td>
<td> &nbsp; <math>\, \frac{e^{itb} - e^{ita}}{it(b-a)}</math></td>
</tr>
</tr>


<tr><td>[[wikipedia:Discrete uniform distribution|Uniform (discrete)]] <math>U(a, b)</math></td>
<tr><td>[[Discrete uniform distribution|Uniform (discrete)]] <math>U(a, b)</math></td>
<td> &nbsp; <math>\, \frac{e^{at} - e^{(b+1)t}}{(b-a+1)(1-e^{t})}</math></td>
<td> &nbsp; <math>\, \frac{e^{at} - e^{(b+1)t}}{(b-a+1)(1-e^{t})}</math></td>
<td>&nbsp; <math>\, \frac{e^{ait} - e^{(b+1)it}}{(b-a+1)(1-e^{it})}</math></td>
<td>&nbsp; <math>\, \frac{e^{ait} - e^{(b+1)it}}{(b-a+1)(1-e^{it})}</math></td>
</tr>
</tr>
<tr> <td>[[wikipedia:normal_distribution|Normal]] <math>N(\mu, \sigma^2)</math></td>
<tr> <td>[[guide:269af6cf67#Normal_Distribution|Normal]] <math>N(\mu, \sigma^2)</math></td>
<td>&nbsp; <math>\, e^{t\mu + \frac{1}{2}\sigma^2t^2}</math></td>
<td>&nbsp; <math>\, e^{t\mu + \frac{1}{2}\sigma^2t^2}</math></td>
<td>&nbsp; <math>\, e^{it\mu - \frac{1}{2}\sigma^2t^2}</math></td>
<td>&nbsp; <math>\, e^{it\mu - \frac{1}{2}\sigma^2t^2}</math></td>
</tr>
</tr>
<tr> <td>[[wikipedia:Chi-squared distribution|Chi-squared]] <math>\chi^2_k</math></td>
<tr> <td>[[Chi-squared distribution|Chi-squared]] <math>\chi^2_k</math></td>
<td> &nbsp; <math>\, (1 - 2t)^{-k/2}</math></td>
<td> &nbsp; <math>\, (1 - 2t)^{-k/2}</math></td>
<td>&nbsp; <math>\, (1 - 2it)^{-k/2}</math></td>
<td>&nbsp; <math>\, (1 - 2it)^{-k/2}</math></td>
</tr>
</tr>
<tr><td>[[wikipedia:Gamma_distribution|Gamma]] <math>\Gamma(k, \theta)</math></td>
<tr><td>[[guide:269af6cf67#Gamma_Distribution|Gamma]] <math>\Gamma(k, \theta)</math></td>
<td> &nbsp; <math>\, (1 - t\theta)^{-k}</math></td>
<td> &nbsp; <math>\, (1 - t\theta)^{-k}</math></td>
<td> &nbsp; <math>\, (1 - it\theta)^{-k}</math></td>
<td> &nbsp; <math>\, (1 - it\theta)^{-k}</math></td>
</tr>
</tr>
<tr><td>[[wikipedia:Exponential_distribution|Exponential]] Exp(<math>λ</math>)</td>
<tr><td>[[guide:269af6cf67#Exponential_Distribution|Exponential]] Exp(<math>λ</math>)</td>
<td> &nbsp; <math>\, (1-t\lambda^{-1})^{-1}, \, (t<\lambda) </math></td>
<td> &nbsp; <math>\, (1-t\lambda^{-1})^{-1}, \, (t<\lambda) </math></td>
<td> &nbsp; <math>\, (1 - it\lambda^{-1})^{-1}</math></td>
<td> &nbsp; <math>\, (1 - it\lambda^{-1})^{-1}</math></td>
</tr>
</tr>
<tr> <td>[[wikipedia:Multivariate normal distribution|Multivariate normal]] <math>N(\mu, \Sigma)</math></td>
<tr> <td>[[Multivariate normal distribution|Multivariate normal]] <math>N(\mu, \Sigma)</math></td>
<td>&nbsp; <math>\, e^{t^\mathrm{T} \mu + \frac{1}{2} t^\mathrm{T} \Sigma t}</math></td>
<td>&nbsp; <math>\, e^{t^\mathrm{T} \mu + \frac{1}{2} t^\mathrm{T} \Sigma t}</math></td>
<td>&nbsp; <math>\, e^{i t^\mathrm{T} \mu - \frac{1}{2} t^\mathrm{T} \Sigma t}</math></td>
<td>&nbsp; <math>\, e^{i t^\mathrm{T} \mu - \frac{1}{2} t^\mathrm{T} \Sigma t}</math></td>
</tr>
</tr>
<tr><td>[[wikipedia:Degenerate distribution|Degenerate]] ''δ<sub>a</sub>''</td>
<tr><td>[[Degenerate distribution|Degenerate]] ''δ<sub>a</sub>''</td>
<td> &nbsp; <math>\, e^{ta}</math></td>
<td> &nbsp; <math>\, e^{ta}</math></td>
<td>&nbsp; <math>\, e^{ita}</math></td>
<td>&nbsp; <math>\, e^{ita}</math></td>
</tr>
</tr>
<tr> <td>[[wikipedia:Laplace distribution|Laplace]] <math>L(μ, b)</math></td>
<tr> <td>[[Laplace distribution|Laplace]] <math>L(μ, b)</math></td>
<td>&nbsp; <math>\, \frac{e^{t\mu}}{1 - b^2t^2}</math></td>
<td>&nbsp; <math>\, \frac{e^{t\mu}}{1 - b^2t^2}</math></td>
<td> &nbsp; <math>\, \frac{e^{it\mu}}{1 + b^2t^2}</math></td>
<td> &nbsp; <math>\, \frac{e^{it\mu}}{1 + b^2t^2}</math></td>
</tr>
</tr>
<tr><td>[[wikipedia:Negative binomial distribution|Negative Binomial]] <math>NB(r, p)</math></td>
<tr><td>[[guide:B5ab48c211#Negative_Binomial|Negative Binomial]] <math>NB(r, p)</math></td>
  <td>&nbsp; <math>\, \frac{(1-p)^r}{(1-pe^t)^r}</math></td>
  <td>&nbsp; <math>\, \frac{(1-p)^r}{(1-pe^t)^r}</math></td>
<td> &nbsp; <math>\, \frac{(1-p)^r}{(1-pe^{it})^r}</math></td>
<td> &nbsp; <math>\, \frac{(1-p)^r}{(1-pe^{it})^r}</math></td>
</tr>
</tr>
<tr>
<tr>
<td> [[wikipedia:Cauchy distribution|Cauchy]] Cauchy(<math>μ, θ</math>)</td>
<td> [[Cauchy distribution|Cauchy]] Cauchy(<math>μ, θ</math>)</td>
<td> does not exist</td>
<td> does not exist</td>
<td> &nbsp; <math>\, e^{it\mu -\theta|t|}</math></td>
<td> &nbsp; <math>\, e^{it\mu -\theta|t|}</math></td>
Line 163: Line 163:
===Definition===
===Definition===


If <math>X</math> is a [[wikipedia:discrete random variable|discrete random variable]] taking values in the non-negative [[wikipedia:integer|integer]]s {0,1, ...}, then the ''probability generating function'' of <math>X</math> is defined as
If <math>X</math> is a [[discrete random variable|discrete random variable]] taking values in the non-negative [[integer|integer]]s {0,1, ...}, then the ''probability generating function'' of <math>X</math> is defined as
<ref>http://www.am.qub.ac.uk/users/g.gribakin/sor/Chap3.pdf</ref><math display="block">G(z) = \operatorname{E} (z^X) = \sum_{x=0}^{\infty}p(x)z^x,</math>
<ref>http://www.am.qub.ac.uk/users/g.gribakin/sor/Chap3.pdf</ref><math display="block">G(z) = \operatorname{E} (z^X) = \sum_{x=0}^{\infty}p(x)z^x,</math>
where <math>p</math> is the [[wikipedia:probability_mass_function|probability mass function]] of <math>X</math>.  Note that the subscripted notations <math>G_X</math> and <math>p_X</math> are often used to emphasize that these pertain to a particular random variable <math>X</math>, and to its distribution. The power series [[wikipedia:absolute convergence|converges absolutely]] at least for all [[wikipedia:complex number|complex number]]s <math>z \leq 1</math>; in many examples the radius of convergence is larger.
where <math>p</math> is the [[guide:82d603b116#Probability_Mass_Function|probability mass function]] of <math>X</math>.  Note that the subscripted notations <math>G_X</math> and <math>p_X</math> are often used to emphasize that these pertain to a particular random variable <math>X</math>, and to its distribution. The power series [[absolute convergence|converges absolutely]] at least for all [[complex number|complex number]]s <math>z \leq 1</math>; in many examples the radius of convergence is larger.


===Properties===
===Properties===
Line 176: Line 176:
\lim_{z \uparrow 1} G(z) = G(z^{-})
\lim_{z \uparrow 1} G(z) = G(z^{-})
</math>
</math>
, since the probabilities must sum to one. So the [[wikipedia:radius of convergence|radius of convergence]] of any probability generating function must be at least 1, by [[wikipedia:Abel's theorem|Abel's theorem]] for power series with non-negative coefficients.
, since the probabilities must sum to one. So the [[radius of convergence|radius of convergence]] of any probability generating function must be at least 1, by [[Abel's theorem|Abel's theorem]] for power series with non-negative coefficients.


====Probabilities and expectations====
====Probabilities and expectations====
Line 182: Line 182:
The following properties allow the derivation of various basic quantities related to <math>X</math>:
The following properties allow the derivation of various basic quantities related to <math>X</math>:


1. The probability mass function of <math>X</math> is recovered by taking [[wikipedia:derivative|derivative]]s of  <math>G</math>:  
1. The probability mass function of <math>X</math> is recovered by taking [[derivative|derivative]]s of  <math>G</math>:  


<math display="block">  p(k) = \operatorname{P}(X = k) = \frac{G^{(k)}(0)}{k!}.</math>
<math display="block">  p(k) = \operatorname{P}(X = k) = \frac{G^{(k)}(0)}{k!}.</math>
Line 190: Line 190:
3. The normalization of the probability density function can be expressed in terms of the generating function by<math display="block">\operatorname{E}(1)=G(1^-)=\sum_{i=0}^\infty f(i)=1.</math>
3. The normalization of the probability density function can be expressed in terms of the generating function by<math display="block">\operatorname{E}(1)=G(1^-)=\sum_{i=0}^\infty f(i)=1.</math>


The [[wikipedia:expected_value|expectation]] of <math>X</math> is given by<math display="block"> \operatorname{E}\left(X\right) = G'(1^-).</math>
The [[guide:82d603b116#Expected_Value|expectation]] of <math>X</math> is given by<math display="block"> \operatorname{E}\left(X\right) = G'(1^-).</math>


More generally, the <math>k</math><sup>th</sup> [[wikipedia:factorial moment|factorial moment]], <math>\textrm{E}(X(X -  1) \cdots (X - k + 1))</math> of <math>X</math> is given by<math display="block">\textrm{E}\left(\frac{X!}{(X-k)!}\right) = G^{(k)}(1^-), \quad k \geq 0.</math>
More generally, the <math>k</math><sup>th</sup> [[factorial moment|factorial moment]], <math>\textrm{E}(X(X -  1) \cdots (X - k + 1))</math> of <math>X</math> is given by<math display="block">\textrm{E}\left(\frac{X!}{(X-k)!}\right) = G^{(k)}(1^-), \quad k \geq 0.</math>


So the [[wikipedia:variance|variance]] of <math>X</math> is given by<math display="block">\operatorname{Var}(X)=G''(1^-) + G'(1^-) - \left [G'(1^-)\right ]^2.</math>
So the [[guide:E4d753a3b5|variance]] of <math>X</math> is given by<math display="block">\operatorname{Var}(X)=G''(1^-) + G'(1^-) - \left [G'(1^-)\right ]^2.</math>


4. <math>G_X(e^{t}) = M_X(t)</math> where <math>X</math> is a random variable, <math>G_X(t)</math> is the probability generating function (of <math>X</math>) and <math>M_X(t)</math> is the moment generating function (of <math>X</math>) .
4. <math>G_X(e^{t}) = M_X(t)</math> where <math>X</math> is a random variable, <math>G_X(t)</math> is the probability generating function (of <math>X</math>) and <math>M_X(t)</math> is the moment generating function (of <math>X</math>) .
Line 200: Line 200:
====Functions of independent random variables====
====Functions of independent random variables====


Probability generating functions are particularly useful for dealing with functions of [[wikipedia:statistical independence|independent]] random variables. For example:
Probability generating functions are particularly useful for dealing with functions of [[guide:Af39987afc|independent]] random variables. For example:


* If <math>X_1, X_2, \ldots, X_n</math> is a sequence of independent (and not necessarily identically distributed) random variables, and  
* If <math>X_1, X_2, \ldots, X_n</math> is a sequence of independent (and not necessarily identically distributed) random variables, and  
Line 222: Line 222:
<math display="block">G_S(z) = G_{X_1}(z)G_{X_2}(1/z).</math>
<math display="block">G_S(z) = G_{X_1}(z)G_{X_2}(1/z).</math>


*Suppose that <math>n</math> is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function <math>G_n</math>.  If the <math>X_1,X_2, \ldots, X_N</math> are independent ''and'' identically distributed with common probability generating function <math>G_X</math>, then <math>G_{S_N}(z) = G_N(G_X(z)).</math> This can be seen, using the [[wikipedia:law of total expectation|law of total expectation]], as follows:
*Suppose that <math>n</math> is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function <math>G_n</math>.  If the <math>X_1,X_2, \ldots, X_N</math> are independent ''and'' identically distributed with common probability generating function <math>G_X</math>, then <math>G_{S_N}(z) = G_N(G_X(z)).</math> This can be seen, using the [[law of total expectation|law of total expectation]], as follows:


<math display="block">
<math display="block">
Line 238: Line 238:
</math>
</math>


This last fact is useful in the study of [[wikipedia:Galton&ndash;Watson process|Galton&ndash;Watson process]]es.
This last fact is useful in the study of [[Galton&ndash;Watson process|Galton&ndash;Watson process]]es.


*Suppose again that <math>N</math> is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function <math>G_N</math> and probability density <math>f_i = \operatorname{P}(N = i)</math>.  If the <math>X_1, \ldots, X_N </math> are independent, but ''not'' identically distributed random variables, where <math>G_{X_i}</math> denotes the probability generating function of <math>X_i</math>, then
*Suppose again that <math>N</math> is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function <math>G_N</math> and probability density <math>f_i = \operatorname{P}(N = i)</math>.  If the <math>X_1, \ldots, X_N </math> are independent, but ''not'' identically distributed random variables, where <math>G_{X_i}</math> denotes the probability generating function of <math>X_i</math>, then
Line 254: Line 254:
! Distribution !! PGF
! Distribution !! PGF
|-
|-
| [[wikipedia:Degenerate distribution|Degenerate]] <math>\delta_a</math> || <math>G(z) = \left(z^a\right)</math>  
| [[Degenerate distribution|Degenerate]] <math>\delta_a</math> || <math>G(z) = \left(z^a\right)</math>  
|-
|-
| [[wikipedia:Bernoulli distribution|Bernoulli]] <math>\, \operatorname{P}(X=1)=p</math> || <math>G(z) = 1/2 + z/2 </math>
| [[guide:B5ab48c211#Bernoulli_distribution|Bernoulli]]   <math>\, \operatorname{P}(X=1)=p</math> || <math>G(z) = 1/2 + z/2 </math>
|-
|-
| [[wikipedia:binomial_distribution|Binomial]] <math>\operatorname{B}(n, p)</math> || <math>G(z) = \left[(1-p) + pz\right]^n  </math>
| [[guide:B5ab48c211#Binomial|Binomial]] <math>\operatorname{B}(n, p)</math> || <math>G(z) = \left[(1-p) + pz\right]^n  </math>
|-
|-
| [[wikipedia:Negative binomial distribution|Negative Binomial]] <math>\operatorname{NB}(r,p)</math>  || <math>G(z) = \left(\frac{pz}{1 - (1-p)z}\right)^r</math>
| [[guide:B5ab48c211#Negative_Binomial|Negative Binomial]] <math>\operatorname{NB}(r,p)</math>  || <math>G(z) = \left(\frac{pz}{1 - (1-p)z}\right)^r</math>
|-
|-
| [[wikipedia:poisson_distribution|Poisson]] <math>\textrm{Pois}(\lambda) </math> || <math>G(z) = \textrm{e}^{\lambda(z - 1)}</math>
| [[guide:B5ab48c211#Poisson_Distribution|Poisson]] <math>\textrm{Pois}(\lambda) </math> || <math>G(z) = \textrm{e}^{\lambda(z - 1)}</math>
|}
|}



Latest revision as of 00:52, 5 April 2024

The moment-generating function of a random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compared with working directly with probability density function or cumulative distribution functions. There are particularly simple results for the moment-generating functions of distributions defined by the weighted sums of random variables. Note, however, that not all random variables have moment-generating functions.

In addition to univariate distributions, moment-generating functions can be defined for vector- or matrix-valued random variables, and can even be extended to more general cases.

The moment-generating function does not always exist even for real-valued arguments, unlike the characteristic function. There are relations between the behavior of the moment-generating function of a distribution and properties of the distribution, such as the existence of moments.

The probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are often employed for their succinct description of the sequence of probabilities [math]\operatorname{P}(X=i)[/math] in the probability mass function for a random variable [math]X[/math], and to make available the well-developed theory of power series with non-negative coefficients.

The Moment Generating Function

Definition

In probability theory and statistics, the moment-generating function of a random variable [math]X[/math] is

[[math]] M_X(t) := \operatorname{E}\!\left[e^{tX}\right], \quad t \in \mathbb{R}, [[/math]]

wherever this expectation exists. In other terms, the moment-generating function can be interpreted as the expectation of the random variable [math] e^{tX}[/math].

A key problem with moment-generating functions is that moments and the moment-generating function may not exist, as the integrals need not converge absolutely. By contrast, the characteristic function always exists and thus may be used instead.

The reason for defining this function is that it can be used to find all the moments of the distribution.[1] The series expansion of [math]e^{tX}[/math] is:

[[math]] e^{t\,X} = 1 + t\,X + \frac{t^2\,X^2}{2!} + \cdots +\frac{t^n\,X^n}{n!} + \cdots. [[/math]]

Hence:

[[math]] \begin{align*} M_X(t) = \operatorname{E}\left[e^{t\,X}\right] = \sum_{n=0}^\infty \frac{\operatorname{E}[X^n]}{n!} = \sum_{n=0}^\infty \frac{t^nm_n}{n!} \end{align*} [[/math]]

where [math]m_n[/math] is the [math]n[/math]th moment. Differentiating [math]M_X(t)[/math] [math]i[/math] times with respect to [math]t[/math] and setting [math]t=0[/math] we obtain the [math]i[/math]th moment about the origin, [math]m_i[/math], see Calculations of moments below.

Calculation

The moment-generating function is the expectation of a function of the random variable, it can be written as:

Case Calculation
General [math]M_X(t) = \int_{-\infty}^\infty e^{tx}\,dF(x)[/math], using the Riemann–Stieltjes integral, and where [math]F[/math] is the cumulative distribution function
Discrete probability mass function [math]M_X(t)=\sum_{i=1}^\infty e^{tx_i}\, p_i[/math]
Continuous probability density function [math] M_X(t) = \int_{-\infty}^\infty e^{tx} f(x)\,dx [/math]

Sum of independent random variables

If [math]S_n = \sum_{i=1}^{n} a_i X_i[/math], where the [math]X_i[/math] are independent random variables and the [math]a_i[/math] are constants, then the probability density function for [math]S_n[/math] is the convolution of the probability density functions of each of the [math]X_i[/math], and the moment-generating function for [math]S_n[/math] is given by

[[math]] M_{S_n}(t)=M_{X_1}(a_1t)M_{X_2}(a_2t)\cdots M_{X_n}(a_nt) \, . [[/math]]

Calculations of moments

The moment-generating function is so called because if it exists on an open interval around [math]t=0[/math] then it is the exponential generating function of the moments of the probability distribution:

[[math]]m_n = E \left( X^n \right) = M_X^{(n)}(0) = \frac{d^n M_X}{dt^n}(0).[[/math]]

Here [math]n[/math] must be a nonnegative integer.

Relation to other functions

Related to the moment-generating function are a number of other transforms that are common in probability theory:

Function Description
Characteristic function The characteristic function [math]\varphi_X(t)[/math] is related to the moment-generating function via [math]\varphi_X(t) = M_{iX}(t) = M_X(it):[/math] the characteristic function is the moment-generating function of iX or the moment generating function of [math]X[/math] evaluated on the imaginary axis. This function can also be viewed as the Fourier transform of the probability density function, which can therefore be deduced from it by inverse Fourier transform.
Cumulant-generating function The cumulant-generating function is defined as the logarithm of the moment-generating function; some instead define the cumulant-generating function as the logarithm of the characteristic function, while others call this latter the second cumulant-generating function.
Probability generating functions The probability-generating function is defined as [math]G(z) = E[z^X].\,[/math] This immediately implies that [math]G(e^t) = E[e^{tX}] = M_X(t).\,[/math]

Examples

Here are some examples of the moment generating function and the characteristic function for comparison.

Distribution Moment-generating function M[math]X[/math]([math]t[/math]) Characteristic function φ(t)
Bernoulli [math]\, \operatorname{P}(X=1)=p[/math]   [math]\, 1-p+pe^t[/math]   [math]\, 1-p+pe^{it}[/math]
Geometric [math](1 - p)^{k-1}\,p\![/math]   [math]\frac{p e^t}{1-(1-p) e^t}\![/math]
 [math]\forall t\lt-\ln(1-p)\![/math]
  [math]\frac{p e^{it}}{1-(1-p)\,e^{it}}\![/math]
Binomial [math]B(n, p)[/math]   [math]\, (1-p+pe^t)^n[/math]   [math]\, (1-p+pe^{it})^n[/math]
Poisson Pois([math]λ[/math])   [math]\, e^{\lambda(e^t-1)}[/math]   [math]\, e^{\lambda(e^{it}-1)}[/math]
Uniform (continuous) [math]U(a, b)[/math]   [math]\, \frac{e^{tb} - e^{ta}}{t(b-a)}[/math]   [math]\, \frac{e^{itb} - e^{ita}}{it(b-a)}[/math]
Uniform (discrete) [math]U(a, b)[/math]   [math]\, \frac{e^{at} - e^{(b+1)t}}{(b-a+1)(1-e^{t})}[/math]   [math]\, \frac{e^{ait} - e^{(b+1)it}}{(b-a+1)(1-e^{it})}[/math]
Normal [math]N(\mu, \sigma^2)[/math]   [math]\, e^{t\mu + \frac{1}{2}\sigma^2t^2}[/math]   [math]\, e^{it\mu - \frac{1}{2}\sigma^2t^2}[/math]
Chi-squared [math]\chi^2_k[/math]   [math]\, (1 - 2t)^{-k/2}[/math]   [math]\, (1 - 2it)^{-k/2}[/math]
Gamma [math]\Gamma(k, \theta)[/math]   [math]\, (1 - t\theta)^{-k}[/math]   [math]\, (1 - it\theta)^{-k}[/math]
Exponential Exp([math]λ[/math])   [math]\, (1-t\lambda^{-1})^{-1}, \, (t\lt\lambda) [/math]   [math]\, (1 - it\lambda^{-1})^{-1}[/math]
Multivariate normal [math]N(\mu, \Sigma)[/math]   [math]\, e^{t^\mathrm{T} \mu + \frac{1}{2} t^\mathrm{T} \Sigma t}[/math]   [math]\, e^{i t^\mathrm{T} \mu - \frac{1}{2} t^\mathrm{T} \Sigma t}[/math]
Degenerate δa   [math]\, e^{ta}[/math]   [math]\, e^{ita}[/math]
Laplace [math]L(μ, b)[/math]   [math]\, \frac{e^{t\mu}}{1 - b^2t^2}[/math]   [math]\, \frac{e^{it\mu}}{1 + b^2t^2}[/math]
Negative Binomial [math]NB(r, p)[/math]   [math]\, \frac{(1-p)^r}{(1-pe^t)^r}[/math]   [math]\, \frac{(1-p)^r}{(1-pe^{it})^r}[/math]
Cauchy Cauchy([math]μ, θ[/math]) does not exist   [math]\, e^{it\mu -\theta|t|}[/math]

The Probability Generating Function

Definition

If [math]X[/math] is a discrete random variable taking values in the non-negative integers {0,1, ...}, then the probability generating function of [math]X[/math] is defined as [2]

[[math]]G(z) = \operatorname{E} (z^X) = \sum_{x=0}^{\infty}p(x)z^x,[[/math]]

where [math]p[/math] is the probability mass function of [math]X[/math]. Note that the subscripted notations [math]G_X[/math] and [math]p_X[/math] are often used to emphasize that these pertain to a particular random variable [math]X[/math], and to its distribution. The power series converges absolutely at least for all complex numbers [math]z \leq 1[/math]; in many examples the radius of convergence is larger.

Properties

Power series

Probability generating functions obey all the rules of power series with non-negative coefficients. In particular, [math]G(1^{-})=1[/math], where

[[math]] \lim_{z \uparrow 1} G(z) = G(z^{-}) [[/math]]

, since the probabilities must sum to one. So the radius of convergence of any probability generating function must be at least 1, by Abel's theorem for power series with non-negative coefficients.

Probabilities and expectations

The following properties allow the derivation of various basic quantities related to [math]X[/math]:

1. The probability mass function of [math]X[/math] is recovered by taking derivatives of [math]G[/math]:

[[math]] p(k) = \operatorname{P}(X = k) = \frac{G^{(k)}(0)}{k!}.[[/math]]

2. It follows from Property 1 that if [math]X[/math] and [math]Y[/math] have identical probability generating functions, then they have identical distributions.

3. The normalization of the probability density function can be expressed in terms of the generating function by

[[math]]\operatorname{E}(1)=G(1^-)=\sum_{i=0}^\infty f(i)=1.[[/math]]

The expectation of [math]X[/math] is given by

[[math]] \operatorname{E}\left(X\right) = G'(1^-).[[/math]]

More generally, the [math]k[/math]th factorial moment, [math]\textrm{E}(X(X - 1) \cdots (X - k + 1))[/math] of [math]X[/math] is given by

[[math]]\textrm{E}\left(\frac{X!}{(X-k)!}\right) = G^{(k)}(1^-), \quad k \geq 0.[[/math]]

So the variance of [math]X[/math] is given by

[[math]]\operatorname{Var}(X)=G''(1^-) + G'(1^-) - \left [G'(1^-)\right ]^2.[[/math]]

4. [math]G_X(e^{t}) = M_X(t)[/math] where [math]X[/math] is a random variable, [math]G_X(t)[/math] is the probability generating function (of [math]X[/math]) and [math]M_X(t)[/math] is the moment generating function (of [math]X[/math]) .

Functions of independent random variables

Probability generating functions are particularly useful for dealing with functions of independent random variables. For example:

  • If [math]X_1, X_2, \ldots, X_n[/math] is a sequence of independent (and not necessarily identically distributed) random variables, and

[[math]]S_n = \sum_{i=1}^n a_i X_i,[[/math]]

where the [math]a_i[/math] are constants, then the probability generating function is given by

[[math]] G_{S_n}(z) = \operatorname{E}(z^{S_n}) = \operatorname{E}(z^{\sum_{i=1}^n a_i X_i,}) = G_{X_1}(z^{a_1})G_{X_2}(z^{a_2})\cdots G_{X_n}(z^{a_n}). [[/math]]

For example, if [math]S_n = \sum_{i=1}^n X_i,[/math] then the probability generating function, GSn(z), is given by

[[math]]G_{S_n}(z) = G_{X_1}(z)G_{X_2}(z)\cdots G_{X_n}(z).[[/math]]

It also follows that the probability generating function of the difference of two independent random variables [math]S[/math] = [math]X[/math]1[math]X[/math]2 is

[[math]]G_S(z) = G_{X_1}(z)G_{X_2}(1/z).[[/math]]

  • Suppose that [math]n[/math] is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function [math]G_n[/math]. If the [math]X_1,X_2, \ldots, X_N[/math] are independent and identically distributed with common probability generating function [math]G_X[/math], then [math]G_{S_N}(z) = G_N(G_X(z)).[/math] This can be seen, using the law of total expectation, as follows:

[[math]] \begin{align*} G_{S_N}(z) = \operatorname{E}(z^{S_N})&= \operatorname{E}(z^{\sum_{i=1}^N X_i}) \\ &= \operatorname{E}\big(\operatorname{E}(z^{\sum_{i=1}^N X_i}| N) \big) \\ &= \operatorname{E}\big( (G_X(z))^N\big) =G_N(G_X(z)). \end{align*} [[/math]]

This last fact is useful in the study of Galton–Watson processes.

  • Suppose again that [math]N[/math] is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function [math]G_N[/math] and probability density [math]f_i = \operatorname{P}(N = i)[/math]. If the [math]X_1, \ldots, X_N [/math] are independent, but not identically distributed random variables, where [math]G_{X_i}[/math] denotes the probability generating function of [math]X_i[/math], then

[[math]]G_{S_N}(z) = \sum_{i \ge 1} f_i \prod_{k=1}^i G_{X_i}(z).[[/math]]

For identically distributed [math]X_i[/math] this simplifies to the identity stated before. The general case is sometimes useful to obtain a decomposition of [math]S_N[/math] by means of generating functions.

Examples

The table below gives the probability generating function for some well known discrete distributions.

Distribution PGF
Degenerate [math]\delta_a[/math] [math]G(z) = \left(z^a\right)[/math]
Bernoulli [math]\, \operatorname{P}(X=1)=p[/math] [math]G(z) = 1/2 + z/2 [/math]
Binomial [math]\operatorname{B}(n, p)[/math] [math]G(z) = \left[(1-p) + pz\right]^n [/math]
Negative Binomial [math]\operatorname{NB}(r,p)[/math] [math]G(z) = \left(\frac{pz}{1 - (1-p)z}\right)^r[/math]
Poisson [math]\textrm{Pois}(\lambda) [/math] [math]G(z) = \textrm{e}^{\lambda(z - 1)}[/math]

Notes

  1. Bulmer, M.G., Principles of Statistics, Dover, 1979, pp. 75–79
  2. http://www.am.qub.ac.uk/users/g.gribakin/sor/Chap3.pdf

References

  • Casella, George; Berger, Roger. Statistical Inference (2nd ed.). pp. 59–68. ISBN 978-0-534-24312-8.
  • Grimmett, Geoffrey; Welsh, Dominic. Probability - An Introduction (1st ed.). pp. 101 ff. ISBN 978-0-19-853264-4.
  • Wikipedia contributors. "Moment-generating function". Wikipedia. Wikipedia. Retrieved 28 January 2022.
  • Wikipedia contributors. "Probability-generating function". Wikipedia. Wikipedia. Retrieved 28 January 2022.