guide:25c0668abf: Difference between revisions

From Stochiki
No edit summary
 
No edit summary
 
Line 1: Line 1:
If <math>X</math> is a random variable with cumulative distribution function <math>F_X</math>, we may produce other random variables by applying a ''transformation'' of <math>X</math> of the form <math>g(X)</math> for suitable functions g. These transformations are used often in probability and statistics since it is often the case that the transformation of random variables yields new random variables that have desirable properties and such properties can yield results pertaining to the original set of random variables. In this page, we are mainly concerned with computing the probability distribution of the transformation in terms of the probability distribution of the original random variable.
==Linear Transformations ==
We first consider the simplest possible transformation: the linear transformation. If <math>a</math> and <math>b</math> are real numbers, then we may consider the random variable
<math display="block">
\begin{equation}
Y = T(X) = aX + b.
\end{equation}
</math>
If <math>a</math> is zero then there isn't anything to discuss since the transformation is just the constant <math>b</math>, so we may assume that <math>a</math> is non-zero.
=== a > 0  ===
If <math>a</math> is positive then <math>T</math> is a strictly increasing function and we have:
<math display="block">
\begin{align}
F_{Y}(y) = \operatorname{P}(aX + b \leq y ) = \operatorname{P}(X \leq a^{-1}(y - b)) &= F_{X}[a^{-1}(y - b)]
\end{align}
</math>
===  a < 0  and <math>X</math> continuous ===
If <math>a</math> is negative and <math>X</math> is a continuous random variable, then <math>T</math> is a strictly decreasing function and we have:
<math display="block">
\begin{align}
F_{Y}(y) = \operatorname{P}(aX + b \leq y ) = \operatorname{P}(X \geq a^{-1}(y - b)) &= 1 - F_{X}[a^{-1}(y - b)] .
\end{align}
</math>
==Monotone Transformations==
===Strictly Increasing ===
Suppose that the transformation, denoted by <math>T</math>, is a transformation that is strictly increasing:
<math display="block">
x_1 \lt x_2 \implies T(x_1) \lt T(x_2).
</math>
We denote by <math>T^{-1}</math> the unique transformation with the property
<math display="block">
T^{-1}(T(x)) = x \Longleftrightarrow I = T^{-1}\circ T
</math>
with <math>I</math> the [[wikipedia:identity function|identity function]]. Following the approach for linear transformations, we have
<math display="block">
\begin{align}
\operatorname{P}[T(X) \leq y ] = \operatorname{P}[X \leq T^{-1}(y)] = F_{X}[T^{-1}(y)].
\end{align}
</math>
Thus we have the following simple relation:
<math display="block">
\begin{equation}
\label{transform-rel-up}
X \mapsto T(X)=Y  \implies F_Y = F_{X} \circ T^{-1}.
\end{equation}
</math>
===Strictly Decreasing and X Continuous ===
If the transformation <math>T</math> is strictly decreasing and <math>F_X</math> is ''continuous'', then
<math display="block">
\operatorname{P}[T(X)\leq y ] =  \operatorname{P}[X \geq T^{-1}(y)] = 1 - F_{X}[T^{-1}(y)]
</math>
and thus
<math display="block">
\begin{equation}
\label{transform-rel-down}
X \mapsto T(X)=Y  \implies F_Y = 1 - F_{X} \circ T^{-1}.
\end{equation}
</math>
===Probability Density Functions ===
If the cumulative distribution function <math>F_X</math> has a density say <math>f_X</math>, then we see from \ref{transform-rel-up} and \ref{transform-rel-down} that the following relation holds:
<math display="block">
\begin{equation}
\label{monotone-density-relation}
X \mapsto T(X)=Y  \implies f_Y = \frac{ f_{X} \circ T^{-1}}{\left |T^{\prime} \circ T^{-1} \right |} \, \cdot \, 
\end{equation}
</math>
{{alert-warning|To be precise, relation \ref{monotone-density-relation} is true when the following ''integrability'' condition holds: <math> \int_{0}^{\infty} f_{Y}(y) \, dy < \infty.</math>}}
===Example: Exponentiation ===
Consider the transformation <math>T(x) = \exp(x)</math>. By \ref{transform-rel-up}, we have <math>F_Y(y) = F_{X}(\ln(y))</math>. If <math>F_X </math> has a density <math>f_X</math> then, by \ref{monotone-density-relation},
<math display = "block"> f_{Y}(y) = \frac{ f_{X}(\ln(y))}{y}</math>
provided that
<math display = "block"> \int_{0}^{\infty} \frac{ f_{X}(\ln(y))}{y} \, dy < \infty.</math>
For instance, let <math>X</math> be a random variable with a standard normal distribution:
<math display="block">
f_{X}(x) = \frac{1}{\sqrt{2\pi}} e^{-x/2}.
</math>
The random variable <math>\exp(X)</math> is said to have a lognormal distribution. It is fairly easy to show that 
<math display="block">
\begin{align*}
\int_{0}^{\infty} \frac{\exp{[-\ln(y)^2]}}{y} \lt \infty
\end{align*}
</math>
and thus the density for the lognormal distribution is given by
<math display="block">
f_{Y}(y) = \frac{1}{2\pi} \frac{\exp{[-\ln(y)^2]}}{y}.
</math>
==General Case ==
For a general transformation <math>T</math> where <math>Y = T(X)</math>, there is no simple and explicit relation between <math>F_X</math> and <math>F_Y</math>. That being said, there are situations when we can use conditioning as well as the relations we've already derived to compute the distribution of <math>Y</math>. More precisely, given a partition (splitting up) of the real line
<math display="block">
-\infty \leq a_0 \lt a_1 \lt \cdots \lt a_n \leq \infty \quad ( 0 = F_X(a_0) \lt F_X(a_1) \leq F_X(a_n) = 1)
</math>
, we let  <math>X_i</math> denote a random variable with distribution equal to the conditional distribution of <math>X</math> given that <math>X</math> lies in the interval <math>(a_{i-1},a_i]</math> and let <math>Y_i = T(X_i)</math>. Then we have
<math display="block">
\begin{align}
F_{Y}(x) &= \sum_{i=1}^n F_{Y_i}(x) \operatorname{P}[a_{i} \lt X \leq a_{i+1}]  \\
\label{gen-case-cdf-transform} &= \sum_{i=1}^n F_{Y_i}(x) [F_X(a_{i+1}) - F_X(a_i)].
\end{align}
</math>
If the partition is chosen in such a way that <math>T</math>  -- when applied to any of the <math>X_i</math> -- satisfies a property that we have encountered in previous sections (linear or monotone), then we can use \ref{gen-case-cdf-transform} to derive a relatively simple expression for the distribution function of <math>Y</math>.
=== Example: Squaring ===
Let <math>T(x) = x^2 </math> and suppose that <math>F_X > 0 </math>. Then we set
<math display="block">
a_0 = -\infty, a_1 = 0, a_2 = \infty.
</math>
If <math>p = F_X(0)</math> and, recalling \ref{gen-case-cdf-transform}, we obtain
<math display="block">
\begin{equation}
\label{square-1}
F_Y(y) = F_{Y_1}(y)\cdot p + F_{Y_2}(y) \cdot (1-p).
\end{equation}
</math>
By \ref{transform-rel-up} and \ref{transform-rel-down}, we know that
<math display="block">
\begin{equation}
\label{square-2}
F_{Y_1}(y) =1 -  F_{X_1}\left(-\sqrt{y}\right) = \operatorname{P}[X \geq -\sqrt{y} | X \leq 0]
\end{equation}
</math>
and
<math display="block">
\begin{equation}
\label{square-3}
   
   
F_{Y_2}(y) = F_{X_2}\left(\sqrt{y}\right) = \operatorname{P}[X \leq \sqrt{y} | X \gt 0].
\end{equation}
</math>
Combining \ref{square-1},\ref{square-2} and \ref{square-3}, we finally obtain the relation
<math display="block">
\begin{equation}
\label{square-final}
X \mapsto Y=X^2  \implies F_{Y}(y) = F_{X}\left(\sqrt{y}\right) - F_{X}\left(-\sqrt{y}\right).
\end{equation}
</math>
{{alert-info| The purpose of this simple example was to demonstrate the use of the conditioning method. A simpler and more direct approach would have also worked:
<math display="block">
F_Y(y) = \operatorname{P}[X^2 \leq y] =\operatorname{P}[\left|X\right| \leq \sqrt{y}] = F_{X}\left(\sqrt{y}\right) - F_{X}\left(-\sqrt{y}\right).
</math>
}}
For <math>y</math> not equal to zero, the derivative of <math>F_Y</math> equals
<math display="block">
\begin{equation}
\frac{1}{2\sqrt{y}}[f_{X}(\sqrt{y}) + f_{X}(-\sqrt{y})].
\end{equation}
</math>
Therefore we obtain the following relation:
<math display="block">
\begin{equation}
\label{square-final-density}
X \mapsto Y=X^2  \implies f_{Y}(y) = \frac{1}{2\sqrt{y}}[f_{X}(\sqrt{y}) + f_{X}(-\sqrt{y})]
\end{equation}
</math>
provided that
<math display="block">
\begin{equation}
\label{square-density-condition}
\int_0^{\infty}\frac{1}{2\sqrt{y}}[f_{X}(\sqrt{y}) + f_{X}(-\sqrt{y})] \, dy < \infty.
\end{equation}
</math>
To demonstrate this technique, consider squaring a random variable <math>X</math> with a standard normal distribution. The integrability condition \ref{square-density-condition} is equivalent to
<math display="block">
\int_0^{\infty}\frac{1}{\sqrt{y}}e^{-y} \, dy < \infty
</math>
which is definitely true. The distribution of <math>X^2</math> is a [[wikipedia:chi-square distribution|chi-square distribution]] with 1 degree of freedom, and its density equals
<math display="block">
\frac{1}{\sqrt{2\pi}} \frac{1}{\sqrt{y}} e^{-y}.
</math>

Latest revision as of 22:18, 30 May 2022

If [math]X[/math] is a random variable with cumulative distribution function [math]F_X[/math], we may produce other random variables by applying a transformation of [math]X[/math] of the form [math]g(X)[/math] for suitable functions g. These transformations are used often in probability and statistics since it is often the case that the transformation of random variables yields new random variables that have desirable properties and such properties can yield results pertaining to the original set of random variables. In this page, we are mainly concerned with computing the probability distribution of the transformation in terms of the probability distribution of the original random variable.

Linear Transformations

We first consider the simplest possible transformation: the linear transformation. If [math]a[/math] and [math]b[/math] are real numbers, then we may consider the random variable

[[math]] \begin{equation} Y = T(X) = aX + b. \end{equation} [[/math]]

If [math]a[/math] is zero then there isn't anything to discuss since the transformation is just the constant [math]b[/math], so we may assume that [math]a[/math] is non-zero.

a > 0

If [math]a[/math] is positive then [math]T[/math] is a strictly increasing function and we have:

[[math]] \begin{align} F_{Y}(y) = \operatorname{P}(aX + b \leq y ) = \operatorname{P}(X \leq a^{-1}(y - b)) &= F_{X}[a^{-1}(y - b)] \end{align} [[/math]]

a < 0 and [math]X[/math] continuous

If [math]a[/math] is negative and [math]X[/math] is a continuous random variable, then [math]T[/math] is a strictly decreasing function and we have:

[[math]] \begin{align} F_{Y}(y) = \operatorname{P}(aX + b \leq y ) = \operatorname{P}(X \geq a^{-1}(y - b)) &= 1 - F_{X}[a^{-1}(y - b)] . \end{align} [[/math]]

Monotone Transformations

Strictly Increasing

Suppose that the transformation, denoted by [math]T[/math], is a transformation that is strictly increasing:

[[math]] x_1 \lt x_2 \implies T(x_1) \lt T(x_2). [[/math]]

We denote by [math]T^{-1}[/math] the unique transformation with the property

[[math]] T^{-1}(T(x)) = x \Longleftrightarrow I = T^{-1}\circ T [[/math]]

with [math]I[/math] the identity function. Following the approach for linear transformations, we have

[[math]] \begin{align} \operatorname{P}[T(X) \leq y ] = \operatorname{P}[X \leq T^{-1}(y)] = F_{X}[T^{-1}(y)]. \end{align} [[/math]]

Thus we have the following simple relation:

[[math]] \begin{equation} \label{transform-rel-up} X \mapsto T(X)=Y \implies F_Y = F_{X} \circ T^{-1}. \end{equation} [[/math]]

Strictly Decreasing and X Continuous

If the transformation [math]T[/math] is strictly decreasing and [math]F_X[/math] is continuous, then

[[math]] \operatorname{P}[T(X)\leq y ] = \operatorname{P}[X \geq T^{-1}(y)] = 1 - F_{X}[T^{-1}(y)] [[/math]]

and thus

[[math]] \begin{equation} \label{transform-rel-down} X \mapsto T(X)=Y \implies F_Y = 1 - F_{X} \circ T^{-1}. \end{equation} [[/math]]

Probability Density Functions

If the cumulative distribution function [math]F_X[/math] has a density say [math]f_X[/math], then we see from \ref{transform-rel-up} and \ref{transform-rel-down} that the following relation holds:

[[math]] \begin{equation} \label{monotone-density-relation} X \mapsto T(X)=Y \implies f_Y = \frac{ f_{X} \circ T^{-1}}{\left |T^{\prime} \circ T^{-1} \right |} \, \cdot \, \end{equation} [[/math]]

To be precise, relation \ref{monotone-density-relation} is true when the following integrability condition holds: [math] \int_{0}^{\infty} f_{Y}(y) \, dy \lt \infty.[/math]

Example: Exponentiation

Consider the transformation [math]T(x) = \exp(x)[/math]. By \ref{transform-rel-up}, we have [math]F_Y(y) = F_{X}(\ln(y))[/math]. If [math]F_X [/math] has a density [math]f_X[/math] then, by \ref{monotone-density-relation},

[[math]] f_{Y}(y) = \frac{ f_{X}(\ln(y))}{y}[[/math]]

provided that

[[math]] \int_{0}^{\infty} \frac{ f_{X}(\ln(y))}{y} \, dy \lt \infty.[[/math]]

For instance, let [math]X[/math] be a random variable with a standard normal distribution:

[[math]] f_{X}(x) = \frac{1}{\sqrt{2\pi}} e^{-x/2}. [[/math]]

The random variable [math]\exp(X)[/math] is said to have a lognormal distribution. It is fairly easy to show that

[[math]] \begin{align*} \int_{0}^{\infty} \frac{\exp{[-\ln(y)^2]}}{y} \lt \infty \end{align*} [[/math]]

and thus the density for the lognormal distribution is given by

[[math]] f_{Y}(y) = \frac{1}{2\pi} \frac{\exp{[-\ln(y)^2]}}{y}. [[/math]]

General Case

For a general transformation [math]T[/math] where [math]Y = T(X)[/math], there is no simple and explicit relation between [math]F_X[/math] and [math]F_Y[/math]. That being said, there are situations when we can use conditioning as well as the relations we've already derived to compute the distribution of [math]Y[/math]. More precisely, given a partition (splitting up) of the real line

[[math]] -\infty \leq a_0 \lt a_1 \lt \cdots \lt a_n \leq \infty \quad ( 0 = F_X(a_0) \lt F_X(a_1) \leq F_X(a_n) = 1) [[/math]]

, we let [math]X_i[/math] denote a random variable with distribution equal to the conditional distribution of [math]X[/math] given that [math]X[/math] lies in the interval [math](a_{i-1},a_i][/math] and let [math]Y_i = T(X_i)[/math]. Then we have

[[math]] \begin{align} F_{Y}(x) &= \sum_{i=1}^n F_{Y_i}(x) \operatorname{P}[a_{i} \lt X \leq a_{i+1}] \\ \label{gen-case-cdf-transform} &= \sum_{i=1}^n F_{Y_i}(x) [F_X(a_{i+1}) - F_X(a_i)]. \end{align} [[/math]]

If the partition is chosen in such a way that [math]T[/math] -- when applied to any of the [math]X_i[/math] -- satisfies a property that we have encountered in previous sections (linear or monotone), then we can use \ref{gen-case-cdf-transform} to derive a relatively simple expression for the distribution function of [math]Y[/math].

Example: Squaring

Let [math]T(x) = x^2 [/math] and suppose that [math]F_X \gt 0 [/math]. Then we set

[[math]] a_0 = -\infty, a_1 = 0, a_2 = \infty. [[/math]]

If [math]p = F_X(0)[/math] and, recalling \ref{gen-case-cdf-transform}, we obtain

[[math]] \begin{equation} \label{square-1} F_Y(y) = F_{Y_1}(y)\cdot p + F_{Y_2}(y) \cdot (1-p). \end{equation} [[/math]]

By \ref{transform-rel-up} and \ref{transform-rel-down}, we know that

[[math]] \begin{equation} \label{square-2} F_{Y_1}(y) =1 - F_{X_1}\left(-\sqrt{y}\right) = \operatorname{P}[X \geq -\sqrt{y} | X \leq 0] \end{equation} [[/math]]

and

[[math]] \begin{equation} \label{square-3} F_{Y_2}(y) = F_{X_2}\left(\sqrt{y}\right) = \operatorname{P}[X \leq \sqrt{y} | X \gt 0]. \end{equation} [[/math]]

Combining \ref{square-1},\ref{square-2} and \ref{square-3}, we finally obtain the relation

[[math]] \begin{equation} \label{square-final} X \mapsto Y=X^2 \implies F_{Y}(y) = F_{X}\left(\sqrt{y}\right) - F_{X}\left(-\sqrt{y}\right). \end{equation} [[/math]]

The purpose of this simple example was to demonstrate the use of the conditioning method. A simpler and more direct approach would have also worked:

[[math]] F_Y(y) = \operatorname{P}[X^2 \leq y] =\operatorname{P}[\left|X\right| \leq \sqrt{y}] = F_{X}\left(\sqrt{y}\right) - F_{X}\left(-\sqrt{y}\right). [[/math]]

For [math]y[/math] not equal to zero, the derivative of [math]F_Y[/math] equals

[[math]] \begin{equation} \frac{1}{2\sqrt{y}}[f_{X}(\sqrt{y}) + f_{X}(-\sqrt{y})]. \end{equation} [[/math]]

Therefore we obtain the following relation:

[[math]] \begin{equation} \label{square-final-density} X \mapsto Y=X^2 \implies f_{Y}(y) = \frac{1}{2\sqrt{y}}[f_{X}(\sqrt{y}) + f_{X}(-\sqrt{y})] \end{equation} [[/math]]

provided that

[[math]] \begin{equation} \label{square-density-condition} \int_0^{\infty}\frac{1}{2\sqrt{y}}[f_{X}(\sqrt{y}) + f_{X}(-\sqrt{y})] \, dy \lt \infty. \end{equation} [[/math]]

To demonstrate this technique, consider squaring a random variable [math]X[/math] with a standard normal distribution. The integrability condition \ref{square-density-condition} is equivalent to

[[math]] \int_0^{\infty}\frac{1}{\sqrt{y}}e^{-y} \, dy \lt \infty [[/math]]

which is definitely true. The distribution of [math]X^2[/math] is a chi-square distribution with 1 degree of freedom, and its density equals

[[math]] \frac{1}{\sqrt{2\pi}} \frac{1}{\sqrt{y}} e^{-y}. [[/math]]