guide:C4cbbce9b2: Difference between revisions

Revision as of 11:32, 13 April 2024

A stationary process (or a strict/strictly stationary process or strong/strongly stationary process) is a stochastic process whose unconditional joint probability distribution does not change when shifted in time.^[1] Consequently, parameters such as mean and variance also do not change over time. To get an intuition of stationarity, one can imagine a frictionless pendulum. It swings back and forth in an oscillatory motion, yet the amplitude and frequency remain constant. Although the pendulum is moving, the process is stationary as its "statistics" are constant (frequency and amplitude). However, if a force were to be applied to the pendulum (for example, friction with the air), either the frequency or amplitude would change, thus making the process non-stationary.^[2]

Since stationarity is an assumption underlying many statistical procedures used in time series analysis, non-stationary data are often transformed to become stationary. The most common cause of violation of stationarity is a trend in the mean, which can be due either to the presence of a unit root or of a deterministic trend. In the former case of a unit root, stochastic shocks have permanent effects, and the process is not mean-reverting. In the latter case of a deterministic trend, the process is called a trend-stationary process, and stochastic shocks have only transitory effects after which the variable tends toward a deterministically evolving (non-constant) mean.

A trend stationary process is not strictly stationary, but can easily be transformed into a stationary process by removing the underlying trend, which is solely a function of time. Similarly, processes with one or more unit roots can be made stationary through differencing. An important type of non-stationary process that does not include a trend-like behavior is a cyclostationary process, which is a stochastic process that varies cyclically with time.

Strict-sense stationarity

Definition

Formally, let [math]\left\{X_t\right\}[/math] be a stochastic process and let [math]F_{X}(x_{t_1 + \tau}, \ldots, x_{t_n + \tau})[/math] represent the cumulative distribution function of the unconditional (i.e., with no reference to any particular starting value) joint distribution of [math]\left\{X_t\right\}[/math] at times [math]t_1 + \tau, \ldots, t_n + \tau[/math]. Then, [math]\left\{X_t\right\}[/math] is said to be strictly stationary, strongly stationary or strict-sense stationary if^[3]^{:p. 155}

[[math]]\begin{equation}\label{sss} F_{X}(x_{t_1+\tau} ,\ldots, x_{t_n+\tau}) = F_{X}(x_{t_1},\ldots, x_{t_n}) \quad \text{for all } \tau,t_1, \ldots, t_n \in \mathbb{R} \text{ and for all } n \in \mathbb{N}\end{equation}[[/math]]

Since [math]\tau[/math] does not affect [math]F_X(\cdot)[/math], [math] F_{X}[/math] is not a function of time.

Examples

Two simulated time series processes, one stationary and the other non-stationary, are shown above. The augmented Dickey–Fuller (ADF) test statistic is reported for each process; non-stationarity cannot be rejected for the second process at a 5% significance level.

Example 1 (Constant Process)

Let [math]Y[/math] be any scalar random variable, and define a time-series [math]\left\{X_t\right\}[/math], by [math]X_t=Y[/math] for all [math]t[/math]. Then [math]\left\{X_t\right\}[/math] is a stationary time series, for which realisations consist of a series of constant values, with a different constant value for each realisation. A law of large numbers does not apply on this case, as the limiting value of an average from a single realisation takes the random value determined by [math]Y[/math], rather than taking the expected value of [math]Y[/math].

Example 2

As a further example of a stationary process for which any single realisation has an apparently noise-free structure, let [math]Y[/math] has a uniform distribution on [math](0,2\pi][/math] and define the time series [math]\left\{X_t\right\}[/math] by

[[math]]X_t=\cos (t+Y) \quad \text{ for } t \in \mathbb{R}. [[/math]]

Then [math]\left\{X_t\right\}[/math] is strictly stationary since ([math] (t+ Y) [/math] modulo [math] 2 \pi [/math]) follows the same uniform distribution as [math] Y [/math] for any [math] t [/math].

Example 3 (White Noise Process)

A discrete-time stochastic process [math]W(n)[/math] is called white noise if its mean is equal to zero for all [math]n[/math] , i.e. [math]\operatorname{E}[W(n)] = 0[/math] and if the autocorrelation function [math]R_{W}(n) = \operatorname{E}[W(k+n)W(k)][/math] has a nonzero value only for [math]n = 0[/math]. Keep in mind that a white noise is not necessarily strictly stationary. Let [math]\omega[/math] be a random variable uniformly distributed in the interval [math](0, 2\pi)[/math] and define the time series [math]\left\{z_t\right\}[/math]

[[math]]z_t=\cos(t\omega) \quad (t=1,2,...) [[/math]]

Then

[[math]] \begin{align*} \mathbb{E}(z_t) &= \frac{1}{2\pi} \int_0^{2\pi} \cos(t\omega) \,d\omega = 0,\\ \operatorname{Var}(z_t) &= \frac{1}{2\pi} \int_0^{2\pi} \cos^2(t\omega) \,d\omega = 1/2,\\ \operatorname{Cov}(z_t , z_j) &= \frac{1}{2\pi} \int_0^{2\pi} \cos(t\omega)\cos(j\omega) \,d\omega = 0 \quad \forall t\neq j. \end{align*} [[/math]]

So [math]\{z_t\}[/math] is a white noise, however it is not strictly stationary.

Weak or wide-sense stationarity

A weaker form of stationarity commonly employed in signal processing is known as weak-sense stationarity, wide-sense stationarity (WSS), or covariance stationarity. WSS random processes only require that 1st moment (i.e. the mean) and autocovariance do not vary with respect to time and that the 2nd moment is finite for all times. Any strictly stationary process which has a finite mean and a covariance is also WSS.^[4]^{:p. 299}

So, a continuous time random process [math]\left\{X_t\right\}[/math] which is WSS has the following restrictions on its mean function [math]m_X(t) \triangleq \operatorname E[X_t][/math] and autocovariance function [math]K_{XX}(t_1, t_2) \triangleq \operatorname E[(X_{t_1}-m_X(t_1))(X_{t_2}-m_X(t_2))][/math]:

[[math]] \begin{align*} & m_X(t) = m_X(t + \tau) & & \text{for all } \tau \in \mathbb{R} \\ & K_{XX}(t_1, t_2) = K_{XX}(t_1 - t_2, 0) & & \text{for all } t_1,t_2 \in \mathbb{R} \\ & \operatorname E[|X(t)|^2] \lt \infty & & \text{for all } t \in \mathbb{R} \end{align*} [[/math]]

The first property implies that the mean function [math]m_X(t)[/math] must be constant. The second property implies that the covariance function depends only on the difference between [math]t_1[/math] and [math]t_2[/math] and only needs to be indexed by one variable rather than two variables.^[3]^{:p. 159} Thus, instead of writing,

[[math]]\,\!K_{XX}(t_1 - t_2, 0)\,[[/math]]

the notation is often abbreviated by the substitution [math]\tau = t_1 - t_2[/math]:

[[math]]K_{XX}(\tau) \triangleq K_{XX}(t_1 - t_2, 0)[[/math]]

This also implies that the autocorrelation depends only on [math]\tau = t_1 - t_2[/math], that is

[[math]]\,\! R_X(t_1,t_2) = R_X(t_1-t_2,0) \triangleq R_X(\tau).[[/math]]

The third property says that the second moments must be finite for any time [math]t[/math].

Differencing

One way to make some time series stationary is to compute the differences between consecutive observations. This is known as differencing. Differencing can help stabilize the mean of a time series by removing changes in the level of a time series, and so eliminating trends. This can also remove seasonality, if differences are taken appropriately (e.g. differencing observations 1 year apart to remove year-lo).

Transformations such as logarithms can help to stabilize the variance of a time series.

One of the ways for identifying non-stationary times series is the ACF plot. Sometimes, seasonal patterns will be more visible in the ACF plot than in the original time series; however, this is not always the case.^[5] Nonstationary time series can look stationary

Another approach to identifying non-stationarity is to look at the Laplace transform of a series, which will identify both exponential trends and sinusoidal seasonality (complex exponential trends). Related techniques from signal analysis such as the wavelet transform and Fourier transform may also be helpful.

Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing functions or series of values, such as time domain signals.

Different fields of study define autocorrelation differently, and not all of these definitions are equivalent. In some fields, the term is used interchangeably with autocovariance.

Unit root processes, trend-stationary processes, autoregressive processes, and moving average processes are specific forms of processes with autocorrelation.

Autocorrelation of stochastic processes

In statistics, the autocorrelation of a real or complex random process is the Pearson correlation between values of the process at different times, as a function of the two times or of the time lag. Let [math]\left\{ X_t \right\}[/math] be a random process, and [math]t[/math] be any point in time ([math]t[/math] may be an integer for a discrete-time process or a real number for a continuous-time process). Then [math]X_t[/math] is the value (or realization) produced by a given run of the process at time [math]t[/math]. Suppose that the process has mean [math]\mu_t[/math] and variance [math]\sigma_t^2[/math] at time [math]t[/math], for each [math]t[/math]. Then the definition of the auto-correlation function between times [math]t_1[/math] and [math]t_2[/math] is^[6]^:p.388^[3]^:p.165

[[math]]\operatorname{R}_{XX}(t_1,t_2) = \operatorname{E} \left[ X_{t_1} \overline{X}_{t_2}\right][[/math]]

where the bar represents complex conjugation.

Subtracting the mean before multiplication yields the auto-covariance function between times [math]t_1[/math] and [math]t_2[/math]:^[6]^:p.392^[3]^:p.168

[[math]]\operatorname{K}_{XX}(t_1,t_2) = \operatorname{E} \left[ (X_{t_1} - \mu_{t_1})\overline{(X_{t_2} - \mu_{t_2})} \right] = \operatorname{E}\left[X_{t_1} \overline{X}_{t_2} \right] - \mu_{t_1}\overline{\mu}_{t_2}[[/math]]

Note that this expression is not well defined for all time series or processes, because the mean may not exist, or the variance may be zero (for a constant process) or infinite (for processes with distribution lacking well-behaved moments, such as certain types of power law).

Definition for wide-sense stationary stochastic process

If [math]\left\{ X_t \right\}[/math] is a wide-sense stationary process then the mean [math]\mu[/math] and the variance [math]\sigma^2[/math] are time-independent, and further the autocovariance function depends only on the lag between [math]t_1[/math] and [math]t_2[/math]: the autocovariance depends only on the time-distance between the pair of values but not on their position in time. This further implies that the autocovariance and auto-correlation can be expressed as a function of the time-lag, and that this would be an even function of the lag [math]\tau=t_2-t_1[/math]. This gives the more familiar forms for the auto-correlation function^[6]^:p.395

[[math]]\operatorname{R}_{XX}(\tau) = \operatorname{E}\left[X_{t+\tau} \overline{X}_{t} \right][[/math]]

and the auto-covariance function:

[[math]]\operatorname{K}_{XX}(\tau) = \operatorname{E}\left[ (X_{t+\tau} - \mu)\overline{(X_{t} - \mu)} \right] = \operatorname{E} \left[ X_{t+\tau} \overline{X}_{t} \right] - \mu\overline{\mu}[[/math]]

Normalization

It is common practice in some disciplines (e.g. statistics and time series analysis) to normalize the autocovariance function to get a time-dependent Pearson correlation coefficient. However, in other disciplines (e.g. engineering) the normalization is usually dropped and the terms "autocorrelation" and "autocovariance" are used interchangeably.

The definition of the auto-correlation coefficient of a stochastic process is^[3]^:p.169

[[math]]\rho_{XX}(t_1,t_2) = \frac{\operatorname{K}_{XX}(t_1,t_2)}{\sigma_{t_1}\sigma_{t_2}} = \frac{\operatorname{E}\left[(X_{t_1} - \mu_{t_1}) \overline{(X_{t_2} - \mu_{t_2})} \right]}{\sigma_{t_1}\sigma_{t_2}} .[[/math]]

If the function [math]\rho_{XX}[/math] is well defined, its value must lie in the range [math][-1,1][/math], with 1 indicating perfect correlation and −1 indicating perfect anti-correlation.

For a weak-sense stationarity, wide-sense stationarity (WSS) process, the definition is

[[math]]\rho_{XX}(\tau) = \frac{\operatorname{K}_{XX}(\tau)}{\sigma^2} = \frac{\operatorname{E} \left[(X_{t+\tau} - \mu)\overline{(X_{t} - \mu)}\right]}{\sigma^2}[[/math]]

where

[[math]]\operatorname{K}_{XX}(0) = \sigma^2 .[[/math]]

The normalization is important both because the interpretation of the autocorrelation as a correlation provides a scale-free measure of the strength of statistical dependence, and because the normalization has an effect on the statistical properties of the estimated autocorrelations.

Properties

Symmetry property

The fact that the auto-correlation function [math]\operatorname{R}_{XX}[/math] is an even function can be stated as^[3]^:p.171

[[math]]\operatorname{R}_{XX}(t_1,t_2) = \overline{\operatorname{R}_{XX}(t_2,t_1)}[[/math]]

respectively for a WSS process:^[3]^:p.173

[[math]]\operatorname{R}_{XX}(\tau) = \overline{\operatorname{R}_{XX}(-\tau)} .[[/math]]

Maximum at zero

For a WSS process:^[3]^:p.174

[[math]]\left|\operatorname{R}_{XX}(\tau)\right| \leq \operatorname{R}_{XX}(0)[[/math]]

Notice that [math]\operatorname{R}_{XX}(0)[/math] is always real.

Cauchy–Schwarz inequality

The Cauchy–Schwarz inequality, inequality for stochastic processes:^[6]^:p.392

[[math]]\left|\operatorname{R}_{XX}(t_1,t_2)\right|^2 \leq \operatorname{E}\left[ |X_{t_1}|^2\right] \operatorname{E}\left[|X_{t_2}|^2\right][[/math]]

References

Gagniuc, Paul A. (2017). Markov Chains: From Theory to Implementation and Experimentation. USA, NJ: John Wiley & Sons. pp. 1–256. ISBN 978-1-119-38755-8.
Laumann, Timothy O. (2016-09-02). "On the Stability of BOLD fMRI Correlations". Cerebral Cortex. doi:10.1093/cercor/bhw265. ISSN 1047-3211. PMID 27591147. PMC:6248456.
^3.0 ^3.1 ^3.2 ^3.3 ^3.4 ^3.5 ^3.6 ^3.7 Park,Kun Il (2018). Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer. ISBN 978-3-319-68074-3.
Ionut Florescu (7 November 2014). Probability and Stochastic Processes. John Wiley & Sons. ISBN 978-1-118-59320-2.
"8.1 Stationarity and differencing | OTexts". www.otexts.org. Retrieved 2016-05-18.
^6.0 ^6.1 ^6.2 ^6.3 Gubner, John A. (2006). Probability and Random Processes for Electrical and Computer Engineers. Cambridge University Press. ISBN 978-0-521-86470-1.

Wikipedia References

Wikipedia contributors. "Stationary process". Wikipedia. Wikipedia. Retrieved 17 August 2022.
Wikipedia contributors. "Autocorrelation". Wikipedia. Wikipedia. Retrieved 17 August 2022.

[1] Gagniuc, Paul A. (2017). Markov Chains: From Theory to Implementation and Experimentation. USA, NJ: John Wiley & Sons. pp. 1–256. ISBN 978-1-119-38755-8.

[2] Laumann, Timothy O. (2016-09-02). "On the Stability of BOLD fMRI Correlations". Cerebral Cortex. doi:10.1093/cercor/bhw265. ISSN 1047-3211. PMID 27591147. PMC:6248456.

[KunIlPark-3] 3.0 ^3.1 ^3.2 ^3.3 ^3.4 ^3.5 ^3.6 ^3.7 Park,Kun Il (2018). Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer. ISBN 978-3-319-68074-3.

[Florescu2014-4] Ionut Florescu (7 November 2014). Probability and Stochastic Processes. John Wiley & Sons. ISBN 978-1-118-59320-2.

[5] "8.1 Stationarity and differencing | OTexts". www.otexts.org. Retrieved 2016-05-18.

[Gubner-6] 6.0 ^6.1 ^6.2 ^6.3 Gubner, John A. (2006). Probability and Random Processes for Electrical and Computer Engineers. Cambridge University Press. ISBN 978-0-521-86470-1.

[1]

[2]

[3]

[4]

[5]

[6]

@@ Line 1: / Line 1: @@
-In [[wikipedia:mathematics|mathematics]] and [[wikipedia:statistics|statistics]], a '''stationary process''' (or a '''strict/strictly stationary process''' or '''strong/strongly stationary process''') is a [[wikipedia:stochastic process|stochastic process]] whose unconditional [[wikipedia:joint probability distribution|joint probability distribution]] does not change when shifted in time.<ref>{{Cite book|title=Markov Chains: From Theory to Implementation and Experimentation|last=Gagniuc|first=Paul A.|publisher=John Wiley & Sons|year=2017|isbn=978-1-119-38755-8|location=USA, NJ|pages=1–256}}</ref> Consequently, parameters such as [[wikipedia:mean|mean]] and [[wikipedia:variance|variance]] also do not change over time. To get an intuition of stationarity, one can imagine a [[wikipedia:Friction|frictionless]] [[wikipedia:Pendulum (mechanics)|pendulum]]. It swings back and forth in an oscillatory motion, yet the [[wikipedia:amplitude|amplitude]] and [[wikipedia:frequency|frequency]] remain constant. Although the pendulum is moving, the process is stationary as its "[[wikipedia:Statistic|statistics]]" are constant (frequency and amplitude). However, if a [[wikipedia:force|force]] were to be applied to the pendulum (for example, friction with the air), either the frequency or amplitude would change, thus making the process ''non-stationary.''<ref>{{Cite journal|last=Laumann|first=Timothy O.|last2=Snyder|first2=Abraham Z.|last3=Mitra|first3=Anish|last4=Gordon|first4=Evan M.|last5=Gratton|first5=Caterina|last6=Adeyemo|first6=Babatunde|last7=Gilmore|first7=Adrian W.|last8=Nelson|first8=Steven M.|last9=Berg|first9=Jeff J.|last10=Greene|first10=Deanna J.|last11=McCarthy|first11=John E.|date=2016-09-02|title=On the Stability of BOLD fMRI Correlations|url=https://doi.org/10.1093/cercor/bhw265|journal=Cerebral Cortex|doi=10.1093/cercor/bhw265|issn=1047-3211|pmc=6248456|pmid=27591147}}</ref>
+A '''stationary process''' (or a '''strict/strictly stationary process''' or '''strong/strongly stationary process''') is a [[stochastic process|stochastic process]] whose unconditional [[joint probability distribution|joint probability distribution]] does not change when shifted in time.<ref>{{Cite book|title=Markov Chains: From Theory to Implementation and Experimentation|last=Gagniuc|first=Paul A.|publisher=John Wiley & Sons|year=2017|isbn=978-1-119-38755-8|location=USA, NJ|pages=1–256}}</ref> Consequently, parameters such as [[mean|mean]] and [[variance|variance]] also do not change over time. To get an intuition of stationarity, one can imagine a [[Friction|frictionless]] [[Pendulum (mechanics)|pendulum]]. It swings back and forth in an oscillatory motion, yet the amplitude and frequency remain constant. Although the pendulum is moving, the process is stationary as its "statistics" are constant (frequency and amplitude). However, if a force were to be applied to the pendulum (for example, friction with the air), either the frequency or amplitude would change, thus making the process ''non-stationary.''<ref>{{Cite journal|last=Laumann|first=Timothy O.|last2=Snyder|first2=Abraham Z.|last3=Mitra|first3=Anish|last4=Gordon|first4=Evan M.|last5=Gratton|first5=Caterina|last6=Adeyemo|first6=Babatunde|last7=Gilmore|first7=Adrian W.|last8=Nelson|first8=Steven M.|last9=Berg|first9=Jeff J.|last10=Greene|first10=Deanna J.|last11=McCarthy|first11=John E.|date=2016-09-02|title=On the Stability of BOLD fMRI Correlations|url=https://doi.org/10.1093/cercor/bhw265|journal=Cerebral Cortex|doi=10.1093/cercor/bhw265|issn=1047-3211|pmc=6248456|pmid=27591147}}</ref>
-Since stationarity is an assumption underlying many statistical procedures used in [[wikipedia:time series analysis|time series analysis]], non-stationary data are often transformed to become stationary. The most common cause of violation of stationarity is a trend in the mean, which can be due either to the presence of a [[wikipedia:unit root|unit root]] or of a deterministic trend. In the former case of a unit root, stochastic shocks have permanent effects, and the process is not [[wikipedia:mean reversion (finance)|mean-reverting]]. In the latter case of a deterministic trend, the process is called a [[wikipedia:trend-stationary process|trend-stationary process]], and stochastic shocks have only transitory effects after which the variable tends toward a deterministically evolving (non-constant) mean.
+Since stationarity is an assumption underlying many statistical procedures used in [[time series analysis|time series analysis]], non-stationary data are often transformed to become stationary. The most common cause of violation of stationarity is a trend in the mean, which can be due either to the presence of a [[unit root|unit root]] or of a deterministic trend. In the former case of a unit root, stochastic shocks have permanent effects, and the process is not [[mean reversion (finance)|mean-reverting]]. In the latter case of a deterministic trend, the process is called a [[trend-stationary process|trend-stationary process]], and stochastic shocks have only transitory effects after which the variable tends toward a deterministically evolving (non-constant) mean.
 A trend stationary process is not strictly stationary, but can easily be transformed into a stationary process by removing the underlying trend, which is solely a function of time. Similarly, processes with one or more unit roots can be made stationary through differencing. An important type of non-stationary process that does not include a trend-like behavior is a [[wikipedia:cyclostationary process|cyclostationary process]], which is a stochastic process that varies cyclically with time.
@@ Line 9: / Line 9: @@
 ===Definition===
-Formally, let <math>\left\{X_t\right\}</math> be a [[wikipedia:stochastic process|stochastic process]] and let <math>F_{X}(x_{t_1 + \tau}, \ldots, x_{t_n + \tau})</math> represent the [[wikipedia:cumulative distribution function|cumulative distribution function]] of the [[wikipedia:marginal distribution|unconditional]] (i.e., with no reference to any particular starting value) [[wikipedia:joint distribution|joint distribution]] of <math>\left\{X_t\right\}</math> at times <math>t_1 + \tau, \ldots, t_n + \tau</math>. Then, <math>\left\{X_t\right\}</math> is said to be '''strictly stationary''', '''strongly stationary''' or '''strict-sense stationary''' if<ref name=KunIlPark>{{cite book | author=Park,Kun Il| title=Fundamentals of Probability and Stochastic Processes with Applications to Communications| publisher=Springer | year=2018 | isbn=978-3-319-68074-3}}</ref>{{rp|p. 155}}
+Formally, let <math>\left\{X_t\right\}</math> be a [[stochastic process|stochastic process]] and let <math>F_{X}(x_{t_1 + \tau}, \ldots, x_{t_n + \tau})</math> represent the [[cumulative distribution function|cumulative distribution function]] of the [[marginal distribution|unconditional]] (i.e., with no reference to any particular starting value) [[joint distribution|joint distribution]] of <math>\left\{X_t\right\}</math> at times <math>t_1 + \tau, \ldots, t_n + \tau</math>. Then, <math>\left\{X_t\right\}</math> is said to be '''strictly stationary''', '''strongly stationary''' or '''strict-sense stationary''' if<ref name=KunIlPark>{{cite book | author=Park,Kun Il| title=Fundamentals of Probability and Stochastic Processes with Applications to Communications| publisher=Springer | year=2018 | isbn=978-3-319-68074-3}}</ref>{{rp|p. 155}}
 <math display = "block">\begin{equation}\label{sss} F_{X}(x_{t_1+\tau} ,\ldots, x_{t_n+\tau}) = F_{X}(x_{t_1},\ldots, x_{t_n}) \quad \text{for all } \tau,t_1, \ldots, t_n \in \mathbb{R} \text{ and for all } n \in \mathbb{N}\end{equation}</math>
@@ Line 16: / Line 16: @@
 ===Examples===
-[[File:Stationarycomparison.png|thumb|right|390px|Two simulated time series processes, one stationary and the other non-stationary, are shown above. The [[wikipedia:Augmented Dickey-Fuller test|augmented Dickey–Fuller]] (ADF) [[wikipedia:test statistic|test statistic]] is reported for each process; non-stationarity cannot be rejected for the second process at a 5% [[wikipedia:significance level|significance level]].]]
+[[File:Stationarycomparison.png|thumb|right|390px|Two simulated time series processes, one stationary and the other non-stationary, are shown above. The [[Augmented Dickey-Fuller test|augmented Dickey–Fuller]] (ADF) [[test statistic|test statistic]] is reported for each process; non-stationarity cannot be rejected for the second process at a 5% [[significance level|significance level]].]]
-[[wikipedia:White noise|White noise]] is the simplest example of a stationary process.
-An example of a [[wikipedia:Discrete-time stochastic process|discrete-time]] stationary process where the sample space is also discrete (so that the random variable may take one of ''N'' possible values) is a [[wikipedia:Bernoulli scheme|Bernoulli scheme]]. Other examples of a discrete-time stationary process with continuous sample space include some [[wikipedia:autoregressive|autoregressive]] and [[wikipedia:moving average model|moving average]] processes which are both subsets of the [[wikipedia:autoregressive moving average model|autoregressive moving average model]]. Models with a non-trivial autoregressive component may be either stationary or non-stationary, depending on the parameter values, and important non-stationary special cases are where [[wikipedia:unit root|unit root]]s exist in the model.
+====Example 1 (Constant Process)====
-====Example 1====
+Let <math>Y</math> be any scalar [[random variable|random variable]], and define a time-series <math>\left\{X_t\right\}</math>, by
-Let <math>Y</math> be any scalar [[wikipedia:random variable|random variable]], and define a time-series <math>\left\{X_t\right\}</math>, by
+<math>X_t=Y</math> for all <math>t</math>. Then <math>\left\{X_t\right\}</math> is a stationary time series, for which realisations consist of a series of constant values, with a different constant value for each realisation. A [[law of large numbers|law of large numbers]] does not apply on this case, as the limiting value of an average from a single realisation takes the random value determined by <math>Y</math>, rather than taking the [[expected value|expected value]] of <math>Y</math>.
-<math>X_t=Y</math> for all <math>t</math>. Then <math>\left\{X_t\right\}</math> is a stationary time series, for which realisations consist of a series of constant values, with a different constant value for each realisation. A [[wikipedia:law of large numbers|law of large numbers]] does not apply on this case, as the limiting value of an average from a single realisation takes the random value determined by <math>Y</math>, rather than taking the [[wikipedia:expected value|expected value]] of <math>Y</math>.
-The time average of <math>X_t</math> does not converge since the process is not [[wikipedia:Ergodic process|ergodic]].
+====Example 2====
-====Example 2====
+As a further example of a stationary process for which any single realisation has an apparently noise-free structure, let <math>Y</math> has a [[Uniform distribution (continuous)|uniform distribution]] on <math>(0,2\pi]</math> and define the time series <math>\left\{X_t\right\}</math> by
-As a further example of a stationary process for which any single realisation has an apparently noise-free structure, let <math>Y</math> has a [[wikipedia:Uniform distribution (continuous)|uniform distribution]] on <math>(0,2\pi]</math> and define the time series <math>\left\{X_t\right\}</math> by
 <math display="block">X_t=\cos (t+Y) \quad \text{ for } t \in \mathbb{R}. </math>
 Then <math>\left\{X_t\right\}</math> is strictly stationary since (<math> (t+ Y) </math>   modulo <math>  2 \pi </math>) follows the same uniform distribution as <math> Y </math> for any <math> t </math>.
-==== Example 3 ====
+==== Example 3 (White Noise Process)====
-Keep in mind that a [[wikipedia:white noise|white noise]] is not necessarily strictly stationary. Let <math>\omega</math> be a random variable uniformly distributed in the interval <math>(0, 2\pi)</math> and define the time series <math>\left\{z_t\right\}</math>
+A discrete-time stochastic process <math>W(n)</math> is called white noise if its mean is equal to zero for all <math>n</math> , i.e. <math>\operatorname{E}[W(n)] = 0</math> and if the autocorrelation function <math>R_{W}(n) = \operatorname{E}[W(k+n)W(k)]</math> has a nonzero value only for <math>n = 0</math>. Keep in mind that a white noise is not necessarily strictly stationary. Let <math>\omega</math> be a random variable uniformly distributed in the interval <math>(0, 2\pi)</math> and define the time series <math>\left\{z_t\right\}</math>
 <math display = "block">z_t=\cos(t\omega) \quad (t=1,2,...) </math>
@@ Line 48: / Line 45: @@
 ==Weak or wide-sense stationarity==
-A weaker form of stationarity commonly employed in [[wikipedia:signal processing|signal processing]] is known as '''weak-sense stationarity''', '''wide-sense stationarity (WSS)''', or '''covariance stationarity'''. WSS random processes only require that 1st [[wikipedia:moment (mathematics)|moment]] (i.e. the mean) and [[wikipedia:autocovariance|autocovariance]] do not vary with respect to time and that the 2nd moment is finite for all times. Any strictly stationary process which has a finite [[wikipedia:mean|mean]] and a [[wikipedia:covariance|covariance]] is also WSS.<ref name="Florescu2014">{{cite book|author=Ionut Florescu|title=Probability and Stochastic Processes|date=7 November 2014|publisher=John Wiley & Sons|isbn=978-1-118-59320-2}}</ref>{{rp|p. 299}}
+A weaker form of stationarity commonly employed in [[signal processing|signal processing]] is known as '''weak-sense stationarity''', '''wide-sense stationarity (WSS)''', or '''covariance stationarity'''. WSS random processes only require that 1st [[moment (mathematics)|moment]] (i.e. the mean) and [[autocovariance|autocovariance]] do not vary with respect to time and that the 2nd moment is finite for all times. Any strictly stationary process which has a finite [[mean|mean]] and a [[covariance|covariance]] is also WSS.<ref name="Florescu2014">{{cite book|author=Ionut Florescu|title=Probability and Stochastic Processes|date=7 November 2014|publisher=John Wiley & Sons|isbn=978-1-118-59320-2}}</ref>{{rp|p. 299}}
-So, a [[wikipedia:continuous time|continuous time]] [[wikipedia:random process|random process]] <math>\left\{X_t\right\}</math> which is WSS has the following restrictions on its mean function <math>m_X(t) \triangleq \operatorname E[X_t]</math> and [[wikipedia:autocovariance|autocovariance]] function <math>K_{XX}(t_1, t_2) \triangleq \operatorname E[(X_{t_1}-m_X(t_1))(X_{t_2}-m_X(t_2))]</math>:
+So, a [[continuous time|continuous time]] [[random process|random process]] <math>\left\{X_t\right\}</math> which is WSS has the following restrictions on its mean function <math>m_X(t) \triangleq \operatorname E[X_t]</math> and [[autocovariance|autocovariance]] function <math>K_{XX}(t_1, t_2) \triangleq \operatorname E[(X_{t_1}-m_X(t_1))(X_{t_2}-m_X(t_2))]</math>:
 <math display = "block">
@@ Line 68: / Line 65: @@
 <math display="block">K_{XX}(\tau) \triangleq K_{XX}(t_1 - t_2, 0)</math>
-This also implies that the [[wikipedia:autocorrelation|autocorrelation]] depends only on <math>\tau = t_1 - t_2</math>, that is
+This also implies that the [[autocorrelation|autocorrelation]] depends only on <math>\tau = t_1 - t_2</math>, that is
 <math display="block">\,\! R_X(t_1,t_2) = R_X(t_1-t_2,0) \triangleq R_X(\tau).</math>
@@ Line 76: / Line 73: @@
 == Differencing ==
-One way to make some time series stationary is to compute the differences between consecutive observations. This is known as [[wikipedia:unit root|differencing]]. Differencing can help stabilize the mean of a time series by removing changes in the level of a time series, and so eliminating trends. This can also remove seasonality, if differences are taken appropriately (e.g. differencing observations 1 year apart to remove year-lo).
+One way to make some time series stationary is to compute the differences between consecutive observations. This is known as [[unit root|differencing]]. Differencing can help stabilize the mean of a time series by removing changes in the level of a time series, and so eliminating trends. This can also remove seasonality, if differences are taken appropriately (e.g. differencing observations 1 year apart to remove year-lo).
 Transformations such as logarithms can help to stabilize the variance of a time series.
-One of the ways for identifying non-stationary times series is the [[wikipedia:Autocorrelation|ACF]] plot. Sometimes, seasonal patterns will be more visible in the ACF plot than in the original time series; however, this is not always the case.<ref>{{Cite web|url=https://www.otexts.org/fpp/8/1|title=8.1 Stationarity and differencing {{!}} OTexts|website=www.otexts.org|access-date=2016-05-18}}</ref> Nonstationary time series can look stationary
+One of the ways for identifying non-stationary times series is the [[#Autocorrelation of stochastic processes|ACF]] plot. Sometimes, seasonal patterns will be more visible in the ACF plot than in the original time series; however, this is not always the case.<ref>{{Cite web|url=https://www.otexts.org/fpp/8/1|title=8.1 Stationarity and differencing {{!}} OTexts|website=www.otexts.org|access-date=2016-05-18}}</ref> Nonstationary time series can look stationary
-Another approach to identifying non-stationarity is to look at the [[wikipedia:Laplace transform|Laplace transform]] of a series, which will identify both exponential trends and sinusoidal seasonality (complex exponential trends). Related techniques from [[wikipedia:signal analysis|signal analysis]] such as the [[wikipedia:wavelet transform|wavelet transform]] and [[wikipedia:Fourier transform|Fourier transform]] may also be helpful.
+Another approach to identifying non-stationarity is to look at the [[Laplace transform|Laplace transform]] of a series, which will identify both exponential trends and sinusoidal seasonality (complex exponential trends). Related techniques from [[signal analysis|signal analysis]] such as the [[wavelet transform|wavelet transform]] and [[Fourier transform|Fourier transform]] may also be helpful.
-'''Autocorrelation''', sometimes known as '''serial correlation''' in the [[wikipedia:discrete time|discrete time]] case, is the [[wikipedia:correlation|correlation]] of a [[wikipedia:Signal (information theory)|signal]] with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a [[wikipedia:periodic signal|periodic signal]] obscured by [[wikipedia:noise (signal processing)|noise]], or identifying the [[wikipedia:missing fundamental frequency|missing fundamental frequency]]  in a signal implied by its [[wikipedia:harmonic|harmonic]] frequencies. It is often used in [[wikipedia:signal processing|signal processing]] for analyzing functions or series of values, such as [[wikipedia:time domain|time domain]] signals.
+'''Autocorrelation''', sometimes known as '''serial correlation''' in the [[discrete time|discrete time]] case, is the [[correlation|correlation]] of a [[Signal (information theory)|signal]] with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a [[periodic signal|periodic signal]] obscured by [[noise (signal processing)|noise]], or identifying the [[missing fundamental frequency|missing fundamental frequency]]  in a signal implied by its [[harmonic|harmonic]] frequencies. It is often used in [[signal processing|signal processing]] for analyzing functions or series of values, such as [[time domain|time domain]] signals.
-Different fields of study define autocorrelation differently, and not all of these definitions are equivalent. In some fields, the term is used interchangeably with [[wikipedia:autocovariance|autocovariance]].
+Different fields of study define autocorrelation differently, and not all of these definitions are equivalent. In some fields, the term is used interchangeably with [[autocovariance|autocovariance]].
-[[wikipedia:Unit root|Unit root]] processes, [[wikipedia:trend-stationary process|trend-stationary process]]es, [[wikipedia:autoregressive process|autoregressive process]]es, and [[wikipedia:moving average process|moving average process]]es are specific forms of processes with autocorrelation.
+[[Unit root|Unit root]] processes, [[trend-stationary process|trend-stationary process]]es, [[autoregressive process|autoregressive process]]es, and [[moving average process|moving average process]]es are specific forms of processes with autocorrelation.
 == Autocorrelation of stochastic processes ==
-In [[wikipedia:statistics|statistics]], the autocorrelation of a real or complex [[wikipedia:random process|random process]] is the [[wikipedia:Pearson correlation coefficient|Pearson correlation]] between values of the process at different times, as a function of the two times or of the time lag. Let <math>\left\{ X_t \right\}</math> be a random process, and <math>t</math> be any point in time (<math>t</math> may be an [[wikipedia:integer|integer]] for a [[wikipedia:discrete-time|discrete-time]] process or a [[wikipedia:real number|real number]] for a [[wikipedia:continuous-time|continuous-time]] process). Then <math>X_t</math> is the value (or [[wikipedia:Realization (probability)|realization]]) produced by a given [[wikipedia:Execution (computing)|run]] of the process at time <math>t</math>. Suppose that the process has [[wikipedia:mean|mean]] <math>\mu_t</math> and [[wikipedia:variance|variance]] <math>\sigma_t^2</math> at time <math>t</math>, for each <math>t</math>. Then the definition of the '''auto-correlation function''' between times <math>t_1</math> and <math>t_2</math> is<ref name=Gubner>{{cite book |first=John A. |last=Gubner |year=2006 |title=Probability and Random Processes for Electrical and Computer Engineers |publisher=Cambridge University Press |isbn=978-0-521-86470-1}}</ref>{{rp|p.388}}<ref name=KunIlPark></ref>{{rp|p.165}}
+In [[statistics|statistics]], the autocorrelation of a real or complex [[random process|random process]] is the [[Pearson correlation coefficient|Pearson correlation]] between values of the process at different times, as a function of the two times or of the time lag. Let <math>\left\{ X_t \right\}</math> be a random process, and <math>t</math> be any point in time (<math>t</math> may be an [[integer|integer]] for a [[discrete-time|discrete-time]] process or a [[real number|real number]] for a [[continuous-time|continuous-time]] process). Then <math>X_t</math> is the value (or [[Realization (probability)|realization]]) produced by a given [[Execution (computing)|run]] of the process at time <math>t</math>. Suppose that the process has [[mean|mean]] <math>\mu_t</math> and [[variance|variance]] <math>\sigma_t^2</math> at time <math>t</math>, for each <math>t</math>. Then the definition of the '''auto-correlation function''' between times <math>t_1</math> and <math>t_2</math> is<ref name=Gubner>{{cite book |first=John A. |last=Gubner |year=2006 |title=Probability and Random Processes for Electrical and Computer Engineers |publisher=Cambridge University Press |isbn=978-0-521-86470-1}}</ref>{{rp|p.388}}<ref name=KunIlPark></ref>{{rp|p.165}}
-<math>\operatorname{R}_{XX}(t_1,t_2) = \operatorname{E} \left[ X_{t_1} \overline{X}_{t_2}\right]</math>
+<math display = "block">\operatorname{R}_{XX}(t_1,t_2) = \operatorname{E} \left[ X_{t_1} \overline{X}_{t_2}\right]</math>
-where <math>\operatorname{E}</math> is the [[wikipedia:expected value|expected value]] operator and the bar represents [[wikipedia:complex conjugation|complex conjugation]]. Note that the expectation may not be [[wikipedia:well defined|well defined]].
+where the bar represents [[complex conjugation|complex conjugation]].
 Subtracting the mean before multiplication yields the '''auto-covariance function''' between times <math>t_1</math> and <math>t_2</math>:<ref name=Gubner/>{{rp|p.392}}<ref name=KunIlPark/>{{rp|p.168}}
@@ Line 102: / Line 99: @@
 <math display = "block">\operatorname{K}_{XX}(t_1,t_2) = \operatorname{E} \left[ (X_{t_1} - \mu_{t_1})\overline{(X_{t_2} - \mu_{t_2})} \right] = \operatorname{E}\left[X_{t_1} \overline{X}_{t_2} \right] - \mu_{t_1}\overline{\mu}_{t_2}</math>
-Note that this expression is not well defined for all time series or processes, because the mean may not exist, or the variance may be zero (for a constant process) or infinite (for processes with distribution lacking well-behaved moments, such as certain types of [[wikipedia:power law|power law]]).
+Note that this expression is not well defined for all time series or processes, because the mean may not exist, or the variance may be zero (for a constant process) or infinite (for processes with distribution lacking well-behaved moments, such as certain types of [[power law|power law]]).
 === Definition for wide-sense stationary stochastic process ===
-If <math>\left\{ X_t \right\}</math> is a [[wikipedia:wide-sense stationary process|wide-sense stationary process]] then the mean <math>\mu</math> and the variance <math>\sigma^2</math> are time-independent, and further the autocovariance function depends only on the lag between <math>t_1</math> and <math>t_2</math>: the autocovariance depends only on the time-distance between the pair of values but not on their position in time. This further implies that the autocovariance and auto-correlation can be expressed as a function of the time-lag, and that this would be an [[wikipedia:even function|even function]] of the lag  <math>\tau=t_2-t_1</math>. This gives the more familiar forms for the '''auto-correlation function'''<ref name=Gubner/>{{rp|p.395}}
+If <math>\left\{ X_t \right\}</math> is a [[wide-sense stationary process|wide-sense stationary process]] then the mean <math>\mu</math> and the variance <math>\sigma^2</math> are time-independent, and further the autocovariance function depends only on the lag between <math>t_1</math> and <math>t_2</math>: the autocovariance depends only on the time-distance between the pair of values but not on their position in time. This further implies that the autocovariance and auto-correlation can be expressed as a function of the time-lag, and that this would be an [[even function|even function]] of the lag  <math>\tau=t_2-t_1</math>. This gives the more familiar forms for the '''auto-correlation function'''<ref name=Gubner/>{{rp|p.395}}
 <math display = "block">\operatorname{R}_{XX}(\tau) = \operatorname{E}\left[X_{t+\tau} \overline{X}_{t} \right]</math>
@@ Line 116: / Line 113: @@
 === Normalization ===
-It is common practice in some disciplines (e.g. statistics and [[wikipedia:time series analysis|time series analysis]]) to normalize the autocovariance function to get a time-dependent [[wikipedia:Pearson correlation coefficient|Pearson correlation coefficient]]. However, in other disciplines (e.g. engineering) the normalization is usually dropped and the terms "autocorrelation" and "autocovariance" are used interchangeably.
+It is common practice in some disciplines (e.g. statistics and [[time series analysis|time series analysis]]) to normalize the autocovariance function to get a time-dependent [[Pearson correlation coefficient|Pearson correlation coefficient]]. However, in other disciplines (e.g. engineering) the normalization is usually dropped and the terms "autocorrelation" and "autocovariance" are used interchangeably.
 The definition of the auto-correlation coefficient of a stochastic process is<ref name=KunIlPark/>{{rp|p.169}}
@@ Line 122: / Line 119: @@
 <math display=block>\rho_{XX}(t_1,t_2) = \frac{\operatorname{K}_{XX}(t_1,t_2)}{\sigma_{t_1}\sigma_{t_2}} = \frac{\operatorname{E}\left[(X_{t_1} - \mu_{t_1}) \overline{(X_{t_2} - \mu_{t_2})} \right]}{\sigma_{t_1}\sigma_{t_2}} .</math>
-If the function <math>\rho_{XX}</math> is well defined, its value must lie in the range <math>[-1,1]</math>, with 1 indicating perfect correlation and −1 indicating perfect [[wikipedia:anti-correlation|anti-correlation]].
+If the function <math>\rho_{XX}</math> is well defined, its value must lie in the range <math>[-1,1]</math>, with 1 indicating perfect correlation and −1 indicating perfect [[anti-correlation|anti-correlation]].
-For a [[wikipedia:Stationary process#Weak or wide-sense stationarity|weak-sense stationarity, wide-sense stationarity]] (WSS) process, the definition is
+For a [[Stationary process#Weak or wide-sense stationarity|weak-sense stationarity, wide-sense stationarity]] (WSS) process, the definition is
 <math display=block>\rho_{XX}(\tau) = \frac{\operatorname{K}_{XX}(\tau)}{\sigma^2} = \frac{\operatorname{E} \left[(X_{t+\tau} - \mu)\overline{(X_{t} - \mu)}\right]}{\sigma^2}</math>
@@ Line 132: / Line 129: @@
 <math display=block>\operatorname{K}_{XX}(0) = \sigma^2 .</math>
-The normalization is important both because the interpretation of the autocorrelation as a correlation provides a scale-free measure of the strength of [[wikipedia:statistical dependence|statistical dependence]], and because the normalization has an effect on the statistical properties of the estimated autocorrelations.
+The normalization is important both because the interpretation of the autocorrelation as a correlation provides a scale-free measure of the strength of [[statistical dependence|statistical dependence]], and because the normalization has an effect on the statistical properties of the estimated autocorrelations.
 ===Properties===
 ====Symmetry property====
-The fact that the auto-correlation function <math>\operatorname{R}_{XX}</math> is an [[wikipedia:even function|even function]] can be stated as<ref name=KunIlPark/>{{rp|p.171}}
+The fact that the auto-correlation function <math>\operatorname{R}_{XX}</math> is an [[even function|even function]] can be stated as<ref name=KunIlPark/>{{rp|p.171}}
 <math display=block>\operatorname{R}_{XX}(t_1,t_2) = \overline{\operatorname{R}_{XX}(t_2,t_1)}</math>
 respectively for a WSS process:<ref name=KunIlPark/>{{rp|p.173}}
@@ Line 148: / Line 145: @@
 ====Cauchy–Schwarz inequality====
-The [[wikipedia:Cauchy–Schwarz inequality|Cauchy–Schwarz inequality]], inequality for stochastic processes:<ref name=Gubner/>{{rp|p.392}}
+The [[Cauchy–Schwarz inequality|Cauchy–Schwarz inequality]], inequality for stochastic processes:<ref name=Gubner/>{{rp|p.392}}
 <math display=block>\left|\operatorname{R}_{XX}(t_1,t_2)\right|^2 \leq \operatorname{E}\left[ |X_{t_1}|^2\right] \operatorname{E}\left[|X_{t_2}|^2\right]</math>