guide:6677063157: Difference between revisions

From Stochiki
No edit summary
 
mNo edit summary
Line 1: Line 1:
   
'''Extreme value theory''' or '''extreme value analysis''' ('''EVA''') is a branch of [[wikipedia:statistics|statistics]] dealing with the extreme [[wikipedia:Deviation (statistics)|deviation]]s from the [[wikipedia:median|median]] of [[wikipedia:probability distribution|probability distribution]]s. It seeks to assess, from a given ordered [[wikipedia:Sample (statistics)|sample]] of a given random variable, the probability of events that are more extreme than any previously observed.
 
==Univariate theory==
Let <math>X_1, \dots, X_n</math> be a sequence of [[wikipedia:independent and identically distributed|independent and identically distributed]] random variables with [[wikipedia:cumulative distribution function|cumulative distribution function]] ''F'' and let <math>M_n =\max(X_1,\dots,X_n)</math> denote the maximum.
 
In theory, the exact distribution of the maximum can be derived:
<math display="block">
\begin{align*}
\Pr(M_n \leq z) & = \Pr(X_1 \leq z, \dots, X_n \leq z) \\
& = \Pr(X_1 \leq z) \cdots \Pr(X_n \leq z) = (F(z))^n.
\end{align*}
</math>
 
The associated [[wikipedia:indicator function|indicator function]] <math>I_n = I(M_n > z)</math> is a [[wikipedia:Bernoulli process|Bernoulli process]] with a success probability <math>p(z)=1-(F(z))^n</math> that depends on the magnitude <math>z</math> of the extreme event. The number of extreme events within <math>n</math> trials thus follows a [[wikipedia:binomial distribution|binomial distribution]] and the number of trials until an event occurs follows a [[wikipedia:geometric distribution|geometric distribution]] with expected value and standard deviation of the same order <math>O(1/p(z))</math>.
 
In practice, we might not have the distribution function <math>F</math> but the [[wikipedia:Fisher–Tippett–Gnedenko theorem|Fisher–Tippett–Gnedenko theorem]] provides an asymptotic result.  If there exist sequences of constants <math>a_n>0 </math> and <math>b_n\in \mathbb R </math> such that
 
<math display="block"> \Pr\{(M_n-b_n)/a_n \leq z\} \rightarrow G(z) </math>
 
as <math>n \rightarrow \infty</math> then
 
<math display="block"> G(z) \propto \exp \left[-(1+\zeta z)^{-1/\zeta} \right] </math>
 
where <math>\zeta</math> depends on the tail shape of the distribution.
When normalized, <math>G</math> belongs to one of the following non-[[wikipedia:degenerate distribution|degenerate distribution]] families:
 
{| class="table"
|-
! Family !! Distribution Function !! Condition
|-
| [[wikipedia:Weibull distribution|Weibull law]] || <math> G(z) = \begin{cases} \exp\left\{-\left( -\left( \frac{z-b}{a} \right) \right)^\alpha\right\} & z \lt b \\ 1 & z\geq b \end{cases} \text{ for }z\in\mathbb R</math> || When the distribution of <math>M_n</math> has a light tail with finite upper bound
|-
| [[wikipedia:Gumbel distribution|Gumbel law]] || <math> G(z) = \exp\left\{-\exp\left(-\left(\frac{z-b}{a}\right)\right)\right\}</math> || When the distribution of <math>M_n</math> has an exponential tail
|-
| [[wikipedia:Fréchet distribution|Fréchet law]] || <math> G(z) = \begin{cases} 0 & z\leq b \\ \exp\left\{-\left(\frac{z-b}{a}\right)^{-\alpha}\right\} & z \gt b \end{cases}</math> || When the distribution of <math>M_n</math> has a [[wikipedia:Heavy-tailed distribution|heavy tail]] (including polynomial decay)
|}
 
For the Weibull and Fréchet laws, <math>\alpha>0</math>. The class of distributions presented above are called the '' generalized extreme value distributions''.
 
==Generalized extreme value distributions==
 
In [[wikipedia:probability theory|probability theory]] and [[wikipedia:statistics|statistics]], the '''generalized extreme value''' ('''GEV''') '''distribution'''<ref>{{Cite web|last=Weisstein|first=Eric W.|title=Extreme Value Distribution|url=https://mathworld.wolfram.com/ExtremeValueDistribution.html|access-date=2021-08-06|website=mathworld.wolfram.com|language=en}}</ref> is a family of continuous [[wikipedia:probability distribution|probability distribution]]s developed within [[wikipedia:extreme value theory|extreme value theory]] to combine the [[wikipedia:Gumbel distribution|Gumbel]], [[wikipedia:Fréchet distribution|Fréchet]] and [[wikipedia:Weibull distribution|Weibull]] families also known as type I, II and III extreme value distributions. By the [[wikipedia:Fisher–Tippett–Gnedenko theorem|extreme value theorem]] the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables.<ref>{{cite book |last1=Haan |first1=Laurens |last2=Ferreira |first2=Ana |title=Extreme value theory: an introduction |date=2007 |publisher=Springer}}</ref> Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution.  Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.
 
==Specification==
Using the standardized variable <math>s = (x - \mu)/\sigma\,,</math> where <math>\mu\,,</math> the location parameter, can be any real number, and <math>\sigma > 0</math> is the scale parameter; the cumulative distribution function of the GEV distribution is then
 
<math display="block">F(s; \xi) = \begin{cases} \exp\Bigl(-\exp(-s)\Bigr) & ~~ \text{ for } ~~ \xi = 0 \\ {} \\
\exp\Bigl(-(1+\xi s)^{-1/\xi}\Bigr) & ~~ \text{ for } ~~ \xi \neq 0 ~~ \text{ and } ~~ \xi \, s > -1 \\ {} \\
0 & ~~ \text{ for } ~~ \xi > 0 ~~ \text{ and } ~~ \xi\, s \le -1 \\ {} \\
1 & ~~ \text{ for } ~~ \xi < 0 ~~ \text{ and } ~~ \xi\, s \le -1 ~, \end{cases}</math>
 
where <math>\xi\,,</math> the shape parameter, can be any real number. Thus, for <math>\xi > 0</math>, the expression is valid for <math>s > -1/\xi\,,</math> while for <math>\xi < 0</math> it is valid for <math>s < -1/\xi\,.</math> In the first case, <math>-1/\xi</math> is the negative, lower end-point, where <math>F</math> is 0; in the second case, <math>-1/\xi</math> is the positive, upper end-point, where <math>F</math> is 1. For <math>\xi = 0</math> the second expression is formally undefined and is replaced with the first expression, which is the result of taking the limit of the second, as <math>\xi \to 0</math> in which case <math>s</math> can be any real number.
 
In the special case of <math>x =\mu\,,</math> so <math>s = 0</math> and <math>F(0; \xi) = \exp(-1)</math> ≈ <math>0.368</math> for whatever values <math> \xi</math> and <math>\sigma</math> might have.
 
The probability density function of the standardized distribution is
 
<math display="block">f(s;\xi) = \begin{cases} \exp(-s) \exp\Bigl(-\exp(-s)\Bigr) & ~~ \text{ for } ~~ \xi = 0 \\ {} \\
\Bigl(1+\xi s\Bigr)^{-(1+1/\xi)} \exp\Bigl(-(1+\xi s)^{-1/\xi}\Bigr) & ~~ \text{ for } ~~ \xi \neq 0 ~~ \text{ and } ~~ \xi \, s > -1 \\ {} \\
0 & ~~ \text{ otherwise, } \end{cases}</math>
 
again valid for <math>s > -1/\xi</math> in the case <math>\xi > 0\,,</math> and for <math>s < -1/\xi</math> in the case <math>\xi < 0\,.</math> The density is zero outside of the relevant range. In the case <math>\xi = 0</math> the density is positive on the whole real line.
 
Since the cumulative distribution function is invertible, the quantile function for the GEV distribution has an explicit expression, namely
 
<math display="block">Q(p;\mu,\sigma,\xi) = \begin{cases}
\mu - \sigma\log\Bigl(-\log\left(p\right)\,\Bigr) & ~ \text{ for } ~ \xi = 0 ~ \text{ and } ~ p \in \left(0,1\right) \\ {} \\
\mu + \displaystyle{{\,\sigma\,}\over{\,\xi\,}}\left( \Bigl(-\log(p)\,\Bigr)^{-\xi} - 1\right) & ~ \text{ for } ~ \xi > 0 ~ \text{ and } ~ p \in \left[0,1\right) \\
{} & ~~ \text{ or } ~ \, \xi < 0 ~ \text{ and } ~ p \in (0,1]\;,\end{cases}</math>
 
and therefore the quantile density function <math>\left(q \equiv \frac{\;\operatorname{d}Q\;}{\operatorname{d}p}\right)</math> is
 
<math display="block">q(p;\sigma,\xi) = \frac{\sigma}{\;\Bigl(-\log\left(p\right)\, \Bigr)^{\xi+1}\, p\,} \quad \text{ for } ~~ p \in \left(0,1\right)\;,</math>
 
valid for <math>~\sigma > 0~</math> and for any real <math>~\xi\;.</math>
 
==Summary statistics==
 
Some simple statistics of the distribution are:
 
{| class="table"
|-
! Statistic !! Value
|-
| Mean || <math>\operatorname{E}(X) = \mu + \left(g_1-1\right)\frac{\sigma}{\xi}</math> for <math>\xi < 1</math>
|-
| Variance || <math>\operatorname{Var}(X) = \left(g_2-g_1^2\right)\frac{\sigma^2}{\xi^2} ,</math>
|-
| Mode || <math>\operatorname{Mode}(X) = \mu+\frac{\sigma}{\xi}[(1+\xi)^{-\xi}-1] .</math>
 
|}
 
 
==Link to Fréchet, Weibull and Gumbel families==
 
The shape parameter <math>\xi</math> governs the tail behavior of the distribution. The sub-families defined by <math>\xi= 0</math>, <math>\xi>0</math> and <math>\xi<0</math> correspond, respectively, to the Gumbel, Fréchet and Weibull families, whose cumulative distribution functions are displayed below.
 
<div class="card">
  <div class="card-header">Gumbel or type I extreme value distribution (<math>\xi=0</math>)</div>
  <div class="card-body">
  <p class="card-text">
<math display="block"> F(x;\mu,\sigma,0)=e^{-e^{-(x-\mu)/\sigma}}\;\;\; \text{for} \;\; x\in\mathbb R.</math></p>
</div>
</div>
 
 
<div class="card">
  <div class="card-header"> Fréchet or type II extreme value distribution</div>
  <div class="card-body">
  <p class="card-text">
If <math>\xi=\alpha^{-1}>0</math> and <math> y = 1 + \xi (x-\mu)/\sigma </math>
<math display="block"> F(x;\mu,\sigma,\xi)=\begin{cases} e^{-y^{-\alpha}} & y > 0 \\ 0 & y \leq 0. \end{cases}</math></p>
</div>
</div>
 
 
<div class="card">
  <div class="card-header">Reversed Weibull or type III extreme value distribution</div>
  <div class="card-body">
  <p class="card-text">
If <math>\xi=-\alpha^{-1}<0</math> and <math> y = - \left( 1 + \xi (x-\mu)/\sigma \right) </math>
<math display="block"> F(x;\mu,\sigma,\xi)=\begin{cases} e^{-(-y)^{\alpha}} & y<0 \\ 1 & y\geq 0 \end{cases}</math></p>
</div>
</div>
 
 
==Related distributions==
 
{| class="table"
|-
! Related Distribution !! Relation
|-
| GEV || If <math>X \sim \textrm{GEV}(\mu,\,\sigma,\,\xi)</math> then <math>mX+b \sim \textrm{GEV}(m\mu+b,\,m\sigma,\,\xi)</math>
|-
| Gumbel || If <math>X \sim \textrm{Gumbel}(\mu,\,\sigma)</math> ([[wikipedia:Gumbel distribution|Gumbel distribution]]) then <math>X \sim \textrm{GEV}(\mu,\,\sigma,\,0)</math>
|-
| Weibull || If <math>X \sim \textrm{GEV}(\mu,\,\sigma,\,0)</math> then <math>\sigma \exp (-\tfrac{X-\mu}{\mu \sigma} ) \sim \textrm{Weibull}(\sigma,\,\mu)</math> ([[wikipedia:Weibull distribution|Weibull distribution]])
|-
| Exponential ||  If <math>X \sim \textrm{Exponential}(1)\,</math> ([[wikipedia:Exponential distribution|Exponential distribution]]) then <math>\mu - \sigma \log{X} \sim \textrm{GEV}(\mu,\,\sigma,\,0)</math>
|-
| Logistic ||  If <math>X \sim \mathrm{Gumbel}(\alpha_X, \beta) </math> and <math> Y \sim \mathrm{Gumbel}(\alpha_Y, \beta) </math> then <math> X-Y \sim \mathrm{Logistic}(\alpha_X-\alpha_Y,\beta) \,</math> (see [[wikipedia:Logistic_distribution|Logistic_distribution]])
|}
 
==Generalized Pareto Distributions==
 
In [[wikipedia:statistics|statistics]], the '''generalized Pareto distribution''' (GPD) is a family of continuous [[wikipedia:probability distribution|probability distribution]]s. It is often used to model the tails of another distribution. It is specified by three parameters: location <math>\mu</math>, scale <math>\sigma</math>, and shape <math>\xi</math>.<ref>{{Cite book |title=An Introduction to Statistical Modeling of Extreme Values |last=Coles |first=Stuart |publisher=Springer |page=75 |url=https://books.google.com/books?id=2nugUEaKqFEC |isbn=9781852334598 |date=2001-12-12}}</ref><ref>{{Cite journal | last1 = Dargahi-Noubary | first1 = G. R. | title = On tail estimation: An improved method | doi = 10.1007/BF00894450 | journal = Mathematical Geology | volume = 21 | issue = 8 | pages = 829–842 | year = 1989 | s2cid = 122710961 }}</ref> Sometimes it is specified by only scale and shape<ref>{{Cite journal | last1 = Hosking | first1 = J. R. M. | last2 = Wallis | first2 = J. R. | title = Parameter and Quantile Estimation for the Generalized Pareto Distribution | journal = Technometrics | volume = 29 | issue = 3 | pages = 339–349 | doi = 10.2307/1269343 | year = 1987 | jstor = 1269343 }}</ref> and sometimes only by its shape parameter. Some references give the shape parameter as <math> \kappa =  - \xi \,</math>.<ref>{{Cite book |title=Statistical Extremes and Applications |editor-last=de Oliveira |editor-first=J. Tiago |publisher=Kluwer |last=Davison |first=A. C. |chapter=Modelling Excesses over High Thresholds, with an Application |page=462 |chapter-url=https://books.google.com/books?id=6M03_6rm8-oC&pg=PA462 |isbn=9789027718044 |date=1984-09-30}}</ref>
 
==Definition==
The standard cumulative distribution function (cdf) of the GPD is defined by<ref>{{Cite book |last1=Embrechts |first1=Paul |last2=Klüppelberg |first2=Claudia|author2-link= wikipedia:Claudia Klüppelberg |last3=Mikosch |first3=Thomas |title=Modelling extremal events for insurance and finance |page=162 |url=https://books.google.com/books?id=BXOI2pICfJUC |isbn=9783540609315 |date=1997-01-01}}</ref>
 
<math display="block">F_{\xi}(z) = \begin{cases}
1 - \left(1 + \xi z\right)^{-1/\xi} & \text{for }\xi \neq 0, \\
1 - e^{-z} & \text{for }\xi = 0.
\end{cases}
</math>
 
where the support is <math> z \geq 0 </math> for <math> \xi \geq 0</math> and <math> 0 \leq z \leq - 1 /\xi </math> for <math> \xi < 0</math>. The corresponding probability density function (pdf) is
 
<math display="block">f_{\xi}(z) = \begin{cases}
(1 + \xi z)^{-\frac{\xi +1}{\xi }} & \text{for }\xi \neq 0, \\
e^{-z} & \text{for }\xi = 0.
\end{cases}
</math>
 
==Characterization==
The related location-scale family of distributions is obtained by replacing the argument ''z'' by <math>\frac{x-\mu}{\sigma}</math> and adjusting the support accordingly.
 
The [[wikipedia:cumulative distribution function|cumulative distribution function]] of <math>X \sim GPD(\mu, \sigma, \xi)</math> (<math>\mu\in\mathbb R</math>, <math>\sigma>0</math>, and <math>\xi\in\mathbb R</math>) is
 
<math display="block">F_{(\mu,\sigma,\xi)}(x) = \begin{cases}
1 - \left(1+ \frac{\xi(x-\mu)}{\sigma}\right)^{-1/\xi} & \text{for }\xi \neq 0, \\
1 - \exp \left(-\frac{x-\mu}{\sigma}\right) & \text{for }\xi = 0,
\end{cases}
</math>
where the support of <math>X</math> is <math> x \geq \mu </math> when <math> \xi \geq 0 \,</math>, and <math> \mu \geq x \geq \mu - \sigma /\xi </math>  when <math> \xi < 0</math>.
 
The [[wikipedia:probability density function|probability density function]] (pdf) of <math>X \sim GPD(\mu, \sigma, \xi)</math> is
 
<math display="block">f_{(\mu,\sigma,\xi)}(x) = \frac{1}{\sigma}\left(1 + \frac{\xi (x-\mu)}{\sigma}\right)^{\left(-\frac{1}{\xi} - 1\right)}</math>, again, for <math> x \geq \mu </math> when <math> \xi \geq 0</math>, and <math> \mu \leq x \leq \mu - \sigma /\xi </math>  when <math> \xi < 0</math>.
 
==Special cases==
 
Well-known distributions are special cases of the generalized pareto distributions:
 
{| class="table"
|-
! Distribution !! Case
|-
| Exponential || If the shape <math>\xi</math> and location <math>\mu</math> are both zero, the GPD is equivalent to the [[wikipedia:exponential distribution|exponential distribution]]
|-
| Uniform || With shape <math>\xi = -1</math>, the GPD is equivalent to the [[wikipedia:continuous uniform distribution|continuous uniform distribution]] <math>U(0, \sigma)</math>
|-
| Pareto || With shape <math>\xi > 0</math> and location <math>\mu = \sigma/\xi</math>, the GPD is equivalent to the [[wikipedia:Pareto distribution|Pareto distribution]] with scale <math>x_m=\sigma/\xi</math> and shape <math>\alpha=1/\xi</math>.
|-
| Burr || GPD is similar to the [[wikipedia:Burr distribution|Burr distribution]].
|}
 
==References==
{{Reflist}}
==Wikipedia References==
*{{cite web |url = https://en.wikipedia.org/w/index.php?title=Extreme_value_theory&oldid=1097515920 | title= Extreme value theory | author = Wikipedia contributors |website= Wikipedia |publisher= Wikipedia |access-date = 22 August 2022 }}
*{{cite web |url = https://en.wikipedia.org/w/index.php?title=Generalized_extreme_value_distribution&oldid=1104881873| title= Generalized extreme value distribution | author = Wikipedia contributors |website= Wikipedia |publisher= Wikipedia |access-date = 22 August 2022 }}
*{{cite web |url =  https://en.wikipedia.org/w/index.php?title=Generalized_Pareto_distribution&oldid=1101835468 | title= Generalized Pareto distribution | author = Wikipedia contributors |website= Wikipedia |publisher= Wikipedia |access-date = 22 August 2022 }}

Revision as of 22:36, 22 August 2022

Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the probability of events that are more extreme than any previously observed.

Univariate theory

Let [math]X_1, \dots, X_n[/math] be a sequence of independent and identically distributed random variables with cumulative distribution function F and let [math]M_n =\max(X_1,\dots,X_n)[/math] denote the maximum.

In theory, the exact distribution of the maximum can be derived:

[[math]] \begin{align*} \Pr(M_n \leq z) & = \Pr(X_1 \leq z, \dots, X_n \leq z) \\ & = \Pr(X_1 \leq z) \cdots \Pr(X_n \leq z) = (F(z))^n. \end{align*} [[/math]]

The associated indicator function [math]I_n = I(M_n \gt z)[/math] is a Bernoulli process with a success probability [math]p(z)=1-(F(z))^n[/math] that depends on the magnitude [math]z[/math] of the extreme event. The number of extreme events within [math]n[/math] trials thus follows a binomial distribution and the number of trials until an event occurs follows a geometric distribution with expected value and standard deviation of the same order [math]O(1/p(z))[/math].

In practice, we might not have the distribution function [math]F[/math] but the Fisher–Tippett–Gnedenko theorem provides an asymptotic result. If there exist sequences of constants [math]a_n\gt0 [/math] and [math]b_n\in \mathbb R [/math] such that

[[math]] \Pr\{(M_n-b_n)/a_n \leq z\} \rightarrow G(z) [[/math]]

as [math]n \rightarrow \infty[/math] then

[[math]] G(z) \propto \exp \left[-(1+\zeta z)^{-1/\zeta} \right] [[/math]]

where [math]\zeta[/math] depends on the tail shape of the distribution. When normalized, [math]G[/math] belongs to one of the following non-degenerate distribution families:

Family Distribution Function Condition
Weibull law [math] G(z) = \begin{cases} \exp\left\{-\left( -\left( \frac{z-b}{a} \right) \right)^\alpha\right\} & z \lt b \\ 1 & z\geq b \end{cases} \text{ for }z\in\mathbb R[/math] When the distribution of [math]M_n[/math] has a light tail with finite upper bound
Gumbel law [math] G(z) = \exp\left\{-\exp\left(-\left(\frac{z-b}{a}\right)\right)\right\}[/math] When the distribution of [math]M_n[/math] has an exponential tail
Fréchet law [math] G(z) = \begin{cases} 0 & z\leq b \\ \exp\left\{-\left(\frac{z-b}{a}\right)^{-\alpha}\right\} & z \gt b \end{cases}[/math] When the distribution of [math]M_n[/math] has a heavy tail (including polynomial decay)

For the Weibull and Fréchet laws, [math]\alpha\gt0[/math]. The class of distributions presented above are called the generalized extreme value distributions.

Generalized extreme value distributions

In probability theory and statistics, the generalized extreme value (GEV) distribution[1] is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables.[2] Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

Specification

Using the standardized variable [math]s = (x - \mu)/\sigma\,,[/math] where [math]\mu\,,[/math] the location parameter, can be any real number, and [math]\sigma \gt 0[/math] is the scale parameter; the cumulative distribution function of the GEV distribution is then

[[math]]F(s; \xi) = \begin{cases} \exp\Bigl(-\exp(-s)\Bigr) & ~~ \text{ for } ~~ \xi = 0 \\ {} \\ \exp\Bigl(-(1+\xi s)^{-1/\xi}\Bigr) & ~~ \text{ for } ~~ \xi \neq 0 ~~ \text{ and } ~~ \xi \, s \gt -1 \\ {} \\ 0 & ~~ \text{ for } ~~ \xi \gt 0 ~~ \text{ and } ~~ \xi\, s \le -1 \\ {} \\ 1 & ~~ \text{ for } ~~ \xi \lt 0 ~~ \text{ and } ~~ \xi\, s \le -1 ~, \end{cases}[[/math]]

where [math]\xi\,,[/math] the shape parameter, can be any real number. Thus, for [math]\xi \gt 0[/math], the expression is valid for [math]s \gt -1/\xi\,,[/math] while for [math]\xi \lt 0[/math] it is valid for [math]s \lt -1/\xi\,.[/math] In the first case, [math]-1/\xi[/math] is the negative, lower end-point, where [math]F[/math] is 0; in the second case, [math]-1/\xi[/math] is the positive, upper end-point, where [math]F[/math] is 1. For [math]\xi = 0[/math] the second expression is formally undefined and is replaced with the first expression, which is the result of taking the limit of the second, as [math]\xi \to 0[/math] in which case [math]s[/math] can be any real number.

In the special case of [math]x =\mu\,,[/math] so [math]s = 0[/math] and [math]F(0; \xi) = \exp(-1)[/math][math]0.368[/math] for whatever values [math] \xi[/math] and [math]\sigma[/math] might have.

The probability density function of the standardized distribution is

[[math]]f(s;\xi) = \begin{cases} \exp(-s) \exp\Bigl(-\exp(-s)\Bigr) & ~~ \text{ for } ~~ \xi = 0 \\ {} \\ \Bigl(1+\xi s\Bigr)^{-(1+1/\xi)} \exp\Bigl(-(1+\xi s)^{-1/\xi}\Bigr) & ~~ \text{ for } ~~ \xi \neq 0 ~~ \text{ and } ~~ \xi \, s \gt -1 \\ {} \\ 0 & ~~ \text{ otherwise, } \end{cases}[[/math]]

again valid for [math]s \gt -1/\xi[/math] in the case [math]\xi \gt 0\,,[/math] and for [math]s \lt -1/\xi[/math] in the case [math]\xi \lt 0\,.[/math] The density is zero outside of the relevant range. In the case [math]\xi = 0[/math] the density is positive on the whole real line.

Since the cumulative distribution function is invertible, the quantile function for the GEV distribution has an explicit expression, namely

[[math]]Q(p;\mu,\sigma,\xi) = \begin{cases} \mu - \sigma\log\Bigl(-\log\left(p\right)\,\Bigr) & ~ \text{ for } ~ \xi = 0 ~ \text{ and } ~ p \in \left(0,1\right) \\ {} \\ \mu + \displaystyle{{\,\sigma\,}\over{\,\xi\,}}\left( \Bigl(-\log(p)\,\Bigr)^{-\xi} - 1\right) & ~ \text{ for } ~ \xi \gt 0 ~ \text{ and } ~ p \in \left[0,1\right) \\ {} & ~~ \text{ or } ~ \, \xi \lt 0 ~ \text{ and } ~ p \in (0,1]\;,\end{cases}[[/math]]

and therefore the quantile density function [math]\left(q \equiv \frac{\;\operatorname{d}Q\;}{\operatorname{d}p}\right)[/math] is

[[math]]q(p;\sigma,\xi) = \frac{\sigma}{\;\Bigl(-\log\left(p\right)\, \Bigr)^{\xi+1}\, p\,} \quad \text{ for } ~~ p \in \left(0,1\right)\;,[[/math]]

valid for [math]~\sigma \gt 0~[/math] and for any real [math]~\xi\;.[/math]

Summary statistics

Some simple statistics of the distribution are:

Statistic Value
Mean [math]\operatorname{E}(X) = \mu + \left(g_1-1\right)\frac{\sigma}{\xi}[/math] for [math]\xi \lt 1[/math]
Variance [math]\operatorname{Var}(X) = \left(g_2-g_1^2\right)\frac{\sigma^2}{\xi^2} ,[/math]
Mode [math]\operatorname{Mode}(X) = \mu+\frac{\sigma}{\xi}[(1+\xi)^{-\xi}-1] .[/math]


Link to Fréchet, Weibull and Gumbel families

The shape parameter [math]\xi[/math] governs the tail behavior of the distribution. The sub-families defined by [math]\xi= 0[/math], [math]\xi\gt0[/math] and [math]\xi\lt0[/math] correspond, respectively, to the Gumbel, Fréchet and Weibull families, whose cumulative distribution functions are displayed below.

Gumbel or type I extreme value distribution ([math]\xi=0[/math])

[[math]] F(x;\mu,\sigma,0)=e^{-e^{-(x-\mu)/\sigma}}\;\;\; \text{for} \;\; x\in\mathbb R.[[/math]]


Fréchet or type II extreme value distribution

If [math]\xi=\alpha^{-1}\gt0[/math] and [math] y = 1 + \xi (x-\mu)/\sigma [/math]

[[math]] F(x;\mu,\sigma,\xi)=\begin{cases} e^{-y^{-\alpha}} & y \gt 0 \\ 0 & y \leq 0. \end{cases}[[/math]]


Reversed Weibull or type III extreme value distribution

If [math]\xi=-\alpha^{-1}\lt0[/math] and [math] y = - \left( 1 + \xi (x-\mu)/\sigma \right) [/math]

[[math]] F(x;\mu,\sigma,\xi)=\begin{cases} e^{-(-y)^{\alpha}} & y\lt0 \\ 1 & y\geq 0 \end{cases}[[/math]]


Related distributions

Related Distribution Relation
GEV If [math]X \sim \textrm{GEV}(\mu,\,\sigma,\,\xi)[/math] then [math]mX+b \sim \textrm{GEV}(m\mu+b,\,m\sigma,\,\xi)[/math]
Gumbel If [math]X \sim \textrm{Gumbel}(\mu,\,\sigma)[/math] (Gumbel distribution) then [math]X \sim \textrm{GEV}(\mu,\,\sigma,\,0)[/math]
Weibull If [math]X \sim \textrm{GEV}(\mu,\,\sigma,\,0)[/math] then [math]\sigma \exp (-\tfrac{X-\mu}{\mu \sigma} ) \sim \textrm{Weibull}(\sigma,\,\mu)[/math] (Weibull distribution)
Exponential If [math]X \sim \textrm{Exponential}(1)\,[/math] (Exponential distribution) then [math]\mu - \sigma \log{X} \sim \textrm{GEV}(\mu,\,\sigma,\,0)[/math]
Logistic If [math]X \sim \mathrm{Gumbel}(\alpha_X, \beta) [/math] and [math] Y \sim \mathrm{Gumbel}(\alpha_Y, \beta) [/math] then [math] X-Y \sim \mathrm{Logistic}(\alpha_X-\alpha_Y,\beta) \,[/math] (see Logistic_distribution)

Generalized Pareto Distributions

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location [math]\mu[/math], scale [math]\sigma[/math], and shape [math]\xi[/math].[3][4] Sometimes it is specified by only scale and shape[5] and sometimes only by its shape parameter. Some references give the shape parameter as [math] \kappa = - \xi \,[/math].[6]

Definition

The standard cumulative distribution function (cdf) of the GPD is defined by[7]

[[math]]F_{\xi}(z) = \begin{cases} 1 - \left(1 + \xi z\right)^{-1/\xi} & \text{for }\xi \neq 0, \\ 1 - e^{-z} & \text{for }\xi = 0. \end{cases} [[/math]]

where the support is [math] z \geq 0 [/math] for [math] \xi \geq 0[/math] and [math] 0 \leq z \leq - 1 /\xi [/math] for [math] \xi \lt 0[/math]. The corresponding probability density function (pdf) is

[[math]]f_{\xi}(z) = \begin{cases} (1 + \xi z)^{-\frac{\xi +1}{\xi }} & \text{for }\xi \neq 0, \\ e^{-z} & \text{for }\xi = 0. \end{cases} [[/math]]

Characterization

The related location-scale family of distributions is obtained by replacing the argument z by [math]\frac{x-\mu}{\sigma}[/math] and adjusting the support accordingly.

The cumulative distribution function of [math]X \sim GPD(\mu, \sigma, \xi)[/math] ([math]\mu\in\mathbb R[/math], [math]\sigma\gt0[/math], and [math]\xi\in\mathbb R[/math]) is

[[math]]F_{(\mu,\sigma,\xi)}(x) = \begin{cases} 1 - \left(1+ \frac{\xi(x-\mu)}{\sigma}\right)^{-1/\xi} & \text{for }\xi \neq 0, \\ 1 - \exp \left(-\frac{x-\mu}{\sigma}\right) & \text{for }\xi = 0, \end{cases} [[/math]]

where the support of [math]X[/math] is [math] x \geq \mu [/math] when [math] \xi \geq 0 \,[/math], and [math] \mu \geq x \geq \mu - \sigma /\xi [/math] when [math] \xi \lt 0[/math].

The probability density function (pdf) of [math]X \sim GPD(\mu, \sigma, \xi)[/math] is

[[math]]f_{(\mu,\sigma,\xi)}(x) = \frac{1}{\sigma}\left(1 + \frac{\xi (x-\mu)}{\sigma}\right)^{\left(-\frac{1}{\xi} - 1\right)}[[/math]]

, again, for [math] x \geq \mu [/math] when [math] \xi \geq 0[/math], and [math] \mu \leq x \leq \mu - \sigma /\xi [/math] when [math] \xi \lt 0[/math].

Special cases

Well-known distributions are special cases of the generalized pareto distributions:

Distribution Case
Exponential If the shape [math]\xi[/math] and location [math]\mu[/math] are both zero, the GPD is equivalent to the exponential distribution
Uniform With shape [math]\xi = -1[/math], the GPD is equivalent to the continuous uniform distribution [math]U(0, \sigma)[/math]
Pareto With shape [math]\xi \gt 0[/math] and location [math]\mu = \sigma/\xi[/math], the GPD is equivalent to the Pareto distribution with scale [math]x_m=\sigma/\xi[/math] and shape [math]\alpha=1/\xi[/math].
Burr GPD is similar to the Burr distribution.

References

  1. Weisstein, Eric W. "Extreme Value Distribution". mathworld.wolfram.com (in English). Retrieved 2021-08-06.
  2. Haan, Laurens; Ferreira, Ana (2007). Extreme value theory: an introduction. Springer.
  3. Coles, Stuart (2001-12-12). An Introduction to Statistical Modeling of Extreme Values. Springer. p. 75. ISBN 9781852334598.
  4. "On tail estimation: An improved method" (1989). Mathematical Geology 21 (8): 829–842. doi:10.1007/BF00894450. 
  5. "Parameter and Quantile Estimation for the Generalized Pareto Distribution" (1987). Technometrics 29 (3): 339–349. doi:10.2307/1269343. 
  6. Davison, A. C. (1984-09-30). "Modelling Excesses over High Thresholds, with an Application". In de Oliveira, J. Tiago (ed.). Statistical Extremes and Applications. Kluwer. p. 462. ISBN 9789027718044.
  7. Embrechts, Paul; Klüppelberg, Claudia; Mikosch, Thomas (1997-01-01). Modelling extremal events for insurance and finance. p. 162. ISBN 9783540609315.

Wikipedia References