μ ∈ ( − ∞ , ∞ ) {\displaystyle \mu \in (-\infty ,\infty )\,} location (real) σ ∈ ( 0 , ∞ ) {\displaystyle \sigma \in (0,\infty )\,} scale (real)
x ⩾ μ ( ξ ⩾ 0 ) {\displaystyle x\geqslant \mu \,\;(\xi \geqslant 0)}
1 σ ( 1 + ξ z ) − ( 1 / ξ + 1 ) {\displaystyle {\frac {1}{\sigma }}(1+\xi z)^{-(1/\xi +1)}}
In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location μ {\displaystyle \mu } , scale σ {\displaystyle \sigma } , and shape ξ {\displaystyle \xi } .[2][3] Sometimes it is specified by only scale and shape[4] and sometimes only by its shape parameter. Some references give the shape parameter as κ = − ξ {\displaystyle \kappa =-\xi \,} .[5]
The standard cumulative distribution function (cdf) of the GPD is defined by[6]
where the support is z ≥ 0 {\displaystyle z\geq 0} for ξ ≥ 0 {\displaystyle \xi \geq 0} and 0 ≤ z ≤ − 1 / ξ {\displaystyle 0\leq z\leq -1/\xi } for ξ < 0 {\displaystyle \xi <0} . The corresponding probability density function (pdf) is
The related location-scale family of distributions is obtained by replacing the argument z by x − μ σ {\displaystyle {\frac {x-\mu }{\sigma }}} and adjusting the support accordingly.
The cumulative distribution function of X ∼ G P D ( μ , σ , ξ ) {\displaystyle X\sim GPD(\mu ,\sigma ,\xi )} ( μ ∈ R {\displaystyle \mu \in \mathbb {R} } , σ > 0 {\displaystyle \sigma >0} , and ξ ∈ R {\displaystyle \xi \in \mathbb {R} } ) is
where the support of X {\displaystyle X} is x ⩾ μ {\displaystyle x\geqslant \mu } when ξ ⩾ 0 {\displaystyle \xi \geqslant 0\,} , and μ ⩽ x ⩽ μ − σ / ξ {\displaystyle \mu \leqslant x\leqslant \mu -\sigma /\xi } when ξ < 0 {\displaystyle \xi <0} .
The probability density function (pdf) of X ∼ G P D ( μ , σ , ξ ) {\displaystyle X\sim GPD(\mu ,\sigma ,\xi )} is
again, for x ⩾ μ {\displaystyle x\geqslant \mu } when ξ ⩾ 0 {\displaystyle \xi \geqslant 0} , and μ ⩽ x ⩽ μ − σ / ξ {\displaystyle \mu \leqslant x\leqslant \mu -\sigma /\xi } when ξ < 0 {\displaystyle \xi <0} .
The pdf is a solution of the following differential equation: [citation needed]
If U is uniformly distributed on (0, 1], then
and
Both formulas are obtained by inversion of the cdf.
In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.
A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.
then
Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that ξ {\displaystyle \ \xi \ } must be positive.
In addition to this mixture (or compound) expression, the generalized Pareto distribution can also be expressed as a simple ratio. Concretely, for Y ∼ E x p o n e n t i a l ( 1 ) {\displaystyle \ Y\sim \operatorname {\mathsf {Exponential}} (\ 1\ )\ } and Z ∼ G a m m a ( 1 / ξ , 1 ) , {\displaystyle \ Z\sim \operatorname {\mathsf {Gamma}} (1/\xi ,\ 1)\ ,} we have μ + σ Y ξ Z ∼ G P D ( μ , σ , ξ ) . {\displaystyle \ \mu +{\frac {\ \sigma \ Y\ }{\ \xi \ Z\ }}\sim \operatorname {\mathsf {GPD}} (\mu ,\ \sigma ,\ \xi )~.} This is a consequence of the mixture after setting β = α {\displaystyle \ \beta =\alpha \ } and taking into account that the rate parameters of the exponential and gamma distribution are simply inverse multiplicative constants.
If X ∼ G P D {\displaystyle X\sim GPD} ( {\displaystyle (} μ = 0 {\displaystyle \mu =0} , σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} , then Y = log ( X ) {\displaystyle Y=\log(X)} is distributed according to the exponentiated generalized Pareto distribution, denoted by Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} .
The probability density function(pdf) of Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) ( σ > 0 ) {\displaystyle )\,\,(\sigma >0)} is
where the support is − ∞ < y < ∞ {\displaystyle -\infty <y<\infty } for ξ ≥ 0 {\displaystyle \xi \geq 0} , and − ∞ < y ≤ log ( − σ / ξ ) {\displaystyle -\infty <y\leq \log(-\sigma /\xi )} for ξ < 0 {\displaystyle \xi <0} .
For all ξ {\displaystyle \xi } , the log σ {\displaystyle \log \sigma } becomes the location parameter. See the right panel for the pdf when the shape ξ {\displaystyle \xi } is positive.
The exGPD has finite moments of all orders for all σ > 0 {\displaystyle \sigma >0} and − ∞ < ξ < ∞ {\displaystyle -\infty <\xi <\infty } .
The moment-generating function of Y ∼ e x G P D ( σ , ξ ) {\displaystyle Y\sim exGPD(\sigma ,\xi )} is
where B ( a , b ) {\displaystyle B(a,b)} and Γ ( a ) {\displaystyle \Gamma (a)} denote the beta function and gamma function, respectively.
The expected value of Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} depends on the scale σ {\displaystyle \sigma } and shape ξ {\displaystyle \xi } parameters, while the ξ {\displaystyle \xi } participates through the digamma function:
Note that for a fixed value for the ξ ∈ ( − ∞ , ∞ ) {\displaystyle \xi \in (-\infty ,\infty )} , the log σ {\displaystyle \log \ \sigma } plays as the location parameter under the exponentiated generalized Pareto distribution.
The variance of Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} depends on the shape parameter ξ {\displaystyle \xi } only through the polygamma function of order 1 (also called the trigamma function):
See the right panel for the variance as a function of ξ {\displaystyle \xi } . Note that ψ ′ ( 1 ) = π 2 / 6 ≈ 1.644934 {\displaystyle \psi '(1)=\pi ^{2}/6\approx 1.644934} .
Note that the roles of the scale parameter σ {\displaystyle \sigma } and the shape parameter ξ {\displaystyle \xi } under Y ∼ e x G P D ( σ , ξ ) {\displaystyle Y\sim exGPD(\sigma ,\xi )} are separably interpretable, which may lead to a robust efficient estimation for the ξ {\displaystyle \xi } than using the X ∼ G P D ( σ , ξ ) {\displaystyle X\sim GPD(\sigma ,\xi )} [2]. The roles of the two parameters are associated each other under X ∼ G P D ( μ = 0 , σ , ξ ) {\displaystyle X\sim GPD(\mu =0,\sigma ,\xi )} (at least up to the second central moment); see the formula of variance V a r ( X ) {\displaystyle Var(X)} wherein both parameters are participated.
Assume that X 1 : n = ( X 1 , ⋯ , X n ) {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})} are n {\displaystyle n} observations (need not be i.i.d.) from an unknown heavy-tailed distribution F {\displaystyle F} such that its tail distribution is regularly varying with the tail-index 1 / ξ {\displaystyle 1/\xi } (hence, the corresponding shape parameter is ξ {\displaystyle \xi } ). To be specific, the tail distribution is described as
It is of a particular interest in the extreme value theory to estimate the shape parameter ξ {\displaystyle \xi } , especially when ξ {\displaystyle \xi } is positive (so called the heavy-tailed distribution).
Let F u {\displaystyle F_{u}} be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions F {\displaystyle F} , and large u {\displaystyle u} , F u {\displaystyle F_{u}} is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate ξ {\displaystyle \xi } : the GPD plays the key role in POT approach.
A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For 1 ≤ i ≤ n {\displaystyle 1\leq i\leq n} , write X ( i ) {\displaystyle X_{(i)}} for the i {\displaystyle i} -th largest value of X 1 , ⋯ , X n {\displaystyle X_{1},\cdots ,X_{n}} . Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et al [3]) based on the k {\displaystyle k} upper order statistics is defined as
In practice, the Hill estimator is used as follows. First, calculate the estimator ξ ^ k Hill {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} at each integer k ∈ { 2 , ⋯ , n } {\displaystyle k\in \{2,\cdots ,n\}} , and then plot the ordered pairs { ( k , ξ ^ k Hill ) } k = 2 n {\displaystyle \{(k,{\widehat {\xi }}_{k}^{\text{Hill}})\}_{k=2}^{n}} . Then, select from the set of Hill estimators { ξ ^ k Hill } k = 2 n {\displaystyle \{{\widehat {\xi }}_{k}^{\text{Hill}}\}_{k=2}^{n}} which are roughly constant with respect to k {\displaystyle k} : these stable values are regarded as reasonable estimates for the shape parameter ξ {\displaystyle \xi } . If X 1 , ⋯ , X n {\displaystyle X_{1},\cdots ,X_{n}} are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter ξ {\displaystyle \xi } [4].
Note that the Hill estimator ξ ^ k Hill {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} makes a use of the log-transformation for the observations X 1 : n = ( X 1 , ⋯ , X n ) {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})} . (The Pickand's estimator ξ ^ k Pickand {\displaystyle {\widehat {\xi }}_{k}^{\text{Pickand}}} also employed the log-transformation, but in a slightly different way [5].)