parameter estimation for 3-parameter generalized pareto distribution by...

Hydrological Sciences -Journal- des Sciences Hydrologiques,40,2, April 1995 1 6 5

Parameter estimation for 3-parameter generalized pareto distribution by the principle of maximum entropy (POME)

V. P. SINGH & H. GUO Department of Civil Engineering, Louisiana State University, Baton Rouge, Louisiana 70803-6405, USA

Abstract The principle of maximum entropy (POME) is employed to derive a new method of parameter estimation for the 3-parameter generalized Pareto (GP) distribution. Monte Carlo simulated data are used to evaluate this method and compare it with the methods of moments (MOM), probability weighted moments (PWM), and maximum likelihood estimation (MLE). The parameter estimates yielded by the POME are either superior or comparable for high skewness.

Estimation des paramètres d'une loi de Pareto généralisée à trois paramètres par la méthode du maximum d'entropie Résumé Nous avons utilisé le principe du maximum d'entropie en vue d'établir une nouvelle méthode d'estimation des paramètres de la distribution de Pareto généralisée à trois paramètres. Des données synthétiques générées selon une procédure de Monte Carlo ont été utilisées pour évaluer cette méthode et pour la comparer aux méthodes des moments, des moments pondérés et du maximum de vraisemblance. L'estimation des paramètres s'appuyant sur le principe du maximum d'entropie est préférable ou comparable à celle des autres méthodes en particulier lorsque l'asymétrie est forte.

GENERALIZED PARETO DISTRIBUTION

Consider a random variable Y with the standard exponential distribution. Let a random variable Xbe defined as X = b{\ — exp(-aY))/a, where a and b are parameters. Then the distribution of X is the 2-parameter generalized Pareto distribution. If c is the threshold or lower bound of X, then the distribution of X is the 3-parameter generalized Pareto (GP) distribution which can be expressed as:

F(x) = 1 - 1 -

= 1 - exp

a(x — b

x —

c)

c

a jt 0

a = 0

(la)

(lb)

where c is a location parameter, b is a scale parameter, a is a shape parameter,

Open for discussion until 1 October 1995

166 V. P. Singh & N. Guo

and F{x) is the distribution function. The probability density function (PDF) of the GP distribution is given by:

m = 1 b

1-

exp

aix — c) b

x — c b

a * 0

0

(2a)

(2b)

The Pareto distributions are obtained for a < 0. Figure 1 shows the PDF for c = 0, b = 1.0, and various values of a. Pickands (1975) has shown that the GP distribution given by equation (1) occurs as a limiting distribution for excesses over thresholds if and only if the parent distribution is in the domain of attraction of one of the extreme value distributions. The GP distribution reduces to the 2-parameter GP distribution for c = 0, the exponential distribution for a — 0 and c = 0, and the uniform distribution on [0, b] for c = 0 and (3 = 1.

(b)

z o rj = >

£ 0.5-<7)

2 °-4 o

0.3

0.0 0.2 0.4 0.6 08 1.0 1.2 1.4 1.6

X Line: a = 0.5; plus: a = 0.75; star: a = 1.0; and dash: a = 1.25

0.0 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2.0

Line: a = - 0 . 1 ; dash: a = - 0 . 5 ; plus: a = - 1 . 0

Fig. 1 Probability density function of generalized Pareto distribution with (a) c = 0, b = 1.0, a = 0.5, 0.75, 1.0 and 1.25; and (b) with c = 0, b = 1.0, a = - 0 . 1 , -0.5 and -1.0.

Some important properties of the GP distribution are worth mentioning: (1) By comparison with the exponential distribution, the GP distribution has

a heavier tail for a < 0 (long-tailed distribution) and a lighter tail for a > 0 (short-tailed distribution). When a < 0, X has no upper limit; there is an upper bound c < x < oo for a > 0; and c < x < b/a. This property makes the GP distribution suitable for the analysis of independent cluster peaks.

(2) In the context of the partial duration series, a truncated GP distribution remains a GP distribution with the original shape parameter a remaining unchanged. This property is popularly referred to as the "threshold

Parameter estimation for generalized Pareto distribution 167

(3)

(4)

(5)

stability" property. Consequently, if X has a GP distribution for a fixed threshold level Q0, then the conditional distribution of X - c, given x > c, corresponding to a higher threshold Q0 + c also has a GP distribution. This is one of the properties that justifies the use of GP distribution to model excesses. Let Z = max(c, X{, X2, ..., XN), where N > 0 is a number. If Xh i = 1, 2, ..., N, are independent and identically distributed as a GP distribution, and N has a Poisson distribution, then Z has a generalized extreme value distribution (GEV) (Smith, 1984; Jin & Stedinger, 1989; Wang, 1990), as defined by Jenkinson (1955). Thus, a Poisson process of exceedance times with generalized Pareto excesses implies the classical extreme value distributions. As a special case, the maximum of a Poisson number of exponential variâtes lias a Gumbel distribution. So exponential peaks lead to Gumbel maxima, and GP distribution peaks lead to GEV maxima. The GEV can be expressed as:

F(Z) exp

exp

— \l-ôZ~y

13

-

— exp „

z-y (3

l I

--, -

„

0, z > 0

0

(3a)

(3b)

where the parameters <5, (3 and y are independent of z. Furthermore, <5 = a; that is, the shape parameters of the GEV and GP distributions are the same. Note that Z is not allowed to take on negative values, and P(Z < 0) = 0 and P(Z = 0) = exp(-X), and only for z > 0 is the CDF modelled by the GEV distribution. This property makes the GP distribution suitable for modelling flood magnitudes exceeding a fixed threshold. The properties given in (2) and (3) characterize the GP distribution such that no other family has either property, making it a practical family for statistical estimation, provided that the threshold is assumed sufficiently high. The failure rate r(x) = f(x)l{\ - F(x)} is expressed as:

r(x) = l/\b - a(x - c)]

and is monotonie in X, decreasing if a < 0, constant if a = 0 and increasing if a > 0.

LITERATURE REVIEW

The generalized Pareto (GP) distribution was introduced by Pickands (1975) and has since been applied to a number of areas including socio-economic phenomena, physical and biological processes (Saksena & Johnson, 1984),


reliability studies and the analysis of environmental extremes. Davison & Smith (1990) pointed out that the GP distribution might form the basis of a broad modelling approach to high-level exceedances. DuMouchel (1983) applied it to estimate the stable index a to measure tail thickness, whereas Davison (1984a, 1984b) modelled contamination due to long-range atmospheric transport of radionuclides, van Montfort & Witter (1985, 1986) and van Montfort & Otten (1991) applied the GP distribution to model the peaks over a threshold (POT) streamflows and rainfall series, and Smith (1984, 1987, 1991) applied it to analyse flood frequencies and wave heights. Similarly, Joe (1987) employed it to estimate quantiles of the maximum of iV observations. Wang (1991) applied it to develop a POT model for flood peaks with Poisson arrival time, whereas Rosbjerg et al. (1992) compared the use of the 2-parameter GP and exponential distributions as distribution models for exceedances with the parent distribution being a generalized GP distribution. In an extreme value analysis of the flow of Burbage Brook, Barrett (1992) used the GP distribution to model the POT flood series with Poisson inter-arrival times. Davison & Smith (1990) presented a comprehensive analysis of the extremes of data by use of the GP distribution for modelling the sizes and occurrences of exceedances over high thresholds.

Methods for estimating the parameters of the 2-parameter GP distribution were reviewed by Hosking & Wallis (1987). Quandt (1966) used the method of moments (MOM), while Baxter (1980) and Cook & Mumme (1981) used the method of maximum likelihood estimation (MLE) for the Pareto distribution. The MOM, MLE and probability weighted moments (PWM) were included in the review, van Montfort & Witter (1986) used the MLE to fit the GP distribution to represent the Dutch POT rainfall series and used an empirical correction formula to reduce bias of the scale and shape parameter estimates. Davison & Smith (1990) used the MLE, PWM, a graphical method and least squares to estimate the GP distribution parameters. Wang (1991) derived the PWM for both known and unknown thresholds.

OBJECTIVE OF STUDY

The objective of this paper is to develop a new competitive method of parameter estimation based on the principle of maximum entropy (POME), and to compare it with the MOM, MLE and PWM using Monte Carlo simulated data. The review of the literature shows that the POME does not appear to have been employed for estimating parameters of the GP distribution.

DERIVATION OF PARAMETER ESTIMATION METHOD BY POME

Shannon (1948) defined entropy as a numerical measure of uncertainty, or conversely the information content associated with a probability distribution,


f(x;8), with a parameter vector 0 and used to describe a random variable X. The Shannon entropy function H(f) for continuous X can be expressed as:

H(f) = - ïfl.x;6) \nf(x;0)àx with [/(x;0)dx = l (4)

where H(f) is the entropy off(x;0), and can be thought of as the mean value of -\nf(x;d).

According to Jaynes (1961), the minimally biased distribution of X is the one which maximizes entropy subject to given information, or which satisfies the principle of maximum entropy (POME). Therefore, the parameters of the distribution can be obtained by achieving the maximum of H(f). The use of this principle for generating the least-biased probability distributions on the basis of limited and incomplete data has been discussed by several authors and has been applied to many diverse problems (e.g. a recent review by Singh & Fiorentino (1992)). Jaynes (1968) has reasoned that the POME is the logical and rational criterion for choosing some specific f(x;d) that maximizes H and satisfies the given information expressed as constraints. In other words, for given information (e.g. mean, variance, skewness, lower limit, upper limit, etc.), the distribution derived by the POME would best represent X; implicitly, this distribution would best represent the sample from which the information was derived. Inversely, if it is desired to fit a particular probability distribution to a sample of data, then the POME can uniquely specify the constraints (or the information) needed to derive that distribution. The distribution parameters are then related to these constraints. An excellent discussion of the underlying mathematical rationale is given in Levine & Tribus (1979).

Given m linearly independent constraints Ch i = 1,2, ..., m, in the form

C. = \wfx)f{x;6)àx, i = 1,2,..., m (5)

where wt(x) are some functions whose averages over f(x;6) are specified, then the maximum of H subject to equation (5) is given by the distribution:

f(x;6) = exp -a0~ £a,-w,-(x) (=i

(6a)

where ah i = 0, 1, 2, ..., m, are the Lagrange multipliers, and can be determined from equations (5) and (6a). Inserting equation (6a) in equation (4) yields the entropy of f(x;6) in terms of the constraints and Lagrange multipliers:

m

H(f) = % + Yja,Ci (6b)

Maximization of H then establishes the relationships between constraints and Lagrange multipliers. Thus, to derive a method using the POME for the estimation of the parameters a, b and c of equation (2), three steps are


involved: (i) specification of the appropriate constraints; (ii) derivation of the entropy of the distribution; and (iii) derivation of the relationships between the Lagrange multipliers and constraints. A complete mathematical discussion of this method can be found in Tribus (1969), Jaynes (1968), Levine & Tribus (1979) and Singh & Rajagopal (1986).

Specification of constraints

The entropy of the GP distribution can be derived by inserting equation (1) in equation (4):

H(f) = lnof/fr;0)dc- 1-1 a

In . __ a(x — c) _ f(x;d)dx (6c)

Comparing equation (6c) with equation (6b), the constraints appropriate for equation (3) can be written (Singh & Rajagopal, 1986) as:

\f{x;d) àx = 1 (7)

In , _ a(x — c) f(x;6)dx = E In 1 _ a(x — c)

b (8)

in which E[*] denotes expectation of the bracketed quantity. These constraints are unique and specify the information that is sufficient for the GP distribution. The first constraint specifies the total probability. The second constraint specifies the mean of the logarithm of the inverse ratio of the scale parameter to the failure rate. Conceptually, this defines the expected value of the negative logarithm of the scaled failure rate. The distribution parameters are related to these constraints.

Construction of the entropy function

The PDF of the GP distribution corresponding to the POME and consistent with equations (7) and (8) takes the form:

f{x;d) = exp —aQ — fljln 1 a(x - c) (9)

where aQ and ax are Lagrange multipliers. The mathematical rationale for equation (9) has been presented by Tribus (1969).

By applying equation (3) to the total probability condition in equation (7), one obtains:

exp(a0)

Parameter estimation for generalized Pareto distribution

a(x - c) exp -ûjln 1 - . dx

which yields the partition function:

exp(a0) = -b 1

a 1 - a ,

The zeroth Lagrange multiplier is given by:

a0 = In b 1 a I—a,

Inserting equation (11) in equation (9) yields:

Ax-B) a(\ —a,)

1 -a(x — c)

A comparison of equation (13) with equation (3) yields:

1 !

I—a, = —

a

Taking logarithms of equation (13) gives: lnf(x;d) = lna+ln(l - a x ) -Inb-a^n aix - c)

b

Therefore, the entropy H(J) of the GP distribution follows:

H(f) = — lna —ln(l —a{) +lnb+alE\ In 1 -a(x — c)

111

(10)

(11)

(12)

(13)

(14)

(15)

(16)

Relationships between distribution parameters and constraints

According to Singh & Rajagopal (1986), the relationships between the distribution parameters and constraints are obtained by taking partial derivatives of the entropy H(f) with respect to the Lagrange multipliers as well as the distribution parameters, and then equating these derivatives to zero, and making use of the constraints. To that end, taking partial derivatives of equation (16) with respect to ax, a, b and c separately and equating each derivative to zero yields:

dH da,

1 I — a,

+ E In 1 a(x - c) _ 0

dH da ™ = -±~atE

(x - c)lb 1 - a(x - c)lb

= 0

(17)

(18)


dH = 2 ~db 1

dH dc

= a jE

j j - E (x - c)/6

1 — a(x -

1 1 — a(x — c)lb

-c)lb

= 0

(19)

(20)

Simplification of equations (17) to (20) yields, respectively:

1 In 1 a(x — c) b

(x - c)lb 1 —a(x—c)/b

(x — c)lb 1 — a(x — c)/ft

1

I-a,

aa,

aa,

1 - a(x — c)lb

(21)

(22)

(23)

(24)

Clearly, equation (24) does not hold. Equation (22) is the same as equation (23). In order to get a unique solution, additional equations are needed which can be obtained by differentiating the zeroth Lagrange multiplier with respect to the Lagrange multipliers and equating the derivatives to zero. To that end, equation (10) is written as:

aQ = In exp — fljln 1 -a(x — c) dx (25)

Differentiating equation (25) with respect to ax:

00

exp{—fljln[l — a(x — c)/b]}ln[l — a(x — c)/b]dx

da, exp[—a0ln{l — a(x — c)/b}]dx

•^{-o.-aMl-^-cVbmi-aix-Omàx

-E{[1-a(x-c)/b]} (26)

Parameter estimation for generalized Pareto distribution

Following Tribus (1969):

var{ln[l - a(x - c)/b]} da,

173

(27)

where var[«] is the variance of the bracketed quantity. From equation (11):

a0 = \n(b/a)-\n(l-al) (28)

Differentiating equation (28) with respect to a{:

(29)

(30)

da0 _

dax

d \

1 1 — flj

1

da, a-^r Equating equation (29) to equation (26) leads to:

In 1 -a(x - c)

b 1

I — a, (31)

which is the same as equation (21). When equation (30) is equated to equation (27), the following is obtained:

var In 1 a(x — c) 1

( l - ^ ) 2 (32)

Therefore, the parameter estimation equations for the POME consist of equations (21), (22) and (32). Inserting ax = 1 - lia from equation (14) into these three equations, one gets:

1 -a(x — c)

1 1 — a(x — c)lb

var In a(x — c) _

= —a

I— a

= a

(33)

(34)

(35)

THREE OTHER METHODS OF PARAMETER ESTIMATION

Three of the most popular methods of parameter estimation are the method of moments (MOM), the method of probability-weighted moments (PWM), and the method of maximum likelihood estimation (MLE). The POME does not

174 V, P. Singh & N. Guo

appear to have been used for estimating parameters of the GP distribution. Therefore, virtually no literature exists on the comparison of parameter estimates by the POME with those by the MLE, PWM and MOM. For the sake of completeness, these methods are briefly summarized.

Method of moments (MOM)

Moment estimators of the GP distribution were derived by Hosking & Wallis (1987). Note that E(l - a(x - c)lb)r = 1/(1 + ar) if 1 + ra > 0. The rth moment of X exists if a > -1/r . Provided that they exist, then the moment estimators are:

x = c+Ji- (36) l+a

9 b2

S2 = ° (37) (l+fl)2(l+2a)

G = 2(l-Q)(l+2fl)0-5 ( 3 8 )

1 +3a

where x, S2 and G are the mean, variance and skewness, respectively. First, the moment estimate of a is obtained by solving equation (38). The relation between G and a is illustrated in Fig. 2. With a calculated, b and c follow from equation (36) and (37) as:

b = S(l+a)(l+2af5 (39)

c = x-- (40) b+a

Probability-weighted moments (PWM)

The PWM estimators for the GP distribution (Hosking & Wallis, 1987) are given as:

a - Wo~SWi-9W2 (41) -W0 + 4W1-3W2

b = (Wo-2WJ(Wo-3W2)(-4Wl+6W2)

( -W 0 +4Wj-3F 2 ) 2

2WQF1-6W0W2 + 6W1W2

~W0+4Wl-3W2


i.o-

0.81

xi 0.6

CC 0.4-

tn o.2-i < 0-0 a: < -0.2

-0.4-1

-0.6

-0.8

-1.0 o i :

SKEWNESS G Fig. 2 Parameter a vs skewness G for GPD3.

where the rth probability-weighted moment Wr is:

l

Wr = E[x(F)(l~F(x)Y] = {c + - [ l - ( l - F ) a ] } ( l - f ) ' ' d F

1

r + 1

ft 1

a a + r + 1 r = 0,1,2,. . (44)

Method of maximum likelihood estimation

The MLE estimators can be expressed as:

j , (Xj-cVb = ^ _

frf 1 — a(x(. - c)lb I—a (45)

J2 ln[l - a(x( - c)/b] = —na (46)

A maximum likelihood estimator cannot be obtained for c, because the likelihood function is unbounded with respect to c, as shown in Fig. 3. Since c is the lower bound of the random variable X, we may use the constraint c < xx, the lowest sample value. Clearly, the likelihood function is maximum with respect to c when c = xv

V. P. Singh & N. Guo

g o ZD

CL

8 X _l UJ

o-- i --2--3--4--5--6--7--8--9-

-10-

-I I -

-12-

-13-

-14-

-15-^

OJO 0.1 0.2 03 0.4 0.5

PARAMETER c 0.6 0.7

Line: a = -0 .116 , b = 0.387, c = 0.562; dash: a = 0.544, b = 1.116, c = 0.277

Fig. 3 Likelihood function of GPD3 vs parameter c for sample size 10.

APPLICATION TO MONTE CARLO-SIMULATED DATA

Monte Carlo samples

To assess the performance of the POME estimation method by comparison with the MOM, PWM and MLE, Monte Carlo sampling experiments were conducted. Two distribution population cases, listed in Table 1, were considered. For each population case, 1000 random samples of size 20, 50 and 100 were generated, and then parameters and quantiles were estimated.

Table 1 GP distribution population cases considered in the sampling experiment

GP distribution population

Case 1 Case 2

cv

0.5 0.5

G

0.5 2.5

Parameters

a b

0.554 1.116 -0.069 0.433

c

0.277 0.536

C„ = coefficient of variation.


Performance indices

The performance of the POME was evaluated using the following performance indices:

Standard bias BIAS = £ (*>~* (47) x

Root mean square error RMSE = E K X - X ) 1 ' (48) A,2i0.5

X

where x is an estimate of x (parameter or quantile) and:

N

W) = ±jt*i <49> Ni=i

where TV is the number of Monte Carlo samples (N = 1000 in this study). 1000 may arguably not be a large enough number of samples to produce the true values of BIAS and RMSE, but will suffice to compare the performances of the estimation methods.

BIAS in parameter estimation

The bias of parameters estimated by the four methods is summarized in Table 2. For G = 0.5, in absolute terms the MOM produced the least bias of the four methods for all sample sizes. The MLE had the second least bias in the parameter estimates. With increasing sample size, there was significant reduction in bias for all four methods. The POME produced less bias than the PWM in estimates of b and c for all sample sizes, but that was not uniformly true in the case of the estimate of parameter a. When G = 2.5, these methods performed quite differently. For all samples sizes, the MLE and the POME

Table 2 BIAS of parameter estimates

Sample size

20

50

100

Method

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

G = 0.5

a

0.156 0.488 0.217

-0.397

0.063 0.230 0.132

-0.407

0.040 0.132 0.086

-0.288

b

0.094 0.632 0.037

-0.122

0.042 0.258 0.060

-0.096

0.028 0.138 0.048

-0.060

c

-0.053 -0.948

0.215 -0.094

-0.025 -0.396

0.067 -0.156

-0.019 -0.208

0.039 -0.126

G = 2.5

a

-4.144 -9.141

0.474 0.013

-1.981 -3.821

0.244 0.009

-1.196 -1.964

0.185 0.012

b

0.509 1.799

-0.077 0.147

0.260 0.626

-0.024 0.115

0.165 0.304

-0.017 0.099

c

-0.143 -0.584

0.034 -0.094

-0.085 -0.231

0.009 -0.079

-0.057 -0.116

0.008 -0.068


were comparable, producing the least bias. For the a and c parameter estimates, the POME had the least bias, but the MLE had the least bias for the b parameter estimate. The PWM had the highest bias in all three parameter estimates for all sample sizes. Thus, if the value of G is high, the POME or MLE may be the preferred method. For lower values of G, the MOM or MLE may be preferable, especially when the sample size is small.

RMSE in parameter estimation

The values of RMSE of parameters estimated by the four methods are given in Table 3. For G = 0.5, of the four methods the MOM produced the least RMSE in the a parameter estimate. However, as the sample size increased, the MOM, PWM and MLE became comparable. In the cases of the b and c parameter estimates, the MLE had the least RMSE, but all four methods were comparable. For G = 2.5, the comparative behaviour of the four methods was markedly different. In absolute terms, the MOM and the PWM produced the highest RMSE in parameter estimates for all sample sizes, with the POME having the least bias in the a parameter estimate but the MLE in the b and c parameter estimates. Thus, it may be concluded that for lower values of G, the MOM or PWM may be the preferred method, but for higher values of G, the MLE or POME is the preferred method.

BIAS in quantile estimation

The results of bias in quantile estimates by the GP distribution are summarized in Table 4. The performance of the four estimation methods varied with the value of G, and probability of non-exceedance P. For G = 0.5, all four methods had comparable bias for P < 0.9 for all sample sizes. When P > 0.99, the MOM and the PWM produced the smallest bias and the POME the

Table 3 RMSE of parameter estimates

Method

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

a

0.448 0.780 0.502 0.785

0.301 0.419 0.329 0.696

0.203 0.268 0.224 0.590

b

0.310 0.820 0.284 0.371

0.213 0.365 0.234 0.271

0.144 0.211 0.176 0.233

c

0.336 0.984 0.357 0.348

0.201 0.427 0.146 0.262

0.139 0.237 0.056 0.185

a

-5.178 --10.990

-1.926 -0.067

-2.785 -4.830 -1.475 -0.061

-1.925 -2.710 -1.205 -0.061

b

0.688 2.005 2.580 0.394

0.376 0.710 0.177 0.250

0.249 0.360 0.127 0.181

c

0.205 0.593 0.053 0.182

0.120 0.236 0.019 0.125

0.083 0.121 0.011 0.097


Table 4 BIAS and RMSE of quantile estimates

p

0.8

0.9

0.99

0.999

Sample size

20

50

100

20

50

100

20

50

100

20

50

100

Method

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

MOM PWM MLE POME

G = 0.5

BIAS

0.000 0.091

-0.011 -0.030

0.001 0.041 0.010 0.000

-0.012 0.076

-0.037 0.031

-0.021 0.153

-0.082 -0.062

-0.004 0.032

-0.005 0.066

0.000 0.018 0.004 0.048

-0.026 0.036

-0.067 0.287

-0.009 0.007

-0.029 0.323

-0.005 0.002 0.014 0.230

-0.022 0.028

-0.063 0.582

-0.005 0.000

-0.034 0.612

-0.004 -0.004 -0.019

0.439

RMSE

0.112 0.152 0.118 0.128

0.078 0.090 0.093 0.076

0.098 0.131 0.153 0.151

0.149 0.231 0.157 0.158

0.065 0.074 0.072 0.126

0.043 0.048 0.055 0.092

0.113 0.153 0.128 0.484

0.070 0.087 0.063 0.491

0.048 0.060 0.039 0.399

0.141 0.192 0.174 0.888

0.090 0.113 0.079 0.906

0.063 0.078 0.047 0.474

G = 2.5

BIAS

0.058 0.169

-0.018 0.046

0.037 0.083

-0.004 0.033

0.024 0.115

-0.068 0.068

0.015 0.186

-0.047 0.064

0.024 0.063

-0.001 0.051

0.019 0.038 0.001 0.044

-0.131 -0.129

0.031 0.104

-0.059 -0.074

0.031 0.080

-0.031 -0.065

0.023 0.070

-0.266 -0.296

0.152 0.121

-0.141 -0.198

0.100 0.093

-0.083 -0.120

0.069 0.081

RMSE

0.172 0.224 0.134 0.176

0.107 0.125 0.090 0.109

0.197 0.225 0.221 0.221

0.273 0.348 0.224 0.271

0.123 0.131 0.106 0.137

0.084 0.085 0.073 0.098

0.309 0.372 0.286 0.297

0.205 0.235 0.203 0.186

0.154 0.165 0.150 0.135

0.427 0.600 0.572 0.332

0.310 0.393 0.406 0.207

0.252 0.289 0.295 0.151

highest, with the MLE in the intermediate range. However, for G = 2.5, the POME produced the least bias, especially when P was greater than 0.99. For all sample sizes, all four methods were somewhat comparable. In conclusion,


for lower values of G, anyone of the four methods may be used for P < 0.99, but the PWM, MOM or MLE may be preferable for P exceeding 0.99. For higher values of G, all four methods were comparable, but for P exceeding 0.99 the POME is the preferred method.

RMSE in quantile estimation

The values of RMSE in quantile estimates for the four methods are given in Table 4. For G = 0.5 and P < 0.9, all four methods produced comparable values of RMSE for all sample sizes; for P > 0.99, the performance of the POME deteriorated. When G = 2.5, all methods produced comparable values of RMSE for all sample sizes for P < 0.9; for P > 0.99 the POME had the least RMSE. Thus, it is inferred that the MOM, PWM or MLE may be used for smaller values of G, but for higher values of G, the POME may be the preferred method.

CONCLUSIONS

The following conclusions can be drawn from this study: (1) the POME offers an alternative method for estimating the parameters of the 3-parameter generalized Pareto distribution; (2) when the skewness was high (G = 2.5), the POME yielded superior parameter estimates; (3) for low skewness (G = 0.5), the POME was better in parameter estimates than the MLE and PWM but worse than the MOM; however, for large sample size, its performance improved significantly; (4) the POME produced either better or comparable quantile estimates as compared with the MOM, MLE and PWM for high skewness (G = 2.5); (5) for low skewness (G = 0.5), the POME was comparable to the MOM, the MLE and the PWM for lower probabilities of nonexceedance which for higher values, the MOM or PWM was better than the POME.

REFERENCES

Barrett, J. H. (1992) An extreme value analysis of the flow of Burbage Brook. Stochastic Hydrol. Hydraul. 6, 151-165.

Baxter, M. A. (1980) Minimum variance unbiased estimation of the parameter of the Pareto distribution. Biometrika 27, 133-138.

Cook, W. L. & Mumme, D. C. (1981) Estimation of Pareto parameters by numerical methods. In: Statistical Distributions in Scientific Work, éd. C. Taillie et al. 5, 127-132.

Davison, A. C. (1984a) Modelling excesses over high thresholds, with an application. In: Statistical Extremes and Applications, ed. J. Tiago de Oliveira, 461-482. Reidel, Dordrecht, The Netherlands.

Davison, A. C. (1984b) A statistical model for contamination due to long-range atmospheric transport of radionuclides. PhD thesis, Department of Mathematics, Imperial College of Science and Technology, London, UK.

Davison, A. C. & Smith, R. L. (1990) Models for exceedances over high thresholds. / . Roy. Statist. Soc. B 52(3), 393-442.


DuMouchel, W. (1983) Estimating the stable index a in order to measure tail thickness. Ann. Statist. 11, 1019-1036.

Hosking, J. R. M. & Wallis, J. R. (1987) Parameter and quantile estimation for the generalized Pareto distribution. Technometrics 29(3), 339-349.

Jaynes, E. T. (1961) Probability Theory in Science and Engineering. McGraw-Hill, New York, USA. Jaynes, E. T. (1968) Prior probabilities. IEEE Trans. Syst. Man. Cybern. 3(SSC-4), 227-241. Jenkinson, A. F. (1955) The frequency distribution of the annual maximum (or minimum) of meteorological

elements. Quart. J. Roy. Meteorol. Soc. 81, 158-171. Jin, M. & Stedinger, J. R. (1989) Partial duration series analysis for a GEV annual flood distribution with

systematic and historical flood information (unpublished paper). Department of Civil Engineering, Pennsylvania State University, State College, PA, USA.

Joe, H. (1987) Estimation of quantiles of the maximum of N observations. Biometrika 74, 347-354. Levine, R. D. & Tribus, M. (1979) The Maximum Entropy Formalism. MIT Press, Cambridge,

Massachusetts, USA. Pickands, J. (1975) Statistical inference using extreme order statistics. Ann. Statist. 3, 119-131. Quandt, R. E. (1966) Old and new methods of estimation of the Pareto distribution. Biometrika 10, 55-82. Rosbjerg, D., Madsen, H. & Rasmussen, P. F. (1992) Prediction in partial duration series with generalized

Pareto-distributed exceedances. Wat. Resour. Res. 28(11), 3001-3010. Saksena, S. K. & Johnson, A. M. (1984) Best unbiased estimators for the parameters of a two-parameter

Pareto distribution. Biometrika 31, 77-83. Shannon, C. E. (1948) The mathematical theory of communication, I-IV. Bell System Tech. J. 27, 279-428,

612-656. Singh, V. P. & Fiorentino, M. (1992) A historical perspective of entropy applications in water resources.

In: Entropy and Energy Dissipation in Water Resources, ed. V. P. Singh & M. Fiorentino, 21-61. Kluwer, Dordrecht, The Netherlands.

Singh, V. P. & Rajagopal, A. K. (1986) A new method of parameter estimation for hydrologie frequency analysis. Hydrol. Sci. Technol. 2(3), 33-40.

Smith, J. A. (1991) Estimating the upper tail of flood frequency distributions. Wat. Resour. Res. 23(8), 1657-1666.

Smith, R. L. (1984) Threshold methods for sample extremes, In: Statistical Extremes and Applications ed. J. Trago de Oliveira, 621-638. Reidel, Dordrecht, The Netherlands.

Smith, R. L. (1987) Estimating tails of probability distributions. Ann. Statist., 15, 1174-1207. Tribus, M. (1969) Rational Descriptions, Decisions and Designs. Pergamon, New York, USA. van Montfort, M. A. J. & Witter, J. V. (1985) Testing exponentiality against generalized Pareto

distribution. / . Hydrol. 78, 305-315. van Montfort, M. A. J. & Witter, J. V. (1986) The generalized pareto distribution applied to rainfall

depths. Hydrol. Sci. J. 31(2), 151-162. van Montfort, M. A. J. & Otten, A. (1991) The first and the second e of the extreme value distribution,

EV1. Stochastic Hydrol. Hydraul. 5, 69-76. Wang, Q. J. (1990) Studies on statistical methods of flood frequency analysis. PhD dissertation, National

University of Ireland, Galway, Ireland. Wang, Q. J. (1991) The POT model described by the generalized Pareto distribution with Poisson arrival

rate. X Hydrol. 129, 263-280.

Received 8 February 1993; 22 September 1994

parameter estimation for 3-parameter generalized pareto distribution by...

Documents