statistical inferences for generalized pareto distribution based on interior penalty function...

Comput Econ (2012) 39:173–193DOI 10.1007/s10614-011-9256-0

Statistical Inferences for Generalized ParetoDistribution Based on Interior Penalty FunctionAlgorithm and Bootstrap Methods and Applicationsin Analyzing Stock Data

Chao Huang · Jin-Guan Lin · Yan-Yan Ren

Accepted: 7 January 2011 / Published online: 26 January 2011© Springer Science+Business Media, LLC. 2011

Abstract This paper studies the application of extreme value statistics (EVS) theoryon analysis for stock data, based on interior penalty function algorithm and Bootstrapmethods. The generalized Pareto distribution (GPD) models are considered in ana-lyzing the closing price data of Shanghai stock market. The maximum likelihoodestimates (MLEs) are obtained by using the interior penalty function algorithm. Cor-respondingly, the bias and standard errors of MLEs, and the hypothesis test on theshape parameter are concerned through Bootstrap methods. Some simulations areperformed to demonstrate the efficacy of parameter estimation and the power of thetest. The estimates of the tail index in this paper are compared with those obtained viaclassical methods. At last, the model is diagnosed by numerical and graphical methodsand the Value-at-Risk (VaR) is estimated.

Keywords Daily closing price · Generalized Pareto distribution · Threshold ·Interior penalty function algorithm · Bootstrap method · Value at Risk

1 Introduction

In the last few years, statistics of extremes has seen a resurgence of interest due to theinfluence of extreme events on people’s life, such as environment, insurance and espe-cially in the financial field, the subprime mortgage crisis is a current economic problemcharacterized by contracted liquidity in the global credit markets and banking system.

C. Huang · J.-G. Lin (B)Department of Mathematics, Southeast University, Nanjing 210096, Chinae-mail: [email protected]

Y.-Y. RenSchool of Economics, Shandong University, Jinan 250100, China

123

174 C. Huang et al.

An undervaluation of real risk in the subprime market ultimately resulted in cascadesand ripple effects affecting the world economy generally. The extreme value theorycan be traced to Fisher and Tippet (1928), who put forward the three well-knownextremal distribution families. Gnedenko (1943) improved the work of Fisher andTippet (1928), which gave the strict proof of the extremal distribution families theory.Pickands (1975) proposed the generalized Pareto distribution (GPD) model, whichhad a profound effect on the application of extreme value statistics (EVS). de Haanand Ferreira (2006) presented the complete results of the domain of attraction theory.

Much of the recent research in extreme value theory has been stimulated by the pos-sibility of large losses in the financial markets, which has resulted in a large literatureon the “Value at Risk”. Jansen and Vries (1991) investigated the frequency of largestock returns and the tail behaviors of the distribution of the stock returns. Embrechtset al. (1997) developed the problems of modelling extreme events for insurance andfinance. Kearns and Pagan (1997) proposed nonparametric estimate of the densitytail index for financial time series. McNeil et al. (2005) put forward the concept ofthe quantitative risk management. Thus, the EVS theory has been a powerful tool inmeasuring the financial risk.

The daily closing price data can be seen as the financial time series data. Until recentyears, more attention have been paid to the forecasting and control of the financialtime series, and many time series models and methods have been proposed in a lot ofpapers (Anderson 1971; Box and Jenkins 1976; Kedem and Fokianos 2002; Fan andYao 2003, etc). This paper does not focus on the prediction issue, instead, the extremeevents in the time series are concerned here. As a result, the EVS theory and the GPDmodel are applied in this paper.

The motivation of this paper is to improve the algorithm for obtaining the estimatesof the parameters in GPD model. The problem of parameter estimation of the GPDmodels has been approached by several authors including Pickands (1975), Smith(1985), Hosking et al. (1985), Hosking and Wallis (1987), Castillo and Hadi (1997),Rasmussen (2001), Zhang (2007), among others. Unfortunately, they may not existsometimes or may give nonsensical estimates. In this paper, the maximum likelihoodestimates (MLEs) are obtained by using a constrained optimization algorithm, the inte-rior penalty function algorithm, which is computationally easy and has high asymptoticefficiency. Estimates obtained by the algorithm somewhat outperform those by othermethods proposed before.

Another innovation of this paper is testing whether the shape parameter is zero inthe GPD. Smith (1985) pointed that, for ξ > − 1

2 , under some suitable regularity con-ditions, the test statistics is asymptotically χ2-distributed under H0. But sometimesthe regularity conditions are violated, especially in practical considerations, the under-lying distribution of the dataset is unknown so that the regularity conditions cannotbe certified. In this paper, the empirical distribution of the test statistics under H0 issimulated by the parametric bootstrap method. The size and the power of the test issimulated based on the empirical distribution, which performs significant for the test.

The format of the paper is as follows. Section 2 introduces the generalized extremevalue (GEV) distribution and the GPD models. The parameter estimates are obtainedby using the interior penalty function algorithm in Sect. 3. Correspondingly, the bias

123

Statistical Inferences for Generalized Pareto Distribution 175

and standard errors of MLEs, and the empirical distribution of the test statistics on theshape parameter are concerned through Bootstrap methods, in this section. In Sect. 4some simulations relating to the parameter estimation and test are performed. At last,the closing price data of Shanghai stock market is analyzed based on the GPD modelin Sect. 6.

2 Extreme Value Models

Consider a sequence of i.i.d. random variables {X1, X2, . . . , Xn} from marginal distri-bution function F(X). Let X(n) � max(X1, X2, . . . , Xn). It is well known that (Fisherand Tippet 1928), if there exists sequences of constants{an : an > 0}, {bn : bn ∈ R}and a non-degenerate distribution function G such that

Pr

{X(n) − bn

an≤ z

}= Fn(anz + bn)

d−→ G(z). (1)

Then G belongs to the GEV distribution family for sample maxima:

G(x) = exp

{−

(1 + ξ

x − μ

σ

)−1/ξ}

, 1 + ξx − μ

σ> 0, (2)

where μ is a location parameter, σ > 0 is a scale parameter and ξ ∈ R is a shape

parameter. When ξ → 0, the right side of (2) can be written as exp{−e− x−μ

σ

}. It is

obviously found that G(x) describes the Gumbel, Frechet and Weibull families withrespect to the cases ξ → 0, ξ > 0 and ξ < 0, respectively.

The prevailing parametric approaches, based on the GEV distribution, for model-ling extreme events in a given period of time, is the block maxima (BM) methods,which is to group the data into blocks of equal length and fit the data to the maximumsof each block, for example, annual maxima of daily precipitation amounts.

Unfortunately, the BM method is somewhat a wasteful approach to extreme valueanalysis if other data on extremes are available. An alternative approach, the peak overthreshold (POT) method (see Davison and Smith 1990), has been studied since theGPD introduced by Balkema and de Haan (1974) and Pickands (1975).

For some high threshold u, we consider the distribution of X conditionally onexceeding u, let Y = X − u > 0, then

Fu(y) = Pr{Y ≤ y|Y > 0} = F(u + y) − F(u)

1 − F(u).

Pickands (1975) showed a good approximation of the conditional distribution of Y,defining rF � sup{x : F(x) < 1}, in the sense that

limu→rF

sup0<y<rF −u

| Fu(y) − H(y; σu, ξ) |= 0, (3)

123

176 C. Huang et al.

where H is defined as the distribution function of the GPD which can be written as

H(y; σu, ξ) =

⎧⎪⎪⎨⎪⎪⎩

1 −(

1 + ξy

σu

)−1/ξ

, ξ �= 0;

1 − exp

(− y

σu

), ξ = 0,

(4)

where 1 + ξy

σu> 0, and the shape parameter ξ in (4) are the same as that in (2).

The GPD can represent different distributions depending on the value taken by ξ . Inparticular, when ξ > 0, we obtain the ordinary Pareto distribution which is suitablefor modelling heavy tailed distributions such as financial returns. When ξ = 0 andξ < 0 we have respectively the exponential and the Pareto II type distributions. Someelementary results about the GPD are

E(Y ) = σ

1 − ξ, (ξ < 1); E(Y − y|Y > y > 0) = σ + ξ y

1 − ξ, (ξ < 1). (5)

3 Statistical Inference Based on the Interior Penalty Function Algorithm

3.1 MLEs and Their Bias and Standard Errors

Suppose that y1, y2, . . . , yn come from the GPD. For ξ �= 0, the log-likelihood func-tion can be derived from (4),

L(ξ, σu) = −n log σu − (1 + 1/ξ)

n∑i=1

log

(1 + ξ

yi

σu

), (6)

where 1 + ξyiσu

> 0, i = 1, 2, . . . , n. Similarly, the log-likelihood function in the caseξ = 0 can also be obtained from (4)

L(σu) = −n log σu − σ−1u

n∑i=1

yi . (7)

Because the support of the GPD is related to the parameters, Newton–Raphson iter-ative algorithm cannot be used directly in estimating the MLEs of ξ and σu directly.The moment estimator and the probability-weighted moment (PWM) estimator givenby Hosking and Wallis (1987) have low asymptotic efficiencies. For ξ > 1/2, asthe variance is infinite, both of them do not exist. Even when the MOM and PWMestimates exist, they may not be consistent with the observed sample values. The max-imum likelihood estimator proposed by Davison (1984) is asymptotically efficient,but for ξ < −1, the likelihood function can be made infinite, thus the MLEs do notexist. What is worse, its computation is complex. In this paper, the interior penaltyfunction algorithm in constrained optimization problems is considered to estimatethe MLEs of the parameters ξ and σu . The penalty function algorithm was proposed

123


by Zangwill (1963) and motivated by the desire to use unconstrained optimizationtechniques to solve constrained problems. There are two approaches existing in thechoice of transformation algorithms: sequential penalty transformations and exactpenalty transformations. The interior penalty function algorithm considered in thispaper belongs to the sequential penalty transformations, which was characterized bythe property of preserving feasibility at all times. In recent years, the penalty methodhas been paid more attention to by the scientists and technologists of all kinds. Eremin(1971) considered the penalty method in convex programming. Di Pillo and Grippo(1989) studied the exact penalty functions in constrained optimization. Auslender(1999) proposed a unified framework of the penalty and barrier methods.

In the interior penalty function method, the initial point is selected within the fea-sible region and the penalty term is defined to keep the solution from leaving thefeasible region. Thus, in the paper, the proposed method based on the interior penaltyfunction algorithm has an advantage that the generated estimator sequence always liesin the interior of the feasible region. For the proposed method, even if the optimalsolution could not be obtained, an alternative version could be generated, which isalways staying in the feasible region and close to the optimal one. Another advantageof this method is about conducting inferences via bootstrap method. Although thenonparametric bootstrap fails in extreme statistics because the empirical distributionfunction is not a good estimate of the true distribution in the extreme tail, in this paperthe parametric bootstrap method can be adopted here, the accuracy of which tends tobe much higher if the true distribution belongs to a known parametric family.

In fact, the problem of computing the MLE of (ξ, σu), is equivalent to the optimi-zation problem below,

{maxξ,σu

L(ξ, σu)

1 + ξyiσu

> 0, i = 1, 2, . . . , n, σu > 0.(8)

Denote that

θ = (ξ, σu)T , L(ξ, σu) = −L(ξ, σu),

gi (ξ, σu) = 1 + ξyi

σu, i = 1, 2, . . . , n, gn+1(ξ, σu) = σu .

Then the problem considered later, equivalent to (8), is written as

{minθ

L(θ)

gi (θ) > 0, i = 1, 2, . . . , n + 1.(9)

In order to keep the iteration points staying within the feasible region, we denotethe barrier function

F(θ, K ) = L(θ) + K · B(θ), (10)

123

178 C. Huang et al.

where B(θ) is continuous and diverges as θ is closed to the boundary of the feasibleregion. Here B(θ) is defined as

B(θ) =n+1∑i=1

1

gi (θ). (11)

As K is a small positive number, F(θ, K ) → ∞ as θ is closed to the boundary ofthe feasible region. On the other hand, F(θ, K ) and L(θ) are almost the same whenK · B(θ) is sufficiently small. As a result, the approximate solution of (9) can beobtained by solving an optimization problem below:

{minθ

F(θ, K )

gi (θ) > 0, i = 1, 2, . . . , n + 1.(12)

Because of the function B(θ), the solution of the optimization problem above issure to remain within the feasible region. Although (12) is a constrained optimizationproblem, as the penalty function performs automatically, (12) can be solved as anunconstrained optimization problem from the point of view of computing.

The procedure of the interior penalty function algorithm in (9) is as follows:

• Step 1. Fix an initial point θ (0), which satisfies the constrained conditions in (9),the penalty factor K > 0, reduction factor ν < 1, and the precision ε > 0, letk = 1;

• Step 2. Construct an augmented objective function F(θ, K ) which is defined in(10) and (11);

• Step 3. Let θ (k−1) be an initial point, solve the problem minF(θ) with Newton–Raphson iterative algorithm, supposed the optimum solution is θ (k). If K ·B(θ (k)) <

ε, then stop the iteration and output θ (k) as θ , elsewise, let K = νK , k = k + 1,go to step 2.

The MLE of θ can be computed with the algorithm above, and the bias and stan-dard error of θ = (ξ , σu)T can also be estimated with Bootstrap method. Unfortu-nately, Efron and Tibshirani (1993) pointed out that the nonparametric bootstrap failedbecause the empirical distribution function is not a good estimate of the true distribu-tion in the extreme tail. Angus (1993) showed that the nonparametric bootstrappingdistribution for the extremes does not converge to an extreme value distribution. Infact, its limit is a random probability measure. However, since the GPD model isassumed, the parametric bootstrap method can be adopted here, the accuracy of whichtends to be much higher if the true distribution belongs to a known parametric family.The procedure of the parametric bootstrap method is presented below.

• Step 1. Given a random sample y = (y1, y2, . . . , yn), calculate the estimate θ ;• Step 2. Sample from the GPD model with parameters θ to get y∗i = (y∗i

1 , y∗i2 ,

. . . , y∗in );

• Step 3. Calculate the same estimate using the sample in step 2 to get the bootstrap

replicates, θ∗i

;

123


• Step 4. Repeat step 2 through step 3, B times;• Step 5. Estimate the bias and standard error of θ using the equations below

bias B(θ) = θ − θ , SE B(θ) ={

1

B − 1

B∑i=1

(θ∗i − θ)2

} 12

, θ = 1

B

B∑i=1

θ∗i

.

Efron and Tibshirani (1993) showed that the number of bootstrap replicates Bshould be between 50 and 200 when estimating the standard error of a statistic, butthe replicates B should be no less than 400 when estimating the bias of a statistic.It is known that the MLE sometimes cannot be obtained for the GPD model, espe-cially for ξ ≤ 0 and small sample size. However, the problem is not so severe for theheavy-tailed GPD and the frequency of the failure of the algorithm is very low.

3.2 The Hypothesis Test on the Shape Parameter

It is known that the MLE of ξ is obtained from (6), which is the log-likelihood functionin the case ξ �= 0. If the estimate ξ is not far from 0, there is not sufficient evidenceto confirm that ξ �= 0. In other words, the log-likelihood (6) should be consideredinstead. In order to check whether ξ equals to 0 or not, we need to test the hypothesison ξ

H0 : ξ = 0; H1 : ξ �= 0 (13)

The test statistic widely used in hypothesis testing is the likelihood ratio statistic

L Rn = 2 logmaxθ∈� L(θ)

maxθ∈�0L(θ)

. (14)

Smith (1985) pointed that, for ξ > − 12 , under some suitable regularity conditions, the

sequence {L Rn} is asymptotically χ2-distributed under H0. But sometimes the regu-larity conditions are violated, especially in practical considerations. Here we considerthe empirical distribution of this test statistic. First, the profile likelihood function ofξ is given by

L p(ξ) = maxσu |ξ L(ξ, σu). (15)

Then the profile likelihood ratio statistic can be written as

L Rn = 2 logL p (ξ )

L p(0), (16)

where ξ is the MLE of the shape parameter ξ on the whole parameter space �.

123

180 C. Huang et al.

Denote that �0 = {(ξ, σu) : ξ = 0, σu ∈ R} and �1 = {(ξ, σu) : ξ �= 0, σu ∈ R}.For � = �0 ∪ �1, �0 ∩ �1 = φ, it is easily found that

L p (ξ ) = max�

L(ξ, σu) = max{max�0

L(ξ, σu), max�1

L(ξ, σu)}.

Thus, the profile likelihood ratio statistics can be written as

L Rn = 2 logmax{maxξ �=0 L(ξ, σu), maxξ=0 L(ξ, σu)}

maxξ=0 L(ξ, σu). (17)

For the hypothesis test (13), we have to know the empirical distribution of the teststatistics L Rn under H0. Here the parametric bootstrap calibration is adopted. Firstfor a given sample y, generate the random samples y1, y2, . . . , yB from the expo-nential distribution E(λ) with the parameter λ equals to the mean of y, where thesample yi = {yi j }n

j=1. Then calculate the profile likelihood ratio statistics for each

sample, put the statistics sequence {L Rin}B

i=1 in ascending order and denote them as

{L R(i)n }B

i=1. Thus an empirical distribution is written as

FB(x) = 1

B

B∑i=1

1{L Rin ≤ x},

where 1{·} is the indicator function. FB(x) can be seen as the empirical cumulativedistribution function (ECDF) of L Rn under H0. As B is sufficiently large, the 1 − α

quantile value of the distribution of the test statistics L Rn under H0 can be replaced

by the [Bα]-th largest value among {L Rin}B

i=1. As a result, the rejection region can bewritten as

W ={

x : L Rn(x) > L R([B(1−α)])n

}, (18)

where α is the significance level. The procedure of constructing the rejection regionof the test (13) is presented below:

• Step 1. For a given sample y, let k = 1, λ = y;• Step 2. Generate pseudorandom numbers {xki }n

i=1 from the exponential distribu-tion E (λ);

• Step 3. Estimate the MLE of ξ, σu using the sample in step 2, and calculate therelevant test statistics, which is denoted as L Rnk ;

• Step 4. Let k = k + 1, go to step 2;• Step 5. When k = B, stop the iteration. Put {L Rnk }B

k=1 in ascending order, thenfind the [Bα]-th largest value among {L Rnk }B

i=1 and denote it as C1;• Step 6. Repeat step 1 to step 5 for 100 times, obtain the critical value sequence

{Ci }100i=1 and denote the median of the sequence as Cα . The rejection region is

obtained as W = {x : L Rn(x) > Cα}.

123


3.3 The Quantile Estimate of the GPD

In this subsection, we consider the estimate of the quantile of the GPD. From

Pr{X > x |X > u}Pr{X > u} = Pr{X > x},

denote ζu = Pr{X > u}, then the 1 − 1m quantile value xm is the solution of the

equation below

ζu

[1 + ξ

(xm − u

σu

)]−1/ξ

= 1

m.

In the hydrologic observation, the 1 − 1m quantile value xm is named as the return

level. Let m = 100, then xm may express the river level in an once-in-a-century flood.The index is of great importance in large dam construction. In analysis of financialmarket risk, xm is the extreme quantile of the daily negative return, which is gener-ally referred to as the Value-at-Risk (VaR). Rearrange the equation above, a plug-inestimate of xm can be obtained directly

xm = u + σu

ξ

[(mζu )ξ − 1

]. (19)

where σu, ξ are the MLE of σu, ξ, ζu = nu

n. Thus, the quantile of the GPD can be

estimated by (19), and the standard error of the estimates can be obtained by theparametric bootstrap method introduced in this section above.

Here we consider the confidence intervals of the quantile. Efron and Tibshirani(1993) suggested a further extension of percentile intervals called bias-corrected andaccelerated (BCα) confidence interval, which is both transformation respecting andsecond-order accurate. The method is an improved version of percentile intervalsalthough its endpoints are also given by percentiles of the bootstrap distribution. TheBCα intervals depend on two numbers a and z0, called the acceleration and bias-correction. Denote θ∗(α) as the 100αth percentile point of the bootstrap replicates,θ

∗i. Then the BCα interval of intended coverage 1 − 2α is given by

BCα : (θlo, θup) = (θ∗(α1), θ∗(α2)), (20)

where

α1 =

(z0 + z0 + z(α)

1 − a(z0 + z(α))

),

α2 =

(z0 + z0 + z(1−α)

1 − a(z0 + z(1−α))

).

(21)

123

182 C. Huang et al.

Here (·) is the standard normal cumulative distribution function and z(α) is the 100αthpercentile point of a standard normal distribution. The value of the bias-correction z0is defined as

z0 = −1(

#{θ∗i < θ}B

), (22)

where −1 is the inverse function of a standard normal distribution function, the nota-tion θ∗i and B are defined in the context above. There are various ways to compute theacceleration a, a simple expression for the acceleration given by Efron and Tibshirani(1993) is

a =∑n

i=1(θ − θ(i))3

6{∑n

i=1(θ − θ(i))2}3/2 , (23)

where θ(i) the estimate of θ based on the original sample with the i th point yi deleted

and θ is defined as the mean of the bootstrap replicates, θ∗i

.At the present level of development, the BCα intervals are recommended for gen-

eral use, especially for nonparametric problems. In this paper, the confidence intervalsof the quantile can be obtained by parametric BCα intervals method. The procedureof the parametric BCα intervals method is listed here:

• Step 1. Given a random sample y = (y1, y2, . . . , yn), calculate the estimate θ andθ (i);

• Step 2. Sample from the GPD model with parameters θ to get y∗i =(y∗i

1 , y∗i2 , . . . , y∗i

n );• Step 3. Calculate the estimate using the sample in step 2 to get the bootstrap

replicates, θ∗i

;• Step 4. Repeat step 2 through step 3, B times;• Step 5. Calculate the bias-correction z0 and the acceleration a defined in (22) and

(23) respectively and compute α1 and α2 in (21).

Thus the BCα intervals in (20) are obtained. In Sect. 6, we will give the confidenceintervals of the VaR of the closing price data of Shanghai stock market based on theBCα intervals method.

4 Simulation Studies

4.1 Parameter Estimates and Their Standard Errors

In Sect. 3, the penalty function algorithm for obtaining the MLE of θ and estimates ofthe bias and standard error of θ are given. Here the performance of parameter estimatesand their bias and standard errors is examined to show the efficacy of the parameterestimation, using Monte Carlo simulations.

123


Table 1 Simulations for the MLEs of parameters in models 1, 2

Model Parameter Sample size MLE Bias Standard error

Model 1 ξ = 10 500 9.8959 −2.8055e − 2 0.4751

1000 10.1351 −2.6129e − 2 0.3844

2000 10.0270 −1.7377e − 2 0.1566

σu = 1 500 0.8820 6.9551e − 2 0.1532

1000 0.9338 6.7089e − 2 0.1408

2000 0.9898 2.8303e − 2 4.9625e−2

Model 2 ξ = 0.01 1000 0.0374 −6.7024e − 3 3.6359e−2

2000 0.0178 7.0647e − 3 3.0915e−2

5000 0.0108 −1.0805e − 3 9.5000e−3

σu = 1 1000 0.9535 7.1927e − 3 4.3166e−2

2000 0.9673 −3.8632e − 3 2.2019e−2

5000 1.0085 1.0678e − 3 1.2902e−2

We first perform the simulations of the parameter estimates by using the interiorpenalty function algorithm. To do this, two cases are considered here: (i) ξ is far fromthe zero point; (ii) ξ is near the zero point. The relevant two models are given below:

• Model 1 The GPD model with parameters ξ = 10, σu = 1;• Model 2 The GPD model with parameters ξ = 0.01, σu = 1.

We generate n1 = 500, 1000 and 2000 pseudorandom numbers form the Model 1and n2 = 1000, 2000 and 5000 pseudorandom numbers from the Model 2. For differ-ent models, the estimates of the parameters and the corresponding bias and standarderrors are displayed in Table 1.

By using the interior penalty function algorithm, Table 1 shows that when the sam-ple size n is sufficiently large, the efficacy of the parameter estimation is very well. Onthe other hand, in order to obtain satisfying estimates, the sample size n for smaller ξ

is larger than that for larger ξ .Now we compare the estimates of the parameters ξ and σu and their standard errors

obtained by the interior penalty function algorithm with existing methods: (1) thePickands method (1975), (2) the method-of-moment (MOM) estimates, (3) the PWMmethod by Hosking and Wallis (1987), (4) the elemental percentile method (EPM) byCastillo and Hadi (1997) and (5) the likelihood moment estimator (LME) by Zhang(2007). First, we generate n = 1000 pseudorandom numbers from the GPD modelwith parameters ξ = 0.3, σu = 1. The estimates are then calculated by the methodsabove and the corresponding bias and standard errors are displayed in Table 2. Fromthe Table 2, the parameter estimates and their standard errors obtained by the interiorpenalty function algorithm, slightly outperform those by other five methods.

4.2 Simulated Powers of the Test Statistic

In this subsection, we examine the performances of the test statistic (17) via MonteCarlo simulations to provide finite-sample properties of the proposed statistics. The

123

184 C. Huang et al.

Table 2 Comparisons of the parameter estimates among different methods

Parameter Method Estimate Bias Standard error

ξ = 0.3 Pickands 0.35776 −1.441e − 3 5.6162e−2

MOM 0.22139 −8.035e − 3 3.4504e−2

PWM 0.28866 −1.3015e − 3 2.9171e−2

LME 0.29556 −1.7854e − 3 2.9675e−2

EPM 0.28122 −1.1324e − 3 3.6359e−2

IPF1 0.30166 −1.3203e − 3 2.8406e−2

σu = 1 Pickands 0.97068 3.6489e − 3 5.6162e−2

MOM 1.11 1.1157e − 2 4.9394e−2

PWM 1.0141 −2.9752e − 3 3.711e−2

LME 1.0166 2.0621e − 3 3.6949e−2

EPM 1.0653 4.9144e − 3 3.8283e−2

IPF 1.0087 1.2884e − 3 3.3768e−2

The IPF is named short for the interior penalty function algorithm

power of the test can also be simulated. For different ξ �= 0, The procedure of simu-lating the power of the test is presented here:

• Step 1. Generate n pseudorandom number xk = {xki }ni=1 from the GPD with

parameters ξ �= 0 and σu defined above;• Step 2. Estimate the MLE of ξ, σu using the sample in step 1, and calculate the

relevant test statistics, which is denoted as L Rnk ;• Step 3. If xk fall into the rejection region W , then let l = l + 1, k = k + 1, go

to Step 2. When k = 1000, stop the iteration, and calculate the power of the testPower = l/n.

The significance level is set as α = 0.01. According to the procedure of simulatingthe test power above, we set σu = 2 for different sample sizes (n = 500, 1000, 2000)and ξ (ξ �= 0). The simulated critical values for each sample size are C500,0.01 =6.4657, C1000,0.01 = 6.501 and C2000,0.01 = 6.5408. The simulated powers of the test(13) is displayed in Fig. 1. From Fig. 1, the results for testing ξ = 0 show that theactual sizes of the test were very close to 0.01. As |ξ | or sample size increased, thepowers of the tests approached 1 quickly.

5 Analysis for the Closing Price Data of Shanghai Stock Market

In the previous sections, the GEV model and the GPD model have been introduced.In this section, the closing price data of Shanghai stock market is analyzed based onthe GPD model.

The closing price data of Shanghai stock market concerned in this paper is rangedfrom November 27, 1996 to April 29, 2009, totally 3001 observations. The data isobtained from the software named Great wisdom, http://www.gw.com.cn/welcome.jsp.

123

http://www.gw.com.cn/welcome.jsp

http://www.gw.com.cn/welcome.jsp


−0.3 −0.2 −0.1 0 0.1 0.2 0.30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

ξ

Pow

er

PowerPlot (α=0.01)

N=2000N=1000N=500

Fig. 1 Monte Carlo simulations of the empirical powers for the test

Table 3 Numericalcharacteristics of {Xt } Sample size Mean Standard error Minimum

3000 −0.0308 1.7956 −9.4014

Median Maximum Skewness Kurtosis

−0.0556 10.4370 0.2429 7.7504

5.1 Data Transformation

The stock data concerned here can be seen as the time series data. Because of thestrong non-stationarity observed in the original data, we hope that the trend term andperiodic term can be separated from the data. Much of recent research in extreme valuetheory in financial market has resulted in the VaR, here we take the similar processon the data as ones in the VaR theory. Supposed Zt is the closing price of Shanghaistock market on day t , as we are mainly interested in the possibility of large loss inthe stock market, the daily negative return is defined by

Xt = 100 logZt−1

Zt. (24)

Some numerical characteristics of {Xt } are given in Table 3. It can be found fromthe table that the sequence of daily negative return {Xt } is fat-tailed, not normal.

5.2 Threshold Selection

From the definition of the GPD model, we define threshold excesses as Y = X−u > 0.Davison and Smith (1990) introduced the idea of the Mean Residual Life Plot, which

123

186 C. Huang et al.

in the sense is a diagnostic plot drawn before fitting any model and can therefore giveguidance about what threshold to use. From Eq. 5, it is found that,

E(X − u|X > u) = σu0 + ξ(u − u0)

1 − ξ, (ξ < 1). (25)

In other words, when u > u0, E(X − u|X > u) is a linear function of u with slopeξ/(1 − ξ). An empirical estimate of E(X − u|X > u) can be given as

M R(u) = 1

nu

nu∑i=1

(x(i) − u), (26)

where nu = #{i : xi > u}, x(1), x(2), . . . , x(nu) are the observed data exceeding u. It isknown to all that, when u > u0, the empirical estimate is expected to change linearlywith u. Thus a plot, in which the ordinate is the MR and the abscissa is the threshold,can be drawn to select the threshold, to select the threshold u0. This plot is called theMean Residual Life Plot.

One difficulty with this method is that the plot typically shows very high variabil-ity, particularly at large threshold. This can make it difficult to decide an observeddeparture from linearity is due to a failure of the GPD or is just sample variability.

A rough confidence band can be given in the plot by Monte Carlo method. Assum-ing that for some threshold u0, the distribution of the excesses over u0 is exactly theGPD with parameters σu0 , ξ .σu0 , ξ are the MLE of σu0 , ξ . A natural estimate of thedifference between the empirical and the theoretical mean excesses for any given u ispresented below

M R(u) − σu0 + ξ (u − u0)

1 − ξ. (27)

The procedure of the simulation is as follows:

• Step 1. Given a random sample y = (y1, y2, . . . , yn), calculate σu0 , ξ , the MLEof parameters σu0 , ξ ;

• Step 2. Generate n pseudorandom number {yki }ni=1 from the GPD with parameters

σu0 , ξ ;• Step 3. Calculate σ k

u0, ξ k , the MLE of ξ, σu0 and the mean excess M Rk(u) using

the sample in step 2, and calculate the relevant statistics,

Dk(u) = M Rk(u) − σ ku0

+ ξ k(u − u0)

1 − ξ k+ σu0 + ξ (u − u0)

1 − ξ;

• Step 4. Repeat step 2 through step 3, 100 times;• Step 5. Order the sequence {Dk(u)} from smallest value to largest value, find the

fifth largest and fifth smallest value of the {Dk(u)}, which are approximately 5%upper and lower confidence bounds on M R(u).

123


0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Threshold u

Kol

mog

orov

−Sm

irnov

sta

tistic

1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

3

3.2

Threshold u

Mea

n ex

cess

es

Mean Residual Life Plot

MR(u)lower confidence bound

upper confidence bound

Fig. 2 K–S statistic and mean residual life plots

As a result, a criterion for choosing threshold is given as: if the plot remains withinthe confidence bands for all u, it can be believed that the threshold u0 is suitable. Itis worth noting that the simulated 90% confidence level is true for any given u butnot simultaneously for all u. Thus, if the plot stays within the confidence bands formost of its range but a small part of its range is outside, there is not sufficient evidenceindicating lake of fit of the GPD.

This graphic approach is clearly subjective and can be sensitive to noise or fluc-tuations in the tail of the distribution. A more objective and principled approach isdesirable. Here we present an alternative method for selecting the threshold based onthe methodology proposed in Clauset et al. (2009). The fundamental idea behind themethod is very simple: the threshold u0 is chosen, that makes the probability distri-butions of the measured data and the best-fit GPD as similar as possible above u0.There are a variety of measures for quantifying the distance between two probabilitydistributions, but for non- normal data the commonest is the Kolmogorov–Smirnovstatistic, which is simply the maximum distance between the ECDF of the data andthe fitted model. Thus the threshold can be obtained as follows:

u0 = arg minu

maxx>u

|Fn(x − u) − H(x − u)|,

where Fn is the ECDF of the excesses for the observations with value at least u, andH is the GPD model that best fits the data.

Figure 2 shows the K–S statistic against the threshold and the mean residual lifeplot. The left plot shows the K–S statistic against the threshold over all its range. It isfound that the K–S distance achieves the smallest value around u = 1.5. In the rightmean residual life plot, u0 is assumed to be 1.5, the red dashed lines are the 5% upperand lower confidence bounds on MR(u). The black line is the curve of MR(u). It canbe found that, from the plot, when u0 = 1.5, the plot of MR(u) stays almost withinits confidence bounds. According to the two approaches introduced above, we choosethe threshold u0 = 1.5, nu = 421.

123

188 C. Huang et al.

Table 4 Estimates ofparameters and their standarderrors

Parameter MLE Bias Standard error

ξ 0.1636 −6.5004e − 3 5.6559e − 2

σu 1.1359 9.8834e − 3 8.8089e − 2

5.3 Parameter Estimation and Hypothesis Testing

After the threshold is chosen, we focus on the threshold excesses Yi = X(i)−u0, whereX(i) ∈ {X(i) : X(i) > u0}. From the definition of the GPD (4), {Yi } are assumed toobey the GPD in the case ξ �= 0:

G(y; σu, ξ) = 1 −(

1 + ξy

σu

)−1/ξ

, 1 + ξy

σu> 0, σu > 0

The MLE of ξ, σu can be computed with the algorithm introduced in Sect. 3, and thebias and standard error of ξ , σu are also estimated with Bootstrap method introducedin Sect. 3. Table 4 displays the MLE of ξ, σu and the related bias and standard error.

From the MLE of ξ we can know that ξ is not far from 0. In order to check whetherξ equals to 0 or not, we have a hypothesis test on ξ :

H0 : ξ = 0; H1 : ξ �= 0.

Based on the MLEs of ξ, σu , we can obtain the test statistic L Rn = 9.0952 and thep-value of the test is 0.003, which gives very significant evidence of ξ �= 0.

5.4 Comparison of the Estimates of the Tail Index

From the previous subsection, the result of the hypothesis test indicates that the under-lying distribution of the negative return data belongs to the domain of attraction of theFrechet family, which verifies the data is leptokurtic and heavy-tailed. Hence, the tailindex, α = 1/ξ , is of great interest. In recent years the problem of estimation of the tailindex of a heavy tailed distribution has been paid much attention. Various estimates forthe tail index have been proposed in the literature. (for example, Hill 1975; Pickands1975; Dekkers et al. 1989; de Haan and Peng 1998 and among others).

Here we consider the tail index of the distribution of the data by comparing the esti-mate of the tail index in this paper with these existing methods. The details of theseestimators are omitted here, someone can refer to the papers listed above. Table 5shows the estimate of the tail index in this paper and those obtained by the existingmethods. Five estimates of tail indices α are obtained and the relevant bias and stan-dard error of 1/α are also presented via the bootstrap method. Similar to the choiceof the threshold, the selection of the tail fraction for the first four estimates is alsoimportant. Here the method for selecting optimal sample fraction introduced by Dreesand Kaufmann (1998) is considered. The tail fractions for the four estimates are set as:k1 = 130, k2 = 220, k3 = 350, k4 = 270. It can be found that, all the five estimates oftail indices are larger than 2, which somewhat means that the underlying distribution

123


Table 5 Comparison estimatesof tail indices among differentmethods

No. Method Estimate(α)

Bias(1/α)

Standard error(1/α)

1 Hill 5.5294 −0.0637 0.2446

2 Pickands 5.7173 0.0465 0.3573

3 Dekkers et al. 5.6295 −0.0552 0.2213

4 de Haan and Peng 5.4219 −0.0914 0.3481

5 IPF 6.1687 0.0093 0.0422

Table 6 MLEs of ξ underdifferent threshold u

Threshold u MLE of ξ

−0.3 0.0290

0.9 0.1099

1.5 0.1636

1.7 0.1473

2.0 0.1664

2.2 0.1507

2.3 0.1623

of the data does not satisfy the leptokurtic stable hypothesis. From the table, it is alsoshown that, the estimate via the IPF method is larger than the other four estimateswhile the related bias and standard error is smaller than the other four methods.

It is necessary to be mentioned that, for the first four estimates, neither the tradi-tional nonparametric bootstrap method nor the parametric bootstrap method in thispaper is valid. Here we adopt the subsample bootstrap method, which was initiallyproposed by Hall (1990). The related bias and standard error are obtained via a broadersubsample bootstrap method proposed by Danielsson et al. (2001).

5.5 Model Diagnosis

In this subsection, we check whether the model fits well or not. As pointed out byColes (2001), when u > u0 and nu is sufficiently large, the MLEs of ξ should remainthe same more or less. Table 6 displays the MLEs of ξ under different threshold u.It can be seen that, when u > u0, ξ is likely to be invariable, which means that thechoice of threshold u0 = 1.5 is reasonable.

Probability plot, QQ plot and density plot are also widely used in checking thequality of a fitted GPD model. Coles (2001) used these plots to check that whether themodel fits well or not. Suppose that the threshold is u0 and y(1) ≤ y(2) ≤ · · · y(n) arethe threshold excesses. Denote the estimated model is G(y;θ), the probability plotconsists of the pairs

{i − 1/2

n, G(y(i);θ); i = 1, 2, . . . , n

},

123

190 C. Huang et al.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Empirical

Mod

el

Probability Plot

Fig. 3 Probability plot

where

G(y;θ) =

⎧⎪⎪⎨⎪⎪⎩

1 −(

1 + ξy

σu

)−1/ξ

, ξ �= 0,

1 − exp

(− y

σu

), ξ = 0

If the GPD model fits well, the probability plot should be approximately linear.The QQ plot consists of the pairs

{G−1

(i − 1/2

n;θ

), y(i), i = 1, 2, . . . , n

},

where

G−1(y;θ) =⎧⎨⎩

σu

ξ

[(1 − y)−ξ − 1

], ξ �= 0,

−σu log(1 − y), ξ = 0

The QQ plot has the same property, linearity, as the probability plot when the GPDmodel is reasonable.

Finally, in the density plot, the p.d.f. of the fitted GPD should approximately agreewith the histogram of the threshold excesses if the fitted model goes well.

Figures 3, 4, and 5 respectively present the probability plot, QQ plot and densityplot of the model. From these plots, the model may be considered to agree with thedata well.

123


0 1 2 3 4 5 6 7 80

1

2

3

4

5

6

7

8

Model

Em

piric

al

QQ Plot

Fig. 4 QQ plot

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

x

g(x)

Density Plot

Fig. 5 Density plot

5.6 Estimates of VaR

Here we consider of the VaR of the closing price data of Shanghai stock market, whichis the quantile of the GPD discussed in Sect. 3. An estimate of xm can be obtaineddirectly

xm = u + σu

ξ

[(mζu )ξ − 1

]. (28)

where σu, ξ are the MLE of σu, ξ, ζu = nu

n. Suppose that the number of the opening

day in each year is m + 1, then the number of the closing price data is m. From (24),

123

192 C. Huang et al.

Table 7 Estimate of the returnlevels for different level m

Level m xm Standard error Confidence interval

5 1.1090 3.0017e−2 (1.0583, 1.1096)

10 1.8958 2.7635e−2 (1.8752, 1.901)

25 3.0826 9.0244e−2 (3.0326, 3.1701)

50 4.1064 1.4791e−1 (3.9133, 4.4038)

100 5.2531 2.3882e−1 (4.9682, 5.4705)

it can be inferred

Zt−1 − Zt = (exm/100 − 1)Zt .

In other words, if Zt = 2000, m = 260, then xm = 7.063. The maximum dailynegative return in this year is approximately 146.37.

Also, for different level m, xm can be estimated by Eq. 28, the relevant standarderror can be estimated by the parametric bootstrap method and the confidence intervalscan be obtained by the BCα intervals method introduced in Sect. 3. Table 7 shows theestimates of the return levels for different level m, the relevant standard error and the90% confidence intervals. It is shown that, as the level m increases, the standard errorof the estimate and the length of the confidence intervals tend to become larger.

6 Conclusion

This paper analyzes the closing price data of Shanghai stock market with the GPDmodel. The MLEs are obtained by using the penalty function algorithm. Based on theMLEs, the problem of hypothesis test for shape parameter is considered. The empir-ical distribution of the test statistics is obtained by the parametric bootstrap method,the size and the power of the test are simulated. In analyzing the closing price data ofShanghai stock market, the GPD model is fitted, which is certified to be satisfied bysome diagnosis. The estimate of the tail index in this paper is compared with thoseobtained via classical methods. The VaR is estimated and the relevant confidenceintervals are also presented. Simulations by using Monte Carlo method demonstratethat the interior penalty function algorithm has better effect than the other methods,as the sample size is sufficiently large.

Acknowledgment The project supported by NSFC 11001052.

References

Anderson, T. W. (1971). The statistical analysis of time series. New York: Wiley.Angus, J. E. (1993). Asymptotic theory for bootstrapping the extremes. Communications in Statistics:

Theory and Methods, 22, 15–30.Auslender, A. (1999). Penalty and barrier methods: A unified framework. SIAM Journal on Optimiza-

tion, 10, 211–230.Balkema, A. A., & de Haan, L. (1974). Residual lifetime at great age. Annals of Probability, 2, 792–804.

123


Box, G. E. P., & Jenkins, G. M. (1976). Time series analysis: Forecasting and control (p. 575). SanFrancisco, USA: Holden-Day.

Castillo, E., & Hadi, A. S. (1997). Fitting the generalized Pareto distribution to data. Journal of theAmerican Statistical Association, 92, 1609–1620.

Clauset, A., Shalizi, C. R., & Newman, M. E. J. (2009). Power-law distributions in empirical data. SIAMReview, 51(4), 661–703.

Coles, S. (2001). An introduction to statistical modeling of extreme values. London: Springer-Verlag.Danielsson, J., de Haan, L., Peng, L., & de Vries, C. G. (2001). Using a bootstrap method to choose

the sample fraction in tail index estimation. Journal of Multivariate Analysis, 76, 226–248.Davison, A. C. (1984). Modeling excesses over high thresholds, with an application. In J. Tiago de

Oliveira (Ed.), Statistical extremes and applications (pp. 461–482). Dordrecht: Reidel.Davison, A. C., & Smith, R. L. (1990). Models for exceedances over high thresholds (with discus-

sion). Journal of the Royal Statistical Society B, 52, 393–442.de Haan, L., & Ferreira, A. (2006). Extreme value theory: An introduction. New York: Springer.de Haan, L., & Peng, L. (1998). Comparison of tail index estimators. Statistica Nederlandica, 52(1), 60–70.Dekkers, A. L. M., Einmahl, J. H. J., & de Haan, L. (1989). A moment estimator for the index of an

extreme value distribution. Annals of Statistics, 17, 1833–1855.Di Pillo, G., & Grippo, L. (1989). Exact penalty functions in constrained optimization. SIAM Journal

on Control and Optimization, 27, 1333–1360.Drees, H., & Kaufmann, E. (1998). Selecting the optimal sample fraction in univariate extreme value

estimation. Stochastic Processes and Their Applications, 75(2), 149–172.Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.Embrechts, P., Kluppelbern, C., & Mikosch, T. (1997). Modelling extreme events for insurance and

finance. New York: Springer.Eremin, I. I. (1971). The penalty method in convex programming. Cybernetics, 3, 53–56.Fan, J. Q., & Yao, Q. W. (2003). Nonlinear time series: Nonparametric and parametric methods. New

York: Springer-Verlag.Fisher, R. A., & Tippet, L. H. C. (1928). Limiting forms of the frequency distributions of the largest

or smallest member of a sample. Proceedings of the Cambridge Philosophical Society, 24, 180.Gnedenko, B. V. (1943). Sur la distribution limite du terme maximum d’un serie aleatoire. Annals of

Mathematics, 44, 423.Hall, P. (1990). Using the bootstrap to estimate means squared error and select smoothing parameter

in nonparametric problems. Journal of Multivariate Analysis, 32, 177–203.Hill, B. M. (1975). A simple general approach to inference about the tail of a distribution. Annals of

Statistics, 3, 1163–1174.Hosking, J. R. M. (1984). Testing whether the shape parameter is zero in the generalized extreme value

distribution. Biometrika, 71, 367–374.Hosking, J. R. M., & Wallis, J. R. (1987). Parameter and quantile estimation for the generalized Pareto

distribution. Technometrics, 29, 339–349.Hosking, J. R. M., Wallis, J. R., & Wood, E. F. (1985). Estimation of the generalized extreme-value

distribution by the method of probability-weighted moments. Technometrics, 27, 251–261.Jansen, D. W., & de Vries, C. G. (1991). On the frequency of large stock returns:putting booms and

busts into perspective. The Review of Economics and Statistics, 73(1), 18–24.Kearns, P., & Pagan, A. (1997). Estimating the density tail index for financial time series. The Review

of Economics and Statistics, 79(2), 171–175.Kedem, B., & Fokianos, K. (2002). Regression models for time series analysis. New York: Wiley.McNeil, A. J., Frey, R., & Embrechts, P. (2005). Quantitative risk management: Concepts, techniques

and tools. Princeton: Princeton University Press.Pickands, J. (1975). Statistical inference using extreme order statistics. The Annals of Statistics, 3(1),

119–131.Rasmussen, P. F. (2001). Generalized probability weighted moments: application to the generalized

Pareto distribution. Water Resources Research, 37(6), 1745–1751.Smith, R. L. (1985). Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67–90.Zangwill, W. (1963). Non-linear programming via penalty functions. Management Science, 13, 344–358.Zhang, J. (2007). Likelihood moment estimation for the generalized Pareto distribution. Australian

& New Zealand Journal of Statistics, 49(1), 69–77.

123

statistical inferences for generalized pareto distribution based on interior penalty function...

Documents