forecasting volatility with outliers in garch models

Copyright © 2008 John Wiley & Sons, Ltd.

Forecasting Volatility with Outliers in GARCH Models

AMÉLIE CHARLESAudencia Nantes, School of Management, France

ABSTRACT

In this paper, we detect and correct abnormal returns in 17 French stocks returns and the French index CAC40 from additive-outlier detection method in GARCH models developed by Franses and Ghijsels (1999) and extended to innovative outliers by Charles and Darné (2005). We study the effects of out-lying observations on several popular econometric tests. Moreover, we show that the parameters of the equation governing the volatility dynamics are biased when we do not take into account additive and innovative outliers. Finally, we show that the volatility forecast is better when the data are cleaned of outliers for several step-ahead forecasts (short, medium- and long-term) even if we consider a GARCH-t process. Copyright © 2008 John Wiley & Sons, Ltd.

INTRODUCTION

In fi nance and fi nancial economics, modelling volatility of returns is fundamentally an important key to risk management, derivative pricing and hedging, market making, market timing, portfolio selection, monetary policy making, and many other fi nancial activities. A good forecast of volatility of prices of asset over the investment holding period is a good starting point for assessing investment risk. Financial stocks returns do not match the familiar bell-shaped normal distribution. Indeed, high-frequency time series of returns on fi nancial assets typically exhibit excess kurtosis. Engle (1982) introduced the ARCH model, which has been generalized by Bollerslev (1986) to capture the excess kurtosis. The GARCH models became popular in both theorical and empirical work. More precisely GARCH(1,1) has been shown to represent adequately the daily returns of most fi nancial time series (Andersen and Bollerslev, 1998; among others). However, even if GARCH processes are able to represent the dynamics of returns, they cannot capture all excess kurtosis.

This has naturally led to the use of non-normal distributions to model this excess kurtosis. Bollerslev (1987), among others, used a Student distribution, while Nelson (1991) suggested the generalized error distribution. Other propositions include mixture distributions such as the normal–Poisson (Jorion, 1988), the normal–lognormal (Hsieh, 1989) or the Bernoulli–normal (Vlaar and Palm, 1993).

Journal of ForecastingJ. Forecast. 27, 551–565 (2008)Published online 5 September 2008 in Wiley InterScience(www.interscience.wiley.com) DOI: 10.1002/for.1065

* Correspondence to: Amélie Charles, Audencia Nantes, School of Management, 8 route de la Jonelière, BP 31222, 44312 Nantes. E-mail: [email protected]

552 A. Charles

Copyright © 2008 John Wiley & Sons, Ltd. J. Forecast. 27, 551–565 (2008) DOI: 10.1002/for

Excess kurtosis could also be explained by the presence of outliers (Baillie and Bollerslev, 1989). As in linear models, outliers affect the identifi cation and estimation of GARCH models (Carnero et al., 2007). It is known that these observations may wrongly suggest conditional heteroscedasticity (van Dijk et al., 2002) and also may hide true heteroscedasticity (van Dijk et al., 2002). With respect to estimation, Sakata and White (1998) showed that quasi-maximum likelihood (QML) estimators can be severally affected by a small number of outliers such as market crashes and rallies. Carnero et al. (2007) analysed fi nite sample behaviour, in the presence of outliers, of a QML estima-tor based on maximizing Student likelihood when the conditional distribution is Gaussian from Monte Carlo experiments. They concluded that this estimator is robust even when the sample is moderate and the outliers are relatively large. Verhoeven and McAleer (2000) studied, empirically, the effects of outliers on the AR(1)-GARCH(1,1) process by analysing 1000 trading days of fi ve fi nancial time series. They found that outliers tend to dominate QML estimates, resulting in larger ARCH and smaller GARCH estimates, and may give rise to spurious AR(1) and ARCH effects. Recently, Carnero et al. (2007) note that, for all contaminated series, the constant a0 is overestimated without depending on whether the series is contaminated with consecutive or isolated outliers.1 In the case of estimates of a1 and b, biases depend on the nature of outliers: if they are consecutive, a1 is overestimated and b is underestimated, while in the case of isolated outliers the biases are not very clear, and both a1 and b parameters can be overestimated or underestimated.

There are several procedures to detect outliers in GARCH models. Hotta and Tsay (1999) proposed two test statistics to detect outliers in GARCH processes. They applied their tests to simulated and real examples and concluded that the tests work well in both applications. Franses and Ghijsels (1999) and Franses and van Dijk (2002) proposed to modify the Chen and Liu (1993) method to correct for additive outliers in stock market returns, when GARCH models are used for forecasting volatility. Charles and Darné (2005) extended the procedure of Franses and Ghijsels (1999) to take into account innovative outliers. Indeed, Balke and Fomby (1994) found that most detected outliers in time series are innovative outliers, especially in high-frequency data. Doornik and Ooms (2002) distinguished between outliers that only affect level and those that also affect conditional variance. Other authors proposed robust methods to estimate the parameters, which avoid the problem of identifying outliers (Sakata and White, 1998; Park, 2002; among others).

Although these robust methods are relatively straighforward, there are several disadvantages: (i) robust methods perform well in some cases but poorly in others; (ii) since outlying observations are not adjusted, the impact of outliers on forecasts remains; (iii) in most cases, only limited infor-mation on the outlier may be obtained (e.g. from the weights applied to the residuals).

To our knowledge, there are few papers analysing the effects of outliers for out-of-sample fore-casting performance at different horizons. Franses and Ghijsels (1999) showed that outliers biased the estimation of ARCH and GARCH parameters and, consequently, the volatility forecasts. They found a substantial improvement in forecasting using data corrected for outliers over GARCH models with a Student distribution for the original returns. Verhoeven and McAleer (2000) found that, when data are corrected for outliers, volatility forecasts are improved substantially for periods of low volatility clustering but not for periods of high volatility clustering.2 Park (2002)

1 The authors consider the following process: s 2t = a0 + a1e2

t−1 + bs 2t−1.

2 The authors consider recursive estimation and forecasting of the volatility of daily returns. They compute one-day-ahead volatility forecasts. According to them, deterioration in the forecast accuracy for high-volatility periods can be explained by the fact that outlying observations are frequently clustered with other large observations, so that removing an outlier will reduce the forecast accuracy of subsequent large observations.

Forecasting Volatility with Outliers in GARCH Models 553


proposed a new approach to take into account outliers based on a robust GARCH model (RGARCH). He showed that the RGARCH model outperforms the GARCH model and the random walk model. The author notes that the out-of-sample volatility forecasts of the RGARCH model are superior to those of others competitive models.3

In this paper we propose to compare the method developed by Charles and Darné (2005) with a GARCH-t model which is very popular. The outline of this paper is as follows. The next section describes modelling outliers in a GARCH model as well as the outlier identifi cation procedure pro-posed by Charles and Darné (2005). We apply this procedure to 17 French daily stock returns and the CAC40 index in the third section, and we examine the effects of outliers on the diagnostics of normality from linear and nonlinear modelling. The fourth section evaluates and compares the forecasting performance of GARCH models from outlier-uncorrected and corrected series and a GARCH-t process using Diebold–Mariano tests for equal predictive ability. We conclude in the fi fth section.

METHODOLOGY

Outliers are aberrant observations that are away from the rest of the data. They can be caused by recording errors or unusual events such as changes in economic policies, wars, disasters, fi nancial crises and so on. They are also likely to occur if errors have fat-tailed distributions, as in the case of fi nancial time series. These observations may take several forms in time series. The fi rst and most usually studied is the additive outlier (AO), which only affects a single observation. In contrast, an innovative outlier (IO) affects several observations. Balke and Fomby (1994) found that many of the detected outliers in fi nancial time series are IOs, especially for data at a high frequency.

Charles and Darné (2005) extended the additive-outlier detection method in GARCH models developed by Franses and Ghijsels (1999) to innovative outliers,4 using the Chen and Liu (1993) approach.

Consider the returns series et, which is defi ned by et = log pt − log pt−1, where pt is the observed price at time t, and consider the GARCH(1,1) model

εt t tz h= , (1)

et ∼ N (0, ht), zt ∼

i.i.d. N (0, 1)

h ht t t= + +− −α α ε β0 1 12

1 1 (2)

where a0 > 0, a1 ≥ 0, b1 ≥

0 and a1 + b1 < 1, such that the model is covariance-stationary.The GARCH(1, 1) model can be rewritten, strictly speaking,5 as an ARMA(1, 1) model for e2

t (Bollerslev, 1986):

3 Empirical analysis supports the dominance of the RGARCH model, when it is compared with GARCH, GARCH-t, EGARCH and random walk models in terms of one-step-ahead volatility forecasts.4 Some Monte Carlo experiments, which are not reported, have been done to study the power of the test. Results indicate that the outliers introduced in a simulated GARCH process are well detected. These Monte Carlo experiments will be the object of future research. In this way, comparisons with other tests will be interesting.5 It is important to note that nt is not a white noise process and thus equation (3) does not exactly represent an ARMA process.

554 A. Charles


e2t = a0 +(a1 + b1) e2

t−1 + nt − b1nt−1 (3)

where nt = e2t − ht. This analogy of the GARCH model with an ARMA model allows one to directly

adapt the method of Chen and Liu (1993) to detect and correct AOs and IOs in GARCH models. Specifi cally, suppose that instead of the true series et one observes the series et, which is defi ned as

e2t = e2

t + wixi(B)It(t) with i = 1, 2 (4)

where It(t) is the indicator function defi ned as It(t) = 1 if t = t and zero otherwise where t is the date of outlier occurring, wi is the magnitude of the outlier effect, and xi(B) represents their dynamic pattern with x1(B) = 1 for an AO and x2(B) = (1 − b1B)(1 − (a1 + b1)B)−1

for an IO.

An AO is related to an exogenous change that directly affects the series and only its level of the given observation at time t = t. An IO is possibly generated by an endogenous change in the series, and affects all the observations after time t through the memory of the process.

The residuals ht of the observed series e2t are given by

η αβ

π π ξ ω τt t t i i tB

B e v B B I= −−

+ ( ) = + ( ) ( ) ( )0

1

2

1 (5)

where p(B) = (1 − (a1 + b1)B)(1 − b1B)−1. The expression (5) can be interpreted as a regression model for ht, i.e.

ht = wixit + nt (6)

with xit = 0 for i = 1, 2 and t < t, xit = 1 for i = 1, 2 and t = t, x1,t+k = −pk (for AO) and x2,t+k = 0 (for IO) for t > t and k > 0.

Outlier detection is based on the maximum value of the standardized statistics of the outliers effects:

AO: ˆ ˆ ˆτ ω τ σ ητ τ

1 1 12

1 2

1 1= ( )( )

=

= = =

∑ ∑v tt

n

t tt

n

tt

x x xττ τ

τ

σ

τ ω τ σ η

n

v tt

n

v

x∑ ∑

= ( ) =

−

=

1

12

1 2

2 2

ˆ

ˆ ˆ ˆIO: ˆσ v

where s2v denotes the estimated variance of the residual process.6

The outlier detection method for GARCH(1, 1) models then consists of the following steps:

1. Estimate a GARCH(1, 1) model for the observed series et and obtain estimates of the conditional variance ht and ht = e2

t − ht.2. Obtain estimates w i (i = 1, 2) for all possible t = 1, . . . , n, and compute tmax = max1≤t ≤ nt i. If

the value of the test statistic exceeds the critical value C, an outlier is detected at the observation for which t is maximized.

6 The estimated variance of the residual process is obtained from the ‘omit-one’ method, which computes the error variance from the sample where the observation at t = t has been deleted (Franses and Ghijsels, 1999; Chen and Liu, 1993).



3. Replace e2t with

AO: *

IO: * with

e e

e e jj j

τ τ

τ τ

ωω ψ

2 21

2 22 0

= −= − >+

ˆ

ˆ

where y(B) = p(B)−1. The outlier-corrected series e*t is defi ned as

AO:for

sign for

IO:for

sig

2e

e t

e e t

ee t

t

t

t t

t

t

**

*

=≠

( ) =

=<

τ

ττ

nn for 2e e t j jt t( ) = + > * ,τ 0

4. Return to step 1 to estimate a GARCH(1, 1) model for the series e*t , and repeat all steps until no tmax test-statistic exceeds the critical value C.

A critical value C = 10 is used, which considered a low-sensitivity value for our sample size (Verhoeven and McAleer, 2000). This choice for C is based on simulation experiments proposed by Franses and Van Dijk (2002).7 The authors simulate some percentiles of the distribution of the tmax statistic under the null hypothesis that no outliers are present for several values of ARCH and GARCH parameters and for two sample sizes (250 and 500). It is seen that the value of C = 10 is reasonably close to the 90th percentile of this distribution for most parameter combinations.

APPLICATION TO THE FRENCH STOCK MARKET

We investigate the daily returns of the CAC40 French index and 17 of the French stocks included in it during the period from 6 January 1997 to 4 April 2002, comprising 1435 observations. From the closing prices, returns are computed as follows: Ri,t =(Pi,t − Pi,t−1)/Pi,t−1, where Pi,t indicates the closing prices for an asset i at time t. The data come from Thomson Financial Datastream.

As expected, some outliers are found in the daily data when we apply the previous method. Furthermore, many of the detected large shocks seem to be associated with the September 11 terrorist attacks. This result confi rms those of Chen and Siems (2004). The results announcements and the Vivendi Universal business, among others, may explained the abnormal returns.8

We analyse the descriptive statistics of residuals computed from AR models.9 These analyses are done both for the original series and the outlier-corrected data to see how the evidence of non-normality and conditional heteroscedasticity is altered by taking the outliers into account (Table I).

The statistics for normality indicate that none of the unadjusted time series is normally distributed. Adjusting outliers makes the distribution of the standardized residuals more normal, reducing skew-ness and excess kurtosis. We note that the outliers may be responsible for the asymmetries (Cac40, Accor, Alcatel, Bnp, Carrefour and Vivendi) but they may also hide the presence of asymmetry in the returns (Air Liquide, L’Oréal and Peugeot). Excess kurtosis still remains signifi cant for all series

7 Franses and Ghijsels (1999) and Beine and Laurent (2003) used a low critical value (C = 4), which tends to correct the outliers as well as the extreme values, making the return distribution platykurtic.8 Results for all series are not reported, to save space, but they are available from the author upon request.9 The estimated AR models for each series are available from the author upon request.

556 A. Charles


but the values are less important than when the series are not corrected of outlying observations. Evidence of conditional heteroscedasticity is found for all the unadjusted and outlier-adjusted time series. These results are not surprising because it is well known that fi nancial returns are mainly characterized by high kurtosis and volatility clustering.

Nevertheless, the linear models with Gaussian innovations do not adequately characterize the daily fi nancial time series. Indeed, it is well known that fi nancial data put on display volatility clustering, namely periods where volatility is larger than in other periods. An important feature of a GARCH model is that it can be fi tted to data which have excess kurtosis. Therefore, we examine the effects of modelling the outlier-corrected series with a GARCH model. We use an AR(p)-GARCH(1, 1) model to analyse outlier-unadjusted and adjusted returns, as this is the most commonly used parametric model for examining time-varying volatility.

As underlined by Verhoeven and McAleer (2000), validation of the regularity conditions of the GARCH(1, 1) model is of interest to forecasting for several reasons. It is possible to provide forecasts with appropriate standard errors only when these conditions are satisfi ed. Furthermore, if both the second and fourth moment conditions are satisfi ed, valid inferences are possible using the t-statistic for signifi cance of the parameter estimates. The regularity conditions of a GARCH(1, 1) model are defi ned as follows:

E

E

t

t

ε α βε α α β β

21 1

4 21 1 1

2

1

3 2 11

[ ] = + <

[ ] = + + <

where the a1 parameter measures the impact of the innovation in the return in the previous period, giving rise to more abrupt changes in volatility, and the b1 parameter measures the impact of the previ-ous conditional variance on its current value, giving rise to the more moderate changes in volatility.

All the series are modelled from a GARCH(1,1) model.Table II displays estimates of GARCH(1, 1) models for some series, before and after applying

the outlier correction method.10 All estimates are signifi cant at 5%.

Table I. Descriptive statistics of residuals from the AR models

Series Type Skewness Kurtosis Jarque–Bera LM(10) LB2(10)

Air Liquide Unadjusted −4.75E−4 6.77* 849.62* 72.01* 99.62*Adjusted 0.27* 3.95* 70.51* 125.18* 255.42*

Alcatel Unadjusted −0.79* 13.23* 6,387.41* 29.58* 34.04*Adjusted 0.10 3.89* 49.95* 113.19* 190.70*

Danone Unadjusted 0.04 5.63* 414.11* 55.55* 78.08*Adjusted 0.12 3.86* 46.97* 117.41* 256.88*

L’Oréal Unadjusted 0.10 4.56* 147.03* 164.41* 371.06*Adjusted 0.18* 4.26* 102.32* 186.31* 445.74*

Peugeot Unadjusted −0.05 6.16* 598.71* 134.16* 231.82*Adjusted 0.15* 4.31* 108.05* 345.10* 1,230.30*

Total Unadjusted −0.03 4.50* 133.81* 42.53* 59.46*Adjusted 0.05 3.53* 17.54* 86.86* 146.24*

Vivendi Unadjusted −1.50* 19.13* 16,086.87* 658.31* 697.14*Adjusted 0.07 3.91* 50.42* 508.45* 1,666.90*

Note: * signifi cant at 5%. The results for all series are not presented to save space but they are available from the author upon request.

10 Note that the regularity conditions of GARCH(1, 1) models are satisfi ed for all the series. The results of estimates for all series are not presented to save space but they are available from the author upon request.



Tabl

e II

. E

stim

ates

of

GA

RC

H(1

, 1)

mod

els

for

daily

ret

urns

and

des

crip

tive

stat

istic

s of

the

sta

ndar

dize

d re

sidu

als

Seri

esTy

pea 1

b 1a 1

+ b

1Sk

ewne

ssK

urto

sis

Jarq

ue–B

era

Q(2

0)Q

2 (20)

LM

(10)

t(b 1

)t(

b 2)

t(b 3

)T

R2

Alc

atel

Una

djus

ted

0.1

1 (9

.32*

)

0.8

7 (5

3.94

*)0.

98−0

.15

8.79

*20

08.3

*25

.02

8.5

1 5

.03

0.59

1.40

0.58

4.25

Adj

uste

d 0

.05

(5.8

7*)

0

.94

(96.

58*)

0.99

0.2

03.

75*

41

.50*

17.7

123

.56

17.1

50.

662.

67*

0.97

8.20

*

Dan

one

Una

djus

ted

0.0

8 (7

.63*

)

0.8

9 (7

5.75

*)0.

97 0

.12

3.72

*

34.5

8*23

.44

10.6

7 5

.97

0.24

1.32

0.21

3.11

Adj

uste

d 0

.14

(7.0

1*)

0

.80

(30.

51*)

0.94

0.1

23.

72*

34

.58*

23.4

410

.67

5.9

70.

141.

340.

033.

69

Tota

l U

nadj

uste

d 0

.05

(4.5

7*)

0

.94

(82.

88*)

0.99

0.0

53.

38*

9

.26*

22.5

317

.35

9.3

70.

011.

150.

832.

06

Adj

uste

d 0

.04

(4.4

6*)

0

.95

(69.

43*)

0.99

0.0

53.

38*

9

.26*

22.5

317

.35

9.3

70.

201.

580.

363.

26

Viv

endi

Una

djus

ted

0.0

9 (9

.61*

)

0.8

9 (8

2.95

*)0.

98 0

.23

3.42

*

23.4

0*12

.60

18.4

9 7

.19

0.21

1.23

0.23

2.88

Adj

uste

d 0

.07

(5.5

5*)

0

.92

(67.

63*)

0.99

0.2

33.

42*

23

.40*

12.6

018

.49

7.1

90.

561.

270.

251.

69

Not

e: t

(b1)

, t(

b 2),

t(b

3) a

nd T

R2 c

orre

spon

d, r

espe

ctiv

ely,

to

the

sign

bia

s te

st,

the

nega

tive

size

bia

s te

st,

the

posi

tive

size

bia

s te

st a

nd t

he j

oint

tes

t de

velo

ped

by

Eng

le a

nd N

g (1

993)

.*

Sign

ifi ca

nt a

t 5%

. The

t-s

tatis

tic i

s gi

ven

in p

aren

thes

es.

558 A. Charles


In general, the a1 estimates are substantially larger while the b1 estimates are substantially smaller, implying that larger innovations have larger ARCH effects but smaller GARCH effects. This result is independent of the trading environments. For Total, only two outliers are detected at the beginning of the sample, meaning that this market seems relatively calm. The value of a1 for Total drops from 0.05 to 0.04, whereas the value of b1 increases from 0.94 to 0.95 when the data are cleaned of outlying observations. For Vivendi, several outliers (essentially IOs) are detected at the end of the sample, implying that this market seems noisy. The value of a1 decreases (0.09 to 0.07), whereas the value of b1 increases (from 0.89 to 0.92) when the data are cleaned of outliers. We note that in some cases (Danone, for example) the value of a1 increases (from 0.08 to 0.14) whereas the value of b1 decreases (from 0.89 to 0.80) when the data are cleaned of outliers. This market seems noisy in the sense that several IOs are detected. Consequently, the biases are not very clear, and both parameters a1 and b1 can be overestimated or underestimated.

Moreover, the value of (a1 + b1) is usually very close to, but less than unity. This implies that the volatility process is highly persistent and close to unit root. The persistence measure is not very sensitive to outliers because these observations usually have opposite but equivalent effects on the estimates of the GARCH parameters.

Finally, to check the adequacy of a fi tted GARCH model, we examine the properties of standardized residuals (Table II). All the original series still have excess kurtosis, except Axa. However, excess kurtosis becomes signifi cant for this time series when the data are corrected for outliers, implying that outliers can hide excess kurtosis. The fat tails of the returns series after correcting for outliers may be caused by well-known long-range volatility correlations or by the presence of structural changes in variance. Moreover, whatever the type of series, the skewness statistic is not signifi cant at 5%. Normality is rejected for all unadjusted and adjusted series. We then test a linear GARCH specifi cation against nonlinear alternatives by means of the tests developed by Engle and Ng (1993). If we consider the original returns, all the statistics are not signifi cant at 5%, meaning that it is necessary to move on to nonlinear volatility models. However, when the series are corrected for outliers the tests become signifi cant for Cac40 and Alcatel. The Engle–Ng tests imply moving on to asymmetric nonlinear volatility models. In this case, outliers can hide asymmetry of the conditional variance and give irrelevant volatility estimates. Nevertheless, rejection of the null hypothesis by one or several of the tests does not give much information concerning which nonlinear GARCH model might be the appropriate alternative. We tried to model Cac40 and Alcatel returns with the most popular asymmetric models such as EGARCH, TGARCH or GJR-GARCH, but the regularity conditions of these models were not satisfi ed.11

VOLATILITY FORECASTS

In the previous section we showed that the GARCH(1, 1) model could adequately represent all daily stock French returns. In this context, we compute volatility forecasts from the previously estimated

11 It would be possible to use an alternative probability density function such as the asymmetric Student distribution, the asymmetric generalized error distribution or the Gram–Charlier distribution (Verhoeven and McAleer, 2004).



AR(p)-GARCH(1, 1) models for each series. We model the fi rst 1395 observations; the others observations are used for out-of-sample forecasting. We consider multi-step-ahead forecasts of the GARCH(1, 1) model for the conditional variance. Three prediction horizons are used: short term (h = 5), medium term (h = 20) and long term (h = 40).

Diebold and Mariano (1995) proposed three tests for equal accuracy between two forecasting models: an asymptotic test that corrects for series correlation and two exact fi nite sample tests based on the sign test and Wilcoxon’s signed rank test. The tests relate prediction error to some very general loss function12

and analyse loss differential derived from errors produced by two competing models.

The null hypothesis is that of equal predictive accuracy of the two models. A signifi cantly positive (negative) t-statistic indicates that the GARCH(1, 1) or the GARCH(1, 1)-t with unadjusted data dominates (is dominated by) the GARCH(1, 1) with adjusted data.

One diffi culty involved when computing forecasting criteria comes from the fact that the true volatility is unknown and has to be estimated. It is generally acknowledged that squared daily returns provide a poor approximation of actual daily volatility. It was fi rst pointed out by Andersen and Bollerslev (1998) that more accurate estimates could be obtained by summing squared intra-day returns. Research and debates on the optimum sampling frequency continue,13 but generally it is recognized that sampling frequencies of less than 5 minutes suffer from serial correlation and various microstructure distortions. In this study, we consider 5-minutes intervals to measure volatility.

The results of volatility forecasts are given in Tables III and IV. In most cases (approximately 70%), the volatility forecasts are signifi cantly different from the null hypothesis, meaning that the volatility forecasts are not the same. The t-statistics are, in general, negative, implying that volatility forecasts computed from data corrected for outliers are better than those computed with the unadjusted data, whatever the forecast horizon. Moreover, we note that it seems possible to use GARCH models in order to provide longer-horizon volatility forecasts.14 This result may be explained by the dynamic of the IOs. Indeed, the IOs produce a temporary effect contrary to the AOs, which only affects a single observation. The impact of the IOs has repercussions on the data.

We compare volatility forecasts computed from the new methodology with a GARCH-t model (Tables V and VI). If we consider short-run forecasts there is no evidence that forecasts from a GARCH model with the corrected returns improve on GARCH and GARCH-t models with unadjusted data. For longer forecasting horizons, we fi nd a substantial improvement in forecasting using corrected data.

Finally, we compare medium- and long-term volatility forecasts computed with adjusted data with unconditional volatility. Indeed, it is well known that long-term volatility forecasts converge to unconditional volatility. We use Diebold and Mariano’s tests. Results of volatility forecasts are given in Table VII. We note that the statistics are, in most cases, signifi cant, implying that medium- and

12 In this study, we use the mean squared forecast error (MSFE) and the mean absolute forecast error (MAFE) as the loss function. The results are equal whatever the loss function. Only the results with MSFE are presented to save space. Results with MAFE are available from the author upon request.13 A more complete discussion on the choice of frequency is given by Andersen et al. (2001).14 Starica (2003) showed that GARCH volatility forecasts are better than historical volatility when the horizon forecast is inferior to 60 trading days. If the horizon forecast is superior to 60 trading days, the historical volatility is better than the GARCH volatility forecasts.

560 A. Charles


long-term volatility forecasts do not converge to the unconditional volatility. Moreover, the sign of statistics is in general negative, meaning that the volatility forecasts computed from a GARCH(1, 1) model with adjusted data are better than the unconditional volatility forecasts. Thus, it seems possible to provide long-term volatility forecasts.

CONCLUSION

In this paper we showed how outlying observations can lead to irrelevant volatility forecasts. From the additive-outlier detection method in GARCH models developed by Franses and Ghijsels (1999) and extended to innovative outliers by Charles and Darné (2005), we detected and corrected abnor-mal returns in 17 stocks returns on the French market. We then studied the effects of outlying observations on several popular econometric tests. Results underlined the fact that outliers can lead to a misspecifi cation of the model when we ignore the presence of this type of data. We modelled the stock returns with GARCH(1, 1), which seems the more appropriate model. The parameters of

Table III. Tests for comparing predictive accuracy: GARCH–GARCHm

Series h = 5 h = 20 h = 40

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Cac40 −4.82* (0.00)

−2.23*(0.02)

−2.02* (0.04)

−5.30* (0.00)

−4.47*(0.00)

−3.92* (0.00)

−5.75* (0.02)

−6.01* (0.00)

−5.49* (0.00)

Accor −0.39 (0.56)

−1.34(0.18)

−1.75 (0.08)

−10.22* (0.00)

−4.02*(0.00)

−3.88* (0.00)

−5.02* (0.00)

−6.00* (0.00)

−5.50* (0.00)

Air Liquide −7.03* (0.00)

−2.24*(0.02)

−2.03* (0.04)

−3.46* (0.05)

−4.47*(0.65)

−3.92* (0.20)

−4.61* (0.00)

−6.01* (0.00)

−3.92* (0.00)

Alcatel −7.01* (0.00)

−2.24*(0.02)

−2.03* (0.04)

−4.45* (0.00)

−4.02*(0.02)

−3.89* (0.00)

−5.05* (0.00)

−6.01* (0.00)

−5.48* (0.00)

Aventis −4.38* (0.00)

−2.24*(0.02)

−2.02* (0.04)

−4.37* (0.00)

−4.03*(0.00)

−3.88* (0.00)

−5.97* (0.00)

−6.01* (0.00)

−5.48* (0.00)

Axa 2.74* (0.01)

2.27*(0.03)

−2.06* (0.04)

−4.13* (0.00)

3.31*(0.00)

3.62* (0.00)

5.57* (0.00)

5.38* (0.00)

5.38* (0.00)

Bnp −4.04* (0.00)

−1.34*(0.18)

−1.76 (0.08)

−4.32* (0.00)

−4.02*(0.00)

−3.81* (0.00)

−4.50* (0.00)

−5.69* (0.00)

−5.39* (0.00)

Carrefour −7.24* (0.00)

−2.78*(0.18)

−2.11* (0.08)

−9.50* (0.00)

−4.02(0.00)

−3.88* (0.00)

−15.77* (0.00)

−6.01* (0.00)

−5.50* (0.00)

Danone −2.59* (0.01)

−1.34(0.18)

−1.75 (0.08)

−1.90 (0.06)

2.24*(0.03)

3.69* (0.01)

4.24* (0.00)

4.43* (0.00)

4.91* (0.00)

Note: Asy. test, Sign test and Wil. test defi ne the asymptotic, sign and Wilcoxon’s signed rank tests, respectively. The volatility forecasts computed from unadjusted data are called GARCH, whereas the volatility forecasts computed from adjusted data are called GARCHm.* One can reject the null hypothesis of equal accuracy. The p-values are given in parentheses. A signifi cant positive (nega-tive) sign means that the volatility forecasts computed from GARCH model and original returns (returns corrected for outliers) are better than those computed from the GARCHm model and returns corrected for outliers (original returns).



Table IV. Tests for comparing predictive accuracy: GARCH–GARCHm

Series h = 5 h = 20 h = 40

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Lagardère −4.73* (0.00)

−1.34 (0.18)

−1.75 (0.08)

−1.87 (0.07)

−3.13* (0.00)

−3.66* (0.00)

−2.44* (0.00)

−5.06* (0.00)

−5.27* (0.00)

L’Oréal −10.56* (0.010)

−2.24* (0.03)

−2.03* (0.04)

−6.97* (0.00)

−4.47* (0.00)

−3.92* (0.00)

−7.55* (0.00)

−6.32* (0.00)

−5.51* (0.00)

Peugeot −13.16* (0.00)

−2.31* (0.02)

−2.10* (0.03)

−6.76* (0.00)

−4.02* (0.00)

−3.88* (0.00)

−7.22* (0.00)

−6.00* (0.00)

−5.51* (0.00)

Renault −9.06* (0.00)

−2.43* (0.02)

−2.18* (0.03)

−5.72* (0.00)

−3.13* (0.00)

−3.69* (0.00)

−5.78* (0.00)

−5.38* (0.00)

−5.43* (0.00)

Schneider −8.63* (0.00)

−2.27* (0.02)

−2.06* (0.04)

−2.79* (0.01)

−4.47* (0.00)

−3.92* (0.00)

−3.90* (0.00)

−6.32* (0.00)

−5.51* (0.00)

Société Générale −2.55* (0.02)

−2.24* (0.02)

−2.02* (0.01)

−3.22* (0.00)

−4.02* (0.00)

−3.77* (0.00)

−4.19* (0.00)

−6.01* (0.00)

−5.44* (0.00)

Suez −11.12* (0.00)

−2.27* (0.02)

−2.03* (0.04)

−6.50* (0.00)

−4.47* (0.00)

−3.92* (0.08)

−5.86* (0.00)

−6.32* (0.00)

−5.51* (0.00)

Total −3.39* (0.00)

−2.24* (0.02)

−2.02* (0.04)

0.82 (0.41)

1.34 (0.17)

1.27 (0.20)

3.12* (0.00)

4.11* (0.00)

4.39* (0.00)

Vivendi −14.48* (0.00)

−2.34* (0.00)

−2.09* (0.04)

−6.24* (0.00)

−4.02* (0.00)

−3.88* (0.00)

−4.42* (0.00)

−6.01* (0.00)

−5.51* (0.00)

Note: Asy. test, Sign test and Wil. test defi ne the asymptotic, sign and Wilcoxon’s signed rank tests, respectively. The volatility forecasts computed from unadjusted data are called GARCH, whereas the volatility forecasts computed from adjusted data are called GARCHm.* One can reject the null hypothesis of equal accuracy. The p-values are given in parentheses. A signifi cant positive (nega-tive) sign means that the volatility forecasts computed from GARCH model and original returns (returns corrected for outliers) are better than those computed from the GARCHm model and returns corrected for outliers (original returns).

Table V. Tests for comparing predictive accuracy: GARCH Student–GARCHm

Series h = 5 h = 20 h = 40

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Asy. test Sign test

Wil. test

Cac40 −4.67* (0.00)

−2.24* (0.02)

−2.02* (0.04)

−5.86* (0.00)

−4.47* (0.00)

−3.92* (0.00)

−6.75* (0.02)

−6.01* (0.00)

−5.49*(0.00)

Accor 1.22 (0.56)

1.34 (0.18)

1.75 (0.08)

−4.98* (0.00)

−4.02* (0.00)

−3.88* (0.00)

−2.32* (0.00)

−6.00* (0.00)

−5.50*(0.00)


−2.24* (0.02)

−2.03* (0.04)

−3.69* (0.05)

−4.47* (0.65)

−3.92* (0.20)

−4.81* (0.00)

−6.01* (0.00)

−3.92*(0.00)

Alcatel −7.86* (0.00)

−2.24* (0.02)

−2.03* (0.04)

−4.57* (0.00)

−4.03* (0.02)

−3.88* (0.00)

−4.04* (0.00)

−6.01* (0.00)

−5.47*(0.00)

Aventis −3.83* (0.00)

−2.24* (0.02)

−2.02* (0.04)

−5.14* (0.00)

−3.99* (0.00)

−3.86* (0.00)

−6.66* (0.00)

−6.00* (0.00)

−5.49*(0.00)

Axa 2.74* (0.01)

2.23* (0.03)

2.03* (0.04)

4.34* (0.00)

3.13* (0.00)

3.62* (0.00)

5.55* (0.00)

5.38* (0.00)

5.38*(0.00)

Bnp 5.30* (0.00)

1.34 (0.18)

1.75 (0.08)

4.43* (0.00)

4.02* (0.00)

3.89 (0.00)

7.86* (0.00)

5.69* (0.00)

5.47*(0.00)

Carrefour −7.26* (0.00)

−2.21* (0.00)

−2.03* (0.08)

−9.12* (0.00)

−4.02* (0.00)

−3.88* (0.00)

−15.83* (0.00)

−6.00* (0.00)

−5.50*(0.00)

Danone 2.59* (0.01)

1.34 (0.18)

1.76 (0.08)

1.85 (0.06)

2.23* (0.03)

2.54* (0.01)

4.24* (0.00)

4.23* (0.00)

4.82*(0.00)

Note: Asy. test, Sign test and Wil. test defi ne the asymptotic, sign and the Wilcoxon’s signed rank tests, respectively. The volatility forecasts computed from unadjusted data with a Student distribution are called GARCH-Student, whereas the volatility forecasts computed from adjusted data are called GARCHm.* One can reject the null hypothesis of equal accuracy. The p-values are given in parentheses. A signifi cant positive (nega-tive) sign means that the volatility forecasts computed from the GARCH-t model and original returns (returns corrected for outliers) are better than those computed from the GARCHm model and returns corrected for outliers (original returns).

562 A. Charles


the equation governing volatility dynamics are biased when we do not take into account outliers, whatever the trading environment (calm or noisy periods). Finally, using Diebold and Mariano (1995) tests, we showed that a volatility forecast is better when the data are cleaned of outliers. It seems possible to provide long-term volatility forecasts from the GARCH process. This is an impor-tant result in the sense that, in practice, volatility is often used to refer to the risk. Indeed, volatility is an important key to risk management, derivative pricing and hedging, market making, market timing, portfolio selection, monetary policy making and many other fi nancial activities.

There are, as indicated in the introduction, several procedures proposed in the literature to take into account outliers in the GARCH models. A comparison of these procedures will be the subject of further research.

ACKNOWLEDGEMENTS

I am grateful to the referees and the editor for constructive and very useful comments.

Table VI. Tests for comparing predictive accuracy: GARCH Student-GARCHm

Series h = 5 h = 20 h = 40

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Lagardère −4.17* (0.00)

−1.34* (0.18)

−1.75* (0.08)

−7.37* (0.07)

−3.58* (0.00)

−3.81* (0.00)

−8.60* (0.00)

−5.69* (0.00)

−5.44* (0.00)

L’Oréal −6.99* (0.010)

−2.24* (0.03)

−2.03* (0.04)

0.39 (0.00)

0.45 (0.00)

0.63 (0.00)

1.99* (0.00)

2.84* (0.00)

3.25* (0.00)

Peugeot 9.19* (0.00)

2.25* (0.02)

2.03* (0.03)

−6.62* (0.00)

−4.03* (0.00)

−3.88* (0.00)

−3.41* (0.00)

−4.11* (0.00)

−4.29* (0.00)

Renault 9.75* (0.00)

2.27* (0.02)

2.10* (0.03)

−6.89* (0.00)

−3.13* (0.00)

−3.70* (0.00)

−6.71* (0.00)

−5.38* (0.00)

−5.43* (0.00)

Schneider −8.68* (0.00)

−2.12* (0.02)

−2.00* (0.04)

−3.08* (0.01)

−4.47* (0.00)

−3.92* (0.00)

−4.33* (0.00)

−6.32* (0.00)

−5.51* (0.00)

Société Générale

−1.51 (0.13)

−1.34 (0.18)

−1.48 (0.14)

−2.26* (0.02)

−3.58* (0.00)

−3.51* (0.00)

−3.56* (0.00)

−5.69* (0.00)

−5.35* (0.00)

Suez −15.72* (0.00)

−2.24* (0.02)

−2.03* (0.04)

−3.66* (0.00)

−4.47* (0.00)

−3.92* (0.08)

−4.58* (0.00)

−6.32* (0.00)

−5.51* (0.00)

Total −3.58* (0.00)

−2.24* (0.02)

−2.02* (0.04)

0.60 (0.58)

0.89 (0.37)

0.67 (0.50)

2.97* (0.00)

3.79* (0.00)

4.15* (0.00)

Vivendi 14.52* (0.00)

2.24* (0.00)

2.09* (0.04)

6.23* (0.00)

4.02* (0.00)

3.88* (0.00)

4.58* (0.00)

6.01* (0.00)

5.51* (0.00)

Note: Asy. test, Sign test and Wil. test defi ne the asymptotic, sign and Wilcoxon’s signed rank tests, respectively. The volatility forecasts computed from unadjusted data with a Student distribution are called GARCH-Student, whereas the volatility forecasts computed from adjusted data are called GARCHm.* One can reject the null hypothesis of equal accuracy. The p-values are given in parentheses. A signifi cant positive (nega-tive) sign means that the volatility forecasts computed from the GARCH-t model and original returns (returns corrected for outliers) are better than those computed from the GARCHm model and returns corrected for outliers (original returns).



Table VII. Tests for comparing predictive accuracy: unconditional volatility–GARCHm

Series h = 20 h = 40

Asy. test

Sign test

Wil. test

Asy. test

Sign test

Wil. test

Cac40 −6.06* (0.00)

−4.47* (0.00)

−3.92* (0.00)

−11.26*(0.00)

−6.01* (0.00)

−5.50* (0.00)

Accor −4.05* (0.00)

−2.24* (0.00)

−2.02* (0.00)

−6.06*(0.00)

−4.47* (0.00)

−3.92* (0.00)


−4.47* (0.00)

−3.92* (0.00)

−5.94*(0.00)

−6.01* (0.00)

−5.46* (0.00)

Alcatel −4.56* (0.00)

−4.02* (0.00)

−3.88* (0.00)

−4.41*(0.00)

−6.01* (0.00)

−5.48* (0.00)

Aventis −8.17* (0.00)

−4.03* (0.00)

−3.88* (0.00)

−7.93*(0.00)

−6.01* (0.00)

−5.48* (0.00)

Axa 6.04* (0.00)

−3.13* (0.00)

−3.47* (0.00)

5.20* (0.00)

5.06* (0.00)

5.30* (0.00)

Bnp −2.53* (0.01)

−2.23* (0.03)

−3.17 (0.00)

−4.09*(0.01)

−4.42* (0.03)

−5.15* (0.00)

Carrefour −8.84* (0.00)

−4.02* (0.00)

−3.88* (0.00)

−15.37*(0.00)

−6.01* (0.00)

−5.50* (0.00)

Danone −5.23* (0.00)

−4.03* (0.00)

−3.84* (0.00)

−5.28*(0.00)

−5.69* (0.00)

−5.44* (0.00)

Lagardère 4.81* (0.00)

3.58* (0.00)

3.81* (0.00)

4.66*(0.00)

5.69* (0.00)

5.47* (0.00)

L’Oréal −4.74* (0.04)

−3.13* (0.00)

−3.58* (0.00)

−9.66*(0.04)

−5.06* (0.00)

−5.27* (0.00)

Peugeot −3.48* (0.00)

−4.02* (0.00)

−3.88* (0.00)

−4.83*(0.00)

−6.01* (0.00)

−5.48* (0.00)

Renault −5.91* (0.00)

−3.13* (0.00)

−3.69* (0.00)

−9.91*(0.00)

−5.38* (0.00)

−5.43* (0.00)

Schneider −6.84* (0.00)

−4.47* (0.00)

−3.92* (0.00)

−7.59*(0.00)

−6.32* (0.00)

−5.51* (0.00)

Société Générale −1.74* (0.08)

−2.68* (0.01)

−2.99* (0.01)

0.57*(0.57)

1.26* (0.20)

1.43* (0.15)

Suez 3.78* (0.00)

4.02* (0.00)

3.88* (0.00)

4.75*(0.00)

6.01* (0.00)

5.50* (0.00)

Total −4.88* (0.00)

−3.13* (0.00)

−3.51* (0.00)

−7.57*(0.00)

−5.06* (0.00)

−5.19* (0.00)

Vivendi −4.59* (0.00)

−4.47* (0.00)

−3.92* (0.00)

−6.88*(0.00)

−6.32* (0.00)

−5.51* (0.00)

Note: Asy. test, Sign test and Wil. test defi ne the asymptotic, sign and Wilcoxon’s signed rank tests, respectively.* One can reject the null hypothesis of equal accuracy. The p-values are given in parentheses. A signifi cant positive (nega-tive) sign means that unconditional volatility (the volatility forecasts computed from the GARCH model and returns corrected for outliers) is better than the volatility forecasts computed from the GARCHm model and returns corrected for outliers (unconditional volatility).

564 A. Charles


REFERENCES

Andersen TG, Bollerslev T. 1998. Answering the skeptics: yes, standard volatility models do provide accurate forecasts. International Economic Review 39: 885–905.

Andersen TG, Bollerslev T, Diebold FX, Labys P. 2001. Modeling and forecasting realized volatility. Economet-rica 71: 579–625.

Baillie RT, Bollerslev T. 1989. The message in daily exchange rates: a conditional variance tale. Journal of Busi-ness and Economic Statistics 7: 297–305.

Balke NS, Fomby TB. 1994. Large shocks, small shocks, and economic fl uctuations: outliers in macroeconomics time series. Journal of Applied Econometrics 31: 307–327.

Beine M, Laurent S. 2003. Central bank interventions and jumps in double long memory models of daily exchange rates. Journal of Empirical Finance 10: 641–660.

Bollerslev T. 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31: 307–327.

Bollerslev T. 1987. A conditionally heteroskedastic time series model for speculative prices and rates of return. Review of Economics and Statistics 69: 542–547.

Carnero MA, Peña D, Ruiz E. 2007. Effects of outliers on the identifi cation and estimation of the GARCH models. Journal of Time series Analysis 28: 471–497.

Charles A, Darné O. 2005. Outliers and GARCH models in daily fi nancial data. Economics Letters 86: 347–352.

Chen C, Liu LM. 1993. Joint estimation of model parameters and outlier effects in time series. Journal of the American Statistical Association 88: 284–297.

Chen AH, Siems TF. 2004. The effects of terrorism on global capital markets. European Journal of Political Economy 20: 349–366.

Diebold FX, Mariano RS. 1995. Comparing predictive accuracy. Journal of Business and Economic Statistics 13: 134–144.

Doornik JA, Ooms M. 2002. Outlier detection in GARCH models. Working paper, Nuffi eld College, Oxford.Engle RF. 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom

infl ation. Econometrica 50: 987–1007.Engle RF, Ng VK. 1993. Measuring and testing the impact of news on volatility. Journal of Finance 48: 1749–

1778.Franses PH, Ghijsels H. 1999. Additive outliers, GARCH and forecasting volatility. International Journal of

Forecasting 15: 1–9.Franses PH, van Dijk D. 2002. Nonlinear Time Series Models in Empirical Finance. Cambridge University Press:

Cambridge, UK.Hsieh D. 1989. Modeling heteroskedasticity in daily foreign exchange rates. Journal of Business and Economic

Statistics 7: 307–317.Hotta LK, Tsay RS. 1999. Outliers in GARCH processes. Graduate School of Business, University of

Chicago.Jorion P. 1988. On jump processes in the foreign exchange and stock markets. Review of Financial Studies 1:

427–445.Nelson DB. 1991. Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59: 347–370.Park BJ. 2002. An outlier robust GARCH model and forecasting volatility of exchange rate returns. Journal of

Forecasting 21: 381–393.Sakata S, White H. 1998. High breakdown point conditional dispersion estimation with application to S&P 500

daily returns volatility. Econometrica 66: 529–567.Starica C. 2003. Is GARCH(1, 1) as good a model as the Nobel prize accolades would imply? Working paper,

Chalmers University of Technology.van Dijk D, Franses PH, Lucas A. 2002. Testing for ARCH in the presence of additive outliers. Journal of Applied

Econometrics 14: 539–562.Verhoeven P, McAleer M. 2000. Modelling outliers and extreme observations for ARMA-GARCH processes.

Working paper, University of Western Australia.Verhoeven P, McAleer M. 2004. Fat tails and asymmetry in fi nancial volatility models. Mathematics and Comput-

ers in Simulation 64: 351–361.



Vlaar PJG, Palm FC. 1993. The message in weekly exchange rates in the European Monetary System: mean reversion, conditional heteroscedasticity and jumps. Journal of Business and Economic Statistics 11: 351–360.

Author’s biography:Amélie Charles is an assistant professor at Audencia Nantes, School of Management. Her research interests are in the fi eld of fi nancial econometrics, particularly in time-series modelling and forecasting.

Author’s address:Amélie Charles, Audencia Nantes, School of Management, 8 route de la Jonelière, BP 31222, 44312 Nantes.

forecasting volatility with outliers in garch models

Documents