forecasting exchange rate volatility using high-frequency data: is the euro different?

19
Available online at www.sciencedirect.com International Journal of Forecasting 27 (2011) 1089–1107 www.elsevier.com/locate/ijforecast Forecasting exchange rate volatility using high-frequency data: Is the euro different? Georgios Chortareas a,1 , Ying Jiang b,, John. C. Nankervis c,2 a Department of Economics, University of Athens, Athens, 10559, Greece b Nottingham University Business School China, University of Nottingham Ningbo, 199 Taikang East Road, Ningbo 315100, China c Essex Business School, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK Abstract We assess the performances of alternative procedures for forecasting the daily volatility of the euro’s bilateral exchange rates using 15 min data. We use realized volatility and traditional time series volatility models. Our results indicate that using high- frequency data and considering their long memory dimension enhances the performance of volatility forecasts significantly. We find that the intraday FIGARCH model and the ARFIMA model outperform other traditional models for all exchange rate series. c 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. Keywords: Euro exchange rates; Volatility forecasting; High-frequency data; GARCH model; Long memory time series; Forecast evaluation 1. Introduction Volatility forecasting of asset prices in general, and exchange rates in particular, has been the focus of research in areas such as investment analysis, deriva- tive securities pricing and risk management. More- over, since the volatility of financial markets has a Corresponding author. Tel.: +86 0 574 88180200; fax: +86 0 574 88180125. E-mail addresses: [email protected] (G. Chortareas), [email protected] (Y. Jiang), [email protected] (J.C. Nankervis). 1 Tel.: +30 210 3689805; fax: +30 210 3689810. 2 Tel.: +44 0 1206 873973; fax: +44 0 1206 873429. direct influence on policymaking, volatility forecasts can play the role of a ‘barometer for the vulnera- bility of financial markets and the economy’ (Poon & Granger, 2003). Poon and Granger (2003) review 93 papers in the volatility forecasting field and show that different models for forecasting the exchange rate volatility perform differently for different currencies. In this paper we evaluate the daily volatility fore- casting performances of alternative models for euro exchange rates using high-frequency data. Until quite recently, the literature typically focused on daily returns for forecasting the daily volatility, and used the daily squared returns as a measure of the ‘true volatility’. However, daily squared returns 0169-2070/$ - see front matter c 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.ijforecast.2010.07.003

Upload: georgios-chortareas

Post on 05-Sep-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

Available online at www.sciencedirect.com

International Journal of Forecasting 27 (2011) 1089–1107www.elsevier.com/locate/ijforecast

Forecasting exchange rate volatility using high-frequency data: Isthe euro different?

Georgios Chortareasa,1, Ying Jiangb,∗, John. C. Nankervisc,2

a Department of Economics, University of Athens, Athens, 10559, Greeceb Nottingham University Business School China, University of Nottingham Ningbo, 199 Taikang East Road, Ningbo 315100, China

c Essex Business School, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK

Abstract

We assess the performances of alternative procedures for forecasting the daily volatility of the euro’s bilateral exchange ratesusing 15 min data. We use realized volatility and traditional time series volatility models. Our results indicate that using high-frequency data and considering their long memory dimension enhances the performance of volatility forecasts significantly.We find that the intraday FIGARCH model and the ARFIMA model outperform other traditional models for all exchange rateseries.c⃝ 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.

Keywords: Euro exchange rates; Volatility forecasting; High-frequency data; GARCH model; Long memory time series; Forecast evaluation

s

1. Introduction

Volatility forecasting of asset prices in general, andexchange rates in particular, has been the focus ofresearch in areas such as investment analysis, deriva-tive securities pricing and risk management. More-over, since the volatility of financial markets has a

∗ Corresponding author. Tel.: +86 0 574 88180200; fax: +86 0574 88180125.

E-mail addresses: [email protected] (G. Chortareas),[email protected] (Y. Jiang), [email protected](J.C. Nankervis).

1 Tel.: +30 210 3689805; fax: +30 210 3689810.2 Tel.: +44 0 1206 873973; fax: +44 0 1206 873429.

0169-2070/$ - see front matter c⃝ 2011 International Institute of Forecadoi:10.1016/j.ijforecast.2010.07.003

direct influence on policymaking, volatility forecastscan play the role of a ‘barometer for the vulnera-bility of financial markets and the economy’ (Poon& Granger, 2003). Poon and Granger (2003) review93 papers in the volatility forecasting field and showthat different models for forecasting the exchange ratevolatility perform differently for different currencies.In this paper we evaluate the daily volatility fore-casting performances of alternative models for euroexchange rates using high-frequency data.

Until quite recently, the literature typically focusedon daily returns for forecasting the daily volatility,and used the daily squared returns as a measure ofthe ‘true volatility’. However, daily squared returns

ters. Published by Elsevier B.V. All rights reserved.

Page 2: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1090 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

are not an accurate measure of the true volatility,since they are calculated from closing prices andtherefore cannot capture price fluctuations during theday (see Andersen & Bollerslev, 1998). In responseto these limitations, Andersen and Bollerslev (1998)propose the realized volatility (constructed fromintraday returns) as a measure of the true volatility,and this measure has since become very popular.High-frequency data carry more information on dailytransactions, and are useful not only in measuringvolatility, but also in direct model estimation andforecast evaluation. Many recent methodologicaladvances focus on high-frequency data,3 while anumber of studies build on this literature to evaluatethe performance of alternative models for volatilityforecasting.4

While there exist a number of studies on foreignexchange volatility forecasting,5 as is discussed inSection 2, to the best of our knowledge, limitedwork has been done on forecasting the volatility ofeuro exchange rates. Since its introduction in 1999,the euro has become a major international currency,quickly establishing itself as the second most widelyused international currency after the US dollar.6

Nevertheless, the literature on exchange rate volatilityforecasting focuses on USD exchange rates alone.

Our study addresses this gap in the literature byproviding a characterization of the euro’s exchangerate volatility at both the daily and intraday frequen-cies, and considers questions such as: Are the samemodels appropriate for the euro exchange rate as forthe USD exchange rate? Do high-frequency euro ex-change rates have properties similar to those of otherhigh-frequency data? Can a long memory factor im-prove the performance of exchange rate volatility fore-casting?

3 Examples of this type of analysis include the use of longmemory ARFIMA (Autoregressive Fractional Integration MovingAverage) models for forecasting the realized volatility (e.g., Pong,Shackleton, Taylor, & Xu, 2004), extending the daily model toinclude intraday information (e.g., Koopman, Jungbacker, & Hol,2005) and the direct modelling of intraday returns using standardvolatility models (e.g., Marlik, 2005).

4 For example, see Andersen, Bollerslev, Diebold, and Labys(2003, ABDL hereafter), Hol and Koopman (2002), Martens (2001)and Martens and Zein (2004).

5 Examples include Andersen and Bollerslev (1998), Andersen,Bollerslev, Diebold, and Labys (1999, 2000, 2003), Martens (2001),Vilasuso (2002) and West and Cho (1995).

6 European Central Bank (1999).

To answer these questions we compare the out-of-sample daily volatility forecast performances oftraditional time series volatility models with that ofa realized volatility model at high frequencies. Thetraditional time series volatility models consideredinclude the GARCH model, the stochastic volatility(SV) model, the stochastic volatility with exogenousvariables (SVX) model, and finally, the fractionallyintegrated GARCH (FIGARCH) model. The realizedvolatility model is an ARFIMA model.7,8 Wecompare the performances of the two types of longmemory models (FIGARCH and ARFIMA) usinghigh-frequency data. We also compare the propertiesof the intraday GARCH and FIGARCH models withthose of ARFIMA models which use the daily realizedvolatility. Finally, we compare the intraday GARCHmodel with the intraday FIGARCH model to provideevidence on whether modelling the long memoryproperty in a high-frequency volatility process canimprove the daily forecast performance.

For the intraday GARCH and FIGARCH modelswe use deseasonalized 15 min data on returns for a pe-riod covering almost four years. We thus obtain a verylarge number of observations relative to other studiesthat apply standard volatility models to intraday re-turns (e.g., Beltratti & Morana, 1999; Marlik, 2005;Martens, 2001; Rahman & Ang, 2002). Marlik (2005),for example, uses 30 min data covering a period of fourmonths.

We employ a battery of tests to evaluate the out-of-sample forecast performances of the models con-sidered. In addition to the regression test and theaccuracy test, we also use the superior predictiveability test (Hansen, 2005) and an equal accuracytest, namely the adjusted Diebold-Mariano (1995)test. The results of these tests show that the intradayFIGARCH model always outperforms other tradi-tional models, and produces results that are notsignificantly different to those from the realizedvolatility (ARFIMA) model. This is not atypical of the

7 The ARFIMA model is fitted to the daily realized volatility. TheGARCH and FIGARCH models are both fitted to deseasonalized15 min returns. The GARCH, SV and SVX models are also fitted todaily returns.

8 This paper focuses on the ARFIMA model alone. Other newlydeveloped realized volatility models include the HeterogeneousAutoregressive (HAR) model (Corsi, 2009) and the Mixed DataSampling (MIDAS) model (Ghysels, Sinko, & Valkanov, 2007).

Page 3: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1091

outcomes of previous research (see Hol & Koopman,2002; Martens & Zein, 2004; Pong et al., 2004, etc.).

Our findings suggest that the use of high-frequencydata enhances the performance of daily volatility fore-casting. Moreover, the forecasting accuracy is im-proved further when the long memory property istaken into account explicitly (i.e., comparing the in-traday FIGARCH with GARCH models). We alsofind that the performance levels of the daily GARCHmodel and the SV models are different across the cur-rencies considered.

The remainder of the paper is arranged as follows.Section 2 reviews some of the main findings and cur-rent arguments in the volatility forecasting literature.Section 3 focuses on the data and methodology usedin this paper. Section 4 discusses forecast evaluationmethods. Section 5 evaluates the estimation results andcompares the out-of-sample forecast performances ofthe models. Finally, Section 6 concludes.

2. Literature review

The literature on volatility forecasting applied tohigh-frequency data includes, but is not limited to,studies of the realized volatility, model comparisonsusing high-frequency versus daily data, assessments ofthe standard volatility model at high frequencies, andthe data properties of specific assets/series.

Since the true volatility is unobservable, dailysquared returns are popularly used in the literatureas a measure of volatility. Andersen and Bollerslev(1998) suggest that the realized volatility is a more ac-curate proxy of the true volatility. Using 5 min dataas a new volatility measure, they demonstrate a dra-matic improvement in the volatility forecasting per-formance of a daily GARCH model. A number offurther studies have since focussed on realized vola-tility forecasting and its properties. Andersen, Boller-slev, Diebold, and Labys (ABDL hereafter) (1999)recommend the ARFIMA model for forecasting therealized volatility, and further show that the realizedvolatility is a consistent estimator of the integratedvolatility (ABDL, 2001). Applying the ARFIMAmodel to exchange rates, ABDL (2001) show thatthe realized volatility can improve forecasting if itis modelled by a parametric model directly, ratherthan simply being used in the evaluation of othermodels’ forecasting behaviours. The findings of the

above studies constitute the theoretical basis for usingthe realized volatility in exchange rate volatility fore-casting directly. Furthermore, ABDL (2003) proposea long memory Gaussian vector autoregressive pro-cess (VAR) for modelling and forecasting the realizedvolatility, and produce strong evidence that the VAR-RV model outperforms the other candidate models.

A second strand in the literature considers the ad-vantages of using high-frequency data and comparesvolatility forecasts using intraday data with those us-ing daily data, as well as those from the option-implied volatility model. Martens (2001) comparesdaily exchange rate volatility forecasts, constructedfrom multiple volatility forecasts of intraday intervals,with forecasts from a daily model and a daily modelextended by an intraday information term. He findsthat the higher the intraday frequency used, the betterthe out-of-sample daily volatility forecasts. The dailymodel, which includes the realized volatility as an ex-planatory variable, displays a similar performance tothose of models using only intraday exchange ratereturns. Martens and Zein (2004) produce evidenceshowing that high-frequency data can improve boththe measurement accuracy and the forecasting per-formance. In addition, they show that long memorymodels improve the forecasting performance. Hol andKoopman (2002) compare the predictive powers ofrealized volatility models and daily time-varyingvolatility models using the S&P100 stock index. Theresults of the out-of-sample evaluation indicate that anARFIMA model fitted to the realized volatility givesthe most accurate forecasts. Pong et al. (2004) com-pare exchange rate volatility forecasts obtained froman option implied volatility model, a short memorymodel (ARMA), a long memory model (ARFIMA)and a daily GARCH model. They find that the mostaccurate volatility forecasts are generated using high-frequency returns rather than a long memory specifi-cation.

Naturally, one question that arises is whether thestandard models are still valid when using high-frequency returns. There is plenty of evidence testify-ing to the performance of the realized volatility. Never-theless, there is still scope for a further examination ofwhether traditional time series models can capture theproperties of high-frequency data and provide a goodfit for the intraday returns series. No consensus onthis issue has yet been reached. For example, Rahman

Page 4: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1092 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

and Ang (2002) show that the intraday volatility canbe described best by a standard GARCH(1, 1) model,while Jones (2003), based on simulation results, sug-gests that standard time series models cannot capturethe intraday exchange rate returns generating processsuccessfully at frequencies higher than 24 h.

Finally, another set of studies considers the proper-ties of high-frequency data for specific markets. Theliterature discussed above focuses on high-frequencyexchange rate returns in developed financial mar-kets. Some of their stylized properties include firstorder negative autocorrelation, non-normal distribu-tions, an increasing fat tail with an increasing fre-quency, and periodicity (Dacorogna, Gencay, Muller,Olsen, & Pictec, 2001). Other authors who haveconsidered developing/emerging markets have foundthat high-frequency returns series display featureswhich are consistent with the above stylized properties(e.g., Barbosa, 2002; Kayahan & Stengos, 2002).

However, limited evidence exists on forecasting theeuro exchange rate volatility and on how the alterna-tive models perform this task. Heaney and Pattenden(2005) evaluate the change in unconditional variancesfor the euro/GBP and euro/USD exchange rates during2002 and 2003. Clements, Galvao, and Kim (2008) ex-amine quantile forecasts of the daily exchange rate re-turns of five currencies versus the USD, including theeuro. Bauwens and Sucarrat (2010) evaluate modelsof weekly NOK/euro exchange rate volatility forecaststhat are generated by applying a general-to-specificeconometric methodology. Marlik (2005) models thevolatility of the USD against the euro and GBP usingGARCH, FIGARCH, and SV models. This paper con-tributes to this body of literature by focusing on theeuro’s daily volatility forecasting at high frequenciesand comparing the performances of competing mod-elling frameworks.

3. Data and estimation methodology

3.1. Data, properties and the stylized facts

The original data sets we use are 5 min interval spotforeign exchange rates of the euro against the Swissfranc (CHF), the UK pound (GBP), the Japanese yen(JPY) and the US dollar (USD), provided by Olsen andAssociates. These are the major currencies in termsof trading volume, accounting for over 85% of all

foreign exchange transactions. Since the CHF, GBPand JPY are all direct trading currencies, we avoidusing cross rates calculated from the USD exchangerates. The data span the period from January 4th, 2000,to October 31st, 2004, resulting in a total of 509,472observations for each exchange rate series.

As high-frequency data carry more information ondaily transactions, using data with the highest possiblefrequency theoretically optimizes the accuracy of thedaily volatility estimation. However, many researchershave expressed concerns about the usefulness of high-frequency data in the presence of various marketmicrostructure effects (e.g., ABDL, 1999; Bandi &Russell, 2005; Zhang, Mykland, & Aıt-Sahalia, 2005).A generally accepted practice is to consider intervalsbetween 5 and 30 min (e.g., ABDL, 2003; Hol &Koopman, 2002; Martens, 2001). Thus, we use 15 mininterval data, which are constructed from an initial gridof 5 min exchange rates.9

The currency exchange market is open 24 h a day,7 days a week, but the trading volumes on weekendsand holidays are quite small. Following a typicalpractice in the literature (e.g., Andersen & Bollerslev,1998), we remove weekend returns from the sample(i.e., from Friday 21:05 GMT to Sunday 21:00 GMT).We also remove the returns of January 1st, December25th and 26th, as well as Good Friday and EasterMonday for each year.

We obtain the intraday return series rt,n by:

rt,n = ln st,n − ln st,n−1, (1)

where st,n is the close-mid spot price at the nth timestamp (for example n = 1 corresponds to 00:00 GMT)on day t .10 Table 1 shows the final number of sampleobservations in the returns series after removing themissing values in the raw price set, and Fig. 1 showsthe plots of each returns series.11

9 Bandi and Russell (2005) propose a rule for the calculationof the optimal sampling frequency for the realized volatility. Theysuggest that for the UK stock market, the optimal frequencies varybetween 0.4 and 13.8 min. This is not too far from the 15 mininterval that we consider, and we believe that this is an acceptableworking assumption. The exact calculation of the optimal frequencyfor our data set is beyond the scope of this paper.10 This refers to the last mid price in the interval. The mid price is

the average of the bid and ask prices (http://www.olsendata.com/).11 Missing values occur in the price series when there is a low

level of market activity. Deleting these values gives rise to differingsample sizes across price series.

Page 5: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1093

Fig. 1. Plots of the 15 min return series.

Table 1Summary statistics of the 15 min returns of each exchange rate series.

No. obs. Mean (×10−6) S.D. (×10−4) Skewness Kurtosis

euro/CHF 105,086 0.056 3.56 −0.120 27.5euro/GBP 110,153 0.606 6.26 0.149 10.7euro/JPY 112,795 2.07 8.38 0.236 16.2euro/USD 116,518 2.22 7.31 0.966 52.9

The table shows the total number of observations of each exchange rate series in 15 min intervals, after removing weekend data, the mainholiday data and missing values. The table also shows the distribution statistics of the 15 min returns of the euro exchange rates. All series haveapproximately a zero mean, with skewed and fat tailed non-normal distributions. The sample period is from Jan. 4th, 2000, to Oct. 31st, 2004.

Table 1 also presents the summary statistics of thedistribution of raw returns for each exchange rate se-ries. The figures suggest that the characteristics of15 min returns of euro exchange rates can be con-sistent with the stylized properties of high-frequencyfinancial time series returns documented in the litera-ture. For example, the kurtosis is far greater than thatof the normal distribution, indicating fat tailed distri-butions. Three of the euro exchange rate returns se-ries are slightly skewed, while the skewness for theeuro/USD exchange rate is as high as 0.97. The meanvalues of the series considered are all approximatelyzero, as is the case for the returns of other financialassets.

Another stylized property of high-frequency re-turns which has been documented in many studies,

including that of Dacorogna et al. (2001), is the neg-ative first order autocorrelation in the returns. Fig. 2shows the autocorrelation function (ACF hereafter) ofeach 15 min euro exchange rate returns series. Onecan observe the large negative autocorrelations in thefirst lag, followed by rather small autocorrelations inthe subsequent lags. This phenomenon is caused bythe bounce between the bid and ask prices, which isa market microstructure effect. Using standard errorsthat are valid for martingale difference series (Lobato,Nankervis, & Savin, 2001), we find that, with the ex-ception of the first-order autocorrelations, no other in-dividual autocorrelation is significantly different fromzero. We estimate an MA(1) model and then test theresiduals for autocorrelation. The standard version ofthe Box-Pierce test (McLeod, 1978), which assumes

Page 6: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1094 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

Fig. 2. The autocorrelations of the 15 min returns at the first 12 lags.

iid innovations, rejects the null of no autocorrelationin some cases. Using a version of the Box-Pierce testwhich allows for dependent uncorrelated processesunder the null (Francq, Roy, & Zakoian, 2005), how-ever, we cannot reject the null of no autocorrelation forup to 36 lags.

To consider the periodicity of the intraday volatilityin Fig. 3, we plot the ACF of the absolute returnsfor the four exchange rate series. The apparent U-shaped periodicity recurs every 24 h period. Forthe 15 min returns series, there are 96 observationsper 24 h, so the pattern can be observed at every96 lags, which strongly indicates periodicity with aperiod of one day. The autocorrelation is highest atthe beginning and end of the 24 h intervals, andlowest in the middle. This pattern is consistent withthe findings of various other studies, including thoseof Andersen and Bollerslev (1997), Barbosa (2002)and Dacorogna et al. (2001). According to Dacorognaet al. (2001), this phenomenon is due to the overlapin opening hours of the three major foreign exchangemarkets of the world, i.e., the American, Europeanand Asian markets. When both the American andEuropean markets are open, the autocorrelations arerelatively large. Fig. 3 also reveals that the persistenceof autocorrelation in the absolute returns dies out veryslowly.

In summary, the returns of the 15 min euroexchange rate series used in this study have propertiesthat are consistent with the stylized facts of high-frequency financial returns reported in the literature.They are all fat tailed, slightly skewed (with theexception of the euro/USD exchange rate) andhave a zero mean. Furthermore, there is a positivecorrelation between the kurtosis and the frequency ofthe returns.12 The series exhibit negative first orderautocorrelation, but no significant autocorrelationswhen longer lags are considered. Most importantly, theseries display strong periodicity patterns in intradayvolatility.

3.2. Models and estimation

We consider five different models at two timefrequencies for volatility forecasting: a daily realizedvolatility ARFIMA model, intraday GARCH andFIGARCH models, daily GARCH and stochasticvolatility models, and an extended daily stochasticvolatility model. Each model has different features.For example, it is widely believed that long memory

12 We consider the kurtosis for four return series at 5, 15 and30 min frequencies. The kurtosis increases at higher frequencies.Detailed results are available from the authors upon request.

Page 7: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1095

Fig. 3. The autocorrelation function of absolute 15 min returns for 300 lags (three days contain 288 lags).

exists in the volatility series, and both the ARFIMAand FIGARCH models take the long memory propertyinto account. Nevertheless, whether there is anydifference between modelling the realized volatilitydirectly (ARFIMA) or embedding the long memoryin a GARCH type model (FIGARCH) remains anopen question. In addition, little research has beenundertaken to compare the standard GARCH modelwith its long memory counterparts in an intradayframework. For example, Lux and Kaizoji (2004)and Vilasuso (2002) compare the GARCH andFIGARCH models for volatility forecasting usingdaily data, while Zumbach (2004) uses 1 h returns withUSD exchange rates. To the best of our knowledge, noevidence on intraday euro exchange rate returns hasyet been produced. We also consider whether focusingon high-frequency returns can help to improve theaccuracy of daily volatility forecasting by applying thestochastic volatility model to intraday observations.

The volatilities of intraday returns typically displaya strong periodicity in 24 h intervals, as was demon-strated in the previous section. Andersen and Boller-slev (1997) and Martens, Chang, and Taylor (2002),among others, note that the estimates of traditionaltime series models, (e.g., GARCH-type models), canbe corrupted by intraday periodic patterns. Thus, weuse the deseasonalized filtered returns to estimate tra-

ditional time series models, rather than using raw re-turns. We define the deseasonalized filtered return asthe nth intraday return divided by an estimated sea-sonality term,

rt,n = rt,n/St,n (n = 1, 2, . . . , N ), (2)

where rt,n is the nth intraday return on day t and St,nis the corresponding seasonality term, for N intradayperiods. We estimate the seasonality term St,n using amethod proposed by Taylor and Xu (1997). This in-volves averaging the squared returns for each intradayperiod, i.e.:

S2t,n =

1T

T−t=1

r2t,n (n = 1, 2, . . . , N ), (3)

where T is the number of days in the sample. Thismethod appears to be quite effective, since it almostremoves the U-shaped pattern from all series. Thus,we use the filtered returns to estimate the intradayGARCH and FIGARCH models.

For the ‘traditional’ volatility models, we applythe GARCH(1, 1) model to both intraday and dailyreturns, and the FIGARCH model to intraday returns.We estimate the models using maximum likelihood.In order to filter the first order serial correlation ofthe intraday returns series caused by microstructure

Page 8: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1096 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

effects, we use an MA(1) process in the mean equationof the intraday GARCH-type models, and to allowfor possible fat tails, we model the innovations in theGARCH process as iid Student’s-t errors. Thus, ourMA(1)-GARCH(1, 1) model is defined as:

rt,n = εt,n + θεt,n−1, εt,n|Ωt,n−1 ∼ Dν(0, ht,n)

ht,n = ω + α1ε2t,n−1 + β1ht,n−1,

(4a)

where ω, α1 and β1 are parameters of the varianceequation, and the error term εt,n , which is conditionalon the information set Ωt,n−1, follows a Student’s-t distribution Dν with zero mean, variance ht,n anddegrees of freedom ν. In the daily GARCH model, wesimply regress the daily returns on a constant in themean equation. We define the daily returns rt as thedifference between the logarithms of prices at 12:00pm GMT on the day in question and at 12:00 pm GMTon the previous day. This results in 1237 daily returnobservations. The daily GARCH model is defined as:

rt = µ + εt , εt |Ωt−1 ∼ N (0, ht )

ht = ω + α1ε2t−1 + β1ht−1,

(4b)

where N is a normal distribution with zero mean andvariance ht .

The FIGARCH model extends the variance equa-tion of the standard GARCH model by including frac-tional differences, to capture the long memory prop-erties of the volatility. Following Baillie, Cencen, andHan (2000), we specify an MA(1)-FIGARCH(1, d, 0)

model as:

rt,n = εt,n + θεt,n−1, εt,n|Ωt,n−1 ∼ Dν(0, ht,n)

ht,n = ω + β1ht,n−1 + (1 − β1L1 − (1 − L1)d)ε2

t,n,(5)

where d is the order of fractional integration, L1 is thelag operator on n and Dν is the Student’s-t distributionwith zero mean, variance ht,n and degrees of freedomν. The model is estimated using the G@RCH package(Laurent & Peters, 2005) in the Ox language (Doornik,2001).

The ARFIMA(p, d, q) model (Granger & Joyeux,1980) for a series yt is defined as

φ(L2)(1 − L2)d(yt − µ) = θ(L2)εt , (6)

where d is the order of fractional integration, and L2is the lag operator on t . The AR and MA polynomialcomponents are given as φ(L2) = 1 + φ1L2 + · · · +

φP L P2 and θ(L2) = 1 + θ1L2 + · · · + θq Lq

2 respec-tively, and µ is the mean of yt . According to ABDL(2003), yt is defined as the logarithm of the daily real-ized volatility, log(σt ). Given that we cannot identifyall of the parameters of an ARFIMA model from thedata, especially in the case of the realized volatility(Hol & Koopman, 2002), many empirical studies set afixed value of d instead of estimating it. In this paperwe follow ABDL (2003), who fix d at 0.401, and setthe order of the model at (5, d, 0).

We construct the daily realized volatility to beused in the ARFIMA(5, d, 0) model from the 15 minreturns series, with the daily realized variance definedby

σ 2t =

N−n=1

r2t,n, (7)

where the realized variance on day t is equal to thesum of squared intraday returns. To be consistent withthe one day definition of the daily GARCH model, therealized variance is also calculated from the returnsbetween 12:00 pm on day t − 1 and 12:00 pm on dayt . Therefore the ARFIMA(5, d, 0) model specificationbecomes:

θ(L2)(1 − L2)d(log(σt ) − µ) = εt . (8)

We estimate the model with the maximum likelihoodmethod using the ARFIMA package (Doornik &Ooms, 2001) written in the Ox language.

Many authors stress the superiority of the SVmodel relative to GARCH-type models in capturingthe main empirical properties of daily financialreturns (e.g., Broto & Ruiz, 2004). The SV model(Taylor, 1994) treats the volatility as an unobservedlatent variable. We use a standard SV model which isdefined as:

rt = σtεt , εt ∼ N I D(0, 1), t = 1, . . . , T,

σ 2t = σ ∗2 exp(ht ),

ht = φht−1 + σηηt , ηt ∼ N I D(0, 1).

(9)

where rt denotes the returns series and we assume thatεt and ηt are uncorrelated. The volatility process σt isthe product of a squared scale parameter, σ ∗, and theexponential of the stochastic process ht .

Hol and Koopman (2000) specify a SV model withimplied volatility as an explanatory variable in thevariance equation, and refer to it as the SVX model.

Page 9: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1097

The implied volatility, however, pertains to the dailyfrequency. To capture the intraday information effectson the daily volatility, we follow Hol and Koopman(2002) and use intraday returns (instead of the impliedvolatility) as the explanatory variable in the SVXmodel. The mean and variance equations of the SVXmodel are identical to those of the standard SV model,while the ht process is specified as:

ht = φht−1 + γ (1 − φL2)σt−1 + σηηt , (10)

where σt−1 is the intraday information defined in Eq.(7) as the sum of the 15 min squared intraday returns.We estimate the SV model by the simulated maximumlikelihood method using the ‘importance samplingtechnique’. For this estimation we use SsfPack byKoopman, Shephard, and Doornik (1999), written inthe Ox language.13

3.3. Forecasting

We set the out-of-sample period for forecastevaluation to 100 days. We produce two series offorecasts at intraday sampling frequencies and fourseries of forecasts at daily frequencies. The in-sampleestimation period runs from January 4th, 2000, toJune 11th, 2004, which is 1137 days in total.14 Weuse a rolling window method and produce one-step-ahead daily volatility forecasts for daily models and96-step-ahead intraday volatility forecasts for intradaymodels. This procedure is repeated 100 times in orderto produce 100 daily volatility forecasts for evaluationout-of-sample.

The intraday forecasts are based on deseasonalizedfiltered returns and must be transformed back to thosefrom the original returns. To do so, we multiply by theappropriate seasonal term S2

t,n , i.e.,

ht,n = S2t,n × ht,n, (11)

where ht,n is the intraday variance forecast from thedeseasonalized filtered returns, S2

t,n is estimated by the

13 This program is downloadable from the webpage of ‘Analysisof Stochastic Volatility using SsfPack’, http://www.ssfpack.com/,created by Koopman.14 The initial estimation period considered for the intraday

GARCH(1, 1) and FIGARCH(1, d, 0) models is from 12:00 pmJanuary 3rd, 2000, to 12:00 pm June 11th, 2004.

method described in the previous section, and ht,n isthe transformed forecast for the original returns.

We calculate the daily variance forecast by sum-ming the 96-step-ahead intraday variance forecasts:

ht =

N−n=1

ht,n, (n = 1, 2, . . . , N ), (12)

where ht,n is the transformed intraday varianceforecasts of the nth interval of day t from therolling window estimation and ht is the daily varianceforecast on day t (see Martens, 2001).

4. Evaluating alternative forecasts: methodology

4.1. True volatility measure

Since the true volatility is unobservable, we followAndersen and Bollerslev (1998) and use a realizedvolatility series constructed from 5 min returns (thehighest frequency that our data set allows) as a proxyfor the true volatility, i.e.,

σ 2rv,t =

N−n=1

r2t,n, (13)

where r2t,n are 5 min interval squared returns and

σ 2rv,t is the realized variance on day t . As there is no

generally accepted forecast evaluation method, we usea set of alternative methods that have emerged as themost popular ones in the relevant literature.

4.2. Regression test

For the ‘regression test’,15 also known as the ‘pre-dictive power test’, the true volatility is regressed on aconstant and on the forecasted volatility, in order to ex-amine whether the forecasted value has any explana-tory power for the true volatility. Given that we usethe realized volatility as a proxy for the true volatility,we conduct the regression test by using the followingequation:

σ 2rv,t+1 = α + βht+1 + εt+1, (14)

15 This approach was originally introduced by Mincer andZarnowitz (1969) and further studied by Hatanaka (1974). It has alsobeen used widely in forecast evaluation by Andersen and Bollerslev(1998), ABDL (2003), Balaban (2004), Martens et al. (2002) andPong et al. (2004), for example.

Page 10: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1098 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

where σ 2rv,t+1 is the realized volatility at time t + 1,

and ht+1 is the forecasted value of the true volatilityat time t + 1 predicted at time t . We compare thevalues of the goodness-of-fit measure R2 derived fromEq. (14) to assess the predictive ability of thealternative models. The higher the R2 value, the moreadequately the true volatility can be explained bythe forecasted one in the equation, and the greaterthe forecasting ability of the model producing thatforecast.

4.3. Root Mean Squared Error (RMSE)

The RMSE belongs to a broader class of accuracytests, which compare the ‘true volatility’ with theforecasted value and calculate the forecast error. It isgiven by:

RM SE =

1P

P−p=1

(h p+1 − σ 2rv,p+1)

2, (15)

where P is the number of observations included in theout-of-sample period.

4.4. Diebold-Mariano test and HLN adjusted DM test

As a general rule, the model with the smallerforecast error is not necessarily superior to theother ‘competing’ models, as the difference betweentwo forecasts may not be statistically significantlydifferent from zero. To take such considerations intoaccount, Diebold and Mariano (1995) propose an‘equal accuracy’ test (DM henceforth) between twoforecasting models. The DM statistic follows anasymptotic standard normal distribution under the nullhypothesis.

The normal distribution can be a poor approxima-tion for finite samples, however, and the test statisticsmay be biased, depending on the degree of serial cor-relation in the forecast errors. We therefore use theadjusted DM test statistic suggested by Harvey, Ley-bourne, and Newbold (1997) (HLN–DM henceforth),which has better small-sample properties.

4.5. Superior predictive ability test (Hansen, 2005)

The DM test can be used to compare the forecastsof any two competing models. In practice, however,

we typically have to choose the most accurate forecastfrom a set of models. Hansen (2005) introducesthe superior predictive ability (SPA henceforth) test,which is designed to evaluate the performances ofseveral alternative models. In addition, this test utilizesa bootstrap procedure to assess whether the sameoutcomes can be obtained from more than one sample.In the SPA test, a benchmark model is selected againstwhich all other ‘alternative forecasts’ are judged.

For the SPA test, the forecasts are evaluated by apredefined loss function. The SPA program (Hansen,Kim, & Lunde, 2003) uses the mean squared error(MSE) and mean absolute error (MAE). Based onthe out-of-sample forecast performance in terms ofthe RMSE reported in the next section, we choosethe intraday FIGARCH model as a benchmark modelfor the euro/CHF and euro/GBP series, and use theARFIMA model for the remaining two series. The nullhypothesis of the test is that the forecasts from thebenchmark model are not inferior to those from theother models.

5. Model estimation and out-of-sample forecastevaluation

5.1. Model estimation

We obtain the estimated parameters for the realizedvolatility model, the daily models, and the intradaymodels from the first in-sample period estimation.Table 2 reports the results of the intraday MA(1)-GARCH(1, 1) and MA(1)-FIGARCH(1, d, 0) models,along with those from the daily GARCH(1, 1). AllMA(1) parameters in the mean equations, capturingthe first order negative autocorrelation in the returns,are negative. All of the parameters in the meanand variance equations of the intraday GARCH andFIGARCH models are significant. The fact that α1 +

β1 < 1 in the intraday GARCH model reveals that thatthe GARCH process is stationary, and since α1 + β1is close to 1 in most cases, the volatility is persistent.Thus, the MA(1)-GARCH(1, 1) specification cancapture the properties of the 15 min intraday returnseffectively. Similarly, the daily GARCH(1, 1) modelfits the daily returns well. In the FIGARCH model,the parameter d is significantly different from zerofor all series, indicating that the volatility processexhibits a long memory property. In particular, the

Page 11: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1099

Table 2In-sample parameter estimation results for the intraday GARCH, intraday FIGARCH, and daily GARCH models:

Intraday MA(1)-GARCH(1, 1)

rt,n = εt,n + θεt,n−1, εt,n |Ωt,n−1 ∼ Dv(0, ht,n)

ht,n = ω + α1ε2t,n−1

+ β1ht,n−1

(4a)

Intraday MA(1)-FIGARCH(1, d, 0)

rt,n = εt,n + θεt,n−1, εt,n |Ωt,n−1 ∼ Dv(0, ht,n)

ht,n = ω + β1ht,n−1 + (1 − β1 L − (1 − L)d )ε2t,n .

(5)

euro/CHF euro/GBP euro/JPY euro/USD

MA(1)-Intraday GARCH(1, 1)θ −0.245 (−77.0)***

−0.152 (−51.2)***−0.103 (−36.1)***

−0.104 (−35.6)***

ω 0.160 (30.1)*** 0.024 (21.1)*** 0.013 (20.6)*** 0.054 (24.3)***

α1 0.188 (34.5)*** 0.066 (36.7)*** 0.058 (37.0)*** 0.083 (34.1)***

β1 0.672 (89.9)*** 0.912 (389)*** 0.931 (560)*** 0.870 (244)***

v 3.970(70.5)*** 5.030(59.2)*** 4.230(70.9)*** 4.187(73.4)***

MA(1)-Intraday FIGARCH(1, d, 0) ((0, d, 1) for euro/CHF)θ −0.248 (−66.3)***

−0.155 (−45)***−0.105 (−31.3)***

−0.106 (−32.9)***

ω 0.252 (31.8)*** 0.213 (31.7)*** 0.169 (30.7)*** 0.245 (29.7)***

β1 (φ1 for CHF) 0.053 (6.73)*** 0.050 (7.09)*** 0.065 (8.86)*** 0.057 (7.82)***

d 0.186 (29.6)*** 0.193 (43.9)*** 0.212 (48.0)*** 0.188 (37.2)***

v 4.192(72.97)*** 5.422(63.88)*** 4.664(74.26)*** 4.406(75.9)***

Daily GARCH(1, 1)µ (×10−4) −0.197 (−0.29) 1.19(0.92) 2.88(1.47) 2.49(1.21)ω (×10−7) 2.27(5.07)*** 1.79(1.75)* 4.51 (2.35)** 9.83 (1.91)*

α1 0.072 (9.07)*** 0.047 (3.87)*** 0.052 (4.67)*** 0.036 (3.08)***

β1 0.891 (67.8)*** 0.946 (67.4)*** 0.939 (87.9)*** 0.945 (54.4)***

The table provides estimates of the intraday GARCH and FIGARCH and daily GARCH models for the in-sample period from January 4th, 2000,to June 11th, 2004.The numbers in parentheses are t-statistics. The results in the first two panels are estimated by 15 min deseasonalized returnsusing Eqs. (4a) and (5). For the CHF/euro we estimate a MA(1)-FIGARCH (0, d, 1) model where ht,n = ω + (1 − (1 − φ1 L1)(1 − L1)d )ε2

t,n

and L1 operates on n. The daily GARCH model is estimated by Eq. (4b), rt = µ + εt , εt |Ωt−1 ∼ D(0, ht ), ht = ω + α1ε2t−1 + β1ht−1,

where rt is the daily return series. The intraday model was estimated using quasi-maximum likelihood with Student’s-t distributed innovationswith ν degrees of freedom. The daily GARCH model is estimated assuming a normal distribution.

∗ Significant at the 10% level.∗∗ Significant at the 5% level.

∗∗∗ Significant at the 1% level.

value of d is around 0.2 in all cases, which is consistentwith the values reported by Baillie et al. (2000).16

The statistical adequacy of the model can be checkedby diagnostic tests on the residuals, as provided inTable 3.

16 The parameter β1, however, is not significant in the euro/CHFseries, under the FIGARCH(1, d , 0) specification. We have triedvarious models and found that a FIGARCH(0, d, 1) specificationfits the data best, indicating that the euro/CHF return series is anARCH process rather than a GARCH process under a long memoryspecification. The fractional parameter d is close to 0.2 and issignificant.

We then consider an ARFIMA(5, d, 0) modelwith d = 0.401 (see ABDL, 2003).17 Instead ofusing a long memory VAR-RV model, as ABDL(2003) do, we adopt a single equation approach. Theresults of the Q-tests on the standardized residualsreported in Table 4 indicate that the ARFIMA model

17 We considered a number of alternative models, but the AICindicated that the order (5, d, 0) is preferable. We also tried toestimate the parameter d for each series. The decrease in the AICfor in-sample estimation was trivial, and most importantly, its out-of-sample performance was similar but slightly inferior to the fixedd case. We therefore retain the specification of ABDL (2003).

Page 12: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1100 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

Table 3Residual test results for the intraday GARCH, intraday FIGARCH and daily GARCH models.

euro/CHF Euro/GBP euro/JPY euro/USD

Intraday GARCH(1, 1)Q(50) 0.002 0.010 0.010 0.000Q2(50) 0.088 0.000 0.000 0.000ARCH(10) 0.005 0.000 0.000 0.000

Intraday FIGARCH(1, d, 0) ((0, d, 1) for euro/CHF)Q(50) 0.001 0.009 0.039 0.037Q2(50) 0.99 0.008 0.146 0.192ARCH(10) 0.60 0.001 0.003 0.032

Daily GARCH(1, 1)Q(50) 0.414 0.890 0.223 0.973Q2(50) 0.744 0.798 0.941 0.864ARCH(10) 0.660 0.479 0.159 0.570

The table shows the p-values of residual tests of in-sample estimations of the intraday GARCH and FIGARCH and daily GARCH models.Q(50) and Q2(50) are the p-values of the Ljung and Box (1979) test of up to 50 lags for the standardized residuals and their squares. ARCH(10)are the ARCH–LM test results of up to 10 lags.

Table 4The in-sample parameter estimation results and the residual tests of ARFIMA(5, d, 0) with d = 0.401:

θ(L)(1 − L)d (log(σt ) − µ) = εt . (8)

euro/CHF Euro/GBP euro/JPY euro/USD

AR(1) −0.085(−.2.89)***−0.090(−3.08)***

−0.092(−3.12)***−0.152(−5.13)***

AR(2) −0.027(−0.9) −0.059(−2.00)**−0.025(−0.84) −0.038(−1.25)

AR(3) 0.041(1.39) −0.005(−0.170) 0.017(0.585) −0.021(−0.706)AR(4) 0.036(1.21) 0.014(0.457) 0.015(0.517) −0.017(−0.569)AR(4) 0.073(2.46)** 0.164(5.58)*** 0.089(3.01)*** 0.092(3.12)***

Residual testARCH(1) 8.98(0.003) 5.30(0.022) 4.55(0.033) 0.126(0.723)Q(30) 34.4(0.188) 39.1(0.079) 34.2(0.193) 27.3(0.501)

The table provides the in-sample parameter estimates and residual test results for Eq. (8), where we use an ARFIMA(5, d, 0) model, and d isfixed at 0.401. The first panel shows the estimations. The numbers in parentheses are t-statistics. The second panel shows the results of theresidual tests, namely the ARCH–LM and Ljung-Box tests. The numbers in parentheses are test p-values.∗∗ Significant at the 5% level.

∗∗∗ Significant at the 1% level.

captures the volatility dependency well, but that theARCH–LM tests give rise to low p-values (exceptfor the euro/USD series). Turning to the stochasticvolatility models, Table 5 reports the log likelihoodvalues and residual test results for the SV and SVXmodels. The Q-test and the normality test on theresiduals indicate that both models fit the daily returnseries successfully in all cases.

5.2. Out-of-sample forecast evaluation

5.2.1. The regression testWe report the regression test results of the intraday,

daily, and realized volatility models in Table 6. Intwo cases, the intraday FIGARCH provides surprisingand interesting results. Specifically, the regressions ofthe intraday FIGARCH model produce the highest

Page 13: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1101

Table 5The in-sample estimation diagnostic test results for the SV and SVX models. The SV model:

rt = σt εt , εt ∼ N I D(0, 1), t = 1, . . . , T,

σ 2t = σ∗2 exp(ht ),

ht = φht−1 + σηηt , ηt ∼ N I D(0, 1).

(9)

The SVX model:

ht = φht−1 + γ (1 − φL)σt−1 + σηηt . (10)

euro/CHF euro/GBP euro/JPY euro/USD

SV modelLog likelihood 5778 4889 4368 4402Q(12) 8.49 14.4 9.57 9.53Normality test 5.69 7.60** 1.80 1.33

SVX modelLog likelihood 5767 4883 4367 4399Q(12) 7.73 14.4 10.1 9.60Normality test 4.86 7.06** 1.74 0.728

The table shows the log likelihood values and the diagnostic residual test results of the estimations from Eqs. (9) and (10), the SV and SVXmodels. Q(12) and the normality test provide the test-statistics of the Ljung-Box test of up to 12 lags and the Jarque-Bera normality test for theresiduals, respectively.∗∗ The null hypothesis is rejected at the 5% significance level.

Table 6The out-of-sample forecast evaluation results by the regression test and the accuracy test (RMSE).

σ 2rv,t+1 = α + β h2

t+1 + εt+1 (14)

RMSE =

1P

P−p=1

(h2p+1 − σ 2

rv,p+1)2 (15)

Return fitted Model euro/CHF euro/GBP euro/JPY euro/USD

R2 RMSE (×10−6) R2 RMSE (×10−6) R2 RMSE (×10−6) R2 RMSE (×10−6)

IntradayFIGARCH 0.54 5.19 0.28 5.27 0.47 9.93 0.002 12.89GARCH 0.22 6.19 0.19 9.80 0.29 23.31 0.001 28.57

Realized volatility ARFIMA 0.60 6.25 0.25 6.13 0.32 9.18 0.013 10.43

DailyGARCH 0.52 9.03 0.19 9.18 0.33 10.69 0.0002 21.86SV 0.50 9.27 0.21 9.11 0.34 11.0 0.0003 22.0SVX 0.51 9.77 0.24 9.64 0.35 10.6 0.0002 21.9

The table provides the R2 values of Eq. (14) and the values of the RMSE calculated by Eq. (15) for 5 models, 2 frequencies and 4 exchangerates. The superior model has a higher R2 and lower RMSE.

R2 values, followed by the ARFIMA model whenconsidering the euro/GBP and euro/JPY exchangerates. The R2 from the intraday FIGARCH model is12% higher than that from the ARFIMA model forthe euro/GBP rate and 46.9% higher for the euro/JPYseries. The ARFIMA model performs best in the

other two cases. However, the intraday GARCH modelproduces the lowest R2 in all cases.

As a number of authors demonstrate, intraday in-formation can improve the performance of traditionaldaily volatility models (Hol & Koopman, 2002; Jones,2003; Koopman et al., 2005; Martens & Zein, 2004).

Page 14: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1102 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

Our results show that the daily SVX model that in-corporates intraday information can explain the truevolatility in the three exchange rates better than thecorresponding SV model, with the exception of theeuro/USD series. With the exception of the euro/JPYseries, however, the daily models cannot improve uponthe two leading models’ (ARFIMA and intraday FI-GARCH) performances.18 The existing empirical evi-dence comparing the forecasting performance of thedaily GARCH model with that of the SV model ismixed. Studies exist both for (e.g., Lopez, 2001; Yu,2002) and against (Dunis, Laws, & Chauvin, 2001) thesuperiority of the SV among daily volatility models.Our results for the daily GARCH and SV models aremixed as well.

5.2.2. RMSETable 6 provides the results relating to the RMSE

criterion. Again, the intraday FIGARCH model showssuperior results in two cases, outperforming the othermodels for the euro/CHF and euro/GBP series. Thisresult complements the findings of the regression test,which shows that the FIGARCH model has the highestR2 value for the euro/GBP series. In the other twocases the ARFIMA model performs best, followedby the intraday FIGARCH model. The differences inforecast errors between the two models are small, andsuch differences are compared further below using theequal accuracy test. The intraday GARCH displays theworst performance in most cases, with the exception ofthe euro/CHF series.

Most previous studies have shown that the SVXperforms better than the SV model, which is alsoconfirmed by the regression test in this study.Under the RMSE criterion, however, this advantagedisappears in most cases, with the exception of theeuro/JPY exchange rate. In addition, the RMSE of theSV model is not as good as that of the daily GARCHmodel for the euro/USD and euro/JPY series, whichis contrary to the results of the R2 test. Such a resultis not unusual in the volatility forecasting literature,since different forecast evaluation methods may favourdifferent models.

In summary, the regression and accuracy tests notonly confirm some of the earlier findings, but also

18 For the euro/JPY series, the SVX model produces the secondhighest R2 value.

produce some new results. On the one hand, therealized volatility model, ARFIMA, performs betterthan the traditional daily volatility models, whichconcurs with many previous studies. On the otherhand, however, we find that the intraday FIGARCHcan predict as well as the ARFIMA model in somecases. We find that the worst results are produced bythe intraday GARCH model. This indicates that theGARCH model cannot capture the properties of theintraday volatility fully to obtain accurate forecasts.Furthermore, the results of the three daily volatilitymodels are rather mixed, and different models may bechosen by different evaluation methods.

5.2.3. The HLN–DM testTable 7 shows the results of the HLN–DM test of

five pairs of models for each exchange rate series.The test compares the significance of the differencesbetween the RMSEs of two competing models pairby pair.19 The differences in forecast errors betweenthe intraday FIGARCH and ARFIMA models are notsignificantly different for three series at the 1% level.This result indicates that the intraday FIGARCH andARFIMA models have almost the same performances,as judged by the equal accuracy test. The performanceof the intraday FIGARCH model is significantlysuperior to that of the intraday GARCH model,which is ranked second in the case of the euro/CHFseries. The intraday GARCH performance, however,is significantly inferior to those of the other models inthe other three cases.

Table 8 shows the rankings given by the HLN–DMtest for the four exchange rate series. Models with thesame rank have statistically equal RMSEs, while themodels in lower lines have numerically larger RMSEs.The table reveals results similar to those of theregression test. The intraday FIGARCH and ARFIMAmodels perform best for all of the exchange rates.There is no consistent result for the daily volatilitymodels for the series under study; however, the threedaily models do not perform significantly differentlyfor the euro/JPY and euro/USD series.

19 According to the order given by the RMSE, we perform theHLN–DM test five times for each series. The models are sorted interms of increasing RMSE and then the test statistics are calculatedfrom consecutive pairs of models, starting with the two modelshaving the smallest RMSEs.

Page 15: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1103

Table 7HLN–DM test results.

euro/CHF euro/GBP euro/JPY euro/USD

Intraday FIGARCH vs. ARFIMA – −1.62 0.90 1.47Intraday FIGARCH vs. SVX – – −0.73 –Intraday FIGARCH vs. Intraday GARCH −4.15*** – – –Intraday FIGARCH vs. Daily GARCH – – – −1.88*

ARFIMA vs. Intraday GARCH 0.16 – – –ARFIMA vs. SVX – – – –ARFIMA vs. Daily GARCH −7.87*** – – –ARFIMA vs. SV – −7.81*** – –Daily GARCH vs. SV −5.63*** 2.15**

−2.23** –Daily GARCH vs. SVX – −6.68*** 0.28 −0.006SV vs. SVX −10.72*** – – 0.53SV vs. Intraday GARCH – – −7.22***

−3.38***

SVX vs. Intraday GARCH – −0.20 – –

The table shows the test statistics of the HLN–DM test. Based on the results of the RMSE, the tests are performed for the competing modelspair by pair. The models are sorted in terms of increasing RMSE and then the test statistics are calculated from consecutive pairs of models,starting with the two models having the smallest RMSEs.

∗ The difference between the two models is significant at the 10% level.∗∗ The difference between the two models is significant at the 5% level.

∗∗∗ The difference between the two models is significant at the 1% level.

Table 8Model rankings by the HLN–DM test.

Rank euro/CHF euro/GBP euro/JPY euro/USD

1 Intraday FIGARCH Intraday FIGARCH ARFIMA ARFIMAARFIMA Intraday FIGARCH Intraday FIGARCH

2 Intraday GARCH SVX Daily GARCHARFIMA SV Daily GARCH SVX

SV SV3 Daily GARCH Daily GARCH Intraday GARCH Intraday GARCH4 SV SVX5 SVX Intraday GARCH

The table provides the results of the HLN–DM test in terms of model rankings. Models in the same rank have statistically equal RMSEs, butwithin a rank, models in lower lines have numerically larger RMSEs.

5.2.4. Superior Predictive Ability (SPA) testThe SPA test is an alternative method for measuring

whether the superiority of one model against the othersis significant. This test selects six models amongany large number of competing models, namely themost significant model, the best model, models withperformances of 75%, 50% (median) and 25% relativeto the benchmark model, and the worst model. Giventhat we only have five competing models in thispaper in addition to the benchmark model, the resultsof the SPA test also show the order of the fivemodels. The benchmark models chosen (according tothe tests discussed above) are the intraday FIGARCH

model for the euro/CHF and euro/GBP series, andthe ARFIMA model for the euro/JPY and euro/USDseries.

Tables 9a and 9b show the results of the SPA testfor each series using two evaluation criteria. We reportthe ‘consistent p-value’ (Hansen, 2005) for this test atthe bottom of each panel. The p-values show that in nocase can the null hypothesis that the benchmark modelis not inferior to other models be rejected.

Table 10 summarises the ranking given by theSPA test. The result is similar to the rankingsobtained using the RMSE. The intraday FIGARCHand ARFIMA models outperform the others, with

Page 16: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1104 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

Table 9aSPA test results evaluated by the MAE and MSE for the euro/CHF and euro/GBP exchange rates.

MAE MSE MAE MSEModels (euro/CHF) t-statistics

Benchmark Intraday FIGARCH Intraday FIGARCH – –Most significant ARFIMA Intraday GARCH −2.51 −3.77Best model ARFIMA ARFIMA −2.51 −3.99Model 75% Intraday GARCH Intraday GARCH −3.33 −3.77Median model 50% Daily GARCH Daily GARCH −8.90 −7.23Model 25% SV model SV model −10.25 −8.02Worst SVX SVX −12.29 −8.85

SPA test p-valueMAE MSE0.691 0.853Models (euro/GBP) t-statistics

Benchmark Intraday FIGARCH Intraday FIGARCH – –Most significant ARFIMA ARFIMA −1.12 −1.62Best model ARFIMA ARFIMA −1.12 −1.62Model 75% Intraday GARCH SV −12.65 −11.34Median model 50% SV model Daily GARCH −5.30 −4.98Model 25% Daily GARCH SVX −5.50 −5.11Worst SVX Intraday GARCH −6.63 −6.17

SPA test p-valueMAE MSE0.874 0.959

Tables 9a and 9b show the SPA test results for each exchange rate series. The benchmark model selected is the FIGARCH model for theeuro/CHF and euro/GBP series, and the ARFIMA model for the other two series. The null hypothesis of the test is that the benchmark modelis not inferior to the other candidate models. The test chooses the most significant model, the best model, models with performances of 75%,50% and 25% relative to the benchmark model, and the worst model. The test p-values are reported in the last panel.

Table 9bSPA test results evaluated by the MAE and MSE for the euro/JPY and euro/USD exchange rates.

MAE MSE MAE MSEModels (euro/JPY) t-statistics

Benchmark ARFIMA ARFIMA – –Most significant Intraday FIGARCH Intraday FIGARCH −2.89 −2.54Best model Intraday FIGARCH Daily GARCH −2.89 −2.81Model 75% Daily GARCH Intraday FIGARCH −2.98 −2.54Median model 50% SV SVX −3.49 −2.72Model 25% SVX SV model −3.58 −3.08Worst Intraday GARCH Intraday GARCH −10.93 −10.09

SPA test p-valueMAE MSE0.638 0.726Models (euro/USD) t-statistics

Benchmark ARFIMA ARFIMA – –Most significant Intraday FIGARCH Intraday FIGARCH −1.35 −1.32Best model Intraday FIGARCH Intraday FIGARCH −1.35 −1.32Model 75% SVX SVX −2.14 −1.88Median model 50% Daily GARCH Daily GARCH −2.65 −1.84Model 25% SV model SV model −4.55 −2.70Worst Intraday GARCH Intraday GARCH −7.88 −5.41

SPA test p-valueMAE MSE0.924 0.906

See the notes for Table 9a.

Page 17: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1105

Table 10Models rankings by the SPA test.

Rank euro/CHF euro/GBP Euro/JPY euro/USDMAE MSE MAE MSE MAE MSE MAE MSE

1 ARFIMA ARFIMA ARFIMA ARFIMA IntradayFIGARCH

DailyGARCH

IntradayFIGARCH

IntradayFIGARCH

2 IntradayGARCH

IntradayGARCH

IntradayGARCH

SV DailyGARCH

IntradayFIGARCH

SVX SVX

3 DailyGARCH

DailyGARCH

SV model DailyGARCH

SV SVX DailyGARCH

DailyGARCH

4 SV model SV model DailyGARCH

SVX SVX SV model SV model SV model

5 SVX SVX SVX IntradayGARCH

IntradayGARCH

IntradayGARCH

IntradayGARCH

IntradayGARCH

To simplify the results shown in Tables 9a and 9b, this table provides the model rankings (except for the benchmark model) according to theSPA test.

the exception of the euro/JPY series in the MSEcase. The intraday GARCH model produces betterforecasts than the daily models for the euro/CHF andthe euro/GBP exchange rates (in terms of the MAE),but is ranked last for the other two exchange rates.The three daily volatility models have mixed rankings,while the daily GARCH model sometimes ranks aheadof the other two. For the euro/USD series, the SVXmodel is preferred to the other daily models whetherthe comparisons are based on the MAE or the MSE.

6. Conclusion

This paper considers daily volatility forecasts ofvarious euro exchange rates using high-frequency data(15 min intervals), and examines the relative perfor-mances of alternative volatility forecasting models.We provide evidence to suggest that the traditionalvolatility model can be useful for volatility forecastingin high-frequency applications, provided that it cap-tures the features of the intraday volatility success-fully.

We find that using a long memory specificationin high-frequency data can improve the forecastingpower and accuracy significantly, a result that cor-roborates the findings of ABDL (2001), Corsi (2009)and Martens and Zein (2004). This finding, how-ever, is contrary to the existing literature, whichsuggests that the improvement in the forecasting per-formance comes only from the high frequency of thedata (see Pong et al., 2004). Our results show that theintraday FIGARCH model produces forecasts which

are as good as those from the ARFIMA model, andthat they are jointly superior to all of the other mod-els considered in most of the out-of-sample evalua-tion tests. The intraday GARCH model, which doesnot take the long memory property of volatility intoaccount, produces unsatisfactory forecasts.

The good performance of the ARFIMA model inthis study is consistent with the literature on the dol-lar exchange rates, which suggests that the ARFIMAmodel is the best model for high-frequency applica-tions (see Hol & Koopman, 2002; Pong et al., 2004).The intraday FIGARCH model’s outstanding perfor-mance, however, is a somewhat surprising finding,given that the FIGARCH model has seldom been re-garded as the preferred model in high-frequency ap-plications. This finding challenges the view that thetraditional volatility model cannot fit the data whenfocusing on higher frequencies. After deseasonalizingthe raw returns of the euro exchange rates and mod-elling the long memory property, it emerges that theFIGARCH model can produce the same satisfactoryforecast results as were obtained from the newly de-veloped models. Thus, the traditional volatility modelcould also be an alternative for volatility forecasting ina high-frequency framework and should be consideredalong with the newer models.

Analyzing the properties of the euro bilateralexchange rates, we find that they are consistentwith the stylized properties of other financial series(stock market indices and other exchange rates) athigh frequencies in many respects. This suggests thatthese properties are not specific to certain kinds of

Page 18: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

1106 G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107

high-frequency data, but most probably reflect somegeneral features that all such data share.

References

Andersen, T. G., & Bollerslev, T. (1997). Intraday periodicity andvolatility persistence in financial markets. Journal of EmpiricalFinance, 4, 115–158.

Andersen, T. G., & Bollerslev, T. (1998). Answering the skeptics:yes, standard volatility models do provide accurate forecasts.International Economic Review, 39, 885–905.

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (1999).(Understanding, optimizing, using and forecasting) realizedvolatility and correlation. Manuscript, Northwestern University,Duke University and University of Pennsylvania. Publishedin revised form as “Great realizations”. Risk. March 2000(pp. 105–108).

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2000).Exchange rate returns standardized by realized volatility are(nearly) Gaussian. Multinational Finance Journal, 4, 159–179.

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2001).The distribution of exchange rate volatility. Journal of theAmerican Statistical Association, 96, 42–55.

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2003).Modelling and forecasting realized volatility. Econometrica, 71,579–625.

Baillie, R. T., Cencen, A. A., & Han, Y. W. (2000). High frequencyDeutsche Mark–US Dollar returns: FIGARCH representationsand non-linearities. Multinational Finance Journal, 4, 247–267.

Balaban, E. (2004). Comparative forecasting performance ofsymmetric and asymmetric conditional volatility models of anexchange rate. Economic Letters, 83(1), 99–105.

Bandi, F. M., & Russell, J. R. (2005). Separating microstructurenoise from volatility. Journal of Financial Economics, 79,655–692.

Barbosa, P. A. (2002). Investigating high frequency exchangerate from the Brazilian Sisbex market. Working paper.http://ssrn.com/abstract=428581.

Bauwens, L., & Sucarrat, G. (2010). General to specific modellingof exchange rate volatility: a forecast evaluation. InternationalJournal of Forecasting, 26(4), 885–907.

Beltratti, A., & Morana, C. (1999). Computing value at risk withhigh frequency data. Journal of Empirical Finance, 6, 431–455.

Broto, C., & Ruiz, E. (2004). Estimation methods for stochasticvolatility models: a survey. Journal of Economic Surveys, 18(4),613–649.

Clements, M. P., Galvao, A. B., & Kim, J. H. (2008). Quantileforecasts of daily exchange rate returns from forecasts ofrealized volatility. Journal of Empirical Finance, 15(4),729–750.

Corsi, F. (2009). A simple approximate long-memory model ofrealized volatility. Journal of Financial Econometrics, 7(2),174–196.

Dacorogna, M. M., Gencay, R., Muller, U. A., Olsen, R. B., &Pictec, O. V. (2001). An introduction to high-frequency finance.San Diego: Academic Press.

Diebold, F. X., & Mariano, R. S. (1995). Comparing predictiveaccuracy. Journal of Business and Economic Statistics, 13,253–265.

Doornik, J. A. (2001). Object-oriented matrix programming usingOx 3.0. London: Timberlake Consultants Press.

Doornik, J. A., & Ooms, M. (2001). A package for estimating,forecasting and simulating ARFIMA models: ARFIMA package1.01 for Ox. Package Manual.

Dunis, C. L., Laws, J., & Chauvin, S. (2001). The use of market dataand model combination to improve forecast accuracy. In C. L.Dunis, A. Timmermann, & J. E. Moody (Eds.), Developments inforecast combination and portfolio choice. Chichester: Wiley.

Francq, C., Roy, R., & Zakoian, J. M. (2005). Diagnostic checkingin ARMA models with uncorrelated errors. Journal of theAmerican Statistical Association, 100, 532–544.

Ghysels, E., Sinko, A., & Valkanov, R. (2007). MIDAS regressions:further results and new directions. Econometric Reviews, 26(1),53–90.

Granger, C. W. J., & Joyeux, R. (1980). An introduction to longmemory time series models and fractional differencing. Journalof Time Series Analysis, 1, 15–39.

Hansen, P. R. (2005). A test for superior predictive ability. Journalof Business and Economic Statistics, 23, 365–380.

Hansen, P. R., Kim, J., & Lunde, A. (2003). Testing for superiorpredictive ability using Ox, a manual for SPA for Ox. PackageManual.

Harvey, D. I., Leybourne, S. J., & Newbold, P. (1997). Testingthe equality of prediction mean squared errors. InternationalJournal of Forecasting, 13, 281–291.

Hatanaka, M. (1974). An efficient estimator for the dynamicadjustment model with autocorrelated errors. Journal ofEconometrics, 2, 199–220.

Heaney, R., & Pattenden, K. (2005). Change in unconditionalforeign exchange rate volatility: an analysis of the GBP andUSD price of the euro from 2002 to 2003. Applied EconomicsLetters, 12, 929–932.

Hol, E., & Koopman, S. J. (2000). Forecasting the variability ofstock index returns with stochastic volatility models and impliedvolatility. Tinbergen Institute Discussion Papers No. 00-104/4.

Hol, E., & Koopman, S. J. (2002). Stock index volatility forecastingwith high frequency data. Tinbergen Institute Discussion PaperNo. 2002-068/4.

Jones, B. (2003). Is ARCH useful in high frequency foreign exchangeapplications? Research paper No. 24. Applied Finance Centre,Macquarie University.

Kayahan, B., & Stengos, T. (2002). Intra-day features of realizedvolatility: evidence from an emerging market. InternationalJournal of Business and Economics, 1(1), 17–24.

Koopman, S. J., Jungbacker, B., & Hol, E. (2005). Forecasting dailyvariability of the S&P 100 stock index using historical, realisedand implied volatility measurements. Journal of EmpiricalFinance, 12(3), 445–475.

Koopman, S. J., Shephard, N., & Doornik, J. A. (1999). Statisticalalgorithms for models in state space using SsfPack 2.2.Econometrics Journal, 2, 113–166.

Laurent, S., & Peters, J. P. (2005). G@RCH 4.0, estimatingand forecasting ARCH models. Timberlake Consultants.www.timberlake.co.uk.

Page 19: Forecasting exchange rate volatility using high-frequency data: Is the euro different?

G. Chortareas et al. / International Journal of Forecasting 27 (2011) 1089–1107 1107

Ljung, G., & Box, G. (1979). On a measure of lack of fit in timeseries models. Biometrika, 66, 265–270.

Lobato, I., Nankervis, J. C., & Savin, N. E. (2001). Testingfor autocorrelation using a modified Box-Pierce Q test.International Economic Review, 42, 187–205.

Lopez, J. A. (2001). Evaluating the predictive accuracy of volatilitymodels. Journal of Forecasting, 20(2), 87–109.

Lux, T., & Kaizoji, T. (2004). Forecasting volume and volatility inthe Tokyo stock market: the advantage of long memory models.In Computing in Economics and Finance 2004. Society forComputational Economics, No. 158.

Marlik, A. K. (2005). European exchange rate volatility dynamics:an empirical investigation. Journal of Empirical Finance, 12,187–215.

Martens, M. (2001). Forecasting daily exchange rate volatility usingintraday returns. Journal of International Money and Finance,20(1), 1–23.

Martens, M., Chang, Y. C., & Taylor, S. (2002). A comparisonof seasonal adjustment methods when forecasting intradayvolatility. Journal of Financial Research, 25(2), 283–299.

Martens, M., & Zein, J. (2004). Predicting financial volatility:high-frequency time-series forecasts vis-a-vis implied volatility.Journal of Futures Markets, 24, 1005–1028.

McLeod, A. I. (1978). On the distribution of residual autocorrela-tions in Box-Jenkins method. Journal of the Royal StatisticalSociety, Series B, 40, 296–302.

Mincer, J., & Zarnowitz, V. (1969). The evaluation of economicforecasts. In Economic forecasts and expectations. New York:National Bureau of Economic Research.

Pong, S., Shackleton, M., Taylor, S. J., & Xu, X. (2004). Forecastingcurrency volatility: a comparison of implied volatilities andAR(FI)MA models. Journal of Banking and Finance, 28(9),2541–2563.

Poon, S. H., & Granger, C. (2003). Forecasting financial marketvolatility: a review. Journal of Economic Literature, 41(2),478–539.

Rahman, S., & Ang, K. P. (2002). Intraday return volatility process:evidence from NASDAQ stocks. Review of Quantitative Financeand Accounting, 19, 155–180.

Taylor, S. (1994). Modelling stochastic volatility: a review andcomparative study. Mathematical Finance, 4, 183–204.

Taylor, S., & Xu, X. (1997). The incremental volatility informationin one million foreign exchange quotations. Journal ofEmpirical Finance, 4, 317–340.

Vilasuso, J. (2002). Forecasting exchange rate volatility. EconomicsLetters, 76, 59–64.

West, K. D., & Cho, D. (1995). The predictive ability of severalmodels of exchange rate volatility. Journal of Econometrics, 69,367–391.

Yu, J. (2002). Forecasting volatility in the New Zealand stockmarket. Applied Financial Economics, 12, 193–202.

Zhang, L., Mykland, P. A., & Aıt-Sahalia, Y. (2005). A tale of twotime scales: determining integrated volatility with noisy highfrequency data. Journal of the American Statistical Association,100, 1394–1411.

Zumbach, G. (2004). Volatility processes and volatility forecast withlong memory. Quantitative Finance, 4(1), 70–86.