# Forecasting exchange rates using cointegration models and intra-day data

Post on 11-Jun-2016

223 views

Embed Size (px)

TRANSCRIPT

<ul><li><p>Journal of ForecastingJ. Forecast. 21, 151166 (2002)Published online 20 March 2002 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/for.822</p><p>Forecasting Exchange Rates usingCointegration Models and Intra-day Data</p><p>ADRIAN TRAPLETTI,1 ALOIS GEYER1* ANDFRIEDRICH LEISCH21 Department of Operations Research, Vienna University ofEconomics, Austria2 Department of Statistics and Probability Theory, ViennaUniversity of Technology, Austria</p><p>ABSTRACTWe present a cointegration analysis on the triangle (USDDEM, USDJPY,DEMJPY) of foreign exchange rates using intra-day data. A vector autore-gressive model is estimated and evaluated in terms of out-of-sample forecastaccuracy measures. Its economic value is measured on the basis of tradingstrategies that account for transaction costs. We show that the typical sea-sonal volatility in high-frequency data can be accounted for by transformingthe underlying time scale. Results are presented for the original and themodified time scales. We find that utilizing the cointegration relation amongthe exchange rates and the time scale transformation improves forecastingresults. Copyright 2002 John Wiley & Sons, Ltd.</p><p>KEY WORDS cointegration; exchange rates; high-frequency data; tradingstrategies</p><p>INTRODUCTION</p><p>Forecasting foreign exchange (FX) rates has been generally found to be a difficult task. Since theimportant paper by Meese and Rogoff (1983) numerous studies have shown that out-of-sampleforecasts of a variety of models are roughly equivalent to or only slightly better than forecasts froma simple random walk (see, for instance, Diebold et al., 1994; Satchell and Timmermann, 1995;Brooks, 1997). This supports the view that FX markets process available information in a highlyefficient way. Therefore, constructing any forecasting model requires us to utilize every potentialsource of information. The present paper attempts to do so in a number of ways.</p><p>First, we make use of the fact that FX rates are traded almost continuously on a worldwidebasis. In order to account for this we use high-frequency intra-day data. We assume that at leastsome (possibly very short) time period is required to integrate new information into the differentmarkets, in particular, when considering how diverse the interpretation of information can be. When</p><p>* Correspondence to: Alois Geyer, Department of Operations Research, Vienna University of Economics, Augasse 2-6,A-1090 Vienna, Austria. E-mail: Alois.Geyer@wu-wien.ac.at</p><p>Copyright 2002 John Wiley & Sons, Ltd.</p></li><li><p>152 A. Trapletti, A. Geyer and F. Leisch</p><p>using intra-day data we expect that this provides the opportunity to detect (short-term) dependenciesintroduced by price adjustments. Provided that such adjustments occur very quickly this may explainwhy daily data is not suitable for building forecasting models.</p><p>Second, we take into account that the trading activity on FX markets is not uniformly distributedover time. If activity is high, forecasts may have to be produced more frequently than duringquiet periods. Looking at the FX markets on a worldwide basis Dacorogna et al. (1996) haveshown distinct patterns in the daily trading activity. We confine ourselves to model the (almost)deterministic seasonality of FX market volatility that can be found during a day and during aweek. We use a deseasonalization procedure similar to the method proposed by Dacorogna et al.(1993) and apply a volatility-based time scale to the price generating process. We do not attemptto account for endogenously induced changes in the volatility of the markets.1 We rather rely onlyon the seasonality introduced by the pattern of opening times and intra-day activities of differentmarkets which can be assumed to stay rather constant over time.</p><p>Third, we consider a triangle of FX rates, namely USDDEM, USDJPY and DEMJPY rates.We attempt to use the so-called triangular identity, which states that the ratio of two FX rates, calledthe main rates, must be equal to the cross rate. In principle, direct trading of the cross rate should beequivalent to carrying out the trade through the main rates. For daily data this condition is almostthe definition of the cross rate in terms of the main rates. However, looking at higher frequencies,the triangular identity does not continuously hold. In practical terms, it is necessary to allow forsome time to elapse such that new information can be processed by the market and be incorporatedinto prices. This relationship forces the three FX rates to be cointegrated, provided that each rateis integrated. Based on the work of Johansen (1991) we estimate a vector autoregressive (VAR)model, analyse the resulting cointegration term, and use the VAR model to generate forecasts. Forcomparison we also present forecasting results obtained from univariate models.</p><p>We compute out-of-sample forecasts for several horizons and evaluate the performance of thehigh-frequency time series models relative to that of a simple random walk. The ultimate test forany forecasting model can be considered to be its economic value (cf. Satchell and Timmermann,1995). We therefore evaluate the profitability of the forecasts in terms of two trading strategies thataccount for transaction costs and differ with respect to the frequency of trades. Since the changein the time scale is a major element of our approach, we present results for models based on the(original) physical time scales and on the modified time scale.</p><p>In the next section we introduce high-frequency FX rates and briefly summarize the deseason-alization procedure and its effects. The results of the VAR analysis appear in the third section. Inthe fourth section we outline the experimental design and present the results of the forecasting andtrading experiments. The final section offers a summary and conclusions.</p><p>HIGH-FREQUENCY EXCHANGE RATES</p><p>DataThe basic data for this study are the exchange rate quotes for USDDEM, USDJPY, and thecross rate DEMJPY. The data set covers the period from 1 October 1992 to 30 September 1993on a tick-by-tick basis and contains 1,472,241 (USDDEM), 570,813 (USDJPY), and 158,978</p><p>1 The corresponding time scale is called intrinsic by Dacorogna et al., (1996).</p><p>Copyright 2002 John Wiley & Sons, Ltd. J. Forecast. 21, 151166 (2002)</p></li><li><p>Forecasting Intra-day Exchange Rates using Cointegration Models 153</p><p>(DEMJPY) data records. Each record consists of the time the record was collected, the bid andask price, the identification of the reporting institution, and a validation flag.</p><p>The price values are computed as the average of the log of bid and ask prices. The returns are theprice changes over a one-hour time interval, and the volatility is the absolute value of the returns.The one-hour interval was chosen in order to keep the intra-day character of the time series, but toavoid a bidask spread of the same order as the returns. To construct an equally spaced time seriesfrom the original tick-by-tick series, we take the most recent valid price record as a proxy for thecurrent price record. We have also tried a linear interpolation between the two neighbouring pricesbut the forecasting and trading results were not strongly affected. We decided to the use the mostrecent record because it conforms more closely to an out-of-sample forecast situation.</p><p>Time scaleOne of the major characteristics of high-frequency data is the strong intra-week and intra-dayseasonal behaviour of the volatility. This is shown, for instance, by the autocorrelation function ofvolatility in physical time for the USDDEM rate in Figure 1.</p><p>A data generation process with strong seasonal distribution patterns is not stationary. Thereforeit is necessary to control for these seasonalities before fitting any time series model. However, ourprimary goal is not only to improve the efficiency of the parameter estimates of the model but alsoto utilize the underlying regularities in the trading activities to obtain better forecasts and resultsof the trading strategies.</p><p>A very promising approach to filter these seasonal patterns is to apply a new time scale to theprice generating process similar to the one suggested by Dacorogna et al. (1993). In probabilitytheory this is formally equivalent to subordinated process modelling. In this model the returnsfollow a subordinated process in physical time and are non-stationary. However, they follow thestationary parent process on another time scale which we call operational.</p><p>0 100 200 300 400</p><p>0.1</p><p>0.0</p><p>0.1</p><p>0.2</p><p>0.3</p><p>0.4</p><p>Lag</p><p>ACF</p><p>Figure 1. Correlogram for USDDEM returns and volatilities for the period from 1 October 1992 to 30September 1993. Autocorrelations of returns are mainly inside the asymptotic 95% confidence bounds forwhite noise, whereas correlations of volatility exhibit a strong seasonality</p><p>Copyright 2002 John Wiley & Sons, Ltd. J. Forecast. 21, 151166 (2002)</p></li><li><p>154 A. Trapletti, A. Geyer and F. Leisch</p><p>The first step towards obtaining such a time scale is to omit the weekends. Thereby the original(physical) time scale is turned into a so-called business time. The business time stops while physicaltime runs through the weekend. It therefore still contains intra-weekly and intra-daily seasonalities(mainly the hour-of-the-day effects). To control for the remaining seasonal patterns, we applyanother time scale to the business time scale which is computed by stretching highly volatilemarket periods and shortening less volatile periods. Based on the transformation procedure explainedin Appendix A a one-hour time interval in operational time corresponds to about 15 minutes inphysical time when the volatility in the market is high (for instance, in the afternoon according toGreenwich Mean Time). On average, however, time intervals on both scales are equally long.</p><p>The three FX series are constructed by sampling equally spaced in operational time with asampling frequency corresponding to one hour time intervals. During a highly volatile period thisimplies, for instance, that four consecutive hourly observations on the operational time scale aretaken from a period of about one physical hour in the original database. As noted above, the mostrecent valid price record is taken as the current price.</p><p>The effect of changing the time scale can be seen in Figure 2.2 Although conditional heteroscedas-ticity is still present, most of the seasonal effects have been removed. We find a significantly negativeautocorrelation at lag one. Bollerslev and Domowitz (1993) find negative autocorrelations in five-minute returns of USDDEM quotes and attribute these to the nonsynchronous construction of theprice series. Zhou (1996) explains the negative autocorrelation by the noise in prices. He arguesthat the returns of high-frequency data are effectively the difference between two noise series,with an first-order autocorrelation approaching 0.5 as the observation frequency increases. Fortick-by-tick data of the USDDEM returns Zhou reports a correlation of 0.45. For the one-hour</p><p>0 100 200 300 400</p><p>0.1</p><p>0.0</p><p>0.1</p><p>0.2</p><p>0.3</p><p>0.4</p><p>Lag</p><p>ACF</p><p>Figure 2. Correlogram for USDDEM returns and volatilities in operational time. Autocorrelations of returnsare mainly inside the asymptotic 95% confidence bounds for white noise, whereas correlations of volatilityare mainly outside the bounds</p><p>2 Since the results for the two other rates are very similar to those of the USDDEM rate, they are not reported here.</p><p>Copyright 2002 John Wiley & Sons, Ltd. J. Forecast. 21, 151166 (2002)</p></li><li><p>Forecasting Intra-day Exchange Rates using Cointegration Models 155</p><p>interval we find a value of only about 0.05 (0.06 and 0.02 for USDJPY and DEMJPY,respectively).</p><p>Based on the Akaike information criterion and using only the first half of the sample (n D 4387)we have selected univariate AR(2) models for the returns3 to compute forecasts, although weobtained very similar results (not reported below) with AR(1), MA(1) and MA(2) models.</p><p>COINTEGRATION ANALYSIS</p><p>The cointegration analysis is done for the first half of the sample (n D 4387). Based on augmentedDickeyFuller tests we find that all series are I1 which is hardly surprising. We then estimate akth-order VAR of the three series by the maximum likelihood procedure following Johansen (1991).The ordering of the variables is (USDDEM, USDJPY, DEMJPY). Allowing for linear trendsand for a varying number of independent unit roots the VAR system can be written as</p><p>Xt Dk1iD10iXti C5Xtk C m C et 1</p><p>where et are NID.0,3/ with the 3 3 covariance matrix 3. Further parameters are the 3 1 vectorm, the 3 3 matrices 0i, and the 3 3 matrix 5. The rank of the latter is equal to the number ofcointegrating vectors.</p><p>We select the lag length in (1) using the Akaike information and final prediction error criteria(AIC and FPE). The unrestricted model (1) with Rank5 D 3 is estimated using the first half ofthe sample. AIC and FPE both yield an optimal lag length of Ok D 4.</p><p>On the basis of the plots of the series (see Figure 3) a model without a linear trend was assumed.The results of the trace and maximum eigenvalue test statistics are given in Table I (cf. Johansenand Juselius, 1990). The support for the existence of exactly one cointegrating vector is strong.Moreover, assuming the absence of a linear trend is data consistent. The likelihood ratio test statisticis LR D 5.05 which is asymptotically 22 and, thus, not significant.</p><p>Table II presents the normalized estimate of the cointegrating vector b D b0,b000 where 5 Dab0.a and b are 3 r matrices, and m D ab00.b determines the error correction mechanism of themodel. It is natural to assume that all rates of the FX triangle enter this mechanism with the sameweight. Given the chosen normalization we consider the hypothesis</p><p>b D 1, 1, 1, 00 j 2</p><p>where j is a real-valued parameter. The likelihood ratio test is given by LR D 6.89 which shouldbe compared with the quantiles of the 23 distribution. It is not significant and, therefore, hypoth-esis (2) is accepted with Oj D 2.612 103. We therefore maintain the model (1) and hypothesis (2)using Ok D 4 and Or D 1. The estimates of the other parameters of the model are presented inAppendix B.</p><p>The autocorrelation function for the error correction term 1, 1, 1Xtk is plotted in Figure 4.While the process is apparently I0, it is immediately clear that this process is not uncorrelated.</p><p>3 These are equivalent to ARIMA(2,1,0) models in the rates.</p><p>Copyright 2002 John Wiley & Sons, Ltd. J. Forecast. 21, 151166 (2002)</p></li><li><p>156 A. Trapletti, A. Geyer and F. Leisch</p><p>0.35</p><p>0.40</p><p>0.45</p><p>0.50</p><p>0.55</p><p>USD</p><p>DEM</p><p>4.65</p><p>4.70</p><p>4.75</p><p>4.80</p><p>USD</p><p>JPY</p><p>4.1</p><p>4.2</p><p>4.3</p><p>4.4</p><p>DEM</p><p>JP</p><p>Y</p><p>0 2000 4000 6000 8000Figure 3. The price series sampled equally spaced in operational time with a sampling frequency correspondingto one hour time intervals for the period from 1 October 1992 to 30 September 1993</p><p>Table I. Trace and maximum eigenvalue test statistics for variousvalues of the cointegration rank r with critical values for a 5%significance level</p><p>Trace Critical value max Critical value</p><p>r 2 3.35 9.09 3.35 9.09r 1 14.01 20.17 10.66 15.75r D 0 608.98...</p></li></ul>