exchange rate forecasting: results from a threshold autoregressive model

14
Open economies review 9: 157–170 (1998) c 1998 Kluwer Academic Publishers. Printed in The Netherlands. Exchange Rate Forecasting: Results from a Threshold Autoregressive Model MICHAEL K. PIPPENGER AND GREGORY E. GOERING Associate Professor, Department of Economics, School of Management, P. O. Box 756080, University of Alaska, Fairbanks AK 99775-6080, USA Key words: exchange rates, threshold autoregression, forecasting JEL Classification Number: F3, C5 Abstract Structural models of exchange rate determination rarely forecast the exchange rate more accu- rately than a naive random walk model. Recent innovations in exchange rate modeling indicate that changes in the exchange rate may follow a self-exciting threshold autoregressive model (SETAR). We estimate a SETAR model for various monthly US dollar exchange rates and generate forecasts for the estimated models. We find: (1) nonlinearities in the data not uncovered by the standard nonlinear- ity tests and (2) that the SETAR model produces better forecasts than the naive random walk model. Structural models of exchange rate determination typically produce exchange rate forecasts inferior to a naive random walk model. 1 This finding leads many to conclude that open economy macro-models of exchange rate determination are at best incomplete. However, this result is perhaps not surprising when one views the exchange rate as an asset price. As with many asset prices, exchange rates appear to contain significant nonlinearities when higher frequency data are examined. In particular, daily and weekly exchange rate data seem to contain conditional heteroskedasticity. However, many authors, such as Baillie and Bollerslev (1989), find that nonlin- earities are typically not found when monthly or annual data are tested. The standard non-linear model employed in exchange rate modeling is the Autoregressive Conditional Heteroskedasticity (ARCH) class model. A number of pre-test procedures have been developed to test for ARCH processes. In particular, Engle’s (1982) Lagrange Multiplier (LM) test can be used to detect nonlinearity and is typically used to test for ARCH. One criticism of the ARCH methodology is that while there are theoretical implications of ARCH processes, (e.g., a time varying risk premium in asset pricing models), the occurrence of ARCH in economic data is, in general, not implied by theory. 2 Another type of nonlinearity that may occur in economic data is the threshold autoregressive process. 3 Unlike ARCH and its variants, threshold processes are

Upload: michael-k-pippenger

Post on 02-Aug-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

Open economies review 9: 157–170 (1998)c© 1998 Kluwer Academic Publishers. Printed in The Netherlands.

Exchange Rate Forecasting: Resultsfrom a Threshold Autoregressive Model

MICHAEL K. PIPPENGER AND GREGORY E. GOERINGAssociate Professor, Department of Economics, School of Management, P. O. Box 756080,University of Alaska, Fairbanks AK 99775-6080, USA

Key words: exchange rates, threshold autoregression, forecasting

JEL Classification Number: F3, C5

Abstract

Structural models of exchange rate determination rarely forecast the exchange rate more accu-rately than a naive random walk model. Recent innovations in exchange rate modeling indicate thatchanges in the exchange rate may follow a self-exciting threshold autoregressive model (SETAR).We estimate a SETAR model for various monthly US dollar exchange rates and generate forecasts forthe estimated models. We find: (1) nonlinearities in the data not uncovered by the standard nonlinear-ity tests and (2) that the SETAR model produces better forecasts than the naive random walk model.

Structural models of exchange rate determination typically produce exchangerate forecasts inferior to a naive random walk model.1 This finding leads manyto conclude that open economy macro-models of exchange rate determinationare at best incomplete. However, this result is perhaps not surprising when oneviews the exchange rate as an asset price.

As with many asset prices, exchange rates appear to contain significantnonlinearities when higher frequency data are examined. In particular, dailyand weekly exchange rate data seem to contain conditional heteroskedasticity.However, many authors, such as Baillie and Bollerslev (1989), find that nonlin-earities are typically not found when monthly or annual data are tested.

The standard non-linear model employed in exchange rate modeling is theAutoregressive Conditional Heteroskedasticity (ARCH) class model. A numberof pre-test procedures have been developed to test for ARCH processes. Inparticular, Engle’s (1982) Lagrange Multiplier (LM) test can be used to detectnonlinearity and is typically used to test for ARCH. One criticism of the ARCHmethodology is that while there are theoretical implications of ARCH processes,(e.g., a time varying risk premium in asset pricing models), the occurrence ofARCH in economic data is, in general, not implied by theory.2

Another type of nonlinearity that may occur in economic data is the thresholdautoregressive process.3 Unlike ARCH and its variants, threshold processes are

Page 2: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

158 PIPPENGER AND GOERING

frequently implied by economic theory. In particular, an exchange rate targetzone would lead to a threshold type process for the nominal exchange rate(see Krugman, 1991). Similarly, exchange rate management such as “leaningagainst the wind’’ may lead to a threshold type process for changes in thenominal exchange rate.4 Also, transaction costs may lead to a threshold typeprocess for the real exchange rate.5

First, we find that while ARCH nonlinearity is not detected in log changes inthe nominal exchange rate, threshold nonlinearity is found.6 We then build aself-exciting threshold autoregressive (SETAR) model and show that, in general,the SETAR model forecasts changes in the exchange rate superior to the naiverandom walk model.

Our paper is organized as follows. Section one motivates and discusses theuse of the specific SETAR model employed and outlines the identification andestimation procedures used in SETAR model building. Section two discussesthe data used, presents the estimated models, and shows that the ARCH non-linearity tests commonly used may fail to detect threshold nonlinearity. Sectionthree contains the forecasting properties of the estimated SETAR models andcompares those forecasts to the naive random walk model. Our conclusionsare contained in section four.

1. The SETAR model: Motivation, identification and estimation

The self-exciting threshold autoregressive (SETAR) model has the following gen-eral form:

xt = aj0 +

k∑i=1

aji xt−i + h j εt if r j−1 < xt−d ≤ r j , (1)

where r0 < r1 < r2 < · · · < rl−1 < rl . The SETAR process in (1) is a piece-wise linear AR(k) process and process switching depends upon the thresholdparameter r j and the value of xt−d where d is the delay. In this general case,there are l regimes. This model is succinctly denoted as SETAR(l ; k).

The motivation for applying the SETAR model to log changes in the nominalexchange rate is straight forward. When a central bank follows a foreign ex-change market intervention rule based on the magnitude of previous changesin the nominal exchange rate (e.g., “leaning against the wind’’), the time se-ries process for the exchange rate will switch when intervention occurs. Thesimplest intervention rule would be a two sided rule where intervention occurswhen the change in the exchange rate is considered by the central bank as too“large’’ in absolute value. Hence, the specific SETAR model we examine is theSETAR(3; 1):7

xt = a10 + a1

1xt−1+ εt if −∞ < xt−1 ≤ r1

xt = a20 + a2

1xt−1+ εt if r1 < xt−1 ≤ r2 (2)

xt = a30 + a3

1xt−1+ εt if r2 < xt−1 ≤ +∞,

Page 3: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

EXCHANGE RATE FORECASTING 159

where r1 < r2 and xt represents changes in the log exchange rate. Thus, as (2)indicates, we have chosen a three regime model with homoskedastic errors.Also, εt is assumed to be Gaussian. When the percentage change in the ex-change rate is larger than r2 or smaller than r1, the central bank intervenes inthe foreign exchange market in the next time period causing the process for theexchange rate to shift. Under a simple intervention rule, as long as the percent-age change in the exchange rate is between r1 and r2 the central bank does notintervene in the foreign exchange market.8

If the threshold parameters are known prior to model building, estimationis quite simple. First, re-order the data according to xt−1 and segment thedata according to the threshold parameters r1 and r2. Finally, estimate thecoefficients for each regime via OLS.9 However, if the threshold parametersare not known, a number of techniques have been developed to identify theseparameters.10

The first technique we discuss is the recursive regression technique of Tsay(1989). The data series of n paired observations (xt , xt−1) are obviously orderedaccording to time. Step 1 of the procedure re-orders the n pairs of obser-vations according to the size of xt−1. Denote these new paired observationsX∗τ = (x∗τ , x∗τ−1). Next a recursive regression is estimated of the form:

x∗τ = b0+ b1x∗τ−1+ ωτ , (3)

where τ no longer represents time but rather the τ th paired observations ofthe reordered series. Next plot the t-ratio of the recursive estimates of theautoregressive coefficient against x∗τ−1. Under ideal conditions, if the modelis linear the recursive t-ratio should monotonically increase in absolute valuewhen plotted with respect to x∗τ−1. Under a two threshold SETAR model therecursive t-ratio should initially monotonically increase in absolute value untilthe process switches. At the point where the process switches, the recursivet-ratio should change direction and this turning point will indicate the regionwhere the first threshold parameter should lie. After this turning point, the t- ratioshould monotonically increase in absolute value until the next process switchoccurs. Once again the t-ratio should change direction and this turning pointshould indicate the region in which the second threshold parameter should lie.Simulation experiments indicate that this procedure is fairly effective at findingthe threshold parameters in simulated data. However, one problem with thistechnique is that if the first threshold occurs near the beginning of the re-ordereddata set, this procedure fails to provide a clear indication of the location of thefirst threshold.

Another technique for identifying threshold processes uses the Akaike In-formation Criterion (AIC) as suggested by Tong (1983). Consider the possiblethresholds r ∗1 and r ∗2 . Using these threshold parameters and the re-ordered data(x∗τ , x∗τ−1) estimate the regression parameters for each of the three regimes givenby r ∗1 and r ∗2 . Next calculate the AIC under each of the three regimes denotedas AIC1(r ∗1), AIC2(r ∗1 , r

∗2), and AIC3(r ∗2). Let AIC(r ∗1 , r

∗2)=AIC1(r ∗1)+AIC2(r ∗1 , r

∗2)

Page 4: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

160 PIPPENGER AND GOERING

+AIC3(r ∗2). Then, for all values for r1 and r2 choose as the threshold parametersthe values of r1 and r2 which minimizes AIC(r ∗1 , r

∗2).

11

We use a combination of the two techniques outlined above. First we employthe Tsay (1989) recursive t-stat technique to identify the likely regions where thethreshold parameters lie and then over these regions find the values for r1 andr2 which minimizes AIC(r ∗1 , r

∗2).

2. Data and estimates

We use monthly end of period US dollar exchange rates for Austria, Belgium,Canada, Denmark, France, Germany, Ireland, Italy, Japan, Netherlands, Norway,Switzerland, and the UK from the International Monetary Fund’s InternationalFinancial Statistics on CD-ROM. Exchange rates are denominated in units offoreign currency per US dollar. The period under examination is from March1979 to December 1991. The data are transformed by taking the natural logand first differencing.

Monthly data are used for two reasons. First, this data frequency simplifiesthe choice of a delay parameter since exchange rate intervention will occur inless than one month. If daily or even weekly data are used, choice of the delayparameter may not be so clear cut thus adding complication to the identificationprocedure. The other reason monthly data are used is that higher frequency datamay contain other types of nonlinearity in addition to that attributable to thresh-olds. In particular, ARCH may also occur in higher frequency data thus com-plicating the model building process. Furthermore, monthly data allows us toexamine the ability of the standard ARCH tests to detect threshold nonlinearity.

The time period was chosen to coincide with the advent of the EuropeanMonetary System (EMS). During all of this period Belgium, Denmark, France,Germany, Ireland, Italy, and the Netherlands participated in the EMS throughoutthe sample period. Hence, at least for these countries, US dollar exchange ratesshould follow a consistent process over the time period under examination.

The plots of the recursive t-ratios for Canada and Japan failed to indicatethe presence of thresholds. However, for the remaining countries the recursivet-ratios indicate threshold nonlinearity and the likely regions for two thresholds.For example, figure 1 below depicts the recursive t-ratios of the estimated AR(1)coefficient for Germany.12

Figure 1 illustrates the recursive t-ratios for Germany plotted against laggedchanges in the exchange rate. The figure illustrates the instability of the t-ratio over the initial part of the data which makes the identification of the firstthreshold difficult using only the recursive t-ratio technique. This instability is at-tributable to the small number of observation used to estimate the initial t-ratios.However, the region for the upper threshold is clearly indicated as around 0.03.

The general pattern shown by figure 1 is found for all the remaining coun-tries. Over these regions we conducted a grid search and used as thresholdparameters the values which minimized the AIC as discussed above.

Page 5: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

EXCHANGE RATE FORECASTING 161

Table 1. Preliminary tests of monthly US dollar exchange rates.

Box-Ljung McLeod-Li Kurtosistest test LM-test test

Austria 23.2 2.55 0.249 1.31

Belgium 24.2 1.66 0.245 1.46

Denmark 27.0 1.11 0.004 0.21

France 23.2 3.53 1.31 1.75

Germany 22.3 1.94 0.192 1.21

Ireland 21.5 1.81 0.449 1.03

Italy 28.3 1.74 0.0002 0.76

Netherlands 25.3 1.15 0.002 1.09

Norway 27.6 2.08 0.589 3.83

Switzerland 21.7 1.47 0.075 0.28

UK 17.3 7.03 1.37 1.83

Note: The Box-Ljung test is distributed χ2(24)—10% critical value: 33.2.The McLeod-Li test is distributed χ2(4)—10% critical value: 7.78 LM-testis distributed χ2(1)—10% critical value: 2.71. The Kurtosis test statisticis distributed standard normal. Data are in log changes.

Figure 1. Recursive T-statistics—Germany.

Table 1 contains the results of the standard nonlinearity tests for log changesin the exchange rate for the various countries as well as the estimated kurtosis.13

These tests, in effect, test the adequacy of the standard random walk modelfor the log exchange rate.14 Also, the Box-Ljung Q-test for white noise wasapplied to log changes in nominal exchange rates (see Ljung and Box, 1978).The McLeod-Li Q-test for nonlinearity and the Engle’s Lagrange Multiplier (LM)

Page 6: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

162 PIPPENGER AND GOERING

test for ARCH were conducted and the data was tested for excess kurtosis.The McLeod-Li tests applies the Box-Ljung Q-test to the random walk modeltesting for up to 4th order autocorrelation in the squared residuals. The LM-testtests for ARCH(1) errors in the random walk model.15

The log first difference of the nominal exchange rate appears to follow awhite noise process in every case. The kurtosis test indicates excess kurtosisonly for Norway and possibly for France and the UK. More importantly, neitherthe McLeod-Li nor the LM-test indicates the existence of ARCH type nonlin-earity. This confirms the findings of previous studies that low frequency datatypically does not seem to contain ARCH. For example, Baillie and Bollerslev(1989) only find ARCH in high frequency exchange rate data. However, whilethese tests may have power in detecting ARCH type nonlinearity, as shown byGoering and Pippenger (1994) these tests may have low power under this typeof threshold alternative. Thus, although monthly exchange rate data fails to in-dicate ARCH type nonlinearity, the recursive t-ratios strongly indicate thresholdnonlinearity. For example, although the t-ratios of Germany in figure 1 indicatethe presence of thresholds, the McLeod-Li and LM-tests for Germany do notdetect this threshold nonlinearity. This suggests that if researchers only usestandard ARCH tests, they may mistakenly conclude nonlinearity does not playa significant role in exchange rate determination when low frequency data areexamined.

Table 2 contains the estimated threshold parameters as well as nonlinearityand excess kurtosis tests of the residuals from the estimated threshold model.

Using the Box-Ljung Q-test, autocorrelation was not detected for any of themodels at the 10% significance level.16 Also, Table 2 indicates that ARCH typeor similar nonlinearity is not detected in the residuals of the threshold models.

Table 2. Threshold estimates and nonlinearity tests.

Upper Lower Box- McLeod-Li Kurtosisthreshold threshold Ljung test test LM-test test

Austria 0.01966 −0.02969 25.2 0.827 0.242 −0.3245

Belgium 0.02395 −0.0306 31.0 0.536 0.193 −0.7819

Denmark 0.03197 −0.01411 21.4 3.38 0.100 −0.7081

France 0.02602 −0.03863 21.3 2.96 1.89 −0.3393

Germany 0.02692 −0.03242 29.4 1.22 0.207 −0.0737

Ireland 0.02202 −0.02606 32.9 0.355 0.136 −0.6147

Italy 0.02141 −0.03217 31.2 0.179 0.002 −0.8262

Netherlands 0.03242 −0.02961 25.5 0.700 0.002 −0.0713

Norway 0.03402 −0.04028 26.7 1.68 0.140 1.8934

Switzerland 0.04077 −0.02343 22.3 4.06 0.001 −0.5896

UK 0.02670 −0.01723 17.8 3.83 1.64 1.0744

Note: See Table 1.

Page 7: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

EXCHANGE RATE FORECASTING 163

The residuals for France and the UK no longer indicate excess kurtosis. Notethat while the kurtosis test statistic for Norway still indicates that the residualsfor this model contain excess kurtosis, the kurtosis measure is one-half that ofthe random walk model. Indeed, the measure of kurtosis is lower in all casescompared to the random walk model.

3. Forecasting results

Next we examine the in-sample prediction performance of the threshold modelrelative to the random walk model by comparing the one step ahead meansquare prediction error (MSPE).17

In every case the threshold model obtains a lower MSPE than the randomwalk model. The improvement ranges from 7.26% for the UK to over 15% forItaly with an average improvement of 11.28%. In-sample the threshold modelexplains the behavior of the nominal exchange rate markedly better than a naiverandom walk model. Hence, taking into account the process switching whichmay result from exchange rate management greatly improves the in-sample per-formance of a time series model of exchange rate determination. The Diebold-Mariano or D-M test found in Table 3 employs the Wilcoxon’s signed-rank testto test the hypothesis that the median of the differential between the squaredforecast errors equals zero. This test is outlined in the Appendix. The null hy-pothesis of the D-M test and the null hypothesis of equality between the meansquare prediction errors of two forecasts are similar hypotheses in that both hy-potheses state that two forecasts have the same predictive accuracy. The D-Mtest indicates a statistically significant difference in predictive accuracy for only

Table 3. In-sample mean square prediction error (MSPE * 1000).

Random walk Threshold D-M testmodel model Improvment statistic

Austria 1.2622 1.1499 8.90% 0.717

Belgium 1.2868 1.1187 13.06% 0.451

Denmark 1.2002 1.0284 14.31% 1.943*

France 1.2215 1.1233 8.04% 0.002

Germany 1.2561 1.1217 10.69% 0.854

Ireland 1.1631 0.9959 14.37% 0.809

Italy 1.0177 0.8626 15.24% 1.192

Netherlands 1.2588 1.1437 9.14% 1.004

Norway 0.8562 0.7451 12.97% 1.113

Switzerland 1.4499 1.3041 10.05% 1.317

UK 1.2807 1.1877 7.26% 0.758

Note: The D-M test is asymptotically distributed standard normal. ‘*’ stands forsignificant at the 10% level.

Page 8: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

164 PIPPENGER AND GOERING

Denmark at almost the 5% level. Thus, while the improvement is substantialin many cases, statistically the random walk and threshold forecasts possesssimilar predictive accuracy.

Next we compare the post-sample performance of the two models. In order togenerate post-sample prediction the models are estimated to September 1988.One step ahead post sample forecasts were generated starting with October1988. Thus 40 post-sample one step ahead predictions were generated.18 Forthe post-sample predictions of the threshold model, the thresholds parame-ters used were those identified using the entire data set. Unfortunately, thetechnique used to identify the thresholds is in part based on judging where thethresholds likely lie and hence this technique cannot simply be transformedinto an algorithm needed for step by step updating as in a completely out-of-sample forecast. Therefore, the post-sample predictions we generate are nottrue ex-ante forecasts.19 However, the results can be interpreted as an indica-tion of the relative explanatory power for the two models. The results of thepost-sample forecasts are in Table 4 below.

The results found in Table 4 indicate that the improved forecasting perfor-mance of the threshold model relative to the random walk model found in- sam-ple carries over to a great extent to the post-sample forecasts. The thresholdmodel improves upon the random walk model in all cases with the lone excep-tion of Austria where model performance is almost identical. The average post-sample improvement of the threshold model is 6.12%. In addition, post-sampleimprovement is greater than that found in-sample for the cases of France andIreland. While Germany and Switzerland show only a small decline in improve-ment when moving to post-sample forecasting, Austria, Belgium, Denmark,UK, Italy, and Norway improvement falls by three to thirteen percentage points.

Table 4. Post-sample mean square prediction error (MSPE * 1000).

Random walk Threshold D-M testmodel model Improvment statistic

Austria 1.3892 1.3899 −0.06% −.0177

Belgium 1.3435 1.2284 8.56% −0.099

Denmark 1.2592 1.2172 3.33% 0.097

France 1.2884 1.1719 9.04% −0.059

Germany 1.3544 1.2236 9.65% −0.023

Ireland 1.2934 1.0838 16.20% −0.006

Italy 1.1129 1.0893 2.11% −0.079

Netherlands 1.4080 1.3205 6.04% 0.026

Norway 0.9675 0.9547 1.31% −0.023

Switzerland 1.3222 1.2073 8.69% 0.014

UK 1.4043 1.3703 2.42% −0.059

Note: See Table 3.

Page 9: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

EXCHANGE RATE FORECASTING 165

Table 5. Accuracy of predicted sign.

In-sample Post-sampleRandom threshold H-M Random threshold H-M

walk model test walk model test

Austria 48.6% 55.9% 2.23* 56.4% 46.2% −1.47

Belgium 49.3% 56.6% 2.86*** 33.3% 43.6% −2.62

Denmark 51.9% 63.1% 4.04*** 33.3% 58.9% 1.06

France 51.3% 51.3% 0.45 33.3% 51.3% 0.27

Germany 50.3% 59.9% 3.39*** 56.4% 51.3% −0.89

Ireland 51.3% 54.6% 1.56 30.8% 46.2% −0.72

Italy 51.3% 59.9% 3.13*** 30.8% 46.2% −1.23

Netherlands 50.6% 58.6% 2.80*** 51.3% 56.4% −0.23

Norway 50.6% 56.6% 2.15* 38.5% 46.2% −1.01

Switzerland 52.6% 56.6% 2.22* 61.5% 48.7% −0.46

UK 52.6% 56.6% 2.75*** 48.7% 51.3% 0.18

Note: The H-M test is asymptotically distributed standard normal. ‘***’ stands for significant at the1% level; ‘**’ stands for significant at the 5% level; ‘*’ stands for significant at the 10% level.

However, in all cases except Austria the threshold model still shows at least amoderate improvement over the random walk model. Table 4 also contains theD-M test statistics. As in the in-sample comparison, the D-M statistic indicatesthat the two models generates forecasts of similar accuracy.

Another interesting way to compare the prediction performance of the twomodels is to compare accuracy of predicting the sign of the change in theexchange rate. Table 5 show the percentage of times each model correctlypredicted the sign of the change in the exchange rate in the next period forboth in-sample and post-sample predictions.

In-sample, the random walk model was correct approximately 50% of thetime while the threshold models accuracy ranged from 51.3% to 63.1% for anaverage of 57.43%.20 The threshold model was more accurate in predictingthe sign of a future change in the exchange rate than the random walk modelin all cases but France where the two models tied. However, in a sense therandom walk model with drift is a “straw man’’ comparison since in every casethe random walk model always predicts either a positive or a negative sign. Inorder to evaluate the sign forecast performance of the threshold model we applythe nonparametric test of market timing presented in Henriksson and Merton(1981) and denoted as the H-M test. This test has as the null hypothesis thatthe unconditional probability that an individual forecast will be correct equals0.5. The interpretation of the null hypothesis is that the forecast techniquehas no forecast or “market timing’’ ability. Another interpretation of the nullhypothesis is that the forecasts are no better than flipping a coin to forecastthe sign of a change in the exchange rate. For the threshold model, in 9 of the

Page 10: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

166 PIPPENGER AND GOERING

11 cases the null hypothesis is rejected and hence the in-sample sign forecastsare statistically better than simply flipping a coin.

Post-sample, in terms of the sign of the change, the predictive accuracy ofthe random walk ranged from 33.3% to 56.4% while the threshold model rangedfrom 43.6% to 58.9%. The threshold model is substantially more accurate thanthe random walk model in the cases of Belgium, Denmark, France, Ireland, andItaly. But, the random walk model is more accurate in the cases of Austria, Ger-many, and Switzerland. However, overall the threshold model is more accurateat this type of prediction than the random walk model in 8 of the 11 cases. But,for the threshold model the H-M test statistic is insignificant in all cases whichindicates that for post-sample forecasts the threshold model does not possessthe same forecast ability as that of the in-sample predictions.

Since the threshold model can be thought of as a specific version of a generalnon-linear model, the results found in Diebold and Nason (1990) are of interest.Diebold and Nason (1990) use the locally weighted regression (LWR) techniqueto generate in-sample and out-of-sample exchange rate forecasts using weeklyUS dollar exchange rates and generate in-sample forecasts for various coun-tries including Belgium, Denmark, France, Germany, Italy, Netherlands, UK andSwitzerland. On average, the in-sample MSPE improvement they find is around15% for these countries. For these countries our in-sample improvement isabove 12% of average. Thus, on average our in-sample improvement is com-parable to that found in Diebold and Nason (1990). However, the LWR techniquegenerates one step ahead out-of-sample forecasts that in general do not im-prove upon a random walk. In particular, for the eight exchange rate forecasts incommon with the current analysis, the LWR technique improves upon the ran-dom walk model in MSPE only for Germany and Switzerland and then with onlyabout a 3% and 1% improvement, respectively. Our post-sample improvementover the previous non-parametric results indicates the importance of thresholdnonlinearity in the behavior of nominal exchange rates.21

One explanation of our improved results may be due to the data frequency weemploy. Higher frequency data may contain more than one type of nonlinearitythus decreasing the explanatory power of the non-parametric model. Also, ifthe true model is a threshold model, the general form of the LWR model imposesa smoothness to the process inconsistent with the discrete process shifts thatoccur in the threshold model we employ.

Our results are probably most similar to Engle (1994). As in our study, usingthe D-M test Engle (1994) finds that the random walk model and the Markovswitching model generate statistically the same post-sample forecasts. Also,in terms of mean square prediction error the random walk model out performsthe Markov switching model over half the time whereas in our study the randomwalk model out-performs the threshold model only once.

Our results indicate that in general the threshold model forecasts the ex-change rate at least as well as the naive random walk model and leads to greaterforecast improvement than previous models. In particular, the in-sample sign

Page 11: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

EXCHANGE RATE FORECASTING 167

predictions pass the test for forecast ability found in Henriksson and Merton(1981). In addition to forecast improvements, the threshold technique mayin general be easier to implement than other more elaborate estimation tech-niques. Once the thresholds are identified, estimation reduces to OLS whereasone would expect that the implementation of many non-parametric techniqueswould be computationally more complicated.

4. Summary and conclusion

Structural models of exchange rate determination typically do not generate fore-casts which improve on simple random walk forecasts. This is, at least in part,due to the significant nonlinearities in the data. In particular, we may expectexchange rate data as well as many other economic and financial data seriesto contain threshold autoregressive nonlinearity. Exchange rate managementwould be expected to lead to a process similar to the threshold autoregres-sive model. We examine thirteen different US dollar exchange rate series toexplore the nonlinearities in the data. We estimate a self-exciting threshold au-toregressive (SETAR) model and use it for forecasting vis-a-vis a random walkmodel.

We first show that Autoregressive Conditional Heteroskedasticity (ARCH) isnot found in the monthly exchange rate data. We then show that even thoughARCH is not detected, threshold nonlinearity is found in eleven of the series.This indicates that the ARCH nonlinearity tests commonly applied in exchangerate modeling may fail to detect some types of nonlinearities such as thresholdnonlinearity. Hence, these types of nonlinearity tests may cause researchersto erroneously conclude that nonlinearity is not an important factor in low fre-quency exchange rate data.

For the eleven exchange rates where threshold nonlinearity is detected, wethen construct a SETAR model and generate both in-sample and post-sampleforecasts. These SETAR forecasts are compared to the forecasts of a pure ran-dom walk model. We find that the SETAR model out-performs the random walkmodel in some cases by a relatively large margin. Indeed, mean square pre-diction error improves on average approximately 11% in-sample and 6% post-sample. Also, in terms of mean square prediction error the threshold modelout-performs the random walk model in every case in-sample and in all but onecase post-sample. However, statistical tests indicate that in general the twomodels generate similar forecasts in terms of forecast accuracy. In compari-son to previous studies the SETAR model also offers substantial post-sampleimprovement when compared to previous studies using more complicated fore-casting models. The SETAR model also more accurately predicts the sign ofthe change in the exchange rate than a random walk model and in generalthe in-sample sign predictions pass the Henriksson-Merton test for forecastability. Taken together, our results suggest that threshold boundaries may bean important component in exchange rate determination and modeling.

Page 12: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

168 PIPPENGER AND GOERING

Appendix: The Diebold-Mariano test

Suppose that we have two forecasts, A and B. Let e2At be the squared forecast

error for forecast A at time t and e2Bt be the squared forecast error for forecast

B at time t . There are T forecasts. Diebold and Mariano (1995) construct thefollowing test of forecast accuracy based upon Wilcoxon’s signed-rank test.

Let dt represent the differential between the squared forecast errors at time t .Thus, dt is given by dt = (e2

At − e2Bt).

The Diebold and Mariano (1995) version of the Wilcoxon’s signed-rank testhas as the null hypothesis that Median(e2

At−e2Bt)= 0. Note, in general, Median(e2

At− e2

Bt)= 0 does not equal Median(e2At)−Median(e2

Bt) nor does this differentialequal Mean(e2

At)−Mean(e2Bt). Therefore, the null of the D-M test is not the same

as the hypothesis that Median(e2At)−Median(e2

Bt)= 0, nor is it the same as thehypothesis that Mean(e2

At)−Mean(e2Bt)= 0. However, all three hypotheses are

similar in that they all state that the two forecasts possess the same forecast ac-curacy.

The D-M test statistic used here has the following form:

Test-Statistic = S3− T(T+1)4√

T(T+1)(2T+1)24

Where; T is the number of forecasts; S3=∑T

t=1 I+(dt )× rank(|dt |); I+(dt )= 1whendt > 0 and equals 0 otherwise; and rank(|dt |) equals the rank of |dt | from a rankordering of |dt |. The D-M test statistic is asymptotically distributed standardnormal. For further details, see Diebold and Mariano (1995).

Acknowledgments

We are indebted to John Geppert for comments and suggestions on earlierdrafts. We have also benefited from comments by John Pippenger, MicheleFratianni and an anonymous referee. Of course, any remaining errors are ourown.

Notes

1. This result was first found by Meese and Rogoff (1983). See Edison (1991) for an updatedanalysis. It is also worth noting that there have been other attempts to increase the accuracyof real exchange rate forecasts. For example, Koedijk and Schotman (1990) construct an errorcorrection model that out performs a random walk model.

2. See Davidson and MacKinnon (1993, p. 556).3. See Tong (1983) and Priestley (1988) for a complete discussion of threshold processes.4. See Hsieh (1992) and Krager and Kugler (1993). Using weekly data Krager and Kugler (1993)

build a threshold model for five US dollar exchange rates. However, they do not explore theforecasting properties of their models.

5. See Pippenger (1993) and Pippenger and Goering (1993).

Page 13: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

EXCHANGE RATE FORECASTING 169

6. See Goering and Pippenger (1994) for Monte Carlo results regarding this issue.7. Since we examine monthly data, a one period delay was chosen since intervention would

certainly occur within one month. However, if intervention occurs at a much higher frequencythe use of monthly data may fail to detect the threshold parameters. Since we identify thresholdbehavior in most of the exchange rates examined, this temporal aggregation does not appearto be a significant problem. Also a three regime model was chosen since, as discussed below,the recursive t-test indicates three regimes in the cases where threshold behavior is found.

8. A more analytical treatment of the nonlinearity examined here may be found in Hsieh (1992).However, since Hsieh (1992) must make restrictive assumptions in order to obtain a closed formsolution, his derived exchange rate processes at best simply motivates the application of theSETAR model.

9. These estimates will be consistent. See Tsay (1989) for details.10. Of course, if the delay parameter is unknown, the identification technique will be more compli-

cated than that discussed here. See Tong (1983,1990) and Tsay (1989) for more details.11. Clearly, a grid search over all possible threshold values could be done. However, this “brute

force’’ technique may simply isolate outliers rather than detect thresholds.12. For each case the pattern of the recursive t-ratios for the intercept term was almost identical

to that found for the AR(1) coefficient.13. See Engle (1982) and McLeod and Li (1983) for more details.14. The benchmark random walk model includes a drift term. Both in and out of sample this model

outperforms the pure random walk model in terms of mean square prediction error.15. Using an LM test for ARCH(4) leads to the same conclusions as the test for ARCH(1).16. Due to the reordering of the data according to xt−1, in order to estimate the parameters the

residuals must be reordered according to time in order to conduct the diagnostic tests.17. As in most recent research such as Diebold and Nason (1990) and Mizrach (1992) the forecasts

generated are for changes in (log) exchange rates.18. Since the last forecast generated is for January 1992 which is outside of the data set, only 39

usable observations are generated.19. Diebold and Nason (1990) confront a similar problem in out-of-sample forecasting of their non-

parametric model. Indeed, since post-sample forecasts using linear time series models typicallynever completely re-identify the entire model at each step, post-sample forecasts are in generalnot completely ex-ante.

20. Of course under standard regression assumptions, the random walk model should be correct50% of the time in-sample.

21. Meese and Rose (1990) and Mizrach (1992) both employ non-parametric techniques to forecastEMS exchange rates using daily data. These studies also find little or no improvement ofnon-parametric forecasts over the random walk model.

References

Baillie, R.T. and T. Bollerslev (1989) “The Message in Daily Exchange Rates,’’ Journal of Economicsand Business Statistics 7, 297–305.

Ball, C.A and A. Roma (1993) “A Jump Diffusion Process for the European Monetary System,’’Journal of International Money and Finance 12, 475–492.

Davidson, R. and J. MacKinnon (1993) Estimation and Inference in Econometrics. New York: OxfordPress.

Diebold, F.X. and R.S. Mariano (1995) “Comparing Predictive Accuracy,’’ Journal of Business andEconomic Statistics 13, 253–263.

Diebold, F.X. and J.M. Nason (1990) “Nonparametric Exchange Rate Prediction?,’’ Journal of Inter-national Economics 28, 315–332.

Edison, H.J. (1991) “Forecast Performance of Exchange Rate Models Revisited,’’ Applied Eco-nomics 23, 187–196.

Page 14: Exchange Rate Forecasting: Results from a Threshold Autoregressive Model

P1: SMA

Open economies review KL566-03-Pippenger January 27, 1998 17:23

170 PIPPENGER AND GOERING

Engle, R.F. (1982) “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance ofU.K. Inflation,’’ Econometrica 50, 987–1007.

Engle, C. (1994) “Can the Markov Switching Model Forecast Exchange Rates,’’ Journal of Interna-tional Economics 36, 151–165.

Flood, R. and P. Garber (1983) “A Model of Stochastic Process Switching,’’ Econometrica 51, 537–551.

Goering, G.E. and M.K. Pippenger (1994) “A Note Regarding ARCH and Threshold Processes:Results From a Monte Carlo Study,’’ Applied Economics Letters 1, 210–213.

Henriksson, R.D. and R.C. Merton (1981) “On Market Timing and Investment Performance. II. Sta-tistical Procedures for Evaluating Forecasting Skill,’’ Journal of Business 54, 513–533.

Hsieh, D.A. (1992) “A Nonlinear Stochastic Rational Expectations Model of Exchange Rates,’’ Jour-nal of International Money and Finance 11, 235–250.

Koedijk, K.G. and P. Schotman (1990) “How to Beat the Random Walk: An Empirical Model of RealExchange Rates,’’ Journal of International Economics 29, 311–332.

Krager, H. and P. Kugler (1993) “Non-Linearities in Foreign Exchange Markets: A Different Perspec-tive,’’ Journal of International Money and Finance 12, 195–208.

Krugman, P. (1991) “Target Zones and Exchange Rate Dynamics,’’ Quarterly Journal of Economics106(3), 669–682.

Ljung, G.M. and G.E. Box (1978) “On a Measure of Lack of Fit in Time-Series Models,’’ Biometrika65, 297–303.

McLeod, A.J. and W.K. Li (1983) “Diagnostic Checking ARMA Time Series Models Using Squared-Residual Correlation,’’ Journal of Time Series Analysis 4, 269-273.

Meese, R.A. and K. Rogoff (1983) “Empirical Exchange Rate Models of the Seventies: Do they Fitout of Sample?,’’ Journal of International Economics 14, 3–24.

Meese, R.A. and A.K. Rose (1990) “Nonlinear, Nonparametric, Nonessential Exchange Rate Esti-mation,’’ American Economic Review 80, 192–196.

Mizrach, B. (1992) “Multivariate Nearest-Neighbour Forecasts of EMS Exchange Rates,’’ Journal ofApplied Econometrics 7, S151–S163.

Pippenger, M.K. (1993) “Cointegration Test of Purchasing Power Parity: The Case of Swiss Ex-change Rates,’’ Journal of International Money and Finance 12, 46–61.

Pippenger, M.K. and G.E. Goering (1993) “A Note on the Empirical Power of Unit Root Tests underThreshold Process,’’ Oxford Bulletin of Economics and Statistic 55, 473–481.

Priestley, M.B. (1988) Non-Linear and Non-Stationary Time Series Analysis. San Diego, CA: Aca-demic Press Inc.

Tong, H. (1983) Threshold Models in Non-Linear Time Series Analysis. New York: Springer-Verlag.Tong, H. (1990) Non-Linear Time Series: A Dynamical System Approach. London: Oxford University

Press.Tsay, R.S. (1989) “Testing and Modeling Threshold Autoregressive Processes,’’ Journal of the Amer-

ican Statistical Association 84, 231–240.