estimating structural exchange rate models by artificial neural networks

12
This article was downloaded by: [North Carolina State University] On: 22 February 2013, At: 11:12 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Applied Financial Economics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/rafe20 Estimating structural exchange rate models by artificial neural networks Joseph Plasmans , William Verkooijen & Hennie Daniels Version of record first published: 07 Oct 2010. To cite this article: Joseph Plasmans , William Verkooijen & Hennie Daniels (1998): Estimating structural exchange rate models by artificial neural networks, Applied Financial Economics, 8:5, 541-551 To link to this article: http://dx.doi.org/10.1080/096031098332844 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Upload: hennie

Post on 03-Dec-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

This article was downloaded by: [North Carolina State University]On: 22 February 2013, At: 11:12Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

Applied Financial EconomicsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/rafe20

Estimating structural exchange rate models byartificial neural networksJoseph Plasmans , William Verkooijen & Hennie DanielsVersion of record first published: 07 Oct 2010.

To cite this article: Joseph Plasmans , William Verkooijen & Hennie Daniels (1998): Estimating structural exchange ratemodels by artificial neural networks, Applied Financial Economics, 8:5, 541-551

To link to this article: http://dx.doi.org/10.1080/096031098332844

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form toanyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arisingdirectly or indirectly in connection with or arising out of the use of this material.

Applied Financial Economics, 1998, 8, 541 Ð 551

Estimating structural exchange rate modelsby artiÞ cial neural networks

JOSEPH PLASMANS* , WILLIAM VERKOOIJEN § andHENNIE DANIELS §

* UFSIA (University of Antwerp), Study Centre of Economic and Social Sciences(SESO), Prinsstraat 13, B-2000 Antwerpen, Belgium; § ABN-AMRO Bank, DevelopmentCorporate Functions, POB 283, NL -1000 EA Amsterdam, The Netherlands and§ T ilburg University, Department Management Science, POB 90153, NL-5000 L ET ilburg, The Netherlands

No theory of structural exchange rate determination has yet been found that performswell in prediction experiments. Only very seldom has the simple random walk modelbeen signi® cantly outperformed. Referring to three, sometimes highly nonlinear,monetary and nonmonetary structural exchange rate models, a feedforward arti® cialneural network speci® cation is investigated to determine whether it improves theprediction performance of structural and random walk exchange rate models. A newtest for univariate nonlinear cointegration is also derived. Important nonlinearitiesare not detected for monthly data of US dollar rates in Deutsche marks, Dutchguilders, British pounds and Japanese yens.

1 One of the ® rst successful attempts to beat the naive random walk model for the Deutsche mark Ð US dollar and British pound Ð US dollarexchange rates was MacDonald and Taylor (1994). But in Hallwood and MacDonald (1994, p. 22) one of the same authors states that

Sadly, it will be shown that even today, after a further two decades or so of research, no single exchange rate theory providesa satisfactory Ð empirically consistent Ð theory of the exchange rate.

I . I N T R O D U C T I O N

It is now recognized that empirical exchange rate modelsof the post-Bretton Woods era are characterized by para-meter instability and dismal forecast performance . . .(Meese and Rose, 1991).

The pessimism about the prediction quality of exchange ratemodels became generally accepted after the publication of thein¯ uential paper by Meese and Rogo� (1983). These authorsperformed a large number of statistical tests, indicating thatnot a single economic model of exchange rates was better inpredicting bilateral exchange rates during the ¯ oating-rateperiod than the simple random walk model, which posits thatall conditional expectations of exchange rates are equal totoday’s rates. In this respect also Baillie and McMahon (1989)observe that almost all bilateral exchange rates are integratedof order one (or, brie¯ y, are I(1) processes) and that changes of

exchange rates are uncorrelated over time. It follows, therefore,that exchange rates are in general not linearly predictable.1

Several approaches have been tried to improve thequality of existing structural exchange rate models. Mark(1985) has noted that, if one relaxes the assumption that therepresentative agent is risk-neutral, the standard asset pric-ing model no longer implies that prices will follow a randomwalk. Monetary and portfolio models of exchange rateshave been developed leading to linear and (highly) nonlinearmodels of bilateral exchange rates. Nonlinear speci® cationsmight improve linear speci® cations and random walkmodels. In this respect, Hsieh (1989) observes that changesof exchange rates may be nonlinearly dependent, eventhough they are linearly uncorrelated. Further, Engel andHamilton (1990) and Chinn (1991) provide some evidence infavour of nonlinear forecasts of exchange rates. But, al-though Diebold and Nason (1990) used a nonparametricprediction technique (locally weighted regression), they were

0960 Ð 3107 Ó 1998 Routledge 541

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

generally unable to improve upon a simple random walk inout-of-sample prediction of ten major bilateral dollar spotexchange rates in the post-1973 ¯ oating period. Also Meeseand Rose (1991) end with a negative conclusion:

. . . we do conclude that incorporating nonlinearities intoexisting structural models of exchange rate determinationdoes not at present appear to be a research strategy whichis likely to improve dramatically our ability to under-stand how exchange rates are determined.

The exchange rate literature usually restricts the applica-tion of nonparametric approaches to locally weighted re-gression techniques (Meese and Rose, 1990, 1991; Dieboldand Nason, 1990; Mizrach, 1992), based on local (moving)averaging and being in principle direct generalizations ofthe standard nearest neighbour technique. It is generallyrecognized that nonparametric modelling based on localapproximations becomes infeasible in high-dimensionalspaces due to the increasing sparseness of the data (Huber,1985). In macroeconomic models most data are typicallysparsely distributed: data on economic fundamentals areavailable, at best, on a monthly basis, which limits theamount of data available to (say) a few hundred observa-tions. Hence, the principle of local averaging is likely to failin macroeconomic modelling problems.

The foregoing does not necessarily imply that model-freeregression modelling is impossible in economics. Whena low-dimensional representation is embedded in thedata, dimensionality reduction methods may successfullybe applied. One such method is arti ® cial neural net-work (ANN) regression, which we will use throughoutthis paper. During the past few years there has been anoticeable increase of ANN applications in economicsand ® nance (Trippi and Turban, 1993; Rehkugler and Zim-mermann, 1994). However, no paper has been published onthe application of ANNs to structural exchange rate model-ling. Recently, some papers have attempted to forecast cur-rency exchange rates by ANNs, but they are not based onspeci® c structural exchange rate (determination) models andare either based on nonlinear and nonparametric tradingissues (Baestaens et al., 1994; Deboeck, 1994; Dunis, 1995) orare of a purely empirical nature (Refenes et al., 1993; Kuanand Liu, 1995). Hence there is a need for an econometricimplementation of structural exchange rate models by ANNs.

This paper examines whether introducing nonlinearitiesin the conditional mean into theoretical models of exchangerate determination improves the prediction power of thesestructural models. In the empirical part of this paper ANNsare employed to investigate the nonlinearity hypothesisfor the exchange rates of the Japanese yen Ð US dollar, theBritish pound Ð US dollar, the Deutsche mark Ð US dollar,and the Dutch guilder Ð US dollar. More speci® cally, wetest whether the assumed fundamental determinants ofthe structural models that we consider, do in fact a� ectthe bilateral exchange rates, without making additional

assumptions about the functional forms of the exchange raterelationships.

The outline of this paper is as follows. In Section II(testable) models of exchange rate determination are for-mulated, based on theoretical structural exchange ratemodels. Section III introduces the ANN methodology, andSection IV examines the empirical dynamic characteristicsof the structural models for the four exchange rates con-sidered, utilizing a novel ANN nonlinear cointegration test.Section V outlines the methodology for assessing predictiveperformance, and examines the predictive power of theselected exchange rate models, speci® ed in linear and inANN form respectively, Finally, Section VI concludes thepaper.

I I . T H E O R E T I C A L M O D E L S O FE X C H A N G E R A T E D E T E R M I N A T I O N

Several theories on exchange rate determination exist(Baillie and McMahon, 1989; MacDonald and Taylor,1992). In many theories two general hypotheses playa prominent role, the Purchasing Power Parity (PPP) hy-pothesis and the Uncovered Interest rate Parity (UIP)hypothesis. The main idea of the PPP hypothesis is thatexchange rates and national price indices will adjust propor-tionately so as to maintain a given currency’s purchasingpower across boundaries, which means that under the as-sumption of strict PPP the real value of a given currencywill be the same in all countries at any moment in time. TheUIP hypothesis states that, in equilibrium, the interest ratedi� erential among countries must be equal to the expectedrate of change of the exchange rate.

Structural models of exchange rate determinationwere developed after the collapse of the (Bretton Woods)® xed exchange rate regime in March 1973. All thesemodels assume domestic money market equilibrium anda long-term strict PPP hypothesis. Deviations in thereal exchange rate from its PPP benchmark can then(very) roughly be viewed as long Ð run gains or losses inexternal competitiveness. Such structural models canbroadly be subdivided into monetary and portfoliobalance exchange rate models. Monetary models of ex-change rate behaviour are descendants of the MundellÐFleming type of models (Mundell, 1963; Fleming 1962),assuming perfect substitutability between domestic andforeign (® nancial) assets. Portfolio balance exchange ratemodels de® ne the sustainable real exchange rate as thatvalue or path being consistent with internal and externalmacroeconomic balances; internal balance corresponds toan output being at its potential level and to a nonaccelerat-ing rate of in¯ ation and external balance requires a balanceof payments position in which any current account imbal-ance is ® nanced by a sustainable rate of capital ¯ ows(Faruqee, 1995).

542 J. Plasmans et al.

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

2 It should be noted that, theoretically, GNP is to be preferred as proxy of real income; GNP data, however, are available on a quarterlybasis, whereas industrial production data are available on a monthly basis. Therefore, following Meese and Rogo� (1983), we use industrialproduction data in our experiments.

Several versions of the monetary exchange rate modelshave been put forward, giving rise to two main types ofmodels. These are the Flexible-Price Monetary Model(FPMM) of Frenkel (1976) and Bilson (1978), and theSticky-Price Monetary Model (SPMM) of Dornbusch (1976)and Frankel (1979).

A basic problem with the FPMM is that it assumes strictPPP, so that the (logarithm of the) real exchange ratecannot vary over time, not even in the short run. This is incontrast with reality: although PPP existed during the1920s, it largely collapsed during the ¯ oating rate periodfrom March 1973 (see Frenkel (1981) and for further evid-ence until the end of the 1980s, Meulepas and Plasmans(1991)).

Therefore, a monetary model for nominal exchange rateswith incomplete competition in the market of tradeablegoods with sticky prices was needed, at least for the shortrun.

In Verkooijen et al. (1996) generalized versions of theFPMM and the SPMM are derived.

The key feature of the Portfolio Balance Model (PBM) isthe assumed imperfect substitutability between domesticand foreign assets. The net ® nancial wealth of the privatesector can be subdivided into three components: nominaldomestic money Mi , domestically issued bonds Bi, whichcan be government debt held by the domestic private sector,and foreign bonds Bj denominated in foreign currency andheld by domestic residents, which can be interpreted as netclaims on foreigners held by the private sector. In a regimeof ¯ oating exchange rates, a current account surplus on thebalance of payments must be exactly matched by a capitalaccount de® cit, i.e. by capital out¯ ow and, hence, by anincrease in the net foreign indebtedness Bj to the domesticeconomy. Therefore, current account imbalances will deter-mine exchange rate changes.

Furthermore, the imperfect substitutability of domes-tic and foreign assets is equivalent to the assumption ofa risk premium, separating expected depreciation and thedomestic Ð foreign interest rate di� erential (implying a col-lapse of the UIP hypothesis) . In the PBM this risk premiumwill be a function of relative domestic and foreign debtoutstanding.

Summarizing, the reduced form equation for the nominalexchange rates may be written under the PBM as

Sij , t = f (Mi , t , Mj , t , Bi , t , Bj , t , FBi , t , FBj , t) (1)

where FBi , t and FBj , t denote foreign holdings of domesticand foreign bonds, respectively. Taking account of theabove mentioned arguments, the four ® nal terms may bereplaced by the domestic and foreign accumulated currentaccount surplus.

Since we examine exchange rates against the US dollaronly (country j is the USA), the notation is slightly simpli-® ed; the subscripts i and j are omitted, instead of which allfundamentals corresponding to the USA carry a *’ mark.

The three models of exchange rate determination, whichwe will test empirically, are subsumed in the generally highlynonlinear equation:

st = f (rt - r*t , mt - m*t , ipt - ip*t , p t - p *t , TBt, TB*t ) + e t

(2)

where s is the logarithm of the bilateral spot exchange rate(e.g. DM/$); m - m* the logarithm of the relative (ratio offoreign to domestic) nominal money supply; ip - ip* thelogarithm of the relative industrial production; r - r*the nominal short-term interest rate di� erential; p - p * thein¯ ation rate di� erential; TB and TB* the cumulated tradebalances, and e is a disturbance term.2

The ¯ exible price monetary model (FPMM) includes onlythe ® rst three terms, i.e. rt - r*t , mt - m*t , and ipt - ip*t . Thesticky price monetary model (SPMM) adds the in¯ ationrate di� erential p t - p *t to this set of variables. Finally, theportfolio balance model (PBM) concentrates on the moneyholdings and on the cumulated domestic and foreign tradebalances.

Imposing the constraint that domestic and foreign vari-ables (except for trade balances) enter the structural modelsin di� erential form, implicitly assumes that the parametersof the corresponding domestic and foreign variables areequal in absolute size, in the case of linear regression. Whilethis parsimony assumption is conventional in empiricalapplications, it is a potential source of misspeci® cation.

I I I . A R T I F I C I A L N E U R A L N E T W O R K S

In the introduction we pointed out that one purpose of thispaper is to assess the employability of neural networks andtheir success in predicting exchange rates. In order to pro-vide the necessary background knowledge we discuss themethodological aspects of neural networks and indicatesome di� culties that may arise when applying neural net-works to data modelling problems. An intuitive introduc-tion to the use of arti® cial neural networks in economics canbe found in Church and Curram (1996).

Cognitive scientists have recently developed a class ofnonlinear models inspired by the neural architecture of thehuman brain ( neural network models’). These models arecapable of learning through interaction with their environ-ment by a process that can be viewed as a recursive statis-tical estimation procedure. The neural networks ® eld has

Exchange rate estimation using neural networks 543

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

Fig. 2. Graphical representation of a neuron

Fig. 1. A generic feedforward neural network with a single hiddenlayer; to prevent the graph from getting overcrowded, the bias neuronhas been removed from the input layer

developed rapidly since the in¯ uential work of Rumelhartand McClelland (1986). Neural network models (alsoknown as multi-layer perceptrons) are now widely used forregression and classi® cation purposes.

Representation

A neural network model is a particular type of input-outputmodel. Given an input vector xt = (x1 , ¼ , xl)9 t the networkproduces an output vector yW t = (yW 1 , ¼ , yW q) 9 t where l indi-cates the number of input units, q the number of outputunits and t is the sample observation (t = 1, ¼ , n). In statis-tics it is common practice to denote estimated variables,which neural network outputs in fact are, by a hat. Weconform to this notation.

A widely studied network is the feed-forward neural net-work, which is depicted in Fig. 1. In graphical form, feed-forward neural networks consist of directed graphs withoutcycles. Each node represents a unit’, also called arti ® cialneuron, which is the building brick of the arti® cial neuralnetwork. The functionality of each unit is as follows. Eachnon-input unit j sums its incoming signals and adds a con-stant term to form the total incoming signal and appliesa function / to this total incoming signal to construct theoutput of the unit. The links have weights wi j which multiplythe signal travelling through them by that factor. Figure2 shows the functionality of an arti® cial neuron. The func-tion / , which is called the squashing function, is usuallytaken to be logistic (with / (z) = exp(z)/(1 + exp(z))). Theinput units only distribute the input vector, so their / is theidentity function.

In mathematical notation a single-output feedforwardneural network, as depicted in Fig. 1, is expressed by

yW t = / O 1 +j

wj / H 1 +i ® j

wij xi , t 2 2 (t = 1, ¼ , n) (3)

where yW t denotes the value of the output unit when inputxt (the tth input pattern) is fed into the network, and+ i ® j stands for the sum over neurons i connected to j.Usually, identical squashing functions are used within thesame layer; / O denotes the squashing function of the outputunits, and / H denotes the squashing function of the hiddenunits. The indexing presumes that neurons are numberedsequentially in the order: input units, hidden units, andoutput unit(s).

It may be preferable to include direct skip-layer’ connec-tions from the input layer to the output layer to explicitlyincorporate the basic linear model, which leads to

yW t = / O 1 +i

wixi , t + +j

wj / H 1 +i ® j

wij xi , t2 2 (4)

The expression f (x, w) is a convenient short-hand notationfor network output since this depends only on inputs andweights, given a ® xed network architecture. The symbolx represents the input vector, and the symbol w representsa vector of all the weights.

In our applications the dimension of the output vectoryW is one, the squashing function of the output unit / O isassumed to be linear and the squashing function of thehidden units / H logistic. Let / (without subscript) denotethe squashing function of the hidden units, so that Equation4 can be rewritten as follows:

yW t = a 0 + a 9 xt +Nh

+i = 1

b i / (w 9i xt), (5)

where Nh denotes the number of hidden units. Note that wehave used di� erent parameter names for di� erent parts ofthe neural network; without comment, w refers to the wholeset of parameters (weights) of the neural network.

Equations 3 and 4 represent quite general classes offunctions. A number of authors, e.g. Cybenko (1989), haveshown that single hidden layer feed-forward neural net-works with a particular type of nonlinear squashing func-tion (e.g. sigmoid) in the hidden units and linear outputunits can approximate any continuous function f uniformlyon compact sets, by increasing the size of the hidden layer.This justi® es the use of neural networks for function approx-imation and pattern recognition.

L earning

Given a particular neural network architecture, the role oflearning is to ® nd suitable values for the network weightsw to approximate an underlying regression function gof x by f (x, w). Traditionally, the weight vector w is

544 J. Plasmans et al.

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

3 This follows directly from the updating of the weights in the direction of the negative gradient of E*.

estimated by minimizing the sum of observed squared errors(Rumelhart et al., 1986). Suppose we have n observations{(xt, yt)}n

t= 1 where yt represents the target’ value that theneural network should generate when the tth input vectorxt is fed in. Then the parameter (weight) vector w is chosento minimize

E (w) =n

+t= 1

[yt - f (xt; w)]2 (6)

as in conventional nonlinear regression. This is a generalminimization problem, so in principle we can use generalpurpose algorithms from unconstrained optimization(Scales, 1985; Nash, 1990).

The neural network community, however, has developedits own learning algorithm known as error back-propaga-tion. It permits weights to be learned from experience ina process resembling trial and error (Rumelhart et al., 1986).Experience is based on the empirical observations on thephenomenon of interest. From now on we assume that theerror function E (w) is a di� erentiable function, which im-plies that the squashing functions have to be di� erentiable,thereby excluding threshold units. According to back-propagation, we start with a set of random weights w0 andthen update them by the formula

wp = wp ± 1 + hn

+t= 1

= f (xt, wp ± 1 ) (yt - f (xt, wp ± 1 ))

p = 1, 2, ¼ (7)

where h is a learning rate and = f is the gradient (the vectorcontaining the partial derivatives) of f with respect tothe weights w (Rumelhart et al., 1986). This is simplythe application of the gradient steepest descent technique tothe minimization of Equation 6. The distinguishing featureof back-propagation is that the calculation of = f (x, w)(y - f (x, w)) is carried out by a sequence of localcomputations on the network itself. The speci® c structureof the neural network is utilized in this calculation.Input data are fed into the network and are forwardedthrough the connections to the output unit(s); the errorbetween the network output and the desired output is cal-culated and propagated backwards through the connec-tions. During this back-propagation phase the weights areupdated. For a good description of feedforward neuralnetworks and their speci® c learning algorithms we refer toHertz et al. (1991).

Back-propagation , like any other nonlinear optimizationalgorithm, su� ers from becoming trapped in local minima.Therefore, in practice, multiple restarts are performed in thehope that a global minimum can be found. The actualnumber of restarts employed in practice is generally limited

by the computing time required to train a neural network.In our applications we use ® ve restarts.

Generalization

In practice, the approximation feature of neural networksvery often results in estimators with low bias and highvariance. Hence, each relationship, characterized by a ® niteset of observations, can be approximated arbitrarily close bya neural network if enough hidden units are used. Althoughthe trade-o� between bias and variance is well known instatistics, its relevance for neural network regression hasbeen explicitly pointed out only recently (Geman et al.,1992). Two concepts from the neural network communitywhich are directly related to the bias-variance dilemma are:generalization and over® tting’. A network has good gener-alization qualities if it performs well on future ( unseen’)examples. A network is said to over® t the data if too manycharacteristics are drawn from the data set at hand. A net-work that over® ts the data sample at hand generally has lowbias but high variance, and consequently displays bad gen-eralization behaviour. Obviously, the interest is in neuralnetwork models that generalize well.

Several approaches have been taken to improve general-ization. A well-known approach, known as weight decay, isto add a penalty term to the original error function (Equa-tion 6)

E* (w) = E (w) + l +ij

w2ij (8)

and estimate the parameters w by minimizing this com-pound error function. Now, each connection wij has a tend-ency to decay to zero,3 so that connections disappear unlessreinforced. The rationale behind weight decay is the same asthat of all shrinkage methods in statistics such as, forexample, ridge regression (Ripley, 1993; Frank and Fried-man, 1993). Hence, the weight decay parameter l can beused for the regularization of bias and variance. A higherl results in smaller weights, which in turn leads to smootherapproximations , and smoother approximators are assumedto display better generalization.

Another factor that in¯ uences both the bias and thevariance of the neural network estimator is the size of thehidden layer (Geman et al., 1992). This follows directly fromthe approximation theorem for neural networks. Due to the¯ exibility of neural networks, it is not sensible to makewithin-sample comparisons between the modelling errors oflinear models and neural network models. It would be veryeasy to ensure that neural networks outperform linear alter-natives, simply by inserting many units into the hiddenlayer. Therefore, in the empirical experiments of this paper

Exchange rate estimation using neural networks 545

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

4 We will use D ± i to denote data set D without the observations from Di.5 The data are described in Verkooijen et al. (1995).

we focus on out-of-sample instead of within-sample predic-tion performance.

Cross-validation

The decay parameter l in Equation 8 and the number ofhidden units, Nh, are typical smoothing’ parameters thathave to be optimized’ in some way. Cross-validation (Stone,1974; Hastie and Tibshirani, 1990; Frank and Friedman,1993) is a practical approach that is often used for thispurpose. It works by leaving points (xi, yi) out one at a time,estimating the weights w on the remaining n - 1 points,and calculating the prediction error on the point left out.Finally, the average out-of-sample prediction error is cal-culated. This is an attempt to mimic the use of training and testsamples for prediction. Since neural network training requiresa relatively large amount of computation time, in practice notone but a subset of points are left out, where the number ofsubsets, k, to be left out, is typically chosen to be 5 or 10.

For neural networks, calculation of the cross-validationerror proceeds as follows. The set of observations, denotedby D, is randomly permuted and decomposed into kmutually exclusive subsets Di of roughly equal size wherei = 1, ¼ , k. When working with time series data we do notpermute the data, but keep the original time ordering.Initially, one repeatedly trains a neural network with di� er-ent random starting weight vectors on the complete data setuntil a good’ (local) minimum w* has been obtained. Thenthe observations from subset Di are left out from the com-plete data set,4 the weights are re-trained (starting from w*),the observations from Di are predicted, and the averageprediction squared error on Di is calculated. The foregoingis repeated until each subset Di has been left out from thecomplete data set. Finally, the average prediction error iscalculated from the k repetitions. The cross-validation error,CV , is calculated as follows:

CV ( l , Nh) = 1/kk

+i = 1 5

1

| Di |

+t Î Di

( fl , Nh(xt, w (D ± i, w0 = w*)) - yt)

2 6 (9)

where we explicitly added the smoothing parameters l andNh to the network output function notation, and stressedthe dependence of the (re-trained) weights w on the trainingset D ± i and the starting weights w0 .

Notice that k networks have to be constructed to calcu-late the cross-validation error for one particular com-bination of the smoothing’ parameters, which can still becomputationally demanding.

This cross-validation procedure is used to comparemodels out-of-sample, and to select the optimal’ set ofsmoothing parameters.

I V . D A T A A N D P R E L I M I N A R YD I A G N O S T I C S

In econometric modelling it is important to distinguishbetween stationary and nonstationary time series data. Theworst consequence of modelling with nonstationary timeseries data is that standard statistical tests provide evidencefor a supposed relationship between economic funda-mentals, whereas in fact the relationship is purely spurious.Tests for cointegration in linear models have been de-veloped to guard against making these erroneous con-clusions (see, for example, Banerjee et al., 1993).

The ® rst step in a modelling exercise therefore incorpor-ates the dynamic characterization of the data. Unit roottests are normally used for this purpose. When the varioustime series contain a unit root, the next step is to investigatewhether these nonstationary time series drift together (arecointegrated) or drift apart (are not cointegrated).

Data sources

Our sample period runs from January 1974 to July 1994.Most monthly data are taken from the OECD series (usingDatastream), including bilateral exchange rates, industrialproduction indices, consumer price indices, short- and long-term interest rates, money supplies (M1), and foreign tradebalances. The items not available in the OECD series, weretaken from the National Accounts.5

To enable ANN training with weight decay (as explainedin the previous section), we have rescaled the data corres-ponding to each explanatory variable in such a way that atleast 95% of the data lies within the [0, 1] range and theaverage equals 0.5. This rescaling makes the signal transfer-red by each input unit comparable to the outputs of internalunits, which is required for weight decay to have e� ect.

Unit roots

In Banerjee et al. (1993), the characterization of time seriesby the order of integration is discussed. We tested all seriesfor possible nonstationarity by simple Augmented Dickey ÐFuller (ADF) tests. Table 1 reports the characterizationssuggested by these tests for the logarithmic variables indi� erential form (i.e. in deviation from US equivalents) .

Most variables are integrated of order one (I (1)), involv-ing a (linear) trend, although the industrial production indexdi� erential appears to be (trend) stationary, in three out offour cases. Trend stationarity is denoted by I(0) + c + t inTable 1. It should be noted that discriminating betweena trend stationary series and a random walk with drift, isdi� cult on the basis of a limited sample.

546 J. Plasmans et al.

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

Table 1. Conclusions based on unit-root tests

Japan UK Germany Netherlands

Exchange rate I(1) I(1) I(1) I(1)Nominal interest rate I(1) I(1) I(1) I(1)Money supply I(1) I(1) I(1) I(0) + c + tIndustrial production I(1) I(0) + c + t I(0) I(0)In¯ ation I(1) I(0) I(1) I(1)Cumulated trade balances I(1) I(1) I(1) I(1)

Table 2. L inear cointegration tests

Model Japan UK Germany Netherlands Critical values (a = 0.1)

FPMM 2.09 2.64 2.52 3.05 4.20SPMM 2.32 3.57 2.38 4.29 4.50PBM 1.79 3.27 1.52 2.61 4.77

Table 3. ANN ADF tests

Model Japan UK Germany Netherlands

FPMM 4.51 5.16 3.05 2.70SPMM 3.54 5.23 4.88 4.73PBM 4.04 4.90 3.78 5.03

The critical values (a = 0.01, 0.05, 0.10) are:FPMM: 6.02, 5.46, 5.19SPMM: 6.27, 5.73, 5.39PBM: 6.67, 6.07, 5.72

Cointegration

Since nearly all di� erential variables are found to be I(1), wehave to investigate whether some linear combination ofthese nonstationary variables is stationary. The variablesare said to be pairwise linearly cointegrated if this is thecase. The residuals of this static, long-run solution are thentested for stationarity by ADF tests in Table 2, using thecritical values calculated by MacKinnon (1991).

Table 2 shows that in all cases the null hypothesis of nolinear cointegration cannot be rejected. Hence, the collecteddata do not seem to con® rm the three theoretical models ofstructural exchange rate determination in linearized form.This conclusion is not altered when the linearized modelsare estimated unrestrictedly, incorporating foreign and do-mestic variables separately.

A further step in the cointegration analysis is the test forthe presence of nonlinear cointegration. We will utilizea new consistent test on the presence of univariate nonlinearcointegration using ANNs (see the Appendix). One maindrawback of this ANN test is the huge computational e� ortrequired when simulating the critical values, which dependon several ANN parameters. We have to make some conces-sions regarding optimality and e� ciency of the test. Toreduce the computational burden, we adopt the same ANNparameters for each exchange rate model and each country.In this way we have to determine three critical values. TheANN consists of three hidden units and the weight decayparameter is taken to be 0.001. The residuals of the ANNversions of the FPMM, SPMM, and PBM are tested fora unit root using the ANN ADF test. How the requiredcritical values are actually generated is found in the Appen-dix. Corresponding to the ADF tests for linear cointegra-tion, the number of lags in the ANN ADF test is determined

as the highest lag that is signi® cant at a 5% level such thatno autocorrelation in the resulting residuals is left. Theestimated test statistics are shown in Table 3.

The tests for univariate nonlinear cointegration do notreject the null hypothesis of no cointegration at reasonablesigni® cance levels. Functional form misspeci® cation doesnot seem to be an important explanation for the weakevidence for a long-run (nonlinear) relationship between theeconomic fundamentals and the four dollar exchange ratesinvestigated. Because no evidence of (nonlinear) cointegra-tion could be detected, in the sequel we focus only onshort-term behaviour of the exchange rate models. Hencethe long-run equilibrium part of the error correction modelsto be used in the next section will be set to zero.

V . P R E D I C T I V E P E R F O R M A N C EA S S E S S M E N T

In this section we investigate the predictive power ofthe various structural exchange rate models. Our main

Exchange rate estimation using neural networks 547

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

6 Linear OLS estimates of these models are:

DM/$: D st = 0.34 D st ± 1 + 0.29 D mt ± 2 - 0.0086 D rt - 0.19 D ipt ± 1

JY/$: D st = 0.36 D st ± 1 + 0.10 D st ± 3 + 0.17 D mt ± 2 - 0.0075 D rt

DFL/$: D st = 0.34 D st ± 1 + 0.13 D mt ± 1 + 0.11 D mt ± 3 - 0.0064 D rt

BP/$: D st = 0.47 D st ± 1 - 0.16 D st ± 2 + 0.20 D mt ± 2 + 3.07 ´ 10 ± 6 D TB*t - 7.91 10 ± 6 D TBt - 1.18 ´ 10 ± 5 D TBt ± 2

where all parameters mentioned are signi® cant at the 1% signi® cance level.

objective is to examine whether a nonlinear speci® cation ofthe supposed relationship between the relating economicfundamentals and a speci® c nominal dollar exchange rateimproves upon the predictive performance of the bench-mark random walk model, and upon linear speci® cations.Since we could not discover any signi® cant nonlinear andlinear (univariate) cointegration in the previous section, thelong-run prediction performance of linear as well as nonlin-ear structural exchange rate models will not be better thanthat of the corresponding naive random walk models. Theseresults are in line with the ® ndings on long-run predictiveperformance of nominal exchange rates in other studies(Meese and Rogo� , 1983; Meese and Rose, 1991; Dieboldand Nason, 1990). Hence, (principally) short-run predictiveperformance will be investigated in this section.

Methodology for out-of-sample model comparison

In line with the Meese and Rogo� (1983) study we comparethe models using recursive estimation. This means that westart with an initial estimation of the model, using (say) the® rst n0 observations. Then we make predictions for theremaining part of our sample (n - n0 ). Hereafter, we includethe next observation to the parameter estimation set, whichnow consists of (n0 + 1) observations, and again predict thevalues of the response variables in the remaining set ofobservations. This procedure is repeated, until the trainingset equals the total sample. In this way we have constructeda set of (n - n0 ) 1-step-ahead predictions, (n - n0 - 1)2-steps-ahead predictions; or generally nk = (n - n0

- k + 1) k-steps-ahead predictions (k < n - n0 ).It must be noted that the structural exchange rate models

require forecasts of their predictor variables in order togenerate predictions of the nominal exchange rates. In linewith what is usually done in this case, we use the realizedvalues of predictor variables. Consequently, the results areoptimistically biased.

We take RMSE as our principal criterion for comparison:

RMSE(k) = 5N ± k

+r = n

o

[ yW p+ k - yp+ k]2 /(N - k)6

1 /2

(10)

where k denotes the prediction horizon (in months), yp+ k theobserved value of the response variable at time (p + k), andyW p+ k the estimated response value from a model with para-meters estimated from the data set {(xt, yt)}p

1 .

Short-run predictions of dollar exchange rates

In this section we examine whether powerful short-runpredictions can be made with the four structural exchangerate models using I(0) variables in ® rst di� erences.

Applying standard econometric inference to dynamicspeci® cations of the structural exchange rate models withdi� erenced variables revealed a strong signi® cance of thenominal exchange rate changes in the previous period for allfour currencies. This is evidence against the most simplerandom walk forecast of no change’ in the level of anindividual exchange rate. Do changes in the economic fun-damentals alter the short-run prediction performance of thenominal dollar exchange rates? Econometric analysis alsoprovides some evidence of economic fundamentals havinga signi® cant in¯ uence on the nominal exchange ratechanges.

Table 4 presents the RMSEs of one-period-ahead predic-tions made by parsimonious models,6 the complete ex-change rate models with two lags for each variable, anda univariate AR(6) time series model, all estimated by OLSand feedforward ANN. Additionally, Table 4 shows resultsfor the random walk model D st = e t with e t

i . i .d.~ N (0, s 2

e ).In the recursive estimation procedure the initial models

were estimated on the ® rst 180 observations. The ANNversions of the parsimonious (complete) models have beenestimated with two hidden units and weight decay para-meters 0.1 and 5, and ® ve restarts. These ANN parametershave been determined by cross-validation and indicate thatif nonlinearities are present the e� ects are tenuous. Hence,the small number of hidden units and the relatively largevalue of the weight decay parameter suggested by cross-validation are attempts to reduce over® tting, rather thanexploring real nonlinearities. From Table 4 it is seen thatsome better short-run prediction is possible: the RMSEs ofthe one-period-ahead predictions were smaller than theRMSEs of the random walk models.

To investigate whether the economic fundamentals havean e� ect on the short-run prediction of the models,a univariate time series AR(6) model is ® tted to the data aswell. The one-month-ahead ANN predictions with inputsconsisting of six time lags of the particular exchange rate,were not better than those from the linear univariate model.The prediction results of the univariate model comparefavourably to those of the parsimonious structural model.

548 J. Plasmans et al.

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

Table 4. Short-run predictions

Parsimonious Complete Univariate RandomCountry (OLS/ANN) (OLS/ANN) (OLS/ANN) walk

Japan 0.221/0.221 0.230/0.214 0.217/0.216 0.223UK 0.289/0.291 0.305/0.303 0.290/0.290 0.323Germany 0.247/0.224 0.269/0.254 0.243/0.273 0.264Netherlands 0.250/0.250 0.266/0.259 0.248/0.248 0.269

Displayed values are: RMSE(OLS)/RMSE(ANN)

This suggests that the economic fundamentals are of nosigni® cant help in predicting the exchange rates’ move-ments.

Are all terms necessary in the AR(6) model? To examinethis, we repeated the same prediction procedure with anAR(1) model instead of the AR(6) model. The RMSEsobtained from the AR(1) models for the exchange ratemovements are similar to the results obtained from theAR(6) models; in three out of four cases even slightly better.Concluding, the only regularity that has been found in thedata is that the next exchange rate movement has the samesign as the previous movement, but is damped by a factor ofapproximately 0.4.

The rolling prediction experiment has revealed that overthe last 62 months of observations some structure is presentin the exchange rate movements. The only factor thatseemed important is the change in the previous month’sexchange rate.

To assess the possible impact of the economic funda-mentals that were selected in the parsimonious models inthe total period 1974Ð 1994, we performed an additional10-fold cross-validation test on the linear models. This testhas been performed as follows. Two years of monthly obser-vations are repeatedly left out from model estimation, andare then predicted from the model constructed on the re-maining observations . The so-obtained out-of-sample pre-dictions’ are then compared with the actual values. We mayconclude from this experiment that over the whole period1974 Ð 1994 some economic fundamentals helped a bit inexplaining part of the variance in the exchange rate changes.Over the last 5 years, however, their e� ect on predictionperformance was small at best.

V I . C O N C L U S I O N S

We have applied neural network speci® cation and linearspeci® cation to three structural exchange rate models (the¯ exible price and sticky price monetary models and theportfolio balance model), and have compared the out-of-sample prediction qualities. From this study we concludethe following.

First, no evidence has been found that con® rms the exist-ence of a long-run relationship (linear or non-linear) be-tween the economic fundamentals included in the ¯ exibleprice, sticky-price, and portfolio balance models, and fournominal dollar exchange rates. Including foreign anddomestic variables as separate regressors did not alter this.Consequently, long-run predictions were worse than predic-tions made from the naive random walk model.

Second, when estimated in ® rst di� erenced form we havefound some evidence of a weak structure underlyingmonthly exchange rate changes. The two main determinantsare the previous month’s exchange rate changes and thechange in the interest rate di� erential between two coun-tries. The random walk model is outperformed by linearmodels for all four dollar currencies considered. Exploringthe possible existence of nonlinearities in the univariate short-run models by neural networks, did not show much signi® -cant evidence. Including foreign and domestic variables asseparate explanatory variables did not alter this ® nding.

Third, in general, biased estimation improves the predic-tion quality of the various exchange rate models, especiallyfor the long run. The experiments with neural networksrevealed the large e� ect of the weight decay parameter onthe prediction quality of the neural network models. Inthose cases where the neural network showed better predic-tion performance than the corresponding linear model esti-mated by OLS, it seems that regularization by weight decayrather than the introduction of nonlinearities was respon-sible for this. Hence, biased estimation could also improvethe prediction quality of the linear models considerably,especially when modelling with (nearly) collinear indepen-dent variables or with a high number of independent vari-ables and a relatively small set of observations.

Exchange rate determination has always been a di� cultproblem (Meese and Rogo� , 1983; Meese and Rose, 1991;MacDonald and Taylor, 1992) that is characterized by veryweak underlying relationships, which are consequentlyproblematic to quantify empirically by any regressionmethod Ð neural networks included. Introducing non-linearities in univariate exchange rate models, using feed-forward neural networks, does not seem to be a researchdirection where high payo� s can be expected.

Exchange rate estimation using neural networks 549

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

A C K N O W L E D G E M E N T S

The authors thank Arie Weeren (UFSIA) for his construc-tive criticism and his help in preparing this paper, and EricPentecost (Loughborough University) and JuÈ rgen Wolters(Freie UniversitaÈ t Berlin) for useful suggestions on aprevious version of this paper. Nevertheless, the authorsclaim sole responsibility.

R E F E R E N C E S

Baestaens, D., Van Den Bergh, W. and Wood, D. (1994) NeuralNetwork Solutions for Trading in Financial Markets (Pitman,London).

Baillie, R. and McMahon, P. (1989) The Foreign Exchange Market:Theory and Econometric Evidence (Cambridge UniversityPress, New York).

Banerjee, A., Dolado, J., Galbraith, J. and Hendry, D. (1993)Co-integration, Error-correction, and the Econometric Analy-sis of Non-stationary Data (Oxford University Press, Oxford).

Bilson, J. (1978) Rational expectations and the exchange rate. InThe Economics of Exchange Rates, eds J. Frenkel and H.Johnson (Addison-Wesley, Reading, MA).

Chinn, M. (1991) Some linear and nonlinear thoughts on exchangerates, Journal of International Money and Finance, 10, 214 Ð 30.

Church, K. B. and Curram, S. P. (1996) Forecasting consumers’expenditure: A comparison between econometric and neuralnetwork models, International Journal of Forecasting, 12,255 Ð 67.

Cybenko (1989) Approximation by superposition of a sigmoidalfunction, Math. Control Systems Signals, 2, 303 Ð 14.

Deboeck, G. (1994) Trading on the Edge Ð Neural, Genetic andFuzzy Systems for Chaotic Financial Markets (John Wiley& Sons, New York).

Diebold, F. and Nason, J. (1990) Nonparametric exchange rateprediction? Journal of International Economics, 28, 315 Ð 32.

Dornbusch, R. (1976) Expectations and exchange rate dynamics.Journal of Political Economy, 84, 1161 Ð 76.

Dunis, C. (1995) The economic value of neural network systems forexchange rate forecasting. Working paper, Chemical Bank,London.

Engel, C. and Hamilton, J. (1990) Long swings in the dollarexchange rates: Are they in the data and do markets know it?The American Economic Review, 80, 689 Ð 713.

Engle, R. and Yoo, S. (1987) Forecasting and testing in co-integrated systems, Journal of Econometrics, 35, 143 Ð 59.

Faruqee, H. (1995) Long-run determinants of the real exchangerate: A stock- ow perspective. IMF Sta¤ Papers, 42, no. 1,80Ð 107.

Fleming, J. (1962) Domestic ® nancial policies under ® xed and¯ oating exchange rates, International Monetary Fund Sta¤Papers, 9, no. 4, 369 Ð 80.

Frank, I. and Friedman, J. (1993) A statistical view of somechemometrics regression tools. Technometrics, 35, 109 Ð 35.

Frankel, J. (1979) On the mark: A theory of ¯ oating exchange ratesbased on real interest di� erentials, The American EconomicReview, 69, 611 Ð 22.

Frenkel, J. (1976) A monetary approach to the exchange rate:doctrinal aspects and empirical evidence, Scandinavian Jour-nal of Economics, 76, 200 Ð 24.

Frenkel, J. (1981) The collapse of purchasing power parities duringthe 1970’s, European Economic Review, 16, 145 Ð 65.

Geman, S., Bienenstock, E. and Doursat, R. (1992) Neural net-works and the bias/variance dilemma, Neural Computation, 4,1 Ð 58.

Hallwood, C. and MacDonald, R. (1994) International Money andFinance (Blackwell, Oxford).

Hastie, T. and Tibshirani, R. (1990) Generalized Additive Models(Chapman and Hall, London).

Hertz, J., Krogh, A. and Palmer, R. (1991) Introduction to theTheory of Neural Computation (Addision-Wesley, RedwoodCity, CA).

Hsieh, D. (1989) Testing for nonlinear dependence in daily foreignexchange rates, Journal of Business, 62, 329 Ð 68.

Huber, P. (1985) Projection pursuit, The Annals of Statistics, 13,435 Ð 75.

Kuan, C. M. and Liu, T. (1995) Forecasting exchange rates usingfeedforward and recurrent neural networks, Journal ofApplied Econometrics, 10, 347 Ð 64.

MacDonald, R. and Taylor, M. (1992) Exchange rate economics:a survey. IMF Sta¤ Papers, 39, 1 Ð 57.

MacDonald, R. and Taylor, M. (1994) The monetary model of theexchange rate: Long-run relationships, short-run dynamicsand how to beat a random walk, Journal of InternationalMoney and Finance, 13, 276 Ð 90.

MacKinnon (1991) Critical values for cointegration tests. In L ong-Run Economic Relationships, eds R. Engle and C. Granger, pp.267 Ð 76 (Oxford University Press, Oxford).

Mark, N. (1985) On time varying risk premia in foreign exchangemarkets: An econometric analysis, Journal of Monetary Econ-omics, 16, 3 Ð 18.

Meese, R. and Rogo� , K. (1983) Empirical exchange rate models ofthe seventies: Do they ® t out-of-sample? Journal of Interna-tional Economics, 14, 3Ð 24.

Meese, R. and Rose, A. (1990) Non-linear, non-parametric, non-essential exchange rate estimation, The American EconomicReview, 80, 192 Ð 6.

Meese, R. and Rose, A. (1991) An empirical assessment of non-linearities in models of exchange rate determination, Review ofEconometric Studies, 58, 601 Ð 19.

Meulepas, M. and Plasmans, J. (1991) Estimating and forecastingexchange rates by means of PPP and UIP. Technical ReportSESO 91/262, UFSIA (University of Antwerp).

Mizrach, B. (1992) Multivariate nearest-neighbour forecasts of emsexchange rates, Journal of Applied Econometrics, 7,S151 Ð S163.

Mundell, R. (1963) Capital mobility and stabilization policy under® xed and ¯ exible exchange rates, Canadian Journal of Eco-nomic and Political Science, 29, 475 Ð 85.

Nash, J. (1990) Compact Numerical Methods for Computers (AdamHilger, New York).

Refenes, A., Azema-Barac, M., Chen, L. and Karoussos, S.(1993) Currency exchange rate prediction and neural networkdesign, Journal of Neural Computing and Applications, 1, 46Ð 58.

Rehkugler, H. and Zimmermann, H. (1994) Neuronale Netze in derOÈ konomie Ð Grundlagen und Þ nanzwirtschaftliche Anwendun-gen (Verlag Franz Vahlen, MuÈ nchen).

Ripley, B. (1993) Statistical aspects of neural networks. In Chaosand Networks Ð Statistical and Probabilistic Aspects, eds O.Barndor� -Nielsen, J. Jensen, and W. Kendall, pp. 40 Ð 123(Chapman & Hall, London).

Rumelhart, D. and McClelland, J. (1986) Parallel Distributed Pro-cessing: Explorations in the Microstructures of Cognition (MITPress, Cambridge, MA).

Rumelhart, D., Hinton, G. and Williams, R. (1986) Learninginternal representations by error propagation. In ParallelDistributed Processing: Explorations in the Microstructures of

550 J. Plasmans et al.

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013

Cognition, eds D. Rumelhart and J. McClelland, pp. 318 Ð 62(MIT Press, Cambridge, MA).

Scales, L. (1985) Introduction to Non-L inear Optimization (Mac-millan, London).

Sephton, P. (1994) Cointegration tests on MARS, ComputationalEconomics, 7, 23 Ð 35.

Stone, M. (1974) Cross-validity choice and assessment of statisticalpredictions, Journal of the Royal Statistical Society, B36,111 Ð 47.

Trippi, R. and Turban, E. (1993) Neural Networks in Finance andInvesting: Using ArtiÞ cial Intelligence to Improve Real W orldPerformance (Probus Publishing, Chicago, Il).

Verkooijen, W., Plasmans, J. and DanieÈ ls, H. (1995) Long-runexchange rate determination: a neural network study. Dis-cussion Paper 95/330, SESO, UFSIA (University of Antwerp).

Verkooijen, W. J. H., Plasmans, J. E. J. and DanieÈ ls, H. A. M.(1996) Exchange rate modelling using arti® cial neural net-works. In Forecasting Financial Markets, New Advances forExchange Rates, Interest Rates and Asset Management(Chemical Bank, London).

A P P E N D I X

An ANN test for nonlinear cointegration

To test for the existence of a nonlinear cointegrating rela-tionship, we generalize the Augmented Dickey Ð Fuller(ADF) tests which have been applied in Section IV. Themain issue is to construct critical values for univariatenonlinear cointegration tests based on ANN estimators. Indoing this, we adopt a similar procedure to that employed inEngle and Yoo (1987) and Sephton (1994). The cointegrat-ing relationship is assumed to be represented by

x1 t = f (x2 t, x3 t, ¼ , xkt) + e t (A1)

In Equation A1 f is estimated by an ANN, but any other¯ exible or nonparametric regression method can be used.

Critical values of the ADF test on ANNs depend onseveral factors. When employing ANNs to construct a test

for univariate nonlinear cointegration, it is necessary tocondition critical values on the following ANN factors: thenumber of hidden units, the value of the weight decayparameter, the number of inputs, the number of observa-tions, and the number of restarts employed in ® nding the® nal ANN. All these factors are somehow related to thetemptation of ANNs to over® t the training data, whichwould result in unduly small sized residuals. When thein¯ uence of the ANN factors is neglected, the ADF testwould too often reject the null hypothesis of no cointegra-tion. The present software and hardware makes it computa-tionally infeasible to construct tables of critical values for allpossible combinations of ANN factors. In the applications,we select ANN factors for a particular case, and calculatethe critical values that correspond to that particular combi-nation of ANN factors. These critical values are constructedunder the null hypothesis of no cointegration through 1000replications of the following procedure. Construct k inde-pendent random walks of length n, using the data generat-ing mechanism:

xt = xt ± 1 + Ut

ui , t = 0.8 ui , t ± 1 + n i , t i = 1, ¼ , k (A2)

with vi , ti . i .d.

~ N (0, 1) . Train an ANN with a particular set offactors to approximate f in Equation A1 and estimate theparameters in

D e W t = a 0 + g 0 e W t ± 1 +p

+i = 1

g i D e W t ± i + z t, z ti . i .d.

~ N (0, s 2z )

(A3)

using the residuals from Equation A1 and calculate thet-statistic of g W 0 . The 1000 t-statistics so obtained give anempirical distribution, which is used in calculating thecritical values required for testing the null hypothesis of nocointegration.

Exchange rate estimation using neural networks 551

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

11:

12 2

2 Fe

brua

ry 2

013