forecasting with vector autoregressive models: an empirical investigation for austria

16
Forecasting with Vector Autoregressive Models: An Empirical Investigation for Austria Robert Kunst, Klaus Neusser*) Zusammenfassung Multivariate Zeitreihenmodelle haben sich zu einer echten Alternative zu den herk6mmli- chen Strukturmodellen entwickelt. Die vorliegende Arbeit untersucht die Prognosequalit~t zweier m6glicher Strategien, solche Modelle zu spezifizieren. Die erste beruht auf einer Se- lektionsstrategie, die mittels eines Informationskriteriums (AIC) und unter Verwendung der 0blichen F- und t-Tests die Dimension des Modells reduziert. Die zweite Strategie greift auf Baysianische Methoden zur~ck, um so dem Problem der 0berparametrisierung zu ent- gehen. Ausgehend von acht Zeitreihen, die das Verhalten der 6sterreichischen Wirtschaft charak- terisieren, wurde versucht, beide Methoden m6glichst gut zu implementieren. In einem nw Schritt wurden dann ihre Ex-Ante-Prognosequalit&ten anhand verschiedener MaBe miteinander verglichen. Es zeigt sich, dab die Selektionsstrategie bei kQrzerem Pro- gnosehorizont die Bayesianische in jeder Beziehung dominiert. Bei I~ngerem Horizont schneidet allerdings letztere besser ab. Die Prognosegenauigkeit steigt weiters bei Anwen- dung verfeinerter Methoden. Das bedeutet, dab es sich Iohnen dQrfte, auf komplexere Me- thoden zurQckzugreifen. 1. Introduction The widely used practice to build up large, so called structural, econometric models on an equation by equation basis has been severely criticized among others by Sims (1980) and Lucas -- Sargent (1981). This critique is primarily concerned with the often arbitrary and implausible exclusion restrictions used for identification. Additionally, simple univariate ARIMA models seem to be superior or at least equivalent in forecasting (McNees, 1981). An alternative proposed by Sims (1980) is to build vector autoregressive (VAR) models with no a-priori exclusion restrictions. The relative scarcity of observations to the number of parameters even for small models, however, led to the problem of overparametrization which tends to diminish their forecasting performance (see Fair, 1979, or Kunst -- Neusser, 1986). *) A previous version of this paper has been presented at the annual conference of the Austrian Eco- nomic Society in 1986. The authors would like to thank Manfred Deistler, David Hendry, and three anonymous referees for their helpful comments and suggestions. 187

Upload: robert-kunst

Post on 10-Jul-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forecasting with vector autoregressive models: An empirical investigation for Austria

Forecasting with Vector Autoregressive Models: An Empirical Investigation for Austria

Robert Kunst, Klaus Neusser*)

Zusammenfassung

Multivariate Zeitreihenmodelle haben sich zu einer echten Alternative zu den herk6mmli- chen Strukturmodellen entwickelt. Die vorliegende Arbeit untersucht die Prognosequalit~t zweier m6glicher Strategien, solche Modelle zu spezifizieren. Die erste beruht auf einer Se- lektionsstrategie, die mittels eines Informationskriteriums (AIC) und unter Verwendung der 0blichen F- und t-Tests die Dimension des Modells reduziert. Die zweite Strategie greift auf Baysianische Methoden zur~ck, um so dem Problem der 0berparametrisierung zu ent- gehen.

Ausgehend von acht Zeitreihen, die das Verhalten der 6sterreichischen Wirtschaft charak- terisieren, wurde versucht, beide Methoden m6glichst gut zu implementieren. In einem nw Schritt wurden dann ihre Ex-Ante-Prognosequalit&ten anhand verschiedener MaBe miteinander verglichen. Es zeigt sich, dab die Selektionsstrategie bei kQrzerem Pro- gnosehorizont die Bayesianische in jeder Beziehung dominiert. Bei I~ngerem Horizont schneidet allerdings letztere besser ab. Die Prognosegenauigkeit steigt weiters bei Anwen- dung verfeinerter Methoden. Das bedeutet, dab es sich Iohnen dQrfte, auf komplexere Me- thoden zurQckzugreifen.

1. Introduction

The widely used practice to build up large, so called structural, econometric models on an equation by equation basis has been severely criticized among others by Sims (1980) and Lucas - - Sargent (1981). This critique is primarily concerned with the often arbitrary and implausible exclusion restrictions used for identification. Additionally, simple univariate ARIMA models seem to be superior or at least equivalent in forecasting (McNees, 1981). An alternative proposed by Sims (1980) is to build vector autoregressive (VAR) models with no a-priori exclusion restrictions. The relative scarcity of observations to the number of parameters even for small models, however, led to the problem of overparametrization which tends to diminish their forecasting performance (see Fair, 1979, or Kunst - - Neusser, 1986).

*) A previous version of this paper has been presented at the annual conference of the Austrian Eco- nomic Society in 1986. The authors would like to thank Manfred Deistler, David Hendry, and three anonymous referees for their helpful comments and suggestions.

187

Page 2: Forecasting with vector autoregressive models: An empirical investigation for Austria

This paper tries to investigate two prominent alternatives to overcome this difficulty. The first approach starts out with the unrestricted vector autoregressive model and tries to re- duce its dimensionality by eliminating those variables which turn out to be "insignificant". In contrast to a previous investigation by Kuns t - - N e u s s e r (1986) in which "insignificant" was defined with respect to the classical t- and F-statistics, this paper uses Aka ike ' s (1974) in- formational criterion on an equation by equation basis to reduce the dimensionality of the model. The other alternative introduced by Doan - - L i t terman - - S ims (1984), and Li t ter-

man (1980) takes a Bayesian point of view by combining sample and prior information. The prior in this paper is, however, not based on the "Minnesota prior" as in Kuns t - - Neusse r

(1986) and in the papers cited above, but tries to be more informative by giving some of the variables extra explanatory power a priori.

The two alternative modeling techniques are evaluated by actually building models of the Austrian economy and by using them to generate ex-ante forecasts from the first quarter of 1982 to the fourth quarter of 1985. The performance of the alternative models is evaluated by computing several measures of predictive accuracy for forecast horizons ranging from one to eight quarters.

The evaluation of alternative forecasting techniques has been the subject of extensive re- search which is well summarized by Fi ldes (1985) and Fair (1986). Especially, it may be in- teresting to compare the results obtained in this paper with the ones of Kl ing - - Bess le r

(1985), who also tried to evaluate alternative multivariate forecasting procedures - - includ- ing among a dozen different techniques Litterman's procedure.

There is another interesting aspect of this comparison. The first approach uses differenced data, so that potentially useful information about the relationships between the levels of the variables is not used. The Bayesian approach on the contrary does not make this transfor- mation, so that it imposes perhaps too many unit roots, thereby incurring the possibility of spuriously mean reverting or explosive models(I).

The plan of the paper is as follows. Section 2 gives a quick presentation the data. The next two sections present the construction of the restricted and the Bayesian vector autore- gressive model which are then compared in their forecasting performance in Section 5. The last section finally summarizes the findings and outlines future research.

2. The data

The paper proposes to build quarterly models of the Austrian economy using the two strategies outlined in the next two sections and to investigate their forecasting perform- ance. The models consist of 8 endogenous variables: Real gross domestic product (GDP) ,

employment ( L E ) , monetary base ( M B ) , the deflator of gross domestic product (PGDP) ,

188

Page 3: Forecasting with vector autoregressive models: An empirical investigation for Austria

the average bond yield in the secondary market (R), the terms of trade (TT) calculated with respect to goods exports and imports, average earnings of employees (WAGE) and real goods exports (XG). The sample period is 1964(1) to 1985(4). Although the selection of variables is somewhat arbitrary, they represent nonetheless key economic indicators for any economy.

Before starting the actual model building process, the logarithm of each variable, with the exception of the interest rate, is taken. This transformation is supposed to be variance stabilizing, but it does not take out the trend. For the restricted vector autoregressive mod- els all variables were differenced once, because the technique required stationary data. In the Bayesian approach, on the contrary, no further transformation is carried out to allow for the possibility of long-run relationships among levels.

The seasonality was modeled by adding four seasonal dummies to each of the eight equa- tions in each model, thereby postulating repeating fixed patterns. The main advantage con- sists in its simplicity, since the original data can be recovered with ease. Although this treats seasonality alike in each approach, there remains still the possibilfty that the relative forecasting performance is impaired by this procedure.

3. A restricted autoregressive model

An "unrestricted" VAR model, as proposed by Sims (1980), is likely to contain many insig- nificant parameters, so the search for restrictions which reduce the dimension of the model seems to be a rewarding task. This is especially motivated by a possible disturbing influ- ence of insignificant parameters on forecasting(2).

There is no unique way for setting up such restrictions. In many cases, they are based on information extracted from theory concerning the structure and the interaction of the time series. This path leads to some kind of approximation of a structural econometric model. An alternative is the specification of restrictions on empirical grounds. Here, the following procedure was chosen:

1. An unrestricted time series model is estimated. F- and t-statistics are noted. These de- scribe the influence of one variable and a specific lag of a variable, respectively.

2. Insignificant variables and insignificant lags of variables are eliminated. The significance level was set loosely to 15 percent to reduce the possibility of neglecting important re- gressors which might be increased due to multicollinearity. This "pre-selection" is de- signed to restrict subsequent AIC search to the interesting regressors (AIC: Akaike In- formation Criterion, see Akaike, 1974). The level of 15 percent is much higher than the level corresponding to AIC seen as a test.

189

Page 4: Forecasting with vector autoregressive models: An empirical investigation for Austria

3. Starting from the resulting model, which is slightly overparametrized according to con- ventional standards, the "best" overall model was searched for by t~inimizing the AIC through stepwise elimination of the least significant regressor until elimination began to increase the AIC.

4. Starting from the resulting, possibly underparametdzed, model, groups of lags were scanned again although these had been labeled "insignificant" in former steps. In some equations, the number of regressors was re-increased in order to minimize AIC.

To cope with possible effects of multicollinearity and to retain a reasonable amount of de- grees of freedom, the total number of regressors is restricted to 20. Thus, stepwise selec- tion and elimination of variables was necessary in the first two steps. This procedure started by identifying univariate autoregressive models of each of the eight time.series via AIC. Because lags beyond the AIC minimum are unlikely to play a role in a multivariate model, search may now focus on the other variables.

Now, by the method of OLS regression the first 4 lags of one or two of the other variables were added to the p own lags identified by AIC. If some of the first 4 lags had proved signif- icant, lags 5 to 8 of that variable were tried. Additionally, four seasonal dummies were in- serted as regressors to correct for non-stationary seasonal behaviour of the data.

Applying the procedure of the second step outlined above, the overall number of parame- ters was reduced from over 500 to only about 110. The last two steps fix this number at 91, including the seasonal constants. The set of regressors in the restricted model (R VAR-2) could depend on the search strategy. A comparison of all possible subset models, as sug- gested by Haggan - - Oyetunji (1984), would be rather tedious and time-consuming. Any- way, the R VAR-2 should not be "far away" from the optimal single-equation model (i. e., the model minimizing the AIC for each equation)(3). The performance of RVAR-2 is com- pared with that of a model R VAR for which only the first two steps in the search were gone through. RVAR emerged as the best non-Bayesian model in Kunst - - Neusser (1986).

The significance of explaining variables offers a rough causality test in the spirit of Granger (1969). With Granger, a variable is said to cause another one if its lags reduce the forecast- ing variance additionally to the proper lags of the variable to be forecasted. Since signifi- cance levels of F-statistics, as often used with unrestricted vector autoregressive models, are not valid due to the iterative elimination process, the causal structure of the system is rather represented by the identified lag structure given in Table 1. Parts of the lag structure may seem implausible economically, as it is common with VAR models. Since some eco- nomic decisions and contracts are revised on an annual basis only, the appearance of long and isolated lags is not surprising. It is, however, noteworthy that MB is completely en- dogenous (i. e., own lags do not provide any information) whereas R and LE are more or less exogenous. Note that exports do not feed back into system R VAR, however residual correlation with GDP is high (see below). In R VAR-2 a feedback loop with TT appears.

190

Page 5: Forecasting with vector autoregressive models: An empirical investigation for Austria

I n f l uences wh ich are q u i c k e r than one qua r t e r o f a yea r are no t c a p t u r e d by the lags. How-

ever , a c c o r d i n g to Pierce - - Haugh (1977), t hey are re f l ec ted in the res idua l co r re la t i ons .

The d i rec t i on o f t h e s e i n f l uences c a n n o t be d e t e r m i n e d f r o m the data. Th is p h e n o m e n o n is

k n o w n as " i n s t a n t a n e o u s causa l i t y " (Granger, 1969). Within the f r a m e w o r k o f a f o r e c a s t i n g

mode l , i n s t a n t a n e o u s causa l i t y cou ld be i n t e r p r e t e d as an i nd i ca to r f o r f laws in the mode l ,

tha t is, i n fo rma t ion wh ich cou ld be used f o r improv ing f o r e c a s t s bu t is not . On the o t h e r

hand, if the mode l is v i ewed as co r rec t , it is imposs ib l e to improve po in t f o r e c a s t s by us ing

t h e s e c o r r e l a t i o n s wh ich h o w e v e r d o a f fec t s t o c h a s t i c f o recas t i ng . The co r re l a t i ons are

s h o w n in Tab le 2. M o s t o f t hem are low, on ly the t e r m s o f t rade T T s h o w r e m a r k a b l e c r o s s -

e f f e c t s w i th R and XG. The co r re la t i on b e t w e e n X G and G D P d e c r e a s e s in the R V A R - 2

vers ion .

4. Bayesian vector autoregression

The Bayes ian a p p r o a c h s ta r t s wi th the p r e s u m p t i o n that a g iven data se t d o e s no t con ta in

i n fo rma t i on in eve ry d imens ion . This m e a n s that by f i t t ing an o v e r p a r a m e t r i z e d sys tem

s o m e coe f f i c i en t s turn ou t to be n o n - z e r o jus t by pu re chance . The in f luence o f the c o r r e -

L a g s t r u c t u r e o f R VAR a n d R VAR-2

GDP LE M B PGDP R TT WAGE

Table 1

XG

R VAR G D P 1 24 134 - - 12 4 14 - -

LE 124 1,245 - - 234 3 24 23 - -

MB 124 - - 12 24 - - 12 2 4

PGDP 1,234 - - - - 12 3 3 - - 24

R 1 2 - - 23 1 234 - - - -

TT - - - - 14 - - 2 12 - - 13

WAGE 14 - - 1,234 134 4 - - 1,234 524

XG . . . . . . . 1,234

R VAR-2

GDP 1 2 145 - - - - 6 45 - - L E 467 4 - - 23 3 - - 27 - -

M B - - 3 - - - - - - 1,348 2 4 PGDP - - - - - - 124 - - 56 - - 678

R - - 2 - - 23 1 - - - - - -

TT - - - - 5 8 - - - - 1 4 3 5

WAGE 47 - - 134 38 - - - - 1,245 - -

XG . . . . 1 46 - - 14

Rows correspond to source variables, columns to effects. The numbers give the lags of the row var- iable included in the equation for the column variable.

191

Page 6: Forecasting with vector autoregressive models: An empirical investigation for Austria

Residual correlation matrix of R V.4R and R VAR-2

GDP LE MB PGDP R TT

R VAR GDP 1.0000 O. 1631

L E 1.0000

MB PGDP R TT WAGE

XG

D E T L N = --66.252

R VA R-2 GDP 1.0000 0.1513

LE 1.0000

MB PGDP R TT WAGE XG

D E T L N = --65.858

Table 2

WAGE XG

--0.0159 --0.0678 0.0052 --0.0217 0.0870 0.4009

--0.1631 0.0056 --0.0992 --0.0164 --0.1460 --0.0123

1.0000 0.0887 --0.2317 0.1492 0.1940 0.1689 1.0000 --0.1023 0.1301 0.2475 --0.0001

1.0000 --0.3487 --0.1243 0.0251 1.0000 --0.1243 --0.3320

1.0000 0.2147

1.0000

0.0593 0.0605 --0.2415 0.0223

1.0000 0.0775 1.0000

--0.0188 --0.0217 0.0124 0.2255 --0.0920 --0.1425 --0.0415 0.0053 --0.2095 0.2375 0.1185 0.0248 --0.1706 0.1009 0.2194 0.0960

1.0000 --0.2732 --0.0529 0.0863 1.0000 --0.1411 --0.3336

1.0000 0.2126 1.0000

sponding variable is then just accidental and does not correspond to a stable relationship, so that the out-of-sample forecasting performance of such models deteriorates quickly. The role of the Bayesian prior can therefore be described as prohibiting coefficients to be non-zero "too easily". Only if the data really provide information will the barrier raised by the prior be broken through.

Before going into details it is necessary to introduce some notation. By setting the lag length universally to 6 the i-th equation of the 8-variable autoregressive model can be writ- ten as follows:

(1) 8 6

h(O = Z Z ,But ~.( t -1) + z~(t)'~. + u,(O, j= l 1=t

where Zi is a vector of deterministic variables which, in this case, includes only the four seasonal dummies, u; is a normally distributed error term. In Bayesian econometrics the coefficients flU~ are considered to be stochastic and the prior is therefore represented by a corresponding density function. In this approach the coefficient vector is given a multivar- iate normal distribution, implying that the prior is completely specified by the mean and the variance-covariance matrix.

192

Page 7: Forecasting with vector autoregressive models: An empirical investigation for Austria

A general observation about economic time series is that they or their logarithm seem to follow a random walk with drift. Following Doan - - L i t t e rman - - S ims (1984) the prior means are set in such a way that the system is simplified to a set of 8 independent random walks with drift:

1 f o r i = j a n d l = 1, (2) f l i j l = 0 otherwise.

For the drift terms, in this case the coefficients of the four seasonal dummies, no prior is specified, so that they are estimated freely. This prior does not represent a genuine Baye- sian prior, since it does not characterize the beliefs of an investigator, who usually postu- lates some relationships among the variables involved. It may, however, be considered as the intersection of the a-priori beliefs of many economists, and can therefore be regarded as an improvement over the "diffuse prior" which is often used to characterize the notion of "knowing little".

The next step consists in the specification of the variance-covariance matrix, which is usually a difficult task because its dimension can be very high. The first reduction is achieved by equating all the covariances to zero. But there still remain 384 variances (8 x 6 x 8) to be specified. Instead of setting each of them separately, the strategy is to specify a function which depends on only a few parameters. Denoting the standard devia- tion of the l-th lag of the j - th variable in the i-th equation by S ( i , j , 1), a relatively general function can be set up as follows:

(3) S (i, j, 1) = z" f (i, j ) g (1) ~j,

where si and sj are scaling factors. The parameter ~- specifies the overall tightness of the prior. A smaller value will put more weight on the prior whereas a larger value will put more weight on the sample information.

The paper considers two different types of the function f ( i , j ) , which controls the interac- tions among variables. The first one labeled "Minnesota prior" is relatively uninformative and treats each variable alike, f is therefore symmetric with ones in the diagonal and w (0< w< 1) in the off-diagonal entries:

1 if i = j , (4) f (i, j) =

w i f i g = j .

This reduces the standard deviations of all coefficients in the i-th equation by a factor w, except for own lags.

193

Page 8: Forecasting with vector autoregressive models: An empirical investigation for Austria

The function g, which controls the shape of the lag weights, is geometrically declining in the lag lengh l:

(5) g (11) = d t-~ .

The actual application of the "Minnesota prior" uses the following values of the metapar- ameters:

B V A R - M I N N : =0.1, d = 0.5, w = 0.5.

The second prior, labeled B V A R - I N F O , is more informative by putting a mild "economic" structure on the function f . This is done by dividing on a-priori grounds the set of variables into core variables which are deemed to be important in the prediction of all variables, in this case GDP, R, and WAGE, and into "circle" variables which are considered to be of lesser importance, in this case LE, MB, PGDP, TT, and XG. The core variables are given a weight of 0.6 in the equations of the other core variables, and of 0.4 in the other equa- tions. The circle variables enter into core variable equations with a weight of only 0.1. The weights of the circle variables in circle equations is governed by the 4-vector (0.2 0.1 0.1 0.2) and the ordering LE, PGDP, MB, TT, XG of the circle variables. This means that the values of f are given as follows:

f(LE, PGDP) = 0 . 2 , f(LE, MB) = 0.1, f (LE, XG) = 0 . 2 , f (PGDP, MB) = 0 . 2 ,

f (eGoe, XG) = 0 . 1 , f (PGDP, LE) = 0 . 2 ,

f (MB, XG) = 0 . 1 , f (MB, LE) = 0 . 1 ,

f (TT, XG) = 0 . 2 , f (TT, LE) = 0 . 1 ,

U(TT, MB) = 0 . 2 , f (XG, LE) = 0 . 2 ,

f (XG, MB) - - 0 . 1 , f (XG, TT) = 0 . 2 .

f (LE, TT) = 0 . 1 ,

f (PGDP, TT) = 0 . 1 ,

f (MB, TT) = 0 . 2 ,

f (MB, PGDP) = 0 . 2 ,

f (TT, PGDP) = 0 . 1 ,

f (XG, PGDP) = 0 . 1 ,

The function g is given by:

(6) g (1) = I -a .

In contrast to the geometrically declining function this specification is looser and gives higher lags comparatively more weight. In the forecasting experiments ~- was set equal to 0.125 and d equal to 0.75. In contrast to the previous choice of metaparameters (~', #, the 4-vector, and the weights of the core variables), their "optimal" values were searched for. This was done by scanning over some values and selecting the specification which mini- mizes the logarithm of the determinant of the one-step ahead prediction error over the sampling period 1982(1) to 1985(4)(4).

194

Page 9: Forecasting with vector autoregressive models: An empirical investigation for Austria

5. Forecasts

One of the main purposes of an econometr ic model is forecast ing. With structural models it is possible to evaluate economic theor ies or to gain insight into economic processes. With t ime series models, this goal is set aside. Thus, the acid test for an econometr ic t ime series

model is its forecast ing performance. The best model will be the one which forecasts best. The comparat ive investigation in this paper relies on a forecast ing scenario and several

measures of predict ive accuracy. An "ex-ante" forecast is per formed over 1982(1) to 1985(4). For this exercise the models are est imated over the per iod 1964(1) to 1981(4) and

used to generate forecasts for the whole predict ion interval. Then the information of the next quarter, 1982(1), is incorporated into the model by updating the parameters through

Ex-ante forecasts from the BVAR-MINN model Table 3

log I VI = --62.117

Forecast steps 1 2 3 4 5 6 7 8

GDP 0.0151 0.0214 0,0227 0.0222 0.0317 0.0394 0.0416 0.0410 0.17 0.21 0.24 0.91 0.34 0.35 0.40 0.89 0.0126 0.0176 0.0184 0.0186 0.0252 0 .0331 0.0354 0.0379

LE 0.0060 0.0072 0 .0111 0.0134 0.0170 0.0192 0.0229 0.0256 0.30 0.26 0.54 1.50 0.75 0.67 1.05 2.06 0.0048 0,0062 0.0096 0.0106 0.0147 0.0166 0.0207 0.0236

MB 0.0264 0.0356 0.0366 0.0335 0 .0461 0.0505 0.0514 0.0370 1.19 1.00 0.81 0.60 0.67 0.63 0.58 0.41 0.0215 0.0286 0.0284 0.0278 0.0402 0.0444 0 .0461 0.0330

PGDP 0.0132 0.0194 0.0186 0.0197 0.0269 0.0326 0.0304 0.0302 0.66 0.67 0.54 0.48 0.50 0.52 0.43 0.36 0.0111 0 .0151 0.0149 0.0179 0.0220 0 .0271 0.0267 0.0292

R 0.3220 0.5994 0 .8091 1.029 1.205 1.264 1.329 1.455 0.86 0.88 0.87 0.87 0.87 0.86 0.86 0.86 0.2306 0.4684 0.6172 0.7795 0.9072 0.9610 1 . 0 5 8 1.1950

TT 0.0272 0.0239 0.0269 0.0194 0.0308 0.0267 0.0290 0.0261 0.84 1.00 0.78 1.18 0.84 1.05 0.84 1.89 0.0212 0.0185 0.0225 0.0154 0.264 0.0232 0.0262 0.0214

WAGE 0.0248 0.0339 0.0279 0.0162 0.0270 0.0399 0.0346 0.0162 0.29 0.93 0.30 0.33 0.26 0.55 0.30 0.17 0.0201 0 .0311 0.0237 0.0137 0.0228 0.0370 0.0264 0.0101

XG 0.0523 0.0566 0.0633 0.0428 0.0515 0.0702 0.0608 0.0523 0.85 0.95 0.77 0,51 0.50 0.56 0.44 0,34 0.0402 0 .0461 0.0497 0.0336 0.0400 0.0563 0.0450 0.0452

Rows 1 to 3 show iance matrix of the

the RMSE, Theil's U-statistic, and the MAE, respectively. V. . . variance-covar- one-step ahead forecast errors.

195

Page 10: Forecasting with vector autoregressive models: An empirical investigation for Austria

Kalman filtering and a new set of forecasts is generated, now going from 1982(2) to 1985(4). This procedure is repeated until 1985(4) is reached when all available information has been used(5). During this updating the parameter vector is kept fixed. This is especially relevant for the R VAR and R VAR-2 models since the underlying search procedures might result in different model structures at different steps. In this way 16 1-period, 15 2-period, 14 3-period . . . . . and 9 8-period forecasts are generated, which can be checked against the actual realizations.

The criteria for the quality of forecasting are the root mean square error (RMSE), the U-statistic due to The# (1966), and the mean absolute error of the forecasts (MAE). RMSE and MAE compare the forecasts with reality whereas Theil's U calculates the ira-

Ex-ante forecasts from the B VA R-INFO model log I VI = --62.639

Forecast steps 1 2 3 4 5 6 7 8

GDP 0.0133 0.0165 0.0162 0 .0151 0.0225 0.0262 0.0245 0.0237 0.15 0.16 0.17 0.61 0.24 0.23 0.23 0.52 0.0107 0.0138 0.0140 0.0136 0.0176 0.0219 0.0193 0.0213

LE 0.0061 0.0077 0.0117 0.0143 0.0189 0.0219 0.0262 0.0295 0.30 0.28 0.57 1.60 0.84 0,77 1.20 2.37 0.0048 0.0068 0.0103 0.0123 0.0165 0.0196 0.0238 0.0274

MB 0.0256 0.0337 0.0344 0 .0331 0 .0451 0.0484 0.0487 0.0378 1.16 0.95 0.76 0.60 0.66 0.60 0.55 0.42 0.0205 0.0262 0.0268 0.0272 0.0396 0 .0441 0.0433 0.0339

PGDP 0.0118 0.0165 0.0155 0.0166 0 .0231 0.0276 0.0255 0.0262 0.59 0.57 0.45 0.40 0.43 0.44 0.36 0.32 0.0097 0.0125 0.0116 0.0137 0.0196 0.0230 0.0226 0.0257

R 0.3160 0.5957 0.7976 0.9914 1 .1740 1 .2798 1 .3794 1.4819 0.85 0.87 0.86 0.84 0.85 0.87 0.89 0.87 0.2618 0.5217 0.7077 0.8740 1 .0340 1 .1259 1 .2405 1.3668

TT 0.0262 0.0232 0.0258 0.0214 0.0296 0.0280 0.0294 0.0294 0.80 0.97 0.75 1.30 0.81 1.10 0.85 2.13 0.0206 0.0186 0.0207 0.0169 0.0255 0.0239 0 .0251 0.255

WAGE 0.0196 0.0249 0.0204 0 .0121 0.0197 0.0295 0.0246 0.0112 0.23 0.66 0.22 0.24 0.19 0.40 0.21 0.12 0.0156 0.0219 0.0175 0.0110 0.0173 0.0277 0.0209 0.0094

XG 0.0513 0.0543 0.0593 0.0389 0 .0441 0.0595 0.0450 0.0402 0.83 0.91 0.72 0.46 0.43 0.48 0.32 0.26 0.0391 0.0432 0,0445 0.0292 0.0306 0 .0511 0.0392 0.0360

Rows 1 to 3 show iance matrix of the

Table 4

the RMSE, Theirs U-statistic, and the MAE, respectively. V . . . variance-covar- one-step ahead forecast errors.

196

Page 11: Forecasting with vector autoregressive models: An empirical investigation for Austria

provement of the model's forecasts relatively to a no-change forecast(6). R M S E and U are

based on quadratic loss functions, whereas M A E uses a linear one. Regrettably, the true forecaster's loss function is unknown. It even may be skew, with different costs for optimis- tic and pessimistic mis-specifications (see Aiginger, 1979, or J#ger, 1985). However, as the modeling procedures rely on quadratic criteria, quadratic loss is a useful technical assump- tion.

The ex-ante forecasting performance is documented in Tables 3 to 6. They show that, by refining both methods, significant improvements can be achieved. Within the Bayesian ap- proach it was possible to reduce the logarithm of the determinant of the variance-covar- lance matrix of the one-step ahead prediction errors from --62.117 to --62.639, while in the other approach a reduction from --63.746 to even --65.561 has been achieved.

Ex-ante forecasts from the It VAR model Table 5

log I VI = --63.746

Forecast steps 1 2 3 4 5 6 7 8

GDP 0.0149 0.0180 0,0185 0,0202 0.0279 0.0348 0.0414 0.0443 0.17 0,17 0.20 0.82 0.30 0.31 0.39 0.96 0.0127 0.0151 0.0150 0.0166 0.0240 0,0295 0,0361 0,0399

LE 0.0032 0.0054 0.0079 0.0097 0.0125 0.0154 0.0178 0.0196 0.16 0.20 0.38 1.09 0.55 0.54 0.81 1.58 0.0027 0.0043 0.0065 0.0077 0.0104 0.0128 0.0154 0.0180

MB 0.0195 0.0287 0.0346 0.0421 0.0503 0.0562 0.0581 0.0573 0.88 0.81 0.76 0.76 0.73 0.70 0.65 0.63 0.0164 0.0239 0.0281 0.0364 0.0465 0.0519 0.0512 0.0485

PGDP 0.0100 0.0142 0.0175 0.0188 0.0207 0.0219 0.0197 0.0193 0.50 0.49 0.51 0.46 0.39 0.35 0.28 0.23 0.0081 0.0120 0.0155 0.0169 0.0182 0.0196 0.0172 0.0162

R 0.3059 0.5648 0.7958 0.8776 0.9406 1.004 1.103 1.243 0.82 0.83 0.85 0.74 0.68 0.68 0.71 0.73 0.2473 0.4367 0.6433 0.7394 0.7411 0,8052 0.9523 1.085

TT 0.0207 0.0233 0.0232 0.0241 0.0293 0.0275 0.0250 0.0283 0.64 0.98 0.67 1.46 0.80 1.08 0.72 2.05 0.0163 0.0182 0.0187 0.0208 0.0254 0.0202 0.0205 0.0229

WAGE 0.0111 0.0118 0.0140 0.0153 0.0216 0.0256 0.0309 0.0376 0.13 0.32 0.15 0.31 0.21 0.35 0.26 0.40 0.0094 0.0093 0.0115 0.0130 0.0193 0.0230 0.0276 0.0342

XG 0.0524 0.0630 0.0708 0.0550 0.0584 0.0693 0.0731 0.0913 0.85 1.06 0.86 0.66 0.56 0.56 0.53 0.59 0.0425 0.0456 0.0569 0.0448 0.0486 0.0564 0.0673 0.0792

Rows 1 to 3 show the RMSE, Theil's U-statistic, and the MAE, respectively. V, . . variance-covar- lance matrix of the one-step ahead forecast errors.

197

Page 12: Forecasting with vector autoregressive models: An empirical investigation for Austria

Comparing the best models of both approaches R VAR-2 clearly dominates B VAR-1NFO. Not only is the logarithm of the determinant of the variance-covariance matrix of the one- step ahead prediction errors significantly lower, R VAR-2 is also better when considering each variable separately. At longer forecast horizons the situation is almost reversed. Com- paring 8-quarters ahead forecast errors, the BVAR-INFO is now better in five out of eight cases. This can be explained by the time period chosen for the identification of the R VAR models. Since they make use of the whole sample period, their forecasts will become bet- ter and better as the starting period for the ex-ante forecasts approaches 1985(4). For short horizons the statistics used to evaluate predictive accuracy have a Targer proport ion of periods in their sample where the model was relatively good. Another possible explana-

Ex-ante forecasts from the R VAR-2 model Table 6

log I VI = --65.561

Forecast steps 1 2 3 4 5 6 7 8

GDP 0.0098 0.0138 0.0153 0.0170 0.0219 0.0273 0.0305 0.0312 0.11 0.13 0.16 0.69 0.23 0.24 0.29 0.68 0.0073 0.0102 0.0117 0.0139 0.0194 0.0257 0.0295 0.0296

LE 0.0038 0.0067 0.0090 0.0106 0.0136 0 .0161 0.0176 0.0193 0.19 0.25 0.43 1.19 0.60 0.56 0,81 1.55 0.0030 0.0059 0.0079 0 .0091 0.0123 0.0148 0.0166 0.0183

MB 0.0164 0.0218 0.0247 0.0309 0.0393 0.0477 0.0490 0.0470 0.74 0.62 0.54 0.55 0.57 0.59 0.55 0.52 0.0128 0.0179 0 .0191 0.0259 0,0349 0,0434 0.0422 0.0386

PGDP 0,0098 0,0133 0.0155 0.0166 0,0190 0.0188 0 .0171 0.0142 0.49 0.46 0.45 0.40 0,35 0.30 0.24 0.17 0.0076 0.0107 0.0140 0.0143 0.0153 0.0160 0 .0141 0.0112

R 0.3116 0.5837 0.8143 1.028 1.241 1.336 1,413 1,562 0.83 0.86 0.87 0.87 0.90 0,91 0.91 0.92 0.2596 0.4404 0 .6741 0.8197 0.9862 1.072 1.143 1.310

TT 0.0182 0.0186 0.0235 0.0215 0.0237 0 .0191 0.0175 0.0179 0.56 0.78 0.68 1.30 0.65 0.75 0.51 1.29 0.0152 0 .0151 0.0205 0.0162 0.0168 0.0145 0.0139 0.0143

WAGE 0.0111 0.0120 0.0118 0.0107 0.0094 0.0147 0.0179 0.0195 0,13 0.33 0.13 0.22 0.09 0.20 0.15 0.21 0.0099 0.0094 0.0093 0.0089 0.0076 0 .0101 0.0147 0,0172

XG 0.0450 0.0617 0.0686 0,0647 0.0703 0.0741 0.0796 0.0750 0.73 1.04 0.84 0,77 0.68 0.59 0.57 0.48 0.0325 0 .0521 0.0589 0.0509 0.0553 0.0643 0.0730 0,0654

Rows 1 to 3 show iance matrix of the

the RMSE, Theil's U-statistic, and the MAE, respectively. V.. . variance-covar- one-step ahead forecast errors.

198

Page 13: Forecasting with vector autoregressive models: An empirical investigation for Austria

tion for this reversal may be that the Bayesian approach does not use differenced data (see the discussion in the introduction).

Going through the list of variables one by one, it becomes apparent that some time series are especially difficult to forecast. This is particularly true for LE, MB, R, and TT. For these variables Theil's U-statistic indicates that a "no-change" forecast is nearly as good. For LE the reason seems to lie in a productivity shift which impaired the Austrian economy in the beginning eighties. For MB several outliers during this period seemed to have con- tributed to this negative performance. For the other two time series, R and TT, this result is to be expected, since for financial variables, like interest rates and exchange rates which are largely responsible for the changes in TT, economic theory would predict some martin- gale properties. The 4- and 8-quarters ahead U-statistics of GDP, LE, and TT are signifi- cantly higher than for other horizons, a peculiarity of this statistic not paralleled by the fore- cast errors. For time series with strong seasonality, no-change predictions to the next year's corresponding quarter are quite good compared to other forecast horizons.

6. Summary and conclusions

One immediate implication of this investigation is that the modeling strategy which uses Akaike's information criterion is clearly superior at the one-step ahead forecast errors to the Bayesian approach. This result is, however, reversed when longer forecast horizons are considered. As explained in Section 5 this can be the consequence of the particular sample period chosen for identifying the RI/AR models, or the differences in data transformation.

A further comment concerns the modeling of seasonality. The inclusion of seasonal dum- mies in each equation seems to be a questionable method. Although the forecasting per- formance can be improved considerably by modeling the seasonal factors more carefully, there is still room for discretion, since other simple methods are considered to be deficient as well.

This paper has demonstrated that refined vector autoregressive modeling techniques do improve the forecasting performances in both approaches. The results therefore seem to point towards even more sophisticated methods. In the context of the RVAR modeling technique this could be achieved by applying Akaike's information criterion on a system wide basis. However, since this procedure gives equal weights to all system variables, such a procedure might tend to inflate the parameters within "bad" equations (like, e. g., R here) at the cost of "good" ones. In the context of the Bayesian technique the search for more in- formative priors, perhaps by incorporating more economic theory, seems to be a rewarding task.

It is to be expected that applying techniques along the lines sketched above will result in in- creasing predictive accuracy but with decreasing returns to scale. Further research in this

199

Page 14: Forecasting with vector autoregressive models: An empirical investigation for Austria

field will have to base the selection of t ime series to be forecast on an economic rationale and reconsider whether the actual choice is able to represent the Austrian economy.

7. References

Aiginger, K., "Mean, Variance and Skewness of Reported Expectations and Their Differences to the Respective Moments of Realizations", Empirica, 1979, 6(2), pp. 217-265.

Akaike, H., "A New Look at Statistical Model Identification, IEEE Transactions on Automatic Control", AC-19, 1974, pp. 716-723.

Chong, Y. Y., Hendry, D. F., "Econometric Evaluation of Linear Macro-Economic Models", The Review of Economic Studies, 1986, 53(175}, pp. 671-690.

Doan, T., Litterman, R., Sims, C., "Forecasting and Conditional Projection Using Realistic Prior Distribu- tions", Econometric Reviews, 1984, 3, pp. 1-100.

Engle, R. F., Granger, C. W. J., ~Co-lntegration and Error-Correction: Representation, Estimation and Testing", Econometrica, 1987 (forthcoming).

Fair, R., "An Analysis of the Accuracy of Four Macroeconometric Models", Journal of Political Econ- omy, 1979, 87, pp. 701-718.

Fair, R., "Evaluating the Predictive Accuracy of Models", in Griliches, Z., Intriligator, M. (Eds.), Hand- book of Econometrics, Vol. 3, North-Holland, Amsterdam, 1986, pp. 1979-1995.

Fildes, R., "Quantitative Forecasting -- The State of the Art: Econometric Models", Journal of the Op- erational Research Society, 1985, 36(7), pp. 549-580.

Granger, C. W. J., "Investing Causal Relations by Econometric Models and Cross-Spectral Methods", Econometrica, 1969, 37, pp. 424-438.

Haggan, V., Oyetunji, O. B., "On the Selection of Subset Autoregressive Time Series Models", Journal of Time Series Analysis, 1984, 5(2), pp. 103-113.

Jw A., "A Note on the Informational Efficiency of Austrian Economic Forecasts", Empirica, 1985, 12(2), pp. 247-260.

King, R., Plosser, C., Stock, J., Watson, M., Stochastic Trends and Economic Fluctuations, 1987 (mimeo).

Kling, J. L., Bessler, D. A., "A Comparison of Multivariate Procedures for Economic Time Series", Inter- national Journal of Forecasting, 1985, 1, pp. 5-24,

200

Page 15: Forecasting with vector autoregressive models: An empirical investigation for Austria

Kunst, R. M., Neusser, K., "A Forecasting Comparison of some VAR Techniques", International Journal of Forecasting, 1986, 2, pp. 447-456.

Litterman, R. B., "A Bayesian Procedure for Forecasting with Vector Autoregressions ~ Massachusetts Institute of Technology, Working Paper, 1980.

Lucas, R., Sargent, T., "After Keynesian Macroeconomics ~ in Lucas, R. E., Sargent, T. J. (Eds.), Ra- tional Expectations and Econometric Practice, Vol. 1, University of Minnesota Press, Minnesota, 1981, pp. 295-319.

McNees, S. K., "The Methodology of Macroeconometric Model Comparisons", in Kmenta, J., Ramsey, J. B. (Eds.), Large-Scale Macroeconometric Models, North-Holland, Amsterdam, 1981, pp. 397-422.

Phillips, P. C. B., Durlauf, S. N., "Multiple Time Series Regression with Integrated Processes", Review of Economic Studies, 1986, 53, pp. 473-495.

Pierce, D.A., Haugh, L.D., "Causality in Temporal Systems", Journal of Econometrics, 1977, 5, pp. 265-293.

Sims, C. A., "Macroeconomics and Reality", Econometrica, 1980, 48, pp. 1-48.

Stock, J. H., Watson, M. W., "Testing for Common Trends", Harvard institute for Economic Research, Discussion Paper, 1986, (1222).

Theil, H., Applied Economic Forecasting, North-Holland, Amsterdam, 1966.

8. Notes

(1) The problem of common trends in a vector autoregressive model has been investigated recently by S t o c k - - W a t s o n (1986) and K i n g e ta l . (1987). See also the closely related literature on co-integration (see E n g l e - - Granger , 1987, or Phi l l ips - - Dur lauf , 1986).

(2) This fact is well documented in the paper by Fa i r (1979). Also compare the relatively poor perform- ance of unrestricted models in K u n s t - - N e u s s e r (1986).

(3) A version treating the seasonal dummies like the remaining regressors, possibly dropping them in order to minimize AIC, was also tried. The resulting model " R V A R - 3 " showed no marked difference in behaviour from R V A R - 2 and is therefore not reported here.

(4) A similar search in the case of the "Minnesota prior" was not successful, since the optimal values would converge to a more and more tight random walk with no interactions. It must also be empha- sized that the specification of the informative prior could probably be improved since the limitation of time did not permit a more exhaustive search (see also the discussion in D o a n - - L i t t e r m a n - - S ims, 1984).

201

Page 16: Forecasting with vector autoregressive models: An empirical investigation for Austria

(5) As shown by Chong - - Hend ry (1986), nothing can be gained from multi-step analysis as long as the parameters are the same as in the one-step case. All reported differences in performance are due to the fact that for shorter horizons more observations are available.

(6) Comparisons to other simple extrapolative methods would be possible, but are still not standard in the literature.

Correspondence:

Robert Kunst

Institut fL~r HShere Studien

Stumpergasse 56

A-1060 Wien

Klaus Neusser

Institut for Wirtschaftswissenschaften Universit~t Wien

LiechtensteinstraSe 13

A-1090 Wien

202