forecasting the levels of vector autoregressive log-transformed time series

International Journal of Forecasting 16 (2000) 111–116www.elsevier.com/ locate / ijforecast

Notes

The Notes section of the International Journal of Forecasting contains commentary on the theoryand practice of forecasting in the form of communications to the journal such as research notes,teaching tips, practitioners’ and consultants’ views, and other contributions, especially those thatattempt to bridge the gap between new developments in the methodology of forecasting and theirpractical application. Contributions to this section of the journal can be submitted to any of the foureditors.

Forecasting the levels of vector autoregressive log-transformed timeseries

a , b*˜Miguel A. Arino , Philip Hans FransesaIESE, Universidad de Navarra, Avda. Pearson 21, 08034 Barcelona, Spain

bEconometric Institute, Erasmus University Rotterdam, P.O. Box 1738, NL-3000 DR Rotterdam, The Netherlands

Abstract

In this paper we give explicit expressions for the forecasts of levels of a vector time series when such forecasts aregenerated from (possibly cointegrated) vector autoregressions for the corresponding log-transformed time series. We alsoshow that simply taking exponentials of forecasts for logged data leads to substantially biased forecasts. We illustrate thisusing a bivariate cointegrated vector series containing US GNP and investments. 2000 International Institute ofForecasters. Published by Elsevier Science B.V. All rights reserved.

Keywords: VAR time series; Log-transformation; Forecasting

1. Introduction form the data using natural logarithms prior tothe construction of econometric models that are

In the empirical time series analysis of econ- often used for forecasting. Some of the motiva-omic variables it is common practice to trans- tions for this strategy are that this log-trans-

formation reduces the impact of outliers, thatfirst differenced log-transformed data corre-*Corresponding author. Tel.: 134-93-253-4200; fax:spond to growth rates, and that it reduces the134-93-253-4343.

˜E-mail addresses: [email protected] (M.A. Arino), often observed increasing variance of [email protected] (P.H. Franses) time series. Once a model has been constructed,

0169-2070/00/$ – see front matter 2000 International Institute of Forecasters. Published by Elsevier Science B.V. All rights reserved.PI I : S0169-2070( 99 )00025-4

˜112 M.A. Arino, P.H. Franses / International Journal of Forecasting 16 (2000) 111 –116

and the parameters have been estimated, one M denotes the conditional expectation of Y ,t t

can make forecasts for the log-transformed data. given the information set at time t, and where ht

In some cases, however, one is interested in the is a standard white noise process. One may nowforecasts of the levels of the time series (i.e. the want to use the so-called naive forecast of X ,t1k

untransformed data) instead of (functions of the) that is, the exponential of the forecast of Y :t1k

log-transformed data. In that case, as is well ˆ ˆ*X 5 exp(M ).t1k t1kknown from the results in Granger and NewboldHowever, since the seminal work in Granger(1976) for univariate time series, simply takingand Newbold (1976), we know that this forecastexponentials of the forecasts of the logged datais not the expected value at time t of X whichyields biased forecasts. For the class of the t1k

ˆwould be the unbiased forecast X of X . Inunivariate autoregressive [AR] model, Granger t1k t1k

fact, the latter equalsand Newbold (1976) derive expressions forunbiased forecasts of the levels. In the present X̂ 5 E [exp(M 1h )]t1k t t1k t1kpaper we extend their results to the practically

where E is the expectation operator at time t.very relevant class of vector autoregressive t

The naive forecast is seen to be biased since the[VAR] time series models. VAR models areexpected value of the exponential of the whiteoften used in empirical economics to generatenoise process is unequal to zero.out-of-sample forecasts since their parameters

In this section we first develop expressionsare easy to estimate, and especially since suchfor the unbiased k-step ahead forecast of anmodels provide a simple framework for them-dimensional time series, of which the log-analysis of cointegration, see for exampletransformed series follows a VAR( p) model andJohansen (1995). In Section 2 we give explicitwe show how the naive forecasts are to beexpressions for the out-of-sample forecasts ofcorrected to obtain unbiased forecasts. Next, asthe levels of m time series when these series (inan example, we give the expressions for m 5 2log-transformed format) are modeled by a VARand p 5 1 for illustrative purposes.model of order p. In Section 3, we give an

empirical example concerning a bivariate US2.1. Forecasting an m-dimensional level timeseries containing GNP and investments, whereserieswe take into account that the log-transformed

series are cointegrated. We conclude our paperLet X(t) be an m-dimensional vector timein Section 4 with some remarks.

series X9(t) 5 (X (t), ? ? ? ,X (t)) such that Y(t)1 m

with Y9(t) 5 (Y (t), ? ? ? ,Y (t)) with Y (t) 5 log1 m j

X (t), follows the VAR( p) modelj2. Forecasting levelsp

In this section we present explicit expressions Y(t 1 1) 5 B 1OB Y(t 2 r 1 1) 1h(t 1 1)0 rr51for the forecasts of the levels of a time series,

m9when the log-transformed vector time series where B 5 (b , ? ? ? ,b ) and B 5 (b ) are0 1 m r ijr i, j51

follows a vector autoregressive model. To moti- an m-dimensional vector and matrix with con-vate our paper, consider the univariate time stant parameters, and h9(t) 5 (h (t), ? ? ? ,h (t))1 m

series X , for which one analyses Y with the is a vector of m normally identically andt t

latter being the series in logs, that is, Y 5 log independently distributed random variables witht

X , where log denotes the natural logarithmic mean zero and covariance matrix V. The as-t

transformation. Suppose that the log-transform- sumption of normality is crucial for our resultsed series can be modelled as Y 5 M 1h where below. Deviations from this assumption wouldt t t

˜M.A. Arino, P.H. Franses / International Journal of Forecasting 16 (2000) 111 –116 113

lead to different results, but this extension is not wherem

pursued here. Also, our results below depend on c (2) 5 b 1Ob b0i i ij1 jthe assumption that the VAR provides unbiased j51pforecasts of the log-transformed data. This is

c (2) 5Ob c (1)however not necessarily the case for an esti- ijl ir1 rjlr51mated VAR, see Dufour (1985). andpBefore we state our main results in the above

d (2) 5Ob d (1)matrix notation, we first address matters while ij ir1 rjr51using the scalar notation. This is done mainly

In order to simplify notation, let us call C (k)for expository purposes. For each variable 1 # 0mthe vector column C (k)9 5 (c (k)) , andi # m, we have that 0 0i i51m

m m C (k) and D(k) the matrix (c (k)) andl ijl i, j51mY (t 1 1) 5 b 1Ob Y (t) 1Ob Y (t 2 1) (d (k)) , respectively.i i ij1 j ij2 j ij i, j51

j51 j51 Calculating in a similar way Y (t 1 3), ? ? ? ,im

Y (t 1 k), we arrive at the expressioni1 ? ? ? 1Ob Y (t 2 p 1 1)ijp jpj51 m

Y (t 1 k) 5 c (k) 1OOc (k)Y (t 2 r 1 1)1h (t 1 1) (1) i 0i ijr jir51j51

m k m

5c (1) 1Oc (1)Y (t) 1OOd (k 2 r 1 1)h (t 1 r)0i ij1 j ij jj51 r51j51

m where1 ? ? ? 1Oc (1)Y (t 2 p 1 1) pijp j

j51m C (k) 5 B 1OB C (k 2 r)0 0 r 0r511Od (1)h (t 1 1).ij j pj51

C (k) 5OB C (k 2 r)l r lwhere r51p

c (1) 5 b0i i D(k) 5OB D(k 2 r)rc (1) 5 b r51ijl ijl

with the initial conditionsandC ( j) 5 0 for j 5 0, 2 1, ? ? ? , 2 p 1 11, for j 5 i; 0

d (1) 5Hij 0, otherwise.C ( j)l

Substituting Y (t 1 1) for its value accordingi I , if j 5 1 2 l; for 1 # l # p, andmto (1) we similarly obtain for Y (t 1 2): 5i H0, otherwise. j 5 0, 2 1, ? ? ? , 2 p 1 1m

Y (t 1 2) 5 c (2) 1Oc (2)Y (t) 1 ? ? ? D( j) 5 0 for j 5 0, 2 1, ? ? ? , 2 p 1 1.i 0i ij1 jj51

m

The above expressions can be used to obtain1Oc (2)Y (t 2 p 1 1)ijp j forecasts of the untransformed level time series.j51

m The naive forecast of X (t 1 k) isi1Od (2)h (t 1 1)ij j X̂ (t 1 k)* 5 exp[E (Y (t 1 k))]j51 i t i

m p mc (k)ijr1Od (1)h (t 1 2) 5PPX (t 2 r 1 1) exp(c (k)).ij j j 0i

j51 r51j51


This expression gives us the exponential of series (Y (t),Y (t))9 with Y (t) 5 log X (t) is1 2 i i

the k-step ahead forecast of the log-transformedD Y (t) 5 2 0.075 10.430D Y (t 2 1)1 1 1 1series Y (t 1 k) according to the specified (0.024) (0.089)i

VAR( p) model. However, under the normality 10.257D Y (t 2 2) 20.043Z(t 2 1)1 1(0.095) (0.013)assumption the unbiased forecast of X (t 1 k) isi

1h (t)1

X̂ (t 1 k) 5 E[exp(Y (t 1 k))]i i

D Y (t) 5 20.593 11.842D Y (t 2 1)1 2 1 1ˆ (0.117) (0.436)5 X (t 1 k)*exp(e (k) /2)i i

10.216D Y (t 2 2) 20.325Z(t 2 1)1 2(0.090) (0.064)where

1h (t)2

e (k) 5 e (k 2 1)i i where standard errors are given in parentheses.The cointegrating relationship is1 (d (k), ? ? ? ,d (k))V(d (k),i1 im i1

? ? ? ,d (k))9 Z(t) 5 Y (t) 2 Y (t)im 1 2

and the estimated covariance matrixwith the initial condition e (0) 5 0. Notice thatih 0.0001041 0.0004112in practice one needs to estimate V and all the 1cov 5 .S D S Dh 0.0004112 0.0025960other parameters. 2

This bivariate error correction model can beexpressed as a VAR(3) model with parameterrestrictions as follows:3. An application

Y (t) 2 0.0751In this section we apply the expressions 5S D S DY (t) 2 0.5932obtained in the previous section to a two-dimen-1.473 2 0.043 Y (t 2 1)sional time series (X (t), X (t))9. X is the real 11 2 1 1S D S DGNP of the US and X is the real gross 2.167 0.675 Y (t 2 1)2 2

domestic investment series. The data are given 2 0.173 0 Y (t 2 2)1in Pindyck and Rubinfeld (1991, chapter 12). 1S D S D2 1.842 0.216 Y (t 2 2)2Quarterly observations are available from the2 0.257 0 Y (t 2 3)first quarter of 1947 until the first quarter of 1

1S D S D0 2 0.216 Y (t 2 3)1988. We will use observations until the fourth 2

quarter of 1980 to estimate our VAR model for h (t)11 .the log-transformed data, and we leave the S Dh (t)2remaining 29 data points to evaluate our naive

A summary of the errors obtained for theand unbiased forecasts. We find that a VARnaive and the appropriate unbiased forecasts ofmodel of order 3 fits the data well. Since theX and X is presented in Table 1. When welogged series appear to be cointegrated accord- 1 2

compare the naive forecasts with the unbiaseding to several of the currently available tests, theforecasts for the restricted VAR(3) model, wecointegrating relationship between both series israpidly notice the better performance of theimposed in the VAR, that is, we obtain a VARunbiased forecasts with respect to the naivemodel with (nonlinear) parameter restrictions.forecasts except for the mean error for X . TheThe estimated model for the log-transformed 1

˜M.A. Arino, P.H. Franses / International Journal of Forecasting 16 (2000) 111 –116 115

Table 1 Table 2Evaluation of 29 quarters out-of-sample forecasting per- Evaluation of 20 quarters out-of-sample forecasting per-formance for naive and unbiased forecasts for levels from formance from 10 to 29 periods ahead for naive and

aa cointegrated VAR(3) model for log-transformed series unbiased forecasts for levels from a cointegrated VAR(3)model for log-transformed seriesLog-model X Log-model X1 2

Log-model X Log-model X1 2Naive Unbiased Naive Unbiased

Naive Unbiased Naive UnbiasedME 29.720 215.1593 4.906 0.557aMAE 93.474 89.587 55.652 53.659 ME 84.328 74.508 42.928 37.442

MAPE 2.72% 2.62% 10.20% 9.96% MAE 75.529 69.099 47.385 43.633MSE 12 248 11 873 4475 4395 MAPE 2.04% 1.87% 7.37% 6.82%RMSE 110.7 109.0 66.9 66.3 MSE 2511 1979 833 679

a RMSE 50.1 44.5 28.9 26.0The forecast errors are defined as the true value minusathe forecasted value. Forecast evaluation criteria are mean These statistics are defined in Table 1.

error, ME, mean absolute error, MAE, mean averagepercentage error MAPE and (root) mean squared error,(R)MSE.

4. Concluding remarks

In this paper we presented explicit expres-mean absolute error, mean percentage absolutesions for forecasts for the levels of a vector timeerror, and mean squared error are smaller for theseries when a VAR model was used for theunbiased forecast in both series, but it alsolog-transformed data. We showed that exponen-appears for example that for GNP, the unbiasedtials of the forecasts for logged data are biased,forecasts outperform the naive forecasts by 18as could also be observed from our empiricaltimes to 11. Using a binomial distribution offorecasts from a bivariate cointegrated vectorparameters n 5 29 and p 5 0.5 this is significantautoregressive time series model containing USat the 7% level. For the investments series theGNP and investment. Our results can be practi-score is 20 to 9 which is significant even at thecally relevant in case one aims to forecast the2% level.levels of a vector time series. Because multi-In addition, and as expected as the forecaststep forecasts are linked with impulse-responseerror variance for the log-transformed datafunctions, see Lutkepohl (1991), an extension ofincreases with the forecast horizon, the out-our results to these functions seems also rel-performance of the unbiased forecasts is obvi-evant. Finally, our expressions can be useful toous especially for the long-term. For the GNPproperly evaluate forecasts from VAR modelsseries all of the last 17 forecasts are better forfor logged data versus such models for untrans-the unbiased than for the naive forecasts. Forformed data. In fact, it may sometimes bethe investment series we find that 16 of the lastunclear from the outset whether taking logs17 forecasts are better using the unbiased meth-amounts to the best empirical strategy.od. This means that the farther the time horizon

the better is the performance of the unbiasedforecast with respect to the naive.

For illustration, in Table 2 we report the same Acknowledgementsstatistics as in Table 1 but using only theforecasts from 10 to 29 periods ahead (20 We thank the editor and two anonymousforecasts). Hence, for the longer horizons one referees for their helpful comments. Miguel A.

˜clearly needs the use of the unbiased forecasts. Arino has received partial financial support


Pindyck, R. S., & Rubinfeld, D. L. (1991). Econometricfrom CIIF, Centro Internacional de Inves-models and economic forecasts. In: McGraw-Hill Inter-´tigacion Financiera (International Center fornational Editions, Economic Series, McGraw-Hill.Financial Research).

˜Biographies: Miguel A. ARINO is Professor of Statisticsand Decision Sciences at IESE, Universidad de Navarra.

References He has a Ph.D. in Mathematics from Universidad deBarcelona. He has publications in several mathematical

Dufour, J. -M. (1985). Unbiasedness of predictions from and economic journals. His current research interests are inestimated vector autoregressions. Econometric Theory 1, the areas of time series analysis and econometrics.387–402.

Granger, C. W. J., & Newbold, P. (1976). Forecasting Philip Hans FRANSES is Professor of Applied Econo-transformed series. Journal of The Royal Statistical metrics at the Econometric Institute of the ErasmusSociety B 38, 189–203. University Rotterdam. One of his research interests is

Johansen, S. (1995). Likelihood-based Inference in Co- modelling and forecasting time series. On this topic he hasintegrated Vector Autoregressive Models, Oxford Uni- published in various journals and books.versity Press, Oxford.

Lutkepohl, H. (1991). Introduction To Multiple TimeSeries Analysis, Springer Verlag, Berlin.

forecasting the levels of vector autoregressive log-transformed time series

Documents