space-time model versus var model: forecasting electricity demand in japan

14
Journal of Forecasting J. Forecast. (2011) Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/for.1255 Space-Time Model versus VAR Model: Forecasting Electricity demand in Japan YOSHIHIRO OHTSUKA 1 * AND KAZUHIKO KAKAMU 2 1 Graduate School of Economics, Hitotsubashi University and Tachibana Securities Co., Ltd, Tokyo, Japan 2 Faculty of Law and Economics, Chiba University, Japan ABSTRACT This paper examined the forecasting performance of disaggregated data with spatial dependency and applied it to forecasting electricity demand in Japan. We compared the performance of the spatial autoregressive ARMA (SAR-ARMA) model with that of the vector autoregressive (VAR) model from a Bayesian perspective. With regard to the log marginal likelihood and log predictive den- sity, the VAR(1) model performed better than the SAR-ARMA(1, 1) model. In the case of electricity demand in Japan, we can conclude that the VAR model with contemporaneous aggregation had better forecasting performance than the SAR-ARMA model. Copyright © 2011 John Wiley & Sons, Ltd. KEY WORDS Markov chain Monte Carlo (MCMC); spatial autoregressive ARMA model; vector autoregressive model; predictive function INTRODUCTION Following Zellner and Tobias’ study (2000), we know that improved forecasting results can be obtained by disaggregation economic variables and, furthermore, following Giacomini and Granger’s work (2004), we know that if dependent variables are the result of the aggregation of regional depen- dent variables, then the spatial-temporal approach is beneficial for forecasting. Ohtsuka et al. (2010) conducted a study to forecast the electricity demand in Japan using the spatial-temporal approach, and the purpose of this paper is to extend this work by Ohtsuka et al. (2010). The VAR model allows the variables to be dependent in both time and cross-sectional direc- tions and is widely used in macroeconomics (e.g. Ang et al., 2006; Panagiotelis and Smith, 2008; Smets and Wouters, 2005). On the contrary, if the cross-sectional dimension increases, forecasting with the VAR model quickly becomes infeasible, as the so-called curse of dimensionality makes it difficult to estimate the model accurately (see Giacomini and Granger, 2004). No study has attempted a comparison between the contemporaneous aggregation model and the spatial dependent aggrega- tion model. In this paper, we compare the spatial and VAR models with regard to the forecasting performance in Zellner and Tobias (2000) and Giacomini and Granger (2004). *Correspondence to: Yoshihiro Ohtsuka, Graduate School of Economics, Hitotsubashi University and Tachibana Securities Co., Ltd, Tokyo, Japan. E-mail: [email protected] Copyright © 2011 John Wiley & Sons, Ltd.

Upload: yoshihiro-ohtsuka

Post on 11-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Journal of ForecastingJ. Forecast. (2011)Published online in Wiley Online Library(wileyonlinelibrary.com) DOI: 10.1002/for.1255

Space-Time Model versus VAR Model:Forecasting Electricity demand in Japan

YOSHIHIRO OHTSUKA1* AND KAZUHIKO KAKAMU2

1 Graduate School of Economics, Hitotsubashi University andTachibana Securities Co., Ltd, Tokyo, Japan2 Faculty of Law and Economics, Chiba University, Japan

ABSTRACTThis paper examined the forecasting performance of disaggregated data withspatial dependency and applied it to forecasting electricity demand in Japan. Wecompared the performance of the spatial autoregressive ARMA (SAR-ARMA)model with that of the vector autoregressive (VAR) model from a Bayesianperspective. With regard to the log marginal likelihood and log predictive den-sity, the VAR(1) model performed better than the SAR-ARMA(1, 1) model. Inthe case of electricity demand in Japan, we can conclude that the VAR modelwith contemporaneous aggregation had better forecasting performance than theSAR-ARMA model. Copyright © 2011 John Wiley & Sons, Ltd.

KEY WORDS Markov chain Monte Carlo (MCMC); spatial autoregressiveARMA model; vector autoregressive model; predictive function

INTRODUCTION

Following Zellner and Tobias’ study (2000), we know that improved forecasting results can beobtained by disaggregation economic variables and, furthermore, following Giacomini and Granger’swork (2004), we know that if dependent variables are the result of the aggregation of regional depen-dent variables, then the spatial-temporal approach is beneficial for forecasting. Ohtsuka et al. (2010)conducted a study to forecast the electricity demand in Japan using the spatial-temporal approach,and the purpose of this paper is to extend this work by Ohtsuka et al. (2010).

The VAR model allows the variables to be dependent in both time and cross-sectional direc-tions and is widely used in macroeconomics (e.g. Ang et al., 2006; Panagiotelis and Smith, 2008;Smets and Wouters, 2005). On the contrary, if the cross-sectional dimension increases, forecastingwith the VAR model quickly becomes infeasible, as the so-called curse of dimensionality makes itdifficult to estimate the model accurately (see Giacomini and Granger, 2004). No study has attempteda comparison between the contemporaneous aggregation model and the spatial dependent aggrega-tion model. In this paper, we compare the spatial and VAR models with regard to the forecastingperformance in Zellner and Tobias (2000) and Giacomini and Granger (2004).

* Correspondence to: Yoshihiro Ohtsuka, Graduate School of Economics, Hitotsubashi University and Tachibana SecuritiesCo., Ltd, Tokyo, Japan. E-mail: [email protected]

Copyright © 2011 John Wiley & Sons, Ltd.

Y. Ohtsuka and K. Kakamu

The space–time or spatial model, which considers the spatial dependency among regions,has been widely used in various research fields. Recently, considerable research has been con-ducted in this regard in econometrics (e.g. Anselin, 2003). Moreover, the spatial model has beenapplied to macroeconomics in areas such as unemployment, electricity demand, and business cycles(see Schanne et al., 2010; Ohtsuka and Kakamu, 2009a; Ohtsuka et al., 2010; Kakamu et al., 2010).As compared to the VAR model, the space–time model requires a small number of parameters. Thatis, forecasting using the space–time model offers a solution to the curse of dimensionality. On theother hand, the space–time model has a disadvantage in that the spatial correlation and constant termis weakly identified (see Ohtsuka and Kakamu, 2009b), and we need to identify the weight matrix,which designs the spatial relationship among the regions in advance. In this paper, we examine thespatial autoregressive-ARMA (SAR-ARMA) model proposed by Ohtsuka et al. (2010).

We estimate the models using the Bayesian technique for the following reasons. Economic anal-yses widely use areal data such as state data. As such, we are required to mention that the samplesize is finite. The maximum likelihood methods depend on their asymptotic properties, whereas theBayesian method does not because the latter evaluates the posterior distributions of the parametersconditioned on the data.

In our empirical analysis, we analyze the electricity demand in Japan, because in Japan electric-ity demand has been adopted as a coincident index in the business conditions index. Moreover, theforecasting of electricity demand has become an important issue not only in electrical engineeringbut also in social research. Electricity is the main source of energy for social and economic activ-ities, and a shortage of electric power results in severe losses. Many models and approaches havebeen used to forecast electricity demand (e.g. Ohtsuka and Kakamu, 2009a; Pappas et al., 2008;Ramanathan et al., 1997; Cottet and Smith, 2003). We examine the electricity demand in Japan usinga dataset from January 1992 to January 2003, which is used in Ohtsuka et al. (2010). We evaluate themodels using the marginal likelihood and log score function. In out-of-sample forecasting, we exam-ine the performance of the model with regard to the data from February 2003 to January 2004. Fromthe empirical results, we found that the log marginal likelihood of the VAR(1) model outperformedthat of the SAR-ARMA(1, 1) model. Moreover, the log predictive distribution of the VAR model out-performed that of the SAR-ARMA(1, 1) model. We found that in the case of forecasting electricitydemand in Japan, the VAR model performed better than the SAR-ARMA model.

The rest of this paper is organized as follows. The next section introduces the SAR-ARMA modeland VAR model. The third section discusses the computational strategy of the MCMC method. Thefourth section presents the empirical results of the application of the SAR-ARMA and VAR models toforecasting electricity demand in Japan. Finally, the fifth section summarizes the results and providesconcluding remarks.

MODEL

SAR-ARMA modelLet yit be the observation for the i th unit .i D 1, : : : ,n/ in the t th period .t D 1, : : : ,T /. Moreover,consider an n � n matrix C of contiguity dummies, with cij D 1 if areas i and j are adjacent andcij D 0 if otherwise (with cii D 0), as in Stakhovych and Bijmolt (2009). Then, scale the elements,which are row-standardized, with W as the scaled matrix, where

WD Œwij �Dhcij=

Xn

jD1cij

i

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Forecasting Electricity Demand in Japan

We define WD fwij g, where wij denotes the spatial weight on the j th unit with respect to the i thunit.

In the model, it is assumed that the individual factor specific to the unit affects the dependent vari-able. Therefore, we define �i .i D 1, : : : ,n/ as a constant term. Then, the SAR model conditioned onparameters �i and � is written as follows:

yit D �i C �

nXjD1

wijyjt C uit , j�j< 1 (1)

Suppose that uit follows an ARMA(p, q) process:

uit D

pXjD1

�ijui ,t�j C �it C

qXjD1

�ij �i ,t�j , �it �N .0, �2i / (2)

which is expressed in terms of a polynomial in the lag operator L as

�i.L/uit D �i.L/�it (3)

where �i.L/D 1� �i1L� : : :� �ipLp and �i.L/D 1C �i1LC : : :C �iqLq .Then, the conditional likelihood function of models (1) and (2) is given as follows:

L.yj�,†,ˆp,‚q , �/DTYtD1

f .yt j�,†,ˆp,‚q , �/ (4)

where y D .y1, : : : , yT /0, yt D .y1t , : : : ,ynt/0, � D .�1, : : : ,�n/0, † D diag.�21 , : : : , �2n/,ˆp D .�1, : : : ,�n/, �i D .�i1, : : : ,�ip/

0,‚q D .�1, : : : ,�n/, � i D .�i1, : : : , �iq/0, and

f .yt j�,�,ˆp,‚q ,†/D .2/�n

2 j†j�1

2 jIn � �Wj exp

e0t†�1et2

!(5)

where In is an n� n unit matrix and

et D yt ��� �Wyt �pXjD1

�jut�j �qXjD1

�j�t�j

with ut D .u1t , : : : ,unt/0, �t D .�1t , : : : , �nt/0, �j D diag.�1j , : : : ,�nj /, and �j D diag.�1j , : : : ,�nj /. We assume that the pre-samples up�1, : : : , u�pC1, �q�1, : : : , ��qC1 are equal to zero (seeTsurumi and Radchenko, 2005).

VAR modelIn this subsection, we introduce the VAR(p) model. We examine only the VAR(1) model in theempirical analysis. However, we define the model in the general form because the order of 1 is selectedfor comparison in the empirical analysis.

Let the model of the n� 1 vector time series process be yt . The VAR model has the form

yt D �CpXiD1

ˆiyt�i C �t , �t �N .0,†/ (6)

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Y. Ohtsuka and K. Kakamu

where yt D .y1t , : : : ,ynt/0, �t D .�1t , : : : , �nt/0, and t D 1, : : : ,T . � is an n� 1 unknown vector of aconstant term. ˆi for i D 1, : : : ,p is an unknown n� n matrix. † is an n� n positive definite errorcovariance matrix. We assume that yt D 0 for t � 0.

The VAR model is stationary if all values of ´ satisfying

jIn �ˆ1´�ˆ2´2 � : : :�ˆp´pj D 0

lie outside the unit circle along with In, which is an n�n unit matrix. Define xt D .1, y0t�1, : : : , y0t�p/,and rewrite (6) as

YD XˆC � (7)

where YD .y01I : : : I y0T /, XD .x01I : : : I x

0T /, ˆ D .�,ˆ1, : : : ,ˆp/0, and � D .�01I : : : I �

0T /. Then, we

will introduce the likelihood function of the model (7) as follows:

L.Y jˆ,†/D .2/�nT

2 j†j�T

2 exp

��1

2trace.E0E†�1/

�(8)

where ED Y�Xˆ. We assume that the pre-samples yp�1, : : : , y�pC1 are equal to zero.

POSTERIOR ANALYSIS

Joint posterior distributionPosterior distribution for the SAR-ARMA model.Since we adopt a Bayesian approach, we complete the model by specifying the prior distribution overthe parameters. Therefore, we apply the following prior distribution:

.�,†,ˆp,‚q , �/D

(nYiD1

.�i/.�2

i /.�i/.� i/

).�/

Given a prior density .�,†,ˆp,‚q , �/ and the likelihood function stated in (4), the jointposterior distribution can be expressed as

.�,†,ˆp,‚q , �jy// .�,†,ˆp,‚q , �/L.yj�,†,ˆp,‚q , �/ (9)

Finally, we assume the following proper prior distributions:

�i �N .�0, 20 /, �2i � IG.�0=2, �0=2/,

�i �N .�� ,†�1� /IS� , � i �N .�� ,†�1� /IS� , and � � U.1=�min, 1=�max/

where IG.a, b/ denotes an inverse gamma distribution with scale and shape parameters a and b,respectively, IA is the indicator function of the set A, S� is the set of �i that satisfy the sta-tionary condition, and S� is the set of �i that satisfy the invertibility condition (see Chib andGreenberg, 1994). �min and �max are the minimum and maximum eigenvalues of matrix W. If Wis row-standardized, then the permissible maximum of � is 1 (see Anselin, 2001), and the permis-sible minimum is the smallest eigenvalue �min, which is smaller than �1. Therefore, we set a priorcondition on �.

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Forecasting Electricity Demand in Japan

Posterior distribution for the VAR model.Let �D vec.ˆ/ and�D†�1, and apply the following prior distribution:

.�,�/D .�/.�/

Given a prior density .�,�/ and the likelihood function stated in (8), the joint posteriordistribution can be expressed as

.�,�jy// .�,�/L.yj�,�/ (10)

Finally, we assume the following prior distributions:

� �N .�0,†�0/IS� , ��W.�0,�0/

where �0 is an n.nC 1/p � 1 vector, IA is the indicator function of the set A, S� is the set of � thatsatisfy the stationary condition, and W.A,B/ refers to the Wishart distribution.1

Posterior simulation for the SAR-ARMA modelSince the joint posterior distribution given in (9) and (10) is quite simplified, we can now use theMCMC method. The Markov chain sampling schemes can be constructed from the full conditionaldistributions of ˇi , �

2i

, �i , and � i for i D 1, : : : ,n and �. In the sampling schemes of all parametersexcept �, we use the sampling algorithm proposed by Ohtsuka et al. (2010).

Sampling �i for i D 1, : : : ,nLet y�

iD .y�

i1, : : : ,y�

iT/0 and x�

iD .x�

i1, : : : , x�

iT/0, and calculate y�

itand x�

itas follows:

y�it D yit � �

nXjD1

wijyit �

pXjD1

�ij

yi ,t�j � �

nXjD1

wijyj ,t�j

!�

qXjD1

�ijy�

i ,t�j

x�it D 1�

pXjD1

�j �

qXjD1

�ijx�

i ,t�j

where yit D y�it D 0 for t � 0 and x�itD 0 for t � 0.

The SAR-ARMA model can be rewritten as

y�it D x�

it�i C �it (11)

Then, the proposal distribution of �i is

�newi �N .�old

i , O†�i/ (12)

where O†�i D s2i1.��2i

x�0

ix�iC �20 /�1, �old

iis the parameter of the previous sampling, and si1 is the

tuning parameter. Next, we evaluate the acceptance probability:

˛.�oldi , �new

i /Dmin

�p.�new

i/

p.�oldi/

, 1

1 The density function of the distribution is

p.�jA,B/D j�jA�.KC1/

2 exp

��1

2trace.B�1�/

whereK is the number of parameters.

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Y. Ohtsuka and K. Kakamu

Further, �i D �newi

with probability ˛.�oldi

,�newi/, and �i D �old

iwith probability 1�˛.�old

i,�newi/.

In addition, the acceptance probability depends on the value of the tuning parameter and is determinedby the sampling process.

Sampling �2i

for i D 1, : : : ,n.Let Neit D y�it � x

�it�i . Then, the full conditional distribution of �2

iis

�2i � IG�T C �0

2,Ne0iNei C �02

where Nei D . Nei1, : : : , NeiT /0. These parameters are sampled only by the Gibbs sampler.

Sampling �i for i D 1, : : : ,n.Let Qyi D . Qyi1, : : : , QyiT /0 and Qxi D .Qx0i1, : : : , Qx

0iT/0, and calculate Qyit and Qxit as follows:

Qyit D yit � �

nXjD1

wijyjt ��i �

qXjD1

�ij Qyi ,t�j

Qxit D . Qyi ,t�1, : : : , Qyi ,t�p/

where yit D Qyit D 0 for t � 0.The SAR-ARMA model can be rewritten as

Qyit D Qxit�i C �it

Then, the proposal distribution of �i is

�newi �N .�old

i , O†�i/ (13)

where O†�i D s2i2.��2iQx0iQxi C†

�1

� /�1, �old

iis the parameter of the previous sampling, and si2 is the

tuning parameter. Next, we evaluate the acceptance probability:

˛.�oldi , �new

i /Dmin

�p.�new

i /

p.�oldi /

, 1

Further, �i D �newi with probability ˛.�old

i ,�newi /, and �i D �

oldi with probability 1� ˛.�old

i ,�newi /.

It should be mentioned that the proposed value of �i is not truncated to the stationary conditioninterval because the constraint is part of the target density. Thus, if the proposed value of �i does notsatisfy the stationary condition, the conditional posterior is zero, and the proposal value is rejectedwith probability one.

Sampling � i for i D 1, : : : ,nLet Nyi D . Nyi1, : : : , NyiT /0 and Nxi D .Nx0i1, : : : , Nx

0iT/0, and calculate Nyit and Nxit as follows:

Nyit D yit � �

nXjD1

wijyjt ��i �

pXjD1

�ij

yi ,t�j � �

nXjD1

wijyi ,t�j ��i

!�

qXjD1

� oldij Nyi ,t�j

Nxit D . Nyi ,t�1, : : : , Nyi ,t�q/

where yit D Nyit D 0 for t � 0.

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Forecasting Electricity Demand in Japan

For sampling �i , the SAR-ARMA model can be rewritten as

Nyit D Nxit� i C �it

Then, the proposal distribution of � i is

�newi �N .�old

i , O†�i/ (14)

where O†�i D s2i3.��2iNx0iNxi C†

�1

� /�1, �old is the parameter of the previous sampling, and si3 is the

tuning parameter. Next, we evaluate the acceptance probability:

˛.�oldi , �new

i /Dmin

�p.�new

i /

p.�oldi /

, 1

Further, � i D �newi with probability ˛.�old

i ,�newi /, and � i D �

oldi with probability 1� ˛.�old

i ,�newi /.

The proposed value of � i is not truncated to the invertible condition interval because the constraint ispart of the target density. Thus, if the proposed value of � i does not satisfy the invertible condition,the conditional posterior is zero and the proposal value is rejected with probability one.

Sampling �From (9), the full conditional distribution of � is written as

p.�//

TYtD1

jIn � �Wj exp

��Ne0t†�1 Net2

Then, the proposal distribution of � is

�new �N .�old , s2i4/

where si4 is the tuning parameter. In the numerical example below, we select the tuning parametersuch that the acceptance rate lies between 0.4 and 0.6 (see Holloway et al., 2002). Next, we evaluatethe acceptance probability:

˛.�old, �new/Dmin

�p.�new/

p.�old/, 1

Further, �D �new with probability ˛.�old, �new/, and �D �old with probability 1� ˛.�old, �new/. Theproposal value of � is not truncated to the interval .1=�min, 1=�max/ because the constraint is part ofthe target density. Thus, if the proposed value of � is not within the interval, the conditional posterioris zero, and the proposal value is rejected with probability one (see Chib and Greenberg, 1998).

Posterior simulation for the VAR modelSampling ˆ and †In order to sample ˆ, we rewrite (6) as follows:

yt D Zt�C �t

where Zt D In˝ x0t . Furthermore, we rewrite (7) as follows:

yD Z�C e (15)

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Y. Ohtsuka and K. Kakamu

where y D .y01, : : : , y0T /0, Z D .Z1I : : : IZT /0, and e D .�01, : : : , �

0T /0. We rewrite the likelihood

function using (15) as follows:

L.yj�,�// j�jT

2 exp

��1

2.y�Z�/0.IT ˝�/.y�Z�/

�(16)

Then, the proposal distribution of � is

�new �N .�old, O†�/ (17)

where O†� D .Z0.IT ˝�/ZC†�0�1/�1, �old is the parameter of the previous sampling, and s is the

tuning parameter. Next, we evaluate the acceptance probability:

˛.�old, �new/Dmin

�p.�new/

p.�old/, 1

Further, � D �new with probability ˛.�old,�new/, and � D �old with probability 1 � ˛.�old,�new/.It should be mentioned that the proposed value of � is not truncated to the stationary condition inter-val because the constraint is part of the target density. Thus, if the proposed value of � does notsatisfy the stationary condition, the conditional posterior is zero and the proposal value is rejectedwith probability one.

Next, the full conditional distribution of � is

p.�jy, Z,�// j�j�1�.mC1/

2 exp

��1

2trace

�O��1

���

where �1 D T C �0 and O� D .��10 C E0E/�1. Therefore, the posterior distribution of � is denotedas follows:

��W.�1, O�/

This parameter is sampled using the Gibbs sampler. Finally, we have † D��1.

Marginal likelihoodIn this paper, we evaluate the model using the marginal likelihood because we adopt a Bayesianapproach. In the computation of marginal data density, we consider Geweke’s (1999) modifiedharmonic mean estimator.

The harmonic mean estimators are based on identity. We have

1

.Y/D

Zf .ı/

.ıjY/.ı/.ıjY/dı

where ı is the set of parameters of the model andRf .ı/dı D 1 (see Gelfand and Dey, 1994). To

make the numerical approximation efficient, f .ı/ should be chosen such that the summands are ofequal magnitude. Geweke (1999) proposed the use of the density of the truncated multivariate normaldistribution:

f .ı/D �1.2/�d=2jVıj�1=2 expŒ�0.5.ı � Nı/0V�1ı .ı � Nı/�

� I f.ı � Nı/0V�1ı .ı � Nı/� F�1

�2d. /g

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Forecasting Electricity Demand in Japan

where Nı and Vı are the posterior mean and the covariance matrix computed from the output of theposterior simulation respectively, d is the dimension of the parameter vector, F�2d . / is the cumu-lative density function of a �2 random variable with d degrees of freedom, and 2 .0, 1/. Finally,we choose the model with the highest marginal likelihood.

Predictive distributionWe evaluate the forecasting performance using a comparison of the predictive distribution. Ohtsukaet al. (2010) applied Geweke and Amisano’s (2011) log predictive score approach. They showed thatthe log predictive score function is a measure of the out-of-sample prediction track record of themodel. Therefore, we choose the model with the highest log predictive density.

For the SAR-ARMA(p, q) model, the log predictive score function (LS) is

LS.yTCK/DKXkD1

logp.yTCkjYTCk�1/

D

KXkD1

nXiD1

logp.yi ,TCkjYTCk�1/ (18)

The one-step-ahead predictive distribution is

p.yTC1jYT /DZ†>0

Z: : :

Zp.yTC1jYT ,�,†,˚p,‚q , �/

� .�,†,˚p,‚q , �/d� d† d˚p d‚q d� (19)

Then, yi ,TC1 for i D 1, : : : ,n can be rewritten as

yi ,TC1 D

1� �

nXjD1

wij

!�1.�i C ui ,TC1/

ui ,TC1 Du�

T�i C �i ,TC1C ��

T� i

where u�T D .uiT , : : : ,ui ,T�pC1/ and ��T D .�i ,T , : : : , �i ,T�qC1/. The log predictive distribution ofthe i th unit is

logp.yi ,TC1jYT /D�log.2/

2�

log.†�i/

2�.yi ,TC1 ��

i/2

2†�i

(20)

where ��iD .1� �

Pn

jD1wij /

�1��i, †�

iD .1� �

Pn

jD1wij /

�2†�i

��iD �i C u

T�i C ��

T� i

†�iD �2i C†�i C u

T†�iu�,T C �

T†�i��,T

Next, the log predictive score function for the VAR model is as follows:

logp.yTC1jYT /D�log.2/

2�

log j M†j

2�.yTC1 � M�/0 M†�1.yTC1 � M�/

2(21)

where

M�D ZTC1�, M†D†CZTC1†�Z0TC1

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Y. Ohtsuka and K. Kakamu

EMPIRICAL ANALYSIS

Available datasetFirst, we explain the dataset used in this paper. The monthly electricity demand data are obtainedfrom the Electricity Enterprises Manual in Japan. Figure 1 plots the volume of electricity supplied innine regions in Japan from January 1992 to January 2003.

In Japan, electric power is supplied by 10 companies: Hokkaido, Tohoku, Tokyo, Chubu,Hokuriku, Kansai, Chugoku, Shikoku, Kyushu, and Okinawa. However, the Okinawa Electric PowerCompany supplies only to Okinawa prefecture and does not have collaborations with other com-panies. Therefore, we eliminated it from the analysis. In this paper, production equals demandbecause electric power cannot be stored economically. However, the electricity produced in oneregion is not necessarily used entirely in that same region, because the electricity companieshelp each other if they face shortages in electricity supply. Moreover, there are no announce-ments of the delivery of electric power from one company to another. Therefore, we consider thisdelivered power as an unobserved component. The figure shows that electricity demand differsaccross regions. For example, demand is highest in the Tokyo region and lowest in the Hokkaidoregion.

On the other hand, it is well known that electricity demand is subject to seasonal changes. Forexample, there is a sharp rise in demand in the summer and winter. In this paper, we use seasonallyadjusted data, which are obtained using the X-12ARIMA procedure. Figure 2 plots the transformeddata.

As the spatial weight matrix is W, we use the contiguity dummy variables (see Anselin, 1988).In this weight matrix, we consider the connection between the companies. All except one (Okinawa)of the electricity companies are located on the four main islands: Hokkaido, Honshu, Shikoku,and Kyushu. Although these four islands are separate geographical entities, they are connected byoverhead transmission lines and undersea cables. For example, the Hokkaido electricity company isconnected to the Tohoku electricity company. As such, we consider this dependence as a first-ordercontiguity between the neighborhoods, as illustrated in Table I. In addition, a spatial weight matrix isused in the row-standardized form.

1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

1000

2000

3000

4000

5000

6000

7000

8000Hokkaido Tokyo Hokuriku Chugoku Kyushu

Tohoku Chubu KansaiShikoku

Figure 1. Electricity demand (MW h) in Japan (1/1991–1/2003)

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Forecasting Electricity Demand in Japan

1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

6.0

6.5

7.0

7.5

8.0

8.5

9.0

9.5Hokkaido Tokyo Hokuriku Chugoku Kyushu

Tohoku Chubu KansaiShikoku

Figure 2. Transformed data of electricity demand in Japan (1/1992–1/2003)

Table I. First-order contiguity, Japanese neighborhoods

Contiguous neighborhoods

Hokkaido: TohokuTohoku: Hokkaido TokyoTokyo: Tohoku ChubuChubu: Tokyo Hokuriku KansaiHokuriku: Chubu KansaiKansai: Chubu Hokuriku Chugoku ShikokuChugoku: Kansai Shikoku KyushuShikoku: Kansai ChugokuKyushu: Chugoku

For the SAR-ARMA model, we run the MCMC algorithm for 3,000,000 iterations after a burn-inphase of 1,000,000 iterations. Of the remaining draws, every 100th draw is used to obtain the poste-rior statistics for the parameters. The hyperparameters of the SAR-ARMA model are represented asfollows:

p.�i/�N .0, 100/, p.�i/�N .0, 100� Ip/IS� , p.�2i /� IG.2=2, 0.01=2/,

p.� i/�N .0, 100� Iq/IS� , and p.�/� U.1=�min, 1=�max/

For the tuning parameters, we set si1 D si2 D si3 D 1.0. In contrast, for the VAR model, werun the MCMC algorithm for 90,000 iterations after a burn-in phase of 30,000 iterations. Of theremaining draws, every 10th draw is used to obtain the posterior statistics for the parameters. Next,the hyperparameters of the VAR model are represented as follows:

� �N .�0, 100� In.nC1/p/IS� and ��W.T C 1, 100�1 � In/

Empirical resultsFirst, we need to choose the orders of AR and MA since these orders are known in advance. In thispaper, following Ohtsuka et al. (2010), we examined the application of the SAR-ARMA(1, 1) modelbecause our dataset is identical to theirs. Moreover, the VAR(1) model was selected because a value

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Y. Ohtsuka and K. Kakamu

of the marginal likelihood of the VAR(1) model was higher than that of VAR(2). All the results inthis paper are calculated using Ox version 5.1 (see Doornik, 2006). In this paper, the estimated resultsin the models are not reported because our purpose is to examine the forecasting performance inthe spatial and contemporaneous aggregated models, and it is not necessary to discuss the proper-ties of estimated parameters. Table II reports the estimated marginal likelihood and log score for theSAR-ARMA(1,1) and VAR(1) models. From the table, the VAR(1) model has a higher marginallikelihood than the SAR-ARMA(1, 1) model. Therefore, the VAR(1) model is better than theSAR-ARMA(1,1) model from the viewpoint of in-sample estimation.

Next, following Ohtsuka et al. (2010), we evaluate the forecasting performance using a compar-ison of the predictive distribution. In this paper, we forecast the aggregated electricity demand for12 months. We fit these models to the transformed data from January 1992 to January 2003, andexamine the performance of the model on the data from February 2003 to July 2003. Table III reportsthe estimated log predictive scores for the SAR-ARMA(1,1) and VAR(1) models. From the table, thelog scores of the VAR(1) model are better than those of the SAR-ARMA(1,1) model.

Finally, we refer to the estimated results for the parameters in the VAR(1) model. In the estimatedresults for the VAR(1) model, none of the parameters of †, which is the variance–covariance matrixof the error terms, is zero in the 95% credible interval. It can thus be inferred that the weight matrixis misspecified. Lee (2007) showed that an over-specification of the weight matrix results in a lesserbias in the estimators than an under-specification. The spatial interaction of electricity in Japan is notpublished. Thus our weight matrix may be under-specified. Moreover, our weight matrix revealedthat the link between the electricity companies using overhead transmission lines and undersea cablesmay not be strong. Therefore, it may be possible that the accurate specification of the weight matrixleads to improved forecasting performance.

Table II. Estimated marginal likelihoods for the SAR-ARMA(1, 1) and VAR(1) models

SAR-ARMA VAR

2915.3 2991.4

Table III. Estimated log score for the SAR-ARMA(1, 1) and VAR(1) models

Horizon SAR-ARMA VAR

1 31.6 34.12 63.1 67.33 95.1 101.24 124.1 131.55 152.4 164.86 177.9 193.87 207.3 226.88 237.3 259.99 267.3 292.110 298.8 326.211 329.8 358.512 361.1 391.9

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Forecasting Electricity Demand in Japan

CONCLUDING REMARKS

This paper examines the efficiency of forecasting the aggregate of spatially correlated variables andapplies it to forecasting electricity demand in Japan. For example, we compared the SAR-ARMAmodel with the VAR model from the estimation and forecasting viewpoints. From the empiricalresults, we found that the log marginal likelihood and log predictive distribution of the VAR(1) modeloutperformed that of the SAR-ARMA(1,1) model. Therefore, we found that in the case of forecastingelectricity demand in Japan the VAR model had better forecasting performance than the SAR-ARMAmodel.

Finally, we state some remaining issues. From the empirical results, it is suggested that our weightmatrix may be under-specified. The identification of the weight matrix calls for further investiga-tion. In addition, our empirical result implied a unit root problem in the AR parameters. As such, itmight result in poor forecasting performance. Giacomini and Granger (2004) stated that forecastingwith a space–time model offers a solution to the curse of dimensionality that arises when model-ing panel data with a moderately large cross-sectional dimension using a VAR model. Therefore,we must examine the application of the space–time model not only to electricity demand but alsoto other macro data. This topic will be discussed in our future research. However, it is importantto know whether it would be more efficient to forecast the aggregate series directly or to modelthe individual components separately and then aggregate the forecasts, as in the case of forecastingelectricity demand in Japan. Therefore, we believe that in this respect our results are an interestingimplementation of empirical analysis.

ACKNOWLEDGEMENTS

We gratefully acknowledge the helpful discussions with and suggestions by Yasuhiro Omori, KosukeOya, Toshiaki Watanabe, anonymous referees, and Derek Bunn, the associate editor of this journal.This research was partially supported by KAKENHI.

REFERENCES

Ang A, Piazzesi M, Wei M. 2006. What does the yield curve tell us about GDP growth? Journal of Econometrics131: 359–403.

Anselin L. 1988. Spatial Econometrics: Methods and Models. Kluwer: Dordrecht.Anselin L. 2001. Spatial econometrics. In A Companion to Theoretical Econometrics, Baltagi B (ed.). Basil

Blackwell: Oxford; 310–330.Anselin L. 2003. Spatial externalities, spatial multipliers, and spatial econometrics. International Regional Science

Review 26: 153–166.Chib S, Greenberg E. 1994. Bayes inference in regression models with ARMA(p, q) errors. Journal of Economet-

rics 64: 183–206.Chib S, Greenberg E. 1998. Analysis of multivariate probit models. Biometrika 85: 347–361.Cottet R, Smith M. 2003. Bayesian modeling and forecasting of intraday electricity load. Journal of the American

Statistical Association 98: 839–849.Doornik JA. 2006. Ox: Object Oriented Matrix Programming Language. Timberlake Consultants Press: London.Gelfand AE, Dey DK. 1994. Bayesian model choice: asymptotics and exact calculations. Journal of the Royal

Statistical Society B 56: 501–514.Geweke J. 1999. Using simulation methods for Bayesian econometric models: inference, development, and

communication. Econometric Reviews 18: 1–73.Geweke J, Amisano G. 2011. Optimal prediction pools. Journal of Econometrics 164: 130–141.Giacomini R, Granger CWJ. 2004. Aggregation of space–time processes. Journal of Econometrics 118: 7–26.

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for

Y. Ohtsuka and K. Kakamu

Holloway G, Shankar B, Rahman S. 2002. Bayesian spatial probit estimation: A primer and an application to HYVrice adoption. Agricultural Economics 27: 383–402.

Kakamu K, Wago H, Tanizaki H. 2010. Estimation of regional business cycle in Japan using Bayesian panel spatialautoregressive probit model. In Handbook of Regional Economics, Nolin TP (ed.). Nova: New York; 555–571.

Lee SY. 2007. Bias from misspecified spatial weight matrices in SAR models: theory and simulation studies.Manuscript, Department of Economics, San Francisco State University.

Ohtsuka Y, Kakamu K. 2009a. Estimation of electric demand in Japan: a Bayesian spatial autoregressive AR(p)approach. In Inflation: Causes and Effects, Schwartz LV (ed.). Nova: New York; 555–571.

Ohtsuka Y, Kakamu K. 2009b. Comparison of the sampling efficiency in spatial autoregressive model. WorkingPaper, Chiba University.

Ohtsuka Y, Oga T, Kakamu K. 2010. Forecasting electricity demand in Japan: a Bayesian spatial autoregressiveARMA approach. Computational Statistics and Data Analysis 54: 2721–2735.

Panagiotelis A, Smith M. 2008. Bayesian density forecasting of intraday electricity prices using multivariate skewt distributions. International Journal of Forecasting 24: 710–727.

Pappas SS, Ekonomou L, Karamousantas DC, Chatzarakis GE, Katsikas SK, Liatsis P. 2008. Electricity demandloads modeling using autoregressive moving average (ARMA) models. Energy 33: 1353–1360.

Ramanathan R, Engle R, Granger C, Vahid-Araghi F, Brace C. 1997. Short-run forecasts of electricity loads andpeaks. International Journal of Forecastings 13: 161–174.

Schanne N, Wapler R, Weyh A. 2010. Regional unemployment forecasts with spatial interdependencies. Interna-tional Journal of Forecasting 26: 908–926.

Smets F, Wouters R. 2005. Comparing shocks and frictions in US and Euro area business cycles: a Bayesian DSGEapproach. Journal of Applied Econometrics 20: 161–183.

Stakhovych S, Bijmolt THA. 2009. Specification of spatial models: a simulation study on weights matrices. Papersin Regional Science 88: 389–408.

Tsurumi H, Radchenko S. 2005. Relationships among the foreign exchange rates after the Asian financial crisis:applications of unit root tests, cointegration tests and VAR. In Econometric Applications of MCMC Algorithms,Wago H (ed.). Toyo Keizai Shimposha: Tokyo; 101–125. (in Japanese).

Zellner A, Tobias J. 2000. A note on aggregation, diaggregation and forecasting performance. Journal ofForecasting 19: 457–469.

Authors’ biographies:Yoshihiro Ohtsuka is Graduate student in the Hitotsubashi University and majoring in Economics. He received hisMA in Economics at the Chiba University. His major research fields are spatio-temporal econometrics and businesscycle.

Kazuhiko Kakamu is Associate Professor at Chiba University. His research fields are Bayesian inference and spa-tial econometrics.

Authors’ addresses:Yoshihiro Ohtsuka, Graduate School of Economics, Hitotsubashi University and Tachibana Securities Co., Ltd,2-1 Naka, Kunitachi, Tokyo 186-8601, Japan.

Kazuhiko Kakamu, Faculty of Law and Economics, Chiba University, 1-33, Yayoi-cho, Inage-ku, Chiba,263-8522, Japan.

Copyright © 2011 John Wiley & Sons, Ltd. J. Forecast. (2011)DOI: 10.1002/for