functional time series approach for forecasting very short-term electricity demand

18
This article was downloaded by: [Ams/Girona*barri Lib] On: 10 October 2014, At: 03:09 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Applied Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/cjas20 Functional time series approach for forecasting very short-term electricity demand Han Lin Shang a a Department of Econometrics and Business Statistics , Monash University , VIC 3145 , Australia Published online: 05 Nov 2012. To cite this article: Han Lin Shang (2013) Functional time series approach for forecasting very short-term electricity demand, Journal of Applied Statistics, 40:1, 152-168, DOI: 10.1080/02664763.2012.740619 To link to this article: http://dx.doi.org/10.1080/02664763.2012.740619 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Upload: han-lin

Post on 15-Feb-2017

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Functional time series approach for forecasting very short-term electricity demand

This article was downloaded by: [Ams/Girona*barri Lib]On: 10 October 2014, At: 03:09Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20

Functional time series approach forforecasting very short-term electricitydemandHan Lin Shang aa Department of Econometrics and Business Statistics , MonashUniversity , VIC 3145 , AustraliaPublished online: 05 Nov 2012.

To cite this article: Han Lin Shang (2013) Functional time series approach for forecastingvery short-term electricity demand, Journal of Applied Statistics, 40:1, 152-168, DOI:10.1080/02664763.2012.740619

To link to this article: http://dx.doi.org/10.1080/02664763.2012.740619

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied StatisticsVol. 40, No. 1, January 2013, 152–168

Functional time series approachfor forecasting very short-term

electricity demand

Han Lin Shang∗

Department of Econometrics and Business Statistics, Monash University, VIC 3145, Australia

(Received 17 August 2011; final version received 13 October 2012)

This empirical paper presents a number of functional modelling and forecasting methods for predictingvery short-term (such as minute-by-minute) electricity demand. The proposed functional methods slice aseasonal univariate time series (TS) into a TS of curves; reduce the dimensionality of curves by applyingfunctional principal component analysis before using a univariate TS forecasting method and regressiontechniques. As data points in the daily electricity demand are sequentially observed, a forecast updatingmethod can greatly improve the accuracy of point forecasts. Moreover, we present a non-parametric boot-strap approach to construct and update prediction intervals, and compare the point and interval forecastaccuracy with some naive benchmark methods. The proposed methods are illustrated by the half-hourlyelectricity demand from Monday to Sunday in South Australia.

Keywords: functional principal component analysis; multivariate time series; ordinary least-squaresregression; penalised least-squares regression; roughness penalty; seasonal time series

1. Introduction

Accurate forecasts of electricity demand is an important aspect in the development of any modelfor electricity planning. The form of the electricity demand depends on the type of planning andthe accuracy that is required. Research on electricity demand forecasting usually consider threemajor forms: long-term forecasts for generator planning, medium-term forecasts for generatormaintenance, and short-term forecasts for daily operation. Electricity demand forecasting is alsoimportant for trading purposes and recently for management of smart grids and smart homes.

Most literature of demand forecasting has been concentrated on short-, medium-, and long-term forecasts with little attention on the problem of very short-term (such as minute-by-minute)demand forecasting. Nonetheless, some notable exceptions include the works of Charytoniuk andChen [6], Liu et al. [27], Taylor [40], and Trudnowski et al. [43]. In the studies of Liu et al. [27], aneural network (NN) outperforms a simple non-seasonal autoregressive model. Charytoniuk and

∗Email: [email protected]

ISSN 0266-4763 print/ISSN 1360-0532 online© 2013 Taylor & Francishttp://dx.doi.org/10.1080/02664763.2012.740619http://www.tandfonline.com

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 3: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 153

Chen [6] compared forecasts from several differently structured NNs. Based on minute-by-minuteBritish electricity demand, Taylor [40] proposed the adaptive Holt-Winters method and an intradaycycle exponential smoothing method for prediction from 10 to 30 min ahead. The approach byTrudnowski et al. [43] to very short-term prediction considers hourly data being used to generatehourly forecasts, which are then interpolated to produce very short-term forecasts.

When historical data of electricity demand are the only available information, we present somefunctional time series (TS) modelling and forecasting methods by borrowing the strength fromfunctional data analysis. Functional data analysis has been popularised by the recent developmentin modern technology that collects and stores large dimensional data [13,33]. However, there arefew functional modelling and forecasting methods that have been applied to electricity demand.For example, Hyndman and Fan [21] utilised a semiparametric regression to forecast long-termelectricity peak demand, while Antoch et al. [2] and Goia et al. [14] forecasted medium-termelectricity demand through the functional linear regression [33, Chapter 16].

The techniques proposed in this paper differ from those of Hyndman and Fan [21], Antochet al. [2], and Goia et al. [14] in three aspects: first, the functional methods slice a seasonalunivariate TS of electricity demand into segments and treat as a TS of curves. The idea of forminga TS of curves from a seasonal univariate TS is not new and has been considered by several authors,including Aneiros-Pérez and Vieu [1] and Besse et al. [3]. Here, we apply this idea to model andforecast very short-term electricity demand. Second, as the daily electricity demand is sequentiallyobserved through time, we adopt a forecast updating method of Shang and Hyndman [36], whopresented an algorithm for updating the prediction of monthly sea surface temperature. In thispaper, we apply it to forecast very short-term electricity demand. Third, we do not intend toincorporate explanatory variables (such as temperature and dummy variables) into our regressionmodel, because such information may not be available in very short-term electricity demandforecasting. Because the curves are continuous and of infinite dimension, the functional methodsallow us to model and forecast very short-term electricity demand, which separates functionaldata analysis from multivariate data analysis.

The proposed methods are illustrated by half-hourly electricity demand (in megawatts) in SouthAustralia from 6 July 1997 (Sunday) to 31 March 2007 (Saturday). Since the intraday pattern ofelectricity demand varies across different days of a week, we divide the data set into 7 weekly datasets consisting electricity demand from Monday to Sunday [36]. For example, let {Zw, w ∈ [1, N]}be a seasonal univariate TS of half-hourly electricity demand on Mondays from 7 July 1997 to 26March 2007, which has been observed at N = 24, 384 equispaced time points. Using the paradigmof functional data analysis, we divide the observed 24,384 equispaced time points into n = 508curves and then consider that each curve has a common and compact support [0, 48). The TS ofcurves is given by

yt(x) = {Zw, w ∈ (p(t − 1), pt]}, t = 1, . . . , 508 and p = 48, (1)

where x is a continuous time variable bounded within [0, 48). By interpolation, a set of observationsin each week can be considered as a few discretised data points of a continuous, infinite-dimensional, and square-integrable function in Equation (1). This is the essence of functionaldata forecasting, where the forecasts can be made in a very small unit, such as minute by minute,despite the data used for interpolation and evaluation were half-hourly. The problem of interestis to forecast yn+h(x) from the historical curves {y1(x), . . . , yn(x)}, where h represents a forecasthorizon.

The outline of this paper is given as follows. Section 2 introduces the electricity demanddata that motivated this empirical paper. In Section 3, we present the functional modelling andforecasting methods utilising functional principal component analysis (FPCA).As the most recentdata are sequentially observed, Section 4 adopts a forecast updating technique of Shang and

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 4: Functional time series approach for forecasting very short-term electricity demand

154 H.L. Shang

0 10 20 30 40

1000

1500

2000

2500

Half−hour

Dem

and

(meg

awat

ts)

1998 2000 2002 2004 2006

1000

1500

2000

2500

Year

Dem

and

(meg

awat

ts)

(a) (b)

Figure 1. Exploratory plots suggesting that both regular pattern and extreme electricity demand are presentedin the Monday electricity demand data between 7 July 1997 and 26 March 2007. (a) A univariate TS displayof electricity demand on Mondays. There are 24,384 discrete time points. Each time point represents onedimension. (b) A functional TS display of electricity demand on Mondays. There are 508 curves. Each curveis constructed from 48 data points and is considered as a square-integrable function.

Hyndman [36] to improve the point forecast accuracy. In Section 5, a non-parametric bootstrapmethod is presented to construct and update prediction intervals. The evaluations of the point andinterval forecast accuracy are given in Section 6. Conclusions are discussed in Section 7, alongwith some thoughts on how the methods developed here might be further extended.

2. Data set

The data set consists of half-hourly electricity demand in South Australia (Adelaide) from 6July 1997 (Sunday) to 31 March 2007 (Saturday). These data were obtained from AustralianEnergy Market Operator (http://www.aemo.com.au). As an illustration, a univariate TS display ofelectricity demand on Mondays from 7 July 1997 to 26 March 2007 is presented in Figure 1(a),with the same data shown in Figure 1(b) as a TS of curves.

From Figure 1(b), there are some weeks showing extreme electricity demand and are suspectto be outliers. Since the presence of outliers can seriously affect the forecast accuracy, we appliedan outlier detection method of Hyndman and Shang [24]. This outlier detection method appliesFPCA to reduce the dimensionality of curves down to two, and it detects an outlier if it is farfrom the centre of the first two principal component scores. As a surrogate of original curves, thebivariate principal component scores can be easily ranked by half-location depth of Tukey [44]and plotted via bivariate bagplot of Rousseeuw et al. [35], from which outliers and inliers areseparated. The detected outliers in Monday electricity demand correspond to the following dates(15 November 1998, 14 January 2001, 18 February 2001, 19 January 2003, 15 February 2004, 28November 2004, 22 January 2006, 5 March 2006, 10 December 2006, 4 February 2007, and 18February 2007). These outliers reflect the extremely high electricity demand during the summerseason from December to February and holiday period in South Australia. By assigning zeroweight to the outlying days, our forecasting method described below eliminates the effect ofoutliers. In Section 6.2, we compare the point forecast accuracy of the same functional principalcomponent method with and without removing outliers.

3. Functional forecasting method

The functional forecasting method utilises FPCA, which plays an important role in the develop-ment of functional data analysis [36]. There are a number of studies on the statistical propertiesof FPCA and applications of the methodology [32,33]. Papers covering the practical applications

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 5: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 155

of FPCA include those of Hall et al. [18], Reiss and Ogden [34], Shen [37], and Hyndman andShang [23]. Significant treatments of the theory of FPCA are given by Dauxois et al. [10], Caiand Hall [5], Hall et al. [17], Hall and Horowitz [15], and Hall and Hosseini-Nasab [16].

In order to understand FPCA, there exist covariance kernel and linear operator viewpoints.Here, we explain it from a covariance kernel viewpoint [see also [15,16]].

Definition 3.1 (Covariance operator) The covariance function of a stochastic process y isdefined to be a function K : F × F → R, such that

K(u, v) = Cov(yc(u), yc(v))

= E{[y(u) − μ(u)][y(v) − μ(v)]},where μ represents the population mean function, yc represents the decentralised stochasticprocess, and F symbolises a function space, such as Hilbert space.

Lemma 3.2 (Mercer’s lemma) Assume that K is continuous over F × F and there is anorthonormal sequence (φk) of continuous function in the square-integrable function space L2(F)

and a non-increasing sequence (λk) of positive numbers, such that

K(u, v) =∞∑

k=1

λkφk(u)φk(v), u, v ∈ F .

The orthogonal basis functions are represented by (φk(x); k = 1, 2, . . .), which are also known asfunctional principal components. The principal component scores (βk; k = 1, 2, . . .) are given bythe projection of yc in the direction of the kth eigenfunction φk , that is, βk = 〈yc, φk〉 and 〈·, ·〉represents inner product. The principal component scores constitute an uncorrelated sequence ofrandom variables with zero mean and variance λk . Sometimes, they are interpreted as the weightsof the contribution of the functional principal component φk to the stochastic process yc [26].In the context of electricity demand, these principal component scores describe patterns that areassociated with the different weeks of a day.

In practice, the mean function (μ), functional principal components (φ), and functional prin-cipal component scores (β) can only be estimated through realisations of a stochastic process.Concretely, for each week t = 1, . . . , n that is considered, let yt(x) denote the electricity demandon week t at intraday period x ∈ [0, 48). The mean of functional curves is estimated by

μ(x) = 1

n

n∑t=1

yt(x), (2)

where {y1(x), . . . , yn(x)} is aTS of curves. Denote yc(x) = [y1(x) − μ(x), y2(x) − μ(x), . . . , yn(x) −μ(x)]T as a vector of decentralised functional curves.

Via FPCA, each curve can be approximated by the sum of functional principal components andtheir associated principal component scores,

yt(x) = μ(x) +K∑

k=1

φk(x)βk,t + et(x), (3)

where {φ1(x), . . . , φK(x)} represents a set of estimated functional principal components,{β1,t , . . . , βK ,t} represents a set of estimated principal component scores for week t, et(x) is thezero-mean residual function that captures the excluded functional principal components, and

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 6: Functional time series approach for forecasting very short-term electricity demand

156 H.L. Shang

K < n is the number of retained functional principal components. The optimal value of K can bedetermined using a holdout method (see Section 4.2 for details).

As pointed out by Shang and Hyndman [36], since the principal component scores are uncor-related to each other by construction, it is appropriate to forecast each series {(βk,1, . . . , βk,n); k =1, . . . , K} using a univariate TS model, such as the autoregressive integrated moving average(ARIMA) model of Box et al. [4] used in this paper. Note that the lagged cross-correlations arenot necessarily zero, but they are likely to be small because the contemporaneous correlations arezero [36,38]. Alternatively, a multivariate autoregressive moving average model can be appliedto the matrix of principal component scores.

Conditioning on the historical curves I = {y1(x), . . . , yn(x)} and the fixed functional principalcomponents � = {φ1(x), . . . , φK(x)}, the forecasted curves are expressed as

yTSn+h|n(x) = E[yn+h(x)|I , �] = μ(x) +

K∑k=1

φk(x)βTSk,n+h|n, (4)

where βTSk,n+h|n denotes an h-step-ahead forecast of βk,n+h.

4. Updating point forecasts

When a TS of curves are segments of a seasonal univariate TS, the most recent curve maynot be complete. When we have observed the first m0 time periods of yn+1(x), denoted byyn+1(xe) = [yn+1(x1), . . . , yn+1(xm0)]T, we are interested in forecasting the data in the remainderof week n + 1, denoted by {yn+1(xl); m0 < l ≤ p}. Note that yn+1(xe) is a vector of most recentobservations, whereas yn+1(xl) is the forecast of a continuous function in the remaining time peri-ods. However, the TS method described in Section 3 does not use the most recent observations.Instead, using Equation (4), the TS forecast of yn+1(xl) is given by

yTSn+1|n(xl) = E[yn+1(xl)|I l, �l] = μ(xl) +

K∑k=1

φk(xl)βTSk,n+1|n, (5)

for m0 < l ≤ p, where I l = {y1(xl), . . . , yn(xl)} denotes a collection of historical curves corre-sponding to the remaining time periods; �l = {φ1(xl), . . . , φK(xl)} is a set of estimated functionalprincipal components, corresponding to the remaining time periods; and μ(xl) is the mean functioncorresponding to the remaining time periods.

To improve point forecast accuracy, the method introduced in Section 4.1 is able to update thepoint forecasts by incorporating the partially observed data.

4.1 Penalised least-squares method

The forecast updating method is developed on the basis of regression analysis (see also [36]). LetFe be m0 × K matrix whose (j, k)th entry is φk(xj) for 1 ≤ j ≤ m0 and 1 ≤ k ≤ K . Let μ(xe) =[μ(x1), . . . , μ(xm0)]T be a vector of estimated mean, βn+1 = [β1,n+1, . . . , βK ,n+1]T be a vector ofregression coefficients, and εn+1(xe) = [εn+1(x1), . . . , εn+1(xm0)]T be a vector of residuals. As themean adjusted data in the most recent curve yc

n+1(xe) = yn+1(xe) − μ(xe) become available, wehave a regression equation expressed as

ycn+1(xe) = Feβn+1 + εn+1(xe). (6)

The βn+1 can be estimated via ordinary least-squares (OLS) method giving

βOLS

n+1 = (FTe Fe)

−1FTe yc

n+1(xe), (7)

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 7: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 157

provided that (FTe Fe)

−1 is invertible.The possible singularity of the regressors may cause inaccurate estimation of regression coeffi-

cients obtained from the OLS method [25]. To rectify such a problem, the shrinkage methods areoften employed, such as ridge regression of Hoerl and Kennard [20] and penalised least-squares(PLS) of Shang and Hyndman [36]. While the ridge regression coefficient estimates are shrunk

towards zero, the PLS regression coefficient estimates are shrunk towards βTS

n+1|n. Because thePLS regression coefficient estimates take historical TS forecasts into account, this often leads toa better forecast accuracy than the ridge regression (see [38] for an example).

As noted by Shang and Hyndman [36], the PLS regression coefficient estimates minimise apenalised residual sum of squares

arg minβn+1

{(ycn+1(xe) − Feβn+1)

T(ycn+1(xe) − Feβn+1) + λ(βn+1 − β

TS

n+1|n)T(βn+1 − β

TS

n+1|n)}. (8)

The first term in Equation (8) measures the so-called goodness of fit, while the second termpenalises the departure of the regression coefficient estimates from the TS forecasted regression

coefficient estimates [25]. The βPLS

n+1 obtained can thus be considered as a tradeoff between these

two terms, conditional on a penalty parameter λ. By taking the first derivative with respect to βn+1in Equation (8), we obtain

βPLS

n+1 = (FTe Fe + λIK)−1[FT

e ycn+1(xe) + λβ

TS

n+1|n],

= FTe yc

n+1(xe)

FTe Fe + λIK

+ λβTS

n+1|nFT

e Fe + λIK. (9)

Plugging Equation (7) into Equation (9), the regression coefficient estimates of the PLS methodare given by

βPLS

n+1 = FTe Fe

FTe Fe + λIK

βOLS

n+1 + λIK

FTe Fe + λIK

βTS

n+1|n,

=(

IK − λIK

FTe Fe + λIK

OLS

n+1 + λIK

FTe Fe + λIK

βTS

n+1|n. (10)

When the penalty parameter λ → 0, βPLS

n+1 approaches βOLS

n+1 , provided that (FTe Fe)

−1 is invertible;

when λ → ∞, βPLS

n+1 approaches βTS

n+1|n; when 0 < λ < ∞, βPLS

n+1 is a weighted average between

βOLS

n+1 and βTS

n+1|n [36].

As pointed out by Shen and Huang [38], βPLS

n+1 also has a Bayesian interpretation. By treating

βTS

n+1 as a prior, we draw samples from a normal distribution with mean βTS

n+1 and variance σ 2β .

The error term εn+1(xe) is modelled as a normal distribution with mean 0 and variance σ 2ε . The

posterior density of βn+1 is maximised by the PLS expression in Equation (10) and the penaltyparameter λ = σ 2

ε /σ 2β can also be determined by a restricted maximum-likelihood estimator.

The forecast of yn+1(xl) obtained from the PLS regression is given by

yPLSn+1(xl) = E[yn+1(xl)|I l, �l] = μ(xl) +

K∑k=1

φk(xl)βPLSk,n+1, (11)

for m0 < l ≤ p.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 8: Functional time series approach for forecasting very short-term electricity demand

158 H.L. Shang

4.2 Selections of penalty parameter and number of principal components

We split the electricity demand data into a training sample and a testing sample consisting of 1-yearelectricity demand from n − 51 to n weeks (n represents the total number of weeks, excluding theoutliers). Within the training sample, we further split the data into a training set and a validationset (including electricity demand from n − 103 to n − 52 weeks, excluding the outliers).

For a set of possible optimal number of principal components, we apply the TS method tothe training set and obtain forecasts for the data in the validation set. Among K = 1, 2, . . . , 10,the optimal number of principal components is determined by minimising the mean absolutepercentage error (MAPE) within the validation set. Because the MAPE captures the proportionalitybetween the forecast error and actual demand, it has become an industry standard forecast errormeasure in demand forecasting [12,41]. The MAPE can be expressed as

MAPE = 1

pq

q∑j=1

p∑i=1

∣∣∣∣ym−j+1(xi) − ym−j+1|m−j(xi)

ym−j+1(xi)

∣∣∣∣ × 100, (12)

where p = 48 represents the number of observations in each day of a week, q = 52 represents thenumber of weeks in the validation set, and m denotes the index corresponding to the maximumnumber of weeks in the validation set (i.e. m = n − 52). For each day of a week, Table 1 presentsthe MAPE for each possible optimal number of principal components, using the TS method. As aresult, the optimal number of principal components is determined to be K = 5. In addition, K = 5explains at least 95% of the total amount of variation in the training set for all 7 days.

Having determined the optimal K , the optimal values of penalty parameter λ for differentupdating periods are also determined by minimising the MAPE within the validation set. Weutilise a one-dimensional optimisation algorithm of Nelder and Mead [30] to minimise the MAPEand to find its corresponding value of λ. Computationally, this minimisation algorithm is given bythe optimise function in R [31]. Note that it may be possible to improve forecast accuracy forthe PLS method, if the values of K and λ were estimated jointly. However, it would be at the costof much slower computational speed, since the optimisation algorithm has to be implementedfor each value of K . This is the reason why we considered to estimate K and λ separately. As avehicle of illustration, the optimal penalty parameters for different updating periods on Mondaysare given in Table 2. As more and more data points are observed in the most recent but incomplete

curve, the penalty parameter λ → 0 and thus βPLS

n+1 approaches βOLS

n+1 [25].

Table 1. The MAPE of each possible optimal number of principal components.

K Monday Tuesday Wednesday Thursday Friday Saturday Sunday

1 8.04 7.87 7.59 7.77 7.49 7.80 6.172 7.27 7.13 6.72 6.90 6.49 7.05 5.463 7.06 6.98 6.56 6.80 6.27 6.81 5.254 6.96 7.05 6.59 6.79 6.27 6.78 5.245 6.96 7.03 6.59 6.75 6.27 6.78 5.216 6.97 7.03 6.59 6.74 6.25 6.78 5.207 6.96 7.01 6.58 6.74 6.25 6.78 5.198 6.97 7.01 6.58 6.74 6.25 6.78 5.199 6.96 7.01 6.57 6.74 6.25 6.78 5.1910 6.96 7.01 6.58 6.74 6.25 6.78 5.19

Note: The optimal number of principal components is K = 5.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 9: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 159

Table 2. For different updating periods, the optimal penalty parameters in the PLSmethod are determined by minimising the MAPE within the validation set forelectricity demand on Mondays.

Updating period Optimal λ Updating period Optimal λ

1:00–23:30 0.1378 12:30–23:30 0.69981:30–23:30 0.2038 13:00–23:30 0.73762:00–23:30 0.2566 13:30–23:30 0.77162:30–23:30 0.2469 14:00–23:30 0.81153:00–23:30 0.2548 14:30–23:30 0.69123:30–23:30 0.3240 15:00–23:30 0.66694:00–23:30 0.2016 15:30–23:30 0.76004:30–23:30 0.2196 16:00–23:30 0.69085:00–23:30 0.2409 16:30–23:30 0.25105:30–23:30 0.2574 17:00–23:30 0.95726:00–23:30 0.3620 17:30–23:30 0.99476:30–23:30 0.7108 18:00–23:30 0.99377:00–23:30 0.9874 18:30–23:30 0.99517:30–23:30 0.3475 19:00–23:30 0.51498:00–23:30 0.2196 19:30–23:30 0.15108:30–23:30 0.4080 20:00–23:30 0.10909:00–23:30 0.4438 20:30–23:30 0.01929:30–23:30 0.4803 21:00–23:30 0.004810:00–23:30 0.5147 21:30–23:30 0.004810:30–23:30 0.5573 22:00–23:30 0.004811:00–23:30 0.5917 22:30–23:30 0.004811:30–23:30 0.6262 23:00–23:30 0.004812:00–23:30 0.6656 23:30–23:30 0.0048

5. Interval forecast methods

Interval forecasts are important for assessing the probabilistic uncertainty associated with pointforecasts [7,8]. In electricity demand forecasting, Taylor [39] considered the construction ofinterval forecasts based on an assumption of parametric distribution. In this section, we present anon-parametric bootstrap method to construct and update prediction intervals.

From Equation (3), there are two sources of errors that need to be considered: errors in esti-mating the regression coefficients and errors in the model residuals. In Section 5.1, we describea non-parametric bootstrap method to construct prediction intervals for the TS method. Basedon Equation (10), Section 5.2 presents a non-parametric bootstrap method to update predictionintervals by incorporating the most recent observations of an incomplete curve.

5.1 Non-parametric bootstrap method to construct prediction intervals

Let the one-step-ahead forecast errors associated with principal component scores be given by

πk,j = βk,n−j+1 − βk,n−j+1|n−j, for j = 1, . . . , n − K . (13)

To avoid singularity and non-invertibility problems, the TS method requires at least K number ofcurves, where K is the number of retained principal components.

The one-step-ahead forecast error {πk; k = 1, 2, . . . , K} is assumed to be independent and iden-tically distributed (i.i.d.), and such an assumption has been verified using the Box–Pierce test.For K = 5, the p-values of the Box–Pierce test averaged across different days are 0.42, 0.56,0.36, 0.57, and 0.29, respectively. Based on the p-values, the i.i.d. assumption is not violated atcustomarily 5% level of significance. Because (π1, π2, . . . , πK) are i.i.d., they can be sampled

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 10: Functional time series approach for forecasting very short-term electricity demand

160 H.L. Shang

with replacement to produce a bootstrap replication of βk,n+h:

βbk,n+h|n = βk,n+h|n + πb

k,∗, for b = 1, . . . , B, k = 1, . . . , K , (14)

where πbk,∗ denotes a bootstrap sample, which is drawn with replacement from {πk,1, . . . , πk,n−K},

and B = 1, 000 is the number of bootstrap replications.Because the first K number of functional principal components approximate the data relatively

well, the model residuals should contribute nothing but i.i.d. random noise [25]. Therefore, wecan bootstrap the model residual function eb

n+h|n(x) in Equation (3) by sampling with replacementfrom the residuals {e1(x), . . . , en(x)}.

Following the early work by Hyndman and Shang [25], adding these two components ofvariability, and assuming that they do not correlate with each other, we obtain B forecast variantsof yn+h(x),

ybn+h|n(x) = μ(x) +

K∑k=1

φk(x)βbk,n+h|n + eb

n+h|n(x). (15)

Hence, the 100(1 − α)% prediction intervals are defined as α/2 and (1 − α/2) empiricalquantiles of {y1

n+h|n(x), . . . , yBn+h|n(x)}, where α is customarily chosen to be 0.05.

5.2 Non-parametric bootstrap method to update prediction intervals

As pointed out by Hyndman and Shang [25], the prediction intervals can also be updated througha non-parametric bootstrap method. First, we obtain B replications of the TS forecasted regression

coefficient estimates, βTS

n+1|n, from which we obtain B replications of the PLS forecasted regression

coefficient estimates, according to Equation (10). Based on βb,PLSn+1 , we obtain B replications of

yb,PLSn+1 (xl) = μ(xl) +

K∑k=1

φk(xl)βb,PLSk,n+1 + eb

n+1(xl), (16)

where ebn+1(xl) is obtained by randomly sampling with replacement from the residual functions

{e1(xl), . . . , en(xl)}, corresponding to the remaining time periods. Hence, the 100(1 − α)% pre-diction intervals for the updated forecasts are defined as α/2 and (1 − α/2) empirical quantiles

of{y1,PLS

n+1 (xl), . . . , yB,PLSn+1 (xl)

}.

5.3 Evaluation of interval forecasts

Following the early work by Shang and Hyndman [36], we use the averaged coverage probabilitydeviance and averaged half-width of prediction intervals to calculate the interval forecast accuracy.The coverage probability deviance is calculated as the absolute difference between the empiricaland nominal coverage probabilities. In general, the smaller the coverage probability deviance is,the better interval forecast accuracy a method provides, subject to the same averaged half-widthof prediction intervals [25]. As a complement to the coverage probability deviance, the averagedhalf-width of prediction intervals assesses which approach gives narrower prediction intervals. It

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 11: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 161

can be expressed as

1

pq

q∑j=1

p∑i=1

|y1−α/2n−j+1|n−j(xi) − yα/2

n−j+1|n−j(xi)|, (17)

where p = 48 and q = 52. The narrower the averaged half-width of prediction intervals is, themore informative the method is, subject to the empirical coverage probability being close to thenominal coverage probability [25].

6. Results of very short-term electricity demand forecasting

In Section 6.1, we present the point forecasts using the TS and PLS methods described in Sections 3and 4. The point forecasts are then compared with the point forecasts produced by some benchmarkmethods in Section 6.2. In Section 6.3, we compare the averaged coverage probability deviance andaveraged half-width of prediction intervals using the non-parametric bootstrap method describedin Sections 5.1 and 5.2.

6.1 Point forecasts

The forecasting method first decomposes a TS of curves into a number of functional principalcomponents and their associated principal component scores. As a vehicle of illustration, wedisplay and attempt to interpret the first three functional principal components shown in the toppanel of Figure 2. Clearly, the mean function illustrates a strong seasonal pattern with a peak at18:00 and a trough at 5:00. The functional principal components are of second-order effects, asindicated by much smaller scales [38]. The first functional principal component models electricitydemand in the afternoon and evening. While the second functional principal component modelsthe contrast in the electricity demand between morning and evening, the third functional principalcomponent models the contrast in the electricity demand between morning and afternoon.

We implemented the automatic algorithm of Hyndman and Khandakar [22], which is a stepwiseapproach to determine the optimal orders of an ARIMA model, according to the minimum akaikeinformation criterion value. With the optimal ARIMA model, the principal component scores areforecasted and their 80% and 95% prediction intervals are highlighted by the orange and yellowregions in the bottom panel of Figure 2.

By conditioning on the historical data and fixed functional principal components, the forecastsare obtained by multiplying the forecasted principal component scores with the fixed functionalprincipal components, and then adding the fixed mean function [25]. As an example, Figure 3displays the forecasted Monday electricity demand in the last week of available data (i.e. 26 March2007), along with the 95% non-parametric prediction intervals described in Section 5.1.

6.2 Point forecast comparison with some benchmark methods

For the purpose of comparison, we investigate the point forecast accuracy of seasonal autoregres-sive moving average (SARIMA), seasonal version of the naive random walk (RW), and meanpredictor (MP) methods. The MP method consists in predicting values at each day of week t + 1by the empirical mean value for each day from the first week to the tth week (see also [25]). Theseasonal version of the naive RW approach predicts new values at each day of week t + 1 by theobservations at each day of week t. The stochastic nature of demand as a function of time hasfrequently been modelled with the SARIMA method [41]. The multiplicative SARIMA can be

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 12: Functional time series approach for forecasting very short-term electricity demand

162 H.L. Shang

0 10 30

1100

1300

1500

Main effects

Half−hour

Mea

n

0 10 30

0.06

0.10

0.14

0.18

Interaction

Half−hourB

asis

func

tion

1

0 10 30

−0.

10.

00.

10.

20.

3

Half−hour

Bas

is fu

nctio

n 2

0 10 30

−0.

20.

00.

10.

2

Half−hour

Bas

is fu

nctio

n 3

Week

Coe

ffici

ent 1

0 200 400

−30

00−

1000

1000

3000

Week

Coe

ffici

ent 2

0 200 400−

1000

050

010

00

Week

Coe

ffici

ent 3

0 200 400

−10

000

500

Figure 2. The mean function, the first three functional principal components, and their associated principalcomponent scores for the Monday electricity demand from 7 July 1997 to 26 March 2007 (excluding theoutliers). The 80% and 95% prediction intervals of the principal component scores are shown by the orangeand yellow regions.

0 10 20 30 401000

1200

1400

1600

1800

2000

2200

2400

Half−hour

Dem

and

(meg

awat

ts)

ForecastsNonparametric prediction intervals

Figure 3.The point forecasts of the Monday electricity demand on 26 March 2007 and the 95% non-parametricprediction intervals constructed via the TS method.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 13: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 163

MP RW SARIMA TS(uni) NN TS(outlier) TS PLS

24

68

10

MA

PE

Figure 4. Boxplots of the averaged MAPE of the 52 iterative one-step-ahead point forecasts from Monday toSunday using the MP, RW, SARIMA, TS(uni), NN, multivariate TS without removing outliers, multivariateTS, and PLS methods for different updating periods in the testing sample.

written as

φp(L)P(Ls)∇d∇Ds (yt − c) = θq(L)�Q(Ls)εt , (18)

where yt is the demand in period t; c is a constant term; s = 48 × 7 = 336 is the number ofperiods in the seasonal cycle, L is the lag operator, ∇ is the difference operator, ∇s is the seasonaldifference operator (e.g. ∇s = (1 − Ls)), d and D are the orders of differencing, εt is a whitenoise error term, φ, , θ , and � are polynomial functions of orders p, P, q, and Q, respectively.This model can be expressed as ARIMA(p, d, q) × (P, D, Q)s. Again, the automatic algorithm ofHyndman and Khandakar [22] is utilised to determine the orders of seasonal components andnon-seasonal components. The optimal model selected is a ARIMA(2, 0, 1) × (0, 1, 0)336.

We exclude the multiplicative double SARIMA (seasonal cycles s1 = 48 and s2 = 336) in ourcomparison. Although work is underway, there is no automatic algorithm for selecting the optimalseasonal and non-seasonal components for double SARIMA.

We include a univariate TS approach, called the TS(uni). We first divide the electricity demanddata by the day of a week, the season of a year, and the half-hourly time interval. For eachhalf-hourly time interval, we utilise a univariate TS approach, such as an ARIMA model, toobtain one-step-ahead point forecast for each day of a week and each season of a year. Within aunivariate TS framework, we also consider an NN in our study. Because an NN is able to capturethe nonlinear and non-parametric features, it is an attractive tool for modelling loads in termsof weather variables (see [19]). In this paper, we utilise an NN for modelling loads in terms oftheir one-lag-behind values. Thus, this can be useful for univariate electricity demand prediction.In addition, we include a multivariate TS approach without removing possible outliers in thetraining set. To compare the point forecast accuracy, Figure 4 shows the averaged MAPE of the52 iterative one-step-ahead point forecasts for 46 different updating periods in the testing sample.We use 46 different updating periods because we need at least two observations in the most recentcurve to implement the PLS method. The method that produces the minimum averaged MAPEacross 46 various time periods from Monday to Sunday is considered to be the best approach.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 14: Functional time series approach for forecasting very short-term electricity demand

164 H.L. Shang

Table 3. Based on the averaged MAPE across 46 different time periods, Nemenyi’s test statistics are calculatedto test the statistical significance in point forecast accuracy among methods.

MP RW SARIMA TS(uni) NN TS TS(outlier)

RW 0.0016 – 0 0.0001 0.0096 0 0.9999SARIMA 0 0 – 0.9975 0.6287 0.9989 0TS(uni) 0 0.0001 0.9975 – 0.9556 1 0.0007NN 0 0.0096 0.6287 0.9556 – 0.9345 0.0400TS 0 0 0.9989 1 0.9345 – 0.0005TS(outlier) 0.0003 0.9999 0 0.0007 0.0400 0.0005 –PLS 0 0 0.1904 0.0303 0.0005 0.0400 0

From this viewpoint, the forecast updating method achieved better point forecast accuracy thanthe non-updating methods.

As pointed out by a referee, we further carried out Friedman’s test to examine if the differenceof point forecast accuracy among methods is statistically significant. Friedman’s test is a non-parametric analogue of variance for a randomised block design, which can be considered asthe non-parametric version of a one-way ANOVA with repeated measures. Friedman’s test isa procedure based on within-block ranks and has approximately a χ2 distribution when nullhypothesis is true (see [11] for details). Based on the averaged MAPE across 46 different timeperiods, the test statistics indicates that there is a significant difference among methods (p <

2.2 × 10−16). We then performed Nemenyi’s test, which is a post hoc pairwise test intended todetermine which method is significantly different from the others. At the 5% level of significance,Table 3 presents the following findings:

(1) the MP method differs significantly from others;(2) the RW method differs significantly from others, except the TS(outlier) method;(3) the SARIMA method performs similarly with the TS(uni), NN, TS, and PLS methods, but it

differs significantly from others;(4) the TS(uni) method performs similarly with the SARIMA, NN, and TS methods, but it differs

significantly from others;(5) except the RW, TS(outliers), and PLS methods, NN performs similar to others;(6) the PLS method differs significantly from other methods, with the exception of the SARIMA

method.

6.3 Updating interval forecasts

Now, suppose that we observe the electricity demand from midnight to 18:30 on Monday (26March 2007), it is plausible to update the interval forecasts for the remaining time periods usingthe PLS method. From the historical TS of Monday electricity demand, we obtain the forecasts

of principal component scores. Based on the relationship between βb,TSn+1|n and β

b,PLSn+1 , the PLS

prediction intervals for the remaining time periods can be obtained from Equation (16). As anexample, Figure 5 presents the 95% prediction intervals obtained by the TS and PLS methods forthe electricity demand from 19:00 to 23:30 in 26 March 2007.

From Figure 5, the PLS prediction intervals are narrower than those of the TS method. Thus,they provide more informative evaluation of forecast uncertainty, subject to the same coverageprobability deviance. Table 4 presents the summary statistics of the averaged coverage probabilitydeviance and averaged half-width of prediction intervals for various updating time periods in thetesting sample.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 15: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 165

38 40 42 44 46

1200

1400

1600

1800

2000

2200

Half−hour

Dem

and

(meg

awat

ts)

ObservationsNonparametric prediction intervals of TS

Nonparametric prediction intervals of PLS

Figure 5. The 95% prediction intervals of the Monday electricity demand from 19:00 to 23:30 on 26 March2007. By incorporating the electricity demand from midnight to 18:30, the prediction intervals can be updatedusing the non-parametric bootstrap method in Section 5.2.

Table 4. Summary statistics of the averaged coverage probability deviance and averaged half-widthof the TS and PLS prediction intervals constructed non-parametrically for the 52 iterativeone-step-ahead forecasts from Monday to Sunday.

Coverage probability deviance Half-width of prediction intervals

Summary statistics TS PLS TS PLS

Minimum 0.0095 0.0144 245.4 78.5First quantile 0.0129 0.0195 335.5 244.9Median 0.0172 0.0273 365.8 287.7Mean 0.0174 0.0400 356.3 269.3Third quantile 0.0212 0.0352 390.4 330.6Maximum 0.0291 0.2500 399.4 354.3

Note: At the 95% nominal coverage probability, the minimal averaged coverage probability deviance and theminimal averaged half-width of prediction intervals are marked in bold.

From Table 4, the narrower half-width of PLS prediction intervals comes at a cost of theworse averaged coverage probability deviance. Depending on one’s objective, updating predictionintervals can provide much narrower half-width, but at the expense of a worse coverage probabilitydeviance. In some cases, updating prediction intervals can be useful when the loss of coverageprobability accuracy is small.

7. Conclusion and future direction

Illustrated by the half-hourly electricity demand data in South Australia, this paper presents anempirical study of functional modelling and forecasting methods for very short-term electricitydemand forecasting. The functional methods treat the historical data as a TS of curves. UsingFPCA, the dimensionality of data is effectively reduced, and the main features in the data are

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 16: Functional time series approach for forecasting very short-term electricity demand

166 H.L. Shang

represented by a set of functional principal components, which explain more than 95% of the totalamount of variation in the seven electricity demand data sets from Monday to Sunday. The prob-lem of forecasting electricity demand has been overcome by forecasting K = 5 one-dimensionalprincipal component scores. By conditioning on the historical data and fixed functional principalcomponents, the forecasts are obtained by multiplying the forecasted principal component scoreswith the fixed functional principal components and then adding the fixed mean function.

The main contribution of this paper is to put forward functional methods and apply them tovery short-term electricity demand forecasting. We first present the TS method for forecasting awhole curve. When data in the most recent curve are sequentially observed, the forecast updatingmethod (PLS) can improve the point forecast accuracy. Judged by the averaged MAPE overthe 52 iterative one-step-ahead point forecasts, the PLS method performs better than any othernon-updating methods investigated in the testing sample averaged from Monday to Sunday.

Moreover, a non-parametric bootstrap method is presented to construct and update predictionintervals. The PLS prediction intervals are narrower than non-updating TS method, but at theexpense of a worse coverage probability deviance.

To conclude, functional data analysis has a lot to offer in the context of very short-term electricitydemand forecasting. Even if we observe half-hourly electricity demand data, the objects of analysisin our functional methods are intrinsically infinite-dimensional curves, rather than many discretedata points.Although not demonstrated in this paper, the continuous nature of our functional meth-ods allows us to analyse the derivatives of curves [28,29]. Such features separate functional dataanalysis methods from other multivariate data analysis approaches. The implementation of updat-ing point and interval forecasts is rather easy using the ftsa package of Hyndman and Shang [25].

There are numerous ways in which the presented methodology can be extended, and we brieflymention a few:

(1) As pointed out by a referee, electricity demand can be forecasted to a higher degree ofaccuracy by considering seasonality (daily, weekly, and yearly), temperatures, and data onnational holidays. By incorporating those covariates in regression, forecasts can be expectedto perform very well also for very short-term demand. In future work, we aim to explore afunctional regression model with discrete and functional types of regressors and compare itsforecast accuracy with the established functional autoregressive of order 1 with exogenousvariables proposed by Damon and Guillas [9].

(2) Other shrinkage methods may be considered, such as lasso [42] and elastic net [46]. Thebeauty of these shrinkage methods is that some of the regression coefficients are shrunk tozero, but there is no closed-form expression for the regression coefficient estimates. With apossible solution of implementing Monte-Carlo techniques, such an investigation awaits.

(3) Equipped with sequential Monte-Carlo techniques, Bayesian dynamic modelling can beapplied to update forecasts (see [45] for details).

Acknowledgements

The author thanks the editor, an associate editor, and two reviewers for their insightful comments, which led to a substantialimprovement of the manuscript. The author also thanks Professors Rob Hyndman and Donald Poskitt for introducinghim to the field of functional data analysis. This work was partially supported by the postgraduate publication award ofMonash University.

References

[1] G. Aneiros-Pérez and P. Vieu, Nonparametric time series prediction: A semi-functional partial linear modeling,J. Multivariate Anal. 99 (2008), pp. 834–857.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 17: Functional time series approach for forecasting very short-term electricity demand

Journal of Applied Statistics 167

[2] J. Antoch, L. Prchal, M.R. De Rosa, and P. Sarda, Functional linear regression with functional response: Applicationto prediction of electricity consumption, in Functional and Operatorial Statistics, S. Dabo-Niang and F. Ferraty,eds., Springer, Heidelberg, 2008, pp. 23–29.

[3] P.C. Besse, H. Cardot, and D.B. Stephenson, Autoregressive forecasting of some functional climatic variations,Scand. J. Stat. 27 (2000), pp. 673–687.

[4] G.E.P. Box, G.M. Jenkins, and G.C. Reinsel, Time Series Analysis: Forecasting and Control, 4th ed., John Wiley,Hoboken, NJ, 2008.

[5] T. Cai and P. Hall, Prediction in functional linear regression, Ann. Stat. 34 (2006), pp. 2159–2179.[6] W. Charytoniuk and M.S. Chen, Very short-term load forecasting using artificial neural networks, IEEE Trans. Power

Syst. 15 (2000), pp. 263–268.[7] C. Chatfield, Calculating interval forecasts, J. Bus. Econom. Stat. 11 (1993), pp. 121–135.[8] C. Chatfield, Time-Series Forecasting, Chapman & Hall/CRC, Boca Raton, FL, 2000.[9] J. Damon and S. Guillas, The inclusion of exogenous variables in functional autoregressive ozone forecasting,

Environmetrics 13 (2002), pp. 759–774.[10] J. Dauxois, A. Pousse, and Y. Romain, Asymptotic theory for the principal component analysis of a vector random

function: Some applications to statistical inference, J. Multivariate Anal. 12 (1982), pp. 136–154.[11] J. Demsar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res. 7 (2006), pp. 1–30.[12] S. Fan and R.J. Hyndman, Short-term load forecasting based on a semi-parametric additive model, IEEE Trans.

Power Syst. 27 (2012), pp. 134–141.[13] F. Ferraty and P. Vieu, Nonparametric Functional Data Analysis: Theory and Practice, Springer, New York, 2006.[14] A. Goia, C. May, and G. Fused, Functional clustering and linear regression for peak load forecasting, Int. J. Forecast.

26 (2010), pp. 700–711.[15] P. Hall and J.L. Horowitz, Methodology and convergence rates for functional linear regression, Ann. Stat. 35 (2007),

pp. 70–91.[16] P. Hall and M. Hosseini-Nasab, Theory for high-order bounds in functional principal components analysis, Math.

Proc. Cambridge Philos. Soc. 146 (2009), pp. 225–256.[17] P. Hall, H.G. Müller, and J.L. Wang, Properties of principal component methods for functional and longitudinal

data analysis, Ann. Stat. 34 (2006), pp. 1493–1517.[18] P. Hall, D.S. Poskitt, and B. Presnell, A functional data-analytic approach to signal discrimination, Technometrics

43 (2001), pp. 1–9.[19] H.S. Hippert, C.E. Pedreira, and R.C. Souza, Neural networks for short-term load forecasting: A review and

evaluation, IEEE Trans. Power Syst. 16 (2001), pp. 44–55.[20] A.E. Hoerl and R.W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12

(1970), pp. 55–67.[21] R.J. Hyndman and S. Fan, Density forecasting for long-term peak electricity demand, IEEE Trans. Power Syst. 25

(2010), pp. 1142–1153.[22] R.J. Hyndman and Y. Khandakar, Automatic time series forecasting: The forecast package for R, J. Statist. Softw.

27 (2008).[23] R.J. Hyndman and H.L. Shang, Forecasting functional time series (with discussion), J. Korean Statist. Soc. 38 (2009),

pp. 199–221.[24] R.J. Hyndman and H.L. Shang, Rainbow plots, bagplots, and boxplots for functional data, J. Comput. Graph. Stat.

19 (2010), pp. 29–45.[25] R.J. Hyndman and H.L. Shang, ftsa: Functional Time Series Analysis, 2012, R package version 3.4. Available at

http://cran.r-project.org/web/packages/ftsa/index.html.[26] S.Y. Lee, W. Zhang, and X.Y. Song, Estimating the covariance function with functional data, British J. Math. Statist.

Psychol. 55 (2002), pp. 247–261.[27] K. Liu, S. Subbarayan, R.R. Shoults, M.T. Manry, C. Kwan, F.L. Lewis, and J. Naccarino, Comparison of very short

term load forecasting techniques, IEEE Trans. Power Syst. 11 (1996), pp. 877–882.[28] A. Mas and B. Pumo, The ARHD model, J. Statist. Plann. Inference 137 (2007), pp. 538–553.[29] A. Mas and B. Pumo, Functional linear regression with derivatives, J. Nonparametr. Stat. 21 (2009), pp. 19–40.[30] J.A. Nelder and R. Mead, A simplex method for function minimization, Comput. J. 7 (1965), pp. 308–313.[31] R Core Team, R Foundation for Statistical Computing, Vienna, 2012. Available at http://www.R-project.org, ISBN

3-900051-07-0.[32] J.O. Ramsay and B.W. Silverman, Applied Functional Data Analysis: Methods and Case Studies, Springer, NewYork,

2002.[33] J.O. Ramsay and B.W. Silverman, Functional Data Analysis, 2nd ed., Springer, New York, 2005.[34] P.T. Reiss and T.R. Ogden, Functional principal component regression and functional partial least squares, J. Am.

Statist. Assoc. 102 (2007), pp. 984–996.[35] P. Rousseeuw, I. Ruts, and J. Tukey, The bagplot: A bivariate boxplot, Am. Statist. 53 (1999), pp. 382–387.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4

Page 18: Functional time series approach for forecasting very short-term electricity demand

168 H.L. Shang

[36] H.L. Shang and R.J. Hyndman, Nonparametric time series forecasting with dynamic updating, Math. Comput.Simulation 81 (2011), pp. 1310–1324.

[37] H. Shen, On modeling and forecasting time series of smooth curves, Technometrics 51 (2009), pp. 227–238.[38] H. Shen and J.Z. Huang, Interday forecasting and intraday updating of call center arrivals, Manuf. Serv. Oper.

Manag. 10 (2008), pp. 391–410.[39] J.W. Taylor, Density forecasting for the efficient balancing of the generation and consumption of electricity, Int.

J. Forecast. 22 (2006), pp. 707–724.[40] J.W. Taylor, An evaluation of methods for very short-term load forecasting using minute-by-minute British data, Int.

J. Forecast. 24 (2008), pp. 645–658.[41] J.W. Taylor, L.M. de Menezes, and P.E. McSharry, A comparison of univariate methods for forecasting electricity

demand up to a day ahead, Int. J. Forecast. 22 (2006), pp. 1–16.[42] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Statist. Soc. Ser. B 58 (1996), pp. 267–288.[43] D.J. Trudnowski, M.L. McReynolds, and J.M. Johnson, Real-time very short-term load prediction for power-system

automatic generation control, IEEE Trans. Control Syst. Technol. 9 (2001), pp. 254–260.[44] J. Tukey, Mathematics and the picturing of data, Proceedings of the International Congress of Mathematicians,

Vol. 2, Canadian Mathematical Congress, Montreal, 1974, pp. 523–532.[45] M. West and J. Harrison, Bayesian Forecasting and Dynamic Models, 2nd ed., Springer-Verlag, New York, 1997.[46] H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Statist. Soc. Ser. B 67 (2005),

pp. 301–320.

Dow

nloa

ded

by [

Am

s/G

iron

a*ba

rri L

ib]

at 0

3:09

10

Oct

ober

201

4