forecast accuracy and evaluation - rob j hyndmanthe statistical forecasting perspective 2.5 5.0 7.5...

74
Forecast Accuracy and Evaluation 1 2017 Beijing Workshop on Forecasting Forecast Accuracy and Evaluation Rob J Hyndman robjhyndman.com/beijing2017

Upload: others

Post on 05-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Forecast Accuracy and Evaluation 1

2017 Beijing Workshop onForecasting

Forecast Accuracyand Evaluation

Rob J Hyndman

robjhyndman.com/beijing2017

Page 2: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Outline

1 The statistical forecasting perspective

2 Some simple forecasting methods

3 Forecasting residuals

4 Measuring forecast accuracy

5 Time series cross-validation

6 Probability scoring

Forecast Accuracy and Evaluation The statistical forecasting perspective 2

Page 3: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

The statistical forecasting perspective

2.5

5.0

7.5

10.0

1980 1990 2000 2010 2020

Year

Mill

ions

of v

isito

rs

Total international visitors to Australia

Forecast Accuracy and Evaluation The statistical forecasting perspective 3

Page 4: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

The statistical forecasting perspective

2.5

5.0

7.5

10.0

1980 1990 2000 2010 2020

Year

Mill

ions

of v

isito

rs

Total international visitors to Australia

Forecast Accuracy and Evaluation The statistical forecasting perspective 3

Forecasting is estimating how thesequence of observations willcontinue into the future.

Page 5: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Sample futures

2.5

5.0

7.5

10.0

1980 1990 2000 2010 2020

Year

Mill

ions

of v

isito

rs

Data

Future 1

Future 2

Future 3

Future 4

Future 5

Future 6

Future 7

Future 8

Future 9

Future 10

Total international visitors to Australia

Forecast Accuracy and Evaluation The statistical forecasting perspective 4

Page 6: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Forecast intervals

2.5

5.0

7.5

10.0

1980 1990 2000 2010 2020

Year

Mill

ions

of v

isito

rs

level

80

95

Total international visitors to Australia

Forecast Accuracy and Evaluation The statistical forecasting perspective 5

Page 7: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Statistical forecasting

Thing to be forecast: a random variable, yt.

Forecast distributions:yt|t−1 = yt|{y1, y2, . . . , yt−1}yT+h|T = yT+h|{y1, y2, . . . , yT}

The “point forecast” is the mean (or median) ofyT+h|T = yT+h|{y1, y2, . . . , yT}The “forecast variance” is Var[yT+h|y1, y2, . . . , yT]A prediction interval or “interval forecast” is a rangeof values of yt with high probability.

Forecast Accuracy and Evaluation The statistical forecasting perspective 6

Page 8: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Outline

1 The statistical forecasting perspective

2 Some simple forecasting methods

3 Forecasting residuals

4 Measuring forecast accuracy

5 Time series cross-validation

6 Probability scoring

Forecast Accuracy and Evaluation Some simple forecasting methods 7

Page 9: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methods

400

450

500

1995 2000 2005

Year

meg

alitr

es

Australian quarterly beer production

Forecast Accuracy and Evaluation Some simple forecasting methods 8How would you forecast these data?

Page 10: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methods

80

90

100

110

1990 1991 1992 1993 1994

Year

thou

sand

s

Number of pigs slaughtered in Victoria

Forecast Accuracy and Evaluation Some simple forecasting methods 9How would you forecast these data?

Page 11: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methods

400

500

600

700

800

0 200 400 600 800

Day

US

$

Google Stock Price (800 trading days from 25 February 2013)

Forecast Accuracy and Evaluation Some simple forecasting methods 10How would you forecast these data?

Page 12: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methodsAverage method

Forecast of all future values is equal to mean of historicaldata {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · · + yT)/T

Naïve methodForecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.

Seasonal naïve methodForecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km wherem = seasonal period andk = b(h− 1)/mc+1.

Forecast Accuracy and Evaluation Some simple forecasting methods 11

Page 13: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methodsAverage method

Forecast of all future values is equal to mean of historicaldata {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · · + yT)/T

Naïve methodForecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.

Seasonal naïve methodForecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km wherem = seasonal period andk = b(h− 1)/mc+1.

Forecast Accuracy and Evaluation Some simple forecasting methods 11

Page 14: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methodsAverage method

Forecast of all future values is equal to mean of historicaldata {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · · + yT)/T

Naïve methodForecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.

Seasonal naïve methodForecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km wherem = seasonal period andk = b(h− 1)/mc+1.

Forecast Accuracy and Evaluation Some simple forecasting methods 11

Page 15: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methods

Drift methodForecasts equal to last value plus average change.Forecasts:

yT+h|T = yT +h

T − 1

T∑t=2

(yt − yt−1)

= yT +h

T − 1(yT − y1).

Equivalent to extrapolating a line drawn between firstand last observations.

Forecast Accuracy and Evaluation Some simple forecasting methods 12

Page 16: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methods

400

450

500

1995 2000 2005 2010

Year

Meg

alitr

es

Forecast

Mean

Naïve

Seasonal naïve

Forecasts for quarterly beer production

Forecast Accuracy and Evaluation Some simple forecasting methods 13

Page 17: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methods

80

90

100

110

1990 1991 1992 1993 1994 1995

Year

thou

sand

s series

Mean

Naïve

Seasonal naïve

Number of pigs slaughtered in Victoria

Forecast Accuracy and Evaluation Some simple forecasting methods 14

Page 18: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Some simple forecasting methods

400

500

600

700

800

0 200 400 600 800 1000

Day

US

$

Forecast

Drift

Mean

Naïve

Google Stock Price (800 trading days from 25 February 2013)

Forecast Accuracy and Evaluation Some simple forecasting methods 15

Page 19: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Outline

1 The statistical forecasting perspective

2 Some simple forecasting methods

3 Forecasting residuals

4 Measuring forecast accuracy

5 Time series cross-validation

6 Probability scoring

Forecast Accuracy and Evaluation Forecasting residuals 16

Page 20: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Fitted values

yt|t−1 is the forecast of yt based on observationsy1, . . . , yt−1.We call these “fitted values”.Sometimes drop the subscript: yt ≡ yt|t−1.Often not true forecasts since parameters areestimated on all data.

Examples:1 yt = y for average method.2 yt = yt−1 for naive method3 yt = yt−1 + (yT − y1)/(T − 1) for drift method.4 yt = yt−m for seasonal naive method

Forecast Accuracy and Evaluation Forecasting residuals 17

Page 21: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Fitted values

yt|t−1 is the forecast of yt based on observationsy1, . . . , yt−1.We call these “fitted values”.Sometimes drop the subscript: yt ≡ yt|t−1.Often not true forecasts since parameters areestimated on all data.

Examples:1 yt = y for average method.2 yt = yt−1 for naive method3 yt = yt−1 + (yT − y1)/(T − 1) for drift method.4 yt = yt−m for seasonal naive method

Forecast Accuracy and Evaluation Forecasting residuals 17

Page 22: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Forecasting residualsResiduals in forecasting: difference between observedvalue and its fitted value: et = yt − yt|t−1.

Assumptions

1 {et} uncorrelated. If they aren’t, then information left inresiduals that should be used in computing forecasts.

2 {et} have mean zero. If they don’t, then forecasts arebiased.

Useful properties (for prediction intervals)

3 {et} have constant variance.4 {et} are normally distributed.

Forecast Accuracy and Evaluation Forecasting residuals 18

Page 23: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Forecasting residualsResiduals in forecasting: difference between observedvalue and its fitted value: et = yt − yt|t−1.

Assumptions

1 {et} uncorrelated. If they aren’t, then information left inresiduals that should be used in computing forecasts.

2 {et} have mean zero. If they don’t, then forecasts arebiased.

Useful properties (for prediction intervals)

3 {et} have constant variance.4 {et} are normally distributed.

Forecast Accuracy and Evaluation Forecasting residuals 18

Page 24: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Forecasting residualsResiduals in forecasting: difference between observedvalue and its fitted value: et = yt − yt|t−1.

Assumptions

1 {et} uncorrelated. If they aren’t, then information left inresiduals that should be used in computing forecasts.

2 {et} have mean zero. If they don’t, then forecasts arebiased.

Useful properties (for prediction intervals)

3 {et} have constant variance.4 {et} are normally distributed.

Forecast Accuracy and Evaluation Forecasting residuals 18

Page 25: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Australian beer production

Seasonal naïve forecast:

yt|t−1 = yt−4 et = yt− yt−4

400

450

500

1995 2000 2005

Year

Meg

alitr

es

Australian quarterly beer production

Forecast Accuracy and Evaluation Forecasting residuals 19

Page 26: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Australian beer production

Seasonal naïve forecast:

yt|t−1 = yt−4 et = yt− yt−4

−40

−20

0

20

1995 2000 2005

Year

Meg

alitr

es

Residuals from seasonal naive method

Forecast Accuracy and Evaluation Forecasting residuals 20

Page 27: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Australian beer production

Seasonal naïve forecast:

yt|t−1 = yt−4 et = yt− yt−4

−0.4

−0.2

0.0

0.2

4 8 12 16

Lag

AC

F

Residuals from seasonal naive method

Forecast Accuracy and Evaluation Forecasting residuals 21

Page 28: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Australian beer production

Seasonal naïve forecast:

yt|t−1 = yt−4 et = yt− yt−4

0

5

10

15

20

−60 −30 0 30 60

res

coun

t

Residuals from seasonal naive method

Forecast Accuracy and Evaluation Forecasting residuals 22

Page 29: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Forecasting residuals

Minimizing the size of forecasting residuals is used forestimating model parameters (e.g., minimizing MSEor maximizing likelihood.In general, forecasting residuals cannot be used(directly) for estimating forecast accuracy.Forecast accuracy can only be measured usinggenuine forecasts; i.e., on different data.Forecasting residuals can help suggest modelimprovements.

Forecast Accuracy and Evaluation Forecasting residuals 23

Page 30: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Outline

1 The statistical forecasting perspective

2 Some simple forecasting methods

3 Forecasting residuals

4 Measuring forecast accuracy

5 Time series cross-validation

6 Probability scoring

Forecast Accuracy and Evaluation Measuring forecast accuracy 24

Page 31: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Training and test sets

timeTraining data Test data

The test set must not be used for any aspect of modeldevelopment or calculation of forecasts.Forecast accuracy is based only on the test set.A model which fits the data well does not necessarilyforecast well.A perfect fit can always be obtained by using a modelwith enough parameters. (Compare R2)Problems can be overcome by measuring trueout-of-sample forecast accuracy. Training set used toestimate parameters. Forecasts are made for test set.

Forecast Accuracy and Evaluation Measuring forecast accuracy 25

Page 32: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracyTraining set: T observationsTest set: H observations

MAE =1H

H∑h=1|yT+h − yT+h|T|

MSE =1H

H∑h=1

(yT+h − yT+h|T)2 RMSE =

√√√√√1H

H∑h=1

(yT+h − yT+h|T)2

MAPE =100H

H∑h=1|yT+h − yT+h|T|/|yT+h|

MAE, MSE, RMSE are all scale dependent.MAPE is scale independent but is only sensible if yT+h � 0for all h, and y has a natural zero.

Forecast Accuracy and Evaluation Measuring forecast accuracy 26

Page 33: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracyTraining set: T observationsTest set: H observations

MAE =1H

H∑h=1|yT+h − yT+h|T|

MSE =1H

H∑h=1

(yT+h − yT+h|T)2 RMSE =

√√√√√1H

H∑h=1

(yT+h − yT+h|T)2

MAPE =100H

H∑h=1|yT+h − yT+h|T|/|yT+h|

MAE, MSE, RMSE are all scale dependent.MAPE is scale independent but is only sensible if yT+h � 0for all h, and y has a natural zero.

Forecast Accuracy and Evaluation Measuring forecast accuracy 26

Page 34: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracyMean Absolute Scaled Error

MASE =1H

H∑h=1|yT+h − yT+h|T|/Q

where Q is a stable measure of the scale of the time series{yt}.

Proposed by Hyndman and Koehler (IJF, 2006).For non-seasonal time series,

Q =1

T − 1

T∑t=2|yt − yt−1|

works well. Then MASE is equivalent to MAE relative to anaïve method.

Forecast Accuracy and Evaluation Measuring forecast accuracy 27

Page 35: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracyMean Absolute Scaled Error

MASE =1H

H∑h=1|yT+h − yT+h|T|/Q

where Q is a stable measure of the scale of the time series{yt}.

Proposed by Hyndman and Koehler (IJF, 2006).For seasonal time series,

Q =1

T −m

T∑t=m+1|yt − yt−m|

works well. Then MASE is equivalent to MAE relative to aseasonal naïve method.

Forecast Accuracy and Evaluation Measuring forecast accuracy 28

Page 36: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracy

400

450

500

1995 2000 2005 2010

Year

Meg

alitr

es

Forecast Method

Mean

Naive

SeasonalNaive

Forecasts for quarterly beer production

Forecast Accuracy and Evaluation Measuring forecast accuracy 29

Page 37: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracy

RMSE MAE MAPE MASE

Mean method 38.5 34.8 8.28 2.44Naïve method 62.7 57.4 14.18 4.01Seasonal naïve method 14.3 13.4 3.17 0.94

Forecast Accuracy and Evaluation Measuring forecast accuracy 30

Page 38: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracyScaling can be used with any measure, and with differentscaling statistics.Mean Squared Scaled Error

MASE =1H

H∑h=1

(yT+h − yT+h|T)2/Q

where Q =1

T −m

T∑t=m+1

(yt − yt−m)2

Assumes {yt} is difference stationary.Minimizing MSSE leads to conditional mean forecasts.MSSE< 1 : out-of-sample multi-step forecasts aremore accurate than in-sample one-step forecasts.

Forecast Accuracy and Evaluation Measuring forecast accuracy 31

Page 39: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Measures of forecast accuracy

Many suggested scale-free measures of forecastaccuracy are degenerate due to infinite variance.The denominator must be positive with probabilityone.Distribution of most measures are highly skewedwhen applied to real data.

Forecast Accuracy and Evaluation Measuring forecast accuracy 32

Page 40: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Poll: true or false?

1 Good forecast methods should have normallydistributed residuals.

2 A model with small residuals will give good forecasts.3 The best measure of forecast accuracy is MAPE.4 If your model doesn’t forecast well, you should make

it more complicated.5 Always choose the model with the best forecast

accuracy as measured on the test set.

Forecast Accuracy and Evaluation Measuring forecast accuracy 33

Page 41: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Outline

1 The statistical forecasting perspective

2 Some simple forecasting methods

3 Forecasting residuals

4 Measuring forecast accuracy

5 Time series cross-validation

6 Probability scoring

Forecast Accuracy and Evaluation Time series cross-validation 34

Page 42: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTraditional evaluation

timeTraining data Test data

Leave-one-out cross-validation

Forecast Accuracy and Evaluation Time series cross-validation 35

Page 43: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

Forecast Accuracy and Evaluation Time series cross-validation 36

Page 44: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

h = 1

Forecast Accuracy and Evaluation Time series cross-validation 37

Page 45: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

h = 2

Forecast Accuracy and Evaluation Time series cross-validation 38

Page 46: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

h = 3

Forecast Accuracy and Evaluation Time series cross-validation 39

Page 47: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

h = 4

Forecast Accuracy and Evaluation Time series cross-validation 40

Page 48: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

h = 5

Forecast Accuracy and Evaluation Time series cross-validation 41

Page 49: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

h = 6

Forecast Accuracy and Evaluation Time series cross-validation 42

Page 50: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Cross-validationTime series cross-validation

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

time

Also known as “Evaluation on a rolling forecast origin”

Forecast Accuracy and Evaluation Time series cross-validation 43

Page 51: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Time series cross-validation

Assume k is the minimum number of observations for atraining set.

Select observation t + h for test set, and useobservations at times 1, 2, . . . , t to estimate model.Compute error on forecast for time t + h.Repeat for t = k, k + 1, . . . , T − h where T is totalnumber of observations.Compute accuracy measure over all errors.

Forecast Accuracy and Evaluation Time series cross-validation 44

Page 52: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Pharmaceutical sales

10

20

30

1995 2000 2005

Year

$ m

illio

n

Antidiabetic drug sales

Forecast Accuracy and Evaluation Time series cross-validation 45

Page 53: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Pharmaceutical sales

Which of these models is best?Linear model with trend and seasonal dummiesapplied to log data.ARIMA model applied to log dataETS model applied to original data

Forecast h = 12 steps ahead based on data to time tfor t = 1, 2, . . ..Compare MAE values for each forecast horizon.

Forecast Accuracy and Evaluation Time series cross-validation 46

Page 54: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Pharmaceutical sales

Which of these models is best?Linear model with trend and seasonal dummiesapplied to log data.ARIMA model applied to log dataETS model applied to original data

Forecast h = 12 steps ahead based on data to time tfor t = 1, 2, . . ..Compare MAE values for each forecast horizon.

Forecast Accuracy and Evaluation Time series cross-validation 46

Page 55: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: Pharmaceutical sales

0.6

0.7

0.8

1 2 3 4 5 6 7 8 9 10 11 12

horizon

MA

E

Forecast method

LM

ARIMA

ETS

Forecast Accuracy and Evaluation Time series cross-validation 47

Page 56: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Example: R code

f1 <- function(y,h) {forecast(tslm(y ~ trend + season, lambda=0), h=h)

}f2 <- function(y,h) {

forecast(auto.arima(y, D=1, lambda=0), h=h)}f3 <- function(y,h) {

forecast(ets(y), h=h)}e1 <- tsCV(a10, f1, h=12)e2 <- tsCV(a10, f2, h=12)e3 <- tsCV(a10, f3, h=12)mae <- data.frame(

LM = colMeans(abs(e1), na.rm=TRUE),ARIMA = colMeans(abs(e2), na.rm=TRUE),ETS = colMeans(abs(e3), na.rm=TRUE))

Forecast Accuracy and Evaluation Time series cross-validation 48

Page 57: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Hirotugu Akaike (1927–2009)

Akaike, H. (1974), “A new look at the statistical modelidentification”, IEEE Transactions on Automatic Control,19(6): 716–723.

Forecast Accuracy and Evaluation Time series cross-validation 49

Page 58: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Akaike’s Information CriterionAIC = −2 log(L) + 2k

where L is the model likelihood and k is the number ofestimated parameters in the model.

If L is Gaussian, then AIC ≈ c + T logMSE + 2kwhere c is a constant, MSE is from one-step forecastson training set, and T is the length of the series.

Minimizing the Gaussian AIC is asymptotically equivalent(as T →∞) to minimizing MSE from one-step forecasts ontest set via time series cross-validation.

AICc a bias-corrected small-sample version.AIC/AICcmuch faster than CV

Forecast Accuracy and Evaluation Time series cross-validation 50

Page 59: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Akaike’s Information CriterionAIC = −2 log(L) + 2k

where L is the model likelihood and k is the number ofestimated parameters in the model.

If L is Gaussian, then AIC ≈ c + T logMSE + 2kwhere c is a constant, MSE is from one-step forecastson training set, and T is the length of the series.

Minimizing the Gaussian AIC is asymptotically equivalent(as T →∞) to minimizing MSE from one-step forecasts ontest set via time series cross-validation.

AICc a bias-corrected small-sample version.AIC/AICcmuch faster than CV

Forecast Accuracy and Evaluation Time series cross-validation 50

Page 60: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Akaike’s Information CriterionAIC = −2 log(L) + 2k

where L is the model likelihood and k is the number ofestimated parameters in the model.

If L is Gaussian, then AIC ≈ c + T logMSE + 2kwhere c is a constant, MSE is from one-step forecastson training set, and T is the length of the series.

Minimizing the Gaussian AIC is asymptotically equivalent(as T →∞) to minimizing MSE from one-step forecasts ontest set via time series cross-validation.

AICc a bias-corrected small-sample version.AIC/AICcmuch faster than CV

Forecast Accuracy and Evaluation Time series cross-validation 50

Page 61: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Akaike’s Information CriterionAIC = −2 log(L) + 2k

where L is the model likelihood and k is the number ofestimated parameters in the model.

If L is Gaussian, then AIC ≈ c + T logMSE + 2kwhere c is a constant, MSE is from one-step forecastson training set, and T is the length of the series.

Minimizing the Gaussian AIC is asymptotically equivalent(as T →∞) to minimizing MSE from one-step forecasts ontest set via time series cross-validation.

AICc a bias-corrected small-sample version.AIC/AICcmuch faster than CV

Forecast Accuracy and Evaluation Time series cross-validation 50

Page 62: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Outline

1 The statistical forecasting perspective

2 Some simple forecasting methods

3 Forecasting residuals

4 Measuring forecast accuracy

5 Time series cross-validation

6 Probability scoring

Forecast Accuracy and Evaluation Probability scoring 51

Page 63: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Probablistic forecasting

400

600

800

1000

0 200 400 600 800 1000

Day

US

$

Google Stock Price (800 trading days from 25 February 2013)

Forecast Accuracy and Evaluation Probability scoring 52

Page 64: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Probabilistic forecasting

How to evaluate a forecast probability distribution?

Forecast intervals: percentage of observationscovered compared to nominal percentage.Density forecastingQuantile forecastingDistribution forecasting

Forecast Accuracy and Evaluation Probability scoring 53

Page 65: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Probability Integral TransformLet F = cdf of Y and Z = F(Y). If F is continuous, then Z isstandard uniform.

−3 −2 −1 0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

Probability Integral Transform

Y

Z=

F(Y

)

F(Y)

Forecast Accuracy and Evaluation Probability scoring 54

Page 66: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

CalibrationYT+h|T has cdf FT+h|T. FT+h|T is our forecast cdf.

Calibration(a) F is marginally calibrated if E[F(y)] = P(Y ≤ y) ∀y ∈ R.(b) F is probabilistically calibrated if Z = F(Y) has a

standard uniform distribution.

å We could plot a histogram of Z = F(Y) and check thatit looks uniform.

å This is a more sophisticated version of testing ifprediction intervals have the correct coverage.

Forecast Accuracy and Evaluation Probability scoring 55

Page 67: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

SharpnessYT+h|T has cdf FT+h|T. FT+h|T is our forecast cdf.

Sharpnesså A “sharp” forecast distribution has narrow prediction

intervals.

A good probabilistic forecast is both calibrated andsharp.Scoring rules combine calibration and sharpness in asingle measure.

Forecast Accuracy and Evaluation Probability scoring 56

Page 68: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Scoring rulesYT+h|T has cdf FT+h|T. FT+h|T is our forecast cdf. A scoring ruleassigns numerical score S(FT+h|T, yT+h).

Dawid-Sebastiani score:

DSS(F, y) =(y − µF)2

σ2F+ 2 logσF

Generalization of MSE assuming normality.

A “proper” scoring rule has the property:

EF[S(F, Y)] < EF[S(F, Y)]

Forecast Accuracy and Evaluation Probability scoring 57

Page 69: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Scoring rulesYT+h|T has cdf FT+h|T. FT+h|T is our forecast cdf. A scoring ruleassigns numerical score S(FT+h|T, yT+h).

Dawid-Sebastiani score:

DSS(F, y) =(y − µF)2

σ2F+ 2 logσF

Generalization of MSE assuming normality.

A “proper” scoring rule has the property:

EF[S(F, Y)] < EF[S(F, Y)]

Forecast Accuracy and Evaluation Probability scoring 57

Page 70: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Scoring rulesYT+h|T has cdf FT+h|T. FT+h|T is our forecast cdf. A scoring ruleassigns numerical score S(FT+h|T, yT+h).

Dawid-Sebastiani score:

DSS(F, y) =(y − µF)2

σ2F+ 2 logσF

Generalization of MSE assuming normality.

A “proper” scoring rule has the property:

EF[S(F, Y)] < EF[S(F, Y)]

Forecast Accuracy and Evaluation Probability scoring 57

Page 71: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Scoring rulesYT+h|T has cdf FT+h|T. FT+h|T is our forecast cdf. A scoring ruleassigns numerical score S(FT+h|T, yT+h).

Continuous Ranked Probability Score:

CRPS(F, y) =∫

[F(x)− 1{y≤x}]2dx = EF|Y − y| − 12EF|Y − Y′|

where Y and Y′ have cdf F. Generalization of MAE.

Continuous Ranked Probability Score:Let QT+h|T = F−1T+h|T be the forecast quantile function

CRPS(Q, y) = 2∫ 1

0

[Q(p)− y

] [1{y<Q(p)} − p

]dp

Forecast Accuracy and Evaluation Probability scoring 58

Page 72: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Scoring rulesYT+h|T has cdf FT+h|T. FT+h|T is our forecast cdf. A scoring ruleassigns numerical score S(FT+h|T, yT+h).

Energy ScoreES(F, y) = EF|Y − y|α − 1

2EF|Y − Y′|αwhere Y and Y′ have cdf F and α ∈ (0, 2].

Log ScorelogS(F, y) = − log f(y)

where f = dF/dy is density corresponding to F.

Forecast Accuracy and Evaluation Probability scoring 59

Page 73: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

R packages

Forecast Accuracy and Evaluation Probability scoring 60

https://github.com/earowang/tsibble

http://pkg.earo.me/sugrrants

https://github.com/mitchelloharawild/fasster

http://pkg.robjhyndman.com/forecast

http://pkg.earo.me/hts

Page 74: Forecast Accuracy and Evaluation - Rob J HyndmanThe statistical forecasting perspective 2.5 5.0 7.5 10.0 1980 1990 2000 2010 2020 Year Millions of visitors Total international visitors

Textbook

Forecast Accuracy and Evaluation Probability scoring 61

OTexts.org/fpp2