forecasting lecture 2: forecast combination, multi-step ...bhansen/cbc/cbc2.pdfgranger-ramanathan...
Post on 27-May-2020
2 Views
Preview:
TRANSCRIPT
ForecastingLecture 2: Forecast Combination,
Multi-Step Forecasts
Bruce E. Hansen
Central Bank of ChileOctober 29-31, 2013
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 1 / 82
Today’s Schedule
Combination Forecasts
Multi-Step Forecasting
Fan Charts
Iterated Forecasts
If time (optional): Threshold models or Nonlinear/NonparametricTime Series
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 2 / 82
Review
Optimal point forecast of yn+1 given information In is the conditionalmean E (yn+1|In)Linear model E (yn+1|In) ' β′xn is an approximationEstimate linear projections by least-squares
Model selection should focus on performance, not “truth”I Best forecast has smallest MSFEI Unknown, but MSFE can be estimatedI CV is a good estimator of MSFE
Good forecasts rely on selection of leading indicators
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 3 / 82
Combination Forecasts
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 4 / 82
Diversity of Forecasts
Model choice is criticalI Classic approach: SelectionI Modern approach: Combination
Issues:I How to select from a wide set of models/forecasts?
F Model selection criteria
I How to combine a wide set of models/forecasts?
F Weight selection criteria
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 5 / 82
Foundation
The ideal point forecast minimizes the MSFE
The goal of a good combination forecast is to minimize the MSFE
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 6 / 82
Forecast Selection
M forecasts: f = {f (1), f (2), ..., f (M)}Selection picks m to determine the forecast f = f (m)
M weights: w = {w(1),w(2), ...,w(M)}A combination forecast is the weighted average
f (w) =M
∑m=1
w(m)f (m)
= w′f
Combination generalizes selection
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 7 / 82
Possible restrictions on the weight vector
∑Mm=1 w(m) = 1I UnbiasednessI Typically improves performance
w(m) ≥ 0I nonnegativityI regularizationI Often critical for good performance
w(m) ∈ {0, 1}I Equivalent to forecast selectionI f (w) = f (m)I Selection is a special case of combinationI Strong restriction
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 8 / 82
OOS Forecast Combination
Sequence of true out-of-sample forecasts ft for yt+1Combination forecast is f (w) = w′fOOS empirical MSFE
σ2(w) =1P
n
∑t=n−P
(yt+1 −w′ft
)2PLS selected the model with the smallest OOS MSFE
Granger-Ramanathan combination: select w to minimize the OOSMSFE
Minimization over w is equivalent to the least-squares regression of yton the forecasts
yt+1 = w′ft + εt+1
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 9 / 82
Granger-Ramanathan (1984)
Unrestricted least-squares
w =
(n
∑t=n−P
ft f ′t
)−1 n
∑t=n−P
ftyt+1
This can produce weights far outside [0, 1] and don’t sum to one
Granger-Ramanathan’s intuition was that this flexibility is goodI But they provided no theory to support conjecture
Unrestricted weights are not regularizedI This results in poor sampling performance
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 10 / 82
Alternative Representation
Take yt+1 = w′ft + εt+1, subtract yt+1 from each side
0 = w′ft − yt+1 + εt+1
Impose restriction that weights to sum to one.
0 = w′ (ft − yt+1) + εt+1
Define et+1 = w′ (ft − yt+1) , the (negative) forecast errors. Then
0 = w′et+1 + εt+1
This is the regression of 0 on the forecast errors
But it is still better to also impose non-negativity w(m) ≥ 0
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 11 / 82
Constrained Granger-Ramanathan
The constrained GR weights solve the problem
minww′Aw
subject to
M
∑m=1
w(m) = 1
0 ≤ w(m) ≤ 1where
A = ∑tet+1e′t+1
is the M ×M matrix of forecast error empirical variances/covariances
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 12 / 82
Quadratic Programming (QP)
The weights lie on the unit simplex
The constrained GR weights minimize a quadratic over the unitsimplex
QP algorithms easily solve this problemI Gauss (qprog)I Matlab (quadprog)I R (quadprog)
Solution solution typicalI Many forecasts will receive zero weight
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 13 / 82
Bates-Granger (1969)
Assume A = ∑t et+1e′t+1 is diagonal.Then the regression with the coeffi cients constrained to sum to one
0 = w′et+1 + εt+1
has solution
w(m) =σ−2(m)
∑Mj=1 σ−2(j)
This are the Bates-Granger weights.
In many cases, they are close to equality, since OOS forecastvariances can be quite similar
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 14 / 82
Bayesian Model Averaging (BMA)
Put priors on individual models, and priors on the probability thatmodel m is the true model
Compute posterior probabilites w(m) that m is the true model
Forecast combination using w(m)
AdvantagesI Conceptually simpleI no theoretical analysis requiredI applies in broad contexts
DisadvantagesI Not designed to minimize forecast riskI Similar to BIC: asymptotically picks “true”finite modelsI does not distinguish between 1-step and multi-step forecast horizons
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 15 / 82
BMA Approximation
BIC weights
w(m) ∝ exp(−BIC (m)
2
)Simple approximation to full BMA method
Smoothed version of BIC selection
Works better than BIC selection in simulations
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 16 / 82
AIC Weights
Smooted AIC
w(m) ∝ exp(−AIC (m)
2
)Proposed by Buckland, Burnhamm and Augustin (1997)
Not theoretically motivated, but works better than AIC selection insimulations
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 17 / 82
Comments
Combination methods typically work better (lower MSFE) thancomparable selection methods
BIC and BMA not optimal for MSFE
Granger-Ramanathan has similar senstive as PLS to choice of P
Bates-Granger and weighted AIC have no theoretical grounding
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 18 / 82
Forecast Combination
yn+1(w) =M
∑m=1
w(m)yn+1(m)
=M
∑m=1
w(m)xn(m)′ β(m)
= x′n β(w)
where
β(w) =M
∑m=1
w(m)β(m)
In Iinear models, the combination forecast is the same as the forecastbased on the weighted average of the parameter estimates across thedifferent modelsComputationally, it is easiest to calculate the M individual forecastyn+1(m), then take the weighted average to obtain yn+1(w)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 19 / 82
Combination Residuals
et+1(w) = yt+1 − x′t β(w)
=M
∑m=1
w(m)(yt+1 − x′t β(m)
)=
M
∑m=1
w(m)et+1(m)
In linear models, the residual from the combination model is the sameas the weighted average of the model residuals.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 20 / 82
Mallows Averaging Criterion
Cn(w) = σ2(w) +2n
σ2M
∑m=1
w(m)k(m)
with σ2 an estimate from a “large”model
Cn(w) is an estimate of the MSFE (assuming homoskedasticity)Hansen (2007, Econometrica) Mallows Model Averaging (MMA)Hansen (Journal of Econometrics, 2008) Forecast Model Averaging(FMA)Combination weights found by constrained minimization
w = argminw
Cn(w)
subject toM
∑m=1
w(m) = 1
0 ≤ w(m) ≤ 1Solution by Quadratic Programming (QP)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 21 / 82
Theory of Optimal Weights
Hansen (2007, Econometrica)
Mallows weight selection is asymptotically optimal underhomoskedasticity
[In large samples, equivalent to using MSFE-minimizing weights
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 22 / 82
Comparison of Granger-Ramanathan and FMA
Both are solved by Quadratic Programming (QP)
Both typically yield corner solutions —many forecasts will receive zeroweight
GR uses empirical (OOS) forecast errors, FMA uses sample residuals
GR uses no penalty, FMA uses “average # of parameters”penalty
FMA is an estimate of MSFE for homoskedastic one-step forecasts,GR has no optimality
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 23 / 82
Cross-Validation
Leave-one-out estimator
β−t (w) =M
∑m=1
w(m)β−t (m)
Leave-one-out prediction residual
et+1(m) = yt+1 −M
∑m=1
w(m)β−t (w)′xt (m)
=M
∑m=1
w(m)et+1(m)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 24 / 82
CVn(w) =1n
∑n−1t=0 et+1(w)
2 is an estimate of MSFEn(m)
Cross-validation (CV) criterion for regression combination/averaging
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 25 / 82
Cross-validation Weights
Combination weights found by constrained minimization of CVn(w)
minwCVn(w) = w′Sw
subject to
M
∑m=1
w(m) = 1
0 ≤ w(m) ≤ 1
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 26 / 82
Cross-validation for combination forecasts (theory)
Theorem: ECVn(w) ' Cn(w)For heteroskedastic forecasts, CV is a valid estimate of the one-stepMSFE from a combination forecast
Hansen and Racine (Journal of Econometrica, 2012) show that theCV weights are asymptotically optimal for cross-section data underheteroskedasticity
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 27 / 82
Summary: Forecast Combination Methods
Granger-Ramanathan (GR), forecast model averaging (FMA) andcross-validation (CV) all pick weight vectors by quadraticminimization
GR only needs actual forecasts, the method can be unknown or ablack box
CV can be computed for a wide variety of estimation methodsI optimality theory for linear estimation
FMA limited to homoskedastic one-step-ahead models
Smoothed AIC (SAIC) and BMA have no forecast optimality, and aredesigned for homoskedastic one-step-ahead forecasts.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 28 / 82
Example: AR models for GDP Growth
Fit AR(1) and AR(2) only
Leave-one-out residuals e1t and e2tCovariance matrix
S =[10.72 10.4410.44 10.52
]The best-fitting single model is AR(2)
The best combination is w = (.22, .78)′
CV = 10.50
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 29 / 82
Example: AR models for GDP Growth
Fit AR(0) through AR(12)
AR(0) is constant only
Models with positive weight are AR(0), AR(1), AR(2)
w = (.06, .16, .78)′
S =
12.0 10.6 10.410.6 10.7 10.410.4 10.5 10.5
CV = 10.50 (essentially unchanged)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 30 / 82
Example: Leading Indicator Forecasts
Fit AR(1), AR(2) with leading indicators
Models with positive weight
wAR(1), Spread, Housing 0.13AR(1), Spread, High-Yield, Housing 0.16AR(1), Spread, High-Yield, Housing, Building 0.52AR(2) 0.18AR(2), Spread 0.01
CV = 9.81
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 31 / 82
Summary: Forecast Combination by CVM forecasts fn+1(m) from n observations
For each estimate mI Define the leave-one-out prediction error
et+1(m) = yt+1 − β′(−t)(m)xt (m)
=et+1(m)1− htt (m)
I Store the n× 1 vector e(m)Construct the M ×M matrix
S =1ne ′e
Find the M × 1 weight vector w which minimizes w′SwI Use quadratic programming (quadprog) to find solution
The combination forecast is fn+1 = ∑Mm=1 w(m)fn+1(m)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 32 / 82
Forecast Combination Criticisms
There has been considerable skepticism about formal forecastcombination method in the forecast literature
Many researchers have found that equal weighting: (wm = 1/M)works as well as formal methods
However, the formal methods which investigated areI Bates-Granger simple weights
F Not expected by theory to work well
I Unconstrained Granger-Ramanathan
F Without imposing [0, 1] weights, work terribly!
Furthermore, most investigations examine pseudo out-of-sampleperformance
I Identical to comparing models by PLS criterionI This is NOT an investigation of performanceI Just a ranking by PLS
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 33 / 82
Another Example - 10-Year Bond Rate
Estimated AR(1) through AR(24) models
CV Selection picked AR(2)
CV weight Selection: Models with positive weightI AR(0): w = 0.04I AR(1): w = 0.04I AR(2): w = 0.47I AR(6): w = 0.23I AR(22): w = 0.22
MInimizing CV = 0.0761 (slightly lower than 0.0768 from AR(2))
Point forecast 1.96 (same as from AR(2))
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 34 / 82
Multi-Step Forecasts
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 35 / 82
Multi-Step Forecasts
Forecast horizon: h
We say the forecast is “multi-step” if h > 1
Forecasting yn+h given Ine.g., forecasting GDP growth for 2012:3, 2012:4, 2013:1, 2013:2
The forecast distribution is yn+h | In ∼ Fh(yn+h |In)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 36 / 82
Point Forecast
fn+h|h minimizes expected squared loss
fn+h|h = argminf
E((yn+h − f )2 |In
)= E (yn+h |In)
Optimal point forecasts are h-step conditional means
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 37 / 82
Relationship Between Forecast HorizonsTake an AR(1) model
yt+1 = αyt + ut+1
Iterate
yt+1 = α (αyt−1 + ut ) + ut+1= α2yt−1 + αut + ut+1
or
yt+2 = α2yt + et+2ut+2 = αut+1 + ut+2
Repeat h times
yt+h = αhyt + et+het+h = ut+h + αut+h−1 + α2ut+h−2 + · · ·+ αh−1ut+1
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 38 / 82
AR(1)
h-step forecast
yt+h = αhyt + et+het+h = ut+h + αut+h−1 + α2ut+h−2 + · · ·+ αh−1ut+1
E (yn+h |In) = αhyn
h−step point forecast is linear in ynh-step forecast error en+h is a MA(h− 1)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 39 / 82
AR(2) Model
1-step AR(2) model
yt+1 = α0 + α1yt + α2yt−1 + ut+1
2-steps ahead
yt+2 = α0 + α1yt+1 + α2yt + ut+2
Taking conditional expectations
E (yt+2|It ) = α0 + α1E (yt+1|It ) + α2E (yt |It ) + E (et+2|It )= α0 + α1 (α0 + α1yt + α2yt−1) + α2yt= α0 + α1α0 +
(α21 + α2
)yt + α1α2yt−1
which is linear in (yt , yt−1)
In general, a 1-step linear model implies an h-step approximate linearmodel in the same variables
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 40 / 82
AR(k) h-step forecasts
Ifyt+1 = α0 + α1yt + α2yt−1 + · · ·+ αkyt−k+1 + ut+1
thenyt+h = β0 + β1yt + β2yt−1 + · · ·+ βkyt−k+1 + et+h
where et+h is a MA(h-1)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 41 / 82
Leading Indicator Models
If
yt+1 = x′tβ+ ut
thenE (yt+h |It ) = E (xt+h−1|It )′ β
If E (xt+h−1|It ) is itself (approximately) a linear function of xt , then
E (yt+h |It ) = x′tγ
yt+h = x′tγ+ et+h
Common Structure: h-step conditional mean is similar to 1-step structure,but error is a MA.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 42 / 82
Forecast Variable
We should think carefully about the variable we want to report in ourforecast
The choice will depend on the context
What do we want to forecast?I Future level: yn+h
F interest rates, unemployment rates
I Future differences: ∆yt+hI Cummulative Change: ∆yt+h
F Cummulative GDP growth
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 43 / 82
Forecast Transformation
fn+h|n = E (yn+h |In) = expected future levelI Level specification
yt+h = x′tβ+ et+hfn+h|n = x′tβ
I Difference specification
∆yt+h = x′tβh + et+hfn+h|n = yn + x′tβ1 + · · ·+ x′tβh
I Multi-Step difference specification
yt+h − yt = x′tβ+ et+hfn+h|n = yn + x′tβ
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 44 / 82
Direct and Iterated
There are two methods of multistep (h > 1) forecasts
Direct ForecastI Model and estimate E (yn+h |In) directly
Iterated ForecastI Model and estimate one-step E (yn+1 |In)I Iterate forward h stepsI Requires full model for all variables
Both have advantages and disadvantagesI For now, we will forcus on direct method.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 45 / 82
Direct Multi-Step Forecasting
Markov approximationI E (yn+h |In) = E (yn+h |xn , xn−1, ...) ≈ E (yn+h |xn , ..., xn−p)
Linear approximationI E (yn+h |xn , ..., xn−p) ≈ β′xn
Projection Definition
I β = (E (xtx′t ))−1 (E (xtyt+h))
Forecast errorI et+h = yt+h − β′xt
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 46 / 82
Multi-Step Forecast Model
yt+h = β′xt + et+h
β =(E(xtx′t
))−1(E (xtyt+h))
E (xtet+h) = 0
σ2 = E(e2t+h
)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 47 / 82
Least Squares Estimation
β =
(n−1∑t=0
xtx′t
)−1 (n−1∑t=0
xtyt+h
)yn+h|n = fn+h|n = β
′xn
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 48 / 82
Residuals
Least-squares residuals
I et+h = yt+h − β′xt
I Standard, but overfit
Leave-one-out residualsI et+h = yt+h − β
′−txt
I Does not correct for MA errors
Leave h out residuals
et+h = yt+h − β′−t ,hxt
β−t ,h =
(∑
|j+h−t |≥hxjx′j
)−1 (∑
|j+h−t |≥hxjyj+h
)The summation is over all observations outside h− 1 periods of t + h.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 49 / 82
Example: GDP Forecast
yt = 400 log(GDPt )
Forecast Variable: GDP growth over next h quarters, at annual rate
yt+h − yth
= β0+ β1∆yt + β1∆yt−1+Spreadt +HighYieldt + β2HSt + et+h
HSt =Housing Startst
h = 1 h = 2 h = 3 h = 4β0 −0.33 (1.0) −0.38 (1.3) −0.01 (1.6) 0.47 (1.8)∆yt 0.16 (.10) 0.18 (.09) 0.13 (.08) 0.13 (.09)∆yt−1 0.09 (.10) 0.04 (.05) 0.05 (.07) 0.02 (.06)Spreadt 0.61 (.23) 0.65(.19) 0.65 (.22) 0.65 (.25)HighYieldt −1.10 (.75) −0.68 (.70) −0.48 (.90) −0.41 (1.01)HSt 1.86 (.65) 1.64 (.70) 1.31 (.80) 1.01 (.94)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 50 / 82
Example: GDP Forecast
Cummulative Annualized GrowthForecast Actual
2012:2 1.3 1.22012:3 1.6 2.02012:4 2.9 1.42013:1 2.2 1.32013:2 2.4 1.52013:3 2.72013:4 2.92014:1 3.2
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 51 / 82
Selection and Combination for h step forecasts
AIC routinely used for model selection
PLS (OOS MSFE) routinely used for model evaluation
Neither well justified
Not well studied problem
I recommend “leave h out” cross-validation.
Topic of afternoon seminar
Minimize sum of squared “leave h out” residuals, separately for eachforecast horizon.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 52 / 82
Example: GDP Forecast Weights by Horizon
h = 1 h = 2 h = 3 h = 4 h = 5 h = 6 h = 7AR(1) .15 .19 .28 .18 .16 .11AR(2) .30AR(1)+HS .66 .70 .22AR(1)+HS+BP .14 .58 .72 .82 .84 .89AR(2)+HS .04
yn+h|n 1.7 2.0 1.9 2.0 2.1 2.3 2.6
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 53 / 82
h-step Variance Forecasting
Not well developed using direct methods
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 54 / 82
h-step Interval Forecasts
Similar to 1-step interval forecastsI But calculated from h−step residuals
Use constant variance specification
Let qe (α) and qe (1− α) be the α’th and (1− α)’th percentiles ofresiduals et+hForecast Interval:
[µn + qε(α), µn + q
e (1− α)]
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 55 / 82
Fan Charts
Plots of a set of interval forecasts for multiple horizonsI Pick a set of horizons, h = 1, ...,HI Pick a set of quantiles, e.g. α = .10, .25, .75, .90I Recall the quantiles of the conditional distribution areqn(α, h) = µn(h) + σn(h)qε(α, h)
I Plot qn(.1, h), qn(.25, h), µn(h), qn(.75, h), qn(.9, h) against h
Graphs easier to interpret than tables
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 56 / 82
Illustration
I’ve been making monthly forecasts of the Wisconsin unemploymentrate
Forecast horizon h = 1, ..., 12 (one year)
Quantiles: α = .1, .25, .75, .90
This corresponds to plotting 50% and 80% forecast intervals
50% intervals show “likely” region (equal odds)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 57 / 82
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 58 / 82
Comments
Showing the recent history gives perspective
Some published fan charts use colors to indicate regions, but do notlabel the colors
Labels important to infer probabilities
I like clean plots, not cluttered
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 59 / 82
Illustration: GDP Growth
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 60 / 82
Figure: GDP Average Growth Fan Chart
2011.0 2011.5 2012.0 2012.5 2013.0 2013.5 2014.0
10
12
34
5
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 61 / 82
It doesn’t “fan”because we are plotting average growth
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 62 / 82
Figure: Fan Chart with Actuals
2011.0 2011.5 2012.0 2012.5 2013.0 2013.5 2014.0
10
12
34
5
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 63 / 82
Iterated Forecasts
Estimate one-step forecast
Iterate to obtain multi-step forecasts
Only works in complete systemsI AutoregressionsI Vector autoregressions
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 64 / 82
Vector Autoregresive Modelsyt is an p vectorxt are other variables (including lags)Ideal point forecast E (yn+1|In)Linear approximation
E (yn+1|In) ' A1yt + A2yt−1 + · · ·+ Akyt−k+1 + Bxt
Vector Autoregression (VAR)
yt+1 = A1yt + A2yt−1 + · · ·+ Akyt−k+1 + Bxt + et+1
Estimation: Least squares
yt+1 = A1yt + A2yt−1 + · · · +Akyt−k+1 + Bxt + et+1
One-Step-Ahead Point forecast
yn+1 = A1yn + A2yn−1 + · · ·+ Akyn−k+1 + Bxn
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 65 / 82
Vector Autoregresive versus Univariate Models
Let xt = (yt , yt−1, ..., xt )Then a VAR is a set of p regression models
y1t+1 = β′1xt + e1t...
ypt+1 = β′pxt + ept
All variables xt enter symmetrically in each equationSims (1980) argued that there is no a priori reason to include orexclude an individual variable from an individual equation.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 66 / 82
Model Selection
Do not view selection as identification of “truth”
Rather, inclusion/exclusion is to improve finite sample performanceI minimize MSFE
Use selection methods, equation-by-equation
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 67 / 82
Example: VAR with 2 variables
y1t+1 = β11y1t + β12y1t−1 + β13y2t + e1t...
y2t+1 = β21y1t + β22y2t + β23y2t−1 + e2t
Selection picks y1t , y1t−1, y2t for equation for y1t+1Selection picks y1t , y2t , y2t−1 for equation for y2t+1The two equations have different variables
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 68 / 82
Same as system
yt+1 = A1yt + A2yt−1 + et+1
with
A1 =
[β11 β13β21 β22
]A2 =
[β12 00 β23
]The VAR system notation is still quite useful for many purposes(including multi-step forecasting)
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 69 / 82
Iterative Forecast Relationships in Linear VARvector yt
yt+1 = A0 + A1yt + A2yt−1 + · · ·+ Akyt−k+1 + ut+11-step conditional mean
E (yt+1|It ) = A0 + A1E (yt |It ) + · · ·+ AkE (yt−k+1|It )= A0 + A1yt + A2yt−1 + · · ·+ Akyt−k+1
2-step conditional mean
E (yt+1|It−1) = E (E (yt+1|It ) |It−1)= A0 + A1E (yt |It−1) + · · ·+ AkE (yt−k+1|It−1)= A0 + A1E (yt |It−1) + A2yt−1 + · · ·+ Akyt−k+1
h−step conditional meanE (yt+1|It−h+1) = E
(E (yt+1|It ) |It−h+1
)= A0 + A1E (yt |It−h+1) + · · ·+ AkE (yt−k+1|It−h+1)
Linear in lower-order (up to h− 1 step) conditional meansBruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 70 / 82
Iterative Least Squares Forecasts
Estimate 1-step VAR(k) by least-squares
yt+1 = A0 + A1yt + A2yt−1 + · · ·+ Akyt−k+1 + ut+1
Gives 1-step point forecast
yn+1|n = A0 + A1yn + A2yn−1 + · · ·+ Akyn−k+1
2-step iterative forecast
yn+2|n = A0 + A1yn+1|n + A2yn + · · ·+ Akyn−k+2
h−step iterative forecast
yn+h|n = A0 + A1yn+h−1|n + A2yn+h−2|n + · · ·+ Ak yn+h−k |n
This is (numerically) different than the direct LS forecast
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 71 / 82
Illustration 1: GDP Growth
AR(2) Model
yt+1 = 1.6+ 0.30yt + .16yt−1yn = 1.8, yn−1 = 2.9
yn+1 = 1.6+ 0.30 ∗ 1.8+ .16 ∗ 2.9 = 2.6yn+2 = 1.6+ 0.30 ∗ 2.6+ .16 ∗ 1.8 = 2.7yn+3 = 1.6+ 0.30 ∗ 2.7+ .16 ∗ 2.6 = 2.9yn+4 = 1.6+ 0.30 ∗ 2.9+ .16 ∗ 2.7 = 3.0
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 72 / 82
Point Forecasts
2012:2 2.652012:3 2.722012:4 2.872013:1 2.932013:2 2.972013:3 2.992013:4 3.002014:1 3.01
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 73 / 82
Illustration 2: GDP Growth+Housing Starts
VAR(2) Model
y1t = GDP Growth, y2t =Housing Starts
xt = (GDP Growtht , Housing Startst , GDP Growtht−1, HousingStartst−1yt+1 = A0 + A1yt + A2yt−1 + ut+1y1t+1 = 0.43+ 0.15y1t + 11.2y2t + 0.18y1t−1 − 10.1y2t−1y2t+1 = 0.07− 0.001y1t + 1.2y2t − 0.001y1t−1 − 0.26y2t−1
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 74 / 82
Illustration 2: GDP Growth+Housing Starts
y1n = 1.8, y2n = 0.71, y1n−1 = 2.9, y2n−1 = 0.68
y1n+1 = 0.43+ 0.15 ∗ 1.8+ 11.2 ∗ 0.71+ 0.18 ∗ 2.9− 10.1 ∗ 0.68 = 2.3y2t+1 = 0.07− 0.001 ∗ 1.8+ 1.2 ∗ 0.71− 0.001 ∗ 2.9− 0.26 ∗ 0.68 =0.76
y1n+2 = 0.43+ 0.15 ∗ 2.3+ 11.2 ∗ 0.76+ 0.18 ∗ 1.8− 10.1 ∗ 0.71 = 2.4y2t+1 = 0.07− 0.001 ∗ 2.3+ 1.2 ∗ 0.76− 0.001 ∗ 1.8− 0.26 ∗ 0.71 =0.80
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 75 / 82
Point Forecasts
GDP Housing2012:2 2.36 0.762012:3 2.38 0.802012:4 2.53 0.842013:1 2.58 0.882013:2 2.64 0.922013:3 2.66 0.952013:4 2.69 0.982014:1 2.71 1.01
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 76 / 82
Model Selection
It is typical to select the 1-step model and use this to make all h-stepforecasts
However, there theory to support this is incomplete
(It is not obvious that the best 1-step estimate produces the besth-step estimate)
For now, I recommend selecting based on the 1-step estimates
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 77 / 82
Model Combination
There is no theory about how to apply model combination to h-stepiterated forecasts
Can select model weights based on 1-step, and use these for allforecast horizons
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 78 / 82
Variance, Distribution, Interval Forecast
While point forecasts can be simply iterated, the other features cannot
Multi-step forecast distributions are convolutions of the 1-stepforecast distribution.
I Explicit calculation computationally costly beyond 2 steps
Instead, simple simulation methods work well
The method is to use the estimated condition distribution to simulateeach step, and iterate forward. Then repeat the simulation manytimes.
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 79 / 82
Multi-Step Forecast SimulationLet µ (x) and σ (x) denote the models for the conditional one-stepmean and standard deviation as a function of the conditional variablesxLet µ (x) and σ (x) denote the estimates of these functions, and let{ε1, ..., εn} be the normalized residualsxn = (yn, yn−1, ..., yn−p) is known. Set x∗n = xnTo create one h-step realization:
I Draw ε∗n+1 iid from normalized residuals {ε1, ..., εn}I Set y∗n+1 = µ (x∗n) + σ (x∗n) ε∗t+1I Set x∗n+1 = (y
∗n+1, yn , ..., yn−p+1)
I Draw ε∗n+2 iid from normalized residuals {ε1, ..., εn}I Set y∗n+2 = µ
(x∗n+1
)+ σ
(x∗n+1
)ε∗t+2
I Set x∗n+2 = (y∗n+2, y
∗n+1, ..., yn−p+2)
I Repeat until you obtain y∗n+hI y∗n+h is a draw from the h step ahead distribution
Repeat this B times, and let y ∗n+h(b), b = 1, ...,B denote the Brepetitions
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 80 / 82
Multi-Step Forecast Simulation
The simulation has produced y ∗n+h(b), b = 1, ...,B
For forecast intervals, calculate the empirical quantiles of y ∗n+h(b)I For an 80% interval, calculate the 10% and 90%
For a fan chartI Calculate a set of empirical quantiles (10%, 25%, 75%, 90%)I For each horizon h = 1, ...,H
As the calculations are linear they are numerically quickI Set B largeI For a quick application, B = 1000I For a paper, B = 10, 000 (minimum))
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 81 / 82
VARs and Variance Simulation
The simulation method requires a method to simulate the conditionalvariances
In a VAR setting, you can:I Treat the errors as iid (homoskedastic)
F Easiest
I Treat the errors as independent GARCH errors
F Also easy
I Treat the errors as multivariate GARCH
F Allows volatility to transmit across variablesF Probably not necessary with aggregate data
Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts October 29-31, 2013 82 / 82
top related