6. estimation of stationary arma processes · estimation of stationary arma processes setting: ......

6. Estimation of Stationary ARMA Processes

Setting:

• Our objective is to fit a stationary ARMA(p, q) process to aset of t = 1, . . . , T observations

• Let x1, . . . , xT denote the observations (trajectory, time se-ries) which are realizations of the process variables X1, . . . , XT(cf. Chapter 4)

182

6.1 Box-Jenkins Methodology

Known result:

• Every stationary data-generating process can be approxi-mated by an ARMA(p, q) process(cf. Chapter 3, Slide 43)

Issues to be tackled:

• Choice of the orders p and q of the process

• Estimation of all process parameters

183

Box-Jenkins Methodology:(cf. Box & Jenkins, 1976)

1. Model identification

2. Parameter estimation

3. Model diagnostics

4. Prediction

184

1. Model identification: (I)

• Ckecking for process stationarity on the basis of x1, . . . , xT

visual inspection of the data

application of statistical tests for stationarity(cf. Chapter 7)

data transformation to achieve stationarity. 1st differences:

xt → ∆xt = (1− L)xt = xt − xt−1

. 1st differences in logs:

xt → ∆log(xt) = log(xt)− log(xt−1) = log

(

xt

xt−1

)

185

1. Model identification: (II)

• Selection of the orders p and q

computation of the estimated ACF and PACF

visual comparison of the estimated ACF/PACF with theirtheoretical counterparts(see the Table on Slide 176 and the Figures on Slides178-180)

statistical selection criteria for the orders p and q(cf. Section 6.3)

186

2. Parameter estimation:

• OLS estimation, maximum likelihood estimation(cf. Section 6.2)

3. Model diagnostics:

• Checking for autocorrelation in the residuals(Ljung-Box tests, cf. Slides 165-166)

autocorrelation-free residuals−→ well-specified model

(analysis of parameter significance)autocorrelated residuals

−→ respecification of the model(iterative procedure)

187

4. Prediction:

• We may use the parameter estimates of a well-specified mod-els for forecasting future values of the process(not discussed in this lecture)

188

6.2 Estimation of ARMA(p, q) Processes

Now:

• Estimation of the parameters c, φ1, . . . , φp, θ1, . . . , θq and σ2 ofa stationary ARMA(p, q) process

Xt = c + φ1Xt−1 + . . . + φpXt−p + εt + θ1εt−1 + . . . θqεt−q

Remark:

• Different estimation procedures are available(OLS, ML estimation)

• For an AR(p) process we may use the Yule-Walker estimator(cf. Neusser, 2006)

189

Preliminary remarks: (I)

• Example of the OLS estimation of an AR(p) model

Xt = c + φ1Xt−1 + . . . + φpXt−p + ut

• We may consider the process as a regression model, where

Xt is the endogenous variable

Xt−1, . . . , Xt−p are the regressors

ut is the error term

190

Preliminary remarks: (II)

• Model in matrix representation:

Xp+1Xp+2

...XT

=

1 Xp Xp−1 · · · X11 Xp+1 Xp · · · X2... ... ... . . . ...1 XT−1 XT−2 · · · XT−p

cφ1φ2...

φp

+

up+1up+2

...uT

y = Xβ + u

• OLS estimator of β =[

c φ1 φ2 · · · φp]′

is given by

βOLS = (X′X)−1Xy

191

Preliminary remarks: (III)

• We may estimate σ2 via the OLS residuals u = y −XβOLSby

σ2 =u′u

T − p

(cf. lecture Econometrics I)

Problems: (I)

• The optimality properties of the OLS estimator depend onthe well-known assumptions of the classical linear regressionmodel(cf. lectures Econometrics I + II)

192

Problems: (II)

• Some of these classical assumptions are violated here:

the regressors are correlated with the error term

the OLS estimates crucially hinge on the chosen startingvalues X1, . . . , Xp

However:

• For AR(p) models the OLS estimators are consistent andasymptotically efficient(cf. Neusser, 2006, pp. 81-84)

193

Now:

• Estimation of the general ARMA(p, q) model

Xt = c + φ1Xt−1 + . . . + φpXt−p + εt + θ1εt−1 + . . . θqεt−q

• We collect all model parameter in the ([p+ q +2]×1) vector

β =[

c φ1 · · · φp θ1 · · · θq σ2]′

Note:

• OLS method is not applicable here, since the ”regressors”εt, εt−1, . . . , εt−q of the MA(q) part are not directly observable

194

Resort:

• We estimate all model parameters by the (conditional) max-imum likelihood technique(cf. lecture ”Advanced Statistics”)

ML technique: (I)

• We need distributional assumptions on the process variablesX1, . . . , XT

• Computation of the joint probability density function

fX1,...,XT (x1, . . . , xT )

195

ML technique: (II)

• We consider the joint probability density function as a func-tion of the unknown parameter vector β

L(β) = fX1,...,XT (x1, . . . , xT ),

or alternatively,

L∗(β) = log[fX1,...,XT (x1, . . . , xT )]

(likelihood function, log-likelihood function)

• We maximize L∗(β) with respect to β

−→ maximum likelihood estimator

196

ML technique: (III)

• ML estimators have nice statistical properties:

consistency

asymptotic normality

asymptotic efficiency

robustness against deviations from the normality assump-tion(quasi ML estimation)

197

Distributional assumption:

• We consider a Gaussian ARMA(p, q) process

Xt = c + φ1Xt−1 + . . . + φpXt−p + εt + θ1εt−1 + . . . θqεt−q,

where εt ∼ GWN(0, σ2)

Log-likelihood function: (I)• Computation of the exakt log-likelihood function impossible

• Instead, we compute the log-likelihood function by takinginto consideration some given starting values

x0 ≡[

x0 x−1 · · · x−p+1]′

,

ε0 ≡[

ε0 ε−1 · · · ε−q+1]′

−→ conditional log-likelihood function

198

Log-likelihood function: (II)

• The conditional log-likelihood function is given by

L∗(β|x0, ε0) = −T2

log(2π)−T2

log(σ2)−T

∑

t=1

ε2t2σ2

Remarks: (I)

• The conditional log-likelihood function L∗(β|x0, ε0) is an in-tricate nonlinear function of the parameter vector β

• There are no analytically closed-form formulae for the MLestimators available

−→ numerical optimization of L∗(β|x0, ε0)

199

Remarks: (II)

• Exact and conditional ML estimators have qualitively similarproperties

• EViews and other econometric software-packages possessnumerical optimization tools

200

6.3 Estimation of the Lag Orders p and q

Question:

• How can we choose the orders p and q of the ARMA modelto be fitted to the data?

Two potential specification errors:

• p and q are chosen too large(overfitting)

• p and/or q are chosen too small(underfitting)

201

Consequences:

• In each case, overfitting or underfitting, the ML estimatorsare no longer consistent

−→ accurate selection of the orders p and q are of major con-cern

Conceivable methods:

• Visual inspection of the estimated ACFs and PACFs(Box-Jenkins methodology, often difficult in practice)

• Statistical selection criteria

202

Idea behind the selection criteria: (I)

• Minimization of an information criterion

• General construction:

with increasing orders p and q the fit of the ARMA modelnecessarily improves

we measure the fit of the model by the estimated varianceof the residuals σ2

p,q

to control for the tendency of overfitting the model, weadd a correction term to the measure σ2

p,q that punishestoo high choices of p and q

203

Idea behind the selection criteria: (I)

• The best-known information criteria are:

AIC(p, q) = log(

σ2p,q

)

+ (p + q)2T(Akaike information criterion)

SIC(p, q) = log(

σ2p,q

)

+ (p + q)log(T )T

(Schwarz information criterion)

HQIC(p, q) = log(

σ2p,q

)

+ (p + q)2 log[log(T )]T

(Hannan-Quinn information criterion)

• In practice we choose the orders p and q such that one ofthe information criteria is minimized

204

Remarks:

• Many empirical applications use the AIC criterion althoughthis criterion has the inclination to yield overfitted models

• The SIC and HQIC criteria yield consistent estimators of theorders p and q

205

6.4 Modeling of a Stochastic Process

Now:

• Fitting an ARMA(p, q) process to an observed real-world timeseries following a four-step procedure

Step #1: Data transformation to achieve stationarity: (I)

• Economic time series are often non-stationary(cf. Chapter 7)

−→ data need to be transformed to get stationary

206

Step #1: Data transformation to achieve stationarity: (II)

• Conceivable transformations:

Differences

Yt = (1− L)dXt for d = 1,2, . . .

(difference filter of order d)

freeing {Xt} from a deterministic trend(cf. Chapter 7)

using logarithmic values or differences in logs

Yt = (1− L) log(Xt) = log(Xt)− log(Xt−1)

(growth rates)

207

Step #2: Choice of the orders p and q:

• Inspection of ACF and PACF

• Application of selection criteria(cf. Section 6.3)

Step #3: Estimation of the model:

• ML estimation of the specified ARMA(p, q) model

208

Step #4: Plausibility checking:

• Are the parameter estimates plausibel?

• Do the residuals a white-noise process?

• Are there structural breaks?

• If necessary, re-specification of the model and new fitting

Example:

• German GDP between 1970:Q1 and 2007:Q4

209

40

50

60

70

80

90

100

110

120

1970.1 1980.1 1990.1 2000.1

Time

GDP (corrected for prices, quarterly data)

-.04

-.02

.00

.02

.04

.06

.08

1970.1 1980.1 1990.1 2000.1

GDP growth rate

Step #1:

• Data appear to be subject to

an increasing trend

a seasonal pattern

−→ taking saisonal differences in logs

Xt = (1− L4) log(GDPt)

= log(GDPt)− log(GDPt−4)

(growth rate with respect to previous-year quarter)

211

Step #2: (I)

• Visual inspection of ACF and PACF(cf. graphs on Slide 213)

ACF slowly (monotonically) decreasing

−→ AR specification

PACF has significant values up to lag h = 4

−→ AR(4) specification

212

-1.0

-0.5

0.0

0.5

1.0

0 5 10 15 20 25 30

h

Estimated ACF

-1.0

-0.5

0.0

0.5

1.0

0 5 10 15 20 25 30

h

Estimated PACF

Step #2: (II)

• Selection criteria:

AIC values for alternative ARMA(p, q) specifications

p / q q = 0 q = 1 q = 2 q = 3 q = 4 q = 5p = 0 −5.6068 −5.6530 −5.9066 −5.9046 −5.9043p = 1 −5.8274 −5.8157 −5.8125 −5.9476 −5.9378 −5.9412p = 2 −5.8322 −5.8813 −5.8806 −5.9394 −5.9290 −5.9504p = 3 −5.8117 −5.8900 −5.8825 −5.9107 −5.9157 −5.9702p = 4 −5.8518 −5.8922 −5.8960 –5.9988 −5.9520 −5.9869p = 5 −5.8988 −5.8861 −5.9203 −5.9775 −5.9922 −5.9810

−→ ARMA(4,3) specification

214

SIC values for alternative ARMA(p, q) specifications

p / q q = 0 q = 1 q = 2 q = 3 q = 4 q = 5p = 0 −5.5663 −5.5923 −5.8256 −5.8034 −5.7828p = 1 −5.7867 −5.7546 −5.7311 –5.8458 −5.8157 −5.7988p = 2 −5.7709 −5.7996 −5.7785 −5.8168 −5.7860 −5.7870p = 3 −5.7296 −5.7874 −5.7593 −5.7670 −5.7514 −5.7855p = 4 −5.7487 −5.7684 −5.7516 −5.8338 −5.7664 −5.7806p = 5 −5.7745 −5.7411 −5.7546 −5.7910 −5.7850 −5.7531

−→ ARMA(1,3) specification

215

Step #3: (I)

• Estimation results for the AR(4) model

216

Dependent Variable: GDP_GROWTHRATE Method: Least Squares Date: 22/05/08 Time: 16:10 Sample (adjusted): 1972Q1 2007Q4 Included observations: 144 after adjustments Convergence achieved after 3 iterations

Variable Coefficient Std. Error t-Statistic Prob.

C 0.021617 0.003267 6.616613 0.0000AR(1) 0.671162 0.082343 8.150788 0.0000AR(2) 0.091627 0.099346 0.922303 0.3580AR(3) 0.146818 0.098621 1.488713 0.1388AR(4) -0.234958 0.080608 -2.914812 0.0041

0.0215200.018744

-5.851775-5.74865642.46246

R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat

0.549943 Mean dependent var 0.536992 S.D. dependent var 0.012754 Akaike info criterion 0.022612 Schwarz criterion 426.3278 F-statistic 1.877438 Prob(F-statistic) 0.000000

Inverted AR Roots .71-.26i .71+.26i -.38+.52i -.38-.52i

Step #3: (II)

• Main results:

parameters φ2 and φ3 are insignificant

variance of the residuals:

σ2 = (0.012754)2 = 0.000163

217

Step #3: (III)

• Estimation results for the ARMA(1,3) model

218

Dependent Variable: GDP_GROWTHRATE Method: Least Squares Date: 21/05/08 Time: 23:14 Sample (adjusted): 1971Q2 2007Q4 Included observations: 147 after adjustments Convergence achieved after 10 iterations Backcast: 1970Q3 1971Q1

Variable Coefficient Std. Error t-Statistic Prob.

C 0.021610 0.002912 7.421498 0.0000AR(1) -0.101970 0.113792 -0.896108 0.3717MA(1) 0.887698 0.084314 10.52848 0.0000MA(2) 0.654474 0.084773 7.720300 0.0000MA(3) 0.656299 0.062123 10.56447 0.0000

0.0215470.018559

-5.947550-5.84583549.49651

R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat

0.582336 Mean dependent var 0.570571 S.D. dependent var 0.012162 Akaike info criterion 0.021004 Schwarz criterion 442.1449 F-statistic 2.010427 Prob(F-statistic) 0.000000

-.10 Inverted AR Roots Inverted MA Roots .02+.84i .02-.84i -.94

Step #3: (IV)

• Main results:

parameter φ1 insignificant

variance of the residuals:

σ2 = (0.012162)2 = 0.000148

−→ better fit than the AR(4) specification

219

Step #4: (ARMA(1,3) model) (I)

• Plausible parameter estimates

• Features of the estimated ARMA(1,3) model: (I)(cf. figure on Slide 222)

inverse root of the AR polynomial inside the unit circle

−→ AR root is outside the unit circle

−→ estimated ARMA(1,3) model is stationary

220

Step #4: (II)

• Features of the estimated ARMA(1,3) model: (II)(cf. figure on Slide 222)

inverse root of the MA polynomial inside the unit circle

−→ MA root is outside the unit circle

−→ estimated ARMA(1,3) model is invertible

221

-1.0

-0.5

0.0

0.5

1.0

-1.0 -0.5 0.0 0.5 1.0

AR-Nullstellen MA-Nullstellen

Inverse roots of AR/MA polynomials

Inverse Roots of AR/MA Polynomial(s) Specification: GDP_GROWTHRATE C AR(1) MA(1) MA(2) MA(3) Date: 21/05/08 Time: 23:59 Sample: 1970Q1 2007Q4 Included observations: 147

AR Root(s) Modulus Cycle

-0.101970 0.101970

No root lies outside the unit circle. ARMA model is stationary.

MA Root(s) Modulus Cycle

-0.936859 0.936859 0.024581 ± 0.836616i 0.836977 4.076222

No root lies outside the unit circle. ARMA model is invertible.

Step #4: (III)

• Residual analysis (I)

223

-1.0

-0.5

0.0

0.5

1.0

0 5 10 15 20 25 30

h

Estimated ACF of the residuals

Step #4: (IV)

• Residual analysis (II)

No significant autocorrelation up to lag h = 30

Ljung-Box test for autocorrelation in the residuals:(cf. Slides 165-166)

Lag Q-statistic p-value10 7.6054 0.26820 15.348 0.49930 21.145 0.734

224

Step #4: (V)

• Conclusion:

Residuals (approximately) follow a white-noise processARMA(1,3) specification reflects the data correlationstructure well

225

6. estimation of stationary arma processes · estimation of stationary arma processes setting: ......

Documents