lesson 13: box-jenkins modeling strategy for building arma ... · introduction in this lesson we...

62
Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models Umberto Triacca Facolt` a di Economia Universit` a dell’Aquila [email protected] Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Upload: trankhanh

Post on 25-Jun-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Lesson 13: Box-Jenkins Modeling

Strategy for building ARMA models

Umberto Triacca

Facolta di Economia

Universita dell’Aquila

[email protected]

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Introduction

In this lesson we present a method to construct anARMA(p, q) model.

The so-called Box-Jenkins Modeling Strategy.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Introduction

The Box-Jenkins approach to modeling ARMA(p, q) modelswas described in a highly influential book by statisticiansGeorge Box and Gwilym Jenkins in 1970.

Box, G.E.P. and G.M. Jenkins (1970) Time series analysis:

Forecasting and control, San Francisco: Holden-Day.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Introduction

The Box-Jenkins modelling procedure involved a preliminaryanalysis (Data Transformation) and an iterative three-stageprocess:

1 Model identification

2 Model estimation

3 Model checking

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Introduction

Each stage concerns a question.

Preliminary analysis: Is the time series stationary?

1 Model identification: What class of models probablyproduced the (transformed) series?

2 Model estimation: What are the model parameters?

3 Model checking: Are the residuals from the estimatedmodel white noise?

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

The assumption of stationarity

The assumption that our time series is a realization of astationary process is clearly fundamental in time seriesanalysis.

The Box-Jenkins methodology requires that the ARMA(p, q)process to be used in describing the DGP to be bothstationary and invertible.

Thus, in order to construct an ARMA model, we must firstdetermine whether our time series can be considered arealization of a stationary process.

If it is not, we must transform the time series in order to getthe stationarity.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

The assumption of stationarity

A time series can be considered a realization of a stationarystochastic process if:

1 if there is no systematic change in mean (no trend),

2 if there is no systematic change in variance,

3 if there is no periodic variation.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Data Transformation

In this stage a very useful tool is the graph of the series.

From the plot of the time series values we can obtain usefulindications concerning the stationarity of the process.

If the observed values of the time series seem to fluctuate withconstant variation around a constant mean, then it isreasonable to suppose that the process is stationary, otherwise,it is nonstationary.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Figure : Time plot of a series generated by a stationary ARMAprocess.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

In the practice many time series cannot be considered likerealizations of stationary processes.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Consider an example the Airline series.

Figure : Monthly totals in thousands of international airlinepassengers from January 1949 to December 1960.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

The plot shows that:

1 The number of passengers tends to increase over time(positive trend).

2 The spread or variance in the counts of passengers tendsto increase over time.

3 The number of passengers tends to peak in certainmonths in each year.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Figure : Monthly totals in thousands of international airlinepassengers from January 1949 to December 1960.

Figure : Time plot of a series generated by a stationary ARMAprocess. Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Conclusion:

Figure : Monthly totals in thousands of international airlinepassengers from January 1949 to December 1960.

this time series cannot be considered like a realization of astationary process.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Making a time series stationary

Goal : Make the data set airlines stationary

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Variance stabilizing techniques

First, we want to stabilize the increasing variability of theseries.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Variance stabilizing techniques

To stabilize the variance, we can use the Box-CoxTransformation:

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

The Box-Cox Transformation

The Box-Cox Transformation

yt =

xλt −1

λif λ 6= 0

log(xt) if λ = 0

where the parameter λ is chosen by the analyst.Different values of λ yield different transformations.Popular choices of the parameter λ are 0 and 1/2.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

Why is it often the case that either λ = 0 or λ = 1/2 isadequate?

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

Consider a time series xt such that

xt = µt + vt

where µt is a nonstochastic mean level.Suppose that the variance of the time series xt has the form

var(xt) = var(vt) = µ2tσ2

The variance of the series is varying according to the meanlevel.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

We want to find a transformation g on xt such that thevariance of g(xt) is constant.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

By using the Taylor’s approximation we have

g(xt) ∼= g(µt) + g ′(µt)(xt − µt)

Thus

var(g(xt)) ∼= [g ′(µt)]2var(xt) = [g ′(µt)]

2µ2tσ2

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

We require that

var(g(xt)) = constant

Therefore g is chosen such that

g ′(µt) =1

µt

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

This implies thatg(µt) = log(µt)

resulting in the usual logarithmic trasformation.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

Ifvar(xt) = µtσ

2

theng ′(µt) = µ

−1/2t

which implies thatg(µt) = 2µ

1/2t

resulting in the square-root trasformation.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

Ifxt = µt + vt

var(xt) = µ2tσ2

the appropriate transformation is the log-trasformation.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

Ifxt = µt + vt

var(xt) = µtσ2

the appropriate transformation is the square-roottrasformation.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Mathematical Foundation of the Box-Cox

Transformation with λ equal to 0 or 1/2

If the variance of the series appears to increasequadratically with the mean, the logarithmic

transformation (λ = 0) is appropriate;

If the variance increases linearly with the mean, weshould use λ = 0.5, that is the square-root

trasformation.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Figure : Monthly totals in thousands of international airlinepassengers from January 1949 to December 1960.

Consider the log transformation

yt = log(xt) t = 1, 2, ...,T

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Figure : Log of Monthly totals in thousands of international airlinepassengers from January 1949 to December 1960.

The log transformation has removed the increasing variability.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

In order to remove the trend and the seasonal component, wedecide to use the differencing method.By using the filter

∆12 = 1− L12

we remove the seasonal component

Figure : (1− L12) Log of Monthly totals in thousands of

international airline passengers

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Finally, we use the filter

∆ = 1− L

in order to remove the non-stationarity in mean.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

The transformed series is given by

zt = ∆∆12log(xt) t = 1, 2, ...,T

We see that the differencing has well removed the trend andthe seasonal component.

Figure : (1− L)(1− L12) Log of Monthly totals in thousands of

international airline passengers

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Time series

Figure : (1− L)(1− L12) Log of Monthly totals in thousands of

international airline passengers

Figure : Time plot of a series generated by a stationary ARMAprocess. Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

The DGP’s model

✫✪✬✩DGP

ARMA

✓✓✓✓✓✓✓✓✓✓✓✼

✛✚

✘✙zk , ..., zT

✛✚

✘✙x1, ..., xT✲✛

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Conclusion

After the data have been rendered stationary, we are ready tofit an appropriate model to the data. This is the subject of thenext lessons.

Umberto Triacca Lesson 13: Box-Jenkins Modeling Strategy for building ARMA

Lesson 13 BIS: The Identification of

ARMA Models

Umberto Triacca

Dipartimento di Ingegneria e Scienze dell’Informazione e Matematica

Universita dell’Aquila,

[email protected]

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

Consider an ARMA process

xt ∼ ARMA(p, q)

Before an ARMA(p,q) model can be estimated we need toselect the order p and q of the AR and MA-polynomial

Following the Box and Jenkins’s terminology we will refer tothis step as identification of the appropriate ARMA model

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

The guidelines for the choice of p and q come from the shapeof two sample functions:

1 The Sample AutoCorrelation Function (SACF)

2 The Sample Partial AutoCorrelation Function (SPACF)

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

The sample autocorrelation and partial autocorrelationfunctions should reflect (with sampling variation) theproperties of the theoretical autocorrelation and partialautocorrelation functions of the process.

In order to identify the order of the model, the SACF andSPACF are compared with the theoretical ACF and PACF,respectively.

The sample autocorrelation plot and the sample partialautocorrelation plot are compared to the theoretical behaviorof these plots.

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

The theoretical behavior of ACF and PACF

If xt ∼ WN(0, σ2), then ρk = 0 and πk = 0 for all k ;

If xt ∼ AR(p) process, then ρk 6= 0 for all k , ρk → 0 ask → ∞ and πk 6= 0 for k ≤ p, πk = 0 for k > p;

If xt ∼ MA(q) process, then ρk 6= 0 for k ≤ q, ρk = 0 fork > q and πk 6= 0 for all k , πk → 0 as k → ∞;

If xt ∼ ARMA(p, q), then ρk 6= 0 for all k , ρk → 0 ask → ∞ and πk 6= 0 for all k , πk → 0 as k → ∞.

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

If xt ∼ AR(p) process, then ρk decays exponentially (eitherdirect or oscillatory) and πk cut off after the lag p.

Figure :

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

If xt ∼ MA(q) process, then ρk cut off after the lag q and πk

decays exponentially (either direct or oscillatory)

Figure :

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

If xt ∼ ARMA(p, q), then ρk decay exponentially (either director oscillatory) and πk decay exponentially (either direct oroscillatory)

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identification

The identification of a pure autoregressive or moving averageprocess is reasonably straightforward using the sampleautocorrelation and partial autocorrelation functions.

On the other hand, as we will see, for ARMA(p, q) processeswith p and q both non-zero, the SACF and SPACF are muchmore difficult to interpret

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Identifying the orders p and q by using Information

Criteria

The mixed models can be particularly difficult to identify byusing the correlogram and the partial correlogram.

For this reason, in recent years information-based criteria suchas AIC (Akaike Information Criterion) and BIC (BayesInformation Criteria) and others have been preferred and used.

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Model Idendification

The AIC statistic is defined as

AIC (p, q) = ln(σ2) +2(p + q)

T

where σ2 is the maximum likelihood estimated of the whitenoise variance.

Among a set of models, we select the values of p and q for ourfitted model to be those which minimize AIC (p, q).

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Model Idendification

Intuitively one can think of

2(p + q)

T

as a penality term to discourage over-parameterization.

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Model Idendification

There is an empirical evidences that AIC has the tendency topick models which are over-parameterized.

The BIC is a criterion which attempts to correct theoverfitting nature of the AIC.

It is defined to be

BIC (p, q) = ln(σ2) +ln(T )(p + q)

T

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Model Idendification

We note that BIC penalizes larger models more than AIC.

ln(T )

T>

2

T∀T ≥ 8

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Model Idendification

The procedure to use these criteria is the following:

1 Set upper bounds, P and Q for the AR and MA order,respectively

2 Fit all possible ARMA(p, q) models for p ≤ P and q ≤ Q

using a common sample size T

3 The AIC(pA, qA) and BIC(pB , qB) of the best modelssatisfy, rispectively,

AIC (pA, qA) = minp≤P,q≤QAIC (p, q)

BIC (pB , qB) = minp≤P,q≤QBIC (p, q)

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Model Idendification

The theoretical properties of these criteria have beeninvestigated. It is known that BIC is consistent in the sensethat the probability of selecting the true model approaches 1(if the true model is in the candidate list), but AIC is not.

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

The blue dotted parallel lines show approximative 95%confidence intervals for the null hypotesis H0 : ρk = 0 andH0 : πk = 0, respectively

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models

Some examples

Table : Selection ARMA order by AIC and BIC.

Orders p,q of ARMA model2,2 2,1 1,2 2,0 0,2 1,1

1,0 0,1

AIC 288.7 286.8 286.7 306.6 293.7 285.2

325.5 320.4BIC 304.4 299.9 299.8 317.1 304.2 289.4

333.4 328.3

Umberto Triacca Lesson 13 BIS: The Identification of ARMA Models