auto correlation 1
TRANSCRIPT
-
8/3/2019 Auto Correlation 1
1/9
UO Econ 425/525 - Cameron 1
UNIVERSITY OF OREGONDepartment of Economics
Economics 425/525
Lecture 1 - Autocorrelated Disturbances
Reading: Greene 5e, Chapter 12.1-12.5
Handout: Syllabus, summary of some of todays math
Caveat These typed notes supercede my older hand-written notes. The content should be very
similar, but the material may not have been thoroughly proofread. If you notice any remainingglitches, please send me a note and I will repair them promptly ([email protected])
1. What does the V= 2 matrix look like if there is stationarity in the error process?2. What are error autocovariances and error autocorrelations? We derived the formulas
for the error autocovariance and autocorrelations for AR(1) errors. What does2 look
like in this special case?
3. If errors are AR(1) and the parameter is known, what are the properties of OLS if thecorrected parameter covariance matrix is used? What are the properties of the nave OLSestimates?
4. What happens to the vector of OLS parameter estimates if the lagged value of Y appearson the right-hand-side of the model?
5. What is the Newey-West autocorrelation-corrected covariance matrix? How is it relatedto the Huber/White/sandwich covariance matrix? Is there just one Newey-West correctedcovariance matrix for each problem?
6. What does the standard Durbin-Watson test detect? Why does it have grey areas wherethe results of the test are ambiguous? How do you interpret a DW test statistic?
7. What type of autocorrelation test is appropriate for quarterly data?8. How do you perform a Breusch-Godfrey LM test for autocorrelated errors of arbitrary
order P. Will this test detect all possible patterns of autocorrelation?9. How do you conduct a Box-Pierce Q-test (or modified Q-test) for autocorrelated errors?10.How is the testing process different (for autocorrelation) if your model contains the
lagged value of Y on the right-hand-side?
11.In order to implement a GLS estimator, what does the appropriate P matrix look like forthe transformations that must be applied to the data (to effect the analogous
transformation of the error terms)?
12.It is easier to effect GLS estimates to correct for AR(1) errors if you just drop the firstobservation in your data set. Therefore, most researchers will do this. True, False,
Uncertain? Explain.
13.There is only one right way to come up with an initial estimate of the unknown parameter that is needed in order to make the transformations of the data for correction ofAR(1) errors. True, False, Uncertain? Explain.
14.How can one iterate two-step GLS estimation, until the parameter estimates converge?Does iteration improve the efficiency of the estimator, in principle?
15.Like all GLS estimators, the estimator that corrects for AR(1) errors can be iterated to
-
8/3/2019 Auto Correlation 1
2/9
UO Econ 425/525 - Cameron 2
yield the analogous maximum likelihood parameter estimates. True, False, Uncertain?
Explain.
16.Explain the rationale behind the test for common factors in time-series models whereerror autocorrelations are suspected.
17.When using a GLS model that corrects for AR(1) errors, the task of creating forecastsbased on the estimated model is a little more complicated than in a standard OLS model.Why?
Review intuition of time series data.
With cross-sectional data, we imagine that our observed sample is just one possible sample
among an extremely large number of alternative samples that could just as well have beencollected in a random sampling process.
Example: 60 in population, choose sample of 10
# different samples( )
!
! !
N N
n n N n
= =
171060 59 58 57 56 55 54 53 52 51 2.73 10 7.5 10
10 9 8 7 6 5 4 3 2 1 3628800
= = =
The intuition of building up a sampling distribution for regression parameters across a large
number of potential samples is very digestible with cross-sectional data.
With time-series data, it is a little more difficult to imagine, say, going back to 1950 and
collecting differenttime series data on US GNP 1950-2005 than actually materialized,historically. There is no counterpart to repeated sampling. Cross-validation of results in splitsamples is impossible in time series data. It requires a leap of faith to imagine that the time
series actually observed is but one of a very large number of alternative realizations that might
have occurred.
Disturbance Processes
In the usual time-series setting, we conveniently assume away the heteroskedasticity problem,
and focus instead on the case where the off-diagonal elements of the error variance-covariancematrix are no longer all zeroes:
[ ]
12 1
21
2 2
31
1
11
'
1
T
T T T
E
= =
versus [ ] 2 2
1 0 00 1
' 0
0 1
T
T T
E I
= =
-
8/3/2019 Auto Correlation 1
3/9
UO Econ 425/525 - Cameron 3
Thus, 2V = has a constant 2 along the diagonal. Since this is a covariance matrix, the off-diagonal elements must be symmetric, but these are the only assumptions in the most generalvariant of the model. The departure from the CLRM is now that we no longer have
0ts
t s = .
Slightly less general case: the elements of , called ts , depend only on t s but not on the
values oftor s themselves. This is called a weak stationarity assumption (a covariance
stationary process).
1 a b c
a 1 a b c
b a 1 a b
c b a b
c b a a
c b a 1
=
We call this a banded matrix.
Matrix will be symmetric in a particular way = strips. Covariances between residuals are a
function only of how far apart in time are those residuals! Where in time this interval occurs is
irrelevant.
Basic definition and standardized notation:
Subscripts on and in this case signify potentially unique scalar parameters for each time-
displacement s .
1. error autocovariances
[ ] [ ], ,t t s t s t sCov Cov += = "strip" value for errors displaced periods in times
So [ ] 2 0,t tCov = = by this definition.
2. error autocorrelations since we assume a homoskedastic case (constantvariance = 0 )
[ ] [ ]
[ ]
[ ] [ ] 0
, ,
,
t t s t s t
t t s ss
t t s
Cor Cor
Cov
Var Var
+
=
= = =
More generally, we can write the autocovariance matrix [ ] 0'E R = where R is the
autocorrelation matrix. [ ]tsR R= where we are considering periods tand s
-
8/3/2019 Auto Correlation 1
4/9
UO Econ 425/525 - Cameron 4
Autocorrelation matrix elements (the autocorrelation coefficients) are:
0 0
where -t r s
tr R t r s
= = =
Note: Different types of time series processes will result in different patterns in the R matrix.
AR(1) error model:
- most common example for annual aggregate macroeconomic data.
1
t 1 1 2 2 3 3
first-order AR(1)
... higher order AR(k)
t t t
t t t k t k t
u
u
= +
= + + + + +
We need to distinguish between AR and MA error processes:
a.)1
(1) :t t t AR u = +
[ ],t t s sCov =
(where this error covariance approaches zero as the two periods become widely separated in
time, but never reaches zero). Influence of a particular periods error fades over time (faster or
slower according to the size of) but only disappears asymptotically.
Contrast with:
b.)1
MA(1):t t tu u = where the tu errors are ordinary white noise (uncorrelated)
[ ] 0tVar = is the own-period covariance: [ ] ( )2 2 2 2 2t , 1t u u uCov = + = +
For any pair of periods t+1 and t(or tand t-1):
[ ] [ ] ( )( )1 1 1 1, ,t t t t t t t t Cov Cov E u u u u + + = = expand to get
2 2
1 1 1 1
=0 0 0 0
t t t t t t E u u u u u u u + + +
where ,
t su u are uncorrelated, so expectation is
2
1u = =
But for an MA(1) model, 0, 1k k = > . The memory ofthis process is only one period.
Pure time series models are a subfield of econometrics by themselves (covered in much moredetail in George Evans class). For annual data, AR(1) error models are the most common
application of these types of models a parsimonious choice in the event that 0 = can be
rejected by the data. Higher-order error processes should be considered for data with lowerlevels of time-aggregation: quarterly, weekly, daily data (e.g. stock price datadaily
information).
-
8/3/2019 Auto Correlation 1
5/9
UO Econ 425/525 - Cameron 5
AR(1) models as MA( ):
AR(1) errors can be represented in MA form by successive substitutions:
1
1 2 1
2 3 2
2
1 2
...
... MA( ) process
t t t
t t t
t t t
t t t t
u
u
u
u u u
= +
= +
= +
= + + +
Provided that is a fraction ( 1 < ), it will be the case that , 2 , 3 , 4 will diminish..
If 0 > , then 2 30, 0,... > > there will tend to be runs of positive errors followed by runs
of negative errors.
If 0 < , signs of errors t will tend to alternate.
In both cases, as long as 1 < , the errors will be damped they will not explode:
lim 0ss = .
Now: successive values oftu are not correlated, so there are no covariances to consider in the
tu s themselves.
(1)[ ] 2 2 2 4 2
22 2 4
2
... (infinite geometric series)
1 ... in the limit1
t u u u
uu
Var
= + + +
= + + + =
well use below
With stationarity [ ] [ ]2 21t t uVar Var = + for all t. Furthermore, [ ] [ ]1t tVar Var = due tohomoscedasticity assumption, so same result falls out.
Recall ( ), 0t sCov u = ift>s, since s is not correlated with any subsequent tu .
Easier way (demonstrate for one-period displaced and two-period displaced):
[ ] [ ]
( ) [ ]
[ ]
1 1
2
1 1 1 1
1
, can switch order...
=
because is uncorrelated with anything
t t t t
t t t t t t
t t
Cov E
E u E E u
Var u
=
= + +
=
2
2from variance result and stationarity
1
u
=
-
8/3/2019 Auto Correlation 1
6/9
UO Econ 425/525 - Cameron 6
[ ] [ ]
( ) ( ){ }
[ ] [ ]
[ ]
2 2
2 1 2 2 1
2 2
2 2 1
2
2
2
, switch order
=
again b
t t t t
t t t t t t t
t t t t t
t
Cov E
E u E u u
E E u E u
Var
=
= + + +
= + +
=1
2
2
2
ecause and are independent
again, from variance result and stationarity1
t t
u
u u
=
In general:1
0
ss i
t t s t i
i
=
= +
[ ] [ ] [ ]
[ ][ ]
( ) ( )
2 2
0 02 2
0
0
, , since1 1
,
, " " recall correlation=
s su ut t s t t s t
st t ss
t t s s
t t s
Cov E Var
Cov
Corr Var Var
= = = = =
= = =
We now have enough information to fill out 2V = for the common AR(1) process model:
2 1
22 2
2
1
1
1
11
1
T
u
T
T T
=
2Don't forget about the denominator underu
is a banded matrix with 1s along the principal diagonal, along the first two adjacent
diagonals, 2 along the next two adjacent diagonals, etc.
Note that the scalar (a ratio) that we have factored out of V could be renamed as our generic 2 ,
the matrix is notjust a TxT identity matrix, so we have all the problems outlined in thegeneral case (for GLS). For mostcommon cases you will encounter in empirical work (in
particular kk = , above), OLS on such a model will yield estimates that are:
- Unbiased
- Consistent- Asymptotically normally distributed, but
- Inefficient (if corrected from ( ) ( )12 'Var X X
= , otherwise just wrong)
If uncorrected? Packaged OLS S.E.s, t-ratios, F-tests are just plain wrong. For 0 > , s.e.s
tend to be too small, thus t-ratios too large. Implies overly optimistic results.
-
8/3/2019 Auto Correlation 1
7/9
UO Econ 425/525 - Cameron 7
WARNING: Exception (lagged dependent variables as regressors)
Simplest case: bothty and t autoregressive:
1
1
this is the new feature of the modelt t t
t t t
y y
u
= +
= +NB: only two, scalar, parameters
In the main regression, 1ty will be correlated with t .[ ] [ ]
[ ] [ ]
[ ]
[ ]
1 1 1
1 1
1
1
, , stationary process, uncorrelated with anything before it
, ,
,
,
t t t t t t
t t t t
t t t
t t
Cov y Cov y u u
Cov y Cov y
Cov y
Cov y
= +
= =
= +
= + [ ]{ }tVar
Thus [ ] [ ] [ ]1 1, ,t t t t t Cov y Cov y Var = +
Solving for [ ]1,t tCov y , which shows up in two places:
[ ]( ) ( ) ( ) ( )
2 2
12 2
,11 1 1
0 if 0
increasing in
u ut t
Cov y
= =
= =
=
Also,
[ ] [ ] [ ] [ ]
[ ]( )
( ) ( ) ( ) ( )( )
( ) ( )
2
1 1
2 2
2 2 2 2
2 , term just calculated above
same same
1 1
1 1 1 1 1 1
t t t t t
u ut
Var y Var y Var Cov y
Var y
= + +
+ += =
Q: What about the consistency of least squares in the presence of this particular stochastic
regressor?
[ ]
[ ][ ]
( )21 1,b1
true value doesn't go away unless =0!
t t
OLS
t
Cov yplim
Var y
= + = ++
Least squares for this special problem will be inconsistent unless 0 = (errors are not
autocorrelated).
Finding generalizes to main model with other regressors, X.
Aside: You cannot assume that a generic packaged regression algorithm for serially correlated
models knows whether any of your regressors happens to be a lagged value of the dependent
variable. However, one program (SHAZAM) allows you to specify that the first explanatoryvariable in the list is a lagged dependent variable:
-
8/3/2019 Auto Correlation 1
8/9
UO Econ 425/525 - Cameron 8
e.g. SHAZAM: auto y yl x / dlag (Check other software for this type of option; Havent foundthis in Stata yet.)
Comparing three estimation possibilities for data with AR errors:
1. Nave OLS, ignoring the fact that there are any pathologies in your errors2. Corrected OLS (use regular point estimates, but fix up parameter vcov matrix)3. GLS (transform all data using feasible version of the appropriate error vcov matrix)
Getting a sense of the relative efficiency of GLS over corrected OLS under autoregressiveerrors: Hard to generalize because you need to know both the nature of the error process and the
DGP for X in your model. Greene gives the simplest example, with only three scalar parameters
involved (i.e. data are deviations from means, so no intercept parameter):
[ ]
[ ]
2
1
2
1
=
= this is the new part now
t t t
t t t t
t t t t x
y x
u Var
x x v Var x
= +
= +
= +
Results only (See Greene 4e p. 535):
[ ]( )( )
( )( )
2
2
2 2
2 2
1( , where = # timeseries observations)
1
1
1 2
x
GLS
x
Var b corrected T T
VarT
+
+
Relative efficiency of GLS =[ ]
GLS
Var b
Var can be considered, across cases where the
autoregressive parameters , = -.9, -.3, 0, .3, .6, .9.
***Finding: If x is not autoregressive ( =0), relative efficiency 9.5 when .9 = . The noise
in the corrected OLS estimator is 9.5 times as great as the noise in the GLS estimator. Bad news!
But, for x adjusting very slowly ( relatively large, e.g. some macro variables?), and 0 > but
not too large ( .6?), relative efficiency 1.46. Losses from OLS (corrected) not too bad.Thus the cost of failing to go to a GLS estimator may not be all that great in some circumstances.
I have mentioned that we should certainly not be using the nave OLS ( )
12
's X X
as ourparameter variance-covariance matrix when we have autoregressive errors. Need estimate of
( ) ( ) ( )1 12 ' ' 'X X X X X X
( 2s being a biased but consistent estimator for 2 ).
For the simple contrived model above where 1t t t x x v = + , it can be shown that the simple ratio
of the naive OLS variance and the corrected OLS variance, with 2 factored out of both, is:
-
8/3/2019 Auto Correlation 1
9/9
UO Econ 425/525 - Cameron 9
1
1 1
'
1
1' ' '
X X
T
X X X X X X
T T T
+
= relative efficiency of corrected estimator
In most econometric models involving time series, it will be the case that (and ) are
positive. If and have the same sign, this ratio is less than one. The OLS uncorrected
parameter vcov matrix is too small, implying overly optimistico
H testing (can be very
dramatic if and are large!) But note: if is zero (no autoregression inx), the mis-
measurement of the standard errors in the nave model may not be too bad, even if is large.
******** End of Lecture 1 ***********