chapter 6 autocorrelation. what is in this chapter? how do we detect this problem? what are the...

Chapter 6

Autocorrelation

What is in this Chapter?

• How do we detect this problem?

• What are the consequences?

• What are the solutions?

• Regarding the problem of detection, we start with the Durbin-Watson (DW) statistic, and discuss its several limitations and extensions. We discuss Durbin's h-test for models with lagged dependent variables and tests for higher-order serial correlation.

• We discuss (in Section 6.5) the consequences of serially correlated errors and OLS estimators.

• The solutions to the problem of serial correlation are discussed in Section 6.3 (estimation in levels versus first differences), Section 6.9 (strategies when the DW test statistic is significant), and Section 6.10 (trends and random walks).

• This chapter is very important and the several ideas have to be understood thoroughly.

6.1 Introduction

• The order of autocorrelation

• In the following sections we discuss how to:– 1. Test for the presence of serial correlation.– 2. Estimate the regression equation when the

errors are serially correlated.

6.2 Durbin-Watson Test

6.3 Estimation in Levels Versus First Differences

• Simple solutions to the serial correlation problem: First Difference

• If the DW test rejects the hypothesis of zero serial correlation, what is the next step?

• In such cases one estimates a regression by transforming all the variables by ρ-differencing (quasi-first difference) or first-difference

• When comparing equations in levels and first differences, one cannot compare the R2 because the explained variables are different.

• One can compare the residual sum of squares but only after making a rough adjustment. (Please refer to P.231)

• Since we have comparable residual sum of squares (RSS), we can get the comparable R2 as well, using the relationship RSS = Syy(l — R2)

• Illustrative Examples

• Usually, with time-series data, one gets high R2 values if the regressions are estimated with the levels yt and Xt but one gets low R2 values if the regressions are estimated in first differences (yt — yt-1) and (xt — xt-1)

• Since a high R2 is usually considered as proof of a strong relationship between the variables under investigation, there is a strong tendency to estimate the equations in levels rather than in first differences.

• This is sometimes called the “R2 syndrome."

• However, if the DW statistic is very low, it often implies a misspecified equation, no matter what the value of the R2 is

• In such cases one should estimate the regression equation in first differences and if the R2 is low, this merely indicates that the variables y and x are not related to each other.

• Granger and Newbold present some examples with artificially generated data where y, x, and the error u are each generated independently so that there is no relationship between y and x

• But the correlations between yt and yt-1,.Xt and Xt-1, and ut and ut-1 are very high

• Although there is no relationship between y and x the regression of y on x gives a high R2 but a low DW statistic

• When the regression is run in first differences, the R2 is close to zero and the DW statistic is close to 2

• Thus demonstrating that there is indeed no relationship between y and x and that the R2 obtained earlier is spurious

• Thus regressions in first differences might often reveal the true nature of the relationship between y and x.

• Further discussion of this problem is in Sections 6.10 and 14.7

Homework

• Find the data – Y is the Taiwan stock index– X is the U.S. stock index

• Run two equations– The equation in levels (log-based price)– The equation in the first differences

• A comparison between the two equations– The beta estimate and its significance– The R square– The value of DW statistic

• Q: Adopt the equation in levels or the first differences?

• For instance, suppose that we have quarterly data; then it is possible that the errors in any quarter this year are most highly correlated with the errors in the corresponding quarter last year rather than the errors in the preceding quarter

• That is, ut could be uncorrelated with ut-1 but it could be highly correlated with ut-4.

• If this is the case, the DW statistic will fail to detect it

• What we should be using is a modified statistic defined as

6.4 Estimation Procedures with Autocorrelated Errors

• GLS (Generalized least squares)

• In actual practice ρ is not known

• There are two types of procedures for estimating – 1. Iterative procedures– 2. Grid-search procedures.

Homework

• Redo the example (see Table 3.11 for the data) in the Textbook– OLS – C-O procedure– H-L procedure with the interval of 0.01– Compare the R2 (Note: please calculate the

comparable R2 form the levels equation)

6.5 Effect of AR(1) Errors on OLS Estimates

• In Section 6.4 we described different procedures for the estimation of regression models with AR(1) errors

• We will now answer two questions that might arise with the use of these procedures:– 1. What do we gain from using these

procedures?– 2. When should we not use these

procedures?

• First, in the case we are considering (i.e., the case where the explanatory variable Xt is independent of the error ut), the OLS estimates are unbiased

• However, they will not be efficient

• Further, the tests of significance we apply, which will be based on the wrong covariance matrix, will be wrong.

• In the case where the explanatory variables include lagged dependent variables, we will have some further problems, which we discuss in Section 6.7

• For the present, let us consider the simple regression model

An Alternative Method to Prove the Above Characteristics???

• Use “simulation method” as shown at Chapter 5

• Write your program by the Gauss program

• Take the program at Chapter 5 and make some modifications on it

• Thus the consequences of autocorrelated errors are:– 1. The least squares estimators are unbiased but are

not efficient. Sometimes they are considerably less efficient than the procedures that take account of the autocorrelation

– 2. The sampling variances are biased and sometimes likely to be seriously understated. Thus R2 as well as t and F statistics tend to be exaggerated.

• 2. The discussion above assumes that the true errors are first-order autoregressive. If they have a more complicated structure (e.g., second-order autoregressive), it might be thought that it would still be better to proceed on the assumption that the errors are first-order autoregressive rather than ignore the problem completely and use the OLS method???– Engle shows that this is not necessarily true (i.e.,

sometimes one can be worse off making the assumption of first-order autocorrelation than ignoring the problem completely).

6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

• In previous sections we considered explanatory variables that were uncorrelated with the error term

• This will not be the case if we have lagged dependent variables among the explanatory variables and we have serially correlated errors

• There are several situations under which we would be considering lagged dependent variables as explanatory variables

• These could arise through expectations, adjustment lags, and so on.

• The various situations and models are explained in Chapter 10. For the present we will not be concerned with how the models arise. We will merely study the problem of testing for autocorrelation in these models

• Let us consider a simple model

• new;• format /m1 /rd 9,3;

• beta=2;

• T=30; @ sample number @ • u=Rndn(T,1); • x=Rndn(T,1)+0*u;• y=beta*x+u;

• @ OLS @• Beta_OLS=olsqr(y,x);

• print " OLS beta estimate ";• Beta_OLS;

• beta=2;

• T=50000; @ sample number @ • u=Rndn(T,1); • x=Rndn(T,1)+0*u;• y=beta*x+u;

• beta=2;

• T=50000; @ sample number @ • u=Rndn(T,1); • x=Rndn(T,1)+0.5*u;• y=beta*x+u;

6.8 A General Test for Higher-Order Serial Correlation: The LM Test

• The h-test we have discussed is, like the Durbin-Watson test, a test for first-order autoregression.

• Breusch and Godfrey discuss some general tests that are easy to apply and are valid for very general hypotheses about the serial correlation in the errors

• These tests are derived from a general principle — called the Lagrange multiplier (LM) principle

• A discussion of this principle is beyond the scope of this book. For the present we will explain what the test is

• The test is similar to Durbin's second test that we have discussed

6.8 A General Test for Higher-Order Serial Correlation: The LM Test

6.9 Strategies When the DW Test Statistic is Significant

• The DW test is designed as a test for the hypothesis ρ = 0 if the errors follow a first-order autoregressive process

• However, the test has been found to be robust against other alternatives such as AR(2), MA(1), ARMA(1, 1), and so on.

• Further, and more disturbingly, it catches specification errors like omitted variables that are themselves autocorrelated, and misspecified dynamics (a term that we will explain). Thus the strategy to adopt, if the DW test statistic is significant, is not clear. We discuss three different strategies:

• 1. Assume that the significant DW statistic is an indication of serial correlation but may not be due to AR(1) errors

• 2. Test whether serial correlation is due to omitted variables.

• 3. Test whether serial correlation is due to misspecified dynamics.

• Serial correlation due to misspecification dynamics

6.10 Trends and Random Walks

• Both the models exhibit a linear trend. But the appropriate method of eliminating the trend differs

• To test the hypothesis that a time series belongs to the TSP class against the alternative that it belongs to the DSP class, Nelson and Plosser use a test developed by Dickey and Fuller:

Three Types of RW• RW without drift: Yt=1*Yt-1+ut;

• RW with drift: Yt=alpha+1*Yt-1+ut;

• RW with drift and time trend: Yt=alpha+beta*t+1*Yt-1+ut

• ut~iid(0,sigma)

RW or Unit Root tests by E-view

• Additional Slides:

• Augmented D-F tests– Yt=a1*Yt-1+ut;– Yt-Yt-1=(a1-1)*Yt-1+ut– ΔYt=(a1-1)*Yt-1+ut– ΔYt=λ*Yt-1+ut– H0:a1=1 ≡ H0: λ=0

• ΔYt=λ*Yt-1+ΣΔYt-i+ut

• As an illustration consider the example given by Dickey and Fuller.36 For the logarithm of the quarterly Federal Reserve Board Production Index 1950-1 through 1977-4 they assume that the time series is adequately represented by the model

• 6. Regression of one random walk on another, with time included for trend, is strongly subject to the spurious regression phenomenon. That is, the conventional t-test will tend to indicate a relationship between the variables when none is present.

• The main conclusion is that using a regression on time has serious consequences when, in fact, the time series is of the DSP type and, hence, differencing is the appropriate procedure for trend elimination

• Plosser and Schwert also argue that with most economic time series it is always best to work with differenced data rather than data in levels

• The reason is that if indeed the data series are of the DSP type, the errors in the levels equation will have variances increasing over time

• Under these circumstances many of the properties of least squares estimators as well as tests of significance are invalid

• On the other hand, suppose that the levels equation is correctly specified. Then all differencing will do is produce a moving average error and at worst ignoring it will give inefficient estimates

• For instance, suppose that we have the model

• Differencing and Long-Run Effects:The Concept of Cointegration– One drawback of the procedure of differencing is that

it results in a loss of valuable "long-run information" in the data

– Recently, the concept of cointegrated series has been suggested as one solution to this problem.39 First, we need to define the term "cointegration.“

– Although we do not need the assumption of normality and independence, we will define the terms under this assumption.

• Yt~I(1)– Yt is a random walk–△Yt is a white noise, or iid– No one could predict the future price change– The market is efficient– The impact of previous shock on the price will

remain and not approach to zero

Cointegration

ttt ZUNDADR 10

jjtjUU

iitiUAtUUt

jjtjAU

iitiAAtAAt

eUNDADRZUND

eUNDADRZADR

Cointegration

Table 1 Unit Root and Cointergration Tests U.K. VOD Brazil CVDO ADR UND ADR UND Log levels -1.4979 -1.5236 -0.6505 -0.2749 % Returns -16.8033* -16.7456* -16.5194* -15.8273* Error Correction Term -11.9252* -4.0032*

Cointegration

• Run the VECM (vector error correction model) by E-view

• Additional slides

CointegrationTable 2 Parameter Estimates of VECM

U.K. VOD Brazil CVDO ADR Equation

αA -0.0006 (0.0008) 0.0008 (0.0007) βA -0.0049 (0.0885) -0.0256 (0.0130)** γAA,1 -0.1518 (0.0871)* -0.0578 (0.0433) γAU,1 0.0829 (0.0852) 0.1248 (0.0453)*** γAA,2 -0.0767 (0.0812) -0.1534 (0.0465)*** γAU,2 -0.0158 (0.0778) 0.0560 (0.0479) γAA,3 -0.0766 (0.0710) -0.1452 (0.0464)*** γAU,3 -0.0204 (0.0660) 0.0952 (0.0472)** γAA,4 -0.0226 (0.0526) -0.0677 (0.0437) γAU,4 0.0408 (0.0439) 0.0287 (0.0429)

UND Equation αU -0.0006 (0.0007) 0.0012 (0.0007)* βU 0.5636 (0.0738)*** 0.0020 (0.0123) γUA,1 0.2844 (0.0727)*** 0.3833 (0.0411)*** γUU,1 -0.2748 (0.0711)*** -0.2970 (0.0430)*** γUA,2 0.1955 (0.0677)*** 0.1478 (0.0442)*** γUU,2 -0.2543 (0.0649)*** -0.1914 (0.0454)*** γUA,3 0.1262 (0.0592)** 0.0634 (0.0440) γUU,3 -0.1510 (0.0551)*** -0.0479 (0.0449) γUA,4 0.0487 (0.0439) -0.0309 (0.0415) γUU,4 -0.0152 (0.0366) -0.0123 (0.0408)

Lead-lag relation obtained with VECM model

• If beta_A is significant and beta_U is insignificant, – the price adjustment mainly depends on ADR

markets– ADR prices converge to UND prices– UND prices lead ADR prices in price discover

y process– UND prices provide an information advantage

• If beta_U is significant and beta_A is insignificant, – the price adjustment mainly depends on UND

markets– UND prices converge to ADR prices– ADR prices lead UND prices in price discover

y process– ADR prices provide an information advantage

• If both of beta_U and beta_A are significant– suggesting a bidirectional error correction – The equilibrium prices line within ADR and UN

D prices– Both ADR and UND prices converge to the eq

uilibrium prices

• If both of beta_U and beta_A are significant, but the beta_U is greater than beta_A in absolute velue– The finding denotes that it is the UND price th

at makes greater adjustment in order to reestablish the equilibrium

– That is, most of the price discovery takes place at the ADR market.

Homework

• Find the spot and futures prices

• Daily and 5-year data at least

• Run the cointegration test

• Run the VECM– Lead-lag relationship

6.11 ARCH Models and Serial Correlation

• We saw in Section 6.9 that a significant DW statistic can arise through a number of misspecifications.

• We will now discuss one other source. This is the ARCH model suggested by Engle which has, in recent years, been found useful in the analysis of speculative prices.

• ARCH stands for "autoregressive conditional heteroskedasticity."

• GARCH (p,q) Model:

i jtjit

• The high level of persistence in GARCH models– the sum of the two GARCH parameter

estimates approximates unity in most cases– Li and Lin (2003): This finding provides some

support for the notion that GARCH models are handicapped by the inability to account for structural changes during the estimation period and thus suffers from a high persistence problem in variance settings.

• Find the stock returns

• Daily and 5-year data at least

• Run the GARCH(1,1) model– Check the sum of the two GARCH parameter

estimates– Parameter estimates – Graph the time-varying variance estimates

Could we identify RW? Low test power of the DF test

The Power of the test? The H0 is not true, but we accept the H0The data series is I(0), but we conclude it is

Several Key Problems for Unit Root Tests

• Low test power • Structural change problem• Size distortion

• RW or non-stationary or I(1) : Yt=1*Yt-1+ut• Stationary Process or I(0): Yt=0.99*Yt-1+ut-1, T=1,000 Yt=0.98*Yt-1+ut-1, T=50 or 1000

Spurious Regression

• RW 1 : Yt=0.05+1*Yt-1+ut

• RW 2: Xt=0.03+1*Xt-1+vt

Spurious Regression• new;• format /m1 /rd 9,3;

• @ Data Gerneration Process @

• Y=zeros(1000,1); u=2*Rndn(1000,1);• X=zeros(1000,1); v=1*Rndn(1000,1);

• i=2;• do until i>1000;

• Y[i,1]=0.05+1*Y[i-1,1]+u[i,1];

• X[i,1]=0.03+1*X[i-1,1]+v[i,1];

• i=i+1;• endo;

• Output file=d:\Courses\Enclass\Unit\YX_Spur.out reset;

• Y~X;

• Output off;

chapter 6 autocorrelation. what is in this chapter? how do we detect this problem? what are the...

levels yt

durbinwatson test6

durbinwatson dw statistic

problem of serial correlation

serial correlation problem

comparable r2

r2 syndrome

dw test statistic

Documents

correlation and autocorrelation

1 autocorrelation

lesson 7: estimation of autocorrelation and partial ... ·...

chapter 07 - autocorrelation

spatial autocorrelation - github pages › ... ›...

chapter 17: autocorrelatio n (serial...

heteroskedasticity- and autocorrelation-robust inference ·...

lecture handout autocorrelation

discrete autocorrelation-based multiplicative mras...

autocorrelation ii

spatial analysis - autocorrelation

9.1 lecture #9 studenmund (2006) chapter 9 objectives the...

autocorrelation-subtracted fourier transform holography...

chapter 12 testing for autocorrelation (ec220).ppt

autocorrelation time series analysis, comp6053...

9bwj4rixfbggech12 autocorrelation

20150404 rm - autocorrelation

6. autocorrelation · 6.2 autocorrelation coefficient in...

spatial autocorrelation - wvu

chapter 5: spatial autocorrelation statistics · pdf...