problems in regression analysis

23
1 Spring 02 Problems in Regression Analysis Heteroscedasticity Violation of the constancy of the variance of the errors. Cross-sectional data Serial Correlation Violation of uncorrelated error terms Time-series data

Upload: aideen

Post on 23-Jan-2016

221 views

Category:

Documents


8 download

DESCRIPTION

Problems in Regression Analysis. Heteroscedasticity Violation of the constancy of the variance of the errors. Cross-sectional data Serial Correlation Violation of uncorrelated error terms Time-series data. Heteroscedasticity. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Problems in Regression Analysis

1Spring 02

Problems in Regression Analysis

Heteroscedasticity Violation of the constancy of the variance of the

errors. Cross-sectional data

Serial Correlation Violation of uncorrelated error terms Time-series data

Page 2: Problems in Regression Analysis

2Spring 02

Heteroscedasticity

The OLS model assumes homoscedasticity, i.e., the variance of the errors is constant. In some regressions, especially in cross-sectional studies, this assumption may be violated.

When heteroscedasticity is present, OLS estimation puts more weight on the observations which have large error variances than on those with small error variances.

The OLS estimates are unbiased but they are inefficient but have larger than minimum variance.

Page 3: Problems in Regression Analysis

3Spring 02

Tests of Heteroscedasticity

Lagrange Multiplier Tests

Goldfeld-Quant Test

White’s Test

Page 4: Problems in Regression Analysis

4Spring 02

Goldfeld-Quant Test

Order the data by the magnitude of the independent variable, X, which is thouth to be related to the error variance.

Omit the middle d observations. (d might be 1/5 of the total sample size)

Fit two separate regressions; one for the low values, another for the high values

Calculate ESS1 and ESS2

Calculate

2

1)2

)2(2

)2(( ESSESSF kdNkdN

Page 5: Problems in Regression Analysis

5Spring 02

Problem

Salvatore – Data on income and consumptionY Consumption

12 10.6 10.8 11.113 11.4 11.7 12.114 12.3 12.6 13.215 13.0 13.3 13.616 13.8 14.0 14.217 14.4 14.9 15.318 15.0 15.7 16.419 15.9 16.5 16.920 16.9 17.5 18.121 17.2 17.8 18.5

Page 6: Problems in Regression Analysis

6Spring 02

Problem

10.0

11.0

12.0

13.0

14.0

15.0

16.0

17.0

18.0

19.0

10 12 14 16 18 20 22

Page 7: Problems in Regression Analysis

7Spring 02

Problem

Regression on the whole sample:

dYC *788.*48.1ˆ

Regressions on the first twelve and last twelve observations:

97.23.3069.1344.3

344.3,71.0,837.31.2ˆ

069.1,91.0,837.85.ˆ

%510,10

1222

12

11

crit

d

d

FF

ESSRYC

ESSRYC

Page 8: Problems in Regression Analysis

8Spring 02

To Correct for Heteroscedasticity

To correct for heteroscedasticity of the form Var(i)=CX2, where C is a nonzero constant, transform the variables by dividing through by the problematic variable.

In the two variable case,

The transformed error term is now homoscedastic

i

i

ii

i

XXX

Y 2

1

Page 9: Problems in Regression Analysis

9Spring 02

Problem

d

dd

idd

YC

YY

C

uYY

C

792.421.1ˆ

1421.1792.

ˆ

121

Page 10: Problems in Regression Analysis

10Spring 02

Serial Correlation

This is the problem which arises in OLS estimation when the errors are not independent. The error term in one period is correlated with error

terms in previous periods.

If i is correlated with i-1, then we say there is first order serial correlation.Serial correlation may be positive or negative. E(i,i-1)>0 E(i,i-1)<0

Page 11: Problems in Regression Analysis

11Spring 02

Serial Correlation

If serial correlation is present, the OLS estimates are still unbiased and consistent, but the standard errors are biased, leading to incorrect statistical tests and biased confidence intervals. With positive serial correlation, the standard errors

of hat is biased downward, leading to higher t stats With negative serial correlation, the standard errors

of hat is biased upward, leading to lower t stats

Page 12: Problems in Regression Analysis

12Spring 02

Durbin-Watson Statistic

40

)(

1

2

2

21

d

d n

tt

n

ttt

0 dL dU 2 4-dU 4-dL 4

+SC inconcl no serial correlation inconcl -SC

Page 13: Problems in Regression Analysis

13Spring 02

Problem

Data 9-4 shows corporate profits and sales in billions of dollars for the manufacturing sector of the U.S. from 1974 to 1994.

Estimate the equation

Profits = 1+2Sales + e

Test for first-order serial correlation.

Page 14: Problems in Regression Analysis

14Spring 02

Problem

Coefficientsa

34.014 24.041 1.415 .173

2.654E-02 .011 .496 2.492 .022

(Constant)

SALES

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: PROFITSa.

OLS Estimate of Profit as a function of Sales:

Salest *027.01.34ˆ

Page 15: Problems in Regression Analysis

15Spring 02

Problem

Test for serial correlation SPSS

Model Summaryb

.496a .246 .207 31.251 1.080Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Durbin-Watson

Predictors: (Constant), SALESa.

Dependent Variable: PROFITSb.

Page 16: Problems in Regression Analysis

16Spring 02

Correcting for Serial Correlation

We assume:

Where ut is distributed normally with a zero mean and constant variance.

Follow a Durbin Procedure

21

1

),(

tt

ttt

Cov

u

Page 17: Problems in Regression Analysis

17Spring 02

Correcting for Serial Correlation

)()(...)()1(

...

...

...

11122211

1112211

1112211

221

ttktktktttt

tktktt

tktktt

tktktt

XXXXYY

XXY

XXY

XXY

Page 18: Problems in Regression Analysis

18Spring 02

Correcting for Serial Correlation

)()(...)()1( 11122211 ttktktktttt XXXXYY

• Move the lagged dependent variable term to the right-hand side and estimate the equation using OLS. The estimated coefficient on the lagged dependent variable is .

Page 19: Problems in Regression Analysis

19Spring 02

Correcting for Serial Correlation

1*

1*

ttt

ttt

YYY

XXX

Create new independent and dependent variables by the following process:

Estimate the following equation:

tkkt uXXY **221

* ...)1(

)()(...)()1( 11122211 ttktktktttt XXXXYY

Page 20: Problems in Regression Analysis

20Spring 02

Correcting for Serial Correlation

The estimates of the slope coefficients are the same (but corrected for serial correlation) as in the original equation.

The constant of the regression on the transformed variables is

tkkt uXXY **221

* ...)1(

)1(

)1(

*1

1

1*1

or

Page 21: Problems in Regression Analysis

21Spring 02

Problem

Begin by regressing Profit () on Profit lagged one period, Sales, and Sales lagged one period.

The estimated coefficient on the lagged dependent variable is .

ttttt uSS 12211

Page 22: Problems in Regression Analysis

22Spring 02

Problem

Coefficientsa

-1.419 24.387 -.058 .954

.492 .209 .419 2.358 .031

.176 .052 3.106 3.355 .004

-.161 .053 -2.840 -3.046 .008

(Constant)

PROFITSL

SALES

SALESL

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: PROFITSa.

= .49

Page 23: Problems in Regression Analysis

23Spring 02

Problem

Then generate the transformed (starred) variables. Run regression on transformed variables

Profit*=.167+.042 Sales*Profit = .327 +.027 Sales With no serial correlation

Coefficientsa

.167 24.855 .007 .995

4.234E-02 .020 .442 2.091 .051

(Constant)

SALESS

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: PROFITSSa.