heteroskedasticity chapter 8 1. i. introduction 2 previously have assumed homoskedasticity. ...

HETEROSKEDASTICITY

Chapter 8

1

I. Introduction2

Previously have assumed homoskedasticity. Variance of unobservable error, u, conditional

on the explanatory variables is constant. Var(u|x1, x2,…, xk) = Var(u)= 2

Heteroskedasticity occurs if variance of u changes across different segments of the population (i.e. different values of x). Var(u|x1, x2,…, xk) ≠ Var(u) not equal to≠ 2

Example: Return to education, where variance in ability differs by educational attainment

I. Introduction3

Failure of homoskedasticity Does not bias coefficient estimates Interpretation of R sq and adj R sq unaffected

(because unconditional variance) Does lead to bias in the estimate of variance

(and consequently, standard error) standard t-statistics, p-values, CI, F-statistics

will lead to wrong inference i.e. t-statistics no longer follows t distribution

OLS is no longer BLUE (i.e. no longer most efficient estimator)

Economics 20 - Prof. Anderson4

.

x x1 x2

yf(y|x)

Example of Heteroskedasticity

x3

..

E(y|x) = 0 + 1x

II. Heteroskedasticity-Robust Infererence5

We can adjust standard errors so that they are valid in the presence of heteroskedasticity of unknown form. Convenient, because can get correct

standard errors regardless of kind of heteroskedasticity.

Outline: Variance with heteroskedasticity Estimate heteroskedasticity-robust standard

errors Detect heteroskedasticity

II. Variance with Heteroskedasticity6

x

2

2

22

22i

2

2

22

1

211

SST toreducesequation ticity,homoskedas With :Note

residuals OLS theare are ˆ where,ˆ

is when for thisestimator A valid

where,ˆ

so ,ˆ case, simple For the

ix

ii

ixx

ii

i

ii

uSST

uxx

xxSSTSST

xxVar

xx

uxx

II. Variance with Heteroskedasticity7

regression thisfrom residuals squared of sum theis

and s,t variableindependenother allon regressing

from residual theis ˆ where,ˆˆˆˆ

isasticity heterosked with ˆ ofestimator

valida model, regression multiple general For the

th2

22

j

j

ijj

iijj

j

SST

x

irSST

urrVa

Var

III. Estimation with Heteroskedasticity8

Now have a consistent estimate of the variance under heteroskedasticity the square root can be used as a standard

error for inference Call them heteroskedasticity robust s.e. or

Huber/White/Eicker s.e. Easily computable

Sometimes the variance is corrected for degrees of freedom by multiplying by n/(n – k – 1) before taking the square root. As n → ∞ it’s all the same, though

III. Estimation with Heteroskedasticity9

Why don’t we always calculate robust s.e.? These robust standard errors only have

asymptotic justification as n → ∞

With small sample sizes, the t statistics formed with robust standard errors will not have a distribution close to the t, and inferences will not be correct

In Stata, robust standard errors are easily obtained using the robust option of reg y x1 x2, robust

II. Heteroskedasticity-Robust InfererenceEstimation10

Large samples, people often report just robust s.e., in small samples will report both.

T-stat tstat=estimate-hypothesized value /(s.e.)

F-stat No easy analytical formula—can’t use

standard formula. Use STATA to give you F-stat for robust s.e.

III. Estimation with Heteroskedasticity

11

Log Wage Equation

Robust s.e. can either be larger or smaller than usual s.e.

Here (by chance) estimates are significant regardless of using wrong or right s.e…..not always case.

0.461R 526,n

[.00024] [.0069] [.00011] [.0051] [.0074] [.057] [.058] [.057] [.109]

(.00023) (.0068) (.00011) (.0055) (.0067) (.056) (.058) (.055) (.100)

00053.0291.exp00054.exp0268.0789.sin110.198.213.321.)log(

2

22^

tenuretenureerereducgfemmarrfemmarrmalewage

IV. Testing for Heteroskedasticity12

Homoskedasticity implies Var(u|x1, x2,…, xk) = 2

To test for heteroskedasticity, just test the following: H0: Var(u|x1, x2,…, xk) = 2, which is equivalent

to H0: E(u2|x1, x2,…, xk) = E(u2) = 2 since assume E(u)=0

If assume the relationship between u2 and xj is linear, can test as a linear restriction For u2 = 0 + 1x1 +…+ k xk + v this means testing H0:

1 = 2 = … = k = 0 using an F-test.

IV. Testing for HeteroskedasticityThe Breusch-Pagan Test

13

Don’t observe the error u (or u2 ), but can estimate it with the residuals from the OLS regression û2 = 0 + 1x1 +…+ k xk + e

Now can use the R û2 2 from this regression to construct an

F statistic. Can use normal F-statistic formula if we assume (which

we do) that the errors here satisfy homoskedasticity Var(e|x1, x2,…, xk) = u

2

The F statistic is just the reported F statistic for overall significance of the regression F = [R û

2 2 /k]/[(1 – R û2 2 )/(n – k – 1)], which is distributed Fk, n – k - 1

IV. Testing for HeteroskedasticityThe Breusch-Pagan Test

Economics 20 - Prof. Anderson

14

Ex: Heteroskedasticity in Housing Price Equations

asticityheterosked no of nullreject 2.71...so and 2.76between is c level, 5%At

.002value-p 5.34,/3].1601)][84-[.1601/(1F

3k 88,n ,101.

672.R 88,n

(9.01) (.013) (.00064) (29.48)

85.13123.00207.077.21

2

3210

^2

2

^

2

uR

ebdrmssqrftlotsizeu

bdrmssqrftlotsizeprice

IV. Testing for HeteroskedasticityThe White Test

15

The Breusch-Pagan test will detect any linear forms of heteroskedasticity

The White test allows for nonlinearities by using squares and crossproducts of all the x’s û 2 = 0 + 1x1 + 2 x2 + 3x2

1+ 4x22 +5x1 x2 + e

Use R û2 2 to form F-statistic

again, assume latter equation satisfies homoskedasticity

test whether all the xj, xj2, and xjxh are jointly

significant

IV. Testing for HeteroskedasticityAlternative form of The White Test

16

White Test can get unwieldy pretty quickly Alternative:

Consider that the fitted values from OLS, ŷ, are a function of all the x’s

Thus, ŷ2 will be a function of the squares and cross-products and so ŷ and ŷ2 can proxy for all of the xj, xj

2, and xjxh. Regress û2 = 0 + 1 ŷ + 2 ŷ2 +

Use the R û2 2 to form an F statistic

Note only testing for 2 restrictions now (H0: 1,

2 =0)

IV. Testing for HeteroskedasticityNotes

17

Recall, transforming dependent variable into log can reduce heteroskedasticity.

ticity.homoskedas of nullreject tofail so ,41.1)3/84)](0480.01/(0480.0[

.0480.0R yields regressorson residual of square Regressing

643.0R 88,n

(0.028) (0.093) (.038) (0.65)

)ln(037.0)ln(700.0)ln(168.030.1)ln(

ticity.homoskedas of nullreject so ,34.5)3/84)](160.01/(1601.0[


672.0R 88,n

(9.01) (0.013) (.00064) (29.48)

85.13123.000207.077.21

ticity.homoskedas of nullreject so ,34.5)3/84)](160.01/(1601.0[


672.0R 88,n

(9.01) (0.013) (.00064) (29.48)

85.13123.000207.077.21

2

u

2

2

u

2

2

u

2

2

2

2

F


F


F


V. Generalized Least Squares & Weighted Least Squares


18

While it’s always possible to estimate robust standard errors for OLS estimates, OLS is not the most efficient estimator.

If we can correctly specify the form of the heteroskedasticity, we can obtain more efficient estimates.

Basic idea: Transform the model into one that has

homoskedastic errors – called weighted least squares

V. Weighted Least SquaresHeteroskedasticity is known up to a Multiplicative Constant

19

Suppose the heteroskedasticity can be modeled as: Var(u|x) =i

2 =2h(x) Means variance of error is

proportional to level of xI.e. As income increases, variability in savings increases.



21

Example:

Transformed equation satisfies all G-M assumptions.

iiiiii

ii

iii

iii

kincincincincsav

inch

incincuVar

uincsav

/1*/1/

)|(

10

2

10

V. Weighted Least SquaresGeneral Least Squares Estimator

22

Estimating the transformed equation by OLS is called generalized least squares (GLS)…class of estimators

GLS will be BLUE in this case Provides more efficient estimates than if used

OLS in untransformed analysis. Can uses s.e. for t-statistics, p-values, CI, and

resulting R2 is used for F-statistics However, R2 not very good goodness of fit

measure (tells us how much of variation in transformed x explains variation in transformed y)

V. Weighted Least Squares


23

GLS estimator for correcting heteroskedasticity is called WLS estimator. minimize the weighted sum of squared

residuals (weighted by 1/hi ), which is inverse of the variance.

Less weight is given to observations with a higher error variance; in contrast, OLS gives same weight to all observations because it is best when error variance is identical for all partitions of the population

V. Weighted Least Squares


24

Minimization problem: ∑uhati

2 /hi

Can easily perform WLS using “weight” option in STATA. Produces s.e. that can use for t-statistics, etc… Some regression packages even include option to

calculate robust s.e. after weighting, in case specify form of heteroskedasticity incorrectly.

Ex: Explain financial wealth in terms of income, age, gender, 401(k) eligibility Suspect heteroskedasticity, so use WLS, with

weight 1/inc: Var(finan|inc)=2inc

Economics 20 - Prof. Anderson25

V. Weighted Least Squares26

WLS is great if we know what Var(ui|xi) looks like, but in most cases won’t know form of heteroskedasticity

One example where we do know form is: data is aggregated across some group or

geographic region instead of at the individual level

relationship between amount worker contributes to 401(k) and plan generosity

V. Weighted Least Squares27

What if only have averages for a firm? mcontribi =0 + 1mearnsi + 2magei +

3mmaratei + mui If individual regression satisfies all G-M

assumptions, it can be shown that the aggregated regression has Var(mui |x)=2/si where si is the number of employees at firm i.

Variance of error term decreases with firm size.

Weight is then: 1/hi=mi

similar issue if use per capita data at city, county, state or country level…error in aggregate equation has variance proportional to 1/size of population in that area.

VI. Heteroskedasticity must be estimated: Feasible GLS

28

More typically, we don’t know the form of the heteroskedasticity

In this case, you need to estimate h(xi) Since are estimating h(xi) and using the

estimate to transform the equation, call it feasible GLS.

Typically, we start with the assumption of a fairly flexible model, such as Var(u|x) = 2exp(0 + 1x1 + …+ kxk) , where

h(xi)=exp(0 + 1x1 + …+ kxk) Since we don’t know the , must estimate


29

Can transform above as u2 = 2exp(0 + 1x1 + …+ kxk)v v is error term assume E(v|x) = 1 and E(v) = 1

Taking natural logs of both sides: ln(u2) = + 1x1 + …+ kxk + e…now

contains original intercept and log(2) Assume E(e) = 1 and e is independent of x

We don’t have u, so replace with its estimate, û. Now can estimate this by OLS to get estimate of h(xi)



30

Now, an estimate of h is obtained as ĥ = exp(ĝ).

Now can use WLS with weights 1/ ĥ Implementation:

Run the original OLS model, save the residuals, û, square them and take the log

Regress ln(û2) on all of the independent variables and get the fitted values, ĝ

Do WLS using 1/exp(ĝ) as the weight


31

FGLS is not BLUE, because use ĥ instead of h, but it is more efficient than just running OLS.

As we saw with White test, could save valuable degrees of freedom by regressing: ln(û2) on ŷ and ŷ2 instead ln(û2) of all of the

independent variables When doing F tests with WLS, must make sure

to use same weights in restricted and unrestricted model form the weights from the unrestricted model and

use those weights to do WLS on the restricted model as well as the unrestricted model



32

Example: OLS (w/o het rob s.e.) and WLS

Under B-P test, Ru2 =.040 and get F-stat that reject null…have

evidence of heteroskedasticity Use Feasible GLS procedure and get second set of estimates. Signs and story are similar, but magnitudes different

.1134R 807,n

(0.80) (.0009) (.097) (.120) (4.46) (.44) (17.80)

46.30056.482.463.)(294.)log(30.164.5

.0526R 807,n

(1.11) (.0017) (.160) (.167) (5.773) (.728) (24.08)

83.20090.771.501.)log(751.)log(880.64.3

2

2^

2

2^

restaurnageageeduccigpriceogincomecigs

restaurnageageeduccigpriceincomecigs

heteroskedasticity chapter 8 1. i. introduction 2 previously have assumed homoskedasticity. ...

Documents