heteroskedasticity chapter 8 1. i. introduction 2 previously have assumed homoskedasticity. ...
TRANSCRIPT
HETEROSKEDASTICITY
Chapter 8
1
I. Introduction2
Previously have assumed homoskedasticity. Variance of unobservable error, u, conditional
on the explanatory variables is constant. Var(u|x1, x2,…, xk) = Var(u)= 2
Heteroskedasticity occurs if variance of u changes across different segments of the population (i.e. different values of x). Var(u|x1, x2,…, xk) ≠ Var(u) not equal to≠ 2
Example: Return to education, where variance in ability differs by educational attainment
I. Introduction3
Failure of homoskedasticity Does not bias coefficient estimates Interpretation of R sq and adj R sq unaffected
(because unconditional variance) Does lead to bias in the estimate of variance
(and consequently, standard error) standard t-statistics, p-values, CI, F-statistics
will lead to wrong inference i.e. t-statistics no longer follows t distribution
OLS is no longer BLUE (i.e. no longer most efficient estimator)
Economics 20 - Prof. Anderson4
.
x x1 x2
yf(y|x)
Example of Heteroskedasticity
x3
..
E(y|x) = 0 + 1x
II. Heteroskedasticity-Robust Infererence5
We can adjust standard errors so that they are valid in the presence of heteroskedasticity of unknown form. Convenient, because can get correct
standard errors regardless of kind of heteroskedasticity.
Outline: Variance with heteroskedasticity Estimate heteroskedasticity-robust standard
errors Detect heteroskedasticity
II. Variance with Heteroskedasticity6
x
2
2
22
22i
2
2
22
1
211
SST toreducesequation ticity,homoskedas With :Note
residuals OLS theare are ˆ where,ˆ
is when for thisestimator A valid
where,ˆ
so ,ˆ case, simple For the
ix
ii
ixx
ii
i
ii
uSST
uxx
xxSSTSST
xxVar
xx
uxx
II. Variance with Heteroskedasticity7
regression thisfrom residuals squared of sum theis
and s,t variableindependenother allon regressing
from residual theis ˆ where,ˆˆˆˆ
isasticity heterosked with ˆ ofestimator
valida model, regression multiple general For the
th2
22
j
j
ijj
iijj
j
SST
x
irSST
urrVa
Var
III. Estimation with Heteroskedasticity8
Now have a consistent estimate of the variance under heteroskedasticity the square root can be used as a standard
error for inference Call them heteroskedasticity robust s.e. or
Huber/White/Eicker s.e. Easily computable
Sometimes the variance is corrected for degrees of freedom by multiplying by n/(n – k – 1) before taking the square root. As n → ∞ it’s all the same, though
III. Estimation with Heteroskedasticity9
Why don’t we always calculate robust s.e.? These robust standard errors only have
asymptotic justification as n → ∞
With small sample sizes, the t statistics formed with robust standard errors will not have a distribution close to the t, and inferences will not be correct
In Stata, robust standard errors are easily obtained using the robust option of reg y x1 x2, robust
II. Heteroskedasticity-Robust InfererenceEstimation10
Large samples, people often report just robust s.e., in small samples will report both.
T-stat tstat=estimate-hypothesized value /(s.e.)
F-stat No easy analytical formula—can’t use
standard formula. Use STATA to give you F-stat for robust s.e.
III. Estimation with Heteroskedasticity
11
Log Wage Equation
Robust s.e. can either be larger or smaller than usual s.e.
Here (by chance) estimates are significant regardless of using wrong or right s.e…..not always case.
0.461R 526,n
[.00024] [.0069] [.00011] [.0051] [.0074] [.057] [.058] [.057] [.109]
(.00023) (.0068) (.00011) (.0055) (.0067) (.056) (.058) (.055) (.100)
00053.0291.exp00054.exp0268.0789.sin110.198.213.321.)log(
2
22^
tenuretenureerereducgfemmarrfemmarrmalewage
IV. Testing for Heteroskedasticity12
Homoskedasticity implies Var(u|x1, x2,…, xk) = 2
To test for heteroskedasticity, just test the following: H0: Var(u|x1, x2,…, xk) = 2, which is equivalent
to H0: E(u2|x1, x2,…, xk) = E(u2) = 2 since assume E(u)=0
If assume the relationship between u2 and xj is linear, can test as a linear restriction For u2 = 0 + 1x1 +…+ k xk + v this means testing H0:
1 = 2 = … = k = 0 using an F-test.
IV. Testing for HeteroskedasticityThe Breusch-Pagan Test
13
Don’t observe the error u (or u2 ), but can estimate it with the residuals from the OLS regression û2 = 0 + 1x1 +…+ k xk + e
Now can use the R û2 2 from this regression to construct an
F statistic. Can use normal F-statistic formula if we assume (which
we do) that the errors here satisfy homoskedasticity Var(e|x1, x2,…, xk) = u
2
The F statistic is just the reported F statistic for overall significance of the regression F = [R û
2 2 /k]/[(1 – R û2 2 )/(n – k – 1)], which is distributed Fk, n – k - 1
IV. Testing for HeteroskedasticityThe Breusch-Pagan Test
Economics 20 - Prof. Anderson
14
Ex: Heteroskedasticity in Housing Price Equations
asticityheterosked no of nullreject 2.71...so and 2.76between is c level, 5%At
.002value-p 5.34,/3].1601)][84-[.1601/(1F
3k 88,n ,101.
672.R 88,n
(9.01) (.013) (.00064) (29.48)
85.13123.00207.077.21
2
3210
^2
2
^
2
uR
ebdrmssqrftlotsizeu
bdrmssqrftlotsizeprice
IV. Testing for HeteroskedasticityThe White Test
15
The Breusch-Pagan test will detect any linear forms of heteroskedasticity
The White test allows for nonlinearities by using squares and crossproducts of all the x’s û 2 = 0 + 1x1 + 2 x2 + 3x2
1+ 4x22 +5x1 x2 + e
Use R û2 2 to form F-statistic
again, assume latter equation satisfies homoskedasticity
test whether all the xj, xj2, and xjxh are jointly
significant
IV. Testing for HeteroskedasticityAlternative form of The White Test
16
White Test can get unwieldy pretty quickly Alternative:
Consider that the fitted values from OLS, ŷ, are a function of all the x’s
Thus, ŷ2 will be a function of the squares and cross-products and so ŷ and ŷ2 can proxy for all of the xj, xj
2, and xjxh. Regress û2 = 0 + 1 ŷ + 2 ŷ2 +
Use the R û2 2 to form an F statistic
Note only testing for 2 restrictions now (H0: 1,
2 =0)
IV. Testing for HeteroskedasticityNotes
17
Recall, transforming dependent variable into log can reduce heteroskedasticity.
ticity.homoskedas of nullreject tofail so ,41.1)3/84)](0480.01/(0480.0[
.0480.0R yields regressorson residual of square Regressing
643.0R 88,n
(0.028) (0.093) (.038) (0.65)
)ln(037.0)ln(700.0)ln(168.030.1)ln(
ticity.homoskedas of nullreject so ,34.5)3/84)](160.01/(1601.0[
.1601.0R yields regressorson residual of square Regressing
672.0R 88,n
(9.01) (0.013) (.00064) (29.48)
85.13123.000207.077.21
ticity.homoskedas of nullreject so ,34.5)3/84)](160.01/(1601.0[
.1601.0R yields regressorson residual of square Regressing
672.0R 88,n
(9.01) (0.013) (.00064) (29.48)
85.13123.000207.077.21
2
u
2
2
u
2
2
u
2
2
2
2
F
bdrmssqrftlotsizeprice
F
bdrmssqrftlotsizeprice
F
bdrmssqrftlotsizeprice
V. Generalized Least Squares & Weighted Least Squares
Economics 20 - Prof. Anderson
18
While it’s always possible to estimate robust standard errors for OLS estimates, OLS is not the most efficient estimator.
If we can correctly specify the form of the heteroskedasticity, we can obtain more efficient estimates.
Basic idea: Transform the model into one that has
homoskedastic errors – called weighted least squares
V. Weighted Least SquaresHeteroskedasticity is known up to a Multiplicative Constant
19
Suppose the heteroskedasticity can be modeled as: Var(u|x) =i
2 =2h(x) Means variance of error is
proportional to level of xI.e. As income increases, variability in savings increases.
V. Weighted Least SquaresHeteroskedasticity is known up to a Multiplicative Constant
20
Suppose Var(u|x) = 2h(x)=2 hi
Since hi is just a function of x: E(ui/√hi|x) = 0, because E(ui|x) = 0
Moreover, Var(u|x)=E(u2|x) –E(u|x)E(u|x)= E(u2|x) Then Var(ui/√hi|x) = E((ui/√hi ) 2|x )=1/ hi E( ui
2|x ) =
1/ hi Var(ui|x)= 1/ hi 2 hi =2
So, if we divided our whole equation by √hi we would have a model where the error is homoskedastic
V. Weighted Least SquaresHeteroskedasticity is known up to a Multiplicative Constant
Economics 20 - Prof. Anderson
21
Example:
Transformed equation satisfies all G-M assumptions.
iiiiii
ii
iii
iii
kincincincincsav
inch
incincuVar
uincsav
/1*/1/
)|(
10
2
10
V. Weighted Least SquaresGeneral Least Squares Estimator
22
Estimating the transformed equation by OLS is called generalized least squares (GLS)…class of estimators
GLS will be BLUE in this case Provides more efficient estimates than if used
OLS in untransformed analysis. Can uses s.e. for t-statistics, p-values, CI, and
resulting R2 is used for F-statistics However, R2 not very good goodness of fit
measure (tells us how much of variation in transformed x explains variation in transformed y)
V. Weighted Least Squares
Economics 20 - Prof. Anderson
23
GLS estimator for correcting heteroskedasticity is called WLS estimator. minimize the weighted sum of squared
residuals (weighted by 1/hi ), which is inverse of the variance.
Less weight is given to observations with a higher error variance; in contrast, OLS gives same weight to all observations because it is best when error variance is identical for all partitions of the population
V. Weighted Least Squares
Economics 20 - Prof. Anderson
24
Minimization problem: ∑uhati
2 /hi
Can easily perform WLS using “weight” option in STATA. Produces s.e. that can use for t-statistics, etc… Some regression packages even include option to
calculate robust s.e. after weighting, in case specify form of heteroskedasticity incorrectly.
Ex: Explain financial wealth in terms of income, age, gender, 401(k) eligibility Suspect heteroskedasticity, so use WLS, with
weight 1/inc: Var(finan|inc)=2inc
Economics 20 - Prof. Anderson25
V. Weighted Least Squares26
WLS is great if we know what Var(ui|xi) looks like, but in most cases won’t know form of heteroskedasticity
One example where we do know form is: data is aggregated across some group or
geographic region instead of at the individual level
relationship between amount worker contributes to 401(k) and plan generosity
V. Weighted Least Squares27
What if only have averages for a firm? mcontribi =0 + 1mearnsi + 2magei +
3mmaratei + mui If individual regression satisfies all G-M
assumptions, it can be shown that the aggregated regression has Var(mui |x)=2/si where si is the number of employees at firm i.
Variance of error term decreases with firm size.
Weight is then: 1/hi=mi
similar issue if use per capita data at city, county, state or country level…error in aggregate equation has variance proportional to 1/size of population in that area.
VI. Heteroskedasticity must be estimated: Feasible GLS
28
More typically, we don’t know the form of the heteroskedasticity
In this case, you need to estimate h(xi) Since are estimating h(xi) and using the
estimate to transform the equation, call it feasible GLS.
Typically, we start with the assumption of a fairly flexible model, such as Var(u|x) = 2exp(0 + 1x1 + …+ kxk) , where
h(xi)=exp(0 + 1x1 + …+ kxk) Since we don’t know the , must estimate
VI. Heteroskedasticity must be estimated: Feasible GLS
29
Can transform above as u2 = 2exp(0 + 1x1 + …+ kxk)v v is error term assume E(v|x) = 1 and E(v) = 1
Taking natural logs of both sides: ln(u2) = + 1x1 + …+ kxk + e…now
contains original intercept and log(2) Assume E(e) = 1 and e is independent of x
We don’t have u, so replace with its estimate, û. Now can estimate this by OLS to get estimate of h(xi)
VI. Heteroskedasticity must be estimated: Feasible GLS
Economics 20 - Prof. Anderson
30
Now, an estimate of h is obtained as ĥ = exp(ĝ).
Now can use WLS with weights 1/ ĥ Implementation:
Run the original OLS model, save the residuals, û, square them and take the log
Regress ln(û2) on all of the independent variables and get the fitted values, ĝ
Do WLS using 1/exp(ĝ) as the weight
VI. Heteroskedasticity must be estimated: Feasible GLS
31
FGLS is not BLUE, because use ĥ instead of h, but it is more efficient than just running OLS.
As we saw with White test, could save valuable degrees of freedom by regressing: ln(û2) on ŷ and ŷ2 instead ln(û2) of all of the
independent variables When doing F tests with WLS, must make sure
to use same weights in restricted and unrestricted model form the weights from the unrestricted model and
use those weights to do WLS on the restricted model as well as the unrestricted model
VI. Heteroskedasticity must be estimated: Feasible GLS
Economics 20 - Prof. Anderson
32
Example: OLS (w/o het rob s.e.) and WLS
Under B-P test, Ru2 =.040 and get F-stat that reject null…have
evidence of heteroskedasticity Use Feasible GLS procedure and get second set of estimates. Signs and story are similar, but magnitudes different
.1134R 807,n
(0.80) (.0009) (.097) (.120) (4.46) (.44) (17.80)
46.30056.482.463.)(294.)log(30.164.5
.0526R 807,n
(1.11) (.0017) (.160) (.167) (5.773) (.728) (24.08)
83.20090.771.501.)log(751.)log(880.64.3
2
2^
2
2^
restaurnageageeduccigpriceogincomecigs
restaurnageageeduccigpriceincomecigs