introduction to robust and clustered standard...

16
Introduction to Robust and Clustered Standard Errors Miguel Sarzosa Department of Economics University of Maryland Econ626: Empirical Microeconomics, 2012

Upload: dinhcong

Post on 05-Feb-2018

242 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Introduction to Robust and Clustered Standard Errors

Miguel Sarzosa

Department of Economics

University of Maryland

Econ626: Empirical Microeconomics, 2012

Page 2: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

1 Standard Errors, why should you worry about them

2 Obtaining the Correct SE

3 Consequences

4 Now we go to Stata!

Page 3: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Regressions and what we estimateA regression does not calculate the value of a relation between twovariables. In fact, it calculates a distribution of values of suchrelation. That is why, when you calculate a regression the two mostimportant outputs you get are:

I The conditional mean of the coe�cientI The standard deviation of the distribution of that coe�cient

Page 4: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Are we in the presence of a star?

The standard errors determine how accurate is your estimation.Therefore, it a�ects the hypothesis testing.That is why the standard errors are so important: they are crucial indetermining how many stars your table gets.And like in any business, in economics, the stars matter a lot.Hence, obtaining the correct SE, is critical

Page 5: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

SE and the data

The correct SE estimation procedure is given by the underlying structureof the data.

It is very unlikely that all observations in a data set are unrelated, butdrawn from identical distributionsFor instance, the variance of income is often grater in familiesbelonging to top deciles than among poorer familiesSome phenomena do not a�ect observations individually, but theya�ect groups of observations uniformly within each group.

Page 6: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

SE and the data

The correct SE estimation procedure is given by the underlying structureof the data.

It is very unlikely that all observations in a data set are unrelated, butdrawn from identical distributionsFor instance, the variance of income is often grater in familiesbelonging to top deciles than among poorer families(heteroskedasticity)

Some phenomena do not a�ect observations individually, but theya�ect groups of observations uniformly within each group. (clustered

data)

Page 7: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

The Basic Case (i.i.d.)

y

i

= x

0i

b + ei

ei

⇠�0,s2

This means that E

hee 0 |X

i= ⌦= s2

I

We know b =hX

0X

i�1

X

0y = b +

hX

0X

i�1

X

0e and

Var (b ) = E

hX

0X

i�1

X

0ee 0X

hX

0X

i�1

�=

hX

0X

i�1

E

hX

0ee 0X

ihX

0X

i�1

= s2

hX

0X

i�1

Page 8: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Heteroskedasticity (i.n.i.d)

y

i

= x

0i

b + ei

ei

⇠�0,s2

i

This means that E

hee 0 |X

i= ⌦= diag

�s2

i

�(big di�erence)

Page 9: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Heteroskedasticity (i.n.i.d)

Now

Var (b )=E

hX

0X

i�1

X

0ee 0X

hX

0X

i�1

�=hX

0X

i�1

E

hX

0ee 0X

ihX

0X

i�1

No further simplification is possibleNeed to estimate E

hX

0ee 0X

i= ÂN

i=1

e2

i

x

i

x

0i

Be aware that ÂN

i=1

e2

i

x

i

x

0i

6= X

0^e^eX

Then the Huber-Eicker-White (HEW) VC estimator is:

Var

⇣b⌘=hX

0X

i�1

"N

Âi=1

e2

i

x

i

x

0i

#hX

0X

i�1

(1)

Two-step procedure: 1. Get ei

, 2. Estimate (1). In Stata:vce(robust)

Page 10: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Clustered Data

Observations are related with each other within certain groups

ExampleYou want to regress kids’ grades on class size.The unobservables of kids belonging to the same classroom will becorrelated (e.g., teachers’ quality, recess routines) while will not becorrelated with kids in far away classrooms.

This means, we are assuming independence across clusters butcorrelation within clusters. That is:

y

ig

= x

0ig

b + eig

where g = 1, . . . ,G

E

he

ig

ejg

0

i=

(0 if g = g

0

s(ij)g if g 6= g

0 (2)

Page 11: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Clustered Data

Let us stack observations by cluster

y

g

= x

0g

b + eg

where g = 1, . . . ,G

The OLS estimator of b is still b =hX

0X

i�1

X

0y. (We just stacked

the data)The variance is given by

Var (b ) = E

hX

0X

i�1

X

0⌦X

hX

0X

i�1

Page 12: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Clustered Data

By (2), we know that

⌦= diag (⌃g

)

=

2

666666666666666666666664

s2

(11)1 . . . s(1N

1

)1 0 . . . 0 . . . 0 . . . 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

s2

(N1

1)1 s(N1

N

1

)1 0 0 0 0

0 . . . 0 s2

(11)2 . . . s(1N

2

)2

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 . . . 0 s(N2

1)2 . . . s2

(N2

N

2

)2

.

.

.

.

.

.

.

.

.

s2

(11)g . . . s(1N

g

)g

.

.

.

.

.

.

.

.

.

s(Ng

1)g . . . s2

(Ng

N

g

)g

3

777777777777777777777775

Page 13: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Clustered Data

With this in mind, we can now write the VC matrix for clustered data

Var

⇣b⌘=

hX

0X

i�1

"G

Âi=1

x

0g

eg

e 0g

x

g

#hX

0X

i�1

In Stata: vce(cluster clustvar ). Where clustvar is a variablethat identifies the groups in which onobservables are allowed tocorrelate.

Page 14: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

The importance of knowing your data

In real world you should never go with the i.i.d. case. Life is not thatsimpleYou need to know your data in order to choose the correct errorstructure and then infer the required SE calculationIf you have aggregate variables (like class size), clustering at that levelis required.You might think your data correlates in more than one way

I If nested (e.g., classroom and school district), you should cluster at thehighest level of aggregation

I If not nested (e.g., time and space), you can:1

Include fixed-e�ects in one dimension and cluster in the other one.

2

Multi-way clustering extension (see Cameron, Gelbach and Miller, 2006)

Page 15: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

How much will my SE increase?

Clustered SE will increase your confidence intervals because you areallowing for correlation between observations. Hence, less stars in yourtables. The higher the clustering level, the larger the resulting SE.For one regressor the clustered SE inflate the default (i.i.d.) SE by

q1+r

x

re�N �1

were rx

is the within-cluster correlation of the regressor, re is thewithin-cluster error correlation and N is the average cluster size.Here it is easy to see the importance of clustering when you haveaggregate regressors (i.e., r

x

= 1).

Page 16: Introduction to Robust and Clustered Standard Errorseconweb.umd.edu/~sarzosa/teach/2/Disc2_Cluster_handout.pdf · Introduction to Robust and Clustered Standard Errors Miguel Sarzosa

Now we go to Stata!