1 research method lecture 11-1 (ch15) instrumental variables estimation and two stage least square...

24
1 Research Method Research Method Lecture 11-1 Lecture 11-1 (Ch15) (Ch15) Instrumental Instrumental Variables Variables Estimation and Two Estimation and Two Stage Least Square Stage Least Square ©

Upload: juniper-tyler

Post on 21-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

1

Research MethodResearch Method

Lecture 11-1 Lecture 11-1 (Ch15)(Ch15)

Instrumental Instrumental Variables Variables

Estimation and Two Estimation and Two Stage Least SquareStage Least Square

©

MotivationMotivationOne explanatory variable One explanatory variable

casecase Consider the following regression.

2

u

eabileducwage )()log( 210

Since ability is not observed, we can only run the following regression.

ueducwage 10)log(

Since ability is correlated with educ, educ is endogenous (i.e, correlated with u). Thus, will be biased. 1̂

We learned two methods to eliminate the bias.

(1)Plug in the proxy variable for ability, such as IQ.

(2)Use panel data method (either the fixed effect or the first differenced model).

Instrumental variable method is another method to eliminate the bias.

3

Instrumental variable Instrumental variable method: One explanatory method: One explanatory

variable case.variable case. Consider the following model.

Suppose that x is endogenous, that is cov(x, u)≠0.

Further, suppose that you have another variable, z, which satisfies the following conditions.

Cov(z,u)=0 (instrument exogeneity) …....(1) Cov(z,x)≠0 (instrument relevance)………(2) If the above conditions are satisfied, we

call z an instrumental variable.

4

uxy 10

There are two ways to intuitively understand these conditions.

1.Instrumental variable is a variable that is not correlated with the omitted variable, but is correlated with the endogenous explanatory variable.

2.Instrumental variable is a variable that affects y only through x.

5

The condition Cov(z,u)=0 involves unobserved u. Therefore, we cannot test this condition. (When you have extra instrumental variables, you can test this. This will be discussed later).

The condition Cov(z,x)≠0 is easy to test. Just runt the following OLS,

x=π0+π1z+v

then test H0:π1=0

6

Instrumental variable estimation: Instrumental variable estimation: One explanatory variable-one One explanatory variable-one

instrument caseinstrument case

Now, consider

Then we have Cov(z,y)=Cov(z,β0+β1x+u)

So we have,

Cov(z,y)= β1Cov(z,x)+Cov(z,u)

Since Cov(z,u)=0, we have

7

uxy 10

)3....(..........),(

),(1 xzCov

yzCov

By replacing Cov(z,y) and Cov(z,x) with their sample covariances, we have the instrumental variable estimator of β1 which is given by

You can easily show that is a consistent estimator of β1.

8

)4..(..........))((

))((ˆ

1

11

n

iii

n

iii

xxzz

yyzz

Statistical inference with Statistical inference with IV: Homoskedasticity caseIV: Homoskedasticity case

Homoskedasticity assumption in the case of IV regression is stated in terms of z.

E(u2|z)=σ2

It can be shown that the asymptotic variance of is given by:

where is the variance of x, and is the correlation between x and z.

9

)5......(..........)ˆvar(2,

2

2

1zxxn

2,zx2

x

Now, the estimator of var( ) is obtained by replacing σ2, , and with their sample estimates.

Sample estimator of σ2 is obtained in the following way. First, obtain the IV estimates for β0 and β1, then compute

The estimator for σ2 is then computed as

10

2x 2

,zx

)6...(........................................ˆˆˆ 10 iii xyu

)7.(........................................ˆ2

1

22

n

iiun

The sample estimator for is given as:

Finally, sample estimator for can be most easily obtained in the following way. First, regress x on z. Then the R-squared from this regression equals the square of the sample correlation. Let call this R2

x,z. (Off course, you can compute the

sample correlation and raise it by power 2. You will get the same result).

11

2x

)8......(..........)(1

ˆ1

22

n

SSTxx

nx

SST

n

iix

x

2,zx

Then, the estimator for the variance of is given by:

You can show that this is a consistent estimator of the asymptotic variance given by (5).

12

)9......(..........ˆ

)ˆ(r̂va2,

2

1zxx RSST

Note: R-squared in IV Note: R-squared in IV regressionregression

The R-squared for IV regression is computed as

R2=1-SSR/SSTWhere SSR is the sum of the squared IV

residuals. (The IV residual is given by (6)).

Unlike in the case of OLS, SSR can be greater than SST. Thus, R2 can be negative. In IV regression, R2 does not have a natural interpretation.

13

Finding the instrumental Finding the instrumental variablevariable

The most difficult part of the instrumental variable estimation is to find suitable instrumental variables.

Consider the following regression

Then, you have to find z that is correlated with educ, but not correlated with abil. What can be z?

14

u

eabileducwage )()log( 210

Consider the father’s education. Perhaps a person whose father is highly educated tends to take more education as well. So the father’s education is likely correlated with educ.

But, for father’s education to be an instrument, this should not be correlated with the unobserved ability. A highly educated father may nurture his child better, so father’s education may be correlated with the unobserved ability. If this is the case, father’s education is not a good instrument.

Nonetheless, many studies have used father’s and mother’s education as instruments.

15

ExercisesExercises

1. Run the following regression using OLS, using MROZ.dat

2. Using the father’s education as an instrument for edu, estimate the same model using IV regression. Also check if father’s education is correlated with educ.

16

ueducwage 10)log(

17

_cons -.1851968 .1852259 -1.00 0.318 -.5492673 .1788736 educ .1086487 .0143998 7.55 0.000 .0803451 .1369523 lwage Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 223.327441 427 .523015084 Root MSE = .68003 Adj R-squared = 0.1158 Residual 197.001022 426 .462443713 R-squared = 0.1179 Model 26.3264193 1 26.3264193 Prob > F = 0.0000 F( 1, 426) = 56.93 Source SS df MS Number of obs = 428

. reg lwage educ

Instruments: fatheducInstrumented: educ _cons .4411034 .4450583 0.99 0.322 -.4311947 1.313402 educ .0591735 .0350596 1.69 0.091 -.009542 .127889 lwage Coef. Std. Err. z P>|z| [95% Conf. Interval]

Root MSE = .68778 R-squared = 0.0934 Prob > chi2 = 0.0914 Wald chi2(1) = 2.85Instrumental variables (2SLS) regression Number of obs = 428

. ivregress 2sls lwage (educ= fatheduc)

OLS

IV regression

18

_cons 10.23705 .2718861 37.65 0.000 9.702646 10.77146 fatheduc .2694416 .0288675 9.33 0.000 .2127013 .326182 educ Coef. Std. Err. t P>|t| [95% Conf. Interval] Robust

Root MSE = 2.0813 Adj R-squared = 0.1706 R-squared = 0.1726 Prob > F = 0.0000 F( 1, 426) = 87.12 Number of obs = 428

Check if father’s education is correlated with educ.

An applicationAn applicationAngrist and Krueger (1991), “Does

Compulsory School Attendance Affect Schooling and Earning?”

They used the quarter of the birth dummy as an instrument for education to estimate the effect of education on wage.

19

In the US, the compulsory schooling law requires students to remain in school until their 16th birthday.

At the same time, schools usually requires Children to be 6 years old on January 1st to be admitted to school. Therefore, children who were born in the first quarter were older than children who were born in the last quarter when they were first admitted to schools (6.45 v.s. 6.07 years).

20

This also means that children who were born in the first quarter of the year has shorter schooling when they reach the legal drop out age. So, children who were born in the first quarter can legally drop out of school with less education than children who were born in other quarters.

If some people want to take as little education as possible but are constrained by the compulsory schooling law, the quarter of birth should affect the education attainment.

21

At the same time, the quarter of birth is unlikely to be correlated with the unobserved ability.

Therefore, the dummy variable indicating if a person was born in the first quarter of the year is a good instrument for education.

22

23

Those born in the first quarter of the year tend to have lower education attainment

24

This is the IV regression