iv/2sls models
Post on 31-Dec-2015
26 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
IV/2SLS models
2
0 0.80x 1 0.57x
0iz 1iz
3
0 3186y 1 3278y
1 01
1 0
3278 3186ˆ 4000.57 0.80
y y
x x
4
Vietnam era service
• Defined as 1964-1975• Estimated 8.7 million served during era• 3.4 million were in SE Asia• 2.6 million served in Vietnam• 1.6 million saw combat• 203K wounded in action, 153K hospitalized• 58,000 deaths• http://www.history.navy.mil/library/online/
american%20war%20casualty.htm#t7
5
Vietnam Era Draft
• 1st part of war, operated liked WWII and Korean War
• At age 18 men report to local draft boards
• Could receive deferment for variety of reasons (kids, attending school)
• If available for service, pre-induction physical and tests
• Military needs determined those drafted
6
• Everyone drafted went to the Army
• Local draft boards filled army.
• Priorities– Delinquents, volunteers, non-vol. 19-25– For non-vol., determined by age
• College enrollment powerful way to avoid service– Men w. college degree 1/3 less likely to serve
7
Draft Lottery
• Proposed by Nixon• Passed in Nov 1969, 1st lottery Dec 1, 1969• 1st lottery for men age 19-26 on 1/1/70
– Men born 1944-1950.
• Randomly assigned number 1-365, Draft Lottery number (DLN)
• Military estimates needs, sets threshold T• If DLN<=T, drafted
8
Questions?
• What are the research questions?
• Why can we NOT obtain estimates from observational data?
9
• If volunteer, could get better assignment• Thresholds for service
• Draft Year of Birth Threshold• 1970 1946-50 195• 1971 1951 125• 1972 1952 95
• Draft suspended in 1973
10
11
12
13
14
Angrist/Evans
15
Female Labor Force Paticipation Rate
0
10
20
30
40
50
60
70
1948
1951
1954
1957
1960
1963
1966
1969
1972
1975
1978
1981
1984
1987
1990
1993
1996
1999
2002
Year
Per
cen
t in
lab
or
forc
e
16
17
18
19
20
21
22
23
. * get correlation coefficient between;
. * instrument and endogenous RHS variable;
. corr morekids samesex; (obs=254654) | morekids samesex -------------+------------------ morekids | 1.0000 samesex | 0.0695 1.0000
Correlation coefficient
24
Ratio of variances = (0.0020246/0.0291242)^2 = 0.004832484
25
R2 = 290.247937/60030.836855 = 0.004832
βiv = -0.0092924/0.0675253= -0.137631
26
Reduced form, just identified model
27
First stage, just identified model
28
2SLS, just identified model
Βiv= -0.0083481/0.0693854 = -0.120315
29
1st stage over identified model
ivreg2
• Download from www
• Within stata, type
ssc install ivreg2, replace
• and hit return
• Does all the tests seemlessly
30
31
* the syntax is ivreg2 y w (x=z), first endog(x);* the first command asks stata to report the 1st stage, and;* endog(x) asks stata to do the hausman-wu test of endogeneity; ivreg2 workedm boy1st boy2nd agem1 agefstm black hispan othrace(morekids=samesex), first endog(morekids);
Endogenous variableAnd instruments
Ask for 1st stage
Test for endogeneityof morekids in model
Outcome of interestW’s (exogenous covariates)
32
IV (2SLS) estimation--------------------
Estimates efficient for homoskedasticity onlyStatistics consistent for homoskedasticity only
Number of obs = 254654 F( 8,254645) = 865.24 Prob > F = 0.0000Total (centered) SS = 63460.72056 Centered R2 = 0.0482Total (uncentered) SS = 134513 Uncentered R2 = 0.5510Residual SS = 60402.67924 Root MSE = .487
------------------------------------------------------------------------------ workedm | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- morekids | -.1203151 .0278407 -4.32 0.000 -.1748818 -.0657483 boy1st | .0009211 .0019489 0.47 0.636 -.0028986 .0047409 boy2nd | -.0048314 .0019425 -2.49 0.013 -.0086386 -.0010241 agem1 | .0219352 .0009013 24.34 0.000 .0201687 .0237018 agefstm | -.0264911 .0012647 -20.95 0.000 -.0289698 -.0240124 black | .1899764 .0047674 39.85 0.000 .1806325 .1993203 hispan | -.0139081 .0053812 -2.58 0.010 -.0244551 -.0033611 othrace | .0443545 .0048137 9.21 0.000 .0349198 .0537891 _cons | .4498966 .0138562 32.47 0.000 .4227389 .4770543------------------------------------------------------------------------------Underidentification test (Anderson canon. corr. LM statistic): 1405.578 Chi-sq(1) P-val = 0.0000------------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic): 1413.330Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53Source: Stock-Yogo (2005). Reproduced by permission.------------------------------------------------------------------------------Sargan statistic (overidentification test of all instruments): 0.000
33
OLS estimation--------------Estimates efficient for homoskedasticity onlyStatistics consistent for homoskedasticity only Number of obs = 254654 F( 8,254645) = 2825.70 Prob > F = 0.0000Total (centered) SS = 60030.83676 Centered R2 = 0.0815Total (uncentered) SS = 96912 Uncentered R2 = 0.4311Residual SS = 55136.2215 Root MSE = .4653------------------------------------------------------------------------------ morekids | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- boy1st | -.0015753 .0026228 -0.60 0.548 -.0067158 .0035653 agem1 | .0304246 .000298 102.09 0.000 .0298405 .0310087 agefstm | -.0435676 .0003462 -125.85 0.000 -.0442461 -.0428891 black | .0679715 .0041853 16.24 0.000 .0597684 .0761747 hispan | .125998 .0038974 32.33 0.000 .1183591 .1336369 othrace | .0479479 .0044209 10.85 0.000 .039283 .0566127 twoboys | .0598382 .0025731 23.26 0.000 .0547951 .0648813 twogirls | .0789326 .0026467 29.82 0.000 .0737452 .08412 _cons | .3138696 .0092684 33.86 0.000 .2957038 .3320353------------------------------------------------------------------------------Included instruments: boy1st agem1 agefstm black hispan othrace twoboys twogirl> s------------------------------------------------------------------------------F test of excluded instruments: F( 2,254645) = 715.13 Prob > F = 0.0000Angrist-Pischke multivariate F test of excluded instruments: F( 2,254645) = 715.13
Prob > F = 0.0000
1st stage F
34
Summary results for first-stage regressions------------------------------------------- (Underid) (Weak id)Variable | F( 2,254645) P-val | AP Chi-sq( 2) P-val | AP F( 2,254645)morekids | 715.13 0.0000 | 1430.31 0.0000 | 715.13
35
IV (2SLS) estimation--------------------Estimates efficient for homoskedasticity onlyStatistics consistent for homoskedasticity only Number of obs = 254654 F( 7,254646) = 987.26 Prob > F = 0.0000Total (centered) SS = 63460.72056 Centered R2 = 0.0475Total (uncentered) SS = 134513 Uncentered R2 = 0.5506Residual SS = 60445.97117 Root MSE = .4872------------------------------------------------------------------------------ workedm | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- morekids | -.1127816 .0276854 -4.07 0.000 -.167044 -.0585193 boy1st | .0009424 .0019496 0.48 0.629 -.0028786 .0047635 agem1 | .0217057 .0008969 24.20 0.000 .0199478 .0234635 agefstm | -.0261649 .0012583 -20.79 0.000 -.0286312 -.0236987 black | .1895035 .0047653 39.77 0.000 .1801637 .1988433 hispan | -.014818 .0053707 -2.76 0.006 -.0253444 -.0042916 othrace | .0439784 .004813 9.14 0.000 .034545 .0534118 _cons | .4448388 .0137111 32.44 0.000 .4179656 .4717121------------------------------------------------------------------------------Underidentification test (Anderson canon. corr. LM statistic): 1422.320 Chi-sq(2) P-val = 0.0000------------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic): 715.129Stock-Yogo weak ID test critical values: 10% maximal IV size 19.93 15% maximal IV size 11.59 20% maximal IV size 8.75 25% maximal IV size 7.25Source: Stock-Yogo (2005). Reproduced by permission.------------------------------------------------------------------------------Sargan statistic (overidentification test of all instruments): 6.182 Chi-sq(1) P-val = 0.0129
-endog- option:Endogeneity test of endogenous regressors: 3.809 Chi-sq(1) P-val = 0.0510Regressors tested: morekids------------------------------------------------------------------------------Instrumented: morekidsIncluded instruments: boy1st agem1 agefstm black hispan othraceExcluded instruments: twoboys twogirls------------------------------------------------------------------------------
Test of over id.
Hausman endo test
36
. * output residuals and do the tests of overid;
. * and hausman test by brute force;
. predict res_2sls_worked, res;
. * test of overid;
. reg res_2sls_worked twoboys twogirls boy1st agem1 agefstm black hispan othr> ace; Source | SS df MS Number of obs = 254654-------------+------------------------------ F( 8,254645) = 0.77 Model | 1.46731447 8 .183414308 Prob > F = 0.6269 Residual | 60444.5039254645 .237367723 R-squared = 0.0000-------------+------------------------------ Adj R-squared = -0.0000 Total | 60445.9712254653 .237366028 Root MSE = .4872------------------------------------------------------------------------------res_2sls_w~d | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- twoboys | -.0052822 .0026941 -1.96 0.050 -.0105625 -1.83e-06 twogirls | .0042367 .0027711 1.53 0.126 -.0011946 .0096681 boy1st | .004822 .0027461 1.76 0.079 -.0005603 .0102043 agem1 | 3.72e-07 .000312 0.00 0.999 -.0006112 .000612 agefstm | 2.07e-06 .0003625 0.01 0.995 -.0007084 .0007125 black | -.0000392 .0043822 -0.01 0.993 -.0086282 .0085498 hispan | -.0000393 .0040807 -0.01 0.992 -.0080375 .0079588 othrace | .0000149 .0046288 0.00 0.997 -.0090575 .0090872 _cons | -.0021381 .0097043 -0.22 0.826 -.0211583 .016882------------------------------------------------------------------------------
37
• SSM = 1.467
• SST = 600444.50
• R2 = SSM/SST = 2.43E-5
• N = 254654
• NR2 = 6.18
• Dist as χ2(1)
• P-value of 6.18 is 0.0129
38
. * Run Hausmans test of endogeneity, two instrument case;
. * add residual from 1st stage regression to OLS of structural model;
. reg workedm morekids boy1st agem1 agefstm black hispan othrace res_1st_2zs; Source | SS df MS Number of obs = 254654-------------+------------------------------ F( 8,254645) = 1677.06 Model | 3176.20362 8 397.025453 Prob > F = 0.0000 Residual | 60284.5169254645 .236739449 R-squared = 0.0500-------------+------------------------------ Adj R-squared = 0.0500 Total | 63460.7206254653 .249204685 Root MSE = .48656------------------------------------------------------------------------------ workedm | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- morekids | -.1127816 .0276489 -4.08 0.000 -.1669726 -.0585906 boy1st | .0009424 .001947 0.48 0.628 -.0028736 .0047585 agem1 | .0217057 .0008957 24.23 0.000 .0199501 .0234612 agefstm | -.0261649 .0012566 -20.82 0.000 -.0286279 -.0237019 black | .1895035 .004759 39.82 0.000 .180176 .1988311 hispan | -.014818 .0053636 -2.76 0.006 -.0253305 -.0043054 othrace | .0439784 .0048067 9.15 0.000 .0345574 .0533994 res_1st_2zs | -.0541136 .0277264 -1.95 0.051 -.1084566 .0002294 _cons | .4448388 .013693 32.49 0.000 .4180009 .4716768------------------------------------------------------------------------------. * notice that OLS of this model generates 2SLS estimates of the other;. * variables in the model (morekids, boy1st, etc.);. test res_1st_2zs; ( 1) res_1st_2zs = 0 F( 1,254645) = 3.81 Prob > F = 0.0510
Do Hausman test brute force
39
. * Run Hausmans test of endogeneity, one instrument case;
. * add residual from 1st stage regression to OLS of structural model;
. reg workedm morekids boy1st agem1 agefstm black hispan othrace res_1st_2zs; Source | SS df MS Number of obs = 254654 -------------+------------------------------ F( 8,254645) = 1677.06 Model | 3176.20362 8 397.025453 Prob > F = 0.0000 Residual | 60284.5169254645 .236739449 R-squared = 0.0500 -------------+------------------------------ Adj R-squared = 0.0500 Total | 63460.7206254653 .249204685 Root MSE = .48656 ------------------------------------------------------------------------------ workedm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- morekids | -.1127816 .0276489 -4.08 0.000 -.1669726 -.0585906 boy1st | .0009424 .001947 0.48 0.628 -.0028736 .0047585 agem1 | .0217057 .0008957 24.23 0.000 .0199501 .0234612 agefstm | -.0261649 .0012566 -20.82 0.000 -.0286279 -.0237019 black | .1895035 .004759 39.82 0.000 .180176 .1988311 hispan | -.014818 .0053636 -2.76 0.006 -.0253305 -.0043054 othrace | .0439784 .0048067 9.15 0.000 .0345574 .0533994 res_1st_2zs | -.0541136 .0277264 -1.95 0.051 -.1084566 .0002294 _cons | .4448388 .013693 32.49 0.000 .4180009 .4716768 ------------------------------------------------------------------------------
Can reject at 5.1 percent the null the coefficients areThe same
Angrist/Krueger
40
41
Example
• Suppose a school district requires that a child turn 6 by October 31 in the 1st grade
• Has compulsory education until age 18
• Consider two kids
• One born Oct 1, 1960
• Another born Nov 1,1960
42
• Oct 1, 1960– Starts school in 1966 (age 5)– Turns 6 a few months into school– Starts senior year in 1977 (age 16)– Does not turn 18 until after HS school is over
• Nov 1, 1960– Start school in 1967 (age 6)– Turns 7 a few months into school– Starts senior year in 1978 (age 17)– Turns 18 midway through senior year
43
44
45
46
47
. * get reduced-forms for wald estimate;
. * compare to table III, panel B;
. reg educ qob1; Source | SS df MS Number of obs = 329509 -------------+------------------------------ F( 1,329507) = 67.57 Model | 727.393312 1 727.393312 Prob > F = 0.0000 Residual | 3546940.27329507 10.7643852 R-squared = 0.0002 -------------+------------------------------ Adj R-squared = 0.0002 Total | 3547667.66329508 10.76656 Root MSE = 3.2809 ------------------------------------------------------------------------------ educ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- qob1 | -.1088179 .0132376 -8.22 0.000 -.1347633 -.0828725 _cons | 12.79688 .0065904 1941.75 0.000 12.78397 12.8098 ------------------------------------------------------------------------------ . reg earnwkl qob1; Source | SS df MS Number of obs = 329509 -------------+------------------------------ F( 1,329507) = 16.42 Model | 7.56705582 1 7.56705582 Prob > F = 0.0001 Residual | 151830.3329507 .460780197 R-squared = 0.0000 -------------+------------------------------ Adj R-squared = 0.0000 Total | 151837.867329508 .460801763 Root MSE = .67881 ------------------------------------------------------------------------------ earnwkl | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- qob1 | -.0110989 .0027388 -4.05 0.000 -.0164669 -.0057309 _cons | 5.902694 .0013635 4329.00 0.000 5.900022 5.905367 ------------------------------------------------------------------------------
1st stage
Reduced-form
βiv==-0.0110989/-0.1088179=-0.10199
48
. * get correlation coefficient for;
. * educ and qob1;
. corr educ qob1; (obs=329509) | educ qob1 -------------+------------------ educ | 1.0000 qob1 | -0.0143 1.0000
Correlation coefficient: z and x
49
50
% of Mothers that Smoked During Pregnancy by Birth Month of their Child
11.0%
11.5%
12.0%
12.5%
13.0%
13.5%
14.0%
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Month
% S
mo
ked
51
52
53
54
Average Birth weight by Birth Month
3280
3290
3300
3310
3320
3330
3340
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Month
Bir
th w
eig
ht
in g
ram
s
55
56
Overidentified model
• 10 years of birth
• 3 quarters of birth
• 30 instruments
57
. * get dummies needed for the models;
. xi i.yob*i.qob; i.yob _Iyob_30-39 (naturally coded; _Iyob_30 omitted) i.qob _Iqob_1-4 (naturally coded; _Iqob_1 omitted) i.yob*i.qob _IyobXqob_#_# (coded as above)
The xi command i.m*i.n takes and generates dummies for i.m, i.n then all the unique interactions of m and n
58
. * run 2sls, qob times yob interactions as instruments; . * compare to column (2), table V; . ivregress 2sls earnwkl _Iyob_* (educ=_Iqob* _IyobX*); Instrumental variables (2SLS) regression Number of obs = 329509 Wald chi2(10) = 41.67 Prob > chi2 = 0.0000 R-squared = 0.1102 Root MSE = .64034 ------------------------------------------------------------------------------ earnwkl | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .0891154 .0161098 5.53 0.000 .0575408 .1206901 _Iyob_31 | -.0088813 .0055293 -1.61 0.108 -.0197185 .0019558 DELETE SOME RESULTS _Iyob_39 | -.0585271 .0104573 -5.60 0.000 -.0790231 -.0380311 _cons | 4.792727 .2006807 23.88 0.000 4.3994 5.186054 ------------------------------------------------------------------------------ Instrumented: educ Instruments: _Iyob_31 _Iyob_32 _Iyob_33 _Iyob_34 _Iyob_35 _Iyob_36 _Iyob_37 _Iyob_38 _Iyob_39 _Iqob_2 _Iqob_3 _Iqob_4 _IyobXqob_31_2 _IyobXqob_31_3 _IyobXqob_31_4 _IyobXqob_32_2 _IyobXqob_32_3 _IyobXqob_32_4 _IyobXqob_33_2 _IyobXqob_33_3 _IyobXqob_33_4 _IyobXqob_34_2 _IyobXqob_34_3 _IyobXqob_34_4 _IyobXqob_35_2 _IyobXqob_35_3 _IyobXqob_35_4 _IyobXqob_36_2 _IyobXqob_36_3 _IyobXqob_36_4 _IyobXqob_37_2 _IyobXqob_37_3 _IyobXqob_37_4 _IyobXqob_38_2 _IyobXqob_38_3 _IyobXqob_38_4 _IyobXqob_39_2 _IyobXqob_39_3 _IyobXqob_39_4
YOB effects
QOB main effects and qob x yob interactions asinstruments
59
. estat overid; Tests of overidentifying restrictions: Sargan (score) chi2(29)= 25.4394 (p = 0.6553) Basmann chi2(29) = 25.4383 (p = 0.6553)
60
. estat firststage; First-stage regression summary statistics -------------------------------------------------------------------------- | Adjusted Partial Variable | R-sq. R-sq. R-sq. F(30,329469) Prob > F -------------+------------------------------------------------------------ educ | 0.0033 0.0032 0.0004 4.90707 0.0000 --------------------------------------------------------------------------
1st stage F – lots of concerns about finite sample bias
61
In columns (4) and (8), age and agesq reduce information containedin instrument. 1st stage F falls to 1.6. Compare 2sls to IV in these cases.In this instance, low F – poor 1st stage fit – results collapse to OLS
62
Generate instruments by interacting 3 QOB x 10 YOB dummies (30)3 QOB x 50 YOB dummies (147)177 instruments, 176 DOF in NR2 test
Notice how close the 2SLS and OLS are
63
top related