christopher dougherty ec220 - introduction to econometrics (chapter 8) slideshow: asymptotic and...
TRANSCRIPT
Christopher Dougherty
EC220 - Introduction to econometrics (chapter 8)Slideshow: asymptotic and finite-sample distributions of the iv estimator
Original citation:
Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 8). [Teaching Resource]
© 2012 The Author
This version available at: http://learningresources.lse.ac.uk/134/
Available in LSE Learning Resources Online: May 2012
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/
http://learningresources.lse.ac.uk/
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
1
uXY 21
XXZZ
YYZZb
ii
iiIV2
2,
2
2,2
2
2,
2
22IV
2
1MSD
11
1var IV
2
ZX
u
ZXi
u
ZXi
ub
rXn
rXXn
n
rXXb
The asymptotic variance of the IV estimator is given by the expression shown. It is the expression for the variance of the OLS estimator, multiplied by the square of the reciprocal of the correlation between X and Z.
2IV2 plim b
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
2
uXY 21
XXZZ
YYZZb
ii
iiIV2
2,
2
2,2
2
2,
2
22IV
2
1MSD
11
1var IV
2
ZX
u
ZXi
u
ZXi
ub
rXn
rXXn
n
rXXb
What does this mean? We have seen that the distribution of the IV estimator degenerates to a spike. So how can it have an asymptotic variance?
2IV2 plim b
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
3
uXY 21
XXZZ
YYZZb
ii
iiIV2
2,
2
2,2
2
2,
2
22IV
2
1MSD
11
1var IV
2
ZX
u
ZXi
u
ZXi
ub
rXn
rXXn
n
rXXb
The contradiction has been caused by compressing several ideas together. We will have to unpick them, taking several small steps.
2IV2 plim b
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
4
uXY 21
XXZZ
YYZZb
ii
iiIV2
2,
2
2,2
2
2,
2
22IV
2
1MSD
11
1var IV
2
ZX
u
ZXi
u
ZXi
ub
rXn
rXXn
n
rXXb
The application of a central limit theorem (CLT) underlies the assertion. To use a CLT, we must first show that a variable has a nondegenerate limiting distribution. The CLT will then show that, under appropriate conditions, this limiting distribution is normal.
2IV2 plim b
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
5
uXY 21
XXZZ
YYZZb
ii
iiIV2
2,
2
2,2
2
2,
2
22IV
2
1MSD
11
1var IV
2
ZX
u
ZXi
u
ZXi
ub
rXn
rXXn
n
rXXb
We cannot apply a CLT to b2IV directly, because it does not have a nondegenerate limiting
distribution. The expression for the variance may be rewritten as shown. MSD(X) is the mean square deviation of X.
2IV2 plim b
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
6
uXY 21
XXZZ
YYZZb
ii
iiIV2
2,
2
2,2
2
2,
2
22IV
2
1MSD
11
1var IV
2
ZX
u
ZXi
u
ZXi
ub
rXn
rXXn
n
rXXb
By a law of large numbers, the MSD tends to the population variance of X and so has a well-defined limit.
2IV2 plim b
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
7
uXY 21
XXZZ
YYZZb
ii
iiIV2
2,
2
2,2
2
2,
2
22IV
2
1MSD
11
1var IV
2
ZX
u
ZXi
u
ZXi
ub
rXn
rXXn
n
rXXb
The variance of b2IV is inversely proportional to n, and so tends to zero. This is the reason
that the distribution of b2IV collapses to a spike.
2IV2 plim b
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
8
We can deal with the diminishing-variance problem by considering √n b2IV instead of b2
IV. This has the variance shown, which is stable. However, √n b2
IV still does not have a limiting distribution because its mean increases with n.
2,
22 1
MSD IV2
ZX
ub rXn
2,
2IV2
1MSD
varZX
u
rXbn
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
9
So instead, consider √n (b2IV – 2). Since b2
IV tends to 2 as the sample size becomes large, this does have a limiting distribution with zero mean and stable variance.
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
10
Under conditions that are usually satisfied in regressions using cross-sectional data, it can then be shown that we can apply a central limit theorem and demonstrate that √n (b2
IV – 2) has the limiting normal distribution shown.
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
11
The arrow with a d over it means ‘has limiting distribution’.
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
12
Having established this, we can now start working backwards and say that, for sufficiently large samples, as an approximation, (b2
IV – 2) has the distribution shown. (~ means ‘is distributed as’.)
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
2
2
2IV2
1MSD
,0~XZ
u
rXnNb
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
13
We can then say that, as an approximation, for sufficiently large samples, b2IV is distributed
as shown, and use this assertion as justification for performing the usual tests. This is what was intended by equation (8.50).
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
2
2
2IV2
1MSD
,0~XZ
u
rXnNb
2
2
2IV2
1MSD
,~XZ
u
rXnNb
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
14
Of course, we need to be more precise about what we mean by a ‘sufficiently large’ sample, and ‘as an approximation’.
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
2
2
2IV2
1MSD
,0~XZ
u
rXnNb
2
2
2IV2
1MSD
,~XZ
u
rXnNb
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
15
We cannot do this mathematically. This was why we resorted to asymptotic analysis in the first place.
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
2
2
2IV2
1MSD
,0~XZ
u
rXnNb
2
2
2IV2
1MSD
,~XZ
u
rXnNb
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
16
Instead, the usual procedure is to set up a Monte Carlo experiment using a model appropriate to the context. The answers will depend on the nature of the model, the correlation between X and u, and the correlation between X and Z.
2,
22 1
MSD IV2
ZX
ub rXn
22
2
2IV2
1,0
XZX
ud
rNbn
2
2
2IV2
1MSD
,0~XZ
u
rXnNb
2
2
2IV2
1MSD
,~XZ
u
rXnNb
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
17
Suppose that we have the model shown and the observations on Z, V, and u are drawn independently from a normal distribution with mean zero and unit variance. We will think of Z and V as variables and of u as a disturbance term in the model. 1 and 2 are constants.
uXY 21 uVZX 21
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
18
By construction, X is not independent of u and so Assumption B.7 is violated when we fit the regression of Y on X. OLS will yield inconsistent estimates and the standard errors and other diagnostics will be invalid.
uXY 21 uVZX 21
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
19
Z is correlated with X, but independent of u, and so can serve as an instrument. (V is included as a component of X in order to provide some variation in X not connected with either the instrument or the disturbance term.)
uXY 21 uVZX 21
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
20
We will set 1 = 10, 2 = 5, 1 = 0.5, and 2 = 2.0.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
21
The diagram shows the distributions of the OLS and IV estimators of 2 for n = 25 and n = 100, for 10 million samples in both cases. Given the information above, it is easy to verify that plim b2
OLS = 5.19. Of course, plim b2IV = 5.00
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
5
10
4 5 6
OLS, n = 25
OLS, n = 100
IV, n = 100
IV, n = 25
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
22
The IV estimator has a greater variance than the OLS estimator and for n = 25 one might prefer the latter. It is biased, but the smaller variance could make it superior, using some criterion such as the mean square error. For n = 100, the IV estimator looks better.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
5
10
4 5 6
OLS, n = 25
OLS, n = 100
IV, n = 100
IV, n = 25
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
23
This diagram adds the distribution for n = 3,200. Both estimators are tending to the predicted limits (the IV estimator more slowly than the OLS, because it has a larger variance). Here the IV estimator is definitely superior.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
20
40
60
4 5 6
IV, n = 3,200
OLS, n = 3,200
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
24
This diagram shows the distribution of √n (b2IV – 2) for n = 25, 100, and 3,200. It also shows,
as the dashed red line, the limiting normal distribution predicted by the central limit theorem.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
0.1
0.2
-6 -4 -2 0 2 4 6
n = 25
n = 100
n = 3,200limiting normal distribution
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
25
It can be seen that the distribution for n = 3,200 is very close to the limiting normal distribution and so inference would be safe with samples of this magnitude.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
0.1
0.2
-6 -4 -2 0 2 4 6
n = 25
n = 100
n = 3,200limiting normal distribution
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
26
However, the distributions for n = 25 and n = 100 are distinctly non-normal. The distribution for n = 25 has fat tails. This means that if you performed a t test, the probability of suffering a Type I error will be much higher than the nominal significance level of the test
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
0.1
0.2
-6 -4 -2 0 2 4 6
n = 25
n = 100
n = 3,200limiting normal distribution
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
27
The distribution for n = 100 is better, in that the right tail is close to that of the normal distribution, but the left tail is much too fat and, as for n = 25, would give rise to excess instances of Type I error.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
0.1
0.2
-6 -4 -2 0 2 4 6
n = 25
n = 100
n = 3,200limiting normal distribution
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
28
The distortion for small sample sizes is partly attributable to the low correlation between X and Z, 0.22.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
0.1
0.2
-6 -4 -2 0 2 4 6
n = 25
n = 100
n = 3,200limiting normal distribution
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
29
Unfortunately, low correlations (‘weak instruments’) are common in IV estimation. It is difficult to find an instrument that is correlated with X but not the disturbance term. Indeed, it is often difficult to find any credible instrument at all.
uXY 21 uVZX 21
uXY 510
uVZX 0.25.0
0
0.1
0.2
-6 -4 -2 0 2 4 6
n = 25
n = 100
n = 3,200limiting normal distribution
Copyright Christopher Dougherty 2011.
These slideshows may be downloaded by anyone, anywhere for personal use.Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author.
The content of this slideshow comes from Section 8.5 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school courseEC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspxor the University of London International Programmes distance learning course20 Elements of Econometricswww.londoninternational.ac.uk/lse.
11.07.24