panel ecmiic2
TRANSCRIPT
2 Panel Data∗
Data sets that combine time series and cross
sections data are common in economics.
An independently pooled cross section is ob-
tained by sampling randomly a large popula-
tion at different points in time (e.g., yearly).
Important feature: The data set consist of
independently sampled observations.
Allows to investigate the effect of time. E.g.,
whether relationships have changed.
Raises typically minor statistical complica-
tions.
∗Version: Jan 19, 2012
1
A panel data set (longitudinal data) a sample
of same individuals, families, firms, cities . . .,
are followed across time.
E.g., OECD statistics contain numerous se-
ries observed yearly from several countries.
Similarly time series data on several firms,
industries, etc., are these type of data.
2
2.1 Pooling independent cross section across
time
Example 1 Women’s fertility over time: Data fromGeneral Social Survey contains samples collected evenyears from 1972 to 1984.
Model for explaining total number of children born toa woman.
Data is available on the course web side (passwordprotected).
3
* read data.insheet using "fertil1.csv", comma clear* describe data.des
Contains dataobs: 1,129
vars: 14size: 24,838 (99.9% of memory free)
------------------------------------------------------------storage display value
variable name type format label variable label------------------------------------------------------------year byte %8.0geduc byte %8.0gmeduc byte %8.0gfeduc byte %8.0gage byte %8.0gkids byte %8.0gblack byte %8.0geast byte %8.0gnorthcen byte %8.0gwest byte %8.0gfarm byte %8.0gothrural byte %8.0gtown byte %8.0gsmcity byte %8.0g
4
. tabstat kids, statistics( mean count ) by(year) columns(statistics)
Summary for variables: kidsby categories of: year
year | mean N---------+--------------------
72 | 3.0 15674 | 3.2 17376 | 2.8 15278 | 2.8 14380 | 2.8 14282 | 2.4 18684 | 2.2 177
---------+--------------------Total | 2.7 1129
------------------------------
2
3
4
70 72 74 76 78 80 82 84 86
N o
f chi
ldre
n
Year
Number of children per woman
It is obvious that the fertility rate has declined over
years
5
The analysis can be substantially elaborated by re-
gression analysis.
After controlling other factors (educations, age, etc.),
what has happened to fertility rate?
Build a regression with year dummies: y74 for 1974,
· · · , y84 for year 1984.
Year 1972 is the base year.
6
. reg kids educ age age2 black east northcen west farm ///y74 y76 y78 y80 y82 y84
Source | SS df MS Number of obs = 1129-------------+------------------------------ F( 14, 1114) = 11.51
Model | 389.777313 14 27.8412367 Prob > F = 0.0000Residual | 2695.73199 1114 2.41986713 R-squared = 0.1263
-------------+------------------------------ Adj R-squared = 0.1153Total | 3085.5093 1128 2.73538059 Root MSE = 1.5556
-----------------------------------------------------kids | Coef. Std. Err. t P>|t|
-------------+---------------------------------------educ | -.1242409 .0181486 -6.85 0.000age | .5381453 .1384005 3.89 0.000
age2 | -.0058679 .0015645 -3.75 0.000black | 1.083783 .1734035 6.25 0.000east | .2276015 .1312518 1.73 0.083
northcen | .3713906 .1199679 3.10 0.002west | .2188689 .1663522 1.32 0.189farm | -.0918808 .122027 -0.75 0.452y74 | .2586277 .1727165 1.50 0.135y76 | -.1012358 .1787317 -0.57 0.571y78 | -.0671507 .1814491 -0.37 0.711y80 | -.0751199 .1827069 -0.41 0.681y82 | -.5323518 .1723385 -3.09 0.002y84 | -.5383952 .174472 -3.09 0.002
_cons | -7.894707 3.05159 -2.59 0.010-----------------------------------------------------
7
Sharp drop in fertility in the early 1980s (others are
not statistically significant).
E.g., the coefficient on y82 indicates that, holding
other factors fixed (educ, age, and others), per 100
women there were about 53 less children than in 1972.
In particular, since education is controlled, this decline
is separate from the decline due to the increase in
eduction.
Women with more education have fewer children (co-
efficient −0.12 is highly statistically significant with
t = −6.85 and p-value < 0.0005).
Other things equal, per 100 women with a college ed-
ucation tend to have 4 × 0.124 = 0.496, i.e., about
50 children less than women with only high school
education.
8
In summary, pooled cross section data (inde-
pendent samples) problems can be analyzed
utilizing dummy variables.
9
2.2 Two-period panel data analysis
From each individual (people, firms, schools,cities, countries, etc.) data is collected attwo time points, t = 1 and t = 2.
In usual regression one major source of biasis caused by omitted (important) variables.
For example, if the true model is
(1) yi = β0 + β1xi + β2zi + ui,
but we estimate
(2) yi = β0 + β1xi + vi,
where
(3) vi = β2zi + ui,
the bias in OLS estimator β1 from model (2)is
(4) E[β1
]− β1 = β2
∑ni=1(xi − x)zi∑ni=1(xi − x)2
,
which can be substantial if x and z are cor-related and β2 is large.
10
Use of panel data makes it possible to elimi-
nate the omitted variable bias in certain cases.
Suppose that we have the following situation
in terms of model (1)
(5) yit = β0 + β1xit + β2zi + uit,
where i refers to individual i and t to time
point t.
Thus, we have panel data where data is col-
lected from each individual i at different time
points t (in the two period case, t = 1,2).
Note that in (5) zi does not have the time
index, which implies that variable z is time in-
variant (or at least changing very slowly with
time).
11
Suppose we have from each of the n individ-
uals observations on yit and xit at time points
t = 1 and t = 2, thus altogether 2n observa-
tions.
However, we do not observe zi.
Suppose further that we allow the possibility
that intercept β0 may be different at different
time points, such that (5) can be written as
(6) yit = β0 + δ0Dt + β1xit + β2zi + uit,
where Dt = 0 for t = 1 and Dt = 1 for t = 2
(time dummy).
12
Then taking differences of the form
∆yi = yi2 − yi1,
the model in (6) becomes
(7) ∆yi = δ0 + β1∆xi + ∆ui,
i.e., the (unobserved) omitted variable disap-
pears and estimating the slope parameter β1
with OLS is unbiased.
13
The above generalizes immediately such that
if we denote
(8) ai = z′iγ = γ1zi1 + γ2zi2 + · · ·+ γqziq
and enhance (6) to
(9) yit = β0 + δ0Dt + βxit + ai + uit,
taking differences reduces again to estima-
tion model (7).
The above model is called the fixed effect (FE) model
in which ai is fixed over the time periods (ai can be a
random variable, and can correlate with the explana-
tory variable xit).
If ai is not correlated with other explanatory variables,
the model is called random effect (RE) model and is
estimated with different techniques that are supposed
to yield more efficient estimators to β-parameters than
the fixed effect methods (that are basically OLS meth-
ods). We will return to the RE model later.
14
In the FE case the resulting estimators of the
regression parameters from the first-differenced
equation wit OLS are called the first-differenced
estimators (FD estimators).
We will deal other fixed effect estimators later.
In summary, differencing eliminates all unob-
served time invariant factors from the model.
A major pitfall is that differencing also wipes
out observed time invariant variables (like gen-
der) from the model!
Thus, this method cannot be used in such
cases (if we want to estimate these effects),
or in cases where the explanatory variables
change very slowly across time (the differ-
ence is nearly zero).
15
In many cases the FD-method is useful, how-
ever.
The following example highlight the biasing
effect of unobserved factors and how panel
estimation with the simple FD-method likely
solves the problem.
Example 2 Data set crime2.xls (Wooldridge) con-tains data on crime and unemployment rates for 46US cities for 1982 (t = 1) and 1987 (t = 2).
Running simple cross section regression of crmrte onunem by using only 1987 yields
. regress crmrte unem if year==87
Source | SS df MS Number of obs = 46-------------+------------------------------ F( 1, 44) = 1.48
Model | 1775.90928 1 1775.90928 Prob > F = 0.2297Residual | 52674.6428 44 1197.15097 R-squared = 0.0326
-------------+------------------------------ Adj R-squared = 0.0106Total | 54450.5521 45 1210.01227 Root MSE = 34.6
-----------------------------------------------------crmrte | Coef. Std. Err. t P>|t|
-------------+---------------------------------------unem | -4.161134 3.416456 -1.22 0.230
_cons | 128.3781 20.75663 6.18 0.000-----------------------------------------------------
16
Coefficient of crmrte is negative, −4.16!
However, not statistically significant.
Likely suffers from omitted variables problem (age dis-tribution, gender distribution, eduction levels, . . .).
Most of these can be expected to be fairly stableacross time. Thus, use of panel data techniques maybe helpful.
Before proceeding to the panel data estimation, let ussee what happens if we simply pool the two years andestimate
(10) crmrte = β0 + δ0D87 + β1unem + u,
where D87 is the year 1987 dummy.
17
. regress crmrte d87 unem
Source | SS df MS Number of obs = 92-------------+------------------------------ F( 2, 89) = 0.55
Model | 989.717314 2 494.858657 Prob > F = 0.5788Residual | 80055.7864 89 899.503218 R-squared = 0.0122
-------------+------------------------------ Adj R-squared = -0.0100Total | 81045.5037 91 890.609931 Root MSE = 29.992
-----------------------------------------------------crmrte | Coef. Std. Err. t P>|t|
-------------+---------------------------------------d87 | 7.940413 7.975324 1.00 0.322
unem | .4265461 1.188279 0.36 0.720_cons | 93.42026 12.73947 7.33 0.000
-----------------------------------------------------
The situation does not change much qualitatively!
18
For example, Stata has very sophisticated
panel data procedures.
We discuss some of them later.
The FD-method can be applied by using the
regress routine by first declaring the data as
a panel data with the xtset command
(Menu: Statistics > Longitudinal/panel data
> Setup and utilities > Declare data set to
be panel data).
In Eviews: Proc > Structure/Resize Current
Page. . ., and follow the instructions.
In SAS: proc panel data = crime2; model crmrte
= unemp; id = state year; end; Before apply-
ing proc panel the data must be sorted by
proc sort.
Whichever software is used, identifiers for the
individuals (in particular) are needed to indi-
cate the multiple measurements on an indi-
vidual.19
After declaring to the program the panel structure,the model
(11) ∆crmrte = δ0 + β1∆umem + ∆u
can be estimated with the FD difference method e.g.,
in Stata as (d.crmrte means crmrte87 − crmrte82):
. reg d.crmrte d.unem
Source | SS df MS Number of obs = 46-------------+------------------------------ F( 1, 44) = 6.38
Model | 2566.43056 1 2566.43056 Prob > F = 0.0152Residual | 17689.5426 44 402.035059 R-squared = 0.1267
-------------+------------------------------ Adj R-squared = 0.1069Total | 20255.9732 45 450.132737 Root MSE = 20.051
-----------------------------------------------------D.crmrte | Coef. Std. Err. t P>|t|
-------------+---------------------------------------unem |D1. | 2.217996 .8778657 2.53 0.015
|_cons | 15.40219 4.702116 3.28 0.002
-----------------------------------------------------
20
In Eviews, after the data has been reshaped
to panel data, the FD-estimatation can be
worked out using Quick > Estimate Equation. . .
to open the Equation Estimation command
window to input d(cmrte) c d(unem) to get
the results similar to above.
The coefficient estimate of the β1 ≈ 2.22 is now highlystatistically significant and of expected sign.
The model predicts that one percent increase in un-employment increases crimes by about 2.2 per 1,000people.
The constant term indicates that even if the changein unemployment rate were zero, the crime rate hasgenerally increased during the period from 1982 to1987 by about 15.4 crimes per 1,000 people.
Note that the time dummy component δ0 in (11) cap-tures all unobserved time effect that are common toall cross-sectional individuals.
That is, we can consider δ0 to represent
δ0 = z′tδ = δ1z1t + δ2z2t + · · ·+ δpzpt,
where zt’s are common trend components affecting all
individual crime rates with same intensity.
21
2.3 More than two time periods
Differencing can be used with more than two
time periods to work out fixed effect estima-
tion.
As an example consider a three period model.
yit = δ1 + δ2D2t + δ3D3t(12)
+β1xit1 + · · ·+ βkxitk + uit
for t = 1,2,3, where D2t = 1 for period t = 2
and zero otherwise and D3t = 1 for t = 3 and
zero othewrise.
Differencing yields
∆yit = δ2 + δ3 + β1∆xit1 +(13)
· · ·+ βk∆xitk + ∆uit
t = 2,3.
Again it is simple to estimate with OLS the
model.22
3.3 Fixed effect method
An alternative method, which works in cer-
tain cases better than the FD-method, is
called the fixed effects method.
Consider the simple case model of
(14) yit = β1xit + ai + uit,
i = 1, . . . , n, t = 1, . . . , T .
Thus there are altogether n× T observations.
Define means over the T time periods
(15) yi =1
T
T∑t=1
yit, xi =1
T
T∑t=1
xit, ui =1
T
T∑t=1
uit.
23
Then
(16) yi = β1xi + ai + ui.
Note that
1
T
T∑t=1
ai =1
TTai = ai.
Thus, subtracting (16) from (14) eliminates
ai and gives
(17) yit − yi = β1(xit − xi) + (uit − ui)
or
(18) yit = β1xit + uit,
where e.g., yit = yit − yi is the time demeaned
data on y.
This transformation is also called the within
transformation and resulting (OLS) estima-
tors of the regression parameters applied to
(18) are called fixed effect estimators or within
estimators.
24
In the two period case the FD method andFE lead to identical results.
Remark 1 The slope coefficient β1 estimated from(16) is called the between estimator. vi = ai + ui isthe error term. The estimator is biased, however, ifthe unobserved component ai is correlated with x.
Remark 2 When estimating the unobserved effect bythe fixed effect (FE) method, it is unfortunately notclear how the goodness-of-fit R-square should be com-puted. Stata produces three different R-squares: within,between, and total.
25
2.4 Dummy variable regression
Yet another method is to introduce dummy
variables for the cross section unit (N − 1
dummy variables) and (possibly) for the pe-
riods (T − 1 dummies).
If N and T are large this is not very practical.
Gives the same estimates for the regression
coefficients as the time demeaned method
and the standard errors and major statistics
are the same.
26
Example 3 Papke (1994), Journal of Public Economics
54, 37–49, studied the effect of Indiana enterprisezone program on unemployment, years 1980–1988(Wooldridges data base, file: ezunem.xls). Six zones desig-nated 1984 and four mode in 1984. Twelve cities didnot receive a zone (control group).
An evaluation model of the policy is
(19) log(uclmsit) = θt + β1Dit + ai + uit
where θt indicates time varying intercept, ucclms is thenumber unemployment claims, and Dit = 1 if the cityi had the zone in year t and zero otherwise.
Fixed Difference estimates for β1:
27
. reg d.luclms d82 d83 d84 d85 d86 d87 d88 d.ez
Source | SS df MS Number of obs = 176-------------+------------------------------ F( 8, 167) = 34.50
Model | 12.8826331 8 1.61032914 Prob > F = 0.0000Residual | 7.79583815 167 .046681666 R-squared = 0.6230
-------------+------------------------------ Adj R-squared = 0.6049Total | 20.6784713 175 .118162693 Root MSE = .21606
-----------------------------------------------------D.luclms | Coef. Std. Err. t P>|t|
-------------+---------------------------------------(Year dummy variable estimates results deleted)
ez |D1. | -.1818775 .0781862 -2.33 0.021
_cons | -.3216319 .046064 -6.98 0.000-----------------------------------------------------
Fixed Effect estimation results. xtreg luclms d82 d83 d84 d85 d86 d87 d88 ez, fe
R-sq: within = 0.8148between = 0.0002overall = 0.3415
F(8,168) = 92.36corr(u_i, Xb) = -0.0040 Prob > F = 0.0000
-----------------------------------------------------luclms | Coef. Std. Err. t P>|t|
-------------+---------------------------------------ez | -.1044148 .059753 -1.75 0.082
_cons | 11.53358 .0325925 353.87 0.000-------------+---------------------------------------
sigma_u | .55551522sigma_e | .21619434
rho | .86846297 (fraction of variance due to u_i)-----------------------------------------------------------F test that all u_i=0: F(21, 168) = 59.31 Prob > F = 0.0000
28
Dummy variable regression:
. reg luclms d82 d83 d84 d85 d86 d87 d88 ///c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 ///c14 c15 c16 c17 c18 c19 c20 c21 c22 ez
Source | SS df MS Number of obs = 198-------------+------------------------------ F( 29, 168) = 68.35
Model | 92.6439601 29 3.19461932 Prob > F = 0.0000Residual | 7.85231887 168 .046739993 R-squared = 0.9219
-------------+------------------------------ Adj R-squared = 0.9084Total | 100.496279 197 .510133396 Root MSE = .21619
-----------------------------------------------------luclms | Coef. Std. Err. t P>|t|
-------------+---------------------------------------(dummy variable results removed)
ez | -.1044148 .059753 -1.75 0.082_cons | 11.51534 .0799536 144.03 0.000
-----------------------------------------------------
The results show that the FE and DVRM results are
exactly the same.
Using the FE results, the coefficient −0.104 implies
about 10.4 percent drop in the unemployment claims
due to the program. The estimate is significant in
one-tailed testing but not in two-tailed testing.
29
2.5 Fixed effects or first differencing?
If the number of periods is 2 (T = 2) FE and
FD give identical results.
When T ≥ 3 the FE and FD are not the same.
Both are unbiased under assumptions FE.1–
FE.4∗
Both are consistent under assumptions FE.1–
FE.4 for fixed T as n→∞.
∗Assumptions:FE.1: For each i, the model is
yit = β1xit1 + · · ·+ βkxitk + ai + uit, t = 1, . . . T .
FE.2: We have a random sample from the crosssection.FE.3: Each explanatory variables changes over time,and they are not perfectly collinear.FE.4: E[uit|X i, ai] = 0 for all time periods (X i standsfor all explanatory variables).FE.5: Var[uit|X i, ai] = σ2
u for all t = 1, . . . , T .FE.6: Cov[uit, uis] = 0 for all t 6= sFE.7: uit|X i, ai ∼ NID(0, σ2
u).
30
If uit is serially uncorrelated, FE is more ef-
ficient than FD (because of this FE is more
popular).
If uit is (highly) serially correlated, ∆uit may
be less serially correlated, which may favor
FD over FE. However, typically T is rather
small, such that serial correlation is difficult
to observe.
In sum, there are no clear cut guidelines to
choose between these two. Thus, a good
advise is to check them them both and try
to determine why they differ if there is a big
difference.
31
2.6 Balanced and unbalanced panels
A data set is called a balanced panel if thesame number of time series observations areavailable for each cross section units. Thatis T is the same for all individuals. The totalnumber of observations in a balanced panelis nT .
All the above examples are balanced paneldata sets.
If some cross section units have missing ob-servations, which implies that for an individ-ual i there are available Ti time period obser-vations i = 1, . . . , n, Ti 6= Tj for some i and j,we call the data set an unbalanced panel.The total number of observations in an un-balanced panel is T1 + · · ·+ Tn.
In most cases unbalanced panels do not causemajor problems to fixed effect estimation.
Modern software packages make appropriateadjustments to estimation results.
32
2.7 Random effects models
Consider the simple unobserved effects model
(20) yit = β0 + β1xit + ai + uit,
i = 1, . . . , n, t = 1, . . . , T .
Typically also time dummies are also included
to (20).
Using FD or FE eliminates the unobserved
component ai.
However, if ai is uncorrelated with xit using
random effect (RE) estimation can lead to
more efficient estimation of the regression
parameters.
33
Generally, we call the model in equation (20)
the random effects model if ai is uncorre-
lated with all explanatory variables, i.e.,
(21) Cov[xit, ai] = 0, t = 1, . . . , T .
How to estimate β1 efficiently?
If (21) holds, β1 can be estimated consis-
tently from a single cross section.
Obviously this discards lots of useful infor-
mation.
34
If the data set is simply pooled and the error
term is denoted as vit = ai + uit, we have the
regression
(22) yit = β0 + β1xit + vit.
Then
(23) Corr[vit, vis] =σ2a
σ2a + σ2
u
for t 6= s, where σ2a = Var[ai] and σ2
u = Var[uit].
That is, the error terms vit are (positively)
autocorrelated, which biases the standard er-
rors of the OLS β1.
35
If σ2a and σ2
u were known, optimal estimators
(BLUE) would be obtained the generalized
least squares (GLS), which in this case would
reduce to estimate the regression slope coef-
ficients from the quasi demeaned equation
(24)
yit − λyt = β0(1− λ) + β1(xit − λxi) + (vit − λvi),
where
(25) λ = 1−(
σ2u
σ2u + Tσ2
a
)12
.
In practice σ2u and σ2
a are unknown, but they
can be estimated.
36
One method is to estimate (22) from the
pooled data set and use the OLS residuals
vit to estimate σ2q and σ2
u and plug them into
(25).
There resulting GLS estimators for the re-
gression slope coefficients are called random
effects estimators (RE estimators).
Under the random effects assumptions∗ the
estimators are consistent, but not unbiased.
They are also asymptotically normal as n→∞for fixed T .
However, with small n and large T properties
of the RE estimator is largely unknown.
∗The ideal random effects assumptions include FE.1,FE.2, FE.4–FE.6.
FE.3 is replaced withRE.3: There are no perfect linear relationshipsamong the explanatory variables.RE.4: In addition of FE.4, E[ai|Xi] = 0.
37
It is notable that λ = 1 results in (24) re-
sults to the pooled regression and FE ob-
tained with λ = 0.
RE estimation is available in modern statis-
tical packages with different options.
Example 4 Data set wagepan.xls (Wooldridge): n =545, T = 8.
Is there a wage premium in belonging to labor union?
log(wageit) = β0 + β1educit + β3exprit + β4expr2it
+β5marriedit + β6unionit + ai + uit
Year dummies for 1980–1987 are included.
It is notable that with inclusion of full set of yeardummies implies that one cannot estimate with theFE method effects that change a constant amountover time. Experience (exper) is such a variable.
38
-------------------------------------------lwage | Pooled Random Fixed
| OLS Effects Effects--------+----------------------------------
educ | .0989945 .0906150 ..| (.0046227) (.0105807)
exper | .0861696 .1027934 ..| (.0101415) (.0153853)
exper2 | -.0027349 -.0046859 -.0051855| (.0007099) (.0006896) (.0007044)
married | .1230113 .0678821 .0466804| (.0155714) (.0167369) (.0183104)
union | .1685243 .1031103 .0800019| (.0170652) (.0178388) (.0193103)
-------------------------------------------
It is notable that OLS standard errors tend to be
smaller than in the RE or FE cases.
OLS standard errors underestimate the true standard
errors.
OLS coefficient estimates also suffer from the omit-
ted variable problem accounted in panel estimation.
Stata estimate of the correlation in (23) is .464.
39
Random effects or fixed effects
FE is widely considered preferable because it
allows correlation between ai and x variables.
Given that the common effects, aggregated
to ai is not correlated with x variables, an
obvious advantage of the RE is that it allows
also estimation of the effects of factors that
do not change in time (like education in the
above example).
Typically the condition that common effects
ai is not correlated with the regressors (x-
variables) should be considered more like an
exception than a rule, which favors FE.
40
Hausman specification test
Hausmanan (1978) devised a test for the or-
thogonality of the common effects (ai) and
the regressors.
The test compares the fixed effect (OLS)
and random effect (GLS) estimates utilizing
the Wald testing approach.
41
The basic idea of the test relies on the fact
that under the null hypothesis of orthogonal-
ity both OLS and GLS are consistent, while
under the alternative hypothesis GLS is not
consistent.
Thus, under the null hypothesis OLS and
GLS estimates should not differ much from
each other.
The test compares these estimates with Wald
statistic.
In Stata performing Hausman requires that
both OLS and GLS regression results are
saved for availability for the postestimation
test0 procedure.
42
Example 5 Applying the Hausman test to the caseof Examle 4 can be in Stata yields:
* Estimate fixed effectsxtreg lwage y81 y82 y83 y84 y85 y86 y87 exper2 married union, fe* store the results into "hfixed"estimates store hfixed* Estimate the random effects modelxtreg lwage y81 y82 y83 y84 y85 y86 y87 educ exper exper2 married union, re* store the results into "hrandom"estimates store hrandom* Hausman testhausman hfixed hrandom
---- Coefficients ----| (b) (B) (b-B) sqrt(diag(V_b-V_B))| hfixed hrandom Difference S.E.
--------+---------------------------------------------------------y81 | .1511912 .0427498 .1084414 .y82 | .2529709 .035577 .2173939 .y83 | .3544437 .0270943 .3273494 .y84 | .4901148 .052207 .4379078 .y85 | .6174822 .0690524 .5484299 .y86 | .7654965 .1053229 .6601736 .y87 | .9250249 .1505464 .7744785 .
exper2 | -.0051855 -.0046859 -.0004996 .000144married | .0466804 .0678821 -.0212017 .0074261
union | .0800019 .1031103 -.0231085 .0073935-------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtregB = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematicchi2(10) = (b-B)’[(V_b-V_B)^(-1)](b-B)
= 26.77Prob>chi2 = 0.0028(V_b-V_B is not positive definite)
43
The test reject the orthogonality condition. Thus, FE
should be used.
In Eviews Hausman test is obtained by first
estimating the model as a random effect model
and then selecting
View > Fixed/Rendom Effect Testing > Correlated
Random Effects - Hausman Test
44
Policy analysis with panel data
Panel data is useful for policy analysis, in par-
ticular, program evaluation.
Example 6 Continue Example 1.2, where training pro-gram on worker productivity was evaluated.
The data include three years, 1987, 1988, and 1989.
The training program was implemented first time 1988.
We focus on the years 1987 (no program) and 1988(program implemented) to see whether the programbenefits firms.
The model panel model is
(26) log(scarpit) = β0 + δ0 y88 + β1grantit + ai + uit,
where y88 is the year 1988 dummy (= 1 for year 1988and = 0 otherwise) and ai includes the unobservedfirm effects (worker skill, etc.).
45
Ignoring panel structure OLS results suggested no im-
provement.
Dependent Variable: LOG(SCRAP)Method: Panel Least SquaresSample: 1 471 IF YEAR < 1989Periods included: 2Cross-sections included: 54Total panel (balanced) observations: 108=====================================================Variable Coefficient Std. Error t-Statistic Prob.-----------------------------------------------------C 0.523144 0.159783 3.274086 0.0014GRANT -0.058018 0.380949 -0.152299 0.8792-----------------------------------------------------R-squared 0.000219Adjusted R-squared -0.009213S.E. of regression 1.507393F-statistic 0.023195Prob(F-statistic) 0.879241=====================================================
The coefficient for grant is not statistically significant,
suggesting that the program does not help in reducing
the scrap rate.
46
Accounting for the possible firm effects and impos-
ing also the year dummy to account for possible time
effect, yields
=====================================================Variable Coefficient Std. Error t-Statistic Prob.-----------------------------------------------------C 0.568716 0.048603 11.70126 0.0000GRANT -0.317058 0.163875 -1.934753 0.0585-----------------------------------------------------Effects SpecificationCross-section fixed (dummy variables)Period fixed (dummy variables)R-squared 0.964308Adjusted R-squared 0.926556S.E. of regression 0.406642F-statistic 25.54364Prob(F-statistic) 0.000000
The estimate of the coefficient for the grant is nega-tive and close to statistically significant in two sidedtesting and significant in one sided testing (programimproves) for the alternative
H1 : β1 < 0
significant at the 5% level with p-value 0.0265.
According to the estimate participating the program
degreases the scrap-rate on average 32% (more ac-
curately 27%, since exp(−.317058)− 1 ≈ 0.272).
47
Dynamic Panel Models
Many economic relationships are dynamic.
These may be characterized by the presence
of lagged dependent variables
(27) yit = δyi,t−1 + x′itβ + vit,
where
(28) vit = ai + uit
with ai ∼ iid(0, σ2a) and uit ∼ iid(0, σ2
u) are in-
dependent, i = 1, . . . , n, t = 1, . . . , T .
48
Alternatively the one-way error component
model in (28) can be a two-way specification
such that
(29) vit = ai + bt + uit,
where all the components are assumed again
independent.
After differencing, we have
(30) δy = δ∆yt−1 + ∆x′itβ + ∆uit
The lagged term yi,t−1 as a regressor vari-
able is correlated with ui,t−1, which causes
problems in estimation.
49
Once regressor variables are correlated with
the error term, OLS or GLS estimators be-
come inconsistent.
A typical solution to the problem is to apply
some kind of instrumental variable estima-
tion.
These are least squares (LS) or some other
type of methods, where instrumental vari-
ables are utilized to remove the inconsistency
due to the error term correlation with the re-
gressors.
A variable is suitable for an instrumental vari-
able if it is not correlated with the error term,
but is correlated with the regressors.
Thus, those regressors that are not corre-
lated with the error term can be used also as
instruments.
50
Example 7 2SLS (two state least squares).
Consider a standard regression model
(31) yi = x′iβ + ui,
where xi is a k-vector of regressors (including the con-stant term) Cov(xi, ui) 6= 0, i = 1, . . . , n.
Suppose we have m ≥ k, additional variables in zi (m-vector) such that Cov(zi, ui) = 0 but Cov(zi,xi) 6= 0.
2SLS solution for the problem is such that first (firststage) use OLS to regress x-variables on z-variables.
In the second stage replace the original regressors xiby the predicted variables xi from the first stage, andestimate β from the regression
(32) yi = x′iβ + ui.
The estimator
(33) β2SLS = (X′X)−1X′y
is called the 2SLS estimator of β.
51
In particular, if m = k then (33) becomes
(34) βIV = (Z′X)−1Z′y,
which is called the Instrumental Variable estimator of
β.
52
Example 8 (Data: http://eu.wiley.com/college/baltagi/> Student companion site > datasets)
Demand for cigarettes in 46 US States [annual data,1963–1992]. Estimated equation
(35) cit = α+ β1ci,t−1 + β2pit + β3yit + β4pnit + vit,
where
(36) vit = ai + bt + uit,
ai and bt are fixed effects, uit ∼ NID(0, σ2u), and all the
observable variables are in logarithms:cit = real per capita sales of cigarettes by persons ofsmoking age (14 and older). cigarette average priceper packpit = real average retail price of a pack of cigarettesyit = real per capital disposable incomepnit = the minimum real price of cigarettes in anyneighboring state (proxy for casual smuggling effectacross state borders)ci,t−1 is very likely correlated with uit.
53
For reference purposes, estimating with panel OLS
(average of within group regressions with time dum-
mies) yields
Fixed-effects (within) regression Number of obs = 1334Group variable: state Number of groups = 46
R-sq: within = 0.9283 Obs per group: min = 29between = 0.9859 avg = 29.0overall = 0.9657 max = 29
F(32,1256) = 508.07corr(u_i, Xb) = 0.4743 Prob > F = 0.0000
-----------------------------------------------------lc | Coef. Std. Err. t P>|t|
-------------+---------------------------------------lc |
L1. | .8302514 .0126242 65.77 0.000|
lp | -.2916822 .0230847 -12.64 0.000ly | .1068698 .0233417 4.58 0.000
lpn | .0354559 .02656 1.33 0.182_cons | .8204374 .2228775 3.68 0.000
-------------+---------------------------------------sigma_u | .02738301sigma_e | .03504776
rho | .37905103 (fraction of variance due to u_i)-----------------------------------------------------F test that all u_i=0: F(45, 1256) = 4.52
Prob > F = 0.0000
54
Several method are proposed to estimate when there
is potential correlation between the error term and
(some) regressors.
GMM (Generalized Method of Moments) estimation
has gained lately much popularity, in particular when
there are non-linear moment restrictions.
Stata has xtdpd procedure which produces the Arel-
lano and Bond or the Arellano-Bover/Blundell-Bond
estimator, which are GMM estimators, where instru-
ments are defined in a particular way (the idea will be
discussed in the classroom).
55
xtdpd l(0/1).lc lp ly lpn y66-y92, div(lp ly lpn y66-y92) dgmmiv(lc)
Dynamic panel-data estimation Number of obs = 1334Group variable: state Number of groups = 46Time variable: year
Obs per group: min = 29avg = 29max = 29
Number of instruments = 437 Wald chi2(31) = 13273.45Prob > chi2 = 0.0000
One-step results-----------------------------------------------------
lc | Coef. Std. Err. z P>|z|-------------+---------------------------------------
lc |L1. | .8201729 .0161446 50.80 0.000
|lp | -.3607549 .0311244 -11.59 0.000ly | .1871102 .0334027 5.60 0.000
lpn | -.0215713 .0399233 -0.54 0.589-----------------------------------------------------Instruments for differenced equation
GMM-type: L(2/.).lcStandard: D.lp D.ly D.lpn D.y66 D.y67 D.y68D.y69 D.y70 D.y71 D.y72 D.y73 D.y74 D.y75D.y76 D.y77 D.y78 D.y79 D.y80 D.y81 D.y82D.y83 D.y84 D.y85 D.y86 D.y87 D.y88 D.y89D.y90 D.y91 D.y92
Instruments for level equationStandard: _cons
56
Test for the orthogonality conditions of the instru-
ments
Sargan test of overidentifying restrictionsH0: overidentifying restrictions are valid
chi2(405) = 561.5047Prob > chi2 = 0.0000
The orthogonality conditions are rejected.
The reason may be that that the errors are MA(1),
which implies that the GMM instruments (lct−2, . . .)
are correlated with the error term.
This can be tried to fix by defining starting from t− 3
with command · · · dgmmiv(lc, lagrange(3 .)).
Doing this improved slightly the situation but still lead
to rejection of the orthogonality conditions.
We however, do not continue the analysis here further.
57