panel ecmiic2

2 Panel Data∗

Data sets that combine time series and cross

sections data are common in economics.

An independently pooled cross section is ob-

tained by sampling randomly a large popula-

tion at different points in time (e.g., yearly).

Important feature: The data set consist of

independently sampled observations.

Allows to investigate the effect of time. E.g.,

whether relationships have changed.

Raises typically minor statistical complica-

tions.

∗Version: Jan 19, 2012

1

A panel data set (longitudinal data) a sample

of same individuals, families, firms, cities . . .,

are followed across time.

E.g., OECD statistics contain numerous se-

ries observed yearly from several countries.

Similarly time series data on several firms,

industries, etc., are these type of data.

2

2.1 Pooling independent cross section across

time

Example 1 Women’s fertility over time: Data fromGeneral Social Survey contains samples collected evenyears from 1972 to 1984.

Model for explaining total number of children born toa woman.

Data is available on the course web side (passwordprotected).

3

* read data.insheet using "fertil1.csv", comma clear* describe data.des

Contains dataobs: 1,129

vars: 14size: 24,838 (99.9% of memory free)

------------------------------------------------------------storage display value

variable name type format label variable label------------------------------------------------------------year byte %8.0geduc byte %8.0gmeduc byte %8.0gfeduc byte %8.0gage byte %8.0gkids byte %8.0gblack byte %8.0geast byte %8.0gnorthcen byte %8.0gwest byte %8.0gfarm byte %8.0gothrural byte %8.0gtown byte %8.0gsmcity byte %8.0g

4

. tabstat kids, statistics( mean count ) by(year) columns(statistics)

Summary for variables: kidsby categories of: year

year | mean N---------+--------------------

72 | 3.0 15674 | 3.2 17376 | 2.8 15278 | 2.8 14380 | 2.8 14282 | 2.4 18684 | 2.2 177

---------+--------------------Total | 2.7 1129

------------------------------

2

3

4

70 72 74 76 78 80 82 84 86

N o

f chi

ldre

n

Year

Number of children per woman

It is obvious that the fertility rate has declined over

years

5

The analysis can be substantially elaborated by re-

gression analysis.

After controlling other factors (educations, age, etc.),

what has happened to fertility rate?

Build a regression with year dummies: y74 for 1974,

· · · , y84 for year 1984.

Year 1972 is the base year.

6

. reg kids educ age age2 black east northcen west farm ///y74 y76 y78 y80 y82 y84

Source | SS df MS Number of obs = 1129-------------+------------------------------ F( 14, 1114) = 11.51

Model | 389.777313 14 27.8412367 Prob > F = 0.0000Residual | 2695.73199 1114 2.41986713 R-squared = 0.1263

-------------+------------------------------ Adj R-squared = 0.1153Total | 3085.5093 1128 2.73538059 Root MSE = 1.5556

-----------------------------------------------------kids | Coef. Std. Err. t P>|t|

-------------+---------------------------------------educ | -.1242409 .0181486 -6.85 0.000age | .5381453 .1384005 3.89 0.000

age2 | -.0058679 .0015645 -3.75 0.000black | 1.083783 .1734035 6.25 0.000east | .2276015 .1312518 1.73 0.083

northcen | .3713906 .1199679 3.10 0.002west | .2188689 .1663522 1.32 0.189farm | -.0918808 .122027 -0.75 0.452y74 | .2586277 .1727165 1.50 0.135y76 | -.1012358 .1787317 -0.57 0.571y78 | -.0671507 .1814491 -0.37 0.711y80 | -.0751199 .1827069 -0.41 0.681y82 | -.5323518 .1723385 -3.09 0.002y84 | -.5383952 .174472 -3.09 0.002

_cons | -7.894707 3.05159 -2.59 0.010-----------------------------------------------------

7

Sharp drop in fertility in the early 1980s (others are

not statistically significant).

E.g., the coefficient on y82 indicates that, holding

other factors fixed (educ, age, and others), per 100

women there were about 53 less children than in 1972.

In particular, since education is controlled, this decline

is separate from the decline due to the increase in

eduction.

Women with more education have fewer children (co-

efficient −0.12 is highly statistically significant with

t = −6.85 and p-value < 0.0005).

Other things equal, per 100 women with a college ed-

ucation tend to have 4 × 0.124 = 0.496, i.e., about

50 children less than women with only high school

education.

8

In summary, pooled cross section data (inde-

pendent samples) problems can be analyzed

utilizing dummy variables.

9

2.2 Two-period panel data analysis

From each individual (people, firms, schools,cities, countries, etc.) data is collected attwo time points, t = 1 and t = 2.

In usual regression one major source of biasis caused by omitted (important) variables.

For example, if the true model is

(1) yi = β0 + β1xi + β2zi + ui,

but we estimate

(2) yi = β0 + β1xi + vi,

where

(3) vi = β2zi + ui,

the bias in OLS estimator β1 from model (2)is

(4) E[β1

]− β1 = β2

∑ni=1(xi − x)zi∑ni=1(xi − x)2

,

which can be substantial if x and z are cor-related and β2 is large.

10

Use of panel data makes it possible to elimi-

nate the omitted variable bias in certain cases.

Suppose that we have the following situation

in terms of model (1)

(5) yit = β0 + β1xit + β2zi + uit,

where i refers to individual i and t to time

point t.

Thus, we have panel data where data is col-

lected from each individual i at different time

points t (in the two period case, t = 1,2).

Note that in (5) zi does not have the time

index, which implies that variable z is time in-

variant (or at least changing very slowly with

time).

11

Suppose we have from each of the n individ-

uals observations on yit and xit at time points

t = 1 and t = 2, thus altogether 2n observa-

tions.

However, we do not observe zi.

Suppose further that we allow the possibility

that intercept β0 may be different at different

time points, such that (5) can be written as

(6) yit = β0 + δ0Dt + β1xit + β2zi + uit,

where Dt = 0 for t = 1 and Dt = 1 for t = 2

(time dummy).

12

Then taking differences of the form

∆yi = yi2 − yi1,

the model in (6) becomes

(7) ∆yi = δ0 + β1∆xi + ∆ui,

i.e., the (unobserved) omitted variable disap-

pears and estimating the slope parameter β1

with OLS is unbiased.

13

The above generalizes immediately such that

if we denote

(8) ai = z′iγ = γ1zi1 + γ2zi2 + · · ·+ γqziq

and enhance (6) to

(9) yit = β0 + δ0Dt + βxit + ai + uit,

taking differences reduces again to estima-

tion model (7).

The above model is called the fixed effect (FE) model

in which ai is fixed over the time periods (ai can be a

random variable, and can correlate with the explana-

tory variable xit).

If ai is not correlated with other explanatory variables,

the model is called random effect (RE) model and is

estimated with different techniques that are supposed

to yield more efficient estimators to β-parameters than

the fixed effect methods (that are basically OLS meth-

ods). We will return to the RE model later.

14

In the FE case the resulting estimators of the

regression parameters from the first-differenced

equation wit OLS are called the first-differenced

estimators (FD estimators).

We will deal other fixed effect estimators later.

In summary, differencing eliminates all unob-

served time invariant factors from the model.

A major pitfall is that differencing also wipes

out observed time invariant variables (like gen-

der) from the model!

Thus, this method cannot be used in such

cases (if we want to estimate these effects),

or in cases where the explanatory variables

change very slowly across time (the differ-

ence is nearly zero).

15

In many cases the FD-method is useful, how-

ever.

The following example highlight the biasing

effect of unobserved factors and how panel

estimation with the simple FD-method likely

solves the problem.

Example 2 Data set crime2.xls (Wooldridge) con-tains data on crime and unemployment rates for 46US cities for 1982 (t = 1) and 1987 (t = 2).

Running simple cross section regression of crmrte onunem by using only 1987 yields

. regress crmrte unem if year==87




-----------------------------------------------------crmrte | Coef. Std. Err. t P>|t|

-------------+---------------------------------------unem | -4.161134 3.416456 -1.22 0.230

_cons | 128.3781 20.75663 6.18 0.000-----------------------------------------------------

16

Coefficient of crmrte is negative, −4.16!

However, not statistically significant.

Likely suffers from omitted variables problem (age dis-tribution, gender distribution, eduction levels, . . .).

Most of these can be expected to be fairly stableacross time. Thus, use of panel data techniques maybe helpful.

Before proceeding to the panel data estimation, let ussee what happens if we simply pool the two years andestimate

(10) crmrte = β0 + δ0D87 + β1unem + u,

where D87 is the year 1987 dummy.

17

. regress crmrte d87 unem



-------------+------------------------------ Adj R-squared = -0.0100Total | 81045.5037 91 890.609931 Root MSE = 29.992

-----------------------------------------------------crmrte | Coef. Std. Err. t P>|t|

-------------+---------------------------------------d87 | 7.940413 7.975324 1.00 0.322

unem | .4265461 1.188279 0.36 0.720_cons | 93.42026 12.73947 7.33 0.000

-----------------------------------------------------

The situation does not change much qualitatively!

18

For example, Stata has very sophisticated

panel data procedures.

We discuss some of them later.

The FD-method can be applied by using the

regress routine by first declaring the data as

a panel data with the xtset command

(Menu: Statistics > Longitudinal/panel data

> Setup and utilities > Declare data set to

be panel data).

In Eviews: Proc > Structure/Resize Current

Page. . ., and follow the instructions.

In SAS: proc panel data = crime2; model crmrte

= unemp; id = state year; end; Before apply-

ing proc panel the data must be sorted by

proc sort.

Whichever software is used, identifiers for the

individuals (in particular) are needed to indi-

cate the multiple measurements on an indi-

vidual.19

After declaring to the program the panel structure,the model

(11) ∆crmrte = δ0 + β1∆umem + ∆u

can be estimated with the FD difference method e.g.,

in Stata as (d.crmrte means crmrte87 − crmrte82):

. reg d.crmrte d.unem




-----------------------------------------------------D.crmrte | Coef. Std. Err. t P>|t|

-------------+---------------------------------------unem |D1. | 2.217996 .8778657 2.53 0.015

|_cons | 15.40219 4.702116 3.28 0.002

-----------------------------------------------------

20

In Eviews, after the data has been reshaped

to panel data, the FD-estimatation can be

worked out using Quick > Estimate Equation. . .

to open the Equation Estimation command

window to input d(cmrte) c d(unem) to get

the results similar to above.

The coefficient estimate of the β1 ≈ 2.22 is now highlystatistically significant and of expected sign.

The model predicts that one percent increase in un-employment increases crimes by about 2.2 per 1,000people.

The constant term indicates that even if the changein unemployment rate were zero, the crime rate hasgenerally increased during the period from 1982 to1987 by about 15.4 crimes per 1,000 people.

Note that the time dummy component δ0 in (11) cap-tures all unobserved time effect that are common toall cross-sectional individuals.

That is, we can consider δ0 to represent

δ0 = z′tδ = δ1z1t + δ2z2t + · · ·+ δpzpt,

where zt’s are common trend components affecting all

individual crime rates with same intensity.

21

2.3 More than two time periods

Differencing can be used with more than two

time periods to work out fixed effect estima-

tion.

As an example consider a three period model.

yit = δ1 + δ2D2t + δ3D3t(12)

+β1xit1 + · · ·+ βkxitk + uit

for t = 1,2,3, where D2t = 1 for period t = 2

and zero otherwise and D3t = 1 for t = 3 and

zero othewrise.

Differencing yields

∆yit = δ2 + δ3 + β1∆xit1 +(13)

· · ·+ βk∆xitk + ∆uit

t = 2,3.

Again it is simple to estimate with OLS the

model.22

3.3 Fixed effect method

An alternative method, which works in cer-

tain cases better than the FD-method, is

called the fixed effects method.

Consider the simple case model of

(14) yit = β1xit + ai + uit,

i = 1, . . . , n, t = 1, . . . , T .

Thus there are altogether n× T observations.

Define means over the T time periods

(15) yi =1

T

T∑t=1

yit, xi =1

T

T∑t=1

xit, ui =1

T

T∑t=1

uit.

23

Then

(16) yi = β1xi + ai + ui.

Note that

1

T

T∑t=1

ai =1

TTai = ai.

Thus, subtracting (16) from (14) eliminates

ai and gives

(17) yit − yi = β1(xit − xi) + (uit − ui)

or

(18) yit = β1xit + uit,

where e.g., yit = yit − yi is the time demeaned

data on y.

This transformation is also called the within

transformation and resulting (OLS) estima-

tors of the regression parameters applied to

(18) are called fixed effect estimators or within

estimators.

24

In the two period case the FD method andFE lead to identical results.

Remark 1 The slope coefficient β1 estimated from(16) is called the between estimator. vi = ai + ui isthe error term. The estimator is biased, however, ifthe unobserved component ai is correlated with x.

Remark 2 When estimating the unobserved effect bythe fixed effect (FE) method, it is unfortunately notclear how the goodness-of-fit R-square should be com-puted. Stata produces three different R-squares: within,between, and total.

25

2.4 Dummy variable regression

Yet another method is to introduce dummy

variables for the cross section unit (N − 1

dummy variables) and (possibly) for the pe-

riods (T − 1 dummies).

If N and T are large this is not very practical.

Gives the same estimates for the regression

coefficients as the time demeaned method

and the standard errors and major statistics

are the same.

26

Example 3 Papke (1994), Journal of Public Economics

54, 37–49, studied the effect of Indiana enterprisezone program on unemployment, years 1980–1988(Wooldridges data base, file: ezunem.xls). Six zones desig-nated 1984 and four mode in 1984. Twelve cities didnot receive a zone (control group).

An evaluation model of the policy is

(19) log(uclmsit) = θt + β1Dit + ai + uit

where θt indicates time varying intercept, ucclms is thenumber unemployment claims, and Dit = 1 if the cityi had the zone in year t and zero otherwise.

Fixed Difference estimates for β1:

27

. reg d.luclms d82 d83 d84 d85 d86 d87 d88 d.ez


Model | 12.8826331 8 1.61032914 Prob > F = 0.0000Residual | 7.79583815 167 .046681666 R-squared = 0.6230

-------------+------------------------------ Adj R-squared = 0.6049Total | 20.6784713 175 .118162693 Root MSE = .21606

-----------------------------------------------------D.luclms | Coef. Std. Err. t P>|t|

-------------+---------------------------------------(Year dummy variable estimates results deleted)

ez |D1. | -.1818775 .0781862 -2.33 0.021

_cons | -.3216319 .046064 -6.98 0.000-----------------------------------------------------

Fixed Effect estimation results. xtreg luclms d82 d83 d84 d85 d86 d87 d88 ez, fe

R-sq: within = 0.8148between = 0.0002overall = 0.3415

F(8,168) = 92.36corr(u_i, Xb) = -0.0040 Prob > F = 0.0000

-----------------------------------------------------luclms | Coef. Std. Err. t P>|t|

-------------+---------------------------------------ez | -.1044148 .059753 -1.75 0.082

_cons | 11.53358 .0325925 353.87 0.000-------------+---------------------------------------

sigma_u | .55551522sigma_e | .21619434

rho | .86846297 (fraction of variance due to u_i)-----------------------------------------------------------F test that all u_i=0: F(21, 168) = 59.31 Prob > F = 0.0000

28

Dummy variable regression:

. reg luclms d82 d83 d84 d85 d86 d87 d88 ///c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 ///c14 c15 c16 c17 c18 c19 c20 c21 c22 ez


Model | 92.6439601 29 3.19461932 Prob > F = 0.0000Residual | 7.85231887 168 .046739993 R-squared = 0.9219

-------------+------------------------------ Adj R-squared = 0.9084Total | 100.496279 197 .510133396 Root MSE = .21619

-----------------------------------------------------luclms | Coef. Std. Err. t P>|t|

-------------+---------------------------------------(dummy variable results removed)

ez | -.1044148 .059753 -1.75 0.082_cons | 11.51534 .0799536 144.03 0.000

-----------------------------------------------------

The results show that the FE and DVRM results are

exactly the same.

Using the FE results, the coefficient −0.104 implies

about 10.4 percent drop in the unemployment claims

due to the program. The estimate is significant in

one-tailed testing but not in two-tailed testing.

29

2.5 Fixed effects or first differencing?

If the number of periods is 2 (T = 2) FE and

FD give identical results.

When T ≥ 3 the FE and FD are not the same.

Both are unbiased under assumptions FE.1–

FE.4∗

Both are consistent under assumptions FE.1–

FE.4 for fixed T as n→∞.

∗Assumptions:FE.1: For each i, the model is

yit = β1xit1 + · · ·+ βkxitk + ai + uit, t = 1, . . . T .

FE.2: We have a random sample from the crosssection.FE.3: Each explanatory variables changes over time,and they are not perfectly collinear.FE.4: E[uit|X i, ai] = 0 for all time periods (X i standsfor all explanatory variables).FE.5: Var[uit|X i, ai] = σ2

u for all t = 1, . . . , T .FE.6: Cov[uit, uis] = 0 for all t 6= sFE.7: uit|X i, ai ∼ NID(0, σ2

u).

30

If uit is serially uncorrelated, FE is more ef-

ficient than FD (because of this FE is more

popular).

If uit is (highly) serially correlated, ∆uit may

be less serially correlated, which may favor

FD over FE. However, typically T is rather

small, such that serial correlation is difficult

to observe.

In sum, there are no clear cut guidelines to

choose between these two. Thus, a good

advise is to check them them both and try

to determine why they differ if there is a big

difference.

31

2.6 Balanced and unbalanced panels

A data set is called a balanced panel if thesame number of time series observations areavailable for each cross section units. Thatis T is the same for all individuals. The totalnumber of observations in a balanced panelis nT .

All the above examples are balanced paneldata sets.

If some cross section units have missing ob-servations, which implies that for an individ-ual i there are available Ti time period obser-vations i = 1, . . . , n, Ti 6= Tj for some i and j,we call the data set an unbalanced panel.The total number of observations in an un-balanced panel is T1 + · · ·+ Tn.

In most cases unbalanced panels do not causemajor problems to fixed effect estimation.

Modern software packages make appropriateadjustments to estimation results.

32

2.7 Random effects models

Consider the simple unobserved effects model

(20) yit = β0 + β1xit + ai + uit,

i = 1, . . . , n, t = 1, . . . , T .

Typically also time dummies are also included

to (20).

Using FD or FE eliminates the unobserved

component ai.

However, if ai is uncorrelated with xit using

random effect (RE) estimation can lead to

more efficient estimation of the regression

parameters.

33

Generally, we call the model in equation (20)

the random effects model if ai is uncorre-

lated with all explanatory variables, i.e.,

(21) Cov[xit, ai] = 0, t = 1, . . . , T .

How to estimate β1 efficiently?

If (21) holds, β1 can be estimated consis-

tently from a single cross section.

Obviously this discards lots of useful infor-

mation.

34

If the data set is simply pooled and the error

term is denoted as vit = ai + uit, we have the

regression

(22) yit = β0 + β1xit + vit.

Then

(23) Corr[vit, vis] =σ2a

σ2a + σ2

u

for t 6= s, where σ2a = Var[ai] and σ2

u = Var[uit].

That is, the error terms vit are (positively)

autocorrelated, which biases the standard er-

rors of the OLS β1.

35

If σ2a and σ2

u were known, optimal estimators

(BLUE) would be obtained the generalized

least squares (GLS), which in this case would

reduce to estimate the regression slope coef-

ficients from the quasi demeaned equation

(24)

yit − λyt = β0(1− λ) + β1(xit − λxi) + (vit − λvi),

where

(25) λ = 1−(

σ2u

σ2u + Tσ2

a

)12

.

In practice σ2u and σ2

a are unknown, but they

can be estimated.

36

One method is to estimate (22) from the

pooled data set and use the OLS residuals

vit to estimate σ2q and σ2

u and plug them into

(25).

There resulting GLS estimators for the re-

gression slope coefficients are called random

effects estimators (RE estimators).

Under the random effects assumptions∗ the

estimators are consistent, but not unbiased.

They are also asymptotically normal as n→∞for fixed T .

However, with small n and large T properties

of the RE estimator is largely unknown.

∗The ideal random effects assumptions include FE.1,FE.2, FE.4–FE.6.

FE.3 is replaced withRE.3: There are no perfect linear relationshipsamong the explanatory variables.RE.4: In addition of FE.4, E[ai|Xi] = 0.

37

It is notable that λ = 1 results in (24) re-

sults to the pooled regression and FE ob-

tained with λ = 0.

RE estimation is available in modern statis-

tical packages with different options.

Example 4 Data set wagepan.xls (Wooldridge): n =545, T = 8.

Is there a wage premium in belonging to labor union?

log(wageit) = β0 + β1educit + β3exprit + β4expr2it

+β5marriedit + β6unionit + ai + uit

Year dummies for 1980–1987 are included.

It is notable that with inclusion of full set of yeardummies implies that one cannot estimate with theFE method effects that change a constant amountover time. Experience (exper) is such a variable.

38

-------------------------------------------lwage | Pooled Random Fixed

| OLS Effects Effects--------+----------------------------------

educ | .0989945 .0906150 ..| (.0046227) (.0105807)

exper | .0861696 .1027934 ..| (.0101415) (.0153853)

exper2 | -.0027349 -.0046859 -.0051855| (.0007099) (.0006896) (.0007044)

married | .1230113 .0678821 .0466804| (.0155714) (.0167369) (.0183104)

union | .1685243 .1031103 .0800019| (.0170652) (.0178388) (.0193103)

-------------------------------------------

It is notable that OLS standard errors tend to be

smaller than in the RE or FE cases.

OLS standard errors underestimate the true standard

errors.

OLS coefficient estimates also suffer from the omit-

ted variable problem accounted in panel estimation.

Stata estimate of the correlation in (23) is .464.

39

Random effects or fixed effects

FE is widely considered preferable because it

allows correlation between ai and x variables.

Given that the common effects, aggregated

to ai is not correlated with x variables, an

obvious advantage of the RE is that it allows

also estimation of the effects of factors that

do not change in time (like education in the

above example).

Typically the condition that common effects

ai is not correlated with the regressors (x-

variables) should be considered more like an

exception than a rule, which favors FE.

40

Hausman specification test

Hausmanan (1978) devised a test for the or-

thogonality of the common effects (ai) and

the regressors.

The test compares the fixed effect (OLS)

and random effect (GLS) estimates utilizing

the Wald testing approach.

41

The basic idea of the test relies on the fact

that under the null hypothesis of orthogonal-

ity both OLS and GLS are consistent, while

under the alternative hypothesis GLS is not

consistent.

Thus, under the null hypothesis OLS and

GLS estimates should not differ much from

each other.

The test compares these estimates with Wald

statistic.

In Stata performing Hausman requires that

both OLS and GLS regression results are

saved for availability for the postestimation

test0 procedure.

42

Example 5 Applying the Hausman test to the caseof Examle 4 can be in Stata yields:

* Estimate fixed effectsxtreg lwage y81 y82 y83 y84 y85 y86 y87 exper2 married union, fe* store the results into "hfixed"estimates store hfixed* Estimate the random effects modelxtreg lwage y81 y82 y83 y84 y85 y86 y87 educ exper exper2 married union, re* store the results into "hrandom"estimates store hrandom* Hausman testhausman hfixed hrandom

---- Coefficients ----| (b) (B) (b-B) sqrt(diag(V_b-V_B))| hfixed hrandom Difference S.E.

--------+---------------------------------------------------------y81 | .1511912 .0427498 .1084414 .y82 | .2529709 .035577 .2173939 .y83 | .3544437 .0270943 .3273494 .y84 | .4901148 .052207 .4379078 .y85 | .6174822 .0690524 .5484299 .y86 | .7654965 .1053229 .6601736 .y87 | .9250249 .1505464 .7744785 .

exper2 | -.0051855 -.0046859 -.0004996 .000144married | .0466804 .0678821 -.0212017 .0074261

union | .0800019 .1031103 -.0231085 .0073935-------------------------------------------------------------------

b = consistent under Ho and Ha; obtained from xtregB = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematicchi2(10) = (b-B)’[(V_b-V_B)^(-1)](b-B)

= 26.77Prob>chi2 = 0.0028(V_b-V_B is not positive definite)

43

The test reject the orthogonality condition. Thus, FE

should be used.

In Eviews Hausman test is obtained by first

estimating the model as a random effect model

and then selecting

View > Fixed/Rendom Effect Testing > Correlated

Random Effects - Hausman Test

44

Policy analysis with panel data

Panel data is useful for policy analysis, in par-

ticular, program evaluation.

Example 6 Continue Example 1.2, where training pro-gram on worker productivity was evaluated.

The data include three years, 1987, 1988, and 1989.

The training program was implemented first time 1988.

We focus on the years 1987 (no program) and 1988(program implemented) to see whether the programbenefits firms.

The model panel model is

(26) log(scarpit) = β0 + δ0 y88 + β1grantit + ai + uit,

where y88 is the year 1988 dummy (= 1 for year 1988and = 0 otherwise) and ai includes the unobservedfirm effects (worker skill, etc.).

45

Ignoring panel structure OLS results suggested no im-

provement.

Dependent Variable: LOG(SCRAP)Method: Panel Least SquaresSample: 1 471 IF YEAR < 1989Periods included: 2Cross-sections included: 54Total panel (balanced) observations: 108=====================================================Variable Coefficient Std. Error t-Statistic Prob.-----------------------------------------------------C 0.523144 0.159783 3.274086 0.0014GRANT -0.058018 0.380949 -0.152299 0.8792-----------------------------------------------------R-squared 0.000219Adjusted R-squared -0.009213S.E. of regression 1.507393F-statistic 0.023195Prob(F-statistic) 0.879241=====================================================

The coefficient for grant is not statistically significant,

suggesting that the program does not help in reducing

the scrap rate.

46

Accounting for the possible firm effects and impos-

ing also the year dummy to account for possible time

effect, yields

=====================================================Variable Coefficient Std. Error t-Statistic Prob.-----------------------------------------------------C 0.568716 0.048603 11.70126 0.0000GRANT -0.317058 0.163875 -1.934753 0.0585-----------------------------------------------------Effects SpecificationCross-section fixed (dummy variables)Period fixed (dummy variables)R-squared 0.964308Adjusted R-squared 0.926556S.E. of regression 0.406642F-statistic 25.54364Prob(F-statistic) 0.000000

The estimate of the coefficient for the grant is nega-tive and close to statistically significant in two sidedtesting and significant in one sided testing (programimproves) for the alternative

H1 : β1 < 0

significant at the 5% level with p-value 0.0265.

According to the estimate participating the program

degreases the scrap-rate on average 32% (more ac-

curately 27%, since exp(−.317058)− 1 ≈ 0.272).

47

Dynamic Panel Models

Many economic relationships are dynamic.

These may be characterized by the presence

of lagged dependent variables

(27) yit = δyi,t−1 + x′itβ + vit,

where

(28) vit = ai + uit

with ai ∼ iid(0, σ2a) and uit ∼ iid(0, σ2

u) are in-

dependent, i = 1, . . . , n, t = 1, . . . , T .

48

Alternatively the one-way error component

model in (28) can be a two-way specification

such that

(29) vit = ai + bt + uit,

where all the components are assumed again

independent.

After differencing, we have

(30) δy = δ∆yt−1 + ∆x′itβ + ∆uit

The lagged term yi,t−1 as a regressor vari-

able is correlated with ui,t−1, which causes

problems in estimation.

49

Once regressor variables are correlated with

the error term, OLS or GLS estimators be-

come inconsistent.

A typical solution to the problem is to apply

some kind of instrumental variable estima-

tion.

These are least squares (LS) or some other

type of methods, where instrumental vari-

ables are utilized to remove the inconsistency

due to the error term correlation with the re-

gressors.

A variable is suitable for an instrumental vari-

able if it is not correlated with the error term,

but is correlated with the regressors.

Thus, those regressors that are not corre-

lated with the error term can be used also as

instruments.

50

Example 7 2SLS (two state least squares).

Consider a standard regression model

(31) yi = x′iβ + ui,

where xi is a k-vector of regressors (including the con-stant term) Cov(xi, ui) 6= 0, i = 1, . . . , n.

Suppose we have m ≥ k, additional variables in zi (m-vector) such that Cov(zi, ui) = 0 but Cov(zi,xi) 6= 0.

2SLS solution for the problem is such that first (firststage) use OLS to regress x-variables on z-variables.

In the second stage replace the original regressors xiby the predicted variables xi from the first stage, andestimate β from the regression

(32) yi = x′iβ + ui.

The estimator

(33) β2SLS = (X′X)−1X′y

is called the 2SLS estimator of β.

51

In particular, if m = k then (33) becomes

(34) βIV = (Z′X)−1Z′y,

which is called the Instrumental Variable estimator of

β.

52

Example 8 (Data: http://eu.wiley.com/college/baltagi/> Student companion site > datasets)

Demand for cigarettes in 46 US States [annual data,1963–1992]. Estimated equation

(35) cit = α+ β1ci,t−1 + β2pit + β3yit + β4pnit + vit,

where

(36) vit = ai + bt + uit,

ai and bt are fixed effects, uit ∼ NID(0, σ2u), and all the

observable variables are in logarithms:cit = real per capita sales of cigarettes by persons ofsmoking age (14 and older). cigarette average priceper packpit = real average retail price of a pack of cigarettesyit = real per capital disposable incomepnit = the minimum real price of cigarettes in anyneighboring state (proxy for casual smuggling effectacross state borders)ci,t−1 is very likely correlated with uit.

53

For reference purposes, estimating with panel OLS

(average of within group regressions with time dum-

mies) yields

Fixed-effects (within) regression Number of obs = 1334Group variable: state Number of groups = 46

R-sq: within = 0.9283 Obs per group: min = 29between = 0.9859 avg = 29.0overall = 0.9657 max = 29

F(32,1256) = 508.07corr(u_i, Xb) = 0.4743 Prob > F = 0.0000

-----------------------------------------------------lc | Coef. Std. Err. t P>|t|

-------------+---------------------------------------lc |

L1. | .8302514 .0126242 65.77 0.000|

lp | -.2916822 .0230847 -12.64 0.000ly | .1068698 .0233417 4.58 0.000

lpn | .0354559 .02656 1.33 0.182_cons | .8204374 .2228775 3.68 0.000

-------------+---------------------------------------sigma_u | .02738301sigma_e | .03504776

rho | .37905103 (fraction of variance due to u_i)-----------------------------------------------------F test that all u_i=0: F(45, 1256) = 4.52

Prob > F = 0.0000

54

Several method are proposed to estimate when there

is potential correlation between the error term and

(some) regressors.

GMM (Generalized Method of Moments) estimation

has gained lately much popularity, in particular when

there are non-linear moment restrictions.

Stata has xtdpd procedure which produces the Arel-

lano and Bond or the Arellano-Bover/Blundell-Bond

estimator, which are GMM estimators, where instru-

ments are defined in a particular way (the idea will be

discussed in the classroom).

55

xtdpd l(0/1).lc lp ly lpn y66-y92, div(lp ly lpn y66-y92) dgmmiv(lc)

Dynamic panel-data estimation Number of obs = 1334Group variable: state Number of groups = 46Time variable: year

Obs per group: min = 29avg = 29max = 29

Number of instruments = 437 Wald chi2(31) = 13273.45Prob > chi2 = 0.0000

One-step results-----------------------------------------------------

lc | Coef. Std. Err. z P>|z|-------------+---------------------------------------

lc |L1. | .8201729 .0161446 50.80 0.000

|lp | -.3607549 .0311244 -11.59 0.000ly | .1871102 .0334027 5.60 0.000

lpn | -.0215713 .0399233 -0.54 0.589-----------------------------------------------------Instruments for differenced equation

GMM-type: L(2/.).lcStandard: D.lp D.ly D.lpn D.y66 D.y67 D.y68D.y69 D.y70 D.y71 D.y72 D.y73 D.y74 D.y75D.y76 D.y77 D.y78 D.y79 D.y80 D.y81 D.y82D.y83 D.y84 D.y85 D.y86 D.y87 D.y88 D.y89D.y90 D.y91 D.y92

Instruments for level equationStandard: _cons

56

Test for the orthogonality conditions of the instru-

ments

Sargan test of overidentifying restrictionsH0: overidentifying restrictions are valid

chi2(405) = 561.5047Prob > chi2 = 0.0000

The orthogonality conditions are rejected.

The reason may be that that the errors are MA(1),

which implies that the GMM instruments (lct−2, . . .)

are correlated with the error term.

This can be tried to fix by defining starting from t− 3

with command · · · dgmmiv(lc, lagrange(3 .)).

Doing this improved slightly the situation but still lead

to rejection of the orthogonality conditions.

We however, do not continue the analysis here further.

57