lecture series 1 linear random and fixed effect models and their (less) recent extensions

Upload: dan-gibbons

Post on 08-Jul-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    1/62

    Lecture Series 1Linear Random and Fixed Effect Models andTheir (Less) Recent Extensions

    Stefanie [email protected]

    RMIT UniversitySchool of Economics, Finance, and Marketing

    January 21, 2014

    1 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    2/62

    Overview

    1  Recap: Linear model set-up, random effects estimation andfixed effects estimation;

    2  Relationship between random and fixed (and between) effectsestimators;

    3   Is fixed effects estimation always preferable to random effects

    estimation?;

    4  Hausman-Taylor (1981) approach to estimating coefficients onboth time-varying and time-invariant variables;

    5  Correlated random effects (CRE): a flexible extension to

    random effect models to relax orthogonality condition;6  Plumper and Troeger’s Fixed Effects Vector Decomposition

    Approach and Rule of Thumb;

    7  Application: Estimating the effects of health on wages.

    2 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    3/62

    References for Lecture 11   Greene, W.H. (2011).   Econometric Analysis. Pearson

    Education Limited . 399-438.2  Wooldridge, J. (2009).  Econometric Analysis of Cross Sectionand Panel Data. The MIT Press. 285-382, 345-361.

    3   Hsiao, C. (2003).  Analysis of Panel Data. EconometricSociety Monographs. CUP: New York. 27-44.

    4  Mundlak, Y. (1978). On the Pooling of Time Series andCross-section Data.   Econometrica   46: 69-85.

    5   Hausman, J., Taylor, W.E. (1981). Panel Data andUnobservable Individual Effects.   Econometrica  49: 1377-1398.

    6   Pluemper, T., Traeger, V. (2007). Efficient estimation of 

    time-invariant and rarely changing variables in finite samplepanel analyses with unit fixed effects. Political Analysis 15:124-139.

    7   Contoyannis, P., Rice, N. (2001). The impact of health onwages. Evidence from the BHPS. Empirical Economics 26:599-622.

    3 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    4/62

    1. Recap: Linear model set-up

    4 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    5/62

    Heterogeneous intercept modelsConsider the following linear regression model which allows forindividual-specific heterogeneity  αi 

    Y it  = X ′it β  + εit ,   (1)

    for all   i  = 1, . . . , N   and  t  = 1, . . . , T 

    εit  = αi  + u it ,   (2)

    •   Y it  is some outcome of interest;

    •   X it   is a vector of covariates (X it 1, . . . , X itK )’ and generallyincludes a constant term, i.e.   X it 1  = 1 for all   i   and  t . These

    may include also time-invariant variables such as  X i .•  The unobserved (errors) consist of two components:   αi 

    (constant across time),  u it   is an idiosyncratic error term thatvaries across individuals and time

    (u it  ∼

     iid (0, σ

    2

    u ),E 

    (εit |αi ,X i 1, . . . ,

    X iT ))=0.

    5 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    6/62

    The model in matrix notation

    The NT observations are ordered first by   i  units, and then by  t observations, such that:

    Y   = X β  + ε,   (3)

    The dimensions are:

    •   Y: NT  × 1 vector of  Y it ’s;

    •   X: NT  × K matrix with rows columns  X itk ;

    •   ε: NT  × 1 vector of  εit ’s.

    6 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    7/62

    OLS estimator in matrix form

    The OLS estimator for  β   is:

    β̂ OLS   = (X ′X )−1(X ′Y ) (4)

    Our focus here is how to estimate this model under differentassumptions about the individual-specific heterogeneity  αi . Early

    discussions (examples) in the literature were concerned withwhether  αi   should be treated as a random variable (which wouldadd a error term) or as a fixed parameter to be estimated for eachcross-sectional group.

    More modern approaches to panel data econometrics are moreconcerned with the question whether  αi   is correlated with theexplanatory variables of interest (e.g. Wooldridge, 2009, p.285-286).

    7 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    8/62

    Random versus fixed effect models

    We will examine the implications for OLS estimation under thealternative assumptions that:

    1   αi   is uncorrelated with  X it   for all  t  = 1, . . . , T  (referred to as

    ”random effects model”);2   αi   is allowed to arbitrarily correlate with  X it   for all

    t  = 1, . . . , T  (referred to as ”fixed effects model”);

    3   αi   is assumed to linearly depend on  X it  (Referred to as”correlated random effects model”);

    We will consider the suitable estimators in each case.

    8 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    9/62

    Random effect models

    In the random effect model we assume  Cov (X it , αi ) = 0

    t  = 1, . . . , T , or the stronger assumption of zero conditionalexpectation, i.e.:   E (αi |X i 1, . . . X iT ) = 0. In this scenario, usingOLS will yield unbiased parameter estimates, but wrong standarderrors and thus unreliable statistical inference. Let’s take a look atwhy:

    Consider the properties of the OLS estimator:

    E (β̂ OLS |X ) =   E {(X ′X )−1X ′Y |X }   (5)

    =   E {(X ′X )−1X ′(X β  + ε)|X }   (6)

    =   β  + (X ′X )−1X ′E {ε|X }   (7)

    =   β    (8)

    9 / 6 2

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    10/62

    Random effect modelsNow think of the sampling properties of the OLS estimator:

    Var (β̂ OLS |X ) =   Var {(X ′X )−1X ′Y |X }   (9)

    =   Var {(X ′X )−1X ′(X β  + ε)|X }   (10)

    =   Var {β  + (X ′X )−1X ′ε|X }   (11)

    =   X ′

    X −1

    X ′

    Var {ε|X }X (X ′

    X )−1

    (12)

    Recall, the OLS assumption about  ε   is that  εit  ∼ iid (0, σ2) and so:

    Var (β̂ OLS |X ) = σ2(X ′X )−1,   (13)

    and replacing  σ2

    by an estimate, typically the sample variance of the regression errors:

    s 2 =  1

    NT 

    N i =1

    T t =1

    e 2it .   (14)

    10/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    11/62

    Random effect models

    But what’s wrong with the variance when we allow for unobservedheterogeneity? Due to εit  = αi  + u it   the assumption of independent errors across observations fails. In particular, if αi  ∼ N (0, σ

    2α) and  u it  ∼ iid (0, σ

    2u ), where  σ

    2ε  = σ

    2u  + σ

    2α, then the

    variance-covariance matrix of  εi  = (εi 1, εi 2, . . . , εiT )′:

    Var (εi |X i ) =

    σ2u  + σ2α   σ

    2α   . . . σ

    2α   σ

    σ2α   σ2u  + σ

    2α   . . . σ

    2α   σ

    σ2α   σ2α   . . . σ

    2α   σ

    σ2

    α   σ2

    α   . . . σ2u  + σ

    2

    α   σ2

    ασ2α   σ

    2α   . . . σ

    2α   σ

    2u  + σ

    =

    σ2u I T ×T  + σ2αi T ×1i 

    ′1×T   = Σ (i   is a vector of ones).

    11/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    12/62

    Random effect models

    Since the observations   i   and   j  are independent, the disturbancecovariance matrix for the full NT observations is:

    Ω =

    Σ 0   . . .   0 00 Σ   . . .   0 0...   . . .

      ...  ...

    0 0   . . .   0 Σ

    NT ×NT 

    = I N ×N 

    ΣT ×T 

    12/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    13/62

    Random effect models

    There are two solutions to fix the wrong standard errors implied bycross-sectional unobserved heterogeneity when using OLS:

    1  Correcting the OLS standard errors:  robust covariance matrix

    estimation; estimate model with OLS, then adjust standarderrors ex post.

    2   Random effects estimation: obtain a more efficient estimatorof  β  using generalised least squares. transform the data first,then use OLS on transformed data - this approach is similar to

    (feasible) GLS when controlling for e.g. heteroskedasticity.

    13/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    14/62

    1. Correcting OLS standard errors ex post

    •   Note that  Var (β̂ OLS |X ) = X ′X −1X ′Var {ε|X }X (X ′X )−1

    implies that  Var (ε|X ) is a  NT  × NT  matrix with a blockdiagonal structure;

    •  For each of the N cross-sectional groups there will be  T  × T diagonal blocks corresponding to  Var (εi |X );

    •  Off these diagonal blocks the matrix has zeros due to theassumed independence of the cross-sectional sample;

    •  Thus we can correct the OLS standard errors by replacingVar (ε|X ) with a suitable estimate from the sample data.

    14/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    15/62

    1. Correcting OLS standard errors ex post

    Suitable estimators are:

    •   Estimate  σ2α  by:

    s 2α  =NT (T  − 1)

    2

    −1 N i =1

    T −1t =1

    T s =t +1

    e it e is    (15)

    •   Estimate  σ2ε   by:

    s 2 = (NT )−1N 

    i =1

    t =1

    e 2it    (16)

    •  This approach is nothing else than robust covariance matrixestimation (See p. 390 in Greene, 2012).

    15/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    16/62

    Random effects (or GLS) estimation

    •  We want to transform the data in a way that the variance of 

    the transformed errors is equal to the identity matrix; i.e.Var (ε̃) = ΓVar (ε)Γ′ = I NT    (17)

    •  A good candidate for transforming the data for eachindividual is Σ−1/2 - hence, if we find this term, we can

    pre-multiply  Y i ,  X i   and  εi   by Σ−1/2 (or, in terms of matrixnotation: Ω−1/2 and pre-multiply  Y ,  X , and  ε).

    •  See the derivations of Σ−1/2 on the blackboard;

    •  The final result is:

    Σ−1/2 =   1σu 

    [I  −   θT 

     i T ×1i ′1×T ],   (18)

    whereθ = 1 −

      σu 

     σ2u  + T σ2α.   (19)

    16/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    17/62

    Random effects estimation

    Consider the following transformation of our benchmark linear

    regression model:

    Ω−1/2Y   = Ω−1/2X β  + Ω−1/2ε,   (20)

    or

    Ỹ   =  X̃ β  + ε̃.   (21)

    where, for instance:

    Σ−1/2Y i  =

    Y i 1 − θ

     ¯Y i Y i 2 − θ Ȳ i 

    ...Y iT  − θ Ȳ i 

    , Σ−1/2X i   =

    X i 1 − θ

     ¯X i X i 2 − θ X̄ i 

    ...X iT  − θ X̄ i 

    .

    17/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    18/62

    Random effects estimation

    We can show that the transformed errors  ε̃ have the property that

    Var (ε̃) = ΓVar (ε)Γ′

    = I NT  (try to do at home). Thus,  feasibleGLS regression based on this transformation satisfies the necessaryassumptions for efficient estimation of  β , and is referred to as therandom effects estimator:

    ˆβ RE   = (

     ˜X ′ ˜X )

    −1 ˜X ′ ˜Y   = (X 

    Ω−1

    X )−1

    X ′

    Ω−1

    Y .   (22)

    The variance of this estimator is (Homework: Check whether youcan derive all steps by yourself - we will talk about it in class nextweek)

    Var (β̂ RE |X ) =   Var {(X ′Ω−1X )−1X ′Ω−1Y |X }   (23)

    = (X ′Ω−1X )−1X ′Ω−1X (X ′Ω−1X )−1 (24)

    = (X ′Ω−1X )−1 (25)

    18/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    19/62

    Fixed effect models

    In fixed effect model we allow for the possibility of  Cov (X it , αi ). Inthis case, the OLS estimator  β̂ OLS  will be biased and inconsistent.This is so because:

    E (β̂ OLS |X ) =   E {(X ′X )−1X ′Y |X }   (26)

    =   E {(X ′X )−1X ′(X β  + ε)|X }   (27)

    =   β  + (X ′X )−1X ′E {ε|X }   (28)

    =   β,   (29)

    where the last inequality stems from the fact that  E (αi |X it ) = 0.

    19/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    20/62

    Fixed effect models

    There are two solutions to the problem. Re-consider the originalmodel:

    Y it  = X ′it β  + εit ,   (30)

    for all   i  = 1, . . . , N   and  t  = 1, . . . , T 

    εit  = αi  + u it ,   (31)

    1   Within-group fixed effects: subtract the within-group means

    from the original regression equation that combines Eqs. 30and 31.

    2  First-differences between two adjacent time periods.

    20/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    21/62

    Within-group fixed effectsConstruct the within-group average of benchmark linear regressionmodel:

    Ȳ i  =  X̄ ′i β  + αi  + ū i ,   (32)

    where  Ȳ i   = T −1T 

    t =1 Y it   and X̄ i  = T 

    −1T 

    t =1 X it , and

    ū i   = T −1T 

    t =1 u it . Then, subtract Eq. 32 from combined Eqs. 30and 31.

    Y it  −  Ȳ i   = (X it  −  X̄ i )′β  + (u it  − ū i ) − (αi  − αi ).   (33)

    And so the within-group fixed effects estimator is:

    β̂ FE   = N i =1

    T t =1

    (X it − X̄ i )(X it − X̄ i )′−1 N 

    i =1

    T t =1

    (X it − X̄ i )(Y it − Ȳ i )

    (34) 21/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    22/62

    Within-group fixed effects

    •   In contrast to the random effects or GLS procedure which usesboth within-group (across time) and between-group (acrosscross-sectional units) variation to estimate  β , the within-groupfixed effect approach uses only the within-group variation.

    •  Any time-invariant observable characteristics will alsodifference out, so that their coefficients cannot be identified(unless they are interacted with time-varying variables).

    •  N degrees of freedom will be lost, since this approachestimates the group sample means (one for each group).

    •  Even though the transformed errors in Eq. 33 (u it  − ū i ) arenon-classical (which means what?), the OLS standard errorsfrom the fixed effects regression are correct.

    22/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    23/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    24/62

    Pros and cons

    •  The first differences approach is easy to implement manuallyand keeping track of the correct number of degrees of freedom is more straightforward.

    •   If the model is correctly specified and if there is no serialcorrelation, then within-group fixed effect estimation is moreefficient than first differences.

    •  The relative efficiency between the two estimators depends onthe degree of serial correlation in the idiosyncratic errors

    (Cov (u it , u is ), for  t  = s ). (Why?)

    24/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    25/62

    2. Relationship between random and fixed

    (and between) effect estimators

    25/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    26/62

    Some transformations

    Consider the following transformation

    •   Group-means transformation:   P  = I N ×N T −1i T ×1i 

    ′1×T ,

    where   I N ×N  is the identity matrix of dimension  N  × N , 

    isthe Kronecker product, and   i T ×1   is a  T  × 1 vector of 1’s.

    •  Deviations from group means:   Q  = I NT ×NT  − P 

    26/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    27/62

    Some transformations, cont.

    P   and  Q  have the effect of transforming the data to group means,

    and deviations from means, respectively:

    PY   =

    Ȳ 1. . .Ȳ 1Ȳ 

    2. . .Ȳ 2. . .Ȳ N 

    . . .Ȳ N 

    ,  QY   =

    Y 11 −  Ȳ 1. . .

    Y 1T  −  Ȳ 1Y 

    21 −  Ȳ 

    2. . .Y 2T  −  Ȳ 2

    . . .Y N 1 −  Ȳ N 

    . . .Y NT  −  Ȳ N 

    , and so on.

    Note that P and Q are idempotent (P 2 = P ,  Q 2 = Q ) andorthogonal (PQ  = 0).

    27/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    28/62

    Fixed and random effects similarity

    Hence, the fixed effects estimator of  β  can be expressed in morecompact notation:

    β̂ FE   = (X ′QX )−1X ′QY    (36)

    The random effect transformation described above is a partial

    deviation from group means:

    Ỹ it  = Y it  − θ Ȳ i ,   (37)

    and

    X̃ it  = X it  − θ X̄ i ,   (38)

    where  θ = 1 −

      σ2u σ2u +T σ

    1/2.

    28/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    29/62

    Fixed and random effects similarity

    The partial deviations framework provides an optimal use of thewithin group and the between group variation. Note that the largeris the between-group fraction of total variation (i.e.   σ2α  relative toσ2u ) and/or the larger is  T , the greater will be  θ  (closer to 1), and

    the more weight is given to within-group, compared tobetween-group, variation.

    •  Suppose T=3 and  σ2α = 0, then  θ = 0 and the full variation inthe data is used, compared to  θ = 0.5, if  σ2α = σ

    2u ;

    •   Alternatively, suppose  σ

    2

    α = σ

    2

    u , then if T=3,  θ = 0.5compared to  θ = 0.75 if T=15.

    29/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    30/62

    Random effects as weighted average

    The random effect estimator can be thought of as a weightedaverage of the within-group estimator  β FE  and the between-groupestimator  β BE  based on the group-means data:

    β̂ RE   = δ k ×k  β̂ FE  + (I k ×k  − δ k ×k )β̂ BE ,   (39)

    β̂ BE    = N i =1

    T ( X̄ i  −  X̄ )( X̄ i  −  X̄ )′−1   N 

    i =1

    T ( X̄ i  −  X̄ )( Ȳ i  −  Ȳ )

    = (X ′PX )−1X ′PY 

    30/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    31/62

    Random effects as weighted average

    δ    = N i =1

    T t =1

    (X it  −  X̄ i )(X it  −  X̄ i )′ + λ

    N i =1

    T ( X̄ i  −  X̄ )( X̄ i  −  X̄ )′−1

    ×N 

    i =1

    t =1

    (X it  −  X̄ i )(X it  −  X̄ i )′

    λ =  σ2u 

    σ2u  + T σ2α

    = (1 − θ)2.   (40)

    •   If  λ = 0, then FE and RE are equivalent (a lot of weight isgiven to within-group variation).

    •   If  λ = 1, a lot of weight is given to between-group variation.

    •   However, 0 < λ

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    32/62

    Summary

    •   If  E (αi |X it ) = 0,

    •   Both  β̂ RE   and  β̂ FE  are consistent for  β  (and so would be OLS).•

      β̂ RE   is efficient,  β̂ OLS  has biased standard errors.

    •   If  E (αi |X it ) = 0,

    •  β̂ RE   is inconsistent for  β .

    •  β̂ FE   is consistent for  β .

    32/62

    T i

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    33/62

    Testing

    The efficiency/consistency trade-off between  β̂ RE   and  β̂ FE   suggests

    a method to test the random effects restriction. One of these testsis the Hausman test. Under the null hypothesis,H 0  : E (αi |X it ) = 0,  β̂ RE   is efficient, but it is inconsistent under thealternative hypothesis (H a  : E (αi |X it ) = 0). In contrast,  β̂ FE   isconsistent under both  H 0  and  H 1.

    The Hausman test statistic for this test is:

    H  = (β̂ FE  −  β̂ RE )′{Var (β̂ FE  −  β̂ RE )}

    −1(β̂ FE  −  β̂ RE ),   (41)

    where  Var (β̂ FE  −  β̂ RE ) = Var (β̂ FE ) − Var (β̂ RE ) is thevariance-covariance matrix of the difference between the fixedeffects and random effects estimator.

    33/62

    T i

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    34/62

    Testing

    Under the null hypothesis, the Hausman test statistic has a  χ2

    distribution with degrees of freedom equal to the dimension of  β ,i.e.:

    H  ∼ χ2k    (42)

    Note:  Since the fixed effects estimation method can only identifycoefficients on time-variant variables, the relevant dimension of  β is the number of time-varying variable coefficients.

    34/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    35/62

    3. Is fixed effects estimation always preferable

    to random effects estimation?

    35/62

    I FE l b tt th RE?

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    36/62

    Is FE always better than RE?

    Recall:  The fixed effects estimator uses only the within-group (=difference from group mean) variation and ignores thebetween-group  variation. This method is used because of aconcern that this between-group variation is contaminated withunobserved heterogeneity.

    In some cases, the cross-sectional variation may be more reliablethan the within-group time-variation, in which case fixed effects

    estimation may be worse than the OLS or RE alternatives.

    36/62

    I FE l b tt th RE?

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    37/62

    Is FE always better than RE?

    Examples are:

    •  Measurement error in  X it : If  X it   is measured with classical, i.e.purely random, error, then taking either differences-from-meanor first-differences will exacerbate the noise-to-signal ratio inthe resulting data  → Serious attenuation bias in  β̂ FE 

    •  Endogenous changes in  X it : If X is endogenous, i.e. changesin  X it  over time are not exogenous to changes in  Y it , thenfixed effects estimation may be worse than random effects orOLS. In this case, (X it  −  X̄ i ) may be strongly correlated with(εit  − ε̄i ).

    •  There may not be enough variation in the X variables,although FE can estimate the coefficient even if X rarelychanges (Pluemper and Troeger, 2007)

    37/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    38/62

    4. Hausman-Taylor (1981) approach toestimating coefficients on both time-varying

    and time-invariant variables

    38/62

    Hausman Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    39/62

    Hausman-Taylor (1981) approach

    If we have a situation, in which we have both time-variant andtime-invariant variables of interest, Hausman and Taylor show that

    consistent estimation of the coefficients of interest is possible, if not all  of the time-varying coefficients are correlated with theunobserved heterogeneity,

    The basic idea is to use the group means of the time-varyingvariables that are uncorrelated with the unobserved heterogeneityas instrument for the time-invariant variables to obtain consistentestimates of their coefficients, while consistent estimates of thetime-varying variable coefficients can be obtained using standard

    fixed effects estimation.

    This requires there are at least as many uncorrelated time-varyingvariables as correlated time-invariant variables and also that thereis suitable correlation between these.

    39/62

    Hausman Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    40/62

    Hausman-Taylor (1981) approach

    Consider the linear regression of  Y it   on  k   time-varying covariates(X it ) and  g  time-invariant covariates (Z i ):

    Y it  = X ′

    it β  + Z ′

    i γ  + εit ,   (43)where   i  = 1, . . . , N ,  t  = 1, . . . , T , and

    εit  = αi  + u it ,   (44)

    40/62

    Hausman Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    41/62

    Hausman-Taylor (1981) approach

    Sub-divide each of the  X it  = (X 1it   X 2it )′ and  Z i   = (Z 1i   Z 2i )

    ′:

    •   X 1it   and  X 2it  consist of  k 1  and  k 2  variables, respectively(k 1 + k 2  = k );

    •   Z 1i   and  Z 2i  consist of  g 1  and  g 2  variables, respectively(g 1 + g 2  = g );

    •   E (αi |X 1it ) = 0 and  E (αi |Z 1i ) = 0; and

    •   E (αi |X 2it ) = 0 and  E (αi |Z 2i ) = 0.

    41/62

    Hausman-Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    42/62

    Hausman-Taylor (1981) approachThe intuition for the Hausman-Taylor approach is follows:

    •  STEP 1:  Fixed effects provides consistent estimation of the

    coefficients on the time-varying variables:

    β̂ FE   = (X ′QX )−1X ′QY    (45)

    Remember that  Q  = I  − P , where  P  = I 

    T −1ii ′ The

    residual variance obtained in this step is a consistent estimatorof  σ2u .

    •  STEP 2:   Using  β̂ FE  to construct the group means of thewithin-group residuals:

    d̂ i  =  Ȳ i  −  X̄ ′i  β̂ FE   = Z ′i γ  + αi  + ū i ,   (46)

    where ū i   is the group-mean residual (u it ).

    •   If (46) was estimated with OLS or GLS, then γ̂   is likely to bebiased, due to the correlation of  Z i 2  with  αi .

    42/62

    Hausman-Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    43/62

    Hausman Taylor (1981) approach

    Where does expression for  d   in (46) come from? The group meansof the within-group residuals are derived as follows:

    d̂    =   P (Y  − X  β̂ FE ) = P {I  − X (X ′QX )−1X ′Q }Y 

    =   P {I  − X (X ′QX )−1X ′Q }(X β  + Z γ  + α + u )

    =   P (X β  + Z γ  + α + u  − X β )

    =   P (Z γ  + α + u )

    =   Z γ  + α + Pu 

    This is a regression of the group-mean residuals from the fixedeffects regression on the  Z ′i s , with  αi  + ū i  being the group-meanresiduals.

    43/62

    Hausman-Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    44/62

    Hausman Taylor (1981) approach

    •  STEP 3:   Use  X̄ 1i  as instruments for  Z 2i . This will provide

    consistent estimation of  γ   if there are sufficient  X ′1s   (i.e. ordercondition:   k 1 ≥ g 2), and the  X 

    ′1i s  are correlated with the  Z 

    ′2i s 

    (rank condition).

    •  Then estimate (46) with a 2-SLS approach, where:

    γ̂  = (Z ′i P AZ i )−1Z ′i P A

     d̂ i    (47)

    where  A = [X 1it Z 1i ], and  P A   is the projection matrix:

    P A  = A(A′

    A)−1

    A′

    (48)and

    Ẑ 2 = A(A′A)−1A′Z 2   (49)

    44/62

    Hausman-Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    45/62

    Hausman Taylor (1981) approach

    •   NOTE: Both  β̂ FE   and γ̂ 2SLS  are consistent. However, since

    β̂ FE   is likely to be inefficient, then γ̂ 2SLS , which stem from theFE approach are likely to be inefficient too. Therefore,Hausman and Taylor suggest an extension to estimate  β  andγ   in a more efficient way.

    •  STEP 4:  The residual variance in the step above is aconsistent estimator of  σ∗2 = σ2u /T  + σ

    2α. Using the

    consistent estimator of  σ2u  from the first step, we deduce anestimator for  σ2α = σ

    ∗2 − σ2u /T . The weight for feasible GLSis:

    θ = 1 −  σu  σ2u  + T σ

    .   (50)

    45/62

    Hausman-Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    46/62

    Hausman Taylor (1981) approach

    •  STEP 5:  Construct a weighted instrumental variableestimator. The full set of variables is:

    w ′it  = (X ′1it   X 

    ′2it   Z 

    ′1i   Z 

    ′2i ) =⇒ W NT ×(k 1+k 2+g 1+g 2),   (51)

    so the transformed variables of GLS are:

    w ∗′

    it    =   w ′it  − θ̂w̄ 

    ′i 

    Y ∗it    =   Y it  − θ̂ Ȳ i .

    The instruments used are:

    v ′it  = [(X 1it  −  X̄ 1i )′ (X 2it  −  X̄ 2i )

    ′ Z ′1i  X̄ ′1i ] (52)

    46/62

    Hausman-Taylor (1981) approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    47/62

    aus a ay o ( 98 ) app oac

    1  Instrumental variable estimator (efficient):

    (β̂  γ̂ )′IV   = [(W ∗′ V )(V ′V )−1(V ′W ∗)]−1[(W ∗

    V )(V ′V )−1(V ′Y ∗)](53)

    2  Instrumental variable estimator using un-weighted variables

    (inefficient):

    (β̂  γ̂ )′IV   = [(W ′V )(V ′V )−1(V ′W )]−1[(W ′V )(V ′V )−1(V ′Y )]

    (54)

    3  Feasible GLS estimator

    (β̂   γ̂ )′GLS   = [W ∗′W ∗]−1[W ∗

    Y ∗] (55)

    47/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    48/62

    5. Correlated random effects (CRE): a flexible

    extension to random effect models

    48/62

    Intuition of CRE

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    49/62

    Recall that the random effects estimator is biased if  α  is correlatedwith  X it . Chamberlain (1984) and Mundlak (1978) observed thatif  αi   is correlated with  X it   in period  t , then it will also becorrelated with  X it   in period  s , where  t  = s . One interpretation of 

    this observation is that  X it  should be included in the period  s regression. More generally, all the realisations of the  X ′s  should beincluded in each period’s regression.

    That is, if  αi   is correlated with  X it   in the structural form then all

    leads and lags of  X it  should be included in the regression.

    49/62

    Formalisation of CRE

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    50/62

    Specify the linear projection of  αi  on the set of  X ′it s :

    αi   = X ′i 1λ1 + X 

    ′i 2λ2 + . . . + X 

    ′iT λT  + ηi .   (56)

    Eq. 56 provides a way to decompose  αi   into two components:

    1   A component (X ′i 1λ1 + X ′i 2λ2 + . . . + X ′iT λT ) that iscorrelated with the observable covariates; and

    2   A component (ηi ) that is uncorrelated with the covariates.

    The λ′s  are the projection coefficients that reflect the extent of the

    correlation between  αi   and  X it , and  ηi   is, by construction, a truerandom effect - i.e. uncorrelated with  X it  for all t.

    50/62

    Formalisation of CRE

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    51/62

    Note:•   E (αi |X it ) does not have to be linear in the  X 

    ′it s . It is only the

    linear correlation that causes bias/inconsistency in the OLSand (random effects/GLS) estimator. Hence, only the linear

    projection is required for CRE to be unbiased/consistent.•  Mundlak (1978) adopted the more restricted specification that

    λ1  = λ2  = λT   = λ. This restriction implies that Eq. 56reduces to:

    αi   = (T  X̄ i )′λ + ηi    (57)

    51/62

    Mundlak’s assumption and consequences

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    52/62

    The assumption that the individual-specific effect is equallycorrelated with all time-period  X ′it s   implies a very easyimplementation of the correction. All you need to do is to replaceαi   in Eq. 60:

    Y it  = X ′it β  + αi  + u it ,   (58)

    With (and ignore the scaling factor of T):

    αi  = (T  X̄ i )′λ + ηi    (59)

    To get:

    Y it  = X ′it β  +  X̄ ′i λ + ηi  + u it ,   (60)

    where  ηi   is a true random effect.

    52/62

    Chamberlain’s approach

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    53/62

    If you do not want to make the strong assumptions made byMundlak, then implementation of this correction is slightly more

    difficult. Use Eq. 56 to substitute for  αi   in combined Eq. 30, weget:

    Y it    =   X ′it β  + X 

    ′i 1

    λ1 + X ′i 2

    λ2 + . . . + X ′iT λT  + ηi  + u it   (61)

    =   X ′it (β  + λt ) +s =t 

    X ′is λs  + ηi  + u it .   (62)

    or, in more compact form:

    Y it  = X ′i 1πt 1 + X 

    ′i 2πt 2 + . . . + X 

    ′iT πtT  + ηi  + u it .   (63)

    where  πts  =   λs    s  = t 

    β  + λt    s  = t .

    53/62

    Some explanations

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    54/62

    Eq. 62 is the reduced form equation for the model. The errors

    (ηi  + u it ) are uncorrelated with the regressors. This expressionshows that one way to view the problem of ignoring the correlationbetween the covariates and the unobserved heterogeneity is anomitted variables problem that can be solved by including all theout-of-period realisations of  X 

    is   in the period  t   equation.

    In Eq. 63, the coefficient on  X it , i.e.   πtt , consists of  twocomponents:

    1   The structural effect of interest  β ;

    2   The component  λt , which reflects the correlation of  X it   withthe unobserved heterogeneity.

    54/62

    Estimation of CRE

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    55/62

    The parameters of interest (β  and  λt ’s) can be estimated by theminimum distance approach - it requires two steps:

    1  Estimate the unrestricted reduced form equations as outlinedin Eq. 63 by OLS. Include all the leads and lags of the  X it ’s inthe period  t  regression, and estimate this regression separatelyfor each time period.

    2   Estimate the parameters of interest by imposing the impliedrestrictions (see below) on the first-stage reduced formcoefficients using a minimum distance estimation method.This latter means to use a quadratic form criteria as the basisfor estimating the parameters of interest in the second stage.

    The implied cross-equation restrictions are:

    1   πts  = λs   ∀  t  = s ;

    2   πtt  − πst  = β  ∀  t  = s .

    The details of minimum distance are explained on the white-board.55/62

    Evaluation of CRE

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    56/62

    •   This approach is called random effects because itparameterises the distribution of  αi   (i.e. by projecting αi   onto

    the set of sample realisations of  X it );•  It requires to estimate 1 + TK  + K  parameters (risk of 

    proliferation of parameters);•   It relies on the measured  X it ’s being time-varying. Time

    invariant variables will be absorbed into the  αi   in thisspecification;

    •   A test of the (zero-) correlation between the covariates andthe unobserved heterogeneity is given by testingH 0  : λ1 = λ2  = . . . = λT  = 0 vs  H a   : not all are zero;

    •   An important caveat to the CRE discussion is that  X is   entersthe period  t  equation only via its correlation with  αi . In somesituations, out-of-period regressors may have  independent,structural reasons for being included (this approach may failthen).

    56/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    57/62

    6. Plumper and Troeger (2007) approach to

    modelling (nearly) time-invariant variables

    57/62

    Three-stage procedure

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    58/62

    1  Run a fixed effects model - predict the individual fixed effect;

    2   Decompose the individual fixed effects into the part explainedby time-invariant and/or rarely changing variables and anerror term (hi );

    3  Re-estimate the first stage by pooled OLS including thetime-invariant variables plus the error term of stage 2.

    58/62

    Three-stage procedure

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    59/62

    y it − ȳ i  = β k 

    k =1

    (x kit − x̄ ki  +γ m

    m=1

    (z mi −z mi )+(e it − ē i )+(u i −u i ))

    (64)Let:

    û i  = ȳ i  −K 

    k =1

    β̂ k x kit  − ē i ) (65)

    û i  =M 

    m=1

    γ mz mi  + hi    (66)

    and

    hi  = û i  −M 

    m=1

    γ mz mi    (67)

    y it  = α + β k 

    k =1

    x kit  +M 

    m=1

    γ mz mi  + δ hi  + εit    (68)59/62

    Monte Carlo Simulations

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    60/62

    1  Compare finite sample properties of the FEVD estimatoragainst those of the Pooled OLS, RE, and Hausman-Taylor IVestimator (Use RMSE as criterion);

    2   If both time-invariant and time-varying variables correlatestrongly with the individual FE, than FEVD outperforms all

    estimators;3   When considering the estimates of coefficients on rarely

    changing variables, FEVD outperforms FE if:•  Ratio between Between/Within variation is high (threshold is

    1.7), and;•   Overall  R 2 is low, and;•  Correlation between rarely changing variables and ind. FE is

    low.

    60/62

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    61/62

    7. Application: Effect of Health on HourlyWages (Contoyannis and Rice, 2001) using six

    waves of BHPS

    61/62

    Assumptions

  • 8/19/2019 Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions

    62/62

    1  Remember: Use the mean values of the exogenous

    time-varying variables to instrument the time-invariantendogenous variables

    2  Time-invariant endogenous variables: Higher Degree;

    3  Time-variant endogenous variables: Health (Psychological and

    Physiological), workforce sector, occupation;4  Test for the validity of the instruments in the Hausman and

    Taylor approach using a Hausman test (comparing theestimated coefficients with those of a FE model): They shouldbe sufficiently close.

    5  Approach is valid only if health is correlated with theindividual, time-invariant effect of wages, but not with theperiod-specific effects of wages

    62/62