1 panel data analysis – advantages and challenges cheng hsiao

Panel Data Analysis – Advantages and Challenges

Cheng Hsiao

Introduction

Year SSCI

1986 292003 5802004 6872005 773

Three factors contributing to the phenomenon growth

(i) Data availability

(ii) Greater capacity for modeling the complexity of human behavior

(iii) Challenging methodology

Data Availability

US: National Longitudinal Surveys of Labor Market Experience (NLS)Michigan Panel Study of Income Dynamics (PSID)

Eurostat:The European Community Household Panel (ECHP)

Kenya:Primary School Deworming Project (PDSP)

China:Township & Village Enterprises SurveyFinancial Institutions Survey (1984-1990)

Taiwan:Household Demographic Survey

Advantages

• Cross-Sectional Data may reflect inter-individual differences• Time Series data may suffer from multicollinearity and

shortages of degree of freedom• Panel data, by blending inter-individual indifference with

intra-individual dynamics, can allow a researcher the possibility to specify more complicated behavioral hypotheses than a single cross-sectional data or time series data

(i) More degree of freedom, more sample variability, less multicollinearity

uxy~~~

n x 1 n x k k x 1

)'()( 12^

~xxVar

uxy iiii

(ii) Greater capacity for capturing the complexity of human behavior

(a) Constructing and testing more complicated

behavioral hypotheses

- Homogenous vs Heterogenous population

Ben-Porath (1973)

- Program Evaluation

Difference-in-Difference method

uxgy iii 111 )( if 1d i treatment

uxgy iii 000 )( if 0d i control

Treatment Effect = yy ii 01

Average Treatment Effect = ][ 01 yyE ii

Data ydydy iiiii 01 )1(

Confounding treatment effect with differences

in covariates between control group and

treatment group

Bias due to selection on unobservables

)1|(0)( 11 duEuE iii

)0|(0)( 00 duEuE iii

Difference-in-Difference method

(b) Controlling the impact of omitted variables

)()( yyEyyE cbcatbta

1,1,1,

uuxxyy

tiittiittiit

itiitit

- unobservable

(c) Uncovering dynamic relationships

multicollinearity

(d) Generating more accurate predictions for individual outcomes (exchangeability)

(e) Providing micro foundation for aggregate data analysis

“representative agent” heterogeneity

)( 211 xxxx

(ii) Simplifying Statistical Inference and Computation

(a) Time-series inference

i.i.d , t1 ttt yy

,0(~)( ,12

)()( ,1

,0()( ,1^

(b) Measurement errors

(c) Dynamic sample selection models

uxy ititit ititit xz

)( itititit uzy )()()( ,,,, jtiitjtiitjtiitjtiit uuzzyy

uufxyyf

ufxyyf

tiitittiit

itittiit

ititit

itittiit

)|(),0|(

)(),|(

0 if 0

Methodology Challenges

Panel data also raises the issue of how best to model unobserved heterogeneity

Standard statistical procedures are developed based on the assumption that y conditional on x is randomly distributed with a common mean

) ; |(

Panel data, by its nature, focus on individual outcomes. Factors affecting individual outcomes are too numerous.

One way to restore homogeneity is to add additional conditional variables, say, , ,… so .

However

(a) A model is a simplification of reality, not a mimic of reality. Multicollinearity, shortages of degree of freedom, etc. may confuse the fundamental relationship between and .

(b) , ,… may not be observable.

)|(~xyf ititit

,...),,|(~~~wzxyf itititit

Another way is to let the parameters characterizing the conditional density of given to vary across i and/or over t,

Meaningful inference on can be made only if we assume certain structure on .

) ; |(~~

ititit xyf

- structural parameters

- incidental parameters (increase with N)

- individual-specific effects represent the effects of those variables that vary across individuals but stay constant over time, at least in the short-time span, e.g. ability, socio-economic background variables, marginal utility of initial wealth, etc.

- fixed constant, Fixed Effects Model (FE)

- random variable, Random Effects Model (RE)

' iiit

Concluding Remarks

The power of panel data to isolate the effects of specific actions, treatments or more general policies depends on the compatibility of the assumptions of statistical tools with the data generating process

Factors to consider:

(1) Advantages

(2) Limitations

(3) Compatibility between assumptions and data generating process

(4) Efficiency

1 panel data analysis – advantages and challenges cheng hsiao

unobservables slide

time series data slide

treatment group slide

challenging methodology

common mean slide

freedom panel data

power of panel data

challenges cheng hsiao

Documents

taiwan...

nccu.mclab a new tcp congestion control mechanism over...

sharon hsiao portfolio

single-crystalline copper...

1 program evaluation and panel discrete data models – some...

tanvir ahmed and anand r. tripathi university of minnesota...

assessing the contribution of r&d to total factor...

panel data analysis - advantages and challenges cheng hsiao...

hsiao-chien 2012

essays in honor of cheng hsiao · advances in econometrics...

crises, what crises? nauro f. campos brunel university, cepr...

fonduer: knowledge base constructionfrom richly ...fonduer:...

a ubiquitous network society - nccchairperson...

fei-bin hsiao ( 蕭飛賓 ) and cheng en liu ( 劉承恩 )...

the monitoring structural change tests via cusq …...

git basic stanley hsiao 2010_12_15

ross map team: nick adam prakhar goel shlomi barsheshet ...

hou hsiao-hsien dvd acción

new panel data analysis—advantages and challenges ·...

noœŸ簡訊.pdftsung-huang hsiao hsiao-yun hsieh members...