methodological workshop 1: research design yu xie university of michigan

37
Methodological Methodological Workshop 1: Workshop 1: Research Design Research Design Yu Xie Yu Xie University of Michigan University of Michigan

Upload: edward-willis

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Methodological Workshop Methodological Workshop 1:1:

Research DesignResearch Design

Yu XieYu XieUniversity of MichiganUniversity of Michigan

Page 2: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Otis Dudley DuncanOtis Dudley Duncan

““But sociology is not like physics. Nothing But sociology is not like physics. Nothing but physics is like physics, because any but physics is like physics, because any understanding of the world that is like the understanding of the world that is like the physicist’s understanding becomes part of physicist’s understanding becomes part of physics…”physics…” (Otis Dudley Duncan. 1984. (Otis Dudley Duncan. 1984. Notes on Social Notes on Social

MeasurementMeasurement. p.169). p.169)

Page 3: Methodological Workshop 1: Research Design Yu Xie University of Michigan

First Principle of Social ScienceFirst Principle of Social Science

Variability is the very essence of social Variability is the very essence of social science research. science research.

““Variability Principle.”Variability Principle.” We are interested in understanding how We are interested in understanding how

social outcomes vary across members in a social outcomes vary across members in a human population and over time.human population and over time.

Mortality example. Mortality example.

Page 4: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Second Principle Second Principle

Social grouping reduces such variability.Social grouping reduces such variability. ““Social Grouping Principle.” Social Grouping Principle.” We seek to understand patterns of We seek to understand patterns of

“between-group” variations in social “between-group” variations in social outcomes. outcomes.

Mortality example. Mortality example.

Page 5: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Third PrincipleThird Principle

Patterns of population variability vary with Patterns of population variability vary with social context, which is often defined by social context, which is often defined by time and space. time and space.

““Social Context Principle”Social Context Principle” Patterns of between-group variations vary Patterns of between-group variations vary

by social context. by social context. Mortality example: is the education-Mortality example: is the education-

mortality relationship reduced or mortality relationship reduced or eliminated through social policy? eliminated through social policy?

Page 6: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Different “Regimes” of Different “Regimes” of VariabilityVariability Social contexts are different from social groups in that Social contexts are different from social groups in that

the former are self-contained social systems with natural the former are self-contained social systems with natural boundaries, for example by time and space. boundaries, for example by time and space.

Patterns of individual variability may be governed by Patterns of individual variability may be governed by “relationships” between individuals that are not reducible “relationships” between individuals that are not reducible to individuals’ attributes. to individuals’ attributes.

Patterns of individual variability may be governed by Patterns of individual variability may be governed by macro-level conditions such as “social structure,” macro-level conditions such as “social structure,” “political structure,” or “culture,” which may be “political structure,” or “culture,” which may be discontinuous and fixed. discontinuous and fixed.

Collective action may lead to changes of macro-level Collective action may lead to changes of macro-level conditions and human relationships –major sources of conditions and human relationships –major sources of social change. social change.

Page 7: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Population Thinking and StatisticsPopulation Thinking and Statistics

In typological thinking, deviations from the In typological thinking, deviations from the mean are nothing but “errors,” with the mean are nothing but “errors,” with the mean approaching the true cause. mean approaching the true cause. (Example: measurement of the speed of (Example: measurement of the speed of sound.)sound.)

In populationIn population thinking, deviations are the thinking, deviations are the reality of substantive importance; the reality of substantive importance; the mean is mean is aa property of a population. property of a population.

Page 8: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Two Views of RegressionTwo Views of Regression

Gaussian View (Typological Thinking): Gaussian View (Typological Thinking): Observed Data = Constant Model + Measurement Observed Data = Constant Model + Measurement

ErrorError Example: yExample: yii = = + + ii, where , where is a true constant.is a true constant.

Galtonian View (Population Thinking): Galtonian View (Population Thinking): Observed Data = Systematic (between-group) Observed Data = Systematic (between-group)

Variability + Remaining (within-group) VariabilityVariability + Remaining (within-group) Variability Example: yExample: yii = = + + ii,, where where =exp(Y). =exp(Y).

Page 9: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Potential Biases in Regression Potential Biases in Regression AnalysisAnalysis

YYii = = + + iiDDii + + i i

There are two types of variability that may There are two types of variability that may cause biases: cause biases: (1) Pre-treatment heterogeneity bias : (1) Pre-treatment heterogeneity bias : ii. If . If

corr(corr(,,,,D)≠0, => pre-treatment heterogeneity bias. D)≠0, => pre-treatment heterogeneity bias.

(2) Treatment-effect heterogeneity bias : (2) Treatment-effect heterogeneity bias : i i If If

corr(corr(,,,,D)≠0, => treatment-effect heterogeneity D)≠0, => treatment-effect heterogeneity

bias. bias.

Page 10: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Comment Comment

When the first form of heterogeneity bias is present, we When the first form of heterogeneity bias is present, we may have “spurious” causal effect.may have “spurious” causal effect.

““Omitted variable bias”Omitted variable bias” ““Correlation does not equal causation.”Correlation does not equal causation.”

ExampleExample

D Y

U

Page 11: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Comment Comment

Second form of heterogeneity bias may Second form of heterogeneity bias may result from rational “anticipatory behavior.”result from rational “anticipatory behavior.”

Problem of “self-selection.”Problem of “self-selection.” ExampleExample

D Y

U

Page 12: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Yu Xie’s “Fundamental Paradox in Yu Xie’s “Fundamental Paradox in Social Science”Social Science”

There is There is alwaysalways variability at the individual variability at the individual level. level.

Causal inference is impossible at the Causal inference is impossible at the individual level and thus individual level and thus alwaysalways requires requires statistical analysis at the group level on statistical analysis at the group level on the basis of some homogeneity the basis of some homogeneity assumption. assumption.

Page 13: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Key Difficulties of a Research Key Difficulties of a Research DesignDesign

(1) How do we know that results based on (1) How do we know that results based on your “comparison” are valid? your “comparison” are valid?

““Internal validity”Internal validity” (2) How do we know that results based on (2) How do we know that results based on

your “comparison” hold true in other your “comparison” hold true in other settings? settings?

““External validity”External validity”

Page 14: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Research Design PossibilitiesResearch Design Possibilities Social Experiments (Randomization) Social Experiments (Randomization) Structural Approach Structural Approach

Multivariate Analysis (Social Grouping Principle)Multivariate Analysis (Social Grouping Principle) Multi-level Analysis (Social Context Principle)Multi-level Analysis (Social Context Principle)

““Quasi-Experimental Designs” or “Natural Quasi-Experimental Designs” or “Natural Experiments”.Experiments”. Instrumental Variables (Randomization) Instrumental Variables (Randomization) Regression Discontinuity (Social Context Principle)Regression Discontinuity (Social Context Principle) Utilizing Spatial Variation (Social Context Principle)Utilizing Spatial Variation (Social Context Principle) Utilizing Temporal Variation (Social Context Principle)Utilizing Temporal Variation (Social Context Principle)

Clustering Design Clustering Design Fixed Effects Model (Social Grouping Principle)Fixed Effects Model (Social Grouping Principle)

Page 15: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Three Key Features of a Good Three Key Features of a Good PaperPaper

The harmonious trio: Theory, Design, and The harmonious trio: Theory, Design, and Evidence. All need to be in place.Evidence. All need to be in place. A good theoretical/conceptual framework –> research A good theoretical/conceptual framework –> research

question.question. A good research design -> matching empirical data to A good research design -> matching empirical data to

research question).research question). Good data analysis -> results that address the Good data analysis -> results that address the

research question.research question. Tight integration of the three. Tight integration of the three.

Page 16: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Why Focus on Small Topics?Why Focus on Small Topics?

Socratic method of inquiry in the western Socratic method of inquiry in the western tradition.tradition.

True knowledge can stand harsh criticisms.True knowledge can stand harsh criticisms. Many important, big questions are not Many important, big questions are not

researchable questions, such as value of researchable questions, such as value of life.life.

From small to big, accumulation of From small to big, accumulation of knowledge.knowledge.

““Demographic tradition” under Duncan’s Demographic tradition” under Duncan’s influence. influence.

Page 17: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Experimental ApproachExperimental Approach

Experimental design eliminates both forms of Experimental design eliminates both forms of the heterogeneity biases. the heterogeneity biases.

Example: High/Scope Perry Preschool study Example: High/Scope Perry Preschool study conducted in Ypsilanti. conducted in Ypsilanti.

Manski and Garfinkel (1992): experimental Manski and Garfinkel (1992): experimental designs suffer from shortcomings that are designs suffer from shortcomings that are often overlooked. often overlooked.

Manski and Garfinkel refer to experimental Manski and Garfinkel refer to experimental approach as “reduced-form.” approach as “reduced-form.”

Page 18: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Shortcomings of Experimental Shortcomings of Experimental ApproachApproach

We cannot always extrapolate results from an We cannot always extrapolate results from an experimental setting to natural setting.experimental setting to natural setting.

Thus, Manski and Garfinkel openly criticize Thus, Manski and Garfinkel openly criticize experimental designs:experimental designs:"In fact, reduced-form experimental evaluation actually "In fact, reduced-form experimental evaluation actually requires that a highly specific and suspect structural requires that a highly specific and suspect structural assumption hold: Individuals and organizations must assumption hold: Individuals and organizations must respond in the same way to the experimental version of respond in the same way to the experimental version of a program as they would to the actual version." (p.17)a program as they would to the actual version." (p.17)

I.e., lacking “external validity.” I.e., lacking “external validity.”

Page 19: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Structural ApproachStructural Approach

Manski and Garfinkel propose the Manski and Garfinkel propose the "structural" approach as an alternative. "structural" approach as an alternative.

DefinitionDefinition: structural approach refers to : structural approach refers to statistical methods that model causal statistical methods that model causal processes based on observational data. processes based on observational data.

Head Start example: control on SES, Head Start example: control on SES, parental involvement, etc. parental involvement, etc.

Requires strong social science theories.Requires strong social science theories.

Page 20: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Comparison of the two Comparison of the two ApproachesApproaches

Advantages of Structural Approach:Advantages of Structural Approach: Since it is conducted in a natural setting, its Since it is conducted in a natural setting, its

findings are directly relevant to the whole findings are directly relevant to the whole population. In contrast, results from an population. In contrast, results from an experimental design need to be extrapolated. experimental design need to be extrapolated.

It is less costly. In contrast, experimental It is less costly. In contrast, experimental research is very expensive.research is very expensive.

It builds upon and contributes to theory. In It builds upon and contributes to theory. In contract, the reduced-form approach only yield contract, the reduced-form approach only yield simple answers to simple questions.simple answers to simple questions.

Page 21: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Advantages of Reduced-form Advantages of Reduced-form ApproachApproach

Biases due to unobservables can be Biases due to unobservables can be eliminated through randomization.eliminated through randomization.

It requires fewer assumptions.It requires fewer assumptions. It does not require complicated statistical It does not require complicated statistical

models that the public and government models that the public and government officials have difficulty understanding. officials have difficulty understanding.

Page 22: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Beyond the Variability PrincipleBeyond the Variability Principle

Use of social grouping principle allows us Use of social grouping principle allows us to better understand group-specific to better understand group-specific properties, i.e., between-group analyses. properties, i.e., between-group analyses.

Useful as a descriptive tool. No Useful as a descriptive tool. No assumption is needed. assumption is needed.

Application of Galtonian regression:Application of Galtonian regression: Regression = E(Y|X), X denotes group Regression = E(Y|X), X denotes group

Page 23: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Using Social Grouping to Control for Using Social Grouping to Control for

HeterogeneityHeterogeneity Social grouping always reduces variability Social grouping always reduces variability

=> implies within-group homogeneity.=> implies within-group homogeneity. We may assume that meaningful We may assume that meaningful

heterogeneity and endogeneity can be heterogeneity and endogeneity can be captured by social grouping (still wishful captured by social grouping (still wishful thinking). thinking).

Assumptions (comment 5) are more Assumptions (comment 5) are more plausible after social grouping than before. plausible after social grouping than before.

Page 24: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Multiple RegressionMultiple Regression

Change regression to: Change regression to: YYii = = + + DDii + +’X’Xii + + i i

Interpretation of Interpretation of :: Treatment effect within levels of X, or Treatment effect within levels of X, or

controlling for X.controlling for X.

D Y

X

Page 25: Methodological Workshop 1: Research Design Yu Xie University of Michigan

CommentComment For X to do this, it needs to be correlated For X to do this, it needs to be correlated

with D (“correlation condition,” c1) and with D (“correlation condition,” c1) and affects Y (“relevance condition,” c2). affects Y (“relevance condition,” c2).

X should be pre-treatment, determining X should be pre-treatment, determining both D and Y structurally. both D and Y structurally.

D Y

X

c1 c2

Page 26: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Examples: Quasi-Experiment Examples: Quasi-Experiment Design Utilizing Spatial VariationDesign Utilizing Spatial Variation

Certain policies are introduced in State A but not Certain policies are introduced in State A but not in State B. in State B. States A and B are otherwise comparable.States A and B are otherwise comparable. Observe how outcome Y differs between State A and Observe how outcome Y differs between State A and

State B. State B.

Pace of economic reforms in China differs Pace of economic reforms in China differs greatly by regiongreatly by region Associate regional variation in returns to education to Associate regional variation in returns to education to

regional variation in depth of economic reforms. regional variation in depth of economic reforms.

Page 27: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Examples: Quasi-Experiment Examples: Quasi-Experiment Design Utilizing Temporal VariationDesign Utilizing Temporal Variation Declining significance of race? Declining significance of race?

Examine temporal changes in SES Examine temporal changes in SES differences by race differences by race

Hope to see a narrowing of racial gaps, Hope to see a narrowing of racial gaps, particularly after the civil rights movement. particularly after the civil rights movement.

Effect of a new instructional method:Effect of a new instructional method:

Page 28: Methodological Workshop 1: Research Design Yu Xie University of Michigan

INSTRUMENTAL VARIABLESINSTRUMENTAL VARIABLES

WHAT ARE INSTRUMENTS?WHAT ARE INSTRUMENTS? Intuitively, instruments are variables that move Intuitively, instruments are variables that move

around the probability of participation but do not around the probability of participation but do not affect outcomes other than through their effect affect outcomes other than through their effect on participation.on participation.

Put more statistically, instruments are variables Put more statistically, instruments are variables that are correlated with the endogenous variable that are correlated with the endogenous variable – in this context the treatment indicator – but not – in this context the treatment indicator – but not correlated with the unobservable in the outcome correlated with the unobservable in the outcome equation.equation.

Page 29: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Instrumental-Variable ApproachInstrumental-Variable Approach

Condition: IVCondition: IV Z affects Y only through X, Z affects Y only through X, meaning:meaning: Z is correlated with Y but does not affect Y Z is correlated with Y but does not affect Y

directly (called “exclusion restriction”).directly (called “exclusion restriction”). Z is also correlated with X but not perfectly.Z is also correlated with X but not perfectly.

It’s very hard to find a good Z.It’s very hard to find a good Z.

X Y

ZU

Page 30: Methodological Workshop 1: Research Design Yu Xie University of Michigan

WHERE DO INSTRUMENTS WHERE DO INSTRUMENTS COME FROM?COME FROM?

Theory combined with clever data Theory combined with clever data collection collection

Ex: Lottery number of military enlistment Ex: Lottery number of military enlistment (Angrist 1990)(Angrist 1990)

Ex: distance as in Card (1995)Ex: distance as in Card (1995)

Page 31: Methodological Workshop 1: Research Design Yu Xie University of Michigan

COMMON EFFECT IV EXAMPLE ICOMMON EFFECT IV EXAMPLE I

A training center serves two towns: the near town and A training center serves two towns: the near town and the far town. the far town.

The impact of training on those who take it is 10, while The impact of training on those who take it is 10, while the outcome in the absence of training is 100. the outcome in the absence of training is 100.

For those in the near town, the cost is zero for everyone. For those in the near town, the cost is zero for everyone. In the far town, for those with a car the cost is In the far town, for those with a car the cost is essentially zero; for those without one the cost is 10. essentially zero; for those without one the cost is 10.

Assume that a random half of the eligible persons have a Assume that a random half of the eligible persons have a car and that there are 200 eligible persons in each town. car and that there are 200 eligible persons in each town.

Assume also that everyone knows their cost of training Assume also that everyone knows their cost of training and their benefits from training, and participates only and their benefits from training, and participates only when the benefits exceed the costs. when the benefits exceed the costs.

Page 32: Methodological Workshop 1: Research Design Yu Xie University of Michigan

COMMON EFFECT IV EXAMPLE IICOMMON EFFECT IV EXAMPLE II

Let Z =1 denote residence in the near Let Z =1 denote residence in the near town and Z = 0 denote residence in the far town and Z = 0 denote residence in the far town. town.

Using our standard notation:Using our standard notation: Pr(D=1|Z=1)=1Pr(D=1|Z=1)=1 Pr(D=1|Z=0)=0.5 Pr(D=1|Z=0)=0.5 Pr(Y=1|Z=1)=YPr(Y=1|Z=1)=YCC + + Pr(D=1|Z=1) =100+10*1.0 = 110Pr(D=1|Z=1) =100+10*1.0 = 110 Pr(Y=1|Z=0)=YPr(Y=1|Z=0)=YCC + + Pr(D=1|Z=0) =100+10*0.5 = 105Pr(D=1|Z=0) =100+10*0.5 = 105

Page 33: Methodological Workshop 1: Research Design Yu Xie University of Michigan

COMMON EFFECT IV EXAMPLE – IIICOMMON EFFECT IV EXAMPLE – III

The IV estimator in this simple case is given by:The IV estimator in this simple case is given by:

Inserting the numbers from the example into the Inserting the numbers from the example into the formula gives:formula gives:

( | 1) ( | 0)

Pr( 1| 1) Pr( 1| 0)

E Y Z E Y Z

D Z D Z

110 105 510

1.0 0.5 0.5

Page 34: Methodological Workshop 1: Research Design Yu Xie University of Michigan

A CONTINUOUS INSTRUMENT IN A A CONTINUOUS INSTRUMENT IN A COMMON EFFECT WORLDCOMMON EFFECT WORLD

The two-stage least squares estimator is The two-stage least squares estimator is commonly used in this case.commonly used in this case.

In the first stage, the endogenous variable (i.e., In the first stage, the endogenous variable (i.e., the treatment indicator) is regressed on all the the treatment indicator) is regressed on all the exogenous variables, including the instrument. exogenous variables, including the instrument.

The second-stage outcome equation regression The second-stage outcome equation regression then includes the predicted value of the then includes the predicted value of the endogenous variable rather than the endogenous variable rather than the endogenous variable itself.endogenous variable itself.

Standard errors must be corrected to account for Standard errors must be corrected to account for the first-stage estimation. Most software the first-stage estimation. Most software packages now do this for you. (ivreg command in packages now do this for you. (ivreg command in Stata.)Stata.)

Page 35: Methodological Workshop 1: Research Design Yu Xie University of Michigan

A Complication: When Treatment A Complication: When Treatment Effects are HeterogeneousEffects are Heterogeneous

IV Estimator is turned to Local Average IV Estimator is turned to Local Average Treatment Effect (LATE): average Treatment Effect (LATE): average treatment effect for those persons whose treatment effect for those persons whose treatment status is affected by random treatment status is affected by random assignment. assignment.

Also called “principal stratification Also called “principal stratification approach.” (approach.” (Angrist, Imbens, and Rubin. Angrist, Imbens, and Rubin. 1996; Little, and Yau 1998)1996; Little, and Yau 1998)

Page 36: Methodological Workshop 1: Research Design Yu Xie University of Michigan

Compliers Compliers Never-takersNever-takers

Defiers Defiers Always-takersAlways-takers

R Assignment

T Treatment received

0

1

0 1

Defiers Defiers Never-takersNever-takers

Compliers Compliers Always-takersAlways-takers

Classification of Compliance StatusClassification of Compliance Status

0 = control

1 = treatment

Page 37: Methodological Workshop 1: Research Design Yu Xie University of Michigan

ReferencesReferences Angrist, Joshua. 1990. “Lifetime Earnings and the Vietnam Era Draft Angrist, Joshua. 1990. “Lifetime Earnings and the Vietnam Era Draft

Lottery: Evidence from Social Security Administrative Records” Lottery: Evidence from Social Security Administrative Records” American Economic ReviewAmerican Economic Review, 80: 313-36. , 80: 313-36.

Angrist, J. D., G.W. Imbens, and D.B. Rubin. Angrist, J. D., G.W. Imbens, and D.B. Rubin. 1996. “Identification of 1996. “Identification of Causal Effects Using Instrumental Variables.” Causal Effects Using Instrumental Variables.” Journal of the American Journal of the American Statistical AssociationStatistical Association 91(434): 444-455. 91(434): 444-455.

Card, David. 1995. “Using Geographic Variation in College Proximity Card, David. 1995. “Using Geographic Variation in College Proximity to Estimate the Return to Schooling.” Pp. 201-222 in to Estimate the Return to Schooling.” Pp. 201-222 in Aspects of Aspects of Labour Market Behavior: Essays in Honour of John VanderkampLabour Market Behavior: Essays in Honour of John Vanderkamp, ed. , ed. by Louis Christofides, E. Kenneth Grant, and Robert Swidinsky. by Louis Christofides, E. Kenneth Grant, and Robert Swidinsky. Toronto: University of Toronto Press.Toronto: University of Toronto Press.

Little, Roderick J. & Yau, Linda H.Y. 1998. “Statistical Techniques for Little, Roderick J. & Yau, Linda H.Y. 1998. “Statistical Techniques for Analyzing Data from Prevention Trials: Treatment of No-shows Using Analyzing Data from Prevention Trials: Treatment of No-shows Using Rubin's Causal Model.” Rubin's Causal Model.” Psychological MethodsPsychological Methods 3(2):147-159. 3(2):147-159.

Manski, C.F., and Garfinkel, I. 1992. “Introduction.” Pp.1-21 in Manski, C.F., and Garfinkel, I. 1992. “Introduction.” Pp.1-21 in Evaluating Welfare and Training ProgramsEvaluating Welfare and Training Programs, edited by Manski, Charles , edited by Manski, Charles F. and Irwin Garfinkel. Cambridge, MA: Harvard University Press. F. and Irwin Garfinkel. Cambridge, MA: Harvard University Press.