what can we learn from the experimental turn?

119
Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning? What Can We Learn from the Experimental Turn? Thad Dunning Department of Political Science University of California, Berkeley Thad Dunning The Experimental Turn 1/ 46

Upload: others

Post on 18-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What Can We Learnfrom the Experimental Turn?

Thad DunningDepartment of Political ScienceUniversity of California, Berkeley

Thad Dunning The Experimental Turn 1/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

0"

2"

4"

6"

8"

10"

12"

14"

Num

ber'o

f'Ar+cles'

Randomized'Controlled'Experiments'Published'in'the'American)Poli-cal)Science)Review)(1906A2004)'

Source:"Jamie"Druckman,"Donald"P."Green,"James"H."Kuklinski,"and"Arthur"Lupia."2006."“The"Growth"and"

Development"of"Experimental"Research"PoliLcal"Science.”"American)Poli-cal)Science)Review)100:"627O635."

Thad Dunning The Experimental Turn 2/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

0"

10"

20"

30"

40"

50"

60"

70"

80"

90"

1960,1989" 1990,1999" 2000,2009"

Num

ber'of'Pub

lishe

d'Ar1cles'

Political Science Economics

Articles published in major political science and economics journals with “natural experiment” in the title or abstract (as tracked in the online archive JSTOR).

Natural Experiments in Political Science and Economics

Thad Dunning The Experimental Turn 3/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Growth of Regression-Discontinuity Designs

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999 2002 2005 2008 2011

020

4060

80100

Figure : The figure shows the number of peer-reviewed journalpublications in economics and political science that refer to an RD designin the title or abstract.

Thad Dunning The Experimental Turn 4/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Promise and Pitfalls of the Experimental Turn

Three (Qualified) Cheers for Design-Based Inference

The Challenge of Cumulative LearningWhither the Experimental Turn?

Thad Dunning The Experimental Turn 5/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Promise and Pitfalls of the Experimental Turn

Three (Qualified) Cheers for Design-Based InferenceThe Challenge of Cumulative Learning

Whither the Experimental Turn?

Thad Dunning The Experimental Turn 5/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Promise and Pitfalls of the Experimental Turn

Three (Qualified) Cheers for Design-Based InferenceThe Challenge of Cumulative LearningWhither the Experimental Turn?

Thad Dunning The Experimental Turn 5/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #1: From complicated models to crediblemulti-method research

In principle, experiments and natural experiments aredesign-based rather than model-based methods

Attempts to confront inferential problems (such asconfounding) come at the research design stage, rather thanthrough ex-post statistical adjustmentInferential leverage comes more from the research design andnot from modeling (e.g., multivariate regression, matching)A simple comparison of means may suffice to establish acausal effect

I As Freedman (2009: 9) puts it, “It is the design of the studyand the size of the effect that compel conviction.”

Thad Dunning The Experimental Turn 6/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #1: From complicated models to crediblemulti-method research

In principle, experiments and natural experiments aredesign-based rather than model-based methodsAttempts to confront inferential problems (such asconfounding) come at the research design stage, rather thanthrough ex-post statistical adjustment

Inferential leverage comes more from the research design andnot from modeling (e.g., multivariate regression, matching)A simple comparison of means may suffice to establish acausal effect

I As Freedman (2009: 9) puts it, “It is the design of the studyand the size of the effect that compel conviction.”

Thad Dunning The Experimental Turn 6/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #1: From complicated models to crediblemulti-method research

In principle, experiments and natural experiments aredesign-based rather than model-based methodsAttempts to confront inferential problems (such asconfounding) come at the research design stage, rather thanthrough ex-post statistical adjustmentInferential leverage comes more from the research design andnot from modeling (e.g., multivariate regression, matching)

A simple comparison of means may suffice to establish acausal effect

I As Freedman (2009: 9) puts it, “It is the design of the studyand the size of the effect that compel conviction.”

Thad Dunning The Experimental Turn 6/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #1: From complicated models to crediblemulti-method research

In principle, experiments and natural experiments aredesign-based rather than model-based methodsAttempts to confront inferential problems (such asconfounding) come at the research design stage, rather thanthrough ex-post statistical adjustmentInferential leverage comes more from the research design andnot from modeling (e.g., multivariate regression, matching)A simple comparison of means may suffice to establish acausal effect

I As Freedman (2009: 9) puts it, “It is the design of the studyand the size of the effect that compel conviction.”

Thad Dunning The Experimental Turn 6/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #1: From complicated models to crediblemulti-method research

In principle, experiments and natural experiments aredesign-based rather than model-based methodsAttempts to confront inferential problems (such asconfounding) come at the research design stage, rather thanthrough ex-post statistical adjustmentInferential leverage comes more from the research design andnot from modeling (e.g., multivariate regression, matching)A simple comparison of means may suffice to establish acausal effect

I As Freedman (2009: 9) puts it, “It is the design of the studyand the size of the effect that compel conviction.”

Thad Dunning The Experimental Turn 6/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

John Snow’s famous study of cholera

Death&rate&from&cholera&in&London,&by&source&of&water&supply&

Rate per 10,000 houses

Southwark & Vauxhall

315

Lambeth 37

Rest of London 59

(adapted'from'Snow'1855,'Table'IX)'

Thad Dunning The Experimental Turn 7/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Digital Democratization (Danny Hidalgo 2011)Module 15, Session 1 Module 15, Session 2 Module 15, Session 3 Module 17, Session 3

Results: Invalid Votes

Electorate (1996)

1998

Blankan

dInvalidVotes

(%of

Total

Votes)

10

20

30

40

50

60

20000 40000 60000 80000

Thad Dunning The Experimental Turn 8/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Simple analysis: graphical balance tests (Hidalgo 2011)Module 15, Session 1 Module 15, Session 2 Module 15, Session 3 Module 17, Session 3

Balance

1996 Electorate

1996

Blankan

dInvalidVotes

(%)

0

5

10

15

20

25

30

0 20000 40000 60000 80000

Thad Dunning The Experimental Turn 9/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

But how credible is the model?

… …

Yi(1)

Treatment group Control group

Yi(1)

Yi(1)€

Yi(1)

Yi(1)

Yi(1)

Yi(1)

Yi(1)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)Study group

Thad Dunning The Experimental Turn 10/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

If the model applies, analysis is simple . . .

… …

Yi(1)

Treatment group Control group

Yi(1)

Yi(1)€

Yi(1)

Yi(1)

Yi(1)

Yi(1)

Yi(1)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)

Yi(0)Study group

1N

[Yi(1) −i=1

N

∑ Yi(0)]The estimand:

An unbiased estimator:

1m

[Yi |Ti =1i=1

m

∑ ] − 1N −m

[Yi |Ti = 0i=m+1

N

∑ ]

where is an indicator for treatment assignment. Under this model, a random subset of size m<N units is assigned to treatment. The units assigned to the treatment group are indexed from 1 to m.

Ti

Thad Dunning The Experimental Turn 11/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

. . . but simplicity and credibility are not guaranteed.

Case ManagementNo Yes

No (A) 10.5% (B) 9.0%Cash

Incentives Yes (C) 14.8% (D) 19.7%

The table displays high-school graduation rates of teenage mothersrandomly assigned to receive: (B) case management; (C) cashincentives; (A) neither; or (D) both. Results are shown for mothers whowere not in school at program entry. Adapted from Mauldon et al (2000,35), N=531.

Thad Dunning The Experimental Turn 12/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

A model-based alternative

Logistic regression is a common choice for analyzingexperimental data with dichotomous outcomes. According toa latent-variables formulation,

Y

i

= 1 i↵ ↵+ �1C

i

+ �2F

i

+ �3(Ci

⇤ F

i

) + u

i

> 0, (1)

where u

i

is a random variable drawn from the standard logisticdistribution; the u

i

are assumed to be independent andidentically distributed (i.i.d.) across subjects. Here, C

i

= 1 ifmother i is assigned to case management and F

i

= 1 ifassigned to financial incentives; if she graduates from highschool, Y

i

= 1.

Thad Dunning The Experimental Turn 13/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Where the modeling goes awry

Why is there a latent variable u

i

, and why are its realizationsi.i.d.? Why are the probabilities independent across i? Whythe logistic distribution function?

I Random assignment implies none of these properties.I The assumptions are quite different from the box model, where

observations are neither independent nor identicallydistributed—and there is no latent variable u

i

in the picture.Note that the difference between average potential outcomesin the cash*case condition and the control is not

⇤(↵+ �1 + �2 + �3) � ⇤(↵). (2)

For this difference, you need to ignore the �1 and �2 terms,leaving ⇤(↵+ �3) � ⇤(↵).

I But the counterfactual reasoning is bizarre: according to themodel, when C

i

= 1 and F

i

= 1, �1 and �2 should not drop out.

Thad Dunning The Experimental Turn 14/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Where the modeling goes awry

Why is there a latent variable u

i

, and why are its realizationsi.i.d.? Why are the probabilities independent across i? Whythe logistic distribution function?

I Random assignment implies none of these properties.

I The assumptions are quite different from the box model, whereobservations are neither independent nor identicallydistributed—and there is no latent variable u

i

in the picture.Note that the difference between average potential outcomesin the cash*case condition and the control is not

⇤(↵+ �1 + �2 + �3) � ⇤(↵). (2)

For this difference, you need to ignore the �1 and �2 terms,leaving ⇤(↵+ �3) � ⇤(↵).

I But the counterfactual reasoning is bizarre: according to themodel, when C

i

= 1 and F

i

= 1, �1 and �2 should not drop out.

Thad Dunning The Experimental Turn 14/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Where the modeling goes awry

Why is there a latent variable u

i

, and why are its realizationsi.i.d.? Why are the probabilities independent across i? Whythe logistic distribution function?

I Random assignment implies none of these properties.I The assumptions are quite different from the box model, where

observations are neither independent nor identicallydistributed—and there is no latent variable u

i

in the picture.

Note that the difference between average potential outcomesin the cash*case condition and the control is not

⇤(↵+ �1 + �2 + �3) � ⇤(↵). (2)

For this difference, you need to ignore the �1 and �2 terms,leaving ⇤(↵+ �3) � ⇤(↵).

I But the counterfactual reasoning is bizarre: according to themodel, when C

i

= 1 and F

i

= 1, �1 and �2 should not drop out.

Thad Dunning The Experimental Turn 14/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Where the modeling goes awry

Why is there a latent variable u

i

, and why are its realizationsi.i.d.? Why are the probabilities independent across i? Whythe logistic distribution function?

I Random assignment implies none of these properties.I The assumptions are quite different from the box model, where

observations are neither independent nor identicallydistributed—and there is no latent variable u

i

in the picture.Note that the difference between average potential outcomesin the cash*case condition and the control is not

⇤(↵+ �1 + �2 + �3) � ⇤(↵). (2)

For this difference, you need to ignore the �1 and �2 terms,leaving ⇤(↵+ �3) � ⇤(↵).

I But the counterfactual reasoning is bizarre: according to themodel, when C

i

= 1 and F

i

= 1, �1 and �2 should not drop out.

Thad Dunning The Experimental Turn 14/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Where the modeling goes awry

Why is there a latent variable u

i

, and why are its realizationsi.i.d.? Why are the probabilities independent across i? Whythe logistic distribution function?

I Random assignment implies none of these properties.I The assumptions are quite different from the box model, where

observations are neither independent nor identicallydistributed—and there is no latent variable u

i

in the picture.Note that the difference between average potential outcomesin the cash*case condition and the control is not

⇤(↵+ �1 + �2 + �3) � ⇤(↵). (2)

For this difference, you need to ignore the �1 and �2 terms,leaving ⇤(↵+ �3) � ⇤(↵).

I But the counterfactual reasoning is bizarre: according to themodel, when C

i

= 1 and F

i

= 1, �1 and �2 should not drop out.

Thad Dunning The Experimental Turn 14/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The importance of assumptions

The Neyman model is flexible and general; for experimentsand natural experiments, it often provides the right startingpoint.

Still, the model imposes restrictions which one must criticallyassess:

I E.g., Non-Interference (a.k.a. SUTVA); Exclusionrestriction; As-if Random

Key point: in some studies, the assumptions are plausible; inothers, they are more suspect.With design-based research assumptions have more preciseimplications—and thus are amenable to partial verification,through simple analysis.

Thad Dunning The Experimental Turn 15/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The importance of assumptions

The Neyman model is flexible and general; for experimentsand natural experiments, it often provides the right startingpoint.Still, the model imposes restrictions which one must criticallyassess:

I E.g., Non-Interference (a.k.a. SUTVA); Exclusionrestriction; As-if Random

Key point: in some studies, the assumptions are plausible; inothers, they are more suspect.With design-based research assumptions have more preciseimplications—and thus are amenable to partial verification,through simple analysis.

Thad Dunning The Experimental Turn 15/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The importance of assumptions

The Neyman model is flexible and general; for experimentsand natural experiments, it often provides the right startingpoint.Still, the model imposes restrictions which one must criticallyassess:

I E.g., Non-Interference (a.k.a. SUTVA); Exclusionrestriction; As-if Random

Key point: in some studies, the assumptions are plausible; inothers, they are more suspect.With design-based research assumptions have more preciseimplications—and thus are amenable to partial verification,through simple analysis.

Thad Dunning The Experimental Turn 15/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The importance of assumptions

The Neyman model is flexible and general; for experimentsand natural experiments, it often provides the right startingpoint.Still, the model imposes restrictions which one must criticallyassess:

I E.g., Non-Interference (a.k.a. SUTVA); Exclusionrestriction; As-if Random

Key point: in some studies, the assumptions are plausible; inothers, they are more suspect.

With design-based research assumptions have more preciseimplications—and thus are amenable to partial verification,through simple analysis.

Thad Dunning The Experimental Turn 15/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The importance of assumptions

The Neyman model is flexible and general; for experimentsand natural experiments, it often provides the right startingpoint.Still, the model imposes restrictions which one must criticallyassess:

I E.g., Non-Interference (a.k.a. SUTVA); Exclusionrestriction; As-if Random

Key point: in some studies, the assumptions are plausible; inothers, they are more suspect.With design-based research assumptions have more preciseimplications—and thus are amenable to partial verification,through simple analysis.

Thad Dunning The Experimental Turn 15/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #2: Close engagement with research contexts

Design-based inference often requires close engagement withthe field

This is true for experiments as for natural experiments:

I With true experiments, the design of manipulations may (andoften should) be informed by field research.

I With natural experiments, “soaking and poaking” is oftencritical for discovering and validating designs.

This is “extracting ideas at close range” (Collier 1999)Thus, natural complementaries between design-basedapproaches and traditional strengths of area studies

Thad Dunning The Experimental Turn 16/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #2: Close engagement with research contexts

Design-based inference often requires close engagement withthe fieldThis is true for experiments as for natural experiments:

I With true experiments, the design of manipulations may (andoften should) be informed by field research.

I With natural experiments, “soaking and poaking” is oftencritical for discovering and validating designs.

This is “extracting ideas at close range” (Collier 1999)Thus, natural complementaries between design-basedapproaches and traditional strengths of area studies

Thad Dunning The Experimental Turn 16/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #2: Close engagement with research contexts

Design-based inference often requires close engagement withthe fieldThis is true for experiments as for natural experiments:

I With true experiments, the design of manipulations may (andoften should) be informed by field research.

I With natural experiments, “soaking and poaking” is oftencritical for discovering and validating designs.

This is “extracting ideas at close range” (Collier 1999)Thus, natural complementaries between design-basedapproaches and traditional strengths of area studies

Thad Dunning The Experimental Turn 16/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #2: Close engagement with research contexts

Design-based inference often requires close engagement withthe fieldThis is true for experiments as for natural experiments:

I With true experiments, the design of manipulations may (andoften should) be informed by field research.

I With natural experiments, “soaking and poaking” is oftencritical for discovering and validating designs.

This is “extracting ideas at close range” (Collier 1999)Thus, natural complementaries between design-basedapproaches and traditional strengths of area studies

Thad Dunning The Experimental Turn 16/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #2: Close engagement with research contexts

Design-based inference often requires close engagement withthe fieldThis is true for experiments as for natural experiments:

I With true experiments, the design of manipulations may (andoften should) be informed by field research.

I With natural experiments, “soaking and poaking” is oftencritical for discovering and validating designs.

This is “extracting ideas at close range” (Collier 1999)

Thus, natural complementaries between design-basedapproaches and traditional strengths of area studies

Thad Dunning The Experimental Turn 16/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #2: Close engagement with research contexts

Design-based inference often requires close engagement withthe fieldThis is true for experiments as for natural experiments:

I With true experiments, the design of manipulations may (andoften should) be informed by field research.

I With natural experiments, “soaking and poaking” is oftencritical for discovering and validating designs.

This is “extracting ideas at close range” (Collier 1999)Thus, natural complementaries between design-basedapproaches and traditional strengths of area studies

Thad Dunning The Experimental Turn 16/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

How do property rights affect the poor? (Galiani andSchargrodsky 2006, 2007)

In 1981, squatters organized by the Catholic church occupiedan urban wasteland in the province of Buenos Aires, dividingthe land into similar parcels

After the return to democracy, a 1984 law expropriated theland, with the intention of transferring title to the squattersSome of the original owners challenged the expropriation incourt, leading to long delays, while other titles were cededand transferred to squattersThe legal action created a treatment group—squatters towhom titles were ceded—and a control group—squatterswhose titles were not cededThe authors administered surveys to both groups

Thad Dunning The Experimental Turn 17/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

How do property rights affect the poor? (Galiani andSchargrodsky 2006, 2007)

In 1981, squatters organized by the Catholic church occupiedan urban wasteland in the province of Buenos Aires, dividingthe land into similar parcelsAfter the return to democracy, a 1984 law expropriated theland, with the intention of transferring title to the squatters

Some of the original owners challenged the expropriation incourt, leading to long delays, while other titles were cededand transferred to squattersThe legal action created a treatment group—squatters towhom titles were ceded—and a control group—squatterswhose titles were not cededThe authors administered surveys to both groups

Thad Dunning The Experimental Turn 17/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

How do property rights affect the poor? (Galiani andSchargrodsky 2006, 2007)

In 1981, squatters organized by the Catholic church occupiedan urban wasteland in the province of Buenos Aires, dividingthe land into similar parcelsAfter the return to democracy, a 1984 law expropriated theland, with the intention of transferring title to the squattersSome of the original owners challenged the expropriation incourt, leading to long delays, while other titles were cededand transferred to squatters

The legal action created a treatment group—squatters towhom titles were ceded—and a control group—squatterswhose titles were not cededThe authors administered surveys to both groups

Thad Dunning The Experimental Turn 17/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

How do property rights affect the poor? (Galiani andSchargrodsky 2006, 2007)

In 1981, squatters organized by the Catholic church occupiedan urban wasteland in the province of Buenos Aires, dividingthe land into similar parcelsAfter the return to democracy, a 1984 law expropriated theland, with the intention of transferring title to the squattersSome of the original owners challenged the expropriation incourt, leading to long delays, while other titles were cededand transferred to squattersThe legal action created a treatment group—squatters towhom titles were ceded—and a control group—squatterswhose titles were not ceded

The authors administered surveys to both groups

Thad Dunning The Experimental Turn 17/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

How do property rights affect the poor? (Galiani andSchargrodsky 2006, 2007)

In 1981, squatters organized by the Catholic church occupiedan urban wasteland in the province of Buenos Aires, dividingthe land into similar parcelsAfter the return to democracy, a 1984 law expropriated theland, with the intention of transferring title to the squattersSome of the original owners challenged the expropriation incourt, leading to long delays, while other titles were cededand transferred to squattersThe legal action created a treatment group—squatters towhom titles were ceded—and a control group—squatterswhose titles were not cededThe authors administered surveys to both groups

Thad Dunning The Experimental Turn 17/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Effects of property rights

The$Effects$of$Land$Titles$on$Children’s$Health!

Source:!Galiani!and!Schargrodsky!(2004).!!No;ce!that!this!is!inten;on=to=treat!analysis.!In!the!first!two!rows,!data!for!children!ages!0=11!are!shown;!in!the!third!row,!data!for!teenage!girls!aged!14=17!are!shown.!!The!number!of!observa;ons!is!in!brackets.!

Thad Dunning The Experimental Turn 18/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Was the ceding of titles as-if random?

For the Argentina land-titling study, one can do quantitativetests of design (e.g. tests for balance on pre-treatmentcovariates).

Just as important is qualitative evidence for example, on theprocess by which the squatting took place:

I Squatters and Catholic Church organizers did not appear toknow the land was privately owned

I They did not anticipate the expropriation of land by the stateI They had no basis for predicting which particular plots of land

would have been expropriated, and thus for assigning plots toparticular squatters

This information does not come from data analysis (or DataSet Observations; see Collier and Brady 2010) but rather fromknowledge of context and process (a.k.a., causal processobservations)

Thad Dunning The Experimental Turn 19/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Was the ceding of titles as-if random?

For the Argentina land-titling study, one can do quantitativetests of design (e.g. tests for balance on pre-treatmentcovariates).Just as important is qualitative evidence for example, on theprocess by which the squatting took place:

I Squatters and Catholic Church organizers did not appear toknow the land was privately owned

I They did not anticipate the expropriation of land by the stateI They had no basis for predicting which particular plots of land

would have been expropriated, and thus for assigning plots toparticular squatters

This information does not come from data analysis (or DataSet Observations; see Collier and Brady 2010) but rather fromknowledge of context and process (a.k.a., causal processobservations)

Thad Dunning The Experimental Turn 19/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Was the ceding of titles as-if random?

For the Argentina land-titling study, one can do quantitativetests of design (e.g. tests for balance on pre-treatmentcovariates).Just as important is qualitative evidence for example, on theprocess by which the squatting took place:

I Squatters and Catholic Church organizers did not appear toknow the land was privately owned

I They did not anticipate the expropriation of land by the stateI They had no basis for predicting which particular plots of land

would have been expropriated, and thus for assigning plots toparticular squatters

This information does not come from data analysis (or DataSet Observations; see Collier and Brady 2010) but rather fromknowledge of context and process (a.k.a., causal processobservations)

Thad Dunning The Experimental Turn 19/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Was the ceding of titles as-if random?

For the Argentina land-titling study, one can do quantitativetests of design (e.g. tests for balance on pre-treatmentcovariates).Just as important is qualitative evidence for example, on theprocess by which the squatting took place:

I Squatters and Catholic Church organizers did not appear toknow the land was privately owned

I They did not anticipate the expropriation of land by the state

I They had no basis for predicting which particular plots of landwould have been expropriated, and thus for assigning plots toparticular squatters

This information does not come from data analysis (or DataSet Observations; see Collier and Brady 2010) but rather fromknowledge of context and process (a.k.a., causal processobservations)

Thad Dunning The Experimental Turn 19/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Was the ceding of titles as-if random?

For the Argentina land-titling study, one can do quantitativetests of design (e.g. tests for balance on pre-treatmentcovariates).Just as important is qualitative evidence for example, on theprocess by which the squatting took place:

I Squatters and Catholic Church organizers did not appear toknow the land was privately owned

I They did not anticipate the expropriation of land by the stateI They had no basis for predicting which particular plots of land

would have been expropriated, and thus for assigning plots toparticular squatters

This information does not come from data analysis (or DataSet Observations; see Collier and Brady 2010) but rather fromknowledge of context and process (a.k.a., causal processobservations)

Thad Dunning The Experimental Turn 19/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Was the ceding of titles as-if random?

For the Argentina land-titling study, one can do quantitativetests of design (e.g. tests for balance on pre-treatmentcovariates).Just as important is qualitative evidence for example, on theprocess by which the squatting took place:

I Squatters and Catholic Church organizers did not appear toknow the land was privately owned

I They did not anticipate the expropriation of land by the stateI They had no basis for predicting which particular plots of land

would have been expropriated, and thus for assigning plots toparticular squatters

This information does not come from data analysis (or DataSet Observations; see Collier and Brady 2010) but rather fromknowledge of context and process (a.k.a., causal processobservations)

Thad Dunning The Experimental Turn 19/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Process tracing

Qualitative methods and contextual knowledge are oftenrequired for designing and validating natural experiments.

I Treatment-Assignment CPOs: pieces or nuggets ofinformation about the process by which units were assigned totreatment and control conditions. Do key actors have theinformation, incentives, and capacity to undermine randomassignment?

I Model-Validation CPOs: Information that contributes tovalidating or undermining assumptions of causal models (e.g.,non-interference, exclusion restrictions).

The information contained within a CPO typically reflectsin-depth knowledge of one or more units, and/or the broadercontext in which data-set observations were generated.

Thad Dunning The Experimental Turn 20/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Process tracing

Qualitative methods and contextual knowledge are oftenrequired for designing and validating natural experiments.

I Treatment-Assignment CPOs: pieces or nuggets ofinformation about the process by which units were assigned totreatment and control conditions. Do key actors have theinformation, incentives, and capacity to undermine randomassignment?

I Model-Validation CPOs: Information that contributes tovalidating or undermining assumptions of causal models (e.g.,non-interference, exclusion restrictions).

The information contained within a CPO typically reflectsin-depth knowledge of one or more units, and/or the broadercontext in which data-set observations were generated.

Thad Dunning The Experimental Turn 20/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Process tracing

Qualitative methods and contextual knowledge are oftenrequired for designing and validating natural experiments.

I Treatment-Assignment CPOs: pieces or nuggets ofinformation about the process by which units were assigned totreatment and control conditions. Do key actors have theinformation, incentives, and capacity to undermine randomassignment?

I Model-Validation CPOs: Information that contributes tovalidating or undermining assumptions of causal models (e.g.,non-interference, exclusion restrictions).

The information contained within a CPO typically reflectsin-depth knowledge of one or more units, and/or the broadercontext in which data-set observations were generated.

Thad Dunning The Experimental Turn 20/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Process tracing

Qualitative methods and contextual knowledge are oftenrequired for designing and validating natural experiments.

I Treatment-Assignment CPOs: pieces or nuggets ofinformation about the process by which units were assigned totreatment and control conditions. Do key actors have theinformation, incentives, and capacity to undermine randomassignment?

I Model-Validation CPOs: Information that contributes tovalidating or undermining assumptions of causal models (e.g.,non-interference, exclusion restrictions).

The information contained within a CPO typically reflectsin-depth knowledge of one or more units, and/or the broadercontext in which data-set observations were generated.

Thad Dunning The Experimental Turn 20/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What can go wrong?

Freedman (2010) discusses the use of qualitative tools inepidemiological research.

I A possible concern: illustrations of successful use of CPOsmay select on the dependent variable.

Often, the deep contextual knowledge required to validate anatural experiment is esoteric.How can we validate the quality of the CPOs in any givenapplication?

Thad Dunning The Experimental Turn 21/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What can go wrong?

Freedman (2010) discusses the use of qualitative tools inepidemiological research.

I A possible concern: illustrations of successful use of CPOsmay select on the dependent variable.

Often, the deep contextual knowledge required to validate anatural experiment is esoteric.How can we validate the quality of the CPOs in any givenapplication?

Thad Dunning The Experimental Turn 21/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What can go wrong?

Freedman (2010) discusses the use of qualitative tools inepidemiological research.

I A possible concern: illustrations of successful use of CPOsmay select on the dependent variable.

Often, the deep contextual knowledge required to validate anatural experiment is esoteric.

How can we validate the quality of the CPOs in any givenapplication?

Thad Dunning The Experimental Turn 21/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What can go wrong?

Freedman (2010) discusses the use of qualitative tools inepidemiological research.

I A possible concern: illustrations of successful use of CPOsmay select on the dependent variable.

Often, the deep contextual knowledge required to validate anatural experiment is esoteric.How can we validate the quality of the CPOs in any givenapplication?

Thad Dunning The Experimental Turn 21/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Mixed-Method Experimental Research

Dunning and Harrison (2010) study the social institution ofcousinage ties (joking kinship) in Mali—which cross-cut ethnicties.

Can cousinage help explain low levels of ethnic voting in Mali?

Thad Dunning The Experimental Turn 22/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Mixed-Method Experimental Research

Dunning and Harrison (2010) study the social institution ofcousinage ties (joking kinship) in Mali—which cross-cut ethnicties.Can cousinage help explain low levels of ethnic voting in Mali?

Thad Dunning The Experimental Turn 22/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Experimental DesignCross-cutting Cleavages and Ethnic Voting February 2010

TABLE 1. Experimental Design: SubjectsAssigned to Treatment and Control Conditions

Subject andpolitician are

joking cousins

Subject andpolitician are notjoking cousins

Subject and politicianare from the sameethnic group

N = 136 N = 122

Subject and politicianare from differentethnic groups

N = 124 N = 152

Control conditionsPolitician’s last name

not givenN = 132

Subject and politicianhave the same lastname

N = 158

along various dimensions. The content of speechesviewed by all subjects was identical. The experimen-tal manipulation consisted of what subjects were toldabout the politician’s last name, which conveys infor-mation about both ethnic identity and cousinage ties inMali.

Our experimental design had six treatment andcontrol conditions. In the four treatment conditionsshown in the top two rows of Table 1, the subject andthe politician are, respectively, joking cousins from thesame ethnic group (N = 136); noncousins from thesame ethnic group (N = 122); joking cousins from dif-ferent ethnic groups (N = 124); or noncousins fromdifferent ethnic groups (N = 152). We also added twoadditional control conditions to the experimental de-sign (bottom rows of Table 1). In the fifth condition,subjects were not provided with information about thepolitician’s last name, and thus received no informationabout their ethnic and cousinage ties to the politician(N = 132). Adding this fifth condition therefore allowsus to estimate treatment effects relative to this baselinecandidate evaluation. Finally, in the sixth condition, thepolitician had the same last name as the subject (N =158). Of course, such politicians are also noncousinsfrom the subject’s own ethnic group (because cousi-nage alliances occur between patronyms, and becauselast name indicates ethnicity); thus, this final controlcondition in one sense coincides with the coethnic,noncousin treatment condition.24 However, adding thisadditional condition allows us to compare treatment ef-fects stemming from cousinage or coethnicity to a sim-ple sameness or clan effect. Experimental subjects wereassigned at random to these six treatment and controlconditions with equal probability, using a computer-generated list of pseudorandom integers between 1 and6 (inclusive).

24 Our evidence supports this assertion. Among subjects assigned toview a speech by a politician with the same last name, 98% said thepolitician was not a cousin, whereas 90% said the politician was acoethnic.

To assign subjects at random to the treatment andcontrol conditions, we needed a way to expose eachsubject to the appropriate stimulus—that is, to a politi-cian’s patronym that corresponds to the relevant cellof Table 1, for a given subject surname. To do this,we reviewed the secondary literature and conductedinterviews with experts on cousinage, as well as or-dinary Malian informants in Bamako. We then cata-logued the surnames associated with each treatmentcondition for more than 200 subject surname-ethnicitycombinations.25 This allowed us to create a large matrixin which each row corresponds to a Malian last namethat we could expect to encounter in the field and eachcolumn gives politicians’ surnames associated with theappropriate treatment or control condition.26 We usedtwo small experiments (N = 42 and N = 169, respec-tively) to test a preliminary version of our matrix. Inconjunction with further qualitative interviews in thefield, these smaller experiments allowed us to createand refine the random assignment matrix used in thelarger experiment reported in this article. Althoughseveral secondary sources describe cousinage alliancesbetween various patronyms, we do not know of anyprevious mapping that is as comprehensive as our ran-dom assignment matrix.

Table 2 shows a typical row of our matrix, this onefor a subject named Keita who is from the Malinke(also known as Maninka) ethnic group. The columns ofTable 2 give the politicians’ surnames associated witheach of the six treatment and control conditions, forsuch a subject. For example, politicians with the sur-names in the first two columns are coethnics of thesubject; however, those in the first column (Sissoko andKonate) are considered cousins of the Keita, whereasthose in the second (Diane) are not. The surnamesin the third and fourth columns, meanwhile, are asso-ciated with other (non-Malinke) ethnic groups, someof which are cousins of the Keita (third column) andsome of which are not (fourth column). In cells withmultiple entries, such as in the first, third, and fourthcolumns in Table 2, the politician’s last name was se-lected at random from the names listed in the cell. Thesurnames included in each column are not intended tobe exhaustive; for instance, the first and third columnsof the matrix do not include all possible cousins for thissubject surname. Rather, we sought to use politiciansurnames for which cousinage links are well under-stood and widely recognized, so that we could accu-rately manipulate the stimuli to which subjects wereexposed.

One important question is whether we were in factable to manipulate subjects’ perceptions of their eth-nic and cousinage ties to politicians. This is importantbecause we ultimately care about how subjects’ per-ceptions of these ties to politicians shape candidate

25 Although last name usually implies a single ethnicity in Mali (asimplied by our experimental design), one will occasionally encounterexceptions. In each row of our matrix, we thus specified the subject’sethnicity and surname.26 The random assignment matrix and other experimental materialsare posted online at http://research.thaddunning.com.

6

Thad Dunning The Experimental Turn 23/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Assignment Matrix

American Political Science Review Vol. 104, No. 1

TABLE 2. Typical Row of Our Random Assignment Matrix

Subject’s (1) (2) (3) (4) (5) (6)Surname Coethnic/ Coethnic/ Not Coethnic/ Not Coethnic/ No Same(Ethnicity) Cousin Not Cousin Cousin Not Cousin Name Name

Keita (Maninka) 1. Sissoko 1. Diane 1. Doucoure 1. Diallo Pas de nom Keita2. Konate 2. Sacko 2. Cisse

3. Sylla 3. Dambele4. Coulibaly 4. Thera5. Toure 5. Toure

6. Togola7. Watarra

evaluations. In some cases, we were concerned thatwe risked misclassifying the stimuli to which subjectsperceived themselves to be exposed. As a manipula-tion check, we therefore asked subjects to identify theethnicity of the politician in the videotape and alsowhether the politician was a joking cousin of the sub-ject. (We did this only after subjects had answered allquestions related to the treatment.)

Subjects perceived both the politician’s ethnic iden-tity and their cousinage ties to the politician with sub-stantial accuracy. First, given only the politician’s lastname and choosing from more than 14 ethnic cate-gories, subjects correctly classified the politician’s eth-nicity more than 80% of the time. In the control con-dition in which no politician surname was provided,subjects’ guesses roughly tracked the distribution ofethnic groups in Bamako. Next, when assigned to viewa speech by a politician from a different ethnic group,subjects correctly classified the politician as a cousinor a noncousin nearly 85% of the time. Subjects moreoften misclassified their cousinage ties to politiciansfrom their own ethnic group (in particular, they moreoften classified coethnic cousins as noncousins thanthey did coethnic, noncousins as cousins). As a sub-stantive matter, the direction of the misclassificationmay serve to emphasize that cousinage alliances aretypically understood to cut across ethnic groups.27

We recruited experimental subjects by approachingmen and women sitting outside homes (or knockingon doors) and asking if they would participate in astudy on political speeches. Distributions on severalmeasured variables in the experimental population,such as ethnicity and age, are similar to those given forBamako and Mali as a whole by representative surveys(Afrobarometer 2007). However, the experimentseverely underrepresents women, who comprise just27% of the experimental population.28 After ap-proaching a potential subject, we administered ascreening questionnaire in which we sought back-ground information, including first and last name and

27 As an inferential matter, however, the slight misclassification maylead us to underestimate the true effects of some treatments, as wediscuss in the next section and in the Appendix.28 In Bamako, women tend to be doing work inside houses or com-pounds, whereas men, when at home, tend to be outside sippingtea.

ethnic identity.29 The information gathered duringscreening allowed us to determine subject eligibilityand to assign subjects randomly to the treatment andcontrol conditions.30

To create the political speech to be viewed by theexperimental subjects, we drew on fieldwork conductedby one of us (Harrison) in Bamako during Mali’s parlia-mentary elections in 2007, as well as secondary sources.The speech focused on standard themes in Malian polit-ical campaigns, such as the need to improve infrastruc-ture, invest in schools, and relieve electricity blackouts.Approximately 56% of experimental subjects said thespeech “reminded them of a speech they had heard on aprevious occasion.” The speech was delivered in Bam-bara/Bamanakan, which is the lingua franca of Bamako(and of Mali).31 The fieldwork for our experiment tookplace from June to October 2008.

Subjects viewed the videotaped political speech on aportable DVD player or laptop computer using head-phones. When subjects were found in groups, only onesubject was recruited per group; only the subject couldhear the speech through the headphones, and each sub-ject answered follow-up questions on his or her own.These features of the research design limited the po-tential for subjects’ responses to treatment to dependon the treatment assignment of other subjects, whichwould violate the standard assumption in experimental

29 The screening questionnaire asked for name, sex, year of birth, lastyear of schooling completed, place of birth, years living in Bamako,where else subject has lived (if anywhere), whether the subject isregistered to vote, language of greatest daily use, the first languagethe subject learned, and the subject’s ethnic identity.30 Around 20% of potential subjects were not eligible to participatebecause their (more unusual) surnames did not appear in the rowsof our random assignment matrix. For such subjects, who are notincluded in this article’s analysis, we showed a single version of thespeech and then administered an abbreviated postspeech question-naire.31 The use of Bambara does not necessarily imply a particular ethnicidentity for the politician. Among experimental subjects who self-identified with an ethnicity other than Bambara/Bamanan, 61%speak Bambara most frequently in daily life, 14% speak bothBambara and French, and 13% speak primarily French—leavingjust 12% of non-Bambaras who use their first language mostfrequently. Also, subjects did not disproportionately attribute aBamanan/Bambara identity to either of our actors/politicians andattributed a similar distribution of ethnicities to both—although oneactor was in fact ethnically Bambara, whereas the other was ethni-cally Peulh.

7

Thad Dunning The Experimental Turn 24/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Experimental Effects

American Political Science Review Vol. 104, No. 1

TABLE 3. Descriptive Statistics on Response Variables(Across All Treatment Conditions)

Variable Range Mean (SD)Mean (SD)Range 0–1

Global evaluation of candidate 1–7 4.53 (1.73) 0.58 (0.29)Global evaluation of speech 1–7 6.29 (1.22) 0.88 (0.20)The candidate . . .

Is likeable 1–5 4.49 (0.61) 0.87 (0.15)Is intelligent 1–5 2.90 (0.95) 0.48 (0.24)Is competent 1–5 2.72 (0.96) 0.43 (0.24)Is impressive 1–7 4.26 (1.69) 0.54 (0.28)Is trustworthy 1–5 2.57 (1.08) 0.39 (0.27)Would do a good job in office 1–7 3.49 (1.78) 0.42 (0.30)Would defend others and fight for his ideals 1–7 2.99 (1.80) 0.33 (0.30)Has good motivations for running 1–7 6.13 (1.39) 0.85 (0.23)Would successfully face challenges of office 1–7 4.00 (1.35) 0.50 (0.23)Has good ideas 1–7 6.01 (1.44) 0.84 (0.24)

The table reports the theoretical (and empirical) range, mean, and standard deviation for each variable analyzedin the article. The final column reports means and standard deviations for each variable, recoded on a 0–1 scale.

FIGURE 1. Average Candidate Evaluations, by Treatment Assignment

Coethnic

Cousins

Coethnics,

Not Cousins

Cousins,

Not Coethnics

Not Coethnics,

Not Cousins

The figure reports average answers by treatment assignment category to the question, “On a scale of 1 to 7, how much does thisspeech make you want to vote for (name of politician/this candidate)?”

negative effects of ethnic differences on candidate eval-uations. In fact, the average evaluation of cousins froma different ethnic group (4.44) is statistically indistin-guishable from the average evaluation of noncousinsfrom the same ethnic group (4.57). On average, subjectsappear roughly indifferent between noncousins fromtheir own ethnic group and cousins from a differentethnic group.

We subjected these results to a variety of robustnesstests. Nonparametric, two-sample Wilcoxon rank-sumtests, which are based on the median rather than themean, tell the same story as the parametric analysis:

coethnics are significantly preferred to non-coethnics,and cousins are significantly preferred to noncousins,whereas preferences for joking cousins from a differ-ent ethnic group and noncousins from the same ethnicgroup are statistically indistinguishable. We also foundvery similar treatment effects for similar questions,such as “On a scale of 1 to 7, how would you ratethe global quality of this speech?” Because our mainanalysis effectively pools across multiple experiments,one for each subject surname, we also analyzed treat-ment effects by individual surnames; although samplesizes are small, even for the most common last names,

9

Thad Dunning The Experimental Turn 25/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Engagement with the field

Dunning and Harrison (2010) drew on local informants tolearn about which surname pairs create cousinage ties, whichis required for constructing the assignment matrix.

There can be a real complementarity between experiments inthe field and various qualitative methods—often, the latter arerequired for the former:

I Design of treatments.I Measurement of outcomes.I Interpretation of results.I Identification of mechanisms through “experimental

ethnography” (Paluck 2010).

Experiments are sometimes viewed as a “quantitative”method, but many of the features and attributes of qualitativefield research are crucial.

Thad Dunning The Experimental Turn 26/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Engagement with the field

Dunning and Harrison (2010) drew on local informants tolearn about which surname pairs create cousinage ties, whichis required for constructing the assignment matrix.There can be a real complementarity between experiments inthe field and various qualitative methods—often, the latter arerequired for the former:

I Design of treatments.I Measurement of outcomes.I Interpretation of results.I Identification of mechanisms through “experimental

ethnography” (Paluck 2010).

Experiments are sometimes viewed as a “quantitative”method, but many of the features and attributes of qualitativefield research are crucial.

Thad Dunning The Experimental Turn 26/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Engagement with the field

Dunning and Harrison (2010) drew on local informants tolearn about which surname pairs create cousinage ties, whichis required for constructing the assignment matrix.There can be a real complementarity between experiments inthe field and various qualitative methods—often, the latter arerequired for the former:

I Design of treatments.

I Measurement of outcomes.I Interpretation of results.I Identification of mechanisms through “experimental

ethnography” (Paluck 2010).

Experiments are sometimes viewed as a “quantitative”method, but many of the features and attributes of qualitativefield research are crucial.

Thad Dunning The Experimental Turn 26/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Engagement with the field

Dunning and Harrison (2010) drew on local informants tolearn about which surname pairs create cousinage ties, whichis required for constructing the assignment matrix.There can be a real complementarity between experiments inthe field and various qualitative methods—often, the latter arerequired for the former:

I Design of treatments.I Measurement of outcomes.

I Interpretation of results.I Identification of mechanisms through “experimental

ethnography” (Paluck 2010).

Experiments are sometimes viewed as a “quantitative”method, but many of the features and attributes of qualitativefield research are crucial.

Thad Dunning The Experimental Turn 26/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Engagement with the field

Dunning and Harrison (2010) drew on local informants tolearn about which surname pairs create cousinage ties, whichis required for constructing the assignment matrix.There can be a real complementarity between experiments inthe field and various qualitative methods—often, the latter arerequired for the former:

I Design of treatments.I Measurement of outcomes.I Interpretation of results.

I Identification of mechanisms through “experimentalethnography” (Paluck 2010).

Experiments are sometimes viewed as a “quantitative”method, but many of the features and attributes of qualitativefield research are crucial.

Thad Dunning The Experimental Turn 26/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Engagement with the field

Dunning and Harrison (2010) drew on local informants tolearn about which surname pairs create cousinage ties, whichis required for constructing the assignment matrix.There can be a real complementarity between experiments inthe field and various qualitative methods—often, the latter arerequired for the former:

I Design of treatments.I Measurement of outcomes.I Interpretation of results.I Identification of mechanisms through “experimental

ethnography” (Paluck 2010).

Experiments are sometimes viewed as a “quantitative”method, but many of the features and attributes of qualitativefield research are crucial.

Thad Dunning The Experimental Turn 26/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Engagement with the field

Dunning and Harrison (2010) drew on local informants tolearn about which surname pairs create cousinage ties, whichis required for constructing the assignment matrix.There can be a real complementarity between experiments inthe field and various qualitative methods—often, the latter arerequired for the former:

I Design of treatments.I Measurement of outcomes.I Interpretation of results.I Identification of mechanisms through “experimental

ethnography” (Paluck 2010).

Experiments are sometimes viewed as a “quantitative”method, but many of the features and attributes of qualitativefield research are crucial.

Thad Dunning The Experimental Turn 26/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #3: Planned and registered analysis

A growing movement towards pre-planning and registration ofdata analysis protocols.

I Helps distinguish between ex-ante and ex-post hypotheses.Ex-post analysis is fine and useful—but it is helpful to know thedifference.

I Substantially limits inferential problems from data mining andallows for meaningful/credible adjustment for multiple statisticalcomparisons.

For design-based inference, it helps a lot to think thingsthrough first!But will registration solve the problems it seems designed tosolve?

Thad Dunning The Experimental Turn 27/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #3: Planned and registered analysis

A growing movement towards pre-planning and registration ofdata analysis protocols.

I Helps distinguish between ex-ante and ex-post hypotheses.Ex-post analysis is fine and useful—but it is helpful to know thedifference.

I Substantially limits inferential problems from data mining andallows for meaningful/credible adjustment for multiple statisticalcomparisons.

For design-based inference, it helps a lot to think thingsthrough first!But will registration solve the problems it seems designed tosolve?

Thad Dunning The Experimental Turn 27/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #3: Planned and registered analysis

A growing movement towards pre-planning and registration ofdata analysis protocols.

I Helps distinguish between ex-ante and ex-post hypotheses.Ex-post analysis is fine and useful—but it is helpful to know thedifference.

I Substantially limits inferential problems from data mining andallows for meaningful/credible adjustment for multiple statisticalcomparisons.

For design-based inference, it helps a lot to think thingsthrough first!But will registration solve the problems it seems designed tosolve?

Thad Dunning The Experimental Turn 27/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #3: Planned and registered analysis

A growing movement towards pre-planning and registration ofdata analysis protocols.

I Helps distinguish between ex-ante and ex-post hypotheses.Ex-post analysis is fine and useful—but it is helpful to know thedifference.

I Substantially limits inferential problems from data mining andallows for meaningful/credible adjustment for multiple statisticalcomparisons.

For design-based inference, it helps a lot to think thingsthrough first!

But will registration solve the problems it seems designed tosolve?

Thad Dunning The Experimental Turn 27/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Cheer #3: Planned and registered analysis

A growing movement towards pre-planning and registration ofdata analysis protocols.

I Helps distinguish between ex-ante and ex-post hypotheses.Ex-post analysis is fine and useful—but it is helpful to know thedifference.

I Substantially limits inferential problems from data mining andallows for meaningful/credible adjustment for multiple statisticalcomparisons.

For design-based inference, it helps a lot to think thingsthrough first!But will registration solve the problems it seems designed tosolve?

Thad Dunning The Experimental Turn 27/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Evidence of publication bias #1

390 Alan S. Gerber, Donald P. Green, and David Nickerson

c O

?

UJ

45

40

35

30

E 25 H 20

15

10

-5

Mail - Canvass | * Phone

15 2 2.5 3 3.5 4 4.5 5

Log (Sample Size) Fig. 1 Relationship between sample size and effect size.

to produce findings that overestimate the true effects, and published treatment effects will decrease as the sample size increases. The published literature is consistent with both of these predictions. Sending voters mailers prior to the election has immense effects when the sample size is small and relatively muted effects when the sample size is large. For

example, while the classic Elders veld (1956) study of the 1953 election found mail to have a massive, 26-percentage-point effect based on a sample size of 43, this figure dropped to 4 when the sample size was expanded to 268 the following year. Using the logged sample sizes of the mail experiments as predictors of the logged effects sizes, we find a slope of ?0.46 (SE = 0.15). Similarly, the effects of phone calls and personal canvassing tend to be much larger in small studies than big ones. The effects of personal canvassing, for

example, is found to be 9 percentage-points in Gerber and Green's (2000) study of 29,380 subjects, compared to 42 percentage-points in Eldersveld's study of 41 subjects prior to the 1953 election. Finally, the four phone experiments suggest that effect size declines as one moves from the small studies by Eldersveld (1956) and Miller et al. (1981) to those by Adams and Smith (1980) and Gerber and Green (2000).

The strong relationship between sample size and effect size is clearly shown in Fig. 1. For all modes of voter contact, studies with larger samples produce smaller estimated effects. Consistent with the examples presented earlier, when samples are small, all modes of treatment appear to work very well. While we cannot observe either the "true effect" of various modes of contact or the publication rule directly, it is nonetheless interesting to

compare the pattern in the published studies to the examples shown in Table 1. Suppose that the true effects of mail, canvassing, and phone calls are similar to those reported in the largest study of voter contact, that by Gerber and Green (2000). For purposes of comparison to the cases analyzed in Table 1, let the true effect of canvassing be 10%, and let the true effect of mail and phone equal 1%.5 In the absence of publication bias, we expect no tendency for smaller studies to show larger or smaller estimated effects. However, when the sample

5Setting the true effect of phone calls to 1% is a matter of convenience. In the presence of publication bias, until

sample sizes reach into the thousands, the published literature when the true effect is 1 % is very similar to the

published literature when the effect is 0%.

This content downloaded from 169.229.32.136 on Sun, 1 Jun 2014 21:57:24 PMAll use subject to JSTOR Terms and Conditions

Source:Gerber, Green and Nickerson (2001)

Thad Dunning The Experimental Turn 28/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Evidence of publication bias #2

Do Statistical Reporting Standards Affect What Is Published? 317

including all specifications (full and partial) in the data analysis. We present an exampleof how regression coefficients were selected in the appendix.

RESULTS

Figures 1(b)(a) and 1(b)(b) show the distribution of z-scores7 for coefficients reportedin the APSR and the AJPS for one- and two-tailed tests, respectively.8 The dashed linerepresents the critical value for the canonical 5% test of statistical significance. There isa clear pattern in these figures. Turning first to the two-tailed tests, there is a dramaticspike in the number of z-scores in the APSR and AJPS just over the critical value of 1.96(see Figure 1(b)(a)). The formation in the neighborhood of the critical value resembles

z-Statistic

Fre

quen

cy

0.16 1.06 1.96 2.86 3.76 4.66 5.56 6.46 7.36 8.26 9.16 10.06 10.96 11.86 12.76 13.66

010

2030

4050

6070

8090

Figure 1(a). Histogram of z-statistics, APSR & AJPS (Two-Tailed). Width of bars(0.20) approximately represents 10% caliper. Dotted line represents critical z-statistic(1.96) associated with p = 0.05 significance level for one-tailed tests.

7 The formal derivation of the caliper test is based on z-scores. However, we replicated the analysesusing t-statistics, and unsurprisingly, the results were nearly identical. Generally, studies employedsufficiently large samples, and there were very few coefficients in the extremely narrow caliperbetween 1.96 and 1.99.

8 Very large outlier z-scores are omitted to make the x-axis labels readable. The omitted cases area very small percentage (between 2.4% and 3.3%) of the sample and do not affect the calipertests. Additionally, authors make it clear in tables whether they are testing one-sided or two-sidedhypotheses.

Source:Gerber and Malhotra (2008)

Thad Dunning The Experimental Turn 29/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registration

I Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plans

I Limits data mining and permits meaningful adjustment formultiple statistical comparisons

I But does not necessarily limit publication bias

Results-blind review

I Allows evaluation based on the quality of the researchquestion and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studies

I But also leaves substantial researcher degrees of freedomPre-analysis plans

I Limits data mining and permits meaningful adjustment formultiple statistical comparisons

I But does not necessarily limit publication bias

Results-blind review

I Allows evaluation based on the quality of the researchquestion and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plans

I Limits data mining and permits meaningful adjustment formultiple statistical comparisons

I But does not necessarily limit publication bias

Results-blind review

I Allows evaluation based on the quality of the researchquestion and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plans

I Limits data mining and permits meaningful adjustment formultiple statistical comparisons

I But does not necessarily limit publication biasResults-blind review

I Allows evaluation based on the quality of the researchquestion and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plansI Limits data mining and permits meaningful adjustment for

multiple statistical comparisons

I But does not necessarily limit publication biasResults-blind review

I Allows evaluation based on the quality of the researchquestion and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plansI Limits data mining and permits meaningful adjustment for

multiple statistical comparisonsI But does not necessarily limit publication bias

Results-blind review

I Allows evaluation based on the quality of the researchquestion and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plansI Limits data mining and permits meaningful adjustment for

multiple statistical comparisonsI But does not necessarily limit publication bias

Results-blind review

I Allows evaluation based on the quality of the researchquestion and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plansI Limits data mining and permits meaningful adjustment for

multiple statistical comparisonsI But does not necessarily limit publication bias

Results-blind reviewI Allows evaluation based on the quality of the research

question and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What practices could fix publication bias?

Study registrationI Allows description of universe of studiesI But also leaves substantial researcher degrees of freedom

Pre-analysis plansI Limits data mining and permits meaningful adjustment for

multiple statistical comparisonsI But does not necessarily limit publication bias

Results-blind reviewI Allows evaluation based on the quality of the research

question and strength of the design not the statisticalsignificance of estimated effects

I A potentially powerful tool for limiting publication bias (but notpracticed yet); some potential drawbacks but notinsurmountable

Thad Dunning The Experimental Turn 30/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What is strong design? (from Dunning, Natural

Experiments in the Social Sciences, 2012)

Typology'of'Natural'Experiments'

Plausibility'of'As#If&Random'

Least& Most&

Cred

ibility'of'the

'Mod

els'

Least&

Most&

Weak'Research'Design'

Strong'Research'Design'

Thad Dunning The Experimental Turn 31/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What is strong design? (from Dunning, Natural

Experiments in the Social Sciences, 2012)

Plausibility*of*As#If&Random*

Least& Most&

Cred

ibility*of*the

*Mod

els*

Least&

Most&

Weak&Research&Design&

Strong&Research&Design&

Galiani&and&Schargrodsky&(2004);&Snow&(1855)&

Doherty&et&al.&(2006)&

Angrist&and&Lavy&(1999)&

Miguel&et&al.&(2004)&

Posner&(2004)&

Brady&&&McNulty&(2011)&

Card&&&Krueger&(1994)&

Grofman&et&al.&(1995)&

Thad Dunning The Experimental Turn 32/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The challenge of cumulative learning: an example oncommunity monitoringPOWER TO THE PEOPLE: EVIDENCE FROM A

RANDOMIZED FIELD EXPERIMENT ONCOMMUNITY-BASED MONITORING IN UGANDA∗

MARTINA BJORKMAN AND JAKOB SVENSSON

This paper presents a randomized field experiment on community-based mon-itoring of public primary health care providers in Uganda. Through two rounds ofvillage meetings, localized nongovernmental organizations encouraged communi-ties to be more involved with the state of health service provision and strengthenedtheir capacity to hold their local health providers to account for performance. Ayear after the intervention, treatment communities are more involved in moni-toring the provider, and the health workers appear to exert higher effort to servethe community. We document large increases in utilization and improved healthoutcomes—reduced child mortality and increased child weight—that compare fa-vorably to some of the more successful community-based intervention trials re-ported in the medical literature.

I. INTRODUCTION

Approximately eleven million children under five years dieeach year and almost half of these deaths occur in sub-SaharanAfrica. More than half of these children will die of diseases (e.g.,diarrhea, pneumonia, malaria, measles, and neonatal disorders)that could easily have been prevented or treated if the childrenhad had access to a small set of proven, inexpensive services(Black, Morris, and Bryce 2003; Jones et al. 2003).

Why are these services not provided? Anecdotal, and re-cently more systematic, evidence points to one possible reason—ineffective systems of monitoring and weak accountability

∗This project is a collaborative exercise involving many people. Foremost, weare deeply indebted to Frances Nsonzi and Ritva Reinikka for their contributionsat all stages of the project. We would also like to acknowledge the importantcontributions of Gibwa Kajubi, Abel Ojoo, Anthony Wasswa, James Kanyesigye,Carolyn Winter, Ivo Njosa, Omiat Omongin, Mary Bitekerezo, and the field anddata staff with whom we have worked over the years. We thank the Uganda Min-istry of Health, Planning Division, the World Bank’s Country Office in Uganda,and the Social Development Department, the World Bank, for their cooperation.We are grateful for comments and suggestions by Paul Gertler, Esther Duflo, Ab-hijit Banerjee, and seminar and conference participants at Stanford, Berkeley,LSE, Oxford, IGIER, MIT, World Bank, NTNU, Namur, UPF, CEPR/EUDN con-ference in Paris, and BREAD & CESifo conference in Venice. We also thank threeanonymous referees and the editor, Lawrence Katz, for very constructive sugges-tions. Financial support from the Bank-Netherlands Partnership Program, theWorld Bank Research Committee, the World Bank Africa Region division, and theSwedish International Development Agency, Department for Research Coopera-tion is gratefully acknowledged. Bjorkman also thanks Jan Wallander’s and TomHedelius’ Research Foundation for funding.

C⃝ 2009 by the President and Fellows of Harvard College and the Massachusetts Institute ofTechnology.The Quarterly Journal of Economics, May 2009

735

at University of California, Berkeley on June 1, 2014

http://qje.oxfordjournals.org/D

ownloaded from

Source: Bjorkmann and Svensson 2009

Thad Dunning The Experimental Turn 34/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

But perhaps only top-down monitoring matters...

200

[ Journal of Political Economy, 2007, vol. 115, no. 2]! 2007 by The University of Chicago. All rights reserved. 0022-3808/2007/11502-0002$10.00

Monitoring Corruption: Evidence from a FieldExperiment in Indonesia

Benjamin A. OlkenHarvard University and National Bureau of Economic Research

This paper presents a randomized field experiment on reducing cor-ruption in over 600 Indonesian village road projects. I find that in-creasing government audits from 4 percent of projects to 100 percentreduced missing expenditures, as measured by discrepancies betweenofficial project costs and an independent engineers’ estimate of costs,by eight percentage points. By contrast, increasing grassroots partic-ipation in monitoring had little average impact, reducing missing ex-penditures only in situations with limited free-rider problems andlimited elite capture. Overall, the results suggest that traditional top-down monitoring can play an important role in reducing corruption,even in a highly corrupt environment.

I. Introduction

Corruption is a significant problem in much of the developing world.In many cases, corruption acts like a tax, adding to the cost of providingpublic services and conducting business. Often, though, the efficiency

I wish to thank Alberto Alesina, Abhijit Banerjee, Robert Barro, Dan Biller, StephenBurgess, Francesco Caselli, Joe Doyle, Esther Duflo, Pieter Evers, Amy Finkelstein, BrianJacob, Seema Jayachandran, Ben Jones, Larry Katz, Philip Keefer, Michael Kremer, JeffLiebman, Erzo Luttmer, Ted Miguel, Chris Pycroft, Lant Pritchett, Mike Richards, MarkRosenzweig, numerous seminar participants, Steven Levitt (the editor), and two anony-mous referees for helpful comments. Special thanks are due to Victor Bottini, RichardGnagey, Susan Wong, and especially to Scott Guggenheim for their support and assistancethroughout the project. The field work and engineering survey would have been impossiblewithout the dedication of Faray Muhammad and Suroso Yoso Oetomo, as well as the entireP4 field staff. This project was supported by a grant from the Department for InternationalDevelopment–World Bank Strategic Poverty Partnership Trust Fund. All views expressedare those of the author, and do not necessarily reflect the opinions of DfID or the WorldBank.

Source: Olken 2007

Thad Dunning The Experimental Turn 35/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Or perhaps community monitoring doesn’t matter at al

Electronic copy available at: http://ssrn.com/abstract=2228900

!!!

Abstract!!

We!study!a!randomized!educational!intervention!in!550!households!in!26!matched!villages! in! two! Kenyan! districts.! The! intervention! provided! parents! with!information!about!their!children’s!performance!on!literacy!and!numeracy!tests,!and!materials! about! how! to! become! more! involved! in! improving! their! children’s!learning.!We! find! the! provision! of! such! information! had! no! discernible! impact! on!either!private!or!collective!action.!In!discussing!these!findings,!we!articulate!a!causal!chain!linking!information!provision!to!changes!in!citizens’!behavior,!and!assess!the!present!intervention!at!each!step.!!Future!research!on!information!provision!should!pay!greater!attention!to!this!causal!chain.!

!!!

KEYWORDS:!Kenya;!Africa;!Information;!Accountability;!Education;!Field!Experiments!! !

Source: Tsai, Lieberman, and Posner 2013

Thad Dunning The Experimental Turn 36/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Challenges to cumulation

Similar questions arise in many settings—e.g. the effects ofinformation on political accountability

Answers from distinct research projects could differ for manyreasons:

I The interventions are differentI The outcomes are differentI “It depends”

Publication biases are also a real concern.

Thad Dunning The Experimental Turn 37/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Challenges to cumulation

Similar questions arise in many settings—e.g. the effects ofinformation on political accountabilityAnswers from distinct research projects could differ for manyreasons:

I The interventions are differentI The outcomes are differentI “It depends”

Publication biases are also a real concern.

Thad Dunning The Experimental Turn 37/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Challenges to cumulation

Similar questions arise in many settings—e.g. the effects ofinformation on political accountabilityAnswers from distinct research projects could differ for manyreasons:

I The interventions are different

I The outcomes are differentI “It depends”

Publication biases are also a real concern.

Thad Dunning The Experimental Turn 37/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Challenges to cumulation

Similar questions arise in many settings—e.g. the effects ofinformation on political accountabilityAnswers from distinct research projects could differ for manyreasons:

I The interventions are differentI The outcomes are different

I “It depends”

Publication biases are also a real concern.

Thad Dunning The Experimental Turn 37/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Challenges to cumulation

Similar questions arise in many settings—e.g. the effects ofinformation on political accountabilityAnswers from distinct research projects could differ for manyreasons:

I The interventions are differentI The outcomes are differentI “It depends”

Publication biases are also a real concern.

Thad Dunning The Experimental Turn 37/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Challenges to cumulation

Similar questions arise in many settings—e.g. the effects ofinformation on political accountabilityAnswers from distinct research projects could differ for manyreasons:

I The interventions are differentI The outcomes are differentI “It depends”

Publication biases are also a real concern.

Thad Dunning The Experimental Turn 37/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

EGAP Metaketa

The Experiments in Governance and Politics (EGAP) group,in conjunction with the Center on the Politics of Developmentat Berkeley, is currently leading an effort emphasizing:

I integrated research programsI teams of researchersI projects in parallel around the worldI generalizable answers to major questions of scholarly and

policy importance(hopefully!)

Thad Dunning The Experimental Turn 38/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

EGAP Metaketa

The Experiments in Governance and Politics (EGAP) group,in conjunction with the Center on the Politics of Developmentat Berkeley, is currently leading an effort emphasizing:

I integrated research programs

I teams of researchersI projects in parallel around the worldI generalizable answers to major questions of scholarly and

policy importance(hopefully!)

Thad Dunning The Experimental Turn 38/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

EGAP Metaketa

The Experiments in Governance and Politics (EGAP) group,in conjunction with the Center on the Politics of Developmentat Berkeley, is currently leading an effort emphasizing:

I integrated research programsI teams of researchers

I projects in parallel around the worldI generalizable answers to major questions of scholarly and

policy importance(hopefully!)

Thad Dunning The Experimental Turn 38/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

EGAP Metaketa

The Experiments in Governance and Politics (EGAP) group,in conjunction with the Center on the Politics of Developmentat Berkeley, is currently leading an effort emphasizing:

I integrated research programsI teams of researchersI projects in parallel around the world

I generalizable answers to major questions of scholarly andpolicy importance(hopefully!)

Thad Dunning The Experimental Turn 38/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

EGAP Metaketa

The Experiments in Governance and Politics (EGAP) group,in conjunction with the Center on the Politics of Developmentat Berkeley, is currently leading an effort emphasizing:

I integrated research programsI teams of researchersI projects in parallel around the worldI generalizable answers to major questions of scholarly and

policy importance(hopefully!)

Thad Dunning The Experimental Turn 38/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The motivation

Individual researchers working do not necessarily generatethe optimal set of studies for knowledge cumulationPublication bias

Little replication of existing studiesValue of being firstLittle coordination across countries, even with similarinterventions

Thad Dunning The Experimental Turn 39/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The motivation

Individual researchers working do not necessarily generatethe optimal set of studies for knowledge cumulationPublication bias

Little replication of existing studiesValue of being firstLittle coordination across countries, even with similarinterventions

Thad Dunning The Experimental Turn 39/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The motivation

Individual researchers working do not necessarily generatethe optimal set of studies for knowledge cumulationPublication biasLittle replication of existing studies

Value of being firstLittle coordination across countries, even with similarinterventions

Thad Dunning The Experimental Turn 39/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The motivation

Individual researchers working do not necessarily generatethe optimal set of studies for knowledge cumulationPublication biasLittle replication of existing studiesValue of being first

Little coordination across countries, even with similarinterventions

Thad Dunning The Experimental Turn 39/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

The motivation

Individual researchers working do not necessarily generatethe optimal set of studies for knowledge cumulationPublication biasLittle replication of existing studiesValue of being firstLittle coordination across countries, even with similarinterventions

Thad Dunning The Experimental Turn 39/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Background on Metaketa Round 1

Challenge to come up with a question that:

I Fit with the general predefined substantive areaI Is important to policymakers and academicsI Includes interventions and outcomes that can be unified

across multiple studiesI Leaves sufficient leeway to researchers to innovate (publish)

Thad Dunning The Experimental Turn 40/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Background on Metaketa Round 1

Challenge to come up with a question that:I Fit with the general predefined substantive area

I Is important to policymakers and academicsI Includes interventions and outcomes that can be unified

across multiple studiesI Leaves sufficient leeway to researchers to innovate (publish)

Thad Dunning The Experimental Turn 40/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Background on Metaketa Round 1

Challenge to come up with a question that:I Fit with the general predefined substantive areaI Is important to policymakers and academics

I Includes interventions and outcomes that can be unifiedacross multiple studies

I Leaves sufficient leeway to researchers to innovate (publish)

Thad Dunning The Experimental Turn 40/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Background on Metaketa Round 1

Challenge to come up with a question that:I Fit with the general predefined substantive areaI Is important to policymakers and academicsI Includes interventions and outcomes that can be unified

across multiple studies

I Leaves sufficient leeway to researchers to innovate (publish)

Thad Dunning The Experimental Turn 40/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Background on Metaketa Round 1

Challenge to come up with a question that:I Fit with the general predefined substantive areaI Is important to policymakers and academicsI Includes interventions and outcomes that can be unified

across multiple studiesI Leaves sufficient leeway to researchers to innovate (publish)

Thad Dunning The Experimental Turn 40/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Research question: Why do voters selectunderperforming politicians?

Two-arm proposal structure:

I A common informational arm, focused on performance ofpoliticians

I At least one alternate arm that may vary across projects.This structure promotes replication andcomparability—through the first treatment arm—whilepreserving room for innovation through the second arm.

I Challenges: Capacity to generate integrated projects isuntested; failure rate of individual studies may be high;heterogeneous treatment effects.

Thad Dunning The Experimental Turn 41/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Research question: Why do voters selectunderperforming politicians?

Two-arm proposal structure:I A common informational arm, focused on performance of

politicians

I At least one alternate arm that may vary across projects.This structure promotes replication andcomparability—through the first treatment arm—whilepreserving room for innovation through the second arm.

I Challenges: Capacity to generate integrated projects isuntested; failure rate of individual studies may be high;heterogeneous treatment effects.

Thad Dunning The Experimental Turn 41/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Research question: Why do voters selectunderperforming politicians?

Two-arm proposal structure:I A common informational arm, focused on performance of

politiciansI At least one alternate arm that may vary across projects.

This structure promotes replication andcomparability—through the first treatment arm—whilepreserving room for innovation through the second arm.

I Challenges: Capacity to generate integrated projects isuntested; failure rate of individual studies may be high;heterogeneous treatment effects.

Thad Dunning The Experimental Turn 41/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Research question: Why do voters selectunderperforming politicians?

Two-arm proposal structure:I A common informational arm, focused on performance of

politiciansI At least one alternate arm that may vary across projects.

This structure promotes replication andcomparability—through the first treatment arm—whilepreserving room for innovation through the second arm.

I Challenges: Capacity to generate integrated projects isuntested; failure rate of individual studies may be high;heterogeneous treatment effects.

Thad Dunning The Experimental Turn 41/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Research question: Why do voters selectunderperforming politicians?

Two-arm proposal structure:I A common informational arm, focused on performance of

politiciansI At least one alternate arm that may vary across projects.

This structure promotes replication andcomparability—through the first treatment arm—whilepreserving room for innovation through the second arm.

I Challenges: Capacity to generate integrated projects isuntested; failure rate of individual studies may be high;heterogeneous treatment effects.

Thad Dunning The Experimental Turn 41/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Projects in Metaketa Round #1

Project! Title! PIs! Informa1on2on…! Method!

Benin!Can!Common!Knowledge!

Improve!Common!Goods?!

Adida,!Go7lieb,!

McClendon,!

&Kramon!

legisla<ve!performance!of!

depu1es2in2the2Na1onal2Assembly!

Legislator!performance!info!

provided!publicly!or!privately!

and!a!civics!message!

Mexico!Common!Knowledge,!

Rela<ve!Performance,!&!

Poli<cal!Accountability!

Larreguy,!Arias,!

Querubin,!&!

Marshall!

corrup<on!and!the!misuse!

of!public!funds!by!local2government2officials!

Leaflets!distr.!doorHtoHdoor!

vs.!leaflets!w/cars!using!

loudspeakers!

India!Using!Local!Networks!to!

Increase!Accountability!

Chauchard!

&Sircar!

financial!crimes!against!

Members2of2the2state2assembly!

DoorHtoHdoor!campaigns!vs.!

public!rallies!

Brazil!Accountability!&!

Incumbent!Performance!in!

Brazilian!Northeast!

Hidalgo,!Boas,!&!

Melos!

performance!gathered!

from!audit!reports!of!the2local2government!

Report!cards!&!an!oral!

message!

Burkina2Faso!

Ci<zens!at!the!Council! Lierl!&!Holmlund!service!delivery!by!the!

municipal2government!Scorecard!vs.!a7ending!local!

council!mee<ng!

Uganda2I!Informa<on!&!

Accountability!in!Primary!

&!General!Elec<ons!

Raffler!&Platas!

Izama!

service!delivery!by!the2local2government!

Recorded!candidate!

statements!viewed!publicly!

&privately!

Uganda2II2!Repairing!Informa<on!

Underload!

Nielson,!

Buntaine,!Bush,!

Pickering!&!

Jablonski!

service!delivery!by!the!

local2government/2varia<on!in!info!effect!if!$!

from!foreign!donors?!!

Informa<on!sent!by!SMS!to!

randomly!sampled!

households.!

Thad Dunning The Experimental Turn 42/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What do these projects have in common?

Each provides performance information relevant to voterwelfareEach provides relative performance informationEach provides information that is attributable to a candidateEach provides the information privately to individualsEach uses publicly available performance information.There is no deception involved in any of the interventions.Each is implemented in collaboration with a local partner.

Thad Dunning The Experimental Turn 43/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

How$can$we$characterize$their$differences?$$

RELEVANCE(FOR(INDIVIDUAL(WELFARE!

Low(! High!

Programma9c(stances!

Raffler/Izama!

Legisla9ve(performance((a,endance!records):!Adida!et!al.,!Nielson!et!al.!

Performance(audits((accoun<ng,!procurement):!Hidalgo/Boas,!Nielson!et!al.!

Malfeasance,(corrup9on,(criminality(Arias!et!al.,!Chauchard,!Sircar!

Public(service(delivery(!

Nielson!et!al.,!Lierl/Holmlund,!Raffler/Izama!

Targeted(policies(Hidalgo/Boas!crop!guarantee!!

Thad Dunning The Experimental Turn 44/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

What do these projects have in common?

All projects in the Metaketa will abide by a common set ofprinciples above and beyond minimal requirements:

I Principles on research transparency(http://egap.org/resources/egap-statement-of-principles/)

I Protect staff: Do not put research staff in harms way.I Informed consent: Participantsknow that information they get

is provided as part of a research project. Core project data willbe publicly available in primary languages athttp://egap.org/research/metaketa/

I Partnership with local civil society actorsI Non-partisan interventionsI Approval from the relevant electoral commission when

appropriate

Thad Dunning The Experimental Turn 45/ 46

Introduction One Cheer... Two Cheers... Three Cheers for Strong Design Cumulative Learning?

Some challenges

Seven studies participating in Metaketa IEGAP Metaketa I committee staff facilitating coordinationacross teams of researchersDrafting committee and PIs wrote plan of plansNew challenges emerged

I Meta-analysis; links to theory; what to do about covariateadjustment . . .

Thad Dunning The Experimental Turn 46/ 46