unpacking the black box of causality: learning about...

25
American Political Science Review Vol. 105, No. 4 November 2011 doi:10.1017/S0003055411000414 Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies KOSUKE IMAI Princeton University LUKE KEELE Pennsylvania State University DUSTIN TINGLEY Harvard University TEPPEI YAMAMOTO Massachusetts Institute of Technology I dentifying causal mechanisms is a fundamental goal of social science. Researchers seek to study not only whether one variable affects another but also how such a causal relationship arises. Yet com- monly used statistical methods for identifying causal mechanisms rely upon untestable assumptions and are often inappropriate even under those assumptions. Randomizing treatment and intermediate variables is also insufficient. Despite these difficulties, the study of causal mechanisms is too important to abandon. We make three contributions to improve research on causal mechanisms. First, we present a minimum set of assumptions required under standard designs of experimental and observational studies and develop a general algorithm for estimating causal mediation effects. Second, we provide a method for assessing the sensitivity of conclusions to potential violations of a key assumption. Third, we offer alternative research designs for identifying causal mechanisms under weaker assumptions. The proposed approach is illustrated using media framing experiments and incumbency advantage studies. O ver the last couple of decades, social scientists have given greater attention to methodological issues related to causation. This trend has led to a growing number of laboratory, field, and survey experiments, as well as an increasing use of natural experiments, instrumental variables, and quasirandom- ized studies such as regression discontinuity designs. However, many of these empirical studies focus on merely establishing whether one variable affects an- Kosuke Imai is Associate Professor, Department of Politics, Princeton University, Corwin Hall 036, Princeton NJ 08544 ([email protected]). Luke Keele is Assistant Professor, Department of Political Sci- ence, Pennsylvania State University, 211 Pond Lab, University Park, PA 16802 ([email protected]). Dustin Tingley is Assistant Professor, Department of Govern- ment, Harvard University, 1737 Cambridge Street, CGIS Knafel Building 208, Cambridge MA 02138 ([email protected]). Teppei Yamamoto is Assistant Professor, Department of Political Science, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139 ([email protected]). The companion articles that present technical aspects of the methods introduced here are available as Imai, Keele, and Tingley (2010), and Imai, Keele, Tingley, and Yamamoto (2010, 2011). All of our proposed methods can be implemented via an R package, mediation (Imai et al. 2010), which is freely avail- able for download at the Comprehensive R Archive Network (http://cran.r-project.org/web/packages/mediation). A Stata package mediation by Raymond Hicks and Tingley is available at IDEAS (http://ideas.repec.org/c/boc/bocode/s457294.html). The replication archive for this article is available as Imai et al. (2011). We thank Ted Brader, Gary Jacobson, and Jonathan Katz for providing us with their data. We also thank Christina Davis, Michael Donnelly, Kevin Esterling, Marty Gilens, Don Green, Simon Jackman, Gary King, Arthur Lupia, Rose McDermott, Tali Mendelberg, Marcus Prior, Cesar Zucco, and participants at the West Coast Experiment Conference, the NSF Conference on Politics Experiments, and the Institute of Statistical Mathematics Summer Lecture Series, as well as seminar participants at Northwestern University and the Univer- sity of Chicago for helpful suggestions. Comments from an APSR co-editor and three anonymous reviewers significantly improved the presentation of this article. Imai acknowledges financial support from the National Science Foundation (SES-0849715 and SES-0918968). other and fail to explain how such a causal relationship arises. This “black box” approach to causality has been criticized across disciplines for being atheoretical and even unscientific (e.g., Brady and Collier 2004; Deaton 2010a, 2010b; Heckman and Smith 1995). 1 For many researchers, estimating causal effects is insufficient and underlying mechanisms must be examined in order to test social science theories empirically. We define a causal mechanism as a process in which a causal variable of interest, i.e., a treatment variable, influences an outcome. The identification of a causal mechanism requires the specification of an interme- diate variable or a mediator that lies on the causal pathway between the treatment and outcome variables. Although qualitative studies often employ the method of process tracing, quantitative investigation of causal mechanisms is based on the estimation of causal medi- ation effects. Indeed, the traditional approach to causal mediation analysis has been to use structural equation models (e.g., MacKinnon 2008; Shadish, Cook, and Campbell 2001), a practice which goes back decades (Haavelmo 1943). 2 In this article, we show that these commonly used statistical methods rely upon untestable assumptions and are often inappropriate even under those assump- tions. In particular, contrary to the commonly held be- lief, conventional exogeneity assumptions alone are in- sufficient for identification of causal mechanisms. 3 For 1 Prominent experimentalists acknowledge “the impatience that so- cial scientists often express with experimental studies that fail to explain why an effect obtains” (Green, Ha, and Bullock 2010, 202.) 2 The use of linear structural equation models is still widespread, and numerous applications can be found (see e.g., Brader, Valentino, and Suhay 2008; Cox and Katz 1996; Hetherington 2001; Miller and Krosnick 2000; among many others). Earlier applications include Cnudde and McCrone (1966) and Miller and Stokes (1963). 3 This fact is well known in the methodological literature on causal inference (e.g., Imai, Keele, and Yamamoto 2010; Imai, Tingley, and Yamamoto n.d.; Pearl 2001; Petersen, Sinisi, and van der Laan 2006; 765

Upload: others

Post on 23-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4 November 2011

doi:10.1017/S0003055411000414

Unpacking the Black Box of Causality: Learning aboutCausal Mechanisms from Experimental and Observational StudiesKOSUKE IMAI Princeton UniversityLUKE KEELE Pennsylvania State UniversityDUSTIN TINGLEY Harvard UniversityTEPPEI YAMAMOTO Massachusetts Institute of Technology

Identifying causal mechanisms is a fundamental goal of social science. Researchers seek to study notonly whether one variable affects another but also how such a causal relationship arises. Yet com-monly used statistical methods for identifying causal mechanisms rely upon untestable assumptions

and are often inappropriate even under those assumptions. Randomizing treatment and intermediatevariables is also insufficient. Despite these difficulties, the study of causal mechanisms is too importantto abandon. We make three contributions to improve research on causal mechanisms. First, we present aminimum set of assumptions required under standard designs of experimental and observational studiesand develop a general algorithm for estimating causal mediation effects. Second, we provide a methodfor assessing the sensitivity of conclusions to potential violations of a key assumption. Third, we offeralternative research designs for identifying causal mechanisms under weaker assumptions. The proposedapproach is illustrated using media framing experiments and incumbency advantage studies.

Over the last couple of decades, social scientistshave given greater attention to methodologicalissues related to causation. This trend has led

to a growing number of laboratory, field, and surveyexperiments, as well as an increasing use of naturalexperiments, instrumental variables, and quasirandom-ized studies such as regression discontinuity designs.However, many of these empirical studies focus onmerely establishing whether one variable affects an-

Kosuke Imai is Associate Professor, Department of Politics,Princeton University, Corwin Hall 036, Princeton NJ 08544([email protected]).

Luke Keele is Assistant Professor, Department of Political Sci-ence, Pennsylvania State University, 211 Pond Lab, University Park,PA 16802 ([email protected]).

Dustin Tingley is Assistant Professor, Department of Govern-ment, Harvard University, 1737 Cambridge Street, CGIS KnafelBuilding 208, Cambridge MA 02138 ([email protected]).

Teppei Yamamoto is Assistant Professor, Department of PoliticalScience, Massachusetts Institute of Technology, 77 MassachusettsAvenue, Cambridge, MA 02139 ([email protected]).

The companion articles that present technical aspects of themethods introduced here are available as Imai, Keele, and Tingley(2010), and Imai, Keele, Tingley, and Yamamoto (2010, 2011).All of our proposed methods can be implemented via an Rpackage, mediation (Imai et al. 2010), which is freely avail-able for download at the Comprehensive R Archive Network(http://cran.r-project.org/web/packages/mediation). A Stata packagemediation by Raymond Hicks and Tingley is available at IDEAS(http://ideas.repec.org/c/boc/bocode/s457294.html). The replicationarchive for this article is available as Imai et al. (2011). We thankTed Brader, Gary Jacobson, and Jonathan Katz for providing uswith their data. We also thank Christina Davis, Michael Donnelly,Kevin Esterling, Marty Gilens, Don Green, Simon Jackman, GaryKing, Arthur Lupia, Rose McDermott, Tali Mendelberg, MarcusPrior, Cesar Zucco, and participants at the West Coast ExperimentConference, the NSF Conference on Politics Experiments, and theInstitute of Statistical Mathematics Summer Lecture Series, as wellas seminar participants at Northwestern University and the Univer-sity of Chicago for helpful suggestions. Comments from an APSRco-editor and three anonymous reviewers significantly improved thepresentation of this article. Imai acknowledges financial support fromthe National Science Foundation (SES-0849715 and SES-0918968).

other and fail to explain how such a causal relationshiparises. This “black box” approach to causality has beencriticized across disciplines for being atheoretical andeven unscientific (e.g., Brady and Collier 2004; Deaton2010a, 2010b; Heckman and Smith 1995).1 For manyresearchers, estimating causal effects is insufficient andunderlying mechanisms must be examined in order totest social science theories empirically.

We define a causal mechanism as a process in whicha causal variable of interest, i.e., a treatment variable,influences an outcome. The identification of a causalmechanism requires the specification of an interme-diate variable or a mediator that lies on the causalpathway between the treatment and outcome variables.Although qualitative studies often employ the methodof process tracing, quantitative investigation of causalmechanisms is based on the estimation of causal medi-ation effects. Indeed, the traditional approach to causalmediation analysis has been to use structural equationmodels (e.g., MacKinnon 2008; Shadish, Cook, andCampbell 2001), a practice which goes back decades(Haavelmo 1943).2

In this article, we show that these commonly usedstatistical methods rely upon untestable assumptionsand are often inappropriate even under those assump-tions. In particular, contrary to the commonly held be-lief, conventional exogeneity assumptions alone are in-sufficient for identification of causal mechanisms.3 For

1 Prominent experimentalists acknowledge “the impatience that so-cial scientists often express with experimental studies that fail toexplain why an effect obtains” (Green, Ha, and Bullock 2010, 202.)2 The use of linear structural equation models is still widespread,and numerous applications can be found (see e.g., Brader, Valentino,and Suhay 2008; Cox and Katz 1996; Hetherington 2001; Miller andKrosnick 2000; among many others). Earlier applications includeCnudde and McCrone (1966) and Miller and Stokes (1963).3 This fact is well known in the methodological literature on causalinference (e.g., Imai, Keele, and Yamamoto 2010; Imai, Tingley, andYamamoto n.d.; Pearl 2001; Petersen, Sinisi, and van der Laan 2006;

765

Page 2: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

example, although randomization is often seen as thegold standard for estimating causal effects, even ran-domizing both treatment and intermediate variablescannot identify a mechanism. Facing these difficulties,some assert that process tracing in detailed case studiesis the best way to evaluate causal mechanisms (e.g.,Collier, Brady, and Seawright 2004). Others highlightwhy the search for causal mechanisms is elusive butstop short of developing methodological tools to con-front the challenge (e.g., Bullock, Green, and Ha 2010;Glynn 2010).4

Although recognizing these difficulties, we believethat the study of causal mechanisms is too important toabandon, and new tools must be developed. In this arti-cle, we make three contributions toward this goal. First,we present a minimum set of assumptions requiredunder standard designs of experimental and observa-tional studies. Using the potential outcomes frameworkof causal mediation analysis, we demonstrate why con-ventional exogeneity assumptions are insufficient foridentifying causal mechanisms. This formal frameworkallows us to develop a general algorithm for estimatingcausal mediation effects, which is applicable to anystatistical model under these assumptions. The newmethod corrects common mistakes made by empiricalresearchers when quantifying causal mechanisms withnonlinear statistical models.

Second, we develop a method of assessing the sen-sitivity of conclusions to potential violations of keyassumptions. Typically, researchers must rely uponuntestable assumptions for identification of causalmechanisms. This situation is similar to the one whereresearchers must assume a treatment is exogenouswhen estimating causal effects in observational studies.Nonetheless, most research, whether experimental orobservational, depends on certain untestable assump-tions (Imai, King, and Stuart 2008). In such circum-stances, sensitivity analysis plays an essential role byformally quantifying the degree to which empiricalfindings rely upon the key assumption (e.g., Imai andYamamoto 2010; Rosenbaum 2002b). To facilitate theuse of sensitivity analysis, we provide software, me-diation, which also implements our general estimationalgorithm for a wide range of commonly used statisticalmodels (Imai et al. 2010).

Third, we offer alternative research designs that en-able identification of causal mechanisms under lessstringent assumptions. Under standard research de-signs, sensitivity analysis will not entirely solve thefundamental difficulty of causal mediation analysis, be-cause no statistical method can recover informationthat is not present in the observed data. Therefore,alternative research design strategies must be devisedwith the goal of replacing strong assumptions with

Robins 2003; Robins and Greenland 1992), but has not receivedmuch attention among social scientists until recently (e.g., Bullock,Green, and Ha 2010; Glynn 2010).4 Concrete methodological suggestions about how to study causalmechanisms appear to be scarce in the qualitative methodology lit-erature, too. For example, King, Keohane, and Verba (1994, 85–87)have only a limited discussion.

weaker and more credible ones. Our approach to causalmediation analysis allows us to develop several suchresearch design templates for both experimental andobservational studies. We describe the power and lim-itations of the proposed research design strategies inorder to guide their application in empirical research.

Although our proposed approach is general, we il-lustrate it by applying it to two empirical examples inpolitical science; media priming experiments and ob-servational studies of incumbency advantage. In manyways, research on these two topics has evolved alongsimilar paths also experienced by other literatures inthe discipline. Initially, researchers focused on the es-timation of causal effects in studies that quantified theeffects of media cues on policy attitudes and the effectsof incumbency on electoral outcomes. Once a certainlevel of consensus emerged about the magnitude ofcausal effects, scholarly attention shifted to the ques-tion of causal mechanisms: how media cues influencepublic opinion and why incumbents have electoral ad-vantages. The two examples allow us to illustrate sub-stantive aspects of the required identification assump-tion, and we apply our estimation method and sensi-tivity analysis to the data from leading publications.We also discuss how research on these issues couldbe further improved by adopting alternative researchdesign strategies.

Finally, we invoke several other empirical exam-ples to further demonstrate how our proposed ap-proach differs from other commonly used methodssuch as instrumental variables and interaction terms.Although these techniques were originally developedfor purposes other than analyzing causal mechanisms,we show that they can be used to identify mecha-nisms under certain assumptions. Here again, the for-mal framework of causal mediation analysis adopted inthis article clarifies these assumptions and helps appliedresearchers choose appropriate statistical methods fortheir substantive questions of interest.

EXAMPLES OF THE SEARCH FORCAUSAL MECHANISMS

Before we present the formal framework of study-ing causal mechanisms, we briefly describe two em-pirical examples where researchers endeavor to iden-tify causal mechanisms and go beyond simply estimat-ing causal effects. They serve as illustrative examplesthroughout the rest of this article.

The Role of Emotions inMedia Framing Effects

Political science has long considered whether the me-dia influence public support for government policies(e.g., opposition or support for specific policies) andpolitical candidates (e.g., evaluations of candidate lead-ership potential) (e.g., Bartels 1993, Druckman 2005).A prominent focus in this literature has been on issueframing (Chong and Druckman 2007). Because the me-dia can frame issues in particular ways, we expect that

766

Page 3: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

the news stories individuals read or hear will influencepublic opinion (Nelson, Clawson, and Oxley 1997). Inparticular, the framing of a political issue involving ref-erences to specific groups of people has been foundto be particularly effective in some issue areas such asimmigration (Nelson and Kinder 1996).

In a recent article, Brader, Valentino, and Suhay(2008) go beyond estimating the framing effects ofethnicity-based media cues on immigration prefer-ences and ask “why the race or ethnicity of immigrants,above and beyond arguments about the consequencesof immigration, drives opinion and behavior” (960,emphasis in the original). That is, instead of simply ask-ing whether media cues influence opinion, they explorethe mechanisms through which this effect operates.Consistent with earlier work suggesting the emotionalpower of group-based politics (Kinder and Sanders1996), the authors find that the influence of group-based media cues arises through changing individuallevels of anxiety.

Brader, Valentino, and Suhay (2008) employ a stan-dard experimental design where subjects receive a ran-domly assigned media cue that featured a story about aCaucasian (in-group) or Latino (out-group) immigrant.This is followed by measurement of anxiety and immi-gration attitudes. Their analysis indicates that threaten-ing cues from out-group immigrants increase anxiety,which then escalates opposition to immigration andmakes political action on the topic more likely. Theyalso examined the role of other mechanisms, such aschanges in beliefs about the economic costs of immigra-tion (Isbell and Ottati 2002). Following this importantstudy, the emphasis in this literature has moved fromsimply estimating the effect of group-based appeals onpublic attitudes to identifying various mechanisms thattransmit this effect (e.g., Gadarian 2010).

The Decomposition of Incumbency Effects

One of the most studied topics in the electoral poli-tics literature is the advantage of incumbency status. Anew approach to this topic began with the work of Gel-man and King (1990), who used the potential outcomesframework of causal inference to demonstrate the biasof previous measures. These and other authors foundthat the incumbency advantage had been positive andgrowing for the past several decades.

Cox and Katz (1996) take the incumbency advantageliterature in a new direction by considering possiblecausal mechanisms that explain why incumbents havean electoral advantage. They argue that an importantmechanism is the ability of incumbents to deter high-quality challengers from entering the race. The authorsattempt to decompose the incumbency advantage intoa “scare-off/quality effect” and effects due to othercausal mechanisms such as name recognition and re-source advantage. They find that much of the growthof incumbency advantage over time can be attributedto the growth of the scare-off/quality effect; incum-bents are facing increasingly low-quality challengers,which gives them a greater electoral advantage. Fol-

lowing Cox and Katz (1996), some have used differ-ent empirical strategies to test the existence of thescare-off/quality effect (e.g., Levitt and Wolfram 1997).Others have considered alternative causal mechanismssuch as the roles of campaign spending (Erikson andPalfrey 1998), personal vote (Ansolabehere, Snyder,and Stewart 2000), and television (Ansolabehere,Snowberg, and Snyder 2006; Prior 2006).

A FORMAL FRAMEWORK FOR STUDYINGCAUSAL MECHANISMS

Using the potential outcomes framework of causal in-ference, we formally define a causal mechanism as aprocess whereby one variable causally affects anotherthrough an intermediate variable. We show that iden-tification of causal mechanisms can be formulated asa decomposition of a total causal effect into directand indirect effects. The use of the potential outcomesframework is essential because it provides a formallanguage for understanding the counterfactual compar-isons required to study causal mechanisms. As shownlater, the conventional approaches based on structuralequation models fail to recognize the key assumptionbehind causal mediation analysis.

Potential Outcomes Framework

We first introduce the concept of potential outcomes,which has been used in the methodological literatureas the formal framework of causal inference (Holland1986; Neyman [1923] 1990; Rubin 1974). The main ad-vantage of this framework is that issues of unobservedcausal heterogeneity are made much more explicit thanin regression models, where such heterogeneity is ob-scured as part of error terms. In fact, using this frame-work, we show later that, contrary to commonly heldbelief, standard exogeneity assumptions are insufficientfor identifying causal mechanisms.

Given a unit and a set of actions that we call treat-ment and control, we associate an outcome of interestwith each unit and action. These two outcomes remainpotential until one is ultimately realized. The other out-come cannot be observed and thus remains counterfac-tual. For example, usually we do not see how subjects inthe control group would have responded had they beenin the treatment group. Formally, let Ti be a treatmentindicator, which takes on the value of 1 when unit i is inthe treatment group and 0 otherwise. For simplicity, wefocus on binary treatment, but our proposed methodscan be extended easily to nonbinary treatment (seeImai, Keele, and Tingley 2010). We can use Yi(t) todenote the potential outcomes that would result whenunit i was under the treatment status t.5 Although thereare two potential values for each subject, only the onethat corresponds to his or her actual treatment statusis observed. Thus, if we use Yi to denote the observed

5 This notation implicitly assumes no interference between units; thepotential outcomes for a given unit does not depend on the treatmentassignment of other units.

767

Page 4: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

outcome, we have Yi = Yi(Ti) for each unit. Throughoutthe remainder of this article, we assume the absence ofmissing data as well as perfect compliance with treat-ment assignment. The violation of these assumptionstypically leads to more complications in identificationand estimation of causal mechanisms (see, Horiuchi,Imai, and Taniguchi 2007).

To illustrate the idea, consider a stylized version ofthe Brader, Valentino, and Suhay (2008) study wheresubjects are exposed to either a negative immigrationstory (Ti = 1) or a control news story unrelated to im-migration (Ti = 0). The outcome here is simply the ex-tent to which subjects want immigration to be increasedor decreased. Under the potential outcomes notation,Yi(1) is subject i’s potential immigration opinion if heor she receives the immigration news story, and Yi(0) isthe potential immigration opinion if he or she receivesthe control story. Similarly, take a stylized version ofthe Cox and Katz (1996) study where the treatmentis the incumbency status (Ti = 1 if candidate i is anincumbent and Ti = 0 otherwise), and the observedoutcome variable Yi represents the actual vote sharecandidate i received. Potential outcomes can also bedefined, where Yi(1) (Yi(0)) is the potential vote sharecandidate i receives if he/she is (not) an incumbent.

Given this setup, the causal effect of the treatmentcan be defined as the difference between two poten-tial outcomes; one potential outcome that would berealized under the treatment, and the other poten-tial outcome that would be realized under the controlcondition, i.e., Yi(1) ! Yi(0). Because only one of thepotential outcomes is observable, the unit-level treat-ment effect is unobservable. Thus, researchers oftenfocus on the estimation of the average treatment effect(ATE) over a population.6 If the treatment assignmentis randomized, as in the Brader, Valentino, and Suhay(2008) study, then the treatment is jointly independentof the potential outcomes because the probability ofreceiving the treatment is identical regardless of thevalues of the potential outcomes. We can write this as{Yi(1), Yi(0)} "" Ti with the standard symbol of statisti-cal independence.

In observational studies, treatments are not random-ized. Thus, we often statistically adjust for the observeddifferences in the pretreatment covariates Xi betweenthe treatment and control groups through regression,matching, and other techniques (e.g., Ho et al. 2007).This approach assumes that there is no omitted variableaffecting both the treatment and outcome variables. Tobe precise, we assume that the treatment is assigned asif randomized among those units that have identicalvalues of the observed pretreatment covariates, i.e.,{Yi(1), Yi(0)} "" Ti | Xi = x for any value x in the sup-port of Xi. For example, Cox and Katz (1996) adjustfor the lagged vote shares of parties by including themin the linear regression model, implying the assump-tion that the incumbency status of any two candidatesfrom the same party is essentially randomly determinedif their districts have similar vote shares in the pastelection.

6 The ATE is defined as E(Yi(1) ! Yi(0)).

Under this framework, the ATE can be identifiedas the average difference in outcome means betweenthe treatment and control groups. For experiments, wehave the familiar result that the difference-in-meansestimator is unbiased for the ATE. For observationalstudies, this amounts to estimating the ATE for aunique set of pretreatment covariate values and thenaveraging it over the distribution of the pretreatmentcovariates.7 Thus, in the Brader, Valentino, and Suhay(2008) experiment, where the two types of news storiesare randomly assigned to subjects, the average causaleffect of the negative immigration story on the opin-ion toward immigration can be estimated without biasby calculating the average difference of observed re-sponses between the two groups. In observational stud-ies, slightly more complex calculations may be needed,although under certain assumptions a regression coef-ficient can be interpreted as an unbiased estimate ofthe ATE.8

Defining Causal Mechanisms asIndirect and Direct Effects

Next, we formally define causal mechanisms usingthe framework introduced previously. We define acausal mechanism as a process whereby one variableT causally affects another Y through an intermediatevariable or a mediator M that operationalizes the hy-pothesized mechanism. In the Brader, Valentino, andSuhay (2008) study, respondents’ anxiety (M) trans-mits the causal effect of the media framing (T) onattitudes toward immigration (Y). In the Cox and Katz(1996) study, challenger quality represents a mediator(M) through which the incumbency status (T) causallyaffects the election outcome (Y). Of course, in bothstudies, other causal mechanisms may exist; for ex-ample, media effects may operate through changes inbeliefs about the consequences of immigration, andcampaign spending and personal vote may explain theincumbency advantage.

Thus, an inferential goal is to decompose the causaleffect of a treatment into the indirect effect, which rep-resents the hypothesized causal mechanism, and the di-rect effect, which represents all the other mechanisms.Figure 1a graphically illustrates this simple idea and theassumed causal ordering. The indirect effect combinestwo arrows going from the treatment T to the outcomeY through the mediator M, whereas the direct effect isrepresented by a single arrow from T to Y.

Formally, let Mi(t) denote the potential value of amediator of interest (anxiety level for media framingand challenger quality for incumbency advantage) forunit i under the treatment status Ti = t. Now, we useYi(t, m) to denote the potential outcome that wouldresult if the treatment and mediating variables equalt and m, respectively. For example, in the incumbencyresearch, Yi(1, 1) represents the potential vote share

7 That is, E(Yi(1) ! Yi(0)) = E{E(Yi | Ti = 1, Xi) ! E(Yi | Ti = 0, Xi)}.8 Specifically, the assumption is called the constant additive unittreatment effect in the linear regression, which is implicitly madein the Cox and Katz (1996) study.

768

Page 5: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

FIGURE 1. Diagrams Representing VariousCausal Mechanisms

Note: (a) is a simple graphical representation of the decom-position where the treatment T causally affects the outcome Ydirectly or indirectly through the mediator M. The other twodiagrams show causal mechanisms involving two measuredmediators. In (b), there is no causal relationship between the twomediators and hence sequential ignorability (Assumption 1) issatisfied. The other diagram (c) does not satisfy the assumption,because N serves as a posttreatment confounder for M.

for candidate i if he/she is an incumbent facing a chal-lenger who was previously an office holder (a typi-cal way of measuring candidate quality in the litera-ture). As before, we only observe one of the potentialoutcomes, and the observed outcome, Yi, now equalsYi(Ti, Mi(Ti)), which depends upon both the treatmentstatus and the level of the mediator under the observedtreatment status. Thus, the (total) unit treatment effectcan be written as !i # Yi(1, Mi(1)) ! Yi(0, Mi(0)).

We can now define indirect effects or causal me-diation effects for each unit i, which correspond to ahypothesized causal mechanism, as follows (Pearl 2001;Robins and Greenland 1992):

"i(t) # Yi(t, Mi(1)) ! Yi(t, Mi(0)), (1)

for each treatment status t = 0, 1. This causal quantityrepresents the indirect effects of the treatment on theoutcome through the mediating variable. It equals thechange in the outcome corresponding to a change inthe mediator from the value that would be realizedunder the control condition, i.e., Mi(0), to the valuethat would be observed under the treatment condition,i.e., Mi(1), holding the treatment status at t. By fix-ing the treatment and changing only the mediator, weeliminate all other causal mechanisms and isolate thehypothesized mechanism. If the treatment has no effecton the mediator, i.e., Mi(1) = Mi(0), then the causalmediation effects are zero. What the potential out-comes framework clarifies is that whereas Yi(t, Mi(t))is observable for units with Ti = t, the counterfactualoutcome Yi(t, Mi(1 ! t)) can never be observed undermost common research designs. This underscores thedifficulty of identifying causal mechanisms.

In the Brader, Valentino, and Suhay (2008) study, themediator is the subjects’ levels of anxiety. Thus, "i(1)represents the difference between the two potentialimmigration opinions for subject i, who actually re-ceives the treatment of an immigration news story. Forthis subject, Yi(1, Mi(1)) is the observed immigrationopinion if he/she views the immigration news story,whereas Yi(1, Mi(0)) is his or her immigration opinionunder the counterfactual scenario where subject i still

viewed the immigration story but his or her anxietylevel is as if the subject viewed a control news story.The difference between these two potential outcomesrepresents the effect of the change in the mediator thatwould be induced by the treatment, while the directimpact of the treatment is suppressed holding its valueconstant.

Similarly, in the Cox and Katz (1996) study, supposecandidate i is an incumbent. Then an indirect effect,"i(1), equals the difference between the observed voteshare Yi(1, Mi(1)) and the counterfactual vote shareYi(1, Mi(0)). This represents the vote share the candi-date would receive if he or she faced a challenger whosequality was at the same level as the challenger he or shewould have faced if not an incumbent. Thus, this causalquantity formalizes the scare-off/quality effect by iso-lating the portion of incumbency advantage that resultsfrom deterrence of high-quality challengers, control-ling for all other mechanisms.

To represent all other causal mechanisms, we definethe direct effects of the treatment as

#i(t) # Yi(1, Mi(t)) ! Yi(0, Mi(t)) (2)

for each unit i and each treatment status t = 0, 1. Thedirect effect equals the causal effect of the treatment onthe outcome that is not transmitted by the hypothesizedmediator. In the Brader, Valentino, and Suhay (2008)study, #i(1) represents the difference in immigrationopinions under treatment (the immigration news story)and control (no immigration news story) holding thelevel of anxiety constant at the level that would be re-alized under treatment. In the incumbency advantagestudy, #i(1) equals the difference in the vote share ofcandidate i with and without incumbency status hold-ing the challenger quality at the level that would berealized if the candidate were an incumbent. Becausethe direct effects and the indirect effects sum up to thetotal causal effect, a causal mediation analysis repre-sents a decomposition of the total effect into the directand indirect (mediation) effects.9

In this article, we focus on the average causal me-diation effects (ACME) "(t) and the average directeffects (ADE) #(t), which represent the populationaverages of the causal mediation and direct effects,respectively.10 As before, the ATE ! equals the sum ofthe ACME and ADE.11 Our goal is to decompose theATE into the ACME and ADE and then assess therelative importance of the hypothesized mechanism.

9 Formally, !i = "i(t) + #i(1 ! t) = 12

!1t=0{"i(t) + #i(t)}, for t = 0, 1.

In addition, if no interaction between the treatment and the mediatoris assumed, i.e., "i = "i(1) = "i(0) and #i = #i(1) = #i(0) (see Interac-tion Terms for details), then we have a simpler expression !i = "i + #i .See Imai, Keele, and Tingley (2010) for additional discussion of theno-interaction assumption and how to relax it.10 These quantities are formally defined as "(t) # E(Yi(t, Mi(1)) !Yi(t, Mi(0))) and #(t) # E(Yi(1, Mi(t)) ! Yi(0, Mi(t))).11 That is, we have ! # E(Yi(1, Mi(1)) ! Yi(0, Mi(0))) =12

!1t=0("(t) + #(t)). Again, under the no-interaction assumption, we

have ! = " + #.

769

Page 6: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

Nonparametric Identification under theStandard Designs

With causal mechanisms formally defined, we now con-sider the assumption necessary to identify the ACMEand ADE under the standard designs. By the stan-dard designs, we mean that the treatment assignmentis either randomized (as in experimental studies) orassumed to be random given the pretreatment covari-ates (as in observational studies). The key insight hereis that both the direct and indirect effects contain apotential outcome that would never be realized underthese designs, and therefore neither quantity can beidentified even in randomized experiments, let aloneobservational studies. In fact, under these designs, theATE is identified, but the ACME and ADE are not.Identifying causal mechanisms, therefore, requires anadditional assumption. Researchers rarely acknowl-edge that this assumption is necessary to give the quan-tities they estimate a causal interpretation.

We formalize this additional identification assump-tion as follows. Let Xi be a vector of the observedpretreatment confounders for unit i, such as a respon-dent’s gender and race in the media framing study andthe past election results in the research on incumbencyadvantage. Then the assumption can be written as fol-lows.

Assumption 1 [Sequential Ignorability (Imai, Keele,and Yamamoto 2010)].

{Yi(t$, m), Mi(t)} "" Ti | Xi = x, (3)

Yi(t$, m) "" Mi(t) | Ti = t, Xi = x, (4)

where 0 < Pr(Ti = t | Xi = x) and 0 < p(Mi = m | Ti =t, Xi = x) for t = 0, 1, and all x and m in the support ofXi and Mi, respectively.

How can Assumption 1 be interpreted? The assump-tion is called sequential ignorability because two ignor-ability assumptions are made sequentially. First, giventhe observed pretreatment confounders, the treatmentassignment is assumed to be ignorable—statistically in-dependent of potential outcomes and potential me-diators. This part of the assumption is often calledno-omitted-variable bias, exogeneity, or unconfound-edness. In experiments, the assumption is expectedto hold because treatment is randomized. In obser-vational studies, researchers typically use regressionand/or matching to make this assumption plausible (Hoet al., 2007).

The second part of Assumption 1 implies that theobserved mediator is ignorable given the actual treat-ment status and pretreatment confounders. Here, weare assuming that once we have conditioned on a setof covariates gathered before the treatment, the me-diator status is ignorable. Note the apparent similaritybetween this assumption and the standard assumptionmade in observational studies that the treatment as-signment is exogenous given the observed pretreat-ment covariates. In fact, even in randomized experi-ments, identification of causal mechanisms requires an

additional assumption that is similar to the one oftenmade in observational studies. However, as further dis-cussed in Sequential Ignorability and Conventional Ex-ogeneity Assumptions a key difference between this as-sumption and the conventional exogeneity assumptionis that randomizing both the treatment and mediatingvariables does not suffice for this assumption to hold.

What does Assumption 1 imply about media framingand incumbency advantage studies discussed in Exam-ples of the Search for Causal Mechanism? First, con-sider the Brader, Valentino, and Suhay (2008) study.Because the news stories are randomly assigned tosubjects, the first part of Assumption 1 will hold evenwithout conditioning on any pretreatment covariateXi. However, the second part of the assumption im-plies that there are no unmeasured pretreatment orposttreatment covariates that confound the relation-ship between the levels of anxiety and the subjects’immigration opinions. To satisfy this assumption, wemust measure the complete set of covariates that af-fect both anxiety and immigration opinions, and theyall must not be affected by the treatment.

This assumption is violated if, for example, bothone’s anxiety and immigration opinions are affected byfear disposition (the strength with which one respondsto threatening stimuli (Jost et al., 2007)) or ideology(Oxley et al., 2008). Among those in the treatmentgroup, individuals with high fear disposition or conser-vative ideology might exhibit higher levels of anxiety.Furthermore, fear disposition and ideology have alsobeen directly linked to a variety of political attitudes,including attitudes towards out-groups (Olsson et al.,2005). Hence, these pretreatment covariates could in-fluence both the mediator and outcome in the Brader,Valentino, and Suhay (2008) study (see Figure 7a in theAppendix for a diagram depicting this situation). Thus,we must assume that ignorability holds after adjust-ment for all pretreatment covariates that affect anxietyand immigration attitudes.

Next, consider the incumbency advantage example.In an observational study, the first part of Assumption1 must be made with great care because treatment as-signment is not randomized. In the context of the Coxand Katz (1996) study, we must first assume that theincumbency status is random once we adjust for differ-ences in the previous election outcome and partisan-ship. This means that after we adjust for these pretreat-ment covariates, whether or not the Democratic partywill run an incumbent candidate in the current electionis essentially random. Assumption 1 also requires thatthe quality of the challenger in the current electionis random once we take into account differences inthe incumbency status and the past election outcomeas well as partisanship. For both of these ignorabilityassumptions, there may exist unobserved confounders.

As this discussion illustrates, the second stage of se-quential ignorability is a strong assumption even instandard randomized experiments. Furthermore, as al-ready recognized by many researchers, the first partof Assumption 1 must be made with great care inobservational studies. Assumptions such as sequentialignorability are often referred to as irrefutable because

770

Page 7: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

TABLE 1. The Fallacy of the Causal Chain Approach

Potential Mediators andOutcomes Treatment Effect Mediator Effect Causal Mediation

Population on Mediator on Outcome EffectProportion Mi (1) Mi (0) Yi (t, 1) Yi (t, 0) Mi (1) ! Mi(0) Yi (t, 1) ! Yi (t, 0) Yi (t, Mi(1)) ! Yi (t, Mi(0))

0.3 1 0 0 1 1 !1 !10.3 0 0 1 0 0 1 00.1 0 1 0 1 !1 !1 10.3 1 1 1 0 0 1 0Average 0.6 0.4 0.6 0.4 0.2 0.2 !0.2

Notes: The left five columns of the table show a hypothetical population proportion of “types” of units defined by the values ofpotential mediators and outcomes. Note that these values can never be jointly observed. The last row of the table shows thepopulation average value of each column. In this example, the average causal effect of the treatment on the mediator (the sixthcolumn) is positive and equal to 0.2. Moreover, the average causal effect of the mediator on the outcome (the seventh column) isalso positive and equals 0.2. And yet the average causal mediation effect (ACME; final column) is negative and equals !0.2.

one cannot disprove them with observable information(Manski 2007). It is impossible to entirely preclude thepossibility that there exist unobserved variables thatconfound the relationships even after conditioning onmany observed covariates.

What does this strong assumption buy us then? Imai,Keele, and Yamamoto (2010) prove that under As-sumption 1 the ACME and ADE are nonparametricallyidentified.12 This means that, without any additionaldistributional or functional-form assumptions aboutthe mediator or outcome variables, these effects canbe consistently estimated. Therefore, Assumption 1 al-lows us to make inferences about the counterfactualquantities we do not observe—the potential outcomesunder the value of the mediator that would be real-ized if subjects were in the treatment status opposite totheir actual treatment status—using the quantities wedo observe—observed outcomes and mediators. Theresult also implies that we may estimate the ACMEand ADE more flexibly by making no or weak assump-tions about the functional form or distribution of theobserved data. Imai, Keele, and Tingley (2010) exploitthis fact to develop a general method for estimatingthese quantities for outcome and mediating variablesof many types using either parametric or nonpara-metric regression models. As illustrated in EmpiricalIllustrations, this new method corrects common mis-takes made by researchers in estimating the ACMEand ADE with nonlinear statistical models.

Sequential Ignorability andConventional Exogeneity Assumptions

As we briefly mentioned earlier, sequential ignorabil-ity (Assumption 1) differs critically from the conven-tional exogeneity assumptions that are commonly un-derstood to identify indirect effects in structural equa-tion models. First, one might incorrectly conjecture thatAssumption 1 is satisfied by the randomization of both

12 Formally, it can be shown that f (Yi(t, Mi(t$)) | Xi = x) ="M f (Yi |

Mi = m, Ti = t, Xi = x) dFMi (m | Ti = t$, Xi = x) for any x % X andt, t$ = 0, 1.

treatment and mediator. For example, Spencer, Zanna,and Fong (2005) propose a “causal chain” approachwhere researchers implement two randomized exper-iments, one in which the treatment is randomized toidentify its effect on the mediator, and another in whichthe mediator is randomized to identify its effect on theoutcome.13

Unfortunately, even though the treatment and medi-ator are each guaranteed to be exogenous in these twoexperiments, simply combining the two is not sufficientto identify the ACME. A simple numerical examplemakes this evident. Consider the hypothetical popula-tion given in Table 1, which describes the populationproportion of “types” of units by the values of potentialmediators and outcomes. Although the values in Table1 can never be jointly observed, the two randomizedexperiments will give sufficient information to iden-tify the average causal effect of the treatment on themediator as well as that of the mediator on the out-come. In this example, both of these effects are positiveand equal to 0.2, and thus based on these results onemight conclude that the ACME is positive. However,the ACME is actually negative. Thus, contrary to thecommonly held belief, the conventional exogeneity as-sumptions do not necessarily identify the ACME.

In this example, causal heterogeneity exists in such away that the units with a positive effect of the treatmenton the mediator (the first row of the table) exhibit anegative effect of the mediator on the outcome. Thisparticular deviation from sequential ignorability makesthe causal mediation effects negative on the averageeven though all other average effects are positive. Thekey point, beyond this specific example, is the funda-mental difference between the causal mediation effectand the causal effect of the mediator itself. The lat-ter refers to the average difference in the potentialoutcomes that would be realized if the mediator weremanipulated to certain fixed values, i.e., the average

13 An alternative and better experimental design is what Imai et al.(2011) call the parallel design where in the second experiment boththe treatment and mediating variables are randomized. See Alterna-tive Research Design for Credible Inference for further discussion.

771

Page 8: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

value of Yi(t, 1) ! Yi(t, 0), which can be consistently es-timated when the conventional exogeneity assumptionholds about the mediator. However, this quantity cru-cially differs from the causal mediation effect in that themediator is artificially manipulated to take particularvalues (1 or 0) as opposed to being hypothetically setto the values that would naturally arise in response totreatment (Mi(1) or Mi(0)). Because a causal mecha-nism represents how the effect of treatment on outcomeis transmitted through the mediator, identifying theeffect of the mediator itself is not sufficient.

The second key difference between sequential ig-norability and conventional exogeneity assumptionsis that the conditioning set of covariates in the sec-ond part of sequential ignorability must only includepretreatment variables.14 In other words, one cannotcondition on premediator confounders if they are af-fected by the treatment. This subtle difference hasimportant substantive implications. Figures 1b and 1cshow two causal diagrams that involve two observedmediators, M and N, where for the sake of simplicityno pretreatment confounders are assumed to exist.15

The diagrams represent a common situation where twopossible causal mechanisms are hypothesized and twocorresponding mediators are measured to test themagainst each other. For example, Brader, Valentino, andSuhay (2008) measure two mediators. The first mech-anism suggests that the influence of the media cue onimmigration attitudes is through changes in a subject’sanxiety levels. In contrast, their second mechanism ex-amines changes in beliefs about the economic effects ofimmigration and hence represents a hypothesis focusedon cognitive evaluations of costs and benefits.

These two diagrams are different in terms of whetherthere is a direct causal relationship between the twomediators. In Figure 1b, the two mediators are causallyunrelated and thus conditionally independent of eachother once the treatment status is controlled for. Thisimplies that both the conventional exogeneity assump-tions and sequential ignorability are satisfied. TheACME for each of the mediators can therefore beidentified, as discussed in Nonparametric Identifica-tion under the Standard Designs. In contrast, Figure 1crepresents a situation in which the causal relationshipbetween one mediator (M) and the outcome (Y) is con-founded by the other mediator (N). Because the secondmediator is affected by the treatment, N represents aposttreatment confounder. This implies that althoughthe exogeneity assumption is met once both N andtreatment (T) are adjusted for, sequential ignorabilityis not satisfied. Here again is an example where theexogeneity of mediator does not imply identifiability ofcausal mechanisms. In the Appendix, we discuss furtherissues related to the role of post-treatment variables

14 Robins (2003) considers a variant of sequential ignorability thatallows posttreatment premediator confounders as well as pretreat-ment confounders. He shows, however, that in general the ACMEcannot be identified without additional strong assumptions, such asno interaction between the treatment and mediator.15 The subsequent argument holds so long as all such confoundersare adjusted for.

and multiple mediators. Imai and Yamamoto (2011)further address these important issues by developing anew sensitivity analysis.

In the Brader, Valentino, and Suhay (2008) study,for example, Figure 1b corresponds to the situationwhere the effect of the media cue goes through bothemotion and cognitive mechanisms (such as beliefs) butthere is no direct causal connection between the twomechanisms. An alternative causal model, depicted inFigure 1c, allows for the possibility that beliefs aboutimmigration’s economic costs (N) also lead to changesin anxiety levels (M). This might be due to individualsrealizing that a threat is present, inducing the greaterinformation acquisition and avoidance behavior associ-ated with anxiety. Here, media cues might also producedirect changes in anxiety because of ingroup/outgrouptriggers, and beliefs about the financial impact of immi-gration can influence immigration preferences (Isbelland Ottati 2002).

INFERENCE AND SENSITIVITY ANALYSISUNDER THE STANDARD DESIGNS

In this section, we introduce our approach to estimatingthe ACME and ADE based on the nonparametric iden-tification result given in Nonparametric Identificationunder the Standard Designs. In both the Cox and Katz(1996) and Brader, Valentino, and Suhay (2008) stud-ies, analyses are conducted within the traditional linearstructural equation modeling (LSEM) framework. Thismethod was popularized by Baron and Kenny (1986)and is widespread in the social sciences (e.g., Shadish,Cook, and Campbell 2001, chap. 12). However, thedrawbacks of the LSEM framework are twofold (seealso Glynn 2010). First, it obscures the identification as-sumptions required to identify causal mechanisms (seeSequential Ignorability and Conventional ExogeneityAssumptions). Second, the LSEM framework does noteasily extend to nonlinear or nonparametric models.Nevertheless, as we illustrate via the Brader, Valentino,and Suhay (2008) study later in this article, many schol-ars incorrectly apply the LSEM-based method to non-linear models such as logistic regression.16 Recogniz-ing this problem, some authors, such as Cox and Katz(1996) who use ordered probit regression to model themediator, report the estimates for the ACMEs onlywhen linear models are used.

In contrast, our approach is not tied to any specificstatistical model. In fact, we can use parametric or non-parametric regressions to model the mediator and out-come variables, because the identification assumptionis stated without reference to any specific model. Wealso propose a sensitivity analysis to probe the sequen-tial ignorability assumption by quantifying the degreeof possible violation. The proposed estimation methodand sensitivity analysis can be implemented easily usingour software mediation.

16 This and other similar mistakes are common in the discipline (seee.g., Hetherington 2001; Miller and Krosnick 2000; among others).

772

Page 9: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

The Existing Method and Its Limitations

To highlight the flexibility and transparency of our ap-proach, we first provide a brief review of the standardapproach to estimating mediation effects (e.g., Baronand Kenny 1986; MacKinnon 2008). This commonlyused method is based on the following set of linearequations:

Yi = $1 + %1Ti + &&1 Xi + 'i1, (5)

Mi = $2 + %2Ti + &&2 Xi + 'i2, (6)

Yi = $3 + %3Ti + (Mi + &&3 Xi + 'i3. (7)

In the media framing experiment, for example, Ti rep-resents a binary treatment indicator for the news storystimuli, Mi represents the observed level of anxiety,and Yi is the observed opinion about immigration lev-els. Similarly, in the incumbency advantage study, Tirepresents the incumbency status of a candidate, Mirepresents the quality of his or her opponent, and Yiis his or her vote share. In both cases, Xi representsa set of observed pre-treatment covariates, which areincluded to make sequential ignorability plausible.

In this setup, the standard method is to estimate theACME using the product of coefficients %2(, where %2and ( are obtained by separately fitting least squaresregressions based on equations (6) and (7). A secondmethod is to use the difference of coefficients method,which uses %1 ! %3 as the estimate of the ACME, where%1 comes from another separate least squares fit ofequation (5). Both produce numerically identical es-timates of the ACME. Finally, %1 and %3 are used asthe estimates of the ATE and ADE, respectively. BothBrader, Valentino, and Suhay (2008) and Cox and Katz(1996) used the product of coefficients method to esti-mate the ACME. Often, researchers conduct a hypoth-esis test based on the asymptotic variance of %2( withthe null hypothesis of the ACME being zero (Sobel1982).

What assumption is required for %2( to be a validestimate of the ACME? Imai, Keele, and Yamamoto(2010) prove that under sequential ignorability and theadditional no-interaction assumption, i.e., "(1) = "(0),the product of coefficients %2( is a valid estimate (i.e.,asymptotically consistent) so long as the linearity as-sumption holds. In fact, the sequential ignorability as-sumption can be easily translated into phraseology fa-miliar to LSEM analysts. Imai, Keele, and Yamamoto(2010) show that under the LSEM, sequential ignor-ability implies zero correlation between 'i2 and 'i3.Clearly, randomization of Ti will not guarantee this cor-relation to be zero, whereas it does enable consistentestimation of the ATEs of the treatment on the out-come and on the mediator (Ti is uncorrelated with ei-ther 'i1 or 'i2). As shown in Sequential Ignorability andConventional Exogeneity Assumptions, however, theconverse is not always true: Zero correlation between'i2 and 'i3 does not necessarily imply sequential ignor-

ability. For example, randomizing both treatment andmediator does not justify using %2( as an estimate ofthe ACME. Indeed, the correlation between the errorterms may not be zero even when the conventional ex-ogeneity assumptions are satisfied (see the Appendix).These important facts are not readily apparent in theLSEM framework but can be seen immediately in thepotential outcomes framework.

As we mentioned before, another important defi-ciency in the LSEM framework is that it cannot bedirectly applied to nonlinear models. If the mediatorand/or the outcome are measured with discrete vari-ables, one may wish to replace linear regression modelswith discrete choice models such as probit regression.However, nonlinearity in this and other models im-plies that the product of coefficients and the differenceof coefficient methods no longer provide a consistentestimate of the ACME under sequential ignorability(Imai, Keele, and Tingley 2010; Pearl n.d.; Vander-Weele 2009), contrary to some existing suggestions(e.g., MacKinnon et al. 2007). Our approach offers ageneral method for estimating the ACME and ADE bydirectly using the nonparametric identification result,which is not dependent on any statistical model. In thefollowing, we provide an informal overview of this newmethod, referring readers to Imai, Keele, and Tingley(2010) for details.

The Proposed Estimation Method

The nonparametric identification result leads to a gen-eral algorithm for computing the ACME and the ADE,which is applicable to any statistical model so long assequential ignorability holds. The algorithm consistsof two steps.17 First, we fit regression models for themediator and outcome. The mediator is modeled as afunction of the treatment and any relevant pretreat-ment covariates. The outcome is modeled as a functionof the mediator, the treatment, and the pretreatmentcovariates. The form of these models is immaterial. Themodels can be nonlinear, such as logistic or probit mod-els, or even be non-semiparametric, such as generalizedadditive models. Based on the mediator model, we thengenerate two sets of predictions for the mediator, oneunder the treatment and the other under the control.For example, in the media framing study, this wouldcorrespond to predicted levels of anxiety after readinga news story on immigration or a neutral news story.

For the next step, the outcome model is used to makepotential outcome predictions. Suppose that we are in-terested in estimating the ACME under the treatment,i.e., "(1). First, the outcome is predicted under the treat-ment using the value of the mediator predicted in thetreatment condition. Second, the outcome is predictedunder the treatment condition but now uses the media-tor prediction from the control condition. The ACMEis then computed as the average difference between theoutcome predictions using the two different values of

17 Imai et al. (2010) show a flowchart summarizing this algorithm interms of the easy-to-use software mediation.

773

Page 10: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

the mediator. For example, in the media framing study,this would correspond to the average difference in im-migration attitudes from fixing the treatment status butchanging the level of anxiety between the level pre-dicted after reading an immigration story versus read-ing a neutral story. Finally, either bootstrap or MonteCarlo approximation based on the asymptotic samplingdistribution (King, Tomz, and Wittenberg 2000) can beused to compute statistical uncertainty.

Thus, our new estimation method provides muchneeded generality and flexibility. Instead of researchersattempting to shoehorn nonlinear models of varioustypes into the LSEM framework, they can estimate theACME and ADE using statistical models appropriateto the data at hand.

Sensitivity Analysis

As we discussed, identifying causal mechanisms re-quires sequential ignorability, which cannot be testedwith the observed data. Given that the identification ofcausal mechanisms relies upon an untestable assump-tion, it is important to evaluate the robustness of em-pirical results to potential violation of this assumption.Sensitivity analysis provides one way to do this. Thegoal of a sensitivity analysis is to quantify the exact de-gree to which the key identification assumption must beviolated for a researcher’s original conclusion to be re-versed. If inference is sensitive, a slight violation of theassumption may lead to substantively different conclu-sions. Although sensitivity analyses are not currentlya routine part of statistical practice in political science(but see Blattman 2009, and Imai and Yamamoto 2010),we would argue that they should form an indispensablepart of empirical research (Rosenbaum, 2002b).

Imai, Keele, and Tingley (2010) and Imai, Keele, andYamamoto (2010) propose a sensitivity analysis basedon the correlation between 'i2, the error for the medi-ation model, and 'i3, the error for the outcome model,under a standard LSEM setting and several commonlyused nonlinear models. They use ) to denote the cor-relation across the two error terms. If sequential ig-norability holds, all relevant pretreatment confoundershave been conditioned on, and thus ) equals zero. How-ever, nonzero values of ) imply departures from the se-quential ignorability assumption and that some hiddenconfounder is biasing the ACME estimate.18 For exam-ple, in the Brader, Valentino, and Suhay (2008) study,if subjects’ unmeasured fear disposition makes themmore likely to become anxious and also more opposedto immigration, this confounding will be reflected inthe data-generating process as a positive correlationbetween 'i2 and 'i3. Ignoring this and estimating thetwo models separately will lead to a biased estimate ofthe ACME. Thus, ) can serve as a sensitivity parameter,because more extreme values of ) represent larger de-partures from the sequential ignorability assumption.In particular, although the true value of ) is unknown,

18 This omitted variable can also be thought of as any linear combi-nation of multiple unobserved confounders, though having a specificomitted variable in mind will help interpretation.

it is possible to calculate the values of ) for which theACME is zero or its confidence interval contains zero.

Researchers may find it difficult to interpret thesensitivity parameter ). To ease interpretation, Imai,Keele, and Yamamoto (2010) have developed an al-ternative formulation of the sensitivity analysis basedon how much the omitted variable would alter the co-efficients of determination (aka. R2) of the mediatorand outcome models. For example, if fear disposition isimportant in determining anxiety levels or immigrationpreferences, then the model excluding fear dispositionwill have a much smaller value of R2 compared tothe full model including fear disposition. On the otherhand, if fear disposition is unimportant, R2 will not bevery different whether including or excluding the vari-able. Thus, this relative change in R2 can be used as asensitivity parameter. For example, the original resultswould be considered weak if the sensitivity analysissuggests that fear disposition would need to explainonly a small portion of the remaining variance in anxi-ety levels and immigration attitudes for the ACME tolose statistical significance.

Although sensitivity analysis can shed light onwhether the estimates obtained under sequential ig-norability are robust to possible hidden pretreatmentconfounders, it is important to note the limitations ofthe proposed sensitivity analysis. First, the proposedmethod is designed to probe for sensitivity to the pres-ence of an unobserved pretreatment confounder. Inparticular, it does not address the possible existenceof confounders that are affected by the treatment andthen confound the relationship between the mediatorand the outcome (see the Appendix for a more thor-ough discussion and Imai and Yamamoto 2011), for anew sensitivity analysis with respect to posttreatmentconfounders). If such a confounder exists, we will needa different strategy for both identification and sen-sitivity analysis. Second, unlike statistical hypothesistesting, sensitivity analysis does not provide an ob-jective criterion that allows researchers to determinewhether sequential ignorability is valid or not. Thisis not surprising, given that sequential ignorability isan irrefutable assumption. Therefore, as suggested byRosenbaum (2002a, 325), a cross-study comparison ishelpful for assessing the robustness of one’s conclusionrelative to those of other similar studies.19

EMPIRICAL ILLUSTRATIONS

In this section, we illustrate the proposed methodsthrough a reanalysis of the experimental and observa-tional studies of Brader, Valentino, and Suhay (2008)and Cox and Katz (1996), respectively, which werebriefly discussed in Examples of the Search for CausalMechanism. We show the general applicability of ourmethod by accommodating different types of data, such

19 In addition, the proposed framework rests on the more funda-mental presumption that the causal ordering imposed by the analystis correct (e.g., emotional reactions occur before policy preferenceis formed). This can only be verified by some appeal to scientificevidence not present in the data.

774

Page 11: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

TABLE 2. Estimated Products of Coefficients and AverageCausal Mediation Effects with Discrete Outcomes

Product of Coefficients Average CausalOutcome variables Method Mediation Effect (")

Decrease Immigration (Ordinal) 0.347 0.105"(1) [0.146, 0.548] [0.048, 0.170]

Support English-Only Laws (Ordinal) 0.204 0.074"(1) [0.069, 0.339] [0.027, 0.132]

Request Anti-Immigration Information (Binary) 0.277 0.029"(1) [0.084, 0.469] [0.007, 0.063]

Send Anti-Immigration Message (Binary) 0.276 0.086"(1) [0.102, 0.450] [0.035, 0.144]

Notes: The 95% confidence intervals for the products of coefficients are based on the asymptotic variance ofSobel (1982). The ACME confidence intervals are based on nonparametric bootstrap with 1000 resamples. Themediation equation was estimated with least squares and the outcome equation is either a binary or an orderedprobit model, depending on whether the outcome measure is binary or ordinal. For ordinal measures, the ACMEis presented only in terms of the probability of the final category, which is the modal category. The results arecomputed via the mediation software.

as binary outcomes and mediators. We also show howto conduct a sensitivity analysis to probe the conse-quences of potential violations of the sequential ignor-ability assumption, i.e., Assumption 1.

Quantifying the Role of Anxietyin the Media Framing Effects

Brader, Valentino, and Suhay (2008) set out to studywhy and how media cues influence attitudes towardimmigration. The authors identify two key factors thatthey hypothesize not only may alter opinions aboutimmigration but also may spur people to political ac-tion. First, media messages that emphasize the costs ofimmigration on society should be expected to increaseopposition, whereas stories that emphasize the benefitsshould reduce opposition. Second, given that immigra-tion often has a racial component, whites will be morelikely to oppose immigration when the immigrants be-ing discussed in the media are nonwhite. Cues usingnonwhite immigrants and messages emphasizing costswill have particularly negative effects on immigrationattitudes. As earlier work suggests that the effect ofgroup-based appeals works through emotional mech-anisms (Kinder and Sanders 1996), Brader, Valentino,and Suhay (2008) hypothesize that the cues operatethrough changes in anxiety levels. They also consideran alternative mechanism where the cues influence im-migration attitudes by changing beliefs about the costsand benefits of immigration.

To test these hypotheses, they constructed an ex-periment where respondents were given a news storywith two manipulations. First, the content of the newsstory was manipulated to emphasize the benefits orthe costs of immigration. Second, the researchers var-ied whether the particular immigrant described andpictured was a white immigrant from Russia or a His-panic immigrant from Mexico. Brader, Valentino, andSuhay (2008) found that generally only one treatmentcombination—a negative immigration news story with a

picture of a Hispanic immigrant—elevated anxiety anderoded support for immigration. That is, when subjectswere exposed to a news story that highlighted the costsof immigration and referenced a Hispanic immigrant,they became less supportive of immigration. They alsowere more likely to speak out against increased immi-gration to their member of Congress and more likelyto request anti-immigration information. The authorsconclude that subjects’ level of anxiety mediated theeffect of media cues.

Given the original results, we recode the four-category treatment condition indicator into a binaryvariable where the treatment condition is the negativenews story combined with the picture of the Hispanicimmigrant and the control condition is composed ofsubjects in the other three conditions. The anxiety me-diator is measured as a roughly continuous scale con-structed from three self-reported emotion indices inthe survey. The outcome variables, which all measurevarious attitudes toward immigration, are all discrete.The first two outcome measures are ordinal scales andthe other two outcome measures are binary. Finally,we use the same pretreatment covariates used in theoriginal analysis (education, age, income, and gender).

Estimation of the Average Causal Mediation Effects.We report two types of results in Table 2. The firstis based upon the product-of-coefficients method thatBrader, Valentino, and Suhay (2008) use (left column).This involves estimating equation (6) with a linear re-gression and then estimating equation (7) with a binaryor ordered probit model (depending on whether theoutcome measure is binary or ordinal), both includingthe set of pretreatment covariates. Under this method,%2( is interpreted to be the estimate of the ACMEand the confidence intervals are calculated using theasymptotic variance formula (Sobel, 1982). For eachtype of immigration attitude or behavior, we obtaina positive, statistically significant estimate using theproduct-of-coefficients method. Brader, Valentino, and

775

Page 12: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

Suhay (2008) took this as evidence that anxiety trans-mits the effect of receiving the Hispanic/cost cue onimmigration attitudes and behavior.

As discussed earlier, however, the use of theproduct-of-coefficients method is problematic unlessboth the outcome and mediator are modeled as linearfunctions. In the current case, because of the nonlinearmodels (probits) for the outcome variables, %2( doesnot estimate the ACME consistently even under the se-quential ignorability assumption and thus lacks a clearsubstantive interpretation.

The second set of results employs the method de-scribed in The Proposed Estimation Method (right col-umn). Using the mediation software, we estimate thesame set of regression models and then calculate theACME with confidence intervals based on the non-parametric bootstrap with 1000 resamples. We reportthe ACME for the treatment condition, "(1).20 Whenthe ordinal outcome is modeled with an ordered probitmodel, there is an ACME point estimate for each cate-gory in the dependent variable, which represents thechange in the probability for each value of the outcome.Here, we report the ACME for the final category ineach outcome measure, which in both cases is the modalcategory. The results show a striking contrast with theproduct-of-coefficients estimates, with the latter being4 to 10 times as large. Under Assumption 1, our esti-mates are consistent for the ACME, which representsthe average change in the outcome that is due to thechange in the mediator induced by the difference in thetreatment condition.

For example, we find that on average the treatmentincreased the probability that a subject preferred lessimmigration by 0.105 (with a 95% confidence inter-val of [0.048, 0.170]) because of heightened anxiety.Because the total causal effect of the Hispanic/costtreatment was 0.195 ([0.067, 0.324]) and the direct ef-fect was 0.090 ([!0.021, 0.209]), we can conclude thatabout 54% of the total effect was mediated throughthe anxiety mechanism. In contrast, the product-of-coefficients method overestimates the increase in theprobability of preferring less immigration due to theanxiety pathway (0.347 as opposed to 0.105).21

Sensitivity Analysis. The previous results indicate thatanxiety is indeed likely to be a mediator of the effect ofmedia cues on immigration opinions. However, thesefindings are obtained under the sequential ignorabilityassumption (Assumption 1). Thus, a natural questionis how sensitive these results are to the violation of thatassumption. In the current context, Assumption 1 im-plies that we have fully accounted for any confoundersthat might have effects on both the mediator and theoutcome. More concretely, we must ask whether indi-viduals who became more anxious have unobserved

20 Estimates of "(0) were similar. Although we can incorporate aninteraction term between the treatment and mediator, the estimatesof "(0) and "(1) will generally differ even without an interaction termbecause of nonlinearity in the outcome model.21 The biased estimate based on the product of coefficients methoddoes not make much substantive sense, either, because it implies thatthe direct effect is negative.

characteristics that differ from those of other individu-als and that also influence immigration attitudes. If, forexample, the unmeasured fear disposition or ideologyof subjects makes them both more anxious and moreopposed to immigration (see Nonparametric Identifi-cation under the Standard Designs), the proposed es-timation procedure produces a biased estimate of theACME. Our sensitivity analysis measures the robust-ness of conclusions to such possibilities.

Here, we focus on the outcome where subjects statedwhether immigration should be decreased or increased.The results are presented in Figure 2, which is gener-ated using the mediation software. In the left panel, thetrue ACME is plotted against values of the sensitivityparameter ), which equals the correlation between theerror terms in the mediator and outcome models andthus represents both the degree and direction of theunobserved confounding factor between anxiety andimmigration preference. When ) is zero, sequential ig-norability holds and the true ACME coincides with theestimate reported in Table 2. The shaded region in theplot marks the 95% confidence intervals for each valueof ).

The first question we ask in the sensitivity analy-sis is how large ) must be for the mediation effect tobe zero. We find that for this outcome, the estimatedACME equals zero when ) equals 0.43. After takinginto account sampling uncertainty, we find that the 95%confidence intervals for the ACME include zero when) exceeds 0.34. Thus, to conclude that the true ACMEis not significantly different from zero, we must assumean unobserved confounder that affects both anxietyand immigration preference in the same direction andmakes the correlation between the two error termsgreater than 0.34.

Although this procedure effectively quantifies thedegree of sensitivity, analysts may have difficulty ininterpreting the result in substantive terms. There aretwo ways to address this issue. As suggested in Sen-sitivity Analysis, the first is a cross-study comparison.For example, Imai, Keele, and Yamamoto (2010) findin their reanalysis of another prominent media framingexperiment (Nelson, Clawson, and Oxley 1997) that theACME is zero when ) is equal to 0.48. Thus, the findingsreported here are slightly less robust to the existenceof unobserved confounding than in this previous study.The second possibility is to express the degree of sen-sitivity in terms of the importance of an unobservedconfounder in explaining the observed variation in themediator and outcome variables.

In the right panel of Figure 2, the true ACME isshown as contours with respect to the proportions ofthe variance in the mediator (horizontal axis) and in theoutcome (vertical axis), each explained by the unob-served confounder in the true regression models. Here,we explore the case where the unobserved confounderaffects the mediator and outcome in the same direc-tion, which is what we would expect if the confounderwere fear disposition. These two sensitivity parametersare each bounded above by one minus the R2 of theobserved models, which represents the proportion ofthe variance that is not yet explained by the observed

776

Page 13: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

FIGURE 2. Sensitivity Analysis with Continuous Mediator and Binary Outcome

!1.0 !0.5 0.0 0.5 1.0

!0.4

!0.2

0.0

0.2

0.4

Sensitivity Parameter: !

Ave

rage

Med

iatio

n E

ffect

: "(1

)

Proportion of Total Variance in M Explained by Confounder

Pro

port

ion

of T

otal

Var

ianc

e in

Y

Exp

lain

ed b

y C

onfo

unde

r

!0.15 !0.1

!0.05

0

0.05

0.0 0.2 0.4 0.6 0.8

0.0

0.1

0.2

0.3

0.4

0.5

0

0.0

0.1

0.2

0.3

0.4

0.5

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Notes: The graphs show two alternative formulations of the proposed sensitivity analysis. The outcome for both analyses is whethersubjects opposed increased immigration. In the left panel, the true ACME is plotted against the sensitivity parameter ), which is thecorrelation between the error terms in the mediator and outcome regression models. The dashed line represents the estimated ACMEwhen the sequential ignorability assumption is made. The shaded areas represent the 95% confidence interval for the mediation effectsat each value of ). In the right panel, the contours represent the true ACME plotted as a function of the proportion of the total mediatorvariance (horizontal axis) and the total outcome variance (vertical axis), that are each explained by the unobserved confounder includedin the corresponding regression models. Here the unobserved confounder is assumed to affect the mediator and outcome in the samedirection. Both graphs are generated by the mediation software.

predictors in each model. In this example, these upperbounds are 0.78 for the mediator model and 0.50 forthe outcome model. Other things being equal, a lowvalue of this upper bound indicates a more robust es-timate of the ACME because there is less room for anunobserved confounder to bias the result.

We find that the true ACME changes sign if theproduct of these proportions is greater than 0.07 andthe confounder affects both anxiety and immigrationpreference in the same direction. For example, if sub-jects’ fear disposition explains more than 35% of thevariance in anxiety and 20% of the variance of the im-migration level preference in the latent scale, then thetrue ACME is negative. Thus, the positive ACME re-ported in the original analysis is robust to confoundingbecause of unmeasured fear disposition when the latterexplains less than about 26% ('

(0.07) of the variance

in the mediator and outcome. If the confounder were toaffect the mediator and outcome in different directions,then mediation effects would be even more positive.In sum, our reanalysis of the Brader, Valentino, andSuhay (2008) study yields appropriate estimates of theACME (given the use of nonlinear models), makes thenecessary assumptions for the identification of media-tion effects clear, and provides a sensitivity analysis.

Estimating the “Scare-off/Quality Effect”of Incumbency

Cox and Katz study the causal mechanisms throughwhich incumbency generates an electoral advantage.They suggest one such mechanism where incumbents“scare off” quality challengers, yielding the electoral

advantage of the incumbent in terms of relative op-ponent quality. Their argument is that because incum-bents are likely to have greater resources available tothem, higher-quality challengers will be deterred bythe higher cost of defeating an incumbent and theirown high opportunity costs.

In the original analysis, the treatment variable is atrichotomous incumbency indicator equal to !1 if theincumbent is Republican in district i, 0 if there is noincumbent, and 1 if district i has a Democratic incum-bent. The mediator is what they call the Democraticquality advantage, which is operationalized as a tri-chotomous variable that equals !1 if the Republicanchallenger had previously held elected office but notthe Democrat, 0 if neither or both candidates previ-ously held elected office, and 1 if the Democrat had heldoffice but not the Republican. The outcome variable isDemocratic vote share in district i.22

Measurement of Challenger Quality. Our reanalysisbased on the potential outcomes framework reveals animportant conceptual limitation of the original study.To estimate the scare-off/quality effect of incumbency,Cox and Katz (1996) operationalize the quality advan-tage of Democratic candidates as the difference in thetwo candidates’ quality (measured by their previousexperience in an elective office) for each district. Thismediating variable, however, is problematic because itis defined in terms of not only challengers’ quality but

22 Despite the trichotomous nature of the mediating variable, theoriginal analysis used linear regression models so that the product ofcoefficients method can be applied. Our approach permits the use ofan ordered probit model.

777

Page 14: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

also incumbents’ own quality. In fact, because incum-bency itself is regarded as previous office experience,the mediator cannot take its largest (smallest) possiblevalue whenever there is a Republican (Democratic)incumbent in a district regardless of the challenger’squality, i.e., Mi(!1) % {!1, 0} and Mi(1) % {0, 1} for anyi. This creates an artificial positive correlation betweenthe observed values of the mediator and the treatmentbecause by definition Mi(!1) can never be greater thanMi(1) for any i.

For example, consider the counterfactual scenariowhere a Democratic incumbent had his or her in-cumbency status changed and thus is no longer anincumbent. The scare-off effect is then the decreasein the quality of the Republican challenger that wouldresult from this hypothetical change in incumbency.However, under the original coding scheme, the valueof Democratic quality advantage would automaticallydecrease—because of the counterfactual change inincumbency status—even if the challenger’s qualitystayed the same. Thus, the change in incumbency neg-atively affects the mediator even if the true scare-offeffect is zero. Note that although our focus on counter-factuals makes these inconsistencies readily apparent,the model-based approach tends to mask them by ob-scuring the relevant counterfactual comparisons.

Fortunately, our framework permits a clear way torevisit their original question. The problem with theoriginal coding scheme was that changes in the incum-bency status would automatically produce changes inthe quality variable; the mediator is defined too closelyto the treatment variable. To avoid this problem, wefirst split apart the sample into two groups based on theparty of incumbents.23 For the analysis of Democraticincumbency effects, the treatment variable is coded as1 if there was a Democratic incumbent in the districtand 0 if the seat was open. To construct the mediatingvariable, we used the original Jacobson (1987) data tocalculate the quality of the Republican candidate inthe district. We code this mediating variable as 1 if theRepublican had previously held public office and 0 ifhe or she had not. Note, importantly, that variation inthis variable is no longer tied to the treatment vari-able in any deterministic way, as in the original codingscheme. Finally, the outcome variable is the Demo-cratic candidate’s percentage of the two-party vote.The variables for the Republican incumbents group arecoded analogously. The new coding scheme allows us todefine causal quantities of interest in a clearer and moretransparent manner. For example, the average total ef-fect of incumbency, ! = E(Yi(1, Mi(1)) ! Yi(0, Mi(0))),is equal to the expected change in the candidate’s per-centage of the two party vote that would result if thecandidate were changed from an incumbent to a nonin-cumbent in an open seat, holding his or her party con-stant either to Democrat or Republican. The ACMEfor the scare-off/quality mechanism under the controlcondition, "(0) = E(Yi(0, Mi(1)) ! Yi(0, Mi(0))), repre-sents the expected change in the vote share caused

23 Open seats are counted twice and included in both groups, com-posing the control groups.

by the change in challenger quality that would resultif a candidate in an open seat (either Democratic orRepublican) hypothetically ran as an incumbent of thesame party. Thus, the original scare-off/quality hypoth-esis can be tested by estimating the size of "(0) andcomparing it to the total incumbency effect, !, for eachparty.

Estimation of the Average Causal Mediation Effects.Cox and Katz found that the component of incumbencyeffects that is due to the scare-off/quality mechanismincreased over time by estimating effects separatelyby election. Figure 3 presents the ACME and total ef-fect of changing the incumbency variable from 0 (openseat) to 1 (incumbent) separately for Democratic in-cumbents (left) and Republican incumbents (right). Asfound generally in the literature, the effect of incum-bency has greatly increased over time. In the originalstudy, this growth was attributed to a similar increaseover time in the scare-off/quality effect. In contrast,our analysis shows that the ACME was not significantlydifferent from zero for either Democratic or Republi-can candidates in the earlier time periods. Moreover,although the ACME has slightly increased over time asin the original study, the effect beginning in the 1970swas usually between 2 and 3% and barely statisticallysignificant at the .05 level. Thus, our reanalysis suggeststhat the increase in incumbency advantage may be at-tributable to different causal mechanisms rather thanto the scare-off/quality mechanism.

Sensitivity Analysis. We now apply the proposed sen-sitivity analyses to the incumbency advantage exam-ple. As explained earlier, the estimates of the ACMEreported in Figure 3 will be biased if the sequential ig-norability assumption (Assumption 1) does not hold. Inthis study, there can be many unobserved confoundersthat affect both the mediator and the outcome vari-able. For example, Assumption 1 will be violated ifnational party organizations allocate campaign fundsacross districts based on priorities for getting par-ticular candidates (say those in powerful committeepositions) elected. The candidates might face lower-quality challengers and have higher election returnsbecause of these added resources. The proposed sensi-tivity analysis quantifies the robustness of the ACMEestimates to the existence of such unobserved con-founding. Whereas previous sensitivity analyses withthe Brader, Valentino, and Suhay (2008) study wereconducted with a dichotomous outcome and continu-ous mediating variable, the flexibility of our approachpermits sensitivity analyses for more general settings.Here, the mediating variable is dichotomous and theoutcome variable continuous.

For illustration, we focus on the Republican incum-bency effects in 1976 and 1980, where the magnitude ofthe estimated ACME was similar (1.25 and 1.27). Theresults are shown in Figure 4, which again is generatedby mediation. Sensitivity estimates can be quite differ-ent even in this case, where the estimated ACME underAssumption 1 is roughly equal. For example, in 1976 thevalue of ) for which the point estimate of the ACME

778

Page 15: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

FIGURE 3. Estimated Average Causal Mediation Effect (ACME) and Total Effect ofIncumbency Status on Own Party Vote Share

!50

510

1520

Democratic Incumbents

Year

1956 1960 1966 1970 1976 1980 1986 1990

1958 1964 1968 1974 1978 1984 1988

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

! ! ! ! ! !

! ! ! !! !

!

!

!

!

!

Average Mediation Effect: "(0)Average Total Effect: #

Effe

ct o

f Inc

umbe

ncy

on

Incu

mbe

nt %

of T

wo!

Par

ty V

ote

!50

510

1520

Republican Incumbents

Year

1956 1960 1966 1970 1976 1980 1986 1990

1958 1964 1968 1974 1978 1984 1988

! !

!

!

!

! !

!

!!

!

! !

!!

! !

!

!

!

! !

!

!!

!

! !

!!

! ! ! !

!

! ! !!

!!

! ! ! !

Notes: For each party (left panel Democratic; right panel Republican), the black dots represent the ACME of incumbency on own partyvote share mediated by the other candidate’s quality. The white dots represent the total effect of incumbency on vote share. The effectsare reported for each U.S. House election between 1946 and 1990. The vertical lines represent the 95% confidence intervals. The effectsare estimated using the algorithm in Imai, Keele, and Tingley (2010) with probit for the mediator model and linear regression for theoutcome model. Estimates generally show smaller proportions of the total effects transmitted through the scare-off/quality mechanismthan reported by Cox and Katz (1996).

changes sign is !0.39, whereas for 1980 it is !0.20,implying that the 1976 estimate is much more robust tounobserved confounding. The analysis with respect tothe explained variances similarly shows a striking con-trast between these two years. For 1976, an unobservedconfounder must affect the mediator and outcome indifferent directions and explain as much as 23.4% ofthe total variance in both variables for the true ACMEto be negative.24 In contrast, this percentage for the1980 estimate is only about 11.8%, and the true ACMEcould be negative and quite large (!3 or even less)when the degree of confounding was extremely high.The two years also differ in terms of the upper boundsof the sensitivity parameters. For 1976, the observedvariables in the models leave 85.1% and 42.1% of thevariance in the mediator and outcome, respectively, tobe potentially explained by an unobserved confounder.For 1980, these proportions are much smaller for themediator (62.5%) but slightly larger for the outcome(56.1%). In summary, even when the point estimatesunder Assumption 1 are similar, the robustness of con-clusions can be quite different.

We conclude this section by reminding readers of animportant limitation of observational studies. As ex-plained in Sensitivity Analysis, the analysis maintainsthe assumption that the treatment is ignorable afterconditioning on observed covariates. Although this as-

24 We use a pseudo-R2 for the probit model (see Imai, Keele, andTingley 2010). If the unobserved confounder influences the media-tor and outcome in the same direction the results would suggest astronger role of the proposed mechanism.

sumption is guaranteed to hold in randomized experi-ments, it can be violated in observational studies. Forexample, the analysis would be invalid if there were un-observed confounders that affected both incumbencystatus and challenger quality. As with any causal infer-ence based on observational data, the assumption ofignorable treatment also plays a crucial role.

ALTERNATIVE RESEARCH DESIGNSFOR CREDIBLE INFERENCE

So far, we have discussed how to make inferencesabout causal mechanisms using the standard designsfor experimental and observational studies. However,as should be clear by now, the standard designs requirea strong identification assumption that may be diffi-cult to justify in practice. A natural question to ask iswhether there exist alternative research designs thatrely on more credible assumptions. Imai, Tingley, andYamamoto (n.d.) propose new experimental designsand analyze their power to identify causal mechanisms.The key idea is to consider designs where the mediatorcan be directly or indirectly manipulated. In this sec-tion, we first discuss some of these alternative experi-mental designs in the context of the Brader, Valentino,and Suhay (2008) study. We then show that the ba-sic ideas of these experimental designs can serve asa template for observational studies in the context ofincumbency advantage research. Our discussion shouldhelp researchers to think systematically about how todesign observational studies.

779

Page 16: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

FIGURE 4. Sensitivity Analysis for the Scare-off/Quality Mechanism, 1976 and 1980 Elections withRepublican Incumbents

!1.0 !0.5 0.0 0.5 1.0

!50

510

1976

!1.0 !0.5 0.0 0.5 1.0

!50

510

1980A

vera

ge C

ausa

l Med

iatio

n E

ffect

: "

Sensitivity Parameter: !

1976

!0.5

0 0.5 1

0.0 0.2 0.4 0.6 0.8

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

1980

!3

!2

!1 0 1

0.0 0.2 0.4 0.6 0.8

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Proportion of Total Variance in M Explained by Confounder

Pro

port

ion

of T

otal

Var

ianc

e in

Y

Exp

lain

ed b

y C

onfo

unde

r

Notes: The top row plots the ACME as a function of the sensitivity parameter ). The bottom row of plots provide the alternative formulationbased on the decomposition of variances. See Figure 2 for the graph interpretation. Despite the two years having a similar-sized ACMEestimate, the sensitivity analyses suggest that the 1980 results are more sensitive to an omitted variable that affects the Democraticcandidate quality (mediator) and Republican vote share (outcome) in opposite directions.

Designing Randomized Experiments

To study how media cues influence immigration at-titudes, Brader, Valentino, and Suhay (2008) use thestandard single-experiment design, which consists ofthe following three basic steps. First, a treatment israndomly assigned to subjects. Second, a mediatingvariable is measured after the treatment has been ad-ministered. Finally, an outcome variable is measured.The single-experiment design is typical of the vast ma-jority of experimental work in the social sciences thatattempt to identify causal mechanisms.

However, sequential ignorability must hold for theACME and ADE to be identifiable under this single-

experiment design. What happens if we relax thisassumption and only assume that the treatment israndomized (as is the case under the single exper-iment design)? For the special case of binary me-diator and outcome, Imai, Tingley, and Yamamoto(n.d.) and Sjolander (2009) derive the nonparamet-ric sharp bounds for the ACME and ADE, respec-tively. The bounds represent the exact range of possiblevalues that these quantities of interest can take with-out sequential ignorability. The results imply that thesingle-experiment design can provide some informa-tion about these quantities compared to what is knownbefore the experiment (i.e., the bounds are narrowerthan [!1, 1]). But the bounds unfortunately will always

780

Page 17: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

include zero and hence will not provide informationabout the sign of the ACME or ADE. Thus, relativelylittle can be learned under the single-experiment designwithout an additional untestable assumption.

The problem with the single-experiment design isthat we cannot be sure that the observed mediator isignorable conditional on the treatment and pretreat-ment covariates. A better alternative is to implementexperimental designs where the researcher randomlyassigns the values of the mediator. Imai, Tingley, andYamamoto (n.d.) propose several such designs and de-rive their identification power under a minimal set ofassumptions. One important difference among thesenew designs is whether the mediator can be perfectlymanipulated by the researcher. For the purpose ofstudying topics like media cues, the most applicableclass of designs is what they call encouragement designs,because it is unlikely that a researcher will be able toperfectly assign levels of anxiety, because anxiety canat best be encouraged to take certain values. Thus, inthis section, we focus on encouragement designs anddiscuss how they can help improve our inferences aboutthe ACME.

In the parallel encouragement design, subjects arefirst split into two experiments, which are run in par-allel. The first experiment uses the standard single-experiment design. In the second experiment, we firstrandomly assign subjects to the treatment and con-trol conditions. Then, within each condition, a randomsubset of subjects are encouraged to take on a highor low value of the mediator. Finally, both the media-tor and outcome variable are observed. For example,a redesign of Brader, Valentino, and Suhay’s (2008)original study would be to assign individuals to ei-ther receive the treatment news story, which featuresa Hispanic immigrant and emphasizes the costs to im-migration, or the control story. Second, within eachcondition, a random set of subjects are encouraged tohave lower or higher levels of anxiety through a writingtask (e.g., Tiedens and Linton 2001) or other moodinduction procedures (e.g., Gross and Levenson 1995).

If mediator manipulation in the second experimentwere perfect, then the parallel encouragement designwould reduce to the parallel design, where the mediatoris directly manipulated to take particular values for arandomly selected subset of the sample. It is importantto note that even in the parallel design, the ACME andADE are not identified. This stems from the fact thatthe causal mediation effect represents a change in themediator due to the difference in the treatment condi-tion rather than the effect of directly manipulating themediator at a certain level (see Sequential Ignorabilityand Conventional Exogeneity Assumptions). In prac-tice, manipulation of mood will not be perfect, so somesubjects will have the same level of anxiety regardlessof whether they are encouraged. In these cases, theencouragement design will provide less informationabout the ACME for the entire population than theparallel design.

However, the parallel encouragement design pro-vides more information for those subjects that complywith the encouragement. Figure 5 illustrates the fact

FIGURE 5. Diagram Illustrating theParallel Encouragement Design

Notes: The randomized encouragement Z induces an exoge-nous variation in the mediator M, which allows researchers tomake informative inference about the ACME and ADE evenin the presence of confounders, which are represented by thedashed arc.

that the randomized encouragement Z can be regardedas the instrument inducing an exogenous variation inthe mediator. Thus, following the identification strat-egy used in instrumental variables approach for thetotal causal effect (Angrist, Imbens, and Rubin 1996),we can define the complier average mediation effect(CACME). In Instrumental Variables, we further dis-cuss this connection with instrumental variables. Forexample, the CACME in the context of the immigra-tion study is equal to the average effect of ethnic cueson immigration attitudes that is mediated by anxietyamong those subjects whose anxiety levels are eitherlowered or raised by the mood induction task. Al-though these compliers represent a particular subsetof the population and hence there is no guarantee thatthe CACME is similar to the ACME for the entirepopulation of interest, the bounds on the former canbe as tight as or even tighter than those on the latter inthis encouragement design.

We refer readers to Imai, Tingley, and Yamamoto(n.d.) for the details of various alternative design,including the parallel encouragement design, as wellas the comparison between them and the single-experiment design. A key point, however, is that thesenew designs in many cases will generate more infor-mation about causal mechanisms. Thus, these designsare useful alternatives for experimentalists who studycausal mechanisms but wish to avoid the sequentialignorability assumption.

Designing Observational Studies

How should we design observational studies so thatwe can make credible inferences about causal mech-anisms in the absence of experimental controls? Oursuggestion is to use the experimental designs discussedpreviously as templates. The growing use of naturalexperiments in social sciences over the last couple ofdecades arose as a result of systematic efforts by em-pirical researchers who use randomized experimentsas research templates. These researchers search forsituations where the treatment variable is determinedhaphazardly so that the ignorability assumption is morecredible.

We argue that a similar strategy can be employed forthe identification of causal mechanisms by designingobservational studies to imitate various experimental

781

Page 18: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

designs. In fact, some have already employed similarresearch design strategies in the incumbency advantageliterature. Here, we show how these existing studies canbe seen as observational study approximations to var-ious experimental designs. This suggests that by usingthese experimental designs as templates, researcherscan systematically think about ways to make obser-vational studies more credible for identifying causalmechanisms.

We first consider an extension of the crossover de-sign proposed in Imai, Tingley, and Yamamoto (n.d.) toan observational study on incumbency advantage. Thecrossover design for randomized experiments consistsof the following two steps. First, the treatment is ran-domized and then the values of the mediator and theoutcome variable are observed. Second, the treatmentstatus is changed to the one opposite to the treatmentstatus of the first period and the mediator is manipu-lated so that its value is fixed at the observed mediatorvalue from the first period. Because the mediator valueis fixed throughout the two periods, the comparison ofthe outcomes of each unit between the first and secondperiods identifies the direct effect for that unit. Sub-tracting the estimated ADE from the estimated ATEthen gives the estimate of the ACME.25

In the incumbency advantage literature, the researchdesign used by Levitt and Wolfram (1997) can be un-derstood as an approximation to this crossover design.In that article, the authors examine repeated contestsbetween the same candidates. The basic idea is thefollowing. Suppose that both candidates are nonincum-bent during the first election. One candidate wins theelection and then they face each other again in the nextelection as an incumbent and a challenger. If we assumethat the candidate quality has not changed betweenthe two elections, then this is essentially a crossoverdesign. In the first period, we have a nonincumbentTi = 0 and we observe the challenger quality withoutincumbency Mi(0). In the second period, the mediatoris held at the same value as the first period, but thetreatment status changes to Ti = 1 now that the can-didate is an incumbent. If we further assume that thefirst election does not affect the second election (i.e.,no carryover effect), then we can identify the ADE,E{Yi(1, Mi(0)) ! Yi(0, Mi(0))}, for a subset of districtsthat have repeated contests between the same two can-didates.

Following Levitt and Wolfram, Ansolabehere,Snyder, and Stewart (2000) use a similar research de-sign to examine the importance of personal vote as analternative causal mechanism of incumbency advan-tage. In particular, the authors use decennial redistrict-ing as a natural experiment and compare (right afterredistricting) the incumbent’s vote share in the newpart of the district with that in the old part of the dis-trict. They argue that this comparison allows the iden-tification of personal vote (due to incumbents’ services

25 Imai, Tingley, and Yamamoto (n.d.) discuss how this design can beapplied to the labor market discrimination experiment of Bertrandand Mullainathan (2004) by modifying the original experimentalprotocol in subtle but important ways.

to their districts) because in both parts of the districtsthe incumbent faces the same challenger; hence thechallenger quality is held fixed. Although the com-parison is made within the same election cycle, thisdesign can also be considered as an approximation tothe crossover design. The authors assume that the in-cumbency status is different between the old and newparts of the district because the candidate is not anincumbent for new voters, even though the challengerquality is the same for the entire district. If this assump-tion is reasonable, then their research design identifiesthe ADE, E{Yi(1, Mi(1)) ! Yi(0, Mi(1))}, for a subset ofdistricts where redistricting produced both new and oldvoters. Assuming there is no causal pathway betweenincumbency and vote share other than challenger qual-ity and personal vote, the ADE is then equal to theincumbency effect due to personal vote.

Assuming that the no-carryover effect assumptionholds, there exist two main advantages of this cross-over design over the standard design such as the oneused by Cox and Katz (1996). First, because the chal-lenger is held constant, researchers can assume chal-lenger/quality is held constant without even measuringit. Second, the randomization of treatment is unnec-essary because under the appropriate assumptions allnecessary potential outcomes are observed for eachunit. This is an important advantage, given that theignorability of treatment assignment is difficult to as-sume in observational studies. These examples illus-trate that the identification of causal mechanisms withobservational studies can be made more credible byusing randomized experiments as templates. In partic-ular, researchers may use the key idea of the crossoverdesign and look for natural experiments where themediator is held constant either across time orspace.

Of course, researchers should always be aware ofwhether a natural experiment has external validitylimitations. In the incumbency advantage exampledescribed earlier, Levitt and Wolfram attributed alarge fraction of incumbency advantage to the scare-off/quality effect, whereas Ansolabehere, Snyder, andStewart (2000) attributed it to the personal vote. Al-though these results are apparently contradictory, thedifference may have arisen simply because the twodesigns identify different quantities. The ADE iden-tified by Levitt and Wolfram (1997) holds the medi-ator constant at Mi(0), whereas the mediator is fixedto Mi(1) for the Ansolabehere, Snyder, and Stewart(2000) study. In addition, the two studies identify thesequantities for different subsets of districts. Thus, thedifferences between the two sets of findings may simplyreflect the differences in the causal estimands.

The Importance of theConsistency Assumption

Finally, we note that the research designs consideredso far all rest on an important assumption called con-sistency (Cole and Frangakis 2009). In the current con-text, the assumption states that regardless of how the

782

Page 19: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

values of the treatment and mediator come about, theirpotential outcomes must take the same values as longas the treatment and mediator values are the same.In other words, experimental manipulations them-selves must not affect the outcome except through thechanges they induce in the values of the treatment ormediator.26

This notion of consistency represents a fundamentalassumption that is common to a vast majority of the ex-isting results in the causal inference literature, thoughit is often left implicit.27 In the analysis of causal mech-anisms, however, the consistency assumption deservesspecial attention. As emphasized throughout this arti-cle, the identification of causal mechanisms, by defini-tion, requires inference about natural changes in themediator as responses to treatment. Therefore, even inexperimental designs involving the manipulation (orencouragement) of mediators, one must assume thatsubjects would respond in the same way if those val-ues of mediators were spontaneously chosen by thesubjects themselves.

The consistency assumption requires particularlycareful examination when the mechanism of interestis a psychological one. For example, in the redesignedversion of the Brader, Valentino, and Suhay (2008)study we discussed in Designing Randomized Exper-iments, anxiety encouragement such as a writing taskmay itself have an effect on subjects’ immigration at-titudes other than through its direct effect on anxietyif the task changes other emotions, which in tern af-fect immigration attitudes. The consistency assumptionwould then be violated. In the incumbency advantagestudy by Levitt and Wolfram, the consistency assump-tion requires that the vote shares in the two electionsbe comparable, in the sense that they would on av-erage take identical values if both incumbency statusand challenger qualities stayed the same. In sum, onemust carefully evaluate the plausibility of consistencyin using these alternative designs in light of the specificcontext of one’s empirical application.

RELATED CONCEPTS ANDCOMMON MISUNDERSTANDINGS

Finally, we discuss how the concepts and methods in-troduced here differ from those frequently used bysocial scientists. Understanding these key differencesis crucial for determining the quantities of interest thatfit the goal of one’s research, leading to the appropriatechoice of statistical methods and research designs.

Instrumental Variables

The instrumental variables method is widely used forthe identification of causal effects across disciplines.

26 The exact definition of the consistency assumption varies depend-ing on which specific design is employed in a given study. See Imai,Tingley, and Yamamoto (n.d.) for its formal representations.27 Consistency and the assumption of no interference between units(see footnote 5) together compose the so-called stable unit treatmentvalue assumption (SUTVA).

Typically, an instrumental variable is used when oneis interested in the causal effect of an endogenoustreatment variable. Under this setting, the instrumentis assumed to have no direct effect on the outcome (i.e.,exclusion restriction) and affects all units in one direc-tion (i.e., monotonicity) (Angrist, Imbens, and Rubin1996). Together with the ignorability of the instrumentand the stable unit treatment value assumption, re-searchers can identify the ATE for compliers. Althoughthis standard use of the instrumental variables methodis helpful for identifying causal effects, it does not di-rectly help identify causal mechanisms. In fact, it hasmore often been associated with the “black box” ap-proach to causal inference, where insufficient attentionis paid to causal mechanisms. For example, Deaton(2010a, 2010b) criticizes the blind application of thismethod to economic research precisely because of thistendency.

Given the value of the instrumental variablesmethod for studying causal effects, can it be incor-porated into the study of causal mechanisms? Theanswer is yes, though unfortunately the existingmethodological suggestions are of limited use for ap-plied researchers because they a priori rule out the ex-istence of causal mechanisms other than the hypothe-sized one by assuming the direct effect of the treatmentto be zero (i.e., an exclusion restriction) (Holland 1988;Jo 2008; Sobel 2008). A more appropriate way of ap-plying the instrumental variable method appears in theencouragement design discussed in Designing Random-ized Experiments. Under that design, the randomizedencouragement can be seen as an instrument for themediator which in conjunction with the randomizedtreatment helps identify causal mechanisms. If encour-agement has no direct effect on the outcome (otherthan through the mediator) and does not discourageanyone, then the instrumental variables assumptionsare satisfied. This means that one can learn about theACME and the ADE for those who can be affected bythe encouragement without assuming sequential ignor-ability. Therefore, the instrumental variables methodcan effectively address the endogeneity of the media-tor. The key point here is that combining instrumentalvariables and novel research designs helps to identifycausal mechanisms, whereas previous applications ofinstrumental variables were unable to do more thansimply identify causal effects.

Furthermore, the idea of this encouragement designcan be extended to observational studies that seek tounderstand the role of a causal mechanism. To dothis, researchers can use an instrument that inducesexogenous variation in the mediator of interest, whilealso measuring and using the treatment variable of in-terest. For example, in the literature on how incum-bency advantage influences election outcomes, Gerber(1998) explores campaign spending as an alternativecausal mechanism. Recognizing the possible endogene-ity problem, the author uses candidate wealth levelsas an instrument. Here, the key identifying assump-tions are that candidate wealth levels are essentiallyrandom (ignorability of instrument); they influenceelection outcomes only through campaign spending

783

Page 20: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

(exclusion restriction); and higher candidate wealthlevels never lead to lower campaign spending (mono-tonicity). These assumptions are strong, but if they aremet, candidate wealth levels can be used as an instru-ment to study causal mechanisms without sequentialignorability.

Under this setting, a standard instrumental vari-ables estimator may be used to estimate the ACMEand ADE. For example, in the LSEM framework,the two-stage least squares (2SLS) estimator can beused, where the first-stage model is given by theequation

Mi = $2 + %2Ti + *Zi + &&2 Xi + 'i2, (8)

where Zi is the instrumental variable, whereas thesecond-stage regression is the same as before, i.e., equa-tion (7). In the Appendix, we prove that under thislinear structural model the ACME and ADE are iden-tified and equal to %2( and %3, respectively. Thus, thiswell-known 2SLS estimator can also be used for theidentification of causal mechanisms. If an instrument isavailable and a researcher has a strong reason to be-lieve that ignorability of the mediator will not hold, thisstrategy is a viable alternative. However, as in any useof instrumental variables, the validity of the requiredassumptions must be taken seriously.

Interaction Terms

Another common strategy researchers employ to iden-tify causal mechanisms is to use interaction terms.Broadly speaking, there are two usages; interactionterms between the treatment and mediator measuresand those between the treatment and pre-treatmentcovariates. Researchers typically include these interac-tion terms in regressions and use their statistical signif-icance as evidence of the causal mechanisms that theseterms are assumed to represent. Now, we examine theconditions that justify such strategies.

First, consider an interaction between treatmentand mediator. A recent such example is the work byBlattman (2009), who finds that in Uganda abductionby rebel groups leads to substantial increases in vot-ing through elevated levels of violence witnessed. Ina series of regressions, the author shows that level ofviolence witnessed has a positive, statistically signifi-cant association with political participation primarilyamong those who were abducted. This finding is thenused as evidence for the claim that “violence, especiallyviolence witnessed, is the main mechanism by whichabduction impacts participation” (239).

Under what assumptions is this line of reasoningvalid? Such an inference can be justified under se-quential ignorability. In the current example, the ab-duction by rebels must occur at random and levels ofviolence witnessed also need to be random, condition-ing on whether one was abducted and other pretreat-ment covariates such as income and education. Un-der sequential ignorability, the significant interactionterm between treatment and mediator indicates that

the ACME differs depending on the treatment status,i.e., "(1) )= "(0), and in particular "(1) > 0 but "(0) * 0.This means that the average level of political participa-tion for abductees would have been lower if they hadwitnessed the same level of violence as nonabductees.However, for nonabductees, the levels of political par-ticipation would not have changed much even if theyhad witnessed as high levels of violence as abducteesdid.

Thus, so long as sequential ignorability holds, thestatistically significant interaction term between treat-ment and mediator provides evidence for the existenceof a hypothesized causal mechanism. However, sim-ply testing the significance of the interaction term isnot recommended because such a procedure can onlytest whether either "(1) or "(0) is different from zero.In contrast, the procedure in Inference and SensitivityAnalysis under the Standard Designs can estimate thesize of these quantities along with confidence intervals,providing more substantive information on the basisof the same assumption. In the situation where thevalues of "(1) and "(0) are likely to differ, one caninclude the interaction term TiMi in equation (7) toallow the estimates to be different (Imai, Keele, andTingley 2010).

The second common strategy is to use the statis-tically significant interaction between treatment andpretreatment variables as evidence for the existenceof a hypothesized causal mechanism. In this approach,researchers demonstrate that the ATE for a certainsubgroup of the population is different from that foranother subgroup. One such example appears in a re-cent survey experiment by Tomz and van Houweling(2009), who investigate how the ambiguity of candi-dates’ position-taking influences voters’ evaluation ofthese candidates. In one part of the study, the authorsrandomize the attachment of party labels to candidatesas the treatment. A hypothesized mechanism is that thelack of a party label increases the uncertainty aboutcandidates’ positions and in turn makes voters morelikely to prefer ambiguous candidates over unambigu-ous candidates if the voter is risk-seeking rather thanrisk-averse. Note that in this study the risk preferenceis considered to be a pretreatment characteristic of avoter. The original analysis finds that the estimatedATE of party labels is larger for risk-seeking votersthan for risk-averse voters. This finding is used to ar-gue that party labels influence candidate preferencesby reducing uncertainty.

Such an interaction between treatment and pretreat-ment covariates indicates variation in the treatment ef-fect. It is well known that such treatment effect hetero-geneity itself does not necessarily imply the existenceof causal mechanisms, representing the distinction be-tween moderation and mediation Baron and Kenny(1986). However, treatment effect heterogeneity canalso be taken as evidence of a causal mechanism undera certain assumption. Specifically, if the size of the ADEdoes not depend on the pretreatment covariate (riskpreferences), a statistically significant interaction termimplies that the ACME is larger for one group (risk-seeking voters) than for another group (risk-averse

784

Page 21: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

voters).28 This assumption allows researchers to inter-pret the variation in the ATE as the variation in theACME.

Thus, an interaction term between the treatment anda pretreatment covariate can be used as evidence forthe hypothesized causal mechanism at the cost of anadditional assumption. A marked advantage of thisapproach is that one can analyze a causal mechanismwithout even measuring the mediating variable. Thedownside, however, is that it necessitates the strongassumption that the ADE is constant regardless of thevalue of the pretreatment covariate Xi. Moreover, thisstrategy shows that the ACME varies as a function ofXi but does not even identify the sign of the ACMEfor a particular value of Xi. For example, in the Tomzand van Houweling study, the ACME can be negativefor both risk groups. This indicates that the strategybased on interaction between treatment and pretreat-ment covariates provides only indirect evidence abouta hypothesized causal mechanism.

CONCLUDING REMARKS ABOUTEMPIRICAL TESTING OFSOCIAL SCIENCE THEORIES

Much of social science research is about theorizing andtesting causal mechanisms. Yet statistical and experi-mental methods have been criticized because of theprevailing view that they only yield estimates of causaleffects and fail to identify causal mechanisms. Rec-ognizing the difficulty of studying causal mechanisms,some researchers even recommend that the focus ofempirical research should be on the identification ofcausal effects and not causal mechanisms.

Although we acknowledge the challenge, we alsobelieve that progress can be made. Empirical social sci-ence research, whether experimental or observational,often requires strong assumptions (Imai, King, andStuart 2008). Yet much can be learned from empiricalanalysis within the constraints of those assumptions.In this article, we show three ways to move forwardin research on causal mechanisms. First, the potentialoutcomes model of causal inference used in this articleimproves understanding of the identification assump-tions. Second, the sensitivity analysis we develop al-lows researchers to formally evaluate the robustnessof their conclusions to the potential violations of thoseassumptions. Finally and perhaps most importantly, theproposed new research designs for experimental andobservational studies can reduce the need to rely uponuntestable assumptions. Strong assumptions such as se-quential ignorability simply deserve great care and callfor a combination of innovative statistical methods andresearch designs.

28 This result is a consequence of a general algebraic equality. Letthe conditional ATE, the conditional ACME, and the conditionalADE be !(x) = E(Yi(1, Mi(1)) ! Yi(0, Mi(0)) | Xi = x), "(t, x) =E(Yi(t, Mi(1)) ! Yi(t, Mi(0)) | Xi = x), and #(t, x) = E(Yi(1, Mi(t)) !Yi(0, Mi(t)) | Xi = x), respectively. Then, !(x) ! !(x$) = {"(t, x) +#(1 ! t, x)} ! {"(t, x$) + #(1 ! t, x$)} = "(t, x) ! "(t, x$).

The set of new methods and research designs in-troduced here can be used to test social science theo-ries that attempt to explain how and why one variablecauses changes in another. Of course, such tests are notalways possible, and in those situations researchers mayevaluate their theories by examining their auxiliary em-pirical implications. For example, this can be done byidentifying a set of competing theories and examiningwhich rival theories best predict the observed data (e.g.,Imai and Tingley n.d.). Another possibility is to identifyparticular components of a treatment that are capableof affecting an outcome, rather than focusing on causalprocesses (e.g., VanderWeele and Robins 2009).

Much methodological work remains to be done toimprove various ways to empirically test social sciencetheories. Scientific inquiry is an iterative process of the-ory construction and empirical theory testing. In thisarticle, we have shown that direct tests of causal mecha-nisms are sometimes possible with new methodologicaltools, and if so, researchers can unpack the black box ofcausality, going beyond the estimation of causal effects.

APPENDIX

Multiple Mediators andPosttreatment ConfoundersIn this article, we focus on a simple setting where the inter-est is in the identification of a particular causal mechanismrepresented by a mediator Mi (indirect effect) against allother possible mechanisms (direct effect). Frequently, ana-lysts have more specific ideas about what these other mech-anisms may be. Suppose that there is a second mediator,Ni , that is also assumed to lie on the causal path from thetreatment Ti to the outcome of interest Yi . This mediatormay be observed or unobserved. For example, in addition tomeasuring anxiety, Brader, Valentino, and Suhay (2008) alsomeasured a second potential mediator, which was changesin beliefs about the economic consequences of immigration.They also tested whether other types of emotional responsesmediated the treatment, but did not measure other possiblemediators.

Under what conditions is the presence of a second mecha-nism problematic for the identification of the main mech-anism under the standard (single-experiment) design? Inthis Appendix, we first describe various situations where theexistence of other mechanisms is addressed by the methodproposed in Inference and Sensitivity Analysis under the Stan-dard Designs. In these cases, either the ACME is identifiedor the researcher can conduct sensitivity analyses to addressthe possibility of confounding. We then describe situationswhere multiple mediators present a serious problem understandard designs, thereby requiring researchers to consideralternative research designs such as those discussed in Alter-native Research Designs for Credible Inference.

In general, the existence of other causal pathways doesnot cause a problem for the identification of a causal mecha-nism under standard designs so long as it does not violate thesequential ignorability assumption. And even in many caseswhere sequential ignorability is violated, the researcher canconduct a sensitivity analysis. Hence, multiple mediators donot in general pose an additional obstacle to inference aboutmediation. Nor does the presence of multiple mediators re-quire alternative identification strategies such as instrumen-tal variables (Albert 2008; Bullock, Green, and Ha 2010).

785

Page 22: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

FIGURE 6. Another Unobserved Mediator Causing No Problem

Notes: The diagrams represent various situations where the presence of an unobserved variable Ni mediating the effect of Ti on Yidoes not violate the sequential ignorability assumption for the identification of the ACME with respect to the mediator of interest, Mi .Solid lines represent causal relationships between observed variables whereas dashed lines represent causal relationships involvingan unobserved variable.

For example, the diagrams of Figure 6 represent various situ-ations in which sequential ignorability still holds despite thepresence of a second unobserved mediator Ni . In each ofthese cases, the ACME of the mediator of interest, Mi , canstill be identified under standard research designs with thesequential ignorability assumption and researchers can applythe methods described in Inference and Sensitivity Analysisunder the Standard Designs.

In Figure 6a, the second mediator is independent, andtherefore not even correlated with the main mediator afterconditioning on the treatment status. In this case, the treat-ment transmits its effect both through the observed mediatorof interest, Mi , and through a second unobserved mediator,Ni , along with other unspecified mechanisms that are implic-itly represented by the direct arrow from Ti to Yi . But becausethere is no direct relationship between the two mediators,the sequential ignorability assumption will still identify theACME for the mediator of interest Mi and the role of allother unobserved mediators will be estimated as part of thedirect effect.

In contrast, the two mediators are correlated in the otherdiagrams in Figure 6, even after conditioning on the treat-ment, though the nature of the correlation is quite differentin each of these cases. The second mediator represents anunobserved variable that simply transmits the entire effect ofthe mediator on the outcome in Figure 6b. Similarly, Figure 6crepresents the situation where the second mediator transmitsthe entire effect of the treatment on the primary mediator. Inboth of these cases the role of Mi will still be identified undersequential ignorability even though Mi is part of a longerchain of causal relationships. This is important because, forexample, the role played by anxiety in transmitting mediacue effects might also involve other more fine-grained psy-chological processes that anxiety induces (Figure 6b) or thatgenerate anxiety (Figure 6c).

In Figure 6d, the second mediator partially transmits boththe direct and indirect effects of the treatment on the out-come. This seemingly problematic situation does not causea problem, because the sequential ignorability assumption isstill satisfied; that is, the mediator and potential outcomesare independent after conditioning on the treatment status.Thus, the ACME of the main mediator of interest Mi can be

FIGURE 7. Unobserved Mediator CausingProblem Addressable by the ProposedSensitivity Analysis

Notes: The diagrams represent situations where the additional(unobserved) mediator Ni causes the violation of sequentialignorability because of the existence of the unobserved pre-treatment covariate Ui . In these cases the ACME can be probedby the proposed sensitivity analysis.

consistently estimated even when we disregard the presenceof the unobserved intermediate variable Ni .

Figures 6e and 6f are the situations where sequential ig-norability holds only after conditioning on the pretreatmentcovariate Xi , despite the presence of the unobserved secondmediator. Failure to control for Xi would violate sequentialignorability because Xi affects both the mediator and the out-come variable. But if Xi is controlled for, then these situationsreduce, respectively, to Figures 6a and 6d.

Because none of these cases lead to violation of the se-quential ignorability, the proposed estimation strategy canbe used to consistently estimate the ACME with respect tothe mediator of primary interest Mi despite the presence ofa secondary (unobserved) mediator Ni . What types of mul-tiple mediators will cause problems for the identification ofcausal mechanisms? The two diagrams in Figure 7 representsituations in which the sequential ignorability assumption isviolated because of an unobserved pretreatment confounder,Ui . In both cases, the unobserved secondary mediator repre-sents a posttreatment confounder between the mediator andthe outcome, but conditioning on both the treatment and

786

Page 23: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

FIGURE 8. Second Mediator CausingSerious Problem

Notes: The diagrams represent situations where the secondmediator Ni causes the violation of the sequential ignorabilityassumption, which cannot be addressed by the proposed sen-sitivity analysis. This is a problem whether or not the secondmediator is unobserved (left panel) or observed (right panel).

the unobserved confounder, should it be possible, would besufficient for the satisfaction of the sequential ignorabilityassumption. Thus, the proposed sensitivity analysis can beconducted to measure the degree of robustness with respectto the presence of this unobserved mediator.

The third class of additional mediators, displayed in Fig-ure 8, is the most problematic. In this situation, the secondmediator causally affects both the primary mediator and theoutcome and thus represents a typical posttreatment con-founder that is not allowed under the sequential ignorabilityassumption. The ACME with respect to the primary mediatoris then not identifiable on the basis of Assumption 1. Thisis true not only when the second mediator is unobserved(Figure 8a) but also even if it is observed (Figure 8b; seeRobins 2003). Nor can the sensitivity analysis described inSensitivity Analysis be applied, because the confounding be-tween the mediator and outcome is due to a posttreatmentcovariate. In such cases the proposed sensitivity analysis willnot be helpful, and instead the researcher should consideralternative research designs (see Alternative Research De-sign for Credible Inference), or other identification strategies(e.g., Robins and Richardson 2010). In addition, Imai andYamamoto (2011) have developed a new sensitivity analysisthat is applicable in the presence of multiple mediators andposttreatment confounders.

This discussion reveals a crucial point: Whether the pres-ence of multiple mechanisms causes a problem or not entirelydepends on the type of mechanisms in a specific application.Thus, one should carefully think about the possible theoret-ical relationships that might be present in linking a partic-ular treatment variable to an outcome variable. Situationslike those in Figures 6 and 7 can be dealt with using meth-ods described in Inference and Sensitivity Analysis under theStandard Designs, whereas situations like those in Figure 8are best dealt with using alternative designs such as thosedescribed in Alternative Research Design for Credible Infer-ence. As a final note, we point out that the discussion appliesequally to both observational and experimental studies, withthe caveat that observational studies must still satisfy theconditional ignorability of the treatment.

Two Interpretations of Nonzero Correlationbetween ErrorsIn this Appendix, we show that nonzero correlation betweenerrors in the structural equation models has at least two dis-tinct interpretations. The first interpretation is the existenceof unobserved pretreatment covariates, which violates the

assumption of sequential ignorability as well as conventionalexogeneity assumptions. This interpretation is given in Sensi-tivity Analysis. Another interpretation, to which a formal jus-tification is given next, is the existence of heterogenous causaleffects. In this case, conventional exogeneity assumptionsmay be satisfied, but it is still difficult to identify the ACME(as discussed in Sequential Ignorability and Conventional Ex-ogeneity Assumptions). The potential outcome framework ofcausal mediation analysis clarifies the distinction betweenthese two interpretations, which is obscured under the tradi-tional structural equation modeling framework.

Consider a system of linear regressions with heterogenouseffects considered by Glynn (2010),

Mi(Ti) = $2 + %2iTi + '2i, (9)

Yi(Ti, Mi) = $3 + %3iTi + (iMi + '3i, (10)

where %2i is the causal effect of the treatment on the mediator,and %3i and (i represent the causal effects of the treatmentand the mediator on the outcome, respectively. All of theseeffects are heterogenous in that they vary across individuals.Now, reparameterize these effects as

%2i = %2 + +i, (11)

%3i = %3 + &i, (12)

(i = ( + ,i, (13)

where %2, %3, and ( represent the average causal effects andE(+i) = E(&i) = E(,i) = 0. Using this reparameterization, wecan rewrite the system of linear regressions as

Mi(Ti) = $2 + %2Ti + '+2i, (14)

Yi(Ti, Mi) = $3 + %3Ti + (Mi + '+3i, (15)

where the error terms are given by '+2i = +iTi + '2i and '+

3i =&iTi + ,iMi + '3i .

Under the exogeneity of Ti and Mi , we have E('+2i | Ti) =

E('+3i | Ti, Mi) = 0, and thus %2, %3, and ( are all identified.

However, the two error terms are correlated. Specifically,using the fact that Cov(Ti, Mi) = %2V(Ti), we have

Cov('+2i, '

+3i) = V(Ti)Cov(+i, &i) + %2V(Ti)Cov(+i, ,i),

(16)

which does not generally equal zero. One condition underwhich the correlation between errors equals zero (and theACME is therefore identified) is that the two correlationsin heterogenous causal effects, i.e., correlation between +iand &i as well as between +i and ,i , are zero (see Imai andYamamoto 2011, for an alternative sensitivity analysis thatexploits this formulation). It is also important to recognizethat if the heterogeneity can be explained by pretreatmentcovariates alone and one can measure these variables, thensequential ignorability assumption will hold conditional onthem. In that situation, all of our proposed methods will beapplicable so long as the functional form is correctly specifiedwhen adjusting for these pretreatment variables.

787

Page 24: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

Learning about Causal Mechanisms November 2011

Two-stage Least Squares Estimation of theAverage Causal Mediation EffectsIn this Appendix, we prove that under certain assumptionsthe two-stage least squares method can be used to estimatethe ACME. Using the potential outcomes notation, where themediator is now a function of both treatment and instrument,we can write the model as

Yi(Ti, Mi(Ti, Zi)) = $3 + %3Ti + (Mi(Ti, Zi)

+ 'i3(Ti, Mi(Ti, Zi)), (17)

Mi(Ti, Zi) = $2 + %2Ti + *Zi + 'i2(Ti, Zi), (18)

where the standard normalization, E('i3(t, m)) =E('i2(t, z)) = 0 for any t, m, z, is assumed. This spec-ification assumes, among other things, the exclusionrestriction of the instrument. The model also impliesthe following expression for the ACME and the averagedirect effect: E(Yi(t, Mi(1, z)) ! Yi(t, Mi(0, z)) = %2( andE(Yi(1, Mi(t, z)) ! Yi(0, Mi(t, z))) = %3. In addition, assumethat both the treatment Ti and the instrument Zi are random-ized. Formally, we write {Ti, Zi}""{Yi(t, m), Mi(t$, z)} for anyt, m, t$, and z. Then we have the following exogeneity condi-tion E('i3(Ti, Mi(Ti, Zi)) | Zi = z, Ti = t) = E('i3(t, m)) = 0for any t, z, where m = $2 + %2t + *z + 'i2(t, z). Thus, themodel parameters can be estimated consistently fromobserved data using Zi as an instrument, implying that theACME and the average direct effect are also consistentlyestimated by %2( and %3 with the two-stage least squaresmethod.

REFERENCES

Albert, J. 2008. “Mediation Analysis via Potential Outcomes Mod-els.” Statistics in Medicine 27: 1282–1304.

Angrist, J. D., G. W. Imbens, and D. B. Rubin. 1996. “Identification ofCausal Effects Using Instrumental Variables (with Discussion).”Journal of the American Statistical Association 91 (434): 444–55.

Ansolabehere, S., E. C. Snowberg, and J. M. Snyder. 2006. “Televi-sion and the Incumbency Advantage in U.S. Elections.” LegislativeStudies Quarterly 31 (4): 469–90.

Ansolabehere, S., J. M. Snyder, and C. Stewart. 2000. “Old Voters,New Voters, and the Personal Vote: Using Redistricting to Mea-sure the Incumbency Advantage.” American Journal of PoliticalScience 44 (1): 17–34.

Baron, R. M., and D. A. Kenny. 1986. “The Moderator–MediatorVariable Distinction in Social Psychological Research: Concep-tual, Strategic, and Statistical Considerations.” Journal of Person-ality and Social Psychology 51 (6): 1173–82.

Bartels, L. M. 1993. “Messages Received: The Political Impact ofMedia Exposure.” American Political Science Review 87 (2): 267–85.

Bertrand, M., and S. Mullainathan. 2004. “Are Emily and GregMore Employable Than Lakisha and Jamal? A Field Experimenton Labor Market Discrimination.” American Economic Review94 (4): 991–1013.

Blattman, C. 2009. “From Violence to Voting: War and Political Par-ticipation in Uganda.” American Political Science Review 103 (2):231–47.

Brader, T., N. A. Valentino, and E. Suhay. 2008. “What TriggersPublic Opposition to Immigration? Anxiety, Group Cues, and Im-migration.” American Journal of Political Science 52 (4): 959–78.

Brady, H. E., and D. Collier. 2004. Rethinking Social Inquiry: DiverseTools, Shared Standards. Lanham, MD: Rowman and Littlefield.

Bullock, J., D. Green, and S. Ha. 2010. “Yes, But What’s the Mech-anism? (Don’t Expect an Easy Answer).” Journal of Personalityand Social Psychology 98 (4): 550–58.

Chong, D., and J. Druckman. 2007. “Framing Theory.” Annual Re-view of Political Science 10: 103–26.

Cnudde, C. F., and D. J. McCrone. 1966. “The Linkage betweenConstituency Attitudes and Congressional Voting Behavior: ACausal Model.” American Political Science Review 60 (1): 66–72.

Cole, S. R., and C. E. Frangakis. 2009. “The Consistency Statementin Causal Inference: A Definition or Assumption?” Epidemiology20 (1): 3–5.

Collier, D., H. E. Brady, and J. Seawright. 2004. “Source of Leveragein Causal Inference: Toward an Alternative View of Methodol-ogy.” In Rethinking Social Inquiry:Diverse Tools, Shared Standardseds. H. Brady and D. Collier, Berkeley, CA: Rowman and Little-field.

Cox, G. W., and J. N. Katz. 1996. “Why Did the Incumbency Advan-tage in U.S. House Elections Grow?” American Journal of PoliticalScience 40 (2): 478–97.

Deaton, A. 2010a. “Instruments, Randomization, and Learningabout Development.” Journal of Economic Literature 48 (2): 424–55.

Deaton, A. 2010b. “Understanding the Mechanisms of EconomicDevelopment.” Journal of Economic Perspectives 24 (3): 3–16.

Druckman, J. 2005. “Media Matter: How Newspapers and TelevisionNews Cover Campaigns and Influence Voters.” American PoliticalScience Review 22: 463–81.

Erikson, R. S., and T. R. Palfrey. 1998. “Campaign Spending andIncumbency: An Alternative Simultaneous Equations Approach.”Journal of Politics 60 (2): 355–73.

Gadarian, S. K. 2010. “The Politics of Threat: How Terrorism NewsShapes Foreign Policy Attitudes.” Journal of Politics 72 (2): 469–83.

Gelman, A., and G. King. 1990. “Estimating Incumbency Advantagewithout Bias.” American Journal of Political Science 34 (4): 1142–64.

Gerber, A. 1998. “Estimating the Effect of Campaign Spending onSenate Election Outcomes Using Instrumental Variables.” Amer-ican Political Science Review 92 (2): 401–11.

Glynn, A. N. 2010. “The Product and Difference Fallacies for In-direct Effects.” Department of Government, Harvard University.Unpublished manuscript, Mimeo.

Green, D. P., S. E. Ha, and J. G. Bullock. 2010. “Enough Alreadyabout Black Box Experiments: Studying Mediation Is More Dif-ficult Than Most Scholars Suppose.” Annals of the AmericanAcademy of Political and Social Sciences 628 (1): 200–08.

Gross, J. J., and R. W. Levenson. 1995. “Eliciting Emotions UsingFilms.” Cognition and Emotion 9 (1): 87–108.

Haavelmo, T. 1943. “The Statistical Implications of a System of Si-multaneous Equations.” Econometrica 11: 1–12.

Heckman, J. J., and J. A. Smith. 1995. “Assessing the Case for SocialExperiments.” Journal of Economic Perspectives 9 (2): 85–110.

Hetherington, M. J. 2001. “Resurgent Mass Partisanship: The Roleof Elite Polarization.” American Political Science Review 95 (3):619–31.

Ho, D. E., K. Imai, G. King, and E. A. Stuart. 2007. “Matching asNonparametric Preprocessing for Reducing Model Dependence inParametric Causal Inference.” Political Analysis 15 (3): 199–236.

Holland, P. W. 1986. “Statistics and Causal Inference.” Journal of theAmerican Statistical Association 81: 945–60.

Holland, P. W. 1988. “Causal Inference, Path Analysis, and RecursiveStructural Equations Models.” Sociological Methodology 18: 449–84.

Horiuchi, Y., K. Imai, and N. Taniguchi. 2007. “Designing and Ana-lyzing Randomized Experiments: Application to a Japanese Elec-tion Survey Experiment.” American Journal of Political Science51 (3): 669–87.

Imai, K., L. Keele, and D. Tingley. 2010. “A General Approach toCausal Mediation Analysis.” Psychological Methods 15 (4): 309–34.

Imai, K., L. Keele, D. Tingley and T. Yamamoto. 2010. “Causal Medi-ation Analysis Using R”. In Advances in Social Science ResearchUsing R, ed. H. D. Vinod, Lecture Notes in Statistics. Springer-verlag: New York, 129–54.

Imai, K., L. Keele, D. Tingley, and T. Yamamoto. 2011. “ReplicationData for: Unpacking the Black Box of Causality: Learning aboutCausal Mechanisms from Experimental and Observational Stud-ies.” The Dataverse Network. hdl:1902.1/16467 (accessed Septem-ber 1, 2011).

788

Page 25: Unpacking the Black Box of Causality: Learning about ...web.mit.edu/teppei/www/research/mediationP.pdfdesign strategies. Finally, we invoke several other empirical exam-ples to further

American Political Science Review Vol. 105, No. 4

Imai, K., L. Keele, and T. Yamamoto. 2010. “Identification, Inference,and Sensitivity Analysis for Causal Mediation Effects.” StatisticalScience 25 (1): 51–71.

Imai, K., G. King, and E. A. Stuart. 2008. “Misunderstandings amongExperimentalists and Observationalists about Causal Inference.”Journal of the Royal Statistical Society, Series A (Statistics in Soci-ety) 171 (2): 481–502.

Imai, K., and D. Tingley. N.d. “A Statistical Method for EmpiricalTesting of Competing Theories.” American Journal of PoliticalScience. Forthcoming.

Imai, K., D. Tingley, and T. Yamamoto. N.d. “Experimental Designsfor Identifying Causal Mechanisms.” (With discussions). Journalof the Royal Statistical Society, Series A (Statistics in Society). Forth-coming.

Imai, K., and T. Yamamoto. 2010. “Causal Inference with DifferentialMeasurement Error: Nonparametric Identification and SensitivityAnalysis.” American Journal of Political Science 54 (2): 543–60.

Imai, K., and T. Yamamoto. 2011. “Sensitivity Analysisfor Causal Mediation Effects under Alternative ExogeneityAssumptions.” http://imai.princeton.edu/research/medsens.html.(accessed September 1, 2011).

Isbell, L., and V. Ottati. 2002. “The Emotional Voter.” In The SocialPsychology of Politics, ed. V. Ottati, New York: Kluwer, 55–74.

Jacobson, G. C. 1987. “The Politics of Congressional Elections.Boston: Little, Brown.

Jo, B. 2008. “Causal Inference in Randomized Experiments withMediational Processes.” Psychological Methods 13 (4): 314–36.

Jost, J. T., J. L. Napier, H. Thorisdottir, S. D. Gosling, T. P. Palfai, andB. Ostafin. 2007. “Are Needs to Manage Uncertainty and ThreatAssociated With Political Conservatism or Ideological Extrem-ity?” Personality and Social Psychology Bulletin 33 (7): 989–1007.

Kinder, D. R. and L. Sanders. 1996. Divided by Color: Racial Politicsand Democratic Ideals. Chicago: University of Chicago Press.

King, G., R. O. Keohane, and S. Verba. 1994. Designing Social In-quiry. Princeton, NJ: Princeton University Press.

King, G., M. Tomz, and J. Wittenberg. 2000. “Making the Most ofStatistical Analyses: Improving Interpretation and Presentation.”American Journal of Political Science 44: 341–55.

Levitt, S. D. and C. D. Wolfram. 1997. “Decomposing the Sourcesof Incumbency Advantage in the U.S. House.” Legislative StudiesQuarterly 22 (1): 45–60.

MacKinnon, D. 2008. Introduction to Statistical Mediation Analysis.New York: Routledge.

MacKinnon, D., C. Lockwood, C. Brown, W. Wang, and J. Hoffman.2007. “The Intermediate Endpoint Effect in Logistic and ProbitRegression.” Clinical Trials 4: 499–513.

Manski, C. F. 2007. Identification for Prediction and Decision. Cam-bridge, MA: Harvard University Press.

Miller, J. M., and J. A. Krosnick. 2000. “News Media Impact on theIngredients of Presidential Evaluations: Politically KnowledgeableCitizens Are Guided by a Trusted Source.” American Journal ofPolitical Science 44 (2): 301–15.

Miller, W. E., and D. W. Stokes. 1963. “Constituency Influence inCongress.” American Political Science Review 57 (1): 45–46.

Nelson, T. E., R. A. Clawson, and Z. M. Oxley. 1997. “Media Framingof a Civil Liberties Conflict and Its Effect on Tolerance.” AmericanPolitical Science Review 91 (3): 567–83.

Nelson, T. E., and D. R. Kinder. 1996. “Issue Frames and Group-centrism in American Public Opinion.” The Journal of Politics58 (4): 1055–78.

Neyman, J. [1923] 1990. “On the Application of Probability The-ory to Agricultural Experiments: Essay on Principles, Section 9.”Statistical Science 5: 465–80.

Olsson, A., J. P. Ebert, M. R. Banaji, and E. A. Phelps. 2005. “TheRole of Social Groups in the Persistence of Learned Fear.” Science309 (5735): 785–87.

Oxley, D. R., K. B. Smith, J. R. Alford, M. V. Hibbing, J. L. Miller,M. Scalora, P. K. Hatemi, and J. R. Hibbing. 2008. “Political At-titudes Vary with Physiological Traits.” Science 321 (5896): 1667–70.

Pearl, J. 2001. “Direct and Indirect Effects.” In Proceedings of theSeventeenth Conference on Uncertainty in Artificial Intelligence,eds. Jack S. Breese and Daphne Koller. San Francisco: MorganKaufmann, 411–20.

Pearl, J. N.d. “The Causal Mediation Formula: A Guide to theAssessment of Pathways and Mechanisms.” Prevention Science.Forthcoming.

Petersen, M. L., S. E. Sinisi, and M. J. van der Laan. 2006. “Estimationof Direct Causal Effects.” Epidemiology 17 (3): 276–84.

Prior, M. 2006. “The Incumbent in the Living Room: The Rise ofTelevision and the Incumbency Advantage in U.S. House Elec-tions.” Journal of Politics 68 (3): 657–73.

Robins, J. M. 2003. “Semantics of Causal DAG Models and theIdentification of Direct and Indirect Effects.” In Highly StructuredStochastic Systems, eds. P. J. Green, N. L. Hjort, and S. Richardson,Oxford: Oxford University Press, 70–81.

Robins, J. M., and S. Greenland. 1992. “Identifiability and Exchange-ability for Direct and Indirect Effects.” Epidemiology 3 (2): 143–55.

Robins, J. M., and T. Richardson. 2010. “Alternative GraphicalCausal Models and the Identification of Direct Effects. In Causalityand Psychopathology: Finding the Determinants of Disorders andTheir Cures, eds. P. Shrout, K. Keyes, and K. Omstein, Oxford:Oxford University Press, 103–58.

Rosenbaum, P. R. 2002a. “Covariance Adjustment in RandomizedExperiments and Observational Studies: Rejoinder.” StatisticalScience 17 (3): 321–27.

Rosenbaum, P. R. 2002b. “Covariance Adjustment in RandomizedExperiments and Observational Studies (with Discussion).” Sta-tistical Science 17 (3): 286–327.

Rubin, D. B. 1974. “Estimating Causal Effects of Treatments in Ran-domized and Non-randomized Studies.” Journal of EducationalPsychology 66: 688–701.

Shadish, W. R., T. D. Cook, and D. T. Campbell. 2001. Experimentaland Quasi-Experimental Designs for Generalized Causal Inference.Boston: Houghton Mifflin.

Sjolander, A. 2009. “Bounds on Natural Direct Effects in thePresence of Confounded Intermediate Variables.” Statistics inMedicine 28 (4): 558–71.

Sobel, M. E. 1982. “Asymptotic Confidence Intervals for IndirectEffects in Structural Equation Models.” Sociological Methodology13: 290–321.

Sobel, M. E. 2008. “Identification of Causal Parameters in Random-ized Studies with Mediating Variables.” Journal of Educationaland Behavioral Statistics 33 (2): 230–51.

Spencer, S., M. Zanna, and G. Fong. 2005. “Establishing a CausalChain: Why Experiments Are Often More Effective Than Medi-ational Analyses in Examining Psychological Processes.” Journalof Personality and Social Psychology 89 (6): 845–51.

Tiedens, L. Z. and S. Linton. 2001. “Judgment under EmotionalCertainty and Uncertainty: The Effects of Specific Emotions onInformation Processing.” Journal of Personality and Social Psy-chology 81 (6): 973–88.

Tomz, M. and R. P. van Houweling. 2009. “The Electoral Implica-tions of Candidate Ambiguity.” American Political Science Review103 (1): 83–98.

VanderWeele, T. J. 2009. “Marginal Structural Models for the Es-timation of Direct and Indirect Effects.” Epidemiology 20 (1):18–26.

VanderWeele, T. J. and J. M. Robins. 2009. “Minimal Sufficient Cau-sation and Directed Acyclic Graphs.” Annals of Statistics 37 (3):1437–65.

789