otis dudley duncan memorial lecture the resurrection of duncanism

62
1 Otis Dudley Duncan Memorial Lecture THE RESURRECTION OF DUNCANISM Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/)

Upload: abeni

Post on 25-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Otis Dudley Duncan Memorial Lecture THE RESURRECTION OF DUNCANISM. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/). OUTLINE. Duncanism = Causally Assertive SEM History: Oppression, Distortion, and Resurrection The Old-New Logic of SEM New Tools - PowerPoint PPT Presentation

TRANSCRIPT

  • Otis Dudley DuncanMemorial Lecture

    THE RESURRECTION OF DUNCANISMJudea PearlUniversity of CaliforniaLos Angeles(www.cs.ucla.edu/~judea/)

  • OUTLINEDuncanism = Causally Assertive SEMHistory: Oppression, Distortion, and ResurrectionThe Old-New Logic of SEMNew Tools4.1Local testing4.2Non-parametric identification4.3Logic of counterfactuals4.3Non-linear mediation analysis

  • DUNCAN BOOK

  • DUNCANISM = ASSERTIVE SEM [y = bx + u] may be read as "a change in x or u produces a change in y ... or x and u are the causes of y (Duncan, 1975, p. 1)[the disturbance] u stands for all other sources of variation in y (ibid) doing the model consists largely in thinking about what kind of model one wants and can justify. (ibid, p. viii)Assuming a model for sake of argument, we can express its properties in terms of correlations and (sometimes) find one or more conditions that must hold if the model is true (ibid, p. 20). SEM A tool for deriving causal conclusion from data and assumptions:

  • HISTORY: BIRTH, OPPRESSION, DISTORTION, RESURRECTIONBirth and Development (1920 - 1980)Sewell Wright (1921), Haavelmo (1943), Simon (1950), Marschak (1950), Koopmans (1953), Wold and Strotz (1963), Goldberger (1973), Blalock (1964), and Duncan (1969, 1975) The regressional assault (1970-1990)Causality is meaningless, therefore, to be meaningful, SEM must be a regression technique, not causal modeling. Richard (1980) Cliff (1983), Dempster (1990), Wermuth (1992), and Muthen (1987) The Potential-outcome assault (1985-present) Causality is meaningful but, since SEM is a regression technique, it could not possibly have causal interpretation. Rubin (2004, 2010), Holland (1986) Imbens (2009), andSobel (1996, 2008)

  • REGRESSION VS. STRUCTURAL EQUATIONS(THE CONFUSION OF THE CENTURY)Regression (claimless, nonfalsifiable): Y = ax + Y Structural (empirical, falsifiable): Y = bx + uY( := assignment) Claim: (regardless of distributions): E(Y | do(x)) = E(Y | do(x), do(z)) = bx Q. When is b a partial regression? b = YX ZA. Shown in the diagram, Slide 36.

  • THE POTENTIAL-OUTCOME ASSAULT (1985-PRESENT)I am speaking, of course, about the equation:What does it mean? The only meaning I have ever determined for such an equation is that it is a shorthand way of describing the conditional distribution of {y} given {x}. (Holland 1995)

    The use of complicated causal-modeling software [read SEM] rarely yields any results that have any interpretation as causal effects. (Wilkinson and Task Force 1999 on Statistical Methods in Psychology Journals: Guidelines and Explanations)SEM = regressional strawman

  • THE POTENTIAL-OUTCOME ASSAULT (1985-PRESENT) (Cont)In general (even in randomized studies), the structural and causal parameters are not equal, implying that the structural parameters should not be interpreted as effect. (Sobel 2008)

    Using the observed outcome notation entangles the science...Bad! Yet this is exactly what regression approaches, path analyses, directed acyclic graphs, and so forth essentially compel one to do. (Rubin 1010)

  • SEM REACTION TO THE STRUCTURE-PHOBIC ASSAULT Galles, Pearl, Halpern (1998 - Logical equivalence)

    Heckman-Sobel Morgan Winship (1997) Gelman BlogNONE TO SPEAK OF!

  • WHY SEM INTERPRETATION ISSELF-CONTRADICTORYD. Freedman JES (1987), p. 114, Fig. 3"Now try the direct effect of Z on Y: We intervene by fixing W and X but increasing Z by one unit; this should increase Y by d units. However, this hypothetical intervention is self-contradictory,because fixing W and increasing Z causes an increase in X. The oversight: Fixing X DISABLES equation (7.1);

  • SEM REACTION TOFREEDMAN CRITICS

    It would be very healthy if more researchers abandoned thinking of and using terms such as cause and effect (Muthen, 1987). Causal modeling is an outdated misnomer (Kelloway, 1998). Causality-free, politically-safevocabulary: covariance structure regression analysis, or simultaneous equations. [Causal Modeling] may be somewhat dated, however, as it seems to appear less often in the literature nowadays(Kline, 2004, p. 9) Total Surrender:

  • THE RESURRECTIONWhy non-parametric perspective?

    What are the parameters all about and why we labor to identify them? Can we do without them? Consider:

    Only he who lost the parameters and needs to find substitutes can begin to ask: Do I really need them? What do they really mean? What role do the play?

  • CAUSAL MODEL (MA)THE LOGIC OF SEMA - CAUSAL ASSUMPTIONSQ Queries of interestQ(P) - Identified estimandsData (D)Q - Estimates of Q(P) Causal inferenceT(MA) - Testable implicationsStatistical inferenceGoodness of fitModel testingConditional claimsA* - Logicalimplications of ACAUSAL MODEL (MA)

  • TRADITIONAL STATISTICALINFERENCE PARADIGMe.g.,Infer whether customers who bought product Awould also buy product B.Q = P(B | A)

  • What happens when P changes?e.g.,Infer whether customers who bought product Awould still buy A if we were to double the price.FROM STATISTICAL TO CAUSAL ANALYSIS:1. THE DIFFERENCESProbability and statistics deal with static relationsDataInferenceQ(P)(Aspects of P)P Joint DistributionPJointDistributionchange

  • FROM STATISTICAL TO CAUSAL ANALYSIS:1. THE DIFFERENCESNote: P (v) P (v | price = 2) P does not tell us how it ought to changeCausal knowledge: what remains invariantWhat remains invariant when P changes say, to satisfy P (price=2)=1DataInferenceQ(P)(Aspects of P)P Joint DistributionPJointDistributionchange

  • FROM STATISTICAL TO CAUSAL ANALYSIS:1. THE DIFFERENCES (CONT)

  • Causal assumptions cannot be expressed in the mathematical language of standard statistics.FROM STATISTICAL TO CAUSAL ANALYSIS:2. MENTAL BARRIERS

  • Causal assumptions cannot be expressed in the mathematical language of standard statistics.FROM STATISTICAL TO CAUSAL ANALYSIS:2. MENTAL BARRIERS

  • DataInferenceQ(M)(Aspects of M)Data GeneratingModelM Invariant strategy (mechanism, recipe, law, protocol) by which Nature assigns values to variables in the analysis.

    JointDistribution THE STRUCTURAL MODELPARADIGMMThink Nature, not experiment!

  • STRUCTURALCAUSAL MODELSDefinition: A structural causal model is a 4-tupleV,U, F, P(u), where V = {V1,...,Vn} are endogenous variables U = {U1,...,Um} are background variables F = {f1,..., fn} are functions determining V,vi = fi(v, u) P(u) is a distribution over UP(u) and F induce a distribution P(v) over observable variables

  • CAUSAL MODELS AND COUNTERFACTUALSDefinition: The sentence: Y would be y (in unit u), had X been x, denoted Yx(u) = y, means:The solution for Y in a mutilated model Mx, (i.e., the equations for X replaced by X = x) with input U=u, is equal to y.

  • CAUSAL MODELS AND COUNTERFACTUALSDefinition: The sentence: Y would be y (in unit u), had X been x, denoted Yx(u) = y, means:The solution for Y in a mutilated model Mx, (i.e., the equations for X replaced by X = x) with input U=u, is equal to y.

  • CAUSAL MODELS AND COUNTERFACTUALSDefinition: The sentence: Y would be y (in situation u), had X been x, denoted Yx(u) = y, means:The solution for Y in a mutilated model Mx, (i.e., the equations for X replaced by X = x) with input U=u, is equal to y.

  • 3-LEVEL HIERARCHY OF CAUSAL MODELSProbabilistic KnowledgeP(y | x)Bayesian networks, covariance structure

    Interventional KnowledgeP(y | do(x))Causal Bayesian Networks (CBN) (Agnostic graphs, manipulation graphs)

    Counterfactual KnowledgeP(Yx = y,Yx =y)Structural equation models, physics, functional graphs

  • TWO PARADIGMS FOR CAUSAL INFERENCEObserved: P(X, Y, Z,...)Conclusions needed: P(Yx=y), P(Xy=x | Z=z)...

    How do we connect observables, X,Y,Z,to counterfactuals Yx, Xz, Zy, ?N-R modelCounterfactuals areprimitives, new variables Super-distribution Structural modelCounterfactuals are derived quantities Subscripts modify the model and distribution

  • ARE THE TWO PARADIGMS EQUIVALENT? Yes (Galles and Pearl, 1998; Halpern 1998)

    In the N-R paradigm, Yx is defined by consistency:

    In SEM, consistency is a theorem.

    Moreover, a theorem in one approach is a theorem in the other.

  • Define:

    Assume: Identify: Estimate:

    Test: THE FIVE NECESSARY STEPSOF CAUSAL ANALYSISExpress the target quantity Q as a function Q(M) that can be computed from any model M.

    Formulate causal assumptions A using some formal language. Determine if Q is identifiable given A. Estimate Q if it is identifiable; approximate it, if it is not. Test the testable implications of A (if any).

  • THE FIVE NECESSARY STEPS FOR EFFECT ESTIMATIONDefine:

    Assume: Identify: Estimate:

    Test: Express the target quantity Q as a function Q(M) that can be computed from any model M.

    Formulate causal assumptions A using some formal language. Determine if Q is identifiable given A. Estimate Q if it is identifiable; approximate it, if it is not. Test the testable implications of A (if any).

  • FORMULATING ASSUMPTIONSTHREE LANGUAGES1. English: Smoking (X), Cancer (Y), Tar (Z), Genotypes (U)Not too friendly:consistent? complete? redundant? arguable?

  • Define:

    Assume:

    Identify:

    Estimate:

    IDENTIFYING CAUSAL EFFECTSIN POTENTIAL-OUTCOME FRAMEWORKExpress the target quantity Q as a counterfactual formula, e.g., E(Y(1) Y(0)) Formulate causal assumptions using the distribution:

    Determine if Q is identifiable using P* andY=x Y (1) + (1 x) Y (0). Estimate Q if it is identifiable; approximate it, if it is not.

  • GRAPHICAL COUNTERFACTUALS SYMBIOSISEvery causal graph expresses counterfactuals assumptions, e.g., X Y Zconsistent, and readable from the graph.

    Express assumption in graphsDerive estimands by graphical or algebraic methodsMissing arrows Y Z2. Missing arcsYZ

  • A is displayed in graph G.NON-PARAMETRIC IDENTIFIABILITYDefinition:Let Q(M) be any quantity defined on a causal model M, and let A be a set of assumption.

    Q is identifiable relative to A iffIn other words, Q can be determined uniquelyfrom the probability distribution P(v) of the endogenous variables, V, and assumptions A.P(M1) = P(M2) Q(M1) = Q(M2)for all M1, M2, that satisfy A.

  • IDENTIFICATION IN SEMFind the effect of X on Y, P(y|do(x)), given the causal assumptions shown in G, where Z1,..., Zk are auxiliary variables.Z6Z3Z2Z5Z1XYZ4Can P(y|do(x)) be estimated if only a subset, Z, can be measured?

  • NON-PARAMETRIC IDENTIFICATION USINGTHE BACK-DOOR CRITERION

    P(y | do(x)) is estimable if there is a set Z ofvariables such that Z d-separates X from Y in Gx.Z6Z3Z2Z5Z1XYZ4Z6Z3Z2Z5Z1XYZ4Z

  • THE MOTHER OF ALL QUESTIONSQuestion: Under what conditions can a path coefficient be estimated as a regression coefficient, and what variables should serve as the regressors?Answer: It is all in the diagram (Pearl 2009, p. 180)The single door criterionNo member of S is a descendant of Y, andS blocks all paths from X to Y, except the direct path X Y.YX

  • SEARCHING FOR THE TESTABLE

  • FINDING THE TESTABLESBY INSPECTIONW3W1Z2W2Z1XYZ3Software for automatic detection of all tests (Kyono, 2010)Missing edges imply conditional independencies and vanishing regression coefficients.

  • FINDING INSTRUMENTAL VARIABLESFind an instrument for identifying b34 (Duncan, 1975)By inspection: X is d-separated from V Therefore, X1 is a valid instrumentFind a variable Z that, conditioned on S, is independent of the disturbance v.

  • XWYZXWYZFIND EQUIVALENT MODELSModel-equivalence implies d-separation-equivalenced-separation tests should replace Lee & Herschberger rules(a)(b)Reason: conditioned on Y,Path Z Y X W is open in (a) and blocked in (b)

  • W1W4XYV2V1W2W3CONFOUNDING EQUIVALENCEWHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE

  • Define:

    Assume:

    Identify:

    Estimate:

    FROM IDENTIFICATIONTO ESTIMATIONExpress the target quantity Q as a function Q(M) that can be computed from any model M.

    Formulate causal assumptions using ordinary scientific language and represent their structural part in graphical form. Determine if Q is identifiable. Estimate Q if it is identifiable; approximate it, if it is not.

  • PROPENSITY SCORE ESTIMATOR(Rosenbaum & Rubin, 1983)Z6Z3Z2Z5Z1XYZ4P(y | do(x)) = ?

  • WHAT PROPENSITY SCORE (PS)PRACTITIONERS NEED TO KNOWThe asymptotic bias of PS is EQUAL to that of ordinary adjustment (for same Z).Including an additional covariate in the analysis CAN SPOIL the bias-reduction potential of others.In particular, instrumental variables tend to amplify bias.Choosing sufficient set for PS, requires knowledge of the model.

  • Regret:I took a pill to fall asleep. Perhaps I should not have?

    Program evaluation:What would terminating a program do tothose enrolled?

    RETROSPECTIVE COUNTERFACTUALS ETT EFFECT OF TREATMENT ON THE TREATED

  • Define:

    Assume:

    Identify:

    Estimate:

    COUNTERFACTUAL DEFINITIONEFFECT OF TREATMENT ON THE TREATEDExpress the target quantity Q as a function Q(M) that can be computed from any model M.

    Formulate causal assumptions using ordinary scientific language and represent their structural part in graphical form. Determine if Q is identifiable. Estimate Q if it is identifiable; approximate it, if it is not.

  • ETT - IDENTIFICATIONTheorem (Shpitser-Pearl, 2009) ETT is identifiable in G iff P(y | do(x),w) is identifiable in GMoreover, Complete graphical criterionXY

  • EFFECT DECOMPOSITION(direct vs. indirect effects)Why decompose effects?

    What is the definition of direct and indirect effects?

    What are the policy implications of direct and indirect effects?

    When can direct and indirect effect be estimated consistently from experimental and nonexperimental data?

  • WHY DECOMPOSE EFFECTS?To understand how Nature works

    To comply with legal requirements

    To predict the effects of new type of interventions: Signal routing, rather than variable fixing

  • XZYLEGAL IMPLICATIONSOF DIRECT EFFECTWhat is the direct effect of X on Y ?

    (averaged over z)(Qualifications)(Hiring)(Gender)Can data prove an employer guilty of hiring discrimination?

  • z = f (x, u)y = g (x, z, u)XZYNATURAL INTERPRETATION OFAVERAGE DIRECT EFFECTSNatural Direct Effect of X on Y:The expected change in Y, when we change X from x0 to x1 and, for each u, we keep Z constant at whatever value it attained before the change.

    In linear models, DE = Controlled Direct EffectRobins and Greenland (1992) Pure

  • z = f (x, u)y = g (x, z, u)XZYDEFINITION OFINDIRECT EFFECTSIndirect Effect of X on Y:The expected change in Y when we keep X constant, say at x0, and let Z change to whatever value it would have attained had X changed to x1.

    In linear models, IE = TE DE, but not in general.

  • POLICY IMPLICATIONS OF DIRECT EFFECTS f What is the direct effect of X on Y?Residual hiring disparity if skills inequality is eliminatedXZYDeactivating a link a new type of interventionDISABLE

  • POLICY IMPLICATIONS OF INDIRECT EFFECTS f What is the indirect effect of X on Y?XZYIGNOREDeactivating a link a new type of interventionResidual hiring disparity if employers prejudices are eliminated.

  • Zm2XYm1In linear systemsIs NOT equal to:

  • The natural direct and indirect effects are identifiable in Markovian models (no confounding), And are given by:

    Applicable to linear and non-linear models, continuous and discrete variables, regardless of distributional form.MEDIATION FORMULAS

  • MEDIATION FORMULASIN UNCONFOUNDED MODELS IE = Fraction of responses explained by mediationTE DE = Fraction of responses owed to mediation

  • MODERATING MEDIATORS m1m2c12What portion of the effect can be explained-by/attributed-to mediation? The Mediation Formula yields:Effect Explained: Effect Attributed:

  • MEDIATION FORMULAFOR BINARY VARIABLESXZY

    XZYE(Y|x,z)=gxzE(Z|x)=hxn1000n2001n3010n4011n5100n6101n7110n8111

  • RAMIFICATION OF THEMEDIATION FORMULADE should be averaged over mediator levels, IE should NOT be averaged over exposure levels.

    TE-DE need not equal IETE-DE = proportion for whom mediation is necessaryIE = proportion for whom mediation is sufficient

    TE-DE informs interventions on indirect pathways IE informs intervention on direct pathways.

  • I TOLD YOU CAUSALITY IS SIMPLE Formal basis for causal and counterfactual inference (complete)Unification of the graphical, potential-outcome and structural equation approachesFriendly and formal solutions to century-old problems and confusions.No other method can do better (theorem)Rephrasing Duncan (1975):He is wise who bases causal inference on an explicit causal structure that is defensible on scientific grounds. CONCLUSIONS

  • They will be answeredQUESTIONS???

  • CONFOUNDING EQUIVALENCEWHEN TWO MEASUREMENTS ARE EQUALLY VALUABLEDefinition:T and Z are c-equivalent if

    Definition (Markov boundary):Markov boundary Sm of S (relative to X) is the minimal subset of S that d-separates X from all other members of S. Theorem (Pearl and Paz, 2009) Z and T are c-equivalent iff Zm=Tm, or Z and T are admissible (i.e., satisfy the back-door condition)

  • W1W4XYV2V1W2W3CONFOUNDING EQUIVALENCEWHEN TWO MEASUREMENTS ARE EQUALLY VALUABLEZ T

  • W1W4XYV2V1W2W3CONFOUNDING EQUIVALENCEWHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE

  • W1W4XYV2V1W2W3Z TCONFOUNDING EQUIVALENCEWHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE

  • BIAS AMPLIFICATIONBY INSTRUMENTAL VARIABLESW2W1XYUAdding W2 to Propensity Score increases bias (if such exists) (Wooldridge, 2009)In linear systems alwaysIn non-linear systems almost always (Pearl, 2010)Outcome predictors are safer than treatment predictors

  • WMEASUREMENT BIAS ANDEFFECT RESTORATIONUnobserved ZXYP(y | do(x)) is identifiable from measurement of W, if P(w | z) is given (Selen, 1986; Greenland & Lash, 2008)

  • EFFECT RESTORATION IN BINARY MODELS11undefinedundefinedTo cell (x,y,Z = 0)To cell (x,y,Z = 1)Weight distribution from cell (x,y,W = 1)WZXY

  • EFFECT RESTORATIONIN LINEAR MODELSWZXYc1c2c3c0(a)

  • nyu slidesCAUSAL INFERENCE:LOGICAL FOUNDATIONAND NEW RESULTS Judea PearlUniversity of CaliforniaLos Angeles(www.cs.ucla.edu/~judea/)

  • Statistical vs. Causal vs. Counterfactual inference: syntax and semanticsThe logical equivalence of Structural Equation Models (SEM) and potential outcomes (Neyman-Rubin)Where do they differ? Or compliment each other? 1. Define, 2. Assume, 3. identify, 4. Estimate, 5. Test Which covariates should be measured?Causation without manipulation The Mediation Formula and its ramificationsMeasurement bias and effect restoration.OUTLINE

  • BAYESIAN NETWORKSDefinitionP is Markov relative to G iff:

    Algebraic Representation

    Graphical Representation

    G

  • DefinitionLet Px(v) stand for P(v | do(X=x)), X and x arbitraryG is a CBN relative to Px* iff:Px(v) is Markov relative to G;Px(vi)=1 for all if vi agrees with X=x. for all whenever pai agrees with X=x, i.e., each P(vi | pai) remains INVARIANT to interventions not involving Vi. Algebraic representation for all v consistent with x.

    (Truncated product, G-computation, manipulation theorem) Graphical representationSurgery: remove incoming arrows to X.CAUSAL BAYESIAN NETWORK (CBN)(Agnostic graphs, manipulation graphs)

  • Front DoorEFFECT OF WARM-UP ON INJURY (After Shrier & Platt, 2008)Warm-up Exercises (X)Injury (Y)???

  • I TOLD YOU CAUSALITY IS SIMPLE Formal basis for causal and counterfactual inference (complete)Unification of the graphical, potential-outcome and structural equation approachesFriendly and formal solutions to century-old problems and confusions.He is wise who bases causal inference on an explicit causal structure that is defensible on scientific grounds. (Aristotle 384-322 B.C.)

    From Charlie PooleCONCLUSIONS

  • FINDING INSTRUMENTAL VARIABLESCan you find a n instrument for identifying b34? (Duncan, 1975)By inspection: X2 d-separate X1 from V Therefore X1 is a valid instrument