barr pearl2009 preso

85
. . . . . . Introduction . From association to causation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structural models, diagrams, causal effects, and counterfactuals . . . . . . . . The potential outcome . . Judea Pearl: Causal Inference in Statistics, An Overview, 2009. Stephen J. Barr Foster School of Business Information Systems and Operations Management University of Washington February 23, 2013

Upload: stephenjbarr

Post on 05-Aug-2015

345 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

.

......

Judea Pearl: Causal Inference in Statistics, AnOverview, 2009.

Stephen J. Barr

Foster School of BusinessInformation Systems and Operations Management

University of Washington

February 23, 2013

Page 2: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Judea Pearl Introduction

Born 1936, Tel Aviv

Director of CognitiveSystems Laboratory, CSdept., UCLA.

Recipient of ACM TuringAward

Founding member of UCLACS dept.

3 books and 300+ papers

Page 3: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Judea Pearl Introduction

1995 - Elected NationalAcademy of Engineering

2001 - Lakatos award - Bestbook in philosophy ofscience

2004 - AAAI Alan Millsaward - seminalcontributions in philosophy,psychology, medicine,econometrics, epidemiology,and the social sciences

2008 - Benjamin FranklinMedal - artificial intelligencealgorithms

Page 4: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Judea Pearl Introduction

Founding member of AAAI,UAI (Uncertainty in ArtificialIntelligence) conference.

Founding member of IEEE

Respected in many researchcommunities: Artificialintelligence, machinelearning, statistics, etc.

Page 5: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Judea Pearl IntroductionQuestions Asked by Pearl:a

“How can we emulatemental phenomena on amachine?”

“What is it about humanintelligence that a computercannot do?”

“What human qualitiesmight be attributed toprograms?”

“How can we imbue robotswith human intuitions aboutcause and effect?”

aReflections about Judea Pearl and his Contributions, Eric Horvitz

Page 6: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Introduction

Many studies are causal in nature...1 What fraction of past crimes could have been avoided by agiven policy?

...2 What was the cause of death of a given individual, in a specificincident?

However, researchers tackle these causal issues withassociative statistical techniques.

Page 7: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Contribution: Make causal tools accessible

Causal tools have been developed

Due to lack of education, these tools are not commonlyused.

Particularly, adoption of causal tools has been slow becausecausal analysis...:

...necessitates untested assumptions based on judgement.

...requires a new mathematical syntax, an extension ofprobability calculus.

Page 8: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Source of Innovations in Causal Analysis

...1 Counterfactual analysis

...2 Nonparametric structural equations

...3 Graphical models

...4 Symbiosis between counterfactual and graphical methods.

.Goal..

......

Summarize and synthesize these tools and make the case for theiruse in the greater research communitya.

aepidemiologists, economists, sociologists, etc

Page 9: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Outline - Goals

Goals:...1 Contrast causal analysis with standard statistical analysis...2 Present a unifying theory, called “structural”, which subsumesgeneralizes all other causal theories.

...3 Demonstrate the tools which come accompany “structural”theory.

...4 Demonstrate how “structural” theory subsumes other causaltheories.

Page 10: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Outline - Talk Outline

From association to causationStructural models, diagrams, causal effects, andcounterfactuals

Intro to SEMFrom linear to nonparametric models and graphsCoping with unmeasured confoundersCounterfactual analysis in structural modelsAn example: Non-compliance in structural models

The potential outcome frameworkThe ”Black-Box” missing-data paradigmProblem formulation and the demystification of ”ignorability”Combining graphs and potential outcomes

Counterfactuals at workMediation: Direct and indirect effectsCauses of effects and probabilities of causation

Conclusions

Page 11: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Standard statistical analysis

(data + hypothesis)→ std. statistics→ Associations in vars

Std statistics uses data to get parameters of a distribution.Using these parameters, probabilities of past and future eventscan be inferred so long as experimental conditions do notchange.

Causality seeks to learn about the dynamics of beliefs underchanging conditions.

.Example..

......

Consider a joint distribution p(x , y) where x ∈ X is symptoms, andy ∈ Y is disease. Is it possible to learn from p(x , y) if changes in xcause changes in y , i.e. if curing the symptom cures the disease?

Page 12: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Limits of uses of PDFs

“There is nothing in a distribution function to tell us how thatdistribution would differ if external conditions were to change– say from observational to experimental setup – because thelaws of probability theory do not dictate how one property ofa distribution out to change when another property ismodified. This information must be provided by causalassumptions which identify relationships that remain invariantwhen external conditions change.”

“Correlation does not imply causation” ⇒ “Behind everycausal conclusion there must lie some causal assumption thatis not testable in observational studies”

Page 13: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Differentiating Causal and Associational

.Definition: Causal Concepts..

......

Associational concept: Any relationship that can be definedin terms of joint distributions of observed variables. E.g.correlation, regression, dependence, conditional independence,likelihood, propensity score, etc.

Causal concept: Any relationship that cannot be definedfrom the distribution alone. E.g. randomization, influence,effect, confounding, “holding constant”, disturbance, spuriouscorrelation, instrumental variables, explanation, attribution,etc.

.The demarcation line..

......

Every claim invoking causal concepts must rely on premises whichinvoke such concepts. It cannot be inferred from or defined interms of statistical associations.

Page 14: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Ramifications of the basic distinction

Ramifications of the basic distinction

“confounding” is not solidly founded in standard frequentiststatistics.

The attempted associational definition:

.Confounding - an associational (attempted) definition..

......

“U is a potential confounder for examining the effect of treatmentX on outcome Y when both U and X and U and Y are notindependent.”

The above definition must not be causal because, if it werethere must be a causal assumption.

⇒ Confounding bias cannot be detected or corrected bystatistical methods alone. Assumptions are necessary.

Page 15: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

What would be provided by a theory of causation?

...1 represent causal questions in some mathematical language,

...2 provide a precise language for communicating assumptionsunder which the questions need to be answered,

...3 provide a systematic way of answering at least some of thesequestions and labeling others unanswerable, and

...4 provide a method of determining what assumptions or newmeasurements would be needed to answer the unanswerablequestions.

Page 16: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Intro to SEM

A Simple Structural Model

X := severity disease

Y := severity of symptom.

uY = all factors other than the disease that couldpossibly affect X when Y is held constant.

y = βx + uY (1)

Typical interpretation: Nature examines x and u and assigns yaccording to Eq. (1).

Page 17: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Intro to SEM

The SEM Paradox

y = βx + ϵ

“change in E [Y ] per unit change of X”, then algebra ⇒

x = (y − ϵ)/β (2)

1/β ⇒? “change in E [X ] per unit change of Y ”? No. What isusually argued?

β has no causal implication. It is purely statistical, measuringreduction of variance of Y explained by X .1

But, under this interpretation, β cannot be used for policymaking, and is no longer “structural”.

Only give causal meaning to coefficients which meet thatwhich meets isolation restriction, e.g. either β or 1/β hasmeaning, but not both.

But, the Cov operator quickly becomes inconsistent.1See Pearl (2000), 5.4.1 for an interesting discussion.

Page 18: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Intro to SEM

SEM and Causality

y = βx + ϵ1 ⇔ x = αy + ϵ2

where Cov(X , ϵ1) = Cov(Y , ϵ2) = 0 and α = rXY = βσ2X/σ

2Y .

Thus,

If Cov(X , ϵ1) endows β with causal meaning, then

Cov(Y , ϵ2) = 0 should similarly endow α with causalmeaning., but,

SEM and Intuition say that ∆E [X ] per unit ∆Y is 0, not rXY .There is no causal path from Y to X 2.

2Pearl 2000, p. 159-160.

Page 19: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Intro to SEM

What is needed

Formal definitions of causality, and a mathematically rigorous wayof working with it..Structural - Definition 5.4.1 (Pearl 2000)..

......

An equation y = βx + ϵ is said to be structural if it is to beinterpreted as follows: In an ideal experiment where we control Xto x and any other set Z of variables (not containing X or Y ) toz , whe value y of Y is given by βx + ϵ, where ϵ is not a functionof the settings x and z .

Page 20: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Outline for next few slides

Introduce graphs as a formal mathematical language fordescribing structural models

Introduce related graph-concepts and their relation toprobabilistic concepts

Introduce the do-calculus

Demonstrate on linear and nonparametric models

Page 21: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Graphs and Structural Models

Structural equations:

xi = fi (pai , ϵi ), i = 1, ..., n (3)

where

pai := parents, variables deemed immediate causes of Xi ,

ϵi := errors due to omitted factors.

Linear structural equations:

xi =∑k ̸=i

αikxk + ϵx , i = 1, .., n (4)

Page 22: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Graphs and Structural Models

Eq. (3) is a causal model if each equation represents the processby which the value (not merely the probability) of variable Xi isselected. To draw the graph,

Each node is variable Xi

Draw a directed arrow from pai to Xi

Bidirected arc between variable pairs with dependent errors(representing latent variable affecting both variables)

Missing arrows imply absence of causal connections

Page 23: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Graphs and Structural Models

z = fZ (uZ )

x = fX (z , uX )

y = fY (x , uY )

where uZ , uX , uY assumed jointlyindependence but otherwise arbi-trarily distributed.

Figure 1: Model M

Structural: Each function is invariant to possible changes inthe form of the other functions.

Page 24: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

d-separation

.d-separation..

......

A set S of nodes is said to block a path p if either (i) p containsat least one arrow-emitting node that is in S , or (ii) p contains atleast one collision node that is outside S and has no descendant inS . If S blocks all paths from X to Y , it is said to “d-separate Xand Y ,” and then, X and Y are independent given S, writtenX ⊥⊥ Y |S .

where paths are non-directional connections between nodes.

Page 25: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

d-separation

p1 = UZ → Z → X → Y .

Blocked by S = {Z} andS = {X}.UX ⊥⊥ Y |Z andUZ ⊥⊥ Y |X .

p2 = UZ → Z → X ← UX .

Not blocked by S = {Y },since Y descends fromcollider X ⇒ UZ ⊥⊥ UX |Ymay or may not hold.Blocked byS = {∅} ⇒ UZ ⊥⊥ UX ,condition (ii).

Figure 2: Model M

Page 26: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

d-separation

Colliders reflect Berkson’sparadox. Observations of aconsequence of two indep.causes render those causesdependent.

E.g. “Two coin flips” areindep., but after stating “atleast one is a tail”, they aredependent.

Figure 3: Model M

Page 27: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

d-separation - A Graphical Example

Consider path:P2 = a→ b → d → e

Is a ⊥⊥ e|b?d is not a collider on P2

b is in the conditioning setAND a non-collider

⇒ a and e are d-separatedby b

⇒ a ⊥⊥ e|b

Figure 4: Model M

Page 28: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

d-separation - A Graphical Example - 2

Is a ⊥⊥ e|c?There are two paths between aand e.

P1 : a→ b → c ← d ← e.

P2 : a→ d ← e.

P1 is not blocked since c is acollider in the conditioning set.

On P1, d is not a collider.

P2 is not blocked since, eventhough d is a collider, c is in theconditioning set AND adescendant of d .

⇒ a ̸⊥⊥ e|c, thus a and e aregraphically dependent given c.

Figure 5: Model M

Page 29: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus

.do(x)..

......

do(x) is a new mathematical operator which “simulates physicalinterventions by deleting certain functions from the model,replacing them by a constant X = x, while keeping the rest of themodel unchanged.”

Page 30: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus

Pre-intervention:

z = fZ (uZ )

x = fX (z , uX )

y = fY (x , uY )

Figure 6: Model M

Post-intervention:

z = fZ (uZ )

x = x0

y = fY (x , uY )

Figure 7: Model M, do(x0)

Page 31: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Pearl 2000 Example - Ch 1.2

X1: Season of the year

X2: Whether rain falls

X3: Whether the sprinkler is on

X4: Whether the pavement would get wet

X5: Whether the pavement would be slippery

Page 32: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Pearl 2000 Example

X1

X2X3

X4

X5

SPRINKLER

WET

RAIN

SLIPPERY

SEASON

Figure 8: Figure 1.2, Pearl 2000

PX3=On(x1, x2, x4, x5) = (5)

P(x1)P(x2|x1)P(x4|x1) ∗ P(x4|x2,X3 = On)P(x5|x4)

Page 33: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Pearl 2000 Example

X1

X2X3

X4

X5

SPRINKLER

WET

RAIN

SLIPPERY

SEASON

Figure 9: Figure 1.2, Pearl 2000

This probability comes from Bayesian conditioning. It is anobservation X3 = On. After observing that the sprinkler is on,we wish to infer that the season is dry, that it probably did notrain, and so on; no such inferences should be drawn in evaluatingthe effects of a contemplated action “turning the sprinkler On.”

Page 34: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Pearl 2000 Example

X1

X2X3

X4

X5

SPRINKLER= ON

WET

RAIN

SLIPPERY

SEASON

Figure 10: Figure 1.4, Pearl 2000

PM(y | do(x)) (6)

This is the causal action, “turning the sprinkler On”. The systemmust “respond to interventions in accordance with the principle ofautonomy.” See Causal Bayesian Network definition (Defn.

1.3.1 Pearl 2000).

Page 35: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Causal Bayesian Network

.Definition: Causal Bayesian Network..

......

Let P(v) be a probability distribution on a set V of variables, andlet Px(v) denote the distribution resulting from the interventiondo(X = x) that sets a subset X of variables to constants x .Denote by P∗ the set of all interventional distributions Px(v),X ⊆ V , including P(v) which represents no intervention (i.e.X = ∅). A DAG G is said to be a CBN compatible with P∗ iff:

...1 Px(v) is Markov relative to G

...2 Px(vi ) = 1 for all Vi ∈ X whenever vi is consistent withX = xi

...3 Px(vi |pai ) = P(vi |pai ) for all Vi ̸∈ X whenever pai isconsistent with X = x .

Page 36: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Causal Bayesian Network - Usage

If a network is a CBN, then...1 For all i , P(vi |pai ) = Ppai (vi ), every parent set is exogenousrelative to its child, so we can simulate setting pai by force.

...2 For all i and for every subset of variables disjoint of {Vi ,PAi},Ppai (vi ) = Ppai (vi ). Invariance. Once the direct causes pai arecontrolled for, no other interventions affect probability of vi .

...3 On a CBN, truncated factorization is possible.

See Pearl 2000, Ch. 1.3 for reference.

Page 37: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Usage

If:

X = treatment, Y = response

Z a covariate affecting amount of treatment received,

Then:

P(z , y | do(x0)) gives “the proportion of individuals that wouldattain response level Y = y and covariate level Z = z underthe hypothetical situation in which treatment X = x0 isadministered uniformly to the population

Post-intervention Distribution:

PM(y | do(x)) ≡ PMx (y) (7)

Page 38: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Structural Parameter Definition

Given modely = βx + ϵ,

We can use do(·) to write

β =∂

∂xE [Y | do(x)] (8)

“Rate of change (relative to x) of E (Y ) in an experiment whereX = x by external control.

Page 39: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Usage

Average difference of treatment:

E (Y | do(x ′0))− E (Y | do(x0)) (9)

where x ′0 and x0 are two levels of treatment to be compared.Controlled Distribution Function

P(Y = y | do(x)) =∑z

P(z , y | do(x)) (10)

Page 40: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

The do calculus - Identifiability

.Definition: Identifiability..

......

A quantity Q(M) is identifable, given a set of assumptions A, if forany two models M1 and M2 that satisfy A, we have

P(M1) = P(M2)⇒ Q(M1) = Q(M2). (11)

The assumptions constrain the variability in such a way thatequality of P’s implies equality of Q’s, thus the parameter ofinterest is expressible in terms of the parameters of P .

Page 41: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Causal Markov Condition.

.Theorem..

......

Causal Markov Condition. Any distribution generated by aMarkovian model M can be factorized as

P(v1, ..., vn) =∏i

P(vi |pai ) (12)

where V1, ...,Vn are endogenous variables in M and pai are theendogenous parents of Vi in the causal diagram of M.

This is sometimes referred to a “belief network”.

Page 42: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Truncated Factorization

.Corollary..

......

Truncated Factorization. For any Markovian model, thedistribution generated by an intervention do(X = x0) on a set X ofendogenous variables is given by the truncated factorization

P(v1, ..., vk | do(x0)) =∏

i |Vi ̸∈X

P(vi |pai )|x=x0 (13)

Meaning: We do not include the intervened variables X in thefactorization.

Page 43: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Using Truncated Factorization

P(z , y | do(x0) =P(z)P(y |x0) where P(z)and P(y |x0) are equal tothose in the pre-interventiondistribution

The distribution of Z is notaffected by the intervention:

P(y | do(x0)) =∑z P(z , y | do(x0)) =∑z P(z)P(y |x0) = P(y |x0)

Page 44: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

.Lesson..

......

Causal assumptions embedded in the model M permit prediction ofthe post-intervention distribution from pre-interventiondistribution, permitting estimation of causal effect of X on Y fromnonexperimental data, since P(y |x0) is estimable.

Page 45: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Another Example of Deriving Causal Effect

Using the truncated factorization formula, immediatelywrite joint distribution of X from graph

P(x, z1, z2, z3, y) = P(z1)P(z2)P(z3|z1, z2)∗ (14)

P(x|z1, z3)P(y|z2, z3, x)

Intervene, set x = x0 using do(x0),

P(z1, z2, z3, y| do(x0)) =P(z1)P(z2)∗ (15)

P(z3|z1, z2)P(y|z2, z3, x0)

Get causal effect of X on Y by marginalizing

P(y| do(x0)) = (16)∑z1,z2,z3

P(z1)P(z2)P(z3|z1, z2)P(y|z2, z3, x0)Figure 11: Model M, do(x0)

Page 46: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Another Example of Deriving Causal Effect

Get causal effect of X on Y by marginalizing

P(y| do(x0)) =∑z1,z2,z3

P(z1)P(z2)P(z3|z1, z2)P(y|z2, z3, x0)

Called “adjusting” for Z1, Z2, Z3

This formula written immediately from Fig. 3without considering if Z1, Z2, Z3 areconfounders, if they are on a causal pathway,etc.

This comes from graph topology

Figure 12: Model M, do(x0)

Page 47: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

From linear to nonparametric models and graphs

Another Example of Deriving Causal Effect - MultivariateIntervention

Apply treatment

do(X = x), do(Z2 = z2)

P(y | do(X = x), do(Z2 = z2)) =∑z1,z3

P(z1)P(z3|z1z2)P(y |z2, z3, x)

See that using the graph anddo(·), we can easily write com-plex interventions in the form ofestimable quantities. Figure 13: Model M, do(x0)

Page 48: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Unmeasured Confounders - An Example

P(y | do(x0)) =∑

z1,z2,z3

P(z1)P(z2)P(z3|z1, z2)P(y |z2, z3, x0)

Can we estimate if one of the Zi ’s is unmeasured.

Quite a few lines of challenging3 algebra implies

P(y | do(x0)) =∑z1,z3

P(z1)P(z3|z1)P(y |z1, z3, x0). (17)

How do we avoid all of this tedious algebra?

Use the followingtheorem

3For Pearl, much more so for me.

Page 49: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Unmeasured Confounders - An Example

P(y | do(x0)) =∑

z1,z2,z3

P(z1)P(z2)P(z3|z1, z2)P(y |z2, z3, x0)

Can we estimate if one of the Zi ’s is unmeasured.

Quite a few lines of challenging3 algebra implies

P(y | do(x0)) =∑z1,z3

P(z1)P(z3|z1)P(y |z1, z3, x0). (17)

How do we avoid all of this tedious algebra? Use the followingtheorem

3For Pearl, much more so for me.

Page 50: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Adjustment For Direct Causes

.Theorem..

......

Adjustment for Direct Causes - Pearl 2000, p. 76. If thegraph G is Markovian, the post-intervention distributionP(Y = y | do(X = x)) is given by the expression

P(Y = y | do(X = x)) =∑t∈T

P(y |t, x)P(t)

where T is the set of direct causes (“parents”) in the graph. Thus,Eq. (17) can be written directly from Fig. 3.

.Implication of ADC Theorem..

......

No matter how complicated the model, the parents of X are theonly variables that need to be measured to estimate causal effectsof X .

Page 51: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Unmeasured Confounders - An Example

Using Adjustment for DirectCauses, we know that we needconsider only the parents of z3.State directly from graph:

P(y | do(x0)) =∑z1,z3

P(z1)P(z3|z1)P(y |z1, z3, x0).

Lesson: Using the graph andADC, we can determine whichvariables are truly needed for post-intervention distribution.

Figure 14: Model M, do(x0)

Page 52: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Question

What sets of variables besides parents(X ) suffice forestimating the effect of X?

Using the back door criterion, we can answer this questiongraphically.

Page 53: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Back-door criterion

Goal: Select subset of available factors for measurement andadjustment such that, comparing treated vs untreated, we get thecorrect treatment effect. This is called the sufficient set oradmissible set (in epidemiology). We want to estimate causaleffects in nonexperimental studies.

Page 54: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Back-door criterion

.Definition: Admissible sets - the back-door criterion..

......

A set S is admissible (or “sufficient”) for adjustment if thefollowing two conditions hold:

...1 No element of S is a descendant of X .

...2 The elements of S “block” all “back-door” paths from X toY , namely all paths that end with an arrow points to X .

Intuition: Back-door paths carry spurious associations from X toY , while paths directed along arrows from X and Y carrycausative associations. Blocking backdoor paths ensures measuredassn between X and Y is purely causative.

Page 55: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Back Door Example

Does this meet back-door criterion for xi , xj?

Neither Z3 nor Z4 descend from xi

Page 56: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Back Door Example

Does this meet back-door criterion for xi , xj?

Any path with arrow into xi is blocked by {Z3,Z4}

Page 57: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Back Door Example

Does this meet back-door criterion for xi , xj?

⇒ Adjusting for {Z3,Z4} yields consistent estimate ofP(xj | do(xi ))

Page 58: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Notes about back-door criterion

Allows researchers to search for an optimal set of covariates, aset S that minimizes measurement cost or samplingvariability.

Given the back-door criterion, causality calculus can handle“confounding”

We want to transform do(·) equations into do(·)-freeequations, which are estimable from non-experimental data.

S is the smallest admissible set, the set of covariates whichgives the correct treatment effect.

Page 59: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

Tian and Pearl, 2002 - Sufficient Conditions forIdentification

A sufficient condition for identifying causal effect P(y | do(x)) isthat every path between X and any of its children traces at leastone arrow emanating from a measured variable.

Page 60: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Coping with unmeasured confounders

do - a final thought

Indeed, whether the conditional probabilities ... originate fromfrequency data or subjective assessment matters not in causalanalysis. Likewise, whether the causal effect P(y | do(x)) isinterpreted as ones degree of belief in the effect of action do(x), oras the fraction of the population that will be affected by the actionmatters not in causal analysis. What matters is ones readiness toaccept and formulate qualitative judgments about cause-effect re-lationship with the same seriousness that one accepts andformulates subjective judgment about prior distributions inBayesian analysis. Trained to accept the human mind as a reliabletransducer of experience, and human experience as a faithfulmirror of reality, Bayesian statisticians are beginning to accept thelanguage chosen by the mind to communicate experience - thelanguage of cause and effect.

Page 61: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Counterfactual analysis in structural models

Limits of P(y | do(x))

Attribution questions (e.g. what fraction of healthy peoplewould have gotten sick had they been exposed? ) do not comefrom P(y | do(x)).This requires counterfactual analysis

This is something that SEM is very good at

“y would be y had X been x in situation U = u”

Yx(u) = y (18)

The idea is that we replace equation for X with x , withoutrendering the system inconsistent.

Page 62: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Counterfactual analysis in structural models

Unit-Level Counterfactuals

.Definition: Unit-level Counterfactuals..

......

Let M be a structural model and Mx be a modified version of Mwith the equation(s) of X replaced by X = x . Denote the solutionY in the equations of Mx by the symbol YMx (u). Thecounterfactual Yx(u) (Read: “the value of Y in unit u, had Xbeen x”), is given by:

Yx(u) ≜ YMx (u)

Page 63: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Counterfactual analysis in structural models

Yx(u) ≜ YMx (u)

The potential-outcome notation was defined by Neyman(1923) and Rubin (1974) (NR framework)

In NR, Yx(u), the unit-level counterfactual, is treated as aprimitive, undefined quantity.

In SEM,Yx(u) is a derived quantity.

u are the “background conditions”. U = u defines a unit.U = u vary from individual to individual.

Laws of nature are reflected in fX , fY .

Page 64: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Counterfactual analysis in structural models

The do calculus

Modified model Mx0

Using the Unit LevelCounterfactuals definition,we can use the symbolYx0(uX , uY , uZ ).

Interpretation: “The wayan individual withcharacteristics (uX , uY , uZ )would respond had thetreatment been x0 ratherthan x = fX (z , uX ).”

Post-intervention:

z = fZ (uZ )

x = x0

y = fY (x , uY )

Page 65: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

An example: Non-compliance in structural models

An example: Non-compliance in structural models

Problem: estimating treatment effect in a clinical trial with partialcompliance. This means that only a subset of the treatment groupactually received treatment.

Page 66: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

An example: Non-compliance in structural models

An example: Non-compliance in structural models - Steps

...1 Define: Express the target quantity Q as a function Q(M)that can be computed from any model M.

...2 Assume: Formulate causal assumptions using ordinaryscientific language and represent their structural part ingraphical form.

...3 Identify: Determine if the target quantity is identifiable.

...4 Estimate: Estimate the target quantity if it is identifiable, orapproximate it, if it is not.

Page 67: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

An example: Non-compliance in structural models

Defining the target quantity

Question: What is the causal effect?

Y is symptom

P(y | do(x)) invokes a submodel Mx

Page 68: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

An example: Non-compliance in structural models

Treatment Example

Figure 15: Diagram for clinical trialwith imperfect compliance

z = fZ (uZ )

x = fX (z , uX )

y = fY (x , uY )

Z = randomized treatment assignment, X = treatmentactually received, Y = observed responseUY = factors influencing a subject response to treatmentUX = factors influencing complianceUZ = the random assignment device

Page 69: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

An example: Non-compliance in structural models

Treatment Example - Assumptions

Figure 16: Diagram for clinical trialwith imperfect compliance

z = fZ (uZ )

x = fX (z , uX )

y = fY (x , uY )

...1 “Exclusion restriction” - Z influences Y only through X .

...2 Z is indep. of UX , UY .

Page 70: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

An example: Non-compliance in structural models

Treatment Example

Figure 17: Diagram for clinical trialwith imperfect compliance -Intervention

z = fZ (uZ )

x = x0

y = fY (x , uY )

P(y | do(x0) describes the response of the population to anexperiment in which we administer treatment at level X = x0uniformly to the entire population and let x0 take different valueson hypothetical copies of the population.

Page 71: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

An example: Non-compliance in structural models

Treatment Example - Identification

Figure 18: Diagram for clinical trialwith imperfect compliance -Intervention

z = fZ (uZ )

x = x0

y = fY (x , uY )

Note: We do not meet the criterion for identification. The back door path is

X ← UX ↔ UY → Y , which blocks the spurious associations created by this

path. ⇒ Since UX and UY are assumed to be unobserved, and since no

other blocking covariates exist, the investigator can conclude that

confounding bias cannot be removed by adjustment.

Page 72: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Section Overview

The potential-outcome (PO) framework (Neyman (1923) andRubin (1974)) does not provide a comprehensive theory ofcausality

It is subsumed by the structural vocabulary

PO does not have sufficient vocabulary to describe causality

PO can serve a purpose when coupled with graphical models

Page 73: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Potential-Outcome Review

Primitive of PO framework

Yx(u)

the value that outcome Y would obtain in experimental unit u hadX been x.

In SCM, this amounts to Yx(u) ≜ YMx (u)

In SCM, Yx(u) is the solution to a system of equations,permitting counterfactual analysis, because you reason aboutbehavior of hypothetical units.

By contrast, in PO, Yx(u) is a primitive,...,undefinedquantity in terms of which other quantities are defined.

Page 74: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

The ”Black-Box” missing-data paradigm

The ”Black-Box” missing-data paradigm

In PO, researchers think and communicate in Yu(x), but

Analysis is performed with standard probability calculus

U and Yx are random variables.

Observed distribution P(x1, ..., xn) is the marginal distributionof augmented distribution P∗

P(Y | do(x)) becomes P∗(Yx = Y ), and counterfactualsderived from that.

Page 75: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

The ”Black-Box” missing-data paradigm

The “Black-Box” missing-data paradigm

“Causality assumptions” in PO come from consistencyconstraints, i.e.

X = x ⇒ Yx = Y

meaning, if X = x , then Yx is value of y had X been X .

.Main Difference..

......

In the structural approach, do(x) changes a distribution but keepsthe variables the same. In PO, Y | do(x) is a different variable, Yx ,loosely connected through relationships such as above. Yx is thentreated as “missing-data.”

Page 76: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Problem formulation and the demystification of ignorability

Problem formulation and the demystification of ignorability

The structure of the problemin PO is communicated asconstraints on P∗

X = x ⇒ Yx = Y

Yx ⊥⊥ X |Z “conditionalignorability” (Rosenbaumand Rubin (1983)) ⇒P(y | do(x)) = P∗(Yx = y)

Then, we can deriveP ∗ (Yx = y) using standardprobability algebra

Page 77: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Problem formulation and the demystification of ignorability

Probability Algebra

Page 78: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Problem formulation and the demystification of ignorability

Pros and Cons of PO

Pros:

Uses standard probabilitycalculus

Treats Yx as a randomvariable

Cons:

“Problem structure must bereformulated to re-fit theory.”

Problem must be translated intocounterfactuals (ignorabilityconditions)

Very difficult to tell which set ofcovariates satisfy conditionalindependence

“Ingorability” is a difficultconcept

PO offers no guidance incovariate selection

Page 79: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Problem formulation and the demystification of ignorability

What to learn from this section

The assumptions needed to use PO are “cast in a language sofar removed from ordinary understanding of scientific theories that,for all practical purposes, they cannot be comprehended byordinary mortals. As a result, researchers in the graph-less POcamp rarely use “conditional ignorability” to guide the choice ofcovariates; they view this condition as a hoped-for miracle ofnature rather than a target to be achieved by reasoned design.”Pearl 2009, p. 130.

Page 80: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Combining graphs and potential outcomes

How to use graph and PO

...1 Exclusion restrictions: For every variable Y having parentspay , and for every set of endogenous variables S disjoint ofpay , we have

YpaY = YpaY ,s .

...2 Independence restrictions: If Z1, ...,Zk is any set of nodes notconnected to Y via dashed arcs, and pa1, ..., pak , theirrespective set of parents, we have

YpaY ⊥⊥ {Z1,pa1 , ...,Zk,pak}.

Page 81: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Combining graphs and potential outcomes

Combining graphs and potential outcomes

Use the “back-door” condition to make sure that the chosencovariates eliminate bias

Use the algebraic machinery of counterfactual notation Yx(u)to refine assumptions

When doing algebra on PO, look to the graph for furtherassumptions

Page 82: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Conclusion - How to Use Pearl’s Causality Calculus

When modeling, use a graph G to state your assumptionsabout dependence and independence.

Write the desired effect, e.g. P(Y | do(x)) (the total effect ofX on Y .)

Use d-separation to determine conditional independenceimplications of G .

Examine identifiability: Tian and Pearl 2002, Back-doorcriterion.

If it is possible to transform P(Y = y | do(x)) into do-freeexpressions, the causal quantity is identifiable.

Use the do-free quantities for estimation.

Page 83: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Ending Quote

First, I do not offer anyone an approach, I offer mathematical toolsto do what researchers say they wish to do, only with less effortand greater clarity; researchers may choose to use or ignore thesetools. By analogy, the invention of the microscope was not anapproach but a new tool. - Judea Pearl’s Bloghttp://www.mii.ucla.edu/causality/

Page 84: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

References

Book: Causality: Models, Reasoning and Inference JudeaPearl.

Paper: Reflections about Judea Pearl and his Contributions,Eric Horvitz.

Video: Judea Pearl Tribute Symposium: Causality, find onYouTube.

Book: Bayesian Reasoning and Machine Learning, DavidBarber. Free PDF on Barber’s site.

Page 85: Barr pearl2009 preso

. . . . . .

Introduction.From association to causation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Structural models, diagrams, causal effects, and counterfactuals

. . . . . . . .The potential outcome framework Conclusions My Idea

Thank you!