trygve haavelmo and the emergence of causal calculus judea pearl university of california los...

31
TRYGVE HAAVELMO AND THE EMERGENCE OF CAUSAL CALCULUS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea)

Upload: gwenda-wilkins

Post on 17-Dec-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

TRYGVE HAAVELMO AND THE EMERGENCE OF

CAUSAL CALCULUS

Judea PearlUniversity of California

Los Angeles(www.cs.ucla.edu/~judea)

• Haavelmo’s major breakthroughs

• The evolution of causal calculus

• What causal calculus can do for us?

• Will Haavelmo’s legacy survive?

OUTLINE

1. An economic model is a set of hypothetical experiments, qualitatively encoded in a system of equations.

2. An economic model is capable of answering policy intervention questions, with no further assistance from the modeller.

3. There is a formal way of taking an arbitrary model, combining it with data, and derive valid answers to policy questions .

HAAVELMO’S THREE INSIGHTS

“We have in mind some actual experiment, or some design of an experiment, which we could at least imagine arranging." (Haavelmo, 1944)

His example:

What information did the modeller intend a to carry in

Eq. 1 and what information would a provide if we are able to estimate its value.

WHAT IS AN ECONOMIC MODEL?

21

byx

axy

notation Later

))(|( xdoYEx

a

)|( xXYEax

Perhaps a = g[P(x,y,z...)] ?

There is no such g[ ].

TRADITIONAL STATISTICALINFERENCE PARADIGM

Data

Inference

Q(P)(Aspects of P)

PJoint

Distribution

e.g.,Infer whether customers who bought product Awould also buy product B.Q = P(B | A)

Data

Inference

Q(M)(Aspects of M)

Data Generating

Model

M – Invariant strategy (mechanism, recipe, law, protocol) by which Nature assigns values to variables in the analysis.

JointDistribution

THE STRUCTURAL MODELPARADIGM

M

“A painful de-crowning of a beloved oracle!”•

“I am speaking, of course, about the equation: {y = a + bx + }.

What does it mean?

The only meaning I have ever determined for such an equation is that it is a shorthand way of describing the conditional distribution of {y} given {x}.” (Holland, 1995, p. 54)

CLINGING DESPARATELY TOAN OLD ORACLE

Fifty-two years later (1995):

“Beneath all multiple equation models, there is only a set of conditional distributions.”

(Richard Berk, 2004, p. 196)

“For example, in the model:

until the relationship between zt and ut is specified,

the meaning of is uncertain since E [ zt ut] could be

either zero or non-zero on the information provided.”                               (Hendry, 1995)

ECONOMISTS CLINGING TO THE OLD ORACLE

Fifty-two years later (1995):

“The joint density is the basis: SEMs are merely aninterpretation of that.”     (1998, personal communication)

],0[ 2ttttt uuuzy IN ~ where

“A state implements tough new penalties on drunk drivers:What is the effect on highway fatalities?... [This effect] is an unknown characteristic of the population joint distribution of X and Y.”        (Stock and Watson, 2007, Introduction to Econometrics, Chapter 4, pp. 111)

ECONOMISTS CLINGING TO THE OLD ORACLE

Sixty-four years later (2007):

Sixty-eight years later (2011):

“An econometric model specifies the statistical relationshipthat is believed to hold between the various economicquantities pertaining to  particular economic phenomenalunder study.” (Wikipedia – “Econometric Models,” 2011)

“to predict consumption,... under the Government policy,... we may use the `theoretical’ equations obtained.”  (1943, p.12)

WHAT CAN OUR NEW ORACLE DO?

Haavelmo: Provide policy advice.

Haavelmo:“this is only natural, because now the Government is, in fact, performing `experiments' of the type we had in mind when constructing each of the two equations.'' (1943, p.12)

“How can anyone predict outcomes of experiments that where never performed?''

Skeptics:

ig

“Assume that the Government decides, through public spending, taxation, etc., to keep income, rt, at a given

level,... the only change in the system being that, instead of we now have where gi is Government expenditure, so adjusted as to keep

r constant, whatever be u and v,...'' (1943, p. 12)

HOW CAN OUR NEW ORACLE DO ALL THAT?(THE ALGORITHMIZATION OF INTERVENTION)

Haavelmo “surgery”:Modify an equation by adding an adjustable force gi, while

keeping all other equations in tact.

iii vur

iii vur

• Haavelmo's surgery (1943) Add adjustable force • Strotz and Wold surgery (1960). “Wipe out” the equation

and replace it with

• Graphical surgery (Spirtes et al., 1993; Pearl, 1993). Wipe out incoming arrows to r

• do-calculus (Pearl, 1994)

•  Structural counterfactuals (Balke and Pearl, 1995) Yr(u) = Y(u) in the model

• Unification with Neyman-Rubin Yx(u) and Lewis (1973)

THE EVOLUTION OF CAUSAL CALCULUS

,iii vur constantir

)( ig

u

vr y

mutilatedr-

)|(),|()()(),,,( ryPvurPvPuPyrvuP

))ry | doYP (( operatornew

),|(),|'(

)()()|(

')(':'

)(:

yxuPyxyYPN

uPyYPyP

yuxYux

yuxYux

In particular:

)(xdo

ALGORITHMIZATION OF COUNTERFACTUALS

Semantics: The counterfactual Yx(u) = y, which reads

“Y would be y (in situation u), had X been x”

is said to be true, if Y = y in the x-mutilated model, under

exogenous conditions U = u.

)(),()(,)(:

uPzZyYPzuZyuYu

wxwx

Joint probabilities of counterfactuals:

AXIOMS OF CAUSAL COUNTERFACTUALS

1. Definiteness

2. Uniqueness

3. Effectiveness

4. Composition

5. Reversibility

xuXtsXx y )( ..

')')((&))(( xxxuXxuX yy

xuX xw )(

)()()( uYuYwuW xxwx

yuYwuWyuY xxyxw )())((&))((

:)( yuYx Y would be y, had X been x (in state U = u)

CAUSAL MODEL

(MA)

A - CAUSAL ASSUMPTIONS

Q Queries of interest

Q(P) - Identified estimands

Causal inference

T(MA) - Testable implications

Statistical inference

Goodness of fit

Model testingProvisional claims

)(Tg

A* - Logicalimplications of A

Q - Estimates of Q(P)

CAUSAL MODEL

(MA)

),( ADQ

Data (D)

THE LOGIC OF CAUSAL ANALYSIS

Model 2 (Linear version)

'2222

'323133

''333

233

uZcW

uZbZaZ

uXcW

ucWbZaWY

22

11

'11'11

''3211

)

uZ

uZ

uZaW

uZtWtX

Model 1

),(

),,(

),(

),,,(

'2222

32133

''333

233

uZgW

uZZfZ

uXgW

uWZWfY

)(

)(

),(

),,(

222

111

'1111

''31

ufZ

ufZ

uZgW

uZWgX

ILLUSTRATING THE WORKINGOF CAUSAL CALCULUS

W1

Z2

W2

Z1

X

Z3

W3 Y

U’s are mutually independent

Model 1

),(

),,(

),(

),,,(

'2222

32133

''333

233

uZgW

uZZfZ

uXgW

uWZWfY

)(

)(

),(

),,(

222

111

'1111

''31

ufZ

ufZ

uZgW

uZWgX

WHAT ARE THE TESTABLEIMPLICATIONS OF THE MODEL?

W1

Z2

W2

Z1

X

Z3

W3 Y

Missing edges: Z1 – Z2, Z1 – Y, Z2 – X . . .Separating sets: {0}, {X, Z2, Z3}, {Z1, Z3} . . . Zero regression coefficients: r{Z1 Z2} = 0 and r{YZ1 • XZ2Z3} = 0

These imply *all* misspecification tests

Testable implications (FINITE!): Z1 Z2

Z1 Y | {X1,Z2,Z3}, Z2 X | {Z1, Z3}.

IMPLIED MISSPECIFICATION TESTS

W1

Z2

W2

Z1

X

Z3

W3 Y

Question 4:If we regress Z1 on all other

variables in the model, which regression coefficient will be zero? Answer: All but these threeQuestion 5:If we regress Z1 on Z3 and W1,

which regression coefficient will change if we add Y as a regressor? Answer:

The coefficient of Z3 will change and the coefficient of W1 will

remain invariant. Non-invariance may not be misspecification.

CAUSAL CALCULUS AS AN ORACLE FOR INTERVENTIONS

W1

Z2

W2

Z1

X

Z3

W3 YSuppose we wish to estimate the average causal effect of X on Y

• Which subsets of variables need to be adjusted to obtain an unbiased estimate of ACE?

• Is there a single variable that, if measured, would allow an unbiased estimate of ACE?

)).0(|())1(|( XdoyYPXdoyYPACE

Front Door

EFFECT OF WARM-UP ON INJURY (After Shrier & Platt, 2008)

No, no!

Watch out!

Warm-up Exercises (X) Injury (Y)

???

THE MACHINERY OF CAUSAL CALCULUS

Rule 1: Ignoring observations P(y | do{x}, z, w) = P(y | do{x}, w)

Rule 2: Action/observation exchange P(y | do{x}, do{z}, w) = P(y | do{x},z,w)

Rule 3: Ignoring actions P(y | do{x}, do{z}, w) = P(y | do{x}, w)

XG Z|X,WY )( if

Z(W)XGZ|X,WY )( if

ZXGZ|X,WY )( if

DERIVATION IN CAUSAL CALCULUS

Smoking Tar Cancer

P (c | do{s}) = t P (c | do{s}, t) P (t | do{s})

= st P (c | do{t}, s) P (s | do{t}) P(t |s)

= t P (c | do{s}, do{t}) P (t | do{s})

= t P (c | do{s}, do{t}) P (t | s)

= t P (c | do{t}) P (t | s)

= s t P (c | t, s) P (s) P(t |s)

= st P (c | t, s) P (s | do{t}) P(t |s)

Probability Axioms

Probability Axioms

Rule 2

Rule 2

Rule 3

Rule 3

Rule 2

Genotype (Unobserved)

WHAT ELSE CAN CAUSAL CALCULUS DO FO US?

W1

Z2

W2

Z1

X

Z3

W3 Y

• Equivalent models• Identifying Counterfactual queries (ETT, PC)• Mediation• Causes of Effects• External validity

Finding instrumentsIs there an instrumental variable for the Z3 Y relationship?

Answer: NoCan we turn Z1 into an IV?

Answer: Yes, condition on W1.

WILL HAAVELMO'S LEGACY SURVIVE?

Two attempted hijackings

1. The regressionism assault (1975 - 2000)Economic model is a parameterized family ofjoint distributions.

THE CAUSAL RENAISSANCE: VOCABULARY IN ECONOMICS

From Hoover (2004) “Lost Causes”

WILL HAAVELMO'S LEGACY SURVIVE?

Two attempted hijackings

1. The regressionism assault (1975 - 2000)Economic model is a parameterized family ofjoint distributions.

2. The “instrumentalism” assault (1995 - ) Economic analysis is a potential clinical trial with missing data (potential outcomes) and one treatment. Instruments are needed, no structure.

3. The structuralist defenseFinding an instrument requires structural assumptions.The "experimentalists" choice of language deprives them from making those assumptions transparent.

FORMULATING ASSUMPTIONSTHREE LANGUAGES

},{),()(

),()()()(

),()(

XYZuYuY

uXuXuXuX

uZuZ

zxzxz

zzyy

yxx

2. Counterfactuals:

1. English: Smoking (X), Cancer (Y), Tar (Z), Genotypes (U)

X YZ

U

ZX Y

3. Structural:

)3,,(

),(

),(

3

22

11

uzfy

xfz

ufx

THE STRUCTURAL-COUNTERFACTUAL SYMBIOSIS

1. Express theoretical assumptions in structural

language.

2. Express queries in counterfactual language.

3. Translate (1) into (2) for algebraic analysis,   Or (2) into (1) for graphical analysis.

4. Use either graphical or algebraic machinery   to answer the query in (2).

• Formal basis for causal and counterfactual inference (complete) • Unification of the graphical, potential-outcome and structural equation approaches • Friendly and formal solutions to century-old problems and confusions.

CONCLUSIONS

He is wise who bases causal inference on an explicit causal structure that is defensible on scientific grounds. After Haavelmo, 1943

CONCLUSIONS

Thank you