1 causality challenge #2: pot-luck isabelle guyon, clopinet constantin aliferis and alexander...

1

Causality challenge #2:Pot-Luck

Isabelle Guyon, ClopinetConstantin Aliferis and Alexander Statnikov, Vanderbilt Univ.

André Elisseeff and Jean-Philippe Pellet, IBM Zürich

Gregory F. Cooper, Pittsburg University

Peter Spirtes, Carnegie Mellon

2

Motivations

* Motivations * Learning causal structure ** Cross-sectional studies * … from experiments * … without experiments * Equivalent MB ** Longitudinal studies * Bring your own problem(s) *

3

Causality Workbench

• February 2007: Project starts. Initial funding of the EU Pascal network.

• August 15, 2007: Two-year grant from the US National Science Foundation.

• December 15, 2007: Workbench made alive. First causality challenge: causation an prediction.

• June 3-4, 2008: WCCI 2008, workshop to discuss the results of the first challenge.

• September 15, 2008: Start pot-luck challenge. Target: NIPS 2008.

• Fall, 2008: Start developing an interactive workbench.

4

Why a new challenge?

• Causality challenge #1– Favor “depth”

• Single well defined task

• Rigor of performance assessment

• Causality challenge #2– Favor “breadth”

• Many different tasks

• Encourage creativity

5

http://clopinet.com/causality

5

6

artif

Pot-Luck challenge

• CYTO: Causal Protein-Signaling Networks in human T cells. Learn a protein signaling network from multicolor flow cytometry data. N=11 proteins, P~800 samples per experimental condition. E=9 conditions.

• LOCANET: LOcal CAusal NETwork. Find the local causal structure around a given target variable (depth 3 network) in REGED, CINA, SIDO, MARTI.

• PROMO: Simulated marketing task. Time series of 1000 promotion variables and 100 product sales. Predict a 1000x100 boolean influence matrix, indicating for each (i,j) element whether the ith promotion has a causal influence of the sales of the jth product. Data is provided as time series, with a daily value for each variable for three years.

• SIGNET: Abscisic Acid Signaling Network. Determine the set of 43 boolean rules that describe the interactions of the nodes within a plant signaling network. 300 separate Boolean pseudodynamic simulations of the true rules. Model inspired by a true biological system.

• TIED: Target Information Equivalent Dataset. Illustrates a case in which there are many equivalent Markov boundaries. Find them all.

self eval

self eval

real

real

artif

artif

artif

http://www.causality.inf.ethz.ch/repository.php?id=3

http://www.causality.inf.ethz.ch/data/LOCANET.html








7

Learning causal structure


8

What is causality?

• Many definitions.• Pragmatic (engineering) view: predicting

the consequences of ACTIONS.• Distinct from making predictions in a

stationary environment.• Canonical methodology: designed

experiments.• Causal discovery from observational data.

9

The “language” ofcausal Bayesian networks

• Bayesian network:– Graph with random variables X1, X2, …Xn as

nodes.– Dependencies represented by edges.– Allow us to compute P(X1, X2, …Xn) as

i P( Xi | Parents(Xi) ).

– Edge directions have no meaning.

• Causal Bayesian network: egde directions indicate causality.

10

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Anxiety Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

Fatigue

LUCAS0: natural

Small example

Markov boundary

11

Arrows indicate “mechanisms”

If Lung Cancer (LC) is determined by Smoking (S) and Genetics (G),

• In the language of BN, use the data table:P(LC=1| S=1, G=1)=… , P(LC=0| S=1, G=1)=…P(LC=1| S=1, G=0)=… , P(LC=0| S=1, G=0)=…P(LC=1| S=0, G=1)=… , P(LC=0| S=0, G=1)=…P(LC=1| S=0, G=0)=… , P(LC=0| S=0, G=0)=…

• In the language of Structural Equation Models (SEM), use:

LC = f(S, G) + noisewhere usually f is a linear function.

12

Common simplifications

– Assume a Markov process– Assume a DAG– Assume causal sufficiency (no hidden common cause)

– Assume stability or faithfulness (no particular parameterization implying dependencies not reflected by the structure)

– Assume linearity of relationships– Assume Gaussianity of PDF’s– Discard relationships of low statistical significance– Focus on a local neighborhood of a target variable– Learn unoriented or partially oriented graphs– Assume uniqueness of the Markov boundary

13

How about time?

Cross-sectional study

0

9 4

11

61

10 2

3

7

5

8

14

How about time?

Cross-sectional study

0

9 4

11

61

10 2

3

7

5

8

01234567891011

01234567891011

01234567891011

01234567891011

Longitudinal study

15

Learning causal structurefrom “cross-sectional”

studies:

CYTOLOCANET

TIED


16

Causal models as particular “generative models”

• Imagine we have “prior knowledge” about a few alternative plausible “causal models” (we basically know the architecture).

• Fit the parameters of the model to data.• Select the model based on goodness of fit

(score), perhaps penalizing higher complexity models.

• Could two models have identical scores?

17

Key types of causal relationships 1

Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue

Lung Cancer

Smoking

Direct cause

18

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

Fatigue


Indirect cause (chain)AN LC | S

Lung Cancer

Anxiety

19

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Car Accident

Born an Even Day

Fatigue


Confounder (fork)YF LC | S

Lung Cancer

Yellow Fingers

20

How this might look in data

Lung cancer

Yellow Fingers

21

Simpson’s paradox

YF LC | S


Non-smokingSmoking

Lung cancer

Yellow Fingers

23

Smoking Genetics

Coughing

AttentionDisorder


Yellow Fingers

Car Accident

Born an Even Day

Fatigue


Collider (V-structure)

AL LC | C

Lung CancerAllergy

24


Lung cancer

Allergy

25


Lung cancer

Allergy

Coughing=1

Coughing=0

26

No Markov equivalence

Colliders (V-structures) : X1 Y | X2

X1 Y

X2

P(X1, X2 , Y) = P(X2 | X1,Y) P(X1) P(Y)

27

Structural methods

1. Build unoriented graph (using conditional independencies).

2. Orient colliders.3. Add more arrows

by constraint propagation without creating new colliders.

0

9 4

11

61

10 2

3

7

5

8

0

9 4

11

61

10 2

3

7

5

8

28

… towards CYTO:using experiments to

learn the causal structure


29

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue

Direct cause

Manipulating a single variable

1

Smoking manipulated (disconnected from its direct causes): remains predictive of LC.

30

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue

Indirect cause


2

Anxiety manipulated: remains predictive of Lung Cancer.

31

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue


3

Consequence of common

cause (correlated,

but not cause) Yellow Fingers manipulated: no longer predictive of LC.

32

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue

Direct cause


4

Genetics manipulated: remains predictive of LC and AD.

?

33

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue

Direct cause


5

Attention disorder manipulated: no longer predictive of Genetics.

34

MEK3/6

MAPKKK

PLC

Erk1/2

Mek1/2

Raf

PKC

p38

Akt

MAPKKK

MEK4/7

JNK

L

A

TLck

VAVSLP-76

RAS

PKA

1 2 3CD28CD3

PI3K

LFA-1

Cytohesin

Zap70

PIP3

PIP2

JAB-1

Activators

1.-CD3

2.-CD28

3. ICAM-2

4. PMA

5. 2cAMP

Inhibitors

6. G06976

7. AKT inh

8. Psitect

9. U0126

10. LY294002

10

5

46

7

9

8

The CYTO problem

Karen Sachs et al

35

… towards LOCANET:learning the causal structure without

experimentsto predict the

consequences of future actions.


36

What if we cannot experiment?

• Experiments may be infeasible, costly or unethical

• Using only observations we may want to predict the effect of new policies.

• Policies may consist in manipulating several variables.

37

LUCAS1: manipulate

d

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue

Manipulating a few variables

Markov boundary

38

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy


Yellow Fingers

Car Accident

Born an Even Day

Fatigue

LUCAS2: manipulate

d

Manipulating all variables

Markov boundary

39

Causality challenge #1:causation and prediction

• Task: Predict the target (e.g., Lung cancer) in “unmanipulated” or “manipulated” test data.

• Goals:– Introduce ML people to causal discovery problems.– Investigate ties between causation and prediction.

• Findings:– Participants used either causal or non-causal feature

selection.– Good causal discovery (feature set containing the

“manipulated” MB) correlated with good predictions.– However, some participants using non-causal feature

selection obtained good prediction results.

40

Causality challenge #2:The LOCANET problem

• Task: Find the local causal structure around a given target variable (depth 3 network) in REGED, CINA, SIDO, MARTI.

• Goal: Analyze more finely to which extent causal discovery methods recover the causal structure and how this affects predicting the target values.





41

TIEDEquivalent Markov

boundaries


42

Equivalent Markov boundaries

Markov boundary

Many almost identical measurements of the same (hidden) variable can lead to many statistically

undistinguishable Markov boundaries.

Y

43

Target Information Equivalence (TIE)

Two disjoint subsets of variables V1 and V2 are Target Information Equivalent (TIE) with respect to target Y iff:

• V1Y

• V2Y

• V1Y | V2

• V2Y | V1

Alexander Statnikov & Constantin Aliferis

44

TIE Data (TIED)Exact equivalence

X2 X3 X11 Y

0

1

2

0

1

2

0

2

0

1

2

3 3 3

1

0

1

2

3

X1

3

Small example of the type of relationships implemented in TIED.The following TIE relations hold in the data:

TIEY(X1, X2) TIEY(X1, X3) TIEY(X1, X11)TIEY(X2, X3) TIEY(X2, X11)TIEY(X3, X11)

TIEX11(X1, X2) TIEX11(X1, X3) TIEX11(X2, X3)Notice that variables X1, X2, X3, X11, and Y are not deterministically related.

Alexander Statnikov & Constantin Aliferis

45

Learning causal structurefrom “longitudinal”

studies:

SIGNET PROMO


46

SIGNET: a plant signaling network

• Plants loose water and take in carbone dioxide through microscopic pores.

• During drought, plant hormone abiscisic acid (ABA) inhibits pore opening (important for the genetic engineering of new drought resistant plants).

• Unraveling the ABA signal transduction network took years of research. A recent dynamic model synthesizes many findings (Li, Assmann, Albert, PLOS, 2006).

• The model is used by Jenkins and Soni to generate artificial data. The problem is to reconstruct the network from the data.

47

Abscisic Acid Signaling Network

Li, Assmann, Albert, PLOS, 2006

47

48

SIGNET: sample data10111011101011011011010010100010110000110011100001110111101101101111111011001011101011110001111011111010110110001101000111010101011000011101111101011011000110000111101010101100001110111110101101100011000011110101010

- Boolean model; asynchronous updates- 43 nodes- 300 simulations

Example of asynchronous updates for a

4-node network:

time

49

PROMO: simulated marketing task

• 100 products• 1000 promotions• 3 years of daily

data• Goal: quantify

the effect of promotions on sales

products

promotions

Jean-Philippe Pellet

50

PROMO: schematically…

The difficulties include:

- non iid samples

- seasonal effects

- promotions are binary, sales are continuous

- the problem is more quantifying the relationships than determining the causal skeleton

other

1000

100

51

Pot-luck challenge:Bring your own problem

* Motivations * Learning causal structure ** Cross-sectional studies * … from experiments * … without experiments * Equivalent MB ** Longitudinal studies * Bring your own problem *

52

From NIPS 2006 workshop…

1. 1. Predict the consequences of a manipulation (similar to a usual predictive modeling task, but the test data is no longer distributed in the same way as the training data; the system undergoes a manipulation to produce the test data).

2. 2. Determine what manipulations are needed to reach a desired system state with maximum probability (e.g., select variables and propose values to achieve a certain value of a response/target variable, with perhaps a cost per variable).

3. 3. Propose system queries to acquire more training data, i.e. design experiments, with perhaps an associated cost per variable and per sample and perhaps with constraints on variables, which cannot be controllable.

4. 4. Determine all causal relationships between variables.5. 5. Determine a local causal region around a response/target variable (causal adjacency).6. 6. Determine the source cause(s) for a response/target variable.7. 7. Determine for all variables whether they are, with respect to a response/target variable: cause, effect,

consequence of a common cause, cause of a common effect, or unrelated.8. 8. Predict the existence of unmeasured variables (not part of the set of variables provided in the data),

which are potential confounders (are common causes of an observed variable and the target).9. 9. Predict which variables called “relevant” by feature selection algorithms are potentially causally

irrelevant because their correlation to the target is the result of an experimental artifact (e.g., sampling bias or systematic error).

10. 10. Determine a causal order of all variables.11. 11. Determine a causal direction in time series data in which one variable is causing the other.12. 12. Determine the direction of time in a time series (mostly of fundamental rather than practical interest).13. 13. Incorporate prior knowledge in causal discovery.14. 14. Predict counterfactuals.

53

http://clopinet.com/causality

• September 15, 2008: challenge start. • October 15, 2008: deadline for (optional)

submission of milestone challenge results.• October 24, 2008: workshop abstracts due.• November 12, 2008: challenge ends (last day to

submit challenge results).• November 21, 2008: JMLR proceedings paper

submission deadline.• December 12, 2008: challenge results publicly

released; workshop.

54

Prizes

• Four prizes (free NIPS workshop entrance or $200). – Best solution to one or more problems: 3 prizes.– Best problem:1 prize.

• All competitors must submit a 6-page paper.• Criteria: performance/usefulness,

novelty/originality, sanity, insight, reproducibility, clarity.

1 causality challenge #2: pot-luck isabelle guyon, clopinet constantin aliferis and alexander...

Documents

local causal network

causal protein

causal influence

causal discovery

local causal structure

new challenge

causality workbenchfebruary

eu pascal network