predictive and causal modeling in biomedicine · 2015-09-08 · predictive and causal modeling in...

63
Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, MS, PhD. New York University, Center for Health Informatics and Bioinformatics 1

Upload: haque

Post on 29-Aug-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive and Causal Modeling in the Health Sciences

Sisi Ma MS, MS, PhD. New York University,

Center for Health Informatics and Bioinformatics

1

Page 2: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Exponentially Rapid Data Accumulation

1975 Rapid DNA Sequencing

1982 GeneBank

Formed

1990 Human

Genome Project

Initiated

2003 Completion of

Human Genome

Sequencing PDB initiated

Protein Sequencing

via MS 1986

2006 TCGA

Initiated 1,000

Genomes Initiated

First GWAS Study

Published; NGS 2005

2016 TCGA

Completed >10,000 Tumors

2010 Human

Connectome Project

Single Cell

Sequencing 2012

2

Page 3: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

From Data to Discoveries

Predictive Model Screening Diagnostics Prognostics

Causal Model Causal Knowledge Intervention Therapeutics

Predictive Knowledge

Advanced Data Preparation, Analysis and Modeling methods are needed for knowledge discovery in high volume, high variety data. Two key types: Predictive Modeling and Computational Causal Discovery

3

Page 4: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Talk Outline

• Predictive Modeling o Brief Introduction to Predictive Modeling

o Indicative Case Studies

• Causal Modeling o Causal Modeling using Observation Data

o Indicative Case Studies

o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection

4

Page 5: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Talk Outline

• Predictive Modeling o Brief Introduction to Predictive Modeling

o Indicative Case Studies

• Causal Modeling o Causal Modeling using Observation Data

o Indicative Case Studies

o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection

5

Page 6: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Models : the Goal

6

Page 7: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Example of Predictive Modeling : Support Vector Machines (SVMs)

Support Vector Machine 7

Key Characteristics of SVM • Maximum gap to prevent overfitting • QP problems can be solved with

standard methods. • Soft margins to tolerate noise • Kernel trick for linearly non-separable

data Boser et al.1992; Statnikov et al., 2011

Page 8: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Models : the Goal

8

Page 9: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling: a Simplified General Framework

9

Page 10: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling: Cross validation for performance estimation and model selection

10 Ma et al., 2015 (in preparation)

Page 11: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Talk Outline

• Predictive Modeling o Brief Introduction to Predictive Modeling

o Indicative Case Studies

• Causal Modeling and its Applications o Causal Modeling using Observation Data

o Indicative Case Studies

o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection

11

Page 12: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

Post-traumatic Stress Response:

• Almost everyone experience at least one traumatic event in their life.

• Most people display acute stress responses.

• Acute stress responses diminish over time in most individuals, but about 10% - 20% people experience non-remitting stress responses long after the trauma.

• Persistent stress is detrimental to Physiological and psychological well-being of individuals.

12 Galatzer-Levy et al., 2015; Ma et al. 2015; Galatzer-Levy et al., 2015 (submitted)

Page 13: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

Discovery Goals/Questions:

• Can we identify the people who will suffer from non-remitting stress responses? If so, can they be identified early enough?

• What types of data need to be collected to identify people who will suffer from non-remitting stress responses?

13

Page 14: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

• 166 trauma survivors that were admitted to the ER were followed up to 4 month after the trauma.

• Patient history, clinical data, stress hormones, psychiatric related measurements were collected in the ER, 1 week, 1 month, and 4 month after the trauma. A total number of 135 variables were collected.

101289479238749818817989 …

101289479238749818817989 …

101289479238749818817989 …

101289479238749818817989 …

101289479238749818817989 …

15675672308252573213 …

101289479238749818817989 …

101289479238749818817989 …

998234989238749892409880 …

101289479238749818817989 …

101289479238749818817989 …

884729238761912876128764 …

101289479238749818817989 …

101289479238749818817989 …

112343247498231881324742 …

Data:

14

Page 15: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

Remitting and Non-remitting Post-traumatic Stress Responses (Identified via Latent Growth Mixture Modeling)

15

Page 16: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

Discovery Goals/Questions:

• Can we identify the people who will suffer from non-remitting stress responses? If so, can they be identified early enough?

• What types of data need to be collected to identify people who will suffer from non-remitting stress responses?

16

Page 17: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Model for Post-traumatic Stress

Study Design: • Five predictive models were build using data incorporating

increasing amounts of information: (1) background data (2) Data collected through ER (3) Data collected through 1 week (4) Data collected through 1 month (5) Data collected though 4 month

• SVM with feature selection was employed, with 10 split 5 fold cross-validation

17

Page 18: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

• Prediction accuracy increases progressively as data collected at later time points are added to the predictive models.

• Predictivity of the model built with patient background information is statistically significant.

• Model built with patient background information and data collected in the ER have strong enough predictive performance to be clinically useful.

18

Page 19: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

Discovery Goals/Questions:

• Can we identify the people who will suffer from non-remitting stress responses? If so, can they be identified early enough?

• What types of data need to be collected to identify people who will suffer from non-remitting stress responses? Specifically, can neuroendocrine levels predict non-remitting post-traumatic stress?

19

Page 20: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Predictive Modeling for Post-traumatic Stress

• Neuroendocrine data studied contain limited information for non-remitting stress response.

• Except at the time of ER, combining neuroendocrine and other data (clinical information, psychiatric surveys) do not significantly increase predictivity of the models.

20

Page 21: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Other Case Studies for Predicting Modeling

• Predicting Cancer Patient Outcome

• Predicting Neural Activity in the Dorsolateral Striatum

• Predicting Transposon Insertion

21

Page 22: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Other Case Studies for Predicting Modeling

Predicting Cancer Patient Outcome

• Problem: Determine the most informative data modality for predicting cancer patient outcome

• Data: 47 datasets/predictive tasks that in total span over 9 data modalities including copy number, gene expression, protein expression, mico-RNA expression, imaging, GWAS, somatic mutation, methylation, and clinical information.

• Conclusion: Gene expression is in generally the most informative data modality. Combining different data modality do not increase predictive performance.

22

Ray MS, Henaff MS, Aliferis PhD, Statnikov PhD @NYU

Ray et al., 2014

Page 23: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Other Case Studies for Predicting Modeling

Predicting Neural Activity in the Dorsolateral Striatum (DLS)

• Problem: Predict neural activity from movement data

• Data: Single Neuron Activity in the DLS

Head Movement Tracking Data

• Model: Linear-Non-linear-Poisson Model to predict neural activity from head movement profile of the animal and spike history of the neuron.

• Reconstructed neural activity in subpopulation of the neurons.

23

David Barker PhD @ NIDA Ma and Barker, 2014

Page 24: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Other Case Studies for Predicting Modeling

Predicting Transposon Insertion • Problem: Identify transposon insertion location in the genome. • Data: Targeted Sequencing Data. • Model: train logistic regression model on a set of annotated

transposon insertion sites and apply the model for de-novo insertion identification.

• More than 95% of the de-novo insertion identified by the model was validated by experiments.

Zuojian Tang MS, David Fenyo PhD, Jeff Boeke PhD @NYU Langone, Kathleen Burns @ JHU

24

Page 25: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Talk Outline

• Predictive Modeling o Brief Introduction to Predictive Modeling

o Indicative Case Studies

• Causal Modeling o Causal Modeling using Observation Data

o Indicative Case Studies

o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection

25

Page 26: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling: the Goal

26

Page 27: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling: the Goal

27

Page 28: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling: Causal graphs Capture Direct, Indirect Relationships

28

Page 29: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling: V-structures a Common Technique for Orienting Causal Relationships

29

Page 30: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Casual Modeling: PC Algorithm a prototypical causal discovery algorithm

30

PC algorithm: Skeleton Discovery

Sprites et al., 1993

Page 31: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

31

Casual Modeling: PC Algorithm

PC algorithm: Skeleton Discovery, Trace

Page 32: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Casual Modeling: PC Algorithm

32

PC algorithm: Orientation

Page 33: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling: HITON-PC Algorithm

B

T

C

D

E

A

33

• Local causal discovery method • Easily extended for global causal

discovery with the LGL framework.

Aliferis et al., 2010

Page 34: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling: HITON-PC Algorithm

B

T

C

D

E

A

Trace of HITON-PC

34

Page 35: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling: Semi-Interleaved HITON-PC a more efficient implementation

35

• Efficient, and robust. • Scalable to very BIG

DATA. • Easily extended for

global causal discovery with the LGL framework.

• An instantiation of the GLL framework.

Page 36: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Talk Outline

• Predictive Modeling o Brief Introduction to Predictive Modeling

o Indicative Case Studies

• Causal Modeling o Causal Modeling using Observation Data

o Indicative Case Studies

o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection

36

Page 37: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling for Post-traumatic Stress Study

• 166 trauma survivors that were admitted to the ER were followed up to 4 month after the trauma.

• Patient history, clinical data, stress hormones, psychiatric related measurements were collected in the ER, 1 week, 1 month, and 4 month after the trauma. A total number of 135 variables were collected.

101289479238749818817989 …

101289479238749818817989 …

101289479238749818817989 …

101289479238749818817989 …

101289479238749818817989 …

15675672308252573213 …

101289479238749818817989 …

101289479238749818817989 …

998234989238749892409880 …

101289479238749818817989 …

101289479238749818817989 …

884729238761912876128764 …

101289479238749818817989 …

101289479238749818817989 …

112343247498231881324742 …

Data:

37 Galatzer-Levy et al., 2015; Ma et al. 2015; Galatzer-Levy et al., 2015 (submitted)

Page 38: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model for Post-traumatic Stress

Causal Discovery Question:

• What are the factors determining non-remitting stress responses?

Analysis Design:

• Apply local causal discovery algorithms (HITON-PC) to find the parent children sets for all measured variables

• A global causal graph depicting the relationship among all measured variables were constructed using the local to global framework LGL.

• Edges were oriented according the time that individual variables were measured.

38

Page 39: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling for Post-traumatic Stress

The Global Causal Graph

A very complicated model! 39

Page 40: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling for Post-traumatic Stress

Example Causal Path Leading to non-remitting Stress Responses

40

Page 41: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling for Post-traumatic Stress

Potential intervention for non-remitting Stress Responses

41

Page 42: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Modeling for Post-traumatic Stress

Potential Intervention for non-remitting Stress Responses

42

Page 43: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Talk Outline

• Predictive Modeling o Brief Introduction to Predictive Modeling

o Indicative Case Studies

• Causal Modeling o Causal Modeling using Observation Data

o Indicative Case Studies

o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection

43

Page 44: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

Goals: • Reduce number of experiments that experimentalists need to

do in order to fully resolve a biological pathway (or other complex set of causal interactions among variables of interest).

• Reduce time to discovery

• Reduce costs

44

Page 45: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

Special Importance In Health Sciences with both omics data and clinical data:

• One variable could be univariately associated with hundred to thousand variables: – Drivers: direct and indirect

– Passengers

– Effects

• High degree of multiplicity.

• Classical statistical techniques exhibit both increased false positives and negatives

45

Page 46: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model-Guided Experimental Minimization and Adaptive Data Collection

Simplified view of the Framework:

46

Page 47: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

The ODLP Algorithm:

Output:

• Local causal pathway (parents and children) of the variable of interest.

Two Phases:

• Identify local causal pathway consistent with the data and information equivalent clusters.

• Adaptively recommend experiments to perform, integrate experimental results to refine and orient the local causal pathway.

47 Statnikov et al., 2015 (Accepted)

Page 48: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

48

The ODLP Algorithm:

Output:

• Local causal pathway (parents and children) of the variable of interest.

Two Phases:

• Identify local causal pathway consistent with the data and information equivalent clusters.

• Adaptively recommend experiments to perform, integrate experimental results to refine and orient the local causal pathway.

ODLP: Pseudo Code:

Page 49: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

The ODLP Algorithm Phase I:

• Identify local causal pathway consistent with the data and information equivalent clusters (TIE*, iTIE* algorithms).

49

Page 50: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

The ODLP Algorithm Phase I: iTIE*

50

Page 51: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

The ODLP Algorithm Phase II:

• Adaptively recommend experiments to perform, integrate experimental results to refine and orient the local causal pathway. (i.e. Identify Causes, Effects, and Passengers).

51

Page 52: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP: Identifying effects

effects

• Manipulate T and obtain experimental data DE.

• Mark all variables in V that change in DE due to manipulation of T as effects.

52

Page 53: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP: direct and indirect effects

Indirect effect

• Select an effect variable X that has neither been marked as indirect effect nor as direct effect.

• Manipulate X and obtain experimental data DE.

• Mark all effect variables that change in DE due to manipulation of X and belong to the same equivalence cluster as indirect effects.

• The last effect variable in an equivalent cluster that is not marked as indirect effect is a direct effect.

53

Page 54: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP: Identifying Passengers

Passengers

• Select an unmarked variable X from an equivalence cluster.

• Manipulate X and obtain experimental data DE.

• If T does not change in DE due to manipulation of X, mark X as a passenger and mark all other non-effect variables that change in DE due to manipulation of X as passengers; otherwise mark X as a cause.

54

Page 55: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP: Identifying Causes

• For every cause X, mark X as a direct cause if there exist no other cause in the same equivalence cluster that changes due to manipulation of X; otherwise mark X as an Indirect cause.

• If there is an equivalence cluster that contains a single unmarked variable X and all marked variables in this cluster (if any) are only passengers and/or effects, then mark X as a direct cause.

55

Page 56: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP vs Other Algorithms: Performance on Simulated Data

• Benchmark study

• 58 algorithms/variant from 4 algorithm families.

• 11 networks of different sizes.

56

Statnikov et al., 2015 (Accepted)

Page 57: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP vs Other Algorithms: Network Reconstruction Quality

57

Page 58: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP vs Other Algorithms: Reconstruction Quality & Efficiency

58

Page 59: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP vs Other Algorithms: Scalability

59

Page 60: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP vs Other Algorithms: Performance on Real Biological Data

60

Ma et al., 2015 (submitted)

Page 61: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Causal Model Guided Experimental Minimization and Adaptive Data Collection

ODLP vs Other Algorithms: Performance on Real Biological Data

61

Page 62: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Summary

• Predictive Modeling o Brief Introduction to Predictive Modeling

o Indicative Case Studies

• Causal Modeling o Causal Modeling using Observation Data

o Indicative Case Studies

o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection

62

Page 63: Predictive and Causal Modeling in Biomedicine · 2015-09-08 · Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, ... o Indicative Case Studies •Causal Modeling

Future directions

• Improve Existing algorithms (e.g., relax some application assumptions).

• Design and Implement Analysis Pipelines that can be used by non experts.

• Disseminate Software and Analytics Packages.

• Apply these techniques broadly in different domains.

• Educate researchers about the capabilities (and limitations) as well as proper use of these and related methods.

63