Download - MLconf NYC Samantha Kleinberg
Causal Inference from Uncertain Time Series Data
Samantha KleinbergStevens Institute of Technology
http://discoverysedge.mayo.edu/artificial-pancreas/
insulindependence.orginfluxis.com gigaom.comNate Heintzman @ UCSD, Dexcom
TimeBlood Glucose
Heart Rate
Skin Temp.
Energy Use Weight
7:207:25 95.7 26.80 37.237:30 90.5 28.63 12.857:35 82.4 29.69 6.787:40 79.2 30.32 6.417:45 77.0 30.55 8.177:50 73.3 31.19 6.87 1557:55 72.7 31.29 6.088:00 130.0 71.4 31.96 9.748:05 203.5 89.8 31.59 24.588:10 208.2 87.9 31.67 10.638:15 209.0 84.8 31.16 15.638:20 207.8 81.0 31.11 11.748:25 205.9 83.7 31.26 16.288:30 204.8 98.3 31.05 23.598:35 214.9 82.2 30.47 11.668:40 225.7 82.9 30.95 16.938:45 231.4 0.18:50 232.8 89.6 29.67 21.968:55 239.4 81.2 30.81 15.86
TimeBlood Glucose
Heart Rate
Skin Temp.
Energy Use Weight
7:207:25 95.7 26.80 37.237:30 90.5 28.63 12.857:35 82.4 29.69 6.787:40 79.2 30.32 6.417:45 77.0 30.55 8.177:50 73.3 31.19 6.87 1557:55 72.7 31.29 6.088:00 130.0 71.4 31.96 9.748:05 203.5 89.8 31.59 24.588:10 208.2 87.9 31.67 10.638:15 209.0 84.8 31.16 15.638:20 207.8 81.0 31.11 11.748:25 205.9 83.7 31.26 16.288:30 204.8 98.3 31.05 23.598:35 214.9 82.2 30.47 11.668:40 225.7 82.9 30.95 16.938:45 231.4 0.18:50 232.8 89.6 29.67 21.968:55 239.4 81.2 30.81 15.86
Errors
TimeBlood Glucose
Heart Rate
Skin Temp.
Energy Use Weight
7:207:25 95.7 26.80 37.237:30 90.5 28.63 12.857:35 82.4 29.69 6.787:40 79.2 30.32 6.417:45 77.0 30.55 8.177:50 73.3 31.19 6.87 1557:55 72.7 31.29 6.088:00 130.0 71.4 31.96 9.748:05 203.5 89.8 31.59 24.588:10 208.2 87.9 31.67 10.638:15 209.0 84.8 31.16 15.638:20 207.8 81.0 31.11 11.748:25 205.9 83.7 31.26 16.288:30 204.8 98.3 31.05 23.598:35 214.9 82.2 30.47 11.668:40 225.7 82.9 30.95 16.938:45 231.4 0.18:50 232.8 89.6 29.67 21.968:55 239.4 81.2 30.81 15.86
Errors
Missing Data
TimeBlood Glucose
Heart Rate
Skin Temp.
Energy Use Weight
7:207:25 95.7 26.80 37.237:30 90.5 28.63 12.857:35 82.4 29.69 6.787:40 79.2 30.32 6.417:45 77.0 30.55 8.177:50 73.3 31.19 6.87 1557:55 72.7 31.29 6.088:00 130.0 71.4 31.96 9.748:05 203.5 89.8 31.59 24.588:10 208.2 87.9 31.67 10.638:15 209.0 84.8 31.16 15.638:20 207.8 81.0 31.11 11.748:25 205.9 83.7 31.26 16.288:30 204.8 98.3 31.05 23.598:35 214.9 82.2 30.47 11.668:40 225.7 82.9 30.95 16.938:45 231.4 0.18:50 232.8 89.6 29.67 21.968:55 239.4 81.2 30.81 15.86
Errors
Missing Data
Different timescales
Discretization
Logic-based causal inference
• Complex, temporal relationships
• Potential causes raise probability of and are earlier than effects
Kleinberg, S. (2012) Causality, Probability, and Time.
Which causes are significant?
• Main idea: looking for better explanations for the effect
• Assess average difference cause makes to probability of effect
• Determine which ε are statistically significant
Causes from EHR data
dx_urinary_symptoms
first_antihtn_combo
dx_overweight
dx_diabetes
dx_hypothyroidism
0510152025
8
9
17
17
18
11
10
22
22
21
Months before CHF diagnosis
S. Kleinberg and N. Elhadad (2013) Lessons Learned in Replicating Data-Driven Experiments in Multiple Medical Systems and Patient Populations. AMIA Annual Symposium.
Adding uncertainty
D. Hutchison and S. Kleinberg (2013) Causality and Experimentation in the Sciences.
Carb-heavy meal (c), vigorous exercise (e), blood glucose (g)
Calculate:
What’s the effect of exercise on glucose?
Calculate: Strict discretization:
What’s the effect of exercise on glucose?
Calculate: Strict discretization:Probabilistic discretization:
What’s the effect of exercise on glucose?
Validation
Simulated structures• Randomly generated
relationships• 6 levels of uncertainty• 20K observations
Results on simulated data
Causes of changes in glucose
Cohort: 17 subjects with T1DMSensor data (collected for >72 hours)– Glucose values– Insulin dosage– Activity– Sleep stage– Heart rate– Temperature
With N. Heintzman (UCSD,Dexcom) http://dial.ucsd.edu/what-we-do.php
Results
very vigorous exercise leads to hyperglycemia (fdr <.01) in 5-30 minutes– Found using both HR (anaerobic activity zone) and
METs
• Not found with strict discretization• Supported by literature
(Marliss and Vranic, 2002; Riddell and Perkins, 2006)
S. Kleinberg and N. Heintzman. Inference of Causal Relationships from Uncertain Data and Application to Type 1 Diabetes. (under review)
Open problems
• Latent variables• Nonstationary time series• Real-time prediction/explanation• Simulation
Thanks!
• NSF/CRA Computing Innovation fellowship
• NLM Computational Thinking contract
• NLM R01
• Columbia– Noemie Elhadad– George Hripcsak– Mitzi Morris– Rimma Pivovarov
• Geisinger– Buzz Stewart– Craig Wood
• UCSD– Nathaniel Heintzman
• Stevens– Dylan Hutchison