h2o for medicine and intro to h2o in python

Post on 12-Jan-2017

1.615 Views

Category:

Software

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

H2O.ai Machine Intelligence

Machine Learning for Medicine

Erin LeDell Ph.D.

With an Intro to H2O in Python

H2O.ai Machine Intelligence

Introduction• Statistician & Machine Learning Scientist at H2O.ai in

Mountain View, California, USA• Ph.D. in Biostatistics with Designated Emphasis in

Computational Science and Engineering from UC Berkeley (focus on Machine Learning)

• Worked as a data scientist at several startups• Written a handful of machine learning R packages

H2O.ai Machine Intelligence

H2O.ai

H2O Company

H2O Software

• Team: 45. Founded in 2012, Mountain View, CA• Stanford Math & Systems Engineers

• Open Source Software • Ease of Use via Web Interface• R, Python, Scala, Spark & Hadoop Interfaces• Distributed Algorithms Scale to Big Data

H2O.ai Machine Intelligence

H2O.ai Founders

SriSatish Ambati

• CEO and Co-founder at H2O.ai• Past: Platfora, Cassandra, DataStax, Azul Systems,

UC Berkeley

• CTO and Co-founder at H2O.ai • Past: Azul Systems, Sun Microsystems• Developed the Java HotSpot Server Compiler at Sun• PhD in CS from Rice University

Dr. Cliff Click

H2O.ai Machine Intelligence

Scientific Advisory CouncilDr. Trevor Hastie

Dr. Rob Tibshirani

Dr. Stephen Boyd

• John A. Overdeck Professor of Mathematics, Stanford University• PhD in Statistics, Stanford University• Co-author, The Elements of Statistical Learning: Prediction, Inference and Data Mining• Co-author with John Chambers, Statistical Models in S• Co-author, Generalized Additive Models • 108,404 citations (via Google Scholar)

• Professor of Statistics and Health Research and Policy, Stanford University• PhD in Statistics, Stanford University• COPPS Presidents’ Award recipient• Co-author, The Elements of Statistical Learning: Prediction, Inference and Data Mining• Author, Regression Shrinkage and Selection via the Lasso• Co-author, An Introduction to the Bootstrap

• Professor of Electrical Engineering and Computer Science, Stanford University• PhD in Electrical Engineering and Computer Science, UC Berkeley• Co-author, Convex Optimization• Co-author, Linear Matrix Inequalities in System and Control Theory• Co-author, Distributed Optimization and Statistical Learning via the Alternating Direction

Method of Multipliers

H2O.ai Machine Intelligence

Agenda• Motivation• Medical Machine Learning• H2O Machine Learning• H2O Python Notebook Demo

H2O.ai Machine Intelligence

MotivationPart 1 of 4

Machine Learning for Medicine

H2O.ai Machine Intelligence

Data-Driven

Preventative Care

Precision Medicine

• “The Future of Medicine is Precision Health” • President Obama’s Precision Medicine Initiative• Similar efforts at Stanford, UC Berkeley, UCSF

Personalized

H2O.ai Machine Intelligence

Electronic Health Records

H2O.ai Machine Intelligence

Genomic Data

Genomics pioneer, Jun Wang, is one of China’s most famous scientists.

Dr. Wang recently stepped down as head of BGI to create an AI health-monitoring system that would identify relationships between individual human genomic data, physiological traits (phenotypes) and lifestyle choices in order to provide advice on healthier living and to predict, and prevent, disease.

Source: BGI

H2O.ai Machine Intelligence

Wearables

ABI Research has projected that by 2016, wearable wireless medical device sales will reach more than 100 million devices annually.

H2O.ai Machine Intelligence

Open mHealth

• FitBit• HealthVault• Jawbone• RunKeeper• Withings• HealthKit• HL7 EHR

Open mHealth is a nonprofit start-up breaking down the barriers to integration and bringing clinical meaning to digital health data.

H2O.ai Machine Intelligence

Medical MLPart 2 of 4

Machine Learning for Medicine

H2O.ai Machine Intelligence

Machine Learning in Medicine

• Predict Viral Failure in AIDS patients• Parkinson’s disease progression prediction from

mobile phone accelerometer data• Personalized diagnostics

• Clinical research: Identify which genes are associated with breast cancer relapse

• Prognosis: Predict probability of survival in 5 years

• Real-time predictions using data from wearables• Medication adherence monitoring

• Clinical research: MRI and PET scans & Deep Learning

• Cellular image analysis: genotype, phenotype, classification, identification, cell tracking

DiagnosticTesting

Oncology

MedicalImaging

Remote PatientMonitoring

H2O.ai Machine Intelligence

H2O PlatformPart 3 of 4

Machine Learning for Medicine

H2O.ai Machine Intelligence

H2O Software

H2O is an open source, distributed, Java machine learning library.

APIs are available for:R, Python, Scala & JSON

H2O.ai Machine Intelligence

H2O Software Overview

Speed Matters!

No Sampling

Interactive UI

Cutting-Edge Algorithms

• Time is valuable• In-memory is faster• Distributed is faster• High speed AND accuracy

• Scale to big data• Access data links• Use all data without sampling

• Web-based modeling with H2O Flow• Model comparison

• Suite of cutting-edge machine learning algorithms• Deep Learning & Ensembles• NanoFast Scoring Engine

H2O.ai Machine Intelligence

Current Algorithm Overview Statistical Analysis

• Linear Models (GLM) • Cox Proportional Hazards • Naïve Bayes

Ensembles

• Random Forest • Distributed Trees • Gradient Boosting Machine • R Package - Super Learner

Ensembles

Deep Neural Networks

• Multi-layer Feed-Forward Neural Network

• Auto-encoder • Anomaly Detection • Deep Features

Clustering

• K-Means

Dimension Reduction

• Principal Component Analysis • Generalized Low Rank Models

Solvers & Optimization

• Generalized ADMM Solver • L-BFGS (Quasi Newton

Method) • Ordinary Least-Square Solver • Stochastic Gradient Descent

Data Munging

• Integrated R-Environment • Slice, Log Transform

H2O.ai Machine Intelligence

http://h2o.ai/download

H2O.ai Machine Intelligence

https://github.com/h2oai/h2o-3

H2O.ai Machine Intelligence

EEG DemoPart 4 of 4

Machine Learning for Medicine

H2O.ai Machine Intelligence

EEG for Eye Detection

Problem

Data

• Goal is to accurately predict the eye state using minimal, surface level EEG data.

• Binary outcome: Open vs Closed

• Data from Emotiv Neuralheadset.• Predictor variables describe

signals from 14 EEG channels placed on the surface of the head.

Source: http://archive.ics.uci.edu/ml/datasets/EEG+Eye+State

H2O.ai Machine Intelligence

EEG Data in H2O Flow

H2O.ai Machine Intelligence

H2O Python Demo

https://github.com/h2oai/h2o-3/blob/master/h2o-py/demos/H2O_tutorial_eeg_eyestate.ipynb

For comparison, there is scikit-learn version: https://github.com/h2oai/h2o-3/blob/master/h2o-py/demos/EEG_eyestate_sklearn_NOPASS.ipynb

H2O.ai Machine Intelligence

H2O on

https://www.kaggle.com/mlandry

• H2O starter scripts available on Kaggle• H2O is used in many competitions on Kaggle• Mark Landry, H2O Data Scientist and Competitive Kaggler

H2O.ai Machine Intelligence

Where to learn more?

• H2O Online Training (free): http://learn.h2o.ai• H2O Slidedecks: http://www.slideshare.net/0xdata• H2O Video Presentations: https://www.youtube.com/user/0xdata• H2O Community Events & Meetups: http://h2o.ai/events• Machine Learning & Data Science courses: http://coursebuffet.com

H2O.ai Machine Intelligence

H2O Booklets

https://github.com/h2oai/h2o-3/tree/master/h2o-docs/src/booklets/v2_2015/PDFs/online

H2O.ai Machine Intelligence

Questions?

@ledell on Twitter, GitHub erin@h2o.ai

http://www.stat.berkeley.edu/~ledell

top related