ai pharma the low hanging fruit - 27 july 2017

AI Pharma Innovation Summit Boston, July 2017

Shaun Comfort, MD, MBA

Associate Director of Risk Management Genentech, A Member of the Roche Group

This presentation represents the opinions of Dr. Comfort, and not those of Genentech, A Member of the Roche Group.

Agenda

2

• Early Members of the Cast • The Global ICSR Funnel • So Is there a Problem? • Parallel vs Serial ICSR Detection • A Better Way to Dig Through Data • Facilitators and Limitations to Adoption • Success Factors and Cautions • A Peek Into the Future? • Conclusions

Early Contributors to the Field

3

Ensemble Modeling: Making Better Classifiers and Better Forecasts

Image Source: Google Images

Machine Learning

5 Source: Downloaded Google Images

Unlike traditional programming (aka “coding”), Supervised ML uses a set of input data and the answers (aka “output”, “response”, etc.) to build a program

The Global ICSR Funnel

ICSRs

Spont Clin

Trials

Social Media PSPs

Today, it’s no longer just clinical trials and spontaneous sources!

Sources never seem to go away!

ICSR Identification

Raw Material Key:

How Big Can the Global Funnel Be?

7

Use Fermi decomposition to estimate the range of likely answers:

Min Max Gm Exp SD RND

# Global BioPharma Co: 50 350 132 155 50 174

# Molecules or Drugs/Co 1 350 19 71 58 110

# ICSRs/Molecule/yr 10 5,000 224 984 832 963

#ICSRs/Year (000,000s) 18.5

Fermi Estimate of Global Annual ICSR VolumeSingle

Iteration

How Big Can the Funnel Be?

8

Augment with a Monte-Carlo simulation of the Fermi ranges:

~8 M

What Are the Resource Implications?

9

• Assume the following:

• Now extrapolate this to the level of the Global ICSR Funnel!

• Costs scale with resources, so the next picture should get your attention…

Unit Assumption

PV Case Specialist Hours / AER 1 to 4 hrs, tg ≈ 2 hrs

Hours / Year / FTE ≈ 1800

2016 Annual Volume (mid-large Pharma) ≈ 250K Cases

Resulting FTE Needs ≈ 280 FTEs Typically highly trained individuals with health care backgrounds (e.g., RNs, PharmDs, MDs, etc.)

The Implications Globally…

10

At 1 M AERs, a firm would need

1,000 processors

Please note: logarithmic

scale

So Is There a Problem?

Image Sources: Low Tech Cartoon.com & WSJ Health-Care Tech Insert June 26, 2017

So Here We Are…

12

• Data increasing, resources limited, and costs approaching the pain point

• We’ve seen this movie before…

• In addition, we are measuring how good humans are at many PV related tasks (e.g., AE identification, Causality Judgments, etc.)

• The results – suggest room for improvement • AE classification is unreliable. (Forster, et al. 2012) • Current methods of AE detection overestimate the risk (Forster, et al. 2006) • The Majority of Expedited IND Safety Reports Are Uninformative (Jarow, et al.

Clin Cancer Res; 22(9) May 1, 2016)

How AI can help

13

• AI (specifically Machine Learning) can perform “deep” reading of medical literature, social media posts, and other AE data

• Multiple companies testing AI systems for PV tasks such as case intake and AE identification because these systems:

• Can perform at similar levels to human SMEs with consistency, never gets tired, and continues learning…

• Can be deployed in parallel ‘ensembles’, greatly increasing ICSR identification accuracy and positive predictive values

Another Member of the AI Universe: Machine Learning with Random Forests

Image Source: Google Images

Parallel vs. Serial ICSR Detection

15

Manual ICSR detection typically involves human SMEs experts sifting through raw material (e.g., medical literature, digital media posts, incoming AEs, etc.).

Individuals identify AEs meeting ICSR criteria with final secondary QC review for accuracy.

This approach can have unforeseen consequences on accurate identification

Two step review can have only 81% sensitivity even if both reviewers operate at 90% sensitivity!

Three step review is even worse – 73%! How?

Data Received SME Review Secondary QC ICSR

Determination

Ensemble

Parallel vs. Serial ICSR Detection

16

System accuracy (i.e., Sensitivity): Serial system ≤ individual component accuracies

Parallel system ≥ Individual component accuracies

The False Negative Rates (i.e., misses) = 1 – Sensitivity!

SME 1

Pd = 0.9 SME 2

Pd = 0.9 SME 3

Pd = 0.9 𝑃𝑠 = 𝑃𝑖

𝑛

𝑖=1

= .73

𝑃𝑠 = 1 − (1 − 𝑃𝑖)

𝑛

𝑖=1

= .99 Incoming ICSR Raw Material

Incoming ICSR Raw Material Se

rial

P

aral

lel

SME 1 Pd = 0.9

SME 2 Pd = 0.9

SME 3 Pd = 0.9

So Why Doesn’t Everyone…?

17

Source: http://www.predictiveanalyticsworld.com/patimes/predictive-analytics-basics-six-introductory-terms-five-effects/8110/

So why doesn’t everyone implement parallel ICSR detection processes?

Human SME parallel reviews are costly and time consuming

However, an AI detection system can have multiple parallel classification models (i.e., ensemble model)

Ensemble Model

Model Model Model Model Model Model

http://www.predictiveanalyticsworld.com/patimes/predictive-analytics-basics-six-introductory-terms-five-effects/8110/















A Better Way to Dig through Data

+

AERs

Now Future Data ‘Haystack’

Facilitators/Limitations to Adoption

Bringing quantitative skills into our PV skill kit

Taking calculated risks with new techniques and platforms

Adopting proactive internal change vs waiting for external forced change

•If it works, don’t ‘fix it’

•NIH – “Not invented here”

•“Regulators won’t support it”

•Don’t rock the boat

•Fac

ilita

tors

:

•Limitatio

ns:

So Can We Really Do This? The Good News – Yes!!

Current POCs using commercial AI systems show Gwet Kappa ≥ 0.7 (substantial agreement) to 0.8 (near perfect agreement) agreement with human SMEs for ICSR detection

MLPA techniques used with great success in many industries (e.g., Finance, Aerospace, etc.)

The Health Care Industry (i.e., Medicine, HC Delivery, Pharma, Med Dev, etc.) can derive similar benefits with appropriate adoption of this technology

The Bad News: Garbage In “Still” = Garbage Out (GIGO) Not even super-AI can develop meaningful insights from trash Invest in collecting/cleaning your data appropriately!!! Solid, clean annotated/curated data is “gold dust” for predictive modelling.

Treat it as such!! Compare your model results to human subject matter expert performance

whenever possible; this is your best ‘ground truth’

Deep Learning with DNNs

Image Source: Google Images https://image.slidesharecdn.com/ and https://qph.ec.quoracdn.net/

https://image.slidesharecdn.com/

https://qph.ec.quoracdn.net/

A Peek Into the Future?

22

= 78

Schrodinger’s Equation

Thank You!

Questions?

Selected References

24

• Web, Graph, and Picture Images: Downloaded from Google Images & other websites (outlined on slides)

• Rise of the Robots. Martin Ford, Basic Books 2015. • Healthcare Management Engineering. Alexander Kolker, Spring 2012. • Quantitative Evaluation of Safety in Drug Development. Qi Jiang and

H. Amy Xia, CRC Press 2015. • Machine Learning with Random Forests and Decision Trees. Scott

Hartshorn, Amazon. • Our Final Invention. James Barrat, Thomas Dunne Books 2013 • Cobert’s Manual of Drug Safety and PV. 2nd Ed. Barton Cobert, Jones

and Bartlett Learning 2012 • A Reality Check for IBM’s AI Ambitions. David H. Freedman June 27,

2017 MIT Technology Review

ai pharma the low hanging fruit - 27 july 2017

Health & Medicine