pharmacoepidemiology – big data, big problems, big … · outline - agenda • in the context of...

54
James Brophy MD FRCP PhD McGill University Health Center, McGill University, Montreal, Quebec Réseau Québécois de Recherche sur les Médicaments Session II : Big Data : une mine d’or Québécoise à exploiter 1 juin 2015 Pharmacoepidemiology – Big data, Big problems, Big solutions

Upload: buiminh

Post on 22-Apr-2018

242 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

James Brophy MD FRCP PhD McGill University Health Center,

McGill University, Montreal, Quebec

Réseau Québécois de Recherche sur les Médicaments

Session II : Big Data : une mine d’or Québécoise à exploiter

1 juin 2015

Pharmacoepidemiology – Big data, Big problems, Big solutions

Page 2: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

2

Conflicts of Interest

I have no known conflicts associated with this presentation and to, the best of my knowledge,

am equally disliked by all pharmaceutical and device companies

http://www.nofreelunch.org/

Page 3: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Outline - agenda

•  In the context of pharmacoepidemiology •  What are big data? •  What are the big problems with big data? •  Are there innovative solutions to these

problems?

3

Page 4: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

What is the definition of big data? •  Something that

– doesn’t fit into Excel (65,535 row limit) – makes you say ”wow” – makes you uncomfortable working with it – only applies to genomics

•  Wikipedia – Big data is high volume, high velocity, and/or

high variety information to enable enhanced decision making, insight discovery and process optimization. 4

Page 5: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

How big is big data?

5

Page 6: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Just because it’s big, is it right?

6 http://oig.ssa.gov/sites/default/files/audit/full/pdf/A-06-14-34030_0.pdf

Over 6 million Americans have reached the age of 112 Just 13 are claiming benefits, and 67,000 of them are WORKING

Page 7: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

More big data hubris

1.  2008 stock market crash – lots of economic data but incorrect models failed to predict and even facilitated the crash (Black Swan – N. Taleb)

2.  Google - “…we can accurately estimate the current level of weekly influenza activity in each region of the United States, with a reporting lag of about one day.” (Nature 2009)

7

Page 8: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

More big data hubris

•  Google Flu was wrong for 100 out of 108 weeks since August 2011

•  Error was a systematic over-estimate (Science Mar 14 2014) 8

Page 9: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

So the big question…

•  Is not the volume, velocity or variety of the data that is the problem but rather its VERACITY

•  Also a problem for pharmacoepidemiology?

9

Page 10: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Pharmacoepidemiology 2010

10

Page 11: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

2010 •  Both studies used UK GPRD database

1996 -2006 & 1995-2005

11

BMJ – RR 2

JAMA – RR 1.07

Page 12: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Me, too

12

Page 13: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

2 RAMQ cohorts

13

Page 14: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

NEJM RCT 2014

14

Page 15: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

NEJM RCT 2014

15

Page 16: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Problems with Big Data

•  Most big data is observational -> biases (selection, information) and confounding

•  Big data -> small random errors, tight CIs, small p values, but systematic errors not measured in these CIs -> false sense of precision

•  Big data often leads to ignoring other pertinent evidence that should be synthesized to reach the most reasonable conclusions

16

Page 17: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Principles for working with big data •  Government

– Privacy / Accessibility –  Integrity of the data

•  Researchers – Privacy / security – Processing the data (design, analysis, model

selection) –  Interpreting the results - epistemologically

important to distinguish information (data), knowledge (causal inferences) & wisdom (systematic incorporation of all knowledge)

17

Page 18: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Learning from Big Data •  More than big data need better data, rich in

important confounders •  Need better research designs, especially

experimental data •  Need to better appreciation of the quantitative

sciences (uncertainty, causal inference) •  Need “domain knowledge”—specific clinical

information •  Must incorporate prior evidence.

–  If good prior data use informative priors –  If very little data use agnostic/uniform prior beliefs 18

Page 19: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

What is the purpose of pharmacoepidemiology?

•  Patterns of drug utilization •  Generating new information on drug safety •  Supplementing premarketing effectiveness

studies – different populations, better precision

•  However, the overall purpose is to provide insights or causal inferences, not merely associations generated from large data sets.

19

Page 20: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Estimating causal effects

1.  Randomized Experiments 2.  Natural Experiments 3.  Instrumental Variables 4.  Regression Discontinuity 5.  Difference in Differences

20

Page 21: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

21

An example

Page 22: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Results

22

Page 23: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Problems

•  Not sure of the benefit in NA context •  Changing everyone in Quebec to

ticagrelor would cost $25 million •  Doing a large conventional RCT could cost

$10-50 million •  What to do?

23

Page 24: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Using big data effectively

•  Most of the cost is for the follow-up •  We have excellent administrative

databases with reliable measures of death and cardiac outcomes so could minimize costs

•  Need to avoid selection basis so could randomize at start and then simply observe

•  New design – randomized registry – can answer the question at a reasonable cost aa24

Page 25: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Conclusion •  Instead of focusing on a “big data

revolution,” better is an “all data revolution” including replication

•  Recognize critical change has been innovative designs and analytics, can be applied to both traditional and new data

•  Big data is an aid to thinking not a substitute for thinking

•  Goal of this revolution is to provide a deeper, clearer understanding of our world. 25

Science March 14 2014

Page 26: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Merci

26

Page 27: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Learning form big data •  Must incorporate prior evidence.

–  If good prior data use informative priors –  If very little data use agnostic or uniform prior

beliefs •  In all cases, must be able to specify where

you are and why, if agnostic approach then need validation study

•  Avoid confusing prior beliefs with prior evidence -> biases

27

Page 28: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

How Much Data is There?

•  2.5 quintillion terabytes of data were generated every day in 2012

•  As much data is now generated in just two days as was created from the dawn of civilization until 2003.

28 Harvard Business Review Dec

2014

Page 29: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

•  Where things go wrong is where tools of this kind are used not as an aid to thinking but as a substitute for thinking. When the information provided is used (this was one of David Ogilvy’s favourite quotations) “… as a drunk uses a lamppost: for support rather than illumination.”

29

Page 30: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

What can big data find in healthcare?

30

Page 31: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Big data & inferences

31 Washington Post March 21

Page 32: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

What is the correct inference?

•  Americans spend too much on gambling and too much on the important stuff of politics

•  Americans spend too much on gambling and not enough on the important stuff of politics

•  Americans don’t spend too much on gambling but spending on politics is out of control

32

Page 33: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Looking in detail

•  Consider there are 316 MM Americans •  Basketball 13% gambled, average bet $200 •  Elections, 80% adults, average $25 •  Elections 1% of 1% of the population

(31,600) spent 28% or $2 B, average contribution $64,000

•  Very small sample of Americans are controlling the election process

33

Page 34: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

How unequal?

34

Page 35: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Do statins increase or decrease the risk of cancer?

Impossible d'afficher l'image. Votre ordinateur manque peut-être de mémoire pour ouvrir l'image ou l'image est endommagée. Redémarrez l'ordinateur, puis ouvrez à nouveau le fichier. Si le x rouge est toujours affiché, vous devrez peut-être supprimer l'image avant de la réinsérer.

NO

YES

Page 36: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Maybe neither

Maybe this is an isolated case and dates from 2007. Surely we are better today.

Page 37: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Do statins cause diabetes?

37

Page 38: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Do statins cause diabetes?

38

Page 39: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Statins & diabetes, Who do you believe?

•  Both studies published in May 2013 •  Both studies published in high impact

journals •  Both used validated administrative

datasets •  Both published by renown investigators

39

Page 40: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Statins & diabetes, Who do you believe?

•  Even more confusing & troublesome •  Both used THE SAME validated

administrative datasets (Ontario) •  Both used essentiallyTHE SAME patients

(>65, no diabetes, new statin users from 1997 (2004) - 2010

•  Both sets of authors are from THE SAME academic institution (Sunnybrook, U of T)

40

Page 41: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Adaptive randomization & ethics

41

Page 42: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

•  In the end, it seems doubtful that adaptive allocation generally improves risk/benefit for patients.

•  Require larger sample sizes -> more patients, more research procedures, more visits.

•  Since costs scale with sample sizes, it means more resources are consumed in answering a single research question than with a fixed 1:1 design.

42

Page 43: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Adaptive randomization & ethics

•  Does outcome-adaptive allocation better accommodate clinical equipoise and promotes informed consent?

•  Does adaptive allocation offers a ‘‘partial remedy’’ for the therapeutic misconception associated with fixed randomization?

43

Page 44: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Arguing against

•  Hey and Kimmelman suggest that they do not improve risk–benefit for subjects but increase total burden for both patients and research systems by demanding larger sample sizes.

•  Suggest that they redistribute rather than dissolving tensions in informed consent

•  Suggest may have validity problems 44

Page 45: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

A source of bias? •  Given that the odds of receiving the better

treatment will improve over the course of the trial

•  It is in the best interests of patient-subjects (and the physicians advocating on their behalf) to wait and enroll as late as possible

•  So later patients maybe healthier (less urgency to participate) -> predictable time-trend in the study population increases the risk of bias 45

Page 46: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Example # 3

46

Page 47: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Example # 3

47 We have reached a threshold such that time to reperfusion no longer matters, provided < 90 minutes, and we now need to look elsewhere for improvements.

Page 48: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Results

48

16 minute improvement

Page 49: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

No improvement, really? •  Adjusted mortality has declined from 5% to

4.7% p=0.34 but what would CI tell us? •  Back of the envelop calculations, a 0.3%

improvement with 95% CI from -0.1% to +0.7%

•  In other words this small improvement in time is consistent with an up to 7/1000 absolute survival benefit (about 2800 annually) or 14% relative decrease in mortality and is entirely consistent with previous research 49

Page 50: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Consistent with other results

50 J Am Coll Cardiol 2006;47:2180-6

22,900 PCI in AMI NRMI

Page 51: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Telling it like it isn’t

51

MY CONCLUSIONS

This study shows that improved treatment times, even below the 90 minute threshold, are likely associated with meaningful mortality benefits that are entirely consistent with previous work and may have a huge public health impact. Efforts should continue to reduce all treatment delays.

Page 52: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Fundamental identity of causal

inference Outcome for treated − outcome for untreated

= [Outcome for treated − Outcome for treated if not treated] + [Outcome for treated if not treated − Outcome for untreated] = Impact of treatment on treated + selection bias If treatment is randomly assigned • Selection bias is zero. • Treated are random selection from population, so impact on treated = impact on population 52

Page 53: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

Problems

•  Basic problems of observational research including selection bias, information bias and confounding – How were patients selected? – How was exposure measured? – Were time dependencies? – What were the statistical models? – What confounders, interactions, mediators

considered? 53

Page 54: Pharmacoepidemiology – Big data, Big problems, Big … · Outline - agenda • In the context of pharmacoepidemiology • What are big data? • What are the big problems with big

References

54