2016 dia pv ibara final distribution

30
A Real-World Look at Mining Social Media for Adverse Events: Impact of Regulatory Definitions and Methods Michael A. Ibara, Pharm.D. Head of Digital Healthcare CDISC* *(Formerly Pfizer, Inc., during which time research was conducted.) @michaelibara

Upload: michael-ibara

Post on 08-Jan-2017

149 views

Category:

Social Media


1 download

TRANSCRIPT

A Real-World Look at Mining Social Media for Adverse Events: Impact of Regulatory Definitions and Methods

Michael A. Ibara, Pharm.D.

Head of Digital Healthcare

CDISC*

*(Formerly Pfizer, Inc., during which time research was conducted.)

@michaelibara

A note on the distribution version of these slides: These slides are designed specifically to support the talk I gave, and not as a standalone document. They provide visual support during my talk, and I make many points verbally that are not spelled out in the slides. I apologize, then, if some points seem opaque because of this, and if you want to know more I’m happy to have a quick chat or email exchange to explain anything.

Having said this, I hope you can glean some information from them about the rationale and direction in this talk. I welcome any discussions!

You can reach me at [email protected], or at [email protected]

Thank you!

The ideas and opinions in this talk are mine and not necessarily those of CDISC, my colleagues, or indeed any other human being

Tufts CSDD Social Media and Drug Development StudyGeneral Practices

Guidelines for company-owned sites

Guidelines on

resourcing

Program Planning

Best practices on

open innovation

Best practices on crowdsourc-

ing

Best practices on

patient reported

outcomes

Patient Recruitment

Effective global

recruitment strategies

Recomme-ndations for

future recruitment

Pharmaco-vigilance

Recomme-nded policy

guidance

Principles for identified AE

scenarios

Social Listening

Strategies behind

listening effectively

Operation-alizing social

listening

M. Ibara1, S. Stergiopoulos2, J. Van Stekelenborg3, A.C. Ianos4, R. Ruben5, P.N. Naik6, R. Boland7.

1CDISC previously Pfizer- Inc, Pharmacovigilance, New York, USA2Tufts Center for the Study of Drug Development, Project Management, Boston, USA3Johnson & Johnsn, Lead Methods and Analysis, New York, USA4Pfizer- Inc., Safety Risk Management, London, United Kingdom5Independent formerly ParagonRx International- LLC, Risk Management, Philadelphia, USA6Independent formerly Tufts Center for the Study of Drug Development, Research Analyst, New York, USA7Janssen- Pharmaceutical Companies of Johnson and Johnson, Translational Informatics & External Innovation R&D IT, Philadelphia, USA

Colleagues

The premise...

6

Growing sophistication in mining social media for possible reports of AEs

PV INTEREST IN SOCIAL MEDIAA personal timeline

c. 2004

c. 2006

What?

I’ll wait!

c. 2010

Tell me more!

Don’t tell me!

I’m scared!

c. 2013

Ok, let’s figure this outThis might

be good!

c. 2014-15

c. Now

We’re getting the hang of this

Piece of cake!

The Premise...

10

Growing interest in determining how such activities are interpreted in light of regulations

Growing sophistication in mining social media for possible reports of AEs

Academic studies address utility of mining social media for AEs, but real-world scenarios match up poorly with ideal research conditions

It is not always clear to what the extent actual operational designs affect the results of that work

ObjectivesDetermine the real-world ability to obtain reproducible results using a single definition of “reporter” to mine social media for possible AEs

Using available vendorsAllowing standard approaches within limits

Test whether the operational definition of “reporter” has a direct impact on results (i.e., counts of possible AEs)

Determine the extent to which varying the definition of “reporter” changes the counts of possible AEs

11

Design: 6 Vendors / 9 Drugs• 6 vendors recruited based on their stated capabilities of mining social media for

possible AEs

• Predetermine set of drugs was used for the investigation* None were marketed by companies whose members participated in the investigation: • Olanzapine (Lanzek, Zypadhera, Zyprexa, Symbyax)

• Trazodone (Depyrel, Desyrel, Molipaxin, Oleptro, Trazodil, Trazorel, Trialodine, Trittico)

• Lamotrigine (Lacmictal)

• Natalizumab (Tysabri, Antegren)

• Aripiprazole (Abilify, Aripiprex)

• Esomeprzole (Nexium, Essocam)

• Duloxetine (Cymbalta, Ariclaim, Xeristar, Yentreve, duzela, Dulane)

• Nicotine (Nicotrol, Habitrol, Nicoderm, Nicorette, Nicotinell, Commit, Thrive)

• Aspirin

*Comparable with Leaman, Robert, et al.2010; Nikfarjam, Azadeh, et al. 2011 12

Design: Retrospective Mining• Retrospective mining of social media for possible AEs for each listed drug

• From 1Jan2013 to ‘present’ to reach at least 400 hits per drug

• Greater time period can be investigated if needed to meet hits requirement

• Possible AE is defined as containing the “4 elements” • Identifiable event, patient, drug, reporter

• Three definitions were standard• “Event” as defined broadly in regulations and defined operationally per vendor (multiple symptoms in

a single post are defined as a single ‘hit’)

• “Patient” defined broadly in regulations and in absence of vendor’s definition will be “Knowledge of an individual experiencing the event with at least one of the following patient qualifiers: a pronoun or noun implying a human; an age or age category; gender; initials; date of birth; name; or patient ID number

• “Drug” as the drug in question using vendor’s procedures.

13

Identifiable Reporter – 4 Levels• LENIENT

• Post exists, i.e., no requirement to identify a reporter

• Used in the initial data collection

• LOW• Any type of information suggesting there is an actual reporter (e.g., acronym, pseudonym, proper

name, email address)

• STANDARD• At least one piece of identifiable information for the reporter - i.e,. local identifier for the platform

that allows contacting (e.g., Facebook name) OR a validly formatted email address OR validly formatted phone number

• STRICT• Must match standard criteria and in addition have additional piece of identifying information such

as valid phone number, valid user name from site, mention of geographic location of the person in question

14

“Hits”• For each drug a minimum of 400 ‘hits’ according to the Lenient definition was

the goal

• Type of reporter identifying information was collected by vendors, but was masked to subteam investigators and supplied as counts • E.g., 10 cases with single valid mail addresses, 5 cases with both valid email

addresses and user names

• Search was limited to English

15

Design: Data Sources / Collection• Four standard data sources

• Facebook

• Twitter

• Dailystrength.org

• Drugs.com

• Vendors varied in terms of methods of collection and extent to which sources were utilized • As this was a real-world examination, no attempt was made to develop a single

consistent dataset

• No data was collected that was not publically available

16

RESULTS

Results: Real-world Variability•Began with 6 vendors claiming ability to independently

complete request. Completed with 4 vendors

• Significant differences across vendors in methods• Operational definitions of drugs, terms• Lexicon use, sophistication • Sourcing and amount of data collected • Algorithms used • Type and extent of curation

18

Results: Overall Findings• It is not possible to pool results or make direct comparisons given

the variability in methods

• Each set of results must be treated individually based on the unique set of methods used

19

Vendor Facebook (‘hits’ per examined) Twitter (‘hits’ per examined)

1 0 / 34 17 / 239

2 43 / 11,431 [human] 141 / 37,409 [human]

3 373 / 9,823 [human] 32 / 33,167 [human]

4 1,200 / 85,220 1,318 / 105,018

Example: ‘Hits’ on Facebook and Twitter by Vendor

“Identifiable Reporter”… irrelevant? •Very little difference between bottom three categories.

The only real change was seen with “strict” definition, which dramatically reduced hits.

• For Facebook and Twitter, a valid email address is a requirement, thus rendering a distinction between “lenient” and “low” meaningless.

•This may point to the discord between a definition created before the world wide web existed, applied to social media today.

20

?

“Identifiable Event”…subject to change

•Vendors varied in their operational definition of an “event” based on their specific assumptions and working models.

• In one instance the vendor used definitions already developed and used in their daily business – with the effect that the precision was high but sensitivity poor.

•What events qualified as hits also varied with lexicon use and sophistication, and with application of curation.

21

Each Platform has unique considerations•While it may seem obvious upon reflection, it is

important to realize that searching “social media” reduces to a collection of specific methodologies for each platform: Facebook vs Twitter vs Drug Info Sites vs Patient engagement sites, etc.

•Platforms’ influence was found not only in how data was collected, but how much could be collected (directly from API, from 3rd parties, stored by vendor)

•Twitter provided the most initial data to review, but it also provided fewer hits per unit number examined

22

Curation / Machine approach Tradeoff• In instances where human curation was used, the precision and

specificity improved (although in two different vendors the curation method was not comparable)

• However, drugs with a very large initial hit rate (e.g., aspirin, nicotine) broke the human curation steps, and required various workaround both in machine and human approaches

• There is as of yet no solely-machine-based approach that approximates human curation, but there is no human curation approach that can handle very large numbers in a cost-effective manner

• This is an area ready for innovation (e.g., a machine-human-based solution that takes advantage of crowd-sourcing?)

23

vs

SUGGESTIONS

We need to focus on methods...

25

It is clear that results our not comparable across studies or vendors without significant (re)work

Our enthusiasm to incorporate SM into PV has now outstripped our methods

Value from social media and PV will come from generalizable, reproducible results

It’s time to adopt an “open-science” style approach to social media PV research

“Open Science”

“...is the movement to make scientific research, data and dissemination accessible to all levels of an inquiring society, amateur or professional. It encompasses practices such as publishing open research, campaigning for open access, encouraging scientists to practice open notebook science, and generally making it easier to publish and communicate scientific knowledge.”

Wikipedia

26

Five suggested areas of methodological focus

•Operational definitions of drugs, terms•Lexicon use, sophistication •Sourcing and amount of data collected •Algorithms used •Type and extent of curation

27

All PV studies with Social Media Should Publish the following along with results...

•Operational definitions of drugs, terms•Lexicon use, sophistication •Sourcing and amount of data collected •Algorithms used (to the extent possible)•Type and extent of curation

28

!

And lastly, it would be good to agree...

29

• For reporting on safety of drugs/devices to regulatory authorities involving social media, more transparency and sharing of methods is highly desirable

•An industry-agreed “good social-media / PV research practices” that goes beyond the current general recommendations, and begins to address those areas raised here, would be timely

•A re-examination of the regulatory definitions of the “4 elements” is needed, to ground them in modern concepts that include social media and the internet – a greater specificity in regulatory definitions will ease the burden on the reporter and lead to greater standardization in methods

Thank You