safety data mining: background and current issues

32
Safety Data Mining: Safety Data Mining: Background and Current Background and Current Issues Issues Ramin Arani, PhD Ramin Arani, PhD Safety Data Mining Safety Data Mining Global Biometric Science Global Biometric Science Bristol-Myers Squibb Company Bristol-Myers Squibb Company SAMSI: July, 2006 SAMSI: July, 2006

Upload: makya

Post on 14-Feb-2016

47 views

Category:

Documents


0 download

DESCRIPTION

Safety Data Mining: Background and Current Issues. Ramin Arani, PhD Safety Data Mining Global Biometric Science Bristol-Myers Squibb Company SAMSI: July, 2006. Outline. Rationale for Pharmacovigilance AERS Data Base Data base issues Methodologies BCNN (WHO) MGPS (FDA) Summary - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Safety Data Mining:  Background and Current Issues

Safety Data Mining: Safety Data Mining: Background and Current Background and Current

IssuesIssues

Ramin Arani, PhDRamin Arani, PhD

Safety Data Mining Safety Data Mining

Global Biometric ScienceGlobal Biometric Science

Bristol-Myers Squibb CompanyBristol-Myers Squibb Company

SAMSI: July, 2006SAMSI: July, 2006

Page 2: Safety Data Mining:  Background and Current Issues

OutlineOutline Rationale for Rationale for PharmacovigilancePharmacovigilance

AERS Data BaseAERS Data Base

Data base issuesData base issues

MethodologiesMethodologies

BCNN (WHO)BCNN (WHO)

MGPS (FDA)MGPS (FDA)

SummarySummary

Challenges and OpportunitiesChallenges and Opportunities

Page 3: Safety Data Mining:  Background and Current Issues

Pharmacovigilance - Pharmacovigilance - Rationale Rationale

Information obtained prior to first marketing is inadequate to cover all Information obtained prior to first marketing is inadequate to cover all aspects of drug safety: aspects of drug safety:

tests in animals are insufficiently predictive of human safety,tests in animals are insufficiently predictive of human safety,

in clinical trials patients are selected and limited in number, in clinical trials patients are selected and limited in number,

conditions of use in trials differ from those in clinical practice,conditions of use in trials differ from those in clinical practice,

duration of trials is limitedduration of trials is limited

information about rare but serious adverse reactions, chronic information about rare but serious adverse reactions, chronic toxicity, use in special groups or drug interactions is often not toxicity, use in special groups or drug interactions is often not available.available.

Page 4: Safety Data Mining:  Background and Current Issues

Pre Approval Data- Controlled- Limited # Pts- Safety data not mature

Post Approval Data - Real life ; uncontrolled- Off label use -Generic

- Solicited Safety Data- Unsolicited Safety Data

Population

Subjects for approval

Pharmacovigilance - Pharmacovigilance - RationaleRationale

Page 5: Safety Data Mining:  Background and Current Issues

Spontaneous AE Spontaneous AE ReportsReports

Safety information from clinical trials is incomplete

° Few patients -- rare events likely to be missed

° Not necessarily ‘real world’

Need info from post-marketing surveillance & spontaneous reports

Pharmacovigilance by reg. agencies & mfrs carried out.

Long history of research on issue

° Finney (MIMed1974, SM1982) Royall (Bcs1971)

° Inman (BMedBull1970) Napke (CanPhJ1970)

Page 6: Safety Data Mining:  Background and Current Issues

Issues Incomplete reports of events, not necessarily reactions

How to compute effect magnitude

Many events reported, many drugs reported

Bias & noise in system

Difficult to estimate incidence because no. of pats at risk, duration of exposure seldom reliable

Appropriate use of computerized methods, e.g., supplementing standard pharmacovigilance to identify possible signals sooner -- early warning signal

Page 7: Safety Data Mining:  Background and Current Issues

Safety Signal: Reported information on a possible causal relationship between an adverse event and a drug.

Pharmacovigilance - Pharmacovigilance - DefinitionDefinition

PhamacovigilanceSet of methods that aim at identifying and quantitatively assess the risks related to the use of drugs in the entire population, or in specific population subgroups

Adverse Drug ReactionA response to a drug which is harmful and unintended, and which occurs at doses normally used.

Page 8: Safety Data Mining:  Background and Current Issues

AERS Database AERS Database Database Origin 1969Database Origin 1969

SRS until 11/1/97; changed to AERSSRS until 11/1/97; changed to AERS

3.0 million reports in database3.0 million reports in database

All SRS data migrated into AERSAll SRS data migrated into AERS

Contains Drug and "Therapeutic" Biologic Reports Contains Drug and "Therapeutic" Biologic Reports

exception = vaccines (VAERS)exception = vaccines (VAERS)

Page 9: Safety Data Mining:  Background and Current Issues
Page 10: Safety Data Mining:  Background and Current Issues

Source of AERS ReportsSource of AERS Reports

Health Professionals, Consumers / Patients

Voluntary : Direct to FDA and/or to Manufacturer

Manufacturers: Regulations for Postmarketing Reporting

Page 11: Safety Data Mining:  Background and Current Issues

AERS LimitationsAERS Limitations

Different populations, Co-morbidities, Co-prescribing, Off-label use, Rare eventsDifferent populations, Co-morbidities, Co-prescribing, Off-label use, Rare events

Report volume for a drug is affected by, volume of use, publicity, type and severity of the event and Report volume for a drug is affected by, volume of use, publicity, type and severity of the event and other factors, therefore the reporting rate is not a true measure of the rate or the risk other factors, therefore the reporting rate is not a true measure of the rate or the risk

An observed event may be due to the indication for therapy rather than the therapy itself; therefore An observed event may be due to the indication for therapy rather than the therapy itself; therefore observed associations should be viewed as signal, and causal conclusions drawn with cautionobserved associations should be viewed as signal, and causal conclusions drawn with caution

Page 12: Safety Data Mining:  Background and Current Issues

ExamplesExamplesClaritin and arrhythmias (channeling and need for detailed

data not in data base)

Increased number of reports due to preexisting condition. Selection of high risk patients for the drug deemed safest for them.

Prozac and suicide (confounding by indication) Large increase in reports following publicity and stimulated reporting

Page 13: Safety Data Mining:  Background and Current Issues

The Pharmacovigilance The Pharmacovigilance ProcessProcess

Detect SignalsTraditional Methods

DataMining

Generate Hypotheses

Refute/VerifyType A

(Mechanism-based)

Type B(Idiosyncratic)

Insight from Outliers

EstimateIncidence

Public HealthImpact, Benefit/Risk

ActInform

Change LabelRestrict use/

withdraw

Page 14: Safety Data Mining:  Background and Current Issues

MethodologiesMethodologies

Page 15: Safety Data Mining:  Background and Current Issues

Finding “Interestingly Large” Cell Finding “Interestingly Large” Cell Counts in a Massive Frequency Counts in a Massive Frequency

TableTable

Rows and Columns May Have Thousands of CategoriesRows and Columns May Have Thousands of Categories

Most Cells Are Empty, even though Most Cells Are Empty, even though NN++++ Is very Large Is very Large

Only 386K out of 1331K Cells Have Only 386K out of 1331K Cells Have NNijij > 0 > 0

174 Drug-Event Combinations Have 174 Drug-Event Combinations Have NNijij > 1000 > 1000

No. No. ReportsReports

AEAE11

…… AEAEnn TotalTotal

Drug 1Drug 1 NN1111 …… NN1n1n NN1+1+

:: :: NNijij :: ::Drug mDrug m NNmm

11

…… NNmnmn NNm+m+

TotalTotal NN+1+1 …… NN+n+n NN++++

Page 16: Safety Data Mining:  Background and Current Issues

Method - BasicsMethod - Basics Endpoint: No of AEs

Most use variations of 2-way table statistics

No. No. ReportsReports

Target Target AEAE

Other Other AEAE

TotalTotal

Target Target DrugDrug

aa bb a+ba+b

Other Other DrugDrug

cc dd c+dc+d

TotalTotal a+ca+c b+db+d nn

Some possibilities Reporting Ratio: E(a) = (a+b) (a+c)/n Proportional Reporting Ratio: E(a) = (a+b) c / (c+d) Odds Ratio: E(a) = b c / d

OR > PRR > RR when a > E(a)

Basic idea:Flag when R = a/E(a) is “large”

Page 17: Safety Data Mining:  Background and Current Issues

Bayesian ApproachesBayesian Approaches Two current approaches: DuMouchel & WHO

Both use ratio nij / Eij where

nij = no. of reports mentioning both drug i & event j

Eij = expected no. of reports of drug i & event j

Both report features of posterior dist’n of ‘information criterion’

ICij = log2 nij / Eij = PRRij

Eij usually computed assuming drug i & event j are mentioned

independently

Ratio > 1 (IC > 0) combination mentioned more often than expected if independent

Page 18: Safety Data Mining:  Background and Current Issues

WHO (Bate et al, EurJClPhrm1998)

‘Bayesian Confidence Neural Network’ (BCNN) Model:

nij = no. reports mentioning both drug i & event j

ni+ = no. reports mentioning drug i

n+j = no. reports mentioning event j

Usual Bayesian inferential setup:

Binomial likelihoods for nij, ni+ , n+j

Beta priors for the rate parameters (rij, pi, qj)

Page 19: Safety Data Mining:  Background and Current Issues

WHO, cont’d Uses ‘delta method’ to approximate variance of

Qij = ln rij / piqj = ln 2 ICij

However, can calculate exact mean and variance of Qij

WHO measure of importance = E(ICij) - 2 SD(ICij)

Test of signal detection predictive value by analysis of signals 1993-2000: Drug Safety 2000; 23:533-542

84% Negative Pred Val, 44% Positive Pred Val

Good filtering strategy for clinical assessment

Page 20: Safety Data Mining:  Background and Current Issues

WHO, cont’d WHO. (Orre et al 2000)

Page 21: Safety Data Mining:  Background and Current Issues

IC

DPAPDAPDAPDAI ,log,,

1,0,loglog

,loglog

,,

,

11

1

21

i kdk

i

ki

i kdk

i

ki

idk

i

ki

di

i

i i

i

ifAPdPAdPAP

APdPAdPAP

APdPAdP

APdPAdPAP

dP

AdPAPddAPDAP

ki

ki

kii

Let A denote adverse events and D denote the drug.

Mutual information I(A,D) is a measure of association

WHO, cont’d

Page 22: Safety Data Mining:  Background and Current Issues

DuMouchel (AmStat1999)DuMouchel (AmStat1999) Eij known, computed using stratification of database --

ni+(k) = no. reports of drug i in stratum k

n+j(k) = no. reports of event j in stratum k

N(k) = total reports in stratum k

Eij = k ni+(k)n+j

(k) / N(k) (E (nij) under independence)

nij ~ Poisson(ij) -- interested in ij = ij/Eij

Prior dist’n for = mixture of gamma dist’ns:

f(; a1, b1, a2, b2, ) = g(; a1, b1) + (1 – ) g(; a2, b2)

where g(; a, b) = b (b)a – 1e-b/(a)

Page 23: Safety Data Mining:  Background and Current Issues

DuMouchel, cont’dDuMouchel, cont’d Estimate , a1, b1, a2, b2 using Empirical Bayes -- marginal dist’n of

nij is mixture of negative binomials

Posterior density of ij also is mixture of gammas

ln2 ij = ICij

Easy to get 5% lower bound (i.e. E(ICij) - 2 SD(ICij) )

Page 24: Safety Data Mining:  Background and Current Issues

The control group and The control group and the issue of ‘compared to the issue of ‘compared to

what?’what?’ Signal strategies, compare

a drug with itself from prior time periods

with other drugs and events

with external data sources of relative drug usage and exposure

Total frequency count for a drug is used as a relative surrogate for external denominator of exposure; for ease of use, quick and efficient;

Analogy to case-control design where cases are specific AE term, controls are other terms, and outcomes are presence or absence of exposure to a specific drug.

Page 25: Safety Data Mining:  Background and Current Issues

Other useful metrics and Other useful metrics and methodsmethods

Chi-square statistics

P-value type metric- overly influenced by sample size

Modeling association through directly Multivariate Poisson dist

Incorporation of a prior distribution on some drugs and/or events for which previous information is available - e.g. Liver events or pre-market signals

Page 26: Safety Data Mining:  Background and Current Issues

Interpreting the Signal ThroughInterpreting the Signal Throughthe Role of Visual Graphicsthe Role of Visual Graphics

Four examples of spatial maps that reduce the scores to patterns and user friendly graphs and help to interpret many signals collectively

Page 27: Safety Data Mining:  Background and Current Issues

Example 1

A spatial map showing the “signal scores” for the most frequently reported events (rows) and drugs (columns) in the database by the intensity of the empirical Bayes signal score (blue color is a stronger signal than purple)

Page 28: Safety Data Mining:  Background and Current Issues

Example 2

Spatial map showing ‘fingerprints’ of signal scores allowing one to visually compare the complexity of patterns for different drugs and events and to identify positive or negative co-occurrences

Page 29: Safety Data Mining:  Background and Current Issues

Example 3

Cumulative scores and numbers of reports according to the year when the signal was first detected for selected drugs

Page 30: Safety Data Mining:  Background and Current Issues

Example 4

Differences in paired male-female signal scores for a specific adverse event across drugs with events reported (red means females greater, green means males greater)

Page 31: Safety Data Mining:  Background and Current Issues

Summary Summary

1. There is NO Golden Standard method for signal detection.

2. The signals become more stable over time, however there is a limited time window of opportunity for signal detection.

3. Use Time-slice evolution of signal.-Fluctuation might reveal external risk factors. -Robustness can be assessed.

4. Consider other endpoint such as time to onset, duration of event, etc.

5. For spontaneous case reports, the means to improve content is to standardize and improve intake

6. Data mining likely will generate many false positives and affirmations of what was previously known

7. Causality assessments should largely be reserved refining important signals

Page 32: Safety Data Mining:  Background and Current Issues

Challenges in the Challenges in the futurefuture

More real time data analysis

More interactivity ( Visual Data mining, e.g. ggobi )

Linkage with other data bases to control the bias inherent in data base

Quality control strategies (e.g. Identifying duplicates

Methods to reduce the false positive and negative?