maras: multi-drug adverse reactions analytics system · 2016-04-29 · fatal. for example, aspirin...

54
MARAS: Multi-Drug Adverse Reactions Analytics System by Tabassum Kakar A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE In partial fulfillment of the requirements for the Degree of Master of Science in Data Science by April 23, 2016 APPROVED: Professor Elke A. Rundensteiner, Thesis Advisor Professor Xiangnan Kong, Thesis Reader

Upload: others

Post on 02-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

MARAS: Multi-Drug Adverse Reactions Analytics System

by

Tabassum Kakar

A Thesis

Submitted to the Faculty

of the

WORCESTER POLYTECHNIC INSTITUTE

In partial fulfillment of the requirements for the

Degree of Master of Science

in

Data Science

by

April 23, 2016

APPROVED:

Professor Elke A. Rundensteiner, Thesis Advisor

Professor Xiangnan Kong, Thesis Reader

Page 2: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Abstract

Adverse Drug Reactions (ADRs) are a major cause of morbidity and mortality

worldwide. Clinical trials, which are extremely costly, human labor intensive and

specific to controlled human subjects, are ineffective to uncover all ADRs related

to a drug. There is thus a growing need of computing-supported methods facilitat-

ing the automated detection of drugs-related ADRs from large reports data sets;

especially ADRs that left undiscovered during clinical trials but later arise due to

drug-drug interactions or prolonged usage. For this purpose, big data sets available

through drug-surveillance programs and social media provide a wealth of longevity

information and thus a huge opportunity.

In this research, we thus design a system using machine learning techniques to

discover severe unknown ADRs triggered by a combination of drugs, also known

as drug-drug-interaction. Our proposed Multi-drug Adverse Reaction Analytics

System (MARAS) adopts and adapts an association rule mining-based methodol-

ogy by incorporating contextual information to detect, highlight and visualize inter-

esting drug combinations that are strongly associated with a set of ADRs. MARAS

extracts non-spurious associations that are true representations of the combination

of drugs taken and reported by patients. We demonstrate the utility of MARAS

via case studies from the medical literature, and the usability of the MARAS sys-

tem via a user study using real world medical data extracted from the FDA Adverse

Event Reporting System (FAERS).

Page 3: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Acknowledgements

I would like to express my sincere gratitude to my advisor Professor Elke Run-

densteiner for giving me the opportunity to work on this research and making it a

pleasant experience for me. Without her support and continuous motivation this

research would not have been possible. I am grateful for her time on revising my

work again and again, to make it perfect. I really appreciate her patience, guidance,

encouragement, as well as immense knowledge that inspired me and helped me grow

and continuously improve my way of thinking in solving a problem.

I am very grateful to Professor Xiangnan Kong for his valuable time on advising

me and reading my thesis.

I am thankful to Xiao Qin, graduate student at Computer Science Department

(WPI) for his close collaboration on this work. Without his continuous feedback

and joint effort this would not have been a quality and perfect work. His immense

knowledge on association rule mining and research has motivated me to continue

research.

I am also thankful to Susmitha Wunnava, graduate student at Data Science

Department (WPI) for her collaboration, support and motivation throughout the

course of this research.

I am deeply grateful to all the Data Science Program Professors whose teachings

equipped me with the knowledge to work on this research. I also thank Mary

Racicot, the administrative staff of Data Science for her continued support.

At the end, I would like to extend a special thanks to my family for their non

stop support, motivation and believing in me.

i

Page 4: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Contents

1 Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Limitation of State-of-the-art . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Research Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 The MARAS Methodology . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Preliminaries 9

2.1 Association Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Association Rule Model for Multi-Drug ADR Signal 11

3.1 Drug-ADR Association . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Closed Drug-ADR Association . . . . . . . . . . . . . . . . . . . . . . 11

3.3 Type of Drug-ADR Association . . . . . . . . . . . . . . . . . . . . . 12

3.4 Mining Supported Drug-ADR Rule Using Closed Itemset . . . . . . . 14

3.5 Multi-level Contextual Association Cluster . . . . . . . . . . . . . . . 15

3.6 Exclusiveness Score for Drug-Drug Interaction Signal . . . . . . . . . 17

4 Visualizing Drug-ADR Association 20

4.1 MARAS Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

ii

Page 5: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

5 Experimental Evaluation 25

5.1 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.2 Mining Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.3 Result At-A-Glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.4 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.4.1 User Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Related Work 33

7 Conclusion 35

A User Study 36

A.1 Sample of interesting and non-interesting groups . . . . . . . . . . . . 36

A.2 User Study Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

iii

Page 6: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

List of Figures

1.1 The MARAS Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.1 A Contextual Glyph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.2 Panoramagram of Glyphs . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3 Zoom-in Glyph View . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.1 Reduction in number of rules. . . . . . . . . . . . . . . . . . . . . . . . 27

5.2 User study results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.3 Bar-chart representing MCAC . . . . . . . . . . . . . . . . . . . . . . . 32

A.1 Sample of two drugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

A.2 Sample of three drugs . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

A.3 Sample of four drugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

A.4 Question 1 with barchart . . . . . . . . . . . . . . . . . . . . . . . . . . 38

A.5 Question 1 with CG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

A.6 Question 2 with barchart . . . . . . . . . . . . . . . . . . . . . . . . . . 39

A.7 Question 2 with CG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

A.8 Question 3 with barchart . . . . . . . . . . . . . . . . . . . . . . . . . . 40

A.9 Question 3 with CG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

A.10 Question 4 with barchart . . . . . . . . . . . . . . . . . . . . . . . . . . 42

A.11 Question 4 with CG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

iv

Page 7: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

A.12 Question 5 with barchart . . . . . . . . . . . . . . . . . . . . . . . . . . 42

A.13 Question 5 with CG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

v

Page 8: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

List of Tables

3.1 Example of a MCAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5.1 FAERS Data From 2014 . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.2 Top 5 Multi-Drug Associations from 1st Quarter of 2014. . . . . . . . . . 28

vi

Page 9: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Chapter 1

Introduction

1.1 Background

An Adverse Drug Reaction (ADR) corresponds to an unwanted and dangerous effect

caused by the administration of a drug. According to the U.S. Food and Drug

Administration (FDA) every year hundreds of thousands of people die because of

these ADRs while over two million serious ADRs are reported every year [1]. For

this purpose, expensive controlled clinical trials are mandated to thoroughly test

the possible ADRs of any drug. In general these clinical trials are expensive and

are restricted to a limited time-frame, population groups with specific diseases and

certain combination of drugs. Thus these trials provide limited information on ADRs

caused by the prolonged usage of the drug or interaction with other drugs taken by

certain patients. Once tested through clinical trails and approved by FDA, the drug

is released into the market for public consumption.

In the post-marketing phase the effectiveness and safety of drugs is monitored

by regulatory agencies known as post-marketing surveillance. One such surveillance

system, the FDA Adverse Event Reporting System (FAERS) [5] collects information

1

Page 10: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

on adverse events related to drugs reported by patients, health care professionals

and drug manufacturers in a database and makes it available to public via web ser-

vices. Many drugs have been withdrawn from the market during the post-marketing

surveillance for their adverse effects such as Posicor [24], Troglitazone [15], Cerivas-

tatin [14] and many more. Research [21] has shown computational methods applied

on the data from post-marketing surveillance can help address the limitations of

clinical trials, i.e. discovering potential severe ADRs related to drugs.

Adverse reactions can be caused by the administration of a single drug or multiple

drugs, either upon immediate use or prolonged use or even overdose. ADRs caused

by multiple drugs also known as Drug-Drug Interactions can vary from being minor

to severe. Minor reactions might increase or decrease the effectiveness of one of

the drugs. On the other hand, a severe reaction can turn out to be potentially

fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug,

may lead to excessive bleeding [9]. Therefore, these drug-drug interactions should be

detected early-on with minimum patient exposure to avoid further harmful incidents.

The data collected from drug-surveillance programs is extremely useful resource to

tap into information related to the drug-drug interaction obtained first-hand from

the patients. Manual scanning of all these reports is extremely costly and nearly

impossible as thousands of reports are added on daily bases [5] hence the database

grows rapidly. Computational methods, especially data mining techniques can be

used to automatically identify the drug-drug interactions.

In order to find association between a combination of drugs and ADRs using

data mining tools, it is vital to know how these drugs individually are associated

with these ADRs. That is, if the chances of the single drug triggering the ADRs are

high then it is less likely that the combination of two or more drugs is triggering

these ADRs, hence the combination of drugs is not interesting. For example, if

2

Page 11: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

the chances of Zometa and Prilosec taken together triggering a set of ADRs such

as Osteoarthritis, Neuropathy peripheral, Osteonecrosis of jaw and pain is very

high, while the possibility of either Zometa or Prilosec taken individually triggering

these ADRs is very low, then the combination of Zometa and Prilosec together as

a potential cause is worth further investigation. This ADR detection is especially

helpful for the drug-safety evaluator to focus on the potentially important drug

safety issue.

1.2 Limitation of State-of-the-art

Drug-drug interactions have been studied in the literature using statistical methods

such as relative reporting ratio and disproportionality analysis [26, 27, 28]. These

methods tend to be restricted to either a specific set of drugs or specific ADRs.

Hence they lack the general methodology to consider any reports that can help to

identify the most critical ADRs driven by the data. While it is crucial for drug-safety

monitoring organizations such as FDA to have plausible interaction information for

all reported drugs and ADRs; to direct their limited resources to take the most

appropriate action to target critical cases for further investigation.

In general, association rule mining is a popular technique used for identifying

relationships among items in large databases. Association rule mining has been used

previously in the medical domain to find ADRs possibly caused by drugs [12, 18].

There has been initial evidence that it can be used to mine associations among

combination of drugs and ADRs [17]. However, current methods lack to take context

of the rules into consideration that can provide crucial information about the drug-

drug interactions. Contextual information is important as considering a rule in

isolation may lose vital information present in its sub-rules. In case of detecting

3

Page 12: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

ADRs related to a combination of drugs this contextual information can provide

insights into the relationship of the corresponding individual drugs to the ADRs.

Furthermore, some association rules can be misleading, i.e. not true representative

of real data, which needs to be eliminated. Finally, an interestingness criteria is

needed to identify the most crucial drug-drug interactions.

1.3 Research Challenges

To develop an Multi-Drug Adverse Reaction Analytics System (MARAS) the fol-

lowing research challenges must be addressed:

Amount of generated rules: Association rule mining applied to a set of

thousands of drugs and ADRs generates extremely huge number of rules which is

impossible for an analyst to sift through. These rules may exponentially grow es-

pecially if the minimum support threshold is very low. In the context of detecting

drug-drug interaction, a low support is necessary to extract all possible combination

of drugs and ADRs. Furthermore, some of these rules might be redundant, mislead-

ing or inappropriate for the analysis of drug-drug interaction. Thus we must devise

a criteria for selecting the appropriate ones from the overwhelming pool of rules.

Avoiding misleading information related to Drug-ADR: Association rules

can sometimes be misleading as they may not be the true representation of data.

This mining process finds all possible relationships among all items in a dataset

which may generate some rules based on partial information. In the context of

detecting drug-drug interaction, rules depicting partial information tend to ignore

drugs that could be contributing factor to the interaction. While one may argue

that the shorter rules with partial information are more general hence better, how-

ever, to find ADRs triggered by combination of drugs these partial rules might not

4

Page 13: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Reports� Adverse Event Reporting System �

Patterns�MARAS�Analytics� Medical Professions�

Domain Knowledge�

Aggregated Result�MARAS

Visualization�

Figure 1.1: The MARAS Approach

depict the actual combination of drugs reported by patients. Therefore, we need a

strategy to detect and eliminate such deceiving information.

Defining interestingness with respect to drug-drug interactions: We

have large number of rules showing ADRs associated with a combination of drugs,

however it is very challenging to find the most interesting ones. Even the definition of

interesting is blurred; For example, the system might select a drug-drug interaction

as interesting one but it might not be interesting for the decision makers because it

is already known and they want to know the unknown drug-drug interactions.

Advanced exploration of interesting drug-drug interactions: Although

a system generates potential drug combinations that trigger ADRs, a major chal-

lenge is providing an interactive tool to the decision maker in order to explore the

interesting drug combinations. For example, integrating domain knowledge into the

system would be beneficial to highlight interactions that are not unknown or may

lead to particularly severe adverse reactions.

1.4 The MARAS Methodology

We design the Multi-Drug Adverse Reaction Analytics System (MAR

AS) depicted in Fig. 1.1 to address the above challenges. The MARAS models

5

Page 14: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

the correlation between a combination of drugs and some possible adverse drug

reaction(s) by adopting and then adapting the association rule methodology.

First, MARAS extracts the critical information from the adverse drug reaction

reports collected by the adverse event reporting system. It then pre-processes this

abstracted information for the core mining phase.

Second, MARAS leverages relevant domain specific insights to efficiently derive

the associations that reflect the original reports while avoiding misleading and redun-

dant information. MARAS offers flexibility to the users to control the ADRs ac-

commodated by the indicated preferences such as interestingness in unknown ADRs

versus unknown drug-drug interactions.

Third, we propose a Multi-level Contextual Association Clustering method to

evaluate the significance of the discovered multi-drug adverse drug reactions. This

contextual ranking strategy scores and ranks the drug-ADR association considering

the strength of the association as well as whether its strength is inversely propor-

tional to the strength of its contextual rules. The contrast of the strength can better

measure the significance of multi-drug ADRs as compared to traditional interest-

ingness measurements such as support, confidence, lift and etc.

Fourth, we propose a visualization method that pictures the overall distribution

of the discovered drug-ADR associations over the ranking scores. For each contex-

tual group, it presents the contrast of the strength of the drug-ADR to the strength

of its contextual rules. It provides a picturesque interpretation of the discovered pat-

terns to the user. MARAS also links the original reports to the returned patterns

so that the user can perform further investigation with richer information available

in the original reports.

6

Page 15: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

1.5 Contributions

The key contributions by this work include:

Modeling Drug-ADR Associations: Mining non-spurious associations among

combination of drugs and ADRs is a two step process. First, we selected only those

rules that have drugs as antecedents and ADRs as consequent to find the possible

drug interactions triggering some adverse reactions. Second, in order to dismiss mis-

leading rules, we introduced the notion of closed drug-ADR associations in Section

3.2 that avoids spurious rules while generating those rules that are true representa-

tions of the original data reports.

Multi-level Contextual Association Clustering (MCAC) of Drugs and

ADRs: We propose a Multi-level Contextual Association Clustering model known

as MCAC that uses the contextual information related to the combination of drugs.

MCAC (Section 3.3) groups a rule composed of two or more drugs with its sub-rules

having a single drug or a subset of drugs to help understand the association of the

drugs and ADRs.

Ranking of Multi-level Contextual Groups: We propose (1) the exclusive-

ness measure that highlights the interesting drug-drug association groups, i.e. the

groups of rules having the contextual information. We score each group based on

the difference of the strength of association among drugs and ADRs of a rule and

its sub-rules within a group. (2) We define interestingness in the context of ADRs

related to drug-drug interactions. Intuitively, the higher the difference in strength

of association of a rule and its sub-rules within a group, the higher the group score

and the more interesting the rule.

Visualization of MARAS Rule Recommendation: We propose the MARAS

visual interactive tool that enables an analyst to not only have the insights into most

7

Page 16: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

plausible drug-drug interactions but also provide them with the flexibility to select

the results based on customized criteria such as a specific drug or ADR (Section 4.1).

The user interface provides a visual exploratory tool that allows the user to (1) find

the interesting drug-drug interactions, (2) visualize the corresponding multi-level

contextual group of Drugs and ADRs, (3) extracts the raw reports that supports

the corresponding interesting drug-drug interaction

Experimental Evaluation: We demonstrate the utility of MARAS via case

studies from the medical literature, and the usability of the MARAS system via

a user study using real world medical data extracted from the FDA Adverse Event

Reporting System (FAERS).

8

Page 17: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Chapter 2

Preliminaries

2.1 Association Rule

Symbolic data analysis techniques that aim to discover patterns or models in data

can be divided into two categories: predictive and descriptive induction. Unlike

predictive induction where models are induced from class labeled data, descriptive

induction aims to find comprehensible patterns typically induced from unlabeled

data. Association rule mining is a descriptive induction technique that is widely

used to detect relationship among the items in large databases.

Let I = {i1, i2, ..., in} represent a set of items. D = {d1, d2, ..., dm} is a collection

of subsets of I called the transaction database. Each transaction di in D is a

set of items such that di ⊆ I. Let S ⊆ I be a set of items, called itemset. If

S ⊆ di, di contains S. |S| denotes the number of transactions in D that contain S.

If the cardinality of S is k, S is called a k-itemset.

Definition. 2.1.1 An association rule is an expression of the form R ≡ A ⇒ B,

where A and B are itemsets and A ⊆ I, B ⊆ I \ A.

Many measurements [23] have been proposed to evaluate the interestingness of

9

Page 18: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

associations. Among them, the most widely used are support, confidence and lift

defined as follows:

Support(R ≡ A ⇒ B) = P (A ∪ B) = |A ∪ B| (2.1)

Confidence(R ≡ A ⇒ B) = P (B|A) =|A ∪ B||A|

(2.2)

Lift(R ≡ A ⇒ B) =P (B|A)

P (B)=P (A|B)

P (A)=|A ∪ B| ×N|A| × |B|

(2.3)

The support defined in Formula 2.1 describes the proportion of the transactions

that contains all items in the association. The confidence defined in Formula 2.2

describes the probability of finding the consequent B of the association under the

condition that these transactions also contain the antecedent A. It is a maxi-

mum likelihood estimate of the conditional probability P (B|A). The lift defined in

Formula 2.3 measures how many times more often A and B occur together than

expected if they are statistically independent.

10

Page 19: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Chapter 3

Association Rule Model for

Multi-Drug ADR Signal

3.1 Drug-ADR Association

Let Idrug and Iade be the complete sets of drugs and ADRs respectively where

Idrug∩Iade ≡ ∅, Idrug∪Iade ≡ I. To measure the association between the drugs and

ADRs, rules with only drug items as antecedents and ADR items as consequences

are considered. Therefore, an association rule R ≡ A ⇒ B is considered as a

drug-ADR association if A ⊆ Idrug and B ⊆ Iade.

3.2 Closed Drug-ADR Association

Traditional association rule model assumes that the correlations among items are

indicated by their co-occurrence in the database. Without pre-established depen-

dency constraints among items, existing rule mining techniques [29] consider every

possible combination of items that appears in a transaction, as a itemset. For exam-

11

Page 20: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

ple, let t = {i1, i2, ..., in} be a transaction. Without any threshold, e.g. minimum

support, the total number of possible itemsets that can be generated based on this

single transaction is:

n1

+

n2

+ ...+

nn

= 2n − 1 (3.1)

Because of this, the number of the possible associations among items grow expo-

nentially w.r.t the number of unique items. As many studies [30, 6] point out, huge

amount of redundant information exists within the generated result. In our study,

we find that some of the patterns are not only redundant, but also misleading in

the context of finding drug-ADR associations from ADR reports.

3.3 Type of Drug-ADR Association

Let us consider an abstracted ADR report with a set of drugs takenA1 = {d1,d2} and

a set of observed ADRs B1 = {a1,a2}. This single ADR report explicitly establishes

the association between A1 and B1, expressed by the rule R1 ≡ d1, d2 ⇒ a1, a2.

Based upon this single report, the traditional association rule mining algorithms

generate 9 drug-ADR associations ((22−1)×(22−1)) including R1. All rules except

R1 are partial interpretations of the report since certain item(s), e.g. some drugs

or ADRs mentioned in the report, are absent from the rule. These rules capture

the associations among the partial drug and ADR sets implicitly indicated by the

report. In some scenarios, these rules could be misleading unless they are indicated

by other reports explicitly or implicitly.

For example, R2 ≡ d1 ⇒ a2 tells that taking d1 causes a2. It may not be true

at all since this report does not ever explicitly indicate this pairwise relationship.

12

Page 21: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

However, if some other report exists, e.g. D2 = {d1,d5,d6} and A1 = {a2,a3,a7},

that also implicitly indicates this partial interpretation, then R2 can be possibly

considered as a legitimate association. Therefore, it can be more safely claimed as

a discovery of the association among drugs and ADRs.

As of now, we briefly describe 3 types of drug-ADR associations, namely, (1)

associations that are explicitly indicated by the report, (2) associations that im-

plicitly indicated by multiple reports and (3) partial associations. In the context of

discovering drug-ADR association using ADR reports, type 1 and 2 can be safely

used as discoveries and type 3 conveys misleading information, therefore, should

not be considered. Next, we will formally define the three types of association and

discuss how our system identifies each type.

Let t be an ADR report. Each report consists of a set of taken drugs and a set

of observed ADRs, denoted by t.D and t.A respectively. T = {t1,...,tn} is a set of

ADR reports in the database. Let R be a drug-ADR Association discovered from

T .

Definition. 3.3.1 Explicitly Supported Drug-ADR Association. R ≡ A ⇒

B is explicitly supported by T , if there exists at least one t ∈ T , such that A∪B ≡

t.D ∪ t.A.

If a drug-ADR association is explicitly supported, according to this definition, a

report exists that only describes the drugs and ADRs expressed in the association.

Other reports that contain these drugs and ADRs can also be used as the evidence

to measure the significance of this association. For example, if there is 1 report only

contains the drugs and ADRs in the association and 99 reports involve them as well,

then the support of R is 100.

Definition. 3.3.2 Implicitly Supported Drug-ADR Association. R ≡ A ⇒

13

Page 22: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

B is implicitly supported by T , if there exists at least two reports t1, t2 ∈ T , such

that A ∪ B ≡ (t1.D ∪ t1.A) ∩ (t2.D ∪ t2.A).

The implicitly supported drug-ADR association captures the partial association

that is derived from different reports. It makes sure that the partial association is

not randomly generated due to the nature of the traditional association rule mining

model. If a drug-ADR association is neither explicitly nor implicitly supported, then

it is unsupported and therefore ignored.

3.4 Mining Supported Drug-ADR Rule Using Closed

Itemset

For a set of frequent itemsets mined from a particular dataset, closed itemsets [6]

are a subset of these regular itemsets that conveys the same amount of information.

By removing some of the redundant itemsets, it compactly represents the regular

itemsets without losing any information. The closed frequent itemset is defined as

below:

Definition. 3.4.1 An itemset S is a closed itemset if there exists no itemset S ′

such that (1) S ′ is a proper superset of S, and (2) every transaction containing S

also contains S ′ . A closed itemset S is frequent if its support passes the given

support threshold.

In our study, we find that if the complete itemset (A∪B) in a rule R ≡ A ⇒ B

is closed, then R must be a supported drug-ADR association. By complete itemset

we mean all the itemsets of the rule. Furthermore, we postulated for all type 1 &

2 drug-ADR associations w.r.t a specific dataset, their complete itemset must be

closed. In the context of finding drug-ADR association, using closed itemset as the

14

Page 23: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

complete itemset of a rule not only compresses the ruleset but also removes any

semantically misleading information as explained in the previous section.

Lemma 3.4.2 R ≡ A ⇒ B is a supported drug-ADR association if the itemset

(A ∪ B) is a closed itemset.

Proof For a rule R ≡ (A ⇒ B), if A ∪ B is closed then there does not exist a

superset that has different support as A ∪ B does. With zero minimum support,

there are two possibilities causing such non-existence: (1) either no report exists

that has more items than A∪B which would mean R must be explicitly supported,

(2) or A∪ B is not a randomly generated subset, meaning that it must be a subset

of at least two different report, therefore R is explicitly supported.

Using Lemma 3.4.2, we only generate and consider the associations with complete

closed itemset to ensure the quality of the ruleset. Furthermore, the goal of this

study is to discover ADRs that are associated with a combination of drugs, the

drug-ADR association will be evaluated as long as it has more than one drug.

3.5 Multi-level Contextual Association Cluster

In this particular study, our goal is to capture the ADRs associated with multiple

drugs. The implication of this definition of interestingness is that the association is

interesting if ADRs are only highly associated with the complete set of drugs rather

than any of the individual drug or subsets of drugs associated with this rule.

While the existing measures [17] are able to find that some drug combinations

are highly associated with particular ADRs, they fail to verify whether this strong

association is in fact already dominated by a subset of the same drugs. Such a

domination from a subset of the drugs may weaken drug-drug interaction signal.

15

Page 24: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Table 3.1: Example of a MCAC

R [XOLAIR] [SINGULAIR] [PREDNISONE] ⇒ [Asthma]

R̃2

R̃21 ≡ [XOLAIR] [SINGULAIR] ⇒ [Asthma]

R̃22 ≡ [XOLAIR] [PREDNISONE] ⇒ [Asthma]

R̃23 ≡ [SINGULAIR] [PREDNISONE] ⇒ [Asthma]

R̃1

R̃11 ≡ [XOLAIR] ⇒ [Asthma]

R̃12 ≡ [SINGULAIR] ⇒ [Asthma]

R̃13 ≡ [PREDNISONE] ⇒ [Asthma]

For example, if the ADRs is highly associated with an individual drug in the com-

bination, it means that the ADRs are likely caused by this particular drug instead

of the drug-drug interaction.

To measure this notion of exclusiveness of the association between drugs and

ADRs, any rule that describes the association between a subset of drugs and the

ADRs needs to be considered as well. These related associations are henceforth

referred to as the contextual rules of the association that is being evaluated. In

particular, we now define the contextual rule as below.

Definition. 3.5.1 A drug-ADR association R ≡ X ⇒ Y is a contextual rule of

a drug-ADR association R ≡ A⇒ B if and only if X ⊂ A and Y ≡ B.

Definition. 3.5.2 The context of a Drug-ADR association R is a set of con-

textual rules of R denoted by C ≡ {R̃1,...,R̃n} such that⋃n

i=1 R̃i.antecedent ≡

P(R.antecedent)−{R.antecedent, ∅} where P(X ) is the power set of an itemset X .

A multi-level contextual association cluster refers to a combination of an

evaluated drug-ADR association and its context. The evaluated association is called

target rule. The contextual rules are grouped according to cardinality of their

antecedents. Table 3.1 displays a drug-ADR association and its entire context. In

the example, R̃ki denotes a contextual rule and k is the carnality of its antecedent.

16

Page 25: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

3.6 Exclusiveness Score for Drug-Drug Interac-

tion Signal

As we explained, if ADRs are caused by the interaction of a set of drugs, normally

any subset of these drugs are not or weakly associated with the particular ADRs.

Inspired by this observation, we propose the exclusiveness measure that uses the

context information to evaluate the interestingness of a drug-ADR association in

terms of indicating drug-drug interaction.

Improvement measure proposed by [19] to evaluate the interestingness of the

“lengthy” rules derived from a dense dataset is derived in Formula 3.2)

Improvement(A ⇒ B) = Min(conf(A ⇒ B)− conf(As ⇒ B)|As ⊂ A) (3.2)

A rule with negative improvement is typically undesirable because the rule can

be simplified to yield a proper sub-rule that is more predictive. Also it applies to

an equal or larger population due to the antecedent containment relationship. The

notion of “sub-rule” relates to our notion of contextual association. If a drug-ADR

association is evaluated using improvement, negative improvement means that there

exists an individual drug or a subset of drugs in the rule which is more likely to cause

the ADRs. Therefore, rules with negative or low improvement value are not interest-

ing. The improvement measure reflects the similar meaning of interestingness as our

definition. However, it only considers the one sub-rule that is the most significant of

all. Overlooking the other sub-rules deprives the opportunity to differentiate among

several interesting cases. For example, even if two drug-ADR associations share the

same improvement value, the one with a larger number of high confidence sub-rules

17

Page 26: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

may be less interesting than the other one because more subset of the drugs seem

to cause the same ADRs. To utilize the entire context to evaluate the drug-ADR

association, we propose exclusiveness measure.

Let V = {v1, ..., vn} be a set of confidence values w.r.t the context of a rule R

and p be the confidence value of R. To utilize the full context, our proposed score in

Formula 3.3 uses the average confidence of the context to compute the exclusiveness.

Exclusiveness(R) = p− 1

n

n∑k=1

vk (3.3)

The deficiency of this method is that it falsely weakens the negative effects of

the contextual rule with a high confidence. The average confidence can be much

lower than the maximum confidence. To overcome this, we introduce the coefficient

of “‘variation” to the measure. Computed as in Formula 3.4, the context with

extremely high and low confidence will be penalized. θ (0 ≤ θ ≤ 1) is a parameter

that allows the user to control the effect of such penality.

Exclusiveness(R) = (p− 1

n

n∑k=1

vk)× (1− θ · Cv(vk)) (3.4)

Following the intuition, contextual rules that describe the association between

individual drug and the ADRs are very important to measure the exclusiveness of

the association between the complete set of drugs and ADRs. As the number of drugs

increases in the contextual rule, the importance of the corresponding exclusiveness

decreases. For this reason, we introduce a decay function fd(k) to decrease the

importance of the contextual rules as the cardinality of their antecedent increases.

Exclusiveness(R) =1

|V|

k∑1

(p− v̄k)× fd(k)× (1− θ · Cv(vk)) (3.5)

vk is a set of confidence values w.r.t a set of contextual rules with k drugs. V

18

Page 27: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

denotes the complete set of vk for R. In our experiment, we use a linear decay

function. If n is the number of drugs in an association, the weight for the average

exclusiveness over contextual rules R̃k is (1− (k− 1)/n). As mentioned in [19], the

confidence in this computation could be replaced by other reasonable measures. For

instance, we also experiment with lift in our evaluation study.

19

Page 28: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Chapter 4

Visualizing Drug-ADR Association

Glyph [7] is a popular technique to visualize multivariate data. A glyph is a display

object that consists of various attributes including shape, size, color and position.

Each of these attributes can be used to describe different variables of the displayed

data. Glyph has been used before to visualize the association rules [15]. However,

existing approaches only consider single rule at a time. They do not work on a cluster

of rules that is composed of sub-rules as explained in Section 3.3. Therefore, to

provide a flexible methodology to help analysts quickly comprehend similarities and

differences among various multi-level association clusters, we propose Contextual

Glyph (CG) depicted in Fig.2. The user study confirmed that CG are a better way

to pinpoint interesting drug-interactions.

The inner circle represents the target rule. The diameter of the circle encodes

the confidence value of the target rule. The surrounding circular sectors represent

the contextual rules. For each circular sector, the distance from the arc to the inner

circle encodes the confidence value of its represented rule. Staring from 12 o’clock,

contextual rules are uniformly laid out ordered by the cardinality of their antecedent.

Rules with the same cardinality identified by the same color (the darker the larger)

20

Page 29: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Confidence of the Evaluated Rule

Confidence of the Contextual Rule

3

2

1

# of Drugs

Ordered by Confidence value

Figure 4.1: A Contextual Glyph

are ordered by their confidence values. In our contextual glyph the larger the inner

circle and the smaller the outer circles are, the higher the rank of the group is,

showing strong association between the ADRs and a drug combination. The user

friendly interactive interface provides options of further drilling down to each glyph.

Clicking on any glyph shows the zoomed view to get further insights about the rule

cluster and mouse over on any segment displays further information about the rule.

These contextual glyphs provide a flexible way of comparison among interesting and

non interesting drug-interactions. An analyst can easily identify the similar ranked

ones and get further insights by drilling down.

4.1 MARAS Interface

MARAS provides a visual interactive interface Fig. 4.2 that have below function-

alities to help an analyst explore the drug related information:

21

Page 30: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure 4.2: Panoramagram of Glyphs

Highlighting interesting drug-drug interactions: The system facilitates

a user to select few interesting drug-drug interactions from the pool of thousands

of rules by searching for a specific drug, combination of drugs or specific ADR.

Moreover, the user can also select drug interactions based on some defined criteria

of interestingness such as drug interactions that may lead to severe ADRs which

might need immediate action for further investigation.

Visualization of ADRs associated with these interactions: Once a user

selects the interesting drug-drug interactions, he can get further insights about the

interaction by visualizing the corresponding multi-level contextual group of Drugs

and ADRs, i.e., association of each individual drug with the ADRs (Fig. 4.1).

Furthermore, the system can highlight drug-drug interactions that are similar to

each other based on the defined interestingness criteria.

22

Page 31: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure 4.3: Zoom-in Glyph View

23

Page 32: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Mapping the drug-drug interactions to actual reports: After the system

generates the plausible drug-drug interaction, they need to be further investigated

in order to and the relevant factors causing the interaction, such as patient’s age,

health history etc. It is essential to analyze the original data reports submitted by

patients that supports the corresponding drug-drug interactions.

24

Page 33: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Chapter 5

Experimental Evaluation

5.1 Data Source

The FDA Adverse Event Reporting System (FAERS) is a database maintained

by FDA as a part of its post-marketing safety surveillance program for drugs and

therapeutic biologic products. FAERS contains million of records about the adverse

event and medication error and is publicly available in quarterly basis. For the

purpose of this research we used the public version of FAERS [3] dataset from

2014. We selected the mandatory reports submitted by manufacturers marked as

expedited (EXP) as these reports contain at least one severe adverse event. Table

2 provides the basic statistics of the dataset we select from each quarter in 2014.

Table 5.1: FAERS Data From 2014

Q1 Q2 Q3 Q4Reports 126,755 138,278 121,725 121,490Drugs 37,661 37,780 33,133 32,721ADRs 9,079 9,324 9,418 9,234

25

Page 34: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

5.2 Mining Process

The first step in the mining process is data preparation and cleaning.We extracted

the drugs and ADRs from FAERS reports and merged them for each single case. We

performed some preliminary cleaning on drug names and ADRs to remove duplica-

tion and correct misspellings. The second step is to apply association rule mining

on the pre-processed data. We use FP-Growth trees for closed item-set and rule

generation. The reason for using closed itemset is to remove misleading rules as

described in Section (3.4).

The third step is to select only those rules that have drugs as antecedent and

ADRs as consequent to align with our goal of discovering ADRs related to drugs.

The fourth step is to generate multi-level contextual association clusters and ranking

them using the exclusiveness measure explained in section (3.5) and (3.6) respec-

tively.

5.3 Result At-A-Glance

Fig. 5.1 summarizes the number of associations generated by different methods.

Total rules refer to the associations generated by the traditional association rule

mining algorithm. Filtered rules refer to all the possible drug-ADR associations.

MCACs refer to the closed drug-ADR associations that are used to signal the drug-

drug interactions. As depicted in the figures, our proposed method significantly

reduces the rule space by removing the redundant and misleading associations.

5.2 shows top 5 multi-drug associations from 2014 Quarter 1 data ranked by 4

different methods. The Confidence and Lift columns are the multi-drug associations

ordered by their confidence and lift values. These two methods do not filter the rule

using closed itemsets. As a result, there are many similar rules. Based on our ob-

26

Page 35: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

1.E+07

Q1 Q2 Q3 Q4

Total Rules Filtered Rules MCACs

Figure 5.1: Reduction in number of rules.

servation explained in Section 3.2, most of these similar rules are redundant and

misleading. Among these top ranked rules, some patterns are uninteresting because

individual drug listed in the pattern is able to trigger the same ADRs. Exclusiveness

with Confidence and Exclusiveness with Lift select the closed multi-drug associations

and rank them using our proposed exclusiveness measure with confidence and lift

value. The top ranked rules are more diverse as compared to the first two methods.

Since lift considers the support of the consequence, in our case, rules with relatively

less frequent ADRs are ranked higher by the methods that involve lift. To evaluate

the quality of these results, we conduct a case study.

5.4 Case Study

MARAS identifies potential drug-drug interactions from the data and ranks the

interesting drug-drug interactions based on the exclusiveness measure explained in

section 3.4. Since the exclusiveness measure is based on contextual information

that is rather subjective in nature, it is tough to conduct a straight forward experi-

mental evaluation of the system to measure its effectiveness. Hence, to evaluate the

27

Page 36: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Table 5.2: Top 5 Multi-Drug Associations from 1st Quarter of 2014.

Pepcid Host Disease Acute Graft Versus Host Disease

Rank Confidence Lift Exclusiveness with Confidence Exclusiveness with Lift

1

Zantac

Osteoporosis

Methotrexate Chronic Graft Versus Host Disease Zometa Osteonecrosis Of

Jaw Prograf Chronic Graft Versus Host Disease Nexium Prograf

Tums Granulocyte Colony-Stimulating Factor Nos Drug Ineffective Ambien Anxiety Melphalan Drug Ineffective

Mylanta

2

Zantac

Osteoporosis

Prograf Chronic Graft Versus Host Disease Prograf

Drug Ineffective

Zometa Osteoarthritis

Tums Granulocyte Colony-

Stimulating Factor Nos Drug Ineffective Melphalan Prilosec Neuropathy Peripheral

Mylanta Osteonecrosis Of Jaw

Pain

3

Zantac

Osteoporosis

Methotrexate Chronic Graft Versus Host Disease Prograf

Drug Ineffective

Zometa Osteoarthritis

Nexium Granulocyte Colony-

Stimulating Factor Nos Drug Ineffective Granulocyte

Colony-Stimulating Factor Nos

Prilosec Anaemia

Mylanta Osteonecrosis Of Jaw

Pain

4 Zantac

Osteoporosis Prograf Chronic Graft Versus

Host Disease ZometaPain Osteonecrosis Of Jaw Prograf Drug Ineffective

Mylanta Melphalan Drug Ineffective Prilosec Pain Melphalan Acute Graft Versus Host Disease

5

Rolaids

Osteoporosis Zometa

Osteoarthritis Fludarabine

Drug Ineffective Prograf

Acute Graft Versus Host

Disease Zantac Neuropathy Peripheral

Nexium Prilosec

Osteonecrosis Of Jaw Prograf Granulocyte

Colony-Stimulating Drug Ineffective Tums Pain

MARAS system and its effectiveness we conduct case studies using FAERS patient

reports. The goal is to validate the top ranked drug-drug interactions identified by

MARAS through existing bio medical literature and domain knowledge resources.

Case I: Ibuprofen and Metamizole drug-drug interaction.

One of the top ranked drug-drug interactions identified by MARAS is Ibuprofen,

nonsteroidal anti-inflammatory drug and Metamizole, an analgesic, antipyretic and

anti-inflammatory agent. This interaction is identified from mining the FAERS 2014

second quarter reports and is ranked third by the MARAS system. We found that

the combination of these two drugs is highly associated with acute renal failure.

We validated this drug-drug interaction with the results of a study published in the

World Health Organization (WHO) Pharmaceuticals Newsletter, 2014 [22]. The

study was conducted on the data from VigiBase [20], WHO Global ICSR Database

and has found a statistically significant and valid drug interaction when the two

28

Page 37: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

drugs are used in combination. Incidentally, Metamizole is an international drug

that is used as a pain killer and fever reducer in Mexico and many other countries.

Although it is prohibited in the Unites States, it is widely used among the Latino

and other immigrant populations [8].

Case II: Methotrexate and Prograf drug-drug interaction.

Our results indicate that a combination of Methotrexate and Prograf (Tacrolimus)

is associated with the drug being ineffective. Methotrexate is used to treat certain

types of cancers, rheumatoid arthritis and psoriasis. Prograf (Tacrolimus) is used in

post organ transplantation to prevent rejection by the human body. This interaction

is ranked second by MARAS. We validated this drug-drug interaction using two

sources: Drugs.com [2], a FDA recommended resource for obtaining valuable and re-

liable information on drug-drug interactions and, DrugBank.ca [11], a drug database

that contains comprehensive biochemical and pharmacological information provid-

ing insights on drug-drug interactions. According to Drugs.com, Methotrexate may

cause kidney problems, and combining it with other medications that can also affect

the kidney such as Tacrolimus may increase that risk. According to DrugBank, the

risk or severity of adverse effects can be increased when Tacrolimus is combined with

Methotrexate. Therefore, when two drugs that have similar adverse reactions are

taken concomitantly, their adverse effects might add up and contribute even more

so towards the occurrence of the ADRs.

Case III: Prevacid and Nexium drug-drug interaction. Our results indicate

that a concomitant use of Prevacid and Nexium is associated with Osteoporosis.

This interaction is ranked fourth by the proposed system. Both Prevacid and Nex-

ium belong to a group of drugs called proton pump inhibitors (PPI) that are used to

29

Page 38: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

treat gastroesophageal reflux disease(GERD) by suppressing the secretion of gastric

acid. Several studies [13, 25] have shown that patients taking PPI drugs are at an

increased risk of developing Osteoporosis and related bone fractures. FDA revised

the original drug label for these PPI drugs to include safety information indicat-

ing possible side effects of osteoporosis and fracture warning [4]. We validated this

drug-drug interaction using Drugs.com [2]. According to Drugs.com, the interaction

is classified as a Therapeutic Duplication, meaning using same drug category drugs

to treat the same condition. And, the recommended maximum number of drugs in

the ’acid suppressant agents’ category to be taken concurrently is usually one. In

such cases as these, it is either intentional (drugs combined together for therapeutic

benefit), or unintentional (patient self-prescribed or has been treated by more than

one doctor, or had prescriptions filled at more than one pharmacy). Either way,

combining these drugs can potentially increase the risk of osteoporosis as supposed

to when these drugs are taken individually.

The above two case studies show that, MARAS can easily detect and identify

ADRs associated with a combination of drugs. In the case studies, for validation

purpose, we have intentionally selected already known and published drug-drug

interactions from our top-ranked results. However, we extend the notion and claim

that if MARAS can detect known drug-drug interactions, it is also equally capable

of detecting unlabeled or unknown drug-drug interactions.

5.4.1 User Study

We conduct a user study to evaluate the effective way of presenting contextual

groups (MCAC) visually either in the form of Contextual Glyphs (CG) or bar-

charts displayed in Fig. 4.1 and Fig. 5.3 respectively. We invited 50 students from

30

Page 39: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

0

20

40

60

80

100

Two Three Four

Pe

rce

nta

ge(

%)

Number of Drugs

Contextual Glyph

Barchart

Figure 5.2: User study results

WPI to identify interesting drug-drug interaction containing two-four drugs, using

both bar-charts and CG. For each question the user was given two visuals, one bar

chart and one CG both representing the same drug interactions and the user was

asked to pick the interesting drug interactions. The details about the questions are

stated in Appendix A. Fig. 6 shows the percentage of users who were quickly able

to recognize an interesting pattern correctly using both visuals. It is obvious that

users could accurately identify top-ranked interesting drug interactions using CG

more faster than the bar-charts. In case of two drug combinations 71% of users

were able to pinpoint the interesting interactions using a CG, 57% for three and

86% for four drug combinations. This confirms that CG significantly save users

effort and time on pinpointing the interesting drug-drug interactions. Therefore, we

selected contextual glyphs to represent the drug-drug interaction in our MARAS

system.

31

Page 40: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure 5.3: Bar-chart representing MCAC

32

Page 41: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Chapter 6

Related Work

Drug-drug interactions: Drug Interactions leading to ADRs have been stud-

ied previously. For example, Tatonetti et al [26, 27, 28] have used statistical methods

to find interactions among drug classes. However, these methods are specified for

a subset of drugs and ADRs only. Hence, they do not consider all reported drugs

and ADRs which is crucial for drug-surveillance. On the other hand, unsupervised

methods and in particular association rule mining has been used in the medical

domain to explore drug related ADRs [12, 18, 16]. However, these methods have

only considered the identification of ADRs related to a single drug, rather than a

combination of drugs. While these drug-drug interactions are crucial to be detected

as about 30% of adverse reactions occur due to these drug-drug interactions.

Drug-drug interaction with association rule mining: [17] has used associ-

ation rule mining with relative reporting ratio to find drug interactions triggering a

set of ADRs. However this approach lacks to define improvements to the technique

in order to get rid of spurious and misleading rules as well as highlight interesting

drug-drug interactions based on contextual information

Interestingness in association rule mining: Various attempts have been

33

Page 42: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

made in literature to reduce the number of the generated rules and rank the most

interesting ones [23, 30, 6]. However majority of these measures are either for clas-

sification rules or are subjective measures that need domain specific knowledge to

define interestingness. Sub-rules based interestingness has been studied by [10],

where interestingness is defined as an unexpected confidence among a neighbor-

hood. The interestingness based on sub-rule’s confidence known as improvement

has been proposed by [19] to ensures that for every rule none of its simplifications

offer any predictive advantage over it. None of these existing methods captures the

most interesting associations among multiple drugs and ADRs.

34

Page 43: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Chapter 7

Conclusion

We proposed the MARAS technology for detecting the drug-drug interactions.

We defined the criteria of interestingness in the context of multi-drug adverse drug

reaction association. Our visual mining technology helps an evaluator explore and

analyze these interactions in further detail. MARAS can effectively identify drug-

drug interaction along-with providing a new exploration experience.

35

Page 44: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Appendix A

User Study

A.1 Sample of interesting and non-interesting groups

First samples of top ranked and bottom ranked drug interactions with both contex-

tual glyphs and bar-charts were shown to users (Fig. A.1, A.2, A.3)

A.2 User Study Questions

Question 1: User had to select the top ranked (interesting) two drug interaction

from Fig. A.4 and A.5 using bar-charts and contextual glyph (CG).

Question 2: User had to select top three ranked (interesting) two drug interaction

from Fig. A.6 and A.7 using bar-charts and contextual glyph (CG).

Question 3: User had to select the top ranked (interesting) three drug interaction

from Fig. A.8 and A.9 using bar-charts and contextual glyph (CG).

Question 4: User had to select top two ranked (interesting) three drug interaction

36

Page 45: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure A.1: Sample of two drugs

Figure A.2: Sample of three drugs

37

Page 46: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure A.3: Sample of four drugs

Figure A.4: Question 1 with barchart

38

Page 47: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure A.5: Question 1 with CG

Figure A.6: Question 2 with barchart

39

Page 48: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure A.7: Question 2 with CG

Figure A.8: Question 3 with barchart

40

Page 49: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure A.9: Question 3 with CG

from Fig. A.10 and A.11 using bar-charts and contextual glyph (CG).

Question 5: User had to select the top ranked (interesting) four drug interaction

from Fig. A.12 and A.13 using bar-charts and contextual glyph (CG).

41

Page 50: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure A.10: Question 4 with barchart

Figure A.11: Question 4 with CG

Figure A.12: Question 5 with barchart

42

Page 51: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

Figure A.13: Question 5 with CG

43

Page 52: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

References

[1] Adverse drug reactions. http://www.fda.gov/Drugs/

DevelopmentApprovalProcess/DevelopmentResources/

DrugInteractionsLabeling/ucm110632.htm. [Online; accessed 21-March-2016].

[2] Drugs.com. http://www.drugs.com. [Accessed 2016-04-20].

[3] Openfda. https://open.fda.gov/drug/event. [Accessed: 2016-04-20].

[4] FDA. http://www.fda.gov/Drugs/DrugSafety/

PostmarketDrugSafetyInformationforPatientsandProviders/ucm213206.

htm. [Accessed 2016-04-20].

[5] FDA adverse event reporting system (FAERS). http://www.fda.gov/

Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/

AdverseDrugEffects/ucm082193.htm. [Online; accessed 11-March-2016].

[6] Y. Bastide, N. Pasquier, R. Taouil, G. Stumme, and L. Lakhal. Mining minimalnon-redundant association rules using frequent closed itemsets. In Computa-tional LogicCL 2000, pages 972–986. Springer, 2000.

[7] J. Beddow. Shape coding of multidimensional data on a microcomputer display.In IEEE Visualization, pages 238–246, 1990.

[8] J. L. Bonkowsky, J. K. Frazer, K. F. Buchi, and C. L. Byington. Metamizoleuse by latino immigrants: a common and potentially harmful home remedy.Pediatrics, 109(6):e98–e98, 2002.

[9] T. Chan. Adverse interactions between warfarin and nonsteroidal antiinflam-matory drugs: mechanisms, clinical significance, and avoidance. The Annals ofpharmacotherapy, 29(12):1274–1283, 1995.

[10] G. Dong and J. Li. Interestingness of discovered association rules in terms ofneighborhood-based unexpectedness. In PAKDD, pages 72–86, 1998.

[11] V. L. et al. Drugbank 4.0: shedding new light on drug metabolism. NucleicAcids Research, 42(Database-Issue):1091–1097, 2014.

44

Page 53: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

[12] D. M. Fram, J. S. Almenoff, and W. DuMouchel. Empirical bayesian data min-ing for discovering patterns in post-marketing drug safety. In ACM SIGKDD,pages 359–368, 2003.

[13] L. Fraser, W. Leslie, L. Targownik, A. Papaioannou, J. Adachi, C. R.Group, et al. The effect of proton pump inhibitors on fracture risk: reportfrom the canadian multicenter osteoporosis study. Osteoporosis International,24(4):1161–1168, 2013.

[14] C. D. Furberg and B. Pitt. Withdrawal of cerivastatin from the world market.Curr Control Trials Cardiovasc Med, 2(5):205–207, 2001.

[15] E. A. Gale. Troglitazone: the lesson that nobody learned? Diabetologia,49(1):1–6, 2006.

[16] M. R. Hacene, Y. Toussaint, and P. Valtchev. Mining safety signals in spon-taneous reports database using concept analysis. In Artificial Intelligence inMedicine, pages 285–294, 2009.

[17] R. Harpaz, H. S. Chase, and C. Friedman. Mining multi-item drug adverseeffect associations in spontaneous reporting systems. BMC Bioinformatics,11(S-9):S7, 2010.

[18] H. Jin, J. Chen, H. He, G. J. Williams, C. Kelman, and C. M. O’Keefe. Miningunexpected temporal associations: Applications in detecting adverse drug re-actions. IEEE Trans. Information Technology in Biomedicine, 12(4):488–500,2008.

[19] R. J. B. Jr., R. Agrawal, and D. Gunopulos. Constraint-based rule mining inlarge, dense databases. In IEEE ICDE, pages 188–197, 1999.

[20] M. Lindquist. Vigibase, the who global icsr database system: basic facts. DrugInformation Journal, 42(5):409–419, 2008.

[21] M. Liu, M. E. Matheny, Y. Hu, and H. Xu. Data mining methodologies forpharmacovigilance. ACM SIGKDD Explorations, 14(1):35–42, 2012.

[22] W. H. Organization et al. WHO pharmaceuticals newsletter. WHO Collabo-rating Centre for International Drug Monitoring, 2014.

[23] S. Sahar. Interestingness measures - on determining what is interesting. In DataMining and Knowledge Discovery Handbook, 2nd ed., pages 603–612. 2010.

[24] R. SoRelle. Withdrawal of posicor from market. Circulation, 98(9):831–832,1998.

45

Page 54: MARAS: Multi-Drug Adverse Reactions Analytics System · 2016-04-29 · fatal. For example, Aspirin as taken together with Warfarin, a blood-thinning drug, may lead to excessive bleeding

[25] L. E. Targownik, L. M. Lix, C. J. Metge, H. J. Prior, S. Leung, and W. D.Leslie. Use of proton pump inhibitors and risk of osteoporosis-related fractures.Canadian Medical Association Journal, 179(4):319–326, 2008.

[26] N. P. Tatonetti, J. Denny, S. Murphy, G. Fernald, G. Krishnan, V. Castro,P. Yue, P. Tsau, I. Kohane, D. Roden, et al. Detecting drug interactions fromadverse-event reports: interaction between paroxetine and pravastatin increasesblood glucose levels. Clinical pharmacology and therapeutics, 90(1):133, 2011.

[27] N. P. Tatonetti, G. H. Fernald, and R. B. Altman. A novel signal detectionalgorithm for identifying hidden drug-drug interactions in adverse event reports.Journal of the American Medical Informatics Association, 19(1):79–85, 2012.

[28] N. P. Tatonetti, P. Y. Patrick, R. Daneshjou, and R. B. Altman. Data-drivenprediction of drug effects and interactions. Science translational medicine,4(125):125ra31–125ra31, 2012.

[29] I. H. Witten and E. Frank. Data Mining: Practical machine learning tools andtechniques. Morgan Kaufmann, 2005.

[30] M. J. Zaki. Generating non-redundant association rules. In Proceedings of thesixth ACM SIGKDD international conference on Knowledge discovery and datamining, pages 34–43. ACM, 2000.

46