extraction of adverse drug effects from clinical records
Post on 30-Dec-2015
28 Views
Preview:
DESCRIPTION
TRANSCRIPT
Extraction of Adverse Drug Effects from Clinical Records
E. ARAMAKI* Ph.D., Y. MIURA **,M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D.,H. MASHUICHI ** Ph.D., K.WAKI * Ph.D. M.D.,
K.OHE * Ph.D. M.D., * University of Tokyo, Japan
** Fuji Xerox, Japan
Our material is Discharge Summary
Background• The use of Electronic Health Records (EHR) in
hospitals is increasing rapidly everywhere• They contain much clinical information about
a patient’s health
BUT Many Natural Language texts !
BUT Many Natural Language texts !
Extracting clinical information from the reports is difficult because they are written in natural language
NLP based Adverse Effect Detecting System
• We are developing a NLP system that extracts medical information, especially Adverse Effect, form natural language parts
• INPUT– a medical text (discharge summary)
• OUTPUT– Date Time– Medication Event– Adverse Effect Event
≒ i2b2 MedicationChallenge
But our target focuses only on adverse effect
Adverse Effect Relation (AER)
Why Adverse Effect Relations?
• Clinical trials usually target only a single drug.• BUT: real patients sometimes take multiple
medications, leading to a gap separating the clinical trials and the actual use of drugs
• For ensuring patient safety, it is extremely important to capturing a new/unknown AEs in the early stage.
DEMO is available on
http://mednlp.jp
副作用関係の推定System Demo
C c
副作用関係の推定System Demo
has no complications at the time of diagnosis 6/23-25 FOLFOX6 2nd.6/24, 25: moderate fever (38℃) again. a fever reducer….
Adverse Effect
Medication
Relation
The point of This Study• (1) Preliminary Investigation: How much information
actually exist? – We annotated adverse effect information in
discharge summaries
• (2) NLP Challenge: Could the current NLP retrieve them?– We investigated the accuracy of with which the
current technique could extract adverse effect information
Outline
• Introduction• Preliminary Investigation
– How much information actually exist in discharge summary?
• NLP Challenge
• Conclusions
Material & Method
• Material: 3,012 Japanese Discharge Summaries• 3 humans annotated possible adverse effects due to
the following 2 steps
<D>Lasix<D> for <S>hypertension</S> is stopped due to <S>his headache</S>.
<D rel=“1”>Lasix<D> for <S>hypertension</S> is stopped due to <S rel=“1”>his headache</S>.
Step 1 Event
Annotation
Step 2Relation
Annotation
XML tag = Event
XML attribute = Relation
Annotation Policy & Process
• We regard only MedDRA/J terms as the events.
• We regarded even a suspicion of an adverse effect as positive data.
• Entire data annotation is time-consuming → We split data into 2 sets SET-A (Event Rich parts): contains keywords such
as Stop, Change, Adverse effect, Side effect
SET-B: The other
adverse effect terminology
Full annotated
Randomly sampled & annotated
14.5%×53.5% + 85.5%×11.3% = 17.4%
SET-BSET-A
Results of Preliminary Investigation
• About 17% discharge summaries contain adverse effect information.– Even considering that the result includes just a
suspicion of effects, the summaries are a valuable resource on AE information.
• We can say that discharge summaries are suitable resources for our purpose.
Outline
• Introduction• Preliminary Investigation
• NLP Challenge– Could the current NLP technique retrieve the AEs?
• Conclusions
Combination of 2 NLP Steps
• 2 NLP steps directly correspond to each annotation step
Lasix for hyperpiesia is stopped due to the pain in the head.
symptom symptomMedication
Adverse Effect Relation
Event Annotation
RelationAnnotation
≒Named Entity Recognition Task
= Relation Extraction Task, which is one of the most hot NLP research topics.
Step1: Event Identification
• Machine Learning Method– CRF (Conditional Random Field) based Named
Entity Recognition
• Feature– Lexicon (Stemming), POS, Dictionary based
feature (MedDRA), window size=5
• Material– SET-A Corpus with Event Annotations
state-of-the-art method ati2b2 de-identification task
Standard Feature Set
Step1: Result of Event Identification
• Result SummaryCat. of Event Precision Recall F-measure
Medication Event 86.99 81.34 0.8485.56 80.24 0.82AE Event
• All accuracies (P, R) >> 80 %, F>0.80, demonstrating the feasibility of our approach
• Considering that the corpus size is small (435 summaries), we can say that the event detection is an easy task
Step2: Relation Extraction Method
• Basic Approach ≒Protein-Protein Interaction (PPI) task [BioNLP2009-shared Task]
• ExampleLasix for hypertension is stopped due to his headache
For each m (Medications) For each a (Adverse Effects) judge_it_has_rel (a, m)For each m (Medications) For each a (Adverse Effects) judge_it_has_rel (a, m)(1) judge_it_has_AER (Lasix , hypetension)(2) judge_it_has_AER (Lasix , headach)
• (1) PTN-BASED: heuristic rules using a set-of-keyword & word distance
..is on ACTOS but stopped for relief of the edema .
n=1<medication> <adverse effect>keyword
n=4
Judge_it_has_AER (m, a, keyword=stopped, windowsize5)
• (2) SVM-BASED: Machine learning approach– Feature: distance & words between two events
( medication & adverse effect)
Two judgment methods
See proceedings for detailed
Step2: Result of Relation ExtractionPrecision Recall F-measure
PTN-BASED 41.1% 91.7% 0.65057.6% 62.3% 0.598SVM-BASED
• Both PTN & SVM accuracies are low (F<0.65)→ the Relation extraction task is difficult!
• SVM accuracy is significant (p=0.05) lower than PTN (1) Corpus size is small (2) positive data << negative data
Machine learning suffers from such small imbalanced data
Outline
• Introduction• Preliminary Investigation• NLP Challenge• Discussions
– (1) Overall Accuracy– (2) Controllable Performance– (3) Event Distribution
• Conclusions
Discussion (1/3) Overall Accuracy
• The overall accuracy is estimated by the combined accuracies of step1 & step2
Overall (= step1 × step2)
Precision 0.289 (=0.855 × 0.869 × 0.390)
• Each NLP step is not perfect, so, the combination of such imperfect results leads to the low accuracy (especially many false positives; low precision)
Recall 0.597 (=0.802 × 0.813 × 0.917)
Discussion (2/3)Performance is Controllable
Precision & Recall curve in SVM
• The performance balance between recall & precision could be controlled
High precision setting
High recall setting
That is a strong advantage of NLP
Discussion (3/3)Event Distribution
• We investigated the entire AE frequency for each medication category.
distribution acquired from annotated real data
distribution acquired from our system results
AE freq. distribution of Drug #1
Discussion (3/3)AER Distribution
• Then, we checked the goodness of the fit test, which measures the similarity between two distributions
Med. 1Med. 2Med. 3Med. 4Med. 5
Total
0.0230.0130.0100.0060.005
0.011
P-value
• High p-value (p=0.011 > 0.01) indicates two distributions are similar.
Outline
• Introduction• Preliminary Investigation• NLP Challenge• Discussions
• Conclusions
Conclusions (1/2)
• Preliminary Investigation:– About 17% discharge summaries contain adverse
effect information.– We can say that discharge summary are suitable
resources for AERs
• NLP Challenge:– Could NLP retrieve the AE information?– Difficult! Overall accuracy is low
Conclusions (2/2)
• BUT: 2 positive findings:(1) We can control the performance balance(2) Even the accuracy is low, the aggregation of the results is similar to the real distribution
• IN THE FUTURE:–A practical system using the above advantages–More acute method for relation extraction
Thank you
Contact Info– Eiji ARAMAKI Ph.D.– University of Tokyo– eiji.aramaki@gmail.com– http://mednlp.jp
top related