extraction of adverse drug effects from clinical records e. aramaki* ph.d., y. miura **, m. tonoike...
TRANSCRIPT
![Page 1: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/1.jpg)
Extraction of Adverse Drug Effects from Clinical Records
E. ARAMAKI* Ph.D., Y. MIURA **,M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D.,H. MASHUICHI ** Ph.D., K.WAKI * Ph.D. M.D.,
K.OHE * Ph.D. M.D., * University of Tokyo, Japan
** Fuji Xerox, Japan
Our material is Discharge Summary
![Page 2: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/2.jpg)
Background• The use of Electronic Health Records (EHR) in
hospitals is increasing rapidly everywhere• They contain much clinical information about
a patient’s health
BUT Many Natural Language texts !
BUT Many Natural Language texts !
Extracting clinical information from the reports is difficult because they are written in natural language
![Page 3: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/3.jpg)
NLP based Adverse Effect Detecting System
• We are developing a NLP system that extracts medical information, especially Adverse Effect, form natural language parts
• INPUT– a medical text (discharge summary)
• OUTPUT– Date Time– Medication Event– Adverse Effect Event
≒ i2b2 MedicationChallenge
But our target focuses only on adverse effect
Adverse Effect Relation (AER)
![Page 4: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/4.jpg)
Why Adverse Effect Relations?
• Clinical trials usually target only a single drug.• BUT: real patients sometimes take multiple
medications, leading to a gap separating the clinical trials and the actual use of drugs
• For ensuring patient safety, it is extremely important to capturing a new/unknown AEs in the early stage.
![Page 5: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/5.jpg)
DEMO is available on
http://mednlp.jp
![Page 6: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/6.jpg)
副作用関係の推定System Demo
![Page 7: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/7.jpg)
C c
副作用関係の推定System Demo
has no complications at the time of diagnosis 6/23-25 FOLFOX6 2nd.6/24, 25: moderate fever (38℃) again. a fever reducer….
Adverse Effect
Medication
Relation
![Page 8: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/8.jpg)
The point of This Study• (1) Preliminary Investigation: How much information
actually exist? – We annotated adverse effect information in
discharge summaries
• (2) NLP Challenge: Could the current NLP retrieve them?– We investigated the accuracy of with which the
current technique could extract adverse effect information
![Page 9: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/9.jpg)
Outline
• Introduction• Preliminary Investigation
– How much information actually exist in discharge summary?
• NLP Challenge
• Conclusions
![Page 10: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/10.jpg)
Material & Method
• Material: 3,012 Japanese Discharge Summaries• 3 humans annotated possible adverse effects due to
the following 2 steps
<D>Lasix<D> for <S>hypertension</S> is stopped due to <S>his headache</S>.
<D rel=“1”>Lasix<D> for <S>hypertension</S> is stopped due to <S rel=“1”>his headache</S>.
Step 1 Event
Annotation
Step 2Relation
Annotation
XML tag = Event
XML attribute = Relation
![Page 11: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/11.jpg)
Annotation Policy & Process
• We regard only MedDRA/J terms as the events.
• We regarded even a suspicion of an adverse effect as positive data.
• Entire data annotation is time-consuming → We split data into 2 sets SET-A (Event Rich parts): contains keywords such
as Stop, Change, Adverse effect, Side effect
SET-B: The other
adverse effect terminology
Full annotated
Randomly sampled & annotated
![Page 12: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/12.jpg)
14.5%×53.5% + 85.5%×11.3% = 17.4%
SET-BSET-A
![Page 13: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/13.jpg)
Results of Preliminary Investigation
• About 17% discharge summaries contain adverse effect information.– Even considering that the result includes just a
suspicion of effects, the summaries are a valuable resource on AE information.
• We can say that discharge summaries are suitable resources for our purpose.
![Page 14: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/14.jpg)
Outline
• Introduction• Preliminary Investigation
• NLP Challenge– Could the current NLP technique retrieve the AEs?
• Conclusions
![Page 15: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/15.jpg)
Combination of 2 NLP Steps
• 2 NLP steps directly correspond to each annotation step
Lasix for hyperpiesia is stopped due to the pain in the head.
symptom symptomMedication
Adverse Effect Relation
Event Annotation
RelationAnnotation
≒Named Entity Recognition Task
= Relation Extraction Task, which is one of the most hot NLP research topics.
![Page 16: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/16.jpg)
Step1: Event Identification
• Machine Learning Method– CRF (Conditional Random Field) based Named
Entity Recognition
• Feature– Lexicon (Stemming), POS, Dictionary based
feature (MedDRA), window size=5
• Material– SET-A Corpus with Event Annotations
state-of-the-art method ati2b2 de-identification task
Standard Feature Set
![Page 17: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/17.jpg)
Step1: Result of Event Identification
• Result SummaryCat. of Event Precision Recall F-measure
Medication Event 86.99 81.34 0.8485.56 80.24 0.82AE Event
• All accuracies (P, R) >> 80 %, F>0.80, demonstrating the feasibility of our approach
• Considering that the corpus size is small (435 summaries), we can say that the event detection is an easy task
![Page 18: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/18.jpg)
Step2: Relation Extraction Method
• Basic Approach ≒Protein-Protein Interaction (PPI) task [BioNLP2009-shared Task]
• ExampleLasix for hypertension is stopped due to his headache
For each m (Medications) For each a (Adverse Effects) judge_it_has_rel (a, m)For each m (Medications) For each a (Adverse Effects) judge_it_has_rel (a, m)(1) judge_it_has_AER (Lasix , hypetension)(2) judge_it_has_AER (Lasix , headach)
![Page 19: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/19.jpg)
• (1) PTN-BASED: heuristic rules using a set-of-keyword & word distance
..is on ACTOS but stopped for relief of the edema .
n=1<medication> <adverse effect>keyword
n=4
Judge_it_has_AER (m, a, keyword=stopped, windowsize5)
• (2) SVM-BASED: Machine learning approach– Feature: distance & words between two events
( medication & adverse effect)
Two judgment methods
See proceedings for detailed
![Page 20: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/20.jpg)
Step2: Result of Relation ExtractionPrecision Recall F-measure
PTN-BASED 41.1% 91.7% 0.65057.6% 62.3% 0.598SVM-BASED
• Both PTN & SVM accuracies are low (F<0.65)→ the Relation extraction task is difficult!
• SVM accuracy is significant (p=0.05) lower than PTN (1) Corpus size is small (2) positive data << negative data
Machine learning suffers from such small imbalanced data
![Page 21: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/21.jpg)
Outline
• Introduction• Preliminary Investigation• NLP Challenge• Discussions
– (1) Overall Accuracy– (2) Controllable Performance– (3) Event Distribution
• Conclusions
![Page 22: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/22.jpg)
Discussion (1/3) Overall Accuracy
• The overall accuracy is estimated by the combined accuracies of step1 & step2
Overall (= step1 × step2)
Precision 0.289 (=0.855 × 0.869 × 0.390)
• Each NLP step is not perfect, so, the combination of such imperfect results leads to the low accuracy (especially many false positives; low precision)
Recall 0.597 (=0.802 × 0.813 × 0.917)
![Page 23: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/23.jpg)
Discussion (2/3)Performance is Controllable
Precision & Recall curve in SVM
• The performance balance between recall & precision could be controlled
High precision setting
High recall setting
That is a strong advantage of NLP
![Page 24: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/24.jpg)
Discussion (3/3)Event Distribution
• We investigated the entire AE frequency for each medication category.
distribution acquired from annotated real data
distribution acquired from our system results
AE freq. distribution of Drug #1
![Page 25: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/25.jpg)
Discussion (3/3)AER Distribution
• Then, we checked the goodness of the fit test, which measures the similarity between two distributions
Med. 1Med. 2Med. 3Med. 4Med. 5
Total
0.0230.0130.0100.0060.005
0.011
P-value
• High p-value (p=0.011 > 0.01) indicates two distributions are similar.
![Page 26: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/26.jpg)
Outline
• Introduction• Preliminary Investigation• NLP Challenge• Discussions
• Conclusions
![Page 27: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/27.jpg)
Conclusions (1/2)
• Preliminary Investigation:– About 17% discharge summaries contain adverse
effect information.– We can say that discharge summary are suitable
resources for AERs
• NLP Challenge:– Could NLP retrieve the AE information?– Difficult! Overall accuracy is low
![Page 28: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/28.jpg)
Conclusions (2/2)
• BUT: 2 positive findings:(1) We can control the performance balance(2) Even the accuracy is low, the aggregation of the results is similar to the real distribution
• IN THE FUTURE:–A practical system using the above advantages–More acute method for relation extraction
![Page 29: Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddf5503460f94ad8eee/html5/thumbnails/29.jpg)
Thank you
Contact Info– Eiji ARAMAKI Ph.D.– University of Tokyo– [email protected]– http://mednlp.jp