search in medical text

52
Search in Medical Text Sarvnaz Karimi National ICT Australia (NICTA) The University of Melbourne 1 / 51

Upload: sarvnaz-karimi

Post on 13-Apr-2017

348 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Search in Medical Text

Search in Medical Text

Sarvnaz Karimi

National ICT Australia (NICTA)The University of Melbourne

1 / 51

Page 2: Search in Medical Text

“What makes medical doctors use computers?”

2 / 51

Page 3: Search in Medical Text

Medicine and Computer Science

Data: Users:biomedical literature biomedical researchers, medical

doctors/students, curators

clinical records hospital staff, medical doctors

medical social media drug companies, health authorities

3 / 51

Page 4: Search in Medical Text

Challenges for Computer Scientists

Data: Challenges:biomedical literature creation of systematic reviews,

experts searching in the literature

clinical records search in medical records

medical social media discovery of drug side-effects

4 / 51

Page 5: Search in Medical Text

1 Systematic Reviews:

A Complex Search Episode for Evidence BasedPolicy and Practice

5 / 51

Page 6: Search in Medical Text

A long term smoker with chronic obstructive air-ways disease (COPD) who has recently quitsmoking has breathing difficulties. What are thesuitable non-drug therapies to improve the pa-tient’s breathing?

(example by Prof. Paul Glasziou)

6 / 51

Page 7: Search in Medical Text

Is adjunctive vitamin A effective in childrendiagnosed with non-measles pneumonia?

(Cochrane collaboration)

7 / 51

Page 8: Search in Medical Text

A clinician applying research to practice needs toknow:

What? interventions match the patient’s conditionsWhat? quality of evidence and applicabilityWhat? duration, dosage, ...

8 / 51

Page 9: Search in Medical Text

Growth of medical scientific literature archive(MEDLINE)

9 / 51

Page 10: Search in Medical Text

Evidence-Based Medicine (EBM)

Background Information/Expert Opinion

Randomized Controlled Trials (RCTs)

Critically Appraised Individual Articles

Critically Appraised Topics

Systematic

Reviews

Cohort Studies

Case−controlled Studies

Information

Filtered

Unfiltered Information

Qua

lity

of E

vide

nce

EBM applies the best available evidence to clinical decision-making.

10 / 51

Page 11: Search in Medical Text

A sample systematic reviewTitle: Vitamin A for non-measles pneumonia inchildren

Main question: Is adjunctive vitamin A effectivein children diagnosed with non-measles pneumo-nia?

Inclusion criteria: Only parallel-arm, randomizedcontrolled trials (RCTs) and quasi-RCTs, in whichchildren (younger than 15 years of age) with non-measles pneumonia were treated with adjunctivevitamin A, were included...

Methods: We searched The Cochrane Library,Cochrane Central Register of Controlled Trials(CENTRAL 2010, issue 3) which contains theAcute Respiratory Infections Group’s Specialised...

Main results: Six trials involving 1740 childrenwere included. There was no significant reduc-tion in mortality...

11 / 51

Page 12: Search in Medical Text

Systematic reviewing process

develop criteria for including studiesDefine a clear review question and

Systematic review

Presenting the results, interpretingthe findings, and drawing conclusions

?

Search

Selecting studies and collecting data

undertaking meta−analysisAnalysing the data and

12 / 51

Page 13: Search in Medical Text

A sample MEDLINE query

1. exp vitamin A/2. vitamin A.mp3. retinol.mp4. exp dietary supplements/5. or/1-46. exp pneumonia/7. pneumonia$.mp8. exp pneumonia, bacterial/9. exp pneumonia, lipid/10. exp pneumonia, mycoplasma/...14. exp pneumonia, viral/15. exp respiratory tract infections/16. acute adj respiratory.mp17. respiratory adj infection.mp18. respiratory adj disease.mp19. or/6-1820. 5 and 19

13 / 51

Page 14: Search in Medical Text

A sample MEDLINE query

1. exp vitamin A/2. vitamin A.mp3. retinol.mp4. exp dietary supplements/5. or/1-46. exp pneumonia/7. pneumonia$.mp8. exp pneumonia, bacterial/9. exp pneumonia, lipid/10. exp pneumonia, mycoplasma/...14. exp pneumonia, viral/15. exp respiratory tract infections/16. acute adj respiratory.mp17. respiratory adj infection.mp18. respiratory adj disease.mp19. or/6-1820. 5 and 19

13 / 51

Page 15: Search in Medical Text

Scale of evidence inclusion

Documentsto be read infull−text

To be actuallyincluded in the review

(500−2000)

Boolean Query Output

(4,000 −− 10,000)Title & Abstract

(10−100)

14 / 51

Page 16: Search in Medical Text

Where can we help?

Our contributions on introducing ranked retrieval is published in:* S. Karimi, S. Pohl, F. Scholer, L. Cavedon, J. Zobel, Boolean versus Ranked Querying forBiomedical Systematic Reviews, BMC Medical Informatics and Decision Making, Vol 10,Number 58, 2010* D. Martinez, S. Karimi, L. Cavedon, T. Baldwin, Facilitating Biomedical Systematic ReviewsUsing Ranked Text Retrieval and Classification, ADCS 2008, December 2008

15 / 51

Page 17: Search in Medical Text

To assist in query formulation for an initial searchstrategy

Suggesting key-terms and synonyms e.g neoplasm for cancer

Bag-of-words to Boolean Suggesting structure to specified queryterms. Template queries already exist for limited inclusion criteria.

16 / 51

Page 18: Search in Medical Text

Consistency verification

Automatic verification against inclusion criteria

Automatic self-consistency verification: If a reviewer selects onedocument, but later chooses to ignore a similar one, the systemshould flag this possible inconsistency.

17 / 51

Page 19: Search in Medical Text

Dynamic relevance feedback

Document selection process is currently paper-based.

A dynamic relevance feedback approach that is active during thedocument selection process could rank the remaining documentsbased on estimated importance.

Dynamic relevance feedback might identify additional documentsthat exist in the collection but were missed by the initial searchstrategy.

18 / 51

Page 20: Search in Medical Text

Analysis and Meta-analysis

There are tools that assist analysing already extracted numerical datafrom one or multiple studies, but the input to these tools should first beextracted manually from text. Automatic information extraction cansave hours.

19 / 51

Page 21: Search in Medical Text

Review update

Updating the review with new evidence so that it remains relevant.

Treatment X works. Treatment Y is preferred over X.Year 2005 Year 2010

20 / 51

Page 22: Search in Medical Text

Literature survey is hard!

21 / 51

Page 23: Search in Medical Text

2 User-Study:

Medical Expert’s Search Behavior

22 / 51

Page 24: Search in Medical Text

Subject: Library needs your helpVolunteers neededStudy : Improving Tools for Searching Medical Literature(Alfred Health Ethics Committee approved)Dear All, I am writing to you as a participant in a Library training class at the Ian PotterLibrary in 2010... probably realise that systems for online searching are often complexand not that easy to use...The Ian Potter Library is participating in a study together withNICTA (University of Melbourne) and RMIT, looking at ways of improving search toolsfor medical literature (see attached). We need volunteers..

Volunteers required - Improving tools for searching medica l literatureThis study aims to improve quality of search results in the biomedical domain. Theresearch team needs participants with (bio)medical background, especially medicalstudents/researchers, to carry out search tasks using search tools. The session willtake about 40 minutes. All participants receive movie vouchers. Alfred Hospital HRECnumber 22/10. Further information...

23 / 51

Page 25: Search in Medical Text

Why user study?

How should a biomedical search engine look like?

What are the needs of specific users of biomedical search tools?• Users’ behaviour, searching and querying style,...

Which one of our proposed systems is more effective?

24 / 51

Page 26: Search in Medical Text

Subjects

Experts : educational background in biomedical sciences andrelated domains.

Non-experts : absolutely no education or working experience inbiomedical domains.

We recruited 46 experts of which 2 were assigned to a pilot study, and6 did not finish the tasks, and also recruited 9 non-experts.

25 / 51

Page 27: Search in Medical Text

User study format

Subjects were asked to imagine that they should write a short reportabout each given topic. Their goal was to carry out searches to finduseful articles that they would want to read in order to prepare theirreport.Each subject was asked to complete the following:

1. Opening questionnaire2. Search phase, consisting of six tasks. For each task:

• Pre-task questionnaire to establish prior familiarity with topic• Search for useful documents• Post-task questionnaire about search experience

3. Closing questionnaire

26 / 51

Page 28: Search in Medical Text

Tasks assigned to the subjects

1 exercise therapy for cystic fibrosis2 families and grief in the ICU3 cognitive behaviour therapy for postnatal depression4 vitamin D and dementia5 ankle injuries and gait analysis6 prevention of type 2 diabetes in developing countries

These topics were previously referred to health librariansin Ian Potter library of Alfred Hospital in Melbourne, byeither students or staff.

27 / 51

Page 29: Search in Medical Text

Search systems and interfaces

System A: A Boolean retrieval system similar to PubMed. Resultswere ordered by date. Very complicated multi-line Booleanquerying was supported.

System B: A combination of ranked and Boolean system. Bothranked and Boolean querying were supported. If a query wasBoolean, the output was ranked based on the keywords.

System C: Topic modelling based system. The output of thequeries were topic modelled (LDA) and then ranked under eachtopic.

28 / 51

Page 30: Search in Medical Text

Preferred system and difficulty of using the systems

Only a slight difference between A and B (not-ranked andranked), but C (topic-modelled) was significantly less liked.

Between A and B, the ranked results of system B were slightlybut significantly better liked.

System C (topic modelling) was rated hardest to work with.

29 / 51

Page 31: Search in Medical Text

Topic Familiarity and its effect on querying

Tasks Queries Ranked Boolean Complex Total queryentered queries queries Boolean terms

Not familiar 147 438 (3.0) 154 (1.0) 271 (1.8) 13 (0.1) 1840 (13.2)Familiar 71 204 (3.0) 51 (0.7) 148 (2.1) 5 (0.1) 960 (15.9)Very familiar 10 14 (1.4) 5 (0.5) 9 (0.9) 0 (0.0) 92 (9.2)p-value 0.0172 0.0184 0.0334 0.6511 0.001* The table shows the sum for each category, with the mean indicated in parentheses.

The number of queries entered varied with their level of topic familiarity.

More queries for topics that subjects were not familiar with.

The number of ranked or Boolean queries employed by searchers varies significantly withthe level of familiarity.

For very familiar topics, users employ fewer query terms.

30 / 51

Page 32: Search in Medical Text

Familiarity, visited result pages, and documentsselected as relevant

Tasks Result pages Itemsviewed saved

Not familiar 147 494 (3.4) 999 (6.8)Familiar 71 253 (3.6) 425 (6.0)Very familiar 10 28 (2.8) 57 (5.7)p-value 0.2801 0.0535

No significant relationship was found between prior familiarity and the numberof result pages viewed.

The number of items saved (relevant) did not vary significantly with topicfamiliarity.

31 / 51

Page 33: Search in Medical Text

Familiarity based on the pre-task questionnaire andDifficulty based on post-task questionnaire

DifficultyEasy Medium Hard

Not familiar 78 44 25 147Familiar 44 21 16 81

122 65 41p-value=0.7678

No relation was confirmed between familiarity and perceived difficultyof working with the systems, in other words being familiar did NOTmake the task easier or harder .

32 / 51

Page 34: Search in Medical Text

3 Drug Side-Effects:

What Do Patient Forums Reveal?

33 / 51

Page 35: Search in Medical Text

Drug side-effect

A drug side-effect is an effect (positive or negative)that is secondary to the one intended.

Some side-effects are severe, such as organ failure,high blood sugar, stroke, heart disease, neuropathy,and some are mild, such as nausea, and dizziness.

Adverse side-effects that are unknown claim manylives each year.

34 / 51

Page 36: Search in Medical Text

Side-effect discovery

or demand

+

Volunteers

Clinical trials

35 / 51

Page 37: Search in Medical Text

Post-marketing Surveillance

Clinical trials are expensive, sometimes out-dated, timeconsuming, and often small-scale.

Professionals and drug users can report mostly severeside-effects in official web-sites.

Patient social networks and forums – such as DailyStrength, andAskPatient – collect feedback directly from drug consumers.

Data in such forums may be of questionable reliability, but itprovides indications of real side-effects, both mild and severe .

36 / 51

Page 38: Search in Medical Text

A new era in side-effect discovery

or demand

+Clinical trials

Volunteers

update

feedback

37 / 51

Page 39: Search in Medical Text

Trade-off in using data from social media

Advantages:

large amount of data

data generated by a large variety of people who shareinformation through personal blogs and public forums.

Disadvantages:(Medical social data is difficult to access and process)

data is scattered over multiple sources.

availability of useful resources is limited (ownership).

data often contains noise (informal language, or mis-spelledspecialised terms) so traditional methods for pre-processing suchas POS tagging, chunking, and sentence segmentation may notwork well.

38 / 51

Page 40: Search in Medical Text

What you may see in a medical forum

User A Side effects from MedicineX therapy?Post 1 . . . Since taking MedicineX for about 3 years, some time in the last

year or so I began to experience significant ringing in the ears. . . .

User B Re: Side effects from MedicineX therapy?Post 2 I haven never heard about it. But I had nausea, vomiting and fever.

User C Re: Side effects from MedicineX therapy?Post 3 It is not true at all. MedicineX is one of medicines which have least

side-effects. In fact, my heart related symptoms became better.

User D Re:Re: Side effects from MedicineX therapy?Post 4 I didn’t have nausea or vomiting but had a skin rash for a few days.

User E Warning!!! BLOOD CLOTHS IN MY LUNG!!!Post 5 After using MedicineX for 3.5 years, my doctor found a blood cloths

in my lung . . .

User A Thank youPost 6 thx. My doctor told me my ear ringing was not MedicineX but . . .

39 / 51

Page 41: Search in Medical Text

Everybody’s different

All previous studies are focused onextracting mentions of adverse effects andmostly ignore the contributing factors thatare patient-dependent.

We are interested in extracting bothadverse and beneficial side-effects alongwith background information on thepatients that could contribute to theirpositive or negative experience.

This is particularly interesting becauseclinical trials do not cover all possible patientconditions.

40 / 51

Page 42: Search in Medical Text

Entities to be extractedEntity ExampleDisease “After 3 years of having Ativan keep the anxiety in check,

...”Symptom “My heart was racing and ..”Drug “I must be addicted to Xanax”Duration “Began taking 5 mg daily(broke the 10mg pill in half) for

4 weeks”Dosage “Began taking 5 mg daily ...”Frequency “Began taking 5 mg daily ..”

Positive side-effect “I’m taking this for my back pain but it has been reducingmy stress as well.”

Negative side-effect “Sometimes causes drowsiness.”Lack of negative side-effect “I feel dizzy and low but no vomiting.”Lack of positive side-effect “I was feeling even more energetic initially but it doesnt

work like that any more”Positive outcome “No apparent side effects thus far and results have been

very effective for the pain.”Negative outcome “Problem is you build up a tolerance and eventually the

drug quits working as has been my case.”Gender of patient “I was prescribed this for anxiety when my teenage

daughter was driving my wife and I into”Age “I’m in my forties”

41 / 51

Page 43: Search in Medical Text

Relations to be extracted

Relation DescriptionDrug-Drug If a patient explicitly mentions that taking two named

drugs together had any effect or no effect, then the twodrugs are annotated by a positive , negative , or no ef-fect relation.

MedicineA MedicineB... was fine till I started taking as well...

Dosage-Frequency The frequency in which a dosage is taken is annotatedby a for relation.

Dosage-Duration The prolong of intake for a specific dosage is annotatedwith a for relation.

Drug-Dosage The dosage which a drug is taken is annotated with ataken relation.

42 / 51

Page 44: Search in Medical Text

Data

We gathered data for ten different drugs from two differentforums: AskPatient 2 and eHealth Forum 3.

A total of 5,996 posts (40,871 sentences) was collected.

We only relied on free-text comments in each post.

The annotation is ongoing by two annotators.

2http://www.askapatient.com/3http://ehealthforum.com/

43 / 51

Page 45: Search in Medical Text

A Survey:What do people think about medicine and socialmedia?

44 / 51

Page 46: Search in Medical Text

Who participated?

# Participants Gender Age Range EducationGroup A 83 61% M 2% under 21 57% G

39% F 83% 21-39 35% B15% above 40 8% under

Group B 379 42% M 7% under 21 20% G57% F 69% 21-39 7% B

24% above 40 73% underAll 462

Group A: survey posted on Facebook, e-health forum, and Yahoo health forumGroup B: Amazon Mechanical Turkers

B: bachelor degree, G: Graduate degreeM: Moderately, V: Very

45 / 51

Page 47: Search in Medical Text

How healthy our participants were? Do they trust theirdoctors?

# Participants Healthy Trust DoctorsGroup A 83 45% V 56% V

43% M 34% M12% not 10% little/none

Group B 379 53% V 68% V41% M 26% M6% not 12% little/not

Group A: survey posted on Facebook, e-health forum, and Yahoo health forumGroup B: Amazon Mechanical Turkers

B: bachelor degree, G: Graduate degreeM: Moderately, V: Very

46 / 51

Page 48: Search in Medical Text

Do people use medical social networks, forums, blogs,or medical information on Internet? Do people sharetheir experiences with drug side effects?

Generic Social Medical Social Int. search Trust Int. ShareGroup A 83% M to E 24% yes 48% M to E 38% well 4% M to E

13% S 76% no 47% S 51% little 17% S4% N 5% N 11% none 79% N

Group B 79% M to E 21% yes 56% M to E 50% well 31% M to E14% S 79% no 39% S 38% little 30% S7% N 5% N 12% none 54% N

Group A: survey posted on Facebook, e-health forum, and Yahoo health forumGroup B: Amazon Mechanical Turkers

N: never, S: sometimes, M: moderately often, E: extremely often

Not so healthy people share more than very healthy people (53% vs 38%).

47 / 51

Page 49: Search in Medical Text

What’s next

We propose finding patterns of side-effect reporting, both usingheuristics and automatically extracted rules. The outcome can beused in enriching side-effect ontologies.

One of our contributions will be providing the research communitywith a rich annotated collection that is large enough forexperimentation and diverse in the types of drugs and annotatedconcepts.

The existing literature does not provide a comparison overprevious approaches, mainly due to lack of availability of astandard and publicly accessible dataset. We intend to conduct acomprehensive comparison of existing methods as well as ourown techniques.

48 / 51

Page 50: Search in Medical Text

Summary

There are many areas in medicine and health which can benefit frommore effective search in text:

Techniques used for extensive search in biomedical literature foranswering focused clinical questions (systematic reviewing) arestill way behind the state-of-the-art search technology.

Domain-experts search differently from laymen and biomedicalsearch engines should accommodate these differences.

Analysing medical social media is one method of capturingpreviously undiscovered drug side-effects.

49 / 51

Page 51: Search in Medical Text

Expert Subjects (Opening questionnaire)

Category Number of SubjectsGender female 27 (71%)

male 11 (29%)

Position allied health 12 (32%)biomedical researcher 11 (29%)medical student 9 (24%)health librarian 3 (8%)nurse 1 (3%)

Search tool used PubMed 33 (87%)Google Scholar 31 (82%)Ovid 22 (58%)EBSCO 15 (40%)Other 7 (18%)

Satisfaction very satisfied 3 (8%)satisfied 30 (79%)borderline 5 (13%)unsatisfied 0 (0%)

50 / 51

Page 52: Search in Medical Text

Expert Subjects (Cont.)

Category Number of SubjectsSearch tool usage daily 7 (18%)

weekly 14 (37%)monthly 13 (34%)rarely 4 (10%)

Database used Medline 30 (79%)Journals@Ovid Full Text 17 (45%)CINAHL 15 (35%)Cochrane Systematic Reviews 12 (32%)PsycINFO 11 (29%)EMBASE 6 (16%)AMED 5 (13%)Other 12 (32%)

51 / 51