healthcare innovations at kno.e.sis

83
Healthcare Innovations at Kno.e.sis Put Knoesis Banner Presentation to the Boonshoft School of Medicine Executive Committee, July 10, 2014 Amit Sheth Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis ) Wright State University, USA

Upload: amit-sheth

Post on 21-Aug-2014

434 views

Category:

Health & Medicine


2 download

DESCRIPTION

This talk, given to the executive committee of the Boonsoft School of Medicines summarizes/introduces some of the projects on clinical and healthcare applications, and health informatics including consumer health behavior and social media use in healthcare. I focus on personalized digital health, handling/mining of healthcare big data, high-level description of innovations and especially applications involving clinical partners that empower patients, support better clinical decision making, reduce clinician's information overload, or improve clinical outcomes. [Because some of the evaluations are undergoing now, some of these benefits are yet to be quantitatively and qualitatively assessed.] 2 min. video on a Personalized Digital Health application (Asthma control in Children): https://www.youtube.com/watch?v=mATRAQ90wio Also see: http://knoesis.org/amit/hcls for related information.

TRANSCRIPT

Page 1: Healthcare innovations at Kno.e.sis

Healthcare Innovations at Kno.e.sis

Put Knoesis Banner

Presentation to the Boonshoft School of Medicine Executive Committee, July 10, 2014

Amit Sheth

Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State University, USA

Page 2: Healthcare innovations at Kno.e.sis

2

• Among top universities in the world in World Wide Web (cf: 10-yr impact, Microsoft Academic Search: among top 10 in June2014)

• Largest academic group in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications

• Exceptional student success: internships and jobs at top salary (IBM Watson/Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups )

• 100 researchers including 15 World Class faculty (>3K citations/faculty) and ~45 PhD students- practically all funded

• Extensive research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)

Page 3: Healthcare innovations at Kno.e.sis

3

Amit Sheth’s PHD students

Ashutosh Jadhav

Hemant Purohit

Vinh Nguyen

Lu ChenPavan

KapanipathiPramod

Anantharam

Sujan Perera

Alan Smith

Swapnil Soni

Maryam Panahiazar

Sarasi Lalithsena

Shreyansh Batt

Kalpa Gunaratna

Delroy Cameron

Sanjaya Wijeratne

Wenbo Wang

Kno.e.sis in 2014 = ~100 researchers (15 faculty, ~50 PhD students)

Special thanks

Special thanks

Special thanks

Special thanks

Special thanks: This presentation covers some of the work of these researchers.

Page 4: Healthcare innovations at Kno.e.sis

4

• 80% of doctors will eventually become obsolete: Vinod Khosla, VC and founder of Sun Microsystems

• “The Doctor is (Always) In: Reinventing the Doctor-Patient

Relationship for the 21st Century” [Dr. J. Shlain]. More data is generated under patient control and outside clinical system. Patient empowerment, reimbursement changes and AHA.

• #dHealth and #IoT are two hottest hashtags at CES and SXSW

Healthcare is changing way too fast

Page 5: Healthcare innovations at Kno.e.sis

5

The Patient of the FutureMIT Technology Review, 2012

http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/

Page 6: Healthcare innovations at Kno.e.sis

6

Collaborators

Page 7: Healthcare innovations at Kno.e.sis

7

Healthcare Innovation at Kno.e.sis

(with subset of applications)

Page 8: Healthcare innovations at Kno.e.sis

8

kHealth:Knowledge empowered personalized

digital mhealthWith applications to: ADHF, GI, Asthma,

[Geriatrics]

Contact: Prof. Amit Sheth

Page 9: Healthcare innovations at Kno.e.sis

Brief Introduction Video

Page 10: Healthcare innovations at Kno.e.sis

10

Providing actionable information in a timely manner is crucial to avoid information overload or fatigue

Sleep dataCommunity data

Personal Schedule Activity data

Personal health records

Data Overload for Patients/health aficionados

Page 11: Healthcare innovations at Kno.e.sis

11

Weather Application

Detection of events, such as wheezing sound, indoor

temperature, humidity, dust, and CO2 level

Weather ApplicationAsthma Healthcare Application

Action in the Physical World

Close the window at home during day to avoid CO2 inflow,

to avoid asthma attacks at night

Public Health

Personal

Population Level

‘FOR human’: Improving Human Experience

Page 12: Healthcare innovations at Kno.e.sis

12

Making sense of sensor data with

Page 13: Healthcare innovations at Kno.e.sis

13

Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information

canary in a coal mine

knowledge-enabled healthcare

kHealth

Page 14: Healthcare innovations at Kno.e.sis

14

kHealth to Manage ADHF(Acute Decompensated Heart Failure)

Page 15: Healthcare innovations at Kno.e.sis

15

1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.

25 million

300 million

$50 billion

155,000

593,000

People in the U.S. are diagnosed with asthma (7 million are children)1.People suffering from asthma worldwide2.

Spent on asthma alone in a year2

Hospital admissions in 20063

Emergency department visits in 20063

Asthma

Page 16: Healthcare innovations at Kno.e.sis

16

Asthma is a multifactorial disease with health signals spanning personal, public health, and population levels.

Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.

Variety Volume

VeracityVelocity

Value

Can we detect the asthma severity level?Can we characterize asthma control level?What risk factors influence asthma control?What is the contribution of each risk factor?

sem

antic

s Understanding relationships betweenhealth signals and asthma attacksfor providing actionable information

WHY Big Data to Smart Data?Healthcare example

Page 17: Healthcare innovations at Kno.e.sis

17ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ;

*consider referral to specialist

Asthma Control and Actionable Information

Sensors and their observations for understanding asthma

Personal, Public Health, and Population Level Signals for Monitoring Asthma

Page 18: Healthcare innovations at Kno.e.sis

18

At DischargeHealth Score Non-compliance Poor economic

statusNo living assistance

Vulnerability Score

Well Controlled Low

Well Controlled Very low

Not Well Controlled

High

Not Well Controlled

Medium

Poor Controlled Very High

Poor Controlled High

Estimation of readmission vulnerability based on the personal health score

Personal Health Score and Vulnerability Score

Page 19: Healthcare innovations at Kno.e.sis

19

Population Level

Personal

Wheeze – YesDo you have tightness of chest? –Yes

Observations Physical-Cyber-Social System Health Signal Extraction Health Signal Understanding

<Wheezing=Yes, time, location><ChectTightness=Yes, time, location>

<PollenLevel=Medium, time, location>

<Pollution=Yes, time, location>

<Activity=High, time, location>

Wheezing

ChectTightness

PollenLevel

Pollution

Activity

Wheezing

ChectTightness

PollenLevel

Pollution

Activity

RiskCategory

<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>

.

.

.

Expert Knowledge

Background Knowledge

tweet reporting pollution level and asthma attacks

Acceleration readings fromon-phone sensors

Sensor and personal observations

Signals from personal, personal spaces, and community spaces

Risk Category assigned by doctors

Qualify

Quantify

Enrich

Outdoor pollen and pollution

Public Health

Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor

Health Signal Extraction to Understanding

Page 20: Healthcare innovations at Kno.e.sis

20

Social streams has been used to extract many near real-time events

Twitter provides access to rich signals but is noisy, informal, uncontrolled capitalization, redundant,

and lacks context

We formalize the event extraction from tweets as a sequence labeling problem

How do we know the event phrases and who creates the training set? (manual creation is ruled out)

Now you know why you’re miserable! Very High Alert for B-ALLERGEN Ragweed I-ALLERGEN pollen. B-FACILITY Oklahoma I-FACILITY Allergy I-FACILITY Clinic says it’s an extreme exposure situation

Idea: Background knowledge used to create the training set e.g., typing information becomes the label for a concept

Health Signal Extraction Challenges

Page 21: Healthcare innovations at Kno.e.sis

21

intelligence at the edge

Approach 1: Send all sensor observations to the cloud for processing

Approach 2: downscale semantic processing so that each device is capable of machine perception

Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.

Page 22: Healthcare innovations at Kno.e.sis

22

Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning

0101100011010011110010101100011011011010110001101001111001010110001101011000110100111

Efficient execution of machine perception

Page 23: Healthcare innovations at Kno.e.sis

23

O(n3) < x < O(n4) O(n)

Efficiency Improvement

• Problem size increased from 10’s to 1000’s of nodes• Time reduced from minutes to milliseconds• Complexity growth reduced from polynomial to

linear

Evaluation on a mobile device

Page 24: Healthcare innovations at Kno.e.sis

24

2 Prior knowledge is the key to perceptionUsing SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web

3 Intelligence at the edgeBy downscaling semantic inference, machine perception can

execute efficiently on resource-constrained devices

1 Translate low-level data to high-level knowledgeMachine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making

Semantic Perception for smarter analysis:

3 ideas to takeaway

Page 25: Healthcare innovations at Kno.e.sis

25

PREDOSE:Social media analysis driven

epidemiologyApplication: Prescription drug abuse and beyond

Contact: Delroy Cameron

Page 26: Healthcare innovations at Kno.e.sis

26

D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. July 2013 (in press)

Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled ComputingCITAR - Center for Interventions Treatment and Addictions Research

http://wiki.knoesis.org/index.php/PREDOSE

Bridging the gap between researcher and policy makers

Early identification of emerging patterns and trends in abuse

PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology

Page 27: Healthcare innovations at Kno.e.sis

27

In 2008, there were 14,800 prescription painkiller deaths*

*http://www.cdc.gov/homeandrecreationalsafety/rxbrief/

• Drug Overdose Problem in US• 100 people die everyday from drug overdoses• 36,000 drug overdose deaths in 2008• Close to half were due to prescription drugs

Gil KerlikowskeDirector, ONDCP

Launched May 2011

PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology

Page 28: Healthcare innovations at Kno.e.sis

28

Early Identification and Detection of Trends

Access hard-to-reach Populations

Large Data Sample Sizes

Group Therapy: http://www.thefix.com/content/treatment-options-prison90683

Interviews

Online Surveys

Automatic Data Collection

Not Scalable

Manual Effort

Sample Biases

Epidemiologist

Qualitative Coding

Problems

Computer Scientist

Automate Information Extraction & Content Analysis

PREDOSE: Bringing Epidemiologists and Computer Scientist together

Page 29: Healthcare innovations at Kno.e.sis
Page 30: Healthcare innovations at Kno.e.sis

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

Codes Triples (subject-predicate-object)Suboxone used by injection, negative experience Suboxone injection-causes-Cephalalgia

Suboxone used by injection, amount Suboxone injection-dosage amount-2mg

Suboxone used by injection, positive experience Suboxone injection-has_side_effect-Euphoria

experience sucked

feel pretty damn good

didn’t do shit

feel great

Sentiment Extraction

bad headache

+ve

-ve

Triples

DOSAGE PRONOUN

INTERVAL Route of Admin.

RELATIONSHIPS SENTIMENTS

DIVERSE DATA TYPES

ENTITIES

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

Buprenorphine

subClassOf

bupe

Entity Identification

has_slang_term

SuboxoneSubutex

subClassOf

bupey

has_slang_term

Drug Abuse Ontology (DAO)83 Classes37 Properties

33:1 Buprenorphine24:1 Loperamide

Page 31: Healthcare innovations at Kno.e.sis

31

Ontology Lexicon Lexico-ontology Rule-based Grammar

ENTITIESTRIPLES

EMOTIONINTENSITYPRONOUN

SENTIMENT

DRUG-FORMROUTE OF ADM

SIDEEFFECT

DOSAGEFREQUENCY

INTERVAL

Suboxone, Kratom, Herion, Suboxone-CAUSE-Cephalalgia

disgusted, amazed, irritatedmore than, a, few of

I, me, mine, myIm glad, turn out bad, weird

ointment, tablet, pill, filmsmoke, inject, snort, sniffItching, blisters, flushing, shaking hands, difficulty

breathing

DOSAGE: <AMT><UNIT> (e.g. 5mg, 2-3 tabs)

FREQ: <AMT><FREQ_IND><PERIOD> (e.g. 5 times a week)

INTERVAL: <PERIOD_IND><PERIOD> (e.g. several years)

PREDOSE: Smarter Data through Shared Context and Data Integration

Page 32: Healthcare innovations at Kno.e.sis

32

Data Type Semantic Web Technique Limitations of Other Approaches

Entity Ontology-driven Identification & Normalization

ML/NLP IR

Requires Labeled Data

Unpredictable term frequencies

Triple Schema-drivenDifficult to

develop language model

Requires entity disambiguation

Sentiment Ontology-assisted Target Entity Resolution

Inconsistent data for Parse Trees or

rules

Diverse simple & complex slang

terms & phrases

PREDOSE: Role of Semantic Web and Ontologies

Page 33: Healthcare innovations at Kno.e.sis

33

Loperamide is used to self-medicate to from Opioid Withdrawal symptoms

Loperamide-Withdrawal Discovery

Page 34: Healthcare innovations at Kno.e.sis

34

EMR and clinical text analysis:Intelligence from clinical data

Contact: Sujan Parera

Page 35: Healthcare innovations at Kno.e.sis

35

• Active Semantic EMR: high quality, low error, faster completion of patient records

• Predicting patient outcomes and advice discharge decisions based on both structured (billing) data and clinical text (unstructured data)

• Deep understanding of clinical text for Computer Assisted Coding for ICD9 and ICD10 and Computerized Document Improvement (commercial products from ezDI)

Page 36: Healthcare innovations at Kno.e.sis

Explanation Module

Explained?

Yes

NoHypothesis

FilteringHypothesis Generation

Hypothesis with High

Confidence

D

D D

DD

D

Patient Notes

UMLS

Semantic Driven Approach for Knowledge Acquisition from EMRs

Page 37: Healthcare innovations at Kno.e.sis

37

Deep clinical text analysis using semantics enhanced NLP has enabled our industry partner ezDI to develop exciting commercial products: ezCDI (Computerized Document Improvement) and ezCAC (Computer Assisted ICD9/ICD10 Coding)

See: http://ezdi.us

Semantics enhanced NLP

Page 38: Healthcare innovations at Kno.e.sis

38

• Typical NLP algorithms misclassify linguistic nuances• Document 1:

• Coronary artery disease listed in the current diagnosis list• “Send for carotid duplex to rule out carotid artery stenosis given his risk factors and

underlying coronary artery disease.“ (NLP output says patient does not have coronary artery disease)

• Document 2:• “Extremities : Warm and dry. No clubbing or cyanosis. No lower extremity edema.“• “I have advised the patient on the side effect of potential lower extremity edema.“ (NLP

output says patient has lower extremity edema)

• Document 3• “He is not having any symptoms of chest pain or exertional syncope or dizziness.”• “I advised him that if he experiences chest pain, shortness of breath with exertion or

dizziness or syncopal episodes to let us know and we can do appropriate workup.” (NLP output says patient has chest pain, shortness of breath, dizziness, syncopal)

Green - correctly identified entities Red – misclassified entities

Semantics enhanced NLP

Page 39: Healthcare innovations at Kno.e.sis

39

Semantics enhanced NLP

• Domain knowledge can be used to resolve misclassifications

Atrial FibrillationSyncope

Is_symptom_of

Warfarin

Atenolol

AspirinIs_medication_for

Symptoms Medication

Medication

Medication

• There are strong evidences to suggest that patient has Atrial Fibrillation.

Page 40: Healthcare innovations at Kno.e.sis

40

Raw Text to Knowledge

He is off both Diovan and Lotrel. I am unsure if it is due to underlying renal insufficiency. He has actually been on atenolol alone for his hypertension.

Raw Text

Concepts

Knowledge

Inference

diovan lotrel renal insufficiency atenolol hypertension

diovanvaltuna

valsartan

antihypertensive agent

atenolol

tenominatenix kidney failure

renal insufficiency

kidney disease

disorder

blood pressure disorder

hypertension

systoloc hypertension

pulmonary hypertension

Patient taking atenolol for hypertension

Patient has kidney disease

Patient is on antihypertensive drugs

is used to treat

is a

drug

disorder

Page 41: Healthcare innovations at Kno.e.sis

cTAKESezNLP

ezKB<problem value="Asthma" cui="C0004096"/><med value="Losartan" code="52175:RXNORM" /><med value="Spiriva" code="274535:RXNORM" /><procedure value="EKG" cui="C1623258" />

ezFIND ezMeasure ezCDIezCAC

www.ezdi.us

ezHealth Platform

41

Page 42: Healthcare innovations at Kno.e.sis

42

Online Health Information Seeking

Contact: Ashutosh Jadhav

Page 43: Healthcare innovations at Kno.e.sis

43

Internet Users in the World

http://www.internetlivestats.com/internet-users/

Around 3 Billions (40%) of the world population

Around 300 Million (87 %) of the US population

Page 44: Healthcare innovations at Kno.e.sis

44

• Online health resources– Easily accessible– Helps to obtain medical information quickly, conveniently– Can help non-experts to make more informed decisions – Play a vital role in improving health literacy

Online Health Information Seeking

Page 45: Healthcare innovations at Kno.e.sis

45

• With the growing availability of online health resources, consumers are increasingly using the Internet to seek health related information

According to a 2013 Pew Survey*, one in three American adults has gone online to find information about a specific medical condition.

*Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013

Online Health Information Seeking

Page 46: Healthcare innovations at Kno.e.sis

46

• One of the most common ways to seek online health Information is via Web search engines such as Google, Yahoo! and Bing

According to the Pew Survey, approximately 8 in 10 online health inquiries initiate from a search engine.

Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013

Online Health Information Seeking

Page 47: Healthcare innovations at Kno.e.sis

47

• Analyzing health search log– Helps to understand population level health information needs

– How users formulate search queries (“expression of information need”)

– availability of potentially larger, cohorts of real users and their behaviors, e.g. querying behaviors

• Such knowledge can be applied – to improve the health search experience

– to develop next-generation knowledge and content delivery systems

Motivation

Page 48: Healthcare innovations at Kno.e.sis

Online Health Information Seeking

Smart Devices

Personal Computers

vs.

Jadhav A et al. “Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal” Journal of Medical Internet Research 2014;16(7):e160 (Impact factor 3.8)

Page 49: Healthcare innovations at Kno.e.sis

Desktop

Mobile

Mobile usagetakesOver

Motivation

Page 50: Healthcare innovations at Kno.e.sis

• With the recent exponential increase in usage of smart devices, the percentage of people using smart devices to search for health information is also growing rapidly

Motivation

Page 51: Healthcare innovations at Kno.e.sis

• Experience of online information searching varies depending on the device used – Smart devices (SDs) : mobile, tablets– Personal computers (PCs): desktop, laptop

• PCs and SDs have distinct characteristics– Readability, user experience, accessibility, etc.

Motivation

Page 52: Healthcare innovations at Kno.e.sis

• In order to improve the health information searching

process and to be prepared for technology shift, it is

necessary

– to understand how device choice influences

online health information seeking

Study Objective

Page 53: Healthcare innovations at Kno.e.sis

• Data:– Health search queries – lunched from PCs and SDs– submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal

(MayoClinic.com)

• Data timeframe: – June 2011 to May 2013

• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)

• Dataset size: – More than 100 million health search queries for both PCs and SDs

Dataset Creation

Page 54: Healthcare innovations at Kno.e.sis

• For PCs and SDs, we analyzed and compared– Frequently searched health categories

– Types of search queries (keyword-based, Wh-questions, Yes/No questions)

– Structural properties of the queries • Length of the search queries• Usage of the search query operators• Usage of special characters

– Misspellings in the health search queries

– Linguistic characteristics of the queries

Comparative Data Analysis

Page 55: Healthcare innovations at Kno.e.sis

The most-searched health categories are ‘Symptoms’ (1 in 3 search queries), ‘Causes’ and ‘Treatments & Drugs’

One of the least searched health category is “Prevention” The distribution of search queries for different health categories differ

with the device used for search Search queries from both PCs and SDs, follow similar pattern for

distribution of the search queries between health categories

Intent Mining for Health Information Seeking

Page 56: Healthcare innovations at Kno.e.sis

Health queries are predominately formulated using keywords (~85%); followed by Wh and Yes/No questions

Users ask more health questions from SDs compared to those from PCs

In the health search queries, users ask more “what”, “how” questions => descriptive information need “can”, “is” and “does” questions => factual information need

Intent Expression: Search Query Type

Page 57: Healthcare innovations at Kno.e.sis

Average length of the queries from SDs (3.29 words and 18.86 characters) is bit longer than that of PCs (2.9 words and 17.61 chars)

Health queries tend to be longer than the general search queries indicating users interest in more specific information

Intent Expression: Search Query Length

Page 58: Healthcare innovations at Kno.e.sis

Online Health Information Seeking for Cardiovascular Diseases

Jadhav A et al."What Information about Cardiovascular Diseases do People Search Online?”, 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014.

Jadhav A et al. "Online Information Searching for Cardiovascular Diseases: An Analysis of Mayo Clinic Search Query Logs” AMIA 2014 Annual Symposium, Washington DC, Nov 15-19, 2014

Page 59: Healthcare innovations at Kno.e.sis

59

• According to CDC, in the United States– CVD is one of the most common chronic diseases– the leading cause of death (1 in every 4 deaths)

• CVD is common across all socioeconomic groups and demographics

• Most of the CVDs require lifelong care and the patient is in charge of managing the disease through self-care

• Online health resources are “significant information supplement” for the patients with chronic conditions

Motivation

Page 60: Healthcare innovations at Kno.e.sis

60

• Although chronic diseases affect large population, very few prior studies have investigated online health information searching exclusively for chronic diseases and especially for CVD.

• In this study, we address this knowledge gap in the community – by performing population-level intention mining for online

health information seeking

Motivation

Page 61: Healthcare innovations at Kno.e.sis

61

• Data:– CVD related search queries – submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal

(MayoClinic.com)

• Data timeframe: – September 2011 to August 2013

• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)

• Dataset size: – 10 million CVD related search queries, which is a significantly large dataset

for a single class of diseases.

Dataset Creation

Page 62: Healthcare innovations at Kno.e.sis

62

• Identification of users intent for health information seeking

• For exampleSearch Query Health Category

Heart palpitations with headache Symptoms

Tylenol raise blood pressure Medication, Vital signPump for pulmonary hypertension Medical device, Disease

Red wine heart disease Food, DiseaseBypass surgery Treatment

Research Problem

Page 63: Healthcare innovations at Kno.e.sis

63

• Using background knowledge based to develop a rule based classification approach

– Using UMLS MetaMap and based on UMLS concepts and semantic types

– To categorize CVD search queries into 14 “consumer oriented” health categories

– Precision: 88.42% , Recall: 86.07% and F-Score: 0.8723

Intent Mining for Online Health Information Seeking

Page 64: Healthcare innovations at Kno.e.sis

64

Methods Overview

Page 65: Healthcare innovations at Kno.e.sis

Intent Mining for Health Information Seeking:

Association Rules for Categorization

Page 66: Healthcare innovations at Kno.e.sis

• One in every two search is related to either ‘Diseases and Conditions’ or ‘Vital signs’.

• Other popular health categories that users search for includes ‘Symptoms’, ‘Living with’, ‘Treatments’, ‘Food and Diet’ and ‘Causes’.

• Although CVD can be prevented with some lifestyle and diet changes, interestingly very few OHISs search for CVD ‘Prevention’.

Intent Mining for Health Information Seeking:

Categorization Results

Page 67: Healthcare innovations at Kno.e.sis

• A search query can be categorized into zero, one or more health categories

• Using our categorization approach, we categorized 92% of the 10 million CVD related queries into at least one health category

• Most of the queries (around 88%) are categorized into either one or two categories

• Very few CVD queries (4.28%) are categorized into 3 or more categories.

Intent Mining for Health Information Seeking:

Categorization Results

Page 68: Healthcare innovations at Kno.e.sis

• Most of the top search queries are related to major CVD diseases and conditions.

• At the same time, queries about blood pressure (high/low) and heart rate also searched frequently

Top CVD Search Queries

Page 69: Healthcare innovations at Kno.e.sis

• Average search query length for CVD is 3.88 words and 22.22 characters

• Around 80% of the CVD search queries have 3 or more words.

• The analysis implies that, CVD search queries are longer than previously reported non-medical as well as medical queries

• Longer search queries also denote users’ interest in more specific information about the disease; subsequently users use more words to narrow down to a particular health topic.

Intent Expression: Search Query Length

Page 70: Healthcare innovations at Kno.e.sis

• Users predominantly formulate search queries using keywords (80%), though queries with Wh-Questions are also significant

• Few queries (2.5%) are formulated as Yes/No type questions

• In Wh-questions, OHISs mostly use “How” and “What” in the search queries and both of them generally signify that more descriptive information is needed

• Yes/No questions are usually used to check some factual information. In Yes/No Questions, OHISs more often start the search queries with “does” “can” and “is”

Intent Expression: Search Query Types

Page 71: Healthcare innovations at Kno.e.sis

Comparative Analysis of Online Health Information Seeking for

Chronic Diseases

Cardiovascular Diseases

Arthritis

Cancer Diabetes

Page 72: Healthcare innovations at Kno.e.sis

Analyzing Temporal Patterns in Online Health Information Seeking

Page 73: Healthcare innovations at Kno.e.sis

Analyzing online information seeking for “Food and Diet” in the

context of “Health”

Page 74: Healthcare innovations at Kno.e.sis

74

Social Health Signals

Contact: Ashutosh Jadhav

Page 75: Healthcare innovations at Kno.e.sis

75

• Everyday millions of health related tweets shared

• Most of these tweets are highly personal and contextual

• Only around 12% posts are informative*

• Keyword-based search doesn't help

• User has to manually identify informative tweets

How to automate the identification of informative content?

Problem: Identifying Signals from Noise

Page 76: Healthcare innovations at Kno.e.sis

76

Present high quality, reliable and informative health related information shared over social media by understanding

Who who shared the information?social network user People Analysis

share what what content is shared? social media post Content Analysis

when when the post is generated? Temporal Analysis

in what context what is the topic of the message? Semantic Analysis

on which channel

To which website, the social media post is pointing? Reliability Analysis

with what social effect

how many retweets, facebook like/share, comments for the post?

Popularity Analysis

Social Health Signals

Page 77: Healthcare innovations at Kno.e.sis

77

Search and Explore

Top health news

Faceted search (by health topics)

Social Health Signals

Page 78: Healthcare innovations at Kno.e.sis

78

On going projects

Page 79: Healthcare innovations at Kno.e.sis

79

• Stress, obesity/lifestyle disease, chronic diseases

• Food and diet in the health context• Keeping elderly at home as long as possible• Clinical research – developing blood test for

esophageal cancer detection

On the drawing board

Page 80: Healthcare innovations at Kno.e.sis

80

• Kno.e.sis is a truly multidisciplinary, pan-University Center of Excellence were world class technology/computing expertise come together with clinical research and applications in health, fitness & wellbeing

• Major theme: personalized digital health, patient empowerment, informed patients, epidemiology

• More is covered in my talk on Semantic Data enabling Personalized Digital Health

Take Away

Page 81: Healthcare innovations at Kno.e.sis

81

http://knoesis.orghttp://knoesis.org/vision

http://knoesis.org/amit/hcls

Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA

thank you, and please visit us at

Page 82: Healthcare innovations at Kno.e.sis

82

1. Henson C, Thirunarayan K, Sheth A. An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices 11th International Semantic Web Conference (ISWC 2012), Boston, Massachusetts, USA, November 11-15, 2012

2. Henson C, Sheth A, Thirunarayan K. Semantic Perception: Converting Sensory Observations to Abstractions IEEE Internet Computing, vol. 16, no. 2, pp. 26-34, Mar./Apr. 2012, doi:10.1109/MIC.2012.20

3. Henson C, Thirunarayan K, Sheth A. An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web. Applied Ontology, vol. 6(4), pp.345-376, 2011.

4. Perera S, Sheth A, Thirunarayan K, Nair S and Shah N. Challenges in Understanding Clinical Notes: Why NLP Engines Fall Short and Where Background Knowledge Can Help. International Workshop on Data management & Analytics for healthcaRE (DARE) at ACM Conference of Information and Knowledge Management (CIKM), pp. 21-26, Burlingame, USA, Nov 1, 2013,

5. Perera S, Henson C, Thirunarayan K, Sheth A, Nair S. Semantics Driven Approach for Knowledge Acquisition From EMRs. IEEE Journal of Biomedical and Health Informatics, vol.18, no.2, pp.515-524, March 2014, doi: 10.1109/JBHI.2013.2282125, PMID: 24058038

Selected References

Page 83: Healthcare innovations at Kno.e.sis

83

6. Cameron D, Smith GA, Daniulaityte R, Sheth A et al.PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. 46(6): 985-997, 2013. PMID: 23892295

7. Cameron D, Bodenreider O, Yalamanchili H, Danh T et al. A Graph-Based Recovery and Decomposition of Swanson's Hypothesis using Semantic Predications. Journal of Biomedical Informatics 46(2): 238-251, 2013.

8. Jadhav A, Sheth A, Pathak J. Analysis of Online Information Searching for Cardiovascular Diseases on a Consumer Health Information Portal. American Medical Informatics Association (AMIA) Annual Symposium 2014, Washington DC, November 15-19, 2014

9. Jadhav A, Andrews D, Fiksdal A, Kumbamu A, McCormick JB, et al. Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal. J Med Internet Res 2014;16(7):e160, PMID: 25000537

10. Fiksdal A, Kumbamu A, Jadhav A, Nelsen L, Pathak J, McCormick JB. Evaluating the Process of Online Health Information Searching: A Qualitative Approach to Exploring Consumer Perspectives. in press at J Med Internet Res 2014

11. Jadhav A, Wu S, Sheth A, Pathak J. Online Information Seeking for Cardiovascular Diseases: A Case Study from Mayo Clinic. 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014

Selected References