smart data in health – how we will exploit personal, clinical, and social “big health data”...

75
1 Smart Data in Health – How we will exploit personal, clinical, and social “Big Health Data” for better outcomes Webinar given to Brain Health Alliance, June 30, 2015 Amit Sheth Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, Ohio http:// knoesis.org http ://knoesis.org/amit/hcls Special Thanks: Sujan Perera

Upload: amit-sheth

Post on 16-Aug-2015

117 views

Category:

Health & Medicine


0 download

TRANSCRIPT

1

Smart Data in Health – How we will exploit personal, clinical, and social

“Big Health Data” for better outcomes

Webinar given to Brain Health Alliance, June 30, 2015 Amit ShethKno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing

Wright State University, Dayton, Ohio http://knoesis.org http://knoesis.org/amit/hcls

Special Thanks: Sujan Perera

Data Sources

Personal

Social

Data Sources

Data Sources

Clinical

Diagnosis

Medications

Family History

Allergies

Patient History

Big Data

Stakeholders

Medical Professionals Scientists Policy Makers

Patients

7

Smart Data

Makes Sense

Actionable or help decision support/making

Contextual

Information

Personalize

d

Smart Data

Smart data makes sense out of Big data

It provides value from harnessing the challenges posed by volume, velocity, variety

and veracity of big data, in-turn providing actionable information and improve decision

making.

Future Patient

Healthcare Data Usage - Examples

• Support Research - Genomics and BeyondThe key to enable the personalized medicine and discoveries

• Transform Data to InformationMine data for meaning and patterns/predictive analytics

• Support Self-CareMobile apps to keep track of your health status

• Support Providers - Improve Patient CareAnalyzing social and clinical data streams to create behavioral health records

• Increase AwarenessInform about epidemics, identifying counterfeit drugs,

inform about environmental issues

Few Success Stories

• IBM- Ontario’s Institute of Technology : predict the onset of nosocomial infections 24 hours before symptoms appeared.

• University of Michigan Health System : reducing the need for blood transfusions by 31 per cent and expenses by $200,000 a month.

• Kaiser Permanente : discovery of adverse drug effects and subsequent withdrawal of the drug Vioxx from the market.

• Harvard Medical School : computer algorithms to analyze EHR data to detect and categorize patients with diabetes for public health surveillance.

• Seton Healthcare – IBM : bulging jugular vein is a strong—and easily observed—predictor that a patient admitted for congestive heart failure is likely to wind up back in the hospital.

Kno.e.sis Harness the Value

kHeath analyzes both active and passive observations of the patients to generate the alarms that helps to improve health, fitness, and wellbeing of the patient. It uses Semantic Sensor Web technology, Semantic Perception, and Intelligence at the Edge to enable sophisticated analysis of personal health observations.

kHealth

Data Sources

kHealth Wiki

Kno.e.sis Harness the Value

The overall aim of PREDOSE is to develop techniques to facilitate prescription drug abuse epidemiology, related to the illicit use of pharmaceutical opioids. PREDOSE is designed to capture the knowledge, attitudes and behaviors of prescription drug abusers through the automatic extraction of semantic information from social media.

PREDOSE

Data Sources

PREDOSE Wiki

Kno.e.sis Harness the Value

eDrugTrends is social media data analytics platform to monitor the cannabis and synthetic cannabinoids usage. It uses Twitter and Web forums data to: 1) Identify and compare trends in knowledge, attitudes, and behaviors related to cannabis and synthetic cannabinoid, and 2) Identify key influencers in cannabis and synthetic cannabinoid-related discussions on Twitter.

eDrugTrends

Data Sources

eDrugTrends Wiki

Kno.e.sis Harness the Value

This project seeks to understand and satisfy users’ need for keeping track of new information in healthcare and well-being. The project harvest collective intelligence to identify high quality, reliable and informative healthcare content shared over social media based on following analysis: Text Analysis, Semantic analysis, Reliability analysis, Popularity Analysis.

Social Health Signals

Data Sources

Social Health Signals Wiki

Technology Stack

EMRSensor

Explicit/Implicit Entity Recognition Understanding Language NuancesNoise Filtering

Entity Disambiguation Sentiment Extraction

Spatial Information Extraction

Knowledge Extraction Semantic Perception

Time Series AnalysisPredictive Analytics Semantic Analysis

Social Network Analysis

Temporal Information Extraction

Data Integration

Kno.e.sis StrengthThe research at Kno.e.sis fundamentally believe that the ‘knowledge about the world and the problem domain’ has critical role to play in solving the complex real world problems. Hence, our technologies always exploit the background knowledge available to overcome the unique challenges posed by the problem at hand.

Computationally we seek to combine bottom brain and top brain inspired computing.

KnowledgeWorld Knowledge

Medical Knowledge Linguistic Knowledge

Few Examples of Our Technologies

Knowledge Acquisition

21

Sujan Perera, Cory Henson, Krishnaprasad Thirunarayan, Amit Sheth, Suhas Nair, 'Semantics Driven Approach for Knowledge Acquisition from EMRs', Special Issue on Data Mining in Bioinformatics, Biomedicine and Healthcare Informatics, Journal of Biomedical and Health Informatics (To Appear)

Intuition: Knowledge is built by abstracting real world facts, once built it should be able to explain the real world

Public Knowledge is not always Sufficient

Semantics Driven Approach for Knowledge Acquisition from EMRs

Explanation Module

Explained?

Yes

NoHypothesis

FilteringHypothesis Generation

Hypothesis with High

Confidence

D

D D

DD

D

Patient Notes

UMLS

Knowledge Acquisition

23

1. Annotate the EMR documents with given knowledgebase2. Find unexplained symptoms3. Generate hypothesis for unexplained symptoms

1. All disorders in document becomes candidates4. Filter out candidate disorder with high confidence

1. Get disorders which has relationship with unexplained symptom in given knowledgebase

2. Collect the “neighborhood” of the disorders3. Get the intersection of “neighborhood” and candidate

disorders

Knowledge Acquisition - Algorithm

Information Extraction

Implicit Entity Recognition

Bob Smith is a 61-year-old man referred by Dr. Davis for outpatient cardiac catheterization because of a positive exercise tolerance test. Recently, he started to have left shoulder twinges and tingling in his hands. A stress test done on 2013-06-02 revealed that the patient exercised for 6 1/2 minutes, stopped due to fatigue. However, Mr. Smith is comfortably breathing in room air. He also showed accumulation of fluid in his extremities. He does not have any chest pain.

Person PersonUMLS:

C0018795

UMLS: C0008031

UMLS: C0015672

Named Entity Recognition (gives type)Co-reference ResolutionNegation DetectionEntity LinkingTemporal Information Extraction

Implicit Entity Recognition

Bob Smith is a 61-year-old man referred by Dr. Davis for outpatient cardiac catheterization because of a positive exercise tolerance test. Recently, he started to have left shoulder twinges and tingling in his hands. A stress test done on 2013-06-02 revealed that the patient exercised for 6 1/2 minutes, stopped due to fatigue. However, Mr. Smith is comfortably breathing in room air. He also showed accumulation of fluid in his extremities. He does not have any chest pain.

Shortness of breath - negated

edema

Shortness of breath : uncomfortable sensation of difficulty in breathingEdema : excessive accumulation of fluid

Implicit Entity Recognition

Implicit Entity Recognition (IER) is the task of determining whether a sentence, which does not contain the proper name of an entity, nevertheless refers to the entity.

Sujan Perera, Pablo Mendes, Amit Sheth, Krishnaprasad Thirunarayan, Adarsh Alex, Christopher Heid, Greg Mott, 'Implicit Entity Recognition in Clinical Documents', In proceedings of The Fourth Joint Conference on Lexical and Computational Semantics (*SEM), 2015

Implicit Entity Recognition

Sentence Entity

Her breathing is still uncomfortable. Shortness of breath

It is important to prevent shortness of breath and lower extremity swelling from fluid accumulation.

Edema

She says she did not have any warning prior to losing consciousness and remembers everything.

Syncope

His tip of the appendix was inflamed. Appendicitis

There is a 1.3 cm gallstone within the gallbladder neck which is not obstructing. Cholecystitis

Sujan Perera, Pablo Mendes, Amit Sheth, Krishnaprasad Thirunarayan, Adarsh Alex, Christopher Heid, Greg Mott, 'Implicit Entity Recognition in Clinical Documents', In proceedings of The Fourth Joint Conference on Lexical and Computational Semantics (*SEM), 2015

Semantic Perception

Semantic Perception

Making sense of sensor data with

SSNOntology

2 Interpreted data(deductive)[in OWL] e.g., threshold

1 Annotated Data[in RDF]e.g., label

0 Raw Data[in TEXT]e.g., number

Levels of Abstraction

3 Interpreted data (abductive)[in OWL]e.g., diagnosis

Intellego

“150”

Systolic blood pressure of 150 mmHg

ElevatedBlood

Pressure

Hyperthyroidism

less

use

ful …

mor

e us

eful

……

31

* based on Neisser’s cognitive model of perception

ObserveProperty

PerceiveFeature

Explanation

Discrimination

1

2

Translating low-level signals into high-level knowledge

Focusing attention on those aspects of the environment that provide useful information

Prior Knowledge

32

Perception Cycle*

W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph

33

Prior Knowledge on the Web

Inference to the best explanation• In general, explanation is an abductive problem; and

hard to compute

Finding the sweet spot between abduction and OWL• Single-feature assumption* enables use of OWL-DL

deductive reasoner

* An explanation must be a single feature which accounts forall observed properties

Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building

35

Explanation

elevated blood pressure

clammy skin

palpitations

Hypertension

Hyperthyroidism

Pulmonary Edema

Observed Property Explanatory Feature

36

Explanation

Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features

ObserveProperty

PerceiveFeature

Explanation

Discrimination2

Focusing attention on those aspects of the environment that provide useful information

37

Discrimination

elevated blood pressure

clammy skin

palpitations

Hypertension

Hyperthyroidism

Pulmonary Edema

Discriminating Property Explanatory Feature

38

Discrimination

Few Real World Problems that Kno.e.sis Solves

Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information

canary in a coal mine

Empowering individuals for their own health

40

kHealth

What?

• kHealth is a knowledge-based approach/application for patient-centric health-care that exploits:(a) Web based tools and social media, (b) Mobile phone technology and wireless sensors, (c) For synthesizing personalized actions from heterogeneous health data

(i) For disease prevention and treatment(ii) For health, fitness and well-being

41

kHealth

kHealth – Applications & Impact

Condition Number of patients Total cost per year

Asthma 25 million 50 billion

ADHF 5 million 34 billion

Parkinson’s disease 1 million 25 billion

Sensordrone (Carbon monoxide,

temperature, humidity) Node Sensor

(exhaled Nitric Oxide)

43

Sensors

Android Device (w/ kHealth App)

Total cost: ~ $500*Along with two sensors in the kit, the application uses a variety of population level signals from the web:

Pollen level Air Quality Temperature & Humidity

kHealth – Asthma Patient Kit

Personal level Signals

Public level Signals

Population level Signals

Domain Knowledge

Risk Model

Events from Social Streams

Take Medication before going to work

Avoid going out in the evening due to high pollen levels

Contact doctor

AnalysisPersonalized Actionable

Information

Data Acquisition & aggregation

44

Health Signal Processing Architecture

45

Risk assessment model

Semantic Perception

Personal level Signals

Public level Signals

Domain Knowledge

Population level Signals

GREEN -- Well Controlled YELLOW – Not well controlledRed -- poor controlled

How controlled is my asthma?

Patient Health Score (Diagnostic)

46

Risk assessment model

Semantic Perception

Personal level Signals

Public level Signals

Domain Knowledge

Population level Signals

Patient health Score

How vulnerable* is my control level today?

*considering changing environmental conditions and current control level

Patient Vulnerability Score (Prognostic)

47

Population Level

Personal

Wheeze – YesDo you have tightness of chest? –Yes

Observations Physical-Cyber-Social System Health Signal Extraction Health Signal Understanding

<Wheezing=Yes, time, location>

<ChectTightness=Yes, time, location>

<PollenLevel=Medium, time, location>

<Pollution=Yes, time, location>

<Activity=High, time, location>

Wheezing

ChestTightness

PollenLevel

Pollution

Activity

Wheezing

ChestTightness

PollenLevel

Pollution

Activity

RiskCategory

<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>

.

.

.

Expert Knowledge

Background Knowledge

Tweets reporting pollution level and asthma attacks

Acceleration readings fromon-phone sensors

Sensor and personal observations

Signals from personal, personal spaces, and community spaces

Risk Category assigned by doctors

Qualify

Quantify

Enrich

Outdoor pollen and pollution

Public Health

Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor

Health Signal Extraction to Understanding

D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. July 2013 (in press)

Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled ComputingCITAR - Center for Interventions Treatment and Addictions Research

http://wiki.knoesis.org/index.php/PREDOSE

Bridging the gap between researcher and policy makers

Early identification of emerging patterns and trends in abuse

PREDOSE

In 2008, there were 14,800 prescription painkiller deaths*

*http://www.cdc.gov/homeandrecreationalsafety/rxbrief/

• Drug Overdose Problem in US• 100 people die everyday from drug overdoses• 36,000 drug overdose deaths in 2008• Close to half were due to prescription drugs

Gil KerlikowskeDirector, ONDCP

Launched May 2011

PREDOSE

PREDOSE

Early Identification and Detection of Trends

Access hard-to-reach Populations

Large Data Sample Sizes

Group Therapy: http://www.thefix.com/content/treatment-options-prison90683

Interviews

Online Surveys

Automatic Data Collection

Not Scalable

Manual Effort

Sample Biases

Epidemiologist

Qualitative Coding

Problems

Computer Scientist

Automate Information Extraction & Content Analysis

Web Crawler

Informal Text DatabaseWeb Forums

2

4

58

Data Cleaning

Stage 1. Data Collection3

Stage 2. Automatic Coding

Stage 3. Data Analysis and Interpretation

1

6

Qualitative and Quantitative Analysis of Drug User Knowledge, Attitudes

and Behaviors

+ =

Semantic Web Database

Information Extraction Module

Temporal Analysis for Trend Detection

10

Triples/RDF Database

Entity Identification

Sentiment ExtractionRelationship

Extraction

Triple Extraction

7Opioid, Cannabinoid,Side Effect, Feeling

[Buprenorphine has_slang_term bupe][Suboxone subClassOf Buprenorphine][Suboxone_Injection CAUSES Nausea]

Drug Abuse Ontology (Schema)

9

PREDOSE Web Application

9

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

Codes Triples (subject-predicate-object)

Suboxone used by injection, negative experience Suboxone injection-causes-Cephalalgia

Suboxone used by injection, amount Suboxone injection-dosage amount-2mg

Suboxone used by injection, positive experience Suboxone injection-has_side_effect-Euphoria

Triples

DOSAGE PRONOUN

INTERVAL Route of Admin.

RELATIONSHIPS SENTIMENTS

DIVERSE DATA TYPES

ENTITIES

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

Buprenorphine

subClassOf

bupe

Entity Identification

has_slang_term

SuboxoneSubutex

subClassOf

bupey

has_slang_term

Drug Abuse Ontology (DAO)83 Classes37 Properties

33:1 Buprenorphine24:1 Loperamide

feel pretty damn good

feel great

Sentiment Extraction

+ve

experience sucked

didn’t do shit

-ve

bad headache

Ontology Lexicon Lexico-ontology Rule-based Grammar

ENTITIESTRIPLES

EMOTIONINTENSITYPRONOUN

SENTIMENT

DRUG-FORMROUTE OF ADM

SIDEEFFECT

DOSAGEFREQUENCY

INTERVAL

Suboxone, Kratom, Herion, Suboxone-CAUSE-Cephalalgia

disgusted, amazed, irritatedmore than, a, few of

I, me, mine, myIm glad, turn out bad, weird

ointment, tablet, pill, filmsmoke, inject, snort, sniffItching, blisters, flushing, shaking hands, difficulty

breathing

DOSAGE: <AMT><UNIT> (e.g. 5mg, 2-3 tabs)

FREQ: <AMT><FREQ_IND><PERIOD> (e.g. 5 times a week)

INTERVAL: <PERIOD_IND><PERIOD> (e.g. several years)

PREDOSE: Smarter Data through Shared Context and Data Integration

55

Loperamide is used to self-medicate to from Opioid Withdrawal symptoms

PREDOSE: Loperamide-Withdrawal Discovery

Social Health Signals

http://www.internetlivestats.com/internet-users/

Around 3 Billions (40%) of the world population

Around 300 Million (87 %) of the US population

• Online health resources– Easily accessible– Helps to obtain medical information quickly, conveniently– Can help non-experts to make more informed decisions – Play a vital role in improving health literacy

Social Health Signals

• With the growing availability of online health resources, consumers are increasingly using the Internet to seek health related information

• Most queries are initiated in search engines

According to a 2013 Pew Survey*, one in three American adults has gone online to find information about a specific medical condition.

*Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013

Social Health Signals

Social Health Signals - Motivation

• Analyzing health search log– Helps to understand population level health information needs

– How users formulate search queries (“expression of information need”)

– availability of potentially larger, cohorts of real users and their behaviors, e.g. querying behaviors

• Such knowledge can be applied – to improve the health search experience

– to develop next-generation knowledge and content delivery systems

Social Health Signals - Studies

• Online information seeking: Personal computer vs Smart devices

• What information about the cardiovascular disease do people search

vs.

Social Health Signals - Studies

• Comparative analysis of online health information seeking for chronic diseases

• Analyzing temporal patterns of the online health seeking

Cardiovascular Diseases

Arthritis

Cancer Diabetes

Social Health Signals - Studies

• Analyzing online information seeking for “Food and Diet” in the context of health

• Identification of users intent for health information seeking

• Using background knowledge based to develop a rule based classification approach

– Using UMLS MetaMap and based on UMLS concepts and semantic types

– To categorize CVD search queries into 14 “consumer oriented” health categories

Research Problem

Methods Overview

eDrugTrends

• eDrugTrends is a software platform developed to semi-automate the processing and visualization of thematic, sentiment, spatio-temporal, and social network dimensions on cannabis and synthetic cannabinoid use.

• This built on top of our existing analytics platforms Twitris and PREDOSE.

eDrugTrends - Significance

• eDrugTrends advance the the field’s technological and methodological capabilities to harness social media for drug abuse surveillance research.

• eDrugTrends informs the field on new trends regarding the use of cannabis and synthetic cannabinoid usage.

eDrugTrends - Architecture

eDrugTrends – Preliminary Study

• We studied the differences in volume of hash oil (form of cannabis) related tweets among varying cannabis legalization policies.

• We studies the attitudes about the use of hash oil products.

eDrugTrends – Data Set

• ~18,000 Tweets in early October from ~14,500 users.

• 20% contains identifiable state level geolocations.

• ExamplesIf you smoke spice and you live in a Weed Legal state.... You are trash

Tried my first dab Tuesday night. Best sleep I've had in a while. Too bad dabs are too expensive for me.

I used to smoke k2 all the time when my bestfriend was on papers‚ then I almost died n never touched it again.

eDrugTrends – Early Findings

• Tweets related to hash oil are highest in the states that have passed medical and recreational usage of cannabis.

• The users have high positive attitude towards the cannabis usage in such states.

• These finding will help to develop intervention and policy responses.

Thank You

Visit Us @ www.knoesis.orgwith additional background at http://knoesis.org/amit/hcls

Ohio Center of Excellence in Knowledge-enabled Computing -An Ohio Center of Excellence in BioHealth Innovation

Wright State University