public health intelligence platform for social health records …ychen/dash/slides/chun.pdf ·...
TRANSCRIPT
11/3/2017
1
Public Health Intelligence Platform for
Social Health Records (SHR)*
City University of New York
Soon Ae Chun
In collaboration with
Xiang Ji (NJIT PhD graduate, Bloomberg Inc.)
James Geller (NJIT)
Introduction & Overview
• Online health-related social networks generate a big amount of health data, e.g.
– Twitter, PatientsLikeMe, Medhelp, etc.
• SHR (Social Health Records) – Social Media-based Health-related Data
– How can we leverage these for gaining health Intelligence & better healthcare?
• We present an integration and analytics framework of social health records (SHR) to address three problems: – Health Data Integration Problem
– Population Analytics Problem
– Predictive Analytics Problem
2
11/3/2017
2
Statistics: Social Media Use in Healthcare
consumers say that information found via
social media affects the way they deal with
their health (chronic disease, diet &
exercise)
18 to 24 year olds are more than twice as
likely as 45 to 54 year olds to use social media for health-
related discussions.
from 18 to 24 years of age said they would
trust medical information shared by others on their social
media networks.
healthcare organizations have
specific social media guidelines in
writing.
of adults are likely to share information about their
health on social media sites with other patients,
47 percent with physicians, 43 percent with hospitals, 38 percent with a health insurance company and
32 percent with a drug company.
of all hospitals in the United States
participate in social media.
of smartphone owners have at least one
health app on their phone. Exercise, diet, and weight apps are
the most popular types.
patients are very comfortable with their
providers seeking advice from online
communities to better treat their conditions.
of healthcare professionals use social media for
professional networking.
of people said social media would affect their choice of a
specific physician, hospital or medical
facility.
26%
41%
31% 54%
Youths 90%
Youths
31%
19% >40%
31%
2/2015 Becker’s ASC Review
Most popular online resources for Health Information
• The most accessed online resources for health related information are:
• 56% searched WebMD,
• 31% on Wikipedia,
• 29% on health magazine websites,
• 17% used Facebook,
• 15% used YouTube,
• 13% used a blog or multiple blogs,
• 12% used patient communities
• 6% used Twitter and
• 27% used none of the above
(source: Mashable, 2012, referralmd.com)
11/3/2017
3
Social Media Health Data
– 230 million tweets posted per day in 2011 -> 317 million (2016) (statistica.com)
• PatientsLikeMe
– 17,835 patients –share profiles with public; 307,033 members –share only with members
– 500+ health conditions (as of 2015)
• Medhelp
– 20 million monthly visitors
– Track pain, weight, chronic diseases
• CureTogether – anonymously track and compare health data
• DailyStrength – emotional support groups
• Inspire – different communities to offer support and educate
• FacetoFaceHealth – algorithm to match people with similar diagnoses
• Meddik – empower patients to search health info and learn from experiences from others
• Doximity – medical doctors and students network to extend and build prof relationships
5
Challenges Social Media for Health Care
• People do seek for Health Information from the social media to make personal health
care decisions
• However, the challenge is that they need to
– visit many different information sources.
– synthesize,
– reason,
– compare
– to make a reasonable decision on their health.
• In other words, social health data
– Vast amount of health data, Distributed (scattered) heterogeneous data, streaming data
– DRIP syndrome: Data rich and Info Poor core problems DS addresses
– Data Science methodologies can provide necessary analytics to generate more “useful” knowledge
for health care.
– SOCIAL HEALTH Analytics Framework
11/3/2017
4
Patient’s Health Reports on Social Media
SHR (Social Health Records)
• Generated by patients
– Health status reports
• Headaches, experienced symptoms
• Diagnosis reports
– Healthcare practice data
• Actual medications, treatments
• Side effects from treatments
– Health-related behaviors/habits
• Drinking, smoking, etc.
• Exercises, fitbits
• Nutritional
EHR (Electronic Health Records)
• Entered by clinical professionals
– Clinical data • medications, allergies, problems, procedures,
chart notes, clinical alert notes, lab results, and
images
– Patient history
– Orders
– Medications/Allergies
– Demographic data
– Lab data
Social Health Records (SHR)
EHR
• Structured/unstructured
• Uses Medical expert language
– Myocardial infarction
• Comparatively precise
– ICD9 code for a disease
• Not easily accessible due to
HIPAA privacy law
• Localized/silo
– Hospital, provider/group
– Application specific (lack of
interoperability)
• Factual statement
SHR
• Mostly unstructured data
• Informal everyday language
– Hearattack
• Ambiguous, vague – Diabetes (type 1 or 2), hepatitis (A or
C?)
• Publicly available or more
readily available
• Can be access around the
world
– Web browser accessible
• Emotional • Reviews, empatic statements, Annotations
11/3/2017
5
Social Health Records (SHR)
• Seems to exhibit potentials to investigate
– Major concerns or topics in health
– Health risks, attitudes towards health
– Identify trends
– Personal feelings/views on treatments or conditions
– What is desired outcomes
– Track adverse drug events
• Can it serve as a knowledge source for clinical, policy related decision
making as well? • the social health data as complimentary data source for research and clinical decisions,
• knowledge source for Health Intelligence to understand health behaviors or practices for
population?
SHR Integration and Analytics Framework
• A social health Integration and analytics framework uses Social Health
Records as the first class data to gain useful insights
– Health Data Integration • Scattered data sources
– Predictive Analytics Problem • From similar individuals to predict a future disease of a person?
• Comorbidity trajectory model
– Public Health Analytics • Public health issues e.g. epidemic outbreak detection, public sentiments
• Drug abuse detection
11/3/2017
6
• Users have to integrate health information from all these sources into one coherent mental model.
11
Query: What are other patients’ symptoms and drug reviews for
treating the top-10 conditions?
Research Issues & Approaches
• Research Issues
• How can we model and integrate the extracted data to satisfy the information
needs?
• How can we best present social health analytics and inference results to users?
• Approach-
• Health data integration for Analytics
– Designed semantic model & RDF storage to perform integration of data that can satisfy the
information needs.
– Developed context-aware social analytics and inferences.
– Social InfoButton (knowledge from social data about population health behaviors, practices..)
11/3/2017
7
RDF-Based Storage
• Triple: <subject, predicate, object>. • A patient “John” has a profile page as well as a health condition “Psoriasis”.
http://www.patientlikeme.com/patient#1050 (URI1)
http://www.patientlikeme.com/
Members/232328/about_me (URI2)
“John”
hasName hasProfile
http://www.patientlikeme.com/condition
#154 (URI3)
“Psoriasis”
hasCondition
hasConditionName
11/3/2017
8
Information Needs User Information Need Examples
Patient
Pre-diagnosis What are the symptoms for diabetes? What are the treatment
options for high blood sugar?
Post-diagnosis What are the new research findings about breast cancer? Are
my symptoms indeed caused by the diagnosed condition?
Community Support What patients or expert communities can provide support for a
specific condition?
Clinician
Drug Choice What are the drug options used by other patients to treat a
specific condition?
Drug Dosage How many pills a day and how many times a day should the
patients take a specific drug?
Side Effect What are the possible adverse effects of a specific drug, and
how severe are they?
Organization Disease Surveillance Where are the current disease outbreaks? What is the trend of a
specific condition?
What are the online profile, # of posts,
and # of replies for a specific condition?
SHR Analytics
11/3/2017
9
Data Source Patients Clinicians Government
Support
Community
Pre-
diagnosis
Healthcare
Providers
Post-
diagnosis
Drug
Choice Drug Dosage
Adverse
Effect
Disease
Surveillance
PatientsLikeMe P P P P
Twitter P
MedHelp P P P
WebMD P P P
Mayo Clinic P P
CDC P P
PubMed P
Open Health Data Sources
Data Source Patient Condition Treatment Symptom Review Community Post State
Prevalence
PatientsLikeMe 17,407 1,228 5,608 2,176 n/a n/a n/a n/a
MedHelp n/a n/a n/a n/a n/a 365 69,243 n/a
WebMD n/a 647 180 n/a 86,715 n/a n/a n/a
Mayo Clinic n/a 1,116 2,496 5,426 n/a n/a n/a n/a
CDC n/a n/a n/a n/a n/a n/a n/a 52
Social InfoButtons
• Use case: A doctor is devising the best practice for a PTSD (Post Traumatic
Stress Disorder) patient.
Statistical Analytic
11/3/2017
10
Social InfoButtons (Cont.)
Twitter Tag Cloud
Individual Tweets
Geospatial Analytic
Social InfoButtons (Cont.)
11/3/2017
11
Asthma Map and Gender distribution
Compare treatments for Fibromyalgia in Social InfoButtons
and Authoritative Sources
Treatment Present in
Social
Present in
Authority
Duloxetine Yes (1058) Yes
Pregabalin Yes (955) Yes
Milnacipran Yes (357) Yes
Gabapentin Yes (346) Yes
Tramadol Yes (201) Yes
Cyclobenzaprine Yes (188) No
11/3/2017
12
• Treatments of Major Depressive Disorder in Social Source completely overlap with
Authoritative Source (Authority)
Treatment in Social Source # of Patients in Social
Source Appears in Authority
Individual Therapy 185 Yes
Bupropion 174 Yes
Venlafaxine 160 Yes
Duloxetine 146 Yes
Fluoxetine 136 Yes
Citalopram 123 Yes
Sertraline 119 Yes
Escitalopram 79 Yes
Desvenlafaxine 30 Yes
Mirtazapine 26 Yes
Electroconvulsive-Therapy ECT 24 Yes
System Evaluation
• Symptoms of Major Depressive Disorder in Social Source partially overlap with Authoritative
Source (Authority)
Symptom in Social Source # of Patients in Social
Source Appears in Authority
Problems concentrating 8402 Yes
Muscle tension 7325 No
Headaches 7205 Yes
Back pain 6337 Yes
Dizziness 4900 No
Stomach pain 4898 No
Lack of motivation 4468 No
Nausea 4453 No
Low self-esteem 3847 No
Inability to experience pleasure 3062 Yes
Hyperventilation 2485 No
System Evaluation (Cont.)
11/3/2017
13
• Complement medical knowledge: When SI’s social source information differs with
information from authoritative source differs, SI proposes a second opinion to the human expert.
Added Value of Social InfoButtons (SI)
Condition Symptom in Social Source Symptom in Authoritative Source
Multiple Sclerosis
Stiffness/Spasticity Numbness or weakness in limbs
Brain fog Optic neuritis
Excessive daytime sleepiness Double vision or blurring of vision
Mood swings Tingling or pain in parts of your body
Bladder problems Electric-shock sensations
Emotional lability Tremor, lack of coordination
Sexual dysfunction Slurred speech
Bowel problems Fatigue
Epilepsy
Memory problems Temporary confusion
Problems concerntrating A staring spell
Excessive daytime sleepiness Uncontrollable jerking movements of arms and legs
Headaches Loss of consciousness or awareness
Application in Clinical Environment
• InfoButtons Cimino et al. [8, 9]
– meet the clinician’s information needs in the context of patient care,
complement the EHR
• “Can drug x cause (adverse) finding y?”,
• “What are my patient’s data? ”,
• “How should I treat condition x (not limited to drug treatments)? ”,
• “What is the drug of choice for condition x? ”
– A point-of-care information retrieval application that automatically
generates and sends queries to digital libraries using patient data
extracted from the electronic medical record.
• simple links, concept-based links, simple search, concept-based search, intelligent
agents, and a calculator.
26 Source [7]
Social
InfoButton
11/3/2017
14
Comorbidity Study with Social Health Records
• Comorbidity Prediction: Current appearance of some conditions indicates the
future occurrence of other conditions. (e.g. diabetes and foot sores)
• Comorbidity prediction benefits
– reduced mortality, lower hospital stay, lower healthcare
• Examples:
– Diabetes
• Hypertension (high blood pressure)
• Dyslipidemia (Abnormal LDL, HDL, or triglycerides, increasing risk for heart attack)
• Nonalcholic fatty liver disease (NAFLD)
• Cardiovascular disease
• Kidney disease
• Obesity
Research Issues
• How to predict medical condition incidence for individual patient? – e.g. John is diagnosed with condition X, what is the likelihood that he develops condition Y in the
future?
• How to predict medical condition progression trajectory for population which
can provide insights for individual treatment planning – e.g. Tom is diagnosed with condition X, what is the confidence value of developing condition
trajectory XYZ in the future?
• What data is available for modeling comorbidities?
11/3/2017
15
Patient’s Social Medical Profile
Comorbidity Trajectory Model
• In many situations it is more desirable to predict a medical condition progression
trajectory.
• A trajectory model is proposed to track the progression and infer the most probable
future trajectories.
• The model is constructed in three steps:
• Edge Discovery: Identifying directional edges of comorbidities, which co-occur for
individual patients.
• Linking: The generated edges are recursively linked to build the condition
trajectory tree T by recognizing the common node (condition) in two edges.
• Inference: The confidence value C of edge trajectory (e1e2e3,…,en) given
an observed condition c is calculated as a conditional probability.
11/3/2017
16
• By setting the root condition to C2, the tree below was built by Algorithm 1. (number
in parenthesis is the trajectory support)
• The confidence value of trajectory T given root condition c is defined as a conditional
probability:
C(T|c) = support(T)/support(c)
e.g., C(C2C8C7|C2) = 1/2 = 0.5
Trajectory Model (Continued)
Progression Trajectory Analysis Results
11/3/2017
17
• The confidence value
• C(MDDGADPD|MDD) = 37/680 = 5.4%;
• C(MDDDysthymiaPD|MDD) = 3.4%;
• C(MDDPTSDPD|MDD) = 3.2%;
• C(MDDSocial Anxiety DisorderPD|MDD) = 2.5%.
• The likelihood going through GAD is higher than other paths.
Progression Trajectory Analysis Results
MDD: Major Depressive Disorder
GAD: Generalized Anxiety Disorder
PD: Panic Disorder
PTSD: Post-Traumatic Stress Disorder
Evaluating Trajectory Model
• We selected three medical conditions with well-studied comorbidities*.
Condition Comorbidity
Major Depressive Disorder
(MDD)
Dysthymia, Panic Disorder, Agoraphobia, Social Anxiety, Obsessive–Compulsive Disorder,
Generalized Anxiety Disorder, and Post-Traumatic Stress Disorder, Alcohol Dependence,
Psychotic Disorder, Antisocial personality, Eating Disorders, Borderline Personality Disorder
Irritable Bowel
Syndrome(IBS)
Major Depression, Anxiety, Somatoform Disorders, Fibromyalgia, Chronic Fatigue Syndrome,
Gastroesophageal Reflux Disease, Restless Legs Syndrome
Eating Disorder (ED) Obsessive–Compulsive Disorder, Bipolar Disorder, Substance Abuse (Drug Addiction/Alcohol
Abuse), Diabetes, Bone Disease, Cardiac Complications, Gastrointestinal Distress
*http://www.huffingtonpost.com/kenneth-l-weiner-md-faed-ceds/eating-disorders_b_1761513.html
11/3/2017
18
Evaluating Trajectory Model (Cont.)
• Trajectory starting from conditions (confidence in percentage/support); * indicates that the
comorbidity exists in medical literature.
Condition Trajectory
Major Depressive Disorder (MDD)
Major Depressive Disorder-> Post-Traumatic Stress Disorder (PTSD)* ->Panic Disorder* -> Social Anxiety Disorder*
(1.3/9)
MDD->Panic Disorder*->Social Anxiety Disorder*->Phobic Disorder (1.1/8)
MDD->Generalized Anxiety Disorder (GAD)*-> Obsessive- Compulsory Disorder (OCD)* (3/23)
MDD->Panic Disorder*->Obsessive- Compulsory Disorder* (2/19)
MDD->Bipolar II (4/21)
MDD->Borderline Personality Disorder* (3/21)
Irritable Bowel Syndrome(IBS)
IBS-> Gastroesophageal Reflux Disease (GERD)*-> Restless Legs Syndrome* (3/6)
IBS->Fibromyalgia*-> Chronic Fatigue Syndrome (CFS)* (9/17)
IBS->Restless Legs Syndrome* (12/23)
IBS->Osteoarthritis (10/18)
Eating Disorder (ED)
ED->Tobacco Addiction->Drug Addiction*->Panic Disorder (4/5)
ED->Obsessive- Compulsory Disorder*->Panic Disorder->Social Anxiety Disorder (4/5)
ED->Bipolar II*->Drug Addiction (5/6)
ED->Drug Addiction*->Alcohol Addiction* (6/7)
ED->Postpartum Depression (13/15)
ED->Alcohol Addiction* (13/16)
36
Epidemics are a major threat for
humanity
(killed 962. year 2003) (killed 18400. year 2009) (killed 30. year 2011)
SARS Swine Flu Listeria
1918 flu pandemic (Spanish
influenza)
(killed 50-100 million. year 1918-1920)
Ebola
(17145 cases
killed 6070 year 2014)
Epidemics Monitoring and Detection
Zika
virus
11/3/2017
19
Research Issues
• Epidemic monitoring and surveillance
– Watch rapid and timely data streams to discover trends and patterns in health events
• Public Concern monitoring
– Active dissemination of medical myths and misinformation by self-interested propagandists.
– Social media “storms” are able to cause and create shared public responses that may or may not be appropriate
for the health event.
– The verification of the shared health information, especially as it relates to fast-moving epidemics or heightened
seasonal health concerns is crucial to keeping the public accurately informed.
– The ability to respond publicly and in a timely manner to the spread of misinformation and health-related rumors
during public health events, as the 2014 Ebola crisis illustrated. Health agencies need to have plans in place
ahead of time to be able to respond to and counter misinformation or support accurate information shared via
social media.
Twitter Data Collection
• Migrated from PHP-based 140dev library to Java-based Twitter4J.
• Collected 11.7+ million tweets across 14 diseases/disasters in DB.
Dataset Id Tweet Type Total number of Tweets
1 Listeria 43,646
2 Influenza 2,231,442
3 Swine Flu 121,208
4 Measles 276,282
5 Meningitis 189,886
6 Tuberculosis 245,639
7 Major Depression 3,209,413
8 Generalized Anxiety Disorder 386,262
9 Obsessive-compulsive Disorder 571,867
10 Bipolar Disorder 181,942
11 Air Disaster 22,946
12 Melanoma Experimental Drug 145,357
13 Natural Disaster 1,746,899
14 Ebola 2,385,275
11/3/2017
20
Distribution maps of Listeria Tweets (Sep 26)
39
09-26-2011 absolute 09-26-2011 relative
09-27-2011 absolute 09-27-2011 relative
In September of 2011, there was a sudden outbreak of Listeria in US.
CDC’s (US Government Center for Disease Control and Prevention)
report, as of 11am EDT on September 29, 2011
[http://www.cdc.gov/listeria/outbreaks/cantaloupes-jensen-farms/093011/index.html, accessed on 4/1/2012]
• 84 persons were infected with listeria as reported by
CDC.
• The states with the largest numbers of infected persons
were: Colorado (17), Texas (14), New Mexico (13),
Oklahoma (11), Nebraska (6), Kansas (5).
11/3/2017
21
Comparing Social data with CDC data
In the six most affected states indicated by CDC report (blue line),
EOSDS result correlated well with CDC report in four states (cycled
in red).
There are two states (cycled in blue) showing differences
between EOSDS results and CDC report, what happened?