machine learning for consumer health, clinical decision ... · clinical decision support, and...

Machine Learning for Consumer Health,

Clinical Decision Support, and

Population Health

Carolina Health Informatics Program (CHIP)CHIP.UNC.EDU

Javed MostafaMcColl Distinguished Term Professor (2017-2019)

Information Sci. & Biomedical Research Imaging CenterBiomedical Informatics Services,

NC Translational & Clinical Sciences Institute,

School of Medicine &

Director, CHIP.UNC.EDU

June 26th, 2018

University of North Carolina at Chapel Hill

Presentation Outline

• Why Machine Learning in Health Care?

• Three Areas in Health Care

– Public health surveillance

– Consumer health information delivery

– Diagnostics

• Longitudinal Tracking

• Clinical-Decision-Support in Imaging

A “Central” Tenant or Foundation in

health care

• Care must be provided based on BEST EVIDENCE

• Often simply referred to as “Evidence-based Medicine”

EMB is Dependent on Data

• Best evidence demands current and accurate data

• And, availability of current and accurate data about the

patient …

Why Machine Learning I?

• Many challenges to achieve accuracy and timeliness …

– Biomedical and health evidence grows rapidly

– Health data are complex

– Leading to the 3-pronged problem:

• Volume … Velocity … Veracity …

• ML application in health care is a way to conduct high precision and efficient data analytics

– Accurate

– Timely

Wide-Scale Adoption of Electronic Health

Records

• In developed countries with centralized health care and in

the USA (with much more fragmented care delivery)

Electronic Health Record system has been implemented at

a wide-scale

Electronic Health Record Adoption (USA)

n engl j med 377;10 nejm.org September 7, 2017

Growing Data: Other Key Types of Health Data

CompleteHealth Record

Needed for “complete care”

Laboratory Results

(Genomic and

Imaging)

Medical Devices

Fitness Devices

Managed Care-

focused behavior

data

Medication and

Pharmacy Data

EnvironmentalData

From: Les Jordan (2015): “The problem with big data in translational medicine”, Applied and Translational Genomics

Provider-focused

Electronic Health Record

Why Machine Learning II? Growth in

Clinical Data…

• Kaiser Permanente, the California-based health network

which has more than 9 million members, is estimated to

have between 30-44 petabytes of patient data under

management

• Kaiser Permanente data come from electronic health record

data, including images and annotations. This amounts to

the same amount of information contained in 4,400

Libraries of Congress.

A Case: Public Health Analytics

Public Health Analytics: Surveillance

• A key challenge in public health is regular monitoring of

community-wide health condition

• Syndromic surveillance is a system used to detect and issue

early warning of disease outbreaks

Hospital ED data shared with BioSense

NC DETECT Data Elements

ED Data• Patient and Visit IDs

• Date of Birth, Sex

• City, County, State, ZIP

• Hospital

• Arrival Date/Time

• Chief Complaint

• Initial Vital Signs

• Diagnosis, Injury and Procedure Codes (ICD-9-CM, CPT)

• Transport Mode to ED

• Insurance Coverage

• ED Disposition

• Triage Notes (not mandatory)

Poison Center Data• Unique ID

• Patient demographics

• Clinical effects

• Scenarios, Therapies, Substances involved (if any)

EMS Data• Unique ID

• Patient Demographics

• Dispatch complaint, chief complaint, primary symptoms

© 2008 University of North Carolina at Chapel Hill and NC Division of Public Health, NC DHHS

Disease Surveillance

• Syndromic surveillance process can be modelled as a

classification process

NC DetectSyndromic

Surveillance

ED Records

Unnecessary work by Public Health Office

Public Health Office loses valuable time in handling outbreak

False Positive

False Negative

ML Classification

What is the core data?

How to transform raw text to vectors?

• We can apply a so called Vector-Space model

• In this modelling approach, a phrase, a few lines, or even a

document with many lines can be transformed into a vector,

i.e., a linear array with fixed length, whereby each element

of the array represents a keyword or term

Example of Vector-Space Matrix I

http://lsa.colorado.edu/

Keywords: Controlled Vocabulary

• From the previous example, they are: HUMAN, INTERFACE,

COMPUTER, USER, SYSTEM, RESPONSE, TIME, EPS,

SURVEY, TREES, GRAPH and MINORS

• Upon creation of the doc x term matrix, one can use the doc

vectors to match with “query” vectors

Another Example: Longer Documents

TI: The structure of negative emotions in a clinical sample of children and

adolescents

SO: Journal of Abnormal Psychology

PY: Feb98, Vol. 107 Issue 1, p74

IS: 12p

NT: 0021843X

AU: Chorpita, Bruce F.Albano, Anne Marieet al

AB: Presents a study which focuses on the factors associated with childhood

anxiety and depression with the use of a structural equations/confirmatory

factor-analytic approach. Reference to a sample of 216 children and

adolescents with diagnoses of an anxiety disorder or comorbid anxiety and

mood disorders; Suggestion of results; Discussion on the implications for

the assessment of childhood negative emotions.

CO: 276712

TI: Depression: A family affair

SO: Lancet

PY: 01/17/98, Vol. 351 Issue 9097, p158

IS: 1p

NT: 00995355

AU: Faraone, Stephen V.Biederman, Joseph

AB: Considers the studies of major depression and anxiety disorders. The

findings with regard to depression being familial and having a genetic

component to its complex etiology; Discusses the continuity between child

and adult psychiatric disorders, psychiatric comorbidity and the

underidentification and treatment of juvenile depression.

CO: 116735

Transforming a document to a vector: the

process is called Indexing

• If we index using the two terms anxiety and depression, the

representations for the previous two documents would be:

T1 T2 T3 T4 T5

[0 0 0 1 1 ] = Document Vector

Assuming:1) T4 = Anxiety and T5 = Depression

2) Terms T1, T2, & T3 are not present in the documents

3) Binary representation

Transforming a Syndrome to a Vector

• Before a similarity score can be generated (or a

classification), the syndrome query is converted to a vector

before a matching is performed

• Example: If the syndrome term is “anxiety” as the query

term, then the vector for this query would be:

T1 T2 T3 T4 T5

[0 0 0 1 0] = Query Vector

Vector Similarity: Inner Product & Cosine

• Inner product is simple:

• Cosine similarity:

Similarity (query, document) = Q x D vectors =

[0 0 0 1 0] = Query Vector X

[1 0 0 0 0] = Another document

-------------------------------------------------------------------

0+ 0+ 0+ 0+ 0 = 0

The “Algorithm” for Syndrome Matching

How to find new terms? Unsupervised

learning or clustering

• To be able to “discover” new terms to be added to the

thesaurus (or sometimes referred to as the dictionary) new

terms need to be constantly added

• Using “unsupervised” learning or clustering new terms can

be discovered

Unsupervised Term Discovery I

• A flat clustering process known is Maxi-Min clustering is

quite effective

• We start with a term-document matrix

• We apply Principle Component Analysis (PCA) to reduce

“noise” and improve the prospect of identifying

homogeneous clusters

Apply PCA (or LSA) II


Conduct A 2-Dimensional Reconstruction

after PCA III


Perform Matrix Transformation Doc-Term

to Doc-Doc: Toward Clustering IV

Inner-product Similarity

Doc1 Doc2 Doc3 Doc4 Doc5 Doc6 Doc7 Doc8 Doc9

Doc1 1 1 1 0 0 0 0 0

Doc2 2 2 3 0 0 0 1

Doc3 3 1 0 0 0 0

Doc4 0 0 0 0 0

Doc5 0 0 0 0

Doc6 1 1 0

Doc7 2 1

Doc8 2

Doc9

Unsupervised Clustering Result

Automatically Discovered Terms

MeSH Classes

Cell Adhesion

Cell Communication

Cell Death

Cell Movement

Cell Survival

Endocytosis

Antibody Formation

Autoimmunity

Immunocompromised Host

Cytotoxicity Immunologic

Immune Tolerance

Immunity Cellular

Regeneration

Evolulution

Complement Activation

Automatically Produced Classes

Cell, Binding

Cell, Adhesion, Growth, Antigen

Communication, Death

Apoptois

Migration

Production, Motility

Tolerance

Virus

Endocytosis, Receptor

Antibody, Serum

Autoimmune

Tumor

Immunocompromised, Infected

Cytotoxic

Immune, Cell, Response, Gene, Class

Regeneration

Evolution, DNA

Complement, Activation, Plasma, Membrane

Transplant

Muscle

Expression

Supervised Learning: Inference Network for

Genes to Diseases Association

❖ Seki, K., & Mostafa, J... Discovering implicit associations between genes and hereditary diseases. In Proceedings of the Pacific Symposium on Biocomputing …

Now turning to …

ML applications in Consumer Health Information Delivery

There is a Wide Demand to Learn about and

Search for Health Information: Pew Research

• 87% of U.S. adults use the web (current until 2016, Pew Internet

Survey).

• 72% of online users say they looked online for health information

• Health information seeking motivated often as a visit preparatory

and/or post-visit activity

• http://www.pewinternet.org/fact-sheets/health-fact-sheet/

Many Barriers to Building a Consumer

Oriented Health Portal or Online Service

• Trust

• Timeliness and accuracy

• Ease of use by people who are not specialists

• Fast growth in health and biomedical information …

Growth in Biomedical Information I

2015

2018

Growth in Biomedical Information II

Growth in Biomedical Information III

Literature Mining for Biologists: Jensen et al. (2006), Nature Reviews Genetics. doi:10.1038/nrg1768

Multilevel Information Service Model to

Cope with Data Volume and Complexity

DAM PAM

Representation Classification Profile

Management

Thesaurus

Management

Classifier

Management

Shift

Detection

F1 F2

Modelling the IS Process

• IS is a function:

– D -> R

– But, mapping content to relevance values directly and in real time is a highly computationally intensive task

– Hence, we propose a decomposition:

– F1: D -> C & F2: C -> R

Function Level 1: ML for Classification

DAM

Representation Classification

Thesaurus

Management

Classifier

Management

F1 F2

Automatically Discovered Terms

MeSH Classes

Cell Adhesion

Cell Communication

Cell Death

Cell Movement

Cell Survival

Endocytosis

Antibody Formation

Autoimmunity

Immunocompromised Host

Cytotoxicity Immunologic

Immune Tolerance

Immunity Cellular

Regeneration

Evolulution

Complement Activation

Automatically Produced Classes

Cell, Binding

Cell, Adhesion, Growth, Antigen

Communication, Death

Apoptois

Migration

Production, Motility

Tolerance

Virus

Endocytosis, Receptor

Antibody, Serum

Autoimmune

Tumor

Immunocompromised, Infected

Cytotoxic

Immune, Cell, Response, Gene, Class

Regeneration

Evolution, DNA

Complement, Activation, Plasma, Membrane

Transplant

Muscle

Expression

F1: Using Terms in Document Matching

(Retrieval)

MeSH= Solid lineAuto= Dotted line

Supervised Learning using Neural

Networks: Classification

• Trained a three layer Neural Net to classify to the 15 classes

– Used 4000 training, 2000 tuning and 1500 evaluation documents

• Same representation in both, the classifiers differed: one did a direct similarity match to MeSH using embedded class label and the other used neural net on document abstract + title sans any class label

❖ Mostafa, J., & Lam, W…Automatic classification using supervised learning in a medical document filtering application. Information Processing & Management, 36(3), 415-444.

Results: Supervised Classifier

• Classification Results Improved: 3.89% was average error

rate / class (stdev was 2.53%)

Function Level 2: User Profile (user

modeling)

DAM PAM

Representation Classification Profile

Management

Thesaurus

Management

Classifier

Management

Shift

Detection

F1 F2

Reinforcement Learning (semi-

supervised): User Modeling

Categories

c1

c2

c3

:

:

cn

u1

u2

u3

:

:

un

t1

t2

t3

:

:

tn

Probability that category 2 is thetop-most relevant category

Probability that category 1 isrelevant to the user

Top class Relevance of categories

User profile/model

Documents

Personalized Health Info Delivery:

MedSIFTER

MedSIFTER: User Rating/Feedback

MedSIFTER Experimental Evaluation

• Explicit = user provided the profile in the first session

• Implicit = ongoing feedback to content

• Combined = both explicit (provided in the initial session) and ongoing feedback

• ~20 subjects; 15 sessions; videotaped interaction and

interface; think-aloud protocol

MedSIFTER: Evaluation Results

Robustness of RL User Modeling:

SimSIFTER

• Type of interest may impact the rating (degree and frequency)

• Rating may impact how quickly the system can “learn” or generate an accurate profile

• Accuracy of profile determines accuracy of prediction of relevance

• SIMSIFTER used about 1.4K consumer health documents and 15 categories of health information (anxiety, allergy, heart, cholesterol, depression, diet, environment, exercise, eye, headache, lung, medicine, teeth, men-health, and women-health

)

Reinforcement Learning of User Interest:

Simulating User Feedback

Reinforcement Learner (RL) Dealing with

Interest Types

Different Interest Types

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 6

11

16

21

26

31

36

41

Sessions

No

rma

liz

ed

Pre

cis

ion

Concrete

Middle

Mildlow

Nolearning

RL dealing with Interest Change

Incremental Interest Change

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 6

11

16

21

26

31

36

41

Sessions

No

rmalized

Pre

cis

ion

low -to-hi

hi-to-low

hybridchange

Abrupt Interest Change

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 6

11

16

21

26

31

36

41

Sessions

suddendev

suddendevloss

suddendevlossdev

MedSIFTER Advantages

• Can be delivered over existing infrastructure

• User or patient specific

– May use EHR to update the profile

• Consistent, Authoritative content

• Low maintenance demands

• Rating can be “pooled” to determine community-level quality of health information

Diagnostics: ML in Point-of-Care

ML Can be Applied at nearly All Critical

Points

Patient Monitoring

Patient Portal

ML for Longitudinal Tracking of Complex

Health Conditions

• A case study: Depression

• Globally, more than 300 million people of all ages suffer

from depression. Depression is the leading cause of

disability worldwide, and is a major contributor to the

overall global burden of disease. ...

• World Health Organization: http://www.who.int/news-

room/fact-sheets/detail/depression

MindsEye: Psychiatric/Clinical Depression

Who is my patient?Who is my patient? How is s/he doing? What to recommend?

ML in Image Diagnostics: Alzheimer’s

Disease

• 5 million Americans & cost 250 million / year to manage

• By 2050 … the number is likely to rise to 16 million Americans

• At a cost of 1 trillion dollars / year to manage …

• www.alz.org/facts

• One of the greatest faults with current approaches is that they start too late … it is akin to giving someone a Lipitor when they have a heart attack – [Dr. Tanzi: http://www.cnn.com/2017/11/13/health/bill-gates-announcement-alzheimers/index.html]

http://www.alz.org/facts

ViewFinder: Online Diagnosis of

Alzheimer Stages & Severity

Reinforcement Learning in VfM

• VfM assumes binary relevance (any scan indicated as either yes/no)

• Session: Number of steps needed to satisfy the user needs for a given query

• Feedback used at 2 levels:

– Inter-session

– Intra-session

Opportunities: RL is an important

Machine Learning Method

In information-filtering environments, uncertainties associated with changing interests of the user and the dynamic document stream must be handled efficiently. In this article, a filtering model is proposed that decomposes the overall task into subsystem functionalities and highlights the need for multiple adaptation techniques to cope with uncertainties. A filtering system, SIFTER, has been implemented based on the model, using established techniques in information retrieval and artificial intelligence. These techniques include document representation by a vector-space model, document classification by unsupervised learning, and user modeling by reinforcement learning. The system can filter information based on content and a user's specific interests. The user's interests are automatically learned with only limited user intervention in the form of optional relevance feedback for documents. We also describe experimental studies conducted with SIFTER to filter computer and information science documents collected from the Internet and commercial database services. The experimental results demonstrate that the system performs very well in filtering documents in a realistic problem setting.

Conclusion: Major Challenges

• Aggregating data across multiple organizations and entities – Efficiently collecting, “normalizing”, and integrating for secondary use

• Secondary analysis and use– Data need to be linked across individuals (longitudinal) and populations

(cohort)

– Data need to be manipulated to derive value

• “Human and Organizational Challenges”

– Different policies and rules

– Sharing IP

– Resourcing innovation and growth

Questions?

• Javed: [email protected]

• Useful links:

• CHIP: http://chip.unc.edu

• TraCS: http://tracs.unc.edu

mailto:[email protected]

Patient-Generated Health Data I

The SmartPill Capsule collects pressure, pH

and temperature data from your GI tract

and wirelessly transmits that information to

a data receiver worn on a belt

This data is then downloaded to a

computer, allowing your physician to

analyze the information.

http://www.tummydoctor.org/video-capsule.php

Patient Generated Health Data II

One ink changes from green to brown as glucose concentration increases. The team has also developed a green ink, viewable under blue light, that grows more intense as sodium concentration rises, an indication of dehydration.

…has already developed an app that can analyze a picture of a sensor and provide quantitative diagnostic results. While patients are an obvious potential market…