population health research with big data: updates and

49
Jonathan Weiner ([email protected]) Hadi Kharrazi ([email protected]) Johns Hopkins University Bloomberg School of Public Health Department of Health Policy and Management Population Health Research with Big Data: Updates and Opportunities for Collaboration Center for Population Health IT (CPHIT) CHSOR Feb 2019

Upload: others

Post on 25-Apr-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Population Health Research with Big Data: Updates and

Jonathan Weiner ([email protected])

Hadi Kharrazi ([email protected])

Johns Hopkins UniversityBloomberg School of Public HealthDepartment of Health Policy and Management

Population Health Research with Big Data: Updates and Opportunities for Collaboration

Center for Population Health IT (CPHIT)

CHSORFeb 2019

Page 2: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

2

Overview of Today’s Talk

1) Population Health Informatics• Emerging Field

• Data Sources & Types

• Scope of CPHIT’s Work

2) CPHIT Portfolio• Claims-based JHU-ACG

• EHR-based Prediction (eACG)

o EHR vs. Claims (dem., Dx, Rx)o EHR Vital Signs (BMI/BP)o EHR Prescription (adherence)o EHR Labs (common labs)o EHR Free-text (geriatric frailty)

• Geographic Factors (elderly falls & VA)

• Opioid Overdose Predictive Models

• Social Determinants of Health

3) Discussion• Challenges & Opportunities

• Collaboration Opportunities

Page 3: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

3

Population Health Informatics

Page 4: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

4

Population Health Informatics Emerging Field

Triple Aims developed by the Institute for Healthcare Improvement (IHI)

Better Health for the Population

Better Care for the Individuals

Lower Cost Through Improvements

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 5: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

5

Molecular Research Health Research

Biomedical informatics as a basic science

Basic Research

Applied Research

Biomedical informatics methods, techniques, and theories

Bioinformatics

ImagingInformatics

ClinicalInformatics

Public HealthInformatics

Consumer HealthInformatics

PopulationHIT

Population Health Informatics Emerging Field

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 6: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

6

Community / PopulationIDS / ACO / Virtual Net

Family and Care giversPractice Team

Physician Patient

ClaimsMISHIS CPOE

CDSSEHR PHR

mHealthapps

Biomet.Tele-H.

NationalDatasets

HIE

Social Network

SocialHR data

GIS

Public Health Systems

Web Portals

emailand

others

Weiner, 2012 http://www.ijhpr.org/content/1/1/33

Population Health Informatics Data Sources

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 7: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

7

Population Health Informatics Data Analytic Cycle

Generate & Integrate New Data from Knowledge

Population Health Database Development

Data Preparation &Data Quality Checks

Extracting Knowledgeby Modeling and Data Mining

Creating Generalizable Knowledgeby Model Validation and Evaluation

Store, Share and Use the Knowledge

0100200300400500

0 20 40 60

y = β0 +β1x1 + … +βnxn

• Validity• Reliability• Goodness of Fit• Consistency• Parsimony• ReproducibilityX Y Z

X Y ZX Y Z

• Quality• Missing• Transf.

Base(Year-0)

Outcome(Year-1)

PredictDemographics; Diagnosis; Medications; Cost and etc.

Cost; Mort.; ER-admit; Hospitalization; Readmit;

x1, x2, …, xn y (binary, cont.)

Research and Operations

Population Health Data Warehouse

Overall Population Health Knowledge Management Process

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 8: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

8

The Johns Hopkins

Center for Population Health Information Technology(CPHIT, or “see-fit”)

The mission of this innovative, multi-disciplinary R&D center is to improve the health and well-being of populations by advancing the state-of-the-art of Health IT across public and private health organization.

CPHIT focuses on the application of electronic health records (EHRs), mobile health and other e-health and HIT tools targeted at communities and populations.

Director: Dr. Weiner

Research Director: Dr. Kharrazi

10+ Core Colleagues, Additional 15+ Collaborating Colleagues

www.jhsph.edu/cphit

Page 9: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

9

CPHIT Research Portfolio

Page 10: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

10

CPHIT Portfolio (cont.)

Research Portfolio (selected list)

• Claims-based JHU-ACG

• EHR-based Prediction (eACG)

o EHR vs. Claims (dem., Dx, Rx)

o EHR Vital Signs (BMI/BP)

o EHR Prescription (adherence)

o EHR Labs (common labs)

o EHR Free-text (geriatric frailty)

• Geographic Factors (elderly falls & VA)

• Opioid Overdose Predictive Models

• Social Determinants of Health

Page 11: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

11

CPHIT Portfolio Claims-based Risk Stratification (ACG)

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 12: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

12

CPHIT Portfolio EHR vs. Claims

Data Source(a)

Characteristic Claims EHR(b) Purpose Reimbursement Clinical care

Scope All providers, including out of network providers, for a given patient Network providers of a patient

Data consistency High consistency across sources Lower consistency across sources Data structure Most of data is structured Considerable unstructured data Coding standard Strict adherence to coding systems Variable adherence to coding systems Provider coverage All providers accepting the insurance Limited to providers using same EHR Coding limit Limited to encoded data Provides ability to enter free text Member limitation Limited to insured patients Insured and uninsured patients Coverage limitation Non-covered items are missing Includes data on non-covered items Data type Limited (mainly enrollment, Dx, Rx) Additional data types (see below)

Data Availability Claims EHR(b) Demographics(a) Yes Yes Race/ethnicity Limited Limited Diagnosis(a) Yes Yes Procedures Yes Yes Eligibility Yes Limited Medications(a) Pharmacy data (drugs dispensed) Prescriptions ordered & MedRec data Socioeconomic data Zip-code derived Coded and zip-code derived Family history Not available Yes Problem list Not available Yes Procedure results Not available Yes Laboratory results Not available Yes Vital signs Not available Yes Behavioral risk factors Not available Limited

Standardized surveys Limited Limited

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 13: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

13

CPHIT Portfolio EHR vs. Claims (cont.)

Comparing Claims and EHR for Risk Stratification

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 14: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

14

CPHIT Portfolio EHR vs. Claims (cont.)

Comparing Various Overlaps of Claims and EHR for Risk Stratification

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 15: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

15

CPHIT Portfolio EHR vs. Claims (cont.)

Comparing Diagnostic Data Found in Claims (C) vs EHRs (E)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

V27.

0V5

7.1

795.

0173

9.3

722.

4V1

2.72

429.

372

1.3

216.

664

6.83

728.

8542

7.89

276.

51 650

424

V72.

3151

9.11

715.

95V2

8.89 34

786.

5957

4.2

787.

0378

4.2

368.

872

4.3

463

787.

171

9.43

692.

7455

8.9

388.

770

221

6.8

218.

978

6.09

729.

536

5.01

704.

870

6.8

726.

545

5.3

216.

9V2

8.81

V74.

1V7

8.9

493.

9279

0.29

367.

411

0.1

V49.

81 734

375.

15V2

5.11

692.

9V0

4.81

780.

5723

9.2

333.

94V4

5.86

623.

579

0.93

782.

942

7.31 79

147

7.9

698.

137

4.87

V77.

1V5

8.32 79

540

1.9

V72.

0V7

6.10

Top Dx (ICDs) with at least 1000 instances - distribution of C, CE, and E source of data

E

CE

C

More records in Claims than other

sources of Dx

More records in Claims + EHRs

than other sources of Dx

Recordsin EHRs

only

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 16: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

16

CPHIT Portfolio EHR vs. Claims (cont.)

Cases found in EHR versus Claims:Diabetes, Hypertension, Depression, Cancer

Page 17: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

17

CPHIT Portfolio EHR vs. Claims (cont.)

Model performance using EHR versus Claims

Page 18: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

18

CPHIT Portfolio EHR Vital Signs (BMI)

Value of BMI in Predicting Utilization

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 19: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

19

CPHIT Portfolio EHR Vital Signs (BMI) (cont.)

Value of BMI in Predicting Utilization

Page 20: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

20

CPHIT Portfolio EHR Prescription

Value of EHR’s Prescription vs. Claims Filling in Predicting Utilization

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 21: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

21

CPHIT Portfolio EHR Prescription (cont.)

Value of EHR’s Prescription vs. Claims Filling in Predicting Utilization

Page 22: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

22

CPHIT Portfolio EHR Labs

Value of EHR’s Common Lab Results in Predicting Utilization

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 23: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

23

CPHIT Portfolio EHR Labs (cont.)

Value of EHR’s Common Lab Results in Predicting Utilization

Page 24: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

24

CPHIT Portfolio EHR Free-text (Geriatric Frailty)

Value of EHR’s Free-text in Identifying Frailty and Predicting Utilization

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 25: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

25

CPHIT Portfolio EHR Free-text (Geriatric Frailty) (cont.)

Claims

EHR Structured

EHR Free Text

(NLP)

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 26: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

26

CPHIT Portfolio EHR Free-text (Geriatric Frailty) (cont.)

Added value of free text represented by the Venn diagramCircle sizes represent the number of patients identified by each methodology/data-source

Green: EHR Free Text; Blue: EHR Structured; Red: Insurance Claims

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 27: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

27

CPHIT Portfolio EHR Free-text (Geriatric Frailty) (cont.)

Value of EHR’s Common Lab Results in Predicting Utilization

Page 28: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

28

CPHIT Portfolio Geographic Factors (Elderly Falls)

Prevalence of falls among elderly in Baltimore City (Census Block Group)

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 29: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

29

CPHIT Portfolio Geographic Factors (Elderly Falls) (cont.)

Prevalence of falls among elderly in Maryland (Census Block Group)

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 30: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

30

CPHIT Portfolio Geographic Factors (Elderly Falls) (cont.)

Predictors and coefficients of the elderly-fall model

Predictors Estimate Std. error z value Pr(>|z|) Significance OR 2.50% 97.50%

History of fall 1.795 0.074 24.113 <2e-16 *** 6.02 5.20 6.97

Fracture 0.604 0.104 5.821 5.85E-09 *** 1.83 1.49 2.24

Substance Abuse 0.520 0.082 6.364 1.96E-10 *** 1.68 1.43 1.97

Parkinson 0.337 0.178 1.895 0.058056 . 1.40 0.98 1.97

Kyphoscoliosis 0.322 0.153 2.102 0.035519 * 1.38 1.01 1.85

Sex (female) 0.173 0.046 3.736 0.000187 *** 1.19 1.09 1.30

Depression 0.146 0.068 2.141 0.032238 * 1.16 1.01 1.32

Mental Illness 0.128 0.065 1.980 0.047652 * 1.14 1.00 1.29

Age 0.038 0.003 14.895 <2e-16 *** 1.04 1.03 1.04

Charlson Index -0.053 0.009 -5.711 1.12E-08 *** 0.95 0.93 0.97

Vision -0.211 0.057 -3.689 0.000225 *** 0.81 0.72 0.91

Obesity -0.251 0.076 -3.311 0.000931 *** 0.78 0.67 0.90

Cardiovascular Disease -0.313 0.050 -6.301 2.95E-10 *** 0.73 0.66 0.81

Lower Urinary Tract Symptoms -0.345 0.074 -4.656 3.23E-06 *** 0.71 0.61 0.82

Hypertension -0.357 0.050 -7.080 1.44E-12 *** 0.70 0.63 0.77

Cancer -0.441 0.081 -5.418 6.02E-08 *** 0.64 0.55 0.75

Lower Back Pain -0.495 0.067 -7.368 1.73E-13 *** 0.61 0.53 0.69

Joint Trauma -0.526 0.197 -2.674 0.007487 ** 0.59 0.39 0.85

Lower Extremity Joint Surgery -1.069 0.182 -5.870 4.36E-09 *** 0.34 0.24 0.48

(Intercept) -4.372 0.197 -22.249 <2e-16 *** 0.01 0.01 0.02

Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Page 31: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

31

CPHIT Portfolio Geographic Factors (VHA Obesity)

Geographic distribution of obesity among VHA population (Limited to 29,322 visits occurred in one day of 2013; generated using CDW data)

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 32: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

32

CPHIT Portfolio Geographic Factors (VHA Obesity) (cont.)

County BMI (using MLM adjustment) for Males 2000-2015

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 33: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

33

CPHIT Portfolio Geographic Factors (VHA Obesity) (cont.)

County BMI (using MLM adjustment) for Males 2000-2015

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 34: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

34

CPHIT Portfolio Geographic Factors (VHA Obesity) (cont.)

County BMI (using MLM adjustment) for Males 2015 (DC and Baltimore)

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 35: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

35

Kilometers

Spatial Intensity Male 2015

CPHIT Portfolio Geographic Factors (VHA Obesity) (cont.)

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 36: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

36

CPHIT Portfolio Geographic Factors (VHA Obesity) (cont.)

Interactive Web-based Real-time Geo-Temporal Exploration of Obesity Data

(Showing averages of 2014 for MD)

Name OwnerVHA Corporate Data Warehouse VHAAmerican Community Survey CensusCensus 2010 CensusNational Health and Nutrition Examination Survey CDCFood Access Research Atlas + Others USDANational Vital Statisitcs Report CDCReference USA RefUSAOpen Street Map OpenMapModerate Resolution Imaging Spectroradiometer NASAConsumer Expenditure Survey BLSUniform Crime Reporting Statistics (FBI) FBIMaryland Food Systems MDUSDA Detailed Maps Baltimore USDAArcGIS Internal Datasets ESRISatellite data Google

Page 37: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

37

Mid-Atlantic VISN – Multivariate GEE Analysis

high income and not married and high SES quartile remained increased odds of obesity, but Gagne morbidity scores were no longer associated with obesity

CPHIT Portfolio Geographic Factors (VHA Obesity) (cont.)

Variable Reference / Type OR p-value Increases Obesity

Age Continuous 0.97 <0.001 lower age

Race (white) Categorical (ref: non-white) 1.02 0.74 race = white

Income (< $25k) Categorical (ref: > $25k) 0.88 <0.05 income > $25k

Marriage (not-married) Categorical (ref: married) 0.77 <0.001 married

Service years Continuous 1.02 <0.001 more service year

SES Q2 Categorical (ref: Q1 lowest) 1.21 <0.05 higher SES quartile

SES Q3 Categorical (ref: Q1 lowest) 1.34 <0.001 higher SES quartile

SES Q4 Categorical (ref: Q1 lowest) 1.16 0.06 higher SES quartile

Gagne (> 0) Categorical (ref: <= 0) 0.94 0.16 Gagne < 0

Urban/Rural (rural) Categorical (ref: urban) 0.93 0.27 being urban

Road Density Continuous 0.91 0.44 lower road density

Low Food Access Continuous 1.06 0.19 higher low-food-access

Page 38: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

38

CPHIT Portfolio Social Determinants of Health

Various sources of data for SDH extracted or derived from an EHR

ICDLOINC

SNOMEDCustom

Lat.Lon.Various

types of notes

ICDLOINC

SNOMEDCustom

EHR data warehouseAncillary DBs

Surveys Diagnosis Free Text

Address Geo-derived SDH

individual-level accuracy

population-level completeness

Cop

yrig

ht @

kha

rraz

i@jh

u.ed

u

Page 39: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

39

Partnership between the Maryland Department of Health, Johns Hopkins (HPM/CPHIT) , Maryland Health Information Exchange (CRISP)

It is one of the top data linkage and modeling efforts of its type in the nation

Three-year grant (2015-2018) funded by US Department of Justice

Main aims:− To develop and validate a predictive risk model for overdose in the

Maryland Prescription Drug Monitoring Program (PDMP)− To extend this model to include predictors from other clinical and

criminal justice− To transfer these tools to the Md Department of Health and others for

use in addressing the opioid death epidemic

Predictive Risk Evaluation to Combat Overdose Grant (PRECOG)

Page 40: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

40

© 2014, Johns Hopkins University. All rights reserved.

© 2014, Johns Hopkins University. All rights reserved.© 2014, Johns Hopkins University. All rights reserved.©2015, Johns Hopkins University. All rights reserved.©2015, Johns Hopkins University. All rights reserved.

Data Sources, Risk Factors and Outcomes

Star indicates inclusion in Precog project as of 2/2018

Page 41: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

41

No common identifier across siloed datasets.Unique identifiers stripped from datasets delivered to Hopkins and study ID appended.

CRISP used a probabilistic matching algorithm (the master patient index) to link person-level records from disparate systems using personal identifiers (e.g., name, DOB, SSN)

Created an integrated, de-identified statewide database (2014-2016)

Very Unique Linkage Process

Page 42: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

42

© 2014, Johns Hopkins University. All rights reserved.

Key Model variables to predict opioid overdose death Using Only PDMP Database

© 2014, Johns Hopkins University. All rights reserved.© 2014, Johns Hopkins University. All rights reserved.

Opioid Overdose Death

Age OR 95% CI

Male 2.913 1.734 - 2.775

Method of Payment

Medicare 2.904 1.857 – 4.542

Opioid Use

Opioid use disorder fills, 1+ 7.021 4.249 – 11.603

Opioid short-acting, schedule II fills, 4+ 4.660 2.428 – 8.944

Other Controlled Substance

Muscle relaxant fills, 1+ 2.614 1.267 – 5.395

Reference categories: female, commercial payer, no OUD fill, no Short acting schedule 2 fill, no other CDS fills.

Page 43: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

43

© 2014, Johns Hopkins University. All rights reserved.

Finding Model Performance in Identifin All Opiod Deaths Using only PDMP Data

• Sensitivity: 71.93• Specificity: 88.09• PPV: 0.72• NPV: 99.96

© 2014, Johns Hopkins University. All rights reserved.© 2014, Johns Hopkins University. All rights reserved.© 2014, Johns Hopkins University. All rights reserved.©2015, Johns Hopkins University. All rights reserved.©2018, Johns Hopkins University. All rights reserved.

Page 44: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

44

Importance of Linking, PDMP ,HCRC and Corrections Data: Overdose fatality rates( per 100K) and OR) for subgroups defined by key items

1233 (OR: 25.2)

1174 (OR: 24.0)

765 (OR: 15.6)

730 (OR: 14.9)

708 (OR: 14.4)

661 (OR: 13.5)

634 (OR: 12.9)

559 (OR: 11.4)

483 (OR: 9.9)

413 (OR: 8.4)

362 (OR: 7.4)

342 (OR: 7.0)

283 (OR: 5.8)

201 (OR: 4.1)

188 (OR: 3.8)

141 (OR: 2.9)

125 (OR: 2.6)

99 (OR: 2.0)

99 (OR: 2.0)

83 (OR: 1.7)

63 (OR: 1.3)

49 (OR: 1.0)

0 200 400 600 800 1000 1200 1400

Inpatient Hospital Visit and Arrest (N=2,758)

Inpatient Hospital Visit and Parole (N=3,832)

Opioid Prescription and Parole (N=7,057)

ED Visit and Parole (N=10,549)

Inpatient Hospital Visit and Inmate (N=848)

Opioid Prescription and Arrest (N=4,992)

Arrest and Parole (N=3,941)

ED Visit and Arrest (N=8,057)

Parole (N=24,199)

Arrest (N=16,232)

Parole and Inmate (N=1,935)

ED Visit and Inmate (N=2,629)

Opioid Prescription and Inmate (N=1,413)

Inmate (N=9,954)

Inpatient and ED Visit (N=314,127)

Opioid Prescription and Inpatient Hospital Visit (N=366,348)

Opioid Prescription and ED Visit (N=610,572)

Inpatient Hospital Visit (N=810,284)

Arrest and Inmate (N=1,011)

ED Visit (N=1,551,240)

Opioid Prescription (N=1,740,332)

Maryland Average (N~=6,001,000)

Page 45: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

45

Discussion

Page 46: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

46

Discussion Challenges and Opportunities (cont.)

• Data sources/types:

o How to compare data types and their added value o What are the limits of each data type? What are we missing?o What can be used from unstructured data?

• Data quality:

o Do objective measures have data quality issues (e.g., BMI)?o How can we measure the quality of subjective data?

• Denominator/Populations:

o Are we excluding noise or signal? o Is this a too big of a cut or too narrow – sample size issues?o Patient attribution issues.

Page 47: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

47

Discussion Challenges and Opportunities (cont.)

• Some Data Science / Analytic Issues

• “Feature” (aka variable Reduction” - How can we mix multiple approaches such as expert opinion + automated approaches to reduce the feature space?

o Longitudinal / Temporal Analysis - What window is appropriate? How to deal with large zero fills in temporal data?

• Privacy and Security:

o “Freeing “the Data while building in robust protectionso Is HIPAA and other regulation s from a past era ? Is it helping or

hurting future science?

Page 48: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

48

Discussion Collaboration Opportunities

Going Beyond Claims! New Data Sources

• EHRs • SDH (various sources)• Geo-derived Factors

Going Beyond Regression! New Methods

• Blending traditional HSR technique and “Machine Learning”• Temporal Data• Geo-analysis

Addressing Informatic / Data Sciences Issues

• Data Quality• Data Interoperability• Extracting New Data (e.g. NLP)

Page 49: Population Health Research with Big Data: Updates and

© CPHIT @ JHSPH-HPM

CHSOR meeting

49

Thank you!

Q & [email protected]@jhu.edu

www.jhsph.edu/cphitwww.hkharrazi.com