think big data, think big impact! - northeast ohio hfma big...survival in cancer patients is to...
TRANSCRIPT
1
Jill S. Barnholtz-Sloan, PhDSally S. Morley Designated Professor in Brain Tumor Research
Professor & Associate Director for Bioinformatics/Translational Informatics (CICB & Case CCC)
Director, Cleveland Center for Health Outcomes Research (CWRU SOM)Director, Research Health Analytics and Informatics (UHHS)
THINK BIG DATA, THINK BIG IMPACT!
The Power of Data!•BIG DATA comes in many forms, leading to
multiple different types of research opportunities– Some datasets are diseased focused, others are not
Volume Size Hundreds of thousands, evenmillions of patients
Velocity Speed at which data is generated and processed
1000s patients/day; 1000s measurements/hour
Variety Claims, EHR, Social Media, etc
Physicians’ notes, pt-reported outcomes, MRI/CT
Veracity Truth value, trustworthiness Incorrect or missing?CP = chest pain or cerebral
palsy? (interoperability)
Value How to turn data into meaningful insights
Lead to measurable improvement
2
Big Data Examples
• Google searches– Anticipates what you are going to type
• Grocery store purchases– Generates coupons for similar products
• Facebook likes– Predicts what others want to see
• Tweets– Identify social (viral) trends
• Insurance companies– Tracking car usage to determine insurance rates
• Uber– Tracking usage to determine surge pricing
Structured Data Unstructured DataLab results
Prescribedmedications
Billing codes
Problem lists
Radiology imagesand reports
Pathology imagesand reports
Clinical notes
Structured vs Unstructured
80% of EHR data is UNSTRUCTURED!!
Slide courtesy of D. Hanauer
3
Cancer Facts and Figures
Brain and other CNS 1.6% Brain and other CNS 1.2%
ACS, 2019
4
ACS, 2019
ACS, 2018
5
6
7
Harnessing the power of big data in cancer!
--- Examples in brain tumors
Cancer is a disease of the genome
•If we precisely characterize the cancer genome can we cure cancer??– Drivers– Passengers– Rapid evolution
•Development of treatment resistance•Clonal evolution
– Other components of biological process --complex signaling
8
Multiple data types
• Clinical diagnosis• Treatment history• Histologic diagnosis• Pathologic
report/images• Tissue anatomic site• Surgical history• Gene expression/RNA
sequence• Chromosomal copy
number• Loss of heterozygosity• Methylation patterns• miRNA expression• DNA sequence• RPPA (protein)• Subset for Mass Spec
TCGA: “No Platform Left Behind”
25* forms of cancer
glioblastoma multiforme(brain)
squamous carcinoma(lung)
serouscystadenocarcinoma
(ovarian)
Etc. Etc. Etc.
Biospecimen CoreResource with more
than 150 Tissue Source Sites
6 Cancer GenomicCharacterization
Centers
3 GenomeSequencing
Centers
7 Genome Data Analysis Centers
Data Coordinating Center
Mutational Landscape of Cancers
TCGA, 2014
Smoking UVHPV Diet?
9
Brain Tumors and TCGA
Brennan et al, Cell, 2013• UH 6th leading accruing site to study
Brat et al, NEJM, 2015• Led to changes in the WHO
classification and treatments for low grade gliomas
UH/CWRU leader of Ohio wide recruitment network for gliomaOhio Brain Tumor Study => recruited >1200 patients with >10,000 biospecimens banked
New WHO Classifications
Currently collecting the following from the patient medical records with varying degree of completeness
• WHO grade• 1p deletion• 19q deletion• MGMT methylation
In 2018 started to collect –available in 2021
• IDH mutation• SHH activation/TP53 wt for
medulloblastoma• C19MC alteration for
embryonal tumors
Louis et al, 2016
10
Prognostic Factors for BTs
•Karnofsky Performance Score (KPS)•Age at diagnosis•Extent of surgical resection•Histological Type of Tumor
•Biomarkers– MGMT methylation– IDH 1/2 mutation– 1p/19 deletion
HG1
• Current approaches to addressing survival in cancer patients is to group patients and discussed MEDIAN survival
• Nomogram is a tool to use to estimate an individualized survival probabilities
Nomograms for individual survival estimation
HG2
Slide 19
HG1 Treatment and race as well.Haley Gittleman, 7/2/2019
Slide 20
HG2 May want to update with the old GBM nomogram from our team.Haley Gittleman, 7/2/2019
11
Example: GBM Nomogram
Example: GBM Nomogram
• 65-year-old male; 90 KPS; total resection; unmethylatedMGMT
http://cancer4.case.edu/rCalculator/rCalculator.html
12
Example: newly diagnosed IDH-wildtype GBM patients
In press, Neuro Oncology Advances
Example: newly diagnosed IDH-wildtype GBM patients
• 65-year-old male; >=70 KPS; total resection; unmethylatedMGMT; concurrent rad/TMZ
https://gcioffi.shinyapps.io/Nomogram_For_IDH_Wildtype_GBM_H_Gittleman/ In press, Neuro Oncology Advances
13
Facilitating Research in Cleveland via
BIG DATA
Integration of EHR data through CLEARPATH:Enables Pragmatic Clinical Trials, Research, and Population Health
EXTERNAL
Governmental
Social
Environmental
Collaborative ResearchResearch Questions:
Cohorts, ‘OmicsClinical, Population
Data
Global Data QueryCLEARPATH
Cleveland-WideData Integration
Limited Dataset
Integrated Data Result
Hospital 1
CDMLimited
Data Set
IDEAS Server (CICB provided)
Hash
Local Data Query
Hospital 3
CDMLimited
Data Set
IDEAS Server (CICB provided)
Hash EHR
Local Data Query
Hospital 2
CDMLimited
Data Set
IDEAS Server (CICB provided)
HashEHR
Local Data Query
GD
Q
GD
R
Multiple hospitals systems
EHR
14
How Does The Hashed ID Look in Real-Life?
Hashed ID Output(Data sent out of hospital)
(Simulated Data)
PHI requiring Authorization Input(File or database within Hospital Firewall)
Importance of De-Duplication in a Research Network
* Results reported from CAPRICORN PCORI CDRN
15
UHHSEHR data
EHR Data in
OMOP CDM
i2b2/SHRINE
CTSA Grants and Activities
PCORI Grants and Activities
TriNetXICB supported
tools
Increased Clinical Trials Participation/
Industry Contracts
PI specific project
integration
Investigator Driven
Research
CCEHR data
MHHCEHR data
ICB enabled Mapping
EHR Data in
OMOP CDM
EHR Data in
OMOP CDM
OHDSI ATLAS
Flow of Harmonized Data Enhancing Cleveland Wide Health Impact
ICB enabled Mapping
ICB enabled Mapping
ROI
NIH Grants
EMERSE
Population Health-- ACO--- Risk
assessment and
prediction
** initial data is structured data only --- dates, codes, lab results, meds, demographicsCICB= Cleveland Institute for Computational Biology
Analytical tools
Adding unstructured data and project specific data
Unstructured data
Structured data
• http://project-emerse.org/• All text notes in EHR -- Large collection of medical terminology with extensive
synonyms and related keywords
EMERSE (Electronic Medical Record Search Engine)
16
EMERSE (Electronic Medical Record Search Engine)
What can I get from all of this?
•Cohort Identification – e.g. clinical trial feasibility, with re-identification after IRB review
•Clinical Characterization – e.g. treatment utilization, disease history, quality improvement
•Population comparisons – e.g. safety surveillance, comparative effectiveness
•Patient-level prediction – e.g. precision medicine, disease risk/ED visit risk/readmission risk
17
New UH Narrative: Improving Value is Everyone’s Responsibility
=Value Quality + Patient ExperienceAnnual Total Cost of Care
Whoever demonstrates that they provide the highest value care will win in this marketplace.
Keep people healthy at home rather than in hospitals
• Organize care around patients’ needs, anchored in primary care
• Engage all staff in improving value
• Reduce mindless variation, augment mindful variation
• Hardwire the connections upstream and downstream in the patient journey
• Use data and technology to innovate
• Create management and accountability systems to eliminate defects in value
18
Stay well, Get well, Be well
Obtain annual wellness exam and close the gap
Proactively reduce unhealthy habits
Support healthy habits
Co-manage/co-locate behavioral health sciences
Provide recommended preventative care, wellness, immunization
STAY WELL
Is disease diagnosed?
Is the patient treated with the recommended therapy?
Is the patient activated and able to use therapy?
Is their physiology controlled?
Is their utilization of ED, hospital admissions and readmissions removed?
Is behavioral health co-managed/collocated?
GET WELL
Optimize health for people with chronic disease
Is care coordinated with PCP?
Is the therapy beneficial and appropriate?
Is care being provided in the highest value site of service?
Is care provided by a high value provider that uses evidence-based medicine and shared decision making?
Have we eliminated preventable harm?
MANAGE ACUTE CONDITIONS
For any condition anywhere in the care continuum
CHEKLIST FOR ELIMINATING DEFECTS IN VALUE
UH Population Health Strategy
19
Real world applications at UH
• With an initial focus on the ACO and employee populations:– Custom built Enterprise Data Warehouse (EDW) and Data Lake to
centralize disparate data across 5 EMRs, 2 scheduling and financial systems, and multiple administrative platforms to enable advanced analytic techniques
– Deployed algorithms to:• identify early detection of type 2 diabetes• predict CHF readmissions in order to prevent them• optimize the closing of quality gaps to maximize shared savings
arrangements• minimize downside financial risk to UH
– Developed dashboards that allow providers and clinical staff to conduct self-service analytics on their assigned patient populations.
On the cutting edge…..
20
What is Artificial Intelligence (AI)?
•Intelligence: “ability to learn, understand and think”
•AI is the study of how to make computers make things that people currently do better
•Examples:– Speech recognition– Smell recognition– Facial recognition– Object recognition– Intuition– Inference– Decision making– Abstract thinking
AI, Machine Learning & Deep Learning
21
AI categories with examples
AI & Clinical Decision Making
22
This article is a must read!
Teaching a computer to read breast tumors!
Classification of breast cancers – unchanged since 1920sComputational Pathologist-- previously unrecognized features associated with survival
Beck et al, Sci Trans Med, 2011
23
Teaching a computer to read your skin!
Acknowledgements
• CWRU teams– JBS and CCHOR Teams: Kristin Waite, Karen Devine,
Haley Gittleman, LC Stetson, Vachan Vadmal, Gino Cioffi, Nirav Patil, Sindhu Malay, Tuesday Gibson
– Translational Informatics Team: Jonathan Haines, Mark Beno, Mike Warfe, Devin Tian, Sunah Song, Bob Lanese, Paola Saroufim, Harry Mengay, Brian Christian, Cal Frye, Wanda Lattimore, Mustafa Ascha, John Shanahan, Saroj Sigdel, Justin Coran
•Primary Funding: NIH/NCI, CDC, Case CCC