towards a pathology imaging data commons for discovery...
TRANSCRIPT
1
Towards a Pathology Imaging Data Commons for Discovery and Precision Oncology
AACR SY32: Integrative Data Science for the Precision Medicine EraApril 17th 2018
Michael J. Becich, MD PhD, FACMIChair, Department of Biomedical InformaticsUniversity of Pittsburgh School of Medicine
University Distinguished Professor,Associate Vice Chancellor for Informatics
Associate Director for Cancer Informatics, Hillman Cancer CenterNCI Board of Scientific Advisors
2
Becich Conflicts of Interest (& Disclaimer)• Spatial Diagnostics, Inc. or SPDx (founder and stock) –
computational pathology newco• Nexi – Newco by Rebecca Jacobson and Text Information
Extraction System (TIES) Cancer Research Network (TCRN) team (licensing revenues to my Department)
• Cancer Center Consultancies and EABs – Baylor, Cancer Institute of New Jersey, Indiana University, Northwestern Lurie Cancer Center, University of Colorado, UCLA, University of Michigan, University of New Mexico and Wake Forest
• CTSA Consultancies and EABs – numerous (not a conflict for this talk except possibly for U of Chicago Institute for Translational Medicine)
• Funding – CDC/NIOSH, NCATS, NCI, NHGRI, NHLBI & NLMDisclaimer – I am a member of NCI’s Board of Scientific Advisors and Frederick National Laboratories Advisory Committee, Technical Working Group
3
Outline• Introduction to Text Information System (TIES)
Cancer Research Network (TCRN) • Introduction to Whole Slide Imaging (WSI) and
Computational Pathology (Comp Path)• Introduction to Pathology Imaging Data Commons
Enabled by TCRNKey Impact Areas – Comp Path Predictive Analytics for Precision Oncology leads to better health and
discovery science!
http://cancerdatanetwork.org/
4
The Text Information Extraction System or TIESNCI ITCR funded effort (U24 CA180921)
• A natural language processing (NLP) and Information Retrieval system for de-identifying, annotating, storing & retrieving pathology (and radiology) reports
• A system for indexing research resources (clinical data, biospecimens & images) with document annotations
• An GUI for querying large repository of annotated documents and obtaining resources locally, using an honest broker model
• A platform to support phenotype, images and biospecimen sharing among networks of cancer centers and other institutions
http://ties.dbmi.pitt.edu/
6
TIES Cancer Research Network (TCRN)UPMC Hillman Cancer Center (lead) • Augusta University Cancer Center • Abramson Cancer Center (Penn)• Roswell Park Cancer Institute• Stonybrook University (new partner)• Sidney Kimmel Cancer Center (TJU)• 12 Cancer Centers interested in joining
Network Trust Agreements• IRBs agree that use of data for
investigators is Not Human Subjects Research, no need for an additional IRB protocol even to access record level de-id data
• Governance• Agreement to abide by SOPs• Instrument of Adherence
Soliciting new WSI “ready” partners!
http://cancerdatanetwork.org/
8
8
9
TCRN Impact Merkel Cell Cancer – Rare Cancer Use Case
http://cancerdatanetwork.org/
10
Adding Cancer Registry Data to TIESOutcomes Annotation
• Identified as a valued development target by users• We have secured additional funding from Institute
for Precision Medicine in Pittsburgh• Mike Davis and Jonathan Silverstein lead this effort• Starting with breast cancer (first publication in
review)• Immediately leveragable by existing TCRN nodes by
adding Cancer Registry data to TIES instances
http://cancerdatanetwork.org/
11
Cancer Registry Data ElementsDemographics Primary Treatment Outcome
Race Primary Site Surgery Vital Status
Gender Histology Chemotherapy Cancer Status
Age @ Diagnosis
Grade BRM Recurrence
Smoking Path TNM Hormonal Cause of Death
Alcohol Clinical TNM Immunotherapy
Prognostic Factors (including site specific)
Rad Onc
http://cancerdatanetwork.org/
12
13
• Kaiser team publication – 2017 J. Path Info article:
• Result = deeper patient annotation, including outcomes data and enabling study cohort identification
Validation of TIES/TCRN Cancer Registry Integration
http://cancerdatanetwork.org/
15
Outline• Introduction to Text Information System (TIES)
Cancer Research Network (TCRN) • Introduction to Whole Slide Imaging (WSI) and
Computational Pathology (Comp Path)• Introduction to Pathology Imaging Data Commons
Enabled by TCRNKey Impact Areas – Comp Path Predictive Analytics for Precision Oncology leads to better health and
discovery science!
16
Traditional vs. Digital PathologyIntro to Whole Slide Imaging (WSI)
From Bertram and Kopfleish, Vet Path 2017
17
WSI – Imaging Big Data Drives Computational Pathology
From Saltz, circa 2011, to NLM
18
WSI Comp Path Adds Spatial Big Data for Tumor Analysis
EpigenomicsGenomics Proteomics Metabolomics
1 mm
Multi to hyperplexed fluorescence imaging of whole slide images for higher spatial resolution
and tissue context
H&E stained whole slide imagesfrom FFPE tumor sample
1
4
4 2A
B
C
D
23
13
Thanks to Drs. Joe Ayoob and Chakra Chennubhotla for this illustration, 2017
NCI ITCR funded effort (U01 CA204826)
19
Outline• Introduction to Text Information System (TIES)
Cancer Research Network (TCRN) • Introduction to Whole Slide Imaging (WSI) and
Computational Pathology (Comp Path)• Introduction to Pathology Imaging Data Commons
Enabled by TCRNKey Impact Areas – Comp Path Predictive Analytics for Precision Oncology leads to better health and
discovery science!
20
• Clinical Phenotype – From Pathology Lab Info Sys– Structured data– Unstructured data via NLP (Text Info Extract Sys – TIES)
• Outcomes Data – From Cancer Registry Systems• Biobank Data – From Formalin Fixed Paraffin
Embedded Tissue• Imaging – WSI - Spatial Annotation – HTAN relevant• Integrate Challenge – Clinical, Outcome, Biobank,
Imaging Data for analysis with Genomic data
Towards Computational Pathology Imaging Commons for Cancer – TCRN Data Types
http://cancerdatanetwork.org/
21Marusyk A. Nat Rev Cancer. 2012 Apr 19;12(5):323-34
Computationally Annotated WSI Assist inQuantification of Tumor Heterogeneity
22
Intra-tumoral spatial heterogeneity complicates accurate diagnosis & prognosis
Nature Reviews Cancer (2009) 9:239–252
23
Tumor Heterogeneity Research Interactive Visualization Environment (THRIVE – NCI ITCR U01 @ Pitt)
• Computational Pathology– Hyperplex Immunofluorescence (9 to > 50 Ab) – Machine Learning + Spatial Statistics– Network Systems Biology– Partnership with General Electric (GE)
• Iterative experimental–computational tumor micro-environment studies
Courtesy Chakra Chennubhotla, 2017
24
THRIVE – Chennubhotla Lab
Cancer Res. 2017 Nov 1;77(21):e71-e74. doi: 10.1158/0008-5472. PMID: 29092944
25
TCRN Enabled Data Sharing and Query
26
Comp Path Impact – TILs in TCGA Tumors
Cell Reports, 2018
NCI ITCR Funded U24 CA180924 (Saltz and Sharma), U24 CA215109 (Saltz and Prior) and others
27
Conclusions• The Text Information System (TIES) Cancer Research
Network (TCRN) is an ideal platform for coupling clinical/outcomes data to biospecimens/images
• Whole Slide Imaging (WSI) and Computational Pathology will enable NCI’s Moonshot in new ways to couple spatial analysis features to –omic data sets
• A WSI Imaging Data Sharing and Query Toolkit (TCRN) with linkages to NCI ITCR and GDC programs can assist the NCI vision for a Cancer Research Data Commons
Key Impact Areas – Computational Pathology Predictive Analytics for Precision Oncology leads to better health
and discovery science!For Copy of PPT –[email protected]
28
Computational Pathology @ PittLed by Chakra Chennubhotla, PhD
• Organized as a Interest Group and Lecture Series• Currently 109 members from Pitt, CMU, UPMC and
regional companies/startups (need sponsor!!!)• Pittsburgh Computational Pathology Interest Group
- http://www.csb.pitt.edu/comppath/• Computational Pathology Lecture Archive -
https://www.youtube.com/channel/UCWfBS3PLWHTIACeccm2CIog
• Supported by Akif Burak Tosun, PhD (post-doc)
29
Computational Pathology @ Pitt
Computational Pathology Lecture Archive -https://www.youtube.com/channel/UCWfBS3PLWHTIACeccm2CIog
30
“...Data commons collocate data, storage, and computing infrastructure with core services and commonly used tools and applications for managing, analyzing, and sharing data to create an interoperable resource for the research community...”
From Computing In Science & Engineering IEEE 2016
31
FAIR Data Principles
• Findable• Accessible• Interoperable• Reusable
• Data Use Case for Data Commons– See Data Med v3.0 – http://datamed.org
32
Six Requirements of a Data Commons
• Permanent digital IDs (data and knowledge)• Permanent metadata (data describing data)• API (interface)-based access (interoperability)• Data portability (standard containers)• Data Peering (commons 1 can access commons 2)• Pay for compute (allocate computing/charging)
– Demand higher than computing resources availableFrom Grossman et al 2016 CISE IEEE
33
Big Data to Knowledge (BD2K)
Pitt’s Department of Biomedical Informatics is a Center of Excellence in Big Data to Knowledge
Department of Biomedical Informatics
Pittsburgh, your Self-Driving Uber is arriving
now!!!
34
Informatics Fuels Data to Knowledge
35
HTAN and Spatial Imaging Discovery
36
Human Cell Atlas & Human BioMolecular Atlas Program (HuBMap) Drive Relevance of Comp Path
37
Enabling Research on the Cancer Registry
TIES tranSMART