cabig: the cancer - prismeforum.orgprismeforum.org/wp-content/uploads/2015/03/3.-ken-buetow.pdf ·...
TRANSCRIPT
caBIG: the cancer Biomedical
Informatics GridKen Buetow
NCICB/NCI/NIH/DHHS
NCI biomedical informatics! Goal: A virtual web of
interconnected data, individuals, and organizations redefines how research is conducted, care is provided, and patients/participants interact with the biomedical research enterprise
•Trials•Animal Models
states
context•pathways•ontologies
agents•therapeutics•probes
components•genes•genotypes•geneexpression•proteins•proteinexpression
etiology,treatment,prevention
MolecularPathology
ClinicalTrials
caCORE
accessportals
participatinggroup nodes
CancerGenomicsMouse
Models
building common architecture, common tools, and common standards
Interoperability
SemanticSemanticinteroperabilityinteroperability
SyntacticSyntacticinteroperabilityinteroperability
Courtesy: Charlie Mead
! in·ter·op·er·a·bil·i·ty- ability of a system...to use the parts or equipment of another systemSource: Merriam-Webster web site
! interoperability- ability of two or more systems or components to exchange information and to use the information that has been exchanged.Source: IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries, IEEE, 1990]
! Information integration
! Cross-discipline reasoning
caCORE – common ontologicrepresentation environment
biomedical objects
common data elements
controlled vocabulary
Enterprise Vocabulary! NCI Meta-Thesaurus
(Cross-map standard vocabularies/ontologies,e.g. SNOMED, MEDRA, ICD)
- Semantic integration, inter-vocabulary mapping
- UMLS Metathesaurus extended with cancer-oriented vocabularies
• 800,000 Concepts, 2,000,000 terms and phrases
• Mappings among over 50 vocabularies
! NCI Thesaurus- Description logic-based- 18,000 “Concepts”
• Concept is the semantic unit• One or more terms describe a Concept
– synonymy• Semantic relationships between
Concepts
biomedical objects
common data elements
controlled vocabulary
Common Data Elements
! Structured data reporting elements(e.g. LOINC)
! ISO11179 compliant
biomedical objects
common data elements
controlled vocabulary
Biomedical Information Objects
! Computer model of a biomedical object –“Plato’s Forms”- capture properties of object- can be joined together to make
complex systems- isolate data from data source- isolate applications from data
! Examples:- HL7-RIM- MAGE-OM
biomedical objects
common data elements
controlled vocabulary
Standards Supporting Infrastructure
! Enterprise Vocabulary Services (EVS)! cancer Data Standards Repository
(caDSR)! cancer Bioinformatics Infrastructure
Objects (caBIO)
Data AccessObjects
Object Managers
DomainObjects
RMI
Web Server
TomcatServlets
JSPsSOAP
UI BeanXML
XSL/XSLT
HTML/XML Clients
(Browsers)
SOAP Clients
Java Applications
DataObjectPresentationClient
NCI
Files
Other Data Others [caBIG]
UCSC
architecture
Prototype GRID Core architecture extension
caBIO server
caBIO client
OGSA-DAI +Globus
Globus
OGSA-DAI caGRID extension(metadata)
caGRID extension (caBIO adapter)
caGRID extension(query)
Client
Grid
Data Source
caGRID extension (Concept Discovery)
caGRID extension (Federated Query)
caGRID Extension (Integration of Discovery and Query Services)
NCICB applications:• clincial trials support - C3DS• molecular pathology - caArray• cancer images - caImage• pre-clinical models - caModelsDb• laboratory support - caLIMS
• Data System for the conduct of clinical trials in CCR that is generalizable to academic environments.
• Components:– C3D (Cancer Central Clinical Database)
• Primary data capture by protocol – C3PR (Cancer Central Clinical Participant Registry)
• Central registration of participants across protocols – C3PA (Cancer Central Clinical Protocol Administration)
• Scientific management system for clinical protocols – C3TR (Cancer Central Clinical Tissue Repository)
• Tissue repository – C3DW (Cancer Central Clinical Data Warehouse)
• De-identified patient information accessed via caBIO
C3DS Data Flow
C3D(Clinical DataManagement)
C3D(Clinical DataManagement)
C3PA(ProtocolAdmin)
C3PA(ProtocolAdmin)
C3TR(Tissue
Repository)
C3TR(Tissue
Repository)
C3DW(Clinical DataWarehouse)
C3DW(Clinical DataWarehouse)
Adverse EventReporting
System
Adverse EventReporting
System
Legacy DBs
AEDetails
ProtocolReporting
Requirements
ProtocolAccruals/Approvals
Active ProtocolData
PatientSpecimens
Non-AccruingProtocol Data
PeriodicUpdates
Source DBs
Batch LoadClinical (Lab) Data
PeriodicDownloads
EnrollPatients
FDA
PatientDetails
AEReports
PeriodicUpdates
Patient Demographics
C3PR(ParticipantRegistry)
C3PR(ParticipantRegistry)
PatientSpecimens
De-Identified Patient Details
SponsorsSponsors
electronic data capture for clinical research• WWW
accessible• Reusable
eCRFlibrary built using NCI CDEs
• Electronic Regulatory reporting
Image Portal• The NCICB has
developed an image portal to allow researchers to search for mouse and human images and annotations– Human and
mouse images and annotations were provided by the MMHCC
Pathway Database • Enhance value of imperfect, but
available, pathway knowledge• Make biological assumptions
explicit• Combine sources of data (e.g.
KEGG, BioCarta, ...)• Merge data from separate
pathways• Build a causal framework to
support (future) quantitative simulation/analysis
Cancer Biomedical Informatics Grid (caBIG)
! Common, widely distributed infrastructure permits cancer research community to focus on innovation
! Shared vocabulary, data elements, data models facilitate information exchange
! Collection of interoperable applications developed to common standard
! Raw published cancer research data is available for mining and integration
caBIG will facilitate sharing of infrastructure, applications, and data
caBIG action plan! Establish pilot network of Cancer Centers
- Groups agreeing to caBIG principles- Mixture of capabilities- Mixture of contributions
! Expanding collection of participants! Establish consortium development process
- Collecting and sharing expertise- Identifying and prioritizing community needs- Expanding development efforts
! Moving at the speed of the internet…
Three Domain Workspaces and two Cross Cutting Workspaces have been launched during the Pilot phase
DOMAIN WORKSPACE 3Tissue Banks & Pathology Tools
provides for the integration, development, and implementation of tissue and pathology tools.
DOMAIN WORKSPACE 2Integrative Cancer Research
provides tools and systems to enable integration and sharing of information.
DOMAIN WORKSPACE 1Clinical Trial Management Systems
addresses the need for consistent, open and comprehensive tools for clinical trials management.
CROSS CUTTING WORKSPACE 2Architecture
developing architectural standards and architecture necessary for other workspaces.
CROSS CUTTING WORKSPACE 1Vocabularies & Common
Data Elements
responsible for evaluating, developing, and integrating systems for vocabulary and ontology content, standards, and software systems for content delivery
caBIG deliverables! Componentized, standards-based Clinical Trials
Management System- e-IND filing/regulatory reporting with FDA- Electronic management of trials- Integration of diverse trials
! Tissue Management System- Systematic description and characterization of tissue
resources- Ability to link tissue resources to clinical and
molecular correlative descriptions! “Plug and Play” analytic tool set! Diverse library of raw, structured data
Cancer Molecular Analysis Project (CMAP)- a prototypic biomedical data integration effort
biomedical objects
common data elements
controlled vocabulary
Profiles, Targets, Agents, Clinical Trials
CGAPNCBIUCSC
(via DAS)
BioCartaKEGGGene
Ontologies
CTEP clinical trialsCGAP gene expression
NCI drug screening
NCI drug screening
caBIG community contributions! Infrastructure
- Ontologies- Databases
! Applications- Clinical trials
support- Analytic tools- Data mining
! Data- Trials- Experimental
outcomes• Genomic• Microarray• Proteomic
acknowledgements! NCICB
- Peter Covitz- Sue Dubman- Carl Schaefer- Mervi Heiskanen- Denise Hise- Kotien Wu- Fei Xu- Ulli Wagner- Frank Hartel- Sheri De Coronado- Gilberto Fragoso
! LPG/CCR- Michael Edmundson- Bob Clifford- Cu Nguyen
http://ncicb.nci.nih.govhttp://cmap.nci.nih.govhttp://caBIG.nci.nih.gov