the cabig™ enterprise - · pdf file(cagrid) what is “systems medicine”?...
Post on 22-Feb-2018
215 Views
Preview:
TRANSCRIPT
The caBIG™ Enterprise
J. Robert Beck, M.D.Chief Academic Officer
Fox Chase Cancer CenterPhiladelphia, USA
November, 2007
base state
selection selectionselection
mutation
mutation
mutation
malignant state
• chemical• virus• hormone• nutrition
genetic constitution
mutation
• immune• hormone• nutrition• (treatment)
Cancer as a Complex Adaptive System
Molecular Medicine A Complex Continuum
Clinical Research
PathologyMolecular Biology
Imaging
Molecular Medicine
caBIG™ and Molecular Medicine The People
Geneticists Geneticists
Molecular Biologists Molecular Biologists
Lab Technicians
Lab Technicians
RadiologistsRadiologists
Trial Managers
Trial Managers
Clinicians Clinicians
MRI Technicians
MRI Technicians
PathologistsPathologists
Geneticists Geneticists
Molecular Biologists Molecular Biologists
Lab Technicians
Lab Technicians
RadiologistsRadiologists
Trial Managers
Trial Managers
Clinicians Clinicians
MRI Technicians
MRI Technicians
PathologistsPathologists
caBIG™ and Molecular Medicine The Activities
SNP Identification
SNP Identification
Clinical Data
Correlation
Clinical Data
Correlation
Expression Analysis
Expression Analysis
Tissue Banking Tissue
Banking
Study Creation
Study Creation
Patient Enrollment
Patient Enrollment
Clinical Data
Collection
Clinical Data
CollectionImage
Sharing & Analysis
Image Sharing & Analysis
caBIG™ and Molecular Medicine The Software Tools
Translational research
tools
Translational research
tools
caIntegratorcaIntegratorcaARRAY
geWorkbench caARRAY
geWorkbench
caTissuecaTissue
PSCPSC
C3PRC3PR
C3DC3D
NCIANCIA
caBIG™ and Clinical Trials
Sample capabilities and tools:• Adverse event management (caAERS)
• Clinical data exchange (caXchange)
• Study participant calendar (PSC)
• Study participant registry (C3PR)
• Virtual clinical data warehouse (CTODS)
• caBIG™-compatible systems architecture [caGrid]
• Integration with caBIG™-compatible data management systems
caBIG™ In Action (caXchange) Extract Data from Hospital Clinical Chemistry Lab; Lab Viewer Marks Out-of-Range Value in Red
caBIG™ In Action Clinical Trials Case Studies
Clinical trials data collection for cancer clinical trials (Case Study B)Organizations
• Duke Comprehensive Cancer Center• Lombardi Comprehensive Cancer Center at
Georgetown University
caBIG™ resources:• Cancer Center Clinical Database (C3D)• Cancer Central Clinical Participant Registry (C3PR)• Cancer Data Standards Repository (caDSR)
Results:• Decreased protocol set-up time• Improved speed and quality of data collection• Reuse of standard forms and best practices• Decreased time/effort invested in study design,
procedure programming, and data extraction• Certification, validation, and full audit trails to address
regulatory requirements
caBIG™ In Action Clinical Trials Case Studies
caBIG™ tools and infrastructure that enable translational medicine research (Case Study C)Organizations
• Duke Comprehensive Cancer Center• SemanticBits LLC
caBIG™ resources:• Cancer Translational Research Informatics Platform (caTRIP)• Cancer Text Information Extraction System (caTIES)• caTissue Core• Cancer Annotation Engine (CAE) • caIntegrator
Results:• More efficient and user-friendly way to query data from existing
patients with similar characteristics to find successful treatments• Improved ability to investigate associations between multiple
predictors and their corresponding outcomes• More efficient searches for available tumor tissues
caBIG™ and Life Sciences
Sample capabilities and tools:
• Biobanking management systems (caTissue Core)
• Virtual clinical data warehouse (CTODS)
• Genome-wide data management system (caGWAS)
• In vivo image repository (NCIA)
• Microarray data management system (caArray)
• Microarray gene expression and sequence data management (geWorkbench)
• caBIG™-compatible systems architecture (caGrid)
What is “systems medicine”?
• Systems Medicine and Systems Biology are viewed in the scientific community as novel methods of understanding biology and approaching medicine. Systems Biology seeks to integrate different levels of information…
-Institute for Systems Medicine
• Systems biology is a relatively new biological study field that focuses on the systematic study of complex interactions in biological systems, thus using a new perspective (integration instead of reduction). According to the interpretation of System Biology as the ability to obtain, integrate and analyze complex data from multiple experimental sources using interdisciplinary tools, some typical technology platforms are:…
-Wikipedia
A systems biology real life example
• Does Epidermal growth factor receptor variant III status define clinically distinct subtypes of Glioblastoma Multiforme?• What are the gene expression levels in this patient cohort (n = 268)• What percentage of patients show V3 mutation?• Can vIII predict response to standard therapies like erlotinib &
gifitinib?• How does the survival analysis look like when patients are stratified
based on EGFR vIII status?• How many patients show amplification, upregulation in expression
and variant 3 deletion?• Do MR images from vIII positive patients differ from vIII negative
series?• How many patients in this cohort fall under the 6 classes in the
recursive partitioning analysis (RTOG-RPA)
Source: J. Clinical Oncology 2007 Jun 1; 25(16): 2288-94
Realize scientific discovery with caBIG tools
•What are the gene expression levels in this patient cohort (n = 268)
•What percentage of patients show V3 mutation?
•Can vIII
predict response to standard therapies like erlotinib
& gifitinib?
•How many patients in this cohort fall under the 6 classes in the recursive partitioning analysis (RTOG-RPA)
•How many patients show amplification, upregulation
in expression and variant 3 deletion?
•How does the survival analysis look like when patients are stratified based on EGFR vIII
status?
•Do MR images from vIII
positive patients differ from vIII negative series?
Realize scientific discovery with caBIG tools
•What are the gene expression levels in this patient cohort (n = 268)
•What percentage of patients show V3 mutation?
•Can vIII
predict response to standard therapies like erlotinib
& gifitinib?
•How many patients in this cohort fall under the 6 classes in the recursive partitioning analysis (RTOG-RPA)
•How many patients show amplification, upregulation
in expression and variant 3 deletion?
•How does the survival analysis look like when patients are stratified based on EGFR vIII
status?
•Do MR images from vIII
positive patients differ from vIII negative series?
Realize scientific discovery with caBIG tools
•What are the gene expression levels in this patient cohort (n = 268)
•What percentage of patients show V3 mutation?
•Can vIII
predict response to standard therapies like erlotinib
& gifitinib?
•How many patients in this cohort fall under the 6 classes in the recursive partitioning analysis (RTOG-RPA)
•How many patients show amplification, upregulation
in expression and variant 3 deletion?
•How does the survival analysis look like when patients are stratified based on EGFR vIII
status?
•Do MR images from vIII
positive patients differ from vIII negative series?
Survival analysis
Cases with mutation
Cases without mutation
Similar charts can be painted for treatment groups
caIntegrator
Realize scientific discovery with caBIG tools
•What are the gene expression levels in this patient cohort (n = 268)
•What percentage of patients show V3 mutation?
•Can
vIII predict response to standard therapies like
erlotinib & gifitinib?
•How many patients in this cohort fall under the 6 classes in the recursive partitioning analysis (RTOG-RPA)
•How many patients show amplification,
upregulation in expression and variant 3 deletion?
•How does the survival analysis look like when patients are stratified based on EGFR
vIII status?
•Do MR images from
vIII positive patients differ from
vIII negative series?
Realize scientific discovery with caBIG tools
•What are the gene expression levels in this patient cohort (n = 268)
•What percentage of patients show V3 mutation?
•Can
vIII predict response to standard therapies like
erlotinib & gifitinib?
•How many patients in this cohort fall under the 6 classes in the recursive partitioning analysis (RTOG-RPA)
•How many patients show amplification,
upregulation in expression and variant 3 deletion?
•How does the survival analysis look like when patients are stratified based on EGFR
vIII status?
•Do MR images from
vIII positive patients differ from
vIII negative series?
Realize scientific discovery with caBIG tools
•What are the gene expression levels in this patient cohort (n = 268)
•What percentage of patients show V3 mutation?
•Can
vIII predict response to standard therapies like
erlotinib & gifitinib?
•How many patients in this cohort fall under the 6 classes in the recursive partitioning analysis (RTOG-RPA)
•How many patients show amplification,
upregulation in expression and variant 3 deletion?
•How does the survival analysis look like when patients are stratified based on EGFR
vIII status?
•Do MR images from
vIII positive patients differ from
vIII negative series?
Lookup the tools from the JCO example
Tool Membership(Bundle/WS)
Versio n
URL
caArray LSD/ICR 2.0 beta
https://array.nci.nih.gov/
C3D CCTS/CTMS 4.5.2 https://cabig.nci.nih.gov/tools/c3d/
J-Review CTMS 8.0 https://octrials- rpt.nci.nih.gov/jreviewwww/sample_default.htm
caIntegrator LSD/ICR 1.2 http://caintegrator-info.nci.nih.govhttps://cabig.nci.nih.gov/tools/caIntegrator
GenePattern ICR 3.0 https://cabig.nci.nih.gov/tools/GenePattern/
geWorkbench ICR 1.0.6 https://cabig.nci.nih.gov/tools/geWorkbench
CGWB ICR 2.0 http://cgwb.nci.nih.gov/
Breast Cancer Study
Distant Past
•Translational research in the distant past was plagued by:• Siloed development within and across individual studies• Integrative analysis performed by MS Excel resulting in increased time and cost
to validate trial outcome• Lack of structured data sharing inhibiting improvements to patient care, outcome,
and ongoing trials
Clinical Data
Analytical Results
Genomic Data
Lung Cancer Study
Analytical Results
Genomic Data
Breast Cancer Study
Analytical Results
Genomic Data
Publications
Clinical Data
Publications Publications
Epi-demiology
Data
SNPData
MethylationData
Clinical Data
caBIG-compatible tools
Current Translational Research – pt. A
•Current translational research involves:• Inter-operable caBIG solutions enable data integration and sharing • Customizations of the common framework to accommodate unique study
needs •Current translational research challenges:
• There still are silo’ed systems that support local studies• Utility tools are needed to map legacy data to develop roadmap for caBIG
compatibility
Breast Cancer Study
Clinical Data
Genomic Data
Lung Cancer Study
Genomic Data
Epi-demiology
Data
Breast Cancer Study
Genomic Data
Clinical Data
SNPData
Clinical Data
MethylationData
caBIG compatible APIs caBIG compatible APIs caBIG compatible APIs
Columbia cancer CenterUCSC Spore-ISPY hosted at NCI Lung study at Center XYZ
PRESENTATIONTIER BUSINESS TIER
DATABASE/ANALYSIS
TIER
Service Layer(J2EE)
INTE
RN
ET Report Generation(XML/XSL)
Asynchronous Updates (AJAX)
Query Builder(Struts)
Findings Factory
Business Cache (ECHACHE)
Security Manager (CSM)
DTOsAnalysis Server Client Manager
(JMS Node)
JMS(Asynchronous)
Multi-Threaded Query Service
(OJB/Hibernate)
R
R-Binary R-Binary
Object Query Service
StudyQuery Service
Analytical Query Service
WebServer
(JBoss/Tomcat)
DOs
caIntegratorData
Warehouse
Analysis server
Remote Service(EJB Container)
BIOAssay Service
Bioassay DTOs
Presentation Cache
(ECHACHE)
Client Browser
Web Visualization/
Analysis Tools
WebGenome
GenePattern
App State
Current Translational Research : caIntegrator Architecture
So, where do we want to go – point B
• Next generation translational research requires:• Extraction of trends/patterns from HTP data• Support for handling high volume data sets• Integration with disparate data sources• Support for multi-dimensional complex queries and robust
analytical routines• Data summarization• Advanced Visualization
• Next generation translational research expands upon the needs of current efforts and requires:• Interoperability• Modularization enabling plug and play• Standards adoption where appropriate
What will take us from point A to point B
caBIG softwarethat support TR
DSIC guidance/policiesfor TR Tr community
•Cancer centers•Spores•CTSAs•IPBS•Industry…
Support network•Knowledge centers•Service providers•Program offices
FDA•Regulatory •IOTF/OBQI
StandardsOrganizations•HL7•CDISC
Let’s put the pieces of the puzzle together
• ICR workspace calls – biweekly• Task-oriented working groups – monthly
https://cabig.nci.nih.gov/workspaces/ICR/General_Meeting_Sch edule/
• New task forces of SMEs being established in EY2 to drive the usecase development for next gen integrative tools
• caBIG listservshttps://list.nih.gov/cgi-bin/wa?SUBED1=cabig_ICR-l&A=1
• caBIG getting connected: https://cabig.nci.nih.gov/getting_connected/working_with_cabig/
Life Sciences Distribution Bundle
•
The Life Sciences Distribution Bundle brings together a range of
caGrid-interfaced
tools that support biomedical informatics•
Functions include:•
Tissue Banking (caTISSUE)
•
Gene Expression Database (caArray)•
Translational Medicine tools (caIntegrator)
•
Biomedical Image Management and Analysis (NCIA)
•
Molecular analysis (geWorkbench)•
…and the supporting
caGrid infrastructure …
Life Sciences Distribution Bundle
Target release in Feb, 2008
ICR Products by Category
Capture data and annotation Analyze data
Link data and analysis tools
Store findings
caArray GenePattern caB2B Rembrandt
CPAS geWorkbench caTRIP TARGET
caBIO,GeneCon nect,caFE, TrAPSS
webGenome EAGLE
caNanoLab, ProtLIMS
Bioconductor CGEMS
caELMIR GOMiner caMOD
gridPIR DWD, VISDA, RProteomics
cPath, Reactome Cytoscape
https://cabig.nci.nih.gov/workspaces/ICR
Translational research website
http://ncicb.nci.nih.gov/NCICB/tools/translation_research
caBIG™ Vision
• Connect the cancer research community through a shareable, interoperable infrastructure
• Deploy and extend standard rules and a common language to more easily share information
• Build or adapt tools for collecting, analyzing, integrating and disseminating information associated with cancer research and care
The caBIG™ Pilot Phase
• An unprecedented effort to connect people, organizations, and data throughout the cancer research community
• 190 participating organizations• 300 software components• 40+ end-user applications in
discovery, clinical trials management, biospecimen management, etc.
• caGrid providing data transmission network that “connects” everyone
• 43 Cancer Centers actively participating in caBIG™ deployment program
• 45+ peer-reviewed publications about caBIG™
caBIG™ Pilot Goals
Illustrate that Cancer Centers with varying needs and capabilities can be joined in a common grid of communications, shared data, applications, and technologies
Demonstrate that Cancer Centers, in collaboration with NCI, will develop new enabling tools and systems that could support multiple Cancer Centers
Create an extensible infrastructure that will continue to be expanded and extended to members of the cancer research community
Demonstrate that Cancer Centers will actively use the grid and realize greater value in their cancer research endeavors by using the grid
caBIG™ In Action Life Sciences Case Studies
Biospecimen management for multi- institutional collaborative research activities (Case Study A)
Organization: • Inter-SPORE Prostate Biomarker Study
(IPBS)
caBIG™ resource:• caTissue Suite
Results:• Are capturing biospecimen and biomarker
data in a decentralized way with caTissue Suite
• Data migration plan developed to load all legacy data into caTissue Suite
• Queries conducted quickly and securely across all 11 centers participating in the IPBS study
caBIG™ In Action Life Sciences Case Studies
caBIG™ tools and infrastructure that support Genome-wide association studies [GWAS] (Case Study E)
Organization:• NCI Cancer Genetic Markers of
Susceptibility (CGEMS) project
caBIG™ resources:• caGWAS• caIntegrator
Results:• Improving collaboration• Providing infrastructure for better data
management, analysis, and communication
• Developing commitment to sharing information and developing data standards
AlabamaBirmingham: UAB Comprehensive Cancer Center ArizonaPhoenix: Translational Genomics Research Institute Tucson: University of Arizona CaliforniaBerkeley: University of California Lawrence Berkeley National Laboratory University of California at Berkeley Los Angeles: AECOM California Institute of Technology University of Southern California Information Sciences Institute University of California at Irvine The Chao Family Comprehensive Cancer Center La Jolla: The Burnham Institute Sacramento: University of California Davis Cancer Center San Diego: SAIC San Francisco: University of California San Francisco Comprehensive Cancer Center ColoradoAurora: University of Colorado Cancer Center District of ColumbiaDepartment of Veterans Affairs Lombardi Cancer Research Center - Georgetown University Medical Center FloridaTampa: H. Lee Moffitt Cancer Center at the University of South Florida HawaiiManoa: Cancer Research Center of Hawaii IllinoisArgonne: Argonne National Laboratory Chicago: Robert H. Lurie Comprehensive Cancer Center of Northwestern University University of Chicago Cancer Research Center Urbana-Champaign: University of Illinois at Urbana-Champaign IndianaIndianapolis:Indiana University Cancer Center Regenstrief Institute, Inc.
Iowa Iowa City: Holden Comprehensive Canter Center at the University of IowaLouisianaNew Orleans: Tulane University School of Medicine MaineBar Harbor: The Jackson Laboratory MarylandBaltimore: The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University Bethesda: Consumer Advocates in Research and Related Activities (CARRA) NCI Cancer Therapy Evaluation Program NCI Center for BioinformaticsNCI Center for Cancer Research NCI Center for Strategic Dissemination NCI Division of Cancer Control and Population Sciences NCI Division of Cancer Epidemiology and Genetics NCI Division of Cancer Prevention NCI Division of Cancer Treatment and Diagnosis Terrapin Systems Rockville: Capital Technology Information Services Emmes Corporation Information Management Services, Inc. MassachusettsCambridge: Akaza Research Massachusetts Institute of Technology Somerville:Panther Informatics MichiganAnn Arbor: Internet2 University of Michigan Comprehensive Cancer Center Detroit: Meyer L. Prentis/Karmanos Comprehensive Cancer Center MinnesotaMinneapolis:University of Minnesota Cancer Center Rochester:Mayo Clinic Cancer Center NebraskaOmaha:University of Nebraska Medical Center/Eppley Cancer Center New HampshireLebanon:Dartmouth College Dartmouth-Hitchcock Medical Center
New YorkBuffalo: Roswell Park Cancer Institute Bronx:Albert Einstein Cancer Center Cold Spring Harbor:Cold Spring Harbor Laboratory New York:Herbert Irving Comprehensive Cancer Center Columbia University Memorial Sloan-Kettering Cancer Center New York University Medical Center White Plains: IBM North CarolinaChapel Hill: University of North Carolina Lineberger Comprehensive Cancer Center Raleigh-Durham: Alpha-Gamma Technologies, Inc. Constella Health SciencesDuke Comprehensive Cancer Center OhioCleveland: Case Comprehensive Cancer Center Columbus: Ohio State University Comprehensive Cancer Center OregonPortland: Oregon Health & Science University PennsylvaniaPhiladelphia: Drexel University Fox Chase Cancer Center Kimmel Cancer Center at Thomas Jefferson University Abramson Cancer Center of the University of Pennsylvania Pittsburgh: University of Pittsburgh Cancer Institute TennesseeMemphis: St. Jude’s Children’s Research Hospital TexasAustin: 9 Star Research Houston: M.D. Anderson Cancer Center VirginiaFairfax: SRA International Reston: Scenpro WashingtonSeattle: DataWorks Development, Inc. Fred Hutchinson Cancer Research Center InternationalParis, France: Sanofi Aventis
Collaboration Is Central
Data Sharing and Security
Sample resources:
• caBIG™ Policies• Processes and Best Practices• Model Documents
caBIG™ In Action Data Sharing and Security
Policies and procedures to support and enable meaningful data sharing and cooperation (Case Study D)
Organizations:• Cancer Center Representatives
caBIG™ resources:• Data Sharing and Intellectual Capital
Workspace (DSIC), • Data Sharing and Security Framework
Results:• Diverse groups of Cancer Center
representatives are working together with government, academic, and commercial groups.
• Are identifying processes to address legal, privacy, and regulatory issues that arise from collaboration and data sharing
caBIG™ Vision for 2010
• All comprehensive and community cancer centers are connected
• Data is being shared
• All multi-center clinical cancer trials are connected to each other electronically and to the FDA for reporting
• Institutions are collaborating and publishing studies with data they are sharing through caBIG™
caBIG™ - The Enterprise Phase
Connect all biomedical researchers
Increase speed and volume of data aggregation and dissemination
Grow the community in breadth and scope
Scalable national infrastructure for Molecular Medicine
Strategies for Increased Adoption
• Enterprise Adopter Program
• Service Providers
• Knowledge Centers
• Program Offices
Future of caBIG™
caBIG™ infrastructure and tools may link biomedical community globally.
caBIG™ capabilities may be integrated into health IT.
caBIG™ may serve as a model for other disease research and biomedical endeavors.
caBIG Deliverables: Architecture
• caBIG™ Compatibility Guidelines• caGrid 0.5 Security White Paper• caGrid Software Version 0.5• caGrid – 1.0• Technology Evaluation White
Paper• caBIG™ - The Security White
Paper (Technology Evaluation)• Workflow Language
Recommendations White Paper• ID Management White Paper• Common Query Language White
Paper
• The Architecture Cross-Cutting Workspace provides for the development of the underlying standards used by the program, and ensures that common mechanisms are used throughout the caBIG™ community via mentoring, white papers and a structured review process.
caBIG Deliverables: Vocabularies and Common Data Elements
• LexGrid• CDE Governance Model• VCDE Guidance Mentoring
Teams• Vocabularies Deployment
Document• Data Standards Approval
Guidelines• Procedures for the Review and
Approval of New VCDE Content• Mouse/Human Anatomy Ontology
Mapping• Nutrition Ontology
• The Vocabularies and Common Data Elements Cross-Cutting Workspace provides for the development of the underlying data elements and vocabularies used by the program, and ensures that common mechanisms are used throughout the caBIG™ community via mentoring, white papers and a structured review process.
• Community driven
• Dynamic implementation
• Built to be upgraded as standards “harden”, and domains expand
Standards-based interoperability: the cancer common object resource environment (caCORE)
biomedical objects
common data elements
controlled vocabulary
Standards infrastructure and services
• Enterprise Vocabulary Services (EVS)• Browsers• APIs
• cancer Bioinformatics Infrastructure Objects (caBIO)
• Applications• APIs
• cancer Data Standards Repository (caDSR)• CDEs• Case Report Forms• Object models• ISO 11179 model
• Developer Toolkits• caCORE SDK• caAdapter
caGrid
• Grid Infrastructure for caBIG• caGrid Components
• Language (metadata, ontologies)• Security• Advertisement and Discovery• Workflow• Grid Service Graphical
Development Toolkit
NCICBcaCORE- caBIO- caDSR- EVS
repositories
Data Mart
Gene Expression
Data
Clinical Data
Tissue Bank
Data Mart
ResearchCenter
Clinical Data
Analysis Tools
Gene Expression Data
Proteomics Data
Genomics Data
Data ServicesAnalytical ServicesAnnotation ServicesService AdvertisementService DiscoveryService QuerySemantic mappingSecurity Services
Data ServicesAnalytical ServicesAnnotation ServicesService AdvertisementService DiscoveryService QuerySemantic mappingSecurity Services
Researcher
Physician Patient
ResearchCenter
caGrid 1.0 Security Needs
• Authentication• Process of determining whether someone or something is, in fact, who or
what it is declared to be.
• Authorization • Process of determining if an authenticated user may do something on a
given resource.• Can User X perform Operation Y on Resource Z?
• Trust Management • Supports applications and services in deciding whether or not signers of
digital credentials/user attributes can be trusted.
• Secure Communication• The ability to guarantee the integrity and/or privacy of messages
between two parties
caGrid Trust Management
Grid Trust Service (GTS)
Grid
1. Username/Password
2. SAML
Assertion
3. SAML A
sser
tion
Grid Trust Service
Grid Service
Dorian
OSU User
IdPOhio State UniversityCertificate Authority
6. Is Proxy
Trusted?
7. Yes/N
o
Trust Agreement
Globus Trusted
Certificates Directoy
Grid Service
Auto Synchronize
With GTS
A caGrid Illustration: Virtual PACSPresent a PACS interface to analytical and data sources on the grid.
Use your own DICOM WorkstationVirtual PACS federates services on the Grid using caGrid
In Vivo Imaging Middleware Project
• Interoperability Library• Translate between DICOM and
caBIG data models, and DICOM QR and caBIG query language
• DICOM Data Service• Exposes existing DICOM QR aware
data resources (PACS, etc) as caGrid compliant service
• VirtualPACS• Allows DICOM-aware clients (review
workstation, etc) to access DICOM caGrid data services over the grid
• caGrid-based security for data transport, authentication, and authorization
gridIMAGE caGrid integration
•Leverages core caGrid services/tools• Introduce, caDSR Service, caGrid Data Service, Index Service,
Authentication Service•Leverages In Vivo Imaging Core Middleware
• DICOM interoperability and Bulk Data Transport via GridFTP
Infrastructure – Today
caBIG
caGrid 0.5 Test Bed
Index Service
Pittsburgh
Duke
caArray
rProteomics
PIR
caTIES
Georgetown NCI
caArraycaBIO
GUMSCAMS
GME
Standards for vocabularies and common data elements established and housed at NCICB
Sample Applications
Building on a foundation of established infrastructure points
+NCICB housed infrastructure for CDEs, and vocabularies
Grid reference implementations lead the way
Compatibility guidelines and initial compatibility evaluation process for caBIG™ program projects established
A rich set of harmonized standards and vocabularies is available
A group of mentors has been identified to ensure consistency across key projects
Infrastructure – Tomorrow
Instantiated formal process for evaluation and harmonization
Many applications Grid enabled (e.g., gene pattern, reactome)
A compatibility evaluation process for caBIG™ program projects and a certification process for externally developed tools are established
NCICB housed infrastructure for CDEs, and vocabularies
Increased growth and interoperability of infrastructure, Easier to addnew tools, workflow support established
+End user portal available, security infrastructure)
caBIG
caGrid 0.5 Test Bed
Index Service
Pittsburgh
Duke
caArray
rProteomics
PIR
caTIES
Georgetown NCI
caArraycaBIO
GUMSCAMS
GME
A rich set of harmonized standards and vocabularies continues to grow in size
Tooling available to provide site-specific vocabularies and ontology management and support
Mentors actively working in caBIG™ Community and beyond to ensure consistency across key projects and adherence to caBIG™ goals
API’s with common interfaces facilitate scientific workflows
Infrastructure – The Future
Microarray
ResearchGroup
NCBI
Gene Database
caGrid Client
ResearchCenter
Tool 1
Tool 2
SNPlex
Protein Database
caGrid Data Service
caGrid Analytical Service
Image
Tool 2
Tool 3
Grid Services Infrastructure(Secure Communication,
Service Invocation, Data Transfer)
caGridAnalytical
Service
Common Data Types, Terminologies, Ontologies
Common Data Elements
Vocabulariesand Ontologies
SchemaManagement
IndexService
Advertisement and Discovery
GSIGUMSCAMS
Security
GSIGUMSCAMS
Security
GSIGUMSCAMS
Security Query Service
Query
caGrid Data Service
caGrid Data Service
caGrid Data Service
Multiple sites host portions of the federated,scaleable, standards-based infrastructure
Functional applications part of standard practice/fully deployed on GRID
NCICB housed infrastructure for CDEs and vocabularies
Broad adoption and independent support, across and beyond thecancer research community; increased growth and interoperabilityof infrastructure; easier to add new tools (data and services);workflow support expanded; greater interconnectedness, and automation
+
Developed standards increase in number; mechanisms exists for community to develop and harmonize standards and compatibility guidelines
Certification process for externally developed tools
A rich set of community developed harmonized standards and vocabularies continues to grow
Tooling available to provide site specific vocabularies and ontology management and support
Mentors actively working in caBIG™ Community and beyond to ensure consistency across key projects/adherence to caBIG™ goals
Vocabulary services are federated
caBIG will need commercial developers to take tools out-- Examples from CTMS:
Velos: Comprehensive clinical trials system in widespread use in the extramural Cancer Centers throughout the country.
PercipEnz: A comprehensive solution for managing all aspects of clinical research – study setup and activation, scientific reviews, subject registration, compliance tracking, visit tracking, data collection, data and safety monitoring, financials management, data extraction, regulatory reporting, and outreach.
Akaza Rsch: web-based, open source software platform for managing multi-site clinical research studies. It facilitates protocol configuration, design of case report forms, electronic data capture, retrieval, and management.
Clinical Research IT Infrastructure
Clinical Systems
De-identificationServices
Labs,EMR,
Tissue,etc.
ClinicalTrials
ExternalReporting
HL7/ CAM
SDK
HL7- v3
HL7-v3,Janus
ClinicalData Mgmt
EDC
Adverse Events
Participant Registry
etc.
Translation Service
FDASPONSOR
NCIother
HL7 trans-
actional database
Clinical Research
InformationExchange
HL7- v2.x, other
Research Data
Warehouse
HL7-v3,Janus
Patient Health Record
Lifecycle Management
The Future
A worldwide biomedical grid community
Bringing translational and clinical research to personalized medicine
Summit Goals
• Initiate a dialogue with decision-makers and strategic thinkers about what they need to further develop caBIG™ tools and services, and/or participate in the caBIG™ enterprise
• Identify key opportunities, issues, and challenges that must be addressed
• Gather ideas about how caBIG™ should be organizationally structured and governed
Summit Agenda
Keynote Address“The Role of caBIG™ in the Future of Cancer Research”
Dr. John NiederhuberDirector, National Cancer Institute
Research & Development Track
Market Opportunities Track
Governance Track
Discuss drivers and new research models for cancer
research, and identify what is needed in biomedical informatics in the near future to support such
models.
Discuss ways to strengthen and expand the market opportunity for caBIG™-compliant products
and services and create a significantly self-sustaining
economic system.
Discuss future models of caBIG™ structure and
governance and identify strategies and tactics to drive
caBIG™ adoption.
Opening Panel Discussion“Opportunities and Challenges from Where I Sit”
Summit “Deliverables”
• Identify people and organizations who want to participate in the next generation of caBIG™
• Identify projects and collaborations around caBIG™ adoption
• Advance ideas for expanding caBIG™ to a broader, multi- constituency-based biomedical ecosystem
• Develop and disseminate Executive Summary of ideas, insights, and proposed programs to catalyze future activities among broader constituencies
Measure states indirectly
base state(s) malignant state(s)
Center for Cancer ResearchLaboratory of Population Genetics
Mutation status
Allele loss
Constitutional variation
RNA expression
Epigenetic variation
Vision
“When I look into the eyes of a patient losing the battle with cancer, I say to myself, It doesn’t have to be this way.” The Nation’s Investment in Cancer Research (2003)
NCI 2015 challenge goal: eliminate suffering and death due to cancer
A.C. von Eschenbach, M.D. Former Director, National Cancer InstituteDirector, Food & Drug Administration
Informatics tower of Babel
•Each cancer research community speaks its own scientific “dialect”
•Integration critical to achieve promise of molecular medicine
Biomedical Informatics and Middleware
DisseminatesInformation
GridInformation Integration
Brings in InformationGrid
Information Integration
Translates andIntegrates Information
Natural Language ProcessingOntologies
The cancer Biomedical Informatics Grid
• Responding to the Vision to reduce the burden of cancer• Dealing with the problem of massive quantities of data• Dealing with the distributed nature of cancer research
• Involving• Translational research• Clinical research• Patient advocates• Cancer center administration
caBIG™ is an innovative bioinformatics program at the NIH’s National Cancer Institute• 50 Cancer Centers are working towards a common goal of integrated
data, tools and methodologies to accelerate cancer research goals at the National Cancer Institute for Bioinformatics (NCICB), the cancer Biomedical Informatics Grid (caBIG™)
• The goal of caBIG™ is to create a virtual web of interconnected data, individuals, and organizations which will:• redefine how research is conducted• care is provided• patients / participants interact with the biomedical research enterprise
• The principles driving caBIG™ are:• Open Source• Open Access• Open Development• Federated Model
caBIG promotes the Vision
“Nearly every facet of NCI’s strategic plan to eliminate suffering and death due to cancer is predicated on the revolutionizing potential of caBIG™.” Cancer Bulletin, 2005
NCI 2015 challenge goal: eliminate suffering and death due to cancer
A.C. von Eschenbach, M.D. Former Director, National Cancer InstituteDirector, Food & Drug Administration
Scenario, 2009
A researcher involved in a phase II clinical trial of a new molecularly targeted therapeutic for brain tumors observes that cancers derived from one specific tissue progenitor appear to be strongly affected. The trial has been generating proteomic and microarray data. The researcher would like to identify potential biochemical and signaling pathways that might be different between this cell type and other potential progenitors in cancer, deduce whether anything similar has been observed in other clinical trials involving agents known to affect these specific pathways, and identify any studies in model organisms involving tissues with similar pathway activity.
Small Molecules Cell Type
Path
way
s Clinical Trials
Therapeutics Animal ModelsHomologous Proteins
Michael Ochs, 2005
How is such research conducted?
• Today: a lot of manual work finding sources, other groups working on problems, getting data from other sites, re-analyzing, etc.
• With caBIG, much of the work is automated across a data grid, caGrid• Security model authenticates and authorizes the investigator• Data is made available for translational use• Standard tools and architectures exist for analytical flow
caBIG will facilitate sharing of infrastructure, applications, and data across multiple cancer research programs
Analysis
Pathologyreports
Discrete and manual annotation on tissues
Mutationidentification
Gene expression profiling
Analysis
Potential Drug Targets and Biomarkers
Clinical Trials Tumor Samples
400 brain tumor tissue samples acquired
caArray
Function Express
Gene annotation
GenePattern
PromoterDB
PathwaysTool
caTissue
caTIES
Clinical Annotation Modules
Proteomics LIMS Q5 PIR
Annotation
Discovery utilizing caBIG™Integrated Cancer Research Tools
Identify up- regulated genes in specific pathways
TrAPSS
Identify recurring promoter elements
Thinking about a Solution
A virtual web of interconnected data, individuals, and organizations redefines how:
•research is conducted•care is provided•patients/participants interact with the biomedical research enterprise
Goals of the caBIG pilot
• Illustrate that a spectrum of Cancer Centers with varying needs and capabilities can be joined in a common grid of communications, shared data, applications, and technologies
• Demonstrate that Cancer Centers, in collaboration with NCI, will develop new enabling tools and systems that could support multiple Cancer Centers
• Create an extensible infrastructure that will continue to be expanded and extended to members of the cancer research community
• Demonstrate that Cancer Centers will actively use the grid and realize greater value in their cancer research endeavors by using the grid
caBIG™ Pilot action plan
•Establish pilot network of NCI Cancer Centers• Groups agreeing to caBIG principles• Mixture of capabilities• Mixture of contributions
•Expanding collection of participants•Establish consortium development process
• Collecting and sharing expertise• Identifying and prioritizing community needs• Expanding development efforts
•Moving at the speed of the internet…
Inauguration of the caBIG™ pilot
• 61 cancer centers were asked• What they could contribute to a biomedical informatics data grid and
community initiative• What they would need from the grid
• All respondents were visited for clarification and detail• “Not a site visit”• Most regarded it as a site visit• Rumor: 10-12 pilot sites at $500,000/year
• 49 institutions offered contracts for small portions of caBIG pilot• Idea to build community with multiple small projects and roles• Political resistance to small number of pilot sites
Common needs helped shape priority areas for the caBIG pilot activities
0 5 10 15 20 25 30 35
Clinical Data Management ToolsStaff Resources
Distributed Data Sharing/Analysis ToolsTranslational Research Tools
Access to DataTissue & Pathology Tools
Center Integration & ManagementCommon Data Elements & Architecture
Meta-ProjectVocabulary & Ontology Tools & Databases
Statistical Data Analysis ToolsVisualization & Front-End Tools
Remote/BandwidthProteomics
Microarray & Gene Expression ToolsMeeting
LIMSLicensing Issues
PathwaysHigh Performance Computing
IntegrationImaging Tools & Databases
Database & Datasets
Number of Needs Reported
Clinical Trial Management Systems
Tissue Banks & Pathology
Integrative Cancer Research
Cancer Center Roles in caBIG
• Developer (20% of centers)• Key is to create an environment for sharing tools with other centers• One of the most important issues is not to ignore the need for common
data elements and vocabulary services• Adopter (20% of centers)
• Key is to understand the needs at local center (and be vocal)• Don’t abandon other development efforts; think modular• When adopting tools make sure they “talk” to legacy systems
• Working Group & Strategic Planning (60% of centers)• These are not “soft” roles• Critical to the success of the program• White paper development will guide caBIG successes• Make sure to communicate internally to all parts of the Cancer Center
This isn’t Rocket Science
• A lot of caBIG™ isn’t even computer science• Most industries did much of this years ago• Really this is an engineering project…
• But it is hard to achieve – it takes time• caBIG™’s goal (oversimplified): facilitate the exchange of
data useful for cancer research and care• Between research domains, systems, investigators, and
organizations• For instance, the caBIG™ compatibility of a system is
determined by how easily the system can exchange data (i.e., interoperability)
Four Domain Workspaces and two Cross Cutting Workspaces have been launched
DOMAIN WORKSPACE 3Tissue Banks & Pathology Tools
provides for the integration, development, and implementation of tissue and pathology tools.
DOMAIN WORKSPACE 2Integrative Cancer Research
provides tools and systems to enable integration and sharing of information.
DOMAIN WORKSPACE 1Clinical Trial Management Systems
addresses the need for consistent, open and comprehensive tools for clinical trials management.
CROSS CUTTING WORKSPACE 2Architecture
developing architectural standards and architecture necessary for other workspaces.
CROSS CUTTING WORKSPACE 1Vocabularies & Common
Data Elements
responsible for evaluating, developing, and integrating systems for vocabulary and ontology content, standards, and software systems for content delivery
DOMAIN WORKSPACE 4Imaging
provides for the sharing and analysis of in vivo imaging data.
Strategic Level Workspaces
caBIG Strategic PlanningAssists in identifying strategic priorities for the development and evolution of the caBIG effort.
TrainingDeveloping strategies for providing training in the use of the caBIG developed resources including on-line turtorials, workshops, training programs.
Data Sharing and Intellectual Capital
Addresses issues related to the sharing of data, applications and infrastructure both within the consortium and in the larger cancer research community.
Overall Goals for caBIG™ Three-year (mid-2008)
• Develop sufficient research tools and standards to have a positive impact on the cancer research community, as measured by adoption of relevant caBIG principles in project proposals.
• Ensure widespread adoption of developer standards so that funded developer projects are operating under the Gold standard of compatibility.
• Adopt and use caBIG interoperable tools and data sets within the caBIG community.
• Develop mechanisms for engaging and promoting caBIG compliant technologies and established datasets within the oncology research community.
Overall Goals for caBIG™ Five-year (mid-2010)
• Ensure widespread adoption, dissemination, and use of caBIG interoperable tools, standards, and data sets within the larger cancer community, to include the biopharmaceutical industry, non-NCI cancer centers, and the national cancer research enterprise.
• Begin to see results of caBIG-compliant interdisciplinary and inter-institutional research affecting clinical oncology care.
Architecture
• Conceptually, caBIG has adopted two primary guiding principles: • To bring systems on-line quickly, caBIG is committed to a
“bias for action.” This implies a commitment to making decisions and moving forward, even if perfection cannot be achieved.
• To allow long-term evolution and improvement of architectural design, caBIG is committed to “designing for change.”
• To turn these thoughts into action, caBIG has also adopted a two-pronged practical approach:• If requirements are well-understood and good solutions are
available, caBIG initiates developmental activities within the architectural workspace.
• If requirements are less clear or if solutions are not yet available, caBIG commissions analysis and assessment activities. This can get UGLY
caBIGTM Compatibility Guidelines
• The caBIGTM compatibility guidelines are designed to insure that systems designed in a Federated environment are still interoperable on the caBIGTM Grid, both syntactically and semantically
• Since achieving interoperability is a process, caBIGTM
recognizes four levels of compatibility, starting from Legacy (not interoperable) through Bronze, Silver and Gold (fully interoperable)
• caBIGTM compatibility is all about interfaces rather than the scientific content of the system
caBIG Deliverables: Clinical Trials Management Systems
• Componentized, interoperable and standards-based Clinical Trials Management Systems, both purpose- built and commercial off- the-shelf to handle, in an automated fashion, many aspects of developing, managing, conducting, and reporting Clinical Trials
• Biomedical Research Integrated Domain Group Model (BRIDG)
• Adverse Events Reporting Tool• Cancer Clinical Comprehensive
Dictionary (C3D)• Cancer Community Clinical
Patient Registry (C3PR)• Clinical Research Information
Exchange (CRIX)• caBIG™ Compatibility evaluation
for existing commercial tools• Harmonization of UML
Representations• Ontological Representations and
Data Elements for Clinical Trials• Metadata Harmonization
Patient CareWorld
PatientData in
ProprietaryFormat
Clinical ResearchWorld
RegulatoryWorld
Clinical Information Integration Challenges
caBIG Deliverables: Tissue Banks and Pathology Tools
• Systematic description and characterization of tissue resources – tools to inventory, track, mine, and visualize tissue samples from geographically dispersed repositories, with an ability to link tissue resources to clinical and molecular correlative descriptions
• caTISSUE Core• caTIES• caTISSUE Clinical Annotation
Engine• caTISSUE Experimental
Annotation Engine• Requirements Specifications
Survey and Results• Federated Tissue Data Set White
Paper• Cancer Translational
Informatics Platform (caTRIP)
caBIG Deliverables: Integrative Cancer Research
• caArray• caWorkbench 2.0• GenePattern• Gene Ontology Miner (GOMiner)• Protein Information Resource
(PIR)*• RProteomics*• Pathways Tool Development• Tools Distance-Weighted
Discrimination• Magellan• Visual and Statistical Data
Analyzer (VISDA)• Cancer Molecular Pages
• The ICR Workspace seeks to provide for the development of a “Plug and Play” analytic tool set, suitable for a variety of experiemental methodologies, including microarrays, proteomics, biological pathways, data analysis and statistical methods, gene annotation, et al. It will also develop a diverse library of raw, structured data and facilitate the integration of different types of data. All of these tools would help in integration of clinical and basic research
caBIG Deliverables: Integrative Cancer Research (cont’d)
• Proteomics Laboratory Information Management System (LIMS) Prototype
• Q5• TrAPSS• Gene Connect• Integrating Bioconductor and R
into caBIG™• Reverse Phase Protein Lysate
Array based data for caArray• Cancer Translational Informatics
Platform (caTRIP)
• FunctionExpress• HapMap, PromoterDB• SEED• NCI-60 Data Sharing• Quantitative Pathway Analysis in
Cancer (QPACA)• Reactome (GKB) Data
Rembrandt: A brain tumor repository now utilizes available
caBIG tools
Better understanding
Better treatments
Expression array data
Clinical data
SNPArray data
Proteomics data
caIntegrator - DataMart
caBIG Analytic Tools
caBIG™ - Interaction Mechanisms
•For all participants:• Annual meeting• Online “Town Hall”
quarterly• Addresses solicited
questions• Monthly program update
newsletter (big picture)• “What’s big this week”
weekly newsletter (e.g. workspace meeting schedule)
•For Cancer Center Directors:• Director’s newsletter
•For Workspaces participants:• Monthly teleconferences
(more frequently as needed)• Quarterly meeting (face to
face)•For all participants and the general public
• caBIG™ website
caBIG™ Involves a Large Community with a Wide Range of Interests
9Star ResearchAlbert EinsteinArdaisArgonne National LaboratoryBurnham Institute California Institute of Technology-JPLCity of Hope Clinical Trial Information Service (CTIS)Cold Spring HarborColumbia University-Herbert IrvingConsumer Advocates in Research
and Related Activities (CARRA)Dartmouth-Norris CottonData Works DevelopmentDepartment of Veterans AffairsDrexel University Duke UniversityEMMES CorporationFirst Genetic TrustFood and Drug AdministrationFox Chase Fred HutchinsonGE Global Research CenterGeorgetown University-LombardiIBMIndiana UniversityInternet 2Jackson LaboratoryJohns Hopkins-Sidney Kimmel Lawrence Berkeley National Laboratory Massachusetts Institute of Technology Mayo Clinic Memorial Sloan KetteringMeyer L. Prentis-KarmanosNew York University
Ohio State University-Arthur G. James/Richard SoloveOregon Health and Science UniversityRoswell Park Cancer Institute St Jude Children's Research HospitalThomas Jefferson University-KimmelTranslational Genomics Research InstituteTulane University School of MedicineUniversity of Alabama at BirminghamUniversity of Arizona University of California Irvine-Chao FamilyUniversity of California, San FranciscoUniversity of California-DavisUniversity of ChicagoUniversity of ColoradoUniversity of Hawaii University of Iowa-HoldenUniversity of MichiganUniversity of MinnesotaUniversity of NebraskaUniversity of North Carolina-LinebergerUniversity of Pennsylvania-AbramsonUniversity of PittsburghUniversity of South Florida-H. Lee Moffitt University of Southern California-NorrisUniversity of VermontUniversity of WisconsinVanderbilt University-IngramVelosVirginia Commonwealth University-MasseyVirginia TechWake Forest UniversityWashington University-SitemanWistarYale UniversityNorthwestern University-Robert H. Lurie
“If caBIG™ accomplishes its mission and creates a robust grid for translational and
clinical research, within the cancer community, it will be deemed a failure.”
Bob Robbins (Fred Hutchinson Cancer Research Center), at the initial Strategic Planning Workspace meeting
“Prevention is Better than Cure”
--Desiderius Erasmus (1466-1536)
Embedding caBIG™ in the larger biomedical research community
Embrace the Larger Community
• Expand caGrid and caDSR into other biomedical domains:• Biomedical Informatics Research Network (BIRN): launched
in 2001 by NIH (NCRR), same concept, neurological disorders, smaller scale. Pilot brain tumor project shows substantial homology between BIRN and caBIG
• Cardiac Arrhythmia Research Network (CARNET): just launched by NHLBI, uses caGrid infrastructure adding cardiovascular terminology to DSR
• National Center for Biomedical Ontologies: Roadmap project drawing medical informatics expertise• This is a computer science project--developing some of the
next generation tools for the Grid• Healthgrid™: International (Europe-based) project
developing standards for information sharing• caBIG participants joining Healthgrid.US board of directors
top related