aug2013 nist program slides
TRANSCRIPT
Genome-in-a-Bottle Consortium August 2013
Reference Materials for Clinical Applications of Human Genome Sequencing
Marc Salit, Ph.D. and Justin Zook, Ph.DNational Institute of Standards and Technology
Genome in a Bottle Consortium Development
• NIST met with sequencing technology developers to assess standards needs– Stanford, June 2011
• Open, exploratory workshop– ASHG, Montreal, Canada– October 2011
• Small, invitational workshop at NIST to develop consortium for human genome reference materials– FDA, NCBI, NHGRI, NCI, CDC, Wash
U, Broad, technology developers, clinical labs, CAP, PGP, Partners, ABRF, others
– developed draft work plan– April 2012
• Open, public meeting at NIST to formally establish consortium, present draft work plan– formed working groups– identified candidate genomes– established principles of:
• reference material selection• characterization• informatics• performance metrics
– August 2012
• Open, public workshop at XGen Congress– March 2013
• Website– www.genomeinabottle.org
Well-characterized, stable RMs• Obtain metrics for validation,
QC, QA, PT• Determine sources and types
of bias/error• Learn to resolve difficult
structural variants• Improve reference genome
assembly• Optimization
– integration of data from multiple platforms
– sequencing and analysis
• Enable regulated applications
Comparison of SNP Calls forNA12878 on 2 platforms, 3
analysis methods
Measurement ProcessSample
gDNA isolation
Library Prep
Sequencing
Alignment/Mapping
Variant Calling
Confidence Estimates
Downstream Analysis
• gDNA reference materials will be developed to characterize performance of a part of process– materials will be certified
for their variants against a reference sequence, with confidence estimates
gene
ric m
easu
rem
ent p
roce
ss
• NIST working with GiaB to select genomes
• Current plan– NA12878 HapMap
sample as Pilot sample• part of 17-member
pedigree
– trios from PGP as more complete set• 8 trios, focus on children• varying biogeographic
ancestry
CEPH Utah Pedigree 1463
Putting “Genomes” in Bottles
11 children, Birth Order Redacted
Genome in a Bottle Working Groups
Reference Material Selection& Design
Andrew Grupe,Celera
•Develop prioritized list of whole human genomes for Reference Materials
• Identify candidate approaches and materials for artificial RMs•Develop prioritized list
Meaurements for Reference Material Characterization
Mike Eberle, Illumina
•Develop consensus plan for experimental characterization of Reference Materials
Bioninformatics, Data Integration, and Data Representation
Steve Sherry, NCBI
•Develop plan for integrating experimental data and forming consensus variant calls and confidence estimates
•Develop consensus plan for data representation
Performance Metrics & Figures of Merit
Justin Johnson
•User interface to the Genome-in-a-Bottle Reference Material• “Dashboard”•what an end user will
see and report to understand and describe the performance of their experiment• variant call accuracy•process performance
measures to enable optimization
AgendaThursdayWelcome and IntroIntegrating large scale sequencing into clinical practice
Heidi Rehm
Personal GenomicsMichael Snyder
Break/Poster SessionUpdate on GIAB Progress
Marc Salit
Comparison of NIST, Platinum Genomes, and other NA12878 call-sets to understand sequencing performance
Justin Zook
Presentations from related projectsPlatinum Genomes
Michael Eberle
NA12878 Trio AnalysisFrancisco De La Vega
GeT-RM Project and Genome BrowserDeanna Church
Lunch (on your own in NIST cafeteria)
Working Group MeetingsReference Material Selection & Design (Lecture Room E)Measurements for Reference Material Characterization (Dining Room A&B)Bioinformatics, Data Integration, and Data Representation (Lecture Room A)Performance Metrics and Figures of Merit (Lecture Room C)
FridayDiscussion between working groupsWorking group reports (Green Auditorium)Workplan refinement, timelineLunch (on your own in NIST cafeteria)Discussion: Scope of consortium, how to make decisionsResource needs, how to meet them, and next steps
AgendaThursdayWelcome and IntroIntegrating large scale sequencing into clinical practice
Heidi Rehm
Personal GenomicsMichael Snyder
Break/Poster SessionUpdate on GIAB Progress
Marc Salit
Comparison of NIST, Platinum Genomes, and other NA12878 call-sets to understand sequencing performance
Justin Zook
Presentations from related projectsPlatinum Genomes
Michael Eberle
NA12878 Trio AnalysisFrancisco De La Vega
GeT-RM Project and Genome BrowserDeanna Church
Lunch (on your own in NIST cafeteria)
Working Group MeetingsReference Material Selection & Design (Lecture Room E)Measurements for Reference Material Characterization (Dining Room A&B)Bioinformatics, Data Integration, and Data Representation (Lecture Room A)Performance Metrics and Figures of Merit (Lecture Room C)
FridayDiscussion between working groupsWorking group reports (Green Auditorium)Workplan refinement, timelineLunch (on your own in NIST cafeteria)Discussion: Scope of consortium, how to make decisionsResource needs, how to meet them, and next steps
Please Note
The plenary sessions of this workshop are being webcasted (audio & slides) – please use the microphones when asking questions. Web attendees can ask questions with chat. Slides will be made available on SlideShare after the workshop (see genomeinabottle.org).
Tweets are welcome unless the speaker requests otherwise. Please use #giab as the hashtag.
Status Update, Consortium Business
Marc Salit and Justin Zook
Consenting Genomes for use as Reference Materials
• Risk of re-identification– this is a real risk– privacy– implications for family members
• Meaning of possibility of withdrawal
• Commercial application– indirect, research– direct, derived products
• PGP project currently state-of-art– broad and direct– test to demonstrate
understanding
• “Wild West”
NIST Reference Materials
Pilot RM - NA12878• 8300 10ug vials of NA12878
gDNA @ NIST 4/2013– Available for sequencing by GIAB
participants– target for release as NIST RM
2/2014• SNPs, small indels
• Will be sequenced at ~10 labs– ~4 technologies, multiple modes
• Received “Human Subjects Approval” for release of NA12878 as NIST RM
Personal Genome Project• Ashkenazim trio DNA expected
~Dec 2013• Asian son DNA expected ~Dec
2013– Parents’ cell lines in process at Coriell
• “Human subjects review” close to approval for release of PGP genomes as NIST RMs
• Plan is 5-6 additional trios of diverse ancestry– Ideally, african, asian, hispanic– What should we do if PGP doesn’t
have trios from each of these groups?
Planned Measurements on NA12878 candidate RM
• NIST– ~300x total 2x150bp Illumina
over 6 vials of NA12878– ~100x SOLiD 5500W 2x50bp
coverage – ~50x SOLiD 5500W 2x50bp
coverage of parents
• Illumina– PCR-free– Mate-pair
• Complete Genomics– Normal pipeline– LFR pipeline
• NCI– Ion Proton – Illumina– Various libraries
• Garvan– Illumina exome
• Celera– Targeted panels
• Cornell Weill– Illumina
• MTAs pending– Univ. of Nebraska Medical– Univ. of Michigan
HOW DO WE WANT TO FUNCTION AS A CONSORTIUM?
What’s our scope?How do we make decisions?
Spectrum of Possibilities
• NIST develops and disseminated gDNA RMs with consortium input
• Consortium functions as a Standards Body, with dynamic portfolio and broad influence
Spectrum of Possibilities
NIST develops and disseminates gDNA RMs with consortium input
Consortium functions as a Standards Body, with dynamic portfolio and broad influence
Scope
Basic Scope• Develop/disseminate pilot
genome and 8 trios as RMs– gDNA and reference data
• Develop/disseminate “Performance Metrics Suite”– data repository?
• Documentary Standards to describe methods?– through a clinical SDO? CLSI?
IFCC?
Extended Scope• Other RMs as part of GiaB
portfolio?– tumor/normal pair– artifical spike-in controls
• pDNA from NCI– derived commercial materials
• Cell lines for which we have reference gDNA
• Such cell lines embedded in FFPE
– engineered cell lines• designed as controls for
specific variants
Extended Scope
• need process to include new material/product in portfolio– what does it mean to put
the GiaB imprimatur on something?
• some possible requirements– guidelines for usage– methods for characterization– conduct interlab studies to
establish utility
• how do we decide what to do?– need to be
• open, transparent, public• form consensus
– pragmatic consensus needs champion and commitment• e.g. proposer pilots interlab• consortium members
participate in interlab
• how do we decide policy matters?– see draft data release policy
discussion on this tomorrow after lunch…