mar 2013 reference materials selection
TRANSCRIPT
Genome in a Bo*le Working Group Reference Material (RM) Selec:on and Design
… to tell the truth and nothing but …
XGEN Congress March 21, 2013
Andrew Grupe, PhD
Scope of Reference Material Discussion
• Human Genome & Tumor Sequencing
• Variant Types – SNP – InDel / Subs:tu:on – CNV – Structural variant
| 2
Reference Material Needed For • Clinical plaVorm valida:on
– Sequencing System – Bioinforma:cs/Analysis Pipeline
• Clinical test development and valida:on – Whole genome – Targeted – Germline vs. tumor
• Research – Process development and QC
• Product development – Sequencing Systems – SoYware development
| 3
Reference Materials Are Needed … to tell the truth and nothing but …
| 4
NY State Guidelines – Oncology NGS Minimum Data Requirement -‐ Valida:on
• Accuracy: Sequence a well-‐characterized reference sample (e.g. HapMap DNA GM12878) to determine error rate across all amplicons.
• AnalyFcal sensiFvity: Establish the analy:cal sensi:vity of the assay by interroga:ng all variants in the 3 amplicons with the consistently poorest coverage, and all variants in 3 amplicons with consistently good coverage. This can iniFally be established with defined mixtures of cell line DNAs (not plasmids), but needs to be verified with 3-‐5 pa:ent samples.
• AnalyFcal specificity: Establish the analy:cal specificity of the assay by interroga:ng all variants in the 3 amplicons with the consistently poorest coverage, and all variants in 3 amplicons with consistently good coverage. This can iniFally be established with defined mixtures of cell line DNAs (not plasmids), but needs to be verified with 3-‐5 pa:ent samples.
| 5
Accredita:on -‐ College of American Pathologists (CAP) NGS Requirements
• Valida:ons must include informa:on on the analy:cal target (examples, exons, genes, exomes, genomes, and transcriptomes). The ability of the analy:cal process to sequence the target (e.g. percentage of target adequately sequenced) must be described.
• Valida:ons must determine and document analy:cal sensi:vity, specificity, reproducibility, repeatability and precision for the types of variants assayed (e.g. single nucleo:de variants, inser:ons and dele:ons, homopolymer or repe::ve sequences).
| 6
Associa:on for Molecular Pathology Comments to FDA UHT-‐Sequencing Mee:ng, June 2011
• … Performance of and coverage needs for a given plaVorm are likely to differ depending on the nucleic acid and DNA regions analyzed, the variants interrogated, the rela:ve allele propor:ons of par:cular variants, … Evalua:on should consider the effects of rela:ve GC content, homopolymeric and other regions of repe::ve sequence, homologous gene regions and DNA structural variants, … This necessitates flexibility and individualiza:on in the development of valida:on protocols, guidelines, and controls on a (clinical) applica:on-‐by-‐applica:on basis. …
• Assay controls should include a range of variants, … Process controls like NA12876 [sic] … and the synthe:c ERCC RNA transcripts from NIST are examples of potenFal standard reference materials. …
| 7
Main Mee:ngs – Reference Materials (RMs)
• April 13, 2012 (NIST) – Genome in a Bo*le consor:um ini:a:on
• August 16, 2012 (NIST) – Intended uses of RMs – RM selec:on strategies
• November 7, 2012 (ASHG) – Status updates
• December 6, 2012 – Selec:on of ini:al RMs
• March 21, 2013 (XGEN Congress)
| 8
Workgroup A*endees
• Approximately 25 a*endees – Federal, incl. FDA, CDA, NIST – Lab accredita:on – Clinical reference labs – PlaVorm technologies – Reference material / reagent providers – Research
| 9
Discussion Topics For Human Genome Sequencing: • What sources of RMs to consider
– Primary sample / cell line • Consent
– Available for research and for profit use • What extent of prior characteriza:on • Which ethnici:es, genders • Which muta:ons need to be present
– Is medical relevance necessary • Ini:ally to have
– ONE characterized genome RM -‐ or – Mul:ple genomes, lower level of characteriza:on
• Source of commercial development and distribu:on – Manufactured under quality system for diagnos:c applica:ons
| 10
Reference Material – Intended Uses
• Characterize PlaVorms & Methods – DNA sequencing – Exis:ng & upcoming NGS technologies – Research applica:ons – Clinical diagnos:cs applica:ons
• Not intended as reference material for – Valida:on of specific muta:ons in a panel
| 11
Desired RM Sample Characteris:cs
• General Considera:ons – Sample characteris:cs are more important than selec:on of specific sample IDs
– More reference samples preferred over fewer samples
• E.g. prefer 8 fully characterized samples at high depth and corresponding trios at lower depth over 4 fully characterized samples plus trios
| 12
Desired RM Sample Characteris:cs (cont.) • High Priority
– Mul:ple ethnici:es • Diversity in structural varia:on to stress systems • However, no requirement for representa:ves from every ethnic group
– Balanced female to male ra:o – Cell lines, low passage
• Replenish supply
Targeted Ethnic Distribu:on 2 European-‐ancestry: northern/western & southern/eastern 2 African-‐American: AA & African, or two AA from different parts of the US
2 La:no: different ancestral places, US or South/Central America
1 East Asian 1 South Asian | 13
Desired RM Sample Characteris:cs (cont.)
• Nice to have – Interracial marriage samples
• Controlled admixture • Haplotypes
• Less cri:cal – Phenotypic characteriza:on
• Reference material not for discovery
– Access to RNA or :ssues • No limitless supply of material with iden:cal characteris:cs
| 14
Other RM Considera:ons • DNA from low passage cell lines
– Understand propaga:on of variants through cell line passaging • Modify DNA purifica:on in future to keep step with new NGS
technologies – Current purified DNA fragment sizes are 80-‐100kb
• OK for exis:ng technologies – New nanopore technologies may need Mbp fragments
• Agarose embedding is proven extrac:on technology • Consider footprint analysis of all batches prior to distribu:on
– Iden:fy gene:c driY, mix ups, …. , develop benchmarks • Reference material that mimics tumor sample characteris:cs
– FFPE embedded cells? • Blood or saliva as primary (not cell line) DNA sources
| 15
RM Sample Source Sugges:ons Most support • NA12878
– Large HapMap family, well characterized – NIST contracted Coriell for DNA batch
• Personal Genome Project Samples – Includes trios – Use sequence data to derive admixture – h*p://www.personalgenomes.org – Consent includes research use, commercial use and re-‐iden:fica:on
| 16
RM Sample Source Sugges:ons (cont.)
Some support (if consent sufficient) • HS1011
– Charcot Marie Tooth cell line • Lupski et al, NEJM 2010
• MCF10A – Normal breast
• Used by Horizon Dx to produce isogenic cell lines with cancer relevant muta:ons
Other • African American sample with 70% sanger sequence
– No cell line available – Subject s:ll alive => re-‐consent & generate cell line?
• huRef sample
| 17
HapMap NA12878 An Obvious Choice?
• Mul:tude of public and proprietary datasets • Cell line and DNA available from Coriell
• Listed in guidelines as poten:al reference sample for clinical tests
| 18
HapMap NA12878 Consent
• Consent available for – Research use
HOWEVER …. • Consent does not include
– Some commercial uses • Incl. altera:ons, re-‐distribu:on
– Re-‐iden:fica:on through sequence data • Op:on to withdraw data and materials
| 19
http://genomeinabottle.org/forum-topic/what-appropriate-informed-consent-reference-materials-genome-bottle-consortium
http://hapmap.ncbi.nlm.nih.gov/downloads/elsi/CEPH_Reconsent_Form.pdf
• NIST expects first batch of DNA from Coriell in mid April
• Legal and IRB review at NIST for NA12878 release • Start to develop bioinforma:cs methods based on NA12878 data – Have bioinforma:cs tools when other samples are available
| 20
HapMap NA12878 Status as RM
8,000 aliquots of 10ug each on order by NIST from Coriell
Personal Genome Project (PGP) Samples
• Consent – Research and commercial use – Possibility of re-‐iden:fica:on, including through sequence – Op:on to withdraw at any point
• Data removal and destruc:on of material www.personalgenomes.org/consent/PGP_Consent_Approved_02212012.pdf
• Sample availability
– Ongoing enrollment – Limited collec:on of ethnically diverse trios h*p://blog.personalgenomes.org/2012/11/29/seeking-‐diversity/
| 21
RM: Selected 3 PGP Trios
Available at Coriell • Ashkenazim Jewish trio, East European ancestry
– Parents, Son – huAA53EO / hu8E87A9 / hu6E4515
Not yet available at Coriell • East Asian trio
– Parents, Son – hu91BD69 / hu38168C / huCA017E
• Caucasian quartet – Parents, 2 monozygo:c twin daughters – huCDC3B8 / huFE01E1 / hu1E8957 / hu961968
| 22
PGP Info -‐ hu8E87A9 (abbreviated)
| 23 https://my.personalgenomes.org/profile/hu8E87A9
| 24
Coriell Info -‐ hu8E87A9 (abbreviated)
http://ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=hu8E87A9
Summary
• Defined required RM characteris:cs • Ini:al set of RM samples selected
– NA12878 • Many exis:ng public and proprietary datasets • Listed in clinical guidelines to establish valida:on parameters • Consent limita:ons
– Commercial use, re-‐iden:fica:on through sequence • Under legal and IRB review by NIST
– Three PGP trios • One trio already available at Coriell
• Consent without withdrawal op:on may not meet ethical review standards
| 25
Contact Informa:on
Genome in a Bo*le: h*p://genomeinabo*le.org Jus:n Zook: jus:[email protected] Marc Salit: [email protected] Andrew Grupe: [email protected]
| 26
Addi:onal Informa:on
| 27
HapMap Re-‐Consent
| 28
What will happen if I don’t agree to let my sample be used? You will not lose any benefits if you choose not to let your sample be used. If you don’t agree to let your sample be used, it will not be used for the HapMap. However, it will continue to be used for other IRB approved research studies, just as it has been in the past, unless you specifically tell us that you don’t want it used for such studies anymore. Can I change my mind after I agree to let my sample be used? Deciding whether to let your sample be used for the HapMap is completely up to you. You will not lose any benefits if you choose not to let your sample be used. However, once your sample has been studied and your genetic information has been put in the database, you will not be able to take that information back.
| 29
HapMap Re-‐Consent
The Repository does not let anyone sell material from samples or cell lines. However, information from genetics research sometimes helps companies make products to diagnose or treat diseases. If information from your family’s cell lines leads to making a product, it would probably contribute only in a very small way. Also, because the cell lines will not have names on them, neither the researchers nor anyone at the Repository would know if your samples were even used. So you will not get any additional payment for having your sample used in this project.
HapMap Re-‐Consent
| 30
… The database will not include any medical information about anyone whose sample is used. It also will not include any information that could identify who the individual people or families are. … Because the database will be public, people who do identity testing, such as for paternity testing or law enforcement, may also use the samples, the database, and the HapMap, to do general research. However, it will be very hard for anyone to learn anything about you personally from any of this research because none of the samples, the database, or the HapMap will include your name or any other information that could identify you or your family. What are the risks of having my sample used for this project? If your family’s samples are used, lots of genetic information from your samples will be put in the database, and lots of people will be able to look at it for any purpose. However, there are only a couple of ways anybody could trace the information back to you. One is if they thought your information might be in the database, got another sample from you, did many tests on that sample, and then compared the genetic information from those tests with the information in the database. The other is if somebody compared the information in the database with genetic information known to be from you that was in another database and figured out who you were. The risk of either of these things happening is very small, but it may grow in the future. We cannot always predict the results of research, so new risks to you may come up in the future that we can’t predict now.