lessons learnt from the 1000 genomes project about sequencing in populations gil mcvean wellcome...
Post on 19-Dec-2015
220 views
TRANSCRIPT
![Page 1: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/1.jpg)
Lessons learnt from the 1000 Genomes Project about sequencing in populations
Gil McVeanWellcome Trust Centre for Human Genetics and
Department of Statistics, University of Oxford
![Page 2: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/2.jpg)
Some questions
• What has the 1000 Genomes Project told us about how to sequence (in) populations
• What has the 1000 Genomes Project told us about populations
![Page 3: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/3.jpg)
Samples for the 1000 Genomes Project
Major population groups comprised of subpopulations of c. 100 each
GBRFIN
TSIIBS
CEU
JPTCHB
CHS
CDX
KHVGWB
GHN
YRI
MAB
LWK
MXL
CLM
ASW AJM
ACB
PEL
PUR
Samples from S. Asia
![Page 4: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/4.jpg)
The role of the 1000G Project in medical genetics
• A catalogue of variants– 95% of variants at 1% frequency in populations of interest
• A representation of ‘normal’ variation
• A set of haplotypes for imputation into GWAS
• A training ground for sequencing/statistical/computational technologies
![Page 5: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/5.jpg)
TSI*
CEU
JPTCHB
CHS*YRI
LWK*
*Exon pilot only
Samples for the 1000 Genomes Project: Pilot
![Page 6: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/6.jpg)
Population-scale genome sequencing
Haplotypes2x
10x
![Page 7: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/7.jpg)
![Page 8: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/8.jpg)
What has the project generated?
![Page 9: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/9.jpg)
>15 million SNPs, >50% of them novel
dbSNP entries increased by 70%
![Page 10: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/10.jpg)
An huge increase in the set of structural variants
![Page 11: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/11.jpg)
A robust and modular pipeline for analysis of population-scale sequence data
![Page 12: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/12.jpg)
An efficient format for storing aligned reads and a set of tools to manipulate and view the files
• SAM/BAM format for storing (aligned) reads
Bioinformatics (2009) http://samtools.sourceforge.net
![Page 13: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/13.jpg)
An information-rich format for storing generic haplotype/genotype data and tools for manipulating the files
http://vcftools.sourceforge.net
![Page 14: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/14.jpg)
An understanding of the ‘rare functional variant load’ carried by individuals
c. 250 LOF / personc. 75 HGMD DM
![Page 15: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/15.jpg)
USH2A
• Mutations cause with Usher syndrome
• 66 missense variants in dbSNP• 2/3 detected in 1000 Genomes Pilot• One HGMD ‘disease-causing’ variant homozygous in 3 YRI
– Other reports indicate this is not a real disease-causing variant
![Page 16: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/16.jpg)
Samples for the 1000 Genomes Project: Phase1
GBRFIN
TSI
CEU
JPTCHB
CHSYRI
LWK
MXL
CLM
ASW
PUR
![Page 17: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/17.jpg)
Lessons learnt about sequencing in populations
![Page 18: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/18.jpg)
Lesson 1.
The low-coverage model works for variant discovery
![Page 19: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/19.jpg)
A near complete record of common variants
CEU
![Page 20: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/20.jpg)
Lesson 2.
The low coverage model works for SNP genotyping
![Page 21: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/21.jpg)
A set of accurate genotypes/haplotypes
CEU
![Page 22: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/22.jpg)
![Page 23: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/23.jpg)
Lesson 3.
The genome has a large grey area where variant calling is hard
![Page 24: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/24.jpg)
![Page 25: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/25.jpg)
Lesson 4.
Joint calling of different variant types substantially improves the
quality of calls
![Page 26: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/26.jpg)
![Page 27: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/27.jpg)
Lesson 5.
Managing uncertainty is important
![Page 28: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/28.jpg)
![Page 29: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/29.jpg)
Lesson 6.
Data visualisation is key
![Page 30: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/30.jpg)
![Page 31: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/31.jpg)
Lessons learnt about populations
![Page 32: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/32.jpg)
![Page 33: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/33.jpg)
Closely related populations can have substantially different rare
variants
![Page 34: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/34.jpg)
![Page 35: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/35.jpg)
Spatial heterogeneity in non-genetic risk can differentially confound association studies for rare and common variants
Iain Mathieson
![Page 36: Lessons learnt from the 1000 Genomes Project about sequencing in populations Gil McVean Wellcome Trust Centre for Human Genetics and Department of Statistics,](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649d2e5503460f94a05245/html5/thumbnails/36.jpg)
Thanks to the many...
• Steering committee– Co-chairs: Richard Durbin and David Altshuler
• Samples and ELSI Committee– Co-chairs: Aravinda Chakravarti and Leena Peltonen
• Data Production Group– Co-chairs: Elaine Mardis and Stacey Gabriel
• Analysis Group– Co-Chairs: Gil McVean and Goncalo Abecasis– Subgroups in gene-targeted sequencing (Richard Gibbs) and population genetics (Molly Przeworski)
• Structural Variation Group– Co-chairs: Matt Hurles, Charles Lee and Evan Eichler
• DCC– Co-Chairs: Paul Flicek and Steve Sherry