gil mcvean
DESCRIPTION
Gil McVean. What makes us different?. Image: Wikimedia commons. The genetic axes. Strong. Genetic disorders. Cancer. Inherited. Somatic. Complex disease. Aging. Weak. Images:Wikimedia commons. Characterising individual genomes. Image: Wikimedia commons. Image: Wikimedia commons. - PowerPoint PPT PresentationTRANSCRIPT
![Page 2: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/2.jpg)
What makes us different?
Image: Wikimedia commons
![Page 3: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/3.jpg)
The genetic axes
Strong
Weak
Inherited Somatic
Cancer
Complex disease
Genetic disorders
Aging
Images:Wikimedia commons
![Page 4: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/4.jpg)
Characterising individual genomes
Image: Illumina Cambridge Ltd
Image: Wikimedia commonsImage: Wikimedia commons
![Page 5: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/5.jpg)
Why 1000 genomes?
• To find all common (>5%) variants in the accessible human genome
• To find at least 95% of variants at 1% in populations of medical genetics interest– 95% of variants at 0.1% in genes
• To provide a fully public framework for interpreting rare genetic variation in the context of disease– Screening– Imputation
![Page 6: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/6.jpg)
The 1000 Genomes Project
![Page 7: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/7.jpg)
1000 Genomes Project design
![Page 8: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/8.jpg)
Haplotypes2x
10x
Population sequencing
![Page 9: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/9.jpg)
A map of shared variation
![Page 10: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/10.jpg)
http://browser.1000genomes.org
www.1000genomes.org
![Page 11: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/11.jpg)
Good, but not perfect
Variant type Validation methods Estimated FDR
Low-coverage SNPs Sequenom, 454, PacBio
1.8%
Exome SNPs 454 1.6%
LOF variants 454 5.2%
Short indels PCR, Sanger, array genotypes
36% -> 5.4%
Large deletions PCR, array CGH, SNP genotype
2.1%
Other large SVs PCR, array CGH, SNP genotype
1.4% – 3.7%
Post-hoc filtering
Not genotyped
![Page 12: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/12.jpg)
![Page 13: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/13.jpg)
4 million sites that differ from the human reference genome
12,000 changes to proteins
100 changes that knockout gene function5 rare
variants that are known to cause disease
![Page 14: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/14.jpg)
Most variation is common – Most common variation is cosmopolitan
Number of variants in typical genome
Found only in Europe
0.3%
Found in all continents
92%
Found only in the UK
0.1%
Found only in you
0.002%
![Page 15: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/15.jpg)
Imputation from 1000 Genomes
• Imputation similar for all variant types across populations• Comparable to imputation from high quality SNP haplotypes
![Page 16: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/16.jpg)
…but it can work for common variants
![Page 17: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/17.jpg)
The 1000 Genomes Sampling design
![Page 18: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/18.jpg)
The 1000 Genomes Sampling design
![Page 19: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/19.jpg)
What have we learned about low-frequency genetic variation from the 1000 Genomes Project?
• How many rare (<0.5%) and low-frequency (0.5-5%) variants are there, how does it vary between populations and what does it tell use about demography?
• To what extent has natural selection shaped the distribution of rare variants within and between populations?
• What are the implications of these findings for the interpretation of genetic variation in individual genomes?
![Page 20: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/20.jpg)
Populations differ in load of rare and common variants
![Page 21: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/21.jpg)
Most rare variation is private
![Page 22: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/22.jpg)
Rare variant differentiation within ancestry groupings increases as variant frequency decreases
![Page 23: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/23.jpg)
Not all populations are equal
![Page 24: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/24.jpg)
Rare variants identify recent historical links between populations
48% of IBS variants shared with American populations
ASW shows stronger sharing with YRI than LWK
![Page 25: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/25.jpg)
What about variants that affect gene function?
![Page 26: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/26.jpg)
Conserved variant load per individual
![Page 27: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/27.jpg)
The proportion of rare variants is predicted by conservation, with the exception of splice-disrupting and STOP+ variants
![Page 28: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/28.jpg)
KEGG ‘pathways’ show variation in excess rare-variant load
![Page 29: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/29.jpg)
Patterns of variation inform about selective constraint
CTCF-binding motif
![Page 30: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/30.jpg)
Variants under selection showed elevated levels of population differentiation
Proportion of pairwise comparisons where nonsynonymous variants are more differentiated than synonymous ones
![Page 31: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/31.jpg)
Rare variant differentiation can confound the genetic study of disease
Mathieson and McVean (2012)
![Page 32: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/32.jpg)
Implications
• Rare variants have spatial and ancestry-related distributions that reflect recent demographic events and selection.
• Purifying selection elevates local differentiation of rare variants.
• The functional and aetiological interpretation of rare variants in the context of disease needs to be aware of the local genetic background.
![Page 33: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/33.jpg)
AFRICA
Gambian in Western Division, The Gambia (GWD)
Malawian in Blantyre, Malawi (MAB)
Mende in Sierra Leone (MSL)
Esan in Nigeria (ESN)
SOUTH ASIAN
Punjabi in Lahore, Pakistan (PJL)
Bengali in Bangladesh (BEB)
Sri Lankan Tamil in the UK (STU)
Indian Telugu in the UK (ITU)
AMERICASAfrican American in Jackson, MS (AJM)
100
200
100
100
100
100
80
The final resource – mid 2013
![Page 34: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/34.jpg)
What more could we learn about human population genetics?
• There is a need for continuing the programme of developing public resources describing genetic variation across new populations, with high resolution spatial information.– This will not just shed light on population history and selection, but be
important for interpreting (rare) genetic variation in individual genomes.
• The Phase 1 1000 Genomes data has made clear the extent of variation in conserved regulatory sequence within genomes– How does this relate to variation in function in different cell types?
• Many of the most interesting parts of the genome (for the study of selection) are still poorly-covered by HTS data– Need to collect ‘bespoke’ data types for some genomic regions
![Page 35: Gil McVean](https://reader036.vdocuments.mx/reader036/viewer/2022081421/56813b63550346895da46090/html5/thumbnails/35.jpg)
The 1000 Genomes Project Consortium
http://www.1000genomes.org/