using 1,000 genomes data for imputafon in genome‐wide ... · imputation in gwas studies author:...
TRANSCRIPT
Using1,000GenomesdataforimputaFonin genome‐wideassociaFonstudies
1,000GenomesDataTutorial
ICHG2011,Montreal
BryanHowie UniversityofChicago
GenotypeimputaFonbackground
0 0 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 0 0 1 Reference 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 haplotypes 1 0 0 0 1 1 11 1 0 1 1 0 0 1
1 1 10 02 1 1 0 ? 0 0 0 1 ? 1
1 2 2 1
1 0 0 1
1 0 0 1
0 1 0 0
1 1 0 ?
Phenotyped GWAS samples
0 2 0 0 1 1 1 1 1 1 1 2
SNPsgenotypedonanarray
0 0 0 0 0 0 01 1 1 1 1 1 1 10 0 1 1 1 1 10 0 0 1 0 0 0 11 1 0 0 0 0 01 1 1 0 1 0 0 01 0 0 0 1 1 11 1 0 1 1 0 0 1
? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?
1 1 10 021 0 00 ?10 0 11 111 1 10 02? 0 00 021 0 ?1 110 1 10 021 1 21 11
? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?
UntypedSNPs
Referencehaplotypes
PhenotypedGWASsamples
Genotypeimputa5onbackground
0 0 0 0 0 0 01 1 1 1 1 1 1 10 0 1 1 1 1 10 0 0 1 0 0 0 11 1 0 0 0 0 01 1 1 0 1 0 0 01 0 0 0 1 1 11 1 0 1 1 0 0 1
0 001 1 12 2 20 011 1 01 1 21 100 2 01 1 10 002 1 12 2 2
1 1 10 021 0 00 010 0 11 111 1 10 022 0 00 021 0 11 110 1 10 021 1 21 11
0 021 0 02 2 21 111 1 01 1 10 220 2 12 2 21 111 1 11 1 1
AssociaHonsignal
Referencehaplotypes
PhenotypedGWASsamples
Genotypeimputa5onbackground
Abriefhistoryofimputa5onreferencepanels:
HapMap2,HapMap3,andthe1,000GenomesProject
1 1 10 021 0 00 010 0 11 111 1 10 02
CEU
HapMap2(2007)
CHBJPTYRI
GWASgenotypes
Referencepanels
1 1 10 021 0 00 010 0 11 111 1 10 02
GWASgenotypes
CEU
HapMap2(2007)
CHBJPTYRI
Referencepanels
HM2
log10#ofgenotypes
910
11
HapMap3(2009)
ASWMKKLWKYRI
CHDCHBJPTGIH
CEUTSIMEX
1 1 10 021 0 00 010 0 11 111 1 10 02
HM2 HM3
log10#ofgenotypes
910
11
GWASgenotypes
Referencepanels
1,000Genomes(2010+)
1 1 10 021 0 00 010 0 11 111 1 10 02
HM2 HM3 1kG
log10#ofgenotypes
910
11
GWASgenotypes
Referencepanels
Europeanancestry
Africanancestry
Admixed(Americas)
0.0 0.1 0.2 0.3 0.4 0.5
0.6
0.7
0.8
0.9
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy
ALLSNPs
0.00 0.01 0.02 0.03 0.04 0.050.6
0.7
0.8
0.9
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy
LOW‐FREQUENCYSNPs
1,000Genomeshaplotypesarehighlyaccurate
Omni2.5M
Illumina550k
Affymetrix500k
0.0 0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy
0.0 0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy
0.0 0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy
ALLSNPs
0.00 0.01 0.02 0.03 0.04 0.050.2
0.4
0.6
0.8
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy0.00 0.01 0.02 0.03 0.04 0.05
0.2
0.4
0.6
0.8
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy0.00 0.01 0.02 0.03 0.04 0.05
0.2
0.4
0.6
0.8
1.0
Minor allele frequency of imputed SNPs
Impu
tatio
n ac
cura
cy
LOW‐FREQUENCYSNPs
Imputa5onaccuracydependsonyourGWASchip
Imputa5onfrom1,000Genomeshaplotypescan
strengthenassocia5onsignals.
GWASofOsteoarthriHsDay‐Williamsetal.(AJHG2011)
1000GPilot
haplotypes
GWASgenotypes
ImputedGWAS
genotypes
StandardImputaHon
40minutespergenome
1000GPhaseI
haplotypes
ImputedGWAS
genotypes
GWASgenotypes
1000GPilot
haplotypes
GWASgenotypes
ImputedGWAS
genotypes
StandardImputaHon
7800minutespergenome
40minutespergenome
Pre‐phasingImputaHonGWAS
genotypes
25minutespergenome
1000GPhaseI
haplotypes
ImputedGWAS
genotypes
GWASgenotypes
1000GPilot
haplotypes
GWASgenotypes
ImputedGWAS
genotypes
StandardImputaHon
7800minutespergenome
40minutespergenome
GWAShaplotypes
Pre‐phasingImputaHonGWAS
genotypes
25minutespergenome
1000GPhaseI
haplotypes
ImputedGWAS
genotypes
GWASgenotypes
1000GPilot
haplotypes
GWASgenotypes
ImputedGWAS
genotypes
StandardImputaHon
7800minutespergenome
40minutespergenome
1minutepergenome
1000GPilot
haplotypes
GWAShaplotypes
ImputedGWAS
genotypes
Pre‐phasingImputaHonGWAS
genotypes
25minutespergenome
1000GPhaseI
haplotypes
ImputedGWAS
genotypes
GWASgenotypes
1000GPilot
haplotypes
GWASgenotypes
ImputedGWAS
genotypes
StandardImputaHon
7800minutespergenome
40minutespergenome
1minutepergenome
1000GPhaseI
haplotypes
ImputedGWAS
genotypes
GWAShaplotypes
1000GPilot
haplotypes
GWAShaplotypes
ImputedGWAS
genotypes
24minutespergenome
Pre‐phasingImputaHonGWAS
genotypes
25minutespergenome
1000GPhaseI
haplotypes
ImputedGWAS
genotypes
GWASgenotypes
1000GPilot
haplotypes
GWASgenotypes
ImputedGWAS
genotypes
StandardImputaHon
7800minutespergenome
40minutespergenome
1minutepergenome
1000GPhaseI
haplotypes
ImputedGWAS
genotypes
GWAShaplotypes
1000GPilot
haplotypes
GWAShaplotypes
ImputedGWAS
genotypes
24minutespergenome
Imputa5onAccuracy(meanR2)
1000Gpanel MAF1‐3% MAF3‐5% MAF>5%
60CEU 0.66 0.78 0.88
60CEU 0.65 0.77 0.87
283EUR 0.73 0.78 0.92
381EUR 0.83 0.85 0.94
• Phase1haplotypesnowincludeSNPs,INDELs,andSVs!
• 1,000GenomeshaplotypesareavailableintheformatsrequiredbyvariousimputaHonprograms.Forexample:
– Beagle:hfp://faculty.washington.edu/browning/beagle/beagle.html
– IMPUTE2:hfp://mathgen.stats.ox.ac.uk/impute/impute_v2.html
– MaCH/minimac:hfp://www.sph.umich.edu/csg/abecasis/MACH/download/
• Thanksforcoming!
Ge`ngthelatest1,000Genomeshaplotypes