using 1,000 genomes data for imputafon in genome‐wide ... · imputation in gwas studies author:...

19
Using 1,000 Genomes data for imputaFon in genome‐wide associaFon studies 1,000 Genomes Data Tutorial ICHG 2011, Montreal Bryan Howie University of Chicago

Upload: others

Post on 26-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Using1,000GenomesdataforimputaFonin genome‐wideassociaFonstudies

1,000GenomesDataTutorial

ICHG2011,Montreal

BryanHowie UniversityofChicago

Page 2: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

GenotypeimputaFonbackground

0 0 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 0 0 1 Reference 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 haplotypes 1 0 0 0 1 1 11 1 0 1 1 0 0 1

1 1 10 02 1 1 0 ? 0 0 0 1 ? 1

1 2 2 1

1 0 0 1

1 0 0 1

0 1 0 0

1 1 0 ?

Phenotyped GWAS samples

0 2 0 0 1 1 1 1 1 1 1 2

SNPsgenotypedonanarray

Page 3: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

0 0 0 0 0 0 01 1 1 1 1 1 1 10 0 1 1 1 1 10 0 0 1 0 0 0 11 1 0 0 0 0 01 1 1 0 1 0 0 01 0 0 0 1 1 11 1 0 1 1 0 0 1

? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?

1 1 10 021 0 00 ?10 0 11 111 1 10 02? 0 00 021 0 ?1 110 1 10 021 1 21 11

? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?? ??? ? ?? ? ?

UntypedSNPs

Referencehaplotypes

PhenotypedGWASsamples

Genotypeimputa5onbackground

Page 4: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

0 0 0 0 0 0 01 1 1 1 1 1 1 10 0 1 1 1 1 10 0 0 1 0 0 0 11 1 0 0 0 0 01 1 1 0 1 0 0 01 0 0 0 1 1 11 1 0 1 1 0 0 1

0 001 1 12 2 20 011 1 01 1 21 100 2 01 1 10 002 1 12 2 2

1 1 10 021 0 00 010 0 11 111 1 10 022 0 00 021 0 11 110 1 10 021 1 21 11

0 021 0 02 2 21 111 1 01 1 10 220 2 12 2 21 111 1 11 1 1

AssociaHonsignal

Referencehaplotypes

PhenotypedGWASsamples

Genotypeimputa5onbackground

Page 5: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Abriefhistoryofimputa5onreferencepanels:

HapMap2,HapMap3,andthe1,000GenomesProject

Page 6: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

1 1 10 021 0 00 010 0 11 111 1 10 02

CEU

HapMap2(2007)

CHBJPTYRI

GWASgenotypes

Referencepanels

Page 7: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

1 1 10 021 0 00 010 0 11 111 1 10 02

GWASgenotypes

CEU

HapMap2(2007)

CHBJPTYRI

Referencepanels

HM2

log10#ofgenotypes

910

11

Page 8: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

HapMap3(2009)

ASWMKKLWKYRI

CHDCHBJPTGIH

CEUTSIMEX

1 1 10 021 0 00 010 0 11 111 1 10 02

HM2 HM3

log10#ofgenotypes

910

11

GWASgenotypes

Referencepanels

Page 9: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

1,000Genomes(2010+)

1 1 10 021 0 00 010 0 11 111 1 10 02

HM2 HM3 1kG

log10#ofgenotypes

910

11

GWASgenotypes

Referencepanels

Page 10: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Europeanancestry

Africanancestry

Admixed(Americas)

0.0 0.1 0.2 0.3 0.4 0.5

0.6

0.7

0.8

0.9

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy

ALLSNPs

0.00 0.01 0.02 0.03 0.04 0.050.6

0.7

0.8

0.9

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy

LOW‐FREQUENCYSNPs

1,000Genomeshaplotypesarehighlyaccurate

Page 11: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Omni2.5M

Illumina550k

Affymetrix500k

0.0 0.1 0.2 0.3 0.4 0.5

0.2

0.4

0.6

0.8

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy

0.0 0.1 0.2 0.3 0.4 0.5

0.2

0.4

0.6

0.8

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy

0.0 0.1 0.2 0.3 0.4 0.5

0.2

0.4

0.6

0.8

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy

ALLSNPs

0.00 0.01 0.02 0.03 0.04 0.050.2

0.4

0.6

0.8

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy0.00 0.01 0.02 0.03 0.04 0.05

0.2

0.4

0.6

0.8

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy0.00 0.01 0.02 0.03 0.04 0.05

0.2

0.4

0.6

0.8

1.0

Minor allele frequency of imputed SNPs

Impu

tatio

n ac

cura

cy

LOW‐FREQUENCYSNPs

Imputa5onaccuracydependsonyourGWASchip

Page 12: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Imputa5onfrom1,000Genomeshaplotypescan

strengthenassocia5onsignals.

GWASofOsteoarthriHsDay‐Williamsetal.(AJHG2011)

Page 13: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

1000GPilot

haplotypes

GWASgenotypes

ImputedGWAS

genotypes

StandardImputaHon

40minutespergenome

Page 14: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

1000GPhaseI

haplotypes

ImputedGWAS

genotypes

GWASgenotypes

1000GPilot

haplotypes

GWASgenotypes

ImputedGWAS

genotypes

StandardImputaHon

7800minutespergenome

40minutespergenome

Page 15: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Pre‐phasingImputaHonGWAS

genotypes

25minutespergenome

1000GPhaseI

haplotypes

ImputedGWAS

genotypes

GWASgenotypes

1000GPilot

haplotypes

GWASgenotypes

ImputedGWAS

genotypes

StandardImputaHon

7800minutespergenome

40minutespergenome

GWAShaplotypes

Page 16: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Pre‐phasingImputaHonGWAS

genotypes

25minutespergenome

1000GPhaseI

haplotypes

ImputedGWAS

genotypes

GWASgenotypes

1000GPilot

haplotypes

GWASgenotypes

ImputedGWAS

genotypes

StandardImputaHon

7800minutespergenome

40minutespergenome

1minutepergenome

1000GPilot

haplotypes

GWAShaplotypes

ImputedGWAS

genotypes

Page 17: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Pre‐phasingImputaHonGWAS

genotypes

25minutespergenome

1000GPhaseI

haplotypes

ImputedGWAS

genotypes

GWASgenotypes

1000GPilot

haplotypes

GWASgenotypes

ImputedGWAS

genotypes

StandardImputaHon

7800minutespergenome

40minutespergenome

1minutepergenome

1000GPhaseI

haplotypes

ImputedGWAS

genotypes

GWAShaplotypes

1000GPilot

haplotypes

GWAShaplotypes

ImputedGWAS

genotypes

24minutespergenome

Page 18: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

Pre‐phasingImputaHonGWAS

genotypes

25minutespergenome

1000GPhaseI

haplotypes

ImputedGWAS

genotypes

GWASgenotypes

1000GPilot

haplotypes

GWASgenotypes

ImputedGWAS

genotypes

StandardImputaHon

7800minutespergenome

40minutespergenome

1minutepergenome

1000GPhaseI

haplotypes

ImputedGWAS

genotypes

GWAShaplotypes

1000GPilot

haplotypes

GWAShaplotypes

ImputedGWAS

genotypes

24minutespergenome

Imputa5onAccuracy(meanR2)

1000Gpanel MAF1‐3% MAF3‐5% MAF>5%

60CEU 0.66 0.78 0.88

60CEU 0.65 0.77 0.87

283EUR 0.73 0.78 0.92

381EUR 0.83 0.85 0.94

Page 19: Using 1,000 Genomes data for imputaFon in genome‐wide ... · Imputation in GWAS Studies Author: Bryan Howie Subject: ICHG 2011: 1000 Genomes Project Data Tutorial Keywords: ICHG

•  Phase1haplotypesnowincludeSNPs,INDELs,andSVs!

•  1,000GenomeshaplotypesareavailableintheformatsrequiredbyvariousimputaHonprograms.Forexample:

–  Beagle:hfp://faculty.washington.edu/browning/beagle/beagle.html

–  IMPUTE2:hfp://mathgen.stats.ox.ac.uk/impute/impute_v2.html

–  MaCH/minimac:hfp://www.sph.umich.edu/csg/abecasis/MACH/download/

•  Thanksforcoming!

Ge`ngthelatest1,000Genomeshaplotypes