association mapping david evans. outline definitions / terminology what is (genetic) association?...

45
Association Mapping David Evans

Upload: rodney-mills

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Association Mapping

David Evans

Page 2: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Outline

• Definitions / Terminology• What is (genetic) association?• How do we test for association?• When to use association• HapMap and tagging• Genome-wide Association• Sequencing and Rare variants

Page 3: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Definitions

SNP: “Single Nucleotide Polymorphism” a mutation that produces asingle base pair change in the DNA sequence

haplotypes

genotypes

alleles A

C

A

C G

C

A

A T

T

both alleles at a locus form a genotype

Locus: Location on the genome

alternate forms of a SNP (mutation)

A

C

A

C G

C

A

A T

T

A

C

A

C G

C

A

A T

Tthe pattern of alleles on a chromosome

QTL: “Quantitative trait locus” a region of the genome that changes the mean value of a quantitative phenotype

Page 4: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

What is (genetic) association?

Correlation between an allele/genotype/haplotype and a trait of interest

Page 5: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genetic AssociationThree Common Forms

• Direct Association• Mutant or ‘susceptible’ polymorphism• Allele of interest is itself involved in phenotype• ~70% of Cystic Fibrosis patients have a deletion of 3 base

pairs resulting in the loss of a phenylalanine amino acid at position 508 of the CFTR gene

Page 6: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genetic AssociationThree Common Forms

• Direct Association• Mutant or ‘susceptible’ polymorphism• Allele of interest is itself involved in phenotype• ~70% of Cystic Fibrosis patients have a deletion of 3 base

pairs resulting in the loss of a phenylalanine amino acid at position 508 of the CFTR gene

• Indirect Association• Allele itself is not involved, but a nearby correlated

variant changes phenotype

Page 7: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Indirect association and Linkage disequilibrium

Page 8: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Indirect association and Linkage disequilibrium

time

Page 9: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Linkage Disequilibrium

Linkage disequilibrium means that we don’t need to genotype the exact aetiological variant, but only a variant that is correlated with it

Page 10: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genetic AssociationThree Common Forms

• Direct Association• Mutant or ‘susceptible’ polymorphism• Allele of interest is itself involved in phenotype

• Indirect Association• Allele itself is not involved, but a nearby correlated

marker changes phenotype

• Spurious association• Apparent association not related to genetic aetiology

(e.g. population stratification)

Page 11: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Population Stratification

Marchini, Nat Genet. 2004

Page 12: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

How do we test for association?

Page 13: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genetic Case Control Study

T/G

T/TT/T

G/TT/T

T/G

T/G T/G

Allele G is ‘associated’ with disease

T/GT/G

G/G

G/G

T/T

T/T

Controls Cases

Page 14: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Allele-based tests

• Each individual contributes two counts to 2x2 table.

• Test of association

where

• X2 has χ2 distribution with 1 degrees of freedom under null hypothesis.

Cases Controls Total

G n1A n1U n1·

T n0A n0U n0·

Total n·A n·U n··

10i UAj ij

2ijij2

nE

nEnX

, ,

n

nnnE ji

ij

Page 15: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genotypic tests

• SNP marker data can be represented in 2x3 table.

• Test of association

where

• X2 has χ2 distribution with 2 degrees of freedom under null hypothesis.

Cases Controls Total

GG n2A n2U n2·

GT n1A n1U n1·

TT n0A n0U n0·

Total n·A n·U n··

210i UAj ij

2ijij2

nE

nEnX

,, ,

n

nnnE ji

ij

Page 16: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Simple Regression Model of Association(Unrelated individuals)

Yi = a + bXi + ei

whereYi = trait value for individual iXi = number of ‘A’ alleles an individual has

10 2

0

0.2

0.4

0.6

0.8

1

1.2

X

Y

Association test is whether > 0b

Page 17: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

AC AA

AC

•Rationale: Related individuals have to be from the same population

•Compare number of times heterozygous parents transmit “A” vs “C” allele to affected offspring

Transmission Disequilibrium Test

Page 18: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Transmission Disequilibrium Test

AC AA

AC

•Difficult to gather families

•Difficult to get parents for late onset / psychiatric conditions

• Inefficient for genotyping (particularly GWA)

Page 19: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Case-control versus TDT

N units for 90% power

0

200

400

600

800

1000

1200

1400

1600

1800

0 0.05 0.1 0.15 0.2 0.25

Allele frequency

CC (K=0.1)

CC (K=0.01)

TDT

N individuals for 90% power

0

1000

2000

3000

4000

5000

6000

0 0.05 0.1 0.15 0.2 0.25

Allele frequency

CC (K=0.1)

CC (K=0.01)

TDT

p = 0.1; RAA = RAa = 2

Page 20: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association
Page 21: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

When to use association...

Page 22: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Methods of gene huntingE

ffec

t S

ize

Frequency

rare, monogenic (linkage)

common, complex (association)

Page 23: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Association Summary

1. Families or unrelateds

2. Matching/ethnicity crucial

3. Many markers req for genome coverage (105 – 106 SNPs)

4. Powerful design

5. Ok for initial detection; good for fine-mapping

6. Powerful for common variants; rare variants difficult

Page 24: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

HapMap and Tagging

Page 25: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Historical gene mapping

Glazier et al, Science (2002).

Page 26: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Reasons for Failure?

Complex Phenotype

Commonenvironment

Marker Gene1

Individualenvironment

Polygenicbackground

Gene2

Gene3

Linkage

Linkagedisequilibrium

Mode ofinheritance Linkage

Association

Weiss & Terwilliger (2000) Nat GenetInadequate Marker Coverage (Candidate gene studies)

Page 27: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Enabling association studies: HapMap

Page 28: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Visualizing empirical LD

Page 29: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Pairwise tagging

Tags:

SNP 1SNP 3SNP 6

3 in total

Test for association:

SNP 1SNP 3SNP 6

A/T1

G/A2

G/C3

T/C4

G/C5

A/C6

high r2 high r2 high r2

AATT

GC

CG

GC

CG

TCCC

ACCC

GC

CG

TCCC

GGAA

GGAA

Carlson et al. (2004) AJHG 74:106

Page 30: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genome-wide Association

Page 31: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Enabling Genome-wide Association Studies

HAPlotype MAP

High throughput genotyping

Large cohorts

Page 32: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genome-wide Association Studies

The Australo-Anglo-American Ankylosing Spondylitis Consortium (2010) Nature Genetics

Page 33: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Meta-analysis

Repapi et al. (2009) Nature Genetics

Page 34: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

2 1 1 2 ? 2 1 ? ? 1 ? 2 2 0……….

2 1 1 2 ? 2 2 ? ? 0 ? 2 1 0……….

2 ? 1 2 ? 2 1 ? ? 1 ? 1 1 0……….

2 1 2 1 ? 2 2 ? ? 1 ? 1 1 0……….

2 1 1 1 ? 2 1 ? ? 1 ? 2 2 0……….

2 1 1 1 ? 2 2 ? ? 0 ? 2 2 0……….

1 0 1 2 ? 2 1 ? ? 1 ? 1 1 ?……….

2 1 2 1 ? 2 2 ? ? 1 ? 1 1 ?……….

HapMap Phase II

Cases

Controls

ImputationRecombination Rate

Page 35: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Imputation

Page 36: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Genomic control

Test locus Unlinked ‘null’ markers

2E

2 No stratification

2E

2

Stratification adjust test statistic

Page 37: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

PCA

Page 38: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Replication

Replication studies should be of sufficient size to demonstrate the effect

Replication studies should conducted in independent datasets

Replication should involve the same phenotype

Replication should be conducted in a similar population

The same SNP should be tested

The replicated signal should be in the same direction

Joint analysis should lead to a lower p value than the original report

Well designed negative studies are valuable

Page 39: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Programs for performing association analysis

• Mx (Neale)– Fully flexible, ordinal data– Not ideal for large pedigrees or GWAs

• PLINK (Purcell, Neale, Ferreira)– GWA

• Haploview (Barrett)– Graphical visualization of LD, tagging, basic tests of

association

• MERLIN, QTDT (Abecasis)– Association and linkage in families

Page 40: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Sequencing and Rare Variants

Page 41: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association
Page 42: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Metzker et al (2010) Nature Reviews Genetics

Page 43: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association
Page 44: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Analysis of Rare Variants

• How to combine rare variants?– “Ordinary” tests of association won’t work– Collapse across all SNPs?

• Which SNPs to include?– Frequency?– Function?

• How to define a region?

Page 45: Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association

Summary

1. Genetic association studies can be used to locate common genetic variants that increase risk of disease/affect quantitative phenotypes

2. Genome-wide association spectacularly successful in identifying common variants underlying complex traits and disease

3. The next challenge is to explain the “missing heritability” in the genome. Genome-wide sequencing and the analysis of rare variants will play a major part in this effort