power calculation for qtl association (discrete and quantitative traits) shaun purcell & pak...

41
Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Upload: poppy-briggs

Post on 11-Jan-2016

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Power calculation for QTL association(discrete and quantitative traits)

Shaun Purcell & Pak Sham

Advanced Workshop

Boulder, CO, 2003

Page 2: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

QuantitativeThresholdDiscrete

Variancecomponents

TDT

Case-controlCase-control

TDT

High LowA n1 n2

a n3 n4

Aff UnAffA n1 n2

a n3 n4

Tr UnTrA n1 n2

a n3 n4

Tr UnTrA n1 n2

a n3 n4

Page 3: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Discrete trait calculation

p Frequency of high-risk allele

K Prevalence of disease

RAA Genotypic relative risk for AA genotype

RAa Genotypic relative risk for Aa genotype

N, , Sample size, Type I & II error rate

Page 4: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Risk is P(D|G)

gAA = RAA gaa gAa = RAa gaa

K = p2 gAA + 2pq gAa + q2 gaa

gaa = K / ( p2 RAA + 2pq RAa + q2 )

Odds ratios (e.g. for AA genotype) = gAA / (1- gAA )

gaa / (1- gaa )

Page 5: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Need to calculate P(G|D)

Expected proportion d of genotypes in cases

dAA = gAA p2 / (gAAp2 + gAa2pq + gaaq2 )

dAa = gAa 2pq / (gAAp2 + gAa2pq + gaaq2 )

daa = gaa q2 / (gAAp2 + gAa2pq + gaaq2 )

Expected number of A alleles for cases

2NCase ( dAA + dAa / 2 )

Expected proportion c of genotypes in controls

cAA = (1-gAA) p2 / ( (1-gAA) p2 + (1-gAa) 2pq + (1-gaa) q2 )

G

GPGDP

GPGDPDGP

)()|(

)()|()|(

Page 6: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Full contingency table

“A” allele “a” allele

Case 2NCase ( dAA + dAa / 2 ) 2NCase ( daa + dAa / 2 )

Control 2NControl ( cAA + cAa / 2 ) 2NControl ( caa + cAa / 2 )

E

EO 22 )(

Page 7: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Threshold selection

Genotype AA Aa aa

Frequency q2 2pq p2

Trait mean -a d a

Trait variance 2 2 2

Page 8: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

P(X) = GP(X|G)P(G)

P(X)

X

AA

Aa

aa

Page 9: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

P(G|X<T) = P(X<T|G)P(G) / P(X<T)

P(X)

X

AA

Aa

Nb. the cumulative standard normal distribution gives the area under the curve, P(X < T)

T

Page 10: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Incomplete LD

Effect of incomplete LD between QTL and marker

A aM pm1 + δ qm1 - δm pm2 – δ qm2 + δ

δ = D’ × DMAX DMAX = min{pm2 , qm1}

Note that linkage disequilibrium will depend on both

D’ and QTL & marker allele frequencies

Page 11: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Incomplete LD

Consider genotypic risks at marker:

P(D|MM) = [ (pm1+ δ)2 P(D|AA)

+ 2(pm1+ δ)(qm1- δ) P(D|Aa)

+ (qm1- δ)2 P(D|aa) ]

/ m12

Calculation proceeds as before, but at the marker

AM/AM

AM/aMor

aM/AM

aM/aM

AAMM

AaMM

aaMM

Haplo.Geno.

MM

Page 12: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Discrete TDT calculation

1. Calculate probability of parental mating type

given affected offspring

2. Calculate probability of offspring genotype given

parental mating type and affected

3. Calculate overall probability of heterozygous

parents transmitting allele A as opposed to a

4. Calculate TDT test statistic, power

Page 13: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Fulker association model

iWiBs

A

is

A

i AAAA

s

jj

s

jj

11

sibshipgenotypic mean

deviation from sibship genotypic mean

The genotypic score (1,0,-1) for sibling i is decomposed into between and within components:

Page 14: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

NCPs of B and W tests

SN

DA

B sVV

Vs

Vs

43

21

N

DAW V

VVs 4

321

)1(

Approximation for between test

Approximation for within test

Sham et al (2000) AJHG 66

Page 15: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Practical Exercise

Calculation of power for simple case-control study.

DATA : frequency of risk factor in 30 cases and 30

controls

TEST : 2-by-2 contingency table : chi-squared

(1 degree of freedom)

Page 16: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Step 1 : determine expected chi-squared

Hypothetical risk factor frequencies

Case Control

A allele present 20 10

A allele absent 10 20

Chi-squared statistic = 6.666

E

EO 22 )(

Page 17: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

P(T)

T

Critical value

Step 2. Determine the critical value for a given type I error rate,

- inverse central chi-squared distribution

Page 18: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

P(T)

T

Critical value

Step 3. Determine the power for a given critical valueand non-centrality parameter

- non-central chi-squared distribution

Page 19: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Calculating Power

1. Calculate critical value (Inverse central 2)

Alpha 0 (under the null)

2. Calculate power (Non-central 2)

Crit. value Expected NCP

Page 20: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

http://workshop.colorado.edu/~pshaun/gpc/pdf.html

df = 1 , NCP = 0

X

0.05

0.01

0.001

3.84146

6.63489

10.82754

Page 21: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Determining power

df = 1 , NCP = 6.666

X Power

0.05 3.84146

0.01 6.6349

0.001 10.827

0.73

0.50

0.24

Page 22: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

1. Planning a study

Candidate gene study

A disease occurs in 2% of the population

Assume multiplicative model

genotype risk ratio Aa = 2

genotype risk ratio AA = 4

100 cases, 100 controls

What if the risk allele is rare vs common?

Page 23: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

2. Interpreting a negative result

Negative candidate gene TDT study,

82 affected offspring trios

“affection” = scoring >2 SD above mean

candidate gene SNP allele frequency 0.25

Desired 80% power, 5% type I error rate

What is the minimum detectable QTL variance

(assume additivity)?

Page 24: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Planning a study

p N cases (=N controls)

0.01 1144

0.05 247

0.2 83

0.5 66

0.8 126

0.95 465

0.99 2286

Page 25: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Interpreting a negative result

QTL Power

0.00 0.05

0.01 0.34

0.02 0.60

0.03 0.78

0.04 0.88

0.05 0.94

Page 26: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Exploring power of association using GPC

Linkage versus association

difference in required sample sizes for specific QTL size

TDT versus case-control

difference in efficiency?

Quantitative versus binary traits

loss of power from artificial dichotomisation?

Page 27: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

log(N for 90% power)

1

10

100

1000

10000

100000

1000000

0% 5% 10% 15% 20% 25%

QTL effect

Linkage

Assoc

Linkage versus association

LRT

0

50

100

150

200

250

0% 5% 10% 15% 20% 25%

QTL effect

Linkage

Assoc

Power

0

0.2

0.4

0.6

0.8

1

0% 5% 10% 15% 20% 25%

QTL effect

Linkage

Assoc

QTL linkage: 500 sib pairs, r=0.5QTL association: 1000 individuals

Page 28: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Case-control versus TDT

N units for 90% power

0

200

400

600

800

1000

1200

1400

1600

1800

0 0.05 0.1 0.15 0.2 0.25

Allele frequency

CC (K=0.1)

CC (K=0.01)

TDT

N individuals for 90% power

0

1000

2000

3000

4000

5000

6000

0 0.05 0.1 0.15 0.2 0.25

Allele frequency

CC (K=0.1)

CC (K=0.01)

TDT

p = 0.1; RAA = RAa = 2

Page 29: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Quantitative versus discrete

K=0.5K=0.2K=0.05

To investigate: use threshold-based association

Fixed QTL effect (additive, 5%, p=0.5) 500 individuals

For prevalence K Group 1 has N and TGroup 2 has N and T

)(6 1 KX K500)1(500 K 6)(1 XK

Page 30: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Quantitative versus discrete

K T (SD)

0.01 2.326

0.05 1.645

0.10 1.282

0.20 0.842

0.25 0.674

0.50 0.000

Allele frequency

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.01 0.05 0.1 0.2 0.25 0.5

K

P(A|case)

P(A|control)

Page 31: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Quantitative versus discrete

LRT

0

5

10

15

20

25

30

0 0.1 0.2 0.3 0.4 0.5

K

VC

CC

Page 32: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Incomplete LD

what is the impact of D’ values less than 1?

does allele frequency affect the power of the test?

(using discrete case-control calculator)

Family-based VC association: between and within tests

what is the impact of sibship size? sibling correlation?

(using QTL VC association calculator)

Page 33: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Incomplete LD

Case-control for discrete traits

Disease K = 0.1

QTL RAA = RAa = 2 p = 0.05

Marker1m = 0.05 D’ = { 1, 0.8, 0.6, 0.4, 0.2, 0}

Marker2m = 0.25 D’ = { 1, 0.8, 0.6, 0.4, 0.2, 0}

Sample 250 cases, 250 controls

Page 34: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Incomplete LD

Genotypic risk at marker1 (left) and marker2 (right)

as a function of D’

0.060

0.080

0.100

0.120

0.140

0.160

0.180

0.200

0 0.2 0.4 0.6 0.8 1

D'

Gen

oty

pic

ris

k

gAA

gAa

gaa

0.060

0.080

0.100

0.120

0.140

0.160

0.180

0.200

0 0.2 0.4 0.6 0.8 1

D'

Gen

oty

pic

ris

k

gAA

gAa

gaa

Page 35: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Incomplete LD

Expected likelihood ratio test as a function of D’

0.00

2.00

4.00

6.00

8.00

10.00

0 0.2 0.4 0.6 0.8 1

D'

LR

T Marker1

Marker2

Page 36: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Family-based association

Sibship type

1200 individuals, 600 pairs, 400 trios, 300 quads

Sibling correlation

r = 0.2, 0.5, 0.8

QTL (diallelic, equal allele frequency)

2%, 10% of trait variance

Page 37: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Family-based association

NCP proportional to variance explained

Between test

↓ with ↑ sibship size and ↑ sibling correlation

Within test

0 for s=1, ↑ with ↑ sibship size and ↑ sibling correlation

Page 38: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Between-sibship association

Page 39: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Within-sibship association

Page 40: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

Total association

Page 41: Power calculation for QTL association (discrete and quantitative traits) Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003

GPC

Usual URL for GPC

http://statgen.iop.kcl.ac.uk/gpc/

Purcell S, Cherny SS, Sham PC. (2003) Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics, 19(1):149-50