basic qtl analysis

17
QTL Analysis an association between marker genotype and quantitative trait y progeny by marker genotype phenotypic mean between classes (t-test or ANOVA) cance = marker linked to QTL nce between means = estimate of QTL effect g = 1 - µ 2 )/2 g = genotypic effect µ 1 = trait mean for genotypic class AA µ 2 = trait mean for genotypic class aa 0 aa AA Genotypic classes β o - 1 x y

Upload: rainer

Post on 22-Feb-2016

67 views

Category:

Documents


0 download

DESCRIPTION

y. β o. 0. x. -1. aa. AA. Genotypic classes. Basic QTL Analysis Is there an association between marker genotype and quantitative trait phenotype? - Classify progeny by marker genotype - Compare phenotypic mean between classes (t-test or ANOVA) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Basic QTL Analysis

Basic QTL Analysis

Is there an association between marker genotype and quantitative trait phenotype? - Classify progeny by marker genotype - Compare phenotypic mean between classes (t-test or ANOVA) - Significance = marker linked to QTL - Difference between means = estimate of QTL effect

g = (µ1 - µ2)/2

g = genotypic effect

µ1 = trait mean for genotypic class AA

µ2 = trait mean for genotypic class aa

0aa AA

Genotypic classes

βo

-1 x

y

Page 2: Basic QTL Analysis

Notations for single-QTL models in backcross and F2 populations

Model Genotype ValueBackcross (Qq x QQ) QQ µ1

Qq µ2

Genetic effect g = 0.5(µ1 - µ2)

DH (qq x QQ) QQ µ1

Qq µ2

Genetic effect g = 0.5(µ1 - µ2)

F2 (Qq x Qq)

QQ µ1

Qq µ2

qq µ3

Additive a = 0.5(µ1 - µ3)

Dominance d = 0.5(2µ2 - µ1 - µ3)

Page 3: Basic QTL Analysis

Single-marker analysis• How it works

– Finds associations between marker genotype and trait value

• When to use– Order of markers unknown or incomplete maps– Quick scan

– Find best possible QTLs– Identify missing or incorrectly formatted

data

• LimitationsUnderestimates QTL number and effects

QTL position can not be precisely determined

A(marker)

Q(putative QTL)

r

r = recombination fraction

yj = trait value for the jth individual in the population

μ = population mean

f(A) = function of marker genotype

εj = residual associated with the jth individual

jj Afy )(

Page 4: Basic QTL Analysis

Single-marker analysis in backcross progeny

• Parents: AAQQ x aaqq

• Backcross: aaqq x AaQq x AAQQ

Expected

Frequency

• BC Progeny AaQq AAQQ 0.5 (1 - r)

Aaqq AAQq 0.5r

aaQq AaQQ 0.5r

aaqq AaQq 0.5(1 - r)

r is recombination frequency between A and Q

Page 5: Basic QTL Analysis

Expected QTL genotypic frequencies conditional on genotypes

Marker genotype

Observed count

Marginal frequencies

QTL genotype Expected trait value

QQ Qq

Joint frequency

AA n1 0.5 0.5(1-r) 0.5r

Aa n2 0.5 0.5r 0.5(1-r)

Conditional frequency

AA n1 0.5 1-r r (1-r)µ1 + rµ2

Aa n2 0.5 r 1-r rµ1 + (1-r)µ2

Page 6: Basic QTL Analysis

Single-marker analysis

- Simple t-test- Analysis of variance- Linear regression- Likelihood

A(marker)

Q(putative QTL)

r

Page 7: Basic QTL Analysis

Simple t-test using backcross progeny

Yj(i)k = μ + Mi + g(M)j(i) + ei(j)k

21

2 11ˆ

ˆˆ

nns

t

M

aaAaM

H0: [μAa - μaa ] = 0(a + d) = 0

r = 0.5

t-distribution with df = N – 2

If tM is significant, then a QTL is declared to be near the marker

Yj(i)k = trait value for individual j with genotype i in the replication kμ = population mean Mi = effect of the marker genotypeg(M)j(i) = genotypic effect which cannot be explained by the marker genotypeei(j)k = error termµAa = trait mean for genotypic class Aaµaa = trait mean for genotypic class aas2

M = pooled variance within the two classes

2

2

1

2 ˆˆ

ˆˆ

ns

ns

taaAa

aaAaM

Page 8: Basic QTL Analysis

Analysis of variance using backcross progenyH0: [μAa - μaa ] = 0

(a + d) = 0

r = 0.5

Source df MS (Mean Square)

Expected MS

Total Genetics N - 1 MSG

Marker 1 MSMG(Marker) N - 2 MSG(M)

Residual N (b - 1) MSE 2e

2)(

22 )1(4 arrb QTLGe

222)(

22 )21()1(4 arbcarrb QTLGe

22Ge b

)(MMSGMSMF

F-distribution with 1 and N – 2 df

If F is significant, then a QTL is declared to be near the marker

F = t if df for numerator is 1

N= no. of individuals in pop.b = no. of replicationsr = recombination fraction

Page 9: Basic QTL Analysis

Analysis of variance using SAS

data a;input Individuals Trait1 Marker1 Marker2;cards; 1 1.57 A B 2 1.35 B A 3 10.7 B B…proc glm;class Marker1 Marker2;model Trait1 = Marker1 Marker2;lsmeans Marker1 Marker2;run;

(A simple example)

Page 10: Basic QTL Analysis

0aa Aa

Genotypic classes

βo

-1 x

y

Linear regression using backcross progeny

jj jxy 10

H0: [μAa - μaa ] = 0(a + d) = 0

r = 0.5

Dummy variables:

aa = -1

Aa = 1

yj= trait value for the jth individual

xj= dummy variable

βo= intercept for the regression

β1= slope for the regression

j= random errorExpectations:

E(βo) = 0.5 (µAa + µaa) = Mean for the trait

E(β1) = 0.5 (1 - 2r) (µAa - µaa) = (1 - 2r) g = 0.5 (a + d) (1 - 2r)

β1

R2: percent of the phenotypic variance explained by the QTL

Page 11: Basic QTL Analysis

y = 3 + x + e

0

1

2

3

4

5

6

-2 -1 0 1 2

y = 3 - x + e

0

1

2

3

4

5

6

-2 -1 0 1 2

Linear regression using backcross progeny

Interpretation of results depends on coding of the dummy variables

y y

x x

Genotypic classes Genotypic classesaa Aa aa Aa

µ = 3µAa = 4µaa = 2g = 0.5(µAa - µaa) = 1

µ = 3µAa = 2µaa = 4g = 0.5(µAa - µaa) = -1

Page 12: Basic QTL Analysis

A likelihood approach using backcross progeny

N

i j

jiijN

yMQpL

1

2

12

2

2)(

exp)/(2

1

Joint distribution function:

Page 13: Basic QTL Analysis

A likelihood approach using backcross progeny (cont.)

)2(22

)(exp)/(,,,( 2

1

2

12

22

21

LnNyMQpLnrLLn

N

i j

jiij

)2(2

)(2

1( 2

1

2221

LnNyLLn

N

ii

)2(22

)(2

)(exp)5.0( 2

12

22

2

21

LnNyyLnrLLn

N

i

ii

Page 14: Basic QTL Analysis

A likelihood approach using backcross progeny (cont.)

H0: [μAa - μaa ] = 0

(a + d) = 0

r = 0.5

)5.0(ln)ˆ,ˆ,ˆ,ˆ(ln2 2 rLrLG aaAa G is distributed asymptotically as a chi-square variable with one degree of freedom

)(ln)ˆ,ˆ,ˆ,ˆ(ln2 2 aaAaaaAa LrLG

The t-test is approximately equivalent to the likelihood ratio test using this formula

G-statistics

Likelihood ratio test statistics (LR)Probability of occurrence of the data under the

null hypothesis

(Weller, 1986)

Page 15: Basic QTL Analysis

LOD scoreLOD : Logarithm of the odds ratio

Base 10 logarithm of GLR= 2 (log)LOD = 4.605LOD LOD= 0.217LR

LOD is interpreted as an odds ratio

(probability of observing the data under linkage/probability of observing the same data under no linkage)

No theoretical distribution is needed to interpret a lOD score

Key value: ≥ 3 (H1 is 1000 times more likely than H0 -no linkage-)

(approx: p = 0.001) p= probability of type I errorType I error: false positive (declare a QTL when there is no QTL)

Page 16: Basic QTL Analysis

G-Statistics and LOD score

Page 17: Basic QTL Analysis

Single-marker analysis Summary

• Identify marker-trait associations• Identify missing or incorrectly formatted data• Genetic map is not required• Divide the population into subpopulations based on the allelic

segregation of individual loci (one marker at a time)• Get trait means for each subpopulation (genotypic class)• Determine if the subpopulations trait means are significantly

different

• LimitationsUnderestimates QTL number and effects

QTL position can not be precisely determined