qtl mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · references c broman kw (2001)...

49
QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA www.biostat.jhsph.edu/˜kbroman

Upload: others

Post on 04-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

QTL mapping in mice

Karl W Broman

Department of BiostatisticsJohns Hopkins University

Baltimore, Maryland, USA

www.biostat.jhsph.edu/˜kbroman

Page 2: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Outline

� Experiments, data, and goals

� Models

� ANOVA at marker loci

� Interval mapping

� LOD scores, LOD thresholds

� Mapping multiple QTLs

� Simulations

Page 3: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Backcross experiment

P1

A A

P2

B B

F1

A B

BCBC BC BC

Page 4: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Intercross experiment

P1

A A

P2

B B

F1

A B

F2F2 F2 F2

Page 5: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Trait distributions

30

40

50

60

70

A B F1 BC

Strain

Phe

noty

pe

Page 6: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Data and Goals

Phenotypes: ��� = trait value for mouse

�Genotypes: � �� = 1/0 if mouse

is BB/AB at marker

(for a backcross)Genetic map: Locations of markers

Goals:

� Identify the (or at least one) genomic regions(QTLs) that contribute to variation in the trait.

� Form confidence intervals for QTL locations.

� Estimate QTL effects.

Note: QTL = “quantitative trait locus”

Page 7: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Why?

Mice: Find gene

� Drug targets, biochemical basis

Agronomy: Selection for improvement

Flies: Genetic architecture

� Evolution

Page 8: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

80

60

40

20

0

Chromosome

Loca

tion

(cM

)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X

Genetic map

Page 9: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

20 40 60 80 100 120

20

40

60

80

100

120

Markers

Indi

vidu

als

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X

Genotype data

Page 10: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Statistical structure

QTL

Markers Phenotype

Covariates

The missing data problem:

Markers QTL

The model selection problem:

QTL, covariates � phenotype

Page 11: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Models: Recombination

We assume no crossover interference.

� � Points of exchange (crossovers) are accordingto a Poisson process.

� � The

��� �

(marker genotypes) form a Markovchain

Page 12: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Example

A B − B A A

B A A − B A

A A A A A B

?

Page 13: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Models: Genotype Phenotype

Let � = phenotype� = whole genome genotype

Imagine a small number of QTLs with genotypes ������ � � � ��� .(

� �

distinct genotypes)

E

� � � � � � ��� "!# # # ! �%$ var� � � � � � & '� !# # # ! �$

Page 14: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Models: Genotype Phenotype

Homoscedasticity (constant variance): & '� ( & '

Normally distributed residual variation: � � �) * � � � � & ' �

.

Additivity: � � !# # # ! �$ � � + ���, � -� �� ( �� � .

or

/

)

Epistasis: Any deviations from additivity.

Page 15: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

The simplest method: ANOVA

� Split mice into groupsaccording to genotypeat a marker.

� Do a t-test / ANOVA.

� Repeat for each marker.

40

50

60

70

80

Phe

noty

pe

BB AB

Genotype at D1M30

BB AB

Genotype at D2M99

Page 16: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

ANOVA at marker loci

Advantages

� Simple.

� Easily incorporatecovariates.

� Easily extended to morecomplex models.

� Doesn’t require a geneticmap.

Disadvantages

� Must exclude individualswith missing genotype data.

� Imperfect information aboutQTL location.

� Suffers in low density scans.� Only considers one QTL at a

time.

Page 17: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Interval mapping (IM)

Lander & Botstein (1989)

� Take account of missing genotype data

� Interpolate between markers

� Maximum likelihood under a mixture model

0 20 40 60

0

2

4

6

8

Map position (cM)

LOD

sco

re

Page 18: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Interval mapping (IM)

Lander & Botstein (1989)

� Assume a single QTL model.

� Each position in the genome, one at a time, is posited as theputative QTL.

� Let 0 � 1/0 if the (unobserved) QTL genotype is BB/AB.Assume �) * � ��1 � & �

� Given genotypes at linked markers, �) mixture of normal dist’nswith mixing proportion

243 � 0 � . �marker data

:

QTL genotype576 578 BB ABBB BB

9: ; <>= ? 9: ; <>@ ?A 9: ; < ? <= <>@ A 9 : ; < ?

BB AB9: ; <= ? <>@ A < <= 9: ; <>@ ?A <

AB BB <= 9: ; <>@ ?A < 9: ; <= ? <>@ A <

AB AB <= <@ A 9: ; < ? 9 : ; <= ? 9: ; <@ ?A 9: ; < ?

Page 19: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

The normal mixtures

576 578B

7 cM 13 cM

C Two markers separated by 20 cM,with the QTL closer to the leftmarker.

C The figure at right show the dis-tributions of the phenotype condi-tional on the genotypes at the twomarkers.

C The dashed curves correspond tothe components of the mixtures.

20 40 60 80 100

Phenotype

BB/BB

BB/AB

AB/BB

AB/AB

µ0

µ0

µ0

µ0

µ1

µ1

µ1

µ1

Page 20: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Interval mapping (continued)

Let D � � 243 � 0�� � . �

marker data

�� � 0�� ) * � �1E � & ' �

243 � ��� �

marker data� ��F � � ��� & � � D � G � ��� H � �� & � + � . � D � � G � �� H �F � & �

where

G � � H �� & � � density of normal distribution

Log likelihood:

I � ��F � � ��� & � � � JLKM 243 � ��� �

marker data� ��F � � ��� & �

Maximum likelihood estimates (MLEs) of � F , � � , &:EM algorithm.

Page 21: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

LOD scores

The LOD score is a measure of the strength of evidence for thepresence of a QTL at a particular location.

LOD

� 0 � � JLKM � F likelihood ratio comparing the hypothesis of aQTL at position 0 versus that of no QTL

� JKM . /243 � � �QTL at 0� N ��F 1 � N � � 1 � N &1 �

243 � � �no QTL� N �� N & �

N �F 1 � N � � 1 � N &1 are the MLEs, assuming a single QTL at position 0.

No QTL model: The phenotypes are independent and identicallydistributed (iid)

* � �� & ' �

.

Page 22: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

An example LOD curve

0 20 40 60 80 100

0

2

4

6

8

10

12

Chromosome position (cM)

LOD

Page 23: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

0 500 1000 1500 2000 2500

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Map position (cM)

lod

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19LOD curves

Page 24: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Interval mapping

Advantages

� Takes proper account ofmissing data.

� Allows examination ofpositions between markers.

� Gives improved estimatesof QTL effects.

� Provides pretty graphs.

Disadvantages

� Increased computationtime.

� Requires specializedsoftware.

� Difficult to generalize.� Only considers one QTL at

a time.

Page 25: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

LOD thresholds

Large LOD scores indicate evidence for the presence of a QTL.

Q: How large is large?

We consider the distribution of the LOD score under the nullhypothesis of no QTL.

Key point: We must make some adjustment for our examination ofmultiple putative QTL locations.

We seek the distribution of the maximum LOD score, genome-wide. The 95th %ile of this distribution serves as a genome-wideLOD threshold.

Estimating the threshold: simulations, analytical calculations, per-mutation (randomization) tests.

Page 26: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Null distribution of the LOD score

C Null distribution derived bycomputer simulation of backcrosswith genome of typical size.

C Solid curve: distribution of LODscore at any one point.

C Dashed curve: distribution ofmaximum LOD score,genome-wide.

0 1 2 3 4

LOD score

Page 27: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Permutation tests

mice

markers

genotypedata

phenotypes

O LOD

9QP ?(a set of curves)

O 5SRT UVXW LOD

9 P ?

C Permute/shuffle the phenotypes; keep the genotype data intact.

C Calculate LOD

Y 9QP ? ; Z 5 YR T UVXW LODY 9 P ?

C We wish to compare the observed5

to the distribution of

5 Y

.

C [�\ 9 5 Y] 5 ?

is a genome-wide P-value.

C The 95th %ile of

5 Y

is a genome-wide LOD threshold.

C We can’t look at all ^ _ possible permutations, but a random set of 1000 is feasi-ble and provides reasonable estimates of P-values and thresholds.

C Value: conditions on observed phenotypes, marker density, and pattern of miss-ing data; doesn’t rely on normality assumptions or asymptotics.

Page 28: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Permutation distribution

maximum LOD score

0 1 2 3 4 5 6 7

95th percentile

Page 29: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Multiple QTL methods

Why consider multiple QTLs at once?

� Reduce residual variation.

� Separate linked QTLs.

� Investigate interactions between QTLs (epistasis).

Page 30: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Epistasis in a backcross

Additive QTLs

Interacting QTLs

0

20

40

60

80

100

Ave

. phe

noty

pe

A HQTL 1

A

H

QTL 2

0

20

40

60

80

100

Ave

. phe

noty

pe

A HQTL 1

A

H

QTL 2

Page 31: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Epistasis in an intercross

Additive QTLs

Interacting QTLs

0

20

40

60

80

100

Ave

. phe

noty

pe

A H BQTL 1

AH

BQTL 2

0

20

40

60

80

100

Ave

. phe

noty

pe

A H BQTL 1

A

H

BQTL 2

Page 32: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Abstractions / simplifications

� Complete marker data

� QTLs are at the marker loci

� QTLs act additively

Page 33: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

The problem

n backcross mice; M markers

�� � = genotype (1/0) of mouse � at marker �

�� = phenotype (trait value) of mouse �

�� � � +5

� R :-� ��� +�` � Which

-� a � /?

b Model selection in regression

Page 34: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

How is this problem different?

� Relationship among the x’s

� Find a good model vs. minimize prediction error

Page 35: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Model selection

� Select class of models– Additive models

– Add’ve plus pairwise interactions

– Regression trees

� Compare models

– BIC c 9Qd ? R e�f g RSS

9Qd ?ih j d j 9k e�f g ^^ ?

– Sequential permutation tests

– Estimate of prediction error

� Search model space– Forward selection (FS)

– Backward elimination (BE)

– FS followed by BE

– MCMC� Assess performance

– Maximize no. QTLs found;control false positive rate

Page 36: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Why BIC l?

� For a fixed no. markers, letting m n, BIC o is consistent.

� There exists a prior (on models + coefficients) for whichBIC o is the –log posterior.

� BIC o is essentially equivalent to use of a threshold on theconditional LOD score

� It performs well.

Page 37: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Choice of l

Smaller

l

: include more loci; higher false positive rate

Larger

l

: include fewer loci; lower false positive rate

Let L = 95% genome-wide LOD threshold(compare single-QTL models to the null model)

Choose

l

= 2 L / JLKM :p mWith this choice of

l, in the absence of QTLs, we’ll

include at least one extraneous locus, 5% of thetime.

Page 38: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Simulations

� Backcross with n=250

� No crossover interference

� 9 chr, each 100 cM

� Markers at 10 cM spacing;complete genotype data

� 7 QTLs

– One pair in coupling– One pair in repulsion– Three unlinked QTLs

� Heritability = 50%

� 2000 simulation replicates

1

2

3

4

5

6

7

8

9

Page 39: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Methods

� ANOVA at marker loci

� Composite interval mapping (CIM)

� Forward selection with permutation tests

� Forward selection with BICk

� Backward elimination with BICk

� FS followed by BE with BICk

� MCMC with BICk

� A selected marker is deemed correct if it is within10 cM of a QTL (i.e., correct or adjacent)

Page 40: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

0 1 2 3 4 5 6 7

Correct

Ave no. chosen

ANOVA

35

79

11

CIM

fs, perm

fs

be

fs/be

mcmc

BIC

Page 41: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

0.0

0.5

1.0

1.5

2.0

QTLs linked in couplingA

ve n

o. c

hose

n

AN

OV

A 3 5 7 9 11

CIM fs, p

erm fs be

fs/b

e

mcm

c

BIC

Page 42: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

0.0

0.5

1.0

1.5

2.0

QTLs linked in repulsionA

ve n

o. c

hose

n

AN

OV

A 3 5 7 9 11

CIM fs, p

erm fs be

fs/b

e

mcm

c

BIC

Page 43: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Other QTLsA

ve n

o. c

hose

n

AN

OV

A 3 5 7 9 11

CIM fs, p

erm fs be

fs/b

e

mcm

c

BIC

Page 44: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

0.0

0.1

0.2

0.3

0.4

Extraneous linkedA

ve n

o. c

hose

n

AN

OV

A 3 5 7 9 11

CIM fs, p

erm fs be

fs/b

e

mcm

c

BIC

Page 45: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

0.00

0.01

0.02

0.03

0.04

0.05

0.06

Extraneous unlinkedA

ve n

o. c

hose

n

AN

OV

A 3 5 7 9 11

CIM fs, p

erm fs be

fs/b

e

mcm

c

BIC

Page 46: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Summary

� QTL mapping is a model selection problem.

� Key issue: the comparison of models.

� Large-scale simulations are important.

� More refined procedures do not necessarily giveimproved results.

� BICk with forward selection followed by backwardelimination works quite well (in the case of additiveQTLs).

Page 47: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

Acknowledgements

Terry Speed, University of California, Berkeley, and WEHI

Gary Churchill, The Jackson Laboratory

Saunak Sen, University of California, San Francisco

Page 48: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

References

C Broman KW (2001) Review of statistical methods for QTL mapping in experi-mental crosses. Lab Animal 30(7):44–52

Review for non-statisticians

C Broman KW, Speed TP (1999) A review of methods for identifying QTLs inexperimental crosses. In: Seillier-Moiseiwitch F (ed) Statistics in MolecularBiology. IMS Lecture Notes—Monograph Series. Vol 33, pp. 114–142

Older, more statistical review.

C Lander ES, Botstein D (1989) Mapping Mendelian factors underlying quantita-tive traits using RFLP linkage maps. Genetics 121:185–199

The seminal paper.

C Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative traitmapping. Genetics 138:963–971

LOD thresholds by permutation tests.

C Miller AJ (2002) Subset selection in regression, 2nd edition. Chapman & Hall,New York.

A reasonably good book on model selection in regression.

Page 49: QTL mapping in micekbroman/presentations/qtl... · 2003. 1. 14. · References C Broman KW (2001) Review of statistical methods for QTL mapping in experi- mental crosses. Lab Animal

C Strickberger MW (1985) Genetics, 3rd edition. Macmillan, New York, chapter11.

An old but excellent general genetics textbook with a very interesting discussionof epistasis.

C Broman KW, Speed TP (2002) A model selection approach for the identificationof quantitative trait loci in experimental crosses (with discussion). J Roy StatSoc B 64:641–656, 737–775

Contains the simulation study described above.