what is the goal of qtl study?

33
Yandell © 2003 JSM 2003 1 Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandel l/statgen [Lan et al. 2003 Genetics 164(4): August 2003]

Upload: elon

Post on 11-Jan-2016

63 views

Category:

Documents


2 download

DESCRIPTION

Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandell/statgen [Lan et al. 2003 Genetics 164 (4): August 2003]. what is the goal of QTL study?. uncover underlying biochemistry - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: what is the goal of QTL study?

Yandell © 2003 JSM 2003 1

Dimension Reductionfor Mapping mRNA Abundance

as Quantitative Traits

Brian S. YandellUniversity of Wisconsin-Madison

www.stat.wisc.edu/~yandell/statgen[Lan et al. 2003 Genetics 164(4): August 2003]

Page 2: what is the goal of QTL study?

Yandell © 2003 JSM 2003 2

what is the goal of QTL study?• uncover underlying biochemistry

– identify how networks function, break down– find useful candidates for (medical) intervention– epistasis may play key role– statistical goal: maximize number of correctly identified QTL

• basic science/evolution– how is the genome organized?– identify units of natural selection– additive effects may be most important (Wright/Fisher debate)– statistical goal: maximize number of correctly identified QTL

• select “elite” individuals– predict phenotype (breeding value) using suite of characteristics

(phenotypes) translated into a few QTL– statistical goal: mimimize prediction error

Page 3: what is the goal of QTL study?

Yandell © 2003 JSM 2003 3

what is a QTL?• QTL = quantitative trait locus (or loci)

– trait = phenotype = characteristic of interest– quantitative = measured somehow

• glucose, insulin, gene expression level

• Mendelian genetics– allelic effect + environmental variation

– locus = location in genome affecting trait• gene or collection of tightly linked genes

• some physical feature of genome

Page 4: what is the goal of QTL study?

Yandell © 2003 JSM 2003 4

typical phenotype assumptions• normal "bell-shaped" environmental variation• genotypic value GQ is composite of m QTL• genetic uncorrelated with environment

22

2

)var(

)var()|(var

)|(

Q

Q

Q

G

Gh

QY

GQYE

8 9 10 11 12 13 14 15 16 17 18 19 20

qq QQ

Qqdatahistogram

Page 5: what is the goal of QTL study?

Yandell © 2003 JSM 2003 5

why worry about multiple QTL?• so many possible genetic architectures!

– number and positions of loci– gene action: additive, dominance, epistasis– how to efficiently search the model space?

• how to select “best” or “better” model(s)?– what criteria to use? where to draw the line?– shades of gray: exploratory vs. confirmatory study– how to balance false positives, false negatives?

• what are the key “features” of model?– means, variances & covariances, confidence regions– marginal or conditional distributions

Page 6: what is the goal of QTL study?

Yandell © 2003 JSM 2003 6

0 5 10 15 20 25 30

01

23

rank order of QTL

addi

tive

eff

ect

Pareto diagram of QTL effects

54

3

2

1

major QTL onlinkage map

majorQTL

minorQTL

polygenes

(modifiers)

Page 7: what is the goal of QTL study?

Yandell © 2003 JSM 2003 7

advantages of multiple QTL approach• improve statistical power, precision

– increase number of QTL detected

– better estimates of loci: less bias, smaller intervals

• improve inference of complex genetic architecture– patterns and individual elements of epistasis

– appropriate estimates of means, variances, covariances• asymptotically unbiased, efficient

– assess relative contributions of different QTL

• improve estimates of genotypic values– less bias (more accurate) and smaller variance (more precise)

– mean squared error = MSE = (bias)2 + variance

Page 8: what is the goal of QTL study?

Yandell © 2003 JSM 2003 8

epistasis in parallel pathways(Gary Churchill)

• Z keeps trait value low

• Neither E1nor E2 is rate limiting

• Loss of function alleles are segregating from parent A

at E1and from parent B at E2

Z

X

Y

E1

E2

Page 9: what is the goal of QTL study?

Yandell © 2003 JSM 2003 9

epistasis in a serial pathway(Gary Churchill)

ZX YE1 E2• Z keeps trait value high

• Neither E1 nor E2 is rate limiting

• Loss of function alleles are segregating from parent B

at E1and from parent A at E2

Page 10: what is the goal of QTL study?

Yandell © 2003 JSM 2003 10

epistasis examples(Doebley Stec Gustus 1995; Zeng pers. comm.)

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

geno

typi

c va

lue

aa Aa AA

bb

BbBB 1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

geno

typi

c va

lue

bb Bb BB

aa

AaAA

-1.0

-0.5

0.0

0.5

effe

ct +

/- 2

se

a1 d1 a2 d2 iaa iad ida idd

20

40

60

80

10

0ge

noty

pic

valu

e

aa Aa AA

bbBb

BB2

04

06

08

01

00

geno

typi

c va

lue

bb Bb BB

aa

Aa

AA

-40

-20

02

0ef

fect

+/-

2 s

e

a1 d1 a2 d2 iaa iad ida idd

02

46

geno

typi

c va

lue

aa Aa AA

bb

BbBB 02

46

geno

typi

c va

lue

bb Bb BB

aa

AaAA -2-1

01

23

effe

ct +

/- 2

se

a1 d1 a2 d2 iaa iad ida idd

traits 1,4,91: dom-dom interaction4: add-add interaction9: rec-rec interaction(Fisher-Cockerham effects)

4

91

Page 11: what is the goal of QTL study?

Yandell © 2003 JSM 2003 11

why map gene expressionas a quantitative trait?

• cis- or trans-action?– does gene control its own expression? – evidence for both modes (Brem et al. 2002 Science)

• mechanics of gene expression mapping– measure gene expression in intercross (F2) population– map expression as quantitative trait (QTL technology)– adjust for multiple testing via false discovery rate

• research groups working on expression QTLs– review by Cheung and Spielman (2002 Nat Gen Suppl)

– Kruglyak (Brem et al. 2002 Science)

– Doerge et al. (Purdue); Jansen et al. (Waginingen)

– Williams et al. (U KY); Lusis et al. (UCLA)

– Dumas et al. (2000 J Hypertension)

Page 12: what is the goal of QTL study?

Yandell © 2003 JSM 2003 12

idea of mapping microarrays(Jansen Nap 2001)

Page 13: what is the goal of QTL study?

Yandell © 2003 JSM 2003 13

goal: unravel biochemical pathways(Jansen Nap 2001)

Page 14: what is the goal of QTL study?

Yandell © 2003 JSM 2003 14

central dogma via microarrays(Bochner 2003)

Page 15: what is the goal of QTL study?

Yandell © 2003 JSM 2003 15

coordinated expression in mouse genome (Schadt et al. 2003 Nature)

expression pleiotropy

in yeast genome(Brem et al.

2002 Science)

Page 16: what is the goal of QTL study?

Yandell © 2003 JSM 2003 16

glucose insulin

(courtesy AD Attie)

Page 17: what is the goal of QTL study?

Yandell © 2003 JSM 2003 17

Insulin Requirement

from Unger & Orci FASEB J. (2001) 15,312

decompensation

Page 18: what is the goal of QTL study?

Yandell © 2003 JSM 2003 18

Type 2 Diabetes Mellitus

from Unger & Orci FASEB J. (2001) 15,312

Page 19: what is the goal of QTL study?

Yandell © 2003 JSM 2003 19

studying diabetes in an F2

• segregating cross of inbred lines– B6.ob x BTBR.ob F1 F2– selected mice with ob/ob alleles at leptin gene (chr 6)– measured and mapped body weight, insulin, glucose at various ages

• (Stoehr et al. 2000 Diabetes)– sacrificed at 14 weeks, tissues preserved

• gene expression data– Affymetrix microarrays on parental strains, F1

• key tissues: adipose, liver, muscle, -cells• novel discoveries of differential expression (Nadler et al. 2000 PNAS; Lan et

al. 2002 in review; Ntambi et al. 2002 PNAS)– RT-PCR on 108 F2 mice liver tissues

• 15 genes, selected as important in diabetes pathways• SCD1, PEPCK, ACO, FAS, GPAT, PPARgamma, PPARalpha, G6Pase, PDI,

Page 20: what is the goal of QTL study?

Yandell © 2003 JSM 2003 20

LOD map for PDI: cis-regulationLan et al. (2003 submitted)

Page 21: what is the goal of QTL study?

Yandell © 2003 JSM 2003 21

0 50 100 150 200 250 300

02

46

8LO

D

chr2 chr5 chr9

0 50 100 150 200 250 300

-0.5

0.0

0.5

1.0

effe

ct (

add=

blue

, do

m=

red)

chr2 chr5 chr9

Multiple Interval MappingSCD1: multiple QTL plus epistasis!

Page 22: what is the goal of QTL study?

Yandell © 2003 JSM 2003 22

2-D scan: assumes only 2 QTL!

epistasis LOD

joint LOD

Page 23: what is the goal of QTL study?

Yandell © 2003 JSM 2003 23

trans-acting QTL for SCD1(no epistasis yet: see Yi, Xu, Allison 2003)

dominance?

Page 24: what is the goal of QTL study?

Yandell © 2003 JSM 2003 24

Bayesian model assessment:chromosome QTL pattern for SCD1

1 3 5 7 9 11 13 15

0.00

0.05

0.10

0.15

4:1,

2,2*

34:

1,2*

2,3

5:3*

1,2,

35:

2*1,

2,2*

35:

2*1,

2*2,

36:

3*1,

2,2*

36:

3*1,

2*2,

35:

1,2*

2,2*

36:

4*1,

2,3

6:2*

1,2*

2,2*

32:

1,3

3:2*

1,2

2:1,

2

3:1,

2,3

4:2*

1,2,

3

model index

mod

el p

oste

rior

pattern posterior

0.2

0.4

0.6

0.8

model index

post

erio

r /

prio

r

Bayes factor ratios

1 3 5 7 9 11 13 15

3 44 4

55 5

6 65

66

23

2

weak

moderate

Page 25: what is the goal of QTL study?

Yandell © 2003 JSM 2003 25

high throughput dilemma• want to focus on gene expression network

– hundreds or thousands of genes/proteins to monitor – ideally capture networks in a few dimensions

• multivariate summaries of multiple traits

– elicit biochemical pathways• (Henderson et al. Hoeschele 2001; Ong Page 2002)

• may have multiple controlling loci– allow for complicated genetic architecture– could affect many genes in coordinated fashion– could show evidence of epistasis– quick assessment via interval mapping may be misleading

Page 26: what is the goal of QTL study?

Yandell © 2003 JSM 2003 26

why study multiple traits together?• environmental correlation

– non-genetic, controllable by design– historical correlation (learned behavior)– physiological correlation (same body)

• genetic correlation– pleiotropy

• one gene, many functions• common biochemical pathway, splicing variants

– close linkage• two tightly linked genes• genotypes Q are collinear

Page 27: what is the goal of QTL study?

Yandell © 2003 JSM 2003 27

high throughput:which genes are the key players?

• one approach:

clustering of expression

seed by insulin, glucose• advantage:

subset relevant to trait• disadvantage:

still many genes to study

Page 28: what is the goal of QTL study?

Yandell © 2003 JSM 2003 28

PC simply rotates & rescalesto find major axes of variation

-5 0 5

-50

5

mRNA1

mR

NA

2

-2 -1 0 1

-2-1

01

2

V1

V2

Page 29: what is the goal of QTL study?

Yandell © 2003 JSM 2003 29

multivariate screenfor gene expressing mapping

principal componentsPC1(red) and SCD(black)

PC2

(22%

)

PC1 (42%)

Page 30: what is the goal of QTL study?

Yandell © 2003 JSM 2003 30

mapping first diabetes PC as a trait

0 50 100 150 200 250 300

0.0

00

0.0

20

loci

his

togr

am

hong7pc.bim summaries with pattern ch2,ch5,ch9

ch2 ch5 ch9

0 50 100 150 200 250 300

-1.0

0.5

addi

tive

ch2 ch5 ch9

0 50 100 150 200 250 300

-2.0

0.0

dom

inan

ce

ch2 ch5 ch9

Page 31: what is the goal of QTL study?

Yandell © 2003 JSM 2003 31

pFDR for PC1 analysis

0.0 0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

relative size of HPD region

pr(

H=

0 | p

>si

ze )

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

pr( locus in HPD | m>0 )

BH

pF

DR

(-)

and

size

(.)

00.

050.

10.

150.

2S

tore

y pF

DR

(-)

prior probability

fraction of posterior

found in tails

Page 32: what is the goal of QTL study?

Yandell © 2003 JSM 2003 32

false detection rates and thresholds

• multiple comparisons: test QTL across genome– size = pr( LOD() > threshold | no QTL at )– threshold guards against a single false detection

• very conservative on genome-wide basis

– difficult to extend to multiple QTL

• positive false discovery rate (Storey 2001)– pFDR = pr( no QTL at | LOD() > threshold )– Bayesian posterior HPD region based on threshold

={ | LOD() > threshold } { | pr( | Y,X,m ) large }

– extends naturally to multiple QTL

Page 33: what is the goal of QTL study?

Yandell © 2003 JSM 2003 33

pFDR and QTL posterior• positive false detection rate

– pFDR = pr( no QTL at | Y,X, in )

– pFDR = pr(H=0)*sizepr(m=0)*size+pr(m>0)*power

– power = posterior = pr(QTL in | Y,X, m>0 )

– size = (length of ) / (length of genome)

• extends to other model comparisons– m = 1 vs. m = 2 or more QTL

– pattern = ch1,ch2,ch3 vs. pattern > 2*ch1,ch2,ch3