Download - Some thoughts of the design of cDNA microarray experiments Terry Speed & Yee HwaYang, Department of Statistics UC Berkeley MGED IV Boston, February 14,

Some thoughts of the design of Some thoughts of the design of cDNA microarray experimentscDNA microarray experiments

Terry Speed & Yee HwaYang,

Department of Statistics

UC Berkeley

MGED IV Boston, February 14, 2002

Some aspects of designSome aspects of design

Layout of the array– Which cDNA sequence to print?

• Library • Controls

– Spatial position

Allocation of samples to the slides – Different design layout

• A vs B : Treatment vs control• Multiple treatments• Factorial • Time series

– Other considerations• Replication• Physical limitations: the number of slides and the amount of material• Extensibility - linking

Some issues to consider before designing cDNA microarray experiments

ScientificAims of the experiment

Specific questions and priorities between them. How will the experiments answer the questions posed?

Practical (Logistic)Types of mRNA samples: reference, control, treatment, mutant, etc Amount of material. Count the amount of mRNA involved in one channel

of hybridization as one unit. The number of slides available for the experiment.

Other Information

The experimental process prior to hybridization: sample isolation, mRNA extraction, amplification, labelling,…

Controls planned: positive, negative, ratio, etc.Verification method: Northern, RT-PCR, in situ hybridization, etc.

Natural design choiceNatural design choice

Case 1: Meaningful biological control (C)

Samples: Liver tissue from four mice treated by cholesterol modifying drugs.

Question 1: Genes that respond differently between the T and the C.

Question 2: Genes that responded similarly across two or more treatments relative to control.

Case 2: Use of universal reference.

Samples: Different tumor samples.

Question: To discover tumor subtypes.

C

T1 T2 T3 T4 T1

Ref

T2 Tn-1 Tn

Treatment vs ControlTreatment vs Control

Two samples

e.g. KO vs. WT or mutant vs. WT

T CT Ref

C Ref

Direct Indirect

2 /2 22

average (log (T/C)) log (T / Ref) – log (C / Ref )

CaveatCaveat

The advantage of direct over indirect comparisons was first pointed out by Churchill & Kerr, and in general, we agree with the conclusion. However, you can see in the last M vs A plot that the difference is not a factor of 2, as theory predicts. Why?

A likely explanation is that the assumption that log(T/Ref)

and log(C/Ref) are uncorrelated is not valid, and so the gains are less than predicted. The reason for the correlation is less obvious, but there are a number of possibilities.

One is that we use mRNA from the same extraction; another is that we didn‘t dye-swap with the two indirect comparisons, but did when we replicated the direct comparison. The answer is not yet clear.

LabelingLabeling • 3 sets of self – self hybridization: (cerebellum vs cerebellum)• Data 1 and Data 2 were labeled together and hybridized on two

slides separately.• Data 3 were labeled separately.

Data 1 Data 1

Dat

a 2

Dat

a 3

• Olfactory bulb experiment:• 3 sets of Anterior vs Dorsal performed on different days• #10 and #12 were from the same RNA isolation and

amplification• #12 and #18 were from different dissections and amplifications• All 3 data sets were labeled separately before hybridization

Extraction

I) Common Reference

II) Common reference

III) Direct comparison

Number of Slides

Ave. variance

Units of material

A = B = C = 1 A = B = C = 2 A = B = C = 2

Ave. variance

One-way layout: one factor, k levelsOne-way layout: one factor, k levels

C B

A

ref

CBA

ref

CBA

I) Common Reference

II) Common reference

III) Direct comparison

Number of Slides

N = 3 N=6 N=3

Ave. variance 2 0.67

Units of material A = B = C = 1 A = B = C = 2 A = B = C = 2

Ave. variance 1 0.67

One-way layout: one factor, k levelsOne-way layout: one factor, k levels

C B

A

ref

CBA

ref

CBA

For k = 3, efficiency ratio (Design I / Design III) = 3. In general, efficiency ratio = 2k / (k-1). However, remember the assumption!

Design I

Design III

A B

C

A

Ref

B C

Illustration from one experiment

Box plots of log ratios: we are still ahead!

CTL OSM

EGF OSM & EGF

Factorial experimentsFactorial experiments

•Treated cell lines

•Possible experiments

Here we are interested not in genes for which there is an O or an E effect, but in which there is an OE interaction, i.e. in genes for which log(O&E/O)-log(E/C) is large or small.

Other examples of factorial experimentsOther examples of factorial experiments

Suppose we have tumor T and standard cells S from the same tissue, and are interested in the impact of radiation R on gene expression. In general, genes for which log(RT/T) and log(RS/S) are large or small, will be less interesting to us than those for which log(RT/T) - log(RS/S) are large or small, i.e. those with large interactions.

Next, suppose that our interest is in comparing gene expression in two mutants , say M and M’, at two developmental stages, E and P say. Then we are probably more interested in those genes for which the temporal pattern in the two mutants differ, than in the patterns themselves, i.e. interest focusses on genes for which log(ME/MP)-log(M’E/M’P) is large or small, again the ones with large interactions.

Indirect A balance of direct and indirect

I) II) III) IV)

# Slides N = 6

Main effect A

0.5 0.67 0.5 NA

Main effect B

0.5 0.43 0.5 0.3

Interaction A.B

1.5 0.67 1 0.67

2 x 2 factorial: some design options2 x 2 factorial: some design options

C

A.BBA

B

C

A.B

A

B

C

A.B

A

B

C

A.B

A

Table entry: variance (assuming all log ratios uncorrelated)

Design choices in time series. Entry: variance

t vs t+1 t vs t+2 t vs t+3

Ave

T1T2 T2T3 T3T4 T1T3 T2T4 T1T4

N=3 A) T1 as common reference 1 2 2 1 2 1 1.5

B) Direct Hybridization 1 1 1 2 2 3 1.67

N=4 C) Common reference 2 2 2 2 2 2 2

D) T1 as common ref + more .67 .67 1.67 .67 1.67 1 1.06

E) Direct hybridization choice 1 .75 .75 .75 1 1 .75 .83

F) Direct Hybridization choice 2 1 .75 1 .75 .75 .75 .83

T2 T3 T4T1

T2 T3 T4T1

Ref

T2 T3 T4T1

T2 T3 T4T1

T2 T3 T4T1

T2 T3 T4T1

M1.WT.P11

M1.MT.P21M1.MT.P11

M1.WT.P21M1.WT.P1

M1.MT.P1

Mutant 1 (M1)

Mutant 2 (M2)

M2.WT.P11

M2.MT.P21M2.MT.P11

M2.WT.P21M2.WT P1

M2.MT.P1

Question: Seek genes that are changing over time and are different in MT vs WT.Analysis: Looking at the interaction effect between time and type.

An recently designed factorial experiment

SummarySummary

The balance of direct and indirect comparisons in a given context should be determined by optimizing the precision of the estimates among comparisons of interest, subject to the scientific and physical constraints of the experiment.

AcknowledgmentsAcknowledgments

Jean Yee Hwa YangJean Yee Hwa YangSandrine DudoitSandrine Dudoit

Gary Glonek (Adelaide)Gary Glonek (Adelaide)

Ingrid Lönnstedt (Uppsala)Ingrid Lönnstedt (Uppsala)

John Ngai’s Lab (Berkeley)

Jonathan Scolnick

Cynthia Duggan

Vivian Peng

Moriah Szpara

Percy Luu

Elva Diaz

Dave Lin (Cornell)

Some web sites:

Technical reports, talks, software etc.

http://www.stat.berkeley.edu/users/terry/zarray/Html/

Statistical software R (“GNU’s S”)

http://www.R-project.org/

Packages within R environment:

-- SMA (statistics for microarray analysis) http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html

--Spot http://www.cmis.csiro.au/iap/spot.htm

http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html









Download - Some thoughts of the design of cDNA microarray experiments Terry Speed & Yee HwaYang, Department of Statistics UC Berkeley MGED IV Boston, February 14,

Top Related