structure in the experimental treatments pgrm 11

50
Statistic s in Science Statistic s in Science Structure in the Experimental Treatments PGRM 11

Upload: makoto

Post on 22-Jan-2016

30 views

Category:

Documents


2 download

DESCRIPTION

Structure in the Experimental Treatments PGRM 11. Factors. Complex systems are affected by a wide range of factors : Ploughing system : soil type, ploughing depth, no of cultivations, type of plough, etc Animal production system : management regime, biological & environmental inputs - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Structure in theExperimental Treatments

PGRM 11

Page 2: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Factors

Complex systems are affected by a wide range of factors:

• Ploughing system: soil type, ploughing depth, no of cultivations, type of plough, etc

• Animal production system: management regime, biological & environmental inputs

• Ecological habitat: available food, cover light, temperature

• Biochemical reaction: concentration of reagents, temperature, light

Page 3: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Factor Levels

Enterprise type is a factor affecting farm outputs

The different enterprise types considered are the levels of the factor:

eg beef, beef suckler, dairy, mixed

Levels may be categorical (as above), or quantitative as in the study of the effect of washing solution on retarding bacterial growth – these were

2%, 4% or 6%of an active ingredient.

With quantitative levels it makes sense to look for a trend (increasing or decreasing) in the response as the level increases.

Page 4: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Single factor experiments• Compare the mean response for the different levels

of a single factor

• Other factors affecting the response must be kept as constant as possible, and any affect of these will appear as random residual variation (due to the random allocation of units to the different levels of the factor)

• The result will be: clear, valid but of limited value

Ex: comparing growth of lambs fed on 2 levels of protein supplement, we must use the same sources of protein for the two levels: we have no info what the response to protein level would be for other sources

Page 5: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

(Multi)-factorial experiments

Examine the effect of 2 (or more) factors at the same time

Treatments:the various combinations of the levels of the different factors

Ex:protein supplement: factor B (levels B1, B2, B3)protein source: factor A (levels A1, A2) 6 treatments

A1B1, A1B2, A1B3A2B1, A2B2, A2B3

Page 6: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Simple and Main effects

• Simple effects of source: Difference (in mean growth) between source A1 & A2 can be considered at each of the 3 levels of protein.

• Simple effects of protein: Differences between B1 & B2, B2 & B3 and between B3 & B1 can be measured for each source.

• Main effects are averages of simple effects, and are not always meaningful

Page 7: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Example

Means B1 B2 B3 Average

A1 10 18 11 13

A2 11 19 18 16

A effect 1 1 7 3

Note:the main effect of A, (1 + 1 + 7)/3, is also the difference between the MARGINAL means

Here the effect of A depends on the level of B

This is an INTERACTION between the factors A and B

Page 8: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Important Rule

With an AB interaction:the effect of A changes as the level of B changes

Hence:averaging the effects of A over the levels of B makes no sense

1. The main effect of a factor can not be uncritically interpreted as the effect of the factor if there is an interaction

2. In this case report the ab treatment means and some meaningful comparisons, and not the separate means for levels of A and B

Page 9: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Interaction plot

Page 10: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Page 11: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Why do factorial?

1. Factorial experiments compare a set of treatments which have a certain structure:the treatments simply consist of combinations of levels of 2 (or more) factors— so we already know how to do the analysis!— the factorial treatment structure will dictate sensible

comparisons to make

2. The gain:— knowing whether the effect of one factor varies with

the level of another— saving resources when there is no interaction,

since a simple effect can be estimated at each level of the other factor and the results combined

Page 12: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Why the gain (in absence of interaction)Sample size

B1 B2 B3 Total

A1 6 6 6 18

A2 6 6 6 18

Total 12 12 12 36

A effect: since this is the same for all levels of B it is measured by the difference in the marginal means, each based on 18 observations.

B effects: each B effect (B1vB2, B2vB3, B3vB1) is measured using means of 12 observations

Page 13: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Separate experiments (same resources)

A1 A2 Total

9 9 18

A effect: now measured by the difference between means of 9 observations (was 18).

B effects: now measured by the difference between means of 6 observations (was 12).

Also: we don’t know if the A effects depend on the level of B – MORE LOSS OF INFORMATION!

B1 B2 B3 Total

6 6 6 18

Page 14: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

PGRM pg 11-6

The enormous benefits (of factorial designs)

arise through no extra cost but merely by

reorganising the work programme.

You can choose to get much more

information for the same money or reduce

the cost of achieving a given level of

information.

Page 15: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

SAS OUTPUT

1. ANOVA table

2. Table of MEANS with SED

3. Writing a summary

Page 16: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

ANOVA ab factorial, replication r• Treat this as a 1-way structure, with ab treatments

Source SS df MS

Treatments TSS ab - 1 TSS/(ab-1)

Error RSS (r-1)ab RSS/((r-1)ab)

Total rab - 1

• Now partition the treatment SS, TSS

Source SS df MS

A SSA a-1 SSA/(a-1)

B SSB b-1 SSB/(b-1)

AB (interaction) SSAB (a-1)(b-1) SSAB/((a-1)(b-1))

Treatment TSS ab-1

Page 17: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Example: time to development of Fasciola hepatica eggs under 2 combinations of temperature and relative humidity

Temperature oC 16 16 22 22

Humiditiy level 1 2 1 2

27 34 13 17

26 37 17 15

29 33 16 18

Treatment Means 27.3 34.7 15.3 16.7

Source df SS MS F

Treatments 3 758.33 252.78 ***75.83

Partition of TSS

Temp 1 675.00 675.00 ***202.7

Humidity 1 56.33 56.33 **16.92

Interaction 1 27.00 27.00 *8.1

Residual 8 26.67 3.33

Total 11 785.00

p<0.001 ***p<0.01 **p<0.05 *

Page 18: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Tables of MeansTemperature oC 16 16 22 22

Humiditiy level 1 2 1 2

Treatment Means 27.3 34.7 15.3 16.7

SED = 1.49

Humidity effect:sig. when temp = 16 (7.4)non-sig. when temp = 22 (1.4)

Temp. effect:sig. (12.0 & 18.0) at both levels of humidity

16 22 SED

31.0 16.0 1.06

Temperature

Interpretation

Overall treatments differ: F = 75.83

Interaction is significant: F = 8.1, so we really should examine the 4 means as above, and ignore the tests for main effects which eg compare levels of HUMIDITY averaged over levels of TEMP

However, in this case, the TEMP effect is much larger than the interaction, its averaged effect broadly reflects its effect at each level of HUMIDITY

H1 H2 SED

21.3 25.7 1.06

Humidity

Page 19: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Page 20: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

SAS/GLM for 2-way analysis

One-way analysis Main effects & interaction

proc glm data = fasciola;class temp humidity;model time = temp humidity temp*humidity;lsmeans temp;lsmeans humidity;lsmeans temp*humidity;estimate ‘SED for temp’ temp 1 -1;estimate ‘SED for humidity’ humidity 1 -1;

quit;proc glm data = fasciola;

class temp humidity;model time = temp*humidity;estimate ‘SED tment means’ temp*humidity 1 -1;

quit;

Page 21: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

SAS demo!

temp humidity time

16 1 27

16 2 34

22 1 13

22 2 17

16 1 26

16 2 37

22 1 17

22 2 15

16 1 29

16 2 33

22 1 16

22 2 18

Data must contain response values (time) in a single column

identified by factor levels in 2 other columns

This gives 3 variables

(columns) for SAS program

faciola.sas

Page 22: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

What to present (again!)

• Since the interaction is significant don’t report the main effects.

• Present:– the 2-way table: (with SED)

– a summary:the temp/humidity interaction was significant (p = 0.02)humidity effects were significant at temp = 16 (p = 0.0012)but not at temp = 22 (p = 0.40)temp effects were significant at both humidities (p < 0.0001), and greater when humidity = 1

Time 160 220

1 27.3 15.3

2 34.7 16.7 SED = 1.49

Page 23: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Factorial experiment laid out in blocks

• Above has laid out the ab treatments as a completely randomised design using rab experimental units (r for each treatment)Think: how would this be done in practice?

• If we block the experimental units into blocks of size ab and randomly allocate the ab treatments to the units in the block we can then remove BSS from RSS, hopefully reducing it sufficiently to compensate for the reduction in DF

• See example over …

Page 24: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

2-way experiment laid out in blocks

• Factor A: 2 levels Factor B: 3 levels

• 60 experimental units available (10 per treatment)

• Completely randomised design (CR): randomly allocate treatments of unitsRandomised blocks (RB): Group units into blocks of size 6 (so 10 blocks) & randomise the 6 treatments in each block, which may be much easier to do

ANOVA

Source DF: CR DF: RB

Block 9

A 1 1

B 2 2

AB 2 2

Residual 54 45

Total 59 59

Page 25: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Practical: 4.2 Two-Factor Factorial Example 2

Bacterial count in sausagesstored at 4 temperaturesusing 3 type of preservative methods

Page 26: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

More than 2 factors!3×4×5 experiment:

ie Factors A, B, C with 3, 4,and 5 levels respectivelygiving 60 treatment combinations!

The 3-factor ABC interaction measures how the 2-factor AB interaction changes over the levels of C(see over)

Can get away with replication r = 1 provided the 3-factor interaction can be assumed negligible– not usually liked by journal editors!

With r > 1 we include:main effects: A, B, C2-factor interactions: BC, CA, AB3-factor interaction: ABC

Page 27: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

3-factor interaction for a 2×2×2 expt(a)

0

10

20

30

40

B1 B2 B3

Re

sp

on

se A1C1

A1C2

A2C1

A2C2

With C1: A effect is least at B2

With C2: A effect is largest at B2

Direction of A effect is different for C1, C2

AB interaction different a two C levels

Page 28: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

3-factor interaction arising naturally

See PGRM Fig 11.2.2 (b)

Page 29: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Examples – measuring the benefit

1. 2222: artificial insemination involving 256 heifers(r = 16 per treatment)

2. 345: imaginary example to practice

calculating sample sizes! 120 units

(r = 2)

3. 222: machine tool lifetime 24 units(r = 3)

Page 30: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Example 2x2x2x2 factorial

choices

A) 4 experiments (r=32)

B) 2 x 2 x 2 x 2 factorial

(r=16 per combination)

Artificial insemination

256 heifers (64 each week) 4 factors at 2 levels.

Compare precision

A) 32 animals per treatment.

SED = (2 s2/32) = s/4

where s2 = MSE.

B) 128 animals for each level of a factor

SED = (2 s2/128) = s/8.

Plus

With B all interactions can be estimated

Page 31: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Compare precision

A) 32 animals per treatment.

SED = (2 s2/32) = s/4

where S2 = MSE.

B) 128 animals for each level of a factor

SED = (2 s2/128) = s/8.

ConclusionSummary - The factorial design

- Halves the SED and quarters the number of animals required for a given level of precision

- Allows more general interpretation of the factor effects since they are tested over a wide range of levels of the other factors

- Allows a test of whether the factors interact.

Page 32: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

3×4×5 expt with factors A, B, C & replication 2(120 units)Replication of Main effect means

A B C

40 30 24

Replication of means in Interaction table, eg BC

B C

1 2 3 4 5 Total

1 6 6 6 6 6 302 6 6 6 6 6 303 6 6 6 6 6 304 6 6 6 6 6 30

Total 24 24 24 24 24 120

All interactions

AB AC BC Treat Comb.

10 8 6 2

For comparing BC effects if only significant interaction is BC

For any factor not involved in a significant interaction

All 2-factor interactions significant, 3-factor not

Page 33: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Example

An engineer is interested in the effects ofcutting speed (A),tool geometry (B) andcutting angle (C)

on thelife (in hours)

of a machine tool.

Two levels of each factor are chosen,and three replicates of a 23 factorial design are run.

Design: 2×2×2

No. treatments: 8

No. units: 24

Page 34: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Example: DataA B C LIFE(hr)

Replicate

1 2 3

1 1 1 22 31 25

2 1 1 32 43 29

1 2 1 35 34 50

2 2 1 55 47 46

1 1 2 44 45 38

2 1 2 40 37 36

1 2 2 60 50 54

2 2 2 39 41 47

Page 35: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Example: ANOVASource df SS MS F F pr.

A 1 0.67 0.67 0.02 0.884B 1 770.67 770.67 25.55 <.001C 1 280.17 280.17 9.29 0.008

A.B 1 16.67 16.67 0.55 0.468A.C 1 468.17 468.17 15.52 0.001B.C 1 48.17 48.17 1.60 0.224

A.B.C 1 28.17 28.17 0.93 0.348Residual 16 482.67 6.17

Total 23 2095.33

Note:

1. ABC interaction non-significant

2. AC is only significant 2-factor interaction

Page 36: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Tables of MEANS1 2 SED

A 40.7 41.0 2.24B 35.2 46.5 2.24C 37.4 44.2 2.24

AB1 B2 SED

1 34.2 47.22 36.2 45.8 3.17

AC1 C2 SED

1 32.8 48.52 42.0 40.0 3.17

BC1 C2 SED

1 6.3 40.02 44.5 48.5 3.17

B1 B1 B2 B2

C1 C2 C1 C2

A1 26.0 42.3 39.7 54.7

A2 34.7 37.7 49.3 42.3

SED = 4.48

Help!

Page 37: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Making sense of tables

1. From this analysis, the only terms that are significant are the B and C main effects and the AC interaction.

2. Thus, the only tables that need to be presented are the B main effect table and the AC tables of means.– Geometry (B) has a large effect, increasing the life by

over 10 hours.– Cutting angle (C) increases the life considerably at low

but not at high speed (A).

3. Another way of looking at the AC interaction is that increased speed increase tool life for the first cutting angle but reduces it for the second cutting angle.

Page 38: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

How were the tables calculated?

Page 39: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

SAS/GLM codeproc glm data = mydataset;

model response = a b c b*c c*a a*b a*b*c;

lsmeans a b c b*c c*a a*b a*b*c;

quit;

With one (AC) significant interaction

lsmeans b a*c / stderr;estimate ‘b SED’ b 1 -1;

/* ac SED = sqrt(2) x stderr */

Is this the

best we can

suggest?

Page 40: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Calculating SEDsRecall (with equal replication):

SED = √2 × SEM

SED: standard error of a difference

SEM: standard error of a mean

SAS:

lsmeans B / stderr;

lsmeans A*C / stderr;

lsmeans A*B*C / stderr;

will give SEM, & a usually useless p-value testing whether the mean is 0!

f3_toolLife.sas

Page 41: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Calculating SED:

For the AC interaction:

SEM = 2.2422707NB: usual SAS unhelpful precision!

so SED = 1.414 × 2.2422707

= 3.17 (3 sig. figs.)

Page 42: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Transformations of data

Analysing log(response)

Page 43: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Interpreting the log scaleLinear relationship

log(y) = a + bx (here: log = log2)

y = 2a + bx = 2a 2bx

Compare y-values for a unit increase in x,

ie y1 at x and y2 at x + 1

y2 / y1 = [2a 2(bx + b)]/ [2a 2bx]

= 2b

Increasing x by 1, multiplies y by 2b

eg if b = -1 this is a 50% decrease in y

Page 44: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Understanding the LOG scale

- where effects of a variate are proportional

Example:

1. uses log2 (logs to base 2)

2. slope b = -1

- giving a 50% decrease per unit increase in x

Page 45: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

log2(y) = 3 – x

- a linear relationship between log2(y) & x

x y

0 8

1 4

2 2

3 1

4 0.5

Page 46: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Back transforming LOG

y = log10(x) x = 10y

y = log2(x) x = 2y

y = log(x)x = exp(y) = ey

Page 47: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

Dilution of drug in milkExcretion of sodium penicillin for five milkings for a cow.

Relationship is not linear.

Units vs Milkings

0

10000

20000

30000

0 2 4 6

Milkings

Uni

ts

Milking Units Excreted

Log(Units)

1 29547 10.29 2 1111 7.01 3 235 5.46 4 26 3.26 5 4.3 1.46

Page 48: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

LOG-scale

Slope b= -2.14

exp(-2.14) = 0.12

Conclusion:

Each milking reduces the# units to 12% of previous milking

Log(units) vs # milkings

log(U) =11.9 - 2.14 M

0

5

10

15

1 2 3 4 5

Milking

Lo

g(u

nit

s)

Page 49: Structure in the Experimental Treatments PGRM 11

Statistics

in

Science

Statistics

in

Science

Statistics

in

Science

Statistics

in

Science

Revision:

t-test, p-value, significance level, hypothesis testing, and much more

ALL IN ONE OVERHEAD!

Page 50: Structure in the Experimental Treatments PGRM 11

Statistics

in

ScienceStatistics

in

Science

t-test

When H0 is true:

5% of t-values fall on axis below blue shading – for 11 df:

beyond ±2.2

For given t, V, p is proportion of more extreme values

H0: = 0

t = ESTIMATE/SE

eg =

3 - 1

1 - 22 +3 regression slope