lecture 33 multiple factor anova - purdue universityghobbs/stat_512/lecture_notes/... · lecture 33...
TRANSCRIPT
33-1
Lecture 33
Multiple Factor ANOVA
STAT 512
Spring 2011
Background Reading
KNNL: Chapter 24
33-2
Topic Overview
• ANOVA with multiple factors
33-3
3-Way ANOVA Model
• Three factors A, B, and C having a, b, and c,
levels, respectively
• Notation is similar to before.
33-4
Data for three-way ANOVA
− Y, the response variable
− Factor A with levels i = 1 to a
− Factor B with levels j = 1 to b
− Factor C with levels k = 1 to c
− Yijkl is the lth observation in cell (i,j,k), l = 1 to
nijk
− A balanced design has nijk = n
33-5
Cell Means Model
ijkl ijk ijklY = µ + ε
− ijkµ is the theoretical mean or expected value
of all observations in cell (i,j,k).
− ( )2~ 0,iid
ijkl Nε σ
− ( )2~ ,ijkl ijkY N µ σ are independent
33-6
Treatment Means
1 1 1
1 1 1
, , ,
1
, ,
c abij ijk i k ijk jk ijk
k j i
acbc abi ijk j ijk k ijk
j k i k i j
abc ijk
i j k
µ µ µ µ µ µ
µ µ µ µ µ µ
µ µ
= = =
= = =
=
∑ ∑ ∑
∑ ∑ ∑
∑
i i i
ii i i ii
iii
33-7
Estimates
1
1 1 1
, , ,
1 1 1
, , , , , ,
1
, , ,
ˆ
ˆ ˆ ˆ
ˆ ˆ ˆ
ˆ
nijk ijkl
l
cn anbnij ijkl i k ijkl jk ijkl
k l j l i l
acnbcn abni ijkl j ijkl k ijkl
j k l i k l i j l
abcn ijkl
i j k l
Y
Y Y Y
Y Y Y
Y
µ
µ µ µ
µ µ µ
µ
=
= = =
= = =
=
∑
∑ ∑ ∑
∑ ∑ ∑
∑
i i i
ii i i ii
iii
33-8
Factor effects model ( ) ( ) ( ) ( )ijk i j k ijklij ik jk ijk
Y = µ +α +β + γ + αβ + αγ + βγ + αβγ + ε
− µ is the overall (grand) mean
− , ,i j kα β γ are the main effects of factors A, B,
and C
− ( ) ( ) ( ), ,ij ik jk
αβ αγ βγ are the two-way (first
order) interactions
− ( )ijk
αβγ is the three-way (second order)
interaction
33-9
Factor Effects
( )
( )
( )
( )
i i ij i jij
j j i k i kik
k k jk j kjk
ijk ij i k jk iijk
j k
α µ µ αβ µ µ µ µ
β µ µ αγ µ µ µ µ
γ µ µ αβ µ µ µ µ
αβγ µ µ µ µ µ
µ µ µ
= − = − − +
= − = − − +
= − = − − +
= − − − +
+ + −
ii iii i ii i i iii
i i iii i ii ii iii
ii iii i i i ii iii
i i i ii
i i ii iii
Plug in cell means to estimate.
33-10
Constraints
• Usual constraints listed on page 997 – sums
of effects for ANY of the indices are zero.
Under these, µiii will be the grand mean.
• In SAS, constraints are all set up to compare
everything to abcµ . Thus a factor effect is
zero if it includes any of the “last” levels of
the factors.
33-11
Assumptions
• Constancy of variance applies across cells;
can do residual plots across treatment
combinations
• For violations, transformations can
sometimes be useful; WLS is a standard
remedial measure if the error distribution is
normal but the variances are different.
33-12
ANOVA Table
• SSTR/Model is partitioned into:
� Main Effects
� Two Way Interactions
� Three Way Interactions
� Etc.
• DF are multiplicative. For example, three-
way interaction between A, B, C, takes up
( )( )( )1 1 1a b c− − − DF.
• SS formulas given on page 1008.
33-13
Steps in 3-Factor Analysis
1. Fit full model and check assumptions
2. Start with the 3-way interaction and
determine if it is significant.
3. If not, may consider pooling. To avoid
likelihood of Type I errors, best to pool only
in cases where p-value is not close to
significant.
4. If 3-way interaction (or multiple 2-way
interactions) are significant, then analyze the
three factors jointly in terms of ijkµ .
33-14
Steps in 3-Factor Analysis (2)
5. If only a single two-way interaction is
significant, may again consider pooling, and
can analyze via regular interaction plot. Do
NOT pool any term for which higher order
terms are significant.
6. Can analyze main effects if factor not
involved in important interaction. May also
be able to look at main effects if they are
large compared to the interactions.
33-15
With More than three factors...
• Hope that higher order interactions are not
significant (this is often the case). If they
are, try to analyze cell means. Assuming
they are not...
• Interactions that overlap (e.g. AB and BC)
and are significant suggest analysis of the
three-factor level means.
• Another potential strategy is to combine
factors (e.g. gender and smoking might be
considered one factor with 4 levels)
33-16
Multiple Comparisons
• Tukey, Bonferroni, and Scheffe adjustments
can be made as before (see page 1017 for
appropriate degrees of freedom to use;
generally model and/or error).
• Can utilize contrasts to study specific
questions (should use Scheffe if looking at
any unplanned contrasts; Bonferroni is
appropriate for contrasts that have been
planned in advance)
33-17
Unequal Cell Sizes
• Formulas change a bit as not all of the ijkn
are the same
• Look at Type III SS as well as Type I (the
closer the sample sizes are to each other,
the less difference there will be).
• MUST use LSMeans to do comparisons
33-18
Empty Cells
• Can often be problematic for larger designs
• Create situations where some effects are
confounded; generally interactions can
only be partially studied.
• Usually forced to assume some interactions
are zero.
• See page 964 for more on empty cells
33-19
Example
• Problem 24.6 (alloy.sas)
• Studying the effects of three factors on the
hardness of an alloy
• Factor A: Use of a chemical additive (1 =
low amount; 2 = high amount)
• Factor B: Temperature (1 = low, 2 = high)
• Factor C: Time allowed for process (1 =
low, 2 = high)
• Three observations per cell, balanced design
33-20
33-21
33-22
33-23
33-24
Interactions
• Parallel lines suggests no interactions. If we
look at the ANOVA table, this is seen there
as well. Source DF SS MS F Pr > F
additive 1 789 789 235 <.0001
time 1 2440 2440 727 <.0001
add*time 1 0.20 0.20 0.06 0.8095
temp 1 1539 1539 458 <.0001
add*temp 1 0.24 0.24 0.07 0.7926
time*temp 1 2.94 2.94 0.88 0.3634
ad*tim*tem 1 0.60 0.60 0.18 0.6778
Error 16 53.7 3.36
Total 23 4826
33-25
Analysis
• In this (nice) case we can simply look at the
individual means and draw conclusions
additive LSMEAN Pr > |t|
1_low 54.2250000 <.0001
2_high 65.6916667
time LSMEAN Pr > |t|
1_low 49.8750000 <.0001
2_high 70.0416667
33-26
Analysis (2)
temp LSMEAN Pr > |t|
1_low 51.9500000 <.0001
2_high 67.9666667
• High levels for all three variables are
preferred.
• Don’t forget assumptions (in this case not
too bad; something weird in cell #1)
33-27
Example (adjusted)
• Data changed a bit (see SAS code)
• Basically, for illustration, interchanged the
cells for A = 1, B = 2 and A = 2, B = 2
• Interaction Plot now suggests interaction
33-28
33-29
Two-way Interaction Plots
• From the 3-way interaction plot we can
guess that the interaction has to do with
time (but not temp since individually, lines
for same level of temp are parallel)
• This is confirmed by looking at the 2-way
interaction plots
33-30
• Interaction between additive and
temperature
33-31
• No interaction between additive/time (and
no apparent effect of additive if we ignore
temperature)
33-32
• No interaction between time/temp; there is
apparently a main effect of temperature in
addition to the interaction.
33-33
ANOVA Output
Source DF SS MS F Pr > F
additive 1 0.24 0.24 0.07 0.7926
time 1 2440 2440 727 <.0001
add*time 1 0.60 0.60 0.18 0.6778
temp 1 1539 1539 458 <.0001
add*temp 1 789 789 235 <.0001
time*temp 1 2.94 2.94 0.88 0.3634
ad*tim*tem 1 0.20 0.20 0.06 0.8095
Error 16 53.7 3.36
Total 23 4826
33-34
Results
• Additive interacts with Temperature; Will want
to examine that interaction
• Temperature is by itself significant; so
probably can look at main effect for that as
well.
• Would be inappropriate to look at main effect
for Additive; factor is important in how it
interacts with temp and main effect here will
be misleading
• Can look at main effect for time since there is
no interaction there.
33-35
Results (2)
time LSMEAN Pr > |t|
1_low 49.8750000 <.0001
2_high 70.0416667
temp LSMEAN Pr > |t|
1_low 51.9500000 <.0001
2_high 67.9666667
• Longer time is better
• Apparently higher temperature is better
33-36
Results (3)
additive temp LSMEAN Number
1_low 1_low 46.317 1
1_low 2_high 73.800 2
2_high 1_low 57.583 3
2_high 2_high 62.133 4
i/j 1 2 3 4
1 <.0001 <.0001 <.0001
2 <.0001 <.0001 <.0001
3 <.0001 <.0001 0.0028
4 <.0001 <.0001 0.0028
33-37
Results (4)
• Can identify a “best” combination of
additive and temperature (low additive,
high temperature)
• As we saw in the interaction plot, the
additive counteracts the effect of raising
the temperature to some degree
33-38
Upcoming …
• More multiple ANOVA / ANCOVA
examples
• Fixed vs. Random Effects