reading: sections 5.3{5.5. introduction to the spock ...€¦ · the spock analysis, model checks...
TRANSCRIPT
Stat 529 (Winter 2011)
The Spock analysis, model checks and robustness
Reading: Sections 5.3–5.5.
• Introduction to the Spock dataset (handout)
– The exploratory data analysis
– Performing the basic one-way ANOVA
• Robustness considerations
– Checking assumptions – revisiting the ANOVA model
– Fitted values and residuals
– Producing residual plots
– Residual plots for the Spock ANOVA analysis
– The all-in-one graph from the ANOVA command
• Comparing different models using the F test
• Diagnostic plots for the finally chosen model
• Completing the inference
1
The Spock dataset
• See the handout for a description of the data, question of
interest, and exploratory data analysis.
• A discussion of the summaries:
2
Performing the basic one-way ANOVA
• Stat → ANOVA → One-Way.
– Response: Percentage of women.
– Factor: Judge.
– Click Store residuals and Store fits.
– Click OK.
• RESI1 are the residuals.
– Rename this to residuals.
• FITS1 are the fitted values.
– Rename this to fitted values.
(We will need the residuals and fitted values to diagnose the
fit of the model graphically.)
3
ANOVA output
One-way ANOVA: Percentage of women versus Judge
Source DF SS MS F P
Judge 6 1927.1 321.2 6.72 0.000
Error 39 1864.4 47.8
Total 45 3791.5
S = 6.914 R-Sq = 50.83% R-Sq(adj) = 43.26%
Individual 95% CIs For Mean Based on
Pooled StDev
Level N Mean StDev --------+---------+---------+---------+-
A 5 34.120 11.942 (-------*------)
B 6 33.617 6.582 (------*------)
C 9 29.100 4.593 (----*-----)
D 2 27.000 3.818 (------------*-----------)
E 6 26.967 9.010 (------*------)
F 9 26.800 5.969 (-----*----)
Spock 9 14.622 5.039 (-----*-----)
--------+---------+---------+---------+-
16.0 24.0 32.0 40.0
Pooled StDev = 6.914
4
Comments on the ANOVA output
5
Robustness considerations
Taken from the textbook, Section 5.5.1:
• Normality is not crucial as long as experiment is balanced
and there are no long-tailed or highly skewed distributions.
• Independence within and across groups is critical. If inde-
pendence is lacking different analyzes should be attempted.
• The assumption of equal standard deviations is crucial
(e.g., see Display 5.13).
• The tools are not resistant to severely outlying observa-
tions.
6
Checking assumptions – revisiting the ANOVA model
• Remember the additive model for our data:
Yij = µi + εij; i = 1, . . . , I, j = 1, . . . , ni.
• One way to check that the model fits well is to check the
assumptions made for the errors, εij. We usually assume:
1. Errors have mean zero and constant error variance σ2.
2. The errors are (usually) normally distributed.
3. The errors are independent across i and j.
• We will estimate the errors using the residuals.
7
Fitted values and residuals
• For any model (reduced or full) that we consider let µi be
the estimate of the mean in the ith population.
• Then the fitted value for case j in sample i is:
Yij = µi
• The residual for individual j in sample i is:
eij = Yij − Yij = Yij − µi.
• Example: In a model in which the mean is different for each
population:
Yij =
eij =
8
Properties of the residuals
• If the model fits well, the residuals have the following
properties.
1. Residuals are centered around zero with constant spread.
2. The residuals are normally distributed about zero.
3. There should be no obvious patterns in residuals across i
and j. There should certainly be no relationships between
the residuals and the fitted values in a well fitting model.
9
Some example residual plots
• Plot the residuals versus the fitted values:
– Check for appropriateness of the fit.
– Do we need to transform the response?
– Check for constancy of the variance of errors.
– Look for outliers.
• Plot the residuals versus the population identifier.
– Check adequacy of fit for each population.
– Curvature may indicate the need to transform.
• Normal Q-Q plot of residuals.
– Check that normality is reasonable for the residuals.
• Residuals versus time or collection order.
– Check for systematic problems in the residuals
(e.g., serial correlation).
10
Producing residual plots for the Spock analysis
• The next four slides show a number of residual plots:
– The 1st plot was created using Graph→ Scatterplot.
– The 2nd plot used Graph→ Individual value plots.
– The 3rd plot is from Graph → Boxplot.
– The last figure is a Graph → Probability plot.
• I added reference lines at y = 0 as needed.
• We would need some time variable (e.g. day the venire was
compiled) to check for serial dependence in the residuals.
11
Residual plots for the Spock ANOVA analysis
12
Residual plots, continued
13
Comments on the residual diagnostic plots
14
The all-in-one graph from the ANOVA command
• The ANOVA command can produce graphs of its own.
– In Stat→ANOVA→One-Way select Graphs and
then Four in One.
• Not very customizable.
– Advice for a good analysis - do not use these graphs – use
your own!
15
Comparing different models using the F test
• Here are some models we could consider for the Spock dataset:
1. One population mean explains all the judges.
2. One mean for Spock’s judge, and another mean for all the
other judges.
3. Each judge needs a single mean.
• Let us compare these models using F tests.
16
A mean for each judge
One-way ANOVA: Percentage of women versus Judge
Source DF SS MS F P
Judge 6 1927.1 321.2 6.72 0.000
Error 39 1864.4 47.8
Total 45 3791.5
S = 6.914 R-Sq = 50.83% R-Sq(adj) = 43.26%
Individual 95% CIs For Mean Based on Pooled StDev
Level N Mean StDev --------+---------+---------+---------+-
A 5 34.120 11.942 (-------*------)
B 6 33.617 6.582 (------*------)
C 9 29.100 4.593 (----*-----)
D 2 27.000 3.818 (------------*-----------)
E 6 26.967 9.010 (------*------)
F 9 26.800 5.969 (-----*----)
Spock 9 14.622 5.039 (-----*-----)
--------+---------+---------+---------+-
16.0 24.0 32.0 40.0
Pooled StDev = 6.914
17
A mean for Spock’s judge and a mean for all
the other judges
One-way ANOVA: Percentage of women versus Is Spock
Source DF SS MS F P
Is Spock 1 1600.6 1600.6 32.15 0.000
Error 44 2190.9 49.8
Total 45 3791.5
S = 7.056 R-Sq = 42.22% R-Sq(adj) = 40.90%
Individual 95% CIs For Mean Based on Pooled StDev
Level N Mean StDev ----+---------+---------+---------+-----
No 37 29.492 7.431 (---*---)
Yes 9 14.622 5.039 (-------*-------)
----+---------+---------+---------+-----
12.0 18.0 24.0 30.0
Pooled StDev = 7.056
18
The F tests (exercise!)
Compare models 1 versus 2, 1 versus 3, and 2 versus 3.
• 1 versus 2:
Fobs = 32.15
The p-value ≤ 0.001 (reject H0).
Conclusion:
• 1 versus 3:
Fobs = 6.72
The p-value ≤ 0.001 (reject H0).
Conclusion:
• 2 versus 3:
Fobs =(2190.9− 1864.4)/(44− 39)
47.8
=326.5/5
47.8=
65.3
47.8= 1.37.
The p-value is 1− 0.743 = 0.257 (fail to reject H0).
Conclusion:
19
The F tests, continued
• We used the following MINITAB output:
F distribution with 5 DF in numerator and 39 DF in denominator
x P(X<=x)
1.37 0.743405
20
Diagnostic plots for Model 2
• Now we check the model assumptions using diagnostic plots
based on the residuals of model 2.
– Important: The model you choose determines the es-
timated mean, µi, that will be used in calculating the
residuals.
• Comments on the fit of model 2:
21
Diagnostic plots for Model 2
22
Diagnostic plots for Model 2, continued
23
Completing the inference
• Do not use this part of the ANOVA output as a final in-
ference for comparing the groups means:
Individual 95% CIs For Mean Based on Pooled StDev
Level N Mean StDev ----+---------+---------+---------+-----
No 37 29.492 7.431 (---*---)
Yes 9 14.622 5.039 (-------*-------)
----+---------+---------+---------+-----
12.0 18.0 24.0 30.0
• What is an appropriate inference to use instead?
(side issue: should we pool variances?)
24