review: the logic underlying anova the possible pair-wise comparisons: x 11 x 12. x 1n x 21 x 22. x...
Post on 21-Dec-2015
224 views
TRANSCRIPT
Review: The Logic Underlying ANOVA
• The possible pair-wise comparisons:
X11
X12
.
.
.X1n
X21
X22
.
.
.X2n
Sample 1 Sample 2
€
X 1
€
X 2means:
X31
X32
.
.
.X3n
Sample 3
€
X 3
Review: The Logic Underlying ANOVA
• There are k samples with which to estimate population variance
X11
X12
.
.
.X1n
X21
X22
.
.
.X2n
Sample 1 Sample 2
€
X 1
€
X 2
X31
X32
.
.
.X3n
Sample 3
€
X 3€
ˆ σ 12 =
(X i − X 1)2∑
n −1
Review: The Logic Underlying ANOVA
• There are k samples with which to estimate population variance
X11
X12
.
.
.X1n
X21
X22
.
.
.X2n
Sample 1 Sample 2
€
X 1
€
X 2
X31
X32
.
.
.X3n
Sample 3
€
X 3€
ˆ σ 22 =
(X i − X 2)2∑n −1
Review: The Logic Underlying ANOVA
• There are k samples with which to estimate population variance
X11
X12
.
.
.X1n
X21
X22
.
.
.X2n
Sample 1 Sample 2
€
X 1
€
X 2
X31
X32
.
.
.X3n
Sample 3
€
X 3€
ˆ σ 32 =
(X i − X 3)2∑n −1
Review: The Logic Underlying ANOVA
• The average of these variance estimates is called the “Mean Square Error” or “Mean Square Within”
€
MSerror =
ˆ σ j2
j=1
k
∑
k
Review: The Logic Underlying ANOVA
• There are k means with which to estimate the population variance
X11
X12
.
.
.X1n
X21
X22
.
.
.X2n
Sample 1 Sample 2
€
X 1
€
X 2
X31
X32
.
.
.X3n
Sample 3
€
X 3€
ˆ σ 2 = n ˆ σ X 2 = n
(X j − X overall )2∑
k −1
Review: The Logic Underlying ANOVA
• This estimate of population variance based on sample means is called Mean Square Effect or Mean Square Between
€
ˆ σ 2 = n ˆ σ X 2 = n
(X j − X overall )2∑
k −1
The F Statistic
• MSerror is based on deviation scores within each sample but…
• MSeffect is based on deviations between samples
• MSeffect would overestimate the population variance when there is some effect of the treatment pushing the means of the different samples apart
The F Statistic
• We compare MSeffect against MSerror by constructing a statistic called F
The F Statistic
• F is the ratio of MSeffect to MSerror
€
Fk−1,k(n−1) =MSeffect
MSerror
The F Statistic
• If the hull hypothesis:
is true then we would expect:
except for random sampling variation
€
μ1 = μ2 = μ3 = μ
€
X 1 = X 2 = X 3 = μ
The F Statistic
• F is the ratio of MSeffect to MSerror
• If the null hypothesis is true then F should equal 1.0
€
Fk−1,k(n−1) =MSeffect
MSerror
ANOVA is scalable
• You can create a single F for any number of samples
ANOVA is scalable
• You can create a single F for any number of samples
• It is also possible to examine more than one independent variable using a multi-way ANOVA– Factors are the categories of independent
variables– Levels are the variables within each factor
ANOVA is scalableA two-way ANOVA:
4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3
Main Effects and Interactions
• There are two types of findings with multi-way ANOVA: Main Effects and Interactions– For example a main effect of Factor 1 indicates that the
means under the various levels of Factor 1 were different (at least one was different)
Main Effects and Interactions4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3
€
X 1
Main Effects and Interactions4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3
€
X 2
Main Effects and Interactions4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3
€
X 3
Main Effects and Interactions4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3
€
X 4
Main Effects and Interactions
A main effect of Factor 1
Factor 11 2 3 4
Levels of Factor 2
123
depe
nden
t var
iabl
e
means of each sample
Main Effects and Interactions
• There are two types of findings with multi-way ANOVA: Main Effects and Interactions– For example a main effect of Factor 1 indicates that the means
under the various levels of Factor 1 were different (at least one was different)
– A main effect of Factor 2 indicates that the means under the various levels of Factor 2 were different
Main Effects and Interactions4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3€
X 1
Main Effects and Interactions4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3
€
X 2
Main Effects and Interactions4 levels of factor 1
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3 le
vels
of
fact
or 2
1 2 3 4
1
2
3
€
X 3
Main Effects and Interactions
A main effect of Factor 2
Factor 11 2 3 4
Levels of Factor 2
123
depe
nden
t var
iabl
e
Main Effects and Interactions
• There are two types of findings with multi-way ANOVA: Main Effects and Interactions– For example a main effect of Factor 1 means that the means under
the various levels of Factor 1 were different (at least one was different)
– A main effect of Factor 2 means that the means under the various levels of Factor 2 were different
– An interaction means that there was an effect of one factor but the effect is different for different levels of the other factor
Main Effects and Interactions
An Interaction
Factor 11 2 3 4
Levels of Factor 2
123
depe
nden
t var
iabl
e
Correlation
• We often measure two or more different parameters of a single object
Correlation
• This creates two or more sets of measurements
Correlation
• These sets of measurements can be related to each other– Large values in one set correspond to
large values in the other set– Small values in one set correspond to
small values in the other set
Correlation
• examples:– height and weight– smoking and lung cancer– SES and longevity
Correlation
• We call the relationship between two sets of numbers the correlation
Correlation
• Measure heights and weights of 6 people
Person Height Weight
a 5’4 120
b 5’10 140
c 5’2 100
d 5’1 110
e 5’6 140
f 5’8 150
Correlation
• Height vs. Weight
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150Weight
Height
Correlation
• Height vs. Weight
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150
a
a
Weight
Height
Correlation
• Height vs. Weight
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150
a
a
b
b
Weight
Height
Correlation
• Height vs. Weight
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150
a
a
b
b, e
c
c
d
d
e f
f
Weight
Height
Correlation
• Notice that small values on one scale pair up with small values on the other
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150
a
a
b
b, e
c
c
d
d
e f
f
Weight
Height
Correlation
• Scatter Plot shows the relationship on a single graph
• Like two number lines perpendicular to each other
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150
a
a
b
b, e
c
c
d
d
e f
f
Think of this as the y-axis
Think of this as the x-axis
Correlation
• Scatter Plot shows the relationship on a single graph
5’ 5’2 5’4 5’6 5’8 5’10
a bcd e f
100
110
120
130
140
150
ab,
ec
df
Wei
ght
Height
*
*
*
*
*
*
Correlation
• The relationship here is like a straight line
• We call this linear correlation
*
*
*
*
*
*
Various Kinds of Linear Correlation
• Strong Positive
Various Kinds of Linear Correlation
• Weak Positive
Various Kinds of Linear Correlation
• Strong Negative
Various Kinds of Linear Correlation
• No (or very weak) Correlation
• y values are random with respect to x values
Various Kinds of Linear Correlation
• No Linear Correlation
Correlation Enables Prediction
• Strong correlations mean that we can predict a y value given an x value…this is called regression
• Accuracy of our prediction depends on strength of the correlation
Spurious Correlation
• Sometimes two measures (called variables) both correlate with some other unknown variable (sometimes called a lurking variable) and consequently correlate with each other
• This does not mean that they are causally related!
• e.g. use of cigarette lighters positively correlated with incidence of lung cancer
Next Time: measuring correlations