manova multivariate analysis of variance. one way multivariate analysis of variance (manova)...
Post on 18-Jan-2016
262 Views
Preview:
TRANSCRIPT
MANOVA
Multivariate Analysis of Variance
One way Multivariate Analysis of Variance (MANOVA)
Comparing k p-variate Normal Populations
Comparing k mean vectors
Situation
• We have k normal populations
• Let denote the mean vector and covariance matrix of population i.
• i = 1, 2, 3, … k.
• Note: we assume that the covariance matrix for each population is the same.
and i
1 2 k
We want to test
0 1 2 3: kH
against
: for at least one pair ,A i jH i j
The data
• Assume we have collected data from each of k populations
• Let denote the n observations from population i.
• i = 1, 2, 3, … k.
1 2, , ,i i inx x x
The summary statistics
Sample mean vectors
pi
i
i
n
jij
ii
x
x
x
xn
xi
2
1
1
1
Sample covariance matrices
S1, S2, etc.
Computing Formulae:
Compute
1
Total vector for sample n
i ijj
T x i
1
1 1 1
Grand Total vector ink k
i iji i j
p
G
G T x
G
1)
2)
11 1
1
n
ijj i
npi
pijj
xT
Tx
Total sample size N kn 3)
21 1
1 1 1 1
1 12
11 1 1 1
k n k n
ij ij piji j i j
k n
ij iji j k n k n
ij pij piji j i j
x x x
x x
x x x
21 1
1 1
12
11 1
1 1
1
1 1
k k
i i pii ik
i ii k k
i pi pii i
T T Tn n
TTn
T T Tn n
4)
5)
Let
1
1 1k
i ii
H TT GGn N
212 1
1 11 1
21 2 1
1 11 1
1 1
1 1
k kp
i i pii i
k kp
i pi ii i
G GGT T T
n N n N
G G GT T T
n N n N
2
1 1 1 11 1
2
1 11 1
k k
i i pi pi i
k k
i pi p pi pi i
n x x n x x x x
n x x x x n x x
= the Between SS and SP matrix
Let1 1 1
1k n k
ij ij i ii j i
E x x TTn
2 21 1 1 1
1 1 1 1 1 1
2 21 1
1 1 1 1 1 1
1 1
1 1
k n k k n k
ij i ij pij i pii j i i j i
k n k k n k
ij pij i pi pij pii j i i j i
x T x x T Tn n
x x T T x Tn n
2
1 1 1 11 1 1 1
2
1 11 1 1 1
k n k n
ij i ij i pij pii j i j
k n k n
ij i pij pi pij pii j i j
x x x x x x
x x x x x x
= the Within SS and SP matrix
Source SS and SP matrix
Between
Within
The Manova Table
11 1
1
p
p pp
h h
H
h h
11 1
1
p
p pp
e e
E
e e
There are several test statistics for testing
0 1 2 3: kH
against
: for at least one pair ,A i jH i j
1. Roy’s largest root1
1 largest eigenvalue of HE
This test statistic is derived using Roy’s union intersection principle
2. Wilk’s lambda ()
1
1
E
H E HE I
This test statistic is derived using the generalized Likelihood ratio principle
3. Lawley-Hotelling trace statistic
2 1 10 sum of the eigenvalues of T tr HE HE
4. Pillai trace statistic (V)
1V tr
H H E
Example
In the following study, n = 15 first year university students from three different School regions (A, B and C) who were each taking the following four courses (Math, biology, English and Sociology) were observed: The marks on these courses is tabulated on the following slide:
Student Math Biology English Sociology Student Math Biology English Sociology Student Math Biology English Sociology1 62 65 67 76 1 65 55 35 43 1 47 47 98 782 54 61 75 70 2 87 81 59 64 2 57 69 68 453 53 53 53 59 3 75 67 56 68 3 65 71 77 624 48 56 73 81 4 74 70 55 66 4 41 64 68 585 60 55 49 60 5 83 71 40 52 5 56 54 86 646 55 52 34 41 6 59 48 48 57 6 63 73 88 767 76 71 35 40 7 61 47 46 54 7 43 62 84 788 58 52 58 46 8 81 77 51 45 8 28 47 65 589 75 71 60 59 9 77 68 42 49 9 47 54 90 78
10 55 51 69 75 10 82 84 63 70 10 42 44 79 7311 72 74 64 59 11 68 64 35 44 11 50 53 89 8912 72 75 51 47 12 60 53 60 65 12 46 61 91 8213 76 69 69 57 13 94 88 51 63 13 74 78 99 8614 44 48 65 65 14 96 88 67 81 14 63 66 94 8615 89 71 59 67 15 84 75 46 67 15 69 82 78 73
Educational RegionA B C
The data
Summary Statistics
63.267 61.600 58.733 60.133
160.638 104.829 -32.638 -47.110104.829 92.543 -4.900 -22.229-32.638 -4.900 155.638 128.967-47.110 -22.229 128.967 159.552
Ax
A S
Bx
B S
Cx
C S
76.400 69.067 50.267 59.200
141.257 155.829 45.100 60.914155.829 185.924 61.767 71.05745.100 61.767 96.495 93.37160.914 71.057 93.371 123.600
52.733 61.667 83.600 72.400
156.067 116.976 53.814 35.257116.976 136.381 3.143 -0.42953.814 3.143 116.543 114.88635.257 -0.429 114.886 156.400
15 15 15
45 45 45A B Cx x x x
14 14 14
42 42 42Pooled A B C S S S S
64.133 64.111 64.200 63.911
152.654 125.878 22.092 16.354125.878 138.283 20.003 16.133
22.092 20.003 122.892 112.40816.354 16.133 112.408 146.517
Computations :
1
Total vector for sample n
i ijj
T x i
1
1 1 1
Grand Total vector ink k
i iji i j
p
G
G T x
G
1)
2)
Total sample size = 45N kn 3)
Math Biology English SociologyA 949 924 881 902B 1146 1036 754 888C 791 925 1254 1086
Grand Totals G 2886 2885 2889 2876
Totals
21 1
1 1 1 1
1 12
11 1 1 1
k n k n
ij ij piji j i j
k n
ij iji j k n k n
ij pij piji j i j
x x x
x x
x x x
4)
195718 191674 180399 182865191674 191321 184516 184542180399 184516 199641 193125182865 184542 193125 191590
=
21 1
1 1
12
11 1
1 1
1
1 1
k k
i i pii ik
i ii k k
i pi pii i
T T Tn n
TTn
T T Tn n
=
5)
189306.53 186387.13 179471.13 182178.13186387.13 185513.13 183675.87 183864.40179471.13 183675.87 194479.53 188403.87182178.13 183864.40 188403.87 185436.27
Now
1
1 1k
i ii
H TT GGn N
= the Between SS and SP matrix
4217.733333 1362.466667 -5810.066667 -2269.3333331362.466667 552.5777778 -1541.133333 -519.1555556
-5810.066667 -1541.133333 9005.733333 3764.666667-2269.333333 -519.1555556 3764.666667 1627.911111
=
Let1 1 1
1k n k
ij ij i ii j i
E x x TTn
2 21 1 1 1
1 1 1 1 1 1
2 21 1
1 1 1 1 1 1
1 1
1 1
k n k k n k
ij i ij pij i pii j i i j i
k n k k n k
ij pij i pi pij pii j i i j i
x T x x T Tn n
x x T T x Tn n
= the Within SS and SP matrix
6411.467 5286.867 927.867 686.8675286.867 5807.867 840.133 677.600927.867 840.133 5161.467 4721.133686.867 677.600 4721.133 6153.733
=
Using SPSS to perform MANOVA
Selecting the variables and the Factors
Multivariate Testsc
.984 586.890a 4.000 39.000 .000
.016 586.890a 4.000 39.000 .000
60.194 586.890a 4.000 39.000 .000
60.194 586.890a 4.000 39.000 .000
.883 7.913 8.000 80.000 .000
.161 14.571a 8.000 78.000 .000
4.947 23.501 8.000 76.000 .000
4.891 48.913b 4.000 40.000 .000
Pillai's Trace
Wilks' Lambda
Hotelling's Trace
Roy's Largest Root
Pillai's Trace
Wilks' Lambda
Hotelling's Trace
Roy's Largest Root
EffectIntercept
High_School
Value F Hypothesis df Error df Sig.
Exact statistica.
The statistic is an upper bound on F that yields a lower bound on the significance level.b.
Design: Intercept+High_Schoolc.
The output
Univariate TestsTests of Between-Subjects Effects
4217.733a 2 2108.867 13.815 .000
552.578b 2 276.289 1.998 .148
9005.733c 2 4502.867 36.641 .000
1627.911d 2 813.956 5.555 .007
185088.800 1 185088.800 1212.473 .000
184960.556 1 184960.556 1337.555 .000
185473.800 1 185473.800 1509.241 .000
183808.356 1 183808.356 1254.515 .000
4217.733 2 2108.867 13.815 .000
552.578 2 276.289 1.998 .148
9005.733 2 4502.867 36.641 .000
1627.911 2 813.956 5.555 .007
6411.467 42 152.654
5807.867 42 138.283
5161.467 42 122.892
6153.733 42 146.517
195718.000 45
191321.000 45
199641.000 45
191590.000 45
10629.200 44
6360.444 44
14167.200 44
7781.644 44
Dependent VariableMath
Biology
English
Sociology
Math
Biology
English
Sociology
Math
Biology
English
Sociology
Math
Biology
English
Sociology
Math
Biology
English
Sociology
Math
Biology
English
Sociology
SourceCorrected Model
Intercept
High_School
Error
Total
Corrected Total
Type III Sumof Squares df Mean Square F Sig.
R Squared = .397 (Adjusted R Squared = .368)a.
R Squared = .087 (Adjusted R Squared = .043)b.
R Squared = .636 (Adjusted R Squared = .618)c.
R Squared = .209 (Adjusted R Squared = .172)d.
Profile Analysis
Repeated Measures Designs
In a Repeated Measures Design
We have experimental units that• may be grouped according to one or several
factors (the grouping factors)Then on each experimental unit we have• not a single measurement but a group of
measurements (the repeated measures)• The repeated measures may be taken at
combinations of levels of one or several factors (The repeated measures factors)
Example In the following study the experimenter was interested in how the level of a certain enzyme changed in cardiac patients after open heart surgery.
The enzyme was measured
• immediately after surgery (Day 0),
• one day (Day 1),
• two days (Day 2) and
• one week (Day 7) after surgery
for n = 15 cardiac surgical patients.
The data is given in the table below.
Subject Day 0 Day 1 Day 2 Day 7 Subject Day 0 Day 1 Day 2 Day 7 1 108 63 45 42 9 106 65 49 49 2 112 75 56 52 10 110 70 46 47 3 114 75 51 46 11 120 85 60 62 4 129 87 69 69 12 118 78 51 56 5 115 71 52 54 13 110 65 46 47 6 122 80 68 68 14 132 92 73 63 7 105 71 52 54 15 127 90 73 68 8 117 77 54 61
Table: The enzyme levels -immediately after surgery (Day 0), one day (Day 1),two days (Day 2) and one week (Day 7)
after surgery
• The subjects are not grouped (single group).
• There is one repeated measures factor -Time – with levels– Day 0, – Day 1, – Day 2, – Day 7
• This design is the same as a randomized block design with – Blocks = subjects
The Anova Table for Enzyme Experiment
Source SS df MS F p-valueSubject 4221.100 14 301.507 32.45 0.0000Day 36282.267 3 12094.089 1301.66 0.0000ERROR 390.233 42 9.291
The Subject Source of variability is modelling the variability between subjects
The ERROR Source of variability is modelling the variability within subjects
Example :
(Repeated Measures Design - Grouping Factor)
• In the following study, similar to example 3, the experimenter was interested in how the level of a certain enzyme changed in cardiac patients after open heart surgery.
• In addition the experimenter was interested in how two drug treatments (A and B) would also effect the level of the enzyme.
• The 24 patients were randomly divided into three groups of n= 8 patients.
• The first group of patients were left untreated as a control group while
• the second and third group were given drug treatments A and B respectively.
• Again the enzyme was measured immediately after surgery (Day 0), one day (Day 1), two days (Day 2) and one week (Day 7) after surgery for each of the cardiac surgical patients in the study.
Table: The enzyme levels - immediately after surgery (Day 0), one day (Day 1),two days (Day 2) and one week (Day 7)
after surgery for three treatment groups (control, Drug A, Drug B)
Group Control Drug A Drug B Day Day Day
0 1 2 7 0 1 2 7 0 1 2 7 122 87 68 58 93 56 36 37 86 46 30 31 112 75 55 48 78 51 33 34 100 67 50 50 129 80 66 64 109 73 58 49 122 97 80 72 115 71 54 52 104 75 57 60 101 58 45 43 126 89 70 71 108 71 57 65 112 78 67 66 118 81 62 60 116 76 58 58 106 74 54 54 115 73 56 49 108 64 54 47 90 59 43 38 112 67 53 44 110 80 63 62 110 76 64 58
• The subjects are grouped by treatment– control, – Drug A, – Drug B
• There is one repeated measures factor -Time – with levels– Day 0, – Day 1, – Day 2, – Day 7
The Anova Table
There are two sources of Error in a repeated measures design:
The between subject error – Error1 and
the within subject error – Error2
Source SS df MS F p-value
Drug 1745.396 2 872.698 1.78 0.1929
Error1
10287.844 21 489.897Time 47067.031 3 15689.010 1479.58 0.0000Time x Drug 357.688 6 59.615 5.62 0.0001
Error2
668.031 63 10.604
Tables of means
Drug Day 0 Day 1 Day 2 Day 7 Overall
Control 118.63 77.88 60.50 55.75 78.19
A 103.25 68.25 52.00 51.50 68.75
B 103.38 69.38 54.13 51.50 69.59
Overall 108.42 71.83 55.54 52.92 72.18
Time Profiles of Enzyme Levels
40
60
80
100
120
0 1 2 3 4 5 6 7Day
Enz
yme
Lev
el
Control
Drug A
Drug B
Example : Repeated Measures Design - Two Grouping Factors
• In the following example , the researcher was interested in how the levels of Anxiety (high and low) and Tension (none and high) affected error rates in performing a specified task.
• In addition the researcher was interested in how the error rates also changed over time.
• Four groups of three subjects diagnosed in the four Anxiety-Tension categories were asked to perform the task at four different times patients in the study.
The number of errors committed at each instance is tabulated below.
Anxiety Low High
Tension None High None High
subject subject subject subject 1 2 3 1 2 3 1 2 3 1 2 3
18 19 14 16 12 18 16 18 16 19 16 16 14 12 10 12 8 10 10 8 12 16 14 12 12 8 6 10 6 5 8 4 6 10 10 8 6 4 2 4 2 1 4 1 2 8 9 8
The Anova Table
Source SS df MS F p-value
Anxiety 10.08333 1 10.08333 0.98 0.3517Tension 8.33333 1 8.33333 0.81 0.3949
AT 80.08333 1 80.08333 7.77 0.0237
Error1
82.85 8 10.3125
B 991.5 3 330.5 152.05 0BA 8.41667 3 2.80556 1.29 0.3003BT 12.16667 3 4.05556 1.87 0.1624
BAT 12.75 3 4.25 1.96 0.1477
Error2
52.16667 24 2.17361
top related