miao(“michelle”)yang - power · statisticalpoweranalysisformulti-levelmodels...
TRANSCRIPT
Statistical Power Analysis for Multi-level Models
Miao (“Michelle”) Yang
Department of PsychologyQuantitative Study Group
Mar 19 2015
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 1 / 25
Outline of the Talk
1 Introduction to multilevel designs
2 Statistical power analysis for multilevel models
3 Software
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 2 / 25
Outline of the Talk
1 Introduction to multilevel designs
2 Statistical power analysis for multilevel models
3 Software
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 2 / 25
Outline of the Talk
1 Introduction to multilevel designs
2 Statistical power analysis for multilevel models
3 Software
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 2 / 25
Multilevel Designs
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 3 / 25
Multilevel Modeling — When?In educational studies, the total sample size is often a combination ofstudents sampled from different classrooms or schools. When data exhibitsuch nested structure, multilevel modeling can be conducted.
Student (ID) School (Name) Verbal Score1 Potato 882 Potato 853 Potato 924 Tomato 765 Tomato 786 Tomato 80...
...60 Sheep 77
Table: An example of nested data
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 4 / 25
Multilevel Modeling — Why?
When data are nested, it is natural that the individuals within the samecluster (e.g., school) are correlated, which violates one of the assumptionsof traditional models such as multiple regression and ANOVA. As aconsequence, traditional models will produce biased estimates ofparameter standard errors, and thus lead to significance tests with inflatedtype I error rates (e.g., Hox, 1998).
Advantages of using multilevel modeling:
Handle nested dataAllow us to know both individual and cluster differencesMore powerful
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 5 / 25
Multilevel Modeling (CRT vs MRT)
CRT:
The entire site (school) is randomly assigned to treatment or control.Avoids a possible “spill over” effect within schools.
MRT:
Students within schools are randomly assigned.More convenient and economical because we have a larger pool.Easy to manage because each cluster follows the same study design.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 6 / 25
Power Analysis for CRT (1 treatment & 1 control)
Yij = β0j + eij , eij ∼ N(0, σ2W )
β0j = γ00 + γ01Xj + u0j , u0j ∼ N(0, σ2B)
i = 1, 2, ..., n (individual); j = 1, 2, ...J (cluster);
Xj : treatment indicator of cluster j , Xj ={0.5 treatment−0.5 control
γ00 : grand mean;
γ01 : treatment main effect (i.e., µD = µT − µC )
β0j : cluster mean
σ2W : within-cluster variance; σ2B : between-cluster variance
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 7 / 25
Power Analysis for CRT (1 treatment & 1 control)
Test treatment main effect H0 : γ01 = 0:
T =γ01√
Var( ˆγ01)=
Y..T − Y..
C√4(σ2B + σ2W /n)/J
Under H0 : T ∼ tJ−2.Under H1 : T ∼ tJ−2,λ.
Power = P(reject H0|H1 true)
={1− P[TJ−2,λ < t0] + P[TJ−2,λ ≤ −t0] two− sided;1− P[TJ−2,λ < t0] one− sided,
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 8 / 25
Power Analysis for CRT (1 treatment & 1 control)
Test treatment main effect H0 : γ01 = 0:
T =γ01√
Var( ˆγ01)=
Y..T − Y..
C√4(σ2B + σ2W /n)/J
Under H0 : T ∼ tJ−2.Under H1 : T ∼ tJ−2,λ.
Power = P(reject H0|H1 true)
={1− P[TJ−2,λ < t0] + P[TJ−2,λ ≤ −t0] two− sided;1− P[TJ−2,λ < t0] one− sided,
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 8 / 25
Power Analysis for CRT (1 treatment & 1 control)
Test treatment main effect H0 : γ01 = 0:
T =γ01√
Var( ˆγ01)=
Y..T − Y..
C√4(σ2B + σ2W /n)/J
Under H0 : T ∼ tJ−2.Under H1 : T ∼ tJ−2,λ.
Power = P(reject H0|H1 true)
={1− P[TJ−2,λ < t0] + P[TJ−2,λ ≤ −t0] two− sided;1− P[TJ−2,λ < t0] one− sided,
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 8 / 25
Power Analysis for CRT (1 treatment & 1 control)
λ = µD√4(σ2B + σ2W
n )/J.
As λ increases, powerincreases.λ is a function of µD , n, J ,σ2
B and σ2W .
To give more meaningfuldefinition, we canreparameterize λ in terms ofeffect size and intra-classcorrelation (ICC).
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 9 / 25
Power Analysis for CRT (1 treatment & 1 control)
λ = µD√4(σ2B + σ2W
n )/J.
As λ increases, powerincreases.λ is a function of µD , n, J ,σ2
B and σ2W .
To give more meaningfuldefinition, we canreparameterize λ in terms ofeffect size and intra-classcorrelation (ICC).
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 9 / 25
ICC in CRT
The intra-class correlation (ICC) quantifies the degree to which tworandomly drawn observations within a cluster are correlated. In CRT, theICC is defined as
ρ = corr(Yij ,Yi ′j) = σ2Bσ2B + σ2W
= σ2Bσ2T
.
The proportion of total variance that is accounted for by clustering.ρ = 0, no between cluster variation.As ρ increases, more variation is due to between-cluster variability.For school-based data sets, ρ usually ranges between 0.10 to 0.30.(Bloom, Bos & Lee, 1999; Hedges & Hedberg, 2007)
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 10 / 25
Effect Size in CRT (1 treatment & 1 control)
The effect sizes used in educational and psychological research are typically standardizedmean differences. Possible definitions for the effect size in CRT (Hedges, 2007):
f = µD/σW . This effect size might be of interest in a meta-analysis where thestudies being compared are single-site studies.f = µD/σB . This effect size might be of interest in a meta-analysis where theother studies are multisite studies that have been analyzed by using cluster meansas the unit of analysis.f = µD/
√σ2
B + σ2W . This effect size might be of interest in a meta-analysis where
the other studies are multisite studies or studies that sample from a broaderpopulation but do not include clusters. ♥♥♥
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 11 / 25
Effect Size in CRT (1 treatment & 1 control)
The effect sizes used in educational and psychological research are typically standardizedmean differences. Possible definitions for the effect size in CRT (Hedges, 2007):
f = µD/σW . This effect size might be of interest in a meta-analysis where thestudies being compared are single-site studies.f = µD/σB . This effect size might be of interest in a meta-analysis where theother studies are multisite studies that have been analyzed by using cluster meansas the unit of analysis.f = µD/
√σ2
B + σ2W . This effect size might be of interest in a meta-analysis where
the other studies are multisite studies or studies that sample from a broaderpopulation but do not include clusters. ♥♥♥
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 11 / 25
Effect Size in CRT (1 treatment & 1 control)
The effect sizes used in educational and psychological research are typically standardizedmean differences. Possible definitions for the effect size in CRT (Hedges, 2007):
f = µD/σW . This effect size might be of interest in a meta-analysis where thestudies being compared are single-site studies.f = µD/σB . This effect size might be of interest in a meta-analysis where theother studies are multisite studies that have been analyzed by using cluster meansas the unit of analysis.f = µD/
√σ2
B + σ2W . This effect size might be of interest in a meta-analysis where
the other studies are multisite studies or studies that sample from a broaderpopulation but do not include clusters. ♥♥♥
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 11 / 25
Power Analysis for CRT (1 treatment & 1 control)
Redefine λ in standardized notation:
λ = µD√4(σ2B + σ2W
n )/J=
√Jf√
4(ρ+ 1−ρn )
.
Now, λ is a function of n, J , f and ρ.
As J or n increases, λ increases and thus power increases.As f increases, λ increases and thus power increases.As ρ increases, λ decreases and thus power decreases.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 12 / 25
Power Analysis for CRT (2 treatments & 1 control)
Yij = β0j + eij , eij ∼ N(0, σ2W )
β0j = γ00 + γ01X1j + γ02X2j + u0j , u0j ∼ N(0, σ2B)
X1j =
1/3 treatment11/3 treatment2−1 control
; X2j =
1/2 treatment1−1/2 treatment20 control
β0j : cluster mean; γ00 : grand mean
γ01 : mean difference between the average of the two treatments and the control
γ02: mean difference between the two treatments
Yij = γ00 + γ01X1j + γ02X2j + u0j + eijµT1 = γ00 + 13γ01 + 1
2γ02µT2 = γ00 + 1
3γ01 −12γ02 =⇒
µC = γ00 − γ01
{0.5(µT1 + µT2)− µC = γ01µT1 − µT2 = γ02
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 13 / 25
Power Analysis for CRT (2 treatments & 1 control)
We might be interested in three different types of test:
1 Test treatment main effect: H0 : γ01 = 0⇔ µD = 0.5(µT1 + µT2)− µC = 0Under H0 : T1 ∼ tJ−3. Under H1 : T1 ∼ tJ−3,λ1 , where
λ1 =√
Jf1√4.5(ρ+ 1−ρ
n )and f1 =
0.5(µT1 + µT2)− µC√σ2B + σ2W
.
2 Comparing the two treatments: H0 : γ02 = 0⇔ µD = µT1 − µT2 = 0Under H0 : T2 ∼ tJ−3. Under H1 : T2 ∼ tJ−3,λ2 , where
λ2 =√
Jf2√6(ρ+ 1−ρ
n )and f2 =
µT1 − µT2√σ2B + σ2W
.
3 Ominibus test: H0 : γ01 = γ02 = 0⇔ µT1 = µT2 = µCUnder H0 : F ∼ F2,J−3. Under H1 :F ∼ F2,J−3,λ, where λ = λ21 + λ22.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 14 / 25
Power Analysis for CRT (2 treatments & 1 control)
We might be interested in three different types of test:
1 Test treatment main effect: H0 : γ01 = 0⇔ µD = 0.5(µT1 + µT2)− µC = 0Under H0 : T1 ∼ tJ−3. Under H1 : T1 ∼ tJ−3,λ1 , where
λ1 =√
Jf1√4.5(ρ+ 1−ρ
n )and f1 =
0.5(µT1 + µT2)− µC√σ2B + σ2W
.
2 Comparing the two treatments: H0 : γ02 = 0⇔ µD = µT1 − µT2 = 0Under H0 : T2 ∼ tJ−3. Under H1 : T2 ∼ tJ−3,λ2 , where
λ2 =√
Jf2√6(ρ+ 1−ρ
n )and f2 =
µT1 − µT2√σ2B + σ2W
.
3 Ominibus test: H0 : γ01 = γ02 = 0⇔ µT1 = µT2 = µCUnder H0 : F ∼ F2,J−3. Under H1 :F ∼ F2,J−3,λ, where λ = λ21 + λ22.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 14 / 25
Power Analysis for CRT (2 treatments & 1 control)
We might be interested in three different types of test:
1 Test treatment main effect: H0 : γ01 = 0⇔ µD = 0.5(µT1 + µT2)− µC = 0Under H0 : T1 ∼ tJ−3. Under H1 : T1 ∼ tJ−3,λ1 , where
λ1 =√
Jf1√4.5(ρ+ 1−ρ
n )and f1 =
0.5(µT1 + µT2)− µC√σ2B + σ2W
.
2 Comparing the two treatments: H0 : γ02 = 0⇔ µD = µT1 − µT2 = 0Under H0 : T2 ∼ tJ−3. Under H1 : T2 ∼ tJ−3,λ2 , where
λ2 =√
Jf2√6(ρ+ 1−ρ
n )and f2 =
µT1 − µT2√σ2B + σ2W
.
3 Ominibus test: H0 : γ01 = γ02 = 0⇔ µT1 = µT2 = µCUnder H0 : F ∼ F2,J−3. Under H1 :F ∼ F2,J−3,λ, where λ = λ21 + λ22.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 14 / 25
Power Analysis for MRT (1 treatment & 1 control)Let’s move on to multisite randomized trials with 1 treatment and 1 control.
Yij = β0j + β1j Xij + eij , eij ∼ N(0, σ2)
β0j = γ00 + u0j , β1j = γ10 + u1j .
(u0ju1j
)∼ N(0,
[τ00 τ01τ10 τ11
])
i = 1, 2, ..., n (individual); j = 1, 2, ...J (site);Xij : indicator of treatment assignment with
Xij ={0.5 treatment−0.5 control
β0j : mean at the jth siteβ1j : mean difference between treatment andcontrol at the jth siteγ00 : grand mean;γ10 : treatment main effectσ2 : between-person variationτ00 : site variabilityτ11 : variance of site-specfic treatment effects
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 15 / 25
Power Analysis for MRT (1 treatment & 1 control)Let’s move on to multisite randomized trials with 1 treatment and 1 control.
Yij = β0j + β1j Xij + eij , eij ∼ N(0, σ2)
β0j = γ00 + u0j , β1j = γ10 + u1j .
(u0ju1j
)∼ N(0,
[τ00 τ01τ10 τ11
])
i = 1, 2, ..., n (individual); j = 1, 2, ...J (site);Xij : indicator of treatment assignment with
Xij ={0.5 treatment−0.5 control
β0j : mean at the jth siteβ1j : mean difference between treatment andcontrol at the jth siteγ00 : grand mean;γ10 : treatment main effectσ2 : between-person variationτ00 : site variabilityτ11 : variance of site-specfic treatment effects
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 15 / 25
Power Analysis for MRT (1 treatment & 1 control)Test treatment main effect H0 : γ10 = 0⇔ µD = µT − µC = 0
Under H0 : T ∼ tJ−1. Under H1 : T ∼ tJ−1,λ, where
λ =√JµD√
4σ2/n + τ11.
Power ={1− P[TJ−1,λ < t0] + P[TJ−1,λ ≤ −t0] two− sided1− P[TJ−1,λ < t0] one− sided
Following Raudenbush & Liu (2000), we define the effect size as f = µD√σ2. Thus,
λ =√Jf√
4/n + τ11/σ2.
Power increases as
the effect size (f ) increases;the number of sites (J) or the number of individuals per site (n) increases;the variance of the treatment effect (τ11) decreases;between-person variation (σ2) increases.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 16 / 25
Power Analysis for MRT (2 treatments & 1 control)
MRT (2 treatments & 1 control):
Yij = β0j + β1jX1ij + β2jX2ij + eij
β0j = γ00 + u0j, β1j = γ10 + u1j , β2j = γ20 + u2j
X1ij =
1/3 treatment11/3 treatment2−1 control
; X2ij =
1/2 treatment1−1/2 treatment20 control
(1) Test treatment main effect: H0 : γ10 = 0⇔ 12(µT1 + µT2) = µC
(2) Comparing the two treatments: H0 : γ20 = 0⇔ µT1 = µT2
(3) Ominibus test: H0 : γ10 = γ20 = 0⇔ µT1 = µT2 = µC
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 17 / 25
Revisit
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 18 / 25
Software: WebPower
CRT with 1 treatment and 1 control:http://webpower.psychstat.org/models/mlm01/
CRT with 2 treatments and 1 control:
http://webpower.psychstat.org/models/mlm02/
MRT with 1 treatment and 1 control:
http://webpower.psychstat.org/models/mlm03/
MRT with 2 treatments and 1 control:
http://webpower.psychstat.org/models/mlm04/
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 19 / 25
Software: WebPower
CRT with 1 treatment and 1 control:http://webpower.psychstat.org/models/mlm01/
CRT with 2 treatments and 1 control:
http://webpower.psychstat.org/models/mlm02/
MRT with 1 treatment and 1 control:
http://webpower.psychstat.org/models/mlm03/
MRT with 2 treatments and 1 control:
http://webpower.psychstat.org/models/mlm04/
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 19 / 25
Application of WebPower
Example 1. A researcher plans to collect data from 20 clinics to examinethe effect of certain behavioral therapies on recovering from anorexia. Ateach clinic, 30 anorexic girls will be randomly assigned to therapy 1,therapy 2, or the control group. Previous research suggests the therapy 1might lead to an increase of 0.5 in BMI and therapy 2 might lead to anincrease of 0.8 in BMI. Further, the between-person variation is 2.25 andthe variance in treatment effects across sites is 0.4. What’s the power fortesting the treatment main effect ?
Sample size = 30Effect size = (0.5+0.8)/2√
2.25 = 0.43Number of clusters = 20Variance in treatment effects across sites = 0.4Between-person variation = 2.25
http://webpower.psychstat.org/models/mlm04/Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 20 / 25
Application of WebPower
Example 2. A group of educational researchers developed a new teachingmethod to help students improve their memory abilities. They decide torandomly assign 20 schools to either the new method or the standardmethod and test students on memory ability from these 20 schools.Suppose the new method might have a medium effect size and theintraclass correlation is 0.10. How many students in each school will beneeded to obtain a power of 0.8?
Effect size = 0.5Number of clusters = 20ICC = 0.10Power = 0.8
http://webpower.psychstat.org/models/mlm01/
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 21 / 25
Application of WebPowerWhether sample size (n) or cluster size (J) is more crucial in increasing the powerfor CRT?Set effect size = 0.5, ICC = 0.1, signifcance level = 0.05.
Varying n Varying J
Total n n J Power n J Power
100 10 10 0.359 10 10 0.359
200 20 10 0.447 10 20 0.680
300 30 10 0.487 10 30 0.858
400 40 10 0.510 10 40 0.942
500 50 10 0.525 10 50 0.978
600 60 10 0.535 10 60 0.992
700 70 10 0.543 10 70 0.997
800 80 10 0.549 10 80 0.999
900 90 10 0.553 10 90 1.000
1000 100 10 0.557 10 100 1.000Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 22 / 25
Comparison With Other Software
Optimal Design (Raudenbush et al., 2011)— Graphic-based power analysis
Pros: Available for three-level designs.Cons: No exact value for power. Onlyconsiders 1 treatment and 1 control.
R function MRTpower() (Usami,2014):
Pros: Extended to three-level designs andunbalanced designs.Cons: Only estimates sample size. To do poweror effect size calculation, readers shouldunderstand the technical details of the paperand write their own syntax.
What can be improved in WebPower?
Generalize to three-level and unbalanceddesigns.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 23 / 25
Comparison With Other Software
Optimal Design (Raudenbush et al., 2011)— Graphic-based power analysis
Pros: Available for three-level designs.Cons: No exact value for power. Onlyconsiders 1 treatment and 1 control.
R function MRTpower() (Usami,2014):
Pros: Extended to three-level designs andunbalanced designs.Cons: Only estimates sample size. To do poweror effect size calculation, readers shouldunderstand the technical details of the paperand write their own syntax.
What can be improved in WebPower?
Generalize to three-level and unbalanceddesigns.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 23 / 25
ReferencesBloom, H. S., Johannes, M. B. & Lee, S-W. (1999). “Using Cluster RandomAssignment to Measure Program Impacts: Statistical Implications for the Evaluation ofEducation Programs.” Evaluation Review 23(4): 445-69.
Hedges, L. V. (2007). Effect sizes in cluster-randomized designs. Journal of Educationaland Behavioral Statistics, 32, 341-370.
Hedges, L. V. & Hedberg, E. C. (2007). Intraclass correlations for planning grouprandomized experiments in rural education. Journal of Research in Rural Education,22(10).
Hox, J. (1998). Multilevel modeling: When and why. Classification, data analysis, anddata highways, 147-154.
Raudenbush, S. W. & Liu, X. (2000). Statistical power and optimal design for multisiterandomized trials. Psychological Methods, Vol 5(2), 199-213.
Raudenbush, S. W., et al. (2011). Optimal Design Software for Multi-level andLongitudinal Research (Version 3.01) [Software]. Available fromwww.wtgrantfoundation.org.
Usami, S. (2014). Generalized sample size determination formulas for experimentalresearch with hierarchical data. Behavior Research Methods, Vol 46(2), 346-356.
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 24 / 25
Thanks
Johnny & Ke-Hai
Supported by the Department of Education (R305D140037)
Gabrielle, Agung & Haiyan
All of you
Miao (“Michelle”) Yang (ND) Power analysis for MLM Mar 19 2015 25 / 25