a brief introduction to multilevel models...2015/02/27 · a brief introduction to multilevel...
TRANSCRIPT
A BRIEF INTRODUCTION TO MULTILEVEL MODELS
Leslie Rutkowski, PhD Assistant Professor of Inquiry Methodology Counseling & Educational Psychology School of Education Indiana University WIM Seminar February 2015
Multilevel data • Often participants of studies are nested within specific
contexts • Patients treated in hospitals • Firms operate within countries • Families live in neighborhoods • Students learn in classes within schools
• Data stemming from such research designs have a
multilevel or hierarchical structure.
2
Terminology • HLM • Multilevel modeling (MLM) • Random effects models • Variance components • Mixed effects modeling
3
More terminology • Macro
• Macro-level units • Macro units • Primary units • Clusters • Level 2 units
• Micro • Micro-level units • Micro units • Secondary units • Elementary units • Level 1 units
HLM: Simple model • One L1 predictor with a random intercept & a random
slope: • HLM form:
• Linear mixed model (LMM) form:
5
Notation, notation, notation
Notation, notation, notation, II
7
Group Level Similarities • If we use traditional linear regression, we assume:
8
Group Level Similarities • But more often we have:
9
Group Level Similarities • Students within a school are somewhat similar • Students between schools are different • Why? (absolutely not comprehensive)
• Teacher factors • Pedagogical approaches • Training
• School factors • Public vs. Private • Safety
• Community factors • Parental involvement • Average SES
Implications? • It is often (but not always) important to take into account
the group level dependencies in analyses. • Why?
• Most (traditional) assumptions are violated. • We might miss some very important group effects. • The level of group dependency is important on its own.
• When do we care about group-level dependencies?
Single vs. multilevel regression
How different are the relationships? MATH452 = 448.28 + 38.57*books. MATH517 = 518.99 + 4.78*books. MATH529 = 487.81 + 10.09*books. MATH548 = 461.95 + 16.55*books. MATH577 = 393.50 – 7.82*books. MATH604 = 493.21 + 17.81*books. MATH619 = 498.76 – 1.93*books. MATH622 = 465.68 + 5.35*books. MATH677 = 506.50 + 1.24*books. MATH1479= 480.86 + 29.54*books. MATH1673= 605.29 + 11.16*books.
Ignoring data structure
• We can easily have such a situation:
y = 0.5x + 6
y = 0.5x + 1
y = 0.5x + 8
y = -1x + 21.619
0
2
4
6
8
10
12
14
16
18
0 5 10 15 20 25
Y
X
Within vs. Between
Null/Empty/Baseline Model • Useful first step in model building / hypothesis testing
process. • With the empty model we learn if there are between group
differences • Yes? Multilevel approach is warranted. ( ) • No? A standard, plain vanilla regression is sufficient. ( )
Simplest Multilevel Model: Null • Null model, empty model, fully unconditional model:
• Where
• There are no predictors • Linear mixed model: • Yij is random because U0j & Rij are random
Deconstructed • Intercept is composed of two parts: • Overall (fixed) mean: • Random group effect:
• This is random difference for group j from . • Individual deviation / residual deviation:
• This is random difference for student i from • Variance components:
• Decomposed into two parts:
In words • Groups (j) are regarded as a sample from a population of
groups. • We can say that the intercept coefficient depends on j. • This is an important first step b/c it provides a basic
partition of the variance. • If the intercept does not depend on j, then it is best to take
an OLS approach to analyzing the data. • Usually – it’s possible that there are no intercept differences but
there are slope differences. When could this happen?
What are we modeling? • We know that we have a sample from a larger population of schools and students.
• We see that the means are quite different. • Do we have sufficient evidence to support a multilevel approach?
452 517 529 548 577 604 619 622 677 1479 1673
300
400
500
600
700
800
Math_Score
SCHOOL ID
How similar/different?: Intraclass correlation • Measure of similarity between two randomly chosen level one
units within a randomly chosen level two unit
• Proportion of variance in the outcome that is between groups
• Proportion of variance in outcome explained by group differences.
• Provides justification for a multilevel modeling approach
• is the population between group variance
• is the population within group variance
Null Model ICC
• Estimated parameters:
• ICC from this example: 3170.88/(3170.88+ 4680.16) = 0.4039
• 40% of the variance in mathematics scores can be attributed to between school differences. The remainder lies within the school.
Parameter Value
3170.88
4680.16
Intercepts as outcomes models • These are also referred to as random intercepts models. • A couple of possibilities:
• Some level one predictors, no level two predictors; • Some level one predictors, some level two predictors. • Some level two predictors, no level one predictors
• But what do these look like?
Add a level-1 predictor • Add to our empty model a predictor for number of books
in the home:
• Number of books in the home is an historic proxy for SES (goes back to FIMS - http://www.iea.nl/fims.html)
• Additional variable may help to explain variance in achievement – this is at the heart of what we are trying to do.
RI model with L1 Books • Adding a predictor to the model gives us:
• As a linear mixed model:
Specifically:
Abstractly:
HLM or MLM
Based on the linear mixed model:
Estimated parameters • We are estimating 4 parameters:
• Intercept • Books effect • Between groups intercept variance • Within-groups variance
Results • Edited output from SAS:
• And the residual ICC:
• Notice this is a bit smaller than the empty model. Why?
Value SE Value SE
Fixed EffectsIntercept 514.03 17.62 515.62 14.59Books -- -- 17.99 3.70
Random EffectsIntercept 3170.88 2115.35Residual 4680.16 4336.00
Null Model Add Books
Adding Level 2 Effects
• Only level 1 variable forces the between and within regressions for a particular effect to be equal.
• Does this seem reasonable? (1.0 = .25?)
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10X
Y
And With Our Data
Adding Level 2 Effects • If we add group mean for books, the between and within groups can differ.
Hierarchical linear model (HLM):
Linear mixed model (LMM):
If , then the between and within relationships are not different
Estimated parameters • Now we are estimating 5 parameters:
• Intercept • Books effect • School average effect of books • Intercept variance • Within variance component
Models compared
• What changed?
• And the ICC:
Value SE Value SE Value SEFixed Effects
Intercept 514.03 17.62 515.62 14.59 365.39 26.67books -- -- 17.99 3.70 16.30 3.75Mbooks 76.13 12.89
Random EffectsIntercept 3170.88 1460.23 2115.35 1021.63 574.51 363.05Residual 4680.16 432.74 4336.00 402.16 4345.33 403.97
Null Model Add Books Add Mbooks
Random slopes
Forcing slopes to be equal across groups
Allowing slopes to be unequal across groups
How does the model look?
• With one micro-variable:
• Where
• And
34
What are all these parts? First, what’s new?
1. A random component for the slope:
2. Variance components – Before: – Now, add slope variance: – And intercept/slope covariance:
How many more parameters are we estimating?
35
Putting it all together… • Linear Mixed Model:
• And:
36
And how do we interpret ?
• If this were a representative situation:
• What do you think?
2 Groups
0
2
4
6
8
10
12
0 0.2 0.4 0.6 0.8 1
X
Y
Intra-class correlation • In a random intercept model, recall the ICC:
• Now, variance and covariance of Yij
• So, the ICC is also dependent on the value of x
Ex 1: Random slopes, 1 predictor • Going back to our earlier example
• Yij = mathij & xij = booksij
• Level 1:
• Level 2:
Results
Value SE Value SE Value SEFixed Effects
Intercept 514.03 17.62 515.62 14.59 505.46 3.10books -- -- 17.99 3.70 12.71 0.73
Random EffectsIntercept 3170.88 2115.35 2157.87Int/Slope 36.33Slope 11.29Residual 4680.16 4336.00 3426.69
Null Model Random Intercept Random Slope
Plot of Groups
And the ICC? • Depends on books, so let’s choose a couple of
reasonable values: • Most students fall between (-2.36, 2.36)
L2 Predictors & Cross Level Interactions • As in random intercept models, we can include level-2
predictors • Means of level one variables
• School SES • School attitude toward math
• Natural level two predictors • School resources (books, facilities) • Principal reports of student behavior problems
• These effects can predict the intercept, the slope or a combination
43
Simple CLI Model • Adding a bit to what we had before:
44
Linear Mixed Model • Combined:
• Notice the term. • This is a “cross-level” interaction • We are trying to explain the slope
• Does the slope for group j depend on some level 2 predictor? • Example: does the effect of student language use depend on the
average SES of the school?
45
Variance components • Notice that our variance components do not change
With all of the usual assumptions
46
Many other possibilities for L2 • We could also have:
Cross-Level Example Level 1: Level 2: Combined:
Results • Interpretation?
Value SE Value SE Value SEFixed Effects
Intercept 514.03 17.62 505.46 3.10 412.91 3.10books -- -- 12.71 0.73 14.08 2.53Mbooks 47.68 3.36books*Mbooks -0.71 1.24
Random EffectsIntercept 3170.88 2157.87 836.09Int/Slope 36.33 73.56Slope 11.29 39.51Residual 4680.16 3426.69 3426.69
Null Model Random Slope CLI Model
Multilevel models • Just a VERY brief overview • A major methodological advance
• Allows for the possibility of randomly varying intercepts • Randomly varying slopes • We can decompose the variance within and between higher level
units. • We can fit cross level interactions
• We can test rich and complex theories
What do we gain? • Statistically efficient estimates of regression coefficients. • Correct standard errors, confidence intervals, and
significance tests. • Can use covariates measured at any of the levels of the
hierarchy. • We can test hypotheses about homogeneity or
heterogeneity of groups.
Lots of other topics • Centering – what and why • Testing variance components • Longitudinal analysis • Cross-classified models
Now, on to SAS!
PROC MIXED • Basic Syntax:
1. PROC MIXED options 2. CLASS statement 3. MODEL statement 4. RANDOM statement. 5. TITLE statement.
PROC MIXED PROC MIXED <options go here> ; <various commands on the following lines> Options: DATA=<name of sas data set>: what SAS data set to use. NOCLPRINT: don’t print classification levels/information. COVTEST: hypothesis tests for variances (& covariances). METHOD: estimation method to use.
ML = maximum likelihood REML = restricted maximum likelihood (REML is the default)
EMPIRICAL: empirical / Hubert-White / sandwich estimators.
CLASS, MODEL, BY statements • CLASS statement indicates variables that are class /
categorical / nominal / discrete • MODEL is of the form
• MODEL response = fixed predictors / options • SOLUTION requests a solution for the fixed effects
• BY statement produces a separate analysis for all of the “by” variables (e.g. country)
RANDOM & TITLE statement • RANDOM specifies the random effects, RANDOM random predictors / options • Option SOLUTION produces solution for random effects • Option type=UN gives variance and covariance components • Option SUBJECT= specifies the grouping variable
• TITLE adds a title to the procedure TITLE ‘title here’
ODS OUTPUT • Writes a data file based on your requests. • Requires certain statements in the PROC MIXED syntax.
• See TABLE 56.22 in SAS Help file for details. • In general: ods output <table name> = • Example: ods output solutionf=f_full covparms=c_full;
SAS Examples: • Null model:
• RI with L1 predictor: