longitudinal data analysis: why and how to do it with multi-level modeling (mlm)?

49
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University

Upload: cyndi

Post on 19-Jan-2016

58 views

Category:

Documents


4 download

DESCRIPTION

Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)?. Oi-man Kwok Texas A & M University. Road Map. Why do we want to analyze longitudinal data under multilevel modeling (MLM) framework? Dependency issue - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

Longitudinal Data Analysis:Why and How to Do it With

Multi-Level Modeling (MLM)?

Oi-man Kwok

Texas A & M University

Page 2: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

2

• Why do we want to analyze longitudinal data under multilevel modeling (MLM) framework?– Dependency issue– Advantages of using MLM over traditional Methods

(e.g., Univariate ANOVA, Multivariate ANOVA)– Review of important parameters in MLM

• How can we do it under SPSS?

Road Map

Page 3: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

• Regression Model:e.g.

DV: Test Scores of 1st Year Grad-Level Statistics IV: GRE_M (GRE Math Test Score) 150 Students (i = 1,…,150)

One of the important Assumptions for OLS regression?(Observations are independent from each other)

iii eMGREStat _10

Page 4: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

4

Ignoring the clustered structure (or dependency between observations) in the analyses can result in:

• Bias in the standard errors

*Bias in the test of significance and confidence interval(Type I errors: Inflated alpha level (e.g. set α=.05; actual α=.10)) non-replicable results

Page 5: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

5

Advantages of MLM over the traditional Methods on analyzing longitudinal data

• Univariate ANOVA—Restriction on the error structure: Compound Symmetry (CS) type error structure (higher statistical power but not likely to be met in longitudinal data)

• Multivariate ANOVA—No restriction on the error structure: Unstructured (UN) type error structure (often too conservative, lower statistical power); can only handle completely balanced data (Listwise deletion)

• More…

Page 6: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

Analyzing Longitudinal Data:

• Example• (Based on Actual Data—variable names changed for ease

of presentation):Compare two different teaching methods on Achievement over time

• Teaching Methods:78 students are randomly assigned to either:A. Lecture (Control group; 39 students) orB. Computer (Treatment group; 39 students)

• 4 Achievement (Ach) scores (right after the course, 1 year after, 2 year after, & 3 year after) were collected from each student after treatment (i.e. statistics course)

Page 7: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

7

Achievement

Computer

Lecture

Time=0 : Immediately posttest measure

Time (Year)1 2 3

Page 8: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

Multi-Level Model (MLM) • Note: Start with simple growth model

ttt eTimeAch 10

1 2 3

e1

Acht

Timet0

Student 36

β0

β1

e0

e2

e3

A Simple Regression Model for ONE student (student 36)

(t=0,1,2,3)

et: Captures variation of individual achievement scores from the fitted regression model WITHIN student 36

V(eti)=σ2

Page 9: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

ttt eTimeAch 10

titiiiti eTimeAch 10 Compare to

(Micro Level Model)

1 2 3

Student 27

Achti

Timeti0

Student 36

Student 52

β1_Student 27

β1_Student 36

Β0_Student 36

Β0_Student 52Β0_Student 27

(i=1,2,3,…,78)

Page 10: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

10

Student ID

12 13.5 1.25

15 10.5 2.75

23 12.6 .23

27 15.6 .28

28 22.3 1.64

33 36.4 3.27

37 25.2 1.22

i0i1

1 2 3

Student 27Achti

Timeti0

Student 36

Student 52

β1_Student 27

β1_Student 36

Β0_Student 36

Β0_Student 52Β0_Student 27

00

10

Grand Intercept

Grand Slope00

11

Variance of the intercepts

Variance of the Slopes

Page 11: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

Overall Model

Student 27

Student 36

Student 52

No variation among the 78 intercepts

Ach

Time0

γ00

110

00

G

11

00

0

0

G

Captures the deviations ofthe 78 intercepts from thegrand intercept γ00

Captures the deviations of the 78 slopes from theGrand slope γ10

Page 12: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

Ach

Time

Overall Model

Student 27

Student 36

Student 52

γ10

γ10

γ10

γ10

No variation among the 78 slopes

00

000G

11

00

0

0

G

Page 13: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

13

Ach

Time

Overall Model

11

00

0

0

G

1110

0100

G

01001

Page 14: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

Summary

• G: Captures between- student differences

• R: Captures within-student random errors

1110

0100

G

00

10

Grand Intercept

Grand Slope

00

11

Variance of the Intercepts

Variance of the Slopes

01 Covariance betweenIntercepts and Slopes

V(eti)=σ2

Page 15: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

15

MACRO vs. MICRO

• UNITS:

Educational

study

Family study Longitudinal study

MACRO School

/Class

Family Individual

MICRO Student Family member

Repeated observations

Page 16: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

16

MACRO vs. MICRO (Cont.)

• MODELS:MICRO level model:

regression model fits the observations within each MACRO unit

MACRO level model:model captures the differences between the overall model and individual regression models from different macro units

Page 17: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

17

• Dependent Variable:

Math Achievement (Achieve, Repeat measures /Micro Level)

• Predictors:• Repeated measure (MICRO) Level Predictor:

Time (& any time varying covariates)

• Student (MACRO) Level Predictor:

Computer (Different teaching methods) (& any time-invariant variables such as gender)

Page 18: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

18

Data format under MANOVA approaches:

• Student Treat T0 T1 T2 T3• S1 0 5 3 2 3 • S2 1 5 25 -- 33• S3 1 -- 19 17 26 • S1 has responses on all time points• S2 has missing response at time 2 (indicated by "--") • S3 has missing response at time 0. 

• MANOVA: only retains S1 in the analysis  

(SPSS Data Format)

Page 19: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

19

Student Treat T0 T1 T2 T3S1 0 5 3 2 3 S2 1 5 25 -- 33S3 1 -- 19 17 26

Student Treat Time DVS1 0 0 5 S1 0 1 3S1 0 2 2S1 0 3 3S2 1 0 5S2 1 1 25S2 1 3 33S3 1 1 19S3 1 2 17S3 1 3 26

Data format for MANOVA

Data format for Multilevel Model

(All 3 students are included in the analyses)

Page 20: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

20

Student Treat Time DVS1 0 0 5 S1 0 7 3S1 0 12 2S1 0 13 3S2 1 1 5S2 1 3 9S2 1 4 5S2 1 6 25S3 1 3 18 S3 1 15 19S3 1 28 17S3 1 31 26

Can you transform thisdataset back into multivariateformat???

Page 21: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

21

Questions

• 1. On average, is there any trend of the math achievement over time?

• 2. Are there any differences between students on the trend of math achievement over time? (Do all students have the same trend of math achievement over time?)

Page 22: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

titiiiti eTimeAch 10

ii U0000

ii U1101

Micro Level (Level 1):

Macro Level (Level 2):

111 )( iUVar

000 )( iUVar

Grand Slope

Grand Intercept

Page 23: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

23

titiiiti eTimeMathach 10

ii U0000 ii U1101

Micro Level

Macro Level

Combined Model

titiiititi eTimeUUTimeMathach 101000

Between School Differences

Within School Errors

Grand Intercept

Grand Slope

Page 24: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

TIME

1.51.0.50.0

AC

H

120

100

80

60

40

20

SUBID

53

32

18

15

14

11

6.0

4.0

Red: ComputerBlue: Lecture

Page 25: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

25

MAti =γ00 + γ10 Timeti+U0i +U1i Timeti+ eti

SPSS MIXED Syntax:MIXED mathach with Time

/METHOD = REML

/Fixed = intercept Time

/Random = intercept Time

|Subject(Subid) COVTYPE (UN)

/PRINT = G SOLUTION TESTCOV.

Execute.

Default: REML(Restricted Maximum Likelihood)Other option:ML (Maximum Likelihood)

Produce asymptoticstandard errors andWald Z-tests for The covarianceParameter estimates

identity variable for Macro levelUnits (e.g., Subid)

Captures the overall model

Requests for regressioncoefficients

Specify random effects: Effects capture the between-School differences

Print G matrix

Structure of G matrix (Unstructured)

DV with Continuous IV by Categorical IV

Page 26: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

26

SPSS Output

Basic Information

Model Dimensionb

1 1

1 1

2 Unstructured 3 subid

1

4 6

Intercept

time

Fixed Effects

Intercept + timeaRandom Effects

Residual

Total

Numberof Levels

CovarianceStructure

Number ofParameters

SubjectVariables

As of version 11.5, the syntax rules for the RANDOM subcommand have changed. Yourcommand syntax may yield results that differ from those produced by prior versions. Ifyou are using SPSS 11 syntax, please consult the current syntax reference guide formore information.

a.

Dependent Variable: Achieve.b.

Page 27: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

27

Information Criteriaa

2509.873

2517.873

2518.004

2536.819

2532.819

-2 Restricted LogLikelihood

Akaike's InformationCriterion (AIC)

Hurvich and Tsai'sCriterion (AICC)

Bozdogan's Criterion(CAIC)

Schwarz's BayesianCriterion (BIC)

The information criteria are displayedin smaller-is-better forms.

Dependent Variable: Achieve.a.

Page 28: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

28

Type III Tests of Fixed Effectsa

1 77 871.772 .000

1 77 13.701 .000

SourceIntercept

time

Numerator dfDenominator

df F Sig.

Dependent Variable: Achieve.a.

Estimates of Fixed Effectsa

54.25609 1.8375833 77 29.526 .000 50.5969939 57.9151856

2.3760897 .6419278 77 3.701 .000 1.0978482 3.6543313

ParameterIntercept

time

Estimate Std. Error df t Sig. Lower Bound Upper Bound

95% Confidence Interval

Dependent Variable: Achieve.a.

Requested by the “Solution” command in the PRINT statement (Line 5)

(γ10) Average Trend of the MA score

(γ00) Average MA score at Time=0

Page 29: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

29

Estimates of Covariance Parametersa

87.75982 9.9368430 8.832 .000 70.2936565 109.5658788

201.9517 43.01424 4.695 .000 133.0294456 306.5824032

-.1513755 11.31083 -.013 .989 -22.3201972 22.0174463

14.58960 5.5482320 2.630 .009 6.9237677 30.7428445

ParameterResidual

UN (1,1)

UN (2,1)

UN (2,2)

Intercept +time [subject= subid]

Estimate Std. Error Wald Z Sig. Lower Bound Upper Bound

95% Confidence Interval

Dependent Variable: Achieve.a.

Random Effect Covariance Structure (G)a

201.9517 -.1513755

-.1513755 14.5895961

Intercept | subid

time | subid

Intercept |subid time | subid

Unstructured

Dependent Variable: Achieve.a.

Requested by the “G” commandin the PRINT statement (Line 5)

τ00 τ10 τ11

τ01

1110

0100

τ00 τ10

τ11

τ01

Requested by the “TESTCOV” command in the PRINT statement (Line 5)

Asymptotic standard errors and Wald Z-tests

σ2

Page 30: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

30

• Compare

Likelihood Ratio Test!

Can I have a simpler G matrix (i.e. τ01= τ10 =0)

1110

0100

11

00

0

0

With

-2LL: 2509.873 -2LL: ?

Page 31: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

31

Syntax for fitting simpler G

SPSS syntax/random = intercept Time |subject(Subid)

COVTYPE (Diag)

11

00

0

0

Page 32: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

32

(Model with τ01= τ10 =0)

-2 Res Log Likelihood 2509.873

(or Deviance)

(Model with τ01= τ10 ≠0)

-2 Res Log Likelihood 2509.873

(or Deviance)

χ2(1)=.00018, p=.99

Choose This

Page 33: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

33

Compare to model with τ11= 0

SPSS syntax

/random = intercept |subject(Subid) COVTYPE (Diag)

00

000

Page 34: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

34

(Model with τ01=τ10=0, τ11≠0)

-2 Res Log Likelihood 2509.873

(Model with τ11=τ01=τ10= 0)

-2 Res Log Likelihood 2524.387

χ2(1)=14.51, p<.001

Choose This

Halved P-value

11

00

0

0

00

000

Page 35: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

35

Result of the final Model

γ00

γ10

Estimates of Covariance Parametersa

87.794973 9.591118 9.154 .000 70.872958 108.757380

201.7136 39.133631 5.154 .000 137.910425 295.034860

14.556515 4.964819 2.932 .003 7.459959 28.403928

ParameterResidual

Var: Intercept

Var: time

Intercept + time [subject= subid]

Estimate Std. Error Wald Z Sig. Lower Bound Upper Bound

95% Confidence Interval

Dependent Variable: Achieve.a.

Random Effect Covariance Structure (G)a

201.7136 0

0 14.556515

Intercept | subid

time | subid

Intercept |subid time | subid

Diagonal

Dependent Variable: Achieve.a.

Estimates of Fixed Effectsa

54.256090 1.836838 89.672 29.538 .000 50.606708 57.905472

2.376090 .641668 89.672 3.703 .000 1.101242 3.650938

ParameterIntercept

time

Estimate Std. Error df t Sig. Lower Bound Upper Bound

95% Confidence Interval

Dependent Variable: Achieve.a.

τ00 τ11

σ2

Page 36: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

36

• 1. On average, is there any trend of the math achievement over time?

• 2. Are there any differences between students on the trend of math achievement over time? (Or, do all students have the same trend of math achievement over time?)

τ00 = 201.71 τ11 = 14.56

• Q3. If Yes to Q2, what causes the differences?

titi TimechaMath 38.226.54ˆ

Page 37: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

37

• Micro Level (Level 1):

MAti = 0i + 1i Timeti + eti

(Variance of eti = σ2)

• Combined Model:

MAti =γ00 + γ01 Compi + γ10 Timeti + γ11Timeti*Compi

+ U0i + U1i SESti + eti

• Macro Level (Level 2):

β0i =γ00 + γ01 Compi + U0i

β1i =γ10 + γ11 Compi + U1i

(Variance of U0i = τ00; Variance of U1i = τ11)

Null Hypothesis:Different teaching methods have SAME effects on achievement over time

(H0: γ11 = 0)

Page 38: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

38

MAij =γ00 + γ01 Compi + γ10 Timeti + γ11Timeti*Compi + U0i + U1i Timeti + eti

• SPSS PROC MIXED Syntax:MIXED mathach with Time

/METHOD = REML /Fixed = intercept Comp Time Time*Comp /Random = intercept Time

|Subject(Subid) COVTYPE (Diag)

/PRINT = G SOLUTION TESTCOV. Execute.

Page 39: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

39

Without Comp in the Macro models

With Comp in the Macro models

Random Effect Covariance Structure (G)a

176.1636 0

0 9.813461

Intercept | subid

time | subid

Intercept |subid time | subid

Diagonal

Dependent Variable: Achieve.a.

Random Effect Covariance Structure (G)a

201.7136 0

0 14.556515

Intercept | subid

time | subid

Intercept |subid time | subid

Diagonal

Dependent Variable: Achieve.a.

Page 40: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

40

81.90

016.176G

56.140

071.201G

(WITHOUT “Comp” in the model) (WITH “Comp” in the model)

Proportion of variance in the intercept ( ) explained by “Comp”=(201.71-176.16)/201.71 = .13 (or 13%)

Proportion of variance in the slope ( ) explained by “Comp”=(14.56-9.81)/14.56 = .33 (or 33%)

00

11

Page 41: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

41

Solution for Fixed Effects

Standard

Effect Estimate Error DF t Value Pr > |t|

Intercept 50.3769 2.4764 76 20.34 <.0001

time 0.5756 0.8445 232 0.68 0.4962

computer 7.7583 3.5021 76 2.22 0.0297

time*comp 3.6009 1.1943 232 3.02 0.0029

tiitiiti TimeCompTimeComphcA *60.3*58.*76.738.50ˆ

Page 42: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

42

titi TimeachhMat 58.38.50ˆ

Overall Model for students in the Lecture method group

Overall Model for students in the Computer method group

titi TimeachhMat 18.414.58ˆ

tiitiiti TimeCompTimeComphcA *60.3*58.*76.738.50ˆ

81.90

016.176G

Random Effect

V(eti)=σ2=90.00

Page 43: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

43

Achievement

Computer

Lecture

Time=0 : Immediately posttest measure

Time (Year)

Page 44: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

Conclusion

• Advantages of using MLM over traditional ANOVA approaches for analyzing longitudinal data: – 1. Can flexibly model the variance function– 2. Retain meaning of the random effects– 3. Explore factors which predict individual differences in

change over time (e.g., Treatment effect)

– 4.Take both unequal spacing and missing data into

account

1100 ,

Page 45: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

45

Take Home Exercise A clinical psychologist wants to examine the

impact of the stress level of each family member (STRESS) on his/her level of symptomatology (SYMPTOM). There are 100 families, and families vary in size from three to eight members. The total number of participants is 400.

a) Can you write out the model? (Hint: What is in the micro model? What is in the macro model?)

b) Can you write out the syntax (SPSS) to analyze this model?

Page 46: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

46

c) In designing the study, what possible macro predictors do you think the clinical psychologist should include in her study? (e.g. family size?)

d) In designing the study, what possible micro predictors do you think the clinical psychologist should include in her study? (e.g. participant’s neuroticism?)

e) Can you write out the model? (Hint: What is in the micro model? What is in the macro model)

f) Can you write out the syntax (SPSS) to analyze this model?

Page 47: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

47

b) SYMPTOMij = γ00 + γ10 STRESSij + U0j + U1j STRESSij + eij

SPSS Syntax:MIXED Symptom with Stress

/fixed = intercept Stress

/random = intercept Stress |subject (Family) COVTYPE (UN)

/PRINT = G SOLUTION TESTCOV.

execute.

Page 48: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

48

a) Micro-level model:

SYMPTOMij = β0j + β1j STRESSij + eij

Macro-level model:

β0j = γ00 + U0j

β1j = γ10 + U1j

Combined model:

SYMPTOMij = γ00 + γ10 STRESSij

+ U0j + U1j STRESSij + eij

Page 49: Longitudinal Data Analysis: Why and How to Do it With  Multi-Level Modeling (MLM)?

THE END!

THANK YOU!