g89.2247 lecture 11 g89.2247 session 12 analyses with missing data what should be reported? hoyle...

14
G89.2247 Lecture 1 1 G89.2247 Session 12 • Analyses with missing data • What should be reported? Hoyle and Panter McDonald and Moon-Ho (2002)

Upload: ezra-wheeler

Post on 13-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 1

G89.2247Session 12

• Analyses with missing data

• What should be reported?Hoyle and PanterMcDonald and Moon-Ho (2002)

Page 2: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 2

Missing Data in SEM

• Data can be missing for a variety of reasonsStudy Design (planned nesting)Longitudinal StudiesRandom events

• Accidents, fire alarms, blackouts

Systematic nonresponse• Refusals

• Dropouts

Page 3: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 3

Missing Data Mechanisms

• Terms suggested by RubinRubin (1976), Little & Rubin (1987)

• MISSING COMPLETELY AT RANDOM (MCAR)Which data point is missing cannot be predicted by

any variable, measured or unmeasured.• Prob(M|Y)=Prob(M)

The missing data pattern is ignorable. Analyzing available complete data is just fine.

Page 4: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 4

Missing Data Mechanisms

• MISSING AT RANDOM (MAR)Which data point is missing is systematically

related to subject characteristics, but these are all measured

• Conditional on observed variables, missingness is random

• Prob(M|Y)=Prob(M|Yobserved)E.g. Lower educated respondents might not answer

a certain question.Missingness can be treated as ignorable

Page 5: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 5

Missing Data Mechanisms

• NOT MISSING AT RANDOM (NMAR)Data are missing because of process related to

value that is unavailable• Someone was too depressed to come report about

depression

• Abused woman is not allowed to meet interviewer

Missing data pattern is not ignorable.Whether missing data are MAR or NMAR can not

usually be established empirically.

Page 6: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 6

Approaches to Missing Data

• Listwise deletion If a person is missing on any analysis variable, he is

dropped from the analysis.

• Pairwise deletionCorrelations/Covariances are computed using all available

pairs of data.

• Imputation of missing data values.• Model-based use of complete data

E-M (estimation-maximization approach)

• SEM-based FIML

Page 7: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 7

EM and FIML

• Use available data to infer sample moment matrix.

• Uses information from assumed multivariate distribution

• Patterns of associations can be structured or unstructured.

• Now implemented in AMOS, EQS, Mplus

Page 8: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 8

Example of CFA with Means Model

parameter Complete n=400 Listwise FIML (EQS)V1 = factor 1 1 1.000 1.000 1.000V2 = factor 1 .9 0.894 0.0060 0.965 0.0270 0.900 0.0060V3 = factor 1 .9 0.901 0.0060 0.996 0.0290 0.915 0.0060V4 = factor 1 .8 0.800 0.0060 0.890 0.0240 0.808 0.0060V5 = factor 1 .8 0.798 0.0050 0.889 0.0230 0.807 0.0050V6 = factor 1 .7 0.690 0.0050 0.751 0.0240 0.693 0.0060V7 = factor 2 1 1.000 1.000 1.000V8 = factor 2 .9 0.910 0.0110 0.941 0.0440 0.903 0.0110V9 = factor 2 .9 0.907 0.0120 0.957 0.0500 0.899 0.0130V10 = factor 2 .8 0.815 0.0110 0.838 0.0430 0.811 0.0110V11 = factor 2 .7 0.707 0.0110 0.702 0.0330 0.702 0.0120V12 = factor 2 .5 0.514 0.0090 0.523 0.0370 0.508 0.0100F1 = mean 100 99.174 0.6910 83.484 2.4660 98.438 0.6380F2 = mean 50 48.765 0.7010 42.629 2.4520 48.999 0.6410D1-F1 variance 100 105.250 8.7600 96.575 29.9810 115.293 9.4120D2-F2 variance 100 118.870 9.9000 117.810 36.1000 119.463 9.8380D2-F2 covariance 60 70.570 7.4200 55.810 25.4400 71.273 7.6540

Page 9: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 9

Missing Pattern Group Approach

• Suppose that one group is missing a whole set of items related to a latent variable. This group can be defined as separate stratumThe effects for the missing variables can be

constrained to be equal to the effects estimated in the group with complete data.

• This can be tedious, but it gives FIML results.• See Enders & Bandalos (2001) The relative

performance of FIML for missing data in SEM. Structural Equation Modeling, 8: 430-457.

Page 10: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 10

Multiple Imputation

• Substitute expected values plus noise for missing values.

• Repeat >5 times.• Combine estimates and standard errors using

formulas described by Rubin (1987). See also Schafer & Grahm (2002) Missing data: Our view of the state of the art. Psychological Methods, 7: 147-177.

Page 11: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 11

Inference from Multiple Imputation

• Rubin (1987) recommends computing for each regression weightAn average across the K imputations

• An estimate of the standard error that takes into account the variation over imputations

Kk kBKB 1

1

11

122

KBB

KK

SS kkB

B

Page 12: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 12

Communicating SEM Results

• Keeping up with the expert recommendationsPsychological MethodsSpecialty journals

• Structural Equation Models• Multivariate Behavioral Research• Applied Psychological Measurement• Psychometrika

• Two kinds of audiencesResearchers interested in the substance of the empirical

contributionExperts in SEM

Page 13: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 13

Talking Points of Hoyle&Panter, McDonald&Ho

• Model specificationTheoretical justificationIdentifiability

• Measurement Model• Structural Model

• Model estimationCharacteristics of data

• Distribution form• Sample size• Missing data

Page 14: G89.2247 Lecture 11 G89.2247 Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)

G89.2247 Lecture 1 14

Talking Points of Hoyle&Panter, McDonald&Ho

• Model estimationEstimation method: ML, GLS, ULS, ADFGoodness of estimates and standard errors

• Model Selection and Fit Statistics• Alternative and Equivalent Models• Reporting Results

Path diagramsTabular informationUse software conventions?