estimating and testing mediation david a. kenny university of connecticut
TRANSCRIPT
Estimating and Testing Mediation
David A. Kenny
University of Connecticut http://davidakenny.net/cm/MediationN.ppt
2
Overview
Introduction (click to go there) The Four-Step Approach (click) The Modern Approach (click) Test the Indirect Effect (click) Power (click) Assumptions (click) DataToText (click)
Introduction
3
4
Interest in Mediation
• Mentions of “mediation” or “mediator” in psychology abstracts:
– 1980: 36
– 1990: 122
– 2000: 339
– 2010: 1,198
4
5
Why the Interest in Mediation?
• Understand the mechanism
– theoretical concerns
– cost and efficiency concerns
• Find more proximal endpoints
• Understand why the intervention did not work
– finding the missing link
– compensatory processes
6
The Beginning Model:Basic Research
7
The Mediational Model
8
The Beginning Model:Applied Research
9
The Four Paths
• X Y: path c
• X M: path a
• M Y (controlling for X): path b• X Y (controlling for M): path c′
(standardized or unstandardized)
The Four-Step Approach
10
11
In the 1980s Different Pairs of Researchers Proposed a Series
of Steps to Test Mediation
• Judd & Kenny (1981)
• James & Brett (1984)
• Baron & Kenny (1986)
12
The Four Steps
• Step 1: X Y (test path c)• Step 2: X M (test path a)• Step 3: M (and X) Y (test path b)• Step 4: X (and M) Y (test path c′)
Note that Steps 3 and 4 use the same regression equation.
13
Total and Partial Mediation
• Total Mediation• Meet steps 1, 2, and 3 and find that c′ equal
zero.
• Partial Mediation• Meet steps 1, 2, and 3 and find that c′ is
smaller in absolute value than c.
14
Example Dataset• Morse et al.
– J. of Community Psychology, 1994
– treatment housing contacts days of stable housing
– persons randomly assigned to treatment groups.
– 109 people
(davidakenny.net/papers/twmr/example.txt)
15
Variables in the Example • Treatment
– 1 = treated (intensive case management)– 0 = treatment as usual
• Housing Contacts: total number of contacts per during the 9 months after the intervention began
• Stable Housing– days per month with adequate housing
(0 to 30)– Averaged over 7 months from month 10
to month 16, after the intervention began
16
Morse et al. Example
Step 1: X Y c = 6.558, p = .009
Step 2: X M a = 5.502, p = .013
Step 3: M (and X) Y b = 0.466, p < .001
Step 4: X (and M) Y c′ = 3.992, p = .090
17
When Can We Conclude Complete Mediation?
When c′ is not statistically different from zero, but c is? NOT REALLY
Ideally when a and b are substantial and c′ is small, not just non-significant.
18
Standardized vs. Unstandardized
For most everything in this presentation, the variables can be standardized making the coefficients Betas.
The key thing is to be consistent: If M is standardized in Step 2, it is also standardized in Step 3.
19
Presenting Mediation
19
The Modern Approach
20
21
Dissatisfaction with the Steps Approach
Low power in the test of Step 1No single measure and test of mediationWork by MacKinnon, Hayes, Preacher and
others has led to the “Modern Approach.”
22
Decomposition of EffectsTotal Effect = Direct Effect + Indirect Effect
c = c′ + ab
(This equality exactly holds for multiple regression, but not necessarily for other estimation methods.)
The Indirect Effect or ab provides one number that summarizes the amount of mediation.
Example:6.558 = (5.502)(0.466)And 100(2.56/6.56) = 39% of the total effect is
explained (ab/c or equivalently 1 - c′/c).
23
Two Sides of the Same CoinNote that
ab = c - c′ That is, the indirect effect exactly equals
the amount of the reduction in the total effect (c) after the mediator is introduced.
Some papers (e.g., Baron & Kenny) emphasize one side and others emphasize the other. Current work in mediation is now focused on the “ab” side.
24
Estimating the Total Effect (c)
The total effect or c can be inferred from direct and indirect effect as c′ + ab.
We need not perform the Step 1 regression to estimate c.
This can be useful in situations when c does not exactly equally c′ + ab.
25
Inconsistent Mediation• ab and c′ have a different sign• X as a “suppressor” variable• Example: Stress and Mood with Coping as
a Mediator• Consequences
– The Total Effect or c may not be significant– Percent mediated greater than 100%
• Do we have mediation?– Yes. There is an indirect effect (ab > 0).– No. There is no effect that is “mediated.”
26
What to Present
Decomposition of Effects
Total Effect
Indirect Effect(s)
Direct Effects
Less of an emphasis on complete versus partial mediation.
Test of the Indirect Effect
27
28
Work on Determining How to Test ab = 0
• The key piece of information in the modern approach is the indirect effect.
• Many researchers developed approaches to this problem.
28
29
Strategies to Test ab = 0
• Test a and b separately
• Sobel test
• Bootstrapping
• Monte Carlo Method
30
Test a and b Separately• Easy to do
• Works fairly well
• Does not provide a method for a confidence interval for ab.
• Seem too much like the old-fashioned Four-Step Approach.
31
Sobel Test of MediationCompute the square root of a2sb
2 + b2sa2
which is denoted as sab Note that sa and sb are the standard
errors of a and b, respectively; ta = a/sa and tb = b/sb.
Divide ab by sab and treat that value as a Z.
So if ab/sab greater than 1.96 in absolute value, reject the null hypothesis that the indirect effect is zero.
32
Examplea = 5.502 and b = 0.466
sa = 2.182 and sb = 0.100
ab = 2.56; sab = 1.157
Sobel test Z is 2.218, p = .027
We conclude that the indirect effect is statistically different from zero.
Website: http://www.people.ku.edu/~preacher/sobel/sobel.htm
33
The distribution of ab is highly skewed which lowers the power of the test.
Large values of “ab” are more variable than small values (i.e., 0).
34
Bootstrapping• “Nonparametric” way of computing a
sampling distribution.• Re-sampling (with replacement)• Many trials (computationally intensive)• Current thinking is NOT to correct for
bias (i.e., the mean of the bootstrap estimate differs slightly from the estimate).
• Compute a confidence interval which is asymmetric.
• Slight changes because empirically derived.
35
Results of Bootstrapping
95% Percentile Confidence Interval:
Lower Upper
.5325 5.0341
Note that the CI is asymmetric for an estimate of 2.598. Also values differ to sampling error.
(Done using the Hayes & Preacher macro from http://www.afhayes.com/spss-sas-and-mplus-macros-and-
code.html.)
36
Monte Carlo Method• Save a, b, sa (the standard error of a) and sb
from the mediation analysis.• For some methods you have the covariance
of ab or sab to save.• Use them and the assumption of normality
to generate estimates of a and b and so ab.• Use this distribution of ab’s to get a
confidence interval or a p value.• Less computationally intensive than
bootstrapping and many trials are feasible.
37
Results of the Monte Carlo Method
95% Monte Carlo Confidence Interval (20,000 trials):
Lower Upper
0.551 5.158
Done using the Selig & Preacher web program at http://www.quantpsy.org/medmc/medmc.htm; if correlated use Tofighi’s Rmediation at http://www.amp.gatech.edu/RMediation
Power
38
39
Power of the Test of cIf c′ is zero, then c equals ab.
If both a and b have a moderate effect size, then c has a “smaller than small” effect size (assuming c′ is zero).
Thus, it is very possible to have tests of significance for a and b be statistically significant, but c is not.
Note that for N = 100, if a and b have moderate effect sizes (r = .3) and c′ = 0, the power of the test of c is only .14 whereas the power for a is .87 and b is .83.
40
Relative Power of c and ab
Several authors, as early as Cox (1960), have noted that the indirect effect has much more power than the test of c even when the two equal the same value as when c′ is zero.
Kenny and Judd (2014) show that sometimes the test of c need 75 times the number of cases to have the same power as the test of ab!
41
Relative Power of the Test of c and c′ when b = 0
When b = 0, c = c′ and in this case c might be statistically significant, yet c′ might not be significant even though. Why?
There is multicollinearity between X and M due to path a.
Note that as M becomes a more successful mediator and path a gets larger, multicollinearity becomes more of a issue in the testing of c′ (and b).
42
Power and the Test of bLet b = .3 (standardized). What happens to N need
to have 80% power as path a gets larger:
a N
.1 86
.3 93
.5 112
(a is standardized; sample size needed for 80% power to reject the null hypothesis that b = 0)
Conclusion: Power of the test of b declines as a increases.
43
Tool MedPowRR based program to help in power analyses.
Can either give the power for a given value of N and a, b, and c’ or give the N needed to achieve a desired level of power.
Program can be downloaded at
http://davidakenny.net/progs/PowMed.R
44
45
Power calculations have begun... Effect Size Power N c .390 .986 100 a .300 .868 100 b .300 .890 100 c' .300 .890 100 ab .090 .773 100Alpha for all power calculations set to .050.Power calculations complete.
Assumptions
46
47
Taking Assumptions Seriously
• Older Mediation Analysis ignored the strong assumptions required for the analysis.
• Current work, especially that within the Causal Inference tradition, focuses much more on them.
• Researchers need to conduct “Sensitivity Analyses” or “What If?”analyses.
47
48
Assumptions: Multiple Regression
• Linearity– Is a problem in the example; housing contacts has a
quadratic effect on stable housing (p = .034).
• Normal Distribution of Errors– Interval level of measurement of M and Y
• Equal Error Variance• Independence
– No clustering
• X and M Do Not Interact to Cause Y48
49
No XM Interaction: Linear Mediation
• Called “Moderation” in Baron & Kenny• Add XM (and possibly other interaction terms, e.g., X2M)
when explaining Y.• Many contemporary analysts now see XM interaction as
part of mediation.• Not significant for the example (p = .476)
50
Causal Assumptions• Perfect Reliability
– for M and X • No Omitted Variables
– all common causes of M and Y, X and M, and X and Y measured and controlled
• No Reverse Causal Effects– M and Y not cause X– Y may not cause M
(Guaranteed if X is manipulated.)50
51
Basic Mediational Causal Model
X Y
M
a
c'
b
U1
1
U2
1
Note that U1 and U2 are theoretical variables and not “errors” from a regression equation.
52
Unreliability• Usually safe to assume that X is
perfectly reliable.
• Measurement error in Y does not bias unstandardized regression coefficients.
• Measurement error in M is problematic.
53
Unreliability in M
X Y
M
c'
U1
U2
1
MLatent
Error1
1
a b
1
54
Effect of Unreliability in M
• b is attenuated (closer to zero)
• c′ is inflated (given consistent mediation)
– more as a increases
– more as b increases
– Note that the bigger the indirect, the greater the bias in c′.
55
What to Do about Unreliability in M?
• Improve the reliability
• Adjust estimates using Structural Equation Modeling
• Conduct Sensitivity Analyses assuming different values of reliability.
56
Omitted Variables
X Y
M
a
c'
b
U1
1
U2
1
OmittedVariable
e
f
57
Other Terms for an Omitted Variable
• Third variable.
• Confounder– Term used in epidemiology– Becoming increasingly popular
58
What is the Effect of Omitted Variables?
• Usually, but not always, the sign of ef is the same as b.– Inflating the estimate of b– Deflates the estimate of c′ (could produce
inconsistent mediation).
59
Effect of Vitamin A Supplements in Northern Sumatra
59
60
"Standard" Analysis
60
Compliance should totally mediate this effect. Why is c' negative?
61
Results with an Omitted Variable
61
Path c' fixed to zero. Omitted variables that causes M and Y.
62
What to Do about Omitted Variables?
• Do not omit them: Include them in the analysis as covariates.
• If there is good reason to believe that c′ = 0, they can be allowed for.
• Sometimes the omitted variable is “shared method effects.” If an issue, measure M and Y by different methods.
• Conduct sensitivity analyses.62
63
A Comforting Fact
• Unreliability in M deflates ab and inflates c′.
• An omitted variable usually deflates ab and inflates c′.
• As Fritz, Kenny, & MacKinnon have shown these two biases can almost exactly offset each other.
63
64
Reverse Causation
X Y
M
a
c'
b
U1
1
U2
1
g
65
Effect of Reverse Causation
• Typically, b and g have the same sign, which likely makes the value of b inflated and the value of c′ deflated.
66
What to Do about Reverse Causation?
• Longitudinal designs
• If c′ = 0, then the model can be estimated.
• Instrumental variable method.
67
Timing of the Measurement of the Mediator
• Mediator should be measured after X but before Y.
• X might be measured at the same time as Y (e.g., number of treatment sessions), but it must be assumed that X has not changed since when it affected Y.
68
Controlling for Prior Values
• Obtain baseline measures of M and Y.
• Control for baseline M and Y in the analysis.
DataToText
69
70
DataToText Project
Have the researcher tell DataToText what is the research question.
DataToText performs the requisite analyses.
DataToText gives the results from those analyses:
computer output
a written description 70
71
DataToText Macros
• Macro developed to provide text, tables, and figures of a simple mediational analysis.– SPSS version: MedText
http://davidakenny.net/dtt/mediate.htm
– R version: MedTextRhttp://davidakenny.net/dtt/mediateR.htm
71
72
Advantages of DataToText• Does the analyses that should be done,
but often are not, e.g., tests for outliers and nonlinearity.
• MedTextR issues up to 20 different warnings.
• Produces a 3 page text describing the results.
• Surprisingly “intelligent”
• Graphics 72
73
74
Links
davidakenny.net/dtt/datatotext.htmdavidakenny.net/dtt/mediate.htm
davidakenny.net/dtt/mediateR.htm
davidakenny.net/dtt/MedTextR.pdf
74
75
Topics Not Discussed Use of SEM
simultaneous estimation
latent variables (reflective and formative)
instrumental variable estimation
Causal Inference Approach
Mediators or outcomes that are categorical or counted
Clustering and multilevel mediation
Mediated moderation and moderated mediation75
76
Conclusion• Mediational Analyses Are
Very Simple• Mediational Analyses Are Very
Difficult–Difficulties are more in
conceptualization and measurement than in the statistical analysis.
davidakenny.net/cm/mediationN.ppt