1 g89.2229 lect 7m statistical power for regression statistical interaction g89.2229 multiple...
TRANSCRIPT
1G89.2229 Lect 7M
• Statistical power for regression
• Statistical interaction
G89.2229 Multiple Regression
Week 7 (Monday)
2G89.2229 Lect 7M
Statistical Power Issues in Regression
• Suppose you are interested in the relation of Conscientiousness (a Big Five measure of personality) on truthfulness in internet surveys.
• Suppose a previous study of 100 people showed that for every unit change of C (on a 0-10 scale) T (on a 0-100 scale) increased by 5 points. The regression equation:» T= 50 + 5*C + e
• Suppose the C effect was just significant at the .05 level.
• What power would you have if you replicated the study EXACTLY with the same population, same measures, and same N?
3G89.2229 Lect 7M
Power considerations
-10 -5 0 5 10 15 20 25
Observed metric
Null
Alternative
Criterion
• The power is a function of» Effect size» Standard error
• The standard error is a function of» Sample size» Amount of residual variation» Distribution of predictor
4G89.2229 Lect 7M
Power Tables and Programs
• For regression, we tend to assume that predictors have normal distributions
• Often use R2 as a summary of effect size
• Increase in R2 already takes into account» Distribution of predictor» Amount of error variation» Covariation among predictors
5G89.2229 Lect 7M
Three Designs: Same N, Different Power
• Let’s study the process with structural model:» Y = 50 + 5X + e
• Suppose further than three studies were carried out.» The first samples 25 persons
each from X=1 and X=10.» The second samples 5 persons
each in each interval between X=1 and X=10.
» The third takes a random sample of 50 persons.
6G89.2229 Lect 7M
The Results
• All Three studies give comparable regression estimates, but standard errors and R2 values are very different.
Design 1 SS df MS FRegression 268.99 1 268.99 22.54Residual 572.80 48 11.93Total 841.79 49
B se t(Constant) 48.98 0.69 70.90
X1 5.15 1.08 4.77
Design 2 SS df MS FRegression 105.50 1 105.50 8.50Residual 595.55 48 12.41Total 701.05 49
B se t(Constant) 49.42 0.9528 51.87X2 5.06 1.734 2.92
Design 3 SS df MS FRegression 36.01 1 36.01 2.90Residual 596.65 48 12.43Total 632.66 49
B se t(Constant) 49.48 1.481 33.41X3 5.04 2.962 1.70
7G89.2229 Lect 7M
Calculating Power for Regression
• Programs and tables stress R2 and R2 change.» These values depend on
distribution of X» Conventions of small, medium
and large effect size might not map on substantive beliefs about what is large or subtle.
• Example:» Power and Precision Software
• Recommendation: Study R2 for different designs using simulation.
8G89.2229 Lect 7M
Statistical Interaction: Moderation of Effects
• Example» In many studies of perceived social
support, stress and distress, a three-way picture is found:
• For persons who are not experiencing an important stress, the relation between social support and distress is small or zero.
• For persons who have a modest amount of stress, social support seems to reduce distress.
• For persons with high stress, social support has large inverse association with distress
» This is called the stress-buffering effect of support.
9G89.2229 Lect 7M
Two Pictures of Interaction
Stress
Dis
tres
s
High Support
Some Support
No Support
StressDistress
Support
e
• The second emphasizes moderation (Baron & Kenny)
10G89.2229 Lect 7M
Representing Interaction in the Regression Equation
• One way to model how the effect of X1 is systematically affected by X2 is to include a multiplicative term in the regression equation» Y=b0+b1X1+b2X2+b3(X1*X2)+e
» This multiplicative term creates a curved surface in the predicted Y
» If the multiplicative term is needed, but left out, the residuals may display heteroscedasticity
• This multiplicative model is related to the polynomial models studied last week.
11G89.2229 Lect 7M
Interpreting the Multiplicative Model
• Y=b0+b1X1+b2X2+b3(X1*X2)+e
• The effect (slope) of X1 varies with different values of X2
» For X2=0, the effect is b1
» For X2=1, the effect is b1+b3
» For X2=2, the effect is b1+2b3
• Because the coefficients b1 and b2 can be easily interpreted when X1 and X2 are zero, it is advisable to CENTER variables involved in interactions to make values of zero easy to understand.
12G89.2229 Lect 7M
Moderation issues
• Scaling of the outcome variable can affect whether an interaction term is needed.» If we have a simple multiplicative model
in Y, it will be additive in Ln(Y).• E(Y|XW) = bXW• E(ln(Y)|XW) = ln(b)+ln(X)+ln(W)
• Scaling is especially important if the trajectories of interest do not cross in the region where data is available.
13G89.2229 Lect 7M
Detecting and testing for scaling effects
• When the variance seems to be related to the level of Y, the hypothesis of interactions being simple scaling functions needs to be considered.» Showing that the theoretically
interesting interaction remains when Y is transformed to ln(Y) is good evidence
» Showing that ln(Y) increases heteroscedasticity also helps (if it is true)
• Often our theory predicts interaction, and scientists are motivated to demonstrate it.