correlation and regression spss
Post on 27-Jan-2015
212 Views
Preview:
DESCRIPTION
TRANSCRIPT
CorrelationScatterplotsRegression
Generating results in SPSSReading SPSS output
Correlation, Scatterplotsand Regression
• Correlation measures the strength and the direction of relationship
• Scatterplots present visual image of data• Regression produces a best-fit line to
predict dependent variable from independent variable
• Significance of relationship tested withcorrelation or regression
0
20
40
60
80
100
120
140
160
180
0 50 100 150 200 250
Drug A (dose in mg)
Sy
mp
tom
In
de
x
0
20
40
60
80
100
120
140
160
0 50 100 150 200 250
Drug B (dose in mg)
Sym
ptom
In
dex
Very good fit Moderate fit
Correlation: Linear Relationships Strong relationship = good linear fit
Points clustered closely around a line show a strong correlation. The line is a good predictor (good fit) with the data. The more spread out the points, the weaker the correlation, and the less good the fit. The line is a REGRESSSION line (Y = bX + a)
Interpreting Correlation Coefficient r
strong correlation: r > .70 or r < –.70moderate correlation: r is between .30 & .70
or r is between –.30 and –.70weak correlation: r is between 0 and .30
or r is between 0 and –.30 .
| | |r = -1.0 r = -.9 r = -.7 r = -.5 r = -.3 r = 0 r = .3 r = .5 r = .7 r = .9 r = 1.0
weak correlation
moderate correlation
strong correlation
Running Correlation in SPSSStrength – Direction - Significance
• Click Analyze – Correlate – Bivariate• Move the two variables into the box – Click OK
Correlations
1 .173**
.000
2803 1798
.173** 1
.000
1798 1801
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
age AGE OFRESPONDENT
rincome RESPONDENTSINCOME
age AGE OFRESPONDE
NT
rincome RESPONDENTS INCOME
Correlation is significant at the 0.01 level (2-tailed).**.
SPSS Correlation Output
• Value of Correlation Coefficient on first liner = +.173– Relationship is positive
– Relationship is weak
• p-value (Significance) is on the second linep < .001 (whenever SPSS shows .000)– Relationship is significant
– Reject H0
Correlations
1 .173**
.000
2803 1798
.173** 1
.000
1798 1801
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
age AGE OFRESPONDENT
rincome RESPONDENTSINCOME
age AGE OFRESPONDE
NT
rincome RESPONDENTS INCOME
Correlation is significant at the 0.01 level (2-tailed).**.
Correlation for Your Project
• Your dependent variable is Interval/Ratio
• Look at the data set and select one other interval/ratio variable that might be related to (predictive of) your dependent variable
• Following the instructions above– run correlation of that variable.– run scatterplot of the variable
GENERATE A SCATTERPLOT TO SEE
THE RELATIONSHIPS
Go to Graphs → Legacy dialogues→ Scatter → Simple
Click on DEPENDENT V. and move it to the Y-Axis
Click on the OTHER V. and move it to the X-Axis
Click OK
Scatterplot might not look promising at first
Double click on chart to open a CHART EDIT window
use Options →Bin Element Simply CLOSE this box.Bins are applied automatically.
BINS
Dot size now shows number of cases with
each pair of X, Y values
DO NOT CLOSE CHART EDITOR YET!
Add Fit Line (Regression)
• In Chart Editor:• Elements
→Fit Line at Total• Close dialog box
that opens• Close Chart Editor
window
Edited Scatterplot
• Distribution of cases shown by dots (bins)
• Trend shown by fit line.
Regression
• Regression predicts the Dependent Variablebased on the Independent Variable– Computes best-fit line for prediction– Output includes slope and intercept for line
• Hypothesis Test based on ANOVA– SStotal computed
– SStotal divided into Regression (predicted)and Error (random)
• Effect size = R2 for regression
SPSS forRegression
• Analyze →Regression →Linear
Simple Linear Regression (One independent variable)
• Move Dependent Variable into box marked “Dependent”
• Move Independent Variable into box marked “Independent”
• Click OK
Regression OutputModel Summary
.173a .030 .029 2.838Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), age AGE OF RESPONDENTa.
ANOVAb
445.824 1 445.824 55.359 .000a
14463.845 1796 8.053
14909.669 1797
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), age AGE OF RESPONDENTa.
Dependent Variable: rincome RESPONDENTS INCOMEb.
Coefficientsa
8.864 .224 39.598 .000
.037 .005 .173 7.440 .000
(Constant)
age AGE OFRESPONDENT
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: rincome RESPONDENTS INCOMEa.
Each element of output considered separately in the following slides.
ANOVA Table
• Regression SS refers to variability related to the Independent Variable – the treatment
• Residual SS refers to variability not related to the Independent Variable – the error or chance element.
• For regression, df for treatment is 1 per variable• Compute MS and F in the same way as ANOVA• If p-value (Sig) < α the Regression line fits the data
better than a flat line; the relationship is significant.
ANOVAb
445.824 1 445.824 55.359 .000a
14463.845 1796 8.053
14909.669 1797
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), age AGE OF RESPONDENTa.
Dependent Variable: rincome RESPONDENTS INCOMEb.
The Regression Line Equation
• Y = bX + a• b is the coefficient for the Independent Variable
• a is the constant coefficient (intercept)• Predict values of Y based on values of X
Coefficientsa
8.864 .224 39.598 .000
.037 .005 .173 7.440 .000
(Constant)
age AGE OFRESPONDENT
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: rincome RESPONDENTS INCOMEa.
Effect Size: R2
• In regression, the effect size is similar to η2 in ANOVA
• SSregression /SStotal
• Represented by R2 (capital R)
• For simple regression(one variable) use the R-Square figure.
Model Summary
.173a .030 .029 2.838Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), age AGE OF RESPONDENTa.
Sample Write-Up Data from the 2004 General Social Survey were used to explore the relationship between age and income, as most Americans expect to earn more money after years in the workforce. Respondents’ age showed a weak positive correlation (r = .173, p < .001) with income level. Linear regression demonstrated a significant positive relationship (F(1,1796) = 55.359, p < .001). Income increased approximately one-third of an income level for each increased decade of age (b = .037). Due to the large range of income levels at every age (see Figure 1), age only accounts for 3% of the variability of income levels. Older people do tend to earn higher incomes, but other characteristics are probably a better predictor of income than age.
top related