cbmr - research - correlation & regression
TRANSCRIPT
![Page 1: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/1.jpg)
Correlation & Regression
![Page 2: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/2.jpg)
Correlation
Finding the relationship between two quantitative variables without being able to infer causal relationships
Correlation is a statistical technique used to determine the degree to which two variables are related
![Page 3: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/3.jpg)
• Rectangular coordinate• Two quantitative variables• One variable is called independent (X) and
the second is called dependent (Y)• Points are not joined • No frequency table
Scatter diagram
Y * *
*X
![Page 4: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/4.jpg)
Wt. (kg)
67 69 85 83 74 81 97 92 114 85
SBP (mmHg)
120 125 140 160 130 180 150 140 200 130
Example
![Page 5: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/5.jpg)
Scatter diagram of weight and systolic blood pressure
60 70 80 90 100 110 12080
100
120
140
160
180
200
220
wt (kg)
SBP(mmHg)Wt.
(kg) 67 69 85 83 74 81 97 92 114 85
SBP (mmHg)
120 125 140 160 130 180 150 140 200 130
![Page 6: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/6.jpg)
60 70 80 90 100 110 12080
100
120
140
160
180
200
220
Wt (kg)
SBP(mmHg)
Scatter diagram of weight and systolic blood pressure
![Page 7: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/7.jpg)
Scatter plots
The pattern of data is indicative of the type of relationship between two variables:
positive relationship negative relationship no relationship
![Page 8: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/8.jpg)
Positive relationship
![Page 9: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/9.jpg)
10 20 30 40 50 60 70 80 900
2
4
6
8
10
12
14
16
18
Age in Weeks
Hei
ght i
n C
M
![Page 10: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/10.jpg)
Negative relationship
Reliability
Age of Car
![Page 11: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/11.jpg)
No relation
![Page 12: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/12.jpg)
Correlation Coefficient
Statistic showing the degree of relation between two variables
![Page 13: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/13.jpg)
Simple Correlation coefficient (r)
It is also called Pearson's correlation or product moment correlationcoefficient.
It measures the nature and strength between two variables of the quantitative type.
![Page 14: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/14.jpg)
The sign of r denotes the nature of association
while the value of r denotes the strength of association.
![Page 15: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/15.jpg)
If the sign is +ve this means the relation is direct (an increase in one variable is associated with an increase in theother variable and a decrease in one variable is associated with adecrease in the other variable).
While if the sign is -ve this means an inverse or indirect relationship (which means an increase in one variable is associated with a decrease in the other).
![Page 16: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/16.jpg)
The value of r ranges between ( -1) and ( +1) The value of r denotes the strength of the
association as illustrated by the following diagram.
-1 10-0.25-0.75 0.750.25
strong strongintermediate intermediateweak weak
no relation
perfect correlation
perfect correlation
Directindirect
![Page 17: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/17.jpg)
If r = Zero this means no association or correlation between the two variables.
If 0 < r < 0.25 = weak correlation.
If 0.25 ≤ r < 0.75 = intermediate correlation.
If 0.75 ≤ r < 1 = strong correlation.
If r = l = perfect correlation.
![Page 18: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/18.jpg)
ny)(
y.nx)(
x
nyx
xyr
22
22
How to compute the simple correlation coefficient (r)
![Page 19: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/19.jpg)
Example: A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded as shown in the following table . It is required to find the correlation between age and weight.
Weight (Kg)
Age (years)
serial No
12 7 18 6 2
12 8 310 5 411 6 513 9 6
![Page 20: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/20.jpg)
These 2 variables are of the quantitative type, one variable (Age) is called the independent and denoted as (X) variable and the other (weight)is called the dependent and denoted as (Y) variables to find the relation between age and weight compute the simple correlation coefficient using the following formula:
ny)(
y.nx)(
x
nyx
xyr
22
22
![Page 21: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/21.jpg)
Y2 X2 xyWeight
(Kg)(y)
Age (years)
(x)Serial
n.
144 49 84 12 7 1
64 36 48 8 6 2
144 64 96 12 8 3
100 25 50 10 5 4
121 36 66 11 6 5
169 81 117 13 9 6
∑y2=742
∑x2=291
∑xy= 461
∑y=66
∑x=41
Total
![Page 22: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/22.jpg)
r = 0.759strong direct correlation
6(66)742.
6(41)291
66641461
r22
![Page 23: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/23.jpg)
EXAMPLE: Relationship between Anxiety and Test Scores
Anxiety )X(
Test score (Y)
X2 Y2 XY
10 2 100 4 208 3 64 9 242 9 4 81 181 7 1 49 75 6 25 36 306 5 36 25 30
∑X = 32 ∑Y = 32 ∑X2 = 230 ∑Y2 = 204 ∑XY=129
![Page 24: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/24.jpg)
Calculating Correlation Coefficient
94.)200)(356(
102477432)204(632)230(6
)32)(32()129)(6(22
r
r = - 0.94
Indirect strong correlation
![Page 25: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/25.jpg)
Regression Analyses
Regression: technique concerned with predicting some variables by knowing others
The process of predicting variable Y using variable X
![Page 26: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/26.jpg)
Regression Uses a variable (x) to predict some outcome
variable (y) Tells you how values in y change as a function
of changes in values of x
![Page 27: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/27.jpg)
Correlation and Regression
Correlation describes the strength of a linear relationship between two variables
Linear means “straight line”
Regression tells us how to draw the straight line described by the correlation
![Page 28: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/28.jpg)
Regression Calculates the “best-fit” line for a certain set of dataThe regression line makes the sum of the squares of
the residuals smaller than for any other lineRegression minimizes residuals
60 70 80 90 100 110 12080
100
120
140
160
180
200
220
Wt (kg)
SBP(mmHg)
![Page 29: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/29.jpg)
By using the least squares method (a procedure that minimizes the vertical deviations of plotted points surrounding a straight line) we areable to construct a best fitting straight line to the scatter diagram points and then formulate a regression equation in the form of:
nx)(
x
nyx
xyb 2
21)xb(xyy b
bXay
![Page 30: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/30.jpg)
Regression Equation
Regression equation describes the regression line mathematically– Intercept– Slope
60 70 80 90 100 110 12080
100
120
140
160
180
200
220
Wt (kg)
SBP(mmHg)
![Page 31: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/31.jpg)
Linear EquationsY
Y = bX + a
a = Y-interceptX
Changein Y
Change in Xb = Slope
bXay
![Page 32: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/32.jpg)
Hours studying and grades
![Page 33: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/33.jpg)
Regressing grades on hours
Linear Regression
2.00 4.00 6.00 8.00 10.00
Number of hours spent studying
70.00
80.00
90.00
Final grade in course = 59.95 + 3.17 * studyR-Square = 0.88
Predicted final grade in class =
59.95 + 3.17*(number of hours you study per week)
![Page 34: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/34.jpg)
Predict the final grade of…
• Someone who studies for 12 hours• Final grade = 59.95 + (3.17*12)• Final grade = 97.99
• Someone who studies for 1 hour:• Final grade = 59.95 + (3.17*1)• Final grade = 63.12
Predicted final grade in class = 59.95 + 3.17*(hours of study)
![Page 35: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/35.jpg)
Exercise
A sample of 6 persons was selected the value of their age ( x variable) and their weight is demonstrated in the following table. Find the regression equation and what is the predicted weight when age is 8.5 years.
![Page 36: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/36.jpg)
Weight (y) Age (x) Serial no.128
12101113
768569
123456
![Page 37: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/37.jpg)
Answer
Y2 X2 xy Weight (y) Age (x) Serial no.
14464
144100121169
493664253681
8448965066117
128
12101113
768569
123456
742 291 461 66 41 Total
![Page 38: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/38.jpg)
6.83641x 11
666
y
92.0
6)41(
291
666414612
b
Regression equation
6.83)0.9(x11y (x)
![Page 39: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/39.jpg)
0.92x4.675y (x)
12.50Kg8.5*0.924.675y (8.5)
Kg58.117.5*0.924.675y (7.5)
![Page 40: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/40.jpg)
11.411.611.8
1212.212.412.6
7 7.5 8 8.5 9
Age (in years)
Wei
ght (
in K
g)
we create a regression line by plotting two estimated values for y against their X component,
then extending the line right and left.
![Page 41: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/41.jpg)
Exercise 2
The following are the age (in years) and systolic blood pressure of 20 apparently healthy adults.
B.P (y)
Age (x)
B.P (y)
Age (x)
128136146124143130124121126123
46536020634326193123
120128141126134128136132140144
20436326533158465870
![Page 42: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/42.jpg)
Find the correlation between age and blood pressure using simple and Spearman's correlation coefficients, and comment.Find the regression equation?What is the predicted blood pressure for a man aging 25 years?
![Page 43: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/43.jpg)
x2 xy y x Serial400 2400 120 20 1
1849 5504 128 43 23969 8883 141 63 3676 3276 126 26 4
2809 7102 134 53 5961 3968 128 31 6
3364 7888 136 58 72116 6072 132 46 83364 8120 140 58 94900 10080 144 70 10
![Page 44: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/44.jpg)
x2 xy y x Serial2116 5888 128 46 112809 7208 136 53 123600 8760 146 60 13400 2480 124 20 143969 9009 143 63 151849 5590 130 43 16676 3224 124 26 17361 2299 121 19 18961 3906 126 31 19529 2829 123 23 20
41678 114486 2630 852 Total
![Page 45: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/45.jpg)
nx)(
x
nyx
xyb 2
21 4547.0
2085241678
2026308521144862
=
=112.13 + 0.4547 x
for age 25 B.P = 112.13 + 0.4547 * 25=123.49 = 123.5 mm hg
y
![Page 46: CBMR - Research - Correlation & Regression](https://reader030.vdocuments.mx/reader030/viewer/2022013013/577cc9f81a28aba711a511d9/html5/thumbnails/46.jpg)
Multiple Regression
Multiple regression analysis is a straightforward extension of simple regression analysis which allows more than one independent variable.