chapter 6
DESCRIPTION
Chapter 6. Intercept A and Gradient of regression line, B. y – intercept or constant term. Gradient or Slope of regression line. y = A + Bx. Dependent Variable, DV. Independent Variable, IV. y- axis. y – intercept or constant term. Gradient or Slope of regression line = B. A. - PowerPoint PPT PresentationTRANSCRIPT
1
Chapter 6
Intercept A and Gradient of regression line, B
2
y = A + Bx
Dependent Variable, DV
Independent Variable, IV
y – intercept or constant term
Gradient or Slope of regression line
A Gradient or Slope of
regression line = B
y – intercept or constant term
x-axis
y- axis
3
y = A + Bx is sometimes called a Deterministic model and it gives an exact relationship between y and x
But in reality yobs is slightly different from the value predicted by ypre
So y = A + Bx + e where e is random error term to take into consideration the difference (see slide 5 if you do not understand this concept)
A and B are population parameters and the regression line is calledPopulation regression line and values of A and B in the population are called true values of the y-intercept and slope.
But population data are difficult to obtain. So we use sample data to estimate the population. Thus the values calculated from sample dataare estimates and so the y-intercept and the slope for the sample dataare denoted as ‘a’ and ‘b’ and yo is denoted as the predicted or estimatedvalue for a given x. yo = a + bx this equation is called estimated regression model; it gives the regression of y on x based on sample data
4
Example 1
Income Food Expenditure35 949 1521 739 1115 528 825 9
5
Scatter Plot for example 1
eRegression
Line
x1
yobs
e (error) = ypre - yobs
Or e = y - yo
ypre – y value predicted by regression line
or best straight line
Yobs – actual y value obtained
ypre
6
Error Sum of Squares, SSEThe sum of errors is always zero for the best straight line or least squares line.i.e. Σe = Σ(y –yo) = 0
So to find the line that best fits the points, we cannot minimize the sum of errors Since it will always be zero. Instead we minimize the error sum of squares, SSE
SSE = Σe2 = Σ(y –yo)2
The value of ‘a’ and ‘b’ that give the minimum SSE are called the least squaresestimates of A and B and the regression line obtained with these estimates iscalled the least squares regression line.
For the least squares regression line, yo = a + bx Where, b = SSxy and a = y - b x SSxx
y = mean of y scoresx = mean of x scores
7
SSxy = Σxy – (Σx) (Σy) n
Can be positive or negative
SSxx = Σx2 – (Σx)2
n
Is always positive
SSxy = Σ (x - x)(y – y)
SSxx = Σ (x – x)2
y = mean of y scoresx = mean of x scores
8
Example 1
Income Food x Expenditure, y xy x2 35 9 315 122549 15 735 240121 7 147 44139 11 429 152115 5 75 22528 8 224 78425 9 225 625
Σx = 212 Σy = 64 Σxy = 2150 Σx2 = 7222
Step 1: Compute Σx, Σy, x and y.
Σx = 212 Σy= 64 = Σx / n = 212 / 7 = 30.2857 = Σy / n = 64 / 7 = 9.1429
Step 2: Compute Σxy and Σx2
X Y
9
Step 3: Compute SSxy and SSxx
SSxy = Σxy – (Σx) (Σy) n
= 2150 – (212)(64) /7 = 211.7143
SSxx = Σx2 – (Σx)2
n
= 7222 – (212)2 / 7
= 801.4286
Step 4: Compute ‘a’ and ‘b’
b = SSxy and a = y - b x SSxx
a = 9.1429 – (.2642)(30.2857) = 211.7143 801.4286 a = 1.1414
= .2642
The estimated regression model ypre = a + bx is ypre = 1.1414 + .2642x
10
This gives the regression of food expenditure on income.Using this estimated regression model, we can find the predicted valueOf y for any specific value of x.
Eg. If the monthly income is RM3500, where x = 35 in hundred Then ypre = 1.1414 – (.2642)(35) = RM10.3884 hundred = RM1038.84
But the actual y value when x = 35 is RM900
There is an error in the prediction of –RM138.84 . This negative error indicatesthat the predicted value of y is greater than the actual value of y. Thus ifWe use the regression model, the household food expenditure is overestimatedby RM138.84
Calculate what happens when income = RM0?
11
Maths Science
32 45
67 56
23 12
86 79
65 73
55 65
32 40
67 77
90 87
31 40
56 49
77 82
10 13
75 76
67 68
77 79
34 45
28 31
44 49
Exercise 1
Calculate the regression equation for theMath (x scores) and Science (y scores) marks.
12
Maths History
32 70
67 30
23 80
86 45
65 35
55 65
32 65
67 35
90 10
31 70
56 49
77 42
10 90
75 40
67 59
77 51
18 81
28 55
44 49
Exercise 2
Calculate the regression equation for theMaths (x scores) and History (y scores) marks.
13
Exercise 3
Calculate the regression equation for the graph between IQ ranges (x axis) and the Correlation coefficients ( r) between Overall Creativity (OC) and Overall Achievement (OA) (y axis) based on the data on page 77 (Graph 7.1) (Palaniappan, 2006)