chapter 12: the regression lineadele/s1040/slides/chapter 12.pdf · chapter 12: the regression line...

18
Chapter 12: The Regression Line We already know that the regression line goes through the point of ____________ and has slope = _________. The equation for the line is Y = intercept + (slope) X where slope = and intercept = ave Y – (slope)(ave X )

Upload: others

Post on 14-Aug-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Chapter 12: The Regression Line We already know that the regression line goes through the point of ____________ and has slope = _________. The equation for the line is

Y = intercept + (slope) X where slope = and intercept = aveY – (slope)(aveX)

Page 2: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Slope and intercept

Page 3: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

The equation can be used to get a prediction by putting in the value for X and getting out the predicted value for Y. The regression equation:

Y = intercept + (slope) X

Put in X

Get out Y

Page 4: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Example 1: Midterm: ave = 65 SD = 16 r = 0.7 Final: ave = 60 SD = 10 Find the equation of the regression line for estimating final exam score from midterm score. Estimate the final exam score for someone who got 50 on the midterm.

Page 5: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Example 2: For the men aged 18-24 in the HANES sample, the relationship between height and systolic blood pressure can be summarized as follows:

Average height ≈ 70”, SD ≈ 3” Average b.p. ≈ 124mm, SD ≈ 14mm r = -0.2 Find the equation of the line for estimating blood pressure. Predict the blood pressure of a man who is 68” tall.

Page 6: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Example 3: California men, aged 25-29 in 2005 Education (years) ave = 12.5 SD = 3 Income ave = $30,000 SD = $24,000 r = 0.25 Find the equation of the line for estimating income from education.

Estimate the income of a California man with 4 years of education.

Page 7: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

California men

Page 8: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Example 4: California women, aged 25-29 in 2005 Education (years) ave = 13 SD = 3.4 Income ave = $18,000 SD = $20,000 r = 0.37 Find the equation of the line for estimating income from education.

Estimate the income of a California woman with 4 years of education.

Page 9: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

California women

Page 10: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Caution!

For an observational study, the regression line describes the data that you see, but it can NOT be relied on for predicting the results of INTERVENTIONS. In other words, we can not treat it as a causal relationship.

e.g. For California women, the slope says that ASSOCIATED with each year of education, there is an increase of $2,176 more in income, on average. Going to school for an extra year will not necessarily CAUSE an increase of $2,176 in income. Those who have a 4-year degree earn, on average, ___________ more than those who only completed high school, but getting a degree will not necessarily cause someone’s salary to increase.

WHY NOT? CONFOUNDING FACTORS!

Page 11: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Notes

We can’t rely on the slope to tell us how y will respond if the investigator changes x unless it is a controlled experiment. In an observational study there are too many confounding factors. Sometimes the intercept will not make sense. For example, it might be negative when we would expect it to be zero or positive. Never use regression to predict outside the range of your data!

Page 12: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

“Least Squares” • Among all lines, the one with the smallest r.m.s. error is the regression line. • We call the regression line the “least squares regression line”.

Page 13: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

A good regression example: Hooke’s Law Hang a weight on a spring and measure the length of the spring. Hooke said the stretch is proportional to the load. Doubling the load doubles the stretch.

Page 14: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Hooke’s law

Slope = .05 cm per kg

Intercept = 439.01 cm

Page 15: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Hooke’s law:

length = mx + b We found slope = .05 and intercept = 439.01, so • we estimate m by .05 • we estimate b by 439.01

Page 16: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

A bad regression example

Measure the area and perimeter of these rectangles

Page 17: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

The correlation between area and perimeter is r = 0.98! The scatter diagram:

Page 18: Chapter 12: The Regression Lineadele/s1040/Slides/Chapter 12.pdf · Chapter 12: The Regression Line We already know that the regression line goes through the point of _____ and has

Calculations show r = 0.98 slope = 1.6 intercept = -10.51

area = -10.51 + 1.6(perimeter) ? RIDICULOUS! Regression will not FIND an appropriate model – you have to do the THINKING. Don’t substitute statistics for science!