measures of regression and prediction intervals 1 section 9.3

20
Measures of Regression and Prediction Intervals 1 Section 9.3

Upload: jonah-maxwell

Post on 11-Jan-2016

231 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Measures of Regression and Prediction Intervals 1 Section 9.3

Measures of Regression and Prediction Intervals

1

Section 9.3

Page 2: Measures of Regression and Prediction Intervals 1 Section 9.3

Section 9.3 Objectives

2

Interpret the three types of variation about a regression line

Find and interpret the coefficient of determination

Find and interpret the standard error of the estimate for a regression line

Construct and interpret a prediction interval for y

Page 3: Measures of Regression and Prediction Intervals 1 Section 9.3

Variation About a Regression Line

3

Three types of variation about a regression lineTotal variationExplained variationUnexplained variation

To find the total variation, you must first calculateThe total deviationThe explained deviationThe unexplained deviation

Page 4: Measures of Regression and Prediction Intervals 1 Section 9.3

Variation About a Regression Line

4

iy yˆiy y

ˆi iy y

(xi, ŷi)

x

y (xi, yi)

(xi, yi)

Unexplained deviation

ˆi iy yTotal

deviationiy y Explained

deviationˆiy y

y

x

Total Deviation = Explained Deviation = Unexplained Deviation =

Page 5: Measures of Regression and Prediction Intervals 1 Section 9.3

Variation About a Regression Line

5

Total variation • The sum of the squares of the differences

between the y-value of each ordered pair and the mean of y.

Explained variation The sum of the squares of the differences

between each predicted y-value and the mean of y.

2iy y Total

variation =

Explained variation =

2ˆiy y

Page 6: Measures of Regression and Prediction Intervals 1 Section 9.3

Variation About a Regression Line

6

Unexplained variation The sum of the squares of the differences

between the y-value of each ordered pair and each corresponding predicted y-value.

2ˆi iy y Unexplained variation =

The sum of the explained and unexplained variation is equal to the total variation.

Total variation = Explained variation + Unexplained variation

Page 7: Measures of Regression and Prediction Intervals 1 Section 9.3

Coefficient of Determination

7

Coefficient of determinationThe ratio of the explained variation to the

total variation. Denoted by r2

2 Explained variationTotal variation

r

Page 8: Measures of Regression and Prediction Intervals 1 Section 9.3

Example: Coefficient of Determination

8

The correlation coefficient for the advertising expenses and company sales data as calculated in Section 9.1 isr ≈ 0.913. Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation?

22 (0.913)

0.834

r

About 83.4% of the variation in the company sales can be explained by the variation in the advertising expenditures. About 16.9% of the variation is unexplained.

Solution:

Page 9: Measures of Regression and Prediction Intervals 1 Section 9.3

The Standard Error of Estimate

9

Standard error of estimateThe standard deviation of the observed yi -

values about the predicted ŷ-value for a given xi -value.

Denoted by se.

The closer the observed y-values are to the

predicted y-values, the smaller the standard error of estimate will be.

2( )ˆ2

i ie

y ys

n

n is the number of ordered pairs in the data set

Page 10: Measures of Regression and Prediction Intervals 1 Section 9.3

The Standard Error of Estimate

10

2

, , , ( ), ˆ ˆ( )ˆ

i i i i i

i i

x y y y yy y

2( )ˆ2

i ie

y ys

n

1. Make a table that includes the column heading shown.

2. Use the regression equation to calculate the predicted y-values.

3. Calculate the sum of the squares of the differences between each observed y-value and the corresponding predicted y-value.

4. Find the standard error of estimate.

ˆ iy mx b

2 ( )ˆi iy y

In Words In Symbols

Page 11: Measures of Regression and Prediction Intervals 1 Section 9.3

Example: Standard Error of Estimate

11

The regression equation for the advertising expenses and company sales data as calculated in section 9.2 is

ŷ = 50.729x + 104.061Find the standard error of estimate.

Solution:Use a table to calculate the sum of the squared differences of each observed y-value and the corresponding predicted y-value.

Page 12: Measures of Regression and Prediction Intervals 1 Section 9.3

Solution: Standard Error of Estimate

12

x y ŷ i (yi – ŷ i)2

2.4 225 225.81 (225 – 225.81)2 = 0.65611.6 184 185.23 (184 – 185.23)2 = 1.51292.0 220 205.52 (220 – 205.52)2 = 209.67042.6 240 235.96 (240 – 235.96)2 = 16.32161.4 180 175.08 (180 – 175.08)2 = 24.20641.6 184 185.23 (184 – 185.23)2 = 1.51292.0 186 205.52 (186 – 205.52)2 = 381.03042.2 215 215.66 (215 – 215.66)2 = 0.4356

Σ = 635.3463

unexplained variation

Page 13: Measures of Regression and Prediction Intervals 1 Section 9.3

Solution: Standard Error of Estimate

13

• n = 8, Σ(yi – ŷ i)2 = 635.3463

2( )ˆ2

i ie

y ys

n

The standard error of estimate of the company sales for a specific advertising expense is about $10.29.

635.3463 10.2908 2

Page 14: Measures of Regression and Prediction Intervals 1 Section 9.3

Prediction Intervals

14

Two variables have a bivariate normal distribution if for any fixed value of x, the corresponding values of y are normally distributed and for any fixed values of y, the corresponding x-values are normally distributed.

Page 15: Measures of Regression and Prediction Intervals 1 Section 9.3

Prediction Intervals

15

A prediction interval can be constructed for the true value of y.

Given a linear regression equation ŷ = mx + b and x0, a specific value of x, a c-prediction interval for y is

ŷ – E < y < ŷ + E where

The point estimate is ŷ and the margin of error is E. The probability that the prediction interval contains y is c.

202 2

( )11( )c e

n x xE t s

n n x x

Page 16: Measures of Regression and Prediction Intervals 1 Section 9.3

Constructing a Prediction Interval for y for a Specific Value of x

16

1. Identify the number of ordered pairs in the data set n and the degrees of freedom.

2. Use the regression equation and the given x-value to find the point estimate ŷ.

3. Find the critical value tc that corresponds to the given level of confidence c.

ˆi iy mx b

Use Table 5 in Appendix B.

In Words In Symbols

d.f. = n – 2

Page 17: Measures of Regression and Prediction Intervals 1 Section 9.3

Constructing a Prediction Interval for y for a Specific Value of x

17

2( )ˆ2

i ie

y ys

n

4. Find the standard error of estimate se.

5. Find the margin of error E.

6. Find the left and right endpoints and form the prediction interval.

In Words In Symbols

202 2

( )11( )c e

n x xE t s

n n x x

Left endpoint: ŷ – E Right endpoint: ŷ + E Interval: ŷ – E < y < ŷ + E

Page 18: Measures of Regression and Prediction Intervals 1 Section 9.3

Example: Constructing a Prediction Interval

18

Construct a 95% prediction interval for the company sales when the advertising expenses are $2100. What can you conclude?Recall, n = 8, ŷ = 50.729x + 104.061, se = 10.290

Solution:Point estimate: ŷ = 50.729(2.1) + 104.061 ≈ 210.592

Critical value:d.f. = n –2 = 8 – 2 = 6 tc = 2.447

215.8, 32.44, 1.975x x x

Page 19: Measures of Regression and Prediction Intervals 1 Section 9.3

Solution: Constructing a Prediction Interval

19

0

2

2

2

2 2

1 8(2.1 1.975)(2.447)(10.290) 1 26.8578 8(32.44) (15

( )

8)

1

.

1( )c e

n x xE t s

n n x x

Left Endpoint: ŷ – E Right Endpoint: ŷ + E

183.735 < y < 237.449

210.592 – 26.857≈ 183.735

210.592 + 26.857

≈ 237.449

You can be 95% confident that when advertising expenses are $2100, the company sales will be between $183,735 and $237,449.

Page 20: Measures of Regression and Prediction Intervals 1 Section 9.3

Section 9.3 Summary

20

Interpreted the three types of variation about a regression line

Found and interpreted the coefficient of determination

Found and interpreted the standard error of the estimate for a regression line

Constructed and interpreted a prediction interval for y