measures of regression and prediction intervals 1 section 9.3
TRANSCRIPT
Measures of Regression and Prediction Intervals
1
Section 9.3
Section 9.3 Objectives
2
Interpret the three types of variation about a regression line
Find and interpret the coefficient of determination
Find and interpret the standard error of the estimate for a regression line
Construct and interpret a prediction interval for y
Variation About a Regression Line
3
Three types of variation about a regression lineTotal variationExplained variationUnexplained variation
To find the total variation, you must first calculateThe total deviationThe explained deviationThe unexplained deviation
Variation About a Regression Line
4
iy yˆiy y
ˆi iy y
(xi, ŷi)
x
y (xi, yi)
(xi, yi)
Unexplained deviation
ˆi iy yTotal
deviationiy y Explained
deviationˆiy y
y
x
Total Deviation = Explained Deviation = Unexplained Deviation =
Variation About a Regression Line
5
Total variation • The sum of the squares of the differences
between the y-value of each ordered pair and the mean of y.
Explained variation The sum of the squares of the differences
between each predicted y-value and the mean of y.
2iy y Total
variation =
Explained variation =
2ˆiy y
Variation About a Regression Line
6
Unexplained variation The sum of the squares of the differences
between the y-value of each ordered pair and each corresponding predicted y-value.
2ˆi iy y Unexplained variation =
The sum of the explained and unexplained variation is equal to the total variation.
Total variation = Explained variation + Unexplained variation
Coefficient of Determination
7
Coefficient of determinationThe ratio of the explained variation to the
total variation. Denoted by r2
2 Explained variationTotal variation
r
Example: Coefficient of Determination
8
The correlation coefficient for the advertising expenses and company sales data as calculated in Section 9.1 isr ≈ 0.913. Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation?
22 (0.913)
0.834
r
About 83.4% of the variation in the company sales can be explained by the variation in the advertising expenditures. About 16.9% of the variation is unexplained.
Solution:
The Standard Error of Estimate
9
Standard error of estimateThe standard deviation of the observed yi -
values about the predicted ŷ-value for a given xi -value.
Denoted by se.
The closer the observed y-values are to the
predicted y-values, the smaller the standard error of estimate will be.
2( )ˆ2
i ie
y ys
n
n is the number of ordered pairs in the data set
The Standard Error of Estimate
10
2
, , , ( ), ˆ ˆ( )ˆ
i i i i i
i i
x y y y yy y
2( )ˆ2
i ie
y ys
n
1. Make a table that includes the column heading shown.
2. Use the regression equation to calculate the predicted y-values.
3. Calculate the sum of the squares of the differences between each observed y-value and the corresponding predicted y-value.
4. Find the standard error of estimate.
ˆ iy mx b
2 ( )ˆi iy y
In Words In Symbols
Example: Standard Error of Estimate
11
The regression equation for the advertising expenses and company sales data as calculated in section 9.2 is
ŷ = 50.729x + 104.061Find the standard error of estimate.
Solution:Use a table to calculate the sum of the squared differences of each observed y-value and the corresponding predicted y-value.
Solution: Standard Error of Estimate
12
x y ŷ i (yi – ŷ i)2
2.4 225 225.81 (225 – 225.81)2 = 0.65611.6 184 185.23 (184 – 185.23)2 = 1.51292.0 220 205.52 (220 – 205.52)2 = 209.67042.6 240 235.96 (240 – 235.96)2 = 16.32161.4 180 175.08 (180 – 175.08)2 = 24.20641.6 184 185.23 (184 – 185.23)2 = 1.51292.0 186 205.52 (186 – 205.52)2 = 381.03042.2 215 215.66 (215 – 215.66)2 = 0.4356
Σ = 635.3463
unexplained variation
Solution: Standard Error of Estimate
13
• n = 8, Σ(yi – ŷ i)2 = 635.3463
2( )ˆ2
i ie
y ys
n
The standard error of estimate of the company sales for a specific advertising expense is about $10.29.
635.3463 10.2908 2
Prediction Intervals
14
Two variables have a bivariate normal distribution if for any fixed value of x, the corresponding values of y are normally distributed and for any fixed values of y, the corresponding x-values are normally distributed.
Prediction Intervals
15
A prediction interval can be constructed for the true value of y.
Given a linear regression equation ŷ = mx + b and x0, a specific value of x, a c-prediction interval for y is
ŷ – E < y < ŷ + E where
The point estimate is ŷ and the margin of error is E. The probability that the prediction interval contains y is c.
202 2
( )11( )c e
n x xE t s
n n x x
Constructing a Prediction Interval for y for a Specific Value of x
16
1. Identify the number of ordered pairs in the data set n and the degrees of freedom.
2. Use the regression equation and the given x-value to find the point estimate ŷ.
3. Find the critical value tc that corresponds to the given level of confidence c.
ˆi iy mx b
Use Table 5 in Appendix B.
In Words In Symbols
d.f. = n – 2
Constructing a Prediction Interval for y for a Specific Value of x
17
2( )ˆ2
i ie
y ys
n
4. Find the standard error of estimate se.
5. Find the margin of error E.
6. Find the left and right endpoints and form the prediction interval.
In Words In Symbols
202 2
( )11( )c e
n x xE t s
n n x x
Left endpoint: ŷ – E Right endpoint: ŷ + E Interval: ŷ – E < y < ŷ + E
Example: Constructing a Prediction Interval
18
Construct a 95% prediction interval for the company sales when the advertising expenses are $2100. What can you conclude?Recall, n = 8, ŷ = 50.729x + 104.061, se = 10.290
Solution:Point estimate: ŷ = 50.729(2.1) + 104.061 ≈ 210.592
Critical value:d.f. = n –2 = 8 – 2 = 6 tc = 2.447
215.8, 32.44, 1.975x x x
Solution: Constructing a Prediction Interval
19
0
2
2
2
2 2
1 8(2.1 1.975)(2.447)(10.290) 1 26.8578 8(32.44) (15
( )
8)
1
.
1( )c e
n x xE t s
n n x x
Left Endpoint: ŷ – E Right Endpoint: ŷ + E
183.735 < y < 237.449
210.592 – 26.857≈ 183.735
210.592 + 26.857
≈ 237.449
You can be 95% confident that when advertising expenses are $2100, the company sales will be between $183,735 and $237,449.
Section 9.3 Summary
20
Interpreted the three types of variation about a regression line
Found and interpreted the coefficient of determination
Found and interpreted the standard error of the estimate for a regression line
Constructed and interpreted a prediction interval for y