10.1 simple linear regression - rocky mountain...

19
10.1 Simple Linear Regression Ulrich Hoensch Tuesday, December 1, 2009

Upload: buianh

Post on 11-Jun-2018

238 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

10.1 Simple Linear Regression

Ulrich Hoensch

Tuesday, December 1, 2009

Page 2: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

The Simple Linear Regression ModelWe have two quantitative random variables X (the explanatoryvariable) and Y (the response variable). The Simple LinearRegression Model assumes that

Y = β0 + β1X + R,

where R ∼ N(0, σ).

I β0, β1, σ are the parameters of the model and must beestimated from given data of the form (x1, y1), . . . , (xn, yn).

I An estimator for β0 is the intercept b0 of the regression liney = b0 + b1x ; an estimator for β1 is the slope b1 of theregression line.

I The value of σ can be estimated using the standard deviations of the residuals, which is computed as follows:

s =

√∑(yi − yi )2

n − 2.

Page 3: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

The Simple Linear Regression Model

Page 4: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Example: Beer and Blood Alcohol

Sixteen student volunteers at Ohio State University drank arandomly assigned number of 12-ounce cans of beer. Thirtyminutes later, a police officer measured their blood alcohol content(BAC). Here are the data:

Student 1 2 3 4 5 6 7 8

x: beers 5 2 9 8 3 7 3 5y: BAC 0.10 0.03 0.19 0.12 0.04 0.095 0.07 0.06

Student 9 10 11 12 13 14 15 16

x: beers 3 5 4 6 5 7 1 4y: BAC 0.02 0.05 0.07 0.10 0.085 0.09 0.01 0.05

Page 5: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Example: Beer and Blood AlcoholWe conduct a Linear Regression t-Test on a TI calculator andinterpret the results.Step 1: Press STAT, select 1: Edit. . . and enter the data.

Step 2: Press STAT, arrow over to TESTS, and selectLinRegTTest. . . .

Page 6: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Example: Beer and Blood AlcoholStep 3: Enter the list that contains the data for the explanatoryvariable (X ) and the list for the response variable (Y ); set Freq:1. As the alternative hypothesis, we will always select β & ρ: 6= 0(explained later); ignore RegEQ.

Step 4: Press Calculate; the following screens appear.

Page 7: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Example: Beer and Blood Alcohol

Interpretation of Results.

I Since β0 = a = −0.01270 and β1 = b = 0.01796, theregression line is y = −0.01270 + 0.01796x .

I Since the p-value is 2.969 · 10−6, the null hypothesis

H0 : β1 = ρ = 0

is rejected. So, there is a significant correlation between BACand number of beers drunk. In other words, the number ofbeers is a significant factor for BAC.

I The correlation coefficient is r = 0.894, so there is a strongpositive correlation, and the coefficient of determination is0.7998 ≈ 80% (80% of variation in BAC is explained bynumber of beers).

Page 8: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Example: Beer and Blood Alcohol

The theoretical model is

BAC = −0.01270 + 0.01796(number of beers) + R,

where R ∼ N(0, 0.02044).

Application. John has drunk 5 beers. According to the modelabove, what is his predicted BAC (after 30 minutes), and what isthe probability that his BAC will be below the legal limit of 0.08?

I The predicted value is

BAC = −0.01270 + 0.01796 · 5 = 0.0771.

I P(BAC < 0.08) =normalcdf(−1000000, 0.08, 0.0771, 0.02044) = 55.6%.

Page 9: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Multiple Regression and Excel Project 3

Ulrich Hoensch

Thursday, April 23, 2009

Page 10: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Description of MY Project (YOURS must be different)

I Collect the selling prices (including shipping) for 81 TI-84 PlusGraphing calculators sold on eBay over the course of oneweek.

I The response variable is Total Price, the selling priceincluding shipping.

I The explanatory variables are:I Color: either BLACK (coded as 0) or SILVER (coded as 1).I Condition: USED (coded as 0); NEW (coded as 1); NIB (new

in box, coded as 2).I Seller Score: scaled using a base-10 logarithm of actual score

to reduce differences in magnitude.I Seller Feedback: 1 = 100%.

I The model is:Total Price = β0 +β1 ·Code(Color) +β2 ·Code(Condition)+β3 · LOG10(Seller Score) + β4 · Seller Feedback.

Page 11: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Excel Data Sheet

Page 12: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Coded Data Sheet

Page 13: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Regression Analysis

To perform the regression analysis, follow these steps:

1. Click on Data; then click on Data Analysis. (This is Tools,Data Analysis. . . for the 1997-2003 version of Excel.)

2. Select Regression, and click OK.

3. Enter the range for the response variable, and the range forthe explanatory variables (preferably include labels). Set theconfidence level to 95%.

Page 14: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Regression Analysis

Page 15: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Regression Analysis

The output of the regression analysis is this.

Page 16: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Interpretation of Results

The multiple regression coefficient is r = 0.724, and thecoefficient of determination is r2 = 0.524, so the model explainsonly about 50% of the variation in the selling price.

Page 17: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Interpretation of Results

The coefficients are:

Interpretation:

I The coefficient for Code(Color) is β1 = 8.19 and theinterpretation is that we can expect to pay $8.19 more for asilver calculator than for a black calculator.

I The coefficient for Code(Condition) is β2 = 8.46 and theinterpretation is that we can expect to pay $8.46 more for anew calculator than for a used calculator, and also $8.46 morefor a new-in-box calculator than for a new calculator.

Page 18: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Interpretation of Results

The coefficients are:

Interpretation:

I The coefficient for LOG10(Seller Score) is β3 = 2.87 and theinterpretation is that we can expect to pay $2.87 more for aten-fold increase in a seller’s score.

I The coefficient for Seller Feedback is β4 = 255.58 and theinterpretation is that we can expect to pay $2.56 more for a1% increase in a seller’s feedback score.

Page 19: 10.1 Simple Linear Regression - Rocky Mountain Collegecobalt.rocky.edu/.../LectureNotes/Lecture24_Linear_Regression.pdf · The Simple Linear Regression Model ... Graphing calculators

Interpretation of Results

At the 95% confidence level, all variables are associated with asignificant increase in the selling price (both endpoints of theconfidence interval are positive), except for the seller feedback(confidence interval contains zero).