multiple regression and model building chapter 15 copyright © 2014 by the mcgraw-hill companies,...

15
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserve McGraw-Hill/Irwin

Upload: clara-barber

Post on 03-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

Multiple Regression and Model Building

Chapter 15

Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

Page 2: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

Multiple Regression and Model Building15.1 The Multiple Regression Model and the

Least Squares Point Estimate15.2 Model Assumptions and the Standard

Error15.3 R2 and Adjusted R2 (This section can be

read anytime after reading Section 15.1)15.4 The Overall F Test15.5 Testing the Significance of an

Independent Variable15.6 Confidence and Prediction Intervals

15-2

Page 3: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

Multiple Regression and Model Building Continued

15.7 The Sales Territory Performance Case15.8 Using Dummy Variables to Model

Qualitative Independent Variables15.9 Using Squared and Interaction

Variances15.10 Model Building and the Effects of

Multicollinearity15.11 Residual Analysis in Multiple

Regression15.12 Logistic Regression

15-3

Page 4: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.1 The Multiple Regression Model and the Least Squares Point Estimate

Simple linear regression used one independent variable to explain the dependent variable◦ Some relationships are too complex to be described

using a single independent variableMultiple regression uses two or more independent

variables to describe the dependent variable◦ This allows multiple regression models to handle

more complex situations◦ There is no limit to the number of independent

variables a model can useMultiple regression has only one dependent variable

LO15-1: Explain the multiple regression model and the related least squares point estimates.

15-4

Page 5: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.2 Model Assumptions and the Standard Error

The model is

y = β0 + β1x1 + β2x2 + … + βkxk +

Assumptions for multiple regression are stated about the model error terms, ’s

LO15-2: Explain the assumptions behind multiple regression and calculate the standarderror.

15-5

Page 6: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.3 R2 and Adjusted R2

1. Total variation is given by the formulaΣ(yi - y ̄)2

2. Explained variation is given by the formulaΣ(y ̂i - y ̄)2

3. Unexplained variation is given by the formula Σ(yi - y ̂i)2

4. Total variation is the sum of explained and unexplained variation

This section can be covered anytime after reading Section 15.1

LO15-3: Calculate and interpret the multiple and adjusted multiple coefficients of determination.

15-6

Page 7: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

R2 and Adjusted R2 Continued

5. The multiple coefficient of determination is the ratio of explained variation to total variation

6. R2 is the proportion of the total variation that is explained by the overall regression model

7. Multiple correlation coefficient R is the square root of R2

LO15-3

15-7

Page 8: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.4 The Overall F Test

To testH0: β1= β2 = …= βk = 0 versusHa: At least one of β1, β2,…, βk ≠ 0

The test statistic is

Reject H0 in favor of Ha if F(model) > F* or

p-value < *F is based on k numerator and n-(k+1)

denominator degrees of freedom

1)](k-)/[n variationed(Unexplain

)/k variation(Explained

F(model)

LO15-4: Test the significance of a multiple regression model by using an F test.

15-8

Page 9: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.5 Testing the Significance of an Independent Variable

A variable in a multiple regression model is not likely to be useful unless there is a significant relationship between it and y

To test significance, we use the null hypothesis H0: βj = 0

Versus the alternative hypothesisHa: βj ≠ 0

LO15-5: Test the significance of a single independent variable.

15-9

Page 10: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.6 Confidence and Prediction Intervals

The point on the regression line corresponding to a particular value of x01, x02,…, x0k, of the independent variables isy ̂ = b0 + b1x01 + b2x02 + … + bkx0k

It is unlikely that this value will equal the mean value of y for these x values

Therefore, we need to place bounds on how far the predicted value might be from the actual value

We can do this by calculating a confidence interval for the mean value of y and a prediction interval for an individual value of y

LO15-6: Find and interpret a confidence interval for a mean value and a prediction interval for anindividual value.

15-10

Page 11: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.8 Using Dummy Variables to Model Qualitative Independent Variables

So far, we have only looked at including quantitative data in a regression model

However, we may wish to include descriptive qualitative data as well◦For example, might want to include the gender of

respondentsWe can model the effects of different levels

of a qualitative variable by using what are called dummy variables◦Also known as indicator variables

LO15-7: Use dummy variables to model qualitative independentvariables.

15-11

Page 12: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.9 Using Squared and Interaction Variables

The quadratic regression model relating y to x is:y = β0 + β1x + β2x2 +

Where:◦ β0 + β1x + β2x2 is the mean value of the

dependent variable y

◦ β0, β1x, and β2x2 are regression parameters relating the mean value of y to x

◦ is an error term that describes the effects on y of all factors other than x and x2

LO15-8: Use squared and interaction variables.

15-12

Page 13: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.10 Model Building and the Effects of Multicollinearity

Multicollinearity is the condition where the independent variables are dependent, related or correlated with each other

Effects◦ Hinders ability to use t statistics and p-values to assess the

relative importance of predictors◦ Does not hinder ability to predict the dependent (or

response) variableDetection

◦ Scatter plot matrix◦ Correlation matrix◦ Variance inflation factors (VIF)

LO15-9: Describe multicollinearity and build a multiple regression model.

15-13

Page 14: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.11 Residual Analysis in Multiple Regression

For an observed value of yi, the residual is

ei = yi - y ̂ = yi – (b0 + b1xi1 + … + bkxik)

If the regression assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance σ2

LO15-10: Use residual analysis to check the assumptions of multipleregression.

15-14

Page 15: Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

15.12 Logistic Regression

Logistic regression and least squares regression are very similar◦Both produce prediction equations

The y variable is what makes logistic regression different◦With least squares regression, the y variable is a

quantitative variable◦With logistic regression, it is usually a dummy 0/1

variable

LO15-11: Use a logistic model to estimate probabilities and odds ratios.

15-15