chapter 13: constructing a multiple regression model...chapter 13: constructing a multiple...

51
1 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2 nd edition, Chapter 13 Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2 nd edition, Chapter 13 Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Learning Objectives for Ch. 13 This chapter presents a four-step process for building a multiple linear regression model: STEP ONE: Initial Selection of Possible Predictor Variables Incorporating Qualitative Independent Variables by Using Dummy or Indicator Variables Incorporating Lagged Predictor Variables when there is Time-Series Data 2 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2 nd edition, Chapter 13 Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Learning Objectives for Ch. 13 STEP TWO: Addressing Nonlinearity and Interaction Among the Variables STEP THREE: Choosing Predictors Using Stepwise and Other Methods STEP FOUR: Checking the Assumptions of Linearity, Heteroscedasticity, Normality and Independence by Doing a Residual Analysis Validating the Model 3

Upload: others

Post on 22-Aug-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

1Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Chapter 13:Constructing a Multiple

Regression Model

Hildebrand, Ott and GrayBasic Statistical Ideas for Managers

Second Edition

Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Learning Objectives for Ch. 13

• This chapter presents a four-step process for building a multiple linear regression model:

• STEP ONE:• Initial Selection of Possible Predictor Variables• Incorporating Qualitative Independent Variables by

Using Dummy or Indicator Variables• Incorporating Lagged Predictor Variables when there is

Time-Series Data

2

Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Learning Objectives for Ch. 13

• STEP TWO:• Addressing Nonlinearity and Interaction Among the

Variables• STEP THREE:

• Choosing Predictors Using Stepwise and Other Methods

• STEP FOUR:• Checking the Assumptions of Linearity,

Heteroscedasticity, Normality and Independence by Doing a Residual Analysis

• Validating the Model

3

Page 2: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

4Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.1Selecting Possible

Independent Variables (Step 1)

5Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• The basic purpose of a multiple regression model is:• to predict a variable, Y, known as the

response variable, using …• two or more predictor or independent

variables, xj , j = 1,2,…,k.• The objective is to produce a reliable and

accurate estimate or prediction of Y.

6Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• This will be affected by which independent variables are chosen.

• The overarching principle is parsimony: build the simplest model possible consistent with producing a good estimate of Y.• This will reduce complexity and simplify

interpretation.• It will also save data collection costs.

Page 3: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

7Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• There is no substitute for a thorough understanding of the field in selecting good independent variables.

• In particular, any underlying theory could be very useful in identifying potential independent variables.

• A challenge, however, will be collinearity, as some predictors could be clearly linked with others.

8Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• Collinearity exists when the independent variables are correlated among themselves.

• This makes impossible the interpretation of a partial slope (“the change in Y per unit change in x, holding all other x’s constant”).

• If several x’s are correlated, then we cannot “hold some constant” while we increase the other.

9Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• Suppose candidates for the predictor variables have been determined.

• An ad-hoc assessment of their suitability as predictors and the extent of any collinearity is found by:• A correlation matrix of all the variables • A Matrix plot for every pair of variables

• These concepts will be explained in the context of Example 13.14.

Page 4: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

10Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

Example 13.14:Data are collected for 20 independent pharmacies in an attempt to predict prescription volume (sales/month).

• The independent variables are: • total floor space (FLOOR_SP); • percentage of floor space allocated to prescription

department (PRESC_RX); • number of available parking spaces (PARKING); • whether or not the pharmacy is located in a shopping

center (SHOPCNTR); and, • per-capita income of the surrounding community

(INCOME).

11Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• The correlation matrix for Example 13.14 follows:Correlations: VOLUME, FLOOR_SP, PRESC_RX, PARKING, SHOPCNTR, INCOME

VOLUME FLOOR_SP PRESC_RX PARKING SHOPCNTRFLOOR_SP 0.183

0.440

PRESC_RX -0.663 -0.7510.001 0.000

PARKING -0.069 0.504 -0.3280.772 0.023 0.158

SHOPCNTR -0.203 0.710 -0.341 0.4820.392 0.000 0.141 0.031

INCOME 0.385 0.863 -0.845 0.393 0.6450.094 0.000 0.000 0.087 0.002

Cell Contents: Pearson correlation P-Value

12Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• A correlation matrix lists the correlation of every possible pair of variables.

• Because the matrix is symmetric, only the lower left part of the matrix is displayed.

• The upper right part of the matrix is a mirror image of the lower left.

• Because a variable is perfectly correlated with itself, these values are not shown.

Page 5: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

13Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• The values in the matrix are the correlations of:• the variable indicated in the row, and,• the variable indicated in the column.

• Example 13.14: The correlation between the predictors “Parking” and “Presc_Rx” is -0.328. This indicates that as the percentage of floor space allocated to the prescription department increases, the number of available parking spaces decreases.

14Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• Low correlation (close to 0) between a pair of x’s indicates little to no collinearity.

• If two or more predictors are highly correlated, then we might consider using just one of them.

• Example 13.14:Two concerns here are the high correlations between:• “Income” and “Floor Space” [.863], and,• “Income” and “Presc_Rx” [-.845].

• The predictor that has the highest correlation with Y (“Volume”) is “Presc_Rx.”

15Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• These findings tell us that two pairs of predictor variables are highly correlated

a potential collinearity problem.

• We may want to reconsider this set of predictors.

Page 6: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

16Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• The Matrix plot is the visual equivalent of the correlation matrix.

• For each pair of variables, a scatterplot is generated.

• The plots are scanned visually:• A distinct linear pattern indicates a

correlated pair of variables.• A scattering without any obvious pattern

indicates a pair of variables with little or no correlation.

17Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• The Matrix Plot for Example 13.14 follows:

VOLUME

600040002000 906030 2416830

20

10

6000

4000

2000

FLOOR_SP

PRESC_RX

50

30

1090

60

30PARKING

SHOP CNTR

1.0

0.5

0.0

302010

24

16

8

503010 1.00.50.0

INCOME

Matrix Plot for Example 13.14

18Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

Example 13.14:• That the correlation between “Income” and “Parking” is only

.393 compared to the correlation of .863 between “Income”and “Floor Space” is evident in their scatterplots.

• The scatterplot between “Income” and “Parking” indicates a nonlinear relation between this pair of predictors.

• That the predictor “Presc_Rx” has the highest correlation of any of the predictors with Y is evident in the first row of the Matrix Plot.

• The unique appearance of the scatterplot between “Volume” and “ShopCntr” occurs because a pharmacy is either in a shopping center (1) or it is not (0).

Page 7: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

19Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.1 Selecting Possible Independent Variables (Step 1)

• If the correlation matrix and Matrix plot indicate some of the independent variables are correlated with the others, we need to reconsider these variables.

• One possibility is to combine some of them into a single predictor.

• The correlation matrix and matrix plot may not show the full extent of a collinearity problem because only pairs of predictors are considered. That is why the VIF should be used (Chapter 12).

20Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.2Using Qualitative Predictors:

Dummy Variables (Step 1)

21Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• Up to now, we have exclusively used quantitative variables in regression.In Example 13.14, total floor space is a quantitative variable.

• Another type of independent variable is a qualitative variable.In Example 13.14, each of the 20 pharmacies either is, or is not, located in a shopping center.

• A dummy or indicator variable is used to model this.

Page 8: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

22Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• When there are only two categories, the dummy variable represents the presence or absence of a particular category.

• In Example 13.14, the variable takes on the value 1 if the pharmacy is located in a shopping center, and the value 0 if the pharmacy is not located in a shopping center.

• In Example 13.14, suppose “FLOOR_SP” and “SHOPCNTR” are the only predictors.

23Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

The Minitab output follows:

Regression Analysis: VOLUME versus FLOOR_SP, SHOPCNTR

The regression equation isVOLUME = 7.03 + 0.00352 FLOOR_SP - 8.26 SHOPCNTR

Predictor Coef SE Coef T P VIFConstant 7.033 5.350 1.31 0.206FLOOR_SP 0.003517 0.001585 2.22 0.040 2.0SHOPCNTR -8.256 3.656 -2.26 0.037 2.0

S = 5.72922 R-Sq = 25.7% R-Sq(adj) = 16.9%

24Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• For a pharmacy in a shopping center,

VOLUME = 7.03 + 0.00352 FLOOR_SP - 8.26= (7.03 - 8.26) + 0.00352 FLOOR_SP= - 1.23 + 0.00352 FLOOR_SP

• For a pharmacy not in a shopping center,

VOLUME = 7.03 + 0.00352 FLOOR_SP

• -8.26 is the estimated difference in sales volume between a pharmacy located in a shopping center (SHOPCNTR = 1) and one not located in a shopping center (SHOPCNTR = 0) for any specified value of Floor Space.

Page 9: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

25Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• When there are only two categories, it would be wrong to use a dummy variable for each category.

• In Example 13.14, it would be wrong to have two dummy variables:Yes = 1, if a pharmacy is in a shopping center;

= 0, otherwise;No = 1, if a pharmacy is not in a shopping center;

= 0, otherwise.

26Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• The worksheet for the first 5 pharmacies follows:

Pharmacy VOLUME FLOOR_SP Yes No1 22 4900 1 02 19 5800 1 03 24 5000 1 04 28 4400 0 15 18 3850 0 1

• The dummy variables “Yes” and “No” add to one for each pharmacy.

Since these two variables have a correlation of -1, there is severe multicollinearity.

27Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

The Minitab output follows:

Regression Analysis: VOLUME versus FLOOR_SP, Yes, No

* No is highly correlated with other X variables* No has been removed from the equation.

The regression equation is

VOLUME = 7.03 + 0.00352 FLOOR_SP - 8.26 Yes

• Minitab recognizes this multicollinearity and removes the “No” variable.

Page 10: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

28Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• Although a qualitative variable can be multi-level, interpretation can be problematic.

• Suppose we want to predict the GPA of an undergraduate student where the qualitative variable is class (freshman, sophomore, junior, senior).

• One possibility is to use an independent variable coded as:

1 = freshman 3 = junior2 = sophomore 4 = senior

• The codes used were arbitrary.

29Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• If Class is the only predictor of GPA, then the population model is:

E(GPA) = β0 + β1x

• For freshmen, E(GPA) = β0 + β1

For sophomores, E(GPA) = β0 + 2β1

For juniors, E(GPA) = β0 + 3β1

For seniors, E(GPA) = β0 + 4β1

30Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• This implies that the change in E(Y) is the same when going from freshmen to sophomores as when going from sophomores to juniors.

• It is better to create three dummy variables for three of the classes, e.g.:

x1 = 1, if freshman; = 0, if notx2 = 1, if sophomore; = 0, if notx3 = 1, if junior; = 0, if not

Page 11: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

31Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• Then each variable is either 1 or 0 depending on whether the student is or is not in that class.

• For example, a freshman would have x1 = 1 and x2 = x3 = 0.

• Any student with a zero for all three variables would be in the fourth group.

• Specifically, a student with x1 = x2 = x3 = 0 would be a senior.

32Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• With three dummy variables, the population model is:E(GPA) = β0 + β1x1 + β2x2 + β3x3

• The model for each class is:Freshmen E(GPA) = β0 + β1Sophomores E(GPA) = β0 + β2Juniors E(GPA) = β0 + β3Seniors E(GPA) = β0

• β2–β1 is the differential effect between Sophomores and Freshmen.

• β1 is the differential effect between Freshmen and Seniors.

33Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• Testing whether or not two population means are equal (Section 9.1) can also be done using regression analysis with a dummy variable.

Exercise 9.5: Company officials are concerned about the length of time a particular drug retains its potency. A random sample (sample 1) of 10 bottles of the product is drawn from current production and analyzed for potency. A second sample (sample 2) is obtained, stored for one year, and then analyzed. The readings obtained are:

• Management is interested in determining if mean potency has decreased after one year.

9.99.89.69.59.710.110.210.19.69.8Sample 210.610.010.210.710.69.810.810.310.510.2Sample 1

Page 12: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

34Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• Assumption: Independent, random samples from two normal populations with parameters (µ1 and σ1) and (µ2 and σ2), respectively.

• Furthermore, assume and unknown, but equal

• For Exercise 9.5, H0: µ1 = µ2 or µ1 - µ2 = 0Ha: µ2 < µ1 or µ2 - µ1 < 0

21σ 2

35Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• Population Regression Model for Exercise 9.5:

E(Y) = β0 + β1x

where x = 1, if Sample 1 = 0, if Sample 2

• The null and research hypotheses become:

H0: β1 = 0 vs. Ha: β1 > 0

• The Minitab output follows:

36Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

Regression Analysis: POTENCY versus x1

The regression equation is: POTENCY = 9.83 + 0.540 x1

Predictor Coef SE Coef T PConstant 9.83000 0.09012 109.07 0.000x1 0.5400 0.1275 4.24 0.000

S = 0.284995 R-Sq = 49.9% R-Sq(adj) = 47.1%

Unusual ObservationsObs x1 POTENCY Fit SE Fit Residual St Resid5 1.00 9.8000 10.3700 0.0901 -0.5700 -2.11R

R denotes an observation with a large standardized residual.

• Conclusion: Since p-value = .000/2 < .05, reject H0: β1 = 0

Page 13: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

37Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.2 Using Qualitative Predictors: Dummy Variables (Step 1)

• The value of the T-statistic (4.24) is the same as in Section 9.1 • The NPP shows the requirement that the standardized residuals be

normally distributed has been met.

SRES1

Perc

ent

3210-1-2-3

99

95

90

80

70

60504030

20

10

5

1

Mean

0.572

-1.54321E-14StDev 1.026N 20AD 0.291P-Value

NPP for SRES from Reression Approach to Problem 9.5Normal

38Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.3Lagged Predictor Variables (Step 1)

39Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

• For time-series data, a regression model is frequently used to make forecasts.

• For example, a regression model to forecast monthly paint sales of a home-supply store chain for a region could be:

where income denotes Median Household Income of the region.

• There are two difficulties with this model.

,)Income(ˆ)AdvExp(ˆˆeslSa 210 ttt βββ ++=

Page 14: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

40Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

• First, future estimates of Sales depend on future estimates of Adv Exp and Income for that time period.

• Secondly, it is likely that income for earlier months, rather than current income, will have more of an effect on current sales.

41Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

• A major question is the number of lags to use.• Fight the temptation to use many lags.

• The lagged variables are likely to be severely correlated.• Using multiple t-tests increases the overall probability of

Type I error.• Each lag results in the loss of one observation.

• “Knowledge of the basic process involved is almost always useful in choosing lags.” (Hildebrand, Ott and Gray)

• To illustrate these concepts, consider Exercise 13.42.

42Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

Exercise 13.42:An auto-supply store had 60 months of data on

variables that were thought to be relevant to sales. The data include monthly sales in thousands of dollars (SALES), average daily low temperature in degrees Fahrenheit (LOWTEMP), advertising expenditure for the month in thousands of dollars (ADEXP), used-car sales in the previous month (USEDCAR), and month number (MONTH).

Page 15: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

43Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

• The variable “USED CAR” is in lagged form.

• Two new lagged variables were created:

LAG(USCA) is actually a two-month lagged variable for used-car sales.

LAG(ADEXP) is a one-month lagged variable for advertising expenditures.

44Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

• It was decided to use the following predictors:

• LOWTEMP: Drastic weather conditions for the current month immediately impact sales.

• LAG(ADEXP): Most sales at an auto-supply store are need-based rather than impulse-based. Meaningful advertising is retained by a potential customer for a future purchase.

• LAG(USCA): Since a warranty covers the repairs for the first 30 days, the impact of used-car purchasers won’t be felt for 2 months.

• The Minitab output follows:

45Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

Regression Analysis: SALES versus LOWTEMP, Lag(ADEXP), Lag(USCA)The regression equation isSALES = 352 - 4.00 LOWTEMP + 5.06 Lag(ADEXP) + 0.0154 Lag(USCA)

59 cases used, 1 cases contain missing values

Predictor Coef SE Coef T P VIFConstant 352.3 117.0 3.01 0.004LOWTEMP -4.0023 0.6858 -5.84 0.000 2.8Lag(ADEXP) 5.0649 0.6640 7.63 0.000 1.8Lag(USCA) 0.015412 0.007440 2.07 0.043 2.1

S = 54.9720 R-Sq = 83.2% R-Sq(adj) = 82.3%Durbin-Watson statistic = 0.410629

Page 16: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

46Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.3 Lagged Predictor Variables (Step 1)

• The good news:• Each predictor adds statistically detectable

predictive value, given the others.• There is no multicollinearity problem.

• The bad news:• There is a serious autocorrelation problem

since Durbin-Watson statistic = 0.411 (Section 13.7).

47Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.4Nonlinear Regression Models (Step 2)

48Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• For 2 predictors, the general form for the first-order fitted model is:

• The residual, , removes the linear effect of x1 and x2.

• If the fitted model should have included an term, a plot of the SRi vs. x1 will show curvature.

• An example illustrating this is in Section 13.6.

• Instead of including an term, another approach is to transform one or more of the existing variables.

22110ˆˆˆˆ xxY βββ ++=

YY ˆ−21x

21x

Page 17: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

49Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• There is a difference between constant percentage growth and constant additive growth.

• Suppose initial sales of a company are $100 million. The difference between a constant percentage growth of 8% per year versus a constant additive growth of $8 million per year isshown in the table below.

Year

132.0124.0116.0108.0100.0$8M growth

136.0126.0116.6108.0100.08% growth

43210

50Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• In time series data, the response variable Y frequently changes at an increasing rate.

• For example, if a present amount (P) is invested at a nominal annual interest rate (r) for t years, then the future amount after t years (Ft) is:

Ft = Pert,

under continuous compounding.

51Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• This nonlinear relation becomes linear after a logarithmic transformation:

ln(Ft) = ln(P) + rt

or

Y = β0 + β1t

Page 18: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

52Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• The general form of the original model is:

Y = β0eβ1x

• A logarithmic transformation yields

ln(Y) = ln(β0) + β1xor

Y* = β0* + β1x• For this model, β1 is the percent change in Y when x changes

by 1 unit.

53Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• “A logarithmic transformation is only one possibility. It is, however, a particularly useful one, because logarithms convert a multiplicative relation to an additive one. A natural logarithm (base e = 2.7182818), often denoted ln(y), is especially useful, because the results are interpretable as percentage changes.

• For example, if a prediction of high school teachers’ salaries yields predicted ln(salary) = constant + .042 (years’experience) + other terms, then an additional year’s experience, other terms held constant, predicts a 4.2 percentincrease in salary. This guideline isn’t perfect, but very close for values less than 0.2 or so.” (Hildebrand, Ott and Gray)

54Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• Another example of a nonlinear model is the Cobb-Douglas production function:

Y = cIαkβ,

where Y is production, I is labor input, k is capital input,and α and β are unknown parameters.

• After a logarithmic transformation, the nonlinear relation becomes linear:

ln(Y) = ln(c) + α ln(I) + β ln(k),

or

Y = β0 + β1 x1 + β2 x2,

Page 19: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

55Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• The general form of the original model is:

Y = β0(x1β1)(x2

β2)(x3β3)…(xk

βk).

• A logarithmic transformation yields:

ln(Y) = ln(β0) + β1 ln(x1) + … + βk ln(xk), or

Y* = β0* + β1 x1* + βk xk *.

56Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• When there is only one predictor, β1 is the percentage change in Y per percentage change in x.

• When x = price and Y = demand, β1 is the price elasticity of demand.

57Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• In both models, the error term (ε) was omitted.

• For the first transformed model to have an additive error term, the original model needs to be:

Y = β0eβ1xeε.

• For the second transformed model to have an additive error term, the original model needs to be:

Y = β0x1β1eε .

Page 20: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

58Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• In converting a nonlinear model to a linear one, one seldom thinks about the format of ε in the original model.

• It is important to note that the error term in the transformed model must satisfy all of the usual conditions.

• If the original model was Y = β0x1β1eε, the transformed

model is:ln(Y) = ln(β0) + β1 ln(x1) + ε.

• In this case, ε must satisfy all of the usual conditions.

59Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• Another type of nonlinearity is when the model includes an interaction term:

-- the product of 2 or more predictors.• For a first-order model with k = 2, the general

form of the fitted model is:

The change in per unit change in x1 is:

, when x2 is held constant.

.ˆˆˆˆ22110 xxY βββ ++=

Y

60Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• For a model with an interaction term:

The change in per unit change in x1 is:

, which depends on the level of x2.

• An interaction term, with a dummy variable as one of the predictors, adds considerable flexibility in model building.

.ˆˆˆˆˆ211222110 xxxxY ββββ +++=

Y

2121ˆˆ xββ +

Page 21: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

61Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

Example 13.14:Data are collected for 20 independent pharmacies in an attempt to predict prescription volume (sales/month).

• The independent variables are: • total floor space (FLOOR_SP); and, • whether or not the pharmacy is located in a shopping

center (SHOPCNTR).

• From the Minitab output that follows, we see each of the predictors is statistically significant.

62Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

Regression Analysis: VOLUME versus PRESC_RX, SHOPCNTR

The regression equation isVOLUME = 30.9 - 0.400 PRESC_RX - 5.97 SHOPCNTR

Predictor Coef SE Coef T P VIFConstant 30.869 2.618 11.79 0.000PRESC_RX -0.40046 0.07412 -5.40 0.000 1.1SHOPCNTR -5.970 1.887 -3.16 0.006 1.1

S = 3.94742 R-Sq = 64.7% R-Sq(adj) = 60.6%

Unusual ObservationsObs PRESC_RX VOLUME Fit SE Fit Residual St Resid5 13.0 18.000 25.663 1.813 -7.663 -2.19R18 42.0 6.000 14.050 1.424 -8.050 -2.19RR denotes an observation with a large standardized residual.

63Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• A question of interest is the possible need for an interaction term.

• For a model with an interaction term:

where Y = VOLUME, x1 = PRESC_RX, and, x2 = SHOPCNTR.

,ˆˆˆˆˆ211222110 xxxxY ββββ +++=

Page 22: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

64Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• For a pharmacy in a shopping center:

• For a pharmacy not in a shopping center:

• The interaction term affects the intercept and the slope.

• A visual assessment of the need for interaction is gleaned from a scatterplot with with a regression for each group, which follows.

.)ˆˆ()ˆˆ(ˆ112120 xY ββββ +++=

.ˆˆˆ110 xY ββ +=

65Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

PRESC_RX

VO

LUM

E

5040302010

30

25

20

15

10

5

SHOPCNTR01

Scatterplot of VOLUME vs PRESC_RX

66Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

• Because the separate lines are nearly parallel, there is no evidence of interaction.

• This is confirmed by running a regression model with an interaction term and examining its p-value.

For such a model, the p-value = 0.678 (shown below).

Page 23: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

67Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.4 Nonlinear Regression Models

Regression Analysis: VOLUME versus PRESC_RX, SHOPCNTR, (PRE)(SHP)

The regression equation isVOLUME = 29.8 - 0.368 PRESC_RX - 4.22 SHOPCNTR - 0.064 (PRE)(SHP)

Predictor Coef SE Coef T P VIFConstant 29.846 3.613 8.26 0.000PRESC_RX -0.3679 0.1081 -3.40 0.004 2.3SHOPCNTR -4.223 4.560 -0.93 0.368 6.3(PRE)(SHP) -0.0643 0.1520 -0.42 0.678 5.6

S = 4.04635 R-Sq = 65.1% R-Sq(adj) = 58.6%

68Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.5Choosing Among Regression Models

69Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Concept: Use some objective criterion to select the independent or predictor variables to be in the model.

• Criteria to be considered:• Stepwise Regression

• Forward and Backward• Forward Selection• Backward

• Best Subsets

Page 24: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

70Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Example 13.14:• Assume that data are collected for 20 independent

pharmacies in an attempt to predict prescription volume (sales/month).

• The independent variables are: • total floor space (FLOOR_SP); • percentage of floor space allocated to prescription

department (PRESC_RX); • number of available parking spaces (PARKING), • whether or not the pharmacy is located in a shopping

center (SHOPCNTR); and, • per-capita income of the surrounding community

(INCOME).

71Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Example 13.14 (continued):• Portions of the data are shown below.

• For 5 predictors, there are 32 models to consider.

• Consider automatic screening procedures.

VOLUME FLOOR_SP PRESC_RX PARKING SHOPCNTR INCOME22 4900 9 40 1 1819 5800 10 50 1 2024 5000 11 55 1 1728 4400 12 30 0 1918 3850 13 42 0 10… … … … … …7 2900 45 30 1 9

17 2400 46 16 0 3

72Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Stepwise (forward and backward) Regression• Let (k) denote the number of predictors under

consideration, including interaction and quadratic terms

• Select an “alpha to enter” and an “alpha to remove”• “Alpha to enter” – the probability of a type 1 error for

entering a new predictor into a regression model • “Alpha to remove” –the probability of a type 1 error

for retaining a predictor that was previously entered into the regression model

• In Minitab, the default value for both alphas is 0.15.

Page 25: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

73Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Stepwise Details • For all of the (k) possible simple regressions,

select that predictor with the largest | t-test |.• If the p-value for that predictor is greater than

“alpha to enter,” stop and choose a new set of predictors.

• If the p-value for the largest | t-test | is less than “alpha to enter,” choose that predictor as the first to enter the model.• Let x[1] denote the first predictor to enter.

74Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Stepwise Details (continued)

• Now consider all possible (k – 1) two predictor multiple regressions with x[1] as one of the twopredictors.

• The second predictor to enter, denoted by x[2], is that predictor with the largest |t-test| provided its p-value is less than “alpha to enter.” If not, stop with the simple regression obtained previously using x[1] .

75Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Stepwise Details (continued)• If x[2] enters, consider the two predictor model with x[1]

and x[2] .• Look at the t-test for x[1]. If its p-value is less than “alpha

to remove,” retain x[1] .• If not, remove x[1]. Use the simple regression with x[2] as

the predictor for the next step.• The procedure continues until no new variables can be

entered.

• Consider Example 13.14. The Minitab output follows.

Page 26: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

76Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Stepwise Regression: VOLUME versus FLOOR_SP, PRESC_RX, ...Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15Response is VOLUME on 5 predictors, with N = 20Step 1 2Constant 25.98 48.29

PRESC_RX -0.321 -0.582T-Value -3.76 -5.67P-Value 0.001 0.000

FLOOR_SP -0.0038T-Value -3.39P-Value 0.003

S 4.84 3.84R-Sq 43.93 66.57R-Sq(adj) 40.82 62.63Mallows C-p 10.2 1.6

77Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• The first variable to enter is percentage of floor space allocated to prescription departmentAt the end of the first step, the fitted model is:VOLUME = 25.98 – 0.321 PRESC_RX

• The second variable to enter is total floor space.The previously entered variable, PRESC_RX, remains.At the end of the second step, the fitted model is:VOLUME = 48.29 – 0.582 PRESC_RX – 0.0038 FLOOR_SP

• The stepwise procedure stops after two steps.

• The same results are obtained if “alpha to enter” and “alpha to remove” are both set at .10 or .05.

78Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Forward Selection:• At each step, enter that predictor with the largest

|t-test| provided its p-value is less than “alpha to enter.”• If not, stop with the previously obtained regression

model.• In Minitab, the default value for “alpha to enter” is 0.25.• Consider Example 13.14 with “alpha to enter” = 0.10.

The Minitab output follows.• For this example, the final results are the same for

both procedures.

Page 27: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

79Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Stepwise Regression: VOLUME versus FLOOR_SP, PRESC_RX, ... Forward selection. Alpha-to-Enter: 0.1Response is VOLUME on 5 predictors, with N = 20Step 1 2Constant 25.98 48.29

PRESC_RX -0.321 -0.582T-Value -3.76 -5.67P-Value 0.001 0.000

FLOOR_SP -0.0038T-Value -3.39P-Value 0.003

S 4.84 3.84R-Sq 43.93 66.57R-Sq(adj) 40.82 62.63Mallows C-p 10.2 1.6

80Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Backward Elimination:• Begins with a model that contains all (k) predictors.• Removes them one at a time without re-entering any.• Ends when the |t-test| for each of the remaining

predictors has a p-value which is less than “alpha to remove.”

• In Minitab, the default value for “alpha to remove” is 0.1.

• Consider Example 13.14. The Minitab output follows.• For this example, all three procedures give the same

model.

81Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Stepwise Regression: VOLUME versus FLOOR_SP, PRESC_RX, ... Backward elimination. Alpha-to-Remove: 0.1 Response is VOLUME on 5 predictors, with N = 20

Step 1 2 3 4Constant 42.09 43.47 42.83 48.29

FLOOR_SP -0.0024 -0.0023 -0.0025 -0.0038T-Value -1.32 -1.34 -1.50 -3.39P-Value 0.210 0.200 0.152 0.003

PRESC_RX -0.50 -0.53 -0.53 -0.58T-Value -3.05 -4.65 -4.74 -5.67P-Value 0.009 0.000 0.000 0.000

PARKING -0.037 -0.040T-Value -0.56 -0.63P-Value 0.582 0.537

SHOPCNTR -3.1 -2.7 -3.0T-Value -0.95 -0.98 -1.14P-Value 0.356 0.342 0.272

INCOME 0.11T-Value 0.25P-Value 0.807

S 4.01 3.88 3.81 3.84R-Sq 70.01 69.87 69.07 66.57R-Sq(adj) 59.30 61.84 63.27 62.63Mallows C-p 6.0 4.1 2.4 1.6

Page 28: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

82Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Best Subsets or all possible regressions is an alternative approach to model selection.

• For k possible predictors, there are k subsets of models.

• There is the subset of models each with one predictor, the subset with two predictors, …, and finally the subset with all k predictors.

• Consider Example 13.14. The Minitab output follows.

83Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Best Subsets Regression: VOLUME vs. FLOOR_SP, PRESC_RX, ...Response is VOLUME

F P SL R P HO E A O IO S R P NR C K C C_ _ I N O

Mallows S R N T MVars R-Sq R-Sq(adj) C-p S P X G R E

1 43.9 40.8 10.2 4.8351 X1 14.8 10.1 23.8 5.9604 X2 66.6 62.6 1.6 3.8420 X X2 64.7 60.6 2.5 3.9474 X X3 69.1 63.3 2.4 3.8089 X X X3 67.9 61.9 3.0 3.8778 X X X4 69.9 61.8 4.1 3.8825 X X X X4 69.3 61.1 4.3 3.9176 X X X X5 70.0 59.3 6.0 4.0099 X X X X X

84Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Minitab displays only the two best models for each subset, rather than all possible models.

• For each subset, three statistics of model performance are given:

• and were discussed in Chapter 12.

pa CMallowsandRR ', 22

2aR2R

Page 29: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

85Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• The Cp criterion

• For a model with p-coefficients [the intercept and (p-1) partial slopes] corresponding to (p-1) predictors,

• If the p-coefficient model has all of the useful predictors, then MS(Residual, p-coefficients)

≈ MS(Residual, all coefficients).

• This implies Cp ≈ p for an appropriate model.

( ) ( )( ) ( )pn

MSMSpn

p 2tscoefficien all Residual,

tscoefficien p Residual,C −−−

=

86Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Consider Example 13.14.

• Using the Cp criterion, which set of predictors should be selected?

• By scanning down the column labeled “Mallows C-p”, we see that:

C-p = 10.2, when p = 2;C-p = 1.6, when p = 3;C-p = 2.4, when p = 4; C-p = 4.1, when p = 5; andC-p = 6.0, when p = 6.

87Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• The subset with p = 3 (or k = 2) has a value of Cp actually less than p.

• The same could be said for the subsets with p = 4 and p = 5.

• The model with p = 3 is chosen because it has the smallest Cp and is more parsimonious.

• For the subset with all k predictors, Cp = p. • This does not imply that you should use the model with k

predictors.

Page 30: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

88Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Using the criterion, which set of predictors should be selected?

• For the subset with one predictor, that predictor with the largest is PRESC_RX.

• Minitab also gives the one predictor model with the next largest . That predictor is INCOME.

• For the subset with two predictors, the two predictors with the largest are: FLOOR_SP and PRESC_RX.

• The percentage increase in going from one to two predictors is: (62.6 – 40.8)/40.8 = 53.4%, a substantial increase.

2aR

2aR

2aR

2aR

89Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• For the subset with three predictors, the three predictors with the largest are: FLOOR_SP, PRESC_RX and SHOPCNTR.

• The percentage increase in in going from two to three predictors is: (63.3 – 62.6)/62.6 = 1.1%.

• Is this negligible increase worth the loss in parsimony?

• The subset with four predictors is not considered since decreases.

• Using the criterion, the subset with two predictors appears to be the most reasonable, and the predictors are FLOOR_SP and PRESC_RX.

2aR

2aR

2aR

2aR

90Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.5 Choosing Among Regression Models

• Hopefully, the procedures will have some predictors in common that can be used as a starting point.

• “In selecting a regression model, a manager should use experience and judgment as well as statistical results. If one model involves reasonable relations and variables, yet does somewhat less well than another, less plausible model on a purely statistical basis, a manager might well choose the first model anyway.” (Hildebrand, Ott and Gray)

Page 31: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

91Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.6Residual Analysis (Step 4)

92Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• In fitting a regression model, certain assumptions are made regarding the errors (ε) of the true or population model.

• Since the errors are unobservable, we use the residuals (preferably the standardized residuals, SRi).

• If the fitted model does not accurately capture the nature of the data, the SRi’s will exhibit a pattern.

• If the fitted model does accurately capture the nature of the data, a plot of the SRi will be uniformly spread out.

93Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• Standardized residuals (SRi) were defined in Chapter 11:

• Recall that a SRi is considered large if | SRi | > 2.

i

iiSR

f_residualeviation_oStandard_dResidual

=

Page 32: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

94Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• One type of unexplained structure is nonlinearityin some of the predictors (xj).

• To detect this, plot SRi versus each xj.• If there is nonlinearity in an xj, the plot will show

curvature. • One remedy is to transform either the dependent

or independent variables. Possible transformations:

ln’s on Y and some or all of the xj, square roots on some or all of the xj, inverses on Y and some or all of the xj.

95Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

Exercise 11.33: A government agency responsible for awarding contracts for much of its research work is under careful scrutiny by a number of private companies. One company examines the relationship between the amount of the contract (x $10,000) and the length of time between the submission of the contract proposal and the contract approval:

Length (in months) Y: 3 4 6 8 11 14 20 Size (x $10,000) x: 1 5 10 50 100 500 1000

The scatterplot of Y vs. x follows.

It is obvious from the scatterplot that the relationship is nonlinear.

96Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

In the lower left portion of the scatterplot, the values are clustered together. Both SIZE and LENGTH are more spread out as one moves from left to right.

SIZE

LEN

GTH

10008006004002000

20

15

10

5

Scatterplot of Length vs. Size

Page 33: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

97Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• The use of a ln transformation on both SIZE and LENGTH shrinks this increasing spread.

• The scatterplot of ln(LENGTH) vs. ln(SIZE) follows:

ln(SIZE)

ln(L

ENGT

H)

76543210

3.0

2.5

2.0

1.5

1.0

Scatterplot of ln(LENGTH ) vs. ln(SIZE)

98Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• In Exercise 11.33, the use of a ln transformation on both variables was determined by inspecting the scatterplot.

• This is not possible in multiple regression.• To illustrate the usefulness of a plot of SRi vs. each xj,

consider a simulated data set.

Example: Consider the Sales vs. Advertising Expenditures and Income example introduced in Chapter 12. However, now data was simulated from a model that also had the square of Advertising Expenditures as an independent variable. In fitting a model, only the linear effects of Advertising Expenditures and Income were used.

99Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

The regression output from Minitab follows:Regression Analysis: Sales versus Adv Exp, Income The regression equation isSales = - 6.37 + 8.03 Adv Exp + 0.944 Income

Predictor Coef SE Coef T P VIFConstant -6.366 8.578 -0.74 0.482Adv Exp 8.0292 0.6817 11.78 0.000 1.8Income 0.9439 0.2370 3.98 0.005 1.8S = 2.69837 R-Sq = 98.2% R-Sq(adj) = 97.7%

Unusual ObservationsObs Adv Exp Sales Fit SE Fit Residual St Resid10 6.00 88.625 84.283 1.720 4.342 2.09RR denotes an observation with a large standardized residual

The plot of SRi vs. Advertising Expenditures follows.

Page 34: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

100Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

The curved relationship between SRi and Advertising Expenditures is obvious.

However, including a quadratic term in the model invites multicollinearity.

Adv Exp

Stan

dard

ized

Res

idua

l

654321

2.5

2.0

1.5

1.0

0.5

0.0

-0.5

-1.0

Residuals Versus Adv Exp(response is Sales)

101Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

Example: Consider the simulated data set for Sales vs. Advertising Expenditures and Income introduced above. The Minitab output follows for when the square of Advertising Expenditures is included as a predictor.

Regression Analysis: Sales versus Adv Exp, (AdvExp)^2, IncomeThe regression equation isSales = 0.00 + 1.55 Adv Exp + 0.948 (AdvExp)^2 + 0.993 Income

Predictor Coef SE Coef T P VIFConstant 0.001 2.042 0.00 1.000Adv Exp 1.5467 0.5946 2.60 0.041 25.6(AdvExp)^2 0.94800 0.08390 11.30 0.000 24.1Income 0.99345 0.05440 18.26 0.000 1.8S = 0.617500 R-Sq = 99.9% R-Sq(adj) = 99.9%

Although all three predictors are statistically significant, the VIF exceeds 10 for two of the predictors.

• When including a quadratic term, it is recommended that be used rather than x2.

)( 2xx −

102Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

The Minitab output follows for when (AdvExp-3.2)2 is used instead of (AdvExp)2

Regression Analysis: Sales versus Adv Exp, (AdvExp-3.2)^2, IncomeThe regression equation isSales = - 9.71 + 7.61 Adv Exp + 0.948(Adv Exp - 3.2)^2 + 0.993 Income

Predictor Coef SE Coef T P VIFConstant -9.706 1.985 -4.89 0.003Adv Exp 7.6139 0.1603 47.51 0.000 1.9X^2 0.94800 0.08390 11.30 0.000 1.1Income 0.99345 0.05440 18.26 0.000 1.8S = 0.617500 R-Sq = 99.9% R-Sq(adj) = 99.9%

Multicollinearity is no longer a problem.

Page 35: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

103Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• Another requirement is that the error terms be normally distributed.

• Standardized Residuals can be viewed as values of a standard normal random variable.

• Thus, a NPP of the SRi’s should be linear and the p-value of a normality test should exceed 0.05.

• Example: Consider the simulated data set for Sales vs. Advertising Expenditures and Income considered above when (AdvExp-3.2)2 is also used as a predictor. The NPP follows:

104Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

The linearity of the NPP and the p-value = 0.682 of the Anderson Darling normality test indicate that the SRi’s are normally distributed.

SRES

Perc

ent

3210-1-2-3

99

95

90

80

70

60504030

20

10

5

1

Mean

0.682

0.06917StDev 1.036N 10AD 0.245P-Value

Probability Plot of SRESNormal

105Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• In Chapter 11, two types of outliers were introduced.

• In simple regression, a high-leverage point is one for which the x-value is, in some sense, far away from most of the x-values.

• A high leverage point is not necessarily bad.

• A high leverage point has the potential to alter the fitted line.

Page 36: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

106Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• The concept of a high leverage point in multiple regression was considered in Chapter 12.

• In multiple regression, one must consider not only the range of values of each predictor but the region of values for of all the predictors taken together.

• As was demonstrated in Chapter 12, Minitab will indicate a high leverage point by the “X” symbol.

107Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• The other type of outlier is a Y-outlier.• A Y-outlier is one where |SRi| > 2.

• One problem with outliers is that they can distort the regression equation.

• Another problem is they can influence the plot of SRi vs. and the NPP of the SRi ’s.

• In the NPP, if there are a few residuals where the |SRi| > 2, then the plot could look nonlinear and have a p-value > 0.05 solely because of these large SRi ’s.

iY

108Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• The “jackknife” method is another approach for detecting outliers.

• In the jackknife method, a set of n regression models is obtained, each time excluding one of the n observations.

• The coefficients of each model are compared to each other. If there is an outlier, the coefficients of that model should change substantially.

Page 37: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

109Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• To illustrate this concept, consider the sales and advertising expenditures example presented in Chapter 11, where the observation for Region G was changed from (3,4) to (10,4).

• That (10,4) is a high influence point is evident from the fitted line plot which follows.

110Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• Applying the jackknife procedure, the following results are obtained.

Adv Exp_1

Sale

s_1

1086420

6

5

4

3

2

1

S 1.08509R-Sq 52.9%R-Sq(adj) 47.0%

Fitted Regression Line for Sales vs. Adv Exp (Revised Data)Sales_1 = 1.472 + 0.3919 Adv Exp_1

111Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• Notice that omitting the point (10,4) resulted in large changes in and .

Data Point Excluded Slope Intercept(1,1) 0.345 1.76(2,1) 0.351 1.78(1,2) 0.399 1.43(3,2) 0.382 1.58(2,3) 0.416 1.29(4,3) 0.392 1.48

(10,4) 0.734 0.524(5,4) 0.382 1.45(5,5) 0.363 1.40(6,5) 0.349 1.50

0β 1β

Page 38: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

112Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• For multiple regression, the problem becomes more complex as there are k partial slopes that need to be compared.

• The jackknife method is also prohibitive for large n.

• For these situations, Cook’s (1977) statistic or others are recommended.

113Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• Regarding the true model, another assumption is constant variance. This means that the error variance ( ) is constant.

• To investigate this, examine plots of the Standardized Residuals (SRi) vs. the fitted Y’s ( ) and each of the predictors.

• If any of these plots is fan (or funnel) shaped, this indicates that the variance of the error term is increasing (or decreasing) heteroscedasticity.

Example: To demonstrate heteroscedasticity, consider a simulated data set of 30 observations from the true model: E(Y) = 1.0 + 0.80 x, where the standard deviation of the error for the first 10 observation is 1.0, 2.0 for the second set of ten observations and 3.0 for the last set of 10 observations.

iY

2εσ

114Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

The Minitab output follows.

Regression Analysis: Y versus XThe regression equation is Y = 0.71 + 0.913 X

Predictor Coef SE Coef T PConstant 0.713 1.240 0.57 0.570X 0.9132 0.2870 3.18 0.004S = 2.567 R-Sq = 26.6% R-Sq(adj) = 23.9%

Unusual ObservationsObs X Y Fit SE Fit Residual St Resid26 6.00 14.644 6.193 0.741 8.451 3.44R

R denotes an observation with a large standardized residual

To determine whether or not heteroscedasticity is a problem, look at the plot ofSRs vs. .Y

Page 39: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

115Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

Residuals vs Fits ( )

The fan or funnel-shaped pattern indicates heteroscedasticity.

6543

4

3

2

1

0

-1

-2

Fitted Value

Sta

ndar

dize

d R

esid

ual

Residuals Versus the Fitted Values(response is Y)

iY

116Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• When there is heteroscedasticity, there are two possible cures.

• One cure is to use weighted least squares.

• The other cure is to re-express the dependent variable.

117Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

Examples 13.19 and 13.20:A very crude model for predicting the price of common

stocks might use price per share (Y) as a linear function of previous year’s earnings per share (x1), change in earnings per share (x2), and asset value per share (x3). A plot of standardized residuals versus values for a regression study of 26 stocks shows evidence of heteroscedasticity, since there is a general tendency for the magnitude of the SR’s to increase as increases.

Refer to the plot of the SR’s vs. that follows.

Y

iY

iY

Page 40: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

118Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

Fitted Value

Stan

dard

ized

Res

idua

l

9080706050403020100

3

2

1

0

-1

-2

Residuals Versus the Fitted Values(response is PRICE)

119Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

• However, when the dependent variable is defined to be price per share (Y) divided by earnings per share (x1), or the P/E ratio, heteroscedasticity is not apparent.

Refer to the plot of the SR’s vs. that follows.iY

120Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.6 Residual Analysis (Step 4)

The dilemma now is that neither x2 nor x3 are statistically significant.

Fitted Value

Stan

dard

ized

Res

idua

l

7.87.77.67.57.4

2

1

0

-1

-2

Residuals Versus the Fitted Values(response is P/E)

Page 41: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

121Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.7Autocorrelation (Step 4)

122Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• A requirement in regression is that the error terms be uncorrelated.

• For time-series data, the error terms (ε ) at different points of time could be correlated with each other.

• This is called autocorrelation.

• First-order or serial autocorrelation occurs if ε1 is correlated with ε2 , ε2 with ε3 , ε3 with ε4, etc.

123Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• One way of stating that successive error terms are autocorrelated is as follows:

where ρ is the correlation between successive error terms and the are independent normal random variables.

• This is called the first-order autoregressive error model.

• For many business and economic data sets, ρ is positive.

1t t tuε ρε −= +

tu

Page 42: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

124Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• When the error terms are positively autocorrelated, consequences of using the least-squares line are: • The residual standard deviation (sε) underestimates the

standard deviation of the error term (σε).• F and t tests will appear more significant than they

really are.• Or, p-values are smaller than they should be.

• R2 is larger than it should be.

• Positive autocorrelation leads to delusions of predictability.“We think we can predict more accurately than we actually can.” (Hildebrand, Ott and Gray)

125Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• How to detect autocorrelation in the unobservableerror terms?

Use the residuals which estimate the error terms!Determine if the residuals are autocorrelated!

• To illustrate residual autocorrelation, consider Exercise 13.42.

126Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

Exercise 13.42:An auto-supply store had 60 months of data on variables that

were thought to be relevant to sales. The data include monthly sales in thousands of dollars (SALES), average daily low temperature in degrees Fahrenheit (LOWTEMP), advertising expenditure for the month in thousands of dollars (ADEXP), used-car sales in the previous month (USEDCAR), and month number (MONTH).

For a preliminary analysis, use Simple Linear Regression with LOWTEMP as the predictor. Selected output is shown below. Is there autocorrelation in the residuals?

Page 43: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

127Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• A plot of the residuals vs. the order in which they occur follows.

• There is a tendency for a positive residual to be followed by a positive residual and a negative residual to be followed by a negative residual.

Residuals are positively autocorrelated.

Observation Order

Res

idua

l

605550454035302520151051

200

100

0

-100

-200

Residuals Versus the Order of the Data(response is SALES)

128Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• The scatterplot for pairs of successive residuals also shows positive first-order autocorrelation.

RES(t-1)

RES

(t)

2001000-100-200

200

100

0

-100

-200

Scatterplot of RES(t) vs. RES(t-1) with Fitted Line

129Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• The Durbin-Watson statistic is a formal test for first-order autocorrelation.

• Test statistic:

[ ]

=

=+ −

= n

tt

n

ttt

d

1

2

1

1

21

Residual

sidualResidualReWatson -Durbin

Page 44: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

130Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• Properties of d

• 0 ≤ d ≤ 4

• If errors are uncorrelated, d ≈ 2

• If errors are positively correlated, d is close to 0

• If errors are negatively correlated, d is close to 4

• A description of the formal hypothesis testing procedure follows.

131Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• H0: No first-order autocorrelation between consecutive error terms

Ha: Positive first-order autocorrelation between consecutive error terms

• Negative may be used in place of positive in the Ha.

• Ha could be two-sided: positive or negative autocorrelation.

• For most time series in business, use Ha with positive autocorrelation.

132Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• The formal procedure for testing H0 using the d-statistic is as follows:

H0: No autocorrelation

Ha: Positive autocorrelation

Rejection region: d < dL,α

Nonrejection region: d > dU,α

Inconclusive (“possibly significant”) region: dL,α ≤ d ≤ dU,α

Page 45: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

133Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• Note: dL,α and dU,α are the lower and upper tabulated values, respectively, corresponding to kindependent variables and n observations.

• Tables of the Durbin-Watson test bounds are available (Johnston, 1977)

• In practice, we hope not to reject H0: ρ = 0• Rule of thumb: “any value of d less than 1.5 or 1.6

leads us to suspect autocorrelation.” (H,O & G)• To illustrate using the Durbin-Watson test,

consider Exercise 13.42.

134Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

Exercise 13.42:An auto-supply store had 60 months of data on variables that were thought to be relevant to sales. The data include monthly sales in thousands of dollars (SALES) and average daily low temperature in degrees Fahrenheit (LOWTEMP). For a preliminary analysis, use Simple Linear Regression with LOWTEMP as the predictor. Obtain the Durbin-Watson statistic. Does this statistic indicate that there is a serious autocorrelation problem?

Selected output is shown below.

135Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

Regression Analysis: SALES versus LOWTEMP

The regression equation isSALES = 1026 - 5.68 LOWTEMP

Predictor Coef SE Coef T PConstant 1026.09 29.38 34.93 0.000LOWTEMP -5.6836 0.5854 -9.71 0.000

S = 80.5599 R-Sq = 61.9% R-Sq(adj) = 61.2%

Durbin-Watson statistic = 1.16730

Page 46: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

136Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• The rule of thumb says a value less of d < 1.5 indicates positive autocorrelation.

• Since d = 1.17, reject H0 of no positive autocorrelation.

• The table value at the 5% level of significance is dL,.05= 1.55. Since d = 1.17 < 1.55, reject H0 of no positive autocorrelation.

137Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• The appropriate remedial measure for autocorrelation depends on the cause.

• If the autocorrelation is due to an omitted predictor, include another predictor variable whose cycles match the cycles of the residuals.

• For example, in regressing monthly sales of a restaurant chain on its monthly advertising expenditures, the addition of a competitor’s monthly advertising expenditures as a predictor could minimize an autocorrelation problem.

138Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• If the model is correctly specified and autocorrelation occurs because the error terms really are autocorrelated, use transformed variables.

• The transformed variables are denoted as and , where:

, and denotes the estimate of ρ.• A quick approach is to let = 1. • Thus, the transformed or differenced variables are:

and • The Minitab procedure for finding the differenced variables is:

Stat > Time Series > Differences

11 ˆ,ˆ −− −=′−=′ tttttt xxxYYY ρρtY ′ tx′

ρρ

1−−=′ ttt YYY 1−−=′ ttt xxx

Page 47: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

139Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• Exercise 13.42 (cont’d.): First differences were obtained of the SALES and LOWTEMP variables and denoted as DIFSALES and DIFTEMP, respectively. Note that although there were 60 observations initially, there are 59 first differences. Is there evidence of autocorrelation problem?

Selected Minitab output follows. At the 5% level of significance, dL,.05 ≈ 1.55 and dU,.05 ≈ 1.62. Since d = 3.03 > 1.62, do not reject HO of no positive autocorrelation.

140Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

Regression Analysis: DIFSALES versus DIFTEMP

The regression equation isDIFSALES = 4.4 - 6.15 DIFTEMP

59 cases used, 1 cases contain missing values

Predictor Coef SE Coef T PConstant 4.40 11.40 0.39 0.701DIFTEMP -6.146 1.130 -5.44 0.000

S = 87.5610 R-Sq = 34.2% R-Sq(adj) = 33.0%

Durbin-Watson statistic = 3.02730

141Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.7 Autocorrelation (Step 4)

• If anything, differencing has resulted in the conversion of positive autocorrelation to apparent negative autocorrelation. Refer to the plot of the residuals vs. their order.

Observation Order

Res

idua

l

605550454035302520151051

200

100

0

-100

-200

Residuals Versus the Order of the Data(response is DIFSALES)

Page 48: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

142Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Section 13.8Model Validation

143Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

• The appropriateness of a fitted regression model is very similar to tuning an automobile.

• When an automobile is tuned, it is tuned for the conditions under which it will typically operate, e.g., sea-level altitude. However, the automobile may not perform properly if operated at high altitudes.

• Similarly, the fitted model was built and tuned using the data in hand.

• However, will the fitted model work just as well for data not used to build the model?

144Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

• The fitted model needs to be validated:How well does the fitted model work with different

data?

• The different data may be:• New data: data collected after the original data; or,• A holdout sample from the original data:

• For cross-sectional data, the holdout sample could be a randomly selected subset of the original data;

• For time-series data, the holdout sample is usually the most recent data.

Page 49: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

145Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

• There are several methods for checking validity.

• H,O&G propose the following method:• Obtain the residuals using the new data;

• Check to see if the average error in the validation sample is near 0 and if the standard deviation of the validation errors is reasonably close to the residual standard deviation of the model.

146Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

• The mean square forecast error, a.k.a., the mean square prediction error is:

where n2 is the size of the new sample.

• To illustrate this validation method, consider Exercise 13.55.

21

2

)ˆ( nYYn

iii∑

=

147Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

Exercise 13.55:

A bank that offers charge cards to customers studies the yearly purchase amount on the card as related to age, income, and years of education of the cardholder, and whether the cardholder owns or rents a home. The variable “owner”equals 1 if the cardholder owns a home and 0 if the cardholder rents a home.

Page 50: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

148Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

• During the model building phase, if one were to use the first 120 observations, this would exclude homeowners.

• Moral: Randomly select the observations to use in building the model.

149Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

• Using 120 randomly selected observations, eliminating multicollinearity by defining the response variable to be “yearly purchase amount / income” (PURCH_1/INC_1), and removing the insignificant predictor, the model is:

(.000) (.034)

The p-values are shown below the beta estimates.

).OWNER(00095.)AGE(000470.00703.ˆ −+=Y

150Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

13.8 Model Validation

• This model was used during the validation phase to obtain residuals for the other 40 observations.

• Since there is no systematic error in the residuals, the model has been validated.

• Note: As with an automobile, the model must undergo scheduled maintenance.

Page 51: Chapter 13: Constructing a Multiple Regression Model...Chapter 13: Constructing a Multiple Regression Model Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition

Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Keywords: Chapter 13

• Collinearity• Correlation matrix• Matrix plot• Qualitative predictors• Dummy variables• Indicator variables• Lagged variables• Nonlinear models• Logarithmic transformation• Residual plots• Stepwise regression• Forward selection

• Backward elimination• Mallows’ Cp• Outliers• Jackknife method• Homoscedasticity• Heteroscedasticity• Weighted least squares• Autocorrelation• First-order autoregressive

model• Durbin-Watson statistic• First differences• Model validation

151

Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Summary of Chapter 13

• This chapter presented a four-step process for building a multiple linear regression model:

• STEP ONE:• Initial Selection of Possible Predictor Variables• Incorporating Qualitative Independent Variables by

Using Dummy or Indicator Variables• Incorporating Lagged Predictor Variables when there is

Time-Series Data

152

Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Summary of Chapter 13

• STEP TWO:• Addressing Nonlinearity and Interaction Among the

Variables• STEP THREE:

• Choosing Predictors Using Stepwise and Other Methods

• STEP FOUR:• Checking the Assumptions of Linearity,

Heteroscedasticity, Normality and Independence by Doing a Residual Analysis

• Validating the Model

153