chapter 8: regression models for quantitative and qualitative predictors

28
Chapter 8: Regression Models for Quantitative and Qualitative Predictors Ayona Chatterjee Spring 2008 Math 4813/5813

Upload: quinlan-mendoza

Post on 30-Dec-2015

168 views

Category:

Documents


2 download

DESCRIPTION

Chapter 8: Regression Models for Quantitative and Qualitative Predictors. Ayona Chatterjee Spring 2008 Math 4813/5813. Polynomial Regression Models. When the true curvilinear response function is indeed a polynomial. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Chapter 8: Regression Models for Quantitative and Qualitative

Predictors

Ayona Chatterjee

Spring 2008

Math 4813/5813

Page 2: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Polynomial Regression Models

• When the true curvilinear response function is indeed a polynomial.

• When the true curvilinear response function is unknown but the polynomial is a good approximation to the true function.

Page 3: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

One Predictor Variable – Second Order

• Let us consider a polynomial model with one variable raised to the first and second order.

• This polynomial is called a second-order model with one predictor.

20 1 2i i i i

i i

Y x x

where

x X X

Page 4: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

One Predictor Variable – Second Order

• Note the second order regression equation in one variable represents a parabola.

Here β1 is called the linear effect coefficient and β11 is called the quadratic effect coefficient.

Page 5: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

One Predictor Variable – Third Order

• The third-order model with one predictor variable is given as

2 30 1 11 111i i i i i

i i

Y x x x

x X X

Page 6: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Two Predictor Variables – Second Order

• The regression model

• This is a second-order model with two predictor variables.

• The equation represents a conic section.

1 20 1 1 2 2 11 1 22 2 12 1 2

1 1 1

2 2 2

i i i i i i i i

i i

i i

Y x x x x x x

x X X

x X X

Page 7: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Example of a Quadratic Response Surface

Page 8: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Hierarchical Approach to Fitting

• The norm is to fit a second-order or a third-order polynomial and explore if a lower order model is adequate.

• For example if we have a third-order model in one variable, we may want to test of β111=0, or whether or not both β11 and β111 equal zero.

• We use extra sums of squares to do the test.

Page 9: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Extra Sums of Squares

• To test if β111=0 we would use SSR(x3|x, x2). If we want to test if both β11 and β111 equal zero then we would use SSR(x2, x3|x).

• Note SSR(x2, x3 | x) = SSR(x2|x) + SSR(x3|x2, x).

• If a polynomial of a given order is retained, then all related terms of lower-order are also retained.

Page 10: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Regression Function in Terms of X

• To revert back to the original scale, and un do the centering of the predictor variables, we use the following transformations.

11'11

111'1

21110

'0

2'11

'1

'0

21110

2

:

ˆ

variablesoriginal theof in termsequation regression The

ˆ

bb

Xbbb

XbXbbb

where

XbXbbY

xbxbbY

Page 11: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Example

• A researcher studied the effects of the charge rate and temperature on the life of a new type of power cell in a small-scale experiment. The charge rate (X1) was controlled at 3 level, and so was the ambient temperature (X2). The life of the power cell was the response (Y).

• The researcher decided to fit a second-order polynomial regression model.

Page 12: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Data Set - Power Cells ExamplePower of cells

YCharge Rate

X_1Temperature

X_2

150 0.6 10

86 1.0 10

49 1.4 10

288 0.6 20

157 1.0 20

131 1.0 20

184 1.0 20

109 1.4 20

279 0.6 30

235 1.0 30

224 1.4 30

•Scale the units and fit the second order polynomial regression model.

•Obtain the correlation between the new variables x and original X. Has the transformation reduced collinearity?

Page 13: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Test of Fit

• An F test to test the goodness of fit of the model to the data.

• Define

• If F* is greater than F table value, the model is not a good fit.

cn

SSPE

pc

SSLF

YYSSPE i

*

2

F

SSPE -SSESSLF

X of levelsdistinct ofnumber theis c c,-n freedom of degreewith

termsreplicatedfor only )(

Page 14: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Partial F Test

• Suppose for the given data you want to test if a first-order model is sufficient.

• Here H0: 11= 22= 12=0

• The F statistics

MSExxxxxxSSR

F 3

),|,,( 212122

21*

Page 15: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Interaction Regression Models

• A regression model with p-1 variables contains additive effects if the response function can be written as– E{Y} = f1(X1)+f2(X2)+……… + fp-1(Xp-1)

• Note all functions need not be simple.

• If a response function cannot be written as above, then the model is not additive and interaction terms are present.

Page 16: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Interpretation of Interaction Regression Models

• In presence of interaction term, the regression coefficients cannot be interpreted as before.

• For a first-order model with interaction term, the change in the mean response with a unit increase in X1 when X2 is held constant is 1 + 3X2 and not just 1 .

Page 17: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Reinforcement Effects

• When the regression coefficients are positive, we say the interaction effect between the two quantitative variables is of a reinforcement or synergistic when the slope of the response function against one of the predictor variables increases for higher levels of the predictor variables. That is when 3 is positive.

Page 18: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Interference Effects

• When the regression coefficients are positive, we say the interaction effect between the two quantitative variables is of an interference or antagonistic type when the slope of the response function against one of the predictor variables decreases for higher levels of the predictor variables. That is when 3 is negative.

Page 19: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Implementing an Interaction Model

• There are two points to keep in mind:– High multicollinearity may exist between some

predictors and hence centering the variables may help in reducing this problem.

– If there are larger number of predictors, then we have a large choice for possible interaction terms. Choose only the terms that you think will influence the response.

Page 20: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Qualitative Predictors

• Example: Y is speed at which an insurance innovation is adopted, X1 is the size of the firm, and another predictor variable to identify type of firm.

• Here let the firm types be stock or mutual company. Thus we can define

otherwise 0

company mutual if 1

otherwise 0

companystock if 1

3

2

X

X

Page 21: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Principle

• A qualitative variable with c classes will be represented by c-1 indicator variable, each taking on the values 0 and 1.

• We modify the previous example as

company mutual if 0

companystock if 12X

Page 22: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Qualitative Predictor with More than Two Classes

• Suppose the regression on tool wear (Y) on tool speed (X1) and tool model. Tool model is a qualitative variables with M1, M2, M3 and M4 possible models.

otherwise 0

M3 model tool1

otherwise 0

M2 model toolif 1

otherwise 0

M1 model toolif 1

4

3

2

X

X

X

Page 23: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Indicator Variables versus Allocated Codes

• An alternative to using indicator variables is to use allocated codes.

• Consider, for instance the predictor variable “frequency of product use”, which has three classes. – Frequent user – 3– Occasional user – 2– Nonuser - 1

• Here we have Yi=0+1Xi1+error.• This coding implies that the mean response

changes by the same amount when going from a nonuser to an occasional user as when going from occasional user to frequent user.

Page 24: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Why indicator variables?

• Indicator variables make no assumptions about the pacing of the classes.

• They reply on data to show the differential effects.

• Alternative model Yi=0+1Xi1+2Xi2+error

– Here X1 = 1 for frequent user– X2 =1 for occasional user– All other cases we have zero.

Page 25: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Quantitative to Qualitative

• Sometimes we may convert quantitative data to qualitative data, for example ages can be grouped and we can use indicator variables to denote the age groups.

• An alternative coding is to use 1 and –1 for the two levels of a qualitative factor.

Page 26: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Comparison of Two or More Regression Functions-Example

• We can compare regression functions using hypothesis testing and see if two functions represent the same response function or now.

• Examples.

Page 27: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Comparison of Two or More Regression Functions-Example

• A company operates two production lines for making soap bars. For each line, the relation between the speed of the line and the amount of the scrap for the day was studied. A scatter plot of the data for the two production lines suggest that the regression relation between production line speed and amount of scarp is linear but not the same for the two production lines. The slopes appear same but the heights of the regression lines differ. A formal test is desired to determine if the two regression lines are identical.

Page 28: Chapter 8: Regression Models for Quantitative and Qualitative Predictors

Soap Production line - Example

• First fit separate regression models for both production lines.

• Next combine all the data and using an indicator variable fit a first-order regression model with interaction.

• Identity of the regression functions for the two production lines is tested by considering the alternatives– H0: 2=3=0 and H0: 3=0