©2006 thomson/south-western 1 chapter 14 – multiple linear regression slides prepared by jeff...
TRANSCRIPT
©2006 Thomson/South-Western 1
Chapter 14 –Chapter 14 –
Multiple Linear Multiple Linear RegressionRegression
Slides prepared by Jeff HeylLincoln University
©2006 Thomson/South-Western
Concise Managerial StatisticsConcise Managerial Statistics
KVANLIPAVURKEELING
KVANLIPAVURKEELING
©2006 Thomson/South-Western 2
Multiple Regression ModelMultiple Regression Model
YY = = 00 + + 11XX11 + + 22XX22 + + kkXXkk + + ee
Deterministic componentDeterministic component
00 + + 11XX11 + + 22XX22 + + kkXXkk
Least Squares EstimateLeast Squares Estimate
SSE = ∑(SSE = ∑(YY - - YY))22^̂
©2006 Thomson/South-Western 3
Multiple Regression ModelMultiple Regression Model
Figure 14.1Figure 14.1
YY
XX11
XX22
YY = = 00 + + 11XX11 + + 22XX22
ee (positive) (positive)
ee (negative) (negative)
©2006 Thomson/South-Western 4
Housing ExampleHousing Example
Y = Home square footage (100s)Y = Home square footage (100s)
X1 = Annual Income ($1,000s)X1 = Annual Income ($1,000s)
X2 = Family SizeX2 = Family Size
X3 = Combined years of X3 = Combined years of education beyond high education beyond high school for all household school for all household membersmembers
©2006 Thomson/South-Western 5
Multiple Regression ModelMultiple Regression Model
Figure 14.2Figure 14.2
©2006 Thomson/South-Western 6
Multiple Regression ModelMultiple Regression Model
Figure 14.3Figure 14.3
©2006 Thomson/South-Western 7
Multiple Regression ModelMultiple Regression Model
Figure 14.4Figure 14.4
©2006 Thomson/South-Western 8
Assumptions of the Assumptions of the Multiple Regression ModelMultiple Regression Model
The errors follow a normal The errors follow a normal distribution, centered at zero, distribution, centered at zero, with common variancewith common variance
The errors are independentThe errors are independent
©2006 Thomson/South-Western 9
Errors in Multiple Linear Errors in Multiple Linear RegressionRegression
Figure 14.5Figure 14.5
YY
XX11
XX22
YY = = 00 + + 11XX11 + + 22XX22
XX11 = 30, = 30, XX2 2 = 8= 8
XX11 = 50, = 50, XX22 = 2 = 2
eeee
©2006 Thomson/South-Western 10
Multiple Regression ModelMultiple Regression Model
An estimate ofAn estimate of ee22
ss22 = = ee22 = = = =
SSESSE
nn - ( - (kk + 1) + 1)SSESSE
nn - - kk - 1 - 1^̂
©2006 Thomson/South-Western 11
Hypothesis Test for the Hypothesis Test for the Significance of the ModelSignificance of the Model
HHoo: : 11 = = 22 = … = = … = kk
HHaa: at least one of the : at least one of the ’s ≠ 0’s ≠ 0
Reject Reject HHoo if if FF > > FF,,kk,,nn--kk-1-1
FF = =MSRMSR
MSEMSE
©2006 Thomson/South-Western 12
Associated F CurveAssociated F Curve
reject reject HH00
FF,,vv , ,vv11 22
Area = Area =
Figure 14.6Figure 14.6
©2006 Thomson/South-Western 13
Test for HTest for Hoo: : ii = 0= 0
reject reject HHoo if | if |tt| > | > tt ./2,./2,nn--kk-1-1t t == bb11
ssbb 11
HHoo: : 11 = 0 ( = 0 (XX11 does not contribute) does not contribute)
HHaa: : 11 ≠ 0 ( ≠ 0 (XX11 does contribute) does contribute)
HHoo: : 22 = 0 ( = 0 (XX22 does not contribute) does not contribute)
HHaa: : 22 ≠ 0 ( ≠ 0 (XX22 does contribute) does contribute)
HHoo: : 33 = 0 ( = 0 (XX33 does not contribute) does not contribute)
HHaa: : 33 ≠ 0 ( ≠ 0 (XX33 does contribute) does contribute)
bbii - - tt/2,/2,nn--kk-1-1ssbb to b to bii + + tt/2,/2,nn--kk-1-1ssbb ii ii
(1-(1- ) 100%) 100% Confidence Interval Confidence Interval
©2006 Thomson/South-Western 14
Housing Example Housing Example
Figure 14.7Figure 14.7
©2006 Thomson/South-Western 15
BB Investments ExampleBB Investments Example
BB Investments wants to develop a model to BB Investments wants to develop a model to predict the amount of money invested by predict the amount of money invested by various clients in their portfolio of high-risk various clients in their portfolio of high-risk securitiessecurities
Y = Investment Amount ($)Y = Investment Amount ($)
X1 = Annual Income ($1,000s)X1 = Annual Income ($1,000s)
X2 = Economic Index, X2 = Economic Index, showing expected showing expected increase in interest levels, manufacturing increase in interest levels, manufacturing costs, and price inflation (1 -100 scale)costs, and price inflation (1 -100 scale)
©2006 Thomson/South-Western 16
BB Investments Example BB Investments Example
Figure 14.8Figure 14.8
©2006 Thomson/South-Western 17
BB Investments Example BB Investments Example
Figure 14.9Figure 14.9
©2006 Thomson/South-Western 18
BB Investments Example BB Investments Example
Figure 14.10Figure 14.10
©2006 Thomson/South-Western 19
Coefficient of DeterminationCoefficient of Determination
SSTSST = total sum of squares= total sum of squares
= SS= SSYY
= ∑(= ∑(YY - - YY))22
= ∑= ∑YY22 - -(∑(∑YY))22
nn
RR22 = 1 - = 1 - SSESSE
SSTSSTFF = =
RR22 / / kk
(1 - (1 - RR22) / () / (nn - - kk - 1) - 1)
©2006 Thomson/South-Western 20
Partial F TestPartial F Test
RRcc22 = the value of = the value of RR22 for the complete model for the complete model
RRrr22 = the value of = the value of RR22 for the reduced model for the reduced model
Test statisticTest statistic
FF = =((RRcc
22 - - RRrr22) / ) / vv11
(1 - (1 - RRcc22) / ) / vv22
©2006 Thomson/South-Western 21
Motormax ExampleMotormax Example
Motormax produces electric motors in Motormax produces electric motors in home furnaces. They want to study home furnaces. They want to study the relationship between the dollars the relationship between the dollars spent per week in inspecting finished spent per week in inspecting finished products (X) and the number of products (X) and the number of motors produced during that week motors produced during that week that were returned to the factory by that were returned to the factory by the customer (Y)the customer (Y)
©2006 Thomson/South-Western 22
Motormax ExampleMotormax Example
Figure 14.11Figure 14.11
©2006 Thomson/South-Western 23
Quadratic CurvesQuadratic Curves
24242222
1616
|
11|
22||
33||
44||
55
2424
18181616
|
11|
22||
33||
44||
55
Figure 14.12Figure 14.12
YY
XX
(a)(a)
XXXX
YYYY
(b)(b)(b)(b)
©2006 Thomson/South-Western 24
Motormax ExampleMotormax Example
Figure 14.13Figure 14.13
©2006 Thomson/South-Western 25
Error From ExtrapolationError From Extrapolation
Figure 14.14Figure 14.14
PredictedPredicted
ActualActual
YY
XX||11
||22
||33
||44
||55
©2006 Thomson/South-Western 26
MulticollinearityMulticollinearityOccurs when independent variables Occurs when independent variables are highly correlated with each otherare highly correlated with each other
Often detectable through pairwise correlations Often detectable through pairwise correlations readily available in statistical packagesreadily available in statistical packages
The variance inflation factor can also be usedThe variance inflation factor can also be used
VIFVIFjj = =11
1 - 1 - RRjj22
Conclude severe multicollinearity exists Conclude severe multicollinearity exists when the maximum when the maximum VIFVIFjj > 10> 10
©2006 Thomson/South-Western 27
Multicollinearity ExampleMulticollinearity Example
Figure 14.15Figure 14.15
©2006 Thomson/South-Western 28
Multicollinearity ExampleMulticollinearity Example
Figure 14.16Figure 14.16
©2006 Thomson/South-Western 29
Multicollinearity ExampleMulticollinearity Example
Figure 14.17Figure 14.17
©2006 Thomson/South-Western 30
MulticollinearityMulticollinearity
The stepwise selection process can The stepwise selection process can help eliminate correlated predictor help eliminate correlated predictor variablesvariables
Other advanced procedures such as Other advanced procedures such as ridge regression can also be appliedridge regression can also be applied
Care should be taken during the model Care should be taken during the model selection phase as multicollinearity can selection phase as multicollinearity can be difficult to detect and eliminatebe difficult to detect and eliminate
©2006 Thomson/South-Western 31
Dummy VariablesDummy Variables
Dummy, or indicator, variables allow Dummy, or indicator, variables allow for the inclusion of qualitative for the inclusion of qualitative
variables in the modelvariables in the model
For example:For example:
XX11 = =11 if femaleif female00 if maleif male
©2006 Thomson/South-Western 32
Dummy Variable ExampleDummy Variable Example
Figure 14.18Figure 14.18
©2006 Thomson/South-Western 33
Stepwise ProceduresStepwise ProceduresProcedures either choose or eliminate variables, Procedures either choose or eliminate variables,
one at a time, in an effort to avoid including one at a time, in an effort to avoid including variables with either no predictive ability or are variables with either no predictive ability or are highly correlated with other predictor variableshighly correlated with other predictor variables
Forward regressionForward regressionAdd one variable at a time until contribution Add one variable at a time until contribution
isisinsignificantinsignificant Backward regressionBackward regressionRemove one variable at a time starting with Remove one variable at a time starting with
the the “worst” until R“worst” until R22 drops significantly drops significantly
Stepwise regressionStepwise regressionForward regression with the ability to remove Forward regression with the ability to remove
variables that become insignificantvariables that become insignificant
©2006 Thomson/South-Western 34
Stepwise Stepwise RegressionRegression
Figure 14.19Figure 14.19
Include Include XX33
Include Include XX66
Include Include XX22
Include Include XX55
Remove Remove XX22
(When (When XX55 was inserted into the model was inserted into the model
XX22 became unnecessary) became unnecessary)
Include Include XX77
Remove Remove XX77 - it is insignificant - it is insignificant
StopStopFinal model includes Final model includes XX33, , XX55 and X and X66
©2006 Thomson/South-Western 35
Checking Model Checking Model AssumptionsAssumptions
Checking Assumption 1 - Normal distributionChecking Assumption 1 - Normal distributionConstruct a histogramConstruct a histogram
Checking Assumption 3 - Errors are independentChecking Assumption 3 - Errors are independentDurbin-Watson statisticDurbin-Watson statistic
Checking Assumption 2 - Constant varianceChecking Assumption 2 - Constant variancePlot residuals versus predicted Y valuesPlot residuals versus predicted Y values^̂
©2006 Thomson/South-Western 36
Detecting Sample OutliersDetecting Sample Outliers Sample leveragesSample leverages Standardized residualsStandardized residuals Cook’s distance measureCook’s distance measure
StandardizedStandardized residual = residual = YYii – – YYii
ss 1 - 1 - hhii
^̂
©2006 Thomson/South-Western 37
Cook’s Distance MeasureCook’s Distance Measure
k 1 or 2 3 or 4 ≥ 5DMAX .8 .9 1.0
Table 14.1Table 14.1
DDii = (standardized residual)= (standardized residual)2211
kk + 1 + 1hhii
1 - 1 - hhii
==((YYii - - YYii))22
((kk + 1) + 1)ss22
hhii
(1 – (1 – hhii))22
^̂
©2006 Thomson/South-Western 38
Residual AnalysisResidual AnalysisFigure 14.20Figure 14.20
©2006 Thomson/South-Western 39
Residual AnalysisResidual Analysis
Figure 14.21Figure 14.21
©2006 Thomson/South-Western 40
Residual AnalysisResidual Analysis
Figure 14.22Figure 14.22
©2006 Thomson/South-Western 41
Prediction Using Multiple Prediction Using Multiple RegressionRegression
Figure 14.23Figure 14.23
©2006 Thomson/South-Western 42
Prediction Using Multiple Prediction Using Multiple RegressionRegression
Figure 14.24Figure 14.24
©2006 Thomson/South-Western 43
Prediction Using Multiple Prediction Using Multiple RegressionRegression
Figure 14.25Figure 14.25
©2006 Thomson/South-Western 44
Prediction Using Multiple Prediction Using Multiple RegressionRegression
Confidence and Prediction IntervalsConfidence and Prediction Intervals
YY - - tt/2,/2,nn--kk-1-1ssYY to Y to Y + + tt/2,/2,nn--kk-1-1ssYY ^̂ ^̂
^̂ ^̂
(1-(1- ) 100%) 100% Confidence Interval for Confidence Interval for µµYY||XX 00
(1-(1- ) 100%) 100% Confidence Interval for Y Confidence Interval for Yxx 00
YY - - tt/2,/2,nn--kk-1 -1 ss22 + s + sYY22 to Y to Y + + tt/2,/2,nn--kk-1 -1 ss22 + s + sYY
22 ^̂ ^̂
^̂ ^̂
©2006 Thomson/South-Western 45
Interaction EffectsInteraction Effects
Implies how variables occur together Implies how variables occur together has an impact on prediction of the has an impact on prediction of the
dependent variabledependent variable
YY = = 00 + + 11XX11 + + 22XX22 + + 33XX11XX22 + + ee
µµYY = = 00 + + 11XX11 + + 22XX22 + + 33XX11XX22
©2006 Thomson/South-Western 46
Interaction EffectsInteraction Effects
Figure 14.26Figure 14.26
µµYY
XX11
(a)(a)
||11
||22
µµYY = 18 + 5 = 18 + 5XX11
µµYY = 30 - 10 = 30 - 10XX11
XX22 = 2 = 2
XX22 = 5 = 5
µµYY = 30 + 15 = 30 + 15XX11
µµYY = 18 + 15 = 18 + 15XX11
XX22 = 2 = 2
XX22 = 5 = 560 –60 –
50 –50 –
40 –40 –
30 –30 –
20 –20 –
10 –10 –
60 –60 –
50 –50 –
40 –40 –
30 –30 –
20 –20 –
10 –10 –
XX11XX11
(b)(b)(b)(b)
||11||11
||22||22
µµYYµµYY
©2006 Thomson/South-Western 47
Quadratic and Quadratic and Second-Order ModelsSecond-Order Models
YY = = 00 + + 11XX11 + + 22XX1122 + + ee
Quadratic EffectsQuadratic Effects
YY = = 00 + + 11XX11 + + 22XX22 + + 33XX11XX22 + + 44XX1122 + + 55XX22
22 + + ee
Complete Second-Order ModelsComplete Second-Order Models
YY = = 00 + + 11XX11 + + 22XX22 + + 33XX33 + + 44XX11XX22 + + 55XX22XX33
+ + 66XX22XX33 + + 77XX1122 + + 88XX22
22 + + 99XX3322 + + ee
©2006 Thomson/South-Western 48
Financial ExampleFinancial Example
Figure 14.27Figure 14.27
©2006 Thomson/South-Western 49
Financial ExampleFinancial Example
Figure 14.28Figure 14.28
©2006 Thomson/South-Western 50
Financial ExampleFinancial Example
Figure 14.29Figure 14.29
©2006 Thomson/South-Western 51
Financial ExampleFinancial Example
Figure 14.30Figure 14.30