etc3250: regression - monba.dicook.org · note: if you fall for the dummy variable trap, is a...
TRANSCRIPT
ETC3250: RegressionSemester 1, 2019 Professor Di Cook Econometrics and Business Statistics Monash University
Week 2 (a)
Outline
� Multipleregression
ǡModel
� Each is numerical and is called a predictor.
� The coef�cients measure the effect of eachpredictor after taking account of the effect of all otherpredictors in the model.
� Predictors may be transforms of other predictors. e.g., .
� The model describes a line, plane or hyperplane in thepredictor space.
1 / 21
Outline
� Multipleregression
ǡModel� The coef�cients measure the effect of eachpredictor after taking account of the effect of all otherpredictors in the model.
� Each is numerical and is called a predictor.
� Predictors may be transforms of other predictors. e.g., .
� The model describes a line, plane or hyperplane in thepredictor space.
1 / 21
Outline
� Multipleregression
ǡModel
� Predictors may be transforms of other predictors. e.g., .
� Each is numerical and is called a predictor.
� The coef�cients measure the effect of eachpredictor after taking account of the effect of all otherpredictors in the model.
� The model describes a line, plane or hyperplane in thepredictor space.
1 / 21
Outline
� Multipleregression
ǡModel
� The model describes a line, plane or hyperplane in thepredictor space.
� Each is numerical and is called a predictor.
� The coef�cients measure the effect of eachpredictor after taking account of the effect of all otherpredictors in the model.
� Predictors may be transforms of other predictors. e.g., .
1 / 21
Outline
� Multipleregression
ǡModel
(Chapter3/3.1.pdf)
2 / 21
Outline
� Multipleregression
ǡModel
(Chapter3/3.5.pdf)
3 / 21
Outline
� Multipleregression
ǡModelǡCategoricalvariables
Qualitative variables need to be converted to numeric.
which would result in the model
These are called dummy variables.
4 / 21
Outline
� Multipleregression
ǡModelǡCategoricalvariables
More than two categories
which would result in the model
These are called dummy variables.
5 / 21
Outline
� Multipleregression
ǡModelǡCategoricalvariablesǡOLS
Ordinary least squares is the simplest way to �t the model.Geometrically, this is the sum of the squared distances,parallel to the axis of the dependent variable, betweeneach observed data point and the corresponding point onthe regression surface – the smaller the sum ofdifferences, the better the model �ts the data.
6 / 21
Outline
� Multipleregression
ǡModelǡCategoricalvariablesǡOLSǡDiagnostics
is the proportion of variation explained by the model,and measures the goodness of the �t, close to 1 the modelexplains most of the variability in , close to 0 it explainsvery little.
where (read: Residual Sum of Squares), and (read: Total Sum of Squares).
7 / 21
Outline
� Multipleregression
ǡModelǡCategoricalvariablesǡOLSǡDiagnostics
Residual Standard Error (RSE) is an estimate of thestandard deviation of . This is meaningful with theassumption that .
8 / 21
Outline
� Multipleregression
ǡModelǡCategoricalvariablesǡOLSǡDiagnostics
F statistic tests whether any predictor explains response,by testing
vs at least one is not 0
9 / 21
Outline
� Multipleregression
ǡModelǡCategoricalvariablesǡOLSǡDiagnosticsǡThink about
� Is at least one of the predictors useful in predicting theresponse?� Do all the predictors help to explain , or is only asubset of the predictors useful?� How well does the model �t the data?� Given a set of predictor values, what response valueshould we predict and how accurate is our prediction?
10 / 21
Outline
� Multipleregression� Example
Wage and other data for a group of 3000 male workers inthe Mid-Atlantic region. Interested in predicting wagebased on worker characteristics.
## Observations: 3,000## Variables: 11## $ year <int> 2006, 2004, 2003, 2003, 2005, 2008, 2009, 2008, 2## $ age <int> 18, 24, 45, 43, 50, 54, 44, 30, 41, 52, 45, 34, 3## $ maritl <fct> 1. Never Married, 1. Never Married, 2. Married, 2## $ race <fct> 1. White, 1. White, 1. White, 3. Asian, 1. White## $ education <fct> 1. < HS Grad, 4. College Grad, 3. Some College, 4## $ region <fct> 2. Middle Atlantic, 2. Middle Atlantic, 2. Middle## $ jobclass <fct> 1. Industrial, 2. Information, 1. Industrial, 2. ## $ health <fct> 1. <=Good, 2. >=Very Good, 1. <=Good, 2. >=Very G## $ health_ins <fct> 2. No, 2. No, 1. Yes, 1. Yes, 1. Yes, 1. Yes, 1. ## $ logwage <dbl> 4.318063, 4.255273, 4.875061, 5.041393, 4.318063## $ wage <dbl> 75.04315, 70.47602, 130.98218, 154.68529, 75.0431
11 / 21
Outline
� Multipleregression� Example
ǡTake a look
12 / 21
Outline
� Multipleregression� Example
ǡTake a lookǡTransform
13 / 21
Outline
� Multipleregression� Example
ǡTake a lookǡTransformǡModel
Proposed model
where log Wage, Year information collected, , Education.
14 / 21
Outline
� Multipleregression� Example
ǡTake a lookǡTransformǡModel
lm(formula = logwage ~ year + age + education, data = Wage)
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.745e+01 5.469e+00 -3.191 0.00143 year 1.078e-02 2.727e-03 3.952 7.93e-05 age 5.509e-03 4.813e-04 11.447 < 2e-16 education2. HS Grad 1.202e-01 2.086e-02 5.762 9.18e-09 education3. Some College 2.440e-01 2.195e-02 11.115 < 2e-16 education4. College Grad 3.680e-01 2.178e-02 16.894 < 2e-16 education5. Advanced Degree 5.411e-01 2.362e-02 22.909 < 2e-16
Residual standard error: 0.3023 on 2993 degrees of freedomMultiple R-squared: 0.2631, Adjusted R-squared: 0.2616 F-statistic: 178.1 on 6 and 2993 DF, p-value: < 2.2e-16
15 / 21
Outline
� Multipleregression� Example� Modelling
ǡ Interpretation
� The ideal scenario is when the predictors areuncorrelated.
ǡEach coef�cient can be interpreted and testedseparately.
� Correlations amongst predictors cause problems.ǡThe variance of all coef�cients tends to increase,sometimes dramatically.ǡ Interpretations become hazardous -- when changes, everything else changes.ǡPredictions still work provided new values arewithin the range of training values.
� Claims of causality should be avoided for observationaldata.
16 / 21
Outline
� Multipleregression� Example� Modelling
ǡ Interpretationǡ Interactions
� An interaction occurs when the one variable changesthe effect of a second variable. (e.g., spending on radioadvertising increases the effectiveness of TV advertising).� To model an interaction, include the product in themodel in addition to and .� Hierarchy principleHierarchy principle: If we include an interaction in amodel, we should also include the main effects, even if thep-values associated with their coef�cients are notsigni�cant. (This is because the interactions are almostimpossible to interpret without the main effects.)
17 / 21
Outline
� Multipleregression� Example� Modelling
ǡ Interpretationǡ Interactions
(Chapter3/3.7.pdf)
18 / 21
Outline
� Multipleregression� Example� Modelling
ǡ Interpretationǡ InteractionsǡResiduals
� If a plot of the residuals vs any predictor in the modelshows a pattern, then the relationship is nonlinear.
� If a plot of the residuals vs any predictor notnot in themodel shows a pattern, then the predictor should beadded to the model.
� If a plot of the residuals vs �tted values shows apattern, then there is heteroscedasticity in the errors.(Could try a transformation.)
18 / 21
Outline
� Multipleregression� Example� Modelling
ǡ Interpretationǡ InteractionsǡResiduals
� If a plot of the residuals vs any predictor in the modelshows a pattern, then the relationship is nonlinear.
� If a plot of the residuals vs any predictor notnot in themodel shows a pattern, then the predictor should beadded to the model.
� If a plot of the residuals vs �tted values shows apattern, then there is heteroscedasticity in the errors.(Could try a transformation.)
18 / 21
Outline
� Multipleregression� Example� Modelling
ǡ Interpretationǡ InteractionsǡResiduals
� If a plot of the residuals vs any predictor in the modelshows a pattern, then the relationship is nonlinear.
� If a plot of the residuals vs any predictor notnot in themodel shows a pattern, then the predictor should beadded to the model.
� If a plot of the residuals vs �tted values shows apattern, then there is heteroscedasticity in the errors.(Could try a transformation.)
18 / 21
Outline
� Multipleregression� Example� Modelling
ǡ Interpretationǡ InteractionsǡResiduals
(Chapter3/3.9.pdf)
19 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModel
Let , , and
Then
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimation
Least squares estimationLeast squares estimation
Minimize:
Differentiate wrt and equal to zero gives
(The "normal equation".)
Note:Note: If you fall for the dummy variable trap, is asingular matrix.
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimation
Least squares estimationLeast squares estimation
Minimize:
Differentiate wrt and equal to zero gives
(The "normal equation".)
Note:Note: If you fall for the dummy variable trap, is asingular matrix.
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimation
Least squares estimationLeast squares estimation
Minimize:
Differentiate wrt and equal to zero gives
(The "normal equation".)
Note:Note: If you fall for the dummy variable trap, is asingular matrix.
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimationǡ Likelihood
If the errors are iid and normally distributed, then
So the likelihood is
which is maximized when is minimized.
So MLE OLS.
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimationǡ Likelihood
If the errors are iid and normally distributed, then
So the likelihood is
which is maximized when is minimized.
So MLE OLS.
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimationǡ LikelihoodǡPredictions
Optimal predictionsOptimal predictions
where is a row vector containing the values of theregressors for the predictions (in the same format as ).
Prediction variancePrediction variance
� This ignores any errors in .� 95% prediction intervals assuming normal errors:
.
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimationǡ LikelihoodǡPredictions
Optimal predictionsOptimal predictions
where is a row vector containing the values of theregressors for the predictions (in the same format as ).
Prediction variancePrediction variance
� This ignores any errors in .� 95% prediction intervals assuming normal errors:
.
20 / 21
Outline
� Multipleregression� Example� Modelling� Matrixformulation
ǡModelǡEstimationǡ LikelihoodǡPredictions
Optimal predictionsOptimal predictions
where is a row vector containing the values of theregressors for the predictions (in the same format as ).
Prediction variancePrediction variance
� This ignores any errors in .� 95% prediction intervals assuming normal errors:
.
20 / 21
Made by a human with a computerSlides at https://monba.dicook.org.
Code and data athttps://github.com/dicook/Business_Analytics.
Created using R Markdown with �air by xaringanxaringan, andkunoichikunoichi (female ninja) style.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
21 / 21