regression analysis
TRANSCRIPT
ASSIGNMENT
Centre for Management Studies University of Petroleum & Energy Studies
Submitted to: Submitted by:
Prof. I. Krishnamurthy Richa Pandey
MBA (Avm)
R120108036
Dataset
Consumption Income Liquid Asset
220 238.1 182.7222.7 240.9 183223.8 245.8 184.4230.2 248.8 187
234 253.3 189.4236.2 256.1 192.2236 255.9 193.8
234.1 255.9 194.8233.4 254.4 197.3236.4 254.8 197239 257 200.3
243.2 260.9 204.2248.7 263 207.6253.7 271.5 209.4259.9 276.5 211.1261.8 281.4 213.2263.2 282 214.1263.7 286.2 216.5263.4 287.7 217.3266.9 291 217.3268.9 291.1 218.2270.4 294.6 218.5273.4 296.1 219.8272.1 293.3 219.5268.9 291.3 220.5270.9 292.6 222.7274.4 299.9 225278.7 302.1 229.4283.8 305.9 232.2289.7 312.5 235.2290.8 311.3 237.2
292.8 313.2 237.7295.4 315.4 238299.5 320.3 238.4298.6 321 240.1299.6 320.1 243.3297 318.4 246.1
301.6 324.8 250
Solution
Solving the given data set by using regression analysis we get-
Regression Analysis: C1 versus C2, C3
The regression equation is
C1 = - 10.6 + 0.682 C2 + 0.373 C3
Predictor Coef SE Coef T PConstant -10.627 3.273 -3.25 0.003C2 0.68166 0.07098 9.60 0.000C3 0.37252 0.09656 3.86 0.000
S = 1.76348 R-Sq = 99.5% R-Sq(adj) = 99.5%
Analysis of Variance
Source DF SS MS F PRegression 2 23165 11583 3724.45 0.000Residual Error 35 109 3Total 37 23274
Source DF Seq SSC2 1 23119C3 1 46
(i) The regression model will beConsumption = -10.6 + 0.682 Income + 0.373 liquid asset.
(ii) Here the value of R square = 99.5 %. Since R square is very high then it can be said that there may be existing the problem of multi co -linearity
(iii) Standard Error is 3.273, which is not high. So it will be difficult to say anything out of this observation.
(iv) “t-value” for Income(C1) and liquid asset(C2) are 9.60 and 3.86 respectively, and both are greater than 2, it implies that these values are significant. So again the problem of multi co-linearity may exist.
Correlations: C2, C3
Pearson correlation of C2 and C3 = 0.988P-Value = 0.000
Correlation between the Income and liquid asset is very high , that is 0.988. It is an indication for the existence of correlation
After dropping the last observation the new result will be as follows
Regression Analysis: C1 versus C2, C3
The regression equation isC1 = - 12.2 + 0.657 C2 + 0.413 C3
Predictor Coef SE Coef T PConstant -12.185 3.396 -3.59 0.001C2 0.65690 0.07193 9.13 0.000C3 0.41272 0.09902 4.17 0.000
S = 1.73621 R-Sq = 99.5% R-Sq(adj) = 99.5%
Analysis of Variance
Source DF SS MS F PRegression 2 21647 10824 3590.60 0.000Residual Error 34 102 3Total 36 21750
Source DF Seq SSC2 1 21595C3 1 52
After dropping the last observation the new regression model is
Consumption = -12.185 + 0.657 Income + 0.413 liquid asset.
Here there is not much change is the coefficient after dropping an observation. So we cannot conclude anything from this observation.
Variance Inflation Factor (VIF) :
VIF(Income) = 1/(1-Rsquare i )
Regression Analysis: C2 versus C3
The regression equation isC2 = - 5.64 + 1.34 C3
Predictor Coef SE Coef T PConstant -5.638 7.628 -0.74 0.465C3 1.34394 0.03528 38.09 0.000
S = 4.14102 R-Sq = 97.6% R-Sq(adj) = 97.5%
Analysis of Variance
Source DF SS MS F PRegression 1 24884 24884 1451.15 0.000Residual Error 36 617 17Total 37 25502
Unusual ObservationsObs C3 C2 Fit SE Fit Residual St Resid 13 208 263.000 273.364 0.726 -10.364 -2.54R
R denotes an observation with a large standardized residual.
VIF = 41.322
When the value of the predictor is more than 10 then the predictors are highly correlated.
Theils measure
Excluding Income
Regression Analysis: C1 versus C3
The regression equation isC1 = - 14.5 + 1.29 C3
Predictor Coef SE Coef T PConstant -14.470 6.107 -2.37 0.023C3 1.28863 0.02825 45.62 0.000
S = 3.31535 R-Sq = 98.3% R-Sq(adj) = 98.3%
Analysis of Variance
Source DF SS MS F PRegression 1 22878 22878 2081.45 0.000Residual Error 36 396 11Total 37 23274
Unusual Observations
Obs C3 C1 Fit SE Fit Residual St Resid 34 238 299.500 292.739 0.844 6.761 2.11R
R denotes an observation with a large standardized residual.
Excluding liquid asset
Regression Analysis: C1 versus C2
The regression equation isC1 = - 7.16 + 0.952 C2
Predictor Coef SE Coef T PConstant -7.160 3.705 -1.93 0.061C2 0.95213 0.01300 73.25 0.000
S = 2.07583 R-Sq = 99.3% R-Sq(adj) = 99.3%
Analysis of Variance
Source DF SS MS F PRegression 1 23119 23119 5365.14 0.000Residual Error 36 155 4Total 37 23274
Unusual Observations
Obs C2 C1 Fit SE Fit Residual St Resid 13 263 248.700 243.252 0.432 5.448 2.68R
R denotes an observation with a large standardized residual.
m = 0.9953-((0.9953-0.9829)+(0.9953-0.9933))
= 0.9809
Since m is not equal to zero ,hence we can say that multicollinearity exists.