multiple regression analysis & multicolinearity by humayun yousaf - hassaan wasti

14
INSTITUTE OF BUSINESS ADMINISTRATION KARACHI MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY QMDM TERM PAPER HUMAYUN YOUSAF ERP: 06670 SYED HASSAN MAHMOOD WASTI ERP :06668 Submitted to: Dr. Abdus Salam

Upload: hassan-wasti

Post on 25-Dec-2015

14 views

Category:

Documents


0 download

DESCRIPTION

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

TRANSCRIPT

Page 1: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

INSTITUTE OF BUSINESS ADMINISTRATION KARACHI

MULTIPLE REGRESSION

ANALYSIS &

MULTICOLINEARITY QMDM TERM PAPER

HUMAYUN YOUSAF ERP: 06670

SYED HASSAN MAHMOOD WASTI ERP :06668

Submitted to:

Dr. Abdus Salam

Page 2: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 1

Table of Contents

1. ABSTRACT ......................................................................................................... 2

2. MULTIPLE REGRESSION FRAMEWORK .......................................................... 2

3. APPLICATION OF SPSS .................................................................................... 4

4. MULTICOLINEARITY IN REGRESSION ANALYSIS .......................................... 6

5. APPLICATION OF TECHNIQUE USING E-VIEWS............................................. 8

6. APPLICATION OF TECHNIQUE USING EXCEL .............................................. 12

7. MAIN FINDINGS:............................................................................................... 13

8. CONCLUSION AND IMPLICATIONS: ............................................................... 13

9. REFERENCES .................................................................................................. 13

Page 3: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 2

1. ABSTRACT

The aim of this study is to strengthen our knowledge regarding Statistical Technique

of Multiple Regression analysis and impact of Multicolinearity on such analysis. This

term paper uses a real life example of wheat prices and different factors affecting the

price of wheat in Pakistan. Results have ben obtained using SPSS, E-VIEWS, and

MS-EXCEL software packages.

2. MULTIPLE REGRESSION FRAMEWORK

Multiple regression is used to explore the relationships among the variables.

Assumptions:

Following assumptions are considered while running a linear regression analysis.

Normality

All the variables involved in analysis follow Normal Distribution

Homoscedasticity

There is Nil Volatility in Volatility, i.e Variance is same for all the predictor and

dependent variables

Linearity

The Linearity between dependent and independent variables holds

Independent predictor variables

All the predictor variables are independent of each other, i.e. change in one

doesn’t affect the value of others

Page 4: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 3

Limitations:

The major conceptual limitation of all regression techniques is that we can

only ascertain relationships, but never be sure about the underlying causal

mechanism.

Objective (Based on Problem):

To find the effect on price of wheat in Pakistan to variation in Price of Rice ,

Price of Fertilizer and Wheat Price Index.

Hypothesis:

Null Hypothesis:

Values of dependent Variables Doesn’t have any effect on independent

variable.

i.e. β1=0 , β2=0, β3=0

Alternate Hypothesis:

Values of Independent Variables have effect on independent variable.

β1≠0 ,β2≠0,β3≠0

POW = f(POR, POF,WPI)

So, the Estimation equation will become,

POW = β0 + β1*(POR) + β2*(POF) + β3*(WPI) + ε

Where,

POW = Price of Wheat (Rs/40Kg) --------------------------------------------Dependent

POR = Price of Rice (Rs/40kg) -------------------------------------------Independent

POF = Price of Fertilizer(Rs/40Kg) -----------------------------------------Independent

WPI = Wheat Price Index (IGC AND FAO Asian Wheat Price Indicator) Independent

Page 5: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 4

Expected Signs of coefficients are as follows:

β1 > 0

β2 > 0

β3 > 0

3. APPLICATION OF SPSS

SPSS was used to run the analysis based on 10 year Scale data obtained from

sources mentioned in reference and following observations were made.

R square

It summarizes the proportion of variance in the dependent variable

associated with the independent variables. Ideally it should be close to

1. In our case value is .998

Durbin Watson

the Durbin–Watson statistic is a test statistic used to detect the

presence of autocorrelation. Ideally it should be 2. Our value is 2.9

which means negative autocorrelation exists.

Page 6: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 5

Std Error of Estimate

This represents the average distance that the observed values fall from

the regression line. Lower Values are better.

T Statistic

In Multiple Regression Analysis T statistic tests the hypothesis that a

population regression coefficient is 0

Significance (P Value)

It should be Less than 0.05 for significance.

Page 7: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 6

4. MULTICOLINEARITY IN REGRESSION ANALYSIS

When high correlations among the explanatory variables lead to erratic point

estimates of the coefficients, large standard errors and unsatisfactorily low t

statistics, the regression is said to said to be suffering from multicollinearity

Checks of Multi Colinearity in SPSS:

Tolerance

It is an indication of percentage of variance in the independent variable

that can not be accounted for by other independent variables. If the

value is less than 0.1 it requires further investigation.

VIF (Variance inflation Factor)

VIF is 1 / Tolerance and a value greater than 10 requires further

investigation

Eigen Value

If Several eigenvalues are close to 0, indicating that the independent

values are highly intercorrelated

Condition Index

The condition indices are computed as the square roots of the ratios of

the largest eigenvalue to each successive eigenvalue. Values greater

than 15 indicate a possible problem with collinearity

Measures of reducing Multi colinearity:

Bringing more variables into the model and reducing the population variance

of the disturbance term.

Increase the number of observations

Combine the correlated variables

Drop some of the correlated variables

Page 8: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 7

Correlation Matrix:

Box Plot of Variables used in Multiple Regression Analysis:

Page 9: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 8

5. Application of Technique Using E-VIEWS

Dependent Variable: POW

Method: Least Squares

Date:15/12/14 Time: 20:13

Sample (adjusted): 2000 2010

Variable Coefficient Std. Error t-Statistic Prob.

C 26.33039 16.79167 1.568063 0.1609

POR 0.290108 0.018394 15.77183 0.0000

POF 0.002785 0.000696 4.004490 0.0052

WPI 1.398779 0.226640 6.171805 0.0005

R-squared 0.997910 Mean dependent var 542.2727

Adjusted R-squared 0.997014 S.D. dependent var 277.2757

S.E. of regression 15.15175 Akaike info criterion 8.549397

Sum squared resid 1607.030 Schwarz criterion 8.694087

Log likelihood -43.02169 Hannan-Quinn criter. 8.458191

F-statistic 1113.955 Durbin-Watson stat 2.901714

Prob (F-statistic) 0.000000

DESCRIPTIVE STATISTICS

CORRLEATION MATRIX:

POW POR POF WPI

POW 1.000000 0.986789 0.902957 0.936992

POR 0.986789 1.000000 0.855768 0.883460

POF 0.902957 0.855768 1.000000 0.839871

Page 10: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 9

WPI 0.936992 0.883460 0.839871 1.000000

CORRELATION MATRIX

Correlation Matrix shows the co-dependency of all the variables on independent

variable and on each other.

CONCLUSION FROM DATA:

After Correlation analysis; it can be interpreted from the correlation matrix that all the

independent variables, i.e. price of rice, Price of Fertilizer and Wheat Price Index

have strong correlation with dependent variable and among themselves.

The highest Correlation exists between Price of wheat and Price of Rice which is 98.67%.

The Correlation between Price of wheat and Price of Fertilizer is 90.29%.

The Correlation between Price of wheat and Wheat Price Index is 93.60%.

Weakest Correlation among the values obtained in the matrix exists between Price of Fertilizer and Wheat Price Index. This can be explained by the difference of supply side variables affecting price of fertilizer as compared to price of Wheat Internationally.

ESTIMATION MODEL:

POW POR POF WPI

Mean 542.2727 942.7273 856.36 130.7000

Median 415.0000 664.0000 617.600 107.1000

Maximum 950.0000 2039.200 2037.00 232.1000

Minimum 300.0000 424.0000 364.000 85.80000

Std. Dev. 277.2757 621.6864 567.43 48.09239

Skewness 0.726116 0.864522 0.886659 0.901452

Kurtosis 1.777608 2.046407 2.506446 2.602117

Page 11: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 10

Jarque-Bera 1.651476 1.787012 1.552949 1.562356

Probability 0.437912 0.409219 0.460025 0.457866

Sum 5965.000 10370.00 9420.0 1437.700

Sum Sq. Dev. 768818.2 3864939. 5 8453238 23128.78

Observations 11 11 11 11

ESTIMATION MODEL OF THE OUTPUT

A total of 11 observations were recorded as data set for the study Estimation Model Shows the Mean, Median, Maxima and Minima of the set.

CONCLUSION FROM DATA:

Mean of POW is 542 while minimum value is 300 and maximum 950, Median value of POW data set is 415

Mean of POR is 943 while minimum value is 424 and maximum 2039

Mean of POF is 856.36 while minimum value is 364 and maximum 2037

Mean of WPI is 130 , maximum value has been 232 and minimum value came out to be 85 only

Graphical Analysis:

Overlapping the fitted graph which has been obtained by the equation to the Actual

graph which was obtained from directly plotting the real values can gives us valuable

insights into the strength of our resulted equation and could prove to be a useful tool

in analyzing the behavior of the values where significant differences occur (spikes

and dips between the curves). After spotting these anomalies one can look into detail

of that particular data entry and find the justification of this behavior. Residual graph

gives has the graphical view of fitness of our curve.

Page 12: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 11

GRAPH: ACTUAL, FITTED, RESIDUAL GRAPH

This graph indicates that the model estimated is a best fit model. It is also evident

from the graph that actual and fitted lines are highly correlated and follows similar

trend. The margin of error is also very low ranging from ‐25 to +25

-30

-20

-10

0

10

20

30

200

400

600

800

1,000

00 01 02 03 04 05 06 07 08 09 10

Residual Actual Fitted

Page 13: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 12

6. Application of Technique Using EXCEL

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.998702

R Square 0.997406

Adjusted R

Square 0.996295

Standard

Error 16.87827

Observatio

ns 11

ANOVA

df SS MS F

Significan

ce F

Regression 3

76682

4 255608

897.260

5 2.07E-09

Residual 7

1994.1

32 284.876

Total 10

76881

8.2

Coefficient

s

Standa

rd Err t Stat P-value

Lower

95%

Upper

95%

Lower

95.0%

Upper

95.0%

Intercept 48.12886

15.705

99

3.06436

3

0.01821

1 10.99009

85.2676

3

10.9900

9

85.2676

3

por 0.210432

0.0316

96

6.63908

5

0.00029

3 0.135483

0.28538

1

0.13548

3

0.28538

1

pof 0.004812 0.0007 6.55710 0.00031 0.003076 0.00654 0.00307 0.0065

wPi 0.411274 0.0759 5.41646 0.00099 0.231727 0.59082 0.23172 0.59082

Page 14: MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY by Humayun Yousaf - Hassaan Wasti

MULTIPLE REGRESSION ANALYSIS & MULTICOLINEARITY

Institute of Business Administration Karachi, December 28th, 2014 Page 13

7. Main Findings:

Price of Wheat is directly related to Price of Rice, Price of Fertilizer and Wheat

Price Index

Fitted Line Equation was obtained to predict change in Wheat Price due to

change in independent variables

Multicolineality was found to be insignificant due to mutually independent

nature of variables.

A negative Auto-Correlation was observed

8. CONCLUSION AND IMPLICATIONS:

Multiple Regression Model was used to simulate a real life problem and different

tools were used to obtain important Statistics results. Following Conclusion can be

drawn from the study:

Rice being the principle substitute of wheat in Pakistan and in most of the world plays a vital role in determining the price of wheat. The higher the price of the rice, the higher the demand of the wheat.

Increased Price of Fertilizer affects price of Wheat (Supply Curve Shift due to Cost of Production)

Another Important observation made during this analysis was the fact that regional wheat prices have a significant impact on wheat prices in Pakistan. This can be explained by indirect impact of world wheat prices on Pakistani government’s wheat procurement policy and direct impact because Pakistan has been importing as well as exporting wheat in different years historically.

9. REFERENCES

1. Paul Dorosh and Abdul Salam 2008: “Wheat Markets and Price Stabilization in Pakistan: An Analysis of Policy Options”

2. Salman Azam Joiya And Adnan Ali Shahzad 2013: “Determinants Of High Food Prices”

3. www.fao.org/statistics/en/ 4. www.pbs.gov.pk 5. www.finance.gov.pk 6. www.blog.minitab.com/ 7. en.wikipedia.org