bivariate analysis module 2
Post on 13-Apr-2017
84 Views
Preview:
TRANSCRIPT
Quantitative Methods in Quantitative Methods in ManagementManagement
Module 2: Bivariate Analysis
N J Jaissy
ObjectiveObjective
Learn of:
� Correlation
� Regression and the assumptions in the regression model
� Coefficient of determination
� Tests of significance for the correlation & regression coefficients
Jaissy@gmail.com 2
Reference Text booksReference Text books
Reference Text books:
� Levin & Rubin: Statistics for Management
� Srivastava, Shenoy & Sharma: Quantitative Techniques for Management decisions
� Anderson & Sweeney: Business Statistics
Jaissy@gmail.com 3
Correlation Correlation –– Example1Example1
Eg: The following table shows the relationship between Annual R&D expenditure & Annual Profits ( Rs. Lakhs)
Jaissy@gmail.com 5
Year R&D Spend Profit
1995 5 31
1994 11 40
1993 4 30
1992 5 34
1991 3 25
1990 2 20
Correlation Correlation –– Example1Example1
Jaissy@gmail.com 6
31
40
30
34
25
20
0
5
10
15
20
25
30
35
40
45
0 2 4 6 8 10 12
Profit
R&D Spend
Profit
Linear (Profit)
Regression Line
Correlation coefficientCorrelation coefficient
Karl Pearson’s correlation coefficient is an indication of how ‘strong’ the association or correlation is between 2 variables.
Jaissy@gmail.com 7
It does NOT matter which variable is termed X or Y when computing the correlation coefficient
Cov(X,Y): Covariance between X & Y= avg. product of the delta of X,Y from their meanSx, Sy = Standard Deviation of X, Y
Simplified formula
Correlation Coefficient: ScaleCorrelation Coefficient: Scale
Jaissy@gmail.com 8
• Correlation coefficient varies from – 1 ( Negative Correlation) to + 1( Positive Correlation)
•If Correlation coefficient = 0, there is NO correlation or association
• A higher magnitude of the correlation coefficient means a higher correlation
Correlation CoefficientCorrelation Coefficient
Jaissy@gmail.com 9
Keep in mind, correlation does NOT imply causality!
Question Set 1Question Set 1
1. The following table shows the relationship between Annual R&D expenditure & Annual Profits. How strong is the correlation. Determine using the correlation coefficient.
Jaissy@gmail.com 10
Year R&D Spend Profit
1995 5 31
1994 11 40
1993 4 30
1992 5 34
1991 3 25
1990 2 20
Correlation coefficient = 0.9; Very strong correlation
Question Set 1Question Set 1
2. Given are the annual income & net savings of a sample of 10 staff belonging to the firm. Is there any association between these 2 variables? Compute using correlation coefficient.
Jaissy@gmail.com 11
Employee
no
Income
(Rs'000)
Net
Savings
1 780 84
2 360 51
3 980 91
4 250 60
5 750 68
6 820 62
7 900 86
8 620 58
9 650 53
10 390 47
Yes – there is a high correlation. Correlation coefficient =0.78
RegressionRegression
� A correlation shows there is a relationship between 2 variables ( X & Y)
�With regression – we try to ‘fit’ a line to the data points.
� This line can be written as an equation which can then be used for prediction (iefor future values of X, what is Y?)
� Our objective is to find the ‘best fitting’ regression line (i.e. –the line that is closest to all the data points)
Jaissy@gmail.com 12
Regression Equation ( 2 variables)Regression Equation ( 2 variables)
Jaissy@gmail.com 13
X = Independent VariableY = Dependent Variable
It DOES matter which variable is termed X or Y
Regression formula(least squares) Regression formula(least squares) ––Short cut method (OPTIONAL!)Short cut method (OPTIONAL!)
Jaissy@gmail.com 15
This is a short cut formula that can be used instead of the earlier formula – both yield the same answer!
Question Set 2Question Set 2
1. The following table shows the relationship between Annual R&D expenditure & Annual Profits. Given there is a correlation, what would be the profit if R&D spend is 20 L
Jaissy@gmail.com 16
Year R&D Spend Profit
1995 5 31
1994 11 40
1993 4 30
1992 5 34
1991 3 25
1990 2 20 Figures in Rs L
Profits = 60L
Question Set 2Question Set 2
2. Mr. Ravi, the production manager of a factory is studying the relation between batch size & production costs. Is there a correlation? Fit a regression line.
Jaissy@gmail.com 17
Yes there is a strong positive correlation. Correlation coefficient = 0.953.
Regression line: Y = 1,864 + 0.044X
Batch no
Batch
Size
Production
costs (Rs '000)
1 11 2.1
2 13 2.7
3 18 2.9
4 24 2.9
5 28 3.1
6 32 3
7 38 3.3
8 42 3.7
9 47 4
10 53 4.4
Residual & Standard ErrorResidual & Standard Error
Jaissy@gmail.com 18
� This gives us the ‘error’ or delta between “actual value of Y” and the projected value of Y given by the regression line
� Ideally, the sum of all the residuals = 0 ( or close to 0). Square of the residuals = sum of the squares of error (SSE)
� Standard error = Standard Deviation of the error of the regression model
Se = standard error of estimateY’ = predicted value of Y using regression formulaYi = actual value of Y n = no of observations
Coefficient of DeterminationCoefficient of Determination
� This gives an indication of how ‘good’ a fit the ‘regression line’ is to the data points
� I.e. How much of the total variation in Y is described or ‘determined’ by the variation in X
Jaissy@gmail.com 19
Coefficient of determination = (Correlation coefficient)
2
Question Set 2Question Set 2
3. Gangarams is selling copies of a new statistics book & wants to estimate the link between sales of the book and the # of classes of statistics taught each semester in IBS. Use data collected below. Is there a correlation? What is the estimating equation? How much of the variation in book sales is accounted by the variation in no. of classes?
Jaissy@gmail.com 21
Sales (# of books) No. of classes
33 3
38 7
24 6
61 6
52 10
45 12
65 12
82 13
29 12
63 13
50 14
79 15
Hint: When fitting the regression line, which is the X variable and which is Y?
Estimating equation is: Y = 21.82 +2.92XModerate positive correlation of 0.59
35% of the variation in book sales is due to the variation in no. of classes
Spearman’s Rank CoefficientSpearman’s Rank Coefficient
� Used when we want to find the correlation between 2 sets of data that is ordinal (i.e has order).
Jaissy@gmail.com 22
Where; n = no. of paired data entriesd = delta between the Ranks of a paired data entry
Question Set 3Question Set 3
1. Nokia wants to see whether persons who were expected at the time of joining to be better sales guys actually turn out to have better sales records. The HR VP reviewed 10 employees’ job interview summaries, academic records & recommendation letters and ranked them in terms of their potential for success. Their sales records for the last 2 years was also drawn. Is there an agreement between the ranking of potential at the time of joining & ranking based on sales performance?
Jaissy@gmail.com 23
Sales Person Ranking in potential Two Year Sales Data
Anil 2 400
Babu 4 360
Chandran 7 300
Dilip 1 295
Edward 6 280
Feida 3 350
Gouda 10 200
Haridas 9 260
Ignatius 8 220
Joshua 5 385
YES! Spearman’s rank coefficient = 0.73 which indicates a moderately strong positive correlation
Question Set 3Question Set 3
2. Two commentators reviewing the India – Pakistan cricket match gave a ranking to all the players. To what extent are their rankings in sync?
Jaissy@gmail.com 24
Rank -
Person A
Rank -
Person B
Player 1 10 9
Player 2 8 5
Player 3 6 4
Player 4 11 2
Player 5 4 6
Player 6 3 11
Player 7 1 1
Player 8 2 7
Player 9 7 8
Player 10 5 10
Player 11 9 3
The two commentator ratings are NOT in sync. There is a weak negative correlation between the 2 rankings. Spearman’s coefficient= - 0.136
top related