regression analysis - how runs are scored
TRANSCRIPT
-
7/24/2019 Regression Analysis - How Runs Are Scored
1/19
REGRESSION ANALYSIS
HOW RUNS ARE SCORED BY BANGLADESHCRICKET TEAM
FEBRUARY 27, 2015
-
7/24/2019 Regression Analysis - How Runs Are Scored
2/19
February 27, 2015
Ms. Shakila Yasmin
Assistant Professor,
Institute of Business Administration,
University of Dhaka.
Dear Sir,
Here is the report prepared on Regression Analysis: How Runs are scored by Bangladesh Cricket Teamas a requirement of a course named K501 Quantitative Analysis for Managers.
I have tried my best to follow the guidelines that were discussed imparted in class in writing this report. I
will be happy to provide further clarification regarding this report whenever necessary.
The experience of this report writing helps me to link up classroom learning and real life situations
largely. It gave me an opportunity to explain a real life scenario with an important statistical tools. I thank
you for providing me this opportunity.
Sincerely,
_____________________________
d b l l
-
7/24/2019 Regression Analysis - How Runs Are Scored
3/19
CKNOWLEDGEMENT
This has been a great opportunity to link a real life problem with a classroom briefing.
First of All I would like to thank my honorable supervisor Ms. Shakila Yasmin, Assistant Professor,
Institute of Business Administration for assigning this topic and explain insight of this topic to us.
-
7/24/2019 Regression Analysis - How Runs Are Scored
4/19
EXECUTIVE SUMM RY
This report is prepared as an integral part of EMBA Program. The main objective of this report is to carryout a Linear Programming case study. Thus, the topic for the report, Regression Analysis: How Runs are
scored by Bangladesh Cricket Team? was chosen to work on.
Managers everyday faces problem which requires analysis of data to some extent and when analysis
matters we need to know some statistical tools. One common statistics tools is Regression Analysis. As a
manager we often tasked with find out what has impacted our profit or production or schedule or supply
chain most to do that we firstly need to understand how the faction of profit or production or supply chains
are associated. To what extent they explain the changes in the outcome. Once we have understood thatthen we can take any measures to improve that situation.
Here in this report first I have tried to put the scenario of batting performance of Bangladesh cricket team
where we were failing to score big runs hence we have lost 625 of the matches in the period of 2010 -
2014. I have tried to put the problem in simplistic way so that it is easily understandable by readers.
As the report progress I tried to describe component of regression analysis and also the dependent and
independent variables from the case itself. Using multiple linear regression method and with excel dataanalysis too I have tried to show how the independent variables are related to the dependent variables. I
have also shown how much the independent variable affects or relates to the Total Run Scored.
In the next part I have interpreted the result to see how much Bangladesh will run given the independent
variables do not perform well or at 0. Also I have showed among those six factors what factor have
significant relationship with the total run scored and what are the order from other factors in terms of
relationship with Total run scored.
-
7/24/2019 Regression Analysis - How Runs Are Scored
5/19
CONTENTS
Chapter 1 ........................................................................................................................................................ 3
Background of the study ............................................................................................................................. 3
Objective of the Report .............................................................................................................................. 3
Methodology .............................................................................................................................................. 3
Chapter 2 ........................................................................................................................................................ 4
What is Regression Analysis? ...................................................................................................................... 4
Use of Regression Analysis.......................................................................................................................... 4
Learning of Regression from Managerial Prospective: ............................................................................... 4
Chapter 3 ........................................................................................................................................................ 5
Problem Case Case Study of Bangladesh Cricket Team ........................................................................... 5
Chapter 4 ........................................................................................................................................................ 8
Regression Analysis Output ........................................................................................................................ 8
Regression Statistics ............................................................................................................................... 8
Anova Table ............................................................................................................................................ 8Coefficient Table ..................................................................................................................................... 8
Residual Table ......................................................................................................................................... 9
Interpretation of the Result: ....................................................................................................................... 9
Chapter 5 ......................................................................................................................................................12
Findings .....................................................................................................................................................12
Recommendation: ....................................................................................................................................12
-
7/24/2019 Regression Analysis - How Runs Are Scored
6/19
CONTENT OF TABLES
Table 1: Bangladesh Team performance from 2010 to 2014 ......................................................................... 5Table 2: Runs scored by Bangladesh in last 65 matches and the variables .................................................... 7
Table 3: Regression Statistics Table ................................................................................................................ 8
Table 4: ANOVA Table ..................................................................................................................................... 8
Table 5: Coefficient Table ............................................................................................................................... 8
Table 6: Table of Residuals ............................................................................................................................. 9
Table 7: Table of Variables and their Coefficient sorted as per their strength ............................................10
CONTENT OF FIGURES
Figure 1: Residual Plots .................................................................................................................................10
Figure 2: Residual Plot for X1 .......................................................................................................................11
Figure 3: Residual Plot for X2 ........................................................................................................................11
Figure 4: Residual Plot for X3 ........................................................................................................................11
Figure 5: Residual Plot for X4 ........................................................................................................................11
Figure 6: Residual Plot for X5 ........................................................................................................................11
Figure 7: Residual Plot for X6 ........................................................................................................................11
CONTENT OF EQUATIONS
Equation 1: Basic regression equation ............................................................................................................ 4
Equation 3: Regression Line ..........................................................................................................................11
-
7/24/2019 Regression Analysis - How Runs Are Scored
7/19
CHA TER 1
Background of the study
During the course K501 Quantitative Analysis for Managers of EMBA program regression analysis was
discussed in the class by our instructor, Ms. Shakila Yasmin, Assistant Professor. To complete our
knowledge and understanding about Regression Analysis this assignment was assigned. This report is
prepared and presented to fulfill that requirement of the course.
We in our work filed everyday faces situations when we need to measure which is influencing our profit,
sales, production or even employee motivation. To do that we need to know how these relationship works
and sometime we need to know exactly how much anything is influencing our decision. This concept is
Regression and to measure that relationship better we need to know the Regression concept well.
To present the case I have choose to work on Bangladesh Cricket Team performances in last 40 years and
wanted to show how some of the variables are related to the Total Runs Scored by Bangladesh in ODI
matches.
Objective of the ReportPrimary objective for this report was to fulfill partial criterion of the course K501 Quantitative Analysis
for Managers. However with that objective there were these below objectives as well:
Understanding of the Regression Analysis Concept
Understanding Business Implication of Regression Analysis
Working with a real life case which would present opportunity to learn to solve these problems in
daily operations
Mastering the use of Data Analysis tool pack of excel to solve Linear Programming
-
7/24/2019 Regression Analysis - How Runs Are Scored
8/19
CHA TER 2
What is Regression Analysis?
Regression Analysis is a statistical tool to measure the extent of relations among the variables. Primarily it
focuses on the relationship between independent and dependent variable by applying techniques for
modelling variables to see the relationship. We business managers are more interested in how the
dependent variable changes when any of the independent variable changes or varies given other variables
are remained same. (Wikipedia, 2015)
This is a normal Linear Regression equation with one dependent and one independent variables:
= + Equation 1: Basic regression equation
Here,
= predicted value of dependent variables
a = is the intercept of the regression line
b = coefficient of variables in other words it is the slope of regression equation
Use of Regression Analysis
Regression Analysis is a daily tool for business, it helps managers in many ways like seeing what is
contributing most to the profit, what factor is most important to employee performance, whether
production is influenced by availability of raw materials or run time of machine? All these questions can be
answered by Regression Analysis. Below are the areas of where regression can be used: (Giulio Rocca, ,
2015)
-
7/24/2019 Regression Analysis - How Runs Are Scored
9/19
CHA TER 3
roblem Case Case Study of Bangladesh Cricket Team
Bangladesh cricket team has been playing ODI cricket since 1997, after 14 years they havent progressed
as they were supposed to. Between 2010 and 2014 Bangladesh cricket team has played 65 matches and
their winning rate was 38%. Knowing the capability team management expected this to be around 60%.
These below table shows performances of Bangladesh for 2010 2014
Year Match Won Lost Tied W/L
2010 16 9 7 0 1.29
2011 20 6 14 0 0.43
2012 9 5 4 0 1.25
2013 8 5 3 0 1.67
2014 12 0 12 0 0.00
Total 65 25 40 0 0.63Table 1: Bangladesh Team performance from 2010 to 2014
Team management identifies batting performance of the team to be accountable for these results. Since
IPL was started in 2008 cricket has emerged as Batsman game and game of batting performance. So the
Bangladesh team management also wanted to see improvement in their batting performances.
Between 2010 and 2014 Bangladeshs batting hasnt clicked that well they have only scored average of 211
runs in their 65 matches. This performance was not enough in matching performance from the opponents.
So Bangladesh Team management wanted to see what contribute most in making runs. So they have
chosen these variables to measure their contribution in the total run.2
- 50 + Partnership
R i Fi t 10 O
-
7/24/2019 Regression Analysis - How Runs Are Scored
10/19
7 234 0 37 70 50 9 113
8 191 2 35 9 33 13 128
9 228 0 42 78 58 22 102
10 177 1 58 57 13 0 155
11 241 3 64 61 106 13 60
12 174 1 53 54 36 29 51
13 200 1 53 60 63 1 81
14 194 1 48 62 9 7 172
15 246 1 40 104 73 63 46
16 189 1 37 85 11 6 153
17 283 3 68 80 55 25 196
18 205 2 67 55 16 36 95
19 58 0 40 0 8 0 30
20 227 3 58 84 32 6 110
21 166 2 30 14 1 11 145
22 78 0 29 0 30 3 16
23 210 2 35 50 51 44 73
24 229 1 25 98 9 81 66
25 295 1 71 97 9 1 253
26 184 1 19 47 53 59 31
27 188 2 32 60 26 12 29
28 245 2 38 102 19 101 173
29 203 2 72 10 39 16 119
30 245 1 40 105 79 20 89
31 258 2 40 98 67 21 117
32 220 1 26 96 12 69 70
33 62 0 37 0 0 10 57
34 91 0 26 0 15 0 23
35 186 1 19 36 34 1 18
36 119 1 23 2 9 0 60
37 241 2 43 76 64 3 120
-
7/24/2019 Regression Analysis - How Runs Are Scored
11/19
61 272 2 34 112 52 59 109
62 58 0 38 0 4 11 41
63 217 1 35 87 0 12 152
64 70 0 42 0 0 6 49
65 247 2 41 92 0 72 128
Table 2: Runs scored by Bangladesh in last 65 matches and the variables
This problem has underlying hypothesis that all the coefficient are equal to zero (0) which means that the
variables has no effect on the dependent variable. And the alternate hypothesis is that the independent
variable has effect on the Dependent variable thus coefficients are not equal to zero (0). This is written as
H0 = b1= b2= b3= b4= b5= b6= 0
Or
HA= bi 0
-
7/24/2019 Regression Analysis - How Runs Are Scored
12/19
CHA TER 4
Regression Analysis Output
Once we have run the excel to analyze these data we have will have a result which will say if these variables
explains the runs scored by Bangladesh well and what are the significance of this result. Also this result will
tell how each variable impact Run Scored and whether that relation are by chance or not.
REGRESSION STATISTICS
This below table tells us about how variables are related to each other, how much of the variability in the
dependent variable are explained by the movement in the independent variable and it also tell us about
the standard variability in the estimated and actual values of dependent variable. The most importantvalues in this table are Multiple R and Adjusted R Square. We will consider Adjusted R Square because this
is comparatively better measure of coefficient of determination as this is the sample size adjusted value.
Regression Statistics
Multiple R 0.929339484
R Square 0.863671877
Adjusted R Square 0.849568967
Standard Error 23.44038679Observations 65
Table 3: Regression Statistics Table
ANOVA TABLE
This ANOVA table also known as variance table, this table give us the significance of this regression and
also tells us whether this solution occurred randomly. Because if the solution has occurred randomly then
the variable may not be associated with the changes in dependent variable. This table also helps to verify
if the model is significant enough. Also this says about which hypothesis will be accepted.
-
7/24/2019 Regression Analysis - How Runs Are Scored
13/19
RESIDUAL TABLE
For Every predicted value there will be a actual value and these values will differ from one another. So for
every observations or data we will get a residual values. These residual values are important in
understanding the pattern of the data and also useful to know as it will detect the nature of the problem
(Linearity or Non Linearity).
Observatio
n
Predicted
Run
Scored
Residual
s
Observatio
n
Predicted
Run Scored Residuals
Observatio
n
Predicted
Run
Scored Residuals
1 155.04 11.96 23 204.56 5.44 45 208.11 12.89
2 171.46 14.54 24 223.06 5.94 46 262.06 -3.06
3 236.46 9.54 25 277.47 17.53 47 149.20 34.80
4 257.89 -7.89 26 172.22 11.78 48 241.61 27.395 202.65 33.35 27 187.52 0.48 49 215.13 36.87
6 186.72 16.28 28 281.23 -36.23 50 237.61 9.39
7 197.70 36.30 29 192.80 10.20 51 249.49 15.51
8 170.53 20.47 30 248.08 -3.08 52 225.42 21.58
9 209.82 18.18 31 259.81 -1.81 53 294.16 14.84
10 211.21 -34.21 32 221.00 -1.00 54 161.45 5.55
11 248.94 -7.94 33 109.51 -47.51 55 223.72 4.28
12 192.18 -18.18 34 96.39 -5.39 56 218.62 21.38
13 205.64 -5.64 35 141.84 44.16 57 271.26 7.74
14 214.54 -20.54 36 117.87 1.13 58 232.92 -10.92
15 244.44 1.56 37 238.15 2.85 59 350.84 -24.84
16 224.12 -35.12 38 315.93 -22.93 60 220.88 -16.88
17 291.24 -8.24 39 165.10 46.90 61 271.22 0.78
18 222.31 -17.31 40 265.73 -31.73 62 107.38 -49.38
19 104.45 -46.45 41 187.72 13.28 63 223.00 -6.00
20 257.83 -30.83 42 292.85 -0.85 64 109.56 -39.56
21 167.70 -1.70 43 214.71 12.29 65 251.90 -4.90
22 100 87 22 87 44 120 16 15 84
-
7/24/2019 Regression Analysis - How Runs Are Scored
14/19
partnership, No run in First 10 overs, No run between 35 50 overs, No run by Shakib, No run by Musfiq
and no run by first 4 batsman. For this constant as P value is less than the Significance Level ( = .05) and
t-Stat falls in the rejection area we can say that this constant is not random or didnt occurred by chance
and Null hypothesis of no impact on the Total Runs Scored is rejected.
All the variables 50+ partnership, Run in First 10 overs, Run between 35 50 overs, Run by Shakib, Run by
Musfiq and Run by first 4 batsman are positively related to the Total Runs Scored and are not random or
occurred by chance.
Strongest variable among those are 50+ Partnership, if there is one 50+ Partnership in the innings Total
Runs Scored will increase by 13.85 or 14. For each one run scored for Run in First 10 Overs, Run between
35 50 overs, Run by Shakib, Run by Musfiq and Run by first 4 batsman will increase Total Run Scored by
Bangladesh 0.58 , 0.88 , 0.25 , 0.22 , 0.25 respectively. From this we can sort the variables by their strengthof relationship with dependent variable as below:
Variables Coefficients
Run Scored Between 35 - 50 Over 0.88
Run in First 10 Over 0.58
Run by Shakib 0.25
Run by First 4 Batsman 0.25
Run by Musfiq 0.22
Table 7: Table of Variables and their Coefficient sorted as per their strength
From this table we can see that runs scored between 35 - 50 over are significant than run scored in first 10
overs. This is a fact of modern cricket that team wants a slow but steady start which they can capitalize in
the later overs. Also we see significance of run scored by Shakib and first four batsman are equal which
signifies the important role played by Shakib in the team and failures in our top order.
From the residuals we have plotted this this graphs where we can see that data are have heteroscedasticity
-
7/24/2019 Regression Analysis - How Runs Are Scored
15/19
Figure 2: Residual Plot for X1 Figure 3: Residual Plot for X2
Figure 4: Residual Plot for X3 Figure 5: Residual Plot for X4
-100
0
100
0 2 4 6Residuals
50 + Partnership
50 + Partnership Residual Plot
-100
0
100
0 20 40 60 80Residuals
Run in First 10 Over
Run in First 10 Over Residual Plot
-200
0
200
0 50 100 150 200Residuals
Run Scored Between 35 - 50 Over
Run Scored Between 35 - 50 Over
Residual Plot
0
100
0 100 200 300Residuals
Run by First 4 Batsman Residual Plot
-100
-50
0
50
100
0 50 100 150Residuals
Run by SAL
Run by SAL Residual Plot
0
100
0 50 100 150Residuals
Run by Musfiq Residual Plot
-
7/24/2019 Regression Analysis - How Runs Are Scored
16/19
CHA TER 5
Findings
As we have discussed about the variables and their relationship between Independent and Dependent one,
we have seen how dependent variable is being explained by the Independent one and also we have seen
how much each variable contributes to the Total Runs Scored. Bit by bit we have reached to the regression
equation which enable us to see different how different values for our Independent variable will affect the
Total Run Scored. Below we can sum up all these with this points:
There is 93% relationship between Independent and Dependent variables
85% of the variability in the dependent variable can be explained by the variability in independent
variable. The regression model is reject the null hypothesis and results are significant and did not occurred
by chance.
All the independent variables do have positive relationship with dependent variables
No variables coefficient has not occurred by chance or randomly
If all the independent variables are at 0 even then total Runs Scored would be 72
50+ Partnership has the strongest relationship among the variables
There is a randomness in the data which makes this problem suggest that Linear Regression
method could be used
Recommendation:
Team management wanted to see how this features of Bangladesh batting is associated with the total runs
scored by the team. As now they have found the association and the extent of relationship for each
variables to the total runs scored they would be in better position to improve the areas needed to score
more runs.
For this case they want to focus on partnership building as partnership of 50+ has the most significant
l h h h l d l h h ld f h h
-
7/24/2019 Regression Analysis - How Runs Are Scored
17/19
BIBLIOGRA HY
Archive, C. (2015, February 26). http://cricketarchive.com/cgi-bin/ask_the_scorecard_oracle.cgi . Retrieved
from http://cricketarchive.com/: http://cricketarchive.com/Giulio Rocca, . (2015, February 26). How Can Regression Analysis Help as a Manager?Retrieved from The
Nest: http://woman.thenest.com/can-regression-analysis-manager-20897.html
Wikipedia. (2015, February 26). Regression Analysis. Retrieved from WikiPedia:
http://en.wikipedia.org/wiki/Regression_analysis
-
7/24/2019 Regression Analysis - How Runs Are Scored
18/19
Regression Analysis: How Runs are Scored by Bangladesh Cricket Team
ppendices : Original Data of the case
Year Opponent Month
Cricket Archive
Serial No Win/Loss
Run
Scored
50 +
Partnership
Run in First
10 Over
Run Scored Between
35 - 50 Over
Run by
SAL
Run by
Musfiq
Run by First 4
Batsman
2010 India June o2993 L 167 1 59 0 7 30 109
2010 Sri Lanka June o2995 L 186 1 62 21 20 6 101
2010 Pakistan June o2998 L 246 0 42 96 25 1 1982010 England July o3018 L 250 2 47 90 20 22 169
2010 England July o3025 W 236 1 50 56 1 0 155
2010 England July o3026 L 203 1 59 51 6 0 82
2010 Ireland July o3027 L 234 0 37 70 50 9 113
2010 Ireland July o3028 W 191 2 35 9 33 13 128
2010 NewZealand October o3051 W 228 0 42 78 58 22 102
2010 NewZealand October o3054 W 177 1 58 57 13 0 155
2010 NewZealand October o3056 W 241 3 64 61 106 13 60
2010 NewZealand October o3058 W 174 1 53 54 36 29 51
2010 Zimbabwe December o3071 L 200 1 53 60 63 1 81
2010 Zimbabwe December o3073 W 194 1 48 62 9 7 172
2010 Zimbabwe December o3075 W 246 1 40 104 73 63 462010 Zimbabwe December o3078 W 189 1 37 85 11 6 153
2011 India February o3100 L 283 3 68 80 55 25 196
2011 Ireland February o3107 W 205 2 67 55 16 36 95
2011 West Indies March o3117 L 58 0 40 0 8 0 30
2011 England March o3126 W 227 3 58 84 32 6 110
2011 Netherlands March o3131 W 166 2 30 14 1 11 145
2011 South Africa March o3138 L 78 0 29 0 30 3 16
2011 Australia April o3149 L 210 2 35 50 51 44 73
2011 Australia April o3150 L 229 1 25 98 9 81 66
2011 Australia April o3151 L 295 1 71 97 9 1 253
2011 Zimbabwe August o3176 L 184 1 19 47 53 59 31
2011 Zimbabwe August o3178 L 188 2 32 60 26 12 29
2011 Zimbabwe August o3180 L 245 2 38 102 19 101 173
2011 Zimbabwe August o3181 W 203 2 72 10 39 16 119
2011 Zimbabwe August o3183 W 245 1 40 105 79 20 89
2011 West Indies October o3198 L 258 2 40 98 67 21 117
2011 West Indies October o3200 L 220 1 26 96 12 69 70
-
7/24/2019 Regression Analysis - How Runs Are Scored
19/19
Regression Analysis: How Runs are Scored by Bangladesh Cricket Team
2011 West Indies October o3202 W 62 0 37 0 0 10 57
2011 Pakistan December o3218 L 91 0 26 0 15 0 23
2011 Pakistan December o3220 L 186 1 19 36 34 1 18
2011 Pakistan December o3222 L 119 1 23 2 9 0 60
2012 Pakistan March o3258 L 241 2 43 76 64 3 120
2012 India March o3261 W 293 3 38 128 49 46 182
2012 Sri Lanka March o3265 W 212 2 41 12 56 1 68
2012 Pakistan March o3267 L 234 2 35 114 68 10 104
2012 West Indies November o3309 W 201 1 49 30 0 16 177
2012 West Indies December o3310 W 292 3 35 102 0 79 210
2012 West Indies December o3311 L 227 2 44 65 0 38 97
2012 West Indies December o3312 L 136 1 37 0 0 27 29
2012 West Indies December o3313 W 221 2 59 56 0 44 62
2013 Sri Lanka March o3349 L 259 2 57 110 0 3 128
2013 Sri Lanka March o3352 W 184 1 62 0 0 9 104
2013 Zimbabwe May o3353 W 269 2 42 104 1 5 99
2013 Zimbabwe May o3354 L 252 0 29 109 34 26 64
2013 Zimbabwe May o3355 L 247 2 44 95 18 32 69
2013 NewZealand October o3423 W 265 2 34 95 0 90 108
2013 NewZealand October o3426 W 247 1 35 87 0 31 145
2013 NewZealand October o3429 W 309 4 67 102 0 2 1522014 Sri Lanka February o3469 L 167 1 41 12 3 27 141
2014 Sri Lanka February o3470 L 228 2 43 48 24 79 136
2014 Sri Lanka February o3471 L 240 1 42 80 0 30 127
2014 India February o3474 L 279 1 36 95 0 117 224
2014 Afghanistan March o3478 L 222 2 42 97 0 23 74
2014 Pakistan March o3482 L 326 3 39 145 44 51 261
2014 Sri Lanka March o3482 L 204 2 39 79 20 4 93
2014 India June o3497 L 272 2 34 112 52 59 109
2014 India June o3498 L 58 0 38 0 4 11 41
2014 West Indies August o3509 L 217 1 35 87 0 12 152
2014 West Indies August o3511 L 70 0 42 0 0 6 49
2014 West Indies August o3514 L 247 2 41 92 0 72 128