regression analysis - how runs are scored

Upload: raquibul-islam-russeau

Post on 24-Feb-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    1/19

    REGRESSION ANALYSIS

    HOW RUNS ARE SCORED BY BANGLADESHCRICKET TEAM

    FEBRUARY 27, 2015

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    2/19

    February 27, 2015

    Ms. Shakila Yasmin

    Assistant Professor,

    Institute of Business Administration,

    University of Dhaka.

    Dear Sir,

    Here is the report prepared on Regression Analysis: How Runs are scored by Bangladesh Cricket Teamas a requirement of a course named K501 Quantitative Analysis for Managers.

    I have tried my best to follow the guidelines that were discussed imparted in class in writing this report. I

    will be happy to provide further clarification regarding this report whenever necessary.

    The experience of this report writing helps me to link up classroom learning and real life situations

    largely. It gave me an opportunity to explain a real life scenario with an important statistical tools. I thank

    you for providing me this opportunity.

    Sincerely,

    _____________________________

    d b l l

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    3/19

    CKNOWLEDGEMENT

    This has been a great opportunity to link a real life problem with a classroom briefing.

    First of All I would like to thank my honorable supervisor Ms. Shakila Yasmin, Assistant Professor,

    Institute of Business Administration for assigning this topic and explain insight of this topic to us.

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    4/19

    EXECUTIVE SUMM RY

    This report is prepared as an integral part of EMBA Program. The main objective of this report is to carryout a Linear Programming case study. Thus, the topic for the report, Regression Analysis: How Runs are

    scored by Bangladesh Cricket Team? was chosen to work on.

    Managers everyday faces problem which requires analysis of data to some extent and when analysis

    matters we need to know some statistical tools. One common statistics tools is Regression Analysis. As a

    manager we often tasked with find out what has impacted our profit or production or schedule or supply

    chain most to do that we firstly need to understand how the faction of profit or production or supply chains

    are associated. To what extent they explain the changes in the outcome. Once we have understood thatthen we can take any measures to improve that situation.

    Here in this report first I have tried to put the scenario of batting performance of Bangladesh cricket team

    where we were failing to score big runs hence we have lost 625 of the matches in the period of 2010 -

    2014. I have tried to put the problem in simplistic way so that it is easily understandable by readers.

    As the report progress I tried to describe component of regression analysis and also the dependent and

    independent variables from the case itself. Using multiple linear regression method and with excel dataanalysis too I have tried to show how the independent variables are related to the dependent variables. I

    have also shown how much the independent variable affects or relates to the Total Run Scored.

    In the next part I have interpreted the result to see how much Bangladesh will run given the independent

    variables do not perform well or at 0. Also I have showed among those six factors what factor have

    significant relationship with the total run scored and what are the order from other factors in terms of

    relationship with Total run scored.

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    5/19

    CONTENTS

    Chapter 1 ........................................................................................................................................................ 3

    Background of the study ............................................................................................................................. 3

    Objective of the Report .............................................................................................................................. 3

    Methodology .............................................................................................................................................. 3

    Chapter 2 ........................................................................................................................................................ 4

    What is Regression Analysis? ...................................................................................................................... 4

    Use of Regression Analysis.......................................................................................................................... 4

    Learning of Regression from Managerial Prospective: ............................................................................... 4

    Chapter 3 ........................................................................................................................................................ 5

    Problem Case Case Study of Bangladesh Cricket Team ........................................................................... 5

    Chapter 4 ........................................................................................................................................................ 8

    Regression Analysis Output ........................................................................................................................ 8

    Regression Statistics ............................................................................................................................... 8

    Anova Table ............................................................................................................................................ 8Coefficient Table ..................................................................................................................................... 8

    Residual Table ......................................................................................................................................... 9

    Interpretation of the Result: ....................................................................................................................... 9

    Chapter 5 ......................................................................................................................................................12

    Findings .....................................................................................................................................................12

    Recommendation: ....................................................................................................................................12

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    6/19

    CONTENT OF TABLES

    Table 1: Bangladesh Team performance from 2010 to 2014 ......................................................................... 5Table 2: Runs scored by Bangladesh in last 65 matches and the variables .................................................... 7

    Table 3: Regression Statistics Table ................................................................................................................ 8

    Table 4: ANOVA Table ..................................................................................................................................... 8

    Table 5: Coefficient Table ............................................................................................................................... 8

    Table 6: Table of Residuals ............................................................................................................................. 9

    Table 7: Table of Variables and their Coefficient sorted as per their strength ............................................10

    CONTENT OF FIGURES

    Figure 1: Residual Plots .................................................................................................................................10

    Figure 2: Residual Plot for X1 .......................................................................................................................11

    Figure 3: Residual Plot for X2 ........................................................................................................................11

    Figure 4: Residual Plot for X3 ........................................................................................................................11

    Figure 5: Residual Plot for X4 ........................................................................................................................11

    Figure 6: Residual Plot for X5 ........................................................................................................................11

    Figure 7: Residual Plot for X6 ........................................................................................................................11

    CONTENT OF EQUATIONS

    Equation 1: Basic regression equation ............................................................................................................ 4

    Equation 3: Regression Line ..........................................................................................................................11

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    7/19

    CHA TER 1

    Background of the study

    During the course K501 Quantitative Analysis for Managers of EMBA program regression analysis was

    discussed in the class by our instructor, Ms. Shakila Yasmin, Assistant Professor. To complete our

    knowledge and understanding about Regression Analysis this assignment was assigned. This report is

    prepared and presented to fulfill that requirement of the course.

    We in our work filed everyday faces situations when we need to measure which is influencing our profit,

    sales, production or even employee motivation. To do that we need to know how these relationship works

    and sometime we need to know exactly how much anything is influencing our decision. This concept is

    Regression and to measure that relationship better we need to know the Regression concept well.

    To present the case I have choose to work on Bangladesh Cricket Team performances in last 40 years and

    wanted to show how some of the variables are related to the Total Runs Scored by Bangladesh in ODI

    matches.

    Objective of the ReportPrimary objective for this report was to fulfill partial criterion of the course K501 Quantitative Analysis

    for Managers. However with that objective there were these below objectives as well:

    Understanding of the Regression Analysis Concept

    Understanding Business Implication of Regression Analysis

    Working with a real life case which would present opportunity to learn to solve these problems in

    daily operations

    Mastering the use of Data Analysis tool pack of excel to solve Linear Programming

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    8/19

    CHA TER 2

    What is Regression Analysis?

    Regression Analysis is a statistical tool to measure the extent of relations among the variables. Primarily it

    focuses on the relationship between independent and dependent variable by applying techniques for

    modelling variables to see the relationship. We business managers are more interested in how the

    dependent variable changes when any of the independent variable changes or varies given other variables

    are remained same. (Wikipedia, 2015)

    This is a normal Linear Regression equation with one dependent and one independent variables:

    = + Equation 1: Basic regression equation

    Here,

    = predicted value of dependent variables

    a = is the intercept of the regression line

    b = coefficient of variables in other words it is the slope of regression equation

    Use of Regression Analysis

    Regression Analysis is a daily tool for business, it helps managers in many ways like seeing what is

    contributing most to the profit, what factor is most important to employee performance, whether

    production is influenced by availability of raw materials or run time of machine? All these questions can be

    answered by Regression Analysis. Below are the areas of where regression can be used: (Giulio Rocca, ,

    2015)

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    9/19

    CHA TER 3

    roblem Case Case Study of Bangladesh Cricket Team

    Bangladesh cricket team has been playing ODI cricket since 1997, after 14 years they havent progressed

    as they were supposed to. Between 2010 and 2014 Bangladesh cricket team has played 65 matches and

    their winning rate was 38%. Knowing the capability team management expected this to be around 60%.

    These below table shows performances of Bangladesh for 2010 2014

    Year Match Won Lost Tied W/L

    2010 16 9 7 0 1.29

    2011 20 6 14 0 0.43

    2012 9 5 4 0 1.25

    2013 8 5 3 0 1.67

    2014 12 0 12 0 0.00

    Total 65 25 40 0 0.63Table 1: Bangladesh Team performance from 2010 to 2014

    Team management identifies batting performance of the team to be accountable for these results. Since

    IPL was started in 2008 cricket has emerged as Batsman game and game of batting performance. So the

    Bangladesh team management also wanted to see improvement in their batting performances.

    Between 2010 and 2014 Bangladeshs batting hasnt clicked that well they have only scored average of 211

    runs in their 65 matches. This performance was not enough in matching performance from the opponents.

    So Bangladesh Team management wanted to see what contribute most in making runs. So they have

    chosen these variables to measure their contribution in the total run.2

    - 50 + Partnership

    R i Fi t 10 O

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    10/19

    7 234 0 37 70 50 9 113

    8 191 2 35 9 33 13 128

    9 228 0 42 78 58 22 102

    10 177 1 58 57 13 0 155

    11 241 3 64 61 106 13 60

    12 174 1 53 54 36 29 51

    13 200 1 53 60 63 1 81

    14 194 1 48 62 9 7 172

    15 246 1 40 104 73 63 46

    16 189 1 37 85 11 6 153

    17 283 3 68 80 55 25 196

    18 205 2 67 55 16 36 95

    19 58 0 40 0 8 0 30

    20 227 3 58 84 32 6 110

    21 166 2 30 14 1 11 145

    22 78 0 29 0 30 3 16

    23 210 2 35 50 51 44 73

    24 229 1 25 98 9 81 66

    25 295 1 71 97 9 1 253

    26 184 1 19 47 53 59 31

    27 188 2 32 60 26 12 29

    28 245 2 38 102 19 101 173

    29 203 2 72 10 39 16 119

    30 245 1 40 105 79 20 89

    31 258 2 40 98 67 21 117

    32 220 1 26 96 12 69 70

    33 62 0 37 0 0 10 57

    34 91 0 26 0 15 0 23

    35 186 1 19 36 34 1 18

    36 119 1 23 2 9 0 60

    37 241 2 43 76 64 3 120

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    11/19

    61 272 2 34 112 52 59 109

    62 58 0 38 0 4 11 41

    63 217 1 35 87 0 12 152

    64 70 0 42 0 0 6 49

    65 247 2 41 92 0 72 128

    Table 2: Runs scored by Bangladesh in last 65 matches and the variables

    This problem has underlying hypothesis that all the coefficient are equal to zero (0) which means that the

    variables has no effect on the dependent variable. And the alternate hypothesis is that the independent

    variable has effect on the Dependent variable thus coefficients are not equal to zero (0). This is written as

    H0 = b1= b2= b3= b4= b5= b6= 0

    Or

    HA= bi 0

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    12/19

    CHA TER 4

    Regression Analysis Output

    Once we have run the excel to analyze these data we have will have a result which will say if these variables

    explains the runs scored by Bangladesh well and what are the significance of this result. Also this result will

    tell how each variable impact Run Scored and whether that relation are by chance or not.

    REGRESSION STATISTICS

    This below table tells us about how variables are related to each other, how much of the variability in the

    dependent variable are explained by the movement in the independent variable and it also tell us about

    the standard variability in the estimated and actual values of dependent variable. The most importantvalues in this table are Multiple R and Adjusted R Square. We will consider Adjusted R Square because this

    is comparatively better measure of coefficient of determination as this is the sample size adjusted value.

    Regression Statistics

    Multiple R 0.929339484

    R Square 0.863671877

    Adjusted R Square 0.849568967

    Standard Error 23.44038679Observations 65

    Table 3: Regression Statistics Table

    ANOVA TABLE

    This ANOVA table also known as variance table, this table give us the significance of this regression and

    also tells us whether this solution occurred randomly. Because if the solution has occurred randomly then

    the variable may not be associated with the changes in dependent variable. This table also helps to verify

    if the model is significant enough. Also this says about which hypothesis will be accepted.

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    13/19

    RESIDUAL TABLE

    For Every predicted value there will be a actual value and these values will differ from one another. So for

    every observations or data we will get a residual values. These residual values are important in

    understanding the pattern of the data and also useful to know as it will detect the nature of the problem

    (Linearity or Non Linearity).

    Observatio

    n

    Predicted

    Run

    Scored

    Residual

    s

    Observatio

    n

    Predicted

    Run Scored Residuals

    Observatio

    n

    Predicted

    Run

    Scored Residuals

    1 155.04 11.96 23 204.56 5.44 45 208.11 12.89

    2 171.46 14.54 24 223.06 5.94 46 262.06 -3.06

    3 236.46 9.54 25 277.47 17.53 47 149.20 34.80

    4 257.89 -7.89 26 172.22 11.78 48 241.61 27.395 202.65 33.35 27 187.52 0.48 49 215.13 36.87

    6 186.72 16.28 28 281.23 -36.23 50 237.61 9.39

    7 197.70 36.30 29 192.80 10.20 51 249.49 15.51

    8 170.53 20.47 30 248.08 -3.08 52 225.42 21.58

    9 209.82 18.18 31 259.81 -1.81 53 294.16 14.84

    10 211.21 -34.21 32 221.00 -1.00 54 161.45 5.55

    11 248.94 -7.94 33 109.51 -47.51 55 223.72 4.28

    12 192.18 -18.18 34 96.39 -5.39 56 218.62 21.38

    13 205.64 -5.64 35 141.84 44.16 57 271.26 7.74

    14 214.54 -20.54 36 117.87 1.13 58 232.92 -10.92

    15 244.44 1.56 37 238.15 2.85 59 350.84 -24.84

    16 224.12 -35.12 38 315.93 -22.93 60 220.88 -16.88

    17 291.24 -8.24 39 165.10 46.90 61 271.22 0.78

    18 222.31 -17.31 40 265.73 -31.73 62 107.38 -49.38

    19 104.45 -46.45 41 187.72 13.28 63 223.00 -6.00

    20 257.83 -30.83 42 292.85 -0.85 64 109.56 -39.56

    21 167.70 -1.70 43 214.71 12.29 65 251.90 -4.90

    22 100 87 22 87 44 120 16 15 84

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    14/19

    partnership, No run in First 10 overs, No run between 35 50 overs, No run by Shakib, No run by Musfiq

    and no run by first 4 batsman. For this constant as P value is less than the Significance Level ( = .05) and

    t-Stat falls in the rejection area we can say that this constant is not random or didnt occurred by chance

    and Null hypothesis of no impact on the Total Runs Scored is rejected.

    All the variables 50+ partnership, Run in First 10 overs, Run between 35 50 overs, Run by Shakib, Run by

    Musfiq and Run by first 4 batsman are positively related to the Total Runs Scored and are not random or

    occurred by chance.

    Strongest variable among those are 50+ Partnership, if there is one 50+ Partnership in the innings Total

    Runs Scored will increase by 13.85 or 14. For each one run scored for Run in First 10 Overs, Run between

    35 50 overs, Run by Shakib, Run by Musfiq and Run by first 4 batsman will increase Total Run Scored by

    Bangladesh 0.58 , 0.88 , 0.25 , 0.22 , 0.25 respectively. From this we can sort the variables by their strengthof relationship with dependent variable as below:

    Variables Coefficients

    Run Scored Between 35 - 50 Over 0.88

    Run in First 10 Over 0.58

    Run by Shakib 0.25

    Run by First 4 Batsman 0.25

    Run by Musfiq 0.22

    Table 7: Table of Variables and their Coefficient sorted as per their strength

    From this table we can see that runs scored between 35 - 50 over are significant than run scored in first 10

    overs. This is a fact of modern cricket that team wants a slow but steady start which they can capitalize in

    the later overs. Also we see significance of run scored by Shakib and first four batsman are equal which

    signifies the important role played by Shakib in the team and failures in our top order.

    From the residuals we have plotted this this graphs where we can see that data are have heteroscedasticity

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    15/19

    Figure 2: Residual Plot for X1 Figure 3: Residual Plot for X2

    Figure 4: Residual Plot for X3 Figure 5: Residual Plot for X4

    -100

    0

    100

    0 2 4 6Residuals

    50 + Partnership

    50 + Partnership Residual Plot

    -100

    0

    100

    0 20 40 60 80Residuals

    Run in First 10 Over

    Run in First 10 Over Residual Plot

    -200

    0

    200

    0 50 100 150 200Residuals

    Run Scored Between 35 - 50 Over

    Run Scored Between 35 - 50 Over

    Residual Plot

    0

    100

    0 100 200 300Residuals

    Run by First 4 Batsman Residual Plot

    -100

    -50

    0

    50

    100

    0 50 100 150Residuals

    Run by SAL

    Run by SAL Residual Plot

    0

    100

    0 50 100 150Residuals

    Run by Musfiq Residual Plot

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    16/19

    CHA TER 5

    Findings

    As we have discussed about the variables and their relationship between Independent and Dependent one,

    we have seen how dependent variable is being explained by the Independent one and also we have seen

    how much each variable contributes to the Total Runs Scored. Bit by bit we have reached to the regression

    equation which enable us to see different how different values for our Independent variable will affect the

    Total Run Scored. Below we can sum up all these with this points:

    There is 93% relationship between Independent and Dependent variables

    85% of the variability in the dependent variable can be explained by the variability in independent

    variable. The regression model is reject the null hypothesis and results are significant and did not occurred

    by chance.

    All the independent variables do have positive relationship with dependent variables

    No variables coefficient has not occurred by chance or randomly

    If all the independent variables are at 0 even then total Runs Scored would be 72

    50+ Partnership has the strongest relationship among the variables

    There is a randomness in the data which makes this problem suggest that Linear Regression

    method could be used

    Recommendation:

    Team management wanted to see how this features of Bangladesh batting is associated with the total runs

    scored by the team. As now they have found the association and the extent of relationship for each

    variables to the total runs scored they would be in better position to improve the areas needed to score

    more runs.

    For this case they want to focus on partnership building as partnership of 50+ has the most significant

    l h h h l d l h h ld f h h

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    17/19

    BIBLIOGRA HY

    Archive, C. (2015, February 26). http://cricketarchive.com/cgi-bin/ask_the_scorecard_oracle.cgi . Retrieved

    from http://cricketarchive.com/: http://cricketarchive.com/Giulio Rocca, . (2015, February 26). How Can Regression Analysis Help as a Manager?Retrieved from The

    Nest: http://woman.thenest.com/can-regression-analysis-manager-20897.html

    Wikipedia. (2015, February 26). Regression Analysis. Retrieved from WikiPedia:

    http://en.wikipedia.org/wiki/Regression_analysis

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    18/19

    Regression Analysis: How Runs are Scored by Bangladesh Cricket Team

    ppendices : Original Data of the case

    Year Opponent Month

    Cricket Archive

    Serial No Win/Loss

    Run

    Scored

    50 +

    Partnership

    Run in First

    10 Over

    Run Scored Between

    35 - 50 Over

    Run by

    SAL

    Run by

    Musfiq

    Run by First 4

    Batsman

    2010 India June o2993 L 167 1 59 0 7 30 109

    2010 Sri Lanka June o2995 L 186 1 62 21 20 6 101

    2010 Pakistan June o2998 L 246 0 42 96 25 1 1982010 England July o3018 L 250 2 47 90 20 22 169

    2010 England July o3025 W 236 1 50 56 1 0 155

    2010 England July o3026 L 203 1 59 51 6 0 82

    2010 Ireland July o3027 L 234 0 37 70 50 9 113

    2010 Ireland July o3028 W 191 2 35 9 33 13 128

    2010 NewZealand October o3051 W 228 0 42 78 58 22 102

    2010 NewZealand October o3054 W 177 1 58 57 13 0 155

    2010 NewZealand October o3056 W 241 3 64 61 106 13 60

    2010 NewZealand October o3058 W 174 1 53 54 36 29 51

    2010 Zimbabwe December o3071 L 200 1 53 60 63 1 81

    2010 Zimbabwe December o3073 W 194 1 48 62 9 7 172

    2010 Zimbabwe December o3075 W 246 1 40 104 73 63 462010 Zimbabwe December o3078 W 189 1 37 85 11 6 153

    2011 India February o3100 L 283 3 68 80 55 25 196

    2011 Ireland February o3107 W 205 2 67 55 16 36 95

    2011 West Indies March o3117 L 58 0 40 0 8 0 30

    2011 England March o3126 W 227 3 58 84 32 6 110

    2011 Netherlands March o3131 W 166 2 30 14 1 11 145

    2011 South Africa March o3138 L 78 0 29 0 30 3 16

    2011 Australia April o3149 L 210 2 35 50 51 44 73

    2011 Australia April o3150 L 229 1 25 98 9 81 66

    2011 Australia April o3151 L 295 1 71 97 9 1 253

    2011 Zimbabwe August o3176 L 184 1 19 47 53 59 31

    2011 Zimbabwe August o3178 L 188 2 32 60 26 12 29

    2011 Zimbabwe August o3180 L 245 2 38 102 19 101 173

    2011 Zimbabwe August o3181 W 203 2 72 10 39 16 119

    2011 Zimbabwe August o3183 W 245 1 40 105 79 20 89

    2011 West Indies October o3198 L 258 2 40 98 67 21 117

    2011 West Indies October o3200 L 220 1 26 96 12 69 70

  • 7/24/2019 Regression Analysis - How Runs Are Scored

    19/19

    Regression Analysis: How Runs are Scored by Bangladesh Cricket Team

    2011 West Indies October o3202 W 62 0 37 0 0 10 57

    2011 Pakistan December o3218 L 91 0 26 0 15 0 23

    2011 Pakistan December o3220 L 186 1 19 36 34 1 18

    2011 Pakistan December o3222 L 119 1 23 2 9 0 60

    2012 Pakistan March o3258 L 241 2 43 76 64 3 120

    2012 India March o3261 W 293 3 38 128 49 46 182

    2012 Sri Lanka March o3265 W 212 2 41 12 56 1 68

    2012 Pakistan March o3267 L 234 2 35 114 68 10 104

    2012 West Indies November o3309 W 201 1 49 30 0 16 177

    2012 West Indies December o3310 W 292 3 35 102 0 79 210

    2012 West Indies December o3311 L 227 2 44 65 0 38 97

    2012 West Indies December o3312 L 136 1 37 0 0 27 29

    2012 West Indies December o3313 W 221 2 59 56 0 44 62

    2013 Sri Lanka March o3349 L 259 2 57 110 0 3 128

    2013 Sri Lanka March o3352 W 184 1 62 0 0 9 104

    2013 Zimbabwe May o3353 W 269 2 42 104 1 5 99

    2013 Zimbabwe May o3354 L 252 0 29 109 34 26 64

    2013 Zimbabwe May o3355 L 247 2 44 95 18 32 69

    2013 NewZealand October o3423 W 265 2 34 95 0 90 108

    2013 NewZealand October o3426 W 247 1 35 87 0 31 145

    2013 NewZealand October o3429 W 309 4 67 102 0 2 1522014 Sri Lanka February o3469 L 167 1 41 12 3 27 141

    2014 Sri Lanka February o3470 L 228 2 43 48 24 79 136

    2014 Sri Lanka February o3471 L 240 1 42 80 0 30 127

    2014 India February o3474 L 279 1 36 95 0 117 224

    2014 Afghanistan March o3478 L 222 2 42 97 0 23 74

    2014 Pakistan March o3482 L 326 3 39 145 44 51 261

    2014 Sri Lanka March o3482 L 204 2 39 79 20 4 93

    2014 India June o3497 L 272 2 34 112 52 59 109

    2014 India June o3498 L 58 0 38 0 4 11 41

    2014 West Indies August o3509 L 217 1 35 87 0 12 152

    2014 West Indies August o3511 L 70 0 42 0 0 6 49

    2014 West Indies August o3514 L 247 2 41 92 0 72 128