predicting the 2016 nba finals outcome using arena software

26
1 Predicting the 2016 NBA Finals Outcome Using Arena (software) vs Author: Ed Orlando, Data Scientist at White Lodging Course: CS 525 Modeling and Simulation Submitted to: Dr. James Caristi, Professor and Chair of Computing and Information Sciences

Upload: ed-orlando

Post on 15-Apr-2017

36 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Predicting the 2016 NBA Finals Outcome Using Arena Software

1

Predicting the 2016 NBA Finals Outcome Using Arena (software)

vs

Author: Ed Orlando, Data Scientist at White Lodging

Course: CS 525 Modeling and Simulation

Submitted to: Dr. James Caristi, Professor and Chair of Computing and Information Sciences

Page 2: Predicting the 2016 NBA Finals Outcome Using Arena Software

2

Section 1: Detailed Introduction to the Problem

On Sunday, June 19th, 2016, the Cleveland Cavaliers pulled off what no team in history has ever

accomplished in the NBA finals. Down three (3) games to one (1) in a best of seven series, the Cavaliers

rallied to win three (3) straight games against the Golden State Warriors. Although coming back down 3-1

has been accomplished by 10 other teams in the NBA playoffs (including the 2016 Warriors in the Western

Conference finals) no team has ever come back from a 3-1 deficit in the NBA finals (Exner, 2016).

The 2016 NBA finals outcome was considered one of the largest upsets in recent times. Not only did

Cleveland come back from a 3-1 deficit, they beat a Golden State team that held the all-time regular season

record of 73-9. In addition, 2 out of the 3 remaining games to be played were located at Golden State’s

arena (Oakland) where the Warriors won 39 out of 41 games (2015–16 Golden State Warriors Season,

2016). Even before the series started, the Warriors were heavy favorites. “The Cavs beating the Warriors

in this year’s NBA finals would be an upset no matter how you look at it. FiveThirtyEight’s projections give

the Warriors a 69 percent chance of winning the series. But if we factor in the conventional wisdom — not

something we say here very often — the Warriors look even stronger” (Morris, 2016).

Since the NBA finals is a best-of-seven series, experts and casual fans often believe that the best team

often wins the championship. This is a different setup compared to other popular sporting venues including

the NFL (Super Bowl), NCAA basketball championship, NCAA football championship, where the victor of

one game in the championship decides the winner. With that said, there are upsets in both venues,

including the NBA finals.

However, what if the NBA finals had a best of a 101 game series? In other words, after simulating the

NBA finals more than one hundred times using a model built in Arena, will this model be consistent with

most of the experts and predict a Golden State Warrior victory in the NBA finals or will it show the Cavaliers

as favorites? The simulation results produced in this study provide the win percent for each team as well

as a range of winning percentages (95% confidence level). These results are compared to FiveThirtyEight’s

projection of the Warrior’s 69 percent chance of winning the NBA finals. The results are also compared to

the betting odds, provided by USA Today, which had the Warriors’ chance of winning listed at 66.7% before

the finals began. Lastly, the results are also compared against what happened in the NBA finals.

Section 2: Goals of the Simulation

There are many goals of the simulation. One of the goals of the study is to build a model that has the

Cavaliers play the Warriors more than 100 times. Those simulations assist in building a predictive model

Page 3: Predicting the 2016 NBA Finals Outcome Using Arena Software

3

that shows the odds of who would win in a 101 game series. The model and statistics produced from it

also shows the average team and individual player statistics for each of the games. The predictive results

will be compared to the actual NBA finals results as well compared to FiveThirtyEight’s projection of the 69

percent chance and the betting odds of 67 percent chance of winning the NBA finals. The simulation reveals

some insights into the likelihood of the Cavaliers winning 4 out of 7 games. Stated differently, throughout

the simulation, are there “streaks” in which the Cavaliers have an opportunity to win a best out of 7 series?

Lastly, it will be determined if this simulation can be used in additional ways. In other words, can the

simulation be utilized and structured in a way where teams from different generations play each other?

For example, what would happen if the 2016 Golden State Warriors played the 1998 Chicago Bulls, who

are both considered two of the best teams of all time?

Section 3: Modeling Assumptions

The model used throughout the study is set up as a stochastic simulation. A stochastic simulation is one

that uses various random variables and various probabilities that can assist in developing simulations.

Although stochastic models can be very inaccurate at times, the model used throughout this study is

compared to static actual results derived from the June 2016 NBA finals.

Assumption #1: Individual Players Utilized in Simulation

In order to simplify the number of variables and assumptions used in the model, the simulation used

individual player stats for a maximum of five players for each team. Although teams often use substitutions

throughout the course of an NBA game, this simulation assumed the same five players playing the entire

48 minutes of each game. The five players chosen to play each game are listed below.

Table 1: Golden State Warrior Players Included in Simulation Position Name Point Guard Stephan Curry Shooting Guard Klay Thompson Small Forward Andre Iguodala Power Forward Draymond Green Center Andrew Bogut

Table 2: Cleveland Cavalier Players Included in Simulation

Position Name Point Guard Kyrie Irving Shooting Guard Iman Shumpert Small Forward LeBron James Power Forward Kevin Love Center Tristan Thompson

Page 4: Predicting the 2016 NBA Finals Outcome Using Arena Software

4

Assumption #2: Normalization Adjustment for Minutes Played and Game Statistics

Since only five players from both teams are utilized in the simulation, the minutes played and game

statistics are adjusted to accommodate for the 48-minute game. For example, if a player averages 24

minutes played and the average number of points scored for that player is 20 points per game, the minutes

played are increased to 48 minutes and the number of points scored are increased to 40 points per game.

This supposition obviously assumes that each player keeps the same pace in all statistics as if they were

played all 48 minutes of the game. Although no player actually averages 48 minutes per game during the

regular season, the number of minutes played in the NBA playoffs does often increase for the starters. In

other words, it is not uncommon for starters to play more than 40 minutes or more than 83% of the game.

For example, during the regular season, Kyrie Irving and LeBron James averaged 32.1 and 36.9 minutes per

game, respectively. However, during the NBA finals, Irving averaged 39.0 minutes per game while James

averaged 41.7 minutes per game (2016 NBA Finals Cavaliers vs. Warriors, 2016). “When it’s win or go

home, there’s no more jockeying for better playoff (or lottery) position, no more using throw-away games

to experiment with new lineups, the best players tend to get more minutes per game and no one’s resting

on the second night of (non-existent) back-to-backs” (Morris, 2016).

Assumption #3: Game Rules Included in Simulation

The basketball game played in the simulation is varied slightly and does not include all NBA rules. The

game simulated includes the following workflow. The game is get started with a referee that allows for a

50/50 chance of either Golden State or Cleveland getting the ball initially. Once a team has control of the

ball there are many things that can happen. Please refer to Section 5 for the specific averages, ranges, and

distributions utilized in the simulation.

- The team holds the ball for an amount of time based on a normal distribution that is calculated

using historical data.

- The distribution of the ball goes to a player. The higher number of shots or field goals a player

takes on average increases his odds of getting the ball. Unlike a typical basketball game, once

a player gets the ball, he will not attempt to pass the ball to another teammate. However, the

following actions (based on discrete probabilities) occur once a player has the ball.

1. The player shoots a 2-point field goal and makes it

a. If true, the individual and the team both receive 2 points

2. The player shoots a 2-point field goal and misses it

a. If true, either the same team can get an offensive rebound or the other team

will receive the ball (odds are based on discrete probabilities)

Page 5: Predicting the 2016 NBA Finals Outcome Using Arena Software

5

3. The player shoots a 3-point field goal and makes it

a. If true, the individual and the team both receive 3 points

4. The player can shoot a 3-point field goal and miss it

a. If true, either the same team can get an offensive rebound or the other team

will receive the ball (odds are based on discrete probabilities)

5. The player gets the ball blocked or stolen and the other team gets the ball

6. The player is fouled and makes 0 out of 2 free throw attempts

7. The player is fouled and makes 1 out of 2 free throw attempts

8. The player is fouled and makes 2 out of 2 free throw attempts

Please note that if a player is fouled, the maximum number of free throw attempts is

equal to 2. Although in a live game situation, it is possible to get fouled while shooting

a 3 pointer, and having 3 free throw attempts, it is very rare.

- Some of the major actions that are excluded from the simulation include the following:

1. Defensive stats and assumptions are utilized using the entire aggregated team stats.

In other words, individual defensive stats will not be tracked.

2. No timeouts are utilized in the simulation since it would not add any benefit to the

model.

3. Since defensive stats are not tracked, no player is eliminated from the game due to too

many fouls. In a real game, a player can be eliminated from a game if he had 6 personal

fouls.

4. Since it is rare in a game, no shot clock violation will occur and there is a slight chance

of a team or player taking more than 24 seconds to shoot the ball. Instead, an average

and standard deviation of how long the team holds the ball is calculated using historical

data for each team.

Section 4: Methods and Data Collection

The majority of the granular level of data collected used in the study is provided from nba.com. Regular

season statistics from 2016 for both teams are used to develop the simulations. The traditional stats

utilized throughout the simulation include the following:

Page 6: Predicting the 2016 NBA Finals Outcome Using Arena Software

6

Table 3: Variable Names and Descriptions Variable Name Variable Description Variable Explanation Stat Used in Simulation

FGM Field Goals Made The number of field goals that a team has made. This includes both 2 and 3 pointers

Not utilized directly

FGA Field Goal Attempts The number of field goals that a team has attempted. This includes both 2 and 3 pointers

Not utilized directly

FG% Field Goal Percentage The percentage of field goals that a team has made. Formula = FGM / FGA.

Not utilized directly

2PM Two Pointers Made The number of two-point field goals that a team has made.

Utilized

2PA Two Pointer Attempts The number of two-point field goals that a team has attempted.

Utilized

2P% Two Point % The percentage of two-point field goals that a team has made. Formula = 3PA / 3PM.

Utilized

3PM Three Pointers Made The number of three-point field goals that a team has made.

Utilized

3PA Three Pointer Attempts The number of three-point field goals that a team has attempted.

Utilized

3P% Three Point % The percentage of three-point field goals that a team has made. Formula = 3PA / 3PM.

Utilized

FTM Free Throws Made

The number of free throws that a team has made.

Utilized

FTA Free Throw Attempts The number of free throws that a team has attempted.

Utilized

FT% Free Throw Percentage The percentage of free throws that a team has made. Formula = FTA / FTM.

Utilized

OREB Offensive Rebounds The number of rebounds a team has collected while on offensive.

Utilized with overall team stats

DREB Defensive Rebounds The number of rebounds a team has collected while on defense.

Not utilized directly

AST Assists An assist occurs when a player completes a pass to a teammate that directly leads to a field goal.

Utilized using a modified approach1

TOV Turnovers A turnover occurs when a player on offense loses the ball to a player on defense.

Not utilized

STL Steals

A steal occurs when a player on defense takes the ball from a player on offense causing a turnover.

Utilized with overall opposing team stats (combined with BLK)

BLK Blocks

A block occurs when an offensive player attempts a shot, and the defense player tips the ball, blocking their chance to score.

Utilized with overall opposing team stats (combined with STL)

PF Personal Fouls The total number of fouls that a team has committed

Utilized with overall team stats (individual players cannot foul out of game)

SEC Seconds Holding the Ball

The number of seconds the player that is going to create action will hold the ball

Utilized using average & standard deviation of team stats

MIN Minutes Played The average number of minutes a player plays in a 48-minute game

Utilized and modified

(League Team Stats, 2016) 1 If the ball exchanges teams from either a defensive rebound, a steal, a block, or after a made field goal / free throw, the ball will be distributed to another teammate or distributed to the same player based on a weighted discrete distribution. The weighting will be given based on the average number shots that player takes compared to other players. Described differently, if a player shoots the ball 30% of time compared to all other players, that player will get passed the ball 30% of the time (on average).

Page 7: Predicting the 2016 NBA Finals Outcome Using Arena Software

7

Section 5: Statistics and Figures Utilized in the Simulation

The players’ stats utilized in the simulation (last 50 games of regular season) include the following: Table 4: Golden State Warriors’ Stats – Prior to Time Normalization Adjustment

Average Statistics (last 50 games of the season)

Abbreviated Name

Stephan Curry

Klay Thompson

Andre Iguodala

Draymond Green

Andrew Bogut

Total

Number of Minutes Played MIN 33.8 33.8 27.5 35.8 21.3 30.4 Number of 2 Point Field Goal Made

2PM 4.9 5.1 2.3 4.7 2.5 19.5

Number of 2 Point Field Goals Missed

2PMS 3.9 4.8 1.4 3.2 1.5 14.8

Number of 3 Point Field Goal Made

3PM 5.4 3.6 1.1 1.6 0.0 11.7

Number of 3 Point Field Goals Missed

3PMS 6.3 5.0 1.6 2.0 0.0 14.9

Number of Free Throws Made FTM 3.8 2.4 1.0 3.3 0.4 10.9 Number of Free Throws Missed FTMS 0.4 0.2 0.6 1.4 0.3 2.9 Number of Offensive Rebounds OREB 1.0 0.4 0.9 1.7 1.8 5.8 Number of Defensive Rebounds DREB 4.5 3.5 3.2 7.8 5.4 24.4 Number of Steals STL 2.1 0.9 1.1 1.6 0.5 6.2 Number of Blocks BLK 0.2 0.5 0.3 1.6 1.6 4.2 Number of Fouls PF 2.1 1.7 1.5 3.0 3.2 11.5

(League Team Stats, 2016) Table 5: Golden State Warriors’ Stats – Adjusted for Assumption that Starters Play Entire Game

Average Statistics (last 50 games of the season)

Abbreviated Name

Stephan Curry

Klay Thompson

Andre Iguodala

Draymond Green

Andrew Bogut

Total

Number of Minutes Played MIN 48.0 48.0 48.0 48.0 48.0 48.0 Number of 2 Point Field Goal Made

2PM 7.0 7.2 4.0 6.3 5.6 30.2

Number of 2 Point Field Goals Missed

2PMS 5.5 6.8 2.4 4.3 3.4 22.5

Number of 3 Point Field Goal Made

3PM 7.7 5.1 1.9 2.1 0.0 16.8

Number of 3 Point Field Goals Missed

3PMS 8.9 7.1 2.8 2.7 0.0 21.5

Number of Free Throws Made FTM 5.4 3.4 1.7 4.4 0.9 15.9 Number of Free Throws Missed FTMS 0.6 0.3 1.0 1.9 0.7 4.5 Number of Offensive Rebounds OREB 1.4 0.6 1.6 2.3 4.1 9.9 Number of Defensive Rebounds DREB 6.4 5.0 5.6 10.5 12.2 39.6 Number of Steals STL 3.0 1.3 1.9 2.1 1.1 9.5 Number of Blocks BLK 0.3 0.7 0.5 2.1 3.6 7.3 Number of Fouls PF 3.0 2.4 2.6 4.0 7.2 19.2

(League Team Stats, 2016)

Page 8: Predicting the 2016 NBA Finals Outcome Using Arena Software

8

Table 6: Cleveland Cavaliers’ Stats – Prior to Time Normalization Adjustment

Average Statistics (last 50 games of the season)

Abbreviated Name

Kyrie Irving

J.R. Smith

LeBron James

Kevin Love

Tristian Thompson

Total

Number of Minutes Played MIN 32.1 31.2 36.9 32.8 30.2 32.6 Number of 2 Point Field Goal Made

2PM 6.1 2.0 9.6 3.8 4.2 25.7

Number of 2 Point Field Goals Missed

2PMS 5.8 2.3 6.9 3.9 2.4 21.3

Number of 3 Point Field Goal Made

3PM 1.7 3.0 1.4 2.6 0 8.7

Number of 3 Point Field Goals Missed

3PMS 3.4 4.1 2.6 3.7 0 13.8

Number of Free Throws Made FTM 3.3 0.5 5.2 4.2 2.6 15.8 Number of Free Throws Missed FTMS 0.4 0.3 1.9 0.8 1.3 4.7 Number of Offensive Rebounds OREB 0.9 0.5 1.7 2.1 3.9 9.1 Number of Defensive Rebounds DREB 2.1 2.4 6.1 8.3 6.3 25.2 Number of Steals STL 1.1 1.1 1.3 0.6 0.5 4.6 Number of Blocks BLK 0.3 0.2 0.6 0.7 0.8 2.6 Number of Fouls PF 2.0 2.5 2.1 2.3 2.5 11.4

(League Team Stats, 2016) Table 7: Cleveland Cavaliers’ Stats – Adjusted for Assumption that Starters Play Entire Game

Average Statistics (last 50 games of the season)

Abbreviated Name

Kyrie Irving

J.R. Smith

LeBron James

Kevin Love

Tristian Thompson

Total

Number of Minutes Played MIN 48.0 48.0 48.0 48.0 48.0 48.0 Number of 2 Point Field Goal Made

2PM 9.1 3.1 12.5 5.6 6.7 36.9

Number of 2 Point Field Goals Missed

2PMS 8.7 3.5 9.0 5.7 3.8 30.7

Number of 3 Point Field Goal Made

3PM 2.5 4.6 1.8 3.8 - 12.8

Number of 3 Point Field Goals Missed

3PMS 5.1 6.3 3.4 5.4 - 20.2

Number of Free Throws Made FTM 4.9 0.8 6.8 6.1 4.1 22.7 Number of Free Throws Missed FTMS 0.6 0.5 2.5 1.2 2.1 6.8 Number of Offensive Rebounds OREB 1.3 0.8 2.2 3.1 6.2 13.6 Number of Defensive Rebounds DREB 3.1 3.7 7.9 12.1 10.0 36.9 Number of Steals STL 1.6 1.7 1.7 0.9 0.8 6.7 Number of Blocks BLK 0.4 0.3 0.8 1.0 1.3 3.8 Number of Fouls PF 3.0 3.8 2.7 3.4 4.0 16.9

(League Team Stats, 2016)

As indicated earlier, the number of minutes played and all stats are adjusted so that all starters play a

total of 48 minutes. The stats assume that the starters attained the same level of productivity produced in

the additional minutes added for each player. In addition, the stats listed in Tables 4 and 6 include the last

50 games of the season, and excludes the first 32 games of the season. All players were available for the

games utilized in the statistics above. Stated differently, no players’ statistics listed above were affected

by missing games.

Page 9: Predicting the 2016 NBA Finals Outcome Using Arena Software

9

Offensive Rebounds per Field Goal Miss Statistics

When a field goal is missed (either 2 pointer or 3 pointer) each team has the opportunity to acquire the

missed shot and rebound the ball. If the same team that shot the initial field goal missed the ball recovers

(rebounds) the ball, it is called an offensive rebound (OREB). If the other team recovers the ball, it is

referred to as a defensive rebound (DREB). In order to account for the ratio of how many times the Warriors

and Cavaliers get an offensive rebound, the number of field goals missed (FGMS) also have to be taken into

account.

The ratio listed below is used in the model to determine the chance of each team of getting an offensive

rebound. It is listed with a discrete probability for each team when either a 2 pointer or a 3 pointer is

missed. Unlike in real games, the model assumes that offensive rebounds OREB cannot be made on missed

free throws. However, offensive rebounds from missed free throw attempts occur very infrequently.

Table 8 (Adjusted for 48-Minute Game) Statistic

Golden State Warriors

Cleveland Warriors

Offensive Rebounds (OREB) 9.9 13.6 Number of Field Goals Missed (FGMS) 44.2 53.6 OREB / FGMS Ratio 22.3% 25.4%

Section 6: Code and Diagrams

The following section will walk through the Arena code, the workflow and list diagrams that will explain

how the simulation works.

Part 1: Initial Start of the Game

In order to get the basketball game started, there is a tipoff that occurs at the beginning of the game. In

the simulation, there is a 50% chance that the Warriors get the ball initially and a 50% chance that the

Cavaliers get the ball to start the game. In other words, each team has the same chance of getting the ball

first.

Page 10: Predicting the 2016 NBA Finals Outcome Using Arena Software

10

Figure 1: Start of the Simulation

Part 2: Record Scores, Possible Game Termination, Record Winner, Hold the Ball

Before the ball is distributed to any player, the simulation performs the following actions.

- Is the Game Over? In this section, the simulation checks to see if 48 minutes or more have

passed. It also checks to see if the game is tied. If the time of the game is greater or equal to

48 minutes and the game is not tied, the simulation will record team and individual stats and

terminate. Although it will most likely be less than a minute, please note that there is a high

probability that the game runs over 48 minutes. The simulation was set up this way since all

the recordings would occur without interference.

- Did the Warriors Win? If the game over, the simulation will look to see if the Warriors have

more points than the Cavaliers. If it is true, it records the Warriors victory and terminates the

simulation.

Page 11: Predicting the 2016 NBA Finals Outcome Using Arena Software

11

Figure 2: Decisions for a game over scenario, distributing the ball if game is not over, and figuring out which team won if game is over

- Game Over. After the simulation records a warrior victory or does not record a victory, it ends

/ terminates the session. An entirely new simulation/game can begin at this time.

- If the game is not over, the game continues and one of the teams hold the ball. If the game is

not 48 minutes or longer, the game continues and the team that has position holds the ball.

Part 3: Hold the Ball

Since each team only distributes the ball once based on discrete probabilities (see next part listed

below), the ball needs to be held for a certain period of time. The average number of time each team held

Page 12: Predicting the 2016 NBA Finals Outcome Using Arena Software

12

the ball was not readily available in the data set collected. However, the number of Field Goal Attempts

was available. With this stat, the time before taking a shot could be backed into for each team. The average

number of FGAs for the Warriors was 89.8 attempts with a standard deviation of 10.9 attempts. The

Cavaliers average attempts was 100.9 with a standard deviation of 14.0. See full table of statistics listed

below.

Table 8: Calculation of Average and Standard Deviation of the Time Holding the Ball for Each Team Statistic

Golden State Warriors

Cleveland Warriors

Average Number of FGAs 89.8 100.9 St Dev Number of FGAs 10.9 14.0 Number of Minutes Per Game 48 48 Number of Seconds Per Game 2,880 2,880 Avg Number of Shots Per Second (Seconds per game / Avg FGAs / 2)1

16.0

14.3

St Dev Number of Shots Per Second 1.95 1.98 1 This stat is divided by 2 in order to account for 2 teams acquiring the ball during the game.

Figure 3: Holding the ball delay process before the distribution of the ball to an individual player.

Part 4: Distribution of the Ball

Once a team has the ball, a discrete distribution is used to determine which player gets the ball. To

simplify the model, the simulation does not record any assists. The distribution of the ball gets the ball to

the player that is going to have some type of action happen. The discrete probabilities computed above

take into account the number of times a player performs the following actions.

Discrete Probability Formula:

Number of Field Goal Attempts for Player i / Number of Field Goal Attempts for Team j

Page 13: Predicting the 2016 NBA Finals Outcome Using Arena Software

13

Figure 4: Warriors Distribution Decision

The discrete distribution statistics for each team are listed below is each of the histograms below. The

percentages represent the probability the player is going to receive the ball and have the chance to do

something with it.

Figure 5: Warriors’ discrete probabilities of players’ chances of receiving the ball

Page 14: Predicting the 2016 NBA Finals Outcome Using Arena Software

14

Figure 6: Cavaliers’ discrete probabilities of players’ chances of receiving the ball

Part 5: Players’ Decisions and Actions

Once each player has the ball, there are eight (8) main actions that can happen. Please note that there

are “sub-actions” that happen following these main six actions. The initial eight (8) actions and sub-actions

are listed below:

Table 9: Player Decision and Action Descriptions and Explanations Main Action Description Sub-Action 2PM Player i shoots a 2-point field goal and

successfully makes it Ball is transferred to other team

2PMS Player i shoots a 2-point field goal and misses

Team i has opportunity to get an offensive rebound (OREB) and distribute the ball to any teammate again. The main actions are recalculated each time

3PM Player i shoots a 3-point field goal and successfully makes it

Ball is transferred to other team

3PMS Player i shoots a 3-point field goal and misses

Team i has opportunity to get an offensive rebound (OREB) and distribute the ball to any teammate again. The main actions are recalculated each time

BLK / STL1 Player i gets the ball stolen from Player j Ball is transferred to other team PF 0-0 Player i gets fouled by Player j (2 free throw

shots are assumed) Player i makes 0 out of 2 free throws. Ball is distributed to other team

PF 1-0 Player i gets fouled by Player j (2 free throw shots are assumed)

Player i makes 1 out of 2 free throws. Ball is distributed to other team

PF 1-1 Player i gets fouled by Player j (2 free throw shots are assumed)

Player i makes 2 out of 2 free throws. Ball is distributed to other team

1 The chances of Player i getting the ball stolen or blocked depend on their opponent’s (Player j) defensive statistics. For example, if Stephen Curry is on offense and Kyrie Irving is on defense, if Kyrie averages 1.6 steals per game and 0.4 blocks per game, those same stats are listed as Stephen Curry’s odds of having a shot blocked/ball stolen.

Page 15: Predicting the 2016 NBA Finals Outcome Using Arena Software

15

Figure 7: Example of S Curry Actions / Decisions

Note: Although the action is not shown above, if the ball is blocked or stolen by opponent, the ball is transferred to the opposing team. In addition, if a 2 or 3 pointer is missed, the team on offense does get a chance of getting an offensive rebound. Those probabilities are listed in Figure 8.

Each players’ decisions and actions vary depending on his regular season stats. Those discrete

probabilities are listed below:

Figure 8: Player Decisions / Actions – Discrete Probabilites

Page 16: Predicting the 2016 NBA Finals Outcome Using Arena Software

16

If a player is fouled, the player gets to attempt 2 foul shots. Please note that in an actual game, a player

can get fouled while shooting a three-pointer and get three foul shots. However, this situation is a fairly

rare occurrence. Therefore, the simulation does not take the possibility of getting fouled while shooting a

three-pointer into account. Each player’s probability of making a foul shot is listed below in the following

discrete probabilities. An example of how the odds of a player making 0, 1, or 2 free throws based on an

overall percentage of successful attempts is listed below:

Table 10: Example of Steph Curry’s FTA Probabilities Player S. Curry Overall FT% 90.9% Odds of Making 0 out of 2 Free Throws (9.1% x 9.1%) 0.8% Odds of Making 1 out of 2 Free Throws 18.2% Odds of Making 2 out of 2 Free Throws (90.1% x 90.1%) 82.6%

Figure 9: Player free throw percentages

Page 17: Predicting the 2016 NBA Finals Outcome Using Arena Software

17

As mentioned earlier, if a player shoots a 2 or 3 pointer and misses, they have a chance to recover the

ball and get an offensive rebound. The chances of each team of getting an offensive rebound are listed

below.

Table 11: Discrete Probabilities of Each Team Getting an Offensive Rebound Team Discrete Probability –

Offensive Rebound Golden State Warriors 23.3% Cleveland Cavaliers 25.4%

Section 7: Experimental Results

As mentioned earlier, in order to get relevant and significant averages and standard deviations, the

simulation number of replications was 101 games. The significant level on the half width produced from

the simulation was <= 0.05.

Figure 10: View of Arena’s run setup options (setting up 101 repetitions)

The ratio of the Warriors defeating the Cavaliers was an average of 73.3% with a half width of 9%. In

other words, the odds of the Warriors winning a 101 game series is 64.3% - 82.3% with a 95% confidence

level.

Figure 11: Arena’s user specified Warriors’ win percentage results

Page 18: Predicting the 2016 NBA Finals Outcome Using Arena Software

18

Each of the game results are also recorded in Arena, and can be exported to a .csv file. The results of

each game are visually shown in the graph below.

Figure 12: Margin of victory for each team (all 101 games)

In the graph above, each of the margins of victory for each game for each team is shown. The blue bars

represent the margin of victory for Golden State while the red bars represent the margin of victory for

Cleveland. The largest margin for the Warriors was 46 points while the largest margin of victory for

Cleveland was 22. Although the Warriors obviously appear to be clear favorites over the Cavaliers (74

games to 27), there are many instances where the Cavaliers would win a best out of 7 series. Those

instances or runs are highlighted using the red box to show the cases where the Cavaliers would have won

at least 4 games out of 7. Please note that some red boxes have multiple scenarios in which the Cavaliers

would win a best out of 7 series. For example, the first red box listed in the graph has 6 scenarios in which

the Cavaliers would win a best out 7 series. Also note, that once a team wins 4 games, the winner is

determined and the series ends.

Indicates times when Cleveland is able to win 4 out of 7 games. Please note that there are different combinations that can allow for Cleveland to win 4 out of 7 games.

Page 19: Predicting the 2016 NBA Finals Outcome Using Arena Software

19

Table 16: Individual Stat Comparisons (Simulation Averages v NBA finals Averages)

Note: The stats above are adjusted for players playing entire 48-minute game (while maintaining same “pace”). In other words,

the stats above are slightly inflated.

As one can witness in the tables above, the players’ averages that were dramatically less in the NBA

finals versus the simulation for the Golden State Warriors were Steph Curry (6.3 points less in the finals)

and Klay Thompson (3.9 points less in the finals). “As for Curry, his finals experience was an obstacle course

of long-limbed defenders (he shot 40.3 percent from the field), spats with officials (he chucked his mouth

guard after he was ejected from Game 6) and volleys from critics, who took jabs at everything from his poor

shooting to his choice of sneakers. Game 7 was another slog” (Cacciola, 2016).

On the other side, the Cavaliers that performed better in the NBA finals versus the simulation were Kyrie

Irving (9 points more in the finals) and LeBron James (6.3 points more in the finals). “James collected 27

points, 11 rebounds, and 11 assists to punctuate one of the most remarkable individual performances in

finals history. James, who was named the finals’ most valuable player got ample help from his teammate

Kyrie Irving, whose 3-pointer with 53 seconds remaining game the Cavaliers the lead” (Cacciola, 2016).

Although there were multiple factors involved, one could say that the performances of these four players

ultimately lead to the Cavaliers defeat over the Warriors.

Page 20: Predicting the 2016 NBA Finals Outcome Using Arena Software

20

Section 8: Sensitivity Analysis

There are two parts of the Sensitivity Analysis section. Part 1 of the Sensitivity analysis includes changing

the time (in seconds) each team holds the ball. One situation includes a game simulation that is “sped up”,

meaning that each team holds the ball for 5 seconds less (on average) and a situation where the game is

“slowed down” meaning that each team holds the ball for 5 seconds more. The original standard deviations

will remain constant in all scenarios. The results of how many times each team wins (out of 101) are

compared in each scenario. The results are also compared against the results of the original simulation.

The results will help determine the variability and validity of the model as this variable changes. The

number of times each team wins should remain fairly consistent and should have high correlation with one

another. The results of each scenario are listed below:

Table 12: Comparison of Simulation Scenarios (Changing Time of Holding Ball Variable) Description

Original Simulation (in seconds)

Scenario 1: Faster Paced Game

Scenario 2: Slower Paced Game

Average Time Warriors Hold the Ball 16.0 11.0 21.0 Standard Deviation Time Warriors Hold the Ball 1.9 1.9 1.9 Average Time Cavaliers Hold the Ball 14.3 9.3 19.3 Standard Deviation Time Cavaliers Hold the Ball 2.0 2.0 2.0 Warrior Win % (out of 101 games) 73.3%1 73.3% 71.3%

1 See Section 9 for more detailed results

The results in all three scenarios produce very similar results. In other words, varying the “pace” of the

game does not significantly impact in the number of times each team wins.

The second part substitutes one player for each team. As a side note, the number of seconds that each

team holds the ball reverts to the original simulation statistics. For the Warriors, Andrew Bogut is

substituted out for Shaun Livingston. For the Cavaliers, J.R. Smith is substituted out for Iman Shumpert.

The data entered into the simulation (replacing the players above) is listed below.

Table 13: Sensitivity Analysis Part 2 - Player Decisions / Actions – Discrete Probabilites Action S Livingston I Shumpert 2PM 40.3% 17.3% 2PMS 32.5% 20.9% 3PM 0.6% 9.6% 3PMS 0.0% 25.1% BLK / STL1 15.9% 21.3% PF 0-0 0.3% 0.2% PF 1-0 3.0% 1.9% PF 1-1 7.5% 3.6%

1The chances of Player i getting the ball stolen or blocked depend on their opponent’s (Player j) defensive statistics.

Page 21: Predicting the 2016 NBA Finals Outcome Using Arena Software

21

Table 14: Sensitivity Analysis Part 2 - Discrete Probabilities of Players’ Chances of Receiving the Ball

Team

Player

Probability of Receiving Ball

Golden State S Curry 30.9% K Thompson 27.9% A Iguodala 11.8% D Green 16.5% S Livingston 13.0% Cleveland K Irving 27.1% I Shumpert 11.6% L James 28.3% K Love 21.9% T Thompson 11.1%

Although these players were not considered starters, both Shaun Livingston and Iman Shumpert played

a significant amount of time for each team during the regular season. Since only one player for each team

is replaced, the number of times each team wins should remain fairly similar. The comparison results of

each scenario are listed below:

Table 15: Win % Comparison (Original Lineup versus New Lineup) Description

Original Simulation (in seconds)

Substitution of One (1) Player for Each team

Warrior Win % Average (out of 101 games)1 73.3%1 76.2% Warrior Win % Range (95% Confidence)1 64.3% to 82.3% 68.2% to 84.2%

Section 9: Validation Analysis

As mentioned earlier, FiveThirtyEight, a website dedicated to providing articles that often include using

statistical models and analysis in various subjects including sports, politics, and culture. Nate Silver, a

renowned statistician, created the website in 2008. “During the U.S. presidential primaries and United

States general election of 2008 the site compiled polling data through a unique methodology derived

from Silver's experience in baseball sabermetrics to ‘balance out the polls with comparative demographic

data.’” (FiveThirtyEight, 2016). As mentioned earlier, the website produced an article that had the

Warriors listed with a 69% chance of winning the series before it started (Morris, 2016). Based on betting

odds as of June 2, 2016, an article written from usatoday.com had the Warriors chance of winning listed

at 66.7% (Neuharth-Keusch, 2016). A comparison of the two predictions listed above are compared to

the simulation predictions in the table below:

Page 22: Predicting the 2016 NBA Finals Outcome Using Arena Software

22

Table 16: Odds of Warriors to Win NBA Finals Comparison Source

Prediction Odds of Warriors Winning the NBA Finals

95% Confidence Range

Simulation Odds Produced from Arena 73.3% 64.3% to 82.3% FiveThirtyEight Odds 69.0% Betting Odds (per usatoday.com) 66.7%

In order to validate some of the other simulation results, a comparison of the averages produced from

the simulation are compared to regular season statistics for each individual player included in the

simulation. Please note that the statistics below for the actual results are adjusted for a 48-minute game.

Table 17: Individual Player Statistics Comparison (Simulation versus Actual Results) Team

Player

Simulated Average Points Per Game1

Actual Season Stats (all 82 games)1

Golden State S Curry 37.2 42.2 K Thompson 30.6 31.9 A Iguodala 12.4 12.6 D Green 17.5 19.3 A Bogut 8.7 12.5 Cleveland K Irving 24.4 29.8 JR Smith 16.3 19.3 L James 27.9 34.1 K Love 19.8 24.3 T Thompson 9.1 13.5

1The stats above are adjusted for players playing entire 48-minute game (while maintaining same “pace”). In other words, the stats above are slightly inflated.

The differences in the comparisons above are attributed to the all the various assumptions made in the

simulation, including the number of times a player is fouled, when the ball is stolen or blocked, and also

how many times a team successful gets an offensive rebound. In addition, another contributor to the

point differential includes the assumption made of how long each team holds the ball (see Table 8 for

more detail).

Section 10: Conclusions and Questions

Although the model above did not accurately predict the Cavaliers to win the NBA finals, the model had

predictions (73%) that matched other robust predictors prior to the two teams playing, including

FiveThirtyEight’s 69% chance of the Warriors winning the series as well as the betting odds according to

USA Today which had the Warriors chances of winning at 67%.

As mentioned earlier, this simulation has the potential to simulate any NBA matchup for a series of

games. The odds of each team winning is easily accumulated in the simulation along with a range with a

Page 23: Predicting the 2016 NBA Finals Outcome Using Arena Software

23

95% confidence level. Once the model is compared to other matchups and thorough validation occurred

with other matchups, the uses of the model become almost limitless. In other words, this model is unique

and extremely flexible in the sense that it utilizes each player’s individual stats to predict a team’s chance

of winning. In other words, it has the capability of taking any five starters from any team an input them

into the model and teams can play each other to predict an outcome. To take it one step further, even

teams from various generations can be entered into the model. However, some adjustments may need to

be made to adjust for any rule changes that occurred in the NBA. For example, the ability to shoot three

pointers was not available until 1979 (Wood, 2011).

Other potential options for this program include adding more players into the simulation in order to take

into account bench players. One of the factors that lead to the Cavaliers win over the Warriors is the

suspension of Draymond Green in Game 5. It is hard to predict these rare occurrences and model these

situations in simulation modeling. With that said, one option would be to use various “starting lineups” for

each team and see if the results remain unchanged. In other words, shuffle the potential lineups using

multiple combinations of each of the team’s top 7-8 players and take an average of the averages to get

additional predictions and ranges.

Page 24: Predicting the 2016 NBA Finals Outcome Using Arena Software

24

References and Citations 2015–16 Golden State Warriors Season (2016). en.wikipedia.org. Retrieved on 11/14/16 from

https://en.wikipedia.org/wiki/2015%E2%80%9316_Golden_State_Warriors_season 2016 NBA Finals Cavaliers vs. Warriors. basketball-reference.com. Retrieved on 11/14/2016

from http://www.basketball-reference.com/playoffs/2016-nba-finals-cavaliers-vs-warriors.html.

Cacciola, S. (2016). Cavaliers Defeat Warriors to Win Their First NBA Title. nytimes.com. Retrieved on 11/2/2016 from http://www.nytimes.com/2016/06/20/sports/basketball/golden-state-warriors- cleveland-cavaliers-nba-championship.html?_r=0

Doane, S. & Seward, L. (2012). Applied Statistics in Business and Economics (4th Ed.). New York, NY: McGraw-Hill/Irwin. Elliott, R. J. & Morrell, C. H. (2010). Learning SAS in the Lab: Third Edition. Boston, MA: Brooks /

Cole. Exner, R. (2016). Eight NBA Teams Have Rallied from 3-1 Deficits to Win a Playoff Series; A Complete

Listing. Cleveland.com. Retrieved in 10/31/2016 from https://www.cleveland.com/datacentral/index.ssf/2009/05/eight_nba_teams_have_rallied_f.html

FiveThrityEight. (2016). en.wikipedia.org. Retrieved on 11/16/16 from

https://en.wikipedia.org/wiki/FiveThirtyEight League Team Stats. nba.com. Retrieved on 6/20/16 from

http://stats.nba.com/league/player/#!/?Season=201516&SeasonType=Regular%20Season Morris, B. (2016). Good News, Warriors Fans: Every Cliché About the NBA Playoffs is True.

fivethirtyeight.com. Retrieved on 10/15/16 from http://fivethirtyeight.com/features/good-news-warriors-fans-every-cliche-about-the-nba-playoffs-is-true/

Neuharth-Keusch, A. J. (2016). Betting Odds for the 2016 NBA Finals. usatoday.com. Retrieved on

11/19/2016 from http://www.usatoday.com/story/sports/nba/2016/06/02/betting-odds-2016- nba-finals-warriors-cavaliers-stephen-curry-lebron-james/85312676/

SAS Certification Prep Guide: Base Programming for SAS 9 Third Edition. Cary, North Carolina: SAS

Institute, Inc. Wood, R. (2011). The History of the 3-Pointer. www.usab.com. Retrieved on 11/18/2016 from

https://www.usab.com/youth/news/2011/06/the-history-of-the-3-pointer.aspx

Page 25: Predicting the 2016 NBA Finals Outcome Using Arena Software

25

Contents

Section 1: Detailed Introduction to the Problem .......................................................................................... 2

Section 2: Goals of the Simulation ................................................................................................................ 2

Section 3: Modeling Assumptions................................................................................................................. 3

Assumption #1: Individual Players Utilized in Simulation ......................................................................... 3

Table 1: Golden State Warrior Players Included in Simulation ............................................................. 3

Table 2: Cleveland Cavalier Players Included in Simulation .................................................................. 3

Assumption #2: Normalization Adjustment for Minutes Played and Game Statistics ............................. 4

Assumption #3: Game Rules Included in Simulation ................................................................................ 4

Section 4: Methods and Data Collection ...................................................................................................... 5

Table 3: Variable Names and Descriptions ........................................................................................... 6

Section 5: Statistics and Figures Utilized in the Simulation .......................................................................... 7

Table 4: Golden State Warriors’ Stats – Prior to Time Normalization Adjustment .............................. 7

Table 5: Golden State Warriors’ Stats – Adjusted for Assumption that Starters Play Entire Game ..... 7

Table 6: Cleveland Cavaliers’ Stats – Prior to Time Normalization Adjustment ................................... 8

Table 7: Cleveland Cavaliers’ Stats – Adjusted for Assumption that Starters Play Entire Game .......... 8

Offensive Rebounds per Field Goal Miss Statistics ................................................................................... 9

Table 8 (Adjusted for 48-Minute Game) ............................................................................................... 9

Section 6: Code and Diagrams ...................................................................................................................... 9

Part 1: Initial Start of the Game ................................................................................................................ 9

Figure 1: Start of the Simulation ..................................................................................................... 10

Part 2: Record Scores, Possible Game Termination, Record Winner, Hold the Ball ............................... 10

Figure 2: Decisions for a game over scenario, distributing the ball if game is not over, ................ 11

and figuring out which team won if game is over .......................................................................... 11

Part 3: Hold the Ball ................................................................................................................................ 11

Table 8: Calculation of Average and Standard Deviation of the Time Holding the Ball for Each Team ............................................................................................................................................................ 12

Figure 3: Holding the ball delay process before the distribution of the ball to an individual ........ 12

player. ............................................................................................................................................. 12

Part 4: Distribution of the Ball ................................................................................................................ 12

Figure 4: Warriors Distribution Decision ......................................................................................... 13

Page 26: Predicting the 2016 NBA Finals Outcome Using Arena Software

26

Figure 5: Warriors’ discrete probabilities of players’ chances of receiving the ball ....................... 13

Figure 6: Cavaliers’ discrete probabilities of players’ chances of receiving the ball ....................... 14

Part 5: Players’ Decisions and Actions .................................................................................................... 14

Table 9: Player Decision and Action Descriptions and Explanations .................................................. 14

Figure 7: Example of S Curry Actions / Decisions ............................................................................ 15

Figure 8: Player Decisions / Actions – Discrete Probabilites ........................................................... 15

Table 10: Example of Steph Curry’s FTA Probabilities ........................................................................ 16

Figure 9: Player free throw percentages ........................................................................................ 16

Table 11: Discrete Probabilities of Each Team Getting an Offensive Rebound .................................. 17

Section 7: Experimental Results .................................................................................................................. 17

Figure 10: View of Arena’s run setup options (setting up 101 repetitions).................................... 17

Figure 11: Arena’s user specified Warriors’ win percentage results .............................................. 17

Figure 12: Margin of victory for each team (all 101 games) ........................................................... 18

Table 16: Individual Stat Comparisons (Simulation Averages v NBA finals Averages) ....................... 19

Section 8: Sensitivity Analysis ..................................................................................................................... 20

Table 12: Comparison of Simulation Scenarios (Changing Time of Holding Ball Variable)................. 20

Table 13: Sensitivity Analysis Part 2 - Player Decisions / Actions – Discrete Probabilites ................. 20

Table 14: Sensitivity Analysis Part 2 - Discrete Probabilities of Players’ Chances of Receiving the Ball ............................................................................................................................................................ 21

Table 15: Win % Comparison (Original Lineup versus New Lineup) ................................................... 21

Section 9: Validation Analysis ..................................................................................................................... 21

Table 16: Odds of Warriors to Win NBA Finals Comparison ............................................................... 22

Table 17: Individual Player Statistics Comparison (Simulation versus Actual Results) ....................... 22

Section 10: Conclusions and Questions ...................................................................................................... 22

References and Citations ............................................................................................................................ 24