predicting the 2016 nba finals outcome using arena software
TRANSCRIPT
1
Predicting the 2016 NBA Finals Outcome Using Arena (software)
vs
Author: Ed Orlando, Data Scientist at White Lodging
Course: CS 525 Modeling and Simulation
Submitted to: Dr. James Caristi, Professor and Chair of Computing and Information Sciences
2
Section 1: Detailed Introduction to the Problem
On Sunday, June 19th, 2016, the Cleveland Cavaliers pulled off what no team in history has ever
accomplished in the NBA finals. Down three (3) games to one (1) in a best of seven series, the Cavaliers
rallied to win three (3) straight games against the Golden State Warriors. Although coming back down 3-1
has been accomplished by 10 other teams in the NBA playoffs (including the 2016 Warriors in the Western
Conference finals) no team has ever come back from a 3-1 deficit in the NBA finals (Exner, 2016).
The 2016 NBA finals outcome was considered one of the largest upsets in recent times. Not only did
Cleveland come back from a 3-1 deficit, they beat a Golden State team that held the all-time regular season
record of 73-9. In addition, 2 out of the 3 remaining games to be played were located at Golden State’s
arena (Oakland) where the Warriors won 39 out of 41 games (2015–16 Golden State Warriors Season,
2016). Even before the series started, the Warriors were heavy favorites. “The Cavs beating the Warriors
in this year’s NBA finals would be an upset no matter how you look at it. FiveThirtyEight’s projections give
the Warriors a 69 percent chance of winning the series. But if we factor in the conventional wisdom — not
something we say here very often — the Warriors look even stronger” (Morris, 2016).
Since the NBA finals is a best-of-seven series, experts and casual fans often believe that the best team
often wins the championship. This is a different setup compared to other popular sporting venues including
the NFL (Super Bowl), NCAA basketball championship, NCAA football championship, where the victor of
one game in the championship decides the winner. With that said, there are upsets in both venues,
including the NBA finals.
However, what if the NBA finals had a best of a 101 game series? In other words, after simulating the
NBA finals more than one hundred times using a model built in Arena, will this model be consistent with
most of the experts and predict a Golden State Warrior victory in the NBA finals or will it show the Cavaliers
as favorites? The simulation results produced in this study provide the win percent for each team as well
as a range of winning percentages (95% confidence level). These results are compared to FiveThirtyEight’s
projection of the Warrior’s 69 percent chance of winning the NBA finals. The results are also compared to
the betting odds, provided by USA Today, which had the Warriors’ chance of winning listed at 66.7% before
the finals began. Lastly, the results are also compared against what happened in the NBA finals.
Section 2: Goals of the Simulation
There are many goals of the simulation. One of the goals of the study is to build a model that has the
Cavaliers play the Warriors more than 100 times. Those simulations assist in building a predictive model
3
that shows the odds of who would win in a 101 game series. The model and statistics produced from it
also shows the average team and individual player statistics for each of the games. The predictive results
will be compared to the actual NBA finals results as well compared to FiveThirtyEight’s projection of the 69
percent chance and the betting odds of 67 percent chance of winning the NBA finals. The simulation reveals
some insights into the likelihood of the Cavaliers winning 4 out of 7 games. Stated differently, throughout
the simulation, are there “streaks” in which the Cavaliers have an opportunity to win a best out of 7 series?
Lastly, it will be determined if this simulation can be used in additional ways. In other words, can the
simulation be utilized and structured in a way where teams from different generations play each other?
For example, what would happen if the 2016 Golden State Warriors played the 1998 Chicago Bulls, who
are both considered two of the best teams of all time?
Section 3: Modeling Assumptions
The model used throughout the study is set up as a stochastic simulation. A stochastic simulation is one
that uses various random variables and various probabilities that can assist in developing simulations.
Although stochastic models can be very inaccurate at times, the model used throughout this study is
compared to static actual results derived from the June 2016 NBA finals.
Assumption #1: Individual Players Utilized in Simulation
In order to simplify the number of variables and assumptions used in the model, the simulation used
individual player stats for a maximum of five players for each team. Although teams often use substitutions
throughout the course of an NBA game, this simulation assumed the same five players playing the entire
48 minutes of each game. The five players chosen to play each game are listed below.
Table 1: Golden State Warrior Players Included in Simulation Position Name Point Guard Stephan Curry Shooting Guard Klay Thompson Small Forward Andre Iguodala Power Forward Draymond Green Center Andrew Bogut
Table 2: Cleveland Cavalier Players Included in Simulation
Position Name Point Guard Kyrie Irving Shooting Guard Iman Shumpert Small Forward LeBron James Power Forward Kevin Love Center Tristan Thompson
4
Assumption #2: Normalization Adjustment for Minutes Played and Game Statistics
Since only five players from both teams are utilized in the simulation, the minutes played and game
statistics are adjusted to accommodate for the 48-minute game. For example, if a player averages 24
minutes played and the average number of points scored for that player is 20 points per game, the minutes
played are increased to 48 minutes and the number of points scored are increased to 40 points per game.
This supposition obviously assumes that each player keeps the same pace in all statistics as if they were
played all 48 minutes of the game. Although no player actually averages 48 minutes per game during the
regular season, the number of minutes played in the NBA playoffs does often increase for the starters. In
other words, it is not uncommon for starters to play more than 40 minutes or more than 83% of the game.
For example, during the regular season, Kyrie Irving and LeBron James averaged 32.1 and 36.9 minutes per
game, respectively. However, during the NBA finals, Irving averaged 39.0 minutes per game while James
averaged 41.7 minutes per game (2016 NBA Finals Cavaliers vs. Warriors, 2016). “When it’s win or go
home, there’s no more jockeying for better playoff (or lottery) position, no more using throw-away games
to experiment with new lineups, the best players tend to get more minutes per game and no one’s resting
on the second night of (non-existent) back-to-backs” (Morris, 2016).
Assumption #3: Game Rules Included in Simulation
The basketball game played in the simulation is varied slightly and does not include all NBA rules. The
game simulated includes the following workflow. The game is get started with a referee that allows for a
50/50 chance of either Golden State or Cleveland getting the ball initially. Once a team has control of the
ball there are many things that can happen. Please refer to Section 5 for the specific averages, ranges, and
distributions utilized in the simulation.
- The team holds the ball for an amount of time based on a normal distribution that is calculated
using historical data.
- The distribution of the ball goes to a player. The higher number of shots or field goals a player
takes on average increases his odds of getting the ball. Unlike a typical basketball game, once
a player gets the ball, he will not attempt to pass the ball to another teammate. However, the
following actions (based on discrete probabilities) occur once a player has the ball.
1. The player shoots a 2-point field goal and makes it
a. If true, the individual and the team both receive 2 points
2. The player shoots a 2-point field goal and misses it
a. If true, either the same team can get an offensive rebound or the other team
will receive the ball (odds are based on discrete probabilities)
5
3. The player shoots a 3-point field goal and makes it
a. If true, the individual and the team both receive 3 points
4. The player can shoot a 3-point field goal and miss it
a. If true, either the same team can get an offensive rebound or the other team
will receive the ball (odds are based on discrete probabilities)
5. The player gets the ball blocked or stolen and the other team gets the ball
6. The player is fouled and makes 0 out of 2 free throw attempts
7. The player is fouled and makes 1 out of 2 free throw attempts
8. The player is fouled and makes 2 out of 2 free throw attempts
Please note that if a player is fouled, the maximum number of free throw attempts is
equal to 2. Although in a live game situation, it is possible to get fouled while shooting
a 3 pointer, and having 3 free throw attempts, it is very rare.
- Some of the major actions that are excluded from the simulation include the following:
1. Defensive stats and assumptions are utilized using the entire aggregated team stats.
In other words, individual defensive stats will not be tracked.
2. No timeouts are utilized in the simulation since it would not add any benefit to the
model.
3. Since defensive stats are not tracked, no player is eliminated from the game due to too
many fouls. In a real game, a player can be eliminated from a game if he had 6 personal
fouls.
4. Since it is rare in a game, no shot clock violation will occur and there is a slight chance
of a team or player taking more than 24 seconds to shoot the ball. Instead, an average
and standard deviation of how long the team holds the ball is calculated using historical
data for each team.
Section 4: Methods and Data Collection
The majority of the granular level of data collected used in the study is provided from nba.com. Regular
season statistics from 2016 for both teams are used to develop the simulations. The traditional stats
utilized throughout the simulation include the following:
6
Table 3: Variable Names and Descriptions Variable Name Variable Description Variable Explanation Stat Used in Simulation
FGM Field Goals Made The number of field goals that a team has made. This includes both 2 and 3 pointers
Not utilized directly
FGA Field Goal Attempts The number of field goals that a team has attempted. This includes both 2 and 3 pointers
Not utilized directly
FG% Field Goal Percentage The percentage of field goals that a team has made. Formula = FGM / FGA.
Not utilized directly
2PM Two Pointers Made The number of two-point field goals that a team has made.
Utilized
2PA Two Pointer Attempts The number of two-point field goals that a team has attempted.
Utilized
2P% Two Point % The percentage of two-point field goals that a team has made. Formula = 3PA / 3PM.
Utilized
3PM Three Pointers Made The number of three-point field goals that a team has made.
Utilized
3PA Three Pointer Attempts The number of three-point field goals that a team has attempted.
Utilized
3P% Three Point % The percentage of three-point field goals that a team has made. Formula = 3PA / 3PM.
Utilized
FTM Free Throws Made
The number of free throws that a team has made.
Utilized
FTA Free Throw Attempts The number of free throws that a team has attempted.
Utilized
FT% Free Throw Percentage The percentage of free throws that a team has made. Formula = FTA / FTM.
Utilized
OREB Offensive Rebounds The number of rebounds a team has collected while on offensive.
Utilized with overall team stats
DREB Defensive Rebounds The number of rebounds a team has collected while on defense.
Not utilized directly
AST Assists An assist occurs when a player completes a pass to a teammate that directly leads to a field goal.
Utilized using a modified approach1
TOV Turnovers A turnover occurs when a player on offense loses the ball to a player on defense.
Not utilized
STL Steals
A steal occurs when a player on defense takes the ball from a player on offense causing a turnover.
Utilized with overall opposing team stats (combined with BLK)
BLK Blocks
A block occurs when an offensive player attempts a shot, and the defense player tips the ball, blocking their chance to score.
Utilized with overall opposing team stats (combined with STL)
PF Personal Fouls The total number of fouls that a team has committed
Utilized with overall team stats (individual players cannot foul out of game)
SEC Seconds Holding the Ball
The number of seconds the player that is going to create action will hold the ball
Utilized using average & standard deviation of team stats
MIN Minutes Played The average number of minutes a player plays in a 48-minute game
Utilized and modified
(League Team Stats, 2016) 1 If the ball exchanges teams from either a defensive rebound, a steal, a block, or after a made field goal / free throw, the ball will be distributed to another teammate or distributed to the same player based on a weighted discrete distribution. The weighting will be given based on the average number shots that player takes compared to other players. Described differently, if a player shoots the ball 30% of time compared to all other players, that player will get passed the ball 30% of the time (on average).
7
Section 5: Statistics and Figures Utilized in the Simulation
The players’ stats utilized in the simulation (last 50 games of regular season) include the following: Table 4: Golden State Warriors’ Stats – Prior to Time Normalization Adjustment
Average Statistics (last 50 games of the season)
Abbreviated Name
Stephan Curry
Klay Thompson
Andre Iguodala
Draymond Green
Andrew Bogut
Total
Number of Minutes Played MIN 33.8 33.8 27.5 35.8 21.3 30.4 Number of 2 Point Field Goal Made
2PM 4.9 5.1 2.3 4.7 2.5 19.5
Number of 2 Point Field Goals Missed
2PMS 3.9 4.8 1.4 3.2 1.5 14.8
Number of 3 Point Field Goal Made
3PM 5.4 3.6 1.1 1.6 0.0 11.7
Number of 3 Point Field Goals Missed
3PMS 6.3 5.0 1.6 2.0 0.0 14.9
Number of Free Throws Made FTM 3.8 2.4 1.0 3.3 0.4 10.9 Number of Free Throws Missed FTMS 0.4 0.2 0.6 1.4 0.3 2.9 Number of Offensive Rebounds OREB 1.0 0.4 0.9 1.7 1.8 5.8 Number of Defensive Rebounds DREB 4.5 3.5 3.2 7.8 5.4 24.4 Number of Steals STL 2.1 0.9 1.1 1.6 0.5 6.2 Number of Blocks BLK 0.2 0.5 0.3 1.6 1.6 4.2 Number of Fouls PF 2.1 1.7 1.5 3.0 3.2 11.5
(League Team Stats, 2016) Table 5: Golden State Warriors’ Stats – Adjusted for Assumption that Starters Play Entire Game
Average Statistics (last 50 games of the season)
Abbreviated Name
Stephan Curry
Klay Thompson
Andre Iguodala
Draymond Green
Andrew Bogut
Total
Number of Minutes Played MIN 48.0 48.0 48.0 48.0 48.0 48.0 Number of 2 Point Field Goal Made
2PM 7.0 7.2 4.0 6.3 5.6 30.2
Number of 2 Point Field Goals Missed
2PMS 5.5 6.8 2.4 4.3 3.4 22.5
Number of 3 Point Field Goal Made
3PM 7.7 5.1 1.9 2.1 0.0 16.8
Number of 3 Point Field Goals Missed
3PMS 8.9 7.1 2.8 2.7 0.0 21.5
Number of Free Throws Made FTM 5.4 3.4 1.7 4.4 0.9 15.9 Number of Free Throws Missed FTMS 0.6 0.3 1.0 1.9 0.7 4.5 Number of Offensive Rebounds OREB 1.4 0.6 1.6 2.3 4.1 9.9 Number of Defensive Rebounds DREB 6.4 5.0 5.6 10.5 12.2 39.6 Number of Steals STL 3.0 1.3 1.9 2.1 1.1 9.5 Number of Blocks BLK 0.3 0.7 0.5 2.1 3.6 7.3 Number of Fouls PF 3.0 2.4 2.6 4.0 7.2 19.2
(League Team Stats, 2016)
8
Table 6: Cleveland Cavaliers’ Stats – Prior to Time Normalization Adjustment
Average Statistics (last 50 games of the season)
Abbreviated Name
Kyrie Irving
J.R. Smith
LeBron James
Kevin Love
Tristian Thompson
Total
Number of Minutes Played MIN 32.1 31.2 36.9 32.8 30.2 32.6 Number of 2 Point Field Goal Made
2PM 6.1 2.0 9.6 3.8 4.2 25.7
Number of 2 Point Field Goals Missed
2PMS 5.8 2.3 6.9 3.9 2.4 21.3
Number of 3 Point Field Goal Made
3PM 1.7 3.0 1.4 2.6 0 8.7
Number of 3 Point Field Goals Missed
3PMS 3.4 4.1 2.6 3.7 0 13.8
Number of Free Throws Made FTM 3.3 0.5 5.2 4.2 2.6 15.8 Number of Free Throws Missed FTMS 0.4 0.3 1.9 0.8 1.3 4.7 Number of Offensive Rebounds OREB 0.9 0.5 1.7 2.1 3.9 9.1 Number of Defensive Rebounds DREB 2.1 2.4 6.1 8.3 6.3 25.2 Number of Steals STL 1.1 1.1 1.3 0.6 0.5 4.6 Number of Blocks BLK 0.3 0.2 0.6 0.7 0.8 2.6 Number of Fouls PF 2.0 2.5 2.1 2.3 2.5 11.4
(League Team Stats, 2016) Table 7: Cleveland Cavaliers’ Stats – Adjusted for Assumption that Starters Play Entire Game
Average Statistics (last 50 games of the season)
Abbreviated Name
Kyrie Irving
J.R. Smith
LeBron James
Kevin Love
Tristian Thompson
Total
Number of Minutes Played MIN 48.0 48.0 48.0 48.0 48.0 48.0 Number of 2 Point Field Goal Made
2PM 9.1 3.1 12.5 5.6 6.7 36.9
Number of 2 Point Field Goals Missed
2PMS 8.7 3.5 9.0 5.7 3.8 30.7
Number of 3 Point Field Goal Made
3PM 2.5 4.6 1.8 3.8 - 12.8
Number of 3 Point Field Goals Missed
3PMS 5.1 6.3 3.4 5.4 - 20.2
Number of Free Throws Made FTM 4.9 0.8 6.8 6.1 4.1 22.7 Number of Free Throws Missed FTMS 0.6 0.5 2.5 1.2 2.1 6.8 Number of Offensive Rebounds OREB 1.3 0.8 2.2 3.1 6.2 13.6 Number of Defensive Rebounds DREB 3.1 3.7 7.9 12.1 10.0 36.9 Number of Steals STL 1.6 1.7 1.7 0.9 0.8 6.7 Number of Blocks BLK 0.4 0.3 0.8 1.0 1.3 3.8 Number of Fouls PF 3.0 3.8 2.7 3.4 4.0 16.9
(League Team Stats, 2016)
As indicated earlier, the number of minutes played and all stats are adjusted so that all starters play a
total of 48 minutes. The stats assume that the starters attained the same level of productivity produced in
the additional minutes added for each player. In addition, the stats listed in Tables 4 and 6 include the last
50 games of the season, and excludes the first 32 games of the season. All players were available for the
games utilized in the statistics above. Stated differently, no players’ statistics listed above were affected
by missing games.
9
Offensive Rebounds per Field Goal Miss Statistics
When a field goal is missed (either 2 pointer or 3 pointer) each team has the opportunity to acquire the
missed shot and rebound the ball. If the same team that shot the initial field goal missed the ball recovers
(rebounds) the ball, it is called an offensive rebound (OREB). If the other team recovers the ball, it is
referred to as a defensive rebound (DREB). In order to account for the ratio of how many times the Warriors
and Cavaliers get an offensive rebound, the number of field goals missed (FGMS) also have to be taken into
account.
The ratio listed below is used in the model to determine the chance of each team of getting an offensive
rebound. It is listed with a discrete probability for each team when either a 2 pointer or a 3 pointer is
missed. Unlike in real games, the model assumes that offensive rebounds OREB cannot be made on missed
free throws. However, offensive rebounds from missed free throw attempts occur very infrequently.
Table 8 (Adjusted for 48-Minute Game) Statistic
Golden State Warriors
Cleveland Warriors
Offensive Rebounds (OREB) 9.9 13.6 Number of Field Goals Missed (FGMS) 44.2 53.6 OREB / FGMS Ratio 22.3% 25.4%
Section 6: Code and Diagrams
The following section will walk through the Arena code, the workflow and list diagrams that will explain
how the simulation works.
Part 1: Initial Start of the Game
In order to get the basketball game started, there is a tipoff that occurs at the beginning of the game. In
the simulation, there is a 50% chance that the Warriors get the ball initially and a 50% chance that the
Cavaliers get the ball to start the game. In other words, each team has the same chance of getting the ball
first.
10
Figure 1: Start of the Simulation
Part 2: Record Scores, Possible Game Termination, Record Winner, Hold the Ball
Before the ball is distributed to any player, the simulation performs the following actions.
- Is the Game Over? In this section, the simulation checks to see if 48 minutes or more have
passed. It also checks to see if the game is tied. If the time of the game is greater or equal to
48 minutes and the game is not tied, the simulation will record team and individual stats and
terminate. Although it will most likely be less than a minute, please note that there is a high
probability that the game runs over 48 minutes. The simulation was set up this way since all
the recordings would occur without interference.
- Did the Warriors Win? If the game over, the simulation will look to see if the Warriors have
more points than the Cavaliers. If it is true, it records the Warriors victory and terminates the
simulation.
11
Figure 2: Decisions for a game over scenario, distributing the ball if game is not over, and figuring out which team won if game is over
- Game Over. After the simulation records a warrior victory or does not record a victory, it ends
/ terminates the session. An entirely new simulation/game can begin at this time.
- If the game is not over, the game continues and one of the teams hold the ball. If the game is
not 48 minutes or longer, the game continues and the team that has position holds the ball.
Part 3: Hold the Ball
Since each team only distributes the ball once based on discrete probabilities (see next part listed
below), the ball needs to be held for a certain period of time. The average number of time each team held
12
the ball was not readily available in the data set collected. However, the number of Field Goal Attempts
was available. With this stat, the time before taking a shot could be backed into for each team. The average
number of FGAs for the Warriors was 89.8 attempts with a standard deviation of 10.9 attempts. The
Cavaliers average attempts was 100.9 with a standard deviation of 14.0. See full table of statistics listed
below.
Table 8: Calculation of Average and Standard Deviation of the Time Holding the Ball for Each Team Statistic
Golden State Warriors
Cleveland Warriors
Average Number of FGAs 89.8 100.9 St Dev Number of FGAs 10.9 14.0 Number of Minutes Per Game 48 48 Number of Seconds Per Game 2,880 2,880 Avg Number of Shots Per Second (Seconds per game / Avg FGAs / 2)1
16.0
14.3
St Dev Number of Shots Per Second 1.95 1.98 1 This stat is divided by 2 in order to account for 2 teams acquiring the ball during the game.
Figure 3: Holding the ball delay process before the distribution of the ball to an individual player.
Part 4: Distribution of the Ball
Once a team has the ball, a discrete distribution is used to determine which player gets the ball. To
simplify the model, the simulation does not record any assists. The distribution of the ball gets the ball to
the player that is going to have some type of action happen. The discrete probabilities computed above
take into account the number of times a player performs the following actions.
Discrete Probability Formula:
Number of Field Goal Attempts for Player i / Number of Field Goal Attempts for Team j
13
Figure 4: Warriors Distribution Decision
The discrete distribution statistics for each team are listed below is each of the histograms below. The
percentages represent the probability the player is going to receive the ball and have the chance to do
something with it.
Figure 5: Warriors’ discrete probabilities of players’ chances of receiving the ball
14
Figure 6: Cavaliers’ discrete probabilities of players’ chances of receiving the ball
Part 5: Players’ Decisions and Actions
Once each player has the ball, there are eight (8) main actions that can happen. Please note that there
are “sub-actions” that happen following these main six actions. The initial eight (8) actions and sub-actions
are listed below:
Table 9: Player Decision and Action Descriptions and Explanations Main Action Description Sub-Action 2PM Player i shoots a 2-point field goal and
successfully makes it Ball is transferred to other team
2PMS Player i shoots a 2-point field goal and misses
Team i has opportunity to get an offensive rebound (OREB) and distribute the ball to any teammate again. The main actions are recalculated each time
3PM Player i shoots a 3-point field goal and successfully makes it
Ball is transferred to other team
3PMS Player i shoots a 3-point field goal and misses
Team i has opportunity to get an offensive rebound (OREB) and distribute the ball to any teammate again. The main actions are recalculated each time
BLK / STL1 Player i gets the ball stolen from Player j Ball is transferred to other team PF 0-0 Player i gets fouled by Player j (2 free throw
shots are assumed) Player i makes 0 out of 2 free throws. Ball is distributed to other team
PF 1-0 Player i gets fouled by Player j (2 free throw shots are assumed)
Player i makes 1 out of 2 free throws. Ball is distributed to other team
PF 1-1 Player i gets fouled by Player j (2 free throw shots are assumed)
Player i makes 2 out of 2 free throws. Ball is distributed to other team
1 The chances of Player i getting the ball stolen or blocked depend on their opponent’s (Player j) defensive statistics. For example, if Stephen Curry is on offense and Kyrie Irving is on defense, if Kyrie averages 1.6 steals per game and 0.4 blocks per game, those same stats are listed as Stephen Curry’s odds of having a shot blocked/ball stolen.
15
Figure 7: Example of S Curry Actions / Decisions
Note: Although the action is not shown above, if the ball is blocked or stolen by opponent, the ball is transferred to the opposing team. In addition, if a 2 or 3 pointer is missed, the team on offense does get a chance of getting an offensive rebound. Those probabilities are listed in Figure 8.
Each players’ decisions and actions vary depending on his regular season stats. Those discrete
probabilities are listed below:
Figure 8: Player Decisions / Actions – Discrete Probabilites
16
If a player is fouled, the player gets to attempt 2 foul shots. Please note that in an actual game, a player
can get fouled while shooting a three-pointer and get three foul shots. However, this situation is a fairly
rare occurrence. Therefore, the simulation does not take the possibility of getting fouled while shooting a
three-pointer into account. Each player’s probability of making a foul shot is listed below in the following
discrete probabilities. An example of how the odds of a player making 0, 1, or 2 free throws based on an
overall percentage of successful attempts is listed below:
Table 10: Example of Steph Curry’s FTA Probabilities Player S. Curry Overall FT% 90.9% Odds of Making 0 out of 2 Free Throws (9.1% x 9.1%) 0.8% Odds of Making 1 out of 2 Free Throws 18.2% Odds of Making 2 out of 2 Free Throws (90.1% x 90.1%) 82.6%
Figure 9: Player free throw percentages
17
As mentioned earlier, if a player shoots a 2 or 3 pointer and misses, they have a chance to recover the
ball and get an offensive rebound. The chances of each team of getting an offensive rebound are listed
below.
Table 11: Discrete Probabilities of Each Team Getting an Offensive Rebound Team Discrete Probability –
Offensive Rebound Golden State Warriors 23.3% Cleveland Cavaliers 25.4%
Section 7: Experimental Results
As mentioned earlier, in order to get relevant and significant averages and standard deviations, the
simulation number of replications was 101 games. The significant level on the half width produced from
the simulation was <= 0.05.
Figure 10: View of Arena’s run setup options (setting up 101 repetitions)
The ratio of the Warriors defeating the Cavaliers was an average of 73.3% with a half width of 9%. In
other words, the odds of the Warriors winning a 101 game series is 64.3% - 82.3% with a 95% confidence
level.
Figure 11: Arena’s user specified Warriors’ win percentage results
18
Each of the game results are also recorded in Arena, and can be exported to a .csv file. The results of
each game are visually shown in the graph below.
Figure 12: Margin of victory for each team (all 101 games)
In the graph above, each of the margins of victory for each game for each team is shown. The blue bars
represent the margin of victory for Golden State while the red bars represent the margin of victory for
Cleveland. The largest margin for the Warriors was 46 points while the largest margin of victory for
Cleveland was 22. Although the Warriors obviously appear to be clear favorites over the Cavaliers (74
games to 27), there are many instances where the Cavaliers would win a best out of 7 series. Those
instances or runs are highlighted using the red box to show the cases where the Cavaliers would have won
at least 4 games out of 7. Please note that some red boxes have multiple scenarios in which the Cavaliers
would win a best out of 7 series. For example, the first red box listed in the graph has 6 scenarios in which
the Cavaliers would win a best out 7 series. Also note, that once a team wins 4 games, the winner is
determined and the series ends.
Indicates times when Cleveland is able to win 4 out of 7 games. Please note that there are different combinations that can allow for Cleveland to win 4 out of 7 games.
19
Table 16: Individual Stat Comparisons (Simulation Averages v NBA finals Averages)
Note: The stats above are adjusted for players playing entire 48-minute game (while maintaining same “pace”). In other words,
the stats above are slightly inflated.
As one can witness in the tables above, the players’ averages that were dramatically less in the NBA
finals versus the simulation for the Golden State Warriors were Steph Curry (6.3 points less in the finals)
and Klay Thompson (3.9 points less in the finals). “As for Curry, his finals experience was an obstacle course
of long-limbed defenders (he shot 40.3 percent from the field), spats with officials (he chucked his mouth
guard after he was ejected from Game 6) and volleys from critics, who took jabs at everything from his poor
shooting to his choice of sneakers. Game 7 was another slog” (Cacciola, 2016).
On the other side, the Cavaliers that performed better in the NBA finals versus the simulation were Kyrie
Irving (9 points more in the finals) and LeBron James (6.3 points more in the finals). “James collected 27
points, 11 rebounds, and 11 assists to punctuate one of the most remarkable individual performances in
finals history. James, who was named the finals’ most valuable player got ample help from his teammate
Kyrie Irving, whose 3-pointer with 53 seconds remaining game the Cavaliers the lead” (Cacciola, 2016).
Although there were multiple factors involved, one could say that the performances of these four players
ultimately lead to the Cavaliers defeat over the Warriors.
20
Section 8: Sensitivity Analysis
There are two parts of the Sensitivity Analysis section. Part 1 of the Sensitivity analysis includes changing
the time (in seconds) each team holds the ball. One situation includes a game simulation that is “sped up”,
meaning that each team holds the ball for 5 seconds less (on average) and a situation where the game is
“slowed down” meaning that each team holds the ball for 5 seconds more. The original standard deviations
will remain constant in all scenarios. The results of how many times each team wins (out of 101) are
compared in each scenario. The results are also compared against the results of the original simulation.
The results will help determine the variability and validity of the model as this variable changes. The
number of times each team wins should remain fairly consistent and should have high correlation with one
another. The results of each scenario are listed below:
Table 12: Comparison of Simulation Scenarios (Changing Time of Holding Ball Variable) Description
Original Simulation (in seconds)
Scenario 1: Faster Paced Game
Scenario 2: Slower Paced Game
Average Time Warriors Hold the Ball 16.0 11.0 21.0 Standard Deviation Time Warriors Hold the Ball 1.9 1.9 1.9 Average Time Cavaliers Hold the Ball 14.3 9.3 19.3 Standard Deviation Time Cavaliers Hold the Ball 2.0 2.0 2.0 Warrior Win % (out of 101 games) 73.3%1 73.3% 71.3%
1 See Section 9 for more detailed results
The results in all three scenarios produce very similar results. In other words, varying the “pace” of the
game does not significantly impact in the number of times each team wins.
The second part substitutes one player for each team. As a side note, the number of seconds that each
team holds the ball reverts to the original simulation statistics. For the Warriors, Andrew Bogut is
substituted out for Shaun Livingston. For the Cavaliers, J.R. Smith is substituted out for Iman Shumpert.
The data entered into the simulation (replacing the players above) is listed below.
Table 13: Sensitivity Analysis Part 2 - Player Decisions / Actions – Discrete Probabilites Action S Livingston I Shumpert 2PM 40.3% 17.3% 2PMS 32.5% 20.9% 3PM 0.6% 9.6% 3PMS 0.0% 25.1% BLK / STL1 15.9% 21.3% PF 0-0 0.3% 0.2% PF 1-0 3.0% 1.9% PF 1-1 7.5% 3.6%
1The chances of Player i getting the ball stolen or blocked depend on their opponent’s (Player j) defensive statistics.
21
Table 14: Sensitivity Analysis Part 2 - Discrete Probabilities of Players’ Chances of Receiving the Ball
Team
Player
Probability of Receiving Ball
Golden State S Curry 30.9% K Thompson 27.9% A Iguodala 11.8% D Green 16.5% S Livingston 13.0% Cleveland K Irving 27.1% I Shumpert 11.6% L James 28.3% K Love 21.9% T Thompson 11.1%
Although these players were not considered starters, both Shaun Livingston and Iman Shumpert played
a significant amount of time for each team during the regular season. Since only one player for each team
is replaced, the number of times each team wins should remain fairly similar. The comparison results of
each scenario are listed below:
Table 15: Win % Comparison (Original Lineup versus New Lineup) Description
Original Simulation (in seconds)
Substitution of One (1) Player for Each team
Warrior Win % Average (out of 101 games)1 73.3%1 76.2% Warrior Win % Range (95% Confidence)1 64.3% to 82.3% 68.2% to 84.2%
Section 9: Validation Analysis
As mentioned earlier, FiveThirtyEight, a website dedicated to providing articles that often include using
statistical models and analysis in various subjects including sports, politics, and culture. Nate Silver, a
renowned statistician, created the website in 2008. “During the U.S. presidential primaries and United
States general election of 2008 the site compiled polling data through a unique methodology derived
from Silver's experience in baseball sabermetrics to ‘balance out the polls with comparative demographic
data.’” (FiveThirtyEight, 2016). As mentioned earlier, the website produced an article that had the
Warriors listed with a 69% chance of winning the series before it started (Morris, 2016). Based on betting
odds as of June 2, 2016, an article written from usatoday.com had the Warriors chance of winning listed
at 66.7% (Neuharth-Keusch, 2016). A comparison of the two predictions listed above are compared to
the simulation predictions in the table below:
22
Table 16: Odds of Warriors to Win NBA Finals Comparison Source
Prediction Odds of Warriors Winning the NBA Finals
95% Confidence Range
Simulation Odds Produced from Arena 73.3% 64.3% to 82.3% FiveThirtyEight Odds 69.0% Betting Odds (per usatoday.com) 66.7%
In order to validate some of the other simulation results, a comparison of the averages produced from
the simulation are compared to regular season statistics for each individual player included in the
simulation. Please note that the statistics below for the actual results are adjusted for a 48-minute game.
Table 17: Individual Player Statistics Comparison (Simulation versus Actual Results) Team
Player
Simulated Average Points Per Game1
Actual Season Stats (all 82 games)1
Golden State S Curry 37.2 42.2 K Thompson 30.6 31.9 A Iguodala 12.4 12.6 D Green 17.5 19.3 A Bogut 8.7 12.5 Cleveland K Irving 24.4 29.8 JR Smith 16.3 19.3 L James 27.9 34.1 K Love 19.8 24.3 T Thompson 9.1 13.5
1The stats above are adjusted for players playing entire 48-minute game (while maintaining same “pace”). In other words, the stats above are slightly inflated.
The differences in the comparisons above are attributed to the all the various assumptions made in the
simulation, including the number of times a player is fouled, when the ball is stolen or blocked, and also
how many times a team successful gets an offensive rebound. In addition, another contributor to the
point differential includes the assumption made of how long each team holds the ball (see Table 8 for
more detail).
Section 10: Conclusions and Questions
Although the model above did not accurately predict the Cavaliers to win the NBA finals, the model had
predictions (73%) that matched other robust predictors prior to the two teams playing, including
FiveThirtyEight’s 69% chance of the Warriors winning the series as well as the betting odds according to
USA Today which had the Warriors chances of winning at 67%.
As mentioned earlier, this simulation has the potential to simulate any NBA matchup for a series of
games. The odds of each team winning is easily accumulated in the simulation along with a range with a
23
95% confidence level. Once the model is compared to other matchups and thorough validation occurred
with other matchups, the uses of the model become almost limitless. In other words, this model is unique
and extremely flexible in the sense that it utilizes each player’s individual stats to predict a team’s chance
of winning. In other words, it has the capability of taking any five starters from any team an input them
into the model and teams can play each other to predict an outcome. To take it one step further, even
teams from various generations can be entered into the model. However, some adjustments may need to
be made to adjust for any rule changes that occurred in the NBA. For example, the ability to shoot three
pointers was not available until 1979 (Wood, 2011).
Other potential options for this program include adding more players into the simulation in order to take
into account bench players. One of the factors that lead to the Cavaliers win over the Warriors is the
suspension of Draymond Green in Game 5. It is hard to predict these rare occurrences and model these
situations in simulation modeling. With that said, one option would be to use various “starting lineups” for
each team and see if the results remain unchanged. In other words, shuffle the potential lineups using
multiple combinations of each of the team’s top 7-8 players and take an average of the averages to get
additional predictions and ranges.
24
References and Citations 2015–16 Golden State Warriors Season (2016). en.wikipedia.org. Retrieved on 11/14/16 from
https://en.wikipedia.org/wiki/2015%E2%80%9316_Golden_State_Warriors_season 2016 NBA Finals Cavaliers vs. Warriors. basketball-reference.com. Retrieved on 11/14/2016
from http://www.basketball-reference.com/playoffs/2016-nba-finals-cavaliers-vs-warriors.html.
Cacciola, S. (2016). Cavaliers Defeat Warriors to Win Their First NBA Title. nytimes.com. Retrieved on 11/2/2016 from http://www.nytimes.com/2016/06/20/sports/basketball/golden-state-warriors- cleveland-cavaliers-nba-championship.html?_r=0
Doane, S. & Seward, L. (2012). Applied Statistics in Business and Economics (4th Ed.). New York, NY: McGraw-Hill/Irwin. Elliott, R. J. & Morrell, C. H. (2010). Learning SAS in the Lab: Third Edition. Boston, MA: Brooks /
Cole. Exner, R. (2016). Eight NBA Teams Have Rallied from 3-1 Deficits to Win a Playoff Series; A Complete
Listing. Cleveland.com. Retrieved in 10/31/2016 from https://www.cleveland.com/datacentral/index.ssf/2009/05/eight_nba_teams_have_rallied_f.html
FiveThrityEight. (2016). en.wikipedia.org. Retrieved on 11/16/16 from
https://en.wikipedia.org/wiki/FiveThirtyEight League Team Stats. nba.com. Retrieved on 6/20/16 from
http://stats.nba.com/league/player/#!/?Season=201516&SeasonType=Regular%20Season Morris, B. (2016). Good News, Warriors Fans: Every Cliché About the NBA Playoffs is True.
fivethirtyeight.com. Retrieved on 10/15/16 from http://fivethirtyeight.com/features/good-news-warriors-fans-every-cliche-about-the-nba-playoffs-is-true/
Neuharth-Keusch, A. J. (2016). Betting Odds for the 2016 NBA Finals. usatoday.com. Retrieved on
11/19/2016 from http://www.usatoday.com/story/sports/nba/2016/06/02/betting-odds-2016- nba-finals-warriors-cavaliers-stephen-curry-lebron-james/85312676/
SAS Certification Prep Guide: Base Programming for SAS 9 Third Edition. Cary, North Carolina: SAS
Institute, Inc. Wood, R. (2011). The History of the 3-Pointer. www.usab.com. Retrieved on 11/18/2016 from
https://www.usab.com/youth/news/2011/06/the-history-of-the-3-pointer.aspx
25
Contents
Section 1: Detailed Introduction to the Problem .......................................................................................... 2
Section 2: Goals of the Simulation ................................................................................................................ 2
Section 3: Modeling Assumptions................................................................................................................. 3
Assumption #1: Individual Players Utilized in Simulation ......................................................................... 3
Table 1: Golden State Warrior Players Included in Simulation ............................................................. 3
Table 2: Cleveland Cavalier Players Included in Simulation .................................................................. 3
Assumption #2: Normalization Adjustment for Minutes Played and Game Statistics ............................. 4
Assumption #3: Game Rules Included in Simulation ................................................................................ 4
Section 4: Methods and Data Collection ...................................................................................................... 5
Table 3: Variable Names and Descriptions ........................................................................................... 6
Section 5: Statistics and Figures Utilized in the Simulation .......................................................................... 7
Table 4: Golden State Warriors’ Stats – Prior to Time Normalization Adjustment .............................. 7
Table 5: Golden State Warriors’ Stats – Adjusted for Assumption that Starters Play Entire Game ..... 7
Table 6: Cleveland Cavaliers’ Stats – Prior to Time Normalization Adjustment ................................... 8
Table 7: Cleveland Cavaliers’ Stats – Adjusted for Assumption that Starters Play Entire Game .......... 8
Offensive Rebounds per Field Goal Miss Statistics ................................................................................... 9
Table 8 (Adjusted for 48-Minute Game) ............................................................................................... 9
Section 6: Code and Diagrams ...................................................................................................................... 9
Part 1: Initial Start of the Game ................................................................................................................ 9
Figure 1: Start of the Simulation ..................................................................................................... 10
Part 2: Record Scores, Possible Game Termination, Record Winner, Hold the Ball ............................... 10
Figure 2: Decisions for a game over scenario, distributing the ball if game is not over, ................ 11
and figuring out which team won if game is over .......................................................................... 11
Part 3: Hold the Ball ................................................................................................................................ 11
Table 8: Calculation of Average and Standard Deviation of the Time Holding the Ball for Each Team ............................................................................................................................................................ 12
Figure 3: Holding the ball delay process before the distribution of the ball to an individual ........ 12
player. ............................................................................................................................................. 12
Part 4: Distribution of the Ball ................................................................................................................ 12
Figure 4: Warriors Distribution Decision ......................................................................................... 13
26
Figure 5: Warriors’ discrete probabilities of players’ chances of receiving the ball ....................... 13
Figure 6: Cavaliers’ discrete probabilities of players’ chances of receiving the ball ....................... 14
Part 5: Players’ Decisions and Actions .................................................................................................... 14
Table 9: Player Decision and Action Descriptions and Explanations .................................................. 14
Figure 7: Example of S Curry Actions / Decisions ............................................................................ 15
Figure 8: Player Decisions / Actions – Discrete Probabilites ........................................................... 15
Table 10: Example of Steph Curry’s FTA Probabilities ........................................................................ 16
Figure 9: Player free throw percentages ........................................................................................ 16
Table 11: Discrete Probabilities of Each Team Getting an Offensive Rebound .................................. 17
Section 7: Experimental Results .................................................................................................................. 17
Figure 10: View of Arena’s run setup options (setting up 101 repetitions).................................... 17
Figure 11: Arena’s user specified Warriors’ win percentage results .............................................. 17
Figure 12: Margin of victory for each team (all 101 games) ........................................................... 18
Table 16: Individual Stat Comparisons (Simulation Averages v NBA finals Averages) ....................... 19
Section 8: Sensitivity Analysis ..................................................................................................................... 20
Table 12: Comparison of Simulation Scenarios (Changing Time of Holding Ball Variable)................. 20
Table 13: Sensitivity Analysis Part 2 - Player Decisions / Actions – Discrete Probabilites ................. 20
Table 14: Sensitivity Analysis Part 2 - Discrete Probabilities of Players’ Chances of Receiving the Ball ............................................................................................................................................................ 21
Table 15: Win % Comparison (Original Lineup versus New Lineup) ................................................... 21
Section 9: Validation Analysis ..................................................................................................................... 21
Table 16: Odds of Warriors to Win NBA Finals Comparison ............................................................... 22
Table 17: Individual Player Statistics Comparison (Simulation versus Actual Results) ....................... 22
Section 10: Conclusions and Questions ...................................................................................................... 22
References and Citations ............................................................................................................................ 24