matlab computational finance conference 2017 … · matlab computational finance conference 2017...

Post on 31-May-2020

17 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

MATLAB COMPUTATIONAL FINANCE CONFERENCE 2017

Quantitative Sports Analytics using MATLAB

Robert Kissell, PhDRobert.Kissell@KissellResearch.com

September 28, 2017

Important Email and Web Addresses

• AlgoSports23/MATLAB Competition

Are you smarter than the Algo?

Email: AlgoSports23@gmail.com

Website: AlgoSports23.com

Please check the website for data updates, and contact AlgoSports23@gmail.com for further information.

Presentation Outline

• Quantitative Sports Modeling

• Modeling Techniques from:

• “Optimal Sports, Math, Statistics, and Fantasy”

• Probability Models

• Rank Sports Teams

• Estimate Winning Probability

• Calculate Winning Margin

• Computing Probability of Beating a Spread

• AlgoSports23/MATLAB Competition

Presentation Outline

• Quantitative Sports Modeling

• Modeling Techniques from:

• “Optimal Sports, Math, Statistics, and Fantasy”

• Probability Models

• Rank Sports Teams

• Estimate Winning Probability

• Calculate Winning Margin

• Computing Probability of Beating a Spread

• AlgoSports23/MATLAB Competition

• Are you smarter than the Algo!

Transaction Cost Analysis and Algorithm Trading

• Suite of TCA Models and Optimizers have been fully integrated into MATLAB’s Trading Toolbox.

• These suites of tools are being used for Algorithmic Trading and Portfolio Management.

• These include:

• Market Impact Estimation

• Pre-Trade

• Post-Trade

• Trade Schedule Optimization

• Liquidation Cost Analysis

• Portfolio Optimization with TCA

• Various Libraries are Available

• Access to a full suite of TCA libraries and MI Data is available upon request.

• Contact: info@KissellResearch.com or Robert.Kissell@KissellResearch.com

Optimal Sport Math, Statistics, and Fantasy

Key items addressed include:

• Accurately rank sports teams

• Compute winning probability

• Demystify the black-box world of computer models

• Provide insight into the BCS and RPI selection process.

• Select optimal mix of players for a fantasy league competition

• Evaluate player skill and forecast future player performance

• Select team rosters

• Assist in salary negotiation

• Determine Hall of Fame eligibility

• Sabermetrics on Steroids!

What is Quantitative Finance?

• Quantitative Finance is the application of methods and analyses

from the different sciences to solve financial problems.

• This include: Math, Statistics, Physics, Engineering, Economics,

Computer Science, Biology, Psychology, Business, etc.

• Quantitative Finance is all about proper utilization of the

“Scientific Method” and drawing statistically significant

conclusions.

Scientist or Engineer

• A Scientist is someone who “loves” surprises. This is an

opportunity to learn and make further advancements. The goal is

to learn, improve, and progress.

Scientist or Engineer

• A Scientist is someone who “loves” surprises. This is an

opportunity to learn and make further advancements. The goal is

to learn, improve, and progress.

• A Engineer is someone who “hates“ surprises. Surprises are

usually a indication that something “failed” or gone wrong and

often results in a loss or slowing of progress.

What about a Quant?

• A Quant is someone who learns from a proper application of the

scientific method by finding “Scientific” surprises and “profit”

opportunities.

• Quants go through great lengths to learn the cause of these

surprises and to ensure that these relationships are statistically

significant.

• Quants then seek to implement these scientific surprises without

suffering any “Engineering” surprises and losses.

The Scientific Method in Practice

ScientistData

Data

Data

StatisticallySignificantConclusion

The Scientific Method in Practice

ScientistData

Data

Data

StatisticallySignificantConclusion

Attorney Desired OutcomeFind supporting data

Data Mining

Data

Data

Data

The Scientific Method in Practice

ScientistData

Data

Data

StatisticallySignificantConclusion

Attorney Desired OutcomeFind supporting data

Data Mining

Data

Data

Data

Doctor Educated GuessTest Data

Worse Case Scenario?

Data ?

Data ?

Data ?

Moral of the Story:

Be a Scientist!

Moral of the Story:

Be a Scientist!

Don’t be that Anti-Scientist!

Quantitative Sports Modeling

What is Quantitative Sports Modeling?

• The application of quantitative tools and analytics, and sound

scientific methods, to sports related problems and questions.

• Quantitative sports modeling consists of the same tools used in

quantitative finance and is comprised of: mathematics, statistics,

engineering, machine learning, economics, business, etc.

• Sports Modeling is based on the same framework as Quantitative

Finance, but solves different set of problems.

What do we want to solve?

• Expected Winning Team

• Probability of Winning

• Expected Winning Margin

• Probability of Beating a Specified Margin

• Future Player Performance

• Roster of Players (Best set of Complementary Players)

• Best Mix of Players given Opponent

• Salaries & Salary Negotiation

Sports Modeling Data: What we want to Predict (LHS)

• Win/Loss

• Win Margin

• Probability of winning by more than X points

• Player Statistics (Fantasy Sports)

• Evaluating Player Ability

• Roster Selection

• Salary and Salary Negotiations

• Line-up and Match-ups

• Player Trades

• Hall of Fame Selection

Sports Modeling Data: Explanatory Factors Data (RHS)

• Win/Loss Result

• Game Scores

• Game Data

• Team Statistics• (AVG, OBP, ERA, HR, Comp. Ratio)

• Venue Location• (Home Field Advantage)

• Momentum

• Players, Injuries

• Career Statistics

• Salary

• Age

• Teammates & Roster

• Principal Component Analysis

Different Sports Prediction Models

• Probability Models

• Non-Linear Regression

• Non-Parametric Statistics

• Neural Networks / Machine Learning

• Sabermetrics on Steroids!

Head-to-Head Competitions – How do we Rank Teams

A

B

E

C

D

F

Ranking:A

B & CD & E

F

Head-to-Head Competitions – How do we Rank Teams

Ranking:A, B, C

A

B C

Head-to-Head Competitions – How do we Rank Teams

Ranking:A & GB & C D & E

F

A

B

E

C

D

F

G

Ranking:A

B & C & GD & E

F

Head-to-Head Competitions – How do we Rank Teams

Ranking:A

B & C D & EF & H

Ranking:A

B & C D & E & H

F

A

B

E

C

D

F

H

Sports Models To Discuss Today

Probability Models: Probability (X>Y)

• Power Function:𝜆𝑥

𝜆𝑥 + 𝜆𝑦

• Logit Regression

𝑏0 + 𝑏ℎ − 𝑏𝑎 = ln𝐹−1 𝑧

1 − 𝐹−1 𝑧

• In probability models, the LHS variable is (0,1) !

Power Function

Power Function

The Power function is derived from the Exponential Distribution.

Let,

𝑓 𝑥 ~𝜆𝑥𝑒−𝜆𝑥𝑡

𝑓 𝑦 ~𝜆𝑦𝑒−𝜆𝑦𝑡

Then,

𝑃𝑟𝑜𝑏 𝑥 > 𝑦 =𝜆𝑥

𝜆𝑥 + 𝜆𝑦

where, 𝜆𝑘= Team “k” Rating

Power Function with Home Field Advantage

Let X be Home Team

Prob X > Y =λx + λ0

λx + λy + λ0

Let Y be Away Team

Prob Y > X =λy

λx + λy + λ0

λk= Team “k” Rating

λ0= Team “k” Rating

Power Function: Solving Parameters

Function

𝐺 =

λx + λ0λx + λy + λ0

𝑖𝑓 ℎ𝑜𝑚𝑒 𝑡𝑒𝑎𝑚 𝑤𝑖𝑛𝑠 𝑔𝑎𝑚𝑒

λx + λ0λx + λy + λ0

𝑖𝑓 𝑎𝑤𝑎𝑦 𝑡𝑒𝑎𝑚 𝑤𝑖𝑛𝑠 𝑔𝑎𝑚𝑒

Max 𝐿 = ς𝐺𝑖

Max log 𝐿 = σ log 𝐺𝑖

Solve using Maximum Likelihood Estimates (“MLE”)

Power Function: Estimate Spread

Run Second Regression,

𝑆𝑝𝑟𝑒𝑎𝑑 = 𝑑0 + 𝑑1 ∙ 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦

Results,𝑑0, 𝑑1, 𝑠𝑒𝑌

MATLAB – Solving Power Function Parameters

% Power Function Model

% Num = matrix of winning team and location (HFA if at home)

% Denon = matrix of all teams including HFA

[b,fval,exitflag,output]=fmincon(@(b) myPower(b,Num,Denom),...

b0,[],[],[],[],LB,UB,...

[],...

options);

exitflag;

function f = myPower(b,Num,Denom)

Z=(Num*b)./(Denom*b);

f=-sum(log(Z));

end

Steps to Solve Power Function

• Set up Objective Function:

• Estimate Team Ratings using MLE

• Compute Winning Probabilities using Power Function Formula

• Run Regression of Home Team Win Margin (Spread”) as function of

Predicted Home Team Winning Probability (“Prob”):

• 𝑆𝑝𝑟𝑒𝑎𝑑 = 𝑑0 + 𝑑1 ∙ 𝑃𝑟𝑜𝑏

• This provides:

• 1) Probability that Home Team Wins Game

• 2) Expected Home Team Win Margin

• 3) Teams can be ranked based on Model Parameter (from highest to lowest)

Logit Regression

Logit Regression Model

Start with Logistic Distribution Function:

1

1 + exp − 𝑏0 + 𝑏ℎ − 𝑏𝑎= 𝑧1

s = Home Pts − Away Pts = Home Team Spread, (-inf, +inf)

z =𝑠 − 𝑎𝑣𝑔(𝑠)

𝑠𝑡𝑑𝑒𝑣(𝑠), (−𝑖𝑛𝑓, +𝑖𝑛𝑓)

𝑧1 = 𝐹−1 𝑧 = 𝑛𝑜𝑟𝑚𝑐𝑑𝑓 𝑧 , (0,1)

Logit Regression Model

We transform the logistic function into the logit regression:

𝑏0 + 𝑏ℎ − 𝑏𝑎 = ln𝑧1

1 − 𝑧1

s = Home Team Spread, (-inf, +inf)

z =𝑠 − 𝑎𝑣𝑔(𝑠)

𝑠𝑡𝑑𝑒𝑣(𝑠), (−𝑖𝑛𝑓, +𝑖𝑛𝑓)

𝑧1 = 𝐹−1 𝑧 = 𝑛𝑜𝑟𝑚𝑐𝑑𝑓 𝑧 , (0,1)

Steps to Solve Logit Spread Regression (Part 1)

• Calculate LHS Spread Values = Home Team Spread, (-inf, +inf);

z =𝑠 − 𝑎𝑣𝑔(𝑠)

𝑠𝑡𝑑𝑒𝑣(𝑠), −𝑖𝑛𝑓, +𝑖𝑛𝑓 ; 𝑧1 = 𝐹−1 𝑧 = 𝑛𝑜𝑟𝑚𝑐𝑑𝑓 𝑧 , (0,1)

• Solve parameters from OLS

• 𝑏0 + 𝑏ℎ − 𝑏𝑎 = ln𝑧1

1−𝑧1

• Estimate Home Team Win Margin

• 𝑧1 = 𝐹−1 𝑧 =1

1+exp − 𝑏0+𝑏ℎ−𝑏𝑎

• 𝑧 = 𝑛𝑜𝑟𝑚𝑖𝑛𝑣 𝑧1

• 𝑠 = 𝑧1 ∙ 𝑠𝑡𝑑𝑒𝑣 𝑠 + 𝑎𝑣𝑔(𝑠)

Steps to Solve Logit Spread Regression (Part 2)

• Run second regression:

• 𝐴𝑐𝑡𝑢𝑎𝑙 𝑆𝑝𝑟𝑒𝑎𝑑 = 𝑑0 + 𝑑1 ∙ 𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑 𝑆𝑝𝑟𝑒𝑎𝑑

• 𝑌 = 𝑑0 + 𝑑1 ∙ 𝑠

• 𝑑0, 𝑑1, 𝑠𝑒𝑌

• Compute Home Team Win Probability

• 𝑃𝑟𝑜𝑏 𝑆𝑝𝑟𝑒𝑎𝑑 > 0

• 𝑃𝑟𝑜𝑏 𝑌 > 0

• 𝑌~𝑁 𝑠, 𝑠𝑒𝑌

MATLAB – Logit Regression

% Logit Regression

% s = home team win margin,

% s>0, home team won game by s

% s<0, home team lost game by s

% z=zscore(s), mu = mean(s), stdev = stdev(s)

% Finv=normcdf(z)

% Y=log(Finv/(1-Finv))

% X=matrix of games, home team = +1, away team = -1

whichstats={'beta','tstat','r','yhat','mse','rsquare'};

myStats = regstats(Y,X,'linear',whichstats);

beta=myStats.tstat.beta;

beta=[beta(2:end);beta(1)];

TeamRating=beta;

NFL

NFL Data: Only Three Weeks of Games (47 Games)

NFL Data: Only Three Weeks of Games

NFL Data: Only Three Weeks of Games

Power Function: Estimating Spreads

𝑝𝑟𝑜𝑏 =λx + λ0

λx + λy + λ0

spread = 𝑑0 + 𝑑1 ∙ 𝑝𝑟𝑜𝑏

NFL - Power Function

Estimating Home Team Win Probability:

𝑝𝑟𝑜𝑏 =λx + λ0

λx + λy + λ0

Estimating Home Team Spread

𝑠 = 𝑑0 + 𝑑1 ∙ 𝑝𝑟𝑜𝑏 = −12.601 + 28.154 ∙ 𝑝𝑟𝑜𝑏

Example: Power Function

New England (Home) vs. Carolina (Away)

New England = 28.954

Carolina = 5.1099

HFA = 0.01

𝑝𝑟𝑜𝑏 =28.954+0.01

28.954+5.109+0.01= 85%

Estimating Home Team Spread

𝑠 = −12.601 + 28.154 ∙ 0.85 = +11.3 (need to adjust)

Logit Regression: Estimating Spreads

Est. Spread = b0 + bH − ba

Act. Spread = 𝑑0 + 𝑑1 ∙ 𝐸𝑠𝑡. 𝑆𝑝𝑟𝑒𝑎𝑑

NFL – Logit Regression

Estimating Home Team Win Probability:

ln𝑧1

1 − 𝑧1= 𝑏0 + 𝑏ℎ − 𝑏𝑎

Estimating Home Team Spread

Y (Actual Spread) = 𝑑0 + 𝑑1 ∙ 𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑 𝑆𝑝𝑟𝑒𝑎𝑑 𝑠𝑑0, 𝑑1, 𝑠𝑒𝑌𝑃𝑟𝑜𝑏 𝑌 > 0 = 𝑛𝑜𝑟𝑚𝑐𝑑𝑓 0, 𝑠, 𝑠𝑒𝑌

NFL Data: Only Three Weeks of Games

Example: Power Function

New England (Home) vs. Carolina (Away)

New England = 1.0079

Carolina = 0.4869

HFA = -0.0592

Estimating Home Team Spread:

𝑠 = 𝐽 𝐾1

1 + exp(−(1.0079 − 0.4869 − 0.0592)= +6.7

Estimating Home Team Win Probability:

𝑝 = f 6.7 =74%

NFL - Predictions

NCAA College Football

College Football: Only Four Weeks of Games (286 Games)Games with Div 1- FBS Teams Only

NCAA Football: Only Four Weeks of Games

NCAA Football - FBS: Model Results

NCAA Football - FBS: Algorithmic Rankings (after 4 weeks)

NCAA Football - FBS: Week 5 Predictions (Part 1)

NCAA Football - FBS: Week 5 Predictions (Part 2)

AlgoSports23/MATLAB Competition

AlgoSports23 / MATLAB Competition

• Are you Smarter than the Algo!

AlgoSports23 / MATLAB Competition

• Are you Smarter than the Algo!

• Can you Beat the Algo!

AlgoSports23 / MATLAB Competition

Two Important Emails:

Robert.Kissell@KissellResearch.com

AlgoSports23@gmail.com

AlgoSports23 / MATLAB Competition

• Rules of the Competition

• All Analysis & Programming MATLAB

• Game Results Data will be Posted Weekly

• Game Prediction File will be Posted Weekly

• Return Model Predictions by Specified Date

• Top 23 performing Algorithms each week will be included in

the AlgoSport23 Computer Rankings and Prediction

• National Media Attention!

• Are you smarter than the Algo?

AlgoSports23 / MATLAB Competition

Your program and submission needs to include the following:

1) Ranking of Teams

2) Prediction of Home Team Winning Margin for all game in a week

Models are measured based on:

1) RMSE

2) Avg Difference

3) Number of Wins

AlgoSports23 / MATLAB Competition

• Top 23 performing Algorithms each week will be included in the

AlgoSport23 Computer Rankings and Prediction!

• National Media Attention!

• Bragging Rights!

top related