an introduction to football modelling at smartodds oxford siam

104
An introduction to football modelling at Smartodds Robert Johnson An introduction to football modelling at Smartodds Oxford SIAM Conference 2011 Robert Johnson Smartodds Ltd February 9, 2011

Upload: lydiep

Post on 03-Jan-2017

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

An introduction to football modelling atSmartodds

Oxford SIAM Conference 2011

Robert Johnson

Smartodds Ltd

February 9, 2011

Page 2: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Introduction

Introduction to Smartodds

Practical example: building a football model

Page 3: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

What is Smartodds about?

Smartodds provides statistical research andsports modelling in the betting sector

Quant team research and implement the sportsmodels

Primary focus is on Football, however we alsomodel Basketball, Baseball, American Football,Ice Hockey and Tennis

Wide range of interesting problems to work on

Actively recruiting!

Page 4: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

What is Smartodds about?

Smartodds provides statistical research andsports modelling in the betting sector

Quant team research and implement the sportsmodels

Primary focus is on Football, however we alsomodel Basketball, Baseball, American Football,Ice Hockey and Tennis

Wide range of interesting problems to work on

Actively recruiting!

Page 5: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

What is Smartodds about?

Smartodds provides statistical research andsports modelling in the betting sector

Quant team research and implement the sportsmodels

Primary focus is on Football, however we alsomodel Basketball, Baseball, American Football,Ice Hockey and Tennis

Wide range of interesting problems to work on

Actively recruiting!

Page 6: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

What is Smartodds about?

Smartodds provides statistical research andsports modelling in the betting sector

Quant team research and implement the sportsmodels

Primary focus is on Football, however we alsomodel Basketball, Baseball, American Football,Ice Hockey and Tennis

Wide range of interesting problems to work on

Actively recruiting!

Page 7: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

What is Smartodds about?

Smartodds provides statistical research andsports modelling in the betting sector

Quant team research and implement the sportsmodels

Primary focus is on Football, however we alsomodel Basketball, Baseball, American Football,Ice Hockey and Tennis

Wide range of interesting problems to work on

Actively recruiting!

Page 8: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Building a Football model

Suppose we decide to build a football model forthe English football leagues

Here we model the divisions Premier League,Championship, League 1 and League 2

There are 92 teams in total to model

We want to predict the probability of team Awinning against team B where team A and teamB could be from any of the 4 leagues

Page 9: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Building a Football model

Suppose we decide to build a football model forthe English football leagues

Here we model the divisions Premier League,Championship, League 1 and League 2

There are 92 teams in total to model

We want to predict the probability of team Awinning against team B where team A and teamB could be from any of the 4 leagues

Page 10: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Building a Football model

Suppose we decide to build a football model forthe English football leagues

Here we model the divisions Premier League,Championship, League 1 and League 2

There are 92 teams in total to model

We want to predict the probability of team Awinning against team B where team A and teamB could be from any of the 4 leagues

Page 11: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Building a Football model

Suppose we decide to build a football model forthe English football leagues

Here we model the divisions Premier League,Championship, League 1 and League 2

There are 92 teams in total to model

We want to predict the probability of team Awinning against team B where team A and teamB could be from any of the 4 leagues

Page 12: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Literature review

Maher (1982) assumed independent Poissondistributions for home and away goals

Means based on each teams’ past performance

Dixon and Coles (1997) took this idea further byaccounting for fluctuations in performance ofindividual teams and estimation between leagues

Dixon and Robinson (1998) modelled the scoresduring a game as a two-dimensional birth process

Page 13: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Literature review

Maher (1982) assumed independent Poissondistributions for home and away goals

Means based on each teams’ past performance

Dixon and Coles (1997) took this idea further byaccounting for fluctuations in performance ofindividual teams and estimation between leagues

Dixon and Robinson (1998) modelled the scoresduring a game as a two-dimensional birth process

Page 14: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Literature review

Maher (1982) assumed independent Poissondistributions for home and away goals

Means based on each teams’ past performance

Dixon and Coles (1997) took this idea further byaccounting for fluctuations in performance ofindividual teams and estimation between leagues

Dixon and Robinson (1998) modelled the scoresduring a game as a two-dimensional birth process

Page 15: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Literature review

Maher (1982) assumed independent Poissondistributions for home and away goals

Means based on each teams’ past performance

Dixon and Coles (1997) took this idea further byaccounting for fluctuations in performance ofindividual teams and estimation between leagues

Dixon and Robinson (1998) modelled the scoresduring a game as a two-dimensional birth process

Page 16: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model formulation

Assume that home and away goals follow aPoisson distribution

Pr(x goals) =λxe−λ

x!

Pr(y goals) =µye−µ

y !

To estimate the probabilities of x and y goals weneed λ and µ

Page 17: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model formulation

Assume that home and away goals follow aPoisson distribution

Pr(x goals) =λxe−λ

x!

Pr(y goals) =µye−µ

y !

To estimate the probabilities of x and y goals weneed λ and µ

Page 18: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 1: Mean goals

Assume that home and away teams are expectedto score the same number of goals

Take average goals scored in a game in Englandas 2.56 and divide by two

λ = 1.28

µ = 1.28

However we may believe that there is someadvantage associated with playing at home

Page 19: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 1: Mean goals

Assume that home and away teams are expectedto score the same number of goals

Take average goals scored in a game in Englandas 2.56 and divide by two

λ = 1.28

µ = 1.28

However we may believe that there is someadvantage associated with playing at home

Page 20: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 1: Mean goals

Assume that home and away teams are expectedto score the same number of goals

Take average goals scored in a game in Englandas 2.56 and divide by two

λ = 1.28

µ = 1.28

However we may believe that there is someadvantage associated with playing at home

Page 21: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 2: Home Advantage

Include a term to take account of homeadvantage

λ = γ × τ

µ = γ

γ is the common mean and τ represents thehome advantage

Page 22: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 2: Home Advantage

Include a term to take account of homeadvantage

λ = γ × τ

µ = γ

γ is the common mean and τ represents thehome advantage

Page 23: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 2: Home Advantage (Cont)

Mean goals scored by the away team in the fourleagues we model English Leagues is 1.10 giving

γ = 1.10

This implies mean goals scored by the hometeam are 2.56− 1.10 = 1.46

Using the above we can estimate τ as

τ = 1.46/1.10 = 1.33

Page 24: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 2: Home Advantage (Cont)

Mean goals scored by the away team in the fourleagues we model English Leagues is 1.10 giving

γ = 1.10

This implies mean goals scored by the hometeam are 2.56− 1.10 = 1.46

Using the above we can estimate τ as

τ = 1.46/1.10 = 1.33

Page 25: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 2: Home Advantage (Cont)

Mean goals scored by the away team in the fourleagues we model English Leagues is 1.10 giving

γ = 1.10

This implies mean goals scored by the hometeam are 2.56− 1.10 = 1.46

Using the above we can estimate τ as

τ = 1.46/1.10 = 1.33

Page 26: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Team Strengths

Previous attempts assumed all teams of equalstrength

Can add team strength parameters for each team

Better teams score more goals. Give each teaman attack parameter denoted α

Better teams concede fewer goals. Give eachteam a defence parameter denoted β

Page 27: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Team Strengths

Previous attempts assumed all teams of equalstrength

Can add team strength parameters for each team

Better teams score more goals. Give each teaman attack parameter denoted α

Better teams concede fewer goals. Give eachteam a defence parameter denoted β

Page 28: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Team Strengths

Previous attempts assumed all teams of equalstrength

Can add team strength parameters for each team

Better teams score more goals. Give each teaman attack parameter denoted α

Better teams concede fewer goals. Give eachteam a defence parameter denoted β

Page 29: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Team Strengths

Previous attempts assumed all teams of equalstrength

Can add team strength parameters for each team

Better teams score more goals. Give each teaman attack parameter denoted α

Better teams concede fewer goals. Give eachteam a defence parameter denoted β

Page 30: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Team Strengths (Cont)

Write λ and µ in terms of the attack anddefence parameters of the home and away teams,which we denote by i and j , giving

λ = γ × τ × αi × βj

µ = γ × αj × βi

The model is overparameterised, so we apply theconstraints

1

n

n∑i=1

αi = 1,1

n

n∑i=1

βi = 1.

Page 31: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Team Strengths (Cont)

Write λ and µ in terms of the attack anddefence parameters of the home and away teams,which we denote by i and j , giving

λ = γ × τ × αi × βj

µ = γ × αj × βi

The model is overparameterised, so we apply theconstraints

1

n

n∑i=1

αi = 1,1

n

n∑i=1

βi = 1.

Page 32: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Pseudolikelihood

The pseudolikelihood for this model is:

L(γ, τ, αi , βi ; i = 1, . . . , n) =∏k

{exp(−λk)λxkk exp(−µk)µykk }φ(t−tk )

φ(·) is an exponential downweighting function,which allows us to place less weight on oldergames

Other downweighting functions could be used

Page 33: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Pseudolikelihood

The pseudolikelihood for this model is:

L(γ, τ, αi , βi ; i = 1, . . . , n) =∏k

{exp(−λk)λxkk exp(−µk)µykk }φ(t−tk )

φ(·) is an exponential downweighting function,which allows us to place less weight on oldergames

Other downweighting functions could be used

Page 34: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model 3: Pseudolikelihood

The pseudolikelihood for this model is:

L(γ, τ, αi , βi ; i = 1, . . . , n) =∏k

{exp(−λk)λxkk exp(−µk)µykk }φ(t−tk )

φ(·) is an exponential downweighting function,which allows us to place less weight on oldergames

Other downweighting functions could be used

Page 35: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Estimation techniques

Obtaining the parameter estimates is notstraightforward

In this example we have 186 parameters toestimate

Various optimisation techniques could be used toobtain parameter estimates (numericalmaximisation of the likelihood function, MCMC)

High dimensional problems may also requiremore sophisticated computing solutions (MPI)

Page 36: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Estimation techniques

Obtaining the parameter estimates is notstraightforward

In this example we have 186 parameters toestimate

Various optimisation techniques could be used toobtain parameter estimates (numericalmaximisation of the likelihood function, MCMC)

High dimensional problems may also requiremore sophisticated computing solutions (MPI)

Page 37: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Estimation techniques

Obtaining the parameter estimates is notstraightforward

In this example we have 186 parameters toestimate

Various optimisation techniques could be used toobtain parameter estimates (numericalmaximisation of the likelihood function, MCMC)

High dimensional problems may also requiremore sophisticated computing solutions (MPI)

Page 38: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Estimation techniques

Obtaining the parameter estimates is notstraightforward

In this example we have 186 parameters toestimate

Various optimisation techniques could be used toobtain parameter estimates (numericalmaximisation of the likelihood function, MCMC)

High dimensional problems may also requiremore sophisticated computing solutions (MPI)

Page 39: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Parameter estimates

These are Smartodds’ current estimates of theattack and defence parameters of the top 6teams in the Premier League

Team Attack Parameter Defence Parameter

Chelsea 3.15 0.34Man Utd 3.08 0.35Arsenal 2.84 0.37

Man City 2.44 0.42Tottenham 2.22 0.44Liverpool 2.12 0.39

Page 40: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes

Suppose Man Utd are playing at home to ManCity

Using the parameter estimates we get

λ = 1.10× 1.33× 3.08× 0.42 = 1.89

µ = 1.10× 2.44× 0.35 = 0.94

We can use λ and µ to obtain the probability ofMan Utd winning the match

Page 41: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes

Suppose Man Utd are playing at home to ManCity

Using the parameter estimates we get

λ = 1.10× 1.33× 3.08× 0.42 = 1.89

µ = 1.10× 2.44× 0.35 = 0.94

We can use λ and µ to obtain the probability ofMan Utd winning the match

Page 42: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes

Suppose Man Utd are playing at home to ManCity

Using the parameter estimates we get

λ = 1.10× 1.33× 3.08× 0.42 = 1.89

µ = 1.10× 2.44× 0.35 = 0.94

We can use λ and µ to obtain the probability ofMan Utd winning the match

Page 43: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes (Cont)

The probability of a specific score is given asfollows

Pr(x , y) =λxe−λ

x!

µye−µ

y !

So the probability of the score, Man Utd 2 ManCity 1, is

Pr(2, 1) =1.892e−1.89

2!

0.941e−0.94

1!= 0.099

Page 44: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes (Cont)

The probability of a specific score is given asfollows

Pr(x , y) =λxe−λ

x!

µye−µ

y !

So the probability of the score, Man Utd 2 ManCity 1, is

Pr(2, 1) =1.892e−1.89

2!

0.941e−0.94

1!= 0.099

Page 45: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes (Cont)

Obtain the probability matrix of all possiblescores

0 1 2 3 4 . . .

0 0.059 0.112 0.105 0.066 0.031 . . .1 0.055 0.105 0.099 0.062 0.029 . . .2 0.026 0.049 0.047 0.029 0.014 . . .3 0.008 0.015 0.015 0.009 0.004 . . .4 0.002 0.004 0.003 0.002 0.001 . . ....

......

......

.... . .

Page 46: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes (Cont)

Sum over all events where home goals aregreater than away goals

0 1 2 3 4 . . .

0 0.059 0.112 0.105 0.066 0.031 . . .1 0.055 0.105 0.099 0.062 0.029 . . .2 0.026 0.049 0.047 0.029 0.014 . . .3 0.008 0.015 0.015 0.009 0.004 . . .4 0.002 0.004 0.003 0.002 0.001 . . ....

......

......

.... . .

Page 47: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Predicting outcomes (Cont)

Giving the probability that Man Utd win at hometo Man City as 59.6%

0 1 2 3 4 . . .

0 0.059 0.112 0.105 0.066 0.031 . . .1 0.055 0.105 0.099 0.062 0.029 . . .2 0.026 0.049 0.047 0.029 0.014 . . .3 0.008 0.015 0.015 0.009 0.004 . . .4 0.002 0.004 0.003 0.002 0.001 . . ....

......

......

.... . .

Page 48: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Practical issues

Betfair’s odds imply Man Utd has a 63% chanceof winning the game, potentially leaving value fora bet on Man City. However, should we bet?

These models take into account no externalinformation about match circumstances

InjuriesMotivationFatigueNewly signed players

So betting off a mathematical model would bedangerous!

Page 49: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Practical issues

Betfair’s odds imply Man Utd has a 63% chanceof winning the game, potentially leaving value fora bet on Man City. However, should we bet?

These models take into account no externalinformation about match circumstances

InjuriesMotivationFatigueNewly signed players

So betting off a mathematical model would bedangerous!

Page 50: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Practical issues

Betfair’s odds imply Man Utd has a 63% chanceof winning the game, potentially leaving value fora bet on Man City. However, should we bet?

These models take into account no externalinformation about match circumstances

Injuries

MotivationFatigueNewly signed players

So betting off a mathematical model would bedangerous!

Page 51: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Practical issues

Betfair’s odds imply Man Utd has a 63% chanceof winning the game, potentially leaving value fora bet on Man City. However, should we bet?

These models take into account no externalinformation about match circumstances

InjuriesMotivation

FatigueNewly signed players

So betting off a mathematical model would bedangerous!

Page 52: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Practical issues

Betfair’s odds imply Man Utd has a 63% chanceof winning the game, potentially leaving value fora bet on Man City. However, should we bet?

These models take into account no externalinformation about match circumstances

InjuriesMotivationFatigue

Newly signed players

So betting off a mathematical model would bedangerous!

Page 53: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Practical issues

Betfair’s odds imply Man Utd has a 63% chanceof winning the game, potentially leaving value fora bet on Man City. However, should we bet?

These models take into account no externalinformation about match circumstances

InjuriesMotivationFatigueNewly signed players

So betting off a mathematical model would bedangerous!

Page 54: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Practical issues

Betfair’s odds imply Man Utd has a 63% chanceof winning the game, potentially leaving value fora bet on Man City. However, should we bet?

These models take into account no externalinformation about match circumstances

InjuriesMotivationFatigueNewly signed players

So betting off a mathematical model would bedangerous!

Page 55: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Shortcomings of the model

If we compare the expected full-time scoresunder the model with the observed scores, wefind our modelling assumptions don’t hold

Goals don’t have a Poisson distributionGoals scored by the home and away teams aren’tindependent

Dixon and Coles corrected for this by modifyingthe predicted distribution to increase probabilityof draws and 0-1 and 1-0 scores

However this isn’t entirely satisfactory — wouldbe better to model what is happening directly

Page 56: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Shortcomings of the model

If we compare the expected full-time scoresunder the model with the observed scores, wefind our modelling assumptions don’t hold

Goals don’t have a Poisson distribution

Goals scored by the home and away teams aren’tindependent

Dixon and Coles corrected for this by modifyingthe predicted distribution to increase probabilityof draws and 0-1 and 1-0 scores

However this isn’t entirely satisfactory — wouldbe better to model what is happening directly

Page 57: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Shortcomings of the model

If we compare the expected full-time scoresunder the model with the observed scores, wefind our modelling assumptions don’t hold

Goals don’t have a Poisson distributionGoals scored by the home and away teams aren’tindependent

Dixon and Coles corrected for this by modifyingthe predicted distribution to increase probabilityof draws and 0-1 and 1-0 scores

However this isn’t entirely satisfactory — wouldbe better to model what is happening directly

Page 58: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Shortcomings of the model

If we compare the expected full-time scoresunder the model with the observed scores, wefind our modelling assumptions don’t hold

Goals don’t have a Poisson distributionGoals scored by the home and away teams aren’tindependent

Dixon and Coles corrected for this by modifyingthe predicted distribution to increase probabilityof draws and 0-1 and 1-0 scores

However this isn’t entirely satisfactory — wouldbe better to model what is happening directly

Page 59: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Shortcomings of the model

If we compare the expected full-time scoresunder the model with the observed scores, wefind our modelling assumptions don’t hold

Goals don’t have a Poisson distributionGoals scored by the home and away teams aren’tindependent

Dixon and Coles corrected for this by modifyingthe predicted distribution to increase probabilityof draws and 0-1 and 1-0 scores

However this isn’t entirely satisfactory — wouldbe better to model what is happening directly

Page 60: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Goal time distribution

0.00

0.01

0.02

0.03

0.04

0.05

Goal time (mins)

Den

sity

0 10 20 30 40 50 60 70 80 90

Goals in injury time at the end of each half arerecorded as 45 / 90 min goals

Goal rate steadily increases over the course ofthe game

Notice the spikes every 5 minutes in the secondhalf - due to rounding?

Page 61: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Goal time distribution

0.00

0.01

0.02

0.03

0.04

0.05

Goal time (mins)

Den

sity

0 10 20 30 40 50 60 70 80 90

Goals in injury time at the end of each half arerecorded as 45 / 90 min goals

Goal rate steadily increases over the course ofthe game

Notice the spikes every 5 minutes in the secondhalf - due to rounding?

Page 62: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Goal time distribution

0.00

0.01

0.02

0.03

0.04

0.05

Goal time (mins)

Den

sity

0 10 20 30 40 50 60 70 80 90

Goals in injury time at the end of each half arerecorded as 45 / 90 min goals

Goal rate steadily increases over the course ofthe game

Notice the spikes every 5 minutes in the secondhalf - due to rounding?

Page 63: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Dixon and Robinsons’ model

If we assume that the goal scoring processes forthe home and away teams are independenthomogeneous Poisson processes then our modelreduces to the full time model discussedpreviously.

For match k between teams i and j

λk(t) = λk = γ × τ × αi × βj

µk(t) = µk = γ × αj × βi

Page 64: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Dixon and Robinsons’ model

If we assume that the goal scoring processes forthe home and away teams are independenthomogeneous Poisson processes then our modelreduces to the full time model discussedpreviously.

For match k between teams i and j

λk(t) = λk = γ × τ × αi × βj

µk(t) = µk = γ × αj × βi

Page 65: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Dixon and Robinsons’ model (continued)

Three changes:

1 Goal-scoring rate dependent on the current score

2 Modelling of injury time

3 Increasing goal-scoring intensity through thegame (due to tiredness of players)

Page 66: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Dixon and Robinsons’ model (continued)

Three changes:

1 Goal-scoring rate dependent on the current score

2 Modelling of injury time

3 Increasing goal-scoring intensity through thegame (due to tiredness of players)

Page 67: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Dixon and Robinsons’ model (continued)

Three changes:

1 Goal-scoring rate dependent on the current score

2 Modelling of injury time

3 Increasing goal-scoring intensity through thegame (due to tiredness of players)

Page 68: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Dixon and Robinsons’ model (continued)

Three changes:

1 Goal-scoring rate dependent on the current score

2 Modelling of injury time

3 Increasing goal-scoring intensity through thegame (due to tiredness of players)

Page 69: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(1) Goal-scoring rate dependent on current score

Assume that home and away scoring processesare independent Poisson processes

Scoring rates are piecewise constant

Home and away intensities are constant until agoal is scored and only change at these times

Denote λxy and µxy as parameters determiningthe scoring rates when the score is (x ,y)

Scoring rates are now

λk(t) = λxyλk

andµk(t) = µxyµk

Page 70: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(1) Goal-scoring rate dependent on current score

Assume that home and away scoring processesare independent Poisson processes

Scoring rates are piecewise constant

Home and away intensities are constant until agoal is scored and only change at these times

Denote λxy and µxy as parameters determiningthe scoring rates when the score is (x ,y)

Scoring rates are now

λk(t) = λxyλk

andµk(t) = µxyµk

Page 71: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(1) Goal-scoring rate dependent on current score

Assume that home and away scoring processesare independent Poisson processes

Scoring rates are piecewise constant

Home and away intensities are constant until agoal is scored and only change at these times

Denote λxy and µxy as parameters determiningthe scoring rates when the score is (x ,y)

Scoring rates are now

λk(t) = λxyλk

andµk(t) = µxyµk

Page 72: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(1) Goal-scoring rate dependent on current score

Assume that home and away scoring processesare independent Poisson processes

Scoring rates are piecewise constant

Home and away intensities are constant until agoal is scored and only change at these times

Denote λxy and µxy as parameters determiningthe scoring rates when the score is (x ,y)

Scoring rates are now

λk(t) = λxyλk

andµk(t) = µxyµk

Page 73: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(1) Goal-scoring rate dependent on current score

Assume that home and away scoring processesare independent Poisson processes

Scoring rates are piecewise constant

Home and away intensities are constant until agoal is scored and only change at these times

Denote λxy and µxy as parameters determiningthe scoring rates when the score is (x ,y)

Scoring rates are now

λk(t) = λxyλk

andµk(t) = µxyµk

Page 74: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(1) Goal-scoring rate dependent on current score

Assume that home and away scoring processesare independent Poisson processes

Scoring rates are piecewise constant

Home and away intensities are constant until agoal is scored and only change at these times

Denote λxy and µxy as parameters determiningthe scoring rates when the score is (x ,y)

Scoring rates are now

λk(t) = λxyλk

andµk(t) = µxyµk

Page 75: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Estimates of λ(x , y) and µ(x , y)

λ̂(0, 0) = 1µ̂(0, 0) = 1

λ̂(1, 0) = 0.88µ̂(1, 0) = 1.35

λ̂(0, 1) = 1.10µ̂(0, 1) = 1.07

Page 76: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Estimates of λ(x , y) and µ(x , y)

λ̂(0, 0) = 1µ̂(0, 0) = 1

λ̂(1, 0) = 0.88µ̂(1, 0) = 1.35

λ̂(0, 1) = 1.10µ̂(0, 1) = 1.07

Page 77: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Estimates of λ(x , y) and µ(x , y)

λ̂(0, 0) = 1µ̂(0, 0) = 1

λ̂(1, 0) = 0.88µ̂(1, 0) = 1.35

λ̂(0, 1) = 1.10µ̂(0, 1) = 1.07

Page 78: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(2) Increase the scoring rate during injury time

Goals scored during injury time are recorded ashaving occurred at either 45 or 90 minutes.

Define two new parameters ρ1 and ρ2 to modelinjury time.

The adjusted scoring rates are

λk(t) =

ρ1λxyλk t ∈ (44, 45]mins,

ρ2λxyλk t ∈ (89, 90]mins,

λxyλk otherwise

and similarly for µk(t)

Page 79: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(2) Increase the scoring rate during injury time

Goals scored during injury time are recorded ashaving occurred at either 45 or 90 minutes.

Define two new parameters ρ1 and ρ2 to modelinjury time.

The adjusted scoring rates are

λk(t) =

ρ1λxyλk t ∈ (44, 45]mins,

ρ2λxyλk t ∈ (89, 90]mins,

λxyλk otherwise

and similarly for µk(t)

Page 80: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(2) Increase the scoring rate during injury time

Goals scored during injury time are recorded ashaving occurred at either 45 or 90 minutes.

Define two new parameters ρ1 and ρ2 to modelinjury time.

The adjusted scoring rates are

λk(t) =

ρ1λxyλk t ∈ (44, 45]mins,

ρ2λxyλk t ∈ (89, 90]mins,

λxyλk otherwise

and similarly for µk(t)

Page 81: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(2) Increase the scoring rate during injury time

Goals scored during injury time are recorded ashaving occurred at either 45 or 90 minutes.

Define two new parameters ρ1 and ρ2 to modelinjury time.

The adjusted scoring rates are

λk(t) =

ρ1λxyλk t ∈ (44, 45]mins,

ρ2λxyλk t ∈ (89, 90]mins,

λxyλk otherwise

and similarly for µk(t)

Page 82: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(3) Increasing goal-scoring intensity

Allow the scoring intensities to increase over time

Model scoring rates as time inhomogeneousPoisson processes with a linear rate of increase

Replace λk(t) and µk(t) with

λ∗k(t) = λk(t) + ξ1t,

µ∗k(t) = µk(t) + ξ2t

ξ1 and ξ2 could be constrained to be positive toensure that the hazard functions above areconstrained to always be positive, but in practicethis is not neccessary

Scoring rates are estimated to be about 75%higher at the end of the game then at the startof the game.

Page 83: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(3) Increasing goal-scoring intensity

Allow the scoring intensities to increase over time

Model scoring rates as time inhomogeneousPoisson processes with a linear rate of increase

Replace λk(t) and µk(t) with

λ∗k(t) = λk(t) + ξ1t,

µ∗k(t) = µk(t) + ξ2t

ξ1 and ξ2 could be constrained to be positive toensure that the hazard functions above areconstrained to always be positive, but in practicethis is not neccessary

Scoring rates are estimated to be about 75%higher at the end of the game then at the startof the game.

Page 84: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(3) Increasing goal-scoring intensity

Allow the scoring intensities to increase over time

Model scoring rates as time inhomogeneousPoisson processes with a linear rate of increase

Replace λk(t) and µk(t) with

λ∗k(t) = λk(t) + ξ1t,

µ∗k(t) = µk(t) + ξ2t

ξ1 and ξ2 could be constrained to be positive toensure that the hazard functions above areconstrained to always be positive, but in practicethis is not neccessary

Scoring rates are estimated to be about 75%higher at the end of the game then at the startof the game.

Page 85: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(3) Increasing goal-scoring intensity

Allow the scoring intensities to increase over time

Model scoring rates as time inhomogeneousPoisson processes with a linear rate of increase

Replace λk(t) and µk(t) with

λ∗k(t) = λk(t) + ξ1t,

µ∗k(t) = µk(t) + ξ2t

ξ1 and ξ2 could be constrained to be positive toensure that the hazard functions above areconstrained to always be positive, but in practicethis is not neccessary

Scoring rates are estimated to be about 75%higher at the end of the game then at the startof the game.

Page 86: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

(3) Increasing goal-scoring intensity

Allow the scoring intensities to increase over time

Model scoring rates as time inhomogeneousPoisson processes with a linear rate of increase

Replace λk(t) and µk(t) with

λ∗k(t) = λk(t) + ξ1t,

µ∗k(t) = µk(t) + ξ2t

ξ1 and ξ2 could be constrained to be positive toensure that the hazard functions above areconstrained to always be positive, but in practicethis is not neccessary

Scoring rates are estimated to be about 75%higher at the end of the game then at the startof the game.

Page 87: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model usage

This ‘in-running’ model can be useful in its ownright (for deriving in-running prices)

Also explains the home/away dependencies andnon-Poisson pdfs observed in the data

Page 88: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Model usage

This ‘in-running’ model can be useful in its ownright (for deriving in-running prices)

Also explains the home/away dependencies andnon-Poisson pdfs observed in the data

Page 89: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Summary

The Dixon-Coles model is a simple and robustfull-time score model, but not all of itsassumptions are met

A continuous time model such as theDixon-Robinson model can model dependenciesbetween home and away scoring rates

Mathematical models cannot model team news(unless this is incorporated into the modelsomehow)

These models can be extended to other sports bychanging the distributions, eg

Normal distribution for American FootballNegative binomial for baseball

Page 90: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Summary

The Dixon-Coles model is a simple and robustfull-time score model, but not all of itsassumptions are met

A continuous time model such as theDixon-Robinson model can model dependenciesbetween home and away scoring rates

Mathematical models cannot model team news(unless this is incorporated into the modelsomehow)

These models can be extended to other sports bychanging the distributions, eg

Normal distribution for American FootballNegative binomial for baseball

Page 91: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Summary

The Dixon-Coles model is a simple and robustfull-time score model, but not all of itsassumptions are met

A continuous time model such as theDixon-Robinson model can model dependenciesbetween home and away scoring rates

Mathematical models cannot model team news(unless this is incorporated into the modelsomehow)

These models can be extended to other sports bychanging the distributions, eg

Normal distribution for American FootballNegative binomial for baseball

Page 92: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Summary

The Dixon-Coles model is a simple and robustfull-time score model, but not all of itsassumptions are met

A continuous time model such as theDixon-Robinson model can model dependenciesbetween home and away scoring rates

Mathematical models cannot model team news(unless this is incorporated into the modelsomehow)

These models can be extended to other sports bychanging the distributions, eg

Normal distribution for American FootballNegative binomial for baseball

Page 93: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Summary

The Dixon-Coles model is a simple and robustfull-time score model, but not all of itsassumptions are met

A continuous time model such as theDixon-Robinson model can model dependenciesbetween home and away scoring rates

Mathematical models cannot model team news(unless this is incorporated into the modelsomehow)

These models can be extended to other sports bychanging the distributions, eg

Normal distribution for American Football

Negative binomial for baseball

Page 94: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Summary

The Dixon-Coles model is a simple and robustfull-time score model, but not all of itsassumptions are met

A continuous time model such as theDixon-Robinson model can model dependenciesbetween home and away scoring rates

Mathematical models cannot model team news(unless this is incorporated into the modelsomehow)

These models can be extended to other sports bychanging the distributions, eg

Normal distribution for American FootballNegative binomial for baseball

Page 95: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

References

M.J. Maher, 1982, Modelling association footballscores, Statist. Neerland., 36, 109-1188

M. Dixon and S.G. Coles, 1997. ModellingAssociation Football Scores and Inefficiencies inthe Football Betting Market. Applied Statistics,46(2), 265-280

M. Dixon and M. Robinson, 1998. A birthprocess model for association football matches.JRSS D, 47(3), 523-538

Page 96: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

References

M.J. Maher, 1982, Modelling association footballscores, Statist. Neerland., 36, 109-1188

M. Dixon and S.G. Coles, 1997. ModellingAssociation Football Scores and Inefficiencies inthe Football Betting Market. Applied Statistics,46(2), 265-280

M. Dixon and M. Robinson, 1998. A birthprocess model for association football matches.JRSS D, 47(3), 523-538

Page 97: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

References

M.J. Maher, 1982, Modelling association footballscores, Statist. Neerland., 36, 109-1188

M. Dixon and S.G. Coles, 1997. ModellingAssociation Football Scores and Inefficiencies inthe Football Betting Market. Applied Statistics,46(2), 265-280

M. Dixon and M. Robinson, 1998. A birthprocess model for association football matches.JRSS D, 47(3), 523-538

Page 98: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Interested?

If you are interested in sports modelling andpossess the following skills:

Post graduate qualification (at least MMath /MSc, PhD. preferred) in mathematics, statisticsor another subject with considerablemathematical contentExperience in developing and implementingmathematical / statistical modelsExperience of computer programming(preferably in C++, C, R or Python)Enthusiasm, self-motivation and the ability towork under pressure to strict deadlines

Then email us at [email protected]

For more information see our website:http://www.smartodds.co.uk

Page 99: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Interested?

If you are interested in sports modelling andpossess the following skills:

Post graduate qualification (at least MMath /MSc, PhD. preferred) in mathematics, statisticsor another subject with considerablemathematical content

Experience in developing and implementingmathematical / statistical modelsExperience of computer programming(preferably in C++, C, R or Python)Enthusiasm, self-motivation and the ability towork under pressure to strict deadlines

Then email us at [email protected]

For more information see our website:http://www.smartodds.co.uk

Page 100: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Interested?

If you are interested in sports modelling andpossess the following skills:

Post graduate qualification (at least MMath /MSc, PhD. preferred) in mathematics, statisticsor another subject with considerablemathematical contentExperience in developing and implementingmathematical / statistical models

Experience of computer programming(preferably in C++, C, R or Python)Enthusiasm, self-motivation and the ability towork under pressure to strict deadlines

Then email us at [email protected]

For more information see our website:http://www.smartodds.co.uk

Page 101: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Interested?

If you are interested in sports modelling andpossess the following skills:

Post graduate qualification (at least MMath /MSc, PhD. preferred) in mathematics, statisticsor another subject with considerablemathematical contentExperience in developing and implementingmathematical / statistical modelsExperience of computer programming(preferably in C++, C, R or Python)

Enthusiasm, self-motivation and the ability towork under pressure to strict deadlines

Then email us at [email protected]

For more information see our website:http://www.smartodds.co.uk

Page 102: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Interested?

If you are interested in sports modelling andpossess the following skills:

Post graduate qualification (at least MMath /MSc, PhD. preferred) in mathematics, statisticsor another subject with considerablemathematical contentExperience in developing and implementingmathematical / statistical modelsExperience of computer programming(preferably in C++, C, R or Python)Enthusiasm, self-motivation and the ability towork under pressure to strict deadlines

Then email us at [email protected]

For more information see our website:http://www.smartodds.co.uk

Page 103: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Interested?

If you are interested in sports modelling andpossess the following skills:

Post graduate qualification (at least MMath /MSc, PhD. preferred) in mathematics, statisticsor another subject with considerablemathematical contentExperience in developing and implementingmathematical / statistical modelsExperience of computer programming(preferably in C++, C, R or Python)Enthusiasm, self-motivation and the ability towork under pressure to strict deadlines

Then email us at [email protected]

For more information see our website:http://www.smartodds.co.uk

Page 104: An introduction to football modelling at Smartodds Oxford SIAM

Anintroductionto footballmodelling atSmartodds

RobertJohnson

Interested?

If you are interested in sports modelling andpossess the following skills:

Post graduate qualification (at least MMath /MSc, PhD. preferred) in mathematics, statisticsor another subject with considerablemathematical contentExperience in developing and implementingmathematical / statistical modelsExperience of computer programming(preferably in C++, C, R or Python)Enthusiasm, self-motivation and the ability towork under pressure to strict deadlines

Then email us at [email protected]

For more information see our website:http://www.smartodds.co.uk