modeling transportation networks - epfl · discrete choice models with applications to departure...

Modeling transportation networksRoute choice models

Michel Bierlaire

TRANSP-OR Transport and Mobility Laboratory

Ecole Polytechnique Federale de Lausanne

transp-or.epfl.ch

Modeling transportation networks – p.1/59

Introduction

Analysis of the supply and demand interactions

• supply: infrastructure, vehicles

• demand: travellers’ decisions• origin and destination• transportation mode• departure time• route, itinerary

• Underlying mathematical structure: the network

In this lecture, we focus on one of the most complicated issue


Route choice model

Given

• a mono- or multi-modal transportation network (nodes, links,origin, destination)

• an origin-destination pair

• link and path attributes

identify the route that a traveler would select.


Choice model

Assumptions about

1. the decision-maker: n

2. the alternatives• Choice set Cn

• p ∈ Cn is composed of a list of links (i, j)

3. the attributes• link-additive: length, travel time, etc.

xkp =∑

(i,j)∈P

xk(i,j)

• non link-additive: scenic path, usual path, etc.

4. the decision-rules: Pr(p|Cn)


Shortest path

Decision-makers all identical

Alternatives

• all paths between O and D• Cn = U ∀n

• U can be unbounded when loops are present

Attributes one link additive generalized cost

cp =∑

(i,j)∈P

c(i,j)

• traveler independent• link cost may be negative• no loop with negative cost must be present so that cp > −∞

for all p


Shortest path

Decision-rules path with the minimum cost is selected

Pr(p) =

{

K if cp ≤ cq ∀cq ∈ U0 otherwise

• K is a normalizing constant so that∑

p∈U Pr(p) = 1.

• K = 1/S, where S is the number of shortest paths betweenO and D.

• Some methods select one shortest path p∗

Pr(p) =

{

1 if p = p∗

0 otherwise


Shortest path

Advantages:

• well defined

• no need for behavioral data

• efficient algorithms (Dijkstra)

Disadvantages

• behaviorally irrealistic

• instability with respect to variations in cost

• calibration on real data is very difficult• inverse shortest path problem is NP complete• Burton, Pulleyblank and Toint (1997) The Inverse Shortest

Paths Problem With Upper Bounds on Shortest Paths Costs

Network Optimization , Series: Lecture Notes in Economics

and Mathematical Systems , Vol. 450, P. M. Pardalos, D.

W. Hearn and W. W. Hager (Eds.), pp. 156-171, Springer


Dial’s approach

Dial R. B. (1971) A probabilistic multipathtraffic assignment model which obviates pathenumeration Transportation Research Vol. 5, pp.83-111.

Decision-makers all identical

Alternatives efficient paths between O and D

Attributes link-additive generalized cost

Decision-rules multinomial logit model


Dial’s approach

• Def 1: A path is efficient if every link in it has• its initial node closer to the origin than its final node, and• its final node closer to the destination than its initial node.

• Def 2: A path is efficient if every link in it has its initial nodecloser to the origin than its final node.

Efficient path: a path that does not backtrack.


Dial’s approach

• Choice set Cn = set of efficient paths (finite, no loop)

• No explicit enumeration

• Every efficient path has a non zero probability to be selected

• Probability to select a path

Pr(p) =eθ(

P

(i,j)∈p∗ c(i,j)−P

(i,j)∈pc(i,j))

∑

q∈Cneθ(

P

(i,j)∈p∗ c(i,j)−P

(i,j)∈pq(i,j))

where p∗ is the shortest path and θ is a parameter


Dial’s approach

Note: the length of the shortest path is constant across Cn

Pr(p) =e−θ

P

(i,j)∈pc(i,j))

∑

q∈Cne−θ

P

(i,j)∈qq(i,j))

=e−θcp

∑

q∈Cne−θcq

Multinomial logit model with

Vp = −θcp


Dial’s approach

Advantages:

• probabilistic model, more stable

• calibration parameter θ

• avoid path enumeration

• designed for traffic assignment

Disadvantages:

• MNL assumes independence among alternatives

• efficient paths are mathematically convenient but notbehaviorally motivated


Dial’s approach

Path 1 : c

c − δδ

δ

Pr(1) =e−θc1

∑

q∈C e−θcq=

e−θc

3e−θc=

1

3for any c, δ, θ


Path Size Logit

• With MNL, the utility of overlapping paths is overestimated

• When δ is large, there is some sort of “double counting”

• Idea: include a correction

Vp = −θcp + β ln PSp

where

PSp =∑

(i,j)∈p

c(i,j)

cp

1∑

q∈C δqi,j

and

δqi,j =

{

1 if link (i, j) belongs to path q

0 otherwise


Path Size Logit

Path 1 : c

c − δ

δ

δ

PS1 =c

c

1

1= 1

PS2 = PS3 = c−δc

12 + δ

c11 =

1

2+

δ

2c


Path Size Logit

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5

0.52

0 0.2 0.4 0.6 0.8 1

P(1

)

delta

beta=0.721427


Path Size Logit

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0 0.2 0.4 0.6 0.8 1

P(1

)

delta

beta=0.4beta=0.5beta=0.6beta=0.7beta=0.8beta=0.9beta=1.0


Path Size Logit

Advantages:

• MNL formulation: simple

• Easy to compute

• Exploits the network topology

• Practical

Disadvantages:

• Derived from the theory on nested logit

• Several formulations have been proposed

• Correlated with observed and unobserved attributes

• May give biased estimates


Path Size Logit: readings

• Cascetta, E., Nuzzolo, A., Russo, F., Vitetta,A. 1996. A modified logit route choice modelovercoming path overlapping problems.Specification and some calibration results forinterurban networks. In Lesort, J.B. (Ed.),Proceedings of the 13th International Symposiumon Transportation and Traffic Theory, Lyon,France.

• Ramming, M., 2001. Network Knowledge and RouteChoice, PhD thesis, Massachusetts Institute ofTechnology.

• Ben-Akiva, M., and Bierlaire, M. (2003).Discrete choice models with applications todeparture time and route choice. In Hall, R.(ed) Handbook of Transportation Science, 2ndedition pp.7-38. Kluwer.


Path Size Logit: readings

• Hoogendoorn-Lanser, S., van Nes, R. and Bovy,P. (2005) Path Size Modeling in MultimodalRoute Choice Analysis. Transportation ResearchRecord vol. 1921 pp. 27-34

• Frejinger, E., and Bierlaire, M. (2007).Capturing correlation with subnetworks in routechoice models, Transportation Research Part B:Methodological 41(3):363-378.doi:10.1016/j.trb.2006.06.003


Random utility models

Decision-makers with characteristics• value of time• access to information• trip purpose

Alternatives explicit set of paths

Attributes both link-additive and path specific

Decision-rules RUM designed to capture correlations

Note: MNL is a random utility model, but the indendenceassumption is inappropriate. We must relax it.


Random utility models

Decision-makers with characteristics• value of time• access to information• trip purpose

Alternatives explicit set of paths

Attributes both link-additive and path specific

Decision-rules RUM designed to capture correlations

Note: MNL is a random utility model, but the indendenceassumption is inappropriate. We must relax it.

In this lecture, we focus on one of the most complicated issues


Relax the independence assumption

U1n

...UJn

=

V1n

...VJn

+

ε1n

...εJn

that isUn = Vn + εn

and εn is a vector of random variables.

Assumption about the random term:multivariate distribution

In the rest, we omit n



A multivariate random variable ε is represented by a density function

f(ε1, . . . , εJ)

and

P (ε ≤ x) =

∫ x1

−∞· · ·∫ xJ

−∞f(ε)dεJ . . . dε1

where x ∈ RJ is a J × 1 vector of constants.


Multinomial probit model

U =

U1

...UJ

=

V1 + ε1

...VJ + εJ

= V + ε

Probability to choose path p

Pr(p|C) = Pr(Uj − Up ≤ 0 ∀j ∈ C)

Let ∆i be a (J − 1 × J) matrix obtained from the (J − 1 × J − 1)

identity matrix, where a column of −1 has been added at index i.For J = 3, we have

∆1 =

(

−1 1 0

−1 0 1

)

∆2 =

(

1 −1 0

0 −1 1

)

∆3 =

(

1 0 −1

0 1 −1

)



U =

U1

U2

U3

∆2 =

(

1 −1 0

0 −1 1

)

Therefore

∆2U =

(

U1 − U2

U3 − U2

)

In general, we obtain

Pr(p|C) = Pr(Uj − Up ≤ 0 ∀j ∈ C)

= Pr(∆pU ≤ 0)



Assume that U ∼ N(V, Σ), where Σ ∈ RJ×J is the

variance-covariance matrix.

∆pU ∼ N(∆pV, ∆pΣ∆Tp )

Pr(p|C) =

∫ 0

εJ=−∞· · ·∫ 0

εp−1=−∞

∫ 0

εp+1=−∞· · ·∫ 0

ε1=−∞fp(ε)dε1 . . . dεp−1dεp+1 . . . dεJ

fp(ε) = (2π)−J−1

2

∣

∣∆pΣ∆Tp

∣

∣

− 12 exp

(

−1

2(ε − ∆pV )T (∆pΣ∆T

p )−1(ε − ∆pV )

)


Multinomial probit: issues

Variance-covariance matrix

• must be independent of the OD

• must capture the physical overlap of paths

• contains J(J + 1)/2 unknown parameters

• Idea: use a structured variance-covariance matrix, with fewparameters

Yai, Iwakura and Morichi (1997) Multinomial probitwith structured covariance for route choicebehavior. Transportation Research Part B 31(3),pp. 195-207.


Structured variance-covariance

ε = εℓ + ε0

where ε0 is a vector of independent r.v.It is assumed that

var(εℓr) = crσ

2

where

• cr is the length (or travel time) of path r

• σ is an unknown parameter to be estimated

The covariance is computed as follows:

cov(εℓr, ε

ℓq) = E[εℓ

rεℓq] − E[εℓ

r] E[εℓq] = E[εℓ

rεℓq]



Letεℓ

r = εr∩q + εr\q

where

• εr∩q corresponds to sections of r overlapping with q

• εr\q corresponds to sections of r not overlapping with q

E[εℓrε

ℓq] = E

[

(εr∩q + εr\q)(εr∩q + εq\r)]

= E[ε2r∩q] = var(εr∩q)

As before, we definevar(εr∩q) = crqσ

2

where crq is the length (or travel time) of the section common to r

and q.



ε = εℓ + ε0, Σ = Σℓ + Σ0

with

Σℓ = σ2

c1 c12 · · · c1J

c12 c2 · · · c2J

......

. . ....

c1J c2J · · · cJ

andΣ0 = σ2

0I

Note: only the ratio σ/σ0 is identified


Multinomial probit: issues

Estimation

• Integral has no closed form

• Numerical integration is cumbersome and almost impossible,except when J is small

• Most efficient estimation method: Bolduc (1999) Apractical technique to estimate multinomialprobit models in transportation TransportationResearch Part B 33, pp. 63-79.

• Bolduc estimates a model with 9 alternatives

• Yai et. al. estimate a model with J = 3.

• In the route choice context, J can be much larger



• If the CDF F (ε1, . . . , εJ) of the distribution is known

f(ε1, . . . , εJ) =∂JF

∂ε1 · · · ∂εJ

(ε1, . . . , εJ)

• The choice probability is

Pr(p) = Pr(V1 + ε1 ≤ Vp + εp, . . . , VJ + εJ ≤ Vp + εp)

= P (ε1 ≤ Vp + εp − V1, . . . , εJ ≤ Vp + εp − VJ)

=

∫ ∞

εp=−∞Fp(Vp + εp − V1, . . . , εp, . . . , Vp + εp − VJ)dεp


MEV models

Family of models proposed by McFadden (1978) (called GEV)Idea: a model is generated by a function

G : RJ → R

From G, we can build

• The cumulative distribution function (CDF)

• The probability model

• The expected maximum utilityMcFadden, D. (1978). Modelling the Choice of Residential

Location. In A. Karlquist et al. (eds.), Spatial

Interaction Theory and Residential Location, North-Holland,

Amsterdam, pp. 75-96.


MEV models

1. G is homogeneous of degree µ > 0, that is

G(αy) = αµG(y)

2. limyi→+∞

G(y1, . . . , yi, . . . , yJ) = +∞, for each i = 1, . . . , J ,

3. the kth partial derivative with respect to k distinct yi is nonnegative if k is odd and non positive if k is even, i.e., for all(distinct) indices i1, . . . , ik ∈ {1, . . . , J}, we have

(−1)k ∂kG

∂yi1 . . . ∂yik

(y) ≤ 0, ∀y ∈ RJ+.


MEV models

• CDF: F (ε1, . . . , εJ) = e−G(e−ε1 ,...,e−εJ )

• Probability: P (i|C) = eVi+ln Gi(eV1 ,...,eVJ )

P

j∈Ce

Vj+ln Gj(eV1 ,...,eVJ )with Gi = ∂G

∂xi. This

is a closed form

• Expected maximum utility: VC = ln G(...)+γ

µwhere γ is Euler’s

constant.

• Note: P (i|C) = ∂VC

∂Vi.

Euler’s constant

γ = −∫ +∞

0

e−x lnxdx = limn→∞

(

n∑

k=1

1

k− lnn

)

≈ 0, 57722


Network representation of MEV models

Automatic way of defining G based on a network representationLet (V, E) be a network with link parameters α(i,j) ≥ 0

Assumptions:

1. No circuit.

2. One node without predecessor: root.

3. J nodes without successor: alternatives.

4. For each node vi, there exists at least one path from the root tovi such that

∏Pk=1 α(ik−1,ik) > 0.


Network MEV

For each node vi, we define

• a set of indices Ii ⊆ {1, . . . , J} of Ji relevant alternatives,

• a homogeneous function Gi : RJi −→ R, and

• a parameter µi.

Recursive definition of Ii:

• Ii = {i} for alternatives,

• Ii =⋃

j∈succ(i) Ij for all other nodes.


Network MEV

Recursive definition of Gi:

For alternatives:Gi : R −→ R : Gi(xi) = xµi

i i = 1, . . . , J

For all others:

Gi : RJi −→ R : Gi(x) =

∑

j∈succ(i)

α(i,j)Gj(x)

µiµj


Network MEV

Example: Cross-Nested Logit G =∑

m

∑

j∈C

αjmyµm

j

µµm

�

JJ

JJJ

�

JJ

JJJ

HHHHHHHHHHj

JJ

JJJ

�

��x x x

x x

x

yµ1

1 yµ2

2 yµ3

3

α51yµ5

1 + α52yµ5

2 + α53yµ5

3

∑

i=4,5 α0i (αi1yµi

1 + αi2yµi

2 + αi3yµi

3 )µ0µi


Network MEV

• Daly, A., and Bierlaire, M. (2006). A Generaland Operational Representation of GeneralisedExtreme Value Models, Transportation ResearchPart B: Methodological 40(4):285-305.doi:10.1016/j.trb.2005.03.003

• Daly, A. (2001). Recursive nested EV model.ITS Working Paper 559, Institute for TransportStudies, University of Leeds.

• Bierlaire, M. (2002). The network GEV model.In: Proceedings of the 2nd SwissTransportation Research Conference, Ascona,Switzerland.


Link Nested Logit Model

Vovsha, P. and Bekhor, S., 1998 Link-nested logitmodel of route choice, Overcoming routeoverlapping problem, Transportation ResearchRecord 1645, pp. 133-142.Idea:

• Use the cross-nested model specification

• Alternatives = paths

• Nests = links


Link Nested Logit Modelℓ1

ℓ2

ℓ 3ℓ 4

ℓ1 ℓ2 ℓ3 ℓ4

p1 p23 p24



• In this example, nests ℓ1, ℓ3 and ℓ4 contain a single alternative

• The model collapses to a nested logit model



ℓ1

ℓ4

ℓ2ℓ3

ℓ5

ℓ1 ℓ2ℓ3 ℓ4ℓ5

p123 p423 p125 p425



µ

µ1 µ2µ3 µ4µ5

α11

α13

α21

α22

α23

α24

α31

α32

α42

α44

α53

α54



• 5 + 12 = 17 parameters to capture the correlation

• Variance covariance matrix: Σ ∈ R4×4

• It is symmetric, so there are J(J + 1)/2 = 10 entries

• For a given correlation matrix, the identification of theassociated parameters is not straightforward

• Abbe, E., Bierlaire, M., and Toledo, T. (2007).Normalization and correlation of cross-nestedlogit models, Transportation Research Part B:Methodological 41(7):795-808.doi:10.1016/j.trb.2006.11.006



Advantages

• Closed form probability

• Rich correlation structure

Disadvantages

• High number of nests

• All correlations structures cannot be reproduced

• Identification of the associated parameters is notstraightforward


Factor Analytic Specification

• Links: Bekhor, S., Ben-Akiva, M. and Ramming M.S.(2002). Adaptation of Logit Kernel to RouteChoice Situation. Transportation ResearchRecord, 1805, 78-85.

• Subnetworks: Frejinger, E., and Bierlaire, M.(2007). Capturing correlation with subnetworksin route choice models, Transportation ResearchPart B: Methodological 41(3):363-378.doi:10.1016/j.trb.2006.06.003


Subnetwork component

Sequence of links corresponding to a part of the network which canbe easily labeled, and is behaviorally meaningful in actual routedescriptions

• Champs-Elysées in Paris

• Fifth Avenue in New York

• Mass Pike in Boston

• City center in Lausanne

Paths sharing a subnetwork component are assumed to becorrelated even if they are not physically overlapping


Subnetworks - Example

O

D

Sa

Sb

Path 1Path 2Path 3


Subnetworks - Methodology

• Factor analytic specification of an error componentmodel (based on model presented in Bekhor et al.,2002)

Up = Vp +∑

s

√cpsσsζs + νp

• cps is the length by which path p overlaps withsubnetwork component s

• σs is an unknown parameter

• ζs ∼ N(0, 1)

• νp i.i.d. Extreme Value distributed


Subnetworks - Example

O

D

Sa

Sb

Path 1Path 2Path 3

U1 = βT X1 +p

l1aσaζa +p

l1bσbζb + ν1

U2 = βT X2 +p

l2aσaζa + ν2

U3 = βT X3 +p

l3bσbζb + ν3

Σ =

2

6

6

4

l1aσ2a + l1bσ

2

b

√

l1a

√

l2aσ2a

√

l1b

√

l3bσ2

b√

l1a

√

l2aσ2a l2aσ2

a 0√

l3b

√

l1bσ2

b0 l3bσ

2

b

3

7

7

5


Mixture of MNL

In statistics, a mixture density is a pdf which is a convex linearcombination of other pdf’s.If f(ε, θ) is a pdf, and if w(θ) is a nonnegative function such that

∫

θ

w(θ)dθ = 1

then

g(ε) =

∫

θ

w(θ)f(ε, θ)dθ

is also a pdf. We say that g is a mixture of f .If f is the pdf of a MNL model, it is a mixture of MNL


Mixture of MNL

Up = Vp +∑

s

√cpsσsζs + νp

If ζ is given,

Pr(p|ζ) =eVp+

P

s

√cpsσsζs

∑

q eVq+P

s

√cqsσsζs

ζ is distributed N(0, I)

Pr(p) =

∫

ζ

eVp+P

s

√cpsσsζs

∑

q eVq+P

s

√cqsσsζs

φ(ζ)dζ


Mixture of MNL

Pr(p) =

∫

ζ

eVp+P

s

√cpsσsζs

∑

q eVq+P

s

√cqsσsζs

φ(ζ)dζ

Not a closed form. Simulated Maximum Likelihood is to be used

• Train, K. (2003) Discrete Choice Methods withSimulation, Cambridge University Press


Subnetworks

Advantages

• Rich correlation structure

• Flexibility between complexity and realism

Disadvantages

• Non closed form


Summary

• Shortest path

• Dial’s efficient paths

• Path Size Logit

• Multinomial probit

• Link Nested Logit

• Subnetworks


Additionnal Reading

Prashker, J. N. and Bekhor, S. (2004) Route Choice Models Used in theStochastic User Equilibrium Problem: A Review, TransportReviews, 24:4, 437 - 463

de Palma, A. and Picard, N. (2005) Route choice decision under traveltime uncertainty Transportation Research Part A: Policy andPractice, 39(4), pp. 295–324

Nielsen, O. A., Daly, A. and Frederiksen R. D. (2002) A Stochastic RouteChoice Model for Car Travellers in the Copenhagen RegionNetworks and Spatial Economics 2(4) pp. 327–346doi:10.1023/A:1020895427428


Additionnal Reading

Han, B. (2001) Analyzing car ownership and route choices usingdiscrete choice models, PhD thesis, Department ofinfrastructure and planning, Royal Institute of Technology,Stockholm, TRITA-IP FR 01-95.

Hoogendoorn-Lanser, S. (2005) Modelling travel behaviour inmulti-modal networks, PhD thesis, Deft University ofTechnology.

Hoogendoorn-Lanser, S., van Nes, R. and Bovy, P. (2005) Path Size andoverlap in multi-modal transport networks, a new interpretation.In Mahmassani, H.S. (Ed.), Flow, Dynamics and HumanInteraction, Proceedings of the 16th International Symposiumon Transportation and Traffic Theory.

Bierlaire, M., and Frejinger, E. (to appear) Route choice modeling withnetwork-free data, Transportation Research Part C: EmergingTechnologies (forthcoming) doi:10.1016/j.trc.2007.07.007


modeling transportation networks - epfl · discrete choice models with applications to departure...

Documents