smart traffic lights that learn ! multi-agent reinforcement learning integrated network of...

20
Smart Traffic Lights that Learn ! M ulti-A gent R einforcement L earning I ntegrated N etwork of Adaptive Traffic Signal Controllers M A R L I N Samah El-Tantawy, Ph.D. Post Doctoral Fellow, Dept of Civil Engineering Baher Abdulhai, Ph.D., P.Eng. Director, ITS Centre and Testbed, Dept of Civil Engineering Hossam Abdelgawad, Ph.D., P.Eng. Manager of ITS Centre and Testbed ACGM 2013- Intelligent Transport for Smart Cities

Upload: jonathan-laba

Post on 29-Nov-2014

293 views

Category:

Technology


2 download

DESCRIPTION

The University of Toronto’s Dr. Samah el-Tantawy has developed MARLIN-ATSC (Multi-agent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers), an artificial intelligence traffic system that uses cameras and inter-system collaboration to determine how best to manage traffic flow throughout a region. MARLIN’s success is largely due to its adaptive nature, which allows it to learn and adjust as it gathers traffic pattern information. Read more: http://www.utne.com/science-and-technology/marlin-transforms-traffic.aspx#ixzz2yxjuDpCP visit http://techronto.com for more information like this.

TRANSCRIPT

Page 1: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Smart Traffic Lights that Learn !

Multi-Agent Reinforcement Learning Integrated Network of Adaptive Traffic Signal Controllers

M A R L I N

Samah El-Tantawy, Ph.D. Post Doctoral Fellow, Dept of Civil Engineering

Baher Abdulhai, Ph.D., P.Eng. Director, ITS Centre and Testbed, Dept of Civil Engineering

Hossam Abdelgawad, Ph.D., P.Eng. Manager of ITS Centre and Testbed

ACGM 2013- Intelligent Transport for Smart Cities

Page 2: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Outline 2

1. In a Nutshell 2. Theory in Brief Reinforcement Learning and Game Theory

3. Applications City of Toronto Testbed

4. Hardware in the Loop Testing Approach Integration with PEEK ATC-1000

Next Steps Q&A

Page 3: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

In a Nutshell 3

Grand objective

Intersections "talk to each other",

Each is affected by what is happening upstream

Each affects what is happening downstream –

Whole network control in one shot from a grand brain is the dream

Issue

Intractable theoretically,

Too complex practically,

Requires massive and very expensive communication

Solution

Decentralized,

Self learning: agents learn to control their local intersection, and

Game theory based: agents learn to collaborate

Page 4: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

What is MARLIN? 4

Artificial-intelligence-based control software

Enables traffic lights to self-learn and self-collaborate with neighbouring traffic lights

Cuts down motorists’ delay, fuel consumption and the negative environmental effects of congestion

Easier to operate (self learning)

Less expensive communication if even necessary (less costly)

Page 5: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

MARLIN-ATSC: Level 4

Evolution of “Adaptive” Signal Control

Level 0 • Fixed-Time

and Actuated Control

• TRANSYT • 1969, UK

Level 1 • Centralized

Control, Off-line Optimization

• SCATS • 1979,

Australia • >50

installations worldwide

Level 2 • Centralized

Control, On-line Optimization

• SCOOT • 1981, UK • >170

installations worldwide

Level 3 • Distributed

Control, Model-Based

• OPAC, RHODES • 1992, USA • 5 installations in

USA

Level 4 • Distributed

Self-Learning Control

• MARLIN-ATSC • 2011, Canada

5

Page 6: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Issues with Leading ATSC Technologies?

• Expensive • Not scalable • Not robust

Centralized

• Relying on an accurate traffic modelling framework

• the accuracy of which is questionable Model-Based

• Increasing the complexity of the system exponentially with the increase in the number of intersections/controllers

Curse of Dimensionality

• Requiring highly skilled labour to operate due to their complexity.

Human Intervention

Requirements

6

Page 7: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Why is MARLIN Different? 7

MARLIN

Self-Learning

Decentralized

Model-Free

Coordinated Scalable

Pattern Sensitive

Generic

Human Intervention Requirements

Centralized

Inefficient Coordination

Model-Based

Curse of Dimensionality

Prediction Requirement

Specific Design

Page 8: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Learning the Control Law: Reinforcement Learning Architecture

8

Environment

RL Architecture

Agent

State Reward Action

Goal: Optimal Control law = mapping between states and actions

)],(),(max[),(),( 111 kkkkk

a

kkkkkkk asQasQrasQasQ

),(maxarg 11 asQa kk

a

k Balancing exploration and exploitation

Q a1 a2

s1 -10 -5

s2 -3 -15

Q Table

Page 9: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

RL-based ATSC Architecture

RL Software Agent

State (Queue Lengths)

Reward

(Delay Savings)

Action (Extend /Switch)

Traffic Simulation Environment

9

Page 10: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

10

Each agent plays a game with each adjacent intersection in its neighborhood

I5 I6 I4

I2 I3 I1

I8 I9 I7

I5 I6 I4

I2 I3 I1

I8 I9 I7

I5 I6 I4

I2 I3 I1

I8 I9 I7

Example for Intermediate Intersection (4 Games )

Example for Edge Intersection ( 3 Games)

Example for Corner Intersection ( 2 Games)

MARLIN- ATSC: Coordination Principle

Page 11: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

MARLIN-ATSC: (a) Independent Mode, (b) Integrated Mode

MARLIN-ATSC Available Modes

Queue Length 1

Delay 1 Extend 1 Queue Length 2

Delay 2 Extend 2

MARLIN-ATSC

Queue Length 1

Delay 1 Extend 1 Queue Length 2

Delay 2 Extend 2

(a)

(b)

11

Page 12: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Large-Scale Application Network-Wide MOE in the Normal Scenario

12

System

MOE

BC

%

Improvments

MARL-TI Vs.

BC

%

Improvments

MARLIN-IC Vs.

BC

% Improvments

MARLIN-IC Vs.

MARL-TI

Average Intersection

Delay (sec/veh)35.27 27% 38% 14%

Throughput (veh) 23084 3% 6% 3%

Avg Queue Length (veh) 8.66 24% 32% 11%

Std. Avg. Queue Length

(veh)2.12 23% 31% 10%

Avg. Link Delay (sec) 9.45 10% 47% 41%

Avg. Link Stop Time (sec) 2.74 6% 26% 21%

Avg. Link Travel Time

(sec)16.81 6% 27% 22%

CO2 Emission Factor

(gm/km)587.28 28% 30% 2%

Page 13: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Large-Scale Application % Improvement in Average Delay

13

MARLIN-IC vs BC

% Improvement Area 1

Area 2

Area 3

Page 14: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Large-Scale Application Average Route Travel Time for Selected Routes

14

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9 10 11 12

Average T

ravel

Tim

e (

min

)

Time Interval (5 min)

Gardiner EB

BC MARL-TI MARLIN-IC

Fre

eway

Page 15: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Large-Scale Application Average Route Travel Time for Selected Routes

15

0

2

4

6

8

10

12

14

16

18

20

1 2 3 4 5 6 7 8 9 10 11 12

Av

era

ge T

ra

vel

Tim

e (

min

)

Time Interval (5 min)

LakeShore EB to Spadina NB

BC MARL-TI MARLIN-IC

Maj

or

Art

eria

l

Page 16: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Controller Interface Device(CID) RS485 to USB

Traffic Signal Controller

RS485 - SDLC protocol

USB - SDLC protocol

Industrial Computer

Ethernet - NTCIP protocol

Paramics Modeller

MARLIN-HILS Architecture 16

Page 17: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

HILS Setup: Demo 17

Page 18: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Conclusion 18

MARLIN state of the art gen4+

Thoroughly developed and tested

Patent Pending Status

On going:

HILS & PEEK ATC-1000 Integration

Potential Field Operation Test

Productization

From TSP to People Priority (PSP)

Page 19: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Samah El-Tantawy [email protected]

Baher Abdulhai [email protected]

Hossam Abdelgawad [email protected]

ACGM 2013- Intelligent Transport for Smart Cities

Page 20: Smart Traffic Lights that Learn !    Multi-Agent Reinforcement Learning Integrated Network  of Adaptive Traffic Signal Controllers

Smart Traffic Lights that Learn !

Multi-Agent Reinforcement Learning Integrated Network of Adaptive Traffic Signal Controllers

M A R L I N

Samah ElTantawy, Ph.D. Post Doctoral Fellow, Dept of Civil Engineering

Baher Abdulhai, Ph.D., P.Eng. Director, ITS Centre and Testbed, Dept of Civil Engineering

Hossam Abdelgawad, Ph.D., P.Eng. Manager of ITS Centre and Testbed

ACGM 2013- Intelligent Transport for Smart Cities