the use of reinforcement learning in autonomous driving for vehicular path planning

The Use of Reinforcement Learning in Autonomous Driving for Vehicular Path PlanningOctober 17th, 2017

Vijay NadkarniGlobal Head, Artificial Intelligence

[email protected]

The High Tech Trend in the Automotive Industry

AI and Cloud Connectivity are becoming

omnipresent

Autonomous vehicles are disrupting

fundamental notions of driving

Increasing use of electric and clean

energy vehicles

Ride sharing and service-based

monetization are coming to the fore

The Building Blocks of Autonomous Driving

AI AI

Sensor

fusion(cameras,

LIDAR, RADAR,

GPS, IMU, etc.)

Object

Detection &

Classification

Actuator

Control(steering, throttle

control, braking)

Vehicular

Maneuvers(aka Path

Planning)

The Progression up the AI Value Chain in Autonomous Driving

Rule-based Identification

CNN-based object

detection

Understanding

RL-based

vehicular maneuvers

Reasoning

AI Technologies That are Useful in Autonomous Driving

Convolutional Neural NetworksObject detection & classification

Recurrent Neural NetworksVehicular trajectory prediction

Reinforcement LearningVehicular maneuvers & path planning

Why Reinforcement Learning Matters…

6

Range of Maneuvers It can learn an extremely large number

of roadway maneuvers

Complexity of ManeuversIt can learn complex maneuvers

involving many contextual parameters

DecisionsIt can make decisions akin to a human

driver

Vehicular Maneuvers – a Foundation of Autonomous Driving

• Objective: a vehicle that can travel from Point A to Point B…on its own

• Conventional model for vehicular maneuvers

• New model for vehicular maneuvers

• RL shines at leaning to handle complex roadway situations

Rule-based

Reinforcement Learning

Some Vehicular Maneuvers that Can be Handled by RL

• Adaptive cruise control

• Overtaking with lane change

• Traffic congestion (stop and go traffic)

• Merge onto highway from entrance ramp

• Merge off highway onto exit ramp

• Narrowing of lanes

• Passing a construction zone

• Passing an accident site

• Stopping at a traffic light

• Stopping at a stop sign

• Left or right turn at intersection

• Merging into roundabout

Example: Overtaking Maneuver using Reinforcement Learning

Some Fundamentals of Reinforcement Learning

• Agent observes state of environment

• Agent takes an action in order to achieve a

benefit

• Agent receives a reward based on the result of

that action – it can be positive or negative

• In many cases, the reward may be obtained

well in the future

• RL enables the agent to learn the optimal

behavior that will maximize the reward

Foundations of RL: Markov Decision Process and the Q-Table

Markov Decision Process

• A mathematical framework for modeling decisions

when outcomes are partly random and partly under the

control of a decision maker.

• Agent observes environment executes an action

receives reward from environment repeats process

Action A Action B Action C

State 0 0.235 0.002 0.506

State 1 0.659 0.121 0.000

State 2 0.020 0.270 0.117

Q-Table

• A table that is built up from repeat observations of

state-action pairs and the resulting outcomes (rewards)

• For each state, indicates the action that is expected to

yield the maximum reward

How a Reinforcement Learning Network is Trained for ACC

•A simulator creates thousands of

ACC training vectors

•Desired behavior: Follow the lead

vehicle at a safe distance

•Four categories of training

vectors:

• Ego vehicle collides with lead vehicle

(high negative reward)

• Ego vehicle maintains safe distance

(positive reward)

• Ego vehicle maintains unsafe distance

(moderate negative reward)

• Ego vehicle closes in, then distance

expands (small negative reward)

How an RL Agent Learns – Box Car Circling Rectangular Track

Final Thoughts – Where Reinforcement Learning is Headed

Is likely to replace conventional methods

for vehicular maneuvers

New algorithms that can handle complex

roadway situations are coming

Cloud and on-board triggers will both be

relevant