the use of reinforcement learning in autonomous driving for vehicular path planning
TRANSCRIPT
The Use of Reinforcement Learning in Autonomous Driving for Vehicular Path PlanningOctober 17th, 2017
Vijay NadkarniGlobal Head, Artificial Intelligence
The High Tech Trend in the Automotive Industry
AI and Cloud Connectivity are becoming
omnipresent
Autonomous vehicles are disrupting
fundamental notions of driving
Increasing use of electric and clean
energy vehicles
Ride sharing and service-based
monetization are coming to the fore
The Building Blocks of Autonomous Driving
AI AI
Sensor
fusion(cameras,
LIDAR, RADAR,
GPS, IMU, etc.)
Object
Detection &
Classification
Actuator
Control(steering, throttle
control, braking)
Vehicular
Maneuvers(aka Path
Planning)
The Progression up the AI Value Chain in Autonomous Driving
Rule-based Identification
CNN-based object
detection
Understanding
RL-based
vehicular maneuvers
Reasoning
AI Technologies That are Useful in Autonomous Driving
Convolutional Neural NetworksObject detection & classification
Recurrent Neural NetworksVehicular trajectory prediction
Reinforcement LearningVehicular maneuvers & path planning
Why Reinforcement Learning Matters…
6
Range of Maneuvers It can learn an extremely large number
of roadway maneuvers
Complexity of ManeuversIt can learn complex maneuvers
involving many contextual parameters
DecisionsIt can make decisions akin to a human
driver
Vehicular Maneuvers – a Foundation of Autonomous Driving
• Objective: a vehicle that can travel from Point A to Point B…on its own
• Conventional model for vehicular maneuvers
• New model for vehicular maneuvers
• RL shines at leaning to handle complex roadway situations
Rule-based
Reinforcement Learning
Some Vehicular Maneuvers that Can be Handled by RL
• Adaptive cruise control
• Overtaking with lane change
• Traffic congestion (stop and go traffic)
• Merge onto highway from entrance ramp
• Merge off highway onto exit ramp
• Narrowing of lanes
• Passing a construction zone
• Passing an accident site
• Stopping at a traffic light
• Stopping at a stop sign
• Left or right turn at intersection
• Merging into roundabout
Example: Overtaking Maneuver using Reinforcement Learning
Some Fundamentals of Reinforcement Learning
• Agent observes state of environment
• Agent takes an action in order to achieve a
benefit
• Agent receives a reward based on the result of
that action – it can be positive or negative
• In many cases, the reward may be obtained
well in the future
• RL enables the agent to learn the optimal
behavior that will maximize the reward
Foundations of RL: Markov Decision Process and the Q-Table
Markov Decision Process
• A mathematical framework for modeling decisions
when outcomes are partly random and partly under the
control of a decision maker.
• Agent observes environment executes an action
receives reward from environment repeats process
Action A Action B Action C
State 0 0.235 0.002 0.506
State 1 0.659 0.121 0.000
State 2 0.020 0.270 0.117
Q-Table
• A table that is built up from repeat observations of
state-action pairs and the resulting outcomes (rewards)
• For each state, indicates the action that is expected to
yield the maximum reward
How a Reinforcement Learning Network is Trained for ACC
•A simulator creates thousands of
ACC training vectors
•Desired behavior: Follow the lead
vehicle at a safe distance
•Four categories of training
vectors:
• Ego vehicle collides with lead vehicle
(high negative reward)
• Ego vehicle maintains safe distance
(positive reward)
• Ego vehicle maintains unsafe distance
(moderate negative reward)
• Ego vehicle closes in, then distance
expands (small negative reward)
How an RL Agent Learns – Box Car Circling Rectangular Track
Final Thoughts – Where Reinforcement Learning is Headed
Is likely to replace conventional methods
for vehicular maneuvers
New algorithms that can handle complex
roadway situations are coming
Cloud and on-board triggers will both be
relevant