autonomous driving made safe - nvidia · deep learning system for automated scenario modification...
TRANSCRIPT
Autonomous driving made safe
tm
Founder, BioCelite Milbrandt
● Austin, Texas since 1998● Founder of Slacker Radio
○ In dash for Tesla, GM, and Ford.○ 35M active users 2008
● Chief Product Officer of RideScout○ Acquired by MBUSA/Daimler
2014
Mission statement: Making autonomous vehicle travel safe
Current scenario verification challenges
● Large vehicle fleets● Driver to manage in the event of system
error● Expensive/ad hoc and incomplete.● Many simple scenarios are missed● Scenario generation and verification
happens in real time, and is not easily repeatable
Solution
● Automate scenario test generation for planning testing
● Deep learning system for automated scenario modification and re-generation.
● Leverages existing gaming systems to enable multiphysics simulation
● Generation of realistic Lidar, Radar, Camera, and IMU sensor information for perceptions system testing
● Enable automated vehicle control performance metrics
● Fast error case regeneration, with derivative regeneration
Testing Perception and Planning
Lidar
Radar Inertial Measurement Unit (IMU)
Camera Stereo Vision Simulation Engine
Control System Under Test
Ground Truth w/ Scene Labeling
Training Realistic Traffic Behavior
● Each agent/driver must have separate behaviors
● Behaviors must be learned based on different reward structures during training
● Examples of learned behaviors○ Speeder○ Brake Happy○ Cell Phone Driver○ Drunk Driver
● Behaviors are distributed based on the type of scenarios we want to test against
● Accidents result based on distribution of agents with various learned behaviors
Reinforcement Trained Neural Network
● Input layer is an image in our case● Output layers are log probability to apply
throttle or turn right.● More negative log probabilities represent
apply brake or turn left, respectively● Number of layers and number of neurons for
each layer are selected based on the convergence characteristic given your desired value function and or policy.
● Reward function is chosen based on desired behavior you are trying to emulate
● Comment: control belongs in CPU, computation lives in GPU
Neuron Updater
rewards
Reinforcement Learning
● Simulator Interface○ Socket-based○ Python, C++○ Single simulator instance
● Per Agent Reward Modifiers ○ Library of reward modifiers
● Agent Hyperparameters○ Continuous action space○ Multiple concurrent agents
● Downsampling○ Full resolution -> 80x80○ Top down view or perspective
Agent
Downsampler↓N
nxm
RewardModifier
P(throttle|s)
P(turnRight|s)
Learning to Drive Example of basic reward system
● Stay in lane● Don’t hit other vehicles● Maintain safe distance from leading vehicle● Change lanes only to avoid collision
Basic System Details:
● Examples of our reward functions for different types of drivers
○ Modulate reward with speed○ Generate negative/positive rewards based
on different collision boundaries○ Generate reward for cause opposing cars
to move, swerve, or change direction
Scalable multi-agent training and testing for A3C...
Example monoDrive Reinforcement Agent
Continuous action space
Up to 20 agents (200 future)
Reward based on agent reward function/modifier
Andrej Karpathy# agent based on karpathy http://karpathy.github.io/2016/05/31/rl/
Try it out!www.monodrive.io
● Download simulator at www.monodrive.io○ Coming soon!○ Early version available with request to
[email protected]● Download sample agent and sample reward at:
www.github.com/celite/agent_cm.py● System Requirements:
○ Windows, Mac, Ubuntu○ Tensorflow-GPU ○ Or Tensorflow if you have more time than money○ 32 Gb memory (64GB recommended)
● Example Agent is python based but can be anything.
● Control Interface based on IP sockets
Agent
Downsampler↓N
nxm
RewardModifier
P(throttle|s)
P(turnRight|s)