direct perception for autonomous drivingalaink/orf467f15/deep driving.pdfprof. k prof. f prof. p...

51
Direct Perception for Autonomous Driving Chenyi Chen

Upload: others

Post on 03-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Direct Perception for Autonomous Driving

Chenyi Chen

Page 2: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

DeepDriving

• http://deepdriving.cs.princeton.edu/

Page 3: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Direct Perception

control estimate

Page 4: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Direct Perception (in Driving…)

• Ordinary car detection: Find a car! It’s localized in the image by a red bounding box.

• Direct perception: The car is in the right lane; 16 meters ahead

• Which one is more helpful for driving task?

Page 5: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Machine Learning

Page 6: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

What’s machine learning? • Example problem: face recognition

Prof. K Prof. F Prof. P Prof. V Chenyi

• Training data: a collection of images and labels (names)

Who is this guy?

• Evaluation criterion: correct labeling of new images

Page 7: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

What’s machine learning?

• Example problem: scene classification

road road sea mountain city

• a few labeled training images

What’s the label of this image?

• goal to label yet unseen image

Page 8: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Supervised Learning

• System input: X • System model: fθ(), with parameter θ • System output: Ŷ= fθ(X), • Ground truth label: Y • Loss function: L(θ) • Learning rate: γ

Page 9: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Why machine learning?

• The world is very complicated • We don’t know the exact model/mechanism

between input and output • Find an approximate (usually simplified)

model between input and output through learning

Page 10: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Deep Learning: A sub-field of machine learning

Page 11: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is
Page 12: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is
Page 13: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is
Page 14: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is
Page 15: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is
Page 16: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Some Salient Features

Why deep learning?

It’s a STOP sign!!

How do we detect a stop sign? It’s all about feature!

Page 17: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Why deep learning?

Pedestrian found!!!

How does computer vision algorithm work? It’s all about feature!

Page 18: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Why deep learning? • We believe driving is also related with certain features • Those features determine what action to take next • Salient features can be automatically detected and

processed by deep learning algorithm

Deep Convolutional Neural Network

Page 19: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Deep Convolutional Neural Network (CNN)

Figure courtesy and modified of Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton

Convolutional layers Fully connected layers

Parallel computing Deep feature, 4096D, float (4 bytes)

Output, 14D, float (4 bytes) Input image, 280*210*3, int8 (1 byte)

……

Page 20: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

• Input image convolves with a large set of filters, and this process is repeated in many layers.

• Compared to traditional computer vision feature extraction method, e.g. SIFT, HOG, more powerful feature representation of the input image is learnt automatically through convolutions.

How does CNN work?

A visualized example of the set of filters

Page 21: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Why deep learning? • ImageNet Large Scale Visual Recognition Challenge

2012 (ILSVRC2012), the most famous challenge in computer vision field

• 1000 categories • 1.2 million training images • 50,000 validation images • 150,000 testing images • Top-5 error rate* of deep learning: 15.3% • Top-5 error rate of second best (which is non-deep

learning): 26.2%

*Top-5 error rate: the fraction of test images for which the correct label is not among the five labels considered most probable by the model

Page 22: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Result Page of ILSVRC2012 Website

Deep Learning

Deep Learning

Only one team!

Page 23: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Result Page of ILSVRC2014 Website

Deep Learning

Page 24: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Result Page of ILSVRC2014 Website

Deep Learning

Page 25: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Why deep learning?

• Deep learning first impacted the computer vision field in ILSVRC2012

• Now it’s almost dominating the computer vision field

• It only takes less than three years!

Page 26: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

How does our system work?

Page 27: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Basic Ideas

• Extract key parameters from driving scenes images with deep learning CNN

• Compute driving control (optimal control) based on those parameters

Page 28: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

In our specific case…

• Let the deep learning algorithm tell us: • angle: the angle between the car’s heading

and the tangent of the track; • toMarking: the distance between the center

of the car and each lane marking; • dist: the distance between the car and the

preceding car in each lane; • … 14 parameters in total

Page 29: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

In our specific case…

toMarking_MLtoMarking_ML toMarking_MRtoMarking_MRangle

• Illustration of key parameters for car pose and localization

Page 30: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

The cases we are dealing with • Monitoring current lane and left & right neighboring lanes

Page 31: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

• Let the deep learning CNN drive in a racing game -- TORCS

In our experiment

Page 32: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

How to train the system?

Page 33: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Seven Training Tracks

Page 34: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Multiple Asphalt Darkness Level for the Training Tracks

Page 35: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

22 Training Cars

Page 36: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

CNN during training phase • Supervised learning, ground truth extracted from the game engine

Page 37: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

How to run the system?

Page 38: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

CNN during testing phase

• Unseen track with unseen cars

Page 39: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

How does the successfully trained system drive a car?

• The system runs at 10Hz, real time

Page 40: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

A Challenging Test: Perception during Night Driving

Page 41: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is
Page 42: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Next step: incorporating temporal information

Page 43: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Other Interesting Stuffs

Page 44: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

A Taste of Stereo Vision

Page 45: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

3D Scene Reconstruction

• Stereo images • Color • 1382*512 • 10 FPS

Page 46: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Visual Odometry

• Visual odometry computes the trajectory of the vehicle only based on image sequences

Page 47: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Depth Map • Disparity map is computed from grayscale stereo image pairs • Depth map can be derived from disparity map and camera model

Page 48: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Other Demos for Structure-from-Motion

• https://www.youtube.com/watch?v=i7ierVkXYa8

• https://www.youtube.com/watch?v=vpTEobpYoTg

Page 49: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Other Demos for Structure-from-Motion

Page 50: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Other Demos for Structure-from-Motion

Page 51: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is

Q & A