direct perception for autonomous drivingalaink/orf467f15/deep driving.pdfprof. k prof. f prof. p...
TRANSCRIPT
![Page 1: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/1.jpg)
Direct Perception for Autonomous Driving
Chenyi Chen
![Page 3: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/3.jpg)
Direct Perception
control estimate
![Page 4: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/4.jpg)
Direct Perception (in Driving…)
• Ordinary car detection: Find a car! It’s localized in the image by a red bounding box.
• Direct perception: The car is in the right lane; 16 meters ahead
• Which one is more helpful for driving task?
![Page 5: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/5.jpg)
Machine Learning
![Page 6: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/6.jpg)
What’s machine learning? • Example problem: face recognition
Prof. K Prof. F Prof. P Prof. V Chenyi
• Training data: a collection of images and labels (names)
Who is this guy?
• Evaluation criterion: correct labeling of new images
![Page 7: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/7.jpg)
What’s machine learning?
• Example problem: scene classification
road road sea mountain city
• a few labeled training images
What’s the label of this image?
• goal to label yet unseen image
![Page 8: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/8.jpg)
Supervised Learning
• System input: X • System model: fθ(), with parameter θ • System output: Ŷ= fθ(X), • Ground truth label: Y • Loss function: L(θ) • Learning rate: γ
![Page 9: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/9.jpg)
Why machine learning?
• The world is very complicated • We don’t know the exact model/mechanism
between input and output • Find an approximate (usually simplified)
model between input and output through learning
![Page 10: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/10.jpg)
Deep Learning: A sub-field of machine learning
![Page 11: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/11.jpg)
![Page 12: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/12.jpg)
![Page 13: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/13.jpg)
![Page 14: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/14.jpg)
![Page 15: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/15.jpg)
![Page 16: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/16.jpg)
Some Salient Features
Why deep learning?
It’s a STOP sign!!
How do we detect a stop sign? It’s all about feature!
![Page 17: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/17.jpg)
Why deep learning?
Pedestrian found!!!
How does computer vision algorithm work? It’s all about feature!
![Page 18: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/18.jpg)
Why deep learning? • We believe driving is also related with certain features • Those features determine what action to take next • Salient features can be automatically detected and
processed by deep learning algorithm
Deep Convolutional Neural Network
![Page 19: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/19.jpg)
Deep Convolutional Neural Network (CNN)
Figure courtesy and modified of Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton
Convolutional layers Fully connected layers
Parallel computing Deep feature, 4096D, float (4 bytes)
Output, 14D, float (4 bytes) Input image, 280*210*3, int8 (1 byte)
……
![Page 20: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/20.jpg)
• Input image convolves with a large set of filters, and this process is repeated in many layers.
• Compared to traditional computer vision feature extraction method, e.g. SIFT, HOG, more powerful feature representation of the input image is learnt automatically through convolutions.
How does CNN work?
A visualized example of the set of filters
![Page 21: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/21.jpg)
Why deep learning? • ImageNet Large Scale Visual Recognition Challenge
2012 (ILSVRC2012), the most famous challenge in computer vision field
• 1000 categories • 1.2 million training images • 50,000 validation images • 150,000 testing images • Top-5 error rate* of deep learning: 15.3% • Top-5 error rate of second best (which is non-deep
learning): 26.2%
*Top-5 error rate: the fraction of test images for which the correct label is not among the five labels considered most probable by the model
![Page 22: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/22.jpg)
Result Page of ILSVRC2012 Website
Deep Learning
Deep Learning
Only one team!
![Page 23: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/23.jpg)
Result Page of ILSVRC2014 Website
Deep Learning
![Page 24: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/24.jpg)
Result Page of ILSVRC2014 Website
Deep Learning
![Page 25: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/25.jpg)
Why deep learning?
• Deep learning first impacted the computer vision field in ILSVRC2012
• Now it’s almost dominating the computer vision field
• It only takes less than three years!
![Page 26: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/26.jpg)
How does our system work?
![Page 27: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/27.jpg)
Basic Ideas
• Extract key parameters from driving scenes images with deep learning CNN
• Compute driving control (optimal control) based on those parameters
![Page 28: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/28.jpg)
In our specific case…
• Let the deep learning algorithm tell us: • angle: the angle between the car’s heading
and the tangent of the track; • toMarking: the distance between the center
of the car and each lane marking; • dist: the distance between the car and the
preceding car in each lane; • … 14 parameters in total
![Page 29: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/29.jpg)
In our specific case…
toMarking_MLtoMarking_ML toMarking_MRtoMarking_MRangle
• Illustration of key parameters for car pose and localization
![Page 30: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/30.jpg)
The cases we are dealing with • Monitoring current lane and left & right neighboring lanes
![Page 31: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/31.jpg)
• Let the deep learning CNN drive in a racing game -- TORCS
In our experiment
![Page 32: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/32.jpg)
How to train the system?
![Page 33: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/33.jpg)
Seven Training Tracks
![Page 34: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/34.jpg)
Multiple Asphalt Darkness Level for the Training Tracks
![Page 35: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/35.jpg)
22 Training Cars
![Page 36: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/36.jpg)
CNN during training phase • Supervised learning, ground truth extracted from the game engine
![Page 37: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/37.jpg)
How to run the system?
![Page 38: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/38.jpg)
CNN during testing phase
• Unseen track with unseen cars
![Page 39: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/39.jpg)
How does the successfully trained system drive a car?
• The system runs at 10Hz, real time
![Page 40: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/40.jpg)
A Challenging Test: Perception during Night Driving
![Page 41: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/41.jpg)
![Page 42: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/42.jpg)
Next step: incorporating temporal information
![Page 43: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/43.jpg)
Other Interesting Stuffs
![Page 44: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/44.jpg)
A Taste of Stereo Vision
![Page 45: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/45.jpg)
3D Scene Reconstruction
• Stereo images • Color • 1382*512 • 10 FPS
![Page 46: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/46.jpg)
Visual Odometry
• Visual odometry computes the trajectory of the vehicle only based on image sequences
![Page 47: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/47.jpg)
Depth Map • Disparity map is computed from grayscale stereo image pairs • Depth map can be derived from disparity map and camera model
![Page 48: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/48.jpg)
Other Demos for Structure-from-Motion
• https://www.youtube.com/watch?v=i7ierVkXYa8
• https://www.youtube.com/watch?v=vpTEobpYoTg
![Page 49: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/49.jpg)
Other Demos for Structure-from-Motion
![Page 50: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/50.jpg)
Other Demos for Structure-from-Motion
![Page 51: Direct Perception for Autonomous Drivingalaink/Orf467F15/Deep Driving.pdfProf. K Prof. F Prof. P Prof. V Chenyi • Training data: a collection of images and labels (names ) Who is](https://reader036.vdocuments.mx/reader036/viewer/2022081400/5f8c1f8ab6f5ec33510fe865/html5/thumbnails/51.jpg)
Q & A