![Page 1: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/1.jpg)
SSD:Single Shot MultiBox Detector
Presenter: Hongjing Zhang, Chen Zhang
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Redd, Cheng-Yang Fu, Alexander C.Berg
![Page 2: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/2.jpg)
1. Review of faster R-CNN2. Model of SSD
a. Multi-scale feature mapb. Bounding boxesc. Network architecture
3. Training of SSDa. Matching strategyb. Loss functionc. Hard negative miningd. Data augmentation
4. Experiments
Table of contents
![Page 3: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/3.jpg)
Performance On VOC2007
Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf
![Page 4: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/4.jpg)
Quick Review of Faster R-CNN
Fast R-CNN + Region Proposal Netsource: faster R-CNN paper by S. Ren
Region Proposal Net: k anchor boxes of different scale and aspect ratio at each position.
![Page 5: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/5.jpg)
SSD: Multi-scale Feature Map
Feature maps from different conv layers of different sizes.
Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf
conv kernel
default box
![Page 6: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/6.jpg)
SSD: Multi-scale Feature Map
Effects on mAP when using different feature maps.
Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf
![Page 7: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/7.jpg)
SSD: Multi-scale Feature Map
Anchor box setting comparison: SSD vs. Faster R-CNN
Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf
![Page 8: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/8.jpg)
SSD: Multi-scale Feature Map
Anchor box setting: aspect ratios
Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf
![Page 9: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/9.jpg)
Default Bounding Boxes - scale and shape
Aspect ratio: a in One extra default case:
![Page 10: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/10.jpg)
Default Bounding Boxes - scale and shape
Aspect ratio: a in One extra default case:
![Page 11: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/11.jpg)
Default Bounding Boxes - scale and shape
Aspect ratio: a in One extra default case:
![Page 12: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/12.jpg)
Default Bounding Boxes - scale and shape
Aspect ratio: a in One extra default case:
![Page 13: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/13.jpg)
Default Bounding Boxes - scale and shape
Aspect ratio: a in
Scale of default boxes computed as a linear function:
Smin = 0.2, Smax = 0.9
One extra default case:
![Page 14: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/14.jpg)
Default Bounding BoxesWhy small boxes in large feature maps?
Source: https://towardsdatascience.com/understanding-ssd-multibox-real-time-object-detection-in-deep-learning-495ef744fab
![Page 15: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/15.jpg)
Default Bounding BoxesWhy small boxes in large feature maps?● large feature map - small receptive field - small object● small feature map - large receptive field - large object
![Page 16: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/16.jpg)
Convolutional predictors for detection
3 x 3 x p kernel Each box: ● cls: # classes (C1, C2, …, Cp)● reg: 4 parameters delta(cx, cy, w, h)
# conv kernels: (Classes+4) x (# Default Boxes)
![Page 17: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/17.jpg)
SSD Network Structure
SSD architecture taken from the original paper
![Page 18: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/18.jpg)
VGG-16 network: by Oxford's Visual Geometry Group (VGG)
Source: https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/
● 3*3 conv kernel● 2*2 pooling with stride = 2
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
![Page 19: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/19.jpg)
Bounding Box Matching Strategy
● Jaccard Overlap (Intersection over Union)
● Matching default boxes to ground truth boxes with IoU > threshold(0.5)
![Page 20: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/20.jpg)
Training Objective● After pairing groundtruth and default boxes, we can write the objective function:
: matching the i-th default box to the j-th ground truth box of category p.
N: matched default boxes. c: class confidence.l: predicted bounding boxg: ground truth bounding box
![Page 21: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/21.jpg)
Training Objective
![Page 22: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/22.jpg)
Hard Negative Mining
● Instead of using all the negative examples, we sort them using the highest confidence loss for each default box and pick the top ones.
● The ratio between negative examples and positive examples is 3:1.
● This method leads to faster optimization and a more stable training.
![Page 23: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/23.jpg)
Data Augmentation
● Making the model more robust to various input object sizes and outputs:
1.Original Images.
2.Sample patch with minimal jaccard scores as 0.1, 0.3, 0.5, 0.7 or 0.9.
3.Randomly sample a patch.
![Page 24: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/24.jpg)
Experiments
Table from paper SSD
PASCAL VOC2007 test detection results.
mAP: SSD > Faster > FastmAP: SSD512 > SSD300 ResolutionMore training data is better07+12+COCO
![Page 25: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/25.jpg)
Experiments
Data AugmentationMore default box shapes is better
Table from Paper SSD
Effects of various design choices and components on SSD performance.
![Page 26: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/26.jpg)
Experiments
Table from paper SSD
Effects of using multiple output layers.
Multiple Output Layers Are Better
![Page 27: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/27.jpg)
Experiments
Plot from paper SSD
Visulization of performance for SSD512 on animals, vehicles and furniture from VOC2007 test.
![Page 28: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/28.jpg)
ExperimentsVisualization of performance for SSD512 on animals, the cumulative fraction of detections.
Plot from paper SSD
![Page 29: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/29.jpg)
ExperimentsVisualization of performance for SSD512 on animals, the distribution of top-ranked false positive types.
Plot from paper SSD
![Page 30: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/30.jpg)
ExperimentsInference Time(Results on VOC2007 test).
Table from paper SSD
![Page 31: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/31.jpg)
Detection Results
Images from paper SSD
![Page 32: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/32.jpg)
Strength Drawbacks ● High Speed
● High Accuracy
● Simple Training(single shot)
● The classification task for small objects is relatively hard for SSD.
![Page 33: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/33.jpg)
Questions
![Page 34: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/34.jpg)
Atrous Algorithm(Dilated Convolution)
![Page 35: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/35.jpg)
Receptive Field
![Page 36: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/36.jpg)
L2 Normalization on Conv4
Figure from paper ParseNet
![Page 37: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/37.jpg)
Why Conv6,Conv7?
![Page 38: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/38.jpg)
About 1x1 conv kernels
In GoogLeNet architecture, 1x1 convolution is used for two purposes
● To reduce the dimensions inside this “inception module”.● To add more non-linearity by having ReLU immediately after every 1x1 convolution.
![Page 39: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/39.jpg)
ExperimentsSensitivity and impact of different object characteristics on VOC2007 test set.
Plot from paper SSD
![Page 40: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander C ...yjlee/teaching/ecs289g-winter2018/SSD.pdfSSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu,](https://reader030.vdocuments.mx/reader030/viewer/2022040510/5e57b01bf8475c0be319398f/html5/thumbnails/40.jpg)
Object Detection Leaderboard(VOC2012)