ssd:single shot multibox redd, cheng-yang fu, alexander...

40
SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen Zhang Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Redd, Cheng-Yang Fu, Alexander C.Berg

Upload: others

Post on 29-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

SSD:Single Shot MultiBox Detector

Presenter: Hongjing Zhang, Chen Zhang

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Redd, Cheng-Yang Fu, Alexander C.Berg

Page 2: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

1. Review of faster R-CNN2. Model of SSD

a. Multi-scale feature mapb. Bounding boxesc. Network architecture

3. Training of SSDa. Matching strategyb. Loss functionc. Hard negative miningd. Data augmentation

4. Experiments

Table of contents

Page 3: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Performance On VOC2007

Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf

Page 4: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Quick Review of Faster R-CNN

Fast R-CNN + Region Proposal Netsource: faster R-CNN paper by S. Ren

Region Proposal Net: k anchor boxes of different scale and aspect ratio at each position.

Page 5: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

SSD: Multi-scale Feature Map

Feature maps from different conv layers of different sizes.

Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf

conv kernel

default box

Page 6: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

SSD: Multi-scale Feature Map

Effects on mAP when using different feature maps.

Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf

Page 7: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

SSD: Multi-scale Feature Map

Anchor box setting comparison: SSD vs. Faster R-CNN

Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf

Page 8: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

SSD: Multi-scale Feature Map

Anchor box setting: aspect ratios

Source: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf

Page 9: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Default Bounding Boxes - scale and shape

Aspect ratio: a in One extra default case:

Page 10: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Default Bounding Boxes - scale and shape

Aspect ratio: a in One extra default case:

Page 11: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Default Bounding Boxes - scale and shape

Aspect ratio: a in One extra default case:

Page 12: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Default Bounding Boxes - scale and shape

Aspect ratio: a in One extra default case:

Page 13: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Default Bounding Boxes - scale and shape

Aspect ratio: a in

Scale of default boxes computed as a linear function:

Smin = 0.2, Smax = 0.9

One extra default case:

Page 14: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Default Bounding BoxesWhy small boxes in large feature maps?

Source: https://towardsdatascience.com/understanding-ssd-multibox-real-time-object-detection-in-deep-learning-495ef744fab

Page 15: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Default Bounding BoxesWhy small boxes in large feature maps?● large feature map - small receptive field - small object● small feature map - large receptive field - large object

Page 16: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Convolutional predictors for detection

3 x 3 x p kernel Each box: ● cls: # classes (C1, C2, …, Cp)● reg: 4 parameters delta(cx, cy, w, h)

# conv kernels: (Classes+4) x (# Default Boxes)

Page 17: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

SSD Network Structure

SSD architecture taken from the original paper

Page 18: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

VGG-16 network: by Oxford's Visual Geometry Group (VGG)

Source: https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/

● 3*3 conv kernel● 2*2 pooling with stride = 2

Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).

Page 19: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Bounding Box Matching Strategy

● Jaccard Overlap (Intersection over Union)

● Matching default boxes to ground truth boxes with IoU > threshold(0.5)

Page 20: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Training Objective● After pairing groundtruth and default boxes, we can write the objective function:

: matching the i-th default box to the j-th ground truth box of category p.

N: matched default boxes. c: class confidence.l: predicted bounding boxg: ground truth bounding box

Page 21: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Training Objective

Page 22: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Hard Negative Mining

● Instead of using all the negative examples, we sort them using the highest confidence loss for each default box and pick the top ones.

● The ratio between negative examples and positive examples is 3:1.

● This method leads to faster optimization and a more stable training.

Page 23: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Data Augmentation

● Making the model more robust to various input object sizes and outputs:

1.Original Images.

2.Sample patch with minimal jaccard scores as 0.1, 0.3, 0.5, 0.7 or 0.9.

3.Randomly sample a patch.

Page 24: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Experiments

Table from paper SSD

PASCAL VOC2007 test detection results.

mAP: SSD > Faster > FastmAP: SSD512 > SSD300 ResolutionMore training data is better07+12+COCO

Page 25: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Experiments

Data AugmentationMore default box shapes is better

Table from Paper SSD

Effects of various design choices and components on SSD performance.

Page 26: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Experiments

Table from paper SSD

Effects of using multiple output layers.

Multiple Output Layers Are Better

Page 27: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Experiments

Plot from paper SSD

Visulization of performance for SSD512 on animals, vehicles and furniture from VOC2007 test.

Page 28: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

ExperimentsVisualization of performance for SSD512 on animals, the cumulative fraction of detections.

Plot from paper SSD

Page 29: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

ExperimentsVisualization of performance for SSD512 on animals, the distribution of top-ranked false positive types.

Plot from paper SSD

Page 30: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

ExperimentsInference Time(Results on VOC2007 test).

Table from paper SSD

Page 31: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Detection Results

Images from paper SSD

Page 32: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Strength Drawbacks ● High Speed

● High Accuracy

● Simple Training(single shot)

● The classification task for small objects is relatively hard for SSD.

Page 33: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Questions

Page 34: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Atrous Algorithm(Dilated Convolution)

Page 35: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Receptive Field

Page 36: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

L2 Normalization on Conv4

Figure from paper ParseNet

Page 37: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Why Conv6,Conv7?

Page 38: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

About 1x1 conv kernels

In GoogLeNet architecture, 1x1 convolution is used for two purposes

● To reduce the dimensions inside this “inception module”.● To add more non-linearity by having ReLU immediately after every 1x1 convolution.

Page 39: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

ExperimentsSensitivity and impact of different object characteristics on VOC2007 test set.

Plot from paper SSD

Page 40: SSD:Single Shot MultiBox Redd, Cheng-Yang Fu, Alexander ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/...SSD:Single Shot MultiBox Detector Presenter: Hongjing Zhang, Chen

Object Detection Leaderboard(VOC2012)