bing: binarized normed gradients for objectness estimation at 300fps cvpr 2014 oral

Post on 04-Jan-2016

229 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

CVPR 2014 Oral

Outline

• 1. Introduction• 2. Methodology

2.1 Normed gradients (NG) and objectness2.2 Learning objectness measurement with NG2.3 Binarized normed gradients (BING)• 3. Experimental Evaluation• 4. Conclusion and Future Work

1. Introduction*Motivation: Generic object detection

1. Introduction

objectness measure which is generic over categories has recently becomes popular

1. Introduction* Objectness : is a value which reflects how likely an image window covers an object of any category[3]

improvecomputational efficiency

detection accuracy

*What is a good objectness measure?

Achieve high object detection rate (DR)

Produce a small number of proposals

Obtain high computational efficiency

Have good generalization ability to unseen object categories

1. Introduction* we propose a surprisingly simple and powerful feature “BING” to help the search for objects using objectness scores

*We observe that generic objects with well-defined closed boundaries share surprisingly strong correlation when looking at the norm of the gradient

resizing of their corresponding image windows to small fixed size (e.g. 8x8).

use the norm of the gradients as a simple 64D feature(NG feature)

Use cascaded SVM framework for learning objectness measure

2. Methodology* we scan over a predefined quantized window sizes (scales and aspect ratios)

target window sizes {(Wo,Ho)}

Sl: filter score

gl: NG feature

l: location

i:size

(x,y): position of a window

* we select a small set of proposals from each size i by Using non-maximal suppression (NMS)

2. Methodology

• 2.1 Normed gradients (NG) and objectness

*Maybe some sizes (e.g. 10 x 500) are less likely than others to contain an object instance (e.g.100 x 100)

coefficient and a bias terms for each quantised size i

* Objects are stand-alone things with well-defined closed boundaries and centers [3, 26, 32].

1.Firstly we resizing of their corresponding image windows to small fixed size (e.g. 8x8).

2. The resized normed gradients maps are defined as a 64D normed gradients (NG) feature of its corresponding window.

2. Methodology* NG feature has several advantages

1.NG features are insensitive to change of translation, scale and aspect ratio, which will be very useful for detecting objects of arbitrary categories

2.NG feature makes it very efficient to be calculated and verified

2. Methodology

• 2.2 Learning objectness measurement with NG

* Two stages cascaded SVM [57].

Stage I. We learn a single model w for (1) using linear SVM ground truth object windows

random sampled background windows

Pos

Neg

Stage II. To learn vi and ti in (3) using a linear SVM

we evaluate (1) at size i for training images and use the selected (NMS) proposals as training samples, their filter scores as 1D features, and check their labeling using training image annotations

2. Methodology

• 2.3 Binarized normed gradients (BING)

* To use advantages of model binary approximation [28, 59] NG->BING

Nw : the number of basis vectors

:basis vector

: Corresponding coefficient

linear model

2. Methodology

2. Methodology* How to binarize and calculate our NG features efficiently

We approximate the normed gradient values (each saved as a BYTE value) using the top Ng

binary bits of the BYTE values

E.g. Decimal: 210 Binary: 11010010Top Ng=4 bits: 1101

64D NG feature gl can be approximated by Ng binarized normed gradients (BING) features

210 = 1 2∗ 8−1+1 2∗ 8−2+0 2∗ 8−3+1 2∗ 8−4

2. Methodology* First, a BING feature bx,y and its last row rx,y could be saved in a single INT64 and a BYTE variables

* Second, adjacent BING features and their rows have a simple cumulative relation

the operator BITWISE SHIFT shifts rx-1,y by one bit, automatically through the bit which does notbelong to rx,y, and makes room to insert the new bit bx,y using the BITWISE OR operator.

Similarly BITWISE SHIFT shifts bx,y-1 by 8 bits automatically through the bits which do not belong to bx,y, and makes room to insert rx,y

2. Methodology

3. Experimental Evaluation

• Proposal quality comparisons• Generalize ability test

․Data set :VOC2007

․evaluation metric : DR-#WIN

proposal qualitygeneralize abilityefficiency

3. Experimental Evaluation

• Computational time

3. Experimental Evaluation

3. Experimental Evaluation

4. Conclusion and Future Work․Limitations

For some object categories, a bounding box might not localize the object instances as accurately as a segmentation region

(e.g. a snake, wires, etc.)

․We present a surprisingly simple, fast, and high quality objectness measure by using 8X8 BING features

․Only needs a few atomic (i.e. ADD, BITWISE, etc.) operations

․The binary operations and memory efficiency make our method suitable to run on low power devices

․It suitable for realtime multi-category object detection applications and large scale image collections

top related