cvpr 2009, miami, florida

CVPR 2009, Miami, Florida

Subhransu Maji and Jitendra MalikUniversity of California at Berkeley, Berkeley, CA-94720

Object Detection Using a Max-Margin Hough Transform

Object detection using a max-margin Hough Transform

Overview2

Overview of probabilistic Hough transform Learning framework Experiments Summary


Our Approach: Hough Transform Popular for detecting parameterized shapes

Hough’59, Duda&Hart’72, Ballard’81,…

Local parts vote for object pose Complexity : # parts * # votes

Can be significantly lower than brute force search over pose (for example sliding window detectors)

3


Generalized to object detection

Learning

Spatial occurrence distributionsx

y

s

x

y

sx

y

s

x

y

s

• Learn appearance codebook– Cluster over interest points

on training images

Use Hough space voting to find objects Lowe’99, Leibe et.al.’04,’08, Opelt&Pinz’08

Implicit Shape ModelLeibe et.al.’04,’08

4

• Learn spatial distributions– Match codebook to training

images– Record matching positions on

object– Centroid is given


Detection Pipeline

Probabilistic Voting

5

Interest Points

eg. SIFT,GB, Local Patches

Matched Codebook Entries

KD Tree

B. Leibe, A. Leonardis, and B. Schiele. Combined object categorization and segmentation with an implicit shape model ‘ 2004


Probabilistic Hough Transform C – Codebook f – features, l - locations

Position Posterior

Codeword Match

Codeword likelihood

Detection Score

6

Codeword likelihood


Learning Feature Weights Given :

Appearance Codebook, C Posterior distribution of object center for each codeword P(x|…)

To Do : Learn codebook weights such that the Hough transform detector

works well (i.e. better detection rates) Contributions :

1. Show that these weights can be learned optimally using a max-margin framework.

2. Demonstrate that this leads to improved accuracy on various datasets

7


Naïve Bayes weights:

Encourages relatively rare parts However rare parts may not be good

predictors of the object location Need to jointly consider both priors and

distribution of location centers.

Learning Feature Weights : First Try

8

8


Location invariance assumption Overall score is linear given the matched codebook entries

Position Posterior

Codeword Match

Codeword likelihood

ActivationsFeature weights

Learning Feature Weights : Second Try9


Max-Margin Training

Training: 1.Construct dictionary 2.Record codeword distributions on training examples3.Compute “a” vectors on positive and negative training examples4.Learn codebook weights using by max-margin training

Standard ISM model (Leibe

et.al.’04)

Our Contribution

10

class label {+1,-1}

activations

non negative


Experiment DatasetsETHZ Shape Dataset (Ferrari et al., ECCV 2006) 255 images, over 5 classes (Apple logo, Bottle, Giraffe, Mug, Swan)

UIUC Single Scale Cars Dataset (Agarwal & Roth, ECCV 2002) 1050 training, 170 test images

INRIA Horse Dataset (Jurie & Ferrari) 170 positive + 170 negative images (50 + 50 for training)

1111


Experimental Results

Hough transform details Interest points : Geometric Blur descriptors

at sparse sample of edges (Berg&Malik’01) Codebook constructed using k-means Voting over position and aspect ratio Search over scales

Correct detections (PASCAL criterion)

12


Max-Margin

Learned Weights (ETHZ shape)

Important Parts

13

Naïve Bayes

blue (low) , dark red (high)

Influenced by clutter(rare structures)


Learned Weights (UIUC cars)14

Naïve Bayes

Max-Margin

Important Partsblue (low) , dark red

(high)


Learned Weights (INRIA horses)

15

Naïve Bayes Max-Margin

Important Partsblue (low) , dark red

(high)


Detection Results (ETHZ dataset)

16

Recall @ 1.0 False Positives Per Window


Detection Results (INRIA Horses)

17

Our Work


Detection Results (UIUC Cars)

INRIA horses

18

Our Work


Hough Voting + Verification Classifier

ETHZ Shape Dataset IKSVM was run on top 30 windows + local

searchKAS – Ferrari et.al., PAMI’08TPS-RPM – Ferrari et.al., CVPR’07

19

Recall @ 0.3 False Positives Per Image

better fitting bounding box

Implicit sampling over aspect-ratio

19



20

IKSVM was run on top 30 windows + local search

Our Work



21

UIUC Single Scale Car DatasetIKSVM was run on top 10 windows + local

search

1.7% improvement


Summary Hough transform based detectors offer good

detection performance and speed. To get better performance one may learn

Discriminative dictionaries (two talks ago, Gall et.al.’09)

Weights on codewords (our work) Our approach directly optimizes detection

performance using a max-margin formulation Any weak predictor of object center can be used

is this framework Eg. Regions (one talk ago, Gu et.al. CVPR’09)

22

Work partially supported by:ARO MURI W911NF-06-1-0076 and ONR MURI N00014-06-1-0734

Computer Vision Group @ UC Berkeley

Acknowledgements

Thank You

23

Questions?


Backup Slide : Toy Example

Rare but poor localization

Rare and good localization

24

cvpr 2009, miami, florida

Documents

maxmargin framework

hough transformpopular

object posecomplexity

object locationneed

givenobject detection

combined object categorization

training examplescompute

negative images