window-based models for generic object detection mei-chen yeh 04/24/2012

Window-based models for generic object detection

Mei-Chen Yeh04/24/2012

Object Detection

• Find the location of an object if it appear in an image– Does the object appear?– Where is it?

Viola-Jones face detector

P. Viola and M. J. Jones. Robust Real-Time Face Detection. IJCV 2004.

Viola-Jones Face Detector: Results

Paul Viola, ICCV tutorial

Viola-Jones Face Detector: Results

A successful application

Consumer application: iPhoto 2009

http://www.apple.com/ilife/iphoto/

Slide credit: Lana Lazebnik

http://www.apple.com/ilife/iphoto/


• Things iPhoto thinks are faces


http://www.flickr.com/groups/977532@N24/pool/


• Can be trained to recognize pets!

http://www.maclife.com/article/news/iphotos_faces_recognizes_cats


http://www.maclife.com/article/news/iphotos_faces_recognizes_cats

Challenges

Slide credit: Fei-Fei Li

Michelangelo 1475-1564

view point variation

illuminationocclusion

Magritte, 1957

scale

Challenges

Xu, Beihong 1943

deformation

Slide credit: Fei-Fei Li

Klimt, 1913

background clutter

Basic framework

• Build/train object model

– Choose a representation

– Learn or fit parameters of model / classifier

• Generate candidates in new image

• Score the candidates

Window-based modelsBuilding an object model

Face/non-face Classifier

Yes, face.No, not a face.

Given the representation, train a binary classifier

Basic framework

• Build/train object model

– Choose a representation

– Learn or fit parameters of model / classifier

• Generate candidates in new image

• Score the candidates

face/non-face Classifier

• Scans the detector at multiple locations and scales

Window-based modelsGenerate and score candidates

Window-based object detection

Car/non-car Classifier

Feature extractio

n

Training examples

Training:1. Obtain training data2. Define features3. Define classifierGiven new image:1. Slide window2. Score by classifier

Viola-Jones detection approach

• Viola and Jones’ face detection algorithm – The first object detection framework to provide

competitive object detection rates in real-time– Implemented in OpenCV

• Components– Features

• Haar-features • Integral image

– Learning• Boosting algorithm

– Cascade method

Haar-features (1)

• The difference between pixels’ sum of the white and black areas

Haar-features (2)

• Capture the face symmetry

A 24x24 detection window

Four types of haar features

Type A

Haar-features (3)

Can be extracted at any location with any scale!

Haar-features (4)

• Too many features!– location, scale, type– 180,000+ possible features associated with each

24 x 24 window• Not all of them are useful!• Speed-up strategy

– Fast calculation of haar-features– Selection of good features

24

24

AdaBoost

Integral image (1)

Sum of pixel values in the blue area

Example:

2 1 2 3 4 33 2 1 2 2 34 2 1 1 1 2

Image

2 3 5 8 12 155 8 11 16 22 289 14 18 24 31 39

Integral image

Time complexity?

Integral image (2)

1

3

2a b

c d

a = sum(1) b = sum(1+2) c = sum(1+3) d = sum(1+2+3+4)

Sum(4) = ?

4

d + a – b – cFour-point calculation!

A, B: 2 rectangles => C: 3 rectangles => D: 4 rectangles =>

6-point8-point9-point

Feature selection

• A very small number of features can be combined to from an effective classifier!

• Example: The 1st and 2nd features selected by AdaBoost

Feature selection

• A weak classifier h

f1 f2

f1 > θ (a threshold) => Face!

f2 ≤ θ (a threshold) => Not a Face!

h = 1 if fi > θ0 otherwise

Feature selection

• Idea: Combining several weak classifiers to generate a strong classifier

α1 α2

α3 αT

……

α1h1+ α2h2 + α3h3 + … + αThT>< Tthresold

weak classifier (feature, threshold)h1 = 1 or 0

~performance of the weak classifier on the training set

Feature selection

• Training Dataset– 4916 face images– non-face images cropped from 9500 images

non-face images

positive samples negative samples

AdaBoost

• Each training sample may have different importance!

• Focuses more on previously misclassified samples– Initially, all samples are assigned equal weights– Weights may change at each boosting round

• misclassified samples => increase their weights• correctly classified samples => decrease their weights

Boosting illustration: 2D case

Weak Classifier 1

Slide credit: Paul Viola

Boosting illustration

WeightsIncreased


Weak Classifier 2


WeightsIncreased


Weak Classifier 3


Final classifier is a combination of weak classifiers

Viola-Jones detector: AdaBoost• Want to select the single rectangle feature and

threshold that best separates positive (faces) and negative (non-faces) training examples, in terms of weighted error.

Outputs of a possible rectangle feature on faces and non-faces.

…

Resulting weak classifier:

For next round, reweight the examples according to errors, choose another filter/threshold combo.

Kristen Grauman

BoostingRound 1 + + + -- - - - - -

0.0094 0.0094 0.4623B1

= 1.9459

AdaBoost

decreased decreased increased

fi

Initial weights for each data point

OriginalData + + + -- - - - + +

0.1 0.1 0.1

-∞ ∞

misclassified ~ error rateerror↘ α↗

AdaBoost

……

Learning the classifier

• Initialize equal weights to training samples• For T rounds

– normalize the weights – select the best weak classifier in terms of the

weighted error– update the weights (raise weights to misclassified

samples)• Linearly combine these T weak classifiers to

form a strong classifier

AdaBoost AlgorithmStart with uniform weights on training examples

Evaluate weighted error for each feature, pick best.

Re-weight the examples:Incorrectly classified -> more weightCorrectly classified -> less weight

Final classifier is combination of the weak ones, weighted according to error they had.

Freund & Schapire 1995

{x1,…xn}For T rounds

First two features selected

Feature Selection: Results

Boosting: pros and cons

• Advantages of boosting– Integrates classification with feature selection– Complexity of training is linear in the number of training

examples– Flexibility in the choice of weak learners, boosting scheme– Testing is fast– Easy to implement

• Disadvantages– Needs many training examples– Often found not to work as well as an alternative

discriminative classifier, support vector machine (SVM)• especially for many-class problems Slide credit: Lana Lazebnik

Viola-Jones detection approach

• Viola and Jones’ face detection algorithm – The first object detection framework to provide

competitive object detection rates in real-time– Implemented in OpenCV

• Components– Features

• Haar-features • Integral image

– Learning• Boosting algorithm

– Cascade method

• Even if the filters are fast to compute, each new image has a lot of possible windows to search.

• How to make the detection more efficient?

Cascade methodStrong Classifier = (α1h1 + α2h2) + (…)+ (…+ αThT)

1 2 3

>< Tthresold

Most windows contain no face!Rejects negative windows in an early stage!

Viola-Jones detector: summary

Train with 5K positives, 350M negativesReal-time detector using 38 layer cascade6061 features in all layers

Implementation available in OpenCV

Faces

Non-faces

Train cascade of classifiers with

AdaBoost

Selected features, thresholds, and

weights

New image

Apply

to e

ach

subw

indow

Kristen Grauman

Questions

• What other categories are amenable to window-based representation?

• Can the integral image technique be applied to compute histograms?

• Alternatives to sliding-window-based approaches?

window-based models for generic object detection mei-chen yeh 04/24/2012

Documents

object detection framework

possible features

small number of features

black areashaarfeatures

new imagescore

detection algorithm

facesslide credit

candidateswindowbased