robust real-time object detection paul viola & michael jones

Post on 21-Dec-2015

243 Views

Category:

Documents

10 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Robust Real-Time Object Detection

Paul Viola & Michael Jones

Introduction

Frontal face detection is achieved Comparatively satisfactory detection rates Efficient decrease in false positive rate

Extremely rapid operation 384*288 pixel image is processed for 15

frames/second

Contribution of The Paper

Integral image A new image representation

AdaBoost Effective classifier selection

Cascade structure of complex classifiers Dramatic decrease in detection time

Simple Rectangle Features

Why not use pixels directly ? Features encodes domain

knowledge that is hard to learn by finite quantity of training data

Features operates much faster than pixel based systems

Integral Image

Double integral of original image

A new representation of image for fast calculation of rectangle features

Integral Image

Sum of pixels in rectangle D from the original image can be defined in integral image as :

P(4) -P(3)-P(2)+P(1)

Advantages of Integral Image

Pyramid image Requires a pyramid of

images A fixed scaled

detector works on all those images

Forming the pyramid is computationally expensive

Integral Image A single feature can

be evaluated at any scale and location in a few operations

Integral image is computed in one pass over the original image

Learning Classification Functions

45,394 features associated with each sub-window

A very small number of these features can be combined to form an effective classifier

A variant of AdaBoost is used to Select features Train the classifier

How does AdaBoost work?

Combines a mixture of weak classifiers to form a strong one

Percepton algorithm returns the one having the minimum classification error

The examples are re-weighted in according to the accuracy of the first classifier

The final strong classifier is a weighted combination of weak classifiers

How does AdaBoost work?

First and second features selected by AdaBoost

Attentional Cascade

Increase detection performance & reduce computation time

Calling simpler classifiers before complex ones

A simple classifier example (two-feature): 100% detection rate 40% false positives 60 microprocessor instructions (very efficient)

Attentional Cascade

Training of Cascade of Classifiers

The deeper classifiers are trained with harder examples

Simple classifiers in the first stages, complex ones in the deeper parts of the cascade

Complex classifiers takes more time to compute

A general detection algorithm works like 85-95% detection rate & 10-5 - 10-6 % false positive rate

The cascade system works like

With a 10 stage classifier For each cascade a detection rate 99% and

false positive rate 30 % Overall system runs at

(.9910~) 90% detection rate (0.3010 ~)6 * 10-6% false positive rate

Training of Cascade of Classifiers

Requirements

Needs to be determined : Number of stages Number of features for each stage Thresholds for each stage

Practical Implementation

User selects acceptable fi and di for each layer

Each layer is trained for Adaboost Number of features are increased until target

fi and di are met for this level If overall target F and D is not met for the

system we add a new level to the cascade

Results – Structure of Cascade

32 layers – 4297 features

Weeks spent to train the cascade

Layer # 1 2 3 4 5 6 7 …

Features 2 5 20 20 20 50 50 …

False Positives 40% 20%

Detection Rate 100% 100%

Results – Algorithm Details

All sub-windows (training – testing) are variance normalized for lighting conditions

Scaling is achieved by just scaling the detectors rather than the image

Step size of one pixel is used

Results – Algorithm details

Results

Most of the windows are rejected in the first & second cascade

Face detection on a 384x288 image runs in about 0.067 seconds

15 times faster than Rowley-Baluja-Kanade 600 times faster than Schneiderman- Kanade

Results

150 images and 507 faces

top related