learning surf cascade for fast and accurate object...

Learning SURF Cascade for Fast and Accurate Object DetectionSupplementary materials

Jianguo Li, Yimin ZhangIntel Labs China

1. Spatial Cell ConfigurationFigure 1 depicts spatial configuration of SURF patches.

Figure 1. Possible spatial cell configuration of local patches. Eachlocal patch has 4 same-size cells, but these cells may be in differentforms. From left to right, 2×2 cells, 4×1 cells, 1×4 cells.

2. L2HYS Feature NormalizationGiven a feature vector v = (v1, . . . , vd), the L2Hys nor-

malization works like below

(1) L2-normalization: ui = vi/√∥v∥22 + ϵ, where ϵ is

small positive value to avoid dividing by zero;

(2) Clipping ui with the following rule

ui =

θ if ui > θ−θ if ui < −θui otherwise,

where θ = 2/√d empirically 1.

(3) Re-normalization: v′i = ui/√∥u∥22 + ϵ, and v′ =

(v′1, . . . , v′d) is the L2Hys normalization result.

3. Results on UMass FDDBFigure 2 depicts ROC curve from both discrete score and

continuous score on FDDB benchmark by SURF cascade.1After normalization, assuming |ui| < θ, thus

∑u2i < dθ2. ui

can be viewed as samples for a Gaussian variable, the variance σ2 =(∑

u2i )/d = 1/d < θ2. As is known, about 95% samples from the

Gaussian distribution fall within the range [−2σ, 2σ]. Therefore, we de-fine θ = 2σ = 2/

√d.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 250 500 750 1000 1250 1500 1750 2000

Tru

e po

sitiv

e ra

te

False positives

SURF_MultiviewSURF_Front_GB

SURF_Front_AB [19]VJGPR [16]

Viola-Jones [21]Mikolajaczyk et al [25]

Subburaman [33]

(a)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 250 500 750 1000 1250 1500 1750 2000

Tru

e po

sitiv

e ra

te

False positives

SURF_MultiviewSURF_Front_GB

SURF_Front_AB [19]VJGPR [16]

Viola-Jones [21]Mikolajaczyk et al [25]

Subburaman [33]

(b)

Figure 2. (a) Discrete score ROC curves and (b) Continuous scoreROC curves for different methods on UMass FDDB dataset.

4. Example Results of Face and Car DetectionFigure 3 depicts some face detection results on FDDB

and CMU+MIT datasets. Figure 4 depicts some car detec-tion results on TUGRAZ dataset.

1

(a) (b)

(c) (d)

Figure 3. Example detection results on CMU+MIT (a) and UMass FDDB datasets (b,c,d).

(a) (b) (c)

(d) (e) (f)

Figure 4. Some car detection results on TUGRAZ dataset.

learning surf cascade for fast and accurate object...

Documents