object co-detection · • better detection accuracy than single -image methods • better matching...

Object Co-detection

Yingze (Sid) Bao, Yu Xiang, Silvio SavareseMidwest Workshop 2012, ECCV 2012

Object Detection: A Review

2

Object Co-Detection• Detect objects in multiple images• Establish object correspondences• Estimate object viewpoint changes

Id: 1

Id: 2

A Coherent Algorithm

3

Motivations• Better detection accuracy than single-image methods

4

Motivations• Better detection accuracy than single-image methods• Better matching accuracy than low-level methods

Where is the car?

5

• Better detection accuracy than single-image methods• Better matching accuracy than low-level methods• Tracking by Detection

– Co-detection provides consistent detection across frames

Motivations

6

Motivations• Better detection accuracy than single-image methods• Better matching accuracy than low-level methods• Tracking by Detection• Semantic Structure From Motion

R, t

R, tBao, Bagra, Chao, Savarese, CVPR 2011 CORP 2012 CVPR 2012 7

Related Problems• Object detection in a single image

• Viola et al. 2001• Fergus et al. 2003• Leibe et al. 2004 • Dalal et al. 2005• Savarese et al. 2006• Felzenszwalb et al. 2009• etc….

8

Related Problems• Object detection in a single image• Single instance detection

– Low level image features– Small pose variation– Rich texture

Lowe 1999

• Lowe 1999• Berg et al. 2005• Ferrari et al. 2006• Nister et al. 2006• Rothganger et al.2006• Hsiao et al. 2010• etc.

9

Related Problems• Object detection in a single image• Single instance detection• Co-segmentation

– No semantic information– Hard to handle objects with different poses

• Rother et al. 2006• Batra et al. 2010• Hochbaum et al. 2009• etc.

Rother et al. 2006

10

Related Problems• Object detection in a single image• Single instance detection• Co-segmentation• Region matching

– No semantic information– May require epipolar geometry validation– Sensitive to segmentation noise

• Schaffalitzky et al. 2001• Tuytelaars et al. 2004• Matas et al. 2004• Toshev et al. 2007• etc.

Toshev et al. 2007 11

Object Co-detection is challenging

12

Pose Variation & Self-occlusion

13

Object Representation

id, location, pose, visibility and scale

• 𝑝𝑝 : part • 𝑟𝑟 : root filter• 𝑉𝑉 : view point𝑂𝑂 = 𝑟𝑟,𝑉𝑉, 𝑝𝑝1, 𝑝𝑝2, … , 𝑝𝑝𝑛𝑛

14

Pose Variation & Self-occlusion

15

Goal of object co-detection

• Identify matching objects in every input images respectively

• 𝑂𝑂𝑘𝑘: an object in image 𝐼𝐼𝑘𝑘

• 𝐸𝐸: energy based on the co-detection model– What is the co-detection model?

𝑂𝑂1,𝑂𝑂2, … , 𝑂𝑂𝐾𝐾 = 𝑎𝑎𝑟𝑟𝑎𝑎max{𝑂𝑂𝑘𝑘}

𝐸𝐸(𝑂𝑂1,𝑂𝑂2, … , 𝑂𝑂𝐾𝐾; 𝑰𝑰)

16

Co-detection Model

• In the case of 2 input images

𝑝𝑝 : part 𝑟𝑟 : bounding box𝑉𝑉 : view point

𝐸𝐸𝑢𝑢𝑛𝑛𝑢𝑢𝑢𝑢

17

Co-detection Model


Account for visibility!


Rectification

Match

Rectification

𝐸𝐸𝑚𝑚𝑚𝑚𝑢𝑢𝑚𝑚𝑚


18

Co-detection Model


Account for visibility!


Rectification

Match

Rectification

𝐸𝐸𝑚𝑚𝑚𝑚𝑢𝑢𝑚𝑚𝑚


19

Set of objects Set of input images

20

• Unitary potential– Single image object detector

Xiang & Savarese CVPR 12

Good with many part-based object detection models!• Fergus et al. 03• Leibe et al. 04• Silvio & Feifei 06• Kushal et. al. 07• Chiu et. al. 07• Sun et al. 09• Felzenszwalb 09• Filder 09

21

• Matching potential• Decompose into pair-wise terms

rectification rectification

Match by concatenation of feature vectors (HOG, sift, color etc..)

22

Single Image Detector False Alarms

23

Inference

• Loopy model– approximating inference with 2 steps

• Step 1: �𝑂𝑂 objects with high unitary potential

• Step 2: 𝑂𝑂∗ = 𝑎𝑎𝑟𝑟𝑎𝑎max�𝑂𝑂

All possible configurationsof matching objects in different images

Goal: detectmatching objects

24

Learn the model• Learn parameters for

– Standard learning process in a part-based object detection model

• Learn parameters for– Learning weights w of different matching cues (e.g. HOG,

sift, color) in a SVM learning framework

25

2D Object Part Representation– A Simplification

• Treat different view points as different object categories– Cannot match objects with large pose variations

• More choices to compute– Fergus et al. CVPR’03– Leibe et al. 04– Felzenszwalb 09– Etc. 27

Experiments

28

Experiments

• 3 Datasets– Cars

• 300 image pairs• Pandey et al. 2009, Bao et al. 2010

– Pedestrians • 200 image pairs• Ess et al. 2007

– 3d objects • ~400 image pairs for 8 categories each• Savarese & Fei-fei. 2006

29

Car dataset

(2d object representation applied)

Single Img. Det. Co-detector30

Pedestrian dataset

(2d object representation applied)

Single Img. Det. Co-detector31

3d object dataset

Single Img. Det. Co-detector

Shoe model

SIFT match

32

Quantitative Evaluation

• Object detection• Pose estimation• Single instance detection

(More results in the paper)

34

45

50

55

60

65

70

75

80

85

Car Pedestrain 3D Object

Aver

age

Prec

isio

n (%

)

Datasets

Object Detection Accuracy

Single Image detector

Co-detetion

Car, Pedestrian: Felzenszwalb et al. 20093D Object: Xiang & Savarese 2012

49.8 53

.5

59.7 62

.7

82.3

83.0

35

606570

75

80

85

90

95

100

Accu

racy

(%)

3D Object Dataset

Pose Estimation Accuracy

Single Image Estimator

Object Co-detection

Xiang & Savarese 2012

36

0

20

40

60

80

100

Aver

age

Prec

isio

n (%

)

(same poses)

Detecting the same instance

Lowe 2004

Co-detection

37

Why the co-detection is better at the instance detection task?

38

• Object co-detection problem– A generalization of the object detection problem

• Our solution– Exploit existing object representation models– Measure object similarity by parts

• Experiments– Superior performance in extensive tests

• Acknowledgement

39

object co-detection · • better detection accuracy than single -image methods • better matching...

Documents