object co-detection · • better detection accuracy than single -image methods • better matching...
TRANSCRIPT
Object Co-detection
Yingze (Sid) Bao, Yu Xiang, Silvio SavareseMidwest Workshop 2012, ECCV 2012
Object Detection: A Review
2
Object Co-Detection• Detect objects in multiple images• Establish object correspondences• Estimate object viewpoint changes
Id: 1
Id: 2
A Coherent Algorithm
3
Motivations• Better detection accuracy than single-image methods
4
Motivations• Better detection accuracy than single-image methods• Better matching accuracy than low-level methods
Where is the car?
5
• Better detection accuracy than single-image methods• Better matching accuracy than low-level methods• Tracking by Detection
– Co-detection provides consistent detection across frames
Motivations
6
Motivations• Better detection accuracy than single-image methods• Better matching accuracy than low-level methods• Tracking by Detection• Semantic Structure From Motion
R, t
R, tBao, Bagra, Chao, Savarese, CVPR 2011 CORP 2012 CVPR 2012 7
Related Problems• Object detection in a single image
• Viola et al. 2001• Fergus et al. 2003• Leibe et al. 2004 • Dalal et al. 2005• Savarese et al. 2006• Felzenszwalb et al. 2009• etc….
8
Related Problems• Object detection in a single image• Single instance detection
– Low level image features– Small pose variation– Rich texture
Lowe 1999
• Lowe 1999• Berg et al. 2005• Ferrari et al. 2006• Nister et al. 2006• Rothganger et al.2006• Hsiao et al. 2010• etc.
9
Related Problems• Object detection in a single image• Single instance detection• Co-segmentation
– No semantic information– Hard to handle objects with different poses
• Rother et al. 2006• Batra et al. 2010• Hochbaum et al. 2009• etc.
Rother et al. 2006
10
Related Problems• Object detection in a single image• Single instance detection• Co-segmentation• Region matching
– No semantic information– May require epipolar geometry validation– Sensitive to segmentation noise
• Schaffalitzky et al. 2001• Tuytelaars et al. 2004• Matas et al. 2004• Toshev et al. 2007• etc.
Toshev et al. 2007 11
Object Co-detection is challenging
12
Pose Variation & Self-occlusion
13
Object Representation
id, location, pose, visibility and scale
• 𝑝𝑝 : part • 𝑟𝑟 : root filter• 𝑉𝑉 : view point𝑂𝑂 = 𝑟𝑟,𝑉𝑉, 𝑝𝑝1, 𝑝𝑝2, … , 𝑝𝑝𝑛𝑛
14
Pose Variation & Self-occlusion
15
Goal of object co-detection
• Identify matching objects in every input images respectively
• 𝑂𝑂𝑘𝑘: an object in image 𝐼𝐼𝑘𝑘
• 𝐸𝐸: energy based on the co-detection model– What is the co-detection model?
𝑂𝑂1,𝑂𝑂2, … , 𝑂𝑂𝐾𝐾 = 𝑎𝑎𝑟𝑟𝑎𝑎max{𝑂𝑂𝑘𝑘}
𝐸𝐸(𝑂𝑂1,𝑂𝑂2, … , 𝑂𝑂𝐾𝐾; 𝑰𝑰)
16
Co-detection Model
• In the case of 2 input images
𝑝𝑝 : part 𝑟𝑟 : bounding box𝑉𝑉 : view point
𝐸𝐸𝑢𝑢𝑛𝑛𝑢𝑢𝑢𝑢
17
Co-detection Model
• In the case of 2 input images
Account for visibility!
𝑝𝑝 : part 𝑟𝑟 : bounding box𝑉𝑉 : view point
Rectification
Match
Rectification
𝐸𝐸𝑚𝑚𝑚𝑚𝑢𝑢𝑚𝑚𝑚
𝐸𝐸𝑢𝑢𝑛𝑛𝑢𝑢𝑢𝑢
18
Co-detection Model
• In the case of 2 input images
Account for visibility!
𝑝𝑝 : part 𝑟𝑟 : bounding box𝑉𝑉 : view point
Rectification
Match
Rectification
𝐸𝐸𝑚𝑚𝑚𝑚𝑢𝑢𝑚𝑚𝑚
𝐸𝐸𝑢𝑢𝑛𝑛𝑢𝑢𝑢𝑢
19
Set of objects Set of input images
20
• Unitary potential– Single image object detector
Xiang & Savarese CVPR 12
Good with many part-based object detection models!• Fergus et al. 03• Leibe et al. 04• Silvio & Feifei 06• Kushal et. al. 07• Chiu et. al. 07• Sun et al. 09• Felzenszwalb 09• Filder 09
21
• Matching potential• Decompose into pair-wise terms
rectification rectification
Match by concatenation of feature vectors (HOG, sift, color etc..)
22
Single Image Detector False Alarms
23
Inference
• Loopy model– approximating inference with 2 steps
• Step 1: �𝑂𝑂 objects with high unitary potential
• Step 2: 𝑂𝑂∗ = 𝑎𝑎𝑟𝑟𝑎𝑎max�𝑂𝑂
All possible configurationsof matching objects in different images
Goal: detectmatching objects
24
Learn the model• Learn parameters for
– Standard learning process in a part-based object detection model
• Learn parameters for– Learning weights w of different matching cues (e.g. HOG,
sift, color) in a SVM learning framework
25
26
2D Object Part Representation– A Simplification
• Treat different view points as different object categories– Cannot match objects with large pose variations
• More choices to compute– Fergus et al. CVPR’03– Leibe et al. 04– Felzenszwalb 09– Etc. 27
Experiments
28
Experiments
• 3 Datasets– Cars
• 300 image pairs• Pandey et al. 2009, Bao et al. 2010
– Pedestrians • 200 image pairs• Ess et al. 2007
– 3d objects • ~400 image pairs for 8 categories each• Savarese & Fei-fei. 2006
29
Car dataset
(2d object representation applied)
Single Img. Det. Co-detector30
Pedestrian dataset
(2d object representation applied)
Single Img. Det. Co-detector31
3d object dataset
Single Img. Det. Co-detector
Shoe model
SIFT match
32
33
Quantitative Evaluation
• Object detection• Pose estimation• Single instance detection
(More results in the paper)
34
45
50
55
60
65
70
75
80
85
Car Pedestrain 3D Object
Aver
age
Prec
isio
n (%
)
Datasets
Object Detection Accuracy
Single Image detector
Co-detetion
Car, Pedestrian: Felzenszwalb et al. 20093D Object: Xiang & Savarese 2012
49.8 53
.5
59.7 62
.7
82.3
83.0
35
606570
75
80
85
90
95
100
Accu
racy
(%)
3D Object Dataset
Pose Estimation Accuracy
Single Image Estimator
Object Co-detection
Xiang & Savarese 2012
36
0
20
40
60
80
100
Aver
age
Prec
isio
n (%
)
(same poses)
Detecting the same instance
Lowe 2004
Co-detection
37
Why the co-detection is better at the instance detection task?
38
• Object co-detection problem– A generalization of the object detection problem
• Our solution– Exploit existing object representation models– Measure object similarity by parts
• Experiments– Superior performance in extensive tests
• Acknowledgement
39