bag of features approach: recent work, using geometric information
Post on 19-Dec-2015
216 views
TRANSCRIPT
Bag of Features Approach: recent work, using geometric information
Problem
• Search for object occurrences in very large image collection
2 sub problems
• Object Category Recognition and Specific Object Recognition
Motivation
• Look for product information• Look for similar products
Related work on large scale image search
• Most systems build upon the BoF framework [Sivic & Zisserman 03]– Large (hierarchical) vocabularies [Nister Stewenius 06]– Improved descriptor representation [Jégou et al 08, Philbin et
al 08]– Geometry used in index [Jégou et al 08, Perdoc’h et al 09]– Query expansion [Chum et al 07]– …
• Efficiency improved by:– Min-hash and Geometrical min-hash [Chum et al. 07-09]– Compressing the BoF representation [Jégou et al. 09]
Local Features - SIFT
Creating a visual vocabulary1 2
3 4
Inverted Index
Index construction Searching
Use geometry
• Possible directions:– Change/optimize spatial verification stage– Insert a new geometric information to the index• Ordered BOF• Bundled features• Visual phrases
– Change the searching algorithm
Survey for today
• Spatial Bag-of-features [Cao, CVPR2010]• Image Retrieval with Geometry-Preserving
Visual Phrases [Zhang Jia Chen, CVPR2011]• Smooth Object Retrieval using a Bag of
Boundaries [Arandjelovi Zisserman, ICCV2011]
Spatial BOF
• Basic idea:
Spatial BOF
• Constructing linear and circular ordered bag-of-features:
Spatial BOF
• Translation invariance:
Spatial BOF
• Pros:– Gets better performance than BOF+RANSAC for large scale
dataset*– Same format as standard BOF
• Cons:– Is dataset dependent because of need of training
• Do not present the results for large scale dataset with transfer learning from another dataset
• Future work– Check it with cross training for large dataset. Otherwise, it
is not worth working further.
Geometry-Preserving Visual Phrases
• Basic idea:
Geometry-Preserving Visual Phrases
• Representation– Quantize image to 10x10 grid– Histogram of GVPs of length k– GVP dictionary size is “choose k from N visual
words”
Geometry-Preserving Visual Phrases
• Pros:– Outperforms BOV + RANSAC
• Cons:– Only translation invariant because of memory
• Future work
BOF for smooth objects
Idea:
The information used for retrievalQuery object
Segment Gradient
BOF for smooth objects
Results:
BOF for smooth objects
Segmentation phase
• Over segmentation with super-pixels• Classification of super-pixels:• 3208 feature vector (median(Mag(Grad)), 4 bits, color
histogram, BOF)• SVM
• Post-processing
BOF for smooth objects
Boundary description phase:• Sample points on the boundary• Calculate HoG at each point in 3 scales
340 dimensional
L2 normalized vector
* The descriptor is not rotation invariant
BOF for smooth objects
Retrieval procedure:• Boundary descripors are quantized (k=10k)• Standard BOF scheme*• Spatial verification for top 200 with loose
affine homography (errors up to 100pixs)
* No spatial information is recorded in the histogram
BOF for smooth objects
• Pros:– Solves the smooth object retrieval problem– Fast
• Cons:– Is dataset dependent because of need of training– Limited to objects with “solid” materials –
segmentation has to catch the object’s boundary• Future work– Eliminate the training step
Summary
• There is an active research in the field of CBIR to exploit geometry information.
• Each method with its limitations• Still no widely accepted solution– Like spatial verification with RANSAC