d escriptors (d escription of i nterest r egions with l ocal b inary p atterns ) yu-lin cheng...
TRANSCRIPT
OUTLINE Scale Invariant Feature Transform (SIFT)
Descriptor
Local Binary Pattern (LBP) Descriptor Center-Symmetric LBP (CS-LBP) Descriptor
Histogram of Oriented Gradients (HOG) Descriptor
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Scale-space Extrema Detection: Stable feature points ----- (scale invariant)
Principle: A local maximum over scales by using combination
of normalized derivatives can be treated as a characteristic point of local structure
Use LoG to find maximum
scale
bad
scale
Good !
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Scale-space Extrema Detection: Use DoG instead of LoG ---- (computational efficiency)
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Scale-space Extrema Detection: Local extrema detection:
Compare to 26 neighbors
Keep the same keypoint in all scale
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Scale-space Extrema Detection: Reject points with low contrast
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Accurate keypoints localization:Quadratic function to interpolate the
location of maximum
Eliminate edge response:
r: threshold, H: Hessian matrix
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Orientation Assignment: Assign a consistent orientation to achieve
orientation invariant
Method:
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Orientation Assignment: Calculate gradient magnitude and direction of
neighboring pixels
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Orientation Assignment: Calculate weighted orientation histogram
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Orientation Assignment: Calculate weighted orientation histogram
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Orientation Assignment: Calculate weighted orientation histogram
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Keypoints Descriptor: Empirical result:
Cell size: 44 pixels Block size: 44 cells Dimension: 44 (cells) 8 (bins) = 128
Weighted magnitude
SIFT(SCALE INVARIANT FEATURE TRANSFORM )
Keypoints Descriptor: Avoid all boundary effect
Use trilinear interpolation
Normalization: (illumination invariant) Normalize to unit length Threshlod the maximum value to 0.2
Match the magnitudes for large gradients is no longer important
Renormalize to unit length
LBP(LOCAL BINARY PATTERN)
A powerful mean of texture description
LBP operator: Standard LBP:
Illustration:
CS-LBP DESCRIPTOR Interest Region Detection:
Detectors: 1. Hessian-Affine (blob-like structure) 2. Harris-Affine (corner-like structure) 3. Hessian-Laplace (scale-invariant version) 4. Harris-Laplace (scale-invariant version)
4141
CS-LBP DESCRIPTOR Feature Extraction:
CS-LBP operator: Parameters:
R: radius R = 1, 2
N: number of neighboring pixels N = 6, 8
T: threshold T = 0.2
Descriptor Construction: Location grids
33 cells/44 cells Avoid boundary effects:
Using ‘bilinear interpolation’
4141
CS-LBP DESCRIPTOR Descriptor Normalization: (illumination invariant)
Normalize to unit length Thresholding Renormalize to unit length
24× (4×4 )=256
COMPARISON(SIFT V.S. CS-LBP)
Assumption: Computations cannot be reused from detection
algorithm
Comparison:
Conclusion: Computational efficiency and better performance
than SIFT
HOG(HISTOGRAM OF ORIENTED GRADIENTS)
Spatial/Orientation Binning: Weighted votes
Function of magnitude Avoid aliasing
Interpolation
Parameters: Number of orientation bins Cell size Block size
Cell Block
HOG(HISTOGRAM OF ORIENTED GRADIENTS)
Spatial/Orientation Binning: Parameters:
Number of orientation bins: 9 bins/18bins Cell size: 88 pixels Block size: 22 cells
HOG(HISTOGRAM OF ORIENTED GRADIENTS)
Normalization: Group cells to larger blocks and normalize each
block separately (illumination invariant)
Normalization Schemes:
HOG VARIATION
‘Object Detection with Discriminatively Trained Part Based Models’
Pixel-Level Feature Maps: Use [-1, 0, 1] to calculate gradient Contrast sensitive(B1), Contrast insensitive(B2)
,(p = 9)
Quantize into orientation bins
r: gradient magnitude
HOG VARIATION
Spatial Aggregation: Rectangular cell: 88 pixels Cell-based feature map:
Reduce the size of feature map Avoid aliasing:
Bilinear interpolation
Normalization:
HOG VARIATION
Truncation:
maximum 0.2 No renormalization
Dimension: 9 bins 4 different normalization = 36 (contrast
insensitive)
HOG VARIATION
PCA analysis: Top eigenvectors lie (approximately) in a linear
subspace
13-dimensional features: Project 36-dimensional HOG feature into uk, vk
Projection into uk : sum over 4 normalization over fixed orientation
Projection into vk : sum over 9 orientation over fixed normalization
HOG VARIATION
For Contrast Insensitive(B2): 9 bins 4 different normalization = 36 (contrast
insensitive)
For Contrast Sensitive(B1): 18 bins 4 different normalization = 72 (contrast
insensitive)
Reduce to (18 + 9) + 4 = 31 dimension
REFERENCE “Description of Interest Regions With Local Binary
Patterns”, Pattern Regonization ’09 Marko Heikkilä http://www.tele.ucl.ac.be/~devlees/ref_ELEC2885/projects/Ro
IdescriptionLBP-pr-accepted.pdf
“Effective Pedestrian Detection Using Center-symmetric Local Binary/Trinary Patterns”, Youngbin Zheng
“Scale-space Theory” Tony Lindeberg “Histogram of Oriented Gradients for Human
Detection”, CVPR ‘05 Navneet Dalal “Finding People in Images and Videos”, Navneet Dalal “Feature matching” Yung-Yu Chuang “Scale & Affine Invariant Interest Point Detectors”,
IJCV ’04 Krystian Mikolajczyk
REFERENCE “Object Detection with Discriminatively Trained Part
Based Models” “Distinctive Image Features from Scale-Invariant
Keypoints”, IJCV ’04 David G. Lowe http://
citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.3843&rep=rep1&type=pdf