real-time logo detection and tracking
DESCRIPTION
A computationally efficient method to detect and track logo’s in videoTRANSCRIPT
SPIE Conference on Real-Time Image and Video ProcessingApril 16, 2010 - Brussels
M. Georgea, N. Kehtarnavaza, M. Rahmana, M. Carlsohnb
a Signal and Image Processing Lab, University of Texas at Dallasb Engineering and Consultancy for Computer Vision and Image Communication,
Bremen, Germany
This work has been partially supported by the Wireless Terminal Business Unit of Texas Instruments.
Motivation for this work Existing approaches Logo detection using SIFT Real-time logo detection/tracking using
online color calibration Detection results/videoclips
User demands for value added applications on smart phones are increasing
Logo detection can be used to provide consumers with offers linked to logos
Logo detection can also be used together with GPS location services
Challenges: Should work for any size logo (smart phone camera seeing logos at
different distances), for any logo orientation (holding smart phone camera at any angle), under any lighting condition
Detection methods that can accommodate different sizes and orientations: Moment Invariance (specific location is needed, otherwise background
objects would make it fail) Viola and Jones (training very time consuming for various orientations) Scale Invariant Feature Transform (SIFT) – most promising, widely used
for object detection applications, but is slow
Our contribution in this paper has been on the real-time aspect Introducing a hybrid approach by combining SIFT for initial detection and
a computationally efficient online color calibration and moment invariants for subsequent detection
Robust object detection technique introduced by David Lowe (1999)
Able to detect objects at different scales making it scale invariant
Descriptors using orientation histograms provide rotation invariance
Pyramid of images generated by Gaussian smoothing and subsampling
Difference of Gaussian (DoG) calculated
Maxima and minima points in DoG images are used to denote keypoints
Figure showing octaves, levels within octaves and Difference of Gaussian (DOG) for scale space extrema**Taken from “Distinctive Image Features from Scale-Invariant Keypoints” by David Lowe (2005)
SIFT keypoints marked
Gradient magnitude and orientation calculated
360° orientation histogram uses gradient orientations of all neighboring pixels around keypoints
128-dimensional SIFT descriptor vector provides location, scale and orientation information
Matching of descriptors done through Best Bin First Search (k-d tree search variant).
keypoints with gradient and orientation information
SIFT is computationally intensive so here it is just used for initial detection
Subsequent detection or tracking is done using color and moment invariants
K-means clustering is applied to the SIFT logo detected region in order to extract the logo color under the light source the image frame is taken
Online calibrated color is then used to detect/track the logo in subsequent frames
Moment invariants applied to all regions having similar colors to increase robustness of detection
Color is a very effective feature but has the problem of being dependent on the light source (color temperature) under which the image is taken. By using online color calibration, the dependency on the light source is adjusted on-the-fly; we previously introduced this online color calibration for face detection
M. Rahman, N. Kehtarnavaz, and Jianfeng Ren, “A Hybrid Face Detection Approach For Real-Time Depolyment On Mobile Devices,” Proceedings of IEEE International Conference on Image Processing (ICIP 2009), Cairo, Egypt, Nov. 2009.
K-means clustering is used to find the most prominent color cluster (black/white can be a dominant color too) in the SIFT detected logo area
Chrominance values modeled by a Gaussian Mixture Model (GMM) Large color areas with high color probability are considered Hu moment invariants (7 invariants) are then used to find the logo area
by eliminating similar large color areas
Dominant color cluster in the Cb-Cr color space found on-the-fly and modeled by GMM
SIFT matching Dominant color image Detection after moment invariants
Moment invariants used to detect the logo among similar large color areas
Flowchart of our hybrid algorithm
Sub-block processing Minimum logo size SIFT image scale down Memory access Lookup table for Gaussian Mixture Model
SIFT image size Total SIFT Points Number of Matches Detection Time (ms) Detection Rate (%)
160 x 120 115 17 727 98.9
320 x 480 163 34 1531 96.4
640 x 480 238 28 4138 100
Logo Total SIFT Points Number of Matches Detection Rate (%) Detection Time (ms)
DHL 101 24 87.6 747
UTD 55 8 94.0 439
IEEE 45 9 98.4 430
Samsung 115 17 98.9 727
National Instruments 71 8 92.2 499
Table 1. SIFT detection rates for the Samsung logo using different image sizes
Table 2. Detection rates and times for different sample logos
Logo Tracking Rate (%) With Filtering (%) Time (ms)
DHL 84.2 98.1 56
UTD 87.6 98.6 47
IEEE 88.2 98.4 53
Samsung 95.9 99.8 55
National Instruments 94.7 99.6 46
Table 3. Tracking results per frame with and without using median filtering.
IEEE
Samsung DHL
UTDallas
A computationally efficient logo detection algorithm is developed by combining SIFT for initial detection (~700 ms) and online color based detection for subsequent frames (~50 ms) providing an average processing rate of 20 fps on PC platform
Ongoing work involves porting this algorithm to the OMAP mobile platform and its real-time implementation on this mobile platform