an algorithm for mobile vision-based localization of skewed nutrition labels that maximizes...
TRANSCRIPT
An Algorithm for Mobile Vision-Based Localization of Skewed Nutrition Labels that Maximizes Specificity
Vladimir KulyukinDepartment of Computer Science
Utah State UniversityLogan, UT, USA
Christopher Blay YouTube, Inc
Palo Alto, CA, USA
vkedco.blogspot.com
Introduction● Many nutritionists consider proactive nutrition
management to be a key factor in reducing and controlling cancer and diabetes
● According to the U.S. Department of Agriculture, U.S. residents have increased their caloric intake by 523 calories per day since 1970
● Enabling consumers to use computer vision on smartphones to extract nutritional information from nutrition labels (NLs) will likely result in improved nutritional decisions
vkedco.blogspot.com
Outline
● Background● Skewed NL Localization Algorithm● Experiments & Results
vkedco.blogspot.com
Relaxation of Alignment Constraints● In our previous work (Kulyukin et al., IPCV 2013), we
developed a vision-based algorithm for horizontally or vertically aligned NLs on smartphones (pdf)
● This algorithm improves our previous algorithm in that it handles not only aligned NLs but also the NLs that are skewed up to 35-40 degrees from the vertical axis of the captured frame
● This algorithm is designed to improve specificity, i.e., percentage of true negative matches out of all possible negative matches
vkedco.blogspot.com
Nutritional Data Analysis Automation
● Modern nutrition management system designers and developers assume that users understand how to collect nutritional data and can be triggered into data collection with digital prompts
● Many users find it difficult to integrate nutritional data collection into their daily activities due to lack of time, motivation, or training
● The current algorithm is a step in the direction of automating nutritional data collection and analysis
vkedco.blogspot.com
Why Localize NLs?
● Because localized NLs are easier to textchunk
● Text chunks tend to OCR better than complete NLs (Kulyukin, Vanka, and Wang, 2013)
vkedco.blogspot.com
Detection of Edges, Lines, Corners ● The algorithm uses three image processing methods:
edge detection, line detection, and corner detection● The algorithm uses the Canny edge detector (CED) to
detect edges● After the edges are detected, the Hough Transform
(HT) is applied to detect lines● Corner detection is done for text spotting because
image segments with higher concentrations of corners are likely to contain text
vkedco.blogspot.com
Detection of Edges & Lines
Edge Detection Line Detection
vkedco.blogspot.com
Rotation Correction
● NLs contain higher numbers of lines with the same skew angle
● All detected lines horizontal within 35 to 40 degrees in either direction (up or down) are used to compute the average skew angle
● After the average skew angle is computed, the image is rotated to align it horizontally
● Corner detection is done after the image rotation
vkedco.blogspot.com
Rotation Correction
● NLs contain higher numbers of lines with the same skew angle
● All detected lines horizontal within 35 to 40 degrees in either direction (up or down) are used to compute the average skew angle
● After the average skew angle is computed, the image is rotated to align it horizontally
● Corner detection is done after the image rotation
vkedco.blogspot.com
Corner Projections● Horizontal & vertical projections of corner pixels are
computed● These projections determine the top, bottom, left, and
right boundaries of the region in which most corners lie● Projection values are averaged and a projection
threshold is arbitrarily set to twice the average● The first and last indexes of each projection greater
than a threshold are selected as boundaries
vkedco.blogspot.com
Experiments & Results
online video
vkedco.blogspot.com
Experimental Design
● 378 images were assembled from a Google Nexus 7 Android 4.3 smartphone during a shopping session in a local supermarket
● Of these, 266 contained an NL and 112 did not● Results were manually categorized into five categories:
complete true positives, partial true positives, true negatives, false positives, and false negatives
vkedco.blogspot.com
Complete & Partial True Positives
Complete (left) vs Partial (right) True Positivesvkedco.blogspot.com
NL Localization Results
PR TR CR PR SP ACC
0.76 0.42 0.36 0.15 1.0 0.59
● PR – precision● TR – total recall● CR – complete recall● PR – partial recall● SP – specificity● ACC - accuracy
vkedco.blogspot.com
NL Localization Results
● Most false negative matches were caused by blurry images
● Bottles, bags, cans, and jars have a large showing in the false negative category due to HT line detection difficulties
● NLs with irregular layouts were also difficult to detect
vkedco.blogspot.com
NL with Curved Lines & Irregular Layouts
NL with Curved Lines (left); NLs with Irregular Layouts (Middle & Right)vkedco.blogspot.com