"combining vision, machine learning and natural language processing to answer everyday...
TRANSCRIPT
Combining Vision, Machine Learning and Natural Language Processing to Answer
Everyday Questions
http://www.qmscientific.com http://www.priceswarm.com
Contact:
Web Info:
The Problem Consumers lack tools and data to answer everyday questions
simply, accurately and in real time
What is the best store to shop at
right now for my list?
Are there cheaper alternatives for products I buy
regularly?
How much do I spend on milk and coffee monthly?
Answers to these questions are buried in a hodgepodge of structured, unstructured, digital and physical data sources
Solution: GPU Powered Quazi Platform
Web
In Store
POS
Receipts
Crowd Source
Quazi Platform
Natural Language
Processing
Computer Vision
Machine Learning
GPU
Quazi: combines proprietary Natural Language Processing, Computer Vision and Machine
Learning technology to extract, connect and organize millions of products, prices and consumer preferences from any data source
Challenges Understanding Everyday Visual
Data Answering everyday shopping questions simply, accurately and in real time is not easy!
• Data heterogeneity
• Everyday data is unstructured
• Fast big data techniques are
needed to analyze and connect massive distributed sets
GPU Accelerated Vision Technology Parallel processing capabilities of modern GPU clusters can enable new vision technologies to enable deeper understanding of everyday data.
Fast and robust information extraction. Making connections between distributed visual and
even non-visual data sources.
Conceptual clustering of objects allow for higher order understanding of a scene.
Construction of visual models from sparse training sets.
Computer Vision for Retail: ReSight®
Translation Models
Local Analysis
ReSight exploits QUAZI machine learning models in conjunction with fast visual processing to make sense of retail based images (e.g. receipt data, product images…)
Image capture device
Retail visual data
Global Object
Identification
Text
Information
Image
Information
Entity Models
Conceptual Clustering
QUAZI
Visual Analytics
Global Image Analysis
Processing requirements for feature extraction and model prediction at different scales is intensive! GPUs allow for massive parallelization and
simultaneous prediction.
Optimized data structure primitives support highly efficient on-device processing schemes.
0
5
10
15
20
9600 38400 87001 154401 348004
Time (ms)
Number of Pixels
Single GPU Timings for Integral Histogram
Computation
OPENCV GPU Module QMScientific Fast 3D Scan
Multiple weak-learner models are tuned to identify objects of interest in image.
Global Object Identification
Ability to learn from very few training examples.
High degree of robustness to lighting, occlusion, and
orientation variations.
Can exploit contextual information that takes into account neighboring regions.
Object based segmentation
Receipt
Background
Robust real-time tracking
Fast graph-based methods provide optimal pixel clustering based on spatial contextual constraints and weak-learner responses.
Unique features:
Adaptive Segmentation
Accelerated adaptive blind segmentation methods help identify regions of interest for further feature extraction and analysis. Object recognition engine extracts receipt
information from image
Intelligent segmentation determines regions of interest
Adaptive filtering used to cluster regions of
similarity
Adaptive filtering used to cluster regions of similarity
Fast connected component analysis (CCA) label connected homogenous clusters in a region
Local Analysis
Local feature extraction immune to variances in scale and orientation enables better understanding of objects within a region of interest.
Advantages: • Concise representation of objects for fast and robust
classification.
• Effective classification with sparse training examples.
fff
Quazi – Combining NLP and Vision
Abbreviation Patterns
Soundex Patterns
Edit Distance Contextual Features
ssl lng gr wht rce
Sunny Select
Long Long
Grams Grain
White White
Rice Rice
Sunny Select Long Grain White Rice
Brand Type Type Main Concept
Intelligent Similarity Search
Sunny Select Long Grain White Rice $3.99
Available @ 5 Lbs.
$0.80 / lb.
OCR engine combines NLP with computer vision and data mining to analyze, enhance and convert raw unstructured
text found in physical data into knowledge.
MaraNatha Creamy
Almond Butter
Source: Website
Price: $8.99
Location: San Jose, CA
Chain: Target
MaraNatha Butter
Creamy Almond
Source: Website
Price: $6.99
Location: San Jose, CA
Chain: Walmart
Trader Joes Creamy
Almond Butter
Source: Blog
Price: ?
Location: ?
Chain: Trader Joe’s
Butter Almond
Creamy Unsalted
Source: Receipt
Price: $6.99
Location: San Jose, CA
Chain: Trader Joe’s
MaraNatha Natural
Almond Butter
Source: Webiste
Price: $9.79
Location: San Jose, CA
Chain: Costco
ALMOND
BUTTER
Creamy Crunchy
BRANDS DESCRIPTORS
CREAMY CRUNCHY CREAMY CRUNCHY
Combining Machine Learning + Vision
Image Analysis with Fine Granularity
Cereal General Mills Kix
Cereal
Cereal
Background
Combination of object recognition technology, data, and conceptual clustering algorithms allows for deeper image analysis specifically for retail image data.
PriceSwarm: Consumer Application
Technical Team to Make it Happen
Dr. Hatim F. Alqadah, CTO and Lead Vision Scientist • PhD Electrical Engineering and M.S. Applied Mathematics • 8+ years research and development experience including
postdoctoral research at the Naval Research Laboratory, Physical Acoustics.
• Expertise in 3D sonar/electromagnetic image reconstruction, object recognition /tracking, and image processing.
• 13+ peer-reviewed publications.
Dr. Faris Alqadah, CEO and Lead Data Scientist • PhD Computer Science • Senior Data Scientist @PayPal, Postdoc fellow @Johns
Hopkins. • Expertise in machine learning + data mining. • 12+ peer-reviewed publications, 2 best paper nomination
+award winning PhD