"fast 3d object recognition in real-world environments," a presentation from vangogh...
TRANSCRIPT
Copyright © 2014 VanGogh Imaging 1
Ken Lee, CEO
May 29, 2014
Fast 3D Object Recognition
In Real-World Environments
Insert Company Logo on
Slide Master
Copyright © 2014 VanGogh Imaging 2
• Founded in 2007
• Located in McLean, VA
• Mission: “Provide Real-time 3D computer vision technology for
embedded and mobile applications”
• Product: ‘Starry Night’ 3D-CV Middleware
• Operating System: Android and Linux
• 3D Sensor: PrimeSense & Kinect & SoftKinetic
• Processors: ARM & Xilinx Zynq
• Applications
• 3D Printing, Parts Inspection, Robotics
• Security, Automotive, Augmented Reality
• Medical, Gaming
Company Background
Copyright © 2014 VanGogh Imaging 3
Starry Night 3D Middleware
Copyright © 2014 VanGogh Imaging 4
• Busy real-world environment
• Real-time processing
• Tolerant to noise from low-cost scanners
• Efficient
• Fully automated
• Mobile or portable embedded platform (ARM & Xilinx Zynq FPGA)
• Released on Avnet Embedded Software Store: June, 2014.
The ‘Starry Night’ Middleware (Unity Plugin)
Starry Night Video:
https://www.youtube.com/watch?v=Ro1mv007MHo&feature=youtu.be
Copyright © 2014 VanGogh Imaging 5
The ‘Starry Night’ Middleware Blocks
Copyright © 2014 VanGogh Imaging 6
• Reliable — The output is always a fully-formed 3D model with known
feature points despite noisy or partial scans
• Easy to use — Fully automated process
• Powerful — Known data structure for easy analysis and measurement
• Fast — Single step process (Not iterative)
The ‘Starry Night’ Shape-Based Registration
Input Scan (Partial) + Reference Model = Full 3D Model
Copyright © 2014 VanGogh Imaging 7
Object Recognition Algorithm
Copyright © 2014 VanGogh Imaging 8
• Busy scene, object orientation, and occlusion
Challenges — Scene
Copyright © 2014 VanGogh Imaging 9
• Mobile and Embedded Devices
• ARM — A9 or A15, <1G RAM
• Existing libraries were built for laptop/desktop platform
• GPU processing is not always available
• Therefore, we need a very efficient algorithm
Challenges — Platform
Copyright © 2014 VanGogh Imaging 10
• Texture based methods
• Color based depends heavily on lighting or color of the object
• Machine Learning robust but requires training per each object
• Neither method provides Transform (i.e. orientation)
• (3D) methods
• Hough transform and geometric hashing Slow
• Geometric hashing Even slower
• Tensor matching Not good for noisy and sparse scene
• Correspondence based methods using rigid geometric descriptors
• The models must have distinctive feature points which is not
true for most models (i.e. cylinder)
Previous Approaches
Tried
Copyright © 2014 VanGogh Imaging 11
General Concept
Reference
Object
Descriptor
Scene
Compare
distance & normal
Distance and normal of
Random sample points
Match Criteria Fine-Tune
Orientation
Location
Transpose
Copyright © 2014 VanGogh Imaging 12
Block Diagram — Example for One Model
Copyright © 2014 VanGogh Imaging 13
Model Descriptor (Pre-processed)
Sample all point
pairs in the model
that are separated by
the same distance D
Use the surface
normal of the pair to
group them into the
hash tablet
key
(α1,β1,Ω1) P1, P2 P3, P4
(α2,β2,Ω2) P5, P6 P7, P8 P9, P10 P11, P12
(α3,β3,Ω3) P13, P14
Note: In the bear example, D = 5 cm which
resulted in 1000 pairs
Note: The keys are angles derived from the normal of
the points.
alpha(α) = first normal to second point
beta(β) = second normal to first point
omega(Ω) = angle of the plane between two points
Copyright © 2014 VanGogh Imaging 14
Object Recognition of the Model (Real-time)
Grab Scene
Sample point pair w/
distance D using
RANSAC
Generate key using
same hash function
Use key to retrieve
similarly oriented
points in the model &
rough transform
Match criteria to find
the best match
Use ICP to refine
transform
Note: The example scene has around 16K points
Note: We iterated this sampling process 100 times
Note: Entire process can be easily parallelized
Very Important: Multiple models can be
found using a single hash table for
example sampled point pair in the scene
Copyright © 2014 VanGogh Imaging 15
• Result
Implementation
Object Recognition Video:
https://www.youtube.com/watch?v=h7whfei0fTw&feature=youtu.be
Copyright © 2014 VanGogh Imaging 16
Performance
Copyright © 2014 VanGogh Imaging 17
• Reliability
• % False positives — depends on the scene
• Clean scene — <1%
• Noisy scene — 15%
• % negative results (cannot find the object)
• Clean scene — <1%
• Noisy scene — 25% (also takes longer)
• Effect of orientation on success ratio
• Model facing front — > 99%
• Model facing backward — > 99%
• Model facing sideways — 65%
Reliability (w/ bear model)
False positive
Copyright © 2014 VanGogh Imaging 18
• Performance on Cortex A-15 2GHz ARM (on Android mobile)
• Amount of time it takes to find one object
• Single-thread — 4 seconds
• Multi-thread & NEON — 1 second
• Amount of time it takes to find two objects
• Single-thread — 5.2 seconds
• Multi-thread & NEON — 1.4 second
Performance — Mobile
Copyright © 2014 VanGogh Imaging 19
• Select Functions to Be Implemented in Zynq
• FPGA — Matrix operations
• Dual-core ARM — Data management + Floating point
• Entire implementation done in C++ (Xilinx Vivado-HLS)
Hardware Acceleration — FPGA (Xilinx Zynq)
Copyright © 2014 VanGogh Imaging 20
• Note: Currently, only 30% of the computationally intensive functions
are implemented on the FPGA with the rest still running on ARM A9.
Therefore, it should be much faster once we can transfer most of these
to the FPGA.
• Performance on Xilinx Zynq (Cortex A-9 800 MHZ + FPGA)
• Amount of time it takes to find one object
• Zynq 7020 — 6 second
• Zynq 7045 (est.) — <1 second
• No test result for two objects but should scale the same way as for
the ARM.
Performance — Embedded using FPGA
Copyright © 2014 VanGogh Imaging 21
• The object recognition implemented is pretty reliable
• The algorithm does a great job in recognizing multiple models with
minimal penalty
• More improvement is needed for the noisy environment and certain
object orientation
• Additional improvement in the performance is needed
• Algorithm
• Application specific parameters (e.g. size of the model descriptor)
• ARM — NEON
• Algorithm improvement
• Optimize the use of FPGA core
Lesson Learned
Copyright © 2014 VanGogh Imaging 22
Summary
Copyright © 2014 VanGogh Imaging 23
• Key implementation issues
• Model descriptor
• Data structure
• Sampling technique
• Performance
• IMPORTANT
• Both ARM & FPGA provides the scalability
• Therefore
• Real-time object recognition was very difficult but successfully
implemented on both mobile and embedded platforms
• LIVE DEMO AT THE BOOTH!
Summary
Copyright © 2014 VanGogh Imaging 24
• www.vangoghimaging.com
• Android 3D printing: http://www.youtube.com/watch?v=7yCAVCGvvso
• “Challenges and Techniques in Using CPUs and GPUs for Embedded Vision” by
Ken Lee, VanGogh Imaging—http://www.embedded-vision.com/platinum-
members/vangogh-imaging/embedded-vision-
training/videos/pages/september-2012-embedded-vision-summit
• “Using FPGAs to Accelerate Embedded Vision Applications”, Kamalina Srikant,
National Instruments— http://www.embedded-vision.com/platinum-
members/national-instruments/embedded-vision-
training/videos/pages/september-2012-embedded-vision-summit
• “Demonstration of Optical Flow algorithm on an FPGA”—
http://www.embedded-vision.com/platinum-members/bdti/embedded-vision-
training/videos/pages/demonstration-optical-flow-algorithm-fpg
• * Reference: “An Efficient RANSAC for 3D Object Recognition in Noisy and
Occluded Scenes” by Chavdar Papazov and Darius Burschka. Technische
Universitaet Muenchen (TUM), Germany.
Resources
Copyright © 2014 VanGogh Imaging 25
Thank you