sccolbert robot vision

42

Upload: motaovas

Post on 21-Apr-2015

27 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

A High Performance Robot Vision Algorithm

Implemented in Python

S. Chris Colbert1, Gregor Franz2, Konrad Wöllhaf2,Redwan Alqasemi1, Rajiv Dubey1

1Department of Mechanical Engineering, University of South Florida2University of Applied Sciences Ravensburg-Weingarten, Germany

SciPy 2010

S. Chris Colbert et al. Robot Vision with Python

Page 2: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

MotivationState of the ArtObjectives

Why the Need for Autonomous Robot Vision?

It has broad applicability.

Industrial automationNuclear waste handlingAssistive and service oriented robots

According to a 2006 US Census Bureau report, 51.2 millionAmericans su�er from some form of disability and 10.7 millionof them are unable to independently perform activities of dailyliving (ADL).

Assistive robots can help with this.

S. Chris Colbert et al. Robot Vision with Python

Page 3: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

MotivationState of the ArtObjectives

The State of the Art

Autonomous object recognition algorithms comes in two forms:

A priori knowledge based

Novelty based

S. Chris Colbert et al. Robot Vision with Python

Page 4: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

MotivationState of the ArtObjectives

The State of the Art

How do the a priori knowledge based systems work?

It starts with a model(s).

3D models, images,features...

Match object against thedatabase.

Retrieve information.

Example: Schlemmer et al.

1 Store shape andappearance.

2 Find shape in range data.

3 Match appearance data.

4 Grasp via visual servoingand matched SIFT points.

S. Chris Colbert et al. Robot Vision with Python

Page 5: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

MotivationState of the ArtObjectives

The State of the Art

How do novelty based systems work?

With lots, and lots ofdata.

Stereo,shape-from-silhouettes,laser ranger.

Full access to objecttypically required.

Long computation times.

Example: Yamazaki et al.

1 Drive robot around object.

2 Capture more than 130images.

3 Perform dense disparityreconstruction.

4 Wait around 100s for thecomputed results.

S. Chris Colbert et al. Robot Vision with Python

Page 6: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

MotivationState of the ArtObjectives

Objectives

Reconstruct the shape and pose of a novel object to asu�cient degree of accuracy such that it permits grasp andmanipulation planning.

Require no a priori knowledge of the object with the exceptionthat a given object is the object of interest.

Require only a minimal number of images for reconstruction;signi�cantly less than the status quo.

Operate e�ciently, such that the computation time isnegligible in comparison to image capture times.

S. Chris Colbert et al. Robot Vision with Python

Page 7: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Algorithm Overview

The algorithm has three main phases:

1 Capture three images of the object and generate a silhouetteof the object for each image.

2 Use the silhouettes to generate a point cloud thatapproximates the surface of the object.

3 Improve the approximation by �tting a parametrized shape tothe points. The parameters of this shape serve as the model ofthe object.

S. Chris Colbert et al. Robot Vision with Python

Page 8: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Image Capture

Three images are capturedfrom disparate locations.

Two frontal, oneoverhead.

Mutually orthogonal ispreferred.

But, this can be relaxeddue to kinematicconstraints.

Store reprojection matrixCWT for each image.

S. Chris Colbert et al. Robot Vision with Python

Page 9: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Pre-ProcessingImage Undistortion

Image distortion is corrected according the equations

xp =(1 + k1r

2 + k2r4 + k3r

6)xd +

(2p1xdyd + p2(r2 + 2x2d )

)yp =

(1 + k1r

2 + k2r4 + k3r

6)yd +

(2p2xdyd + p1(r2 + 2y2d )

)Where (xd , yd ) are the distorted image points and(k1, k2, k3, p1, p2) are the �ve distortion coe�cients.

S. Chris Colbert et al. Robot Vision with Python

Page 10: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Pre-ProcessingSilhouette Generation

Color based segmentation is used to generate the silhouette ofobject from each image.

S. Chris Colbert et al. Robot Vision with Python

Page 11: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Surface Approximation

Once the images have been captured and preprocessed, the 3Dsurface of the object is approximated in the form a point cloud.This comprises two major steps:

1 Create a sphere of points that completely bounds the object.

Requires the calculation of a centroid and radius.

2 Modify the position of each point such that the projection ofthe point intersects or lies on the edge of the silhouette.

S. Chris Colbert et al. Robot Vision with Python

Page 12: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Bounding Sphere ConstructionFinding the Centroid

P

1 For each silhouette,project a ray from thecamera center through theimaged silhouettecentroid.

2 Find the single pointwhich minimizes the sumof squared distances toeach ray.

3 This point is theapproximate centroid ofthe object and is used asthe centroid of the sphere.

S. Chris Colbert et al. Robot Vision with Python

Page 13: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Bounding Sphere ConstructionFinding the Radius

rmax

1 For each silhouette, �ndrmax as shown in the�gure.

2 Select the silhouette withthe largest rmax for furtherprocessing.

S. Chris Colbert et al. Robot Vision with Python

Page 14: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Bounding Sphere ConstructionFinding the Radius

p

ix

p

p

p

12

3

4

rmax

p4 = p3 + (p3 − p1)t

t =−(p1 − p2) • (p3 − p2)

(p1 − p2) • (p3 − p1)

1 Project two rays from thecamera center: onethrough the centroid, andone through rmax .

2 Construct a plane at thecentroid that isperpendicular to thecentroid ray.

3 Find the point that lies onthis plane and the raycontaining rmax

S. Chris Colbert et al. Robot Vision with Python

Page 15: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Bounding Sphere ConstructionExample

Example: A simulated cylinderbounded by the computedsphere.

Points are generated usinga simple routine based onthe golden ratio.

The radius of the sphereis generally increased by afactor to insure completebounding of the object.

S. Chris Colbert et al. Robot Vision with Python

Page 16: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Point Cloud Manipulation

x

p p'

x0 = Sphere CenterC0 = Cam Centerxi = Sphere Pointxinew = modi�ed xi

1 Project xi into thesilhouette image to get x′i .

2 If x′i intersects thesilhouette, do nothing.

3 Find the pixel point p′

4 Let the line c0p be L1.

5 Let the line x0xi be L2.

6 Let xinew be the point ofintersection of lines L1and L2

S. Chris Colbert et al. Robot Vision with Python

Page 17: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Point Cloud ManipulationResults

The procedure is appliedto each point once ineach silhouette image.

A signi�cant improvementover other algorithms.

The result is a roughapproximation of theobjects surface.

Given an in�nite number of images, the approximation wouldconverge to the visual hull.

Rather than more images, we improve the approximation by�tting a superquadric to the point.

S. Chris Colbert et al. Robot Vision with Python

Page 18: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Shape FittingOverview

The �nal phase of the algorithm is to �nd a shape that best �ts thepoint cloud. We use superquadrics as our modeling tool for avariety of reasons:

They have a convenient parametrized form which can bedirectly used for grasp planning.

Their closed form expression provides a nice base for non-linearminimization.

Their nature makes them robust to small sources of error.

They are capable of accurately approximating the shape ofmany objects used in Activities of Daily Living.

S. Chris Colbert et al. Robot Vision with Python

Page 19: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

SuperquadricsSome Possible Shapes

S. Chris Colbert et al. Robot Vision with Python

Page 20: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

SuperquadricsStandard Equation

Implicit Superquadric Equation

F (xw , yw , zw ) =

((nxxw + nyyw + nzzw − pxnx − pyny − pznz

a1

) 2

ε2

+

+

(oxxw + oyyw + ozzw − pxox − pyoy − pzoz

a2

) 2

ε2

) ε2ε1

+

+

(axxw + ayyw + azzw − pxax − pyay − pzaz

a3

) 1

ε1

Evaluates to 1 if a point (xw , yw , zw ) lies on the superquadric.F (xw , yw , zw ) is also called the inside-outside function.17 parameters at �rst glance; 6 are redundant.

S. Chris Colbert et al. Robot Vision with Python

Page 21: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Superquadrics

The parameters (nx , ny , nz , ox , oy , oz , ax , ay , az , px , py , pz) make upthe 4x4 transformation matrix that relates the superquadriccentered coordinate system to the world coordinate system.

WQ T =

nx ox ax pxny oy ay pynz oz az pz0 0 0 1

The 3x3 rotation portion is orthonormal and can be decomposedinto the ZYZ-Euler angles (φ, θ, ψ).Thus, the superquadric is parametrized by 11 parameters

Λ = (λ1, λ2, ...λ11) = (a1, a2, a3, ε1, ε2, φ, θ, ψ, px , py , pz)

S. Chris Colbert et al. Robot Vision with Python

Page 22: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Superquadrics

What do the 11 parameters represent?

(a1, a2, a3) are the dimensions of the superquadric in the x , y ,and z directions.

(ε1, ε2) are the shape exponentials.

(φ, θ, ψ) are the ZYZ-Euler angles which de�ne orientation.

(px , py , pz) are the (x , y , z) world coordinates of the centroidof the superquadric.

S. Chris Colbert et al. Robot Vision with Python

Page 23: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Classical Cost Function

Inside-Outside function

F = F (xw , yw , zw , λ1, λ2, ..., λ11)

Cost function

minΛ

n∑i=1

(√λ1λ2λ3(F ε1 − 1)2)

Standard cost function asderived by Jacklic, Leonardis,and Solina.√λ1λ2λ3 recovers smallest

superquadric.

ε1 exponential promotes rapidand robust convergence.

Use non-linear gradientdescent algorithm to �nd Λ.

There are limits placed oncertain λi to restrict therange of recoverable shapes.

S. Chris Colbert et al. Robot Vision with Python

Page 24: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Error Rejecting Cost Function

Modi�ed Cost function

minΛ

[w

n∑i=1

(√λ1λ2λ3(F ε1 − 1))2+

((1− w)

n∑i=1

(√λ1λ2λ3(F ε1 − 1))2 ∈ F ε1 < 1

)]

Penalizes points that lie inside the superquadric.

Forces the superquadric to reject perspective projection errors.

The superquadric will be as large as possible, withoutexceeding the visual hull.

Empirically determined w = 0.2.

S. Chris Colbert et al. Robot Vision with Python

Page 25: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Cost Function Comparison

Surface Standard Modi�edApproximation Cost Function Cost Function

↑ 20% greateraccuracy

S. Chris Colbert et al. Robot Vision with Python

Page 26: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Example Reconstruction

S. Chris Colbert et al. Robot Vision with Python

Page 27: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Image Capture and Pre-ProcessingSurface ApproximationShape Fitting

Example Reconstruction

S. Chris Colbert et al. Robot Vision with Python

Page 28: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Simulation TrialsOverview

Tested the algorithm against simulated sphere, cylinder, prism,and cube.

These shapes represent a range of common convex shapes andcan be modeled accurately by a superquadric.

The results are reported by comparing recovered superquadricparameters against the known ground truth.

The volume of the superquadric is also compared against thevolume of the object in the form of a fraction.

The volume fraction vf is a quick and intuitive measure ofaccuracy.

S. Chris Colbert et al. Robot Vision with Python

Page 29: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Simulation TrialsResults

vf = 1.087 vf = 1.088

vf = 1.077 vf = 1.092

S. Chris Colbert et al. Robot Vision with Python

Page 30: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Hardware SetupRobot

Kuka KR 6/2

Six axis, low payload,industrial manipulator.

High repeatability: ±0.1mm

S. Chris Colbert et al. Robot Vision with Python

Page 31: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Test SetupViewing Positions

Due to kinematic limitations of the robot, viewing directions werenot perfectly orthogonal, but they approached such a condition.

S. Chris Colbert et al. Robot Vision with Python

Page 32: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Test SetupTest Objects

Four test objects, all red in color, which represent a range offrequently encountered shapes.

S. Chris Colbert et al. Robot Vision with Python

Page 33: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Experimental TrialsSources of Error

Several sources of error are introduced in the hardwareimplementation that are not present in the simulation environment:

Imprecise camera calibration: intrinsics and extrinsics

Robot kinematic uncertainty

Imperfect segmentation

Ground truth measurement uncertainty (must be measuredwith the robot).

S. Chris Colbert et al. Robot Vision with Python

Page 34: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Experimental TrialsBattery Box

vf = 1.18

S. Chris Colbert et al. Robot Vision with Python

Page 35: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Experimental TrialsCup Stack

vf = 1.13

S. Chris Colbert et al. Robot Vision with Python

Page 36: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Experimental TrialsYarn Ball

vf = 1.14

S. Chris Colbert et al. Robot Vision with Python

Page 37: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Experimental TrialsCardinal Statue

vf = N/A

S. Chris Colbert et al. Robot Vision with Python

Page 38: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

SimulationHardware

Performance Evaluation

With respect to the stated objectives:

Most parameters of the reconstruction di�er from the groundtruth by no more than a few percent. This should be wellwithin the margin of error for most household retrieval tasks.

Contrast this with an error of 10% in the work byYamazaki et al.

On an Intel QX-9300 at 2.53 GHz, the algorithm executes in~0.3 seconds on average. This time depends highly on the timerequired for the non-linear minimization routine to converge.

The work by Yamazaki et al. required in excess of 100 secondsto converge.

S. Chris Colbert et al. Robot Vision with Python

Page 39: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Python ImplementationCore Algorithm

NumPy

Basic image data structureLinear algebra

Cython

Image processing andsuperquadric gradient

Scikits.Image

OpenCV bindings

SciPy

Non-linear minimization

S. Chris Colbert et al. Robot Vision with Python

Page 40: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Python ImplementationSimulation

Mayavi

Simulator engine3D renderings

Traits UI

Simulator UI

S. Chris Colbert et al. Robot Vision with Python

Page 41: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Python ImplementationHardware Networking

OpenOPC

Robot control andcommunication

xmlrpclib

Exposes core algorithm tothe network

PIL

Camera JPEG -> NumPyconversion

S. Chris Colbert et al. Robot Vision with Python

Page 42: Sccolbert Robot Vision

IntroductionAlgorithm

TestingImplementation

Summary

Summary

This work has presented an algorithm for the shape and posereconstruction of novel objects using just three images.

The algorithm requires fewer images than other algorithms inthe published literature, and provide su�cient accuracy forgrasp and manipulation planning.

The algorithm provides higher performance than the otheralgorithms in the published literature.

The algorithm is entirely implemented in Python using librariessuch as NumPy, SciPy, Cython, Mayavi, Traits, OpenOPC,and PIL.

S. Chris Colbert et al. Robot Vision with Python