pose estimation using redundant measurements and polar-decomposition · pdf file ·...

The 14th IFToMM World Congress, Taipei, Taiwan, October 25-30, 2015

DOI Number: 10.6567/IFToMM.14TH.WC.FA.020

Pose Estimation Using Redundant Measurements and Polar-decompositionFiltering

Xiao He∗ Jorge Angeles† Jozsef Kovecses‡

Department of Mechanical Engineering & Centre for Intelligent Machines,McGill University, Montreal, QC, CANADA

Abstract— Rigid-body pose estimation is a recurrentproblem that arises in many application areas, such asrobotics, rehabilitation and assembly. As such, the prob-lem has been the subject of intensive research, and it stillis. The motivation behind the procedure reported here isthe calibration of a spherical parallel manipulator, the Ag-ile Wrist, intended to be used as the pitch-roll-yaw joy-stick of a haptic device. The procedure is based on redun-dant measurements, to filter measurement noise, and polar-decomposition filtering, to extract the orthogonal factor ofa still-noisy estimate. The measurement system is a binoc-ular camera. The reported results show an acceptable per-formance of both the data-acquisition system and the com-putational procedure.

Keywords: rigid-body pose estimation, polar-decomposition theo-rem, pose-data acquisition, binocular camera

I. IntroductionKnowing the pose—position of a landmark point and

orientation—of a rigid body is a recurrent problem, thatarises in multiple application domains: robotics; rehabilita-tion; assembly; multi-axis machining; haptics; etc. Becauseof its multiple applications, the problem has been inten-sively and extensively investigated over the years; it is still asubject of research. It is well known that the pose of a rigidbody is fully determined by the position of three points ofthe body [1]. However, determining the pose of the body re-quires knowledge of the position of the three points in botha reference and a current pose, which brings about the needof accurate measurements of the point-coordinates. Withevery measurement incurring an error, redundant measure-ments are required to filter the error. The number of re-dundant measurements thus can vary from four, the veryminimum, to a thousand points, known as clouds [2].

Reported in this paper is a novel algorithm that producesthe pose of a rigid body from point-position data providedby a stereovision camera. The algorithm also uses polar-decomposition filtering to remove measurement and round-off errors, as opposed to an optimization approach, which ismost common in the literature. The rigid body whose poseis under estimation is the moving plate (MP) of the Agile

∗[email protected]†[email protected]‡[email protected]

Wrist, a three-degree-of-freedom (three-dof) spherical par-allel manipulator. Due of the symmetries involved in theMP and the Agile Wrist, tied with the dimensions involvedand the limited vision field of the stereovision system, onlyfour points are tracked in the work reported here.

The use of computer vision systems in object tracking isnot new. Most recently, Giancola et al. [3] used a trinocularvision system to reconstruct the position and orientation ofa paintbrush tool moving along various trajectories. Reflec-tive markers were also placed on the tool and enabled thecamera system to register its movements. Using additionaltechniques to smoothen the acquired trajectory, a robot canultimately be used to recreate the recorded motions.

With respect to polar-decomposition filtering, Baron andAngeles [4] used this method in their computation of thedirect kinematics of parallel manipulators. Specifically, thedirect kinematics problem they proposed solves for the poseof the moving platform by analyzing the instrumented anduninstrumented joints on each limb of the manipulator. Bydefinition, an instrumented joint is one whose displacementis measured by sensors. The direct kinematics problem isthen formulated as the minimization of a quadratic objec-tive function subject to the non-linear constraints imposedby the orthogonality of the rotation matrix. The polar-decomposition theorem is then used to filter out the noiseand round-off errors incurred in the computation of the ro-tation matrix.

Recently, a similar approach to the one reported here wasproposed by Wolf and Sharf [5], who regarded the noisymeasurements as producing a transformation matrix of theirset of rigid-body points as a strain tensor. Then, the rotationmatrix was produced by means of the concept of a Cosserattensor. In our approach, we regard the transformation ma-trix as a general affine transformation. Upon applicationof the polar-decomposition theorem [6], we extract the or-thogonal component of the putative rotation matrix that stillcontains unfiltered measurement noise.

Horowitz et al. [2], in turn, reported a method ap-plicable to a cloud of points using an optimization ap-proach. The gist of this method lies in converting anequality-constrained least-squares problem—the constraintbeing introduced to force the underlying rotation matrix tobe proper orthogonal—into an inequality constrained prob-lem. They do this by resorting to the concept of orbitope,

described in the references therein. In a nutshell, a properorthogonal matrix can be visualized as a point lying on theboundary of a unit sphere in the four-dimensional spaceof the matrix Euler-Rodrigues parameters. The orbitopebecomes the solid four-dimensional sphere. This transfor-mation helps the computation of the solution because solv-ing an inequality-constrained optimization problem is sim-pler than solving its equality-constrained counterpart. Theauthors’ rationale is that, since the inequality constrainedproblem is convex, a) it admits only one optimum, whichis thus the global optimum, and b) the optimum lies onthe boundary of the orbitope, i.e., on the surface of theunit sphere, which represents a proper orthogonal matrix,namely, a rotation.

II. Problem FormulationThe Agile Wrist originally served as the mechanism car-

rying the end-effector of a serial robotic arm [7]. Thisspherical parallel manipulator has been recently convertedinto a haptic joystick by replacing the original end-effectorwith a simple joystick handle, as depicted in Fig. 1. As forits size, the Agile Wrist measures roughly 0.38× 0.38×0.46 m.

Fig. 1: The Agile Wrist with its joystick handle attached,and the Bumblebee camera visible at the top

With the joystick column now rigidly attached to themoving plate (MP), the task at hand is to determine the ori-entation of this rigid body for every pose of the joystick.Traditionally, this can be obtained via the forward displace-ment analysis (FDA), which uses the kinematic model of

the parallel manipulator and joint sensor data to calculatethe desired orientation. However, this approach does nottake into account the possible machining errors and mis-alignments that inevitably arise during the manufacturingand assembly of the mechanism. Consequently, the resultof the FDA might not reflect the actual orientation of theplatform for each pose. Hence, a more robust approach isneeded.

The method outlined in this paper uses a 3D stereovi-sion camera, along with coloured recognition markers, tovisually estimate the orientation of the MP. The 3D camera,known commercially as the Bumblebee, is able to outputthe Cartesian coordinates of a colored marker in real-time.In determining the orientation of the joystick, three mark-ers are mounted on the MP to represent the points neededin defining the pose of a rigid body, with a fourth markeron the joystick column serving as a redundant data point inthe measurement. The presence of four data points greatlyimproves the robustness of the algorithm since this en-ables the measurement errors to be filtered out using polar-decomposition.

III. Data-acquisition SystemThe Bumblebee stereovision camera, shown in Fig. 1, is

mounted on a frame extending from the ceiling above theAgile Wrist. This enables the system to have a top-downview of the MP. Similar to human eyes, each lens on thecamera acquires an image of the joystick platform; a three-dimensional picture is then obtained from these images by aprocess known as image rectification. Currently, the camerais able to visually track the location of a green ball whiletransmitting its Cartesian coordinates in real-time to a hostcomputer running MATLAB and Simulink.

For this project, a spherical marker setup is implemented,with three markers being held by the MP. The centers ofthese markers also form an equilateral triangle fixed to theplatform. Additionally, a fourth marker mounted atop thejoystick column also doubles as the joystick handle, asshown in Fig. 1. With each marker being spherical, a circu-lar profile will always be exposed to the camera regardlessof the tilt angle of the MP. However, a current limitation ofthe visual recognition system is that the Bumblebee cam-era can only track the location of one marker at a time.Therefore, the coordinates of the four marker points mustbe captured sequentially.

IV. Pose Estimation and Polar-decompositionThe data capture is carried out for numerous joystick

poses, which correspond to various orientations along thepitch-roll-yaw axes. With the raw camera data parsed, thecoordinates of the four marker points are obtained for theMP. Thus, we can define four points indicating the markerlocations with respect to the camera reference frame. Theseare denoted as M j

1 , M j2 , M j

3 , and M j4 . With j indicating the

pose number, a reference pose is defined for j = 0, which is

the pose where the MP is level with the table surface. Thefour points also define a rigid body in the shape of a tetra-hedron whose base is an equilateral triangle, as shown inFig. 2.

M j1

M j2

M j3

M j4

Z

X

YCj

pj1

pj2

pj3

pj4

Fig. 2: Tetrahedron defining the orientation of the MP foran arbitrary pose j

The coordinates of the four vertices are then redefinedwith respect to the centroid C j of the tetrahedron. With m j

iand c j defined as the position vectors of points M j

i and C j,the latter is readily calculated as

c j =14

4

∑i=1

m ji (1)

Next, vectors {p ji }4

i=1 connecting C j to each respective ver-tex are obtained as

p ji = m j

i − c j (2)

Thus, an arbitrary pose j can now be uniquely defined by a3×4 matrix P j, whose column vectors are simply {p j

i }4i=1

P j =[p j

1 p j2 p j

3 p j4

](3)

However, with {p ji }4

i=1 being calculated based on the rawcamera data, the P j matrices are inevitably affected by mea-surement and round-off errors from the vision system. Toimprove the accuracy of the pose estimation, these errors inthe data are filtered out using polar-decomposition. In thiscase, the j-th pose of the MP can be characterized by a pu-tative rotation matrix Q j and the position vector c j of thetetrahedron centroid. From eq.(3), the reference pose is de-noted as P0, its relationship to any arbitrary pose P j beingexpressed as

P j = Q jP0, j = 1,2, . . . ,n (4)

which, for every value of j, represents a matrix equation inthe unknown 3×3 matrix Q j. In order to express this equa-tion in a more familiar form, its two sides are transposed,while swapping the sides of the equation, to yield

PT0 QT

j = PTj , j = 1,2, . . . ,n (5)

Apparently, the foregoing matrix equation involves 4×3 = 12 scalar equations in 3×3 = 9 unknowns, and hence,the system of equations is overdetermined. Notice that,while the dimensions of the Agile Wrist allow for fourspoints to be clearly visible by the binocular camera, thesame procedure can be applied for m > 3 points on the MP.In the sequel we limit ourselves to the specific case at hand,in which m = 4, but the same procedure can be applied,physical conditions permitting, to m > 4 as well.

Given the overdeterminacy of the system of equations,and its linearity, one value of Q j satisfying all four matrixequations is not possible in general; however, an optimumestimate can be obtained via least squares. In this light,the optimum estimate, represented as Q j, is obtained nu-merically via the QR-decomposition [9]. Symbolically, theleast square approximation is represented by means of theMoore-Penrose generalized inverse, namely, as

QTj =

(P0PT

0)−1 P0PT

j (6)

which readily leads to

Q j = P jPT0(P0PT

0)−1

(7)

Thus, eq.(7) can be applied to any platform with morethan four markers installed. For n poses of the joystickplatform, the pose-estimation algorithm should generate nestimations of corresponding rotation matrices. Realisti-cally, due to the imperfection of the vision system and theinherent round-off errors in the data-processing, Q j mostlikely fails to be orthogonal. A filtering procedure is there-fore needed to extract the most likely proper orthogonalmatrix from Q j. This is achieved through matrix polar-decomposition, namely,

Q j = Q jT j, j = 1,2, . . . ,n (8)

Here, Q j is the proper orthogonal matrix represent-ing the best estimate of the j-th rotation, and T j is apositive-definite symmetric matrix containing the measure-ment error. In the current pose estimation algorithm, Q j

is extracted from Q j via the Higgs algorithm for polar-decomposition [8]. As for T j, it can be simply obtainedfrom eq.(8) as

T j = QTj Q j, j = 1,2, . . . ,n (9)

Thus, each j-th pose can now be characterized by a purerotation Q j, followed by a translation t j given by

t j = c j− c0, j = 1,2, . . . ,n (10)

For better visualization of the measurement error, the errorcomponents can be extracted as

E j = T j−1, j = 1,2, . . . ,n (11)

where 1 is the 4×4 identity matrix. Finally, the error com-ponents are quantified as a scalar by using the normalizedmatrix Frobenius norm1:

e j = ‖E j‖F , j = 1,2, . . . ,n (12)

Thus, for n poses, the algorithm also generates an arrayof n scalars, each representing the measurement error of thecorresponding pose.

V. Estimation ResultsUsing the Bumbleebee camera, the data capture was per-

formed on twenty poses of the joystick platform. Subse-quently, the various orientations of the MP were recreatedvirtually using the camera data and MATLAB. The tetradsrepresenting the MP in its reference pose and at an arbi-trary pose corresponding to j = 14, as a matter of example,are shown in Fig. 3.

From the raw camera data, the unfiltered rotation matrixQ14, for example, is obtained as

Q14 =

−0.0675 1.0157 0.0007−0.9323 −0.0151 0.37780.4010 0.0376 0.9949

(13)

which is not proper orthogonal, since the product QT14Q14

does not yield the 3×3 identity matrix, as shown below:

QT14Q14 =

1.0345 −0.0394 0.0467−0.0394 1.0333 0.03240.0467 0.0324 1.1326

(14)

Subsequently, extracting Q14 from Q14 through polar-decomposition gives

Q14 =

−0.0469 0.9988 −0.0132−0.9261 −0.0385 0.37540.3744 0.0298 0.9268

(15)

which proves to be proper orthogonal. Finally, after ex-tracting matrix T j for all twenty poses, the array of mea-surement errors is computed. The errors associated with alltwenty poses of the MP are recorded in Table I.

TABLE I: Measurement and round-off errors

Pose ( j) Errors (e j)1 to 5 0.0198 0.0466 0.0465 0.0691 0.07666 to 10 0.0228 0.0321 0.0150 0.0437 0.0352

11 to 15 0.0597 0.0337 0.0596 0.0277 0.033016 to 20 0.0360 0.0666 0.0394 0.0666 0.0394

1For a n×n matrix A, this norm is defined as (1/n)tr(AAT ), with tr( ·)denoting the trace of ( ·).

(a)

(b)

Fig. 3: Tetrad recreating the MP at (a) the reference pose,and (b) the pose j = 14

Since the poses are captured independently from eachother, there are no correlations nor any patterns present be-tween the various error values. Moreover, the errors areobtained via the Frobenius norm of the error matrices, andhence, these error values cannot be considered as percent-ages. Therefore, a highly accurate data capture would cor-respond to a value of e j that is close to zero, which by virtueof eqs.(9) and (11), also implies that the unfiltered rotationmatrix Q j is nearly orthogonal. Thus, e j is also a means ofmeasuring the orthogonality of Q j. In this case, the errorsobtained are relatively small, which is also made evident bythe off-diagonal entries of QT

14Q14 being close to zero, asthe reader can readily verify.

VI. ConclusionsThe Bumblebee vision system was successfully used to

capture the orientation of the Agile Wrist joystick platform.By incorporating the Moore-Penrose generalized inverse,the pose estmation algorithm reported here can be imple-mented on any similar platform with more than four recog-nition markers. For the Agile Wrist, the errors incurred in

the measurement of the marker coordinates were filteredout using polar-decomposition. This resulted in each joy-stick pose being defined through a proper orthogonal ro-tation matrix. The algorithm proposed here can be furtherimproved by adding multi-object recognition capabilities tothe vision system, which will help to streamline the datacapture process in the future.

Acknowledgements

The authors acknowledge the research support receivedfrom the Natural Sciences and Engineering Research Coun-cil of Canada (NSERC) through a Strategic Research Grant;the second author also acknowledges the support receivedfrom McGill University’s James McGill Professorship ofMechanical Engineering.

References[1] Angeles, J. Fundamentals of Robotic Mechanical Systems. Theory,

Methods, Algorithms Fourth Edition, Springer, New York.[2] Horowitz, M.B., Matni, N., and Burdick, J. Convex relaxations of

SE(2) and SE(3) for visual pose estimation. In 2014 IEEE Interna-tional Conference on Robotics & Automation (ICRA), Hong Kong,May, 31–June 7, 2014.

[3] Giancola, S., Chiarion, D., and Sala, R. A robot trajectory program-ming method using multi-camera systems. 2014 Mechatronic andEmbedded Systems and Applications (MESA), Senigallia, Sept 10 –12, 2014.

[4] Baron, L. and Angeles, J. The direct kinematics of parallel manipu-lators under joint-sensor redundancy IEEE Transactions on Roboticsand Automation, 16(1):644–651, Febuary 2000.

[5] Wolf, A., Sharf, I., and Rubin, MB. Using cosserat point theory forestimating kinematics and soft-tissue deformation during gait analy-sis. In Advances in Robot Kinematics: Motion in Man and Machine,pp. 63–70, May, 2010.

[6] Strang, G. Linear Algebra and Its Applications Third Edition, Har-court Brace Jovanovich College Publishers, New York.

[7] Al-Widyan, K., Ma, X.Q., and Angeles, J. The robust designof parallel spherical robots Mechanism and Machine Theory,46(3):335–343, March 2011.

[8] Higham, N.J. and Schreiber, R.S. Fast polar decomposition of an ar-bitrary matrix SIAM Journal on Scientific and Statistical Computing,11(4):648655, 1989.

[9] Golub, G.H. and Van Loan, C.F. Matrix Computations Third Edition,Johns Hopkins University Press, Baltimore.

pose estimation using redundant measurements and polar-decomposition · pdf file ·...

Documents