position-based visual servoing in robotic capture of moving target enhanced by kalman filter

International Journal of Robotics and Automation, Vol. 30, No. 3, 2015

POSITION-BASEDVISUALSERVOINGIN

ROBOTICCAPTUREOFMOVINGTARGET

ENHANCEDBYKALMANFILTER

Benoit P. Larouche∗ and Zheng H. Zhu∗

Abstract

The paper develops a Kalman filter (KF)-enhanced position-based

visual servo control strategy for autonomous robotic capture of a

moving target with an eye-in-hand-configured robotic manipulator.

A dual KF scheme is developed to process noisy imaging data and

track, intercept, and affect a smooth capture of the target. The

first KF concerns the image processing errors and temporary loss

of lock of target to enhance tracking robustness while the second

KF processes noises resulting from camera’s residual vibration due

to manipulator’s joint flexibility. This framework not only enhances

pose and velocity estimation from noisy images, but also provides

smooth pose and velocity estimates for the robotic control system,

which improves the efficiency of tracking and capturing of a moving

target. Furthermore, a composite index of pre-capture measures and

a threshold logic function are introduced to activate an automatic

grasping of target. Experimental results show that faster and

smoother captures of a moving target have been achieved with the

proposed control strategy.

Key Words

Robotic manipulator, visual servoing, Kalman filter, track and

capture, moving target, experiment

1. Introduction

Service robotics is a major focus for the future of au-tomation [1], [2], where great efforts have been devoted toautonomously track, interact with and service targets inhomes, offices, hazardous or dangerous environments, andspace explorations [3]–[9]. The critical phase of such taskis to track, approach, and finally capture the target. Twotypes of robotic visual servoing, eye-in-hand and eye-to-hand [10], are generally used for such tasks. The eye-in-hand monitors the target motion by a camera mountedon the manipulator’s end-effector, whereas the eye-to-handobserves the motion of robot and target simultaneouslyby a fixed camera in workspace. The former has a localbut accurate view of the target and its control accuracy

∗ Department of Earth and Space Science and Engineering,York University, Canada; e-mail: [email protected],[email protected]

Recommended by Prof. Gian Luca Foresti(DOI: 10.2316/Journal.206.2015.3.206-4230)

increases as the camera approaches the target. However,the target may be out of the field of view leading to un-wanted abortion of capture operation. The latter has aglobal but coarse view of the target and its pose estimatesare not as accurate as the eye-in-hand. Based upon thetypes of error to be controlled, there are position-based[11], image-based [12], and hybrid [13] visual servo schemesin the visual servo [14], [15]. Among them, the position-based visual servoing (PBVS) is widely adopted because itcontrols the end-effector from its actual pose to the desiredone in a 3-dimensional (3D) workspace [16] naturally anddirectly. However, the pose estimation in the PBVS isprone to image noises and residual vibration of camera inthe eye-in-hand configuration if the camera is not prop-erly calibrated. Furthermore, it experiences time delayin image acquiring/processing and 3D pose estimation indynamic environments due to the computational complex-ity and insufficient update rate of visual servo system rela-tive to robotic motion control system, the latter is criticalin tracking and capturing of a moving target. The KFand its different variations are commonly used to overcomethese difficulties in the PBVS [17]. Existing approachesgenerally coupled the target pose estimation with pho-togrammetry [17]–[19], which is highly non-linear. An ex-tended KF (EKF) or its variations [17] must be employedwhere the measurement model is linearized about the cur-rent state estimates. Although effective, the optimalityof KF is compromised. Poor estimation of measurement,process noise, and sudden changes of the target pose couldfurther degrade the EKF performance.

To address the above issues in the existing KF-enhanced PBVS, a new, simple, and dual KF scheme isproposed to decouple the target pose and image in theKF, which is the main contribution of the work. Thenew approach employs the first KF in the image spaceto reduce the errors due to image processing and jittersimages. The filtered image data are used by the pho-togrammetry for efficient pose estimation since the latteruses the previous pose as the initial seed for its iterativesolution process. More importantly, the KF could feed thephotogrammetric algorithm with estimated image data oftarget in case the vision system loses the tracking of tar-get momentarily to avoid abortion of the tracking process

267

Figure 1. An eye-in-hand robotic manipulator and a targetin a 3D workspace.

and thus enhance the robustness of vision system. Thesecond KF deals with the noises in pose estimates dueto the high-frequency vibration of camera in robotic op-eration due to the flexibility [20] and/or backlash [21] ofjoints. The measurement model in this phase employs thepose estimates output from the photogrammetry insteadof the image data directly. Thus, the KFs in the imageand state spaces are linear and their optimality is achievedseparately in each domain. Although the approximate er-rors from photogrammetric process may affect the secondKF, our experiments show that the assurance of KF’s opti-mality can minimize their impact and enhance the systemaccuracy and robustness. In the final capture phase, anadaptive threshold logic function has been developed withfive carefully defined misalignment metrics, which is an-other main contribution of the current work. The functioncan be gradually improved by continuous learning from theexternal environment and autonomously control the end-effector to approach, track, and align the gripper with andcapture the target. Finally, the developed KF-enhancedPBVS control strategy has been validated experimentallyusing a custom-built robotic manipulator.

2. Robotic Manipulator Kinematics

Consider an eye-in-hand and anthropomorphic [16] roboticmanipulator as shown in Fig. 1. The robotic arm iscomposed by three links that are actuated by the torso(θ1), shoulder (θ2), and elbow (θ3) joint motors. The end-effector is connected to the arm by a spherical wrist jointand its orientation is adjusted by two rotational joints:wrist roll (θ4) and wrist yaw (θ5) angles. The last jointis a translational joint allowing the gripper to grasp thetarget. The gripper is usually not considered as a roboticjoint since it will be activated only after the end-effector isaligned with the target. Hence, only the degrees of freedom(DOF) of the five rotational joints will be considered in therobotic controller.

For the PBVS, it is natural to describe the targetrelative to the gripper. Let the spatial position of the end-effector be described by a stationary global Cartesian framex0. Define the local frame X g attached to the gripper withy and z axes aligned with the rotational axes of wrist roll(θ4) and wrist yaw (θ5), respectively. Then, the kinematicrelationship between the rotational joint positions and the

Table 1Manipulator Joint Actuation Limits

Joint Minimum Maximum Note

θ1 −90◦ 90◦ Singularity

θ2 0◦ 90◦ Operational limitation

and singularity

θ3 −60◦ 90◦ Singularity

θ4 −90◦ 90◦ Hardware limit

θ5 −90◦ 90◦ Hardware limit

corresponding Cartesian position of the end-effector in theworkspace is defined as,

⎧⎨⎩X 0

1

⎫⎬⎭ = T 0g(θ)

⎧⎨⎩X g

1

⎫⎬⎭ (1)

where T 0g(θ) is the 4× 4 Denavit–Hartenberg transforma-tion matrix from the gripper frame to the global frame andθ= {θ1, θ2, . . . , θ5}T is the vector of joint angles defined inthe joint space, respectively.

Similarly, the transformation from the camera frameXC to the gripper frame X g can be expressed as,

⎧⎨⎩X g

1

⎫⎬⎭ = T gc(θ)

⎧⎨⎩XC

1

⎫⎬⎭ (2)

where T gc(θ) is the 4× 4 Denavit–Hartenberg transforma-tion matrix from the camera frame to the gripper frame.The camera frame XC is defined as x and y axes lie inthe image plane while the z-axis is parallel to the axis offorearm and pointing towards the target.

Thus, the velocity and acceleration relationships be-tween the end-effector and the joints are,

X e = J (θ)θ and X e = J (θ)θ + J (θ)θ (3)

where J (θ) is the Jacobian matrix of the roboticmanipulator.

If the Jacobian is invertible, the inverse kinematics canbe derived:

θ = J+(θ)X e (4)

where J+(θ) = (JT (θ)J (θ))−1JT (θ) is the Moore–Penrose pseudo-inverse of the Jacobian matrix.

The inverse kinematics in (4) may lead to multiplesolutions or singularity. The problem is alleviated simplyby mapping out all possible singularities plus the opera-tional and hardware limitations of each joint in this specificconfiguration and then limiting the operating ranges ofjoints as shown in Table 1. More advanced approaches [22]should be taken to avoid the problem for general cases.

268

3. Target Pose Estimation by Photogrammetry

Assume the camera is calibrated and its intrinsic andextrinsic parameters are known. The point feature ormarker of target is projected onto the image plane by apinhole camera model,

xc = −fxtc/(ztc − f) ≈ −fxtc/ztc,

yc = −fytc/(ztc − f) ≈ −fytc/ztc (5)

where (xtc, ytc, ztc) are the spatial coordinates of themarker with respect to the camera frame, (xc, yc) are theprojected image coordinates, and f is the focal length ofthe camera.

The coordinates (xtc, ytc, ztc) are unknown and can berelated to the known coordinates (xt, yt, zt) of the markerin the target-fixed frame with respect to its origin by acoordinate transformation,

⎧⎪⎪⎪⎨⎪⎪⎪⎩

xtc

ytc

ztc

⎫⎪⎪⎪⎬⎪⎪⎪⎭

=

⎡⎢⎢⎢⎣

r11 r12 r13

r21 r22 r23

r31 r32 r33

⎤⎥⎥⎥⎦

⎧⎪⎪⎪⎨⎪⎪⎪⎩

xt

yt

zt

⎫⎪⎪⎪⎬⎪⎪⎪⎭

+

⎧⎪⎪⎪⎨⎪⎪⎪⎩

Tx

Ty

Tz

⎫⎪⎪⎪⎬⎪⎪⎪⎭

(6)

where (Tx, Ty, Tz) are the spatial coordinates of theorigin of target frame in the camera frame and [rij ] isthe rotational matrix from the target-fixed frame to thecamera frame, respectively.

Substituting (6) into (5) leads to two independent non-linear equations with six unknown pose parameters (Tx,Ty, Tz, Θ, Φ, Ω) of the target, where (Θ, Φ, Ω) are the Eulerangles describing the rotation from the target frame to thecamera frame in the sequence of yaw (Ω), pitch (Φ), androll (Θ). Therefore, at least three non-collinear markersare required to solve for the six unknown pose parametersmathematically along with a priori knowledge of themarkers’ coordinates. However, there exist four distinctposes that appear the same in camera, which cannot bedifferentiated with three markers [14]. Therefore, four non-collinear markers are used to provide system redundancyfor a unique solution as well as to tolerate the loss of onemarker in the tracking process temporarily. The resultingequations are solved by a least-square method iterativelyuntil the residual errors of the measurement satisfy a pre-set convergence criterion [23]. It is noted that the adoptionof Euler angle representation could lead to singularitymathematically when the pitch angle approaches ±90◦.Fortunately, this situation is eliminated by the controlrequirement to keep the target in the field of view of camerafor the eye-in-hand configuration.

4. Kalman Filter

The proposed dual KF scheme processes the images andthe poses in the image and global frames separately. Thefirst KF, which is embedded in the open-sourced computervision (OpenCV) library [24], is employed to reduce theimage noises from the discretization and/or random jitters

of the digital camera and prevent them from being ampli-fied by the photogrammetric algorithm and propagatinginto the PBVS. The problem is linear and thus the opti-mality of KF is ensured. The details about the embeddedKF algorithm can be found in [24] and are not given here.In case of temporary loss of the locking of markers – oneof the major concerns in PBVS, the KF will provide theestimated image coordinates of markers for the photogram-metric algorithm for a short period of time until the visionsystem locks the markers again. This effectively increasesthe robustness of the PBVS.

The second KF concerns with the noises resulting fromthe residual vibration of the camera when the end-effectoris in motion as well as the motion of the non-cooperativetarget. Based on the position–velocity–acceleration ap-proach [25], [26], the system model of the KF is defined as,

xk+1 = Axk +Bwk (7)

where x = {Tx, Tx, Tx, Ty, Ty, Ty, Tz, Tz, Tz,Θ, Θ, Θ,Φ, Φ,

Φ,Ω, Ω, Ω}T is the state vector of the target defined in thecamera frame, A is the state transition matrix, B is thedisturbance transition matrix associated with the processnoise vector w , and subscripts k and (k+1) are the samplestep indices for the instant times tk and tk+1, respectively.

The transition matrix A is 18× 18, consisting of six3× 3 block diagonal sub-matrices, while the disturbancetransition matrix B is an 18× 6 sparse matrix with thefollowing non-zero elements:

a =

⎡⎢⎢⎢⎣

1 dt dt2/2

0 1 dt

0 0 1

⎤⎥⎥⎥⎦, B3(i−1)+1,i = dt3

/6,

B3(i−1)+2,i = dt2/2, B3(i−1)+3,i = dt (8)

where dt is the sampling period and i=1, 2, . . . , 6.

The process noise vector wk ={...T x,

...T y,

...T z,

...Θ,

...Φ,

...Ω}T

kcontains the jerks of the target and satisfies a zero-meanwhite Gaussian distribution with covariance Qk,

The measurement model of the KF is defined as,

zk+1 = Hxk + vk (9)

where zk+1 = {Tx, Ty, Tz,Θ,Φ,Ω}Tk is the pose measure-ment of the target output from the photogrammetry model,vk is the measurement noise vector of a zero-mean whiteGaussian distribution with covariance Rk, and H is a6× 18 sparse matrix with the following non-zero elements:

H1,1 = H2,4 = H3,7 = H4,10 = H5,13 = H6,16 = 1 (10)

It is noted that both (7) and (9) are linear and theoptimal estimate of pose and velocity can be achieved bythe KF. Furthermore, the process and measurement noisevectors are assumed independent and not correlated withthe initial state vector.

269

Once the system and measurement models are deter-mined, the KF can be implemented in a standard pro-cedure. The noise covariance matrix Qk is determined,or tuned, straightforward by experiments as suggested by[27] because the measurement system is known. However,the noise covariance matrix Rk is difficult to tune becausethe current study assumes the target is non-cooperativewith unpredictable motion. It will be tuned extensively bytrial-and-error in experiments [28]. Good initial values forthe covariance matrices can reduce the settling time of thefilter.

5. Control, Guidance, and Interception

The control strategy in the present work is divided intotwo phases: (i) the tracking and approaching, and (ii) thecapture. Accordingly, separate controllers are designedwith each controller containing several sub-controllers toexecute subtasks within each phase.

5.1 Tracking and Approaching

The path planning for tracking and approaching is dividedinto two parts: the wrist and the end-effector. Assumingthere is no obstacle between the end-effector and the target,a direct path for the wrist in the joint space is planned forsimplicity. This may result in a curved trajectory of thewrist in the workspace. In the current work, the attentionis focused on the KF enhancement for the PBVS and thetarget is assumed within the field of view of the camerain the tracking and approaching. In the future study, thetilt and pan angles of the camera should be included inthe control scheme to ensure the target stays within thefield of view. As the wrist approaches the target, theorientation of the gripper will be controlled in the finalphase to ensure it is properly aligned with the target tomaximize the potential for a successful capture.

The approach is controlled through a PD controllerthat employs the relative position error of the wrist, e , andthe orientation error φ of the gripper, such as,

e = X 0 −X d0 =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

x0

y0

z0

⎫⎪⎪⎪⎬⎪⎪⎪⎭

−

⎧⎪⎪⎪⎨⎪⎪⎪⎩

xd0

yd0

zd0

⎫⎪⎪⎪⎬⎪⎪⎪⎭

and

φ = α−αd =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

αx

αy

αz

⎫⎪⎪⎪⎬⎪⎪⎪⎭

−

⎧⎪⎪⎪⎨⎪⎪⎪⎩

αdx

αdy

αdz

⎫⎪⎪⎪⎬⎪⎪⎪⎭

(11)

where X 0 and X d0 are the current and desired wrist

positions in the global frame, α and αd are the currentand desired gripper orientations in the local gripper frameX g, respectively. The desired wrist pose is obtained bythe KF-enhanced vision system.

For the dynamic tracking and approaching, the end-effector should aim for the intercept position instead of the

current position. Thus, the desired pose in (11) should bereplaced with the following:

Xd

0 = X d0 + X

d

0(Δt+ΔT ) (12)

where Δt is time interval between two updates of PBVS,ΔT = ‖e‖/‖e‖ is the estimated remaining time from thecurrent position to the target position, and the desired

velocity Xd

0 is estimated by the KF. As the end-effectorapproaches the target, the ΔT approaches zero and theestimated intercept becomes more accurate.

The error in the joint space is obtained by the inversekinematics,

θe ≈ J+e +Tφφ and θe = J+e +Tφφ (13)

where T φ is the transformation matrix taking the angularmisalignments from the local frame X g to the joint space.It should be noted that the velocity of the joint errorvector must be limited by the maximum velocity of theend-effector allowed, such that, ‖J θe‖≤‖X e_max‖.

Thus, the controller for the path tracking of the end-effector is defined as,

θc = Kpθe +Kdθe (14)

where Kp and Kd are the gains of the PD control, respec-tively.

It is worth noting that the trajectory tracking and thefinal approaching control can be separated to improve thecontrol efficiency in practice. When the end-effector is farfrom the target, only the position of the wrist is controlledto follow the trajectory. Once the end-effector is closed tothe target and prepares for a capture, the orientation errorof the gripper is considered and both terms in (13) mustbe used.

5.2 Capture Threshold Logic Function

A capture control scheme, analogous to neural networkcontrol, has been developed here to activate a dynamiccapture. Five grasping offsets are employed to determine ifthe designated capture feature of the target is aligned andwithin the grasping range of the gripper,

M1 =√ΔX2

g +ΔY 2g +ΔZ2

g , M2 =√ΔZ2

g (15a)

M3 =

√(αx − αd

x)2+(αy − αd

y

)2+ (αz − αd

z)2,

M4 = ΔZg tan(αx − αd

x

)+ΔYg,

M5 = ΔZg tan(αy − αd

y

)+ΔXg (15b)

Here, the M1 defines the misalignment in the totaldistance representing a spherical space around the capturepoint for collision avoidance. However, it does not providea specific guidance for the current robotic configuration,where the end-effector approaches the target horizontally.Accordingly, the M2 measurement is induced as a criticalhorizontal distance for collision avoidance. Similarly, the

270

M3 is the general orientation misalignment of the end-effector with the target. However, it does not reflect theeffects of the angle misalignments on the projected pointof the end-effector on the plane where the capture pointresides. The parameters M4 and M5 are the projectedpoints of the end-effector on the capture point plane,which are offset from the capture point due to the angle(pitch and yaw) misalignments. These five offsets form acomposite index for the total alignment of the end-effectorby weighted summation,

h =5∑

i=0

wiMi (16)

where wi are the weights to be tuned experimentally. Inthis work, the weights are the same as 1/5, which impliesthat each offset is equally important in the following logicfunction (TL) to actuate the capture action based on thesliding averaged index, such that,

TL :

⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩

j < 100 Initialization Approach

j ≥ 100

⎧⎪⎪⎪⎨⎪⎪⎪⎩

500 < h Approach

85 < h ≤ 500 Approach + Alignment

h ≤ 85 Grasp

(17)

where j is the number of vision measurement and h isthe sliding averaged composite index. The number of 100iterations of vision measurements was determined fromexperiments. It allows the KFs and the TL function tosettle down. The threshold value of h for the controller toactuate different tasks was determined by trial-and-errorin the experiments and was target-specific. If the targetchanges, it should be tuned again. Once the approach andalignment (85<h≤ 500) of the end-effector is achieved,the weight for M2 is increased to ensure a collision doesnot occur.

6. Experimental Validation

6.1 Robotic Manipulator

A Pieper-type robotic manipulator described in Fig. 1 hasbeen designed and fabricated to carry out the experimentalwork. The torso, shoulder, and elbow joints employ steppermotors that are paired with 1:25 ratio planetary gearboxesin order to provide an increase in torque and positionaccuracy. The wrist actuators are servo motors, mainlydue to their light-weight and high-torque profiles. Detailedparameters of the manipulator are given in Table 2.

6.2 Target

The target employed in the experiment possesses a visuallylow-noise design with four markers as well as a capturebar for the gripper to grasp, shown in the inset in Fig. 2.The target motion was achieved by a specially designedset-up as shown in Fig. 2, where it was suspended by a

Table 2Robotic Manipulator Specifications

Items Maximum Dimension

Arm length 487 mm

Forearm length 497 mm

Arm mass 0.687 kg

Forearm mass 0.704 kg

Gear box 1:25 ratio

Gear box mass 0.5 kg

Stepper motor torque 2 N·mStepper motor mass 0.5 kg

Holding torque 4 N·mStep size 0.225 Deg./step

Figure 2. Experimental set-up.

wire attached to an upright stepper motor and lower leftanchor. This setup simulates (in two dimensions) a zero-gravity environment (in horizontal direction) allowing forfree motion of the target in response to external stimuli.The target path was adjustable through changing thepositions of the anchor to create a variety of arcs. Themotor generates different target motion by varying speedduring a single path set-up allowing various situationsto be tested. Two velocities were used in the testingphases: a low and a high velocity. The low velocity(0.38 cm/s of linear speed at the motor) was defined asthe one at which the target can be tracked and capturedby the robotic manipulator without any prediction for theintercept position. The high velocity was set twice thelow velocity, i.e., 0.76 cm/s. Although the magnitude ofthe velocity of the target was constant in each case, itsdirection was changing in the course as shown in Fig. 2.

6.3 Camera

The vision sensor is a Logitech webcam. The camerapossesses an autofocus feature with a variable focal lengthranging from 2.0 to 3.7mm and a true 2-megapixel CCDsensor with 1600× 1200 pixels. In the experiments, theautofocus feature was disabled and the focal length wasfixed at 3.7mm. The physical size of the CCD sensor wasmeasured as 6mm× 4mm.

271

Table 3Test Results of Photogrammetry Algorithm

Vision Accuracy Error Precision Error

x-axis 93.12% ±0.23% 99.16% ±0.05%

y-axis 94.50% ±0.31% 99.87% ±0.05%

z-axis 87.32% ±0.75% 97.01% ±0.04%

Roll 99.45% ±0.44% 99.45% ±0.08%

Pitch 88.20% ±0.80% 88.45% ±0.09%

Yaw 88.13% ±0.80% 89.21% ±0.08%

7. Results and Discussion

7.1 Validation of Computer Vision and Photogram-metry

The computer vision is realized by adopting the OpenCV li-brary together with a custom photogrammetric algorithm.The program identified and locked the vertices of all mark-ers first and then grouped them into four groups with eachgroup containing five vertices based on the known shape ofthe markers. Then, the centre coordinates of the markerswere calculated by averaging the coordinates of five ver-tices within each group and input to the photogrammetricalgorithm to calculate the pose of target with respect tothe camera in the 3D workspace. The accuracy of thevision system was evaluated by rotating the target’s pitchand yaw angles from 0 to 35◦ at 5◦ intervals with respect tothe camera and moving the distance from the camera from10 to 100 cm at 5 cm intervals, respectively. The resultsare shown in Table 3. It can be seen that the accuracy ofthe algorithm is very high for the given camera.

7.2 Kalman Filters

While no work is needed for the embedded KF inthe OpenCV library, the second KF’s noise covariance

Figure 3. KF simulation results – the top three figures illustrate the YZ tracking plane while the bottom three illustrate theyaw (Ω) tracking.

matrices of measurement Qk and process Rk were tunedexperimentally. The matrix Qk was tuned by comparingreal poses with the estimates. The matrix Rk was tunedbased on the fact that the absolute value of the matrix isless important than the relative weight in comparison tothe Qk matrix. Therefore, a simulation was performedto locate the overall region for the values of Qk andseveral experimental iterations were performed and therelative weights were chosen from the best results. Thefollowing are the values for both the measurement andnoise covariance matrices,

Rk =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

4 0 0 0 0 0

0 2 0 0 0 0

0 0 4 0 0 0

0 0 0 3 0 0

0 0 0 0 1 0

0 0 0 0 0 1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

× 10−6

Qk =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

5 0 0 0 0 0

0 5 0 0 0 0

0 0 5 0 0 0

0 0 0 5 0 0

0 0 0 0 5 0

0 0 0 0 0 5

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

× 10−6

Figures 3–4 present the simulation results and illus-trates the effectiveness of the proposed KF in the statespace by presenting the true, noisy, and KF output sig-nals for a baseline case. Figure 3 demonstrates the YZprofile in the camera frame and the yaw (Ω) signal in theirtrue, noisy, and KF output forms that represent the mostcritical portion of the tracking information when consider-ing the capture operation. The result illustrates a proper

272

Figure 4. 3D path and rotational track: (a) ideal signals; (b) noisy signals; (c) KF-processed signals.

Figure 5. Low-velocity capture without KF.

Figure 6. Low-velocity capture with KF.

tracking and extraction of the original signal by the KFand validates the process. Figure 4 shows in details by 3Dpath in the upper portion and the rotational tracks in thebottom half.

7.3 Low-Velocity Capture

The captures of target were performed two times withthe target moving on the same path at the same velocity

without and with the KF’s prediction. The results areshown in Figs. 5 and 6. A “capture flag” was added infigures to indicate the point at which the gripper closed.Note that all captures presented were successful, meaningthat the target was successfully identified, tracked, andcaught. Figure 5 shows the robot took a longer time tocapture the target (occurs near time 76.1 s) without theKF in comparison to Fig. 6 where the capture occurs neartime 31.3 s with the KF. This is a significant reduction

273

Table 4Comparisons of Captures without and with KF

Capture θ1 θ2 θ3 DeltaSymbol Time Error Error Error Time

Low velocity 76.1 12 5 16 58.8%without KF

Low velocity 31.3 2 4 12with KF

High velocity 102.0 8 4 13 42.4%without KF

High velocity 58.8 4 4 7with KF

in time required to perform a capture operation by usingthe KF.

Of even greater interest are the variations in the desiredpositions when the KF was not used in comparison to whenthe KF was employed. Table 4 shows the largest errorsin the course of the capture, which corroborate directlywith the increase in capture times. The θ1 and θ3 joints

Figure 7. High-velocity capture without KF.

Figure 8. High-velocity capture with KF.

oscillate in a much greater fashion before settling down tothe appropriate values. This not only is the cause for thevariations in velocity, but also creates a much more stableand reliable system for capturing.

7.4 High-Velocity Capture

Figures 7 and 8 represent the high-velocity captures with-out and with the KF. Figure 7 shows by the capture flagthat the capture occurred at time 102.0 s and that θ3 oscil-lated greatly near the beginning of the capture operationbefore establishing the correct trend. Due to the highervelocity of the target, the oscillations were reduced in com-parison to the low-velocity captures, as the system had ahigher ratio of signal to noise.

Figure 8 displays the final high-velocity capture withthe KF and the capture flag indicates a successful captureoccurring at time 58.8 s, well below the capture without theKF. Although the θ3 joint still seems to indicate a certainlevel of oscillations in the system, the number of overshootsand corrections is greatly reduced (3 in comparison to 6 forno predictions). This aided in reducing the time requiredto settle and affect a capture.

274

Figure 9. Error histogram of low-velocity capture without KF.

Figure 10. Error histogram of low-velocity capture with KF.

Figure 11. Error histogram of high-velocity capture without KF.

275

Figure 12. Error histogram of high-velocity capture with KF.

7.5 Error Analysis

The improvements by the KF are further analysed by his-tograms of joint errors in the tracking process to representnot only the overall behaviour but also any lead/lag be-haviour. Figures 9 and 10 represent the comparisons oferror histogram of low-velocity capture. The KF providesan overall improvement significantly. For instance, theerrors are much more uniformly distributed around thezero mean with KF in Fig. 10, displaying a better overalllead/lag performance.

Figures 11 and 12 represent the high-velocity captureoperations without and with the KF. The higher velocityexplains the overall higher level of errors corresponding toa longer capture operation. The largest improvement isseen in θ3 tracking where the KF results in a much lowerand more evenly distributed error histogram. Finally, theoverall values for all joints show a distinct improvementwith the KF and a better lead/lag behaviour.

8. Conclusion

In summary, the main contributions of current work are(i) a dual KF approach to the eye-in-hand PBVS con-trol strategy in order to affect the capture of a passivelynon-cooperative target. It enhances the robustness of thevisual servo system and addresses the time delay problemin PBVS by predicting the rendezvous point for capture.(ii) The capture threshold logic function and the compos-ite index. It provides an easy and efficient way to auto-mate the approach, alignment and capture of the target.(iii) The experimental validation of the effectiveness of theproposed approach and the error analysis of the proposedapproach. The KF-enhanced capture reduced the timerequired to affect a capture by over 42% (58.8% for low-velocity capture and 42.4% for high-velocity capture). Theresults experimentally demonstrate the effectiveness androbustness of the proposed visual servoing by quicker andsmoother captures of a moving target with the KF.

Acknowledgement

This work is supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC).

References

[1] N.-G. Cui, P. Wang, J.-F. Guo, and X. Cheng, A review ofon-orbit servicing, Journal of Astronautics, 28(4), 2007, 33–39.

[2] A. Kroll and S. Soldan, Survey results on status, needs andperspectives for using mobile service robots in industrial ap-plications, Proc. 11th Int. Conf. on Control, Automation,Robotics and Vision, 1(1), Singapore, 2010, 621–626.

[3] U. Reiser, C. Connette, J. Fischer, J. Kubacki, et al., Care-O-Bot r©3 –Creating a product vision for service robot applicationsby integrating design and technology, Proc. IEEE/RSJ Int.Conf. on Intelligent Robots and Systems, 1(1), 2009, 1992–1998.

[4] A. Jain and C. Kemp, EL-E: An assistive mobile manipu-lator that autonomously fetches objects from flat surfaces,Autonomous Robots, 28(1), 2010, 45–64.

[5] F. Aghili, A prediction and motion-planning scheme for visuallyguided robotic capturing of free-floating tumbling objects withuncertain dynamics, IEEE Transactions on Robotics, 28(3),2012, 634–649.

[6] F. Roe, R. Howard, and L. Murphy, Automated rendezvousand capture system development and simulation for NASA,Proceedings of the SPIE, 1(1), 2004, 118–125.

[7] H. Wang, Y.-H. Liu, and D. Zhou, Adaptive visual servoingusing point and line features with an uncalibrated eye-in-handcamera, IEEE Transactions on Robotics, 24(4), 2008, 843–857.

[8] S.S. Srinivasa, D. Berenson, M. Cakmak, A. Collet, et al., Herb2.0: Lessons learned from developing a mobile manipulator forthe home, Proceedings of the IEEE, 100(8), 2012, 2410–2428.

[9] W. Chung, G. Kim, and M. Kim, Development of the multi-functional indoor service robot PSR systems, AutonomousRobots, 22(1), 2007, 1–17.

[10] I. Siradjuddin, L. Behera, M. McGinnity, and S. Coleman,Image-based visual servoing of a 7-DOF robot manipulator us-ing an adaptive distributed fuzzy PD controller, IEEE/ASMETransactions on Mechatronics, 1(1), 2013, 1–12.

[11] F. Chaumette and S. Hutchinson, Visual servo control, IEEERobotics and Automation Magazine, March, 7, 2007, 109–118.

[12] J. Correa and A. Soto, Active visual perception for mobile robotlocalization, Journal of Intelligent Robotic Systems, 58(3–4),2010, 339–354.

[13] E. Muehlenfeld and H. Raubenheimer, Image-processing withmatched scanning of contours, Proceedings Society of Photo-Optics Instrumentations and Engineering, 397, 1983, 125–130.

276

[14] F. Chaumette and S. Hutchinson, Visual servo control – partI: Basic approaches, IEEE Robotics & Automation Magazine,13(4), 2006, 82–90.

[15] F. Chaumette and S. Hutchinson, Visual servo control – part II:Advanced approaches, IEEERobotics &AutomationMagazine,14(1), 2007, 109–118.

[16] V. Lippiello, B. Siciliano, and L. Villani, A position-basedvisual impedance control for robot manipulators, Proc. IEEEInt. Conf. on Robotics and Automation, Roma, 2007, 2068–2073.

[17] S.Y. Chen, Kalman filter for robot vision: A survey, IEEETransactions on Industrial Electronics, 59(11), 2012, 4409–4420.

[18] K. Hashimoto, A review on vision-based control of robotmanipulators, Advanced Robotics, 17(10), 2003, 969–991.

[19] F. Janabi-Sharifi and M. Marey, A Kalman-filter-based methodfor pose estimation in visual servoing. IEEE Transactions onRobotics, 26(5), 2010, 939–947.

[20] D.H. Kim and W.H. Oh, Robust control design for flexiblejoint manipulators: Theory and experimental verification,International Journal of Control, Automation, and Systems,4(4), 2006, 495–505.

[21] F. Mazzini and S. Dubowsky, Experimental Validation ofthe tactile exploration by a manipulator with joint backlash,Journal of Mechanisms and Robotics, 4(1), 2012, 011009–011009-8.

[22] M. Ozdemir and S. Ider, Inverse dynamics control of paral-lel manipulators around singular configurations, Journal ofMechanical Engineering Science, 14(3), 2007, 50–62.

[23] J. Pomares, I. Perea, and F. Torres, Dynamic visual servo-ing with chaos control for redundant robots, IEEE/ASMETransactions on Mechatronics, 19(2), 2014, 423–431.

[24] G. Bradski and A. Kaehler, Learning OpenCV: Computervision with the OpenCV library (Sebastopol, CAssO’ReillyMedia Inc., 2008).

[25] Z.H. Zhu, R.V. Mayorga, and A.K. Wong, Dynamic robotmanipulator trajectory planning for obstacle avoidance,Mechanics Research Communications, 26(2), 1999, 139–144.

[26] B.P. Larouche and Z.H. Zhu, Autonomous robotic captureof non-cooperative target using visual servoing and motionpredictive control, Autonomous Robots, 37(2), 2014, 157–167.

[27] C. Liu, X. Huang, and M. Wang, Target tracking for visualservoing systems based on an adaptive Kalman filter, Inter-national Journal of Advanced Robotic Systems, 9(149), 2012,1–12.

[28] M. Barut, R. Demir, and E. Zerdali, Real-time implementationof bi input-extended Kalman filter-based estimator for speed-sensorless control of induction motors, IEEE Transactions onIndustrial Electronics, 59(11), 2012, 4197–4206.

Biographies

Benoit P. Larouche obtained hisB.A.Sc. in Mechanical Engineer-ing fromUniversity of Toronto, hisM.A.Sc. in Aerospace Engineer-ing from University of TorontoInstitute for Aerospace Studies,and his Ph.D. from York Univer-sity in Autonomous Robotics allin Toronto, Canada. His majorfield of study is satellites and au-tonomous robotics with a focuson on-orbit servicing and enabling

technologies. He concluded his master’s degree with thelaunch of two nano-satellites, CanX-2 and NTS, back in2008 which continue to operate to this date. In addition, heworked on several other currently operational nano-satellitemissions. He has published several papers on autonomouscapture of noncooperative targets and nano-satellite design.

Dr. Larouche is a Member of AIAA, Prospectors & Devel-opers Association, is an Engineer in Training with Profes-sional Engineers Ontario and an IAC member since 2007.

Zheng H. Zhu is a professor atDepartment of Earth and SpaceScience and Engineering, YorkUniversity in Toronto, Canada.He received his B.Eng., M.Eng.,and Ph.D. in mechanics fromShanghai Jiao Tong University inChina. Furthermore, he receivedhis M.A.Sc. in Robotic Controlfrom University of Waterloo andhis Ph.D. inMechanical Engineer-ing from University of Toronto all

in Ontario, Canada. From 1993 to 1995, he worked as aresearch associate in Department of Mechanical and In-dustrial Engineering, University of Toronto. From 1995 to2006, he was a senior engineer with Curtiss-Wright – IndalTechnologies located in Mississauga, Canada. Since 2006,he is a faculty member of York University. His researchinterests include on-orbit service robot, dynamics, andcontrol of electrodynamic tether system. He has publishedover 140 papers. Dr. Zhu is a licensed Professional Engi-neer, Associate Fellow of AIAA, Fellow of CSME, SeniorMember of IEEE, and Member of ASME.

277

position-based visual servoing in robotic capture of moving target enhanced by kalman filter

Documents