fast and accurate humanoid robot navigation guided by ... and accurate humanoid robot... · fast...

7
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/224591626 Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper · September 2009 DOI: 10.1109/ICMA.2009.5246533 · Source: IEEE Xplore CITATIONS 2 READS 57 7 authors, including: Some of the authors of this publication are also working on these related projects: Linking Robotics to Orthodontics View project MEANING-CENTRIC FRAMEWORK FOR NATURAL TEXT/SCENE UNDERSTANDING BY ROBOTS View project Ming Xie Nanyang Technological University 123 PUBLICATIONS 950 CITATIONS SEE PROFILE Zeyang Xia Chinese Academy of Sciences 59 PUBLICATIONS 174 CITATIONS SEE PROFILE Lining Sun 351 PUBLICATIONS 1,487 CITATIONS SEE PROFILE Junhong Ji Harbin Institute of Technology 12 PUBLICATIONS 90 CITATIONS SEE PROFILE All content following this page was uploaded by Zeyang Xia on 28 May 2014. The user has requested enhancement of the downloaded file.

Upload: others

Post on 29-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fast and Accurate Humanoid Robot Navigation Guided By ... and accurate humanoid robot... · Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper ·

Seediscussions,stats,andauthorprofilesforthispublicationat:https://www.researchgate.net/publication/224591626

FastandAccurateHumanoidRobotNavigationGuidedByStereovision

ConferencePaper·September2009

DOI:10.1109/ICMA.2009.5246533·Source:IEEEXplore

CITATIONS

2

READS

57

7authors,including:

Someoftheauthorsofthispublicationarealsoworkingontheserelatedprojects:

LinkingRoboticstoOrthodonticsViewproject

MEANING-CENTRICFRAMEWORKFORNATURALTEXT/SCENEUNDERSTANDINGBYROBOTSViewproject

MingXie

NanyangTechnologicalUniversity

123PUBLICATIONS950CITATIONS

SEEPROFILE

ZeyangXia

ChineseAcademyofSciences

59PUBLICATIONS174CITATIONS

SEEPROFILE

LiningSun

351PUBLICATIONS1,487CITATIONS

SEEPROFILE

JunhongJi

HarbinInstituteofTechnology

12PUBLICATIONS90CITATIONS

SEEPROFILE

AllcontentfollowingthispagewasuploadedbyZeyangXiaon28May2014.

Theuserhasrequestedenhancementofthedownloadedfile.

Page 2: Fast and Accurate Humanoid Robot Navigation Guided By ... and accurate humanoid robot... · Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper ·

Abstract—Stair-climbing and moving object grasping both require high precision information feedback of the feature coordinate. This manuscript mainly describes how to process the information acquired from the stereovision system in a fast and accurate way and how to use the data to compensate the progressive error caused by the humanoid robot. From camera calibration to image processing to stereo match, every step in the whole vision information processing process plays an important role in getting high precision. Less time consuming can make the robot adapt to the changes of environment quickly, especially in dynamic environment. In this paper, two humanoid robot common tasks are chosen as experiments to verify the effectiveness of the proposed methods. Stair climbing experiment verifies the high precision in long distances and also by using the image information to compensate the progressive error caused by long distance walking and moving object grasping experiment verifies the less time consuming.

Index Terms- Humanoid robot, Image processing, Camera calibration, Feature extraction

I. INTRODUCTION Although the stable walking ability of modern humanoid robot, their ability to autonomously and accurate navigation in unknown environments has so far been limited. The ability of biped humanoid robots to step onto some obstacles such as the stair or uneven ground makes the humanoid robot ideally suited for environments designed for human beings. If pre-programmed to let the humanoid robot to walk a long

Manuscript received January 30, 2009. This paper is supported in part by

Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT)(IRT0423) and in part by China Scholarship Council Foundation , and is sponsored by “LOCH” project.

Chen Guodong, a PHD candidate in State Key Laboratory of Robotics and System ,Harbin Institute of Technology, China, now doing research work in Nanyang Technological University, Singapore, phone:65-67906591; (e-mail: [email protected]).

Sun Lining is a professor in State Key Laboratory of Robotics and System, Harbin Institute of Technology, China. (e-mail: [email protected])

Xie Ming, associate professor in Nanyang Technological University, Singapore. (e-mail: [email protected]) .

Zeyang Xia, Research Fellow at school of mechanical and aerospace engineering , Nanyang Technological University, Singapore. ([email protected])

Junhong Ji is a associate professor in Harbin Institute of Technological, China. ([email protected])

Zhijiang Du is a professor in Harbin Institute of Technological, China.([email protected])

Wang Lei, a master student in Nanyang Technological University, Singapore

distance or to climb up to a stair, although the distance and the stair height can be measured in advance, the progressive error makes it is difficult to walk the pre-input distance and also it is difficult to climb up the stair. By using the stereo vision system to measure the distance between the robot and the target in real time, if the robot walks in an abnormal way, the robot can use the image information to compensate the progressive error. Also in a far distance, it is difficult to find the obstacles accurately using the stereo vision system. Yet there is little work has been done to use the stereo vision information to compensate the progressive error.

Take the problem of navigating the humanoid robot to a goal in a dynamic environment as a closed-loop between using the vision and planning the humanoid robots close to the goal. In this manuscript, we view the problem of navigating the robot climb up a stair in a far distance and without knowing the orientation relationship between the robot and the stair as a closed-loop to make the robot can dynamically walk a long way and then step onto the stair. In order to test the robot’s dynamic performance, we set up an experiment let the robot grasping a ball which moves in an unpredictable way.

With stereo camera, it is easy to get the disparity of the images, but if the match is not good enough, it will get error information. In the process of navigating the robot to step on the stair in such a long distance, the error caused by vision system is not in allowed range, in long distance walking, the progressive error of the mechanism also is another factor affecting the precision, in this situation a method based on ground constraint is used to get a higher precision, the vision system provides data to the robot in real time. Because the data acquired from the vision system is not stable when the robot is walking, so we can not make the robot update the data in real time, so by sampling the data and fractional average the data step by step can make the data stable.

The rest of this paper is organized as follows: Section II presents related research work on robot navigation. In section III, we introduce the method we used to navigate the robot to the stairs and grasping the moving object based on vision autonomously. Section IV shows the results from the online implementation using LOCH humanoid robot. Finally, Section V gives the conclusion with a discussion and the future research work.

Fast and Accurate Humanoid Robot Navigation Guided By Stereovision

Chen Guodong, Ming Xie, Zeyang Xia, Lining Sun, Junhong Ji, Zhijiang Du, Wang Lei

Page 3: Fast and Accurate Humanoid Robot Navigation Guided By ... and accurate humanoid robot... · Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper ·

II. RELATED WORK With stereo cameras, it is easy to get the depth of an image

directly, but it will cost a lot of time, especially in stereo corresponding, and also it is not a good idea to get accurate distance between the goal and the robot. If the robot walks indoor, it is easy to get the perspective projection of the distance of points on the floor. In left and right images, objects related to the floor or the same plane are in the same place from one image to another, otherwise the objects will be deformed.

There are some papers on indoor navigation, including map-based navigation, map-building-based navigation and mapless navigation [1]. Some of the papers in biped humanoid robot concentrated on various approaches to 3D navigation, they always like to do the research work on planar segmentation[2] (assuming they can get precise 3D points).As all know, if objects are farther away from the sensor, larger errors occurred. It is difficult to model the error.

3D information is not accurate enough and can not be trust, because maybe there are some key points in image plane can be projected to the wrong position, although most of the points are correct in condition of exact calibration result and high image quality. If the image is too big, and the object is not far from the camera, it will get the 3D information of the object, but the humanoid robot is 1.8m high, if the stair is 2m far from the robot, the distance between the robot and the eye is a long distance. It can not get the high precision.

The following is the 3D information got by a professional 3D stereo camera on the humanoid robot’s head, the 3D image (Fig.1 (b)) shows, there are some error points. So if we just use the 3D information to get the stair position, we need to filter the 3D data, and to classify them. It will cost a lot. So in this paper, we combine 2D and 3D information together to navigate the robot. Test result shows it can get high precision.

(a) (b)

Fig.1. Raw image got by stereo vision camera and the stereo image

It is still has no effective method to recognize an object such as the stair, there are hundreds of papers on pattern recognition, such as model matching method based on gradient[3], this method does not consider the scale of the model. Also some one referred to use the method of affine transform to detect the stair, using this method can get a much better result, but can not use this method in real time [4]. We have tried a method widely used on face recognition-float boost learning method[5], this method can get about 70%

correct rate. In this paper, we assume we have recognized the stair, in order to recognize it quickly, we paint it green.

III. EFFICIENT METHOD BASED ON GROUND CONSTRAINT

A. Camera Calibration The stereo vision system is calibrated mainly by Tsai’s

method. The distortion model is based on Brown’s distortion model [6] and only considering radial distortion as shown in Eq.(1) and Eq.(2). 2 4 6 2 2

1 2 5 3 4(1 ) (2 ( 2 ))d u u u ux x k r k r k r k x y k r x� � � � � � � (1) 2 4 6 2 2

1 2 5 3 4(1 ) ( ( 2 ) 2 )d u u u uy y k r k r k r k r y k x y� � � � � � � (2) When doing calibration, in order to get a high precision, a

method based on Kalman filter[7] is used, assume the rotation matrix is R and translation vector T between camera coordinate and reference coordinate. The rotation matrix R can be described by unit 4-D vector 0 1 2 3( , , , )Tq q q q q� ,

In which, 2 2 2 2

0 1 2 3 1q q q q� � � � , So

1 2 3

4 5 6

7 8 9

2 2 2 20 1 2 3 1 2 0 3 1 3 0 2

2 2 2 21 2 0 3 0 1 2 3 2 3 0 1

2 2 2 21 3 0 2 2 3 0 1 0 1 2 3

2( ) 2( )2( ) 2( )2( ) 2( )

r r rR r r r

r r r

q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q q q q q

� �� �� � �� � � �� � � � �� �� � � � � �� �� �� � � � �

(3) 0 0 0( , , )TT x y z� (4) Take intrinsic and extrinsic parameters as the state vector as shown in Eq.(5) 0 1 2 3 0 0 0 0 0 1 2 5( , , , , , , , , , , , , , )T

k u vx q q q q x y z f f u v k k k� (5) State model can be described as

1k k kx A x �� In which: 1kx � state vector at instant 1k � kx state vector at instant k kA 14 14� state transition matrix When doing the calibration, image point ( , )u v and its

corresponding world axis ( , , )w w wx y z can be observed, take them as the observation vector. Also take the constrained condition 2 2 2 2

0 1 2 3 1q q q q� � � � as observation, so the observation vector can be described as

� �1 2 3 1 2 3( ), ( ), ( ) ( ) ( ) ( )( )

T Tk

k k k

z h k h k h k n k n k n kH x N

� �

� �(6)

In which ( )k kH x got by camera project model

kN zero-mean white noise, is the functions of time

Page 4: Fast and Accurate Humanoid Robot Navigation Guided By ... and accurate humanoid robot... · Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper ·

1 2 31

7 8 9

2 2 21 2 5 0

5 6 72

7 8 92 2

1 2 5

( ) ( ) ( ) ( )( ) ( )

( ) ( ) ( ) ( )

(1 ( ) ( ) ( ) ) ( ) ( )( ) ( ) ( ) ( )

( ) ( )( ) ( ) ( ) ( )

(1 ( ) ( ) (

w w w xu

w w w x

w w w xv

w w w x

r k x r k y r k z t kh k f k

r k x r k y r k z t k

k k r k k r k k r u k u kr k x r k y r k z t k

h k f kr k x r k y r k z t k

k k r k k r k

� � ��

� � �

� � � � � �� � �

�� � �

� � � � 20

2 2 2 23 0 1 2 3

) ) ( ) ( )

( ) 1

k r u k v kh k q q q q

���������� � ��

� � � � ���

(7) The flow chart of calibrating the vision system is show as

Fig.2. In order to get high precision, 10 groups, 5 points per group of data are used to camera calibration. The re-project error is 0.02186 in X direction and -0.2139722 in Y direction.

start

Input data

Initial parameter and set the iterative count n

k=1

k>n?

, ,1 1ˆ ˆ( 0)k k kx f x u�� ��

1 1 1T T

k k k k k k kP A P A W Q W�� � �� �

1( )T T Tk k k k k k k k kK P H H P H V R V� � �� �

ˆ ˆ ˆ( ( ,0))k k k k kx x K z h x� �� � �

( )k k k kP I K H P�� �

k=k+1

No

Output parameter

Yes

End Fig.2. Flow chart of camera calibration

B. Coordinate System Transform The coordinate system and camera model are shown as

Fig.3. The reference plane as the world coordinate, the other coordinate systems such as the leg coordinate system and the arm coordinate system all are related to the reference coordinate system.

(a)

refxrefy

refzrefO

ux

uy

uO

cx

cy

cz

cO Pp

0 0( , )u vx

y

1O

(b)

Fig. 3. Coordinate systems draw on an un-distorted image and the camera model, (a) coordinate plot on the image, (b)camera model

The relationship between the shoulder coordinate and the

leg coordinate system can be measured, but because it is not easy to measure the arm or shoulder’s posture in reference coordinate system, so when doing calibration, the relationship between the tip end of the arm and reference plane is calibrated in advance. The coordinate of point P in

ref ref ref refO x y z is unknown, assume its coordinate is

rP ( , , )ref ref ref

Tef x y zP P P� , So

rP efP R T� � (8) Now, R , T and rP ef are unknown, if the motion of point P

is known, each time its incremental vector ( , , )Ti i ix y z� � � is

known, so r(P ( , , ) )T

i ef i i iP R x y z T� � � � � � 1, 2,3....i n� (9) There are 10 unknown parameters in Eq.7, so if the object moves four times, all parameters can be got easily.

C. Plane Constraint The transformation between the image plane and the floor

or another plane can be represented by a 3 by 3 constant matrix, also can be described as a matrix projection. We denote T as the translation matrix, and R as the rotation matrix, the coordinate change between the image frame and the camera frame can be described as a projective matrix M

0

0

0 00 0

1 0 0 0

uM v

��

�� �� �� �� �� �� �

(10)

Where � represents the ratio of the camera focal length to the size of one pixel, 0 0( , )u v is the principle point. The matrix 1T from the floor or other plane to the image plane can be got:

1T MRT� (11) If P is a point with coordinate ( , , ,1)X Y Z in the floor

plane or other plane, its corresponding coordinate in image plane is:

1 ( , , )Tp T P x y z� � (12)

Page 5: Fast and Accurate Humanoid Robot Navigation Guided By ... and accurate humanoid robot... · Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper ·

It is easy to get the coordinate ( , )u v of the projection of P in the image. If 0Z � , the above equation can be written as

1 1

k X a b c uk Y d e f vk g h

�� � � � � �� � � � � �� � �� � � � � �� � � � � �� � � � � �

(13)

Let us denote leftI and rightI are the images taken by the two cameras, from the camera parameters we can know the relationship between left image leftI and right image rightI , let

2right leftI be the image transformed from right image rightI to left, so if the points belongs to the same plane, the matching points in rightI and 2right leftI will be at the same position. Compare with leftI and 2right leftI , if the points on the same plane, the coordinates on them are the same. So next, if we can find the coordinates of the corresponding points in two images are the same, we can get the objects on the same plane.Fig.4. shows leftI and 2right leftI , it shows that the object on ground are overlapping. So in this way, it is easy to detect the object on the ground.

(a) Left image (b)Transformed right image overlap on left image

Figure 4 Left image and transformed right image

D. Image Segmentation and Edge Detection In order to get the main edges and wipe off image noises,

Gauss filter is used, and then Canny method is used to detect the edges. Fig.5. shows the raw image and its edge image.

(a) (b)

Fig.5 (a)raw image and (b)its edge

E. Distance and Height Detection In order to get the 3D information, we have to match both

images. This means that we have to find correspondences between two images. In order to reduce the compute complex, correspondences are not needed for the whole images, only detect the region of interest, and match the ROI.

Because we have known the stair by the green color, so we only need to detect the parallel lines and find out the direction of the stair, so in this way, the robot can adjust its own posture and step onto the stair. In this paper, we use Hough Transform method to detect lines. Because the stair is composed of some

parallel lines except from a special view point, we only need to detect the parallel lines. In this paper, a novel method name local grads operator enhanced method is used for peak detection in Hough space, the kernel of the enhance algorithm is shown as Fig.6. and Eq.(12)

Fig.6. The definition of members inside an 8-neighborhood of an

accumulator unit

� ����

��9

51

55 cosmm

mmzzr � (12)

Fig.7. (a) shows original Hough transform image, Fig.7. (b) shows Hough transform image after local grads enhanced.

(a) Original Hough transform

(b) Enhanced Hough transform

Fig. 7 Hough Transform image before and after local grads operator enhanced

From Fig.8, it shows there are several parallel lines which

slop angles near zero. Constraint by the ground, in those parallel lines there is one in contact with the floor. That is the stair’s baseline. Above the baseline and at the same time considering the parallel line’s relationship, it is easy to get the stair’s height.

F. Stair Orientation Determine Now we have known the stairs position in images, the

stair’s two sides end points can be detected, although the

Page 6: Fast and Accurate Humanoid Robot Navigation Guided By ... and accurate humanoid robot... · Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper ·

baseline of the stair has detected, in order to get the 3D information of the stair, stereo matching is also needed. In this paper, SAD method is used for stereo matching.

As the two sides’ end points position is known, it is easy to get determine the relationship between humanoid and the stair, in this way, the robot can adjust its own orientation and determine the way to close the stair and upstairs.

G. Moving Ball Detection and Grasping The relationship between the arm-hand coordinate system

and the reference coordinate can be determined by the method in section II B. In this manuscript, a red ball is the target, so just need to segment the image, because maybe there are some noises as shown in Fig.8 (a), so we choose the mean-shift method to segment the ball. Fig.8 (b) shows the detected ball by mean-shift method.

(a) Original Image

(b) Detected Image

Fig.8 The ball detecting image

IV. EXPERIMENTS AND RESULTS ANALYZE We use the humanoid robot to do the test, the appearance

of the robot is as shown in Fig.9, and the layout of degrees of freedom is as shown in Fig.10.

Fig.9. Humanoid LOCH, the designed performance view (left), real

performance view (right)

Fig.10 Layout of degrees of freedom of the humanoid

The relationship between every coordinate system and the

reference coordinate system has been calibrated in section II. By using the method mentioned above, we do the tests. Fig.11. (a) shows the robot LOCH step onto the stair, Fig.11. (b) shows robot LOCH following the moving ball, also the head can following the ball, in this way, the ball is always in the center of the image.

(a) LOCH stepping on a stair

(b) LOCH grasping a moving ball

Fig.11. Experiment tests Take 9 times detection results to do comparison with traditional 3D method, because the stair height can not be got directly by the floor constraint, so use the traditional 3D method to get the height of the stair and then verify the error The following is the error compared the measured real value as shown in Fig.12.

Page 7: Fast and Accurate Humanoid Robot Navigation Guided By ... and accurate humanoid robot... · Fast and Accurate Humanoid Robot Navigation Guided By Stereovision Conference Paper ·

The value got based on the floor constraint is very near real value, but in practice, because, the image is not stable, so there are some errors will occur, the data got by original stereo image is not ideal, when the object is farther, the error is larger, but the floor constraint value is stable, so in this way, we can verify the stair height based on the distance, that is to say, after getting the stereo data of the stair, using the floor constraint to verify the error. Test results shows the humanoid can find the stair and step onto the stair in high precision.

Fig.12. Test results comparison

V. CONCLUSION

In this manuscript, the image processing method has been proposed to be used in the humanoid navigation in a dynamic environment. The whole information system provides accurate data to the robot in real time, in this way the robot can modify the progressive error stage by stage. Effective camera calibration method, good feature extraction method and accurate stereo data getting method all used in this manuscript, all methods integrated together makes the robot works in an effective way. Test results show by using the methods mentioned in the manuscript can make the robot adapt to the changes of environment quickly, especially in dynamic environment.

ACKNOWLEDGMENT The author would like to express thanks to the LOCH

project and all the members in this project.

REFERENCES

[1] O.D.Faugeras, M.Herbert, “Representation, recognition and locating of 3-D objects, International Conference on Pattern Recognition”, 1986, pp15-20.

[2] S.Ganapathy, “Decomposition of transformation matrices for robot vision”, Proceeding of IEEE International Conference on Robot and Automation, 1984, pp130-139.

[3] O. D. Faugeras, G.. Toscani, “Camera Calibration for 3D Computer Vision”, Proc.Int. Workshop on Industrial Application of Machine Vision and Machine Intelligence. NewYork: IEEE, 1986, pp15-20.

[4] R. Y. Tsai, “An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision”, Proc. of IEEE Conference of Computer

Vision and Pattern Recognition. 1986, pp364~374. [5] R. Y. Tsai, “A Versatile Camera Calibration Technique for High

Accuracy 3D Machine Vision Metrology Using Off-the-shelf TV Cameras and Lenses”, IEEE Journal of Robotics and Automation,1987, 4,pp323~344.

[6] L. M. Song, M. P. Wang, L. Lu, H. J. Huan, “High Precision Camera Calibration in Vision Measurement”, Optics and Laser Technology. 2007(39), pp1413~1420.

[7] J. Weng, P. Cohen, M. Herniou, “Camera Calibration with Distortion Models and Accuracy evaluation”, Trans. on Pattern Analysis and Machine Intelligence, 1992, 14(10), pp965~980.

[8] Z. Y. Zhang, “A Flexible New Technique for Camera Calibration”, IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000, 22(11),pp1330~1334.

[9] Z. Y. Zhang, “Flexible Camera Calibration by Viewing a Plane from Unkown Orientations”, IEEE International Conference on Computer Vision. Greece, 1999, pp666~673.

[10] R. E. Kalman, “A New Approach to Linear Filtering and Prediction Problems”, Transaction of the ASME-Journal of Basic Engineering, 1960, pp35~45.

[11] G.J. CASTRO, J.M. GALLEGO and L. P.E. CABELLO, “An effective camera calibration method,” IEEE AMC’98-Coimbra. 5th International Workshop, 1998, pp171-174.

[12] Faraz M. Mirzaei and Stergios I. Roumeliotis, “A Kalman Filter-based Algorithm for IMU-Camera Calibration”, Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems San Diego, CA, USA, Oct 29 - Nov 2, 2007, pp2427-2434.

[13] Faraz M. Mirzaei, “A Kalman Filter-Based Algorithm for IMU-Camera Calibration: Observability Analysis and Performance Evaluation”, IEEE transactions on robotics, 2008, 24(5), pp1143-1156.

[14] ZHOU Fuqiang, ZHAI Jin, ZHANG Guangjun, “A camera calibration method based on Iterated Extended Kalman Filter using planar target”, Sixth Intl. Symp. On Instrumentation and Control technology:sensors, automatic measurement, control, and computer simulation, 2006

[15] L. Wang, M. Xie, Z.W. Zhong, H.J. Yang and J. Li, “Design of Dexterous Arm-Hand for Human-Assisted Manipulation”, First International Conference on Intelligent Robotics and Applications, Wuhan, China.

[16] G.. Q. Wei. S. D. Ma, “Implicit and Explicit Camera Calibration: theory and experiment”,IEEE Trans. Pattern Recognition and Machine Intelligence. 1994, 16(5),pp469~480

[17] D. C. Brown, “Lens Distortion for Close-range Photogrammetry”, Photometric Engineering. 1971, 37(8), pp855~866

View publication statsView publication stats