wearable hand pose estimation for remote control of a

9
Wearable Hand Pose Estimation for Remote Control on the Moon Paper: Wearable Hand Pose Estimation for Remote Control of a Robot on the Moon Sota Sugimura and Kiyoshi Hoshino University of Tsukuba 1-1-1 Tennodai, Tsukuba 305-8573, Japan E-mail: {s1620795@u, hoshino@esys}.tsukuba.ac.jp [Received March 20, 2017; accepted August 1, 2017] In recent years, a plan has been developed to conduct an investigation of an unknown environment by re- motely controlling an exploration robot system sent to the Moon. For the robot to successfully perform so- phisticated tasks, it must implement multiple degrees of freedom in not only its arms and legs but also its hands and fingers. Moreover, the robot has to be a humanoid type so that it can use tools used by astro- nauts, with no modification. On the other hand, to ma- nipulate the multiple-degrees-of-freedom robot with- out learning skills necessary for manipulation and to minimize the psychological burden on operators, em- ploying a method that utilizes everyday movements of operators as input to the robot, rather than a special controller, is ideal. In this paper, the authors propose a compact wearable device that allows for the estimation of the hand pose (hand motion capture) of the subject. The device has a miniature wireless RGB camera that is placed on the back of the user’s hand, rather than on the palm. Attachment of the small camera to the back of the hand may make it possible to minimize the restraint on the subject’s motions during motion capture. In the conventional techniques, the camera is attached on the palm because images of fingertips al- ways need to be captured by the camera before images of any other part of the hand can be captured. In con- trast, the image processing algorithm proposed herein is capable of estimating the hand pose with no need for first capturing the fingertips. Keywords: hand pose estimation, wearable, miniature RGB camera, remote control of a robot 1. Introduction Based on observation data transferred from a lunar probe, SELENE (launched by JAXA in 2007), Haruyama et al. found vertical holes of approximately 10 m in di- ameter on the Moon [1]. It is assumed that vast under- ground space lies on their undersides. The vertical holes and underground spaces found on the Moon and Mars are expected to yield clues that will further studies on life sci- ence and formation of telluric planets, and direct explo- ration of these sites on the Moon is desired. The under- ground spaces linked to the vertical holes are protected from cosmic radiation and meteorite impact; moreover, they are more stable in temperature than the surface of the Moon. For this reason, they are assumed to have ex- tremely advantageous conditions as a lunar base, and an early investigation of their underground environments is expected from the standpoint of using them as a lunar base. Haruyama et al. have been developing a study under the UZUME (Unprecedented Zipangu Underworld of the Moon (Mars) Exploration) project expected to be com- pleted in the 2020s that aims to go down through these vertical holes to the ground spaces in order to make a di- rect exploitation of the underground spaces of the Moon and Mars [2]. To make a direct investigation of the un- derground spaces of the Moon and Mars, an exploitation system that is capable of landing on a spot as close to the vertical hole as possible, reaching the underground space that lies at the bottom of the hole, and moving in the space, an unknown environment, to make a scientific investiga- tion, is essential. To this end, a design of an exploitation robot system has been developed with the aim of achiev- ing a system capable of making a scientific investigation in an unknown environment at a site very far from Earth, such as the Moon, by means of manipulation as a geolo- gist on Earth intends [3]. The UZUME project aims to successfully develop a remote-controlled proxy robot capable of making a sci- entific investigation on behalf of a geologist on Earth. For this reason, it is desirable to use a remote- controlled robot skillful enough to perform tasks at the same level as human beings. The tasks assumed to be necessary for scientific investigations include, for exam- ple in the case of geological investigations, “insertion of a zoom camera into a crack of rock,” “cutting off a rock to collect samples,” “crushing and putting collected sam- ples in an analyzer,” and “grinding the sample surfaces to observe under a microscope.” As space robots require a considerable amount of rocket launching cost, for exam- ple US$ 100 /g of transportation cost when the rocket is launched toward the Moon, a severe restriction is imposed on the use of resources. To address this problem, it is im- portant that the versatility of the robot is improved so as to skillfully perform various types of tasks with as few ad- Journal of Robotics and Mechatronics Vol.29 No.5, 2017 829 https://doi.org/10.20965/jrm.2017.p0829 © Fuji Technology Press Ltd. Creative Commons CC BY-ND: This is an Open Access article distributed under the terms of the Creative Commons Attribution-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nd/4.0/).

Upload: others

Post on 15-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Wearable Hand Pose Estimation for Remote Control of a

Wearable Hand Pose Estimation for Remote Control on the Moon

Paper:

Wearable Hand Pose Estimation for Remote Control of a Roboton the Moon

Sota Sugimura and Kiyoshi HoshinoUniversity of Tsukuba

1-1-1 Tennodai, Tsukuba 305-8573, JapanE-mail: {s1620795@u, hoshino@esys}.tsukuba.ac.jp[Received March 20, 2017; accepted August 1, 2017]

In recent years, a plan has been developed to conductan investigation of an unknown environment by re-motely controlling an exploration robot system sent tothe Moon. For the robot to successfully perform so-phisticated tasks, it must implement multiple degreesof freedom in not only its arms and legs but also itshands and fingers. Moreover, the robot has to be ahumanoid type so that it can use tools used by astro-nauts, with no modification. On the other hand, to ma-nipulate the multiple-degrees-of-freedom robot with-out learning skills necessary for manipulation and tominimize the psychological burden on operators, em-ploying a method that utilizes everyday movements ofoperators as input to the robot, rather than a specialcontroller, is ideal. In this paper, the authors propose acompact wearable device that allows for the estimationof the hand pose (hand motion capture) of the subject.The device has a miniature wireless RGB camera thatis placed on the back of the user’s hand, rather thanon the palm. Attachment of the small camera to theback of the hand may make it possible to minimizethe restraint on the subject’s motions during motioncapture. In the conventional techniques, the camera isattached on the palm because images of fingertips al-ways need to be captured by the camera before imagesof any other part of the hand can be captured. In con-trast, the image processing algorithm proposed hereinis capable of estimating the hand pose with no need forfirst capturing the fingertips.

Keywords: hand pose estimation, wearable, miniatureRGB camera, remote control of a robot

1. Introduction

Based on observation data transferred from a lunarprobe, SELENE (launched by JAXA in 2007), Haruyamaet al. found vertical holes of approximately 10 m in di-ameter on the Moon [1]. It is assumed that vast under-ground space lies on their undersides. The vertical holesand underground spaces found on the Moon and Mars areexpected to yield clues that will further studies on life sci-ence and formation of telluric planets, and direct explo-

ration of these sites on the Moon is desired. The under-ground spaces linked to the vertical holes are protectedfrom cosmic radiation and meteorite impact; moreover,they are more stable in temperature than the surface ofthe Moon. For this reason, they are assumed to have ex-tremely advantageous conditions as a lunar base, and anearly investigation of their underground environments isexpected from the standpoint of using them as a lunarbase.

Haruyama et al. have been developing a study underthe UZUME (Unprecedented Zipangu Underworld of theMoon (Mars) Exploration) project expected to be com-pleted in the 2020s that aims to go down through thesevertical holes to the ground spaces in order to make a di-rect exploitation of the underground spaces of the Moonand Mars [2]. To make a direct investigation of the un-derground spaces of the Moon and Mars, an exploitationsystem that is capable of landing on a spot as close to thevertical hole as possible, reaching the underground spacethat lies at the bottom of the hole, and moving in the space,an unknown environment, to make a scientific investiga-tion, is essential. To this end, a design of an exploitationrobot system has been developed with the aim of achiev-ing a system capable of making a scientific investigationin an unknown environment at a site very far from Earth,such as the Moon, by means of manipulation as a geolo-gist on Earth intends [3].

The UZUME project aims to successfully develop aremote-controlled proxy robot capable of making a sci-entific investigation on behalf of a geologist on Earth.

For this reason, it is desirable to use a remote-controlled robot skillful enough to perform tasks at thesame level as human beings. The tasks assumed to benecessary for scientific investigations include, for exam-ple in the case of geological investigations, “insertion ofa zoom camera into a crack of rock,” “cutting off a rockto collect samples,” “crushing and putting collected sam-ples in an analyzer,” and “grinding the sample surfaces toobserve under a microscope.” As space robots require aconsiderable amount of rocket launching cost, for exam-ple US$ 100 /g of transportation cost when the rocket islaunched toward the Moon, a severe restriction is imposedon the use of resources. To address this problem, it is im-portant that the versatility of the robot is improved so asto skillfully perform various types of tasks with as few ad-

Journal of Robotics and Mechatronics Vol.29 No.5, 2017 829

https://doi.org/10.20965/jrm.2017.p0829

© Fuji Technology Press Ltd. Creative Commons CC BY-ND: This is an Open Access article distributed under the terms of the Creative Commons Attribution-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nd/4.0/).

Page 2: Wearable Hand Pose Estimation for Remote Control of a

Sugimura, S. and Hoshino, K.

the Moon

information obtained at the site

commands

(cited from JAXA Digital Archives)

Earth

Fig. 1. Conceptual scheme of control of the exploitationmovements of the lunar robot by remote control based onhuman gestures on Earth.

ditional devices as possible. To this end, it is necessary toequip the robot with a hand with multi-degree-of-freedomfive fingers to enable it to have a capability of efficientmanipulation of tools, which exceeds existing “graspingsomething” and “releasing something.”

Most of the environments of the Moon and Mars are un-known to us, and we must determine the next movementdepending on the situation based on information obtainedat the site. As a delay occurs in communication betweenEarth and the Moon, it is desirable that the robot doesbasic tasks under the self-control system. However, themost practical method that enables the robot to recognizethe current situation of the unknown environment to de-termine and select the appropriate actions involves an airtraffic controller giving commands from Earth by remotecontrol to the robot on the Moon. Moreover, to achievea deft robot capable of using the tools efficiently, multi-ple degrees of freedom must be incorporated not only inits arms and legs but also its hands and fingers; however,it is technically very difficult to make the robot performskillful tasks as well as a human. To address this prob-lem, a practical method that takes advantage of the tele-existence robot technology to make a master/slave sys-tem that follows the movements by an operator on Earth,which are captured by a motion capture in order to ma-nipulate the multiple-degrees-of-freedom robot skillfully,is needed [4]. Fig. 1 shows the outline of the exploita-tion system. A geologist on Earth, who is required to beable to determine the actual circumstances of unknownenvironments, makes a plan of an optimal trajectory witha bit longer processing time permitted, gives the instruc-tions on necessary movements to the robot, and indicatesthe targets. In contrast, an autonomous control loop in-corporated in the lunar robot executes the target posi-tions/postures of the hand tracking control loop and pro-file control of the working targets, which require process-ing with a short time constant and fail to achieve the ob-jective in the case where the robot starts to move after thereception of the instructions from the geologist. The useof a method that involves capturing hand motions that arethen entered into the system instead of one that involves

tools, such as joysticks and controllers, allows scientists,who do not have considerable experience with the system,on Earth, to give commands of sophisticated movementsusing tools intuitively and easily with no need for learn-ing the operating technique. A delay in communication(more than 3 s for an Earth-Moon round trip) may de-teriorate operability of the robot. The authors expectedto improve this disadvantage by means of additional in-corporation of a predicted delay indicator [5] in the op-eration system on Earth and of a local auto-corrector tomodify the trajectories and adjust contact force when therobot holds/manipulates objects [6] in the robot based onits tactile force.

However, it is difficult for a motion capture device tomeasure both the large movements of the arms, legs, andbody trunk at the same time as the more subtle move-ments of the fingers. This is due to the differences inbody-part size, the range of movement and the complex-ity of the change in geometry of hand movements com-pared with body movements. Therefore, it is necessaryto combine conventional motion capture systems with atechnique specialized for measurement of the fingers.

The technical specifications, which do not affect mo-tion capture, are required to use the hand pose estimationwith the motion capture system. Such technical specifica-tions include the ability to estimate the pose of a relativelysmall hand even while a subject is moving freely arounda wide studio for optical motion capture; moreover, theequipment must have a form and mass weight that donot cause the subject to feel uncomfortable. The authors,along with their colleagues, have proposed a non-contactsystem that enables hand pose estimation to be performedimmediately and accurately using one or two RGB high-speed cameras [7–9]. However, this non-contact handpose estimation system has a disadvantage: when the sub-ject moves freely around a wide studio, as with motioncapture, the images of the hand away from the camera(s)are very small, making it difficult to carry out hand poseestimation. To solve this problem, the authors have pro-posed the use of a wearable camera, rather than a com-pletely non-contact type camera, as this may achieve com-pact hand pose estimation device.

One of the techniques we identified was to use a wire-less data glove [10]. However, the data glove has disad-vantages such as being expensive and feeling tight; more-over, its strain indicator (a sensor) and wiring are easilybroken, and it cannot be used for quick-motion capture.Furthermore, this technique is not suitable for motionssuch as object manipulation with the subject’s palm andfist making.

Digits [11] is an example of a hand pose estimation de-vice that uses a wireless wearable camera. When Digitsis used, an infrared sensor or infrared camera is attachedon the wrist of the subject. Digits then captures the im-ages of the subject’s hand from the palm side to measurethe distance between the sensor and the fingers, achiev-ing hand pose estimation. Note that this system, whichrequires a sensor attached on the wrist, has a disadvan-tage of difficulty in feasible response to flexing and ex-

830 Journal of Robotics and Mechatronics Vol.29 No.5, 2017

Page 3: Wearable Hand Pose Estimation for Remote Control of a

Wearable Hand Pose Estimation for Remote Control on the Moon

flexible wire

wireless RGB camera

wrist supporter

battery

Fig. 2. Appearance of the device.

tending motions. In addition, in this system, the sensormust be attached on the palm of the hand; thus, it maydisturb the subject’s motions, such as daily actions (e.g.,walking). Other techniques have been reported by a studyon hand pose estimation using an RGB camera and an ARmarker [12], and a study on hand pose estimation based onthe convex and concave shapes of the wrist by using largequantities of photo-reflectors attached on the wrist [13].Both of these approaches have disadvantages of prevent-ing the subject from doing work, insufficient accuracy,etc.

To address these problems, we propose a compact sys-tem that allows for estimation of the hand pose of a subjectwho has an ultraminiature wireless RGB camera attachedon the back of the hand, rather than the palm. Attach-ment of the camera on the back of the hand may makeit possible to minimize the restraint on the subject’s mo-tions during motion capture. In conventional techniques,the camera is attached on the palm of the hand becausethe images of fingertips always need to be captured by thecamera before images of any other parts of the hand arecaptured. In contrast, the algorithm the authors proposehere is capable of estimating the hand pose with no needfor first capturing the images of the fingertips.

2. System Configuration

2.1. Hardware Configuration

The hardware is composed of a wireless camera, an armfor fixing the camera, and a mobile battery. The handimages acquired by the wireless camera are transferred toa computer for estimating the hand shape. Fig. 2 shows

Fig. 3. Schematic of the database.

the appearance of the device.A small-sized 5.8 G Wireless Mini Camera TE60A (3rd

Eye Electronics) was used as the wireless camera. Theangle of view is 90◦, and the resolution of acquired imagesis 720×480 pixels.

The fixing arm is composed of a flexible design wireand a medical wrist supporter. The cable connecting thecamera and the battery is longer than the length of a sub-ject’s arm, and the battery is wearable around the subject’sbody trunk, where it is less sensitive to moment.

2.2. Building a DatabaseThe database is built by assuming information on hand

joint angles, hand silhouettes, and higher order local au-tocorrelation (HLAC) [14] described later, and ridge linesto be one dataset. The database is a collection of datasetsof a wide variety of views. Fig. 3 shows the schematic ofthe database.

The input images used for building the database arecreated using three-dimensional human drawing software,Poser (Curious Labs). As shown in Fig. 4, this softwareallows CG hand images corresponding to the specifiedhand joint angles to be created. The CG hand images cre-ated using this software are formed using the parts of thefirst to third joints of each finger and the part of a palm;thus, there are 16 parts in total. The individual parts can bescaled three-dimensionally to reflect the individual differ-ences in hand shape among subjects. Moreover, changingthe parameter for camera position enables the camera po-sition to be changed in the software, resulting in imagesacquired from various viewpoints. The images, for whichthe hand area and the background area are binarized, areacquired in advance. The hand joint angles specified atthat time are retained as those of the dataset for outputfrom estimation.

All the acquired images are preprocessed as describedbelow. First, the image is binarized to separate the back-ground area. Then, the areas are labeled, and the area withthe largest label size is assumed to be a hand area. Finally,the hand area image is reduced to the size of 64×64 pix-els. Each pixel of the achieved image is retained as sil-

Journal of Robotics and Mechatronics Vol.29 No.5, 2017 831

Page 4: Wearable Hand Pose Estimation for Remote Control of a

Sugimura, S. and Hoshino, K.

Fig. 4. Input images for creating a database.

(a) Input image (b) Preprocessed image

Fig. 5. Input images for creating a database.

houette information whether the pixel is in the foregroundor background area. Fig. 5 shows an input image and itspreprocessed image.

From the 64 × 64 pixels image, the contour of thehand is extracted. As shown in Fig. 6, the acquired im-age is divided into eight blocks vertically and horizon-tally, which creates 64 blocks in total. The number of 25types of HLAC patterns shown in Fig. 7 in each block iscounted, and the count is assumed to be a HLAC featureamount. Twenty-five-dimensional information is obtainedfor each block, i.e., a 1,600 (25×8×8) dimensional fea-ture amount is acquired for each image.

Many individual differences tend to be easily reflectedin the contours of fingers, such as the sizes (length andthickness) of fingers and the positions of the bases of fin-gers. To address this problem, information on a ridge linepassing through the center of the fingers and a palm isused, which may be less sensitive to the effect of the in-dividual differences. Thinning processing is applied tothe 64× 64 pixels image obtained by the preprocessingdescribed in this section, and the coordinates of the fore-ground area of the thinned image are obtained as infor-

Fig. 6. Hand area divided into 8×8 blocks.

Fig. 7. Input local patterns of the high-order local autocor-relational function.

Fig. 8. Information on the ridge line.

mation on the ridge line. The Hilditch method is used forthinning processing. Fig. 8 shows the image formed basedon information on the ridge line.

2.3. Searching the DatabaseThe method for preprocessing input images for esti-

mation, and the methods for calculating HLAC featureamounts, information on the ridge line, and informationon the silhouette are the same as those for building thedatabase, as described in Section 2.2. The input image

832 Journal of Robotics and Mechatronics Vol.29 No.5, 2017

Page 5: Wearable Hand Pose Estimation for Remote Control of a

Wearable Hand Pose Estimation for Remote Control on the Moon

penalty

Fig. 9. Example of the penalty.

is compared with its corresponding image in the databasebased on the obtained feature amount, and the most sim-ilar image is assumed to be the result of estimation. Fi-nally, the hand joint angles in the dataset of the result ofestimation are output. The above-mentioned procedure isdescribed below in detail.

To speed up the search, the objects to be searched arenarrowed down in two stages. Penalties are calculated us-ing information on the silhouette and ridge line obtainedby the methods described in Section 2.2. Fig. 9 shows anexample of the penalty.

First, the ridge line area of an input image protrudedfrom the silhouette area of the database image is calcu-lated based on the information on the ridge line and sil-houette area of the dataset, and the value [pixels] for theprotruded area is assumed to be penalty P1. In the firstnarrowing stage, the datasets whose P1 values exceed thethreshold shown in the following formula are excludedfrom the search.

th1 = C ·Ainput, . . . . . . . . . . . . . (1)

th1 : Threshold for the first narrowing stage.C : Constant.

Ainput : The area of the ridge line area of an input image.

Figure 10 shows an example of the first-stage narrow-ing. At the first-stage narrowing, any data set with nofinger in the input image area may be excluded from thetarget to be searched.

After the first narrowing stage has been completed,the second narrowing stage is applied to the remainingdatasets to be searched. The ridge line area of the databaseimage protruded from the silhouette area of the input im-age is calculated based on the information on the ridgeline of the dataset and silhouette area of the input image,and the value [pixels] for the protruded area is assumed tobe the second penalty P2. In the second narrowing stage,the datasets whose P2 values exceed the threshold shownin the following formula are excluded from the search.

th2 = C1 ·A2dataset +C2 ·Adataset, . . . . . . (2)

th2 : Threshold for the second narrowing stage.C1, C2 : Constants.Adataset : The area of the ridge line area of the database image.

Fig. 10. First narrowing stage.

Fig. 11. Second narrowing stage.

01020304050607080

0 20 40 60 80 100 120 140 160 180

thre

shol

d [p

ixel

]

area of a ridge line [pixel]

formulae (1) formulae (2)

Fig. 12. Thresholds used to narrow the search region.

Figure 11 shows an example of the second-stage nar-rowing. At the second-stage narrowing, any data set withfingers in the input image area may be excluded from atarget to be searched.

As penalties P1 and P2 increase with a larger area of aridge line (specifically, in the hand posture with more fin-gers stretched), the threshold should be defined so that itincreases with the larger area of the ridge line. Fig. 12shows a graph of formulae (1) and (2). As shown inthe figure, the threshold would be set less strictly with a

Journal of Robotics and Mechatronics Vol.29 No.5, 2017 833

Page 6: Wearable Hand Pose Estimation for Remote Control of a

Sugimura, S. and Hoshino, K.

smaller area of the ridge line, if it is defined in terms of alinear function, while it would be set strictly with a largerarea of the ridge line. If the threshold is defined in termsof a quadratic function, it would be set less strictly with alarger area of the ridge line, while it would be set strictlywith a smaller area of the ridge line. In the preliminarytest, it was observed that a search target disappeared be-cause of the threshold being too strict in the region witha larger area when both th1 and th2 were defined in termsof a linear function, and in the region with a smaller areawhen both were defined in terms of a quadratic function.To address this problem, an attempt was made to defineth1 in terms of a linear function and th2 in terms of aquadratic function to narrow the search region with thesearch target retained.

The above-mentioned two narrowing stages enable thedatasets whose shapes are considerably different fromthose of the input image to be excluded from the search,reducing the search time.

After the narrowing stages have been completed, thedatasets remaining to be estimated are searched by cal-culating dissimilarity. The dissimilarity is calculated bythe following formula using HLAC feature amounts andpenalties described in this section:

E[i] =D

∑k=1

(currentk −data[i]k)2 +C3 ·P1[i]+C4 ·P2[i],

. . . . . . . . . . . . . . . . . . . . (3)

i : Database number.E[i] : Dissimilarity between an input image and the

database number i.k : Number of dimensions of the HLAC feature

amount.D : Number of dimensions of HLAC feature

amount/image.currentk : HLAC feature amount of the number of dimension

k of an input image.data[i]k : HLAC feature amount of the number of dimension

of the image with the data number i.C3, C4 : Constants.

P1[i] : Penalty based on information on silhouette of thedata number i and the ridge line of an input image.

P2[i] : Penalty based on information on the ridge line withthe data number i and the silhouette of an input im-age.

After the whole database has been searched, the handjoint angles in the dataset, whose dissimilarity obtainedthrough the dissimilarity calculation indicates the smallestvalue, are output as the result of the estimation.

3. Evaluation

3.1. Processing SpeedTo study the effectiveness of this system, we first

searched the database on moving images of human handmotions for similar hand images. With an ultraminiaturewireless RGB camera on the back of the hand, the tester

Fig. 13. Views of hand pose estimation.

freely moved his/her hand fingers. Fig. 13 shows an out-put image during the test captured on the screen as an ex-ample of an estimated result. A hand image in the leftand right indicates a captured image and estimated imagefrom the database, respectively. The sampling frequencyof the camera is 30 fps. From this drawing, we can see thata processing speed of approximately 30 fps is obtained aswell, even in the case where a personal computer withconventional functions (Pentium IV 2.8 GHz, the mainmemory 1 GB) is used. We can instinctively understandthat similar images are searched with a fairly good accu-racy.

3.2. Search AccuracyWith our proposed system attached to the right hand

and the data glove (CyberGrove II from Cyber GroveSystems) to the left hand, the estimated accuracy of ourproposed system was verified using the measured valuesas the values obtained by the data glove when the rightand left hands make the same movements. In our ex-periment, a database containing 20,448 data sets, whichwere created by making the four fingers excluding thethumb flex/extend and adduct/abduct, and the thumb sep-arately move, was used. In our experiment, the variables

834 Journal of Robotics and Mechatronics Vol.29 No.5, 2017

Page 7: Wearable Hand Pose Estimation for Remote Control of a

Wearable Hand Pose Estimation for Remote Control on the Moon

-20

0

20

40

60

80

100

120

0 500 1000

join

t ang

le [d

egre

e]

time [number of frame]

measured estimated

-40-30-20-10

010203040

0 500 1000

join

t ang

le [d

egre

e]

time [number of frame]

Fig. 14. Example of the estimated result of the movementsof the index finger PIP joint.

mentioned in Section 2 were assumed to be C = 0.25,C1 = 0.002, C2 = 0.03, and C3 = C4 = 800.

Figures 14 and 15 show the estimated results. Amongthe experimentally obtained results, a detailed compari-son was made on the graph between the measured valuesand estimated values for the PIP joint of the index fin-ger, which has an average length and a wider joint mov-able range, as a representative of the four fingers, exceptthe thumb and the CM joint of the thumb, which movedifferently from other four fingers. In each of Figs. 14and 15, on the upper side, a time-course graph of themeasured values and estimated values is shown, and onthe lower side, a time-course graph of estimated errors isshown. In this experiment, differences in the values of thejoint angle were generated even with the same hand pos-tures because the measured values were obtained by thedata glove, while the estimates values were obtained byPoser. To address this problem, the right and left handsassumed several types of similar postures with our pro-posed system and data glove directly before the experi-ment, and based on the differences between the obtainedvalues and estimated values, the dynamic range and off-set of the latter were adjusted to minimize the differencesto calibrate the correlation between both types of values.The mean value and standard deviation of the estimatederrors were −1.73± 8.21◦ for the index finger PIP jointand 0.96±6.65◦ for the thumb CM joint.

Figures 16 shows a snapshot in which a comparisonbetween the joint angles in the data sets of the estimatedresults on the robot hand CG and the user’s hand posturesfor intuitively easy recognition of the estimated accuracyof our proposed system. It is suggested, from Fig. 16, that,

-20

0

20

40

60

80

100

120

0 500 1000

join

t ang

le [d

egre

e]

time [number of frame]

measured estimated

-40-30-20-10

010203040

0 500 1000

join

t ang

le [d

egre

es]

time [number of frame]

Fig. 15. Example of the estimated result of the movementsof the thumb finger CM joint.

Fig. 16. Snapshots of estimated results drawn using CG hands.

Journal of Robotics and Mechatronics Vol.29 No.5, 2017 835

Page 8: Wearable Hand Pose Estimation for Remote Control of a

Sugimura, S. and Hoshino, K.

as the user’s hand postures have been reproduced on therobot hand CG with high accuracy, the hand joint anglesmight be estimated with high accuracy.

In addition, the estimation values obtained by our pro-posed system were sent to the robot hand (Handroid fromITK) to verify that the robot hand could actually grasp anobject. Assuming that the robot collects materials such assands and rocks on the surface of the Moon and does workusing tools, the experiment was conducted by making therobot grasp or precisely grasp small and large objects thatresembled the rocks and tools. Fig. 17 shows a snapshotof the experimental scene. In the first and second imageof Fig. 17, the hand grasps the larger object, while in thethird image, it precisely grasps the smaller object. In thefourth image, it grasps the tool. As shown in Fig. 17,an operator is able to control the movements of the robothand and make it grasp objects easily as he/she intendssimply by making everyday movements.

4. Conclusion

In recent years, a plan has been developed to conductan investigation of an unknown environment by remotelycontrolling an exploration robot system sent to the Moon.For the robot to successfully carry out sophisticated tasks,it must implement multiple degrees of freedom in not onlyits arms and legs but also its hands and fingers. More-over, the robot must be a humanoid type so as to usethe tools used by astronauts with no modification. Onthe other hand, to manipulate the multiple-degrees-of-freedom robot without learning skills necessary for ma-nipulation and to minimize the psychological burden onoperators, a method that utilizes everyday movements ofoperators as input to the robot, rather than a special con-troller, is ideal.

However, there are differences in the size of the device,area of the space through which the system moves, andcomplexity of the profile change between the existing op-tical motion captures, making it difficult to measure themovements of the arms and legs, larger parts of the body,and those of the hands and fingers using the same system.For this reason, any measurement technology specificallyfor the hands and fingers has to be used along with con-ventional motion capture systems.

In the present study, we realized a system for retriev-ing the shape of human hand fingers in real time and withhigh accuracy, without using any special peripheral equip-ment, such as a range sensor and PC cluster, by a methodof searching similar images quickly and with high accu-racy, from a large volume of image databases contain-ing complicated shapes and self-occlusion. In designingthe system, we constructed a database that was adaptableeven to differences between individuals, and searched thedatabase for images of hands similar to the unknown handimage, through extraction of characteristics using high-order local autocorrelational patterns. In particular, theauthors designed an original compact system that allowsfor the estimation of the hand pose of a subject who has

Fig. 17. Estimated output values of the robot hand.

836 Journal of Robotics and Mechatronics Vol.29 No.5, 2017

Page 9: Wearable Hand Pose Estimation for Remote Control of a

Wearable Hand Pose Estimation for Remote Control on the Moon

attached an ultraminiature wireless RGB camera on theback of his/her hand, rather than on the palm, as attach-ment of the camera on the back of the hand may make itpossible to minimize the restraint on the subject’s motionsduring motion capture. The algorithm we proposed hereis capable of estimating the hand pose with no need forfirst capturing the fingertips.

In the system verification experiment, the processingspeed of our proposed system was 30 fps on average, indi-cating that real-time estimation was possible. The calcu-lated mean value and standard deviation of the estimateserrors was −1.73± 8.21◦ for the index finger PIP jointand 0.96± 6.65◦ for the thumb CM joint. In another ex-periment to verify that transferring the estimates obtainedby our proposed system to the robot hand might allow itto grasp objects, an operator successfully made the robothand grasp and precisely grasp objects and tools simplyby assuming everyday grasping motions. A series of ex-perimental results indicate the effectiveness of our pro-posed system.

AcknowledgementsA part of this research was conducted with the assistance of theStrategic Information and Communication R and D PromotionProgramme (SCOPE) of the Ministry of Internal Affairs and Com-munication, KDDI Foundation, and the Adaptable and SeamlessTechnology Transfer Program through Target-driven R&D (A-STEP) of Japan Science and Technology Agency (JST). The au-thors would like to extend their sincere gratitude to these organi-zations.

References:[1] J. Haruyama et al., “Possible lunar lava tube skylight observed by

SELENE cameras,” Geophysical Research Letters, Vol.36, No.21,L21206, 2009. doi: 10.1029/2009GL040635

[2] J. Haruyama et al., “Mission Concepts of Unprecedented ZipanguUnderworld of the Moon Exploration (UZUME) Project,” Trans. ofthe Japan Society for Aeronautical and Space Sciences, AerospaceTechnology Japan, Vol.14, No.ists30, pp. 147-150, 2016.

[3] I. Kawano et al., “System Study of Exploration of Lunar and MarsHoles and underlying subsurface,” Proc. of 60th Space Sciences andTechnology Conf., 1C07 (JSASS-2016-4039), 2016.

[4] K. Hoshino, N. Igo, M. Tomida, and H. Kotani, “Teleoperating Sys-tem for Manipulating a Moon Exploring Robot on the Earth,” Int. J.of Automation Technology, Vol.11, No.3, pp. 433-441, 2017.

[5] A. K. Bejczy, W. S. Kim, and S. C. Venema, “The Phantom Robot:Predictive Displays for Teleoperation with Time Delay,” Proc. ofIEEE Int. Conf. on Robotics and Automation, pp. 546-551, 1990.

[6] T. J. Tarn and K. Brady, “A framework for the control of time-delayed telerobotic systems,” IFAC Proceedings, Vol.30, No.20,pp. 599-604, 1997.

[7] K. Hoshino, T. Kasahara, M. Tomida, and T. Tanimoto, “Gesture-world environment technology for mobile manipulation – remotecontrol system of a robot with hand pose estimation,” J. of Roboticsand Mechatronics, Vol.24, No.1, pp. 180-190, 2012.

[8] K. Hoshino, “Hand Gesture Interface for Entertainment Games,”R. Nakatsu, M. Rauterberg, and P. Ciancarini (Ed.), Handbook ofDigital Games and Entertainment Technologies, Springer, pp. 1-20,2015. ISBN: 978-981-4560-52-8

[9] M. Tomida and K. Hoshino, “Wearable device for high-speed handpose estimation with a miniature camera,” J. of Robotics andMechatronics, Vol.27, No.2, pp. 167-173, 2015.

[10] K. Hoshino, “Dexterous robot hand control with data glove byhuman imitation,” IEICE Trans. on Information and Systems,Vol.E89-D, No.6, pp. 1820-1825, 2006.

[11] D. Kim, O. Hilliges, S. Izadi, A. Butler, J. Chen, I. Oikonomidis,and P. Olivier, “Digits: Freehand 3D Interactions Anywhere Usinga Wrist-Worn Gloveless Sensor,” UIST’12 Proc. 25th annual ACM

Symposium on User Interface and Software Technology, pp. 167-176, 2012.

[12] L. A. F. Fernandes, V. F. Pamplona, J. L. Prauchner, L. P. Nedel,and M. M. Oliveira, “Conceptual Image-Based Data Glove forComputer-Human Interaction,” Revista de Informatica Teorica eAplicada (RITA), Vol.15, No.3, pp. 75-94, 2008.

[13] R. Fukui, M. Watanabe, T. Gyota, M. Shimosaka, and T. Sato,“Hand Shape Classification with a Wrist Contour Sensor: Devel-opment of a Prototype Device,” UbiComp ’11, pp. 311-314, 2011.

[14] N. Otsu and T. Kurita, “A new scheme for practical, flexible and in-telligent vision systems,” Proc. IAPR. Workshop on Computer Vi-sion, pp. 431-435, 1998.

Name:Sota Sugimura

Affiliation:Graduate School of Systems and InformationEngineering, University of Tsukuba

Address:1-1-1 Tennodai, Tsukuba 305-8573, JapanBrief Biographical History:2012-2016 Undergraduate Student, University of Tsukuba2016- Master Candidate, University of TsukubaMain Works:• “A wearable hand pose estimation with reference data selected accordingto individual differences in hand shape,” 2016 IEEE Int. Symposium onRobotics and Intelligent Sensors (IRIS), T02-5, pp. 1-6, 2016.

Name:Kiyoshi Hoshino

Affiliation:Professor, Graduate School of Systems and In-formation Engineering, University of Tsukuba

Address:1-1-1 Tennodai, Tsukuba 305-8573, JapanBrief Biographical History:1993- Assistant Professor, Tokyo Medical and Dental University1995- Associate Professor, University of the Ryukyus1998-2001 Senior Researcher of PRESTO project, Japan Science andTechnology Agency (JST)2002- Associate Professor, University of Tsukuba2002-2005 Project Leader of SORST project, JST2008- Professor, University of TsukubaMain Works:• “Hand Gesture Interface for Entertainment Games,” R. Nakatsu,M.Rauterberg, and P. Ciancarini (Ed.), Handbook of Digital Games andEntertainment Technologies, Springer, pp. 1-20, 2015.ISBN: 978-981-4560-52-8Membership in Academic Societies:• The Robotics Society of Japan (RSJ)• The Institute of Electronics, Information and Communication Engineers(IEICE)• Japanese Society for Medical and Biological Engineering (JSMBE)

Journal of Robotics and Mechatronics Vol.29 No.5, 2017 837

Powered by TCPDF (www.tcpdf.org)