october 18-22, 2010, taipei, taiwan - university of...

6
The 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems October 18-22, 2010, Taipei, Taiwan 978-1-4244-6676-4/10/$25.00 ©2010 IEEE 75

Upload: others

Post on 07-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: October 18-22, 2010, Taipei, Taiwan - University of Missourivigir.missouri.edu/...IROS_2010/data/papers/0443.pdf · The 2010 IEEE/RSJ International Conference on Intelligent Robots

The 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems October 18-22, 2010, Taipei, Taiwan

978-1-4244-6676-4/10/$25.00 ©2010 IEEE 75

Page 2: October 18-22, 2010, Taipei, Taiwan - University of Missourivigir.missouri.edu/...IROS_2010/data/papers/0443.pdf · The 2010 IEEE/RSJ International Conference on Intelligent Robots

(a) (b) (c)

Fig. 2. (a) The Gifu Hand III. (b) The ball tracking setup. The camera forrecording the pictures used by the colour blob tracker is positioned abovethe hand viewing in the direction of the palm plane. (c) Exemplary pictureof the colour blob tracker camera shown in (b). Tracked blobs (ellipses) aremarked by surrounding lines: the two balls with green and blue ellipses; thereference points with red and green.

2(a)). While the thumb provides four independent joints

resulting in four DOF, the fingers only have three DOF as

- in each case - the two distal joints are coupled. For the

ball swapping task, the Gifu Hand was mounted on a PA-10

robot arm (Mitsubishi Heavy Industries) in order to adjust the

orientation of the Gifu Hand in a similar way as in [10] (see

Fig. 2(b)). In addition, a camera was placed above the scene

and directed towards the palm of the hand (cf. Fig. 2(b)).

Using the camera pictures, a colour blob tracker provides

2D positions in the palm plane of the two balls relative to

the reference blobs near the wrist (see Fig. 2(c)).

The remainder of this paper is organised as follows: In

Section II, we briefly review Structured UKR which we use

as a basis for the training of the motion representation de-

scribed in Section III. The feedback control for the manipu-

lation task is detailed in Section IV, and Section V addresses

the experimental results of the proposed framework using the

Gifu Hand III. Section VI provides a short conclusion.

II. UNSUPERVISED KERNEL REGRESSION

Unsupervised Kernel Regression (UKR) is a recent ap-

proach to learning non-linear continuous manifold repre-

sentations, that is, to finding a lower dimensional (latent)

representation X = (x1,x2, . . . ,xN ) ∈ Rq×N of a set

of observed data Y = (y1,y2, . . . ,yN ) ∈ Rd×N and

a corresponding functional relationship y = f(x). It was

introduced as the unsupervised counterpart of the Nadaraya-

Watson kernel regression estimator by Meinecke et al. in

[7]. Further development has lead to the inclusion of general

loss functions, a landmark variant, and the generalisation to

local polynomial regression [6]. In its basic form, UKR uses

the Nadaraya-Watson estimator [9], [18] as smooth mapping

f : x ∈ Rq → y ∈ R

d from latent to observed data space:

f(x) =

N∑

i=1

yi

KΘ(x − xi)∑j KΘ(x − xj)

. (1)

The original estimator gives a smooth, continuous general-

isation of the functional relationship between two random

variables x and y described by the given data samples

(xi;yi). Here, KΘ(·) is a density kernel (e.g., Gaussian)

with associated bandwidth parameters Θ.

UKR treats Eq.1 as a mapping from latent space to the

original data space in which the manifold is embedded and

from which the observed data samples Y = {yi}, i = 1..N

are taken. The associated set X = {xi}, i = 1..N now plays

the role of the input data to the regression function (1). Here,

they are treated as latent parameters corresponding to Y. As

the scaling and positioning of the xi’s are free, the formerly

crucial bandwidth parameter Θ becomes irrelevant and we

can use unit bandwidths. Thus, the regression function can

be denoted as

bi(x;X) =K(x − xi)∑j K(x − xj)

(2)

f(x;X) =

N∑

i=1

yibi(x;X) = Yb(x;X). (3)

where b(x;X)=(b1(x;X), b2(x;X), . . . , bN (x;X))T ∈RN

is a vector of basis functions representing the effects of the

kernels parametrised by the latent parameters.

As objective function for the UKR training, the following

reconstruction error is used:

R(X)=1

N

i

‖yi−f(xi;X)‖2=1

N‖Y−YB(X)‖2

F . (4)

Here, B(X) = (b(x1;X),b(x2;X), . . . ,b(xN ;X)) is an

N × N basis function matrix. Note that moving the xi

infinitely apart from each other results in B(X) being the

identity matrix which corresponds to a trivial minimisation

solution R(X) = 0. In order to prevent this undesired

case, several regularisation methods are possible [6]. Most

notably, with UKR one can very efficiently perform leave-

one-out cross-validation, that is, reconstruct each yi without

using the yi term itself. To this end, the only additional

step is to zero the diagonal of B(X) before normalising

its column sums to 1. For a preselected density kernel, the

highly non-linear reconstruction error (4) only depends on

the set of latent parameters X and thus can be optimised

with respect to X by gradient-based methods. As such

methods often suffer from getting stuck in poor local minima,

an appropriate initialisation is important. In the case of

UKR, results from spectral embeddings, e.g. performed with

Isomap [17], can easily be used for the initialisation. An

inverse mapping x = f−1(y;X) from data space to latent

space is not directly supported in UKR. Instead, one may use

an orthogonal projection to define a mapping x̂ = g(y;X) =arg minx ‖y − f(x;X)‖2 which approximates f−1(·).

In its original form, UKR is a purely unsupervised ap-

proach to continuous manifold learning. In order to enable

to incorporate prior knowledge about the structure of the

training data, we introduced a structured version of UKR

training (e.g. [14]). With Structured UKR, it is possible to

represent data with a temporal context, like trajectories of

hand positions, in a very easy and robust way. In particular,

due to the specific training of Structured UKR, the order

of the represented time series of training observations yi is

reflected in their latent parameters xi and is captured by one

specific latent time dimension. In order to represent periodic

76

Page 3: October 18-22, 2010, Taipei, Taiwan - University of Missourivigir.missouri.edu/...IROS_2010/data/papers/0443.pdf · The 2010 IEEE/RSJ International Conference on Intelligent Robots

77

Page 4: October 18-22, 2010, Taipei, Taiwan - University of Missourivigir.missouri.edu/...IROS_2010/data/papers/0443.pdf · The 2010 IEEE/RSJ International Conference on Intelligent Robots

78

Page 5: October 18-22, 2010, Taipei, Taiwan - University of Missourivigir.missouri.edu/...IROS_2010/data/papers/0443.pdf · The 2010 IEEE/RSJ International Conference on Intelligent Robots

79

Page 6: October 18-22, 2010, Taipei, Taiwan - University of Missourivigir.missouri.edu/...IROS_2010/data/papers/0443.pdf · The 2010 IEEE/RSJ International Conference on Intelligent Robots

in Fig. 6 illustrate the different behaviours of open loop and

closed-loop control for the same initial ball configuration.

Figure 6(a) shows intermediate hand postures from the

open loop control. Here, pics. 1-5 show an unsuccessful

attempt to swap the balls. Indeed, as the open loop control

cannot react to unachieved sub-goals of the control, the

following movement (pics. 6-11) continues as if the blue

(bottom left) ball had been correctly moved between red (top

right) ball and the palm (cf. Fig. 6(b), pic. 11 for the targeted

configuration) and thus tries in the following to bring the red

ball to the position to which the blue ball returned.

Figure 6(b) depicts the same initial ball configuration as in

Fig. 6(a), but using the closed-loop control (on the basis of

the same UKR representation). Here, again, pics. 1-5 show

an unsuccessful attempt to swap the balls. But, as the closed-

loop control recognises that the blue ball returned to its initial

position, the adequate part of the control is repeated (pics.

7-10) and the goal is eventually reached (pic. 11).

Figure 6(c) shows a control scenario in which the blue (left

bottom) ball is manually pushed back to the initial position

to prevent the ball from swapping. In this sequence, three

attempts to push the ball to the correct position can be seen

(pics. 1-4, 5-9, and 10-11).

Readers are encouraged to refer to the accompanying

video [15] associated with Fig. 6 as it very intuitively

demonstrates the effects of the closed-loop control and its

superiority compared to the open loop version.

Whereas the closed-loop manipulations (Figs. 6(b-c)) are

not perfect in the sense that no errors occur during the ball

swapping, the closed-loop control scheme, however, better

exploits the underlying UKR representation. It realises a hand

motion which is adapted to the current ball configuration and

better reacts to unforeseen disturbances during the manipula-

tion. One interesting observation is that the repeated ”trying”

of the robot to accomplish the sub-goal of bringing one ball

in a specific position gives the impression that the robot has

a kind of awareness of the current situation yielding a very

natural looking manipulation motion.

VI. CONCLUSION

In this paper, we presented a new closed-loop controller

scheme which operates on the predefined clear structure of

Structured UKR manifolds - a modified version of Unsuper-

vised Kernel Regression which we have shown in previous

work to be well suited to representing human motion data.

The new control scheme extends the existing framework of

Structured UKR manifolds that provides means for represent-

ing, actuating, and recognising motions. The new mechanism

now allows for considering partial observations as sensory

input in order to perform a closed-loop control. Using this

controller scheme, we have successfully implemented the

ball swapping task on a real 16 DOF robot hand, namely

the Gifu III Hand, using the positions of the balls as sensory

feedback.

In addition to the proof of the presented concept on a

real robot, we could also show that our new control is

able to better exploit the underlying motion representation

in order to realise a more sophisticated manipulation which

adequately adapts to the current situation.

In future work, we plan to combine the closed-loop

scheme presented in this paper with a more complex man-

ifold providing more than one latent dimension and thus a

broader range of different versions of the same manipulation

for different motion parameters (e.g. ball radius).

Since the closed-loop control is based on the unified

latent structure of Structured UKR Manifolds, we expect the

control scheme to be applicable to all applications that can

be represented in such manifolds. The control scheme then

can be directly used in the presented way, only adjusting the

control parameters k and ∆ defining the length and durations

of the feed forward steps of the control.

ACKNOWLEDGEMENT The authors would like to acknowledge sup-port from the German Collaborative Research Centre ”SFB 673 - Alignmentin Communication” granted by the DFG, the Japanese National Instituteof Information and Communications Technology (NICT), and the GermanCenter of Excellence 277 ”Cognitive Interaction Technology” (CITEC).Further on, the authors would like to thank Ales Ude for providing thecolour blob tracking software.

REFERENCES

[1] C. G. Atkeson and S. Schaal. Memory-based neural networks for robotlearning. Neurocomputing, 9(3):243 – 269, 1995.

[2] D. C. Bentivegna, C. G. Atkeson, and G. Cheng. Learning tasks fromobservation and practice. Robotics and Autonomous Systems, 47(2-3):163 – 169, 2004. Robot Learning from Demonstration.

[3] C. Breazeal and B. Scassellati. Robots that imitate humans. Trends

in Cognitive Sciences, 6(11):481 – 487, 2002.[4] Y. Demiris and G. Hayes. Imitation as a dual-route process featuring

predictive and learning components: a biologically-plausible computa-tional model. In Imitation in Animals and Artifacts. MIT Press, 2002.

[5] A. Ijspeert, J. Nakanishi, and S. Schaal. Learning attractor landscapesfor learning motor primitives. In Advances in Neural Information

Processing Systems, pages 1523–1530. MIT Press, Cambridge, 2003.[6] S. Klanke. Learning Manifolds with the Parametrized Self-Organizing

Map and Unsupervised Kernel Regression. PhD Thesis, BielefeldUniversity, Bielefeld, Germany, 2007.

[7] P. Meinicke, S. Klanke, R. Memisevic, and H. Ritter. PrincipalSurfaces from Unsupervised Kernel Regression. Trans. on PAMI,27(9):1379 – 1391, 2005.

[8] T. Mouri, H. Kawasaki, K. Yoshikawa, J. Takai, and S. Ito. Anthro-pomorphic Robot Hand: Gifu Hand III. In Proc. ICCAS, pages 1288– 1293, 2002.

[9] E. A. Nadaraya. On Estimating Regression. Theory of Probability and

Its Application, Vol.9, 1964.[10] E. Oztop, L.-H. Lin, M. Kawato, and G. Cheng. Dexterous Skills

Transfer by Extending Human Body Schema to a Robotic Hand. InProc. Intl. Conf. on Humanoid Robots (Humanoids), 2006.

[11] S. Schaal. Is imitation learning the route to humanoid robots? Trends

in Cognitive Sciences, 3(6):233 – 242, 1999.[12] S. Schaal, A. Ijspeert, and A. Billard. Computational approaches to

motor learning by imitation. Philosophical Transaction of The Royal

Society of London: Ser.B, Biological Sci., 358(1431):537–547, 2003.[13] J. Steffen, R. Haschke, and H. Ritter. Towards Dextrous Manipulation

Using Manifolds. In Proc. IROS, 2008.[14] J. Steffen, S. Klanke, S. Vijayakumar, and H. Ritter. Realising

Dextrous Manipulation with Structured Manifolds using UnsupervisedKernel Regression with Structural Hints. In ICRA 2009 Workshop:

Approaches to Sensorimotor Learning on Humanoid Robots, 2009.[15] J. Steffen, E. Oztop, and H. Ritter. IROS 2010: accompanying video

for this paper. www.techfak.uni-bielefeld.de/∼jsteffen/iros2010/.[16] J. Steffen, M. Pardowitz, and H. Ritter. Using Struct. UKR Manifolds

for Motion Classification and Segmentation. In Proc. IROS, 2009.[17] J. B. Tenenbaum, V. de Silva, and J. C. Langford. A global

geometric framework for nonlinear dimensionality reduction. Science,290(5500):2319–2323, December 2000.

[18] G. S. Watson. Smooth Regression Analysis. Sankhya, Ser.A, 26, 1964.

80