[ieee 2004 ieee/rsj international conference on intelligent robots and systems (iros) - sendai,...

6
Proceedings 01 2004 IEEEIRSI International Conference on Intelligent Robots and Systems September 28 -October 2,2004, Sendai, Japan Appearance-based Concurrent Map Building and ~ Localization using a Multi-Hypotheses Tracker Josep M. Porta IAS Group, University of Amsterdam IOPBSJ, Amsterdam, The Netherlands Email: [email protected] Abshner--The main drawback of appearance-based robot localization with respect to landmark-based one is that it requires a map including images taken at known positions in the area where the robot is expected to move. In this paper, we describe a Coneurmt Mapbuilding and Localization (CML) system developed within the appearance-base mbot localization paradigm. This allows us to combine the good features of appearanrebase ldization, sucb as simple sen- sor processing or robustness, without having to deal with its inconveniences. In our CML system, both the robot's position and the map are represented using Gaussian mixtures. Using this kind of representation, we can deal with the global localization problem while being efficient both in memory and in execution time. I. INTRODUCTION Robot localization methods can be divided in two fam- ilies: methods based on landmark extraction and track- ing [1]-[4], and methods based on an appearance model of the environment [51-[71. Landmark-based localization methods rely on the assumption that landmarks can be correctly identified and that their position cah be accurately extracted from the raw sensors readings. However, the transformation from sensor readings to geometric infor- mation is, in general, complex and prone to errors. As a counterpart, in the appearance-based methods the environ- ment is not modeled geometrically, hut as an appearance map that includes a collection of sensor readings obtained at known positions, usually placed on a regular grid. The advantage of this representation is that the sensor readings obtained at a given moment can be directly matched with the observations stored in the appearance-based map. Since no complex sensor process is necessary, appearance- based localization methods are computationally very and, in some cases, they are more robust to noise, certain type of occlusions and changes in illumination' that landmark based-methods 181. The main drawback of appearance- based methods is that localization is only possible in previously mapped areas. The construction of a map is a supervised process that can he quite time-consuming and that is only valid as far as no important modifications of the environment occur. While much work has been done on Concurrent Mapping and Localization (CML) using landmarks [9]-[11], this is not the case withii the appearance-based approach. Recent work in this line [I21 'When a cdze deteclor is usd 10 preprocess the images. Ben J.A. Krose IAS Group, University of Amsterdam 1098SJ, Amsterdam, The Netherlands Email: [email protected] use a Kalman filter formulation to estimate the position of the robot. Using a Kalman filter only one hypothesis can be tracked at a time and, consequently, this system is not able to perform global localization (i.e., the localization without any prior information on the robot's position). In this paper, we introduce a system that is able to perform CML within the appearance-based approach. With this aim, we replace the static, pre-defined map of the environment used in appearance-based localization by an approximation of it obtained and refined by the robot as it moves in the environment. The basic idea we exploit in our system is that, if the robot re-visits an already explored area, it can use the information previously stored to reduce the uncertainty on its position. Additionally, the improvements on the robot's position can be back- propagated to map points stored in previous time slices using trajectory reconstruction techniques [ 131. The result, as desired, is a correction of both the robot's position and the map. Our approach'is closely related with the work on map building on cyclic trajectories I141 that is usually done using range sensors such as sonars or laser scanners [15]. Range sensors provide distance information (position of obstacles w.r.t. the robot) that can he integrated into a geometric map of the environment. Contrariwise, our map is not a geometric map of the environment but a map in the space of features derived from images. Due to the different nature of the maps, the main issues of our approach are different form those in existing work. For instance, when using range-based sensors, one of the main issues is that of sensor matching (i.e., to determine the translation and rotation for a given sensor reading so that the match with the map is maximal). In our case, sensor matching is trivial since images are directly compared in the feature space, but the map representation becomes more complex. We first describe the formalism we use in our localiza- tion framework. Next, we describe how to estimate the position of the robot, assuming we have a map. After that, we describe how to extract features from the input images and how to on-line approximate the feature-based map necessary for localization. Finally, we show the results obtained with the new CML system, and we conclude summarizing our work and pointing direction for further research. 3424 0-78031463-6/WS20.00 W2004 iEEE

Upload: bja

Post on 07-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Sendai, Japan (28 Sept.-2 Oct. 2004)] 2004 IEEE/RSJ International Conference on Intelligent

Proceedings 01 2004 IEEEIRSI International Conference on Intelligent Robots and Systems September 28 -October 2,2004, Sendai, Japan

Appearance-based Concurrent Map Building and ~

Localization using a Multi-Hypotheses Tracker Josep M. Porta

IAS Group, University of Amsterdam IOPBSJ, Amsterdam, The Netherlands

Email: [email protected]

Abshner--The main drawback of appearance-based robot localization with respect to landmark-based one is that it requires a map including images taken at known positions in the area where the robot is expected to move. In this paper, we describe a Coneurmt Mapbuilding and Localization (CML) system developed within the appearance-base mbot localization paradigm. This allows us to combine the good features of appearanrebase ldization, sucb as simple sen- sor processing or robustness, without having to deal with its inconveniences. In our CML system, both the robot's position and the map are represented using Gaussian mixtures. Using this kind of representation, we can deal with the global localization problem while being efficient both in memory and in execution time.

I. INTRODUCTION

Robot localization methods can be divided in two fam- ilies: methods based on landmark extraction and track- ing [1]-[4], and methods based on an appearance model of the environment [51-[71. Landmark-based localization methods rely on the assumption that landmarks can be correctly identified and that their position cah be accurately extracted from the raw sensors readings. However, the transformation from sensor readings to geometric infor- mation is, in general, complex and prone to errors. As a counterpart, in the appearance-based methods the environ- ment is not modeled geometrically, hut as an appearance map that includes a collection of sensor readings obtained at known positions, usually placed on a regular grid. The advantage of this representation is that the sensor readings obtained at a given moment can be directly matched with the observations stored in the appearance-based map. Since no complex sensor process is necessary, appearance- based localization methods are computationally very and, in some cases, they are more robust to noise, certain type of occlusions and changes in illumination' that landmark based-methods 181. The main drawback of appearance- based methods is that localization is only possible in previously mapped areas. The construction of a map is a supervised process that can he quite time-consuming and that is only valid as far as no important modifications of the environment occur. While much work has been done on Concurrent Mapping and Localization (CML) using landmarks [9]-[11], this is not the case withii the appearance-based approach. Recent work in this line [I21

'When a cdze deteclor is u s d 10 preprocess the images.

Ben J.A. Krose IAS Group, University of Amsterdam 1098SJ, Amsterdam, The Netherlands

Email: [email protected]

use a Kalman filter formulation to estimate the position of the robot. Using a Kalman filter only one hypothesis can be tracked at a time and, consequently, this system is not able to perform global localization (i.e., the localization without any prior information on the robot's position).

In this paper, we introduce a system that is able to perform CML within the appearance-based approach. With this aim, we replace the static, pre-defined map of the environment used in appearance-based localization by an approximation of it obtained and refined by the robot as it moves in the environment. The basic idea we exploit in our system is that, if the robot re-visits an already explored area, it can use the information previously stored to reduce the uncertainty on its position. Additionally, the improvements on the robot's position can be back- propagated to map points stored in previous time slices using trajectory reconstruction techniques [ 131. The result, as desired, is a correction of both the robot's position and the map.

Our approach'is closely related with the work on map building on cyclic trajectories I141 that is usually done using range sensors such as sonars or laser scanners [15]. Range sensors provide distance information (position of obstacles w.r.t. the robot) that can he integrated into a geometric map of the environment. Contrariwise, our map is not a geometric map of the environment but a map in the space of features derived from images. Due to the different nature of the maps, the main issues of our approach are different form those in existing work. For instance, when using range-based sensors, one of the main issues is that of sensor matching (i.e., to determine the translation and rotation for a given sensor reading so that the match with the map is maximal). In our case, sensor matching is trivial since images are directly compared in the feature space, but the map representation becomes more complex.

We first describe the formalism we use in our localiza- tion framework. Next, we describe how to estimate the position of the robot, assuming we have a map. After that, we describe how to extract features from the input images and how to on-line approximate the feature-based map necessary for localization. Finally, we show the results obtained with the new CML system, and we conclude summarizing our work and pointing direction for further research.

3424 0-78031463-6/WS20.00 W2004 iEEE

Page 2: [IEEE 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Sendai, Japan (28 Sept.-2 Oct. 2004)] 2004 IEEE/RSJ International Conference on Intelligent

11. MUI.TI-GAUSSIAN BASED LOCALIZATION Probabilistic systems for robot localization can be clas-

sified according to the flexibility of the mechanism used to represent the probability distribution on the robot’s pose (position and orientation). Systems based on the Kalman filter [3] represent the robot’s pose using a single Gaussian. These systems are effective avoiding the error on the robot’s location to grow without any bound (as it would happen using only odometry for localization). Additionally, due to the simplicity of the model we can formally proof the convergence of the localization and of the associated mapping process, if any. However, using a single Gaussian, it is not possible to track more than one hypothesis at the same time an4 due to this, the Kalman-based localization systems are unable to deal with the global localization problem (i.e., to determine the location of the robot without any prior knowledge on the robot’s pose) or with the kidnap problem. Probabilistic occupancy grids [ I l l , [I61 and particle filters 1171, [18] can be used to solve the global localization problem. In the probabilistic occupancy grids framework, the area where the robot is expected to move is discretized in small cells and the system maintains the probability for tbe.robot to be in each one of these cells. In the particle filter, the robot pose is estimated using a set of discrete samples in the pose space. Both occupancy grids and particle filters are extremely flexible but also computationally expensive in memory and in execution time per time slice.

A multi-Gaussian probability representations [191-[211 is an option that is between Kalman-based systems and panicle filter: they can track more than one hypothesis si- multaneously (and, so, they are also called muti-hypotheses trackers), but they do not reach the degree of flexibility of particle filters. However, for practical purposes, an approximation of the probability on the robot’s pose as a set of Gaussian functions is good enough and it has the advantage of being bigbly intuitive. Additionally, multi- Gaussian systems use less resources (in memory and time) and they can provide localization as that provided by other methods [221. As a drawback, working with a multi- hypothesis tracker poses a problem of data association (i.e., to determine which p m of the sensor readings to use to update each hypothesis) that is not present in particle filtering or probabilistic occupancy grids.

In this paper, we use the multi-Gaussian approach intro- duced by Jensfelt and Kristensen in [21]. Our contribution is that, while Iensfelt and Kristensen use a pre-defined map of the environment, our syslem is able to on-line build the map as the robot explores the environment.

111. ROBOT POSITION ESTIMATION

The probabilistic localization methods aim at improving the estimation of the pose of the robot at time t , x t r taking into account the movements of the robot { u l , . . . :ut} and the observations of the environment taken by the robot { y ~ , . . . I y t ) up to that time. Formally, we want to esti- mate the posteriorp(xtI{ul:yl: .... ul.yi}). The Markov assumption states that this probability can be updated from

the previous state probability p ( ~ ~ - ~ ) taking into account only the last executed action ut and the cunent observation yi. Thus, we only have to estimate p(xtlut, yt). Applying Bayes we have that

P(.%lUt,Yi) = P ( v t l ~ t ) P ( z t / ~ t ) , (1)

where the probability p(xtlui) can be computed propagat- ing from p ( ~ t - ~ l u ~ - ~ , y t - - l ) using the action model

p(ztlut) = / ~ ( ~ t l u t . x t - - i ) p(zt-llut--l, yt-1) dxt-1.

Equations 1 and 2 define a prediction-correction system to estimate the pose of the robot.

We use a Gaussian mixture Xi = {(&E;:w:) Ii E 11, NI) to represent the pose of the robot. Thus

(2)

N

p(ztlut,yt) CK CZU:4(ZllZt,Ct); f=1

with @(&,C:) a Gaussian centered at zg and with covariance matrix C:. The weight Z U ~ (0 < wt 2 1) pro- vides information on the certainty of the pose hypothesis represented by the corresponding Gaussian.

The motion of the robot is modeled as

with v1 a Gaussian noise with zero mean and covariance Q. With this, the prediction step of the localization system (equation 2) amounts at applying equation 3 to each one of the components in Xi. Using a linear approximation,

4, = f ( . E - 1 ; U l ) :

cl, = F C ; _ , F ~ + G Q G ~ : (4)

and G the With the set Xu* =

with F the Jacobian o f f with respect to Jacobian o f f with respect to {(z&;E&,u$)li E [l lN]} we define

N

P(Z t1Ut ) a Cwl4(zt lzt* ,Ct*) . i=1

Once we have p ( x t l u t ) , we have to integrate the in- formation provided by the sensor readings to perform the correction step of the localization system (equation 1). At this point, we assume we have a map of the environment from which we can define p(yilzl) as a set of Gaussian functions with parameters Yv* = { ( x i . ~ X i t w i t ) 1 j E 11, N ‘ ] } . In next section, we describe bow to create and update the set Yut. From Yy* we can define

If Yy. has no components (A” = 0), the estimation on the robot’s pose obtained applying equation 4 can not be corrected and we have

3425

Page 3: [IEEE 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Sendai, Japan (28 Sept.-2 Oct. 2004)] 2004 IEEE/RSJ International Conference on Intelligent

If N' > 0, we have to fuse the Gaussian functions in Xu, with those in Yyt. The direct application of equation 1 supposes to find a Gaussian that represents the multipli- cation of each one of the elements in Xu, with those in Yyt. This would produce a quadratic ( N x N') increment of the number of hypotheses. However, if xu* E Xu, and xu, E Yy, are in different areas of the pose space, it makes few sense to combine their information. This arises the problem of datu association: to determine which elements nn Xu, and on Yyt refer to the same positioning hypothesis so that we can only associate those elements. Additionally, this helps to keep the number of hypotheses under a reasonable limit. We perform the data association using the same criterion used in 1211. For each couple ( z , j ) with (xi,:Ei,-.i:) E Xu, and (z$,>Eit,.i&) E Yv. we compute the innovation as

u., . - .i 1 3 - U* -4 S;,j = Et, + E;*

and we assume that hypotheses on the robot pose i and sensor reading j match if the following condition holds

U . *.J .S-'v'. i,j t . 3 - < y. ( 5 )

If there is a match, we update of hypothesis i with the sensor information j using the Covariance Intersection rule 1231. This rule permits filtering and estimation to be performed even with the presence of the unmodeled correlations in the sensor readings that are so common in real-world problems. This rule updates the covariance matrix and the average as

(E;)-' = (1 - w ) ( x y + w(EiJ1

z j = Ej[(l - w)(Et,)-'zI, + w(E&)-'z;,], (6)

withw = lZt,I/(lE$I+l.ZiJ). Iftheinformationprovided by the sensors is more reliable than the current one, this update results in a reduction of the uncertainty of hypothesis i (a reduction in the size of

Hypotheses on the state not matched with any Gaussian on Yyt are just keep without any modilkation, but the weight update described below. Sensor components not matched with any state hypothesis have to he introduced as new hypotheses on Xi. This allow us to deal with the kidnap problem.

The confidence on each hypothesis in X , is represented by the corresponding weight w:. Following [24], it seems clear that, the more the sensor readings recently supporting a given hypothesis the larger the confidence. Thus, we increase for the hypothesis that are properly matched in the data association process

w.r.t. Et,).

.it = w:-, + cy(1- (7)

with cy a leaning rate in the range [0;1]. For the not matched hypothesis, the confidence is decreased as

w; = (1 - cy)wt-'. (81

Hypotheses with t~ low weight are not included in X t .

Iv. IMAGE FEATURE EXTRACTION

Our sensory input for localization are images taken by the camera mounted on the robot. A problem with images is their high dimensionality, resulting in large storage requirements and high computational demands. To alleviate this problem, Murase and Nayar 1251 proposed to compress images (2) to low-dimensional feature vectors (8) using a linear projection

y = w2.

In the work of Murase and Nayar, the projection matrix W7 is obtained by Principal Component Analysis (PCA) on a supervised training set (7' = ((z,> i ; ) l i E [I, I,]]) including images .si obtained at known states x,. However, in our case, we don't have a training set. For this reason, we use a standard linear compression technique: the discrete cosine transform @CT) [261, [271. We select this transform since, PCA on natural images approaches the discrete cosine transform in the limit. The DCT computes features from images in a linear way using a projection matrix W defined as

(9) Wj,k = cos (p - 1/2))

with j the number of the feature to be extracted, k 5 n the index of a given pixel in the image. For a given image we compute a set of d features (d is typically around IO). Those features capture a very general description of the image, omitting the low level details and making the comparison between images more meaningful.

V. ENVIRONMENT MAPPING

The manifold of features y can be seen as a function of the pose of the robot x, y = g(x). The objective of a appearance-based mapping is to approximate g-l since this gives us information about the possible poses of the robot given a set of features.

At a given time, we have a pair (Xt, yt) with yi a set of features and Xt the estimation of the pose of the robot. We can use the sequence of such pairs obtained as the robot moves as a first approximation to the map needed for localization. Thus, our map can be initially defined as M = {Y,,,yt)}, with Yvt = X t .

As explained in section m, when the robot re-obsemes a given set of features yt. we can use Yve to improve the location of the robot Xu, after the prediction step of the localization system. We considerer that y' is a re- observation of observation y if Ily-y'lI < 6. The parameter 6 discretizes the space of features with a given granularity and determines the area in the environment where an equivalent set of features is observed. For an analysis on the sensitivity of appearance-based features to translations, rotations and changes in illumination we refer to [281. Thus, if y (1-6) is re-observed, we can use an old estimation on the robot's pose to improve the current one. However, we can also use the inlormation the other way around, we can improve Yvt using the additional information provided by the current Xu*. For this, we need to introduce new information into a given Gaussian mixture and a procedure

Page 4: [IEEE 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Sendai, Japan (28 Sept.-2 Oct. 2004)] 2004 IEEE/RSJ International Conference on Intelligent

to do that is exactly what we have described in section m. Therefore, we can improve Yyt using this same procedure but here the roles are swapped in the previous section we update the pose of the robot using the sensor information and now we adjust the map using the information provided by the pose of the robot. So, we have to swap X,, with Yys, zut with xy<. and i with j . The only difference is that we assume the environment to he static and, thus Yy* is not updated by any action model.

At a given moment, the robot is at a single pose, so, when the state Xu, includes more than one hypothesis, we are uncertain about the robot's location and. consequently, any map update using Xu, will include incorrect map modifications that we will need to undo later on. To avoid this problem, we use a conservative strategy in the map update: when Xu, is not a single hypothesis (i.e., a single Gaussian) no map update is done. If Xu, is a set of Gaussian functions (or a uniform distribution) and, due to the state update, it becomes a single Gaussian, we use the path followed by the robot and the backward action model to add new points to the map corresponding to the time slices where the robot was unsure about its location (and that now we now can unambiguously determine). Additionally, if X t is a single Gaussian and the state update results in a reduction of its covariance, we also backpropagate this error reduction to previous map points. The result of these two backpropagation process is to add new points to the map (i.e., to extend the map) and to improve the estimation of previously stored map points (i.e., to improve the quality of the map).

The mapping strategy just outlined allow us to build an appearance-based map of a given environment along the paths followed by the robot. When the map A4 is empty, we need to now the initial pose of the robot (the map is build with respect to this pose), but when M has some elements, global localization is possible. To perform global localization we have to initialize the system with XI, containing a uniform distribution. As s w n as the robot observes features already stored in the map (i.e., it visits an already explored area) new hypotheses are added to Xt using the data stored in the corresponding map points. The weight update will determine which one of the hypothesis is the more likely one and will remove the rest from the state Xt. When this happens, the new information on the robot pose is backpropagated adding points to the map.

Figure 1 summarizes the CML algorithm we introduce in this paper.

VI. EXPERIMENTS AND RESlJ1,TS

We tested the proposed CML system mapping a corridor 25 meters long. We drive the robot using a joystick all along the corridor and hack to the origin. Figure 2-A shows the path followed by the robot according to odomemc information. As we can see there is a large error in the final pose of the robot (about 40 cm in the X axis, 300 cm in the Y, and 20 degrees in orientation). In the figure, each one of the mows correspond to a robot's pose where a map point is stored.

4ulti Hypotheses CMLW): Input: M, the map.

U M is empty X c {(z = 0 . C = 0 , w = l)} else X + Uniform distribution.

)O forever: Update X according to the action model (equations 4). Get an image 2. Compute the features y of z using the DCT (equation 9), (Y:Y) - a v m i n v ( , , , y w r IIY - Y ' I I if 1111 - Y'II > 6 then

I , " " I , -

M t M U {X,y) else

X' = x Associate elements in X and Y (equation 5 ) . Update X:

For the associated elements use equations 6 , 7. For the non-associated decrease weight (eq. 8). Remove elements in X with too low weight.

if 2 associated with y E Y

else

Decrease weight for y' E Y. y' # y (eq. 8). Remove elements in Y with too low weight.

(uniform(X') or 1x1 < IX'I) Backpmpagate(X)

if X is a single Gaussian (a)

Update y using equations 6 , 1.

Add z 10 Y.

if sire(X)= 1 and not(uniform(X)) and

Fig. 1. me multi hypxhcsis trac!4inp. apparancc-based CML alsotithm. lloll is ihe norm 01 vcclor o and / b l is the determinant oi mattir b.

B

n 4 4 n

C Fig. 2. Path of the rohot used ior ihe lens. A- Using only odometry. B- The map is corrected when the loop is closed C- Adding new portions ihe map (to do that thc rohat perfoms plohal localization).

3427

Page 5: [IEEE 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Sendai, Japan (28 Sept.-2 Oct. 2004)] 2004 IEEE/RSJ International Conference on Intelligent

Our CML system detects that the final point in the trajectory is close to the initial one comparing the features of the images obtained at those points. This coincidence allows a drastic reduction on the uncertainly on the robot's location and this reduction can be back-propagated improv- ing the quality of the map. Figure 2-B show the corrected map points after the loop is closed. We can see that the correction affects mainly to the points close to the end of the trajectory, where the back-propagated information is more certainty that the previously stored one: close to the beginning of the trajectory the initial information is more reliable than the back-propagated one and the map points are not modified. In figure 2-B, the Light gray arrows correspond to poses where there is perceptual aliasing: the same set of features are observed from all the pose plotted in light gray. Perceptual aliasing is one of the reasons why we need to keep track of many hypotheses simultaneously.

Another siNation where multiple hypothesis should be considered is when performing global localization: when the robot is started at a unknown pose (but in a previously mapped area) or after a kidnap. In the experiment reponed in figure 2-C, we started the robot in a non-mapped area and we drive ir to the beginning of the conidor previously mapped. In this case, the robot operates with a uniform distribution about its pose. When consistent matches with already existing map points occur, the robot determines its pose, the uniform distribution is replaced by a Gaussian defined from the map and new map points dong the robot path are added.

In a third experiment we tested the ability of our system to deal with the kidnap problem, even while the map is in early phases of its construction. We move the robot in a close circuit of 3.5 x 3 meters (figure 3-left) departing from position 0. At the first loop. at position A the robot is kidnapped lifted and displaced to position B and rotated 180'. Thus, according to odometry the robot moves from A to C while it is actually is moving from B to 0. The kidnap introduces a large error while the map is in its initial construction phase (observations initially assigned to points in the path from A to C are actually occurring in the path from B to 0). When the robot gets to 0 a new hypothesis (the correct one) is introduced into the state set. This new hypothesis is reinforced by the observations and the weight of other hypotheses decreases. After few time slices, only the correct hypothesis remains in the set Xt and in the fol- lowing iterations over the circuil the map is corrected. We can use an example to show how the map correction works. Initially, the incorrect association ( { ( Z I > X ~ , W I = l)}:y) is stored in the map. In the second loop, this map point is updated to ({(zI,&,wI = l).(zz,Cz,ur, = l )} ,~) . In the following iterations, the weight of the association (xl;y) decreases sine it is not observed again while the correct association (22. y) is reinforced. After few loops, the association (xl;y) becomes weak and it is eventually removed from the map.

Figure 3-Right shows the error in localization as the robot moves around the circuit. The large errors at time slices 40 - 90 are caused by the initial kidnap but, as the

0

0 100 2m 3M a00

Time Slice Fig. 3. Left: The c k u i t uscd for lhc kidnap enpefimenl. Right: the m r in localimtion for diffcreni time sliecr.

robot gets to 0 again, the error is canceled and keep low. Every loop around the circuit takes 100 time slices.

VII. DISCUSSION AND FUTURE WORK We have introduced a system that is able to simulta-

neously build an appearance-map of the environment and to use this map, still under construction, to improve the localization of the robot. The on-line construction and update of the map allow us to overcome the major hurdles of traditional appearance-based localization. First, the robot can operate in previously unhown areas. Second, we can deal with smalYslow changes in the environment: new observations obtained at already explored positions are added to the map, the old observations at those position are not used any more and they are slowly forgotten. However, if the environment is highly dynamic, our system would fail in defining a stable map. Finally, the way in which the map is built guarantees a regular sampling of the feature space and not of the geomelric space, as it happens in nor- mal appearance-based localization. Sampling uniformly the feature space is essential for achieving a good localization since the sensor model is based on the similarities [i.e., the distances) in that space.

An implicit assumption in our mapping strategy is that the robot moves repetitively through the same areaslpaths. However, this is a quite reasonable assumption for service robots moving in relatively small offices or houses, that are the kind of environments in which we plan to use our system. For larger environments, other localization methods, as for instance that introduced in [29], would result more adequate than that presented here.

The proposed CML approach provide a position infor- mation with a larger error than landmark-based methods. However, a rough information on the robot pose is enough for most tasks, assuming that the low level behaviors of the robot controller are able to deal with local aspects of the environment (obstacles, doors, etc).

Due to the way in which we define the map, the map error will be small close to the areas where the map is started and growing for points far away from the origin. Actually, the error for a given map point p is is lower bounded by the error in cdomeuy for a direct displacement from the origin of the map 0 to p . As a reference, our Nomad Scout robot can map an area of 20 x 20 meters with an accuracy below 1 meter. Since the error in odometry limits the area that we can map with a given accuracy,

Page 6: [IEEE 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Sendai, Japan (28 Sept.-2 Oct. 2004)] 2004 IEEE/RSJ International Conference on Intelligent

we would like to complement our localization system with additional odometric sensors (accelerometers, vision-based motion detection, etc) to determine more accurately the relative displacements of the robot.

The results presented in this paper qualify the proposed approach as a promising one, but much work has to be done to complete the system. First, we would like to reformulate the our map leaming in the context of function approximation using Gaussian mixture using the techniques already developed in our group [301, 1311. In second term, we have to incorporate mechanisms to properly deal with illumination changes [32]. We would also like to explore the possibility of using different types of feaNres as, for instance, the SIFT features [331. Additionally, more elaborated ways to perform the data association and the hypotheses pruning are also needed. Finally, another point we want to address in the near future is the integration of our CML system with a navigatiodexploration module. For this purpose, we have to devise criterion to select the robot's movements so that accuracy in localization, coverage of the space to map, accuracy in mapping, and efficiency in reaching the desired positions are simultane- ously optimized.

ACKNOWLEDGMENTS

This work has been partially supported by the European ([TEA) project "Ambience: Cmrexr Aware Environments for Ambient Services"

I l l A. EIfcs, "Sonarbared realhwrld mapping and navigation:' IEEE Joirnrol of Robotics and Aulomarion. vol. 3, no. 3, pp. 249-265. 1987.

121 1. Honwill. Anifioid hrdigence mid Mobile Robots. MI1 Press. Cambridge, MA. 1998, eh. Thc Polly System. pp. 125-139.

131 I. Leonard and H. Uumt-Whylc, "Mohilc Robot Localization by Tracking Geometric Bcacans," IEEE Tmnsociions on Rohorics and

7. no. 3, pp. 376-382, 1991. n s r limion in Cmainty Grids lor Mahile Robols."

AI Magazine, vol. 9, no. 2. pp. 61-74, 1988. IS1 M. Jogan and A. Ironardis. "Rohust iacali7ation using Eigenspace

of Spinning-Images:' in Pmceedingr of rhr IEEE Worksbp on Omnidirectional virion, Hilfori Heod Irlnnd. Soath Camiim, 2000, pp. 37-44.

161 U. Krose, N. Vlassis, and R. Bunscholcn. "Omnidirectional Vision for Appcarancc-bascd Robot Localization:' I ~ c f u r e Norer in Com- puter science, YOI. 2238, pp. 39-50. 2002.

[7] 1. Cmwley, F. Wallnei, and B. Schick. "Position Estimation Using Principal Componena or Range Datal' in Pmcrcdingngr ofrhe IEEE Iniemofioaal Conf~mnce on Robotics and Aaromarion (ICRAJ, 1998, pp. 3121-3128.

(81 R. Sim and G. Dudck, "Comparing lmagc-hasrd Localization Methods:' in Pmceedings of r h r Eighreerrrh Inremorional Joinr Co>ference on Anificiol Inrelligeace (IJCAI). Acapulco, Merieo. 2003.

191 G. Dissanayake. P. Newman, S. Clark H. D u m t - W h y e , and M. Csmha, "A Solution to the Simullanwus Lncaliracion and Map Building (SLAM) Prohlcm:' IEEE Tmnsaerions on Roboricr and Automarion, vol. 17. DO. 3, pp. 229-241, 2001.

1101 H. Feder, J. Leonard, and C. Smith, "Adaptive Mobilc Robot NIY- igalion and Mapping:' Inrrniaiionol JolrrMl of Roboricr Research, Sorciol Isrue on Fidd and Senlcp Rohoiirj. vol. 18, no. 7. nn.

&onzing. vol. 31, pp. 29-

[I21 P. E. Kybski, S . Ro~melioti~. M. Gini, and N. Papanikolopoulos, "Appemcc-Bared Minimalistic Metric SLAM:' in Pmceediagr of the Intermfiorlol Conference on Robotics ord InIelligeru Syrlem (IROSJ. Los Egos, USA, 2003, pp. 194-199.

[I31 S. L Hagen a d B. KrGse. 'Trajectory reconswclion lor %U- localization and map huilding," in Proceedings ofthe IEEE I n r e m - rional Conference on Robotic$ ord Automiion, Woshingron D.C., USA. 2002. pp. 1796-1801

1141 J. Gutmann and K. Konrligc. "lnercmental Mapping of Largc Cyclic Envimnments:' in Pmceedings of !he IEEE Infemotionol Sympo- sir" on Compurarionoi Intelligence in Roborio nnd Automarion CIRA, 1999, pp. 318-325.

[IS] E Lu and E. Milios, "Globdly Consistent Range Scan Alignment lor Environment Mapping," Autorromour Robots, vol. 4. pp. 333- 349. 1997.

[I61 W. B q a r d . U. Fox, U. Hcnnig, and T. Schmidt "Estimating the Ahsolure Position of a Mobile Robot using Position Prohahilily Grids:' in Pmceedings of the NorioMi Conference on Anficlol Inreliigence ( M I , 1996, pp. 896-901.

[I71 M. Pit1 and N. Shephard "l3ltcting Via Simulation: A u x i l i q Panicle Fillen," J. A m < Srotisl. Arsoc.. vol. 94, no. 446, pp. 5 9 s 599, lune 1999.

1181 N. Vlassis, R. Tenvijn, and E. Krbse. "Auxiliary Particle Filter Robot Localization fmm High-l)imensional Sensor Observations," in Pmceedingr of rhr IEEE Inrcmoiionnl Conference on Roborici aiidAufomriw (ICR4). Worhirigron D.C., USA, May 2002. pp. 7- 12.

1191 I. Cox and 1. Izonard, "Modcling a Dynamic E m i r o m " Using a Multiple HyptbesiS Approach:' Anificial Inrelligsrae. 901. 66, no. 2. pp. 311-344. 1994.

1201 K. 0. Arras, I. A. Caslcllanos. and R. Siegwn, "Pcalure-Based Multi&Hyplhesis Localization and Tmcking lor Mohile RohotS Using Geometric Consuuints:' in Proceeding$ ofIEEE Biremrionol Conference on Roboric? mid Auromarion, 2002, pp. 1371-1377.

1211 P. lensfelt and S. Knslmsen, ''Active Glohal lacaliialion for a Mo- bile Robot Using Multiple Hypothesis Tracking," IEEE Transactions on Robotics and Aicromrioa. vol. 17, no. 5, pp. 748-760, 2MI.

1221 S. Knstensen and 1,. lensfclt, "An Experimental Comparison of LOEali7ation Methods, the MHL Session:' in Pmceadingr of the In- teniniioiwl Conference on Robotics arid Iniclligerii Sysremr (IROSj, Las Vegos. UG, 2003. pp. 992-997.

1231 J. Uhlmann. S. Julicr, and M. Csorha. "Nondivergent Simullnncous Map Building and Localinalion Using Covariance Intersection:' in Pmrredin@r of rhe SPIE Abro~dme Couference. 3087. 1997.

~~ , ~ ~ ~ ~~~~~

1241 B. Tewijn, J. Pona and B. Krbsr, ' ~ A Faniele Fillcr to I'.slimue non-Markovian Slam:' in Pmrredirigr of the 8th lnremorioiiol Conference oil Inreliigerir A ~ r o i i o m o ~ ~ S y m m s (IASj. 2024, pp. 1062% 1069.

1251 H. Murase and S. Nayar, "Visual Laming and Recognition of 3- D Ohjects from Appearance:' I!itenurionol Joimol of Computer Vlrioa. vol. 14. no. 5-24. 1995. ..

1261 K. R& and P. Yip, Disrrele Cosine Transform: Algorizhnlr. Admn- rages, Applications. Academic Prcsb, Boston. 1990.

[271 R. Clarke, Digiiol Compression of Sfill lmoger old Video. Aca- dcmic press, 1995.

1281 1. Pona and B. Krbse. "On the use of disparity maps formbust mhot lacdliiation undcr different illumination eanditians," in Proceeding3 ofthe 11th lntemotionol Corference on Advonced Robotic* (ICARI. Coimbra. Ponugol, 2W3.

I291 M. Monicmerlo, S. Thrum D. Koller. and B. Wchgfeil. "FastS1,AM 2.0: An Improved Panicle Piltcrinp Algorithm lor Simulmeou~ Localimtion and Mapping that Provably Canvcrgcs:' in Proceedings of d!z Inremariorwl Joinr Conference on Anificinl hselligence (IJCAIL 2003.

1301 I. I. Vchcck, N. Vlassir, and B. KTOSE, "Efli eient Greedy b i n g 01 Gaussian Mixture Models."Neuml Com~utntion, vol. 15. no. 2. pp. 469-485, 2003.

[Ill Z. Zivkovic and R d. Heijden, "Recursive unsupervised learning of fi nitc mixture models:' IEEE Trorisocrioris on Potrani Aoalyrir ord Machine Intelligence. vol. 26, no. 5, 2004.

1321 I. Pons, 1. Vnbeck, and B. KrCire. '.Enhancing Appearance-bawd Kohl Lmalization Usinc Non-Dcnsc Disnarity Mans," in Proceed- . . . ings of rhe lmeniaiiowl Co,fereecr on Roborirr ord Inlelligent System (IROS), Los Vegor, USA, 2003.

1331 I). Lowe, "Distinclive image lealures from scale-invariant key- points:' Iureniarionol Joirnial OJ Conipirrrr Mriou. 2004.

3429