planetary rover absolute localization by combining...

8
PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING VISUAL ODOMETRY WITH ORBITAL IMAGE MEASUREMENTS Manolis Lourakis and Emmanouil Hourdakis Institute of Computer Science Foundation for Research and Technology - Hellas (FORTH) P.O.Box 1385, GR-711 10, Heraklion, Crete, Greece ABSTRACT Visual Odometry (VO) has established itself as an im- portant localization technology for planetary exploration rovers, being capable of yielding location estimates with small error over medium-sized trajectories. However, due to VO’s incremental mode of operation, the estimation er- ror accumulates over time, resulting in considerable drift for long trajectories. This paper proposes a global local- ization method that counters VO drift by matching boul- ders extracted from overhead and ground images of a planet and using them periodically to re-localize the rover and refine VO estimates. The performance of the method is evaluated with the aid of overhead imagery of different resolutions. Experimental results demonstrate that a very terse representation, consisting of approximate boulder locations only, suffices for bringing significant accuracy improvements to VO over long traverses. Key words: Planetary Rover; Visual Localisation; Visual Odometry; Drift; Orbital Images; Boulders. 1. INTRODUCTION Exploration of planets, and especially of Mars, has in re- cent years been among the main objectives of missions planned by space agencies like ESA and NASA. These missions include activities such as the characterization of geological material, the detection of water presence, the identification of climate change or the assessment of human habitability potential. Autonomous rovers are in- dispensable to planetary science, facilitating both in situ exploration and selective collection of samples to be re- turned to Earth. Consequently, planetary rovers should possess advanced mobility capabilities, which in turn leads to the requirement that they should be able to accu- rately localize themselves over long traverses using sen- sory input and on-board processing. To support this functionality, vision-based approaches have been extensively used to determine a rover’s po- sition using on-board cameras [22]. In particular, re- search has focused on variations of the VO paradigm [26, 31], which, by analogy to wheel odometry, refers to the process of determining the position and orienta- tion (i.e. pose) of a camera rig by successively analyz- ing the images it acquires over time. Localization in an extraterrestrial environment cannot rely on features such as line segments or special structures such as planar sur- faces, which abound in terrestrial man-made urban set- tings. Martian terrain, for instance, is dusty and has vari- able morphology, being occasionally covered with self- occluding boulders and geological elements of similar lu- minance. To account for this, planetary VO relies on nat- urally occurring point features defined by the local image content. An inherent drawback of VO is that it determines camera pose for each frame in an incremental fashion, therefore it is sensitive to errors that accumulate over time. Fur- thermore, VO estimates the position of the rover relative to its starting location. While this might suffice for small to medium-sized traverses, complex planetary missions that involve longer traverses demand the rover position to be expressed in an absolute coordinate system. Such a global localization capacity [2], will allow future mis- sions to overcome potential landing uncertainties intro- duced by localization errors during descent, and deter- mine the position of a site using a planet’s cartographic reference system. This paper proposes a global localization method that limits the drift of VO pose estimates by matching boul- ders among images obtained by a rover’s on-board cam- eras and orthoimages acquired high above the ground by the planet’s orbiters. Localizing a robot outdoors us- ing prior knowledge of the environment has been ad- dressed by several authors, e.g. see [17] and references therein. These approaches, however, are tailored to ter- restrial environments and employ features such roads, line segments and building footprints or sensors such as GPS and LIDAR that are only available on Earth. Thus, the primary contribution of the proposed method is to show that localization improvements can be based on a compact representation derived from common geological structures such as boulders. Due to its reliance on mean statistics to describe boulders and the use of geometrical information to derive position estimates, the method is resilient to changes in illumination and ground morphol-

Upload: others

Post on 10-Jul-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING VISUALODOMETRY WITH ORBITAL IMAGE MEASUREMENTS

Manolis Lourakis and Emmanouil Hourdakis

Institute of Computer ScienceFoundation for Research and Technology - Hellas (FORTH)

P.O.Box 1385, GR-711 10, Heraklion, Crete, Greece

ABSTRACT

Visual Odometry (VO) has established itself as an im-portant localization technology for planetary explorationrovers, being capable of yielding location estimates withsmall error over medium-sized trajectories. However, dueto VO’s incremental mode of operation, the estimation er-ror accumulates over time, resulting in considerable driftfor long trajectories. This paper proposes a global local-ization method that counters VO drift by matching boul-ders extracted from overhead and ground images of aplanet and using them periodically to re-localize the roverand refine VO estimates. The performance of the methodis evaluated with the aid of overhead imagery of differentresolutions. Experimental results demonstrate that a veryterse representation, consisting of approximate boulderlocations only, suffices for bringing significant accuracyimprovements to VO over long traverses.

Key words: Planetary Rover; Visual Localisation; VisualOdometry; Drift; Orbital Images; Boulders.

1. INTRODUCTION

Exploration of planets, and especially of Mars, has in re-cent years been among the main objectives of missionsplanned by space agencies like ESA and NASA. Thesemissions include activities such as the characterizationof geological material, the detection of water presence,the identification of climate change or the assessment ofhuman habitability potential. Autonomous rovers are in-dispensable to planetary science, facilitating both in situexploration and selective collection of samples to be re-turned to Earth. Consequently, planetary rovers shouldpossess advanced mobility capabilities, which in turnleads to the requirement that they should be able to accu-rately localize themselves over long traverses using sen-sory input and on-board processing.

To support this functionality, vision-based approacheshave been extensively used to determine a rover’s po-sition using on-board cameras [22]. In particular, re-search has focused on variations of the VO paradigm

[26, 31], which, by analogy to wheel odometry, refersto the process of determining the position and orienta-tion (i.e. pose) of a camera rig by successively analyz-ing the images it acquires over time. Localization in anextraterrestrial environment cannot rely on features suchas line segments or special structures such as planar sur-faces, which abound in terrestrial man-made urban set-tings. Martian terrain, for instance, is dusty and has vari-able morphology, being occasionally covered with self-occluding boulders and geological elements of similar lu-minance. To account for this, planetary VO relies on nat-urally occurring point features defined by the local imagecontent.

An inherent drawback of VO is that it determines camerapose for each frame in an incremental fashion, thereforeit is sensitive to errors that accumulate over time. Fur-thermore, VO estimates the position of the rover relativeto its starting location. While this might suffice for smallto medium-sized traverses, complex planetary missionsthat involve longer traverses demand the rover positionto be expressed in an absolute coordinate system. Sucha global localization capacity [2], will allow future mis-sions to overcome potential landing uncertainties intro-duced by localization errors during descent, and deter-mine the position of a site using a planet’s cartographicreference system.

This paper proposes a global localization method thatlimits the drift of VO pose estimates by matching boul-ders among images obtained by a rover’s on-board cam-eras and orthoimages acquired high above the ground bythe planet’s orbiters. Localizing a robot outdoors us-ing prior knowledge of the environment has been ad-dressed by several authors, e.g. see [17] and referencestherein. These approaches, however, are tailored to ter-restrial environments and employ features such roads,line segments and building footprints or sensors such asGPS and LIDAR that are only available on Earth. Thus,the primary contribution of the proposed method is toshow that localization improvements can be based on acompact representation derived from common geologicalstructures such as boulders. Due to its reliance on meanstatistics to describe boulders and the use of geometricalinformation to derive position estimates, the method isresilient to changes in illumination and ground morphol-

Page 2: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

ogy. Experiments demonstrate that the proposed methodcan localize a rover over long distances, with less than1% positional error over the entire traverse.

2. PREVIOUS WORK

Over the last decade, VO for planetary rovers has maturedto the point that it yields fairly accurate localization re-sults for small and medium-sized traverses. For example,[21] discusses the VO algorithm implemented on NASA’sMars Exploration Rovers (MERs) and its integration inmission operations. However, even though VO can com-pute accurate relative motion estimates, it suffers fromdrift errors due to calibration errors, quantization noise,low quality images, mislocalized or mismatched imagefeatures, etc. Drift errors accumulate over time and can-not be recovered from in the absence of more elaborateknowledge of the environment.

One way to cope with drift errors is to employ a visualsimultaneous localization and mapping (VSLAM) ap-proach. VSLAM is the process of incrementally updatinga map of an unknown environment whilst localizing therover within the said map [34]. However, VSLAM tech-niques are quite demanding for the limited performance,radiation-hardened CPUs available on a planetary rover(for example, the RAD6000 and RAD750 flight proces-sors, respectively on-board the MERs and Curiosity, arecapable of executing up to only 35 and 400 MIPS). Fur-thermore, VSLAM is more effective when a vehicle re-turns to previously visited areas (i.e. closes loops), whichis not that common for a planetary rover. Another possi-bility is the so-called windowed bundle adjustment [24],which performs a local optimization over the most recentframes. Still, even windowed bundle adjustment is chal-lenging for the computational capacity of current rovers.

To mitigate the problems due to the accumulated VO er-ror to a manageable computational cost, researchers haveresorted to using additional, non-visual sensors or priorinformation regarding the environment. In [27], for ex-ample, orientation sensors on-board the rover were usedto correct the VO orientation error and eventually re-duce the drift to a linear function of the distance trav-eled. Sparse bundle adjustment and an Inertial Measure-ment Unit (IMU) were employed in [16] to correct VOdrift. Other works combine a LIDAR sensor with geo-referenced Digital Elevation Maps (DEMs), in order toalign orbital and ground images [4]. Similarly, [15] cre-ates a DEM using the rover stereo images, which is thenmatched against the DEM obtained from an orbiter. In[25], localization is based on matching a DEM built bythe rover against an orbital terrain map, combined withaligning the horizon curve detected by the rover with thatrendered from the orbital DEM.

Despite achieving remarkable localization results, theaforementioned methods rely on information that is notalways readily available on planetary rovers. For exam-ple, DEMs must be reconstructed from images of the

Figure 1: Graphical illustration of the proposed localiza-tion approach. Pose corrections are obtained by regis-tering ground boulders, extracted from the rover’s stereoimages, with boulders from the planet’s orbital images.

planet’s surface, and require excessive storage capacityas well as special hardware such as GPUs for being ren-dered fast [25]. Moreover, in contrast to stereo cameraswhich are already being used by rovers to discern geo-logical information, the incorporation of a LIDAR sensorwould require additional installations, increasing the ve-hicle’s weight, complexity and power consumption.

3. PROPOSED METHOD

Our proposed localization method uses as features boul-ders from orthorectified aerial images of a planet’s sur-face (aka orbital images). The locations of these boulderscan be known accurately via offline processing of imagesacquired from orbit. While traversing the ground, therover extracts boulders from its images and attempts tomatch them against the overhead ones. Upon a success-ful match, the rover can update its VO estimate, thus re-ducing drift. In this sense, boulders serve as known land-marks allowing the rover to relocalize itself with respectto them [35]. The aforementioned steps are implementedby a pipeline illustrated in Fig. 1, which (i) estimates localrelative displacement using VO, (ii) derives a statisticaldescriptor for boulders in the ground and orbital images,(iii) registers the ground and orbital images by estimat-ing the transformation that minimizes a misalignment er-ror defined on boulder locations and (iv) re-initializes VOusing the estimated ground-orbital transformation. In thesubsections that follow, each step is described in detail.

Even though high resolution imagery from the HiRISEMars orbiter [23] and the Martian rovers stereo camerasis publicly available, no public dataset exists that includesinformation from both these sources for the same tra-verse. Consequently, to evaluate our method, we em-ployed a set of synthetic images that correspond to a ter-rain with morphological properties similar to those foundin a Martian environment. The dataset consists of a stereoimage sequence with accurate ground truth pose and anorthoimage depicting an area that includes the ground tra-verse, as would be seen from a high-flying orbiter. Thedataset simulates a Mars-like surface featuring rocks ofvarious sizes and their shadows, as well as sand and soilsegments (see Fig. 2).

Page 3: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

Figure 2: Different snapshots from the simulated groundstereo sequence, showing regions with mineral depositsand soil, occlusion between boulders and shadows.

3.1. Visual Odometry

Local motion estimates are provided by a feature-basedVO pipeline. VO assumes an imaging system consist-ing of two parallel forward-looking cameras, a configu-ration that raises the well-known depth/scale ambiguityand simplifies the estimation of motion when the rover isstationary [13]. VO employs SIFT for feature detectionand matching as well as absolute orientation on recon-structed sparse 3D point clouds for motion estimation.The VO pipeline is briefly described in the remainder ofthis section with more details available in [19].

Feature detection and matching concerns the extractionof sparse point features from a scene and their associa-tion in successive images. This work employs the Scale-Invariant Feature Transform (SIFT), which detects fea-tures as scale-space extrema and then describes themusing weighted spatial histograms of gradient orienta-tions [20]. Considering that in the context of VO suc-cessive images are not expected to differ considerably,scale-invariance of features is not essential. Therefore,in this work, the computationally expensive scale-spacefiltering for determining feature locations is avoided andfeatures are detected at a single scale with the Harris op-erator [12]. For each of these features, a SIFT descrip-tor capturing the distribution of orientations in a regionaround it is computed.

Prior to estimating 3D motion, the 2D motion of imagepoints has to be determined by matching them across im-ages. Point matching proceeds according to the standarddistance ratio matching strategy [20], as follows. Givenan image pair, matches are identified by finding the twonearest neighbors of each interest point from the first im-age among those in the second, and only accepting amatch if the distance to the closest neighbor is less than afixed fraction of that to the second closest neighbor. Thisfraction can be raised to select more matches or loweredto select only the most reliable. Distances among SIFTdescriptors are usually computed with the Euclidean (L2)norm. Here, higher quality matches are obtained by sub-stituting L2 with the χ2 distance which originates fromthe χ2 test statistic [28, 7]. This is a histogram distancethat takes into account the fact that in many natural his-

tograms, the difference between large bins is less impor-tant than the difference between small bins and shouldtherefore be reduced. Compared to more elaborate dis-tances, the use of the χ2 distance has been found to offera good performance / computational cost trade-off.

Using the feature detection and matching techniques de-scribed above, the pose of the employed camera sys-tem can be incrementally recovered. More specifically,a number of points are detected and matched between thetwo stereo views at a certain time instant t. Knowledgeof the stereo calibration parameters allows the estima-tion via triangulation of the 3D points giving rise to thematched image projections. Triangulation recovers 3Dpoints as the intersections of back-projected rays definedby the matched image projections and the camera cen-ters. Since there is no guarantee that back-projected rayswill actually intersect in space (i.e., they are not skew),matched image points should be refined prior to triangu-lation so as to exactly satisfy the underlying epipolar ge-ometry. This is achieved by computing the points on theepipolar lines that are closest to the original ones. Thecomputation involves minimizing the distances of pointsto epipolar lines with a non-iterative scheme that boilsdown to solving a sixth degree polynomial [13]. As thisis rather costly in terms of computation, we employ anapproximate but much cheaper alternative relying on theSampson approximation of the distance function [30].

As the two cameras move, detected feature points aretracked over time in both stereo views from time t to t+1.By triangulating the tracked points in time t+1, the cam-era system motion can be computed as the rigid trans-formation bringing the 3D points of time t in agreementwith those obtained at time t + 1. Determining the rigidbody transformation that aligns two matched 3D pointsets is known as the absolute orientation problem. Itssolution involves the minimization of the mean squareddistance between the point sets under the sought trans-formation and is expressed analytically with the aid ofunit quaternions [14]. Since incorrectly matched pointscaused by phenomena such as occlusions, depth disconti-nuities, repetitive patterns, shadows, etc. cannot be com-pletely avoided in descriptor-based matching, care mustbe taken so that their influence on the estimation of mo-tion is limited. This is achieved by embedding the esti-mation of motion in a robust regression framework basedon RANSAC [9], which ensures the detection and elimi-nation of spurious matches.

3.2. Boulder detection

Automatic detection of boulders in images is a difficultproblem, since no assumptions can be made regardingthe ground morphology, boulder size, texture or color (cf.Fig. 3(a)). For example, images acquired on Mars havesmall color and illumination variations, therefore shad-ows drastically reduce boulders discrepancy [33]. To de-tect boulders, existing approaches have relied on edgedetection [5], estimation of protrusion from a locally fit-

Page 4: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

(a)

(b)

Figure 3: Overhead image detail with visible boulders(top) and detected connected components correspondingto boulders, shown as white blobs (bottom). Red crossesmark the centroid of each sufficiently large component.

ted ground plane [10], or template-based matching [11].Nevertheless, even if a reliable algorithm for boulder de-tection was available, ground images differ substantiallyin their appearance compared to overhead ones due totheir very different vantage points. Therefore, bouldermatching cannot rely on photometric cues. The approachadopted here is to base boulder detection on geometricalinformation. The methodology used to detect boulders inorbital and ground images is outlined next.

The process for detecting boulders in orbital images startsby using adaptive Otsu thresholding to convert them tobinary, followed by 15 × 15 median filtering to reducenoise while preserving edges. Connected components arethen extracted from the resulting binary image and theirarea is calculated. Only those components whose area isabove a certain threshold are retained, then their centroidscomputed and stored on the rover. Figure 3(b) demon-strates the result of this process. Since overhead imagesare geo-registered, the computed centroids are associatedwith cartographic coordinates.

Ground boulders are extracted with the following se-quence of operations. First, images are segmented usingan approach based on mean shift [8]. Mean shift is a non-parametric kernel density estimation technique, based on

(a)

(b)

Figure 4: Segmentation of a ground image from the leftrover camera. Original image (top), segmented image su-perimposed semi-transparently on the original (bottom).

iterative clustering. It was employed to the problem athand due to its robustness to parameter definition, as itworks well with clusters of arbitrary size, shape and num-ber, and its ability to preserve discontinuities in an imageafter smoothing. To reduce over-segmentation, a post-processing step that iteratively merges regions with sim-ilar intensity characteristics is applied to the output ofmean shift. Very small regions are eliminated from fur-ther consideration as it is unlikely that they are visiblein the overhead images, while the remaining regions areassigned a unique label.

Figure 4 shows representative segmentation results forthe first left frame of the sequence. It is evident that meanshift is able to preserve edges on the boulders. This ef-fect drastically improves the segmentation results, sinceboulders found in the said scenery usually exhibit severalcolor discontinuities on their surface, which can causeother algorithms to fail. Segmented regions are tempo-rally associated between successive frames with the aid ofreconstructed 3D points. Specifically, image features ex-tracted from the segmented regions are matched betweenstereo views and used to recover 3D points via triangula-tion. For all n segmented regions in frame t and m seg-mented regions in frame t+ 1, a n×m voting matrix Mis formed. Each new reconstructed 3D point is assignedtwo labels, one for the segmented region it projects into

Page 5: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

(a)

(b)

Figure 5: Boulders matched between successive time in-stants after voting and grouping the 3D points. Pixelswith the same color belong to the same boulder label.

in frame t and one for that in frame t + 1, and casts onevote to the corresponding row and column of M. Theselabels are associated in consecutive frames by solving theassignment problem for M with the Munkres algorithm[3]. Labels with no vote in frame t are discarded, whilelabels with no vote in frame t + 1 are considered as cor-responding to newly seen boulders. Figure 5 illustrateshow this voting scheme succeeds in temporally matchingboulders.

Due to mismatched image features and occasional over-segmentation, there is a considerable possibility for pointclouds belonging to the same boulder to be assigned dif-ferent labels. To prevent this, we perform an agglom-erative clustering on point coordinates, which employstheir Euclidean distances to form groups of uniformlyclustered 3D points that represent boulders. Figure 6 il-lustrates the reassignment of labels to point clouds afterclustering, which permits points from the same boulderthat have been assigned different labels, to be merged intoone. After grouping, we use the mean of each 3D pointcloud, as a representation for each boulder. Due to relyingon the mean statistic of strictly geometrical information,shadows on a boulder, terrain morphological variationsand changes in illumination do not compromise the qual-ity of the boulder descriptor.

(a)

(b)

Figure 6: 3D points projected on the orbital image, drawnin different colors according to their labels. Before clus-tering (top) and after (bottom). (Colors are assigned in-dependently to each image).

3.3. Ground-orbital registration

Ground boulders are extracted over short image subse-quences with the aforementioned procedure. The cen-troids of the extracted 3D point clouds are transformed ina common coordinate frame using the underlying VO es-timates and then projected vertically to obtain a local 2Dmap. Local maps made up of detected ground bouldersare registered with the 2D map formed by overhead boul-ders, using the Iterative Closest Point (ICP) algorithmwith the point-to-point distance metric. As a byproduct,ICP matching yields the similarity transformation align-ing the ground with the overhead boulders. Eventually,this transformation is used for correcting the 2D posi-tion and in-plane rotational components of the VO esti-mate. To reduce the computational overhead, the roverselects for matching only those overhead boulders thatare in the vicinity of its current position as estimated us-ing the VO. It is noted that the application of ICP assumesthat the starting point of the rover trajectory is approxi-

Page 6: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

mately known in the orbital map. When this is not thecase, initial ground-orbital map registration can be basedon a less constrained, albeit more computationally expen-sive scheme, such as geometric hashing [18].

The ICP algorithm [1] is the standard approach to geo-metrically align two sets of points whose relative poseis approximately known. In its simplest form, one setof points, the model, is kept fixed and ICP repetitivelytransforms the other, called the data set, to align withthe model. This is achieved by alternating between es-tablishing putative point correspondences in the two setsand using them to estimate the geometric transformationthat registers these sets. For ICP to converge to the cor-rect result, the initial pose of the data set must be suffi-ciently close to that of the model, which in turn ensuresthat sufficient overlap exists between the two. Numer-ous enhancements to the basic ICP algorithm have beenproposed that aim to enlarge its convergence basin, ac-celerate its rate of convergence or increase its robustnessto local minima, outlying points and noise [29, 6]. Inthis work, the following two enhancements were foundto improve convergence speed and accuracy: motion pa-rameter extrapolation, using the method outlined in [1]and rejecting 20% to 40% of the worst matches.

4. EXPERIMENTAL RESULTS

Sample results from an experimental evaluation of themethod are presented in this section. A sequence of 1667synthetic stereo images with a resolution of 512 × 384pixels and a 66 ◦ × 52 ◦ field of view was employed.The images emulated a stereo camera pair mounted ata height of 0.7m above the ground, with an angle of39 ◦ with the horizon and a baseline equal to 0.12m.An S-shaped, 100m long rover traverse was simulated(cf. Fig. 7). Ground truth poses were available for allimages of the traverse. Plain VO along the entire tra-verse estimated the final location of the rover with a posi-tional error around 2.5m. The simulated orbital orthoim-age was 15360 × 15360 pixels and depicted an area of76.8 × 76.8m2 using a resolution of 0.005m per pixel.VO estimates were used to transform boulders exractedfrom the ground images into a common reference frame.The accumulated VO error was rectified every 30 frames,based on the ground-orbital boulder registration.

Figure 7 shows the true rover trajectory compared withthose estimated with the proposed method and plain VO.To test the limits of the method, and determine whether itis resilient to lower resolution orbital images, we haveevaluated its accuracy when applied to orbital imagesthat were resampled with increasingly lower resolutions.More specifically, the original orbital image was shrunkto 75%, 50% and 25% of its dimensions and the proposedmethod was applied to it as well as all lower resolutionorbital images.

Figure 8(a) plots the translational errors correspondingto the poses obtained with the proposed method at each

Figure 7: Comparison of the rover trajectories estimatedusing the proposed method at the finest orbital image res-olution (red) and plain VO (dashed green) against thesimulated rover trajectory (blue).

frame and resolution and illustrates that the method yieldshighly accurate positional estimates for all resolutionstested, maintaining the corresponding error less than 1%over the entire 100m traverse. The translational er-ror between an estimate t and the true translation t iscomputed as ||t − t||, whereas the rotational error be-tween an estimated rotation matrix R and the true Ris arccos((trace(R−1R)− 1)/2) and corresponds to theamount of rotation about a unit vector that transfers R toR. A comparison of the rotational errors for the trajecto-ries computed using VO with and without orbital correc-tion is included in Fig. 8(b). From these plots, it is evidentthat the incorporation of orbital imagery in the localiza-tion pipeline achieves considerable improvements in theaccuracy of VO estimates, even with orbital images oflow resolution.

At this point it is worth mentioning that the proposed VOrefinement performs worse than plain VO at the begin-ning of the trajectory, especially when lower resolutionorbital images are used. This is because the accuracyof VO corrections estimated with the proposed methodis limited by the orbital image resolution, whereas plainVO is still quite accurate as drift has not yet accumulatedsignificantly. It is also pointed out that VO refinements,either correct or wrong, cause the sudden jumps observedin the orientation error for the two lower resolutions inFig. 8(b) (cyan and magenta curves).

Table 1 summarizes the method’s translation and orienta-tion errors at the end of the traverse for the different res-olution orbital images. We conclude this section by not-ing that even the coarsest of the simulated orbital imageshave a resolution considerably higher than that providedfrom current high-resolution orbital imagers such as theHiRISE instrument (i.e. 2 cm vs. 25 cm per pixel). How-ever, orbital imagery of suitable detail can be obtained bysuper-resolution techniques such as [32], which employsmultiple overlapping orbital images to reconstruct higherresolution imagery at 5 cm.

Page 7: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

(a)

(b)

Figure 8: Positional (top) and orientation (bottom) er-rors over the entire traverse. Plain VO error is plottedin red, while the remaining lines plot the translationalerror after correction, for different orbital image reso-lutions: 15360 × 15360 (green), 11520 × 11520 (blue),7680× 7680 (cyan) and 3840× 3840 (magenta).

5. CONCLUSION

Global localization is an important capacity for futureplanetary exploration missions. Images acquired by or-biters of a planet have opened the potential of localizinga planetary rover in a global reference frame, indepen-dently of its initial location and traveled distance. Thispaper has suggested a method that is able to correct thedrift in VO by registering boulders detected in ground andorbital images. Due to its reliance on geometrical infor-mation, the method is able to match such images despitetheir large difference in appearance.

Implementation wise, the proposed method is well-suitedto the hardware available on planetary rovers, as it isboth computationally and memory efficient. The fea-

Table 1: Position and orientation errors at the end of the100m traverse for increasingly coarser orbital images.

Resolution Positional Rotational(pixels) error (m) error (deg)15360× 15360 0.7743 0.380911520× 11520 0.9939 0.53787680× 7680 0.6954 0.92543840× 3840 1.2522 1.2457

ture matching process diminishes each boulder to a sin-gle mean statistic, not overwhelming configurations withlimited storage. In addition, several parts of the boulderdetection pipeline can be parallelized and therefore canbe accelerated with an FPGA-based implementation. Asthe experimental results demonstrate, the method can beapplied successfully to orbital maps of quite coarse res-olution. One drawback of the method is that it only pro-vides corrections to longitudinal and lateral movementsbut not vertical ones. To remedy this, one should obtain a3D representation of the boulders of the orbital image, orconsider additional constraints during the matching pro-cess (e.g. using different transforms for the projection ofthe ground boulders on the orbital image). We plan toexplore these two directions in future work.

ACKNOWLEDGMENTS

This work was partially supported by the ESA SPAR-TAN Extension Activity (SEXTANT) (ESA/ESTEC ref-erence 4000103357/11/NL/EK). The authors thank Mar-cos Aviles Rodrigalvarez for preparing the datasets usedin the experiments.

REFERENCES

[1] Besl, P. and McKay, N. (1992). A Method for Reg-istration of 3-D Shapes. IEEE Trans. Pattern Anal.Mach. Intell., 14:2:239–256.

[2] Boukas, E., Gasteratos, A., and Visentin, G. (2014).Localization of planetary exploration rovers with or-bital imaging: a survey of approaches. In ICRA14Workshop on Modelling, Estimation, Perception andControl of All Terrain Mobile Robots.

[3] Bourgeois, F. and Lassalle, J.-C. (1971). An ex-tension of the Munkres algorithm for the assignmentproblem to rectangular matrices. Communications ofthe ACM, 14(12):802–804.

[4] Carle, P. J., Furgale, P. T., and Barfoot, T. D. (2010).Long-range rover localization by matching lidar scansto orbital elevation maps. Journal of Field Robotics,27(3):344–370.

[5] Castano, R., Estlin, T., Gaines, D., Chouinard, C.,Bomstein, B., Anderson, R. C., Burl, M., Thompson,

Page 8: PLANETARY ROVER ABSOLUTE LOCALIZATION BY COMBINING …users.ics.forth.gr/~lourakis/publ/2015_astra.pdf · eras and orthoimages acquired high above the ground by the planet’s orbiters

D., Castano, A., and Judd, M. (2007). Onboard au-tonomous rover science. In Aerospace Conference,2007 IEEE, pages 1–13. IEEE.

[6] Castellani, U. and Bartoli, A. (2012). 3D Shape Reg-istration. In Pears, N., Liu, Y., and Bunting, P., ed-itors, 3D Imaging, Analysis and Applications, pages221–264. Springer London.

[7] Cha, S.-H. (2007). Comprehensive Survey on Dis-tance/Similarity Measures between Probability Den-sity Functions. International Journal of MathematicalModels and Methods in Applied Sciences, 1(4):300–307.

[8] Comaniciu, D. and Meer, P. (2002). Mean shift: Arobust approach toward feature space analysis. IEEETrans. Pattern Anal. Mach. Intell., 24(5):603–619.

[9] Fischler, M. A. and Bolles, R. C. (1981). Randomsample consensus: A paradigm for model fitting withapplications to image analysis and automated cartog-raphy. Commun. ACM, 24:6:381–395.

[10] Gor, V., Castano, R., Manduchi, R., Anderson, R.,and Mjolsness, E. (2001). Autonomous rock detectionfor Mars terrain. Space, pages 1–14.

[11] Gulick, V. C., Morris, R. L., Ruzon, M. A., andRoush, T. L. (2001). Autonomous image analysesduring the 1999 Marsokhod rover field test. Jour-nal of Geophysical Research: Planets (1991–2012),106(E4):7745–7763.

[12] Harris, C. and Stephens, M. (1988). A combinedcorner and edge detector. In Proc. of the 4th AlveyVision Conference, pages 147–151.

[13] Hartley, R. and Zisserman, A. (2004). Multiple ViewGeometry in Computer Vision. Cambridge UniversityPress,, second edition.

[14] Horn, B. K. P. (1987). Closed-form solution of ab-solute orientation using unit quaternions. Journal ofthe Optical Society of America A, 4(4):629–642.

[15] Hwangbo, J. W., Di, K., and Li, R. (2009). Inte-gration of orbital and ground image networks for theautomation of rover localization. In ASPRS 2009 An-nual Conference.

[16] Konolige, K., Agrawal, M., and Sola, J. (2011).Large-scale visual odometry for rough terrain. InRobotics Research, pages 201–212. Springer.

[17] Kummerle, R., Steder, B., Dornhege, C., Kleiner,A., Grisetti, G., and Burgard, W. (2011). Large scalegraph-based SLAM using aerial images as prior infor-mation. Auton. Robots, 30(1):25–39.

[18] Lamdan, Y. and Wolfson, H. J. (1988). Geometrichashing: A general and efficient model-based recogni-tion scheme. In ICCV, pages 238–249.

[19] Lourakis, M., Chliveros, G., and Zabulis, X. (2013).Autonomous Visual Navigation for Planetary Explo-ration Rovers. In 12th Symposium on Advanced SpaceTechnologies in Automation and Robotics (ASTRA).

[20] Lowe, D. G. (2004). Distinctive image featuresfrom scale-invariant keypoints. Int. J. of Computer Vi-sion, 60:2:91–110.

[21] Maimone, M., Cheng, Y., and Matthies, L. (2007).Two years of visual odometry on the Mars explorationrovers. Journal of Field Robotics, 24(3):169–186.

[22] Matthies, L., Maimone, M., Johnson, A., Cheng,Y., Willson, R., Villalpando, C., Goldberg, S., Huer-tas, A., Stein, A., and Angelova, A. (2007). Computervision on Mars. International Journal of ComputerVision, 75(1):67–92.

[23] McEwen, A. S., Eliason, E. M., Bergstrom, J. W.,Bridges, N. T., Hansen, C. J., Delamere, W. A., Grant,J. A., Gulick, V. C., Herkenhoff, K. E., Keszthelyi, L.,Kirk, R. L., Mellon, M. T., Squyres, S. W., Thomas,N., and Weitz, C. M. (2007). Mars reconnaissanceorbiter’s high resolution imaging science experiment(HiRISE). Journal of Geophysical Research: Planets(1991–2012), 112(E5).

[24] Mouragnon, E., Lhuillier, M., Dhome, M.,Dekeyser, F., and Sayd, P. (2006). Real time local-ization and 3D reconstruction. In CVPR, volume 1,pages 363–370.

[25] Nefian, A., Boyssounousse, X., Edwards, L., Kim,T., Hand, E., Rhizor, J., Deans, M., Bebis, G., andFong, T. (2014). Planetary rover localization withinorbital maps. In ICIP. IEEE.

[26] Nister, D., Naroditsky, O., and Bergen, J. (2004).Visual odometry. In Computer Vision and PatternRecognition, 2004. CVPR 2004., volume 1, pages I–652. IEEE.

[27] Olson, C. F., Matthies, L. H., Schoppers, M., andMaimone, M. W. (2003). Rover navigation usingstereo ego-motion. Robotics and Autonomous Sys-tems, 43(4):215–229.

[28] Rubner, Y., Puzicha, J., Tomasi, C., and Buhmann,J. M. (2001). Empirical evaluation of dissimilaritymeasures for color and texture. Comput. Vis. ImageUnd., 84(1):25–43.

[29] Rusinkiewicz, S. and Levoy, M. (2001). EfficientVariants of the ICP Algorithm. In 3DIM, pages 145–152.

[30] Sampson, P. (1982). Fitting conic sections to veryscattered data: An iterative refinement of the Book-stein algorithm. Computer Graphics and Image Pro-cessing, 18(1):97–108.

[31] Scaramuzza, D. and Fraundorfer, F. (2011). Visualodometry. IEEE Robot. Automat. Mag., 18(4):80–92.

[32] Tao, Y. and Muller, J.-P. (2014). Super-resolution ofrepeat-pass orbital imagery at 400km altitude to obtainrover-scale imagery at 5cm. In European PlanetaryScience Congress, volume 9, EPSC2014-201.

[33] Thompson, D. R. and Castano, R. (2007). Perfor-mance comparison of rock detection algorithms forautonomous planetary geology. In Aerospace Confer-ence, 2007 IEEE, pages 1–9. IEEE.

[34] Williams, B. and Reid, I. (2010). On CombiningVisual SLAM and Visual Odometry. In Proc. ICRA,pages 3494–3500.

[35] Zhu, Z., Oskiper, T., Samarasekera, S., Kumar, R.,and Sawhney, H. (2007). Ten-fold Improvement in Vi-sual Odometry Using Landmark Matching. In ICCV,pages 1–8.