an optimization based approach to visual odometry using

Institutionen för systemteknikDepartment of Electrical Engineering

Examensarbete

An Optimization Based Approach to VisualOdometry Using Infrared Images

Examensarbete utfört i Reglerteknikvid Tekniska högskolan i Linköping

av

Emil Nilsson

LiTH-ISY-EX--10/4386--SE

Linköping 2010

Department of Electrical Engineering Linköpings tekniska högskolaLinköpings universitet Linköpings universitetSE-581 83 Linköping, Sweden 581 83 Linköping

An Optimization Based Approach to VisualOdometry Using Infrared Images

Examensarbete utfört i Reglerteknikvid Tekniska högskolan i Linköping

av

Emil Nilsson

LiTH-ISY-EX--10/4386--SE

Handledare: Christian Lundquistisy, Linköpings universitet

Jacob RollAutoliv Electronics

David ForslundAutoliv Electronics

Examinator: Thomas Schönisy, Linköpings universitet

Linköping, 15 June, 2010

Avdelning, InstitutionDivision, Department

Division of Automatic ControlDepartment of Electrical EngineeringLinköpings universitetSE-581 83 Linköping, Sweden

DatumDate

2010-06-15

SpråkLanguage

� Svenska/Swedish� Engelska/English

�

�

RapporttypReport category

� Licentiatavhandling� Examensarbete� C-uppsats� D-uppsats� Övrig rapport�

�

URL för elektronisk versionhttp://www.control.isy.liu.se

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-57981

ISBN—

ISRNLiTH-ISY-EX--10/4386--SE

Serietitel och serienummerTitle of series, numbering

ISSN—

TitelTitle

En optimeringsbaserad metod för visuell odometri med infraröda bilderAn Optimization Based Approach to Visual Odometry Using Infrared Images

FörfattareAuthor

Emil Nilsson

SammanfattningAbstract

The goal of this work has been to improve the accuracy of a pre-existing algorithmfor vehicle pose estimation, which uses intrinsic measurements of vehicle motionand measurements derived from far infrared images.

Estimating the pose of a vehicle, based on images from an on-board camera andintrinsic measurements of vehicle motion, is a problem of simultanoeus localizationand mapping (SLAM), and it can be solved using the extended Kalman filter(EKF). The EKF is a causal filter, so if the pose estimation problem is to be solvedoff-line acausal methods are expected to increase estimation accuracy significantly.In this work the EKF has been compared with an acausal method for solving theSLAM problem called smoothing and mapping (SAM) which is an optimizationbased method that minimizes process and measurement noise.

Analyses of how improvements in the vehicle motion model, using a numberof different model extensions, affects accuracy of pose estimates have also beenperformed.

NyckelordKeywords visual odometry, smoothing and mapping, SAM, SLAM, automobile motion model,

FIR, monocular

AbstractThe goal of this work has been to improve the accuracy of a pre-existing algorithmfor vehicle pose estimation, which uses intrinsic measurements of vehicle motionand measurements derived from far infrared images.

Estimating the pose of a vehicle, based on images from an on-board camera andintrinsic measurements of vehicle motion, is a problem of simultanoeus localizationand mapping (SLAM), and it can be solved using the extended Kalman filter(EKF). The EKF is a causal filter, so if the pose estimation problem is to be solvedoff-line acausal methods are expected to increase estimation accuracy significantly.In this work the EKF has been compared with an acausal method for solving theSLAM problem called smoothing and mapping (SAM) which is an optimizationbased method that minimizes process and measurement noise.

Analyses of how improvements in the vehicle motion model, using a numberof different model extensions, affects accuracy of pose estimates have also beenperformed.

SammanfattningMålet med detta examensarbete var att förbättra precisionen hos en redan exis-

terande algoritms skattningar av ett fordons pose (position och orientering), somanvänder interna mätningar av fordonets rörelse samt mätningar erhållna fråninfraröda bilder.

Att skatta ett fordons pose, utifrån bilder från en kamera ombord på farkostensamt interna mätningar av fordonsrörelse, är ett problem av typen samtidig lokalis-ering och kartering (SLAM), och det kan lösas med ett utökat Kalmanfilter (EKF).EKF är ett kausalt filter, så om skattning av pose ska utföras i efterhand kanicke-kausala metoder istället användas och sådana metoder förväntas ge avsevärtförbättrad precision i skattningarna. I detta examensarbete har EKF jämförts meden icke-kausal metod att lösa SLAM-problemet som kallas glättning och kartering(SAM), en optimeringsbaserad metod som minimerar process- och mätbrus.

Även analyser av hur förbättringar av fordonets rörelsemodell, med ett antalolika modelltillägg, påverkar precisionen i skattningarna av pose har genomförts.

v

Acknowledgments

First of all I would like to thank David Forslund and Jacob Roll at Autoliv Elec-tronics for the help, in all kinds of aspects, during this master’s thesis. Of greathelp has also Christian Lundquist and my examiner Thomas Schön been, con-stantly sharing ideas and answering questions throughout my work. My thanksalso goes to Zoran Sjanic and Martin Skoglund for giving me an introduction toSAM.

Furthermore I want to take the opportunity to thank Thomas Karlsson andGöran Forsling at the Mathematics department for, during all of my years at theuniversity, always allowing me to ask them questions, giving me helpful solutionsand explanations to mathematical problems and queries.

Last but not least, I would like to thank my family and Anna for all encour-agement and support you have given me throughout the years.

vii

Contents

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Autoliv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Camera System Overview . . . . . . . . . . . . . . . . . . . . . . . 31.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 System Modeling 52.1 Coordinate Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Rotation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Vehicle Process Model . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3.1 Basic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3.2 Model Extensions . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 Landmark Parametrization . . . . . . . . . . . . . . . . . . . . . . 112.5 Measurement Model . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Visual Odometry and SLAM 153.1 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Feature Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.3 Feature Association . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.4 Smoothing and Mapping . . . . . . . . . . . . . . . . . . . . . . . . 193.5 Feature Association Improvement . . . . . . . . . . . . . . . . . . . 25

4 Results 274.1 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.1 Landmark Measurement Residuals . . . . . . . . . . . . . . 274.1.2 Trajectory Plot in Camera Image . . . . . . . . . . . . . . . 28

4.2 SAM and Model Extensions . . . . . . . . . . . . . . . . . . . . . . 294.3 Feature Association Improvement . . . . . . . . . . . . . . . . . . . 54

5 Concluding Remarks 575.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.1.1 SAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.1.2 Vehicle Process Model Extensions . . . . . . . . . . . . . . 575.1.3 Feature Association Improvement . . . . . . . . . . . . . . . 58

ix

x Contents

5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Bibliography 61

A Nomenclature 63A.1 Mathematical Notations . . . . . . . . . . . . . . . . . . . . . . . . 63A.2 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63A.3 Landmark States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63A.4 Vehicle States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64A.5 Variables, Parameters and Functions . . . . . . . . . . . . . . . . . 64A.6 Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65A.7 Top Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

B Proof of Matrix Singularity 66

C Derivation of Landmark Measurement Residual Limit 67

D Results 68

Chapter 1

Introduction

This section describes the problem to be solved in this work, and the context ofthe problem.

1.1 BackgroundAutoliv Electronics has, together with the Division of Automatic Control at Linkö-ping University, developed an off-line software tool for estimating the sequence ofposes of a vehicle (with pose meaning position and orientation), using images takenby a far infrared (FIR) camera and measurements of vehicle acceleration, speedand yaw rate. This tool uses sensor fusion of the vehicle measurements and FIRimages in order to perform simultaneous localization and mapping (SLAM), so itis not only the vehicle poses that are estimated, but also positions of landmarksseen by the camera. Figure 1.1 illustrates the SLAM problem; it shows five con-secutive vehicle poses, and measurements of landmarks from the different vehiclepositions. Note that although the camera only measures the direction to land-marks, it is possible to determine a landmark’s position as the vehicle moves andmore measurements of the landmark are gathered.

1.2 Problem FormulationThe problem to be solved in this work is to find out how the off-line estimation ofvehicle poses can be improved in terms of accuracy.

Before this work, pose estimation was performed using an extended Kalmanfilter (EKF). Since the EKF is a causal method, and the pose sequence estimationis performed off-line, using an acausal method was expected to improve the esti-mation significantly, so the first order of action was to implement such a method.The problem of recovering the best possible estimate of the poses and landmarkpositions is called smoothing and mapping (SAM, see for example [1]), and themethod proposed was linear least squares SAM, which means iteratively lineariz-ing the vehicle and measurement models and finding a pose sequence estimate by

1

2 Introduction

Figure 1.1: Simultaneous localization (of the vehicle, illustrated by rectangles) andmapping (of the landmarks, illustrated by circles) is performed by making somesort of position measurements of the landmarks (e.g. bearing measurements) fromthe different vehicle positions.

solving a linear least squares problem. Other proposed areas with potential forincreasing the estimation accuracy was improvements regarding the vehicle model,feature detection and association algorithms (how to extract measurements fromthe FIR images) and robustness against moving objects.

Since the ground truth for the vehicle poses is unknown, the estimation accu-racy must be evaluated by indirect measures, so an important part of this work isto find good measures of estimation accuracy.

1.3 Autoliv

Autoliv is a worldwide leading developer and manufacturer of automotive safetysystems such as seatbelts, airbags and safety electronics, with most of the majorautomobile manufacturers as their customers.

The main activity of Autoliv’s subsidiary company Autoliv Electronics, forwhom this work is performed, is to delevop and manufacture electronic controlunits (ECU) for controlling air bag deployment. Not so long ago Autoliv Elec-tronics started development of the Night Vision pedestrian detection system [5],a system that in spite of darkness is able to warn drivers when pedestrians are inthe vehicles path, or moving towards this path. Although the system currentlydoes not automatically detect and warn for animals, they can be seen on the NightVision display, giving the driver a chance to detect them by using the display in away similar to the use of rear-view mirrors. The Night Vision system, originallywithout the pedestrian detection, has been available in production cars since 2005.Figure 1.2 shows the Night Vision system in action, along with visual images forcomparison.

1.4 Camera System Overview 3

(a) City (b) Countryside

Figure 1.2: These two images show comparisons between visual images and NightVision images

1.4 Camera System Overview

The Night Vision camera registers radiation in the far infrared (FIR) region at30 Hz, with a resolution of 320×240 pixels, and a spectral range of 8–14 µm. Anadvantage of using a FIR sensor, compared to near infrared (NIR) sensors, is thatbody heat radiation from humans and animals lies within the FIR region ([13]suggests that it is easier to detect distant pedestrian using FIR compared to usingNIR). On the other hand, non-living objects and structures are not registered soclearly by a FIR camera, so for the sake of estimating vehicle poses the FIR cameramight not be the best choice. The camera system is passive, meaning that it doesnot actively illuminate the road ahead with FIR radiation. Figure 1.3 shows thecamera when it is mounted to the car.

(a) Camera mounting (b) Close-up view of camera mounting

Figure 1.3: The images in this figure show the placement of the Night Visioncamera.

4 Introduction

1.5 Related WorkThe EKF-based estimation of the vehicle poses that Autoliv used before this workis described in [9]. Descriptions of a few versions of SAM can be found in [1] and[11], where [11] describes the application of SAM to a problem similar to the onein this work. The work in [1] is on the other hand more focused on fast executionwhen using SAM, using factorizations of the information matrix, and [4] describeshow to further improve the execution time by using incremental updates of thefactorized information matrix.

In [8] it is shown how the uncertainty in the distance (depth) to landmarksin images is more adequately represented by Gaussian noise when the distanceparametrization used is the inverse depth than when regular depth is used.

The model improvements for vehicle pitch dynamics evaluated during this work,namely constant offset, a second order model and influence from acceleration, aredescribed in [2].

Bundle adjustment (which is a computer vision term corresponding to what ishere referred to as SAM) is compared to filtering in [12]. The authors evaluatebundle adjustment and filtering in terms of computational cost for decreasingthe robot pose estimate covariance, and found that filtering based methods onlymight be the better choice when the available processing resources are small; theyconclude that bundle adjustment is superior in all other cases. Bundle adjustmentis still rather costly, and the authors of [10] state that this is because of the choiceof a single privileged coordinate frame; a choice which they avoid by taking arelative approach to bundle adjustment. This approach uses some of the poses asreferences, and for each other pose or landmark there is one reference pose whichit is related to.

In the solution developed during this work feature detection is performed us-ing the Harris corner detector described in [3], and for feature description imagepatches are used. For association of features in new images normalized cross-correlation (NCC), see for example [7], between feature patches and the new im-ages is used. A problem with using image patches as feature descriptors is thatthey are not invariant to changes in scale and rotation, and [9] suggests using theso-called Scale-invariant feature transform (SIFT) described in [6], instead of theHarris detector and NCC of image patches. The SIFT algorithm detects and de-scribes features in images and two of its advantages is that it is invariant to scaleand orientation of the features.

Chapter 2

System Modeling

The system of a vehicle moving in unknown surroundings, with measurements oflandmarks which are stationary in these surroundings, and measurements of thevehicle motion, is modeled by a state space model. To begin with we have thestate of the vehicle (describing the current status of the position and motion of thevehicle) at time t, denoted xvt . Based on the vehicle state xvt−1 and some input ut(in our case it is the acceleration of the vehicle), the next vehicle state xvt is givenby

xvt = f(xvt−1, ut) + wt, (2.1a)

where f is the motion model function and wt is Gaussian process noise whichcompensates for model errors. The position at time t of the jth landmark isparametrized by its state xlj,t and the landmark motion model becomes

xlj,t = xlj,t−1, (2.1b)

since we assume that all landmarks are stationary. Then we have, at time t, thevehicle measurements yvt according to

yvt = hv(xvt ) + evt , (2.1c)

where hv is the vehicle measurement function and evt is Gaussian measurementnoise. Finally, the landmark measurements of the jth landmark at time t is denotedylj,t and given by

ylj,t = hl(xvt , xlj,t) + elj,t, (2.1d)

where hl is the landmark measurement function and elj,t is Gaussian measurementnoise.

2.1 Coordinate FramesThere are three relevant coordinate frames for the combined vehicle and camerasystem:

5

6 System Modeling

• World (w): This is considered an inertial frame and it is fixed to the sur-roundings of the vehicle.

• Vehicle body (b): This frame is fixed to the vehicle, with its origin locatedin the middle of the rear axis. Coordinate frame b coincide with w at thestart of a scenario.

• Camera (c): This frame is fixed relative to b, and it is positioned in theoptical center of the camera.

For all three coordinate frames the x-axis is pointing forward, the y-axis ispointing to the right and the z-axis is pointing downwards.

The yaw, pitch and roll angles describe rotations around z-, y- and x-axesrespectively, and unless stated otherwise, positive direction of angles are given bythe right-hand grip rule applied to the coordinate frame axis around which therotation occurs.

2.2 Rotation MatricesThis section describes how coordinates of a fixed point is transformed betweenthree dimensional coordinate systems that have different orientation but the sameorigin.

The three basic rotation matrices for three dimensional space are given by

Rx(θ) =

1 0 00 cos θ − sin θ0 sin θ cos θ

, (2.2a)

Ry(θ) =

cos θ 0 sin θ0 1 0

− sin θ 0 cos θ

, (2.2b)

Rz(θ) =

cos θ − sin θ 0sin θ cos θ 0

0 0 1

. (2.2c)

For any rotation matrix we have that R(−θ) = R(θ)−1, and we also haveR−1 = RT by definition.

If the angles yaw (α), pitch (β) and roll (γ) are used to describe the orientationof coordinate frame κ (e.g. a vehicle) as seen from coordinate frame λ (e.g. theworld), the rotation matrix Rκλ for converting λ-coordinates to κ-coordinates isgiven by

Rκλ = Rx(−γ)Ry(−β)Rz(−α) == Rx(γ)TRy(β)TRz(α)T .

(2.3)

2.3 Vehicle Process Model 7

Since rotating from κ to λ is the inverse of rotating from λ to κ (Rλκ = (Rκλ)−1)we get

Rλκ = Rz(α)Ry(β)Rx(γ). (2.4)

Equation (2.5) concludes this theory section by defining a notation for therotation matrix, as a function of the three rotation angles:

R(α, β, γ) = Rλκ. (2.5)

2.3 Vehicle Process ModelThis section describes the vehicle process models, in other words the models forhow vehicle states changes over time. The first section is about the basic model(also described in [9]) that was used before this work, and the remaining sectionsare about the model extensions that have been tested during this work.

2.3.1 Basic ModelBefore this work the used vehicle process model was the one described in thissection.

With the states described in Table 2.1, the vehicle state vector at time t isgiven by

xvt =

ptvx,tψtδtαtφt

, (2.6a)

pt =

px,tpy,tpz,t

. (2.6b)

The input signal ut is given by

ut = vx,t, (2.7)

where the vehicle acceleration vx,t in practice is measured. By treating vx,t as aninput signal instead of as a measurement we avoid having to incorporate it as astate in the vehicle state vector.

With T as the sampling time, L as the wheel base of the vehicle and C as apitch damping parameter, we have

8 System Modeling

State Descriptionp Vehicle position in world coordinates.vx Velocity of the vehicle in its forward direction.ψ Yaw angle (z-axis rotation), relative to the world coordinate frame.δ Front wheel angle, relative to the vehicle’s forward direction.α Road pitch angle, relative to the world coordinate frame xy-plane.φ Pitch angle of the vehicle, relative to the road.

Table 2.1: This table contains short descriptions of the vehicle states.

f(xvt , ut+1) =

px,t + Tvx,t cosψt cosαtpy,t + Tvx,t sinψt cosαt

pz,t − Tvx,t sinαtvx,t + Tut+1

ψt + Tvx,t

L tan δtδtαtCφt

, (2.8)

which describes the vehicle process model.

The vehicle process noise wt is independent and Gaussian, according to

wt = B(xvt−1)ωt ∼ N (0, Q(xvt−1)), (2.9a)Q(xvt ) = B(xvt )QωB(xvt )T , (2.9b)

B(xvt ) =

T cosψt cosαt 0T sinψt cosαt 0

T sinαt 01 00 I4×4

, (2.9c)

ωt =

ωvxt

ωψtωδtωαtωφt

∼ N (0, Qω), (2.9d)

Qω =

qvx

qψ 0qδ

0 qα

qφ

, (2.9e)

where all the q-variables are process noise variance parameters.

2.3 Vehicle Process Model 9

2.3.2 Model ExtensionsThis section describes the vehicle process model extensions tested during this work.Note that these model extensions can be used together in any combination.

We need to introduce some new notation: In order to describe the vehicleprocess model for one of the vehicle states, we use superscripts. An example: Forthe yaw angle (ψ) process model we would use the notation fψ.

Constant Offset in Car Pitch Angle

Since the stationary pitch angle of the camera, relative to the road, might be non-zero, due to the current load in the vehicle or misaligned camera mounting, thestate for the camera pitch offset φ0 is added to the state vector. This offset stateis interpreted as the stationary car pitch angle, around which the car pitch angleoscillates. The process model for car pitch angle φ and car pitch angle offset φ0

becomes

fφ(xvt , ut+1) = C(φt − φ0t ) + φ0

t , (2.10a)

fφ0(xvt , ut+1) = φ0

t , (2.10b)

where C is the previously mentioned pitch damping parameter.

The camera pitch offset models a constant offset angle, so the process noisevariance for φ0

t is zero. The process noise variance for φ is independent of whetherφ0 is included in the vehicle state and process model or not.

Acceleration Offset in Car Pitch Angle

Acceleration (including deceleration) of the vehicle has significant effects on thecar pitch angle. The model of this effect is that the stationary car pitch angle,when the acceleration u is constant, becomes Ku, where K is a vehicle geometrydependent parameter. This leads to that

fφ(xvt , ut+1) = C(φt −Kut+1) +Kut+1. (2.11)

This model is very similar to the constant offset model for car pitch; the dif-ferences are that for acceleration no new states are required, and the offset due toacceleration is not necessarily constant.

Note that the choice of using a linear relation between acceleration and sta-tionary car pitch angle is mainly motivated by the fact that the springs, which arepart of the vehicle suspension, have a linear relation between displacement andforce (this is known as Hooke’s law).

The process noise variance for φ should not be changed when adding the carpitch acceleration offset to the process model.

10 System Modeling

Second Order Car Pitch Model

A natural extension of the first order model in (2.8) for the car pitch is to usea second order model, modeling the car pitch as a damped harmonic oscillator,instead of as oscillation-free damping which the first order model describes.

With τ as the characteristic time (sometimes called relaxation time) and fnas the natural frequency of the car pitch system the differential equation for carpitch motion is

φ+ 2 1τφ+ (2πfn)2φ = torque

moment of inertia. (2.12)

Since the torque is unknown it will be represented by process noise and subse-quently set to zero in

φ+ 2 1τφ+ (2πfn)2φ = 0, (2.13)

which is the time-continuous equation (without process noise) that the car pitchprocess model will be based on.

Discretization and approximations of derivatives are performed according to

φt = φ(tT ), (2.14a)

φt = φ(tT ) ≈ φt+1 − φtT

, (2.14b)

φt = φ(tT ) ≈ φt+1 − φtT

. (2.14c)

The order of the linear ordinary differential equation in (2.13) is two. Thismeans that a state space representation of the car pitch system requires two states;φ is already a vehicle state, and φ is added to the vehicle state vector.

In the absence of process noise, we can identify φt+1 as fφ(xvt , ut+1) and φt+1as f φ(xvt , ut+1), so (2.13) and (2.14) gives

fφ(xvt , ut+1) = φt + T φt, (2.15a)

f φ(xvt , ut+1) =(

1 − 2Tτ

)φt − T (2πfn)2φt. (2.15b)

For the second order model of the car pitch to have the same characteristictime τ (i.e. the same damping) as the first order model in (2.8) we seek τ = τ(C).The first order model in (2.8) is a discretization of the continuous time model

φ+ 1τφ = 0. (2.16)

2.4 Landmark Parametrization 11

Using (2.14) to discretize (2.16) results in

φt+1 ≈(

1 − T

τ

)φt. (2.17)

A comparison of (2.8) and (2.17) finally gives

τ ≈ T

1 − C. (2.18)

Using the second order model for car pitch, the process noise variance for φ islowered by an order of magnitude compared to the process noise variance whenusing the basic model, making φ tightly connected to φ. Then the process noisevariance for φ is set to approximately correspond to that of φ for when the basicmodel is used, resulting in a value one or two orders of magnitude higher than thebasic φ process noise variance.

Roll Angle

For several reasons, such as that curves of country roads are banked, and that thecar may roll quite a bit when driving on uneven roads such as dirt roads, lettingthe roll angle be constant at zero, which is done in the basic model where roll isnot a vehicle state, might be an inadequate approximation. The process model forthe combined roll angle γ of the car and road is given by

fγ(xvt , ut+1) = γt. (2.19)

Since the roll and pitch angles of automobiles have similiar behaviour in termsof amplitude and natural frequency, the process noise variance for the roll angleis set to be approximately the same as the basic model car pitch process noisevariance.

2.4 Landmark ParametrizationIn this section the landmark state is described. At time t we have Mt numberof visible landmarks. (Visible means that the landmark has been measured; alandmark may very well be non-visible although it is present in the FIR image,but it cannot be visible if it is not in the image.) The landmark index, which isits identification, of the visible landmark number i ∈ {1, 2, . . . ,Mt} at time t isdenoted jt(i).

With the landmark states described in Table 2.2, the landmark state vector isgiven by

12 System Modeling

xlt =

xljt(1),txljt(2),t

...xljt(Mt),t

, (2.20a)

xlj,t =

kwj,tθwj,tϕwj,tρj,t

, (2.20b)

kwj,t =

kwj,t,xkwj,t,ykwj,t,z

. (2.20c)

State Descriptionkw The position in world coordinates of the camera at the time when the

landmark was first seen.θw The azimuth angle of the landmark as seen from kw, relative to world

coordinate frame directions.ϕw The elevation angle of the landmark as seen from kw, relative to world

coordinate frame directions, with positive angles towards the positivez-axis.

ρ The inverse depth (which is the inverse of the distance) from kw to thelandmark.

Table 2.2: This table contains short descriptions of the landmark states.

The reason for using inverse depth rather than regular distance is that we wanta more natural way to assign uncertainty to the estimates of distance. Since theuncertainty of state estimates will be represented by normal distributions we gethigher uncertainty for large distances and lower uncertainty for small distances byusing inverse depth, and this is precisely what we want.

The landmark state xlj,t is a parametrization of the position lwj,t of landmark jat time t, and the relationship between position and state, with landmark positiongiven in world coordinates, is

lwj,t = kwj,t + 1ρj,t

cosϕwj,t cos θwj,tcosϕwj,t sin θwj,t

sinϕwj,t

︸︷︷︸

mwj,t

. (2.21)

2.5 Measurement Model 13

2.5 Measurement ModelThe model for the vehicle measurements is

hv(xvt ) =(vx,tψt

)=(

vx,tvx,t

L tan δt

), (2.22)

where L is the wheel base of the vehicle. The landmark measurement model isgiven by

hl(xvt , xlj,t) = Pn(pcj,t) = 1pcj,t,x

(pcj,t,ypcj,t,z

), (2.23a)

pcj,t =

pcj,t,xpcj,t,ypcj,t,z

= pc(xvt , xlj,t) =

= 1ρj,t

Rcb(Rbw(ρj,t(kwj,t − pt) +mwj,t) − ρj,tc

b),

(2.23b)

Rcb = R(αc, βc, γc)T , (2.23c)

Rbw =

{R(ψt, αt + φt, γt)T if model contains γ,R(ψt, αt + φt, 0)T otherwise.

(2.23d)

where cb is the position of the camera in the vehicle body coordinate frame, andPn(pc) is the so-called normalized pinhole projection of a point pc, which is givenin camera coordinates. Furthermore, Pn generates normalized camera coordinatesand αc, βc and γc are the yaw, pitch and roll angles of the camera, relative to thevehicle body coordinate frame.

Both of the two measurement noises evt and elj,t are independent and Gaussian,according to

evt ∼ N (0, Sv), (2.24a)elj,t ∼ N (0, Sl), (2.24b)

Sv =(svx 00 sψ

), (2.24c)

Sl =(sc 00 sc

), (2.24d)

where all the s-variables are measurement noise variance parameters.The FIR camera is of course digital, so it generates images consisting of pixels.

To translate between pixel coordinates(y z

)T and normalized camera coordi-nates

(y z

)T , which is the kind of coordinates landmark measurements ylj,t aregiven in, we use

14 System Modeling

(yz

)=

y−yic

fy

z−zic

fz

, (2.25)

where(yic zic

)T denotes the image center (which is the intersection of the opticalaxis and the image plane), and fy and fz are the focal lengths (given in pixels) iny-direction and z-direction respectively.

Chapter 3

Visual Odometry and SLAM

The contents of this chapter describes how estimation of vehicle and landmarkstates is performed, by first explaining how the extended Kalman filter (EKF) isapplied to our problem, and then describing the smoothing and mapping (SAM)algorithm, which can be seen as a method to refine the state estimates from theEKF. Finally an improvement of the feature association is described.

3.1 Extended Kalman FilterWith xvt and xlt given by (2.6a) and (2.20a) respectively, the true state vector attime t is given by

xt =(xvtxlt

), (3.1)

and xt|s is the estimate of xt using measurements up until time s ∈ {t− 1, t}.

If the model and the initial values of x and P are accurate, then we havePt|s = cov(xt − xt|s). The model used is of course not perfectly accurate, but wecan still interpret Pt|s as a measure of how good xt|s approximates xt. However,because of the linearization done in the EKF, P tends to be underestimated, sayingthat x is less uncertain than it actually is, and this must be kept in mind if we arelooking at the actual values in P (e.g. plotting the confidence bounds for landmarkpositions). Pt|s can be decomposed according to

Pt|s =

(P vt|s P vlt|sP lvt|s P lt|s

), (3.2)

where P vt|s is the covariance of the vehicle states, P vlt|s and P lvt|s is the covariancebetween the landmark states and the vehicle states, and P lt|s is the covariance ofthe landmark states.

The vector ylt contains all the landmark measurements for time t, and hlt(xt)is the corresponding measurement function; they are defined in (3.3). Remember

15

16 Visual Odometry and SLAM

from Chapter 2 that the landmarks which are measured at time t have indicesjt(i), ∀i ∈ {1, 2, . . . ,Mt}, where Mt is the number of measured landmarks at timet. We have that

ylt =

yljt(1),tyljt(2),t

...yljt(Mt),t

, (3.3a)

hlt(xt) =

hl(xvt , xljt(1),t)hl(xvt , xljt(2),t)

...hl(xvt , xljt(Mt),t)

. (3.3b)

The EKF uses linearized measurement and process models given by

Ft =∂f(xvt−1, ut)

∂xvt−1

∣∣∣∣xv

t−1=xvt−1|t−1

, (3.4a)

Hvt = ∂hv(xvt )

∂xvt

∣∣∣∣xv

t =xvt|t−1

, (3.4b)

H lt = ∂hlt(xt)

∂xt

∣∣∣∣xt=xt|t−1

. (3.4c)

We then get the time update (i.e. the prediction) of the states according to

xvt|t−1 = f(xvt−1|t−1, ut), (3.5a)

xlt|t−1 = xlt−1|t−1, (3.5b)

Pt|t−1 =(F vt 00 I

)Pt−1|t−1

(F vt 00 I

)T+(Qt 00 0

), (3.5c)

where Qt = Q(xvt−1|t−1) (see (2.9)).The next step is the measurement update, which adjusts the state predictions

from the time update, using measurements. Slt is the noise covariance for mea-surement ylt, so it is a block diagonal matrix with Mt number of Sl matrices on itsdiagonal. With Sl and Sv given by (2.24), the measurement update is given by

3.1 Extended Kalman Filter 17

xt|t = xt|t−1 +Kvt (yvt − hv(xvt|t−1)) +Kl

t(ylt − hlt(xt|t−1)), (3.6a)

Pt|t = Pt|t−1 −Kvt

(Hvt 0

)Pt|t−1 −Kl

tHltPt|t−1, (3.6b)

Kvt = Pt|t−1

(Hvt 0

)T ( (Hvt 0

)Pt|t−1

(Hvt 0

)T + Sv)−1

, (3.6c)

Klt = Pt|t−1(H l

t)T(H ltPt|t−1(H l

t)T + Sl)−1

. (3.6d)

When a new feature is extracted from an image, a corresponding landmarkis added to the landmark state vector xt|t and the covariance matrix Pt|t mustbe expanded to include the uncertainty of the new landmark’s states. With pas the vehicle position, ρ0 as the initial inverse depth, Pρ0 as the initial variancefor inverse depth, ynl =

(ynly ynl

z

)T as the measured position of the feature (innormalized camera coordinates), and superscipt nl meaning new landmark, thenew state vector xnew

t|t and covariance matrix P newt|t are given by

xnewt|t =

(xt|txnl

), (3.7a)

xnl =

pt|t +Rwbcb

arctan ynly + ψt|t

arctan ynlz − φt|tρ0

, (3.7b)

Rwb =

{R(ψt, αt + φt, γt) if model contains γ,R(ψt, αt + φt, 0) otherwise,

(3.7c)

P newt|t = J

Pt|t 0 0 00 0 0 00 0 Sl 00 0 0 Pρ0

JT , (3.7d)

J =

(I 0

∂xnl

∂pt|t0 ∂xnl

∂ψt|t0 0 ∂xnl

∂φt|t0 · · · 0 0 0 0 ∂xnl

∂ynl∂xnl

∂ρ0

),

(3.7e)

where lines are added for better readability, at the same place in the differentmatrices (which are of equal size). These lines are placed so they separate the oldcovariance from the covariance of the new landmark state. Remember that cb isthe position of the camera in the vehicle body coordinate frame, and note thatthe covariance for the new landmark’s camera position state will be given by thecovariance for the vehicle position in combination with ∂xnl

∂pt|t.

Due to the linearization of the non-linear landmark measurement function hl

it sometimes happens that the latest estimation of a landmarks inverse depth isnegative. This can happen to landmarks that are distant but with measurements


indicating that the current distance estimate is too low, so the distance is increasedby lowering the inverse depth estimation. To avoid this problem all inverse depthestimates in xlt|t are forced to be above a positive threshold ρmin, thereby settinga maximum allowed distance for landmarks. If ρmin is small enough, making themaximum allowed distance large enough, the measurement errors (for landmarksfurther away than the maximum allowed distance) due to forcing the inverse depthestimates is negligible.

Algorithm 1 describes the EKF algorithm used. P v1|0 is the initial vehicle statecovariance, tf is the interval for when to search for new features, and Mmax is themaximal number of visible landmarks.

Algorithm 1 Sensor fusion using EKF

1: Initialize all elements in xv1|1 to zero, except for vx,1|1 and δ1|1 which are givenby measurements yv1 using (2.22) with yv1 = hv(xv1|1). Let P1|1 = P v1|0.

2: Generate yl1 by extracting features from the first image.3: Expand x1|1 and P1|1 according to (3.7) using yl1.4: for t = 2 to N do5: Generate xt|t from xt−1|t−1 and Pt|t from Pt−1|t−1 using vehicle measure-

ments yvt , but no landmark measurements.6: Predict landmark measurements.7: Associate features, resulting in landmark measurements ylt. (Landmarks

whose features cannot be associated with high precision are removed.)8: Update xt|t and Pt|t using landmark measurements ylt. (Landmarks with

highly unlikely measurements are removed.)9: for all j do

10: Make sure ρj,t|t ≥ ρmin.11: end for12: Make sure Pt|t is symmetric.13: if t ≡ 0 (mod tf ) and Mt < Mmax then14: Extract new features (at most Mmax −Mt new features).15: end if16: end for

3.2 Feature DetectionIn order to find distinctive, and thereby trackable, features in the image somesort of feature detection must be performed. In this work the so-called Harrisdetector (see [3]) is used; this detector finds areas in the image where the imagecontains significant change in more than one direction (thus for examle corners,but not straight lines, will be detected). The feature detection algorithm, used inAlgorithm 1 and given by Algorithm 2, is run at fixed time intervals, and only ifthe number of visible landmarks Mt is lower than the maximum allowed numberof visible landmarks Mmax.

3.3 Feature Association 19

Algorithm 2 Feature detection1: Smooth the image in order to suppress noise.2: Compute the image gradient.3: Compute the Harris measure from the image gradient.4: Mask the image where the road is expected to be, near the image borders, and

around predicted positions of existing features, by setting the Harris measureto zero in these regions.

5: Set nM = Mmax −Mt.6: while nM > 0 and maximum of Harris measure is above a threshold do7: Extract a fixed-size image patch for the feature around the position of the

maximum for the Harris measure.8: Mask the image around the new feature, by setting the Harris measure to

zero in this region.9: Decrease nM by one.

10: end while11: for all extracted features do12: Expand the landmark state and covariance with the new landmark that the

extracted feature represents.13: end for

3.3 Feature AssociationIn order to track a feature in a sequence of frames we must perform association,i.e. determining the position of the feature in question in all of the images in thesequence. In this work features are represented and characterized by a small imagepatch of each feature. Association of one feature image patch with a camera image,i.e. determining where in the camera image the feature image patch (and thus thefeature) is located, is performed using normalized cross-correlation (NCC). Thefeature association in Algorithm 1 is performed according to Algorithm 3.

3.4 Smoothing and MappingThe idea behind smoothing and mapping (SAM) is to simultaneously estimate thevehicle states and the landmark states, utilizing both previous, present and futuremeasurements, as opposed to the EKF which cannot utilize future measurements.This acausal estimation is performed by solving weighted least squares problemsiteratively, until desired accuracy of the converging solution is achieved.

Since the landmarks are stationary each landmark is represented by a time-independent landmark state. The vehicle however is not stationary, so the vehiclestates for all the time steps t ∈ {1, 2, . . . , N} are included in the state vector ofthe SAM algorithm. Mathematical descriptions of the states used in the SAMalgorithm can be found in (3.8).


Algorithm 3 Feature association1: for all features in xlt do2: Predict feature position in image.3: Select a search region around the predicted position.4: if search region is completely within the image then5: Compute normalized cross-correlation (NCC) of the current image (lim-

ited to the search region) and the feature image patch representing thelandmark.

6: Find the maximum of the NCC.7: Compute the maximum of the NCC when a small area around the original

maximum has been zeroed out.8: if second maximum NCC-value is not close to original maximum then9: Save the position of the original maximum as the new measurement of

the feature.10: end if11: end if12: end for

Not all landmarks from the EKF run are included in the SAM run; a thresholdfor the minimum number of times a landmark has been visible is used to sort outlandmarks with few measurements. With M number of landmarks included inthe SAM run, the landmark index of landmark number i ∈ {1, 2, . . . ,M} in theSAM run is denoted j(i). The state vector x, containing both landmark statesand vehicle states, is given by

x =(xv

xl

), (3.8a)

xv =

xv1xv2...xvN

, (3.8b)

xl =

xlj(1)xlj(2)

...xlj(M)

, (3.8c)

xlj =

θwjϕwjρj

. (3.8d)

The variable x is the current estimate of x. During the first SAM iterationx is based on the state estimate from the EKF. The vehicle states xvt are simply

3.4 Smoothing and Mapping 21

collected from all the time steps of the EKF run. Landmark states xlj on theother hand are based on the last estimation by the EKF of the landmark state inquestion.

Instead of including the camera positions in the landmark states xlj we use thevehicle state xvtc(j), from the time tc(j) when landmark j was first seen, to calculatethe camera position. This can be done since we assume that the camera is firmlymounted in the car. We can recompose a landmark state using the function g,which returns the complete landmark state (i.e. camera position, azimuth angle,elevation angle and inverse depth) according to

g(xvtc(j), xlj) =

(ptc(j) +Rwbcb

xlj

), (3.9a)

Rwb =

{R(ψtc(j), αtc(j) + φtc(j), γtc(j)) if model contains γ,R(ψtc(j), αtc(j) + φtc(j), 0) otherwise.

(3.9b)

Note that xlj,t, with time index, denotes the landmark state that includes cameraposition, whereas xlj , without time index, denotes the landmark state that ex-cludes camera position.

Since the number of landmark measurements ylj,t is not the same for every timestep, k is used to numerate the complete series of landmark measurements. Weintroduce the notation jk for the index of the landmark associated with measure-ment k, and similarly tk for the time when measurement k was performed. Usingthe just introduced notation, yljk,tk

is the measurement we get from landmarkmeasurement number k.

Vehicle measurements yvt are performed once every time step, so no notationother than the time t is required to numerate these measurements.

Linearization at x = x of the process model and measurement model, andusing the fact that the input signal ut is non-variable (it is actually a measurementalthough we treat it as an input signal) results in

xvt + δxvt = f(xvt−1, ut) + Ftδxvt−1 + wt, (3.10a)

yvt = hv(xvt ) +Hvt δx

vt + evt , (3.10b)

yljk,tk= hl(xvtk , g(xvtc(jk), x

ljk

)) +H lkδx

vtk

+ (3.10c)

+H l,ck δxvtc(jk) + Jkδx

ljk

+ eljk,tk, (3.10d)

where


Ft =∂f(xvt−1, ut)

∂xvt−1

∣∣∣∣xv

t−1=xvt−1

, (3.11a)

Hvt = ∂hv(xvt )

∂xvt

∣∣∣∣xv

t =xvt

, (3.11b)

H lk =

∂hl(xvt , g(xvtc(jk), xljk

))∂xvt

∣∣∣∣∣xv

t =xvtk

, (3.11c)

H l,ck =

∂hl(xvtk , g(xvtc , x

ljk

))∂xvtc

∣∣∣∣∣xv

tc=xv

tc(jk)

, (3.11d)

Jk =∂hl(xvtk , g(x

vtc(jk), x

lj))

∂xlj

∣∣∣∣∣xl

j=xl

jk

, (3.11e)

are Jacobians of the process model and measurement model.By using the residuals

at = xvt − f(xvt−1, ut), (3.12a)cvt = yvt − hv(xvt )), (3.12b)clk = yljk,tk

− hl(xvtk , g(xvtc(jk), xljk

)), (3.12c)

we can form a least squares optimization expression, minimizing the sum of squaredweighted measurement and process noises, according to

δx∗ = arg minδx

(∑t

(∥wt∥2

Q−1t

+ ∥evt ∥2(Sv)−1

)+∑k

∥eljk,tk∥2

(Sl)−1

)=

= arg minδx

(∑t

(∥Ftδxvt−1 − δxvt − at∥2

Q−1t

+ ∥Hvt δx

vt − cvt ∥2

(Sv)−1

)+

+∑k

∥H lkδx

vtk

+H l,ck δxvtc(jk) + Jkx

ljk

− clk∥2(Sl)−1

),

(3.13)

where Qt = Q(xvt−1) (see (2.9)). Note that for t = 1 we need xv0, which we don’thave, so we let F1 = 0, a1 = 0 and Q1 = P v1|0, which is the initial vehicle statecovariance for the EKF estimation. This way the resulting term ∥δxv1∥2

Q−11

makessure that the smoothed state estimate xs stays reasonably close to x.

The vector norm ∥ · ∥P−1 used in (3.13) is the so called Mahalanobis distance,and it is given by

∥e∥2P−1 = eTP−1e = (P−T/2e)T (P−T/2e) = ∥P−T/2e∥2

2. (3.14)

3.4 Smoothing and Mapping 23

A problem with using the Mahalanobis distance is that Qt is singular (see Ap-pendix B for a proof), so its inverse does not exist. To get around this problemwe add ϵI, where ϵ > 0 is a small number, to Qt before the inversion.

By collecting all the weighted Jacobians (Q−T/2t Ft, Q−T/2

t , (Sv)−T/2Hvt , . . . )

in a matrix A, and stacking all the weighted residuals (Q−T/2t at, (Sv)−T/2clk and

(Sl)−T/2cvt ) in a vector b, the least squares optimization can be rewritten to be ofthe form of a standard linear least squares problem, according to

δx∗ = arg minδx

∥Aδx− b∥22, (3.15)

and an example of the structure of A is available in Example 3.1.Thereafter the sought smoothed state vector xs is obtained according to

xs = x+ δx∗ (3.16)

In order to get good accuracy for the state estimate we repeat the SAM algo-rithm, using x = xs, until δx∗ becomes smaller than some predefined threshold.

Example 3.1To show the structure of the matrix A, an example scenario is used to illustratehow A is composed of the Jacobians Hv

t , H lk, H l,c

k and Jk, but with all the weightmatrices omitted for better visibility (the symbol ∼ used later on is to be inter-preted as "roughly similar to"). The example scenario has:

• N = 5 time steps.

• M = 2 landmarks.

• Measurements of landmark j = 1 at times t ∈ {1, 2, 3, 4}.

• Measurements of landmark j = 2 at times t ∈ {3, 4, 5}.

This results in the following landmark measurements, enumerated by k

k 1 2 3 4 5 6 7tk 1 2 3 3 4 4 5jk 1 1 1 2 1 2 2

The structure of A, as generated by Algorithm 4, is described in (3.17). Re-member that the subscript k of H l,c

k and H lk denotes landmark measurement num-

ber, while the subscript t of Hvt denotes time.

Note that:

• A11 relates vehicle states to vehicle state prediction errors.

• A21 relates vehicle states to measurement prediction errors.


• A22 relates landmark states to measurement prediction errors.

A =(A11 0A21 A22

)∼

∼

−IF2 −I

F3 −IF4 −I

F5 −IHv

1H l,c

1 +H l1 J1

Hv2

H l,c2 H l

2 J2Hv

3H l,c

3 H l3 J3

H l,c4 +H l

4 J4Hv

4H l,c

5 H l5 J5

H l,c6 H l

6 J6Hv

5H l,c

7 H l7 J7

(3.17)

It is time comsuming to evaluate the Jacobian H l,ck , connecting camera position

to vehicle state. If only the Jacobian’s partial derivatives with respect to vehicleposition are allowed to be non-zero, it is possible to reduce the execution timeof the evaluation whilst the speed of convergence is marginally lowered, and theapproximation’s effect on the vehicle pose is almost undetectable. As a result ofthis, the implementation uses the approximate version of H l,c

k . The interpretationof this approximation is that we disregard change in camera position that stemsfrom changes in the vehicle state angles ψtc(jk) and αtc(jk).

Just as with the EKF algorithm, and because of the same reason, it is possibleto after a SAM iteration get negative estimates of landmarks’ inverse depths. Thisproblem is solved in the SAM algorithm like it is in the EKF algorithm, namelyby forcing inverse depth estimates below ρmin up to this minimal allowed value.

Algorithm 4 summarizes the implementation of the SAM algorithm. nv is thenumber of states in the vehicle model (i.e. the length of xvt , for any t). nl is thenumber of states in the SAM landmark model, for one landmark.

3.5 Feature Association Improvement 25

3.5 Feature Association ImprovementIn order to derive FIR images measurements that contains much information aboutthe vehicle poses we want many measurements, but at the same time we want fewlandmark states since also these are estimated from the measurements. A methodto increase the number of landmark measurements per landmark is described be-low.

A major disadvantage with association of features using normalized cross-correlation of image patches is that change in scale is not handled. To overcomethis issue, without implementing a whole new feature representation, scaling ofthe image patches that describes the features can be performed according to

Lj,t = 2⌊Dj,tL0

2

⌋+ 1, (3.18a)

Dj,t =∥lwj,t − kwj,t∥

∥lwj,t − (pt +Rwbcb)∥= 1ρj,t∥lwj,t − (pt +Rwbcb)∥

, (3.18b)

where Lj,t is the size of the enlarged patch for landmark j at time t, L0 is the sizeof the original patch (with size being the length of the quadratic patches’ sides).pt, kwj,t, ρj,t and lwj,t are EKF estimates of states and variables described in Section2.4, and Rwb is the same as in (3.7). Note that lwj,t is not a state and is thereforenot itself estimated, it is instead calculated using the estimated landmark states.

The quantity Dj,t is a scale factor which, based on the EKF estimate of land-mark state and vehicle state, estimates how much larger landmark j will appear inthe image at time t, compared to when the landmark was first seen and the imagepatch describing the feature was stored. The scale factor is simply calculated asthe distance to the landmark when it was first measured, divided by the currentdistance to it.

Scaling the feature patches can, besides prolonging the life-time of landmarks,also be expected to improve the quality of landmark measurements. The key tothis is that when scaling is not used, and we have a landmark patch which oughtto be enlarged, then the association of this patch with new image might becomeoffset in some direction. Consider for example the top of a spruce tree and let’ssay that the feature patch describing this landmark is taken far away so that thepatch depicts the upper half of the tree. If we associate this feature patch withan image taken at half the original distance to the tree it is likely that we succeedin finding an association, but the patch will be associated with the top quarter ofthe tree, so the measurement is higher up than it should be.

Note that Lj,t is Dj,tL0 rounded to the nearest odd integer. This is done tomake sure that also the enlarged patch has one pixel which represents the centerof the patch. Also note that Dj,t is an approximation of the real scale change, butthis approximation holds since the landmarks are small compared to the distanceto them, resulting in small angles.


Algorithm 4 Sensor fusion using SAM1: Generate xv from all xvt|t.2: Generate xl from all xlt|t, using the last estimate of the individual landmarks.

Exclude landmarks that has fewer measurements than a predefined threshold.3: Update yljk,tk

(taken from the EKF algorithm) by removing measurementsbelonging to landmarks removed in the previous step.

4: repeat5: Initialize a and c as empty column vectors, and A21 and A22 as empty

matrices. (These variables will grow iteratively as new rows are added tothem.)

6: for t = 1 to N do7: if t = 1 then8: Set A11 = (P v1|0)−T/2 (−I 0nv×(N−1)nv).9: Set a = 0nv×1.

10: else11: Append Q

−T/2t

(0nv×(t−2)nv

Ft −I 0nv×(N−t)nv) to A11.12: Append Q

−T/2t at to a.

13: end if14: Append (Sv)−T/2 (02×(t−1)nv

Hvt 02×(N−t)nv) to A21.

15: Append a two rows of zeros to A22.16: Append (Sv)−T/2cvt to c.17: for all k : tk = t do18: if t = tc(jk) then19: Append (Sl)−T/2 (02×(t−1)nv

H l,ck +H l

k 02×(N−t)nv) to A21.20: else21: (Sl)−T/2 (02×(tc(jk)−1)nv

H l,ck 02×(t−tc(jk)−1)nv

H lk 02×(N−t)nv)

is appended to A21.22: end if23: Append (Sl)−T/2

(02×(jk−1)nl

Jk 02×(M−jk)nl)

to A22.24: Append (Sl)−T/2clk to c.25: end for26: end for27: Let A =

(A11 0A21 A22

).

28: Let b =(ac

).

29: Calculate δx∗ and xs according to (3.15) and (3.16) respectively.30: Set x = xs.31: for all j do32: Make sure ρj ≥ ρmin.33: end for34: until max abs(δx∗) is smaller than predefined threshold

Chapter 4

Results

This chapter contains a description of the result presentation, followed by theresults of using SAM and vehicle model extensions, and lastly results of introducingscale change compensation to the feature association.

4.1 Performance Measures

The results in terms of vehicle pose estimation accuracy of SAM and vehicle modelextensions are given in the form of two measures. The first one, landmark measure-ment residuals, is an indirect but quantified measure of pose estimation accuracy,whereas the second measure, trajectory plot in camera image, is direct but non-quantified and also to some extent subjective.

4.1.1 Landmark Measurement Residuals

The root mean square, RMS(v), of a vector v =(v1 . . . vN

)T ∈ RN is

RMS(v) =

√√√√ 1N

N∑i=1

v2i . (4.1)

To measure how well image positions for landmarks are predicted, i.e. howfar off the predictions were from the measurements on average, the RMS(cl) ofthe landmark measurement residuals cl (given in pixels) is used. With fy and fzas the focal lengths (given in pixels) in y-direction and z-direction respectively,Mk =

∑Nt=1 Mt as the total number of landmark measurements and te(j) as the

end time for landmark j, i.e. the latest time that it was measured, cl is givenaccording to

27

28 Results

cl =

cl1...

clMk

, (4.2a)

clk =(fy 00 fz

)clk, (4.2b)

clk =

{yljk,tk

− hl(xvtk|tk , xljk,te(jk)|te(jk)) if EKF estimates are used,

yljk,tk− hl(xvtk , g(x

vtc(jk), x

ljk

)) if SAM estimates are used.(4.2c)

Note that clk is given in normalized camera coordinates.

The fact that landmark measurements are discretized (they are given as integerpixel positions) results in that there is a lower limit for the RMS of the landmarkmeasurement residuals, and it is RMS(cl) ' 1

2√

3 ≈ 0.29. See Appendix C for thederivation of this limit.

A drawback with this pose estimation measure is that since it is based on land-mark measurements it depends on what types of landmarks that are extractedfrom the image sequence, and the extracted landmarks are not necessarily thesame when using a different subset of the model extensions described in 2.3.2. Forexample: Let us consider a data sequence where we with one model extensionget a couple of erroneously associated landmarks and no erroneous associationswith another model extension, then it is difficult to compare the pose estimationaccuracy based on the RMS(cl) values from these two cases. To make the resultsmore dependable we take the RMS(cl) values for all the 16 combinations of ve-hicle process model extensions for both EKF and SAM (a total of 32 values persequence) and calculate a few averages of the RMS(cl) values for each sequence.

It should also be noted that the measurements of course are not the truth,and the quality of them depends on how well the feature association algorithmperforms.

4.1.2 Trajectory Plot in Camera ImagePlotting the sequence of estimated vehicle positions (the trajectory) in the firstFIR camera image provides an intuitive way to evaluate the accuracy of the poseestimates.

The vehicle pose consists of three-dimensional position and three-dimensionalorientation, and the part of the pose sequence that the trajectory basically shows isposition and yaw angle (since yaw is tightly connected to the sequence of positions),but not pitch angle or roll angle.

The obvious drawback of using the trajectory plot as a measure of pose es-timation accuracy is that it is not quantitative but instead has to be assessedsubjectively.

4.2 SAM and Model Extensions 29

4.2 SAM and Model ExtensionsWe introduce the following abbreviations for the vehicle process model extensionsdescribed in Section 2.3.2:

Offset: Constant offset in car pitch angle.

Acc: Acceleration offset in car pitch angle.

Roll: Roll angle.

2nd: Second order model for car pitch.

For each data sequence used to evaluate the SAM algorithm and model exten-sions proposed in this thesis, both EKF and SAM estimates has been generatedfor all the 16 combinations of vehicle process model extensions, and all values ofRMS(cl) for the sequences are found in Appendix D. This section summarizes theresults found in Appendix D, and for each sequence a few vehicle trajectories areplotted in the first FIR camera image of the sequence. To illustrate the trajectoryand landmarks a map of estimated landmark positions and vehicle trajectory isshown for each sequence.

It is important to note that not all sequences contains behaviour that the modelextensions are ment to handle. If for example the true car pitch offset is zero theoffset model extension cannot be expected to improve the pose estimates for thatparticular sequence. The same goes for the acceleration offset in car pitch anglefor any sequence in which the vehicle moves at constant speed. To give an ideaabout which model extension that are given a chance to improve the results for aparticular sequence, all sequences below are accompanied by a short descriptionof its pitch motion, roll motion and acceleration, but note that it is generallyunknown whether or not the car pitch has a constant offset.

30 Results

Sequence AThe car pitch and roll oscillates quite a bit during this sequence, but the accel-eration is virtually zero. Table 4.1 presents the results for the mean landmarkmeasurement residuals, when using SAM compared to using only EKF, and whenusing the different vehicle model extensions, compared to not using them. Figure4.1 shows a top view of the estimated map and trajectory for Sequence A, whileFigures 4.2 and 4.3 show trajectory plots which compares the different vehiclemodel extensions, and SAM versus EKF.

Method RMS(cl)EKF 2.71SAM 0.51

(a) EKF versus SAM.

RMS(cl) Vehicle process modelOffset Acc Roll 2nd

With model 0.52 0.51 0.48 0.50Without model 0.50 0.51 0.54 0.53(b) With and without model extensions, using SAM.

Table 4.1: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence A.

0 500 1000 1500

−200

−150

−100

−50

0

50

x [m]

y [m

]

Figure 4.1: Sequence A: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.2: Sequence A: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.3: Sequence A: Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

32 Results

Sequence BIn this sequence there are moderate amounts of pitch and roll motion, but theacceleration is very close to zero. Table 4.2 presents the results for the meanlandmark measurement residuals, when using SAM compared to using only EKF,and when using the different vehicle model extensions, compared to not using them.Figure 4.4 shows a top view of the estimated map and trajectory for Sequence B,while Figures 4.5 and 4.6 show trajectory plots which compares the different vehiclemodel extensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.2: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence B.

0 100 200 300 400 500 600 700

−60

−40

−20

0

20

40

60

x [m]

y [m

]

Figure 4.4: Sequence B: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.5: Sequence B: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.6: Sequence B: Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

34 Results

Sequence CIn this sequence the car pitch oscillates moderately, but roll angle oscillation islow. However there is some roll motion due to road geometry, and the level ofacceleration is low. Table 4.3 presents the results for the mean landmark measure-ment residuals, when using SAM compared to using only EKF, and when usingthe different vehicle model extensions, compared to not using them. Figure 4.7shows a top view of the estimated map and trajectory for Sequence C, while Fig-ures 4.8 and 4.9 show trajectory plots which compares the different vehicle modelextensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.3: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence C.

0 200 400 600 800 1000

−50

0

50

100

150

x [m]

y [m

]

Figure 4.7: Sequence C: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.8: Sequence C: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.9: Sequence C: Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

36 Results

Sequence DThis sequence contains some roll motion but not so much. The acceleration islow and the pitch oscillates moderately. Table 4.4 presents the results for themean landmark measurement residuals, when using SAM compared to using onlyEKF, and when using the different vehicle model extensions, compared to notusing them. Figure 4.10 shows a top view of the estimated map and trajectoryfor Sequence D, while Figures 4.11 and 4.12 show trajectory plots which comparesthe different vehicle model extensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.4: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence D.

0 200 400 600 800 1000

−80

−60

−40

−20

0

20

40

60

80

100

x [m]

y [m

]

Figure 4.10: Sequence D: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.11: Sequence D: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.12: Sequence D: Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

38 Results

Sequence EThis sequence contains almost no acceleration, low roll motion, but the pitchoscillates moderately. Table 4.5 presents the results for the mean landmark mea-surement residuals, when using SAM compared to using only EKF, and whenusing the different vehicle model extensions, compared to not using them. Figure4.13 shows a top view of the estimated map and trajectory for Sequence E, whileFigures 4.14 and 4.15 show trajectory plots which compares the different vehiclemodel extensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.5: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence E.

0 100 200 300 400 500 600 700

−40

−20

0

20

40

60

80

x [m]

y [m

]

Figure 4.13: Sequence E: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.14: Sequence E: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.15: Sequence E Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

40 Results

Sequence FIn this sequence both pitch and roll motion is rather extensive, and there is alsoa little bit of acceleration. Table 4.6 presents the results for the mean landmarkmeasurement residuals, when using SAM compared to using only EKF, and whenusing the different vehicle model extensions, compared to not using them. Figure4.16 shows a top view of the estimated map and trajectory for Sequence F, whileFigures 4.17 and 4.18 show trajectory plots which compares the different vehiclemodel extensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.6: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence F.

0 200 400 600 800

−20

0

20

40

60

80

100

x [m]

y [m

]

Figure 4.16: Sequence F: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.17: Sequence F: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.18: Sequence F Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

42 Results

Sequence GThis sequence exhitibs moderate roll motion and acceleration but extensive pitchmotion. Table 4.7 presents the results for the mean landmark measurement resid-uals, when using SAM compared to using only EKF, and when using the differentvehicle model extensions, compared to not using them. Figure 4.19 shows a topview of the estimated map and trajectory for Sequence G, while Figures 4.20 and4.21 show trajectory plots which compares the different vehicle model extensions,and SAM versus EKF.


(a) EKF versus SAM.



Table 4.7: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence G.

0 100 200 300 400 500 600 700

0

20

40

60

80

100

120

140

x [m]

y [m

]

Figure 4.19: Sequence G: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.20: Sequence G: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.21: Sequence G Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

44 Results

Sequence HThis sequence exhibits a large constant offset in car pitch (resulting in that thehorizon is on average 15-20 pixels below the image center, this can be seen inthe images below), extensive roll motion, moderate acceleration, but not muchcar pitch oscillation. Table 4.8 presents the results for the mean landmark mea-surement residuals, when using SAM compared to using only EKF, and whenusing the different vehicle model extensions, compared to not using them. Figure4.22 shows a top view of the estimated map and trajectory for Sequence H, whileFigures 4.23 and 4.24 show trajectory plots which compares the different vehiclemodel extensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.8: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence H.

0 50 100 150 200 250 300 350

−20

−15

−10

−5

0

5

10

x [m]

y [m

]

Figure 4.22: Sequence H: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.23: Sequence H: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.24: Sequence H Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

46 Results

Sequence IIn this sequence pitch and roll motion is moderate, and there is some acceleration.Table 4.9 presents the results for the mean landmark measurement residuals, whenusing SAM compared to using only EKF, and when using the different vehiclemodel extensions, compared to not using them. Figure 4.25 shows a top view ofthe estimated map and trajectory for Sequence I, while Figures 4.26 and 4.27 showtrajectory plots which compares the different vehicle model extensions, and SAMversus EKF.


(a) EKF versus SAM.



Table 4.9: These tables show the effect on the mean landmark measurement resid-uals of using SAM and vehicle model extensions for Sequence I.

0 200 400 600 800

−80

−60

−40

−20

0

20

40

60

x [m]

y [m

]

Figure 4.25: Sequence I: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.26: Sequence I: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.27: Sequence I Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

48 Results

Sequence JThis sequence contains extensive acceleration and moderate roll and pitch motion.Table 4.10 presents the results for the mean landmark measurement residuals,when using SAM compared to using only EKF, and when using the differentvehicle model extensions, compared to not using them. Figure 4.28 shows a topview of the estimated map and trajectory for Sequence J, while Figures 4.29 and4.30 show trajectory plots which compares the different vehicle model extensions,and SAM versus EKF.


(a) EKF versus SAM.



Table 4.10: These tables show the effect on the mean landmark measurementresiduals of using SAM and vehicle model extensions for Sequence J.

0 100 200 300 400

−20

−10

0

10

20

30

x [m]

y [m

]

Figure 4.28: Sequence J: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.29: Sequence J: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.30: Sequence J Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

50 Results

Sequence KThis sequence exhibits moderate roll motion, and extensive acceleration and pitchmotion. Table 4.11 presents the results for the mean landmark measurementresiduals, when using SAM compared to using only EKF, and when using thedifferent vehicle model extensions, compared to not using them. Figure 4.31 showsa top view of the estimated map and trajectory for Sequence K, while Figures4.32 and 4.33 show trajectory plots which compares the different vehicle modelextensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.11: These tables show the effect on the mean landmark measurementresiduals of using SAM and vehicle model extensions for Sequence K.

0 50 100 150 200 250 300

−15

−10

−5

0

5

10

15

20

25

30

x [m]

y [m

]

Figure 4.31: Sequence K: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.32: Sequence K: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.33: Sequence K Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

52 Results

Sequence LIn this sequence both the acceleration and roll motion are extensive, but the pitchoscillation is moderate. Table 4.12 presents the results for the mean landmarkmeasurement residuals, when using SAM compared to using only EKF, and whenusing the different vehicle model extensions, compared to not using them. Figure4.34 shows a top view of the estimated map and trajectory for Sequence L, whileFigures 4.35 and 4.36 show trajectory plots which compares the different vehiclemodel extensions, and SAM versus EKF.


(a) EKF versus SAM.



Table 4.12: These tables show the effect on the mean landmark measurementresiduals of using SAM and vehicle model extensions for Sequence L.

0 200 400 600 800

0

20

40

60

80

100

120

140

x [m]

y [m

]

Figure 4.34: Sequence L: This figure show a map of the SAM estimates, using thebasic vehicle process model.


Figure 4.35: Sequence L: Trajectory for EKF estimates (dashdotted) and SAMestimates (solid), using the basic vehicle process model.

Figure 4.36: Sequence L Trajectory for SAM estimates, using the models offset(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).

54 Results

Smoothed TrajectoryFigure 4.37 shows a part of the trajectory of Sequence A, where it clearly can beseen that the EKF estimated trajectory is rather rough while the SAM estimatedtrajectory is smooth.

Figure 4.37: Sequence A: Part of the trajectory for EKF estimates (dashdotted)and SAM estimates (solid), using the basic vehicle process model.

4.3 Feature Association ImprovementThe feature patch scaling described in Section 3.5 is intended to improve the qual-ity of the landmark measurements and to increase the life-time of the landmarks,with life-time being the number of times a landmark has been visible (measured),resulting in more measurements per landmark state.

For a number of sequences measures of life-time has been generated, both withand without the patch scaling, using the basic vehicle process model, and theseresults are presented in table 4.13. The results are ambiguous, showing bothimprovement and degradation of the mean landmark life-time.

4.3 Feature Association Improvement 55

Mean life-time Scaling No scalingSequence A 24.9 26.6Sequence B 27.0 22.6Sequence C 47.7 41.2Sequence D 33.5 32.8Sequence E 32.1 26.9Sequence F 20.8 17.6Sequence G 22.7 23.2Sequence H 32.1 33.2Sequence I 40.4 37.7Sequence J 35.5 36.0Sequence K 37.5 49.1Sequence L 14.2 14.2

Table 4.13: Life-time of landmarks, with and without scaling of feature patches.

Chapter 5

Concluding Remarks

This work was aimed at improving the accuracy of vehicle pose estimates, andthis chapter contains conclusions about how the different investigated methodsperformed, based on the results in the previous chapter, and a discussion of areaswhich might be interesting to investigate in order to further improve the poseestimation accuracy.

It should be noted that this work is in no way exclusive to or dependent onFIR images, other kind of images such as NIR images or visual images may inprinciple be used instead of FIR images.

5.1 ConclusionsThis section contains the conclusions, based on the results, of this work.

5.1.1 SAMIt is clear from the sequences presented in Chapter 4 that the SAM algorithm out-performs the EKF algoritm in terms of pose estimation accuracy. This conclusionis supported both by the landmark measurement residual measure, and by thetrajectory plots, which in the cases where the EKF estimates are clearly erroneousthe SAM estimates are considerably better.

Another aspect of SAM is that the estimated trajectory is smooth, as expected(remember that SAM literally means smoothing and mapping), while the EKFgenerates estimates that are generally much less smooth, especially when we geterroneous association of landmarks. An example of this is found in figure 4.37.

5.1.2 Vehicle Process Model ExtensionsAs can be seen in the results, the process model extensions are not able to improvethe pose estimates as much as the SAM algorithm. They do however result in someimprovements, and a summary of the results along with conclusions are given inthis section.

57

58 Concluding Remarks

Constant Offset in Car Pitch Angle

The landmark measurement residual measures shows that this model extension cangive some improvements in pose estimation accuracy. The magnitude of the im-provement of course depends heavily on how much offset the car pitch angle reallyis, which is why this model extension cannot provide improvements in accuracyfor all sequences.

Note that if we have a sequence with considerable acceleration, and we usethe constant offset model extension but not the acceleration model extension,then the constant offset model extension will erroneously try to describe the pitchangle change (caused by the acceleration) as an offset.

Acceleration Offset in Car Pitch Angle

It can be seen from the results that if a sequence contains extensive accelerationthis model extension can decrease the value of the landmark measurement residualmeasure, indicating improved pose estimation accuracy. To get such results theacceleration has to be at least in the order of 1 m/s2. As a reference, a regularautomobile can gain speed at a rate of 2–3 m/s2 och brake at a rate of 5–10m/s2, so under normal driving conditions accelerations of 1 m/s2 or more arecommon. An advantage of this model extension is that it does not introduce anynew vehicle states, so it can be used without compromising the estimation accuracyfor acceleration-free sequences.

Roll Angle

This model extension differs from the others in that it introduces a new dimensionfor the description of vehicle motion. The results reflects this by showing that itis the model extension which provides the largest performance improvement, atleast in terms of landmark measurement residuals. However, for some sequencesthe roll angle has had a tendency to drift slowly, so if longer sequences are to beanalyzed this might become a problem.

Second Order Model for Car Pitch

The results for this model extension indicate that it provides no improvement,but rather that it decreases performance. This might be a result of that the truecar pitch oscillation process is described adequately enough by the basic model,so that the second order model basically only adds the hassle of estimating morevehicle states, making it more difficult to estimate the pose accurately. Anotherpossibility is that the used values of the model parameters (natural frequency andrelaxation time) are not properly set.

5.1.3 Feature Association ImprovementThe results show that introducing scale change correction for association of featurepatches most of the time gives landmarks with basically unchanged or slightly

5.2 Future Work 59

longer life-time, but the results of sequence K shows that the scaling might decreasethe life-time, although the life-time with scaling was still high for this sequence,so it is possible that the risk of shortened life-time is not so big for sequence inwhich the life-time is low when scaling is not used. Another possibility for thatsequence is that when no scaling was used there were some long-lived landmarkswith erroneous association, and introducing scaling beneficially decreased the life-time of these landmarks by letting only the correct associations remain.

5.2 Future WorkDuring this work it has become more and more clear that the problem with er-roneous association of features is an important issue to be dealt with, since themeasurements are the foundation on which SAM and model extensions relies. Thenatural way to improve landmark association is to use some other feature descrip-tor, and possibly also change the feature detection algorithm, but a straightforwardapproach of reducing the rate of erroneous feature association is to add logic inthe SAM algorithm that removes unstable landmarks. Such an approach mightalso result in robustness to moving objects, at least to some extent. Another wayto improve the measurements derived from the camera images is to extract imagemeasurements of yaw, pitch and roll change rates that are not based on landmarks,but instead uses phase correlation or some similar method.

Bibliography

[1] F. Dellaert and M. Kaess. Square root SAM: Simultaneous localization andmapping via square root information smoothing. International Journal ofRobotics Research, 25(12):1181–1203, 2006.

[2] E.D. Dickmanns. Dynamic Vision for Perception and Control of Motion.Springer, Secaucus, NJ, USA, 2007.

[3] C. Harris and M. Stephens. A combined corner and edge detector. In Pro-ceedings of the 4th Alvey Vision Conference, pages 147–151, Manchester, UK,August 1988.

[4] M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Incremental Smoothingand Mapping. IEEE Transactions on Robotics, 24(6):1365–1378, 2008.

[5] Q. Lin, F. Tjärnström, J. Roll, and B. Wass. Developing a far infrared basednight-vision system with pedestrian detection. VDI Berichte, (2038):153–158,2008.

[6] D.G. Lowe. Distinctive image features from scale-invariant keypoints. Inter-national Journal of Computer Vision, 60(2):91–110, 2004.

[7] Y. Ma, S. Soatto, J. Kosecka, and S.S. Sastry. An Invitation to 3-D Vision:From Images to Geometric Models. Springer, 2004.

[8] J.M.M. Montiel, J. Civera, and A.J. Davison. Unified inverse depthparametrization for monocular SLAM. In Proceedings of Robotics: Scienceand Systems (RSS), Philadelphia, USA, August 2006.

[9] T.B. Schön and J. Roll. Ego-motion and indirect road geometry estimationusing night vision. In IEEE Intelligent Vehicles Symposium, Proceedings,pages 30–35, Xi’an, Shaanxi, China, June 2009.

[10] G. Sibley, C. Mei, I. Reid, and P. Newman. Adaptive Relative Bundle Ad-justment. In Proceedings of Robotics: Science and Systems (RSS), Seattle,USA, June 2009.

[11] Z. Sjanic, M. Skoglund, T. B. Schön, and F. Gustafsson. Solving the SLAMProblem for Unmanned Aerial Vehicles Using Smoothed Estimates. In Pro-ceedings of the Reglermöte (Swedish Control Conference), Lund, Sweden,June 2010.

61

62 Bibliography

[12] H. Strasdat, J.M.M. Montiel, and A.J. Davison. Real-time Monocular SLAM:Why Filter? In Proceedings of the IEEE International Conference onRobotics and Automation (ICRA), Anchorage, Alaska, USA, May 2010.

[13] O. Tsimhoni, J. Bärgman, and M.J. Flannagan. Pedestrian detection withnear and far infrared night vision enhancement. LEUKOS - Journal of Illu-minating Engineering Society of North America, 4(2):113–128, 2007.

Appendix A

Nomenclature

A.1 Mathematical Notations

f time derivative dfdt of function f

N (µ,Σ) normal distribution with mean µ and covariance matrix Σ∥v∥2 2-norm of vector varg min

xf(x) argument which minimizes the function f

max abs(v) maximum of the absolute values of the elements in vector v⌊x⌋ flooring of scalar xIn×n identity matrix with n rows and columns

A.2 AbbreviationsEKF extended Kalman filterFIR far infraredNCC normalized cross-correlationNIR near infraredRMS root mean squareSAM smoothing and mappingSLAM simultaneous localization and mapping

A.3 Landmark Stateskw initial camera position in three dimensionsθw azimuth angleϕw elevation angleρ inverse depth

63

64 Nomenclature

A.4 Vehicle Statesp position in three dimensionsvx velocity of the vehicle in its forward directionψ yaw angleδ front wheel angleα road pitch angleφ vehicle pitch angleφ0 vehicle pitch angle offsetφ vehicle pitch angular accelerationγ roll angle

A.5 Variables, Parameters and Functions

a vehicle state prediction residualsA matrix describing the optimization problemc measurement residualscb camera position given in vehicle body coordinatesC car pitch damping parametere measurement noisef vehicle state process function, focal lengthF Jacobian of f with respect to vehicular statesg landmark state in six parameters as a function of landmark index

and SAM statesh measurement functionH Jacobian of h with respect to all states or vehicular statesj landmark indexJ Jacobian of h with respect to landmark statesk measurement numberl landmark positionL wheel base (distance between front and rear wheel axes) of the vehiclem directional unit vector from initial camera position to landmarkM number of landmarks, number of visible landmarksN total number of time stepsP state covariancePn normalized pinhole projection functionQ process noise covarianceR rotation matrixS measurement noise covariancet timeT sampling timeu input signalw vehicle state process noisex state vectory measurement vector

A.6 Indices 65

A.6 Indicesb vehicle body coordinate framec camera, camera coordinate framej landmark indexk measurement numberl landmarkt timev vehiclew world coordinate framex x-coordinatey y-coordinatez z-coordinate

A.7 Top Notationsx state estimate from EKF algorithmx state estimate from SAM algorithmy landmark measurements given in pixel coordinates

Appendix B

Proof of Matrix Singularity

This appendix provides a proof of the statement that the process noise covariancematrix Q(xvt ) is singular (non-invertible).

We have

Qt = Q(xvt )(2.9)= B(xvt )QωB(xvt )T ∈ Rn

v×nv

, (B.1a)B(xvt ) ∈ Rn

v×m, (B.1b)Qω ∈ Rm×m, (B.1c)m < nv, (B.1d)

which results in

rankQt = rank(B(xvt )QωB(xvt )T ) ≤≤ min(rank(B(xvt )), rank(QωB(xvt )T )) ≤

≤/

rank(B(xvt )) ≤ min(nv,m) = m/

≤

≤ m.

(B.2)

Finally, rankQt < nv since m < nv, and we have that Qt is not of full rank, henceit is singular.

66

Appendix C

Derivation of LandmarkMeasurement Residual Limit

Suppose that the undiscretized landmark measurement y (given in pixels) is noisefree and unbiased, and that we have the prediction yp = y. The measurementym is y rounded to the nearest integer, resulting in the landmark measurementresidual cl(y) according to

cl(y) = ym − yp = round(y) − y = ⌊y + 0.5⌋ − y. (C.1)

Note that we are considering one scalar component of a single measurement, i.e.we are looking at either the y-component or the z-component of the measurement.

The residual is periodic, so the RMS of it can be calculated over one period,and we assume that y is uniformly distributed.

RMS(cl(y)) =

√√√√√ 10.5 − (−0.5)

0.5∫−0.5

cl(y)2 dy =

√√√√√ 0.5∫−0.5

(0 − y)2 dy =

=

√√√√√ 0.5∫−0.5

y2 dy =

√[y3

3

]0.5

−0.5=

√13

(123 + 1

23

)=

= 12√

3≈ 0.29.

(C.2)

The conclusion from this is that regardless of how good the landmark measure-ment predictions are (as a result of an accurate vehicle process model), RMS(cl)cannot be expected to be below approximately 0.29. (Note however that this limitis not strict in a mathematical sense, since by chance we might get samples ofy resulting in a RMS(cl) smaller than the limit, but the probability for this tohappen decreases rapidly as the number of measurements increases.)

67

Appendix D

Results

Tables D.1-D.12 show the complete results in terms of landmark measurementresiduals for all the sequences presented in the results chapter.

Vehicle process model RMS(cl)Offset Acc Roll 2nd EKF SAM

2.50 0.55• 2.91 0.54

• 2.49 0.55• 2.73 0.55

• 2.54 0.53• • 2.90 0.54• • 2.88 0.50• • 2.88 0.53

• • 2.47 0.49• • 2.53 0.53

• • 2.65 0.40• • • 2.88 0.49• • • 2.91 0.56• • • 2.78 0.50

• • • 2.56 0.39• • • • 2.77 0.52

Table D.1: Sequence A: Landmark measurement residuals.

68

69


1.02 0.49• 1.39 0.52

• 1.00 0.49• 0.94 0.41

• 0.90 0.49• • 1.37 0.51• • 1.36 0.39• • 1.34 0.50

• • 0.95 0.41• • 0.99 0.51

• • 1.10 0.45• • • 1.36 0.39• • • 1.34 0.50• • • 1.31 0.39

• • • 1.20 0.44• • • • 1.31 0.39

Table D.2: Sequence B: Landmark measurement residuals.


3.02 0.71• 6.02 0.62

• 3.14 0.83• 3.45 0.65

• 3.18 0.71• • 4.59 0.61• • 2.82 0.60• • 5.09 0.63

• • 3.46 0.65• • 2.77 0.74

• • 3.35 0.53• • • 2.87 0.58• • • 5.13 0.60• • • 3.78 0.63

• • • 3.07 0.69• • • • 3.47 0.62

Table D.3: Sequence C: Landmark measurement residuals.

70 Results


1.44 0.66• 1.81 0.56

• 1.34 0.64• 1.93 0.55

• 1.08 0.65• • 1.82 0.56• • 1.64 0.50• • 1.69 0.57

• • 1.90 0.55• • 1.09 0.66

• • 1.91 0.55• • • 1.65 0.50• • • 1.74 0.57• • • 1.57 0.51

• • • 1.97 0.65• • • • 2.37 0.56

Table D.4: Sequence D: Landmark measurement residuals.


0.99 0.47• 0.86 0.44

• 0.97 0.47• 0.95 0.45

• 0.98 0.47• • 0.86 0.44• • 0.90 0.45• • 0.82 0.43

• • 0.93 0.45• • 0.97 0.47

• • 0.94 0.45• • • 0.88 0.42• • • 0.83 0.43• • • 0.84 0.44

• • • 0.92 0.45• • • • 0.85 0.44

Table D.5: Sequence E: Landmark measurement residuals.

71


1.46 0.55• 1.74 0.53

• 1.48 0.53• 1.47 0.48

• 1.37 0.56• • 1.84 0.53• • 1.61 0.49• • 1.64 0.52

• • 1.47 0.48• • 1.31 0.64

• • 1.39 0.49• • • 1.69 0.49• • • 1.80 0.53• • • 1.57 0.49

• • • 1.40 0.48• • • • 1.69 0.49

Table D.6: Sequence F: Landmark measurement residuals.


1.95 0.53• 3.43 0.54

• 2.04 0.66• 2.03 0.50

• 3.00 0.61• • 3.50 0.54• • 2.75 0.54• • 4.72 0.55

• • 1.80 0.52• • 2.17 0.65

• • 2.51 0.53• • • 3.10 0.48• • • 3.78 0.57• • • 2.47 0.51

• • • 2.11 0.53• • • • 3.12 0.55

Table D.7: Sequence G: Landmark measurement residuals.

72 Results


3.91 0.75• 4.49 0.67

• 4.20 0.87• 3.91 0.87

• 3.48 0.97• • 5.89 0.65• • 3.50 0.42• • 4.20 0.65

• • 3.18 0.48• • 3.63 0.95

• • 3.63 0.69• • • 3.01 0.41• • • 5.59 0.80• • • 5.65 0.68

• • • 3.47 0.64• • • • 5.86 0.68

Table D.8: Sequence H: Landmark measurement residuals.


2.10 0.51• 2.66 0.55

• 2.19 0.51• 1.87 0.36

• 2.12 0.53• • 2.35 0.53• • 1.83 0.36• • 2.35 0.42

• • 1.66 0.36• • 1.97 0.43

• • 2.91 0.53• • • 1.81 0.36• • • 2.58 0.43• • • 1.76 0.36

• • • 2.70 0.46• • • • 1.74 0.35

Table D.9: Sequence I: Landmark measurement residuals.

73


2.84 0.75• 4.04 0.73

• 2.83 0.69• 2.99 0.72

• 2.58 0.77• • 3.49 0.67• • 3.70 0.57• • 3.96 0.67

• • 3.00 0.65• • 2.96 0.73

• • 2.93 0.83• • • 3.65 0.63• • • 3.73 0.64• • • 4.25 0.61

• • • 2.41 0.66• • • • 3.85 0.56

Table D.10: Sequence J: Landmark measurement residuals.


4.26 0.84• 7.54 0.72

• 3.41 0.75• 3.16 0.72

• 3.14 0.88• • 3.64 0.67• • 3.25 0.63• • 2.96 0.81

• • 3.24 0.66• • 3.91 1.08

• • 3.33 0.87• • • 3.10 0.59• • • 2.81 0.76• • • 3.48 0.59

• • • 2.58 0.75• • • • 3.24 0.66

Table D.11: Sequence K: Landmark measurement residuals.

74 Results


2.53 0.68• 2.82 0.68

• 2.59 0.68• 1.93 0.60

• 2.02 0.77• • 2.57 0.61• • 2.15 0.67• • 4.19 1.29

• • 1.78 0.59• • 2.02 0.66

• • 1.83 0.56• • • 2.30 0.57• • • 2.87 0.66• • • 2.53 0.52

• • • 2.24 0.50• • • • 2.38 0.53

Table D.12: Sequence L: Landmark measurement residuals.

an optimization based approach to visual odometry using

Documents