lecture 13 visual inertial fusion - uzh

Lecture 13 Visual Inertial Fusion

Davide Scaramuzza

2

Outline

Introduction

IMU model and Camera-IMU system

Different paradigms

Filtering

Maximum a posteriori estimation

Fix-lag smoothing

3

What is an IMU?

Inertial Measurement Unit

Angular velocity

Linear Accelerations

4

What is an IMU?

Different categories

Mechanical

Optical

MEMS

….

For mobile robots: MEMS IMU

Cheap

Power efficient

Light weight and solid state

MEMS Accelerometer

spring

capacitive divider

M M M a a

A spring-like structure connects the device to a seismic mass vibrating in a capacity devider. A capacitive divider converts the displacement of the seismic mass into an electric signal. Damping is created by the gas sealed in the device.

6

MEMS Gyroscopes

MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks, vibrating wheels, or resonant solids)

Their working principle is similar to the haltere of a fly

Haltere are small structures of some two-winged insects, such as flies. They are flapped rapidly and function as gyroscopes, informing the insect about rotation of the body during flight.

7

Why IMU?

Monocular vision is scale ambiguous.

Pure vision is not robust enough

Low texture

High dynamic range

High speed motion

Robustness is a critical issue: Tesla accident

“The autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky. ”

8

Why vision?

Pure IMU integration will lead to large drift (especially cheap IMUs)

Will see later mathematically

Intuition

- Integration of angular velocity to get orientation: error proportional to t

- Double integration of acceleration to get position: if there is a bias in acceleration, the error of position is proportional to t2

- Worse, the actually position error also depends on the error of orientation.

http://www.vectornav.com/support/library/imu-and-ins

Smartphone accelerometers

9

Why visual inertial fusion?

Summary: IMU and vision are complementary

Visual sensor Inertial sensor

Precise in case of non-aggressive motion

Rich information for other purposes ᵡ Limited output rate (~100 Hz) ᵡ Scale ambiguity in monocular setup. ᵡ Lack of robustness

Robust High output rate (~1,000 Hz)

ᵡ Large relative uncertainty when at low

acceleration/angular velocity ᵡ Ambiguity in gravity / acceleration

In common: state estimation based on visual or/and inertial sensor is dead-reckoning, which suffers from drifting over time. (solution: loop detection and loop closure)

10

Outline

Introduction


Different paradigms

Filtering


Fix-lag smoothing

11

IMU model: Measurement Model Measures angular velocity and acceleration in the body frame:

B WB B WB

g gt t t t ω ω b n

B WB BW W WB w

a at t t t t a R a g b n

where the superscript𝑔

stands for Gyroscope and𝑎

for Accelerometer Notations: • Left subscript: reference frame in which the quantity is expressed • Right subscript {Q}{Frame1}{Frame2}: Q of Frame2 with respect to Frame1 • Noises are all in the body frame

measurements noise

12

IMU model: Noise Property Additive Gaussian white noise:

Trawny, Nikolas, and Stergios I. Roumeliotis. "Indirect Kalman filter for 3D attitude estimation.“ https://github.com/ethz-asl/kalibr/wiki/IMU-Noise-Model

,g at tb b

,g at tn n

2

1 2 1 2

0E n t

E n t n t t t

( ) bt tb w

~ (0,1)

/ t

d

d

n k w k

w k N

Bias:

1

~ (0,1)

bd

bd b

k k k

t

w k N

b b w

i.e., the derivative of the bias is white Gaussian noise (so-called random walk)

The biases are usually estimated with the other states • can change every time the IMU is started • can change due to temperature change, mechanical pressure, etc.

https://github.com/ethz-asl/kalibr/wiki/IMU-Noise-Model








14

IMU model: Integration

Per component: {t} stands for {B}ody frame at time t

2

2 1 1

1

2

Wt Wt 2 1 Wt Wt w d

t

a

t

t t t t t t p p v R a b g

• Depends on initial position and velocity • The rotation R(t) is computed from the gyroscope

Trawny, Nikolas, and Stergios I. Roumeliotis. "Indirect Kalman filter for 3D attitude estimation."

15

Camera-IMU System

There can be multiple cameras.

16

Outline

Introduction


Different paradigms

Closed-form solution

Filtering

Fix-lag smoothing


17

Closed-form Solution (intuitive idea) The absolute pose 𝑥 is known up to a scale 𝑠, thus

𝑥 = 𝑠𝑥

From the IMU

𝑥 = 𝑥0 + 𝑣0(𝑡1 − 𝑡0) + 𝑎 𝑡 𝑑𝑡𝑡1

𝑡0

By equating them

𝑠𝑥 = 𝑥0 + 𝑣0 𝑡1 − 𝑡0 + 𝑎 𝑡 𝑑𝑡𝑡1

𝑡0

As shown in [Martinelli’14], for 6DOF, both 𝑠 and 𝑣0 can be determined in closed form from a single feature observation and 3 views. 𝑥0 can be set to 0.

Martinelli, Closed-form solution of visual-inertial structure from motion, International Journal of Computer Vision, 2014

18

Closed-form Solution

Martinelli, Closed-form solution of visual-inertial structure from motion, International Journal of Computer Vision, 2014

𝑠𝑥1 = 𝑣0 𝑡1 − 𝑡0 + 𝑎 𝑡 𝑑𝑡𝑡1

𝑡0

𝑠𝑥2 = 𝑣0 𝑡2 − 𝑡0 + 𝑎 𝑡 𝑑𝑡𝑡2

𝑡0

𝑡0 𝑡1 𝑡2

𝐿1

𝑥1 (𝑡0−𝑡1)𝑥2 (𝑡0−𝑡2)

𝑠𝑣0

=

𝑎 𝑡 𝑑𝑡𝑡1

𝑡0

𝑎 𝑡 𝑑𝑡2

𝑡0

19

Different paradigms Loosely coupled: use the output of individual system

Estimate the states individually from visual and inertial data

Combine the separate states estimations

Tightly coupled: use the internal states

Make use of the raw measurements

- Feature positions

- IMU readings

- …

Example:

- Use IMU for guided feature matching

- Minimizing reprojection error and IMU error together

- …

More accurate More implementation effort.

20

Different paradigms

Feature Extraction

Feature Matching

Motion Estimation

images position attitude

IMU Integration IMU measurements

position attitude velocity

Loosely coupled

Tightly coupled

Feature correspondence

More accurate More implementation effort.

21

Different paradigms

Filtering Fix-lag Smoothing Maximum-A-Posteriori

(MAP) Estimation

Filtering the most recent states • (e.g., extended

Kalman filter)

Optimize window of states • Marginalization • Nonlinear least squares

optimization

Optimize all states • Nonlinear Least squares

optimization

1 Linearization Accumulation of linearization errors Gaussian approximation of marginalized states Faster

Re-Linearize Accumulation of linearization errors Gaussian approximation of marginalized states Fast

Re-Linearize Sparse Matrices Highest Accuracy

Slow

22

Filtering: Kalman Filter in a Nutshell

Assumptions: linear system, Gaussian noise

Kalman Filter System dynamics

1

1 1

x k A k x k

u k v k

z k H k x k w k

x(k): state u(k): control input, can be 0 z(k): measurement

0 00 ~ ,

~ 0,

~ 0,

x N x P

v k N Q k

w k N R k

0 00 , 0m mx x P P

ˆ ˆ1 1 1

1 1 1

1

p m

T

p m

x k A k x k u k

P k A k P k A k

Q k

11

1

ˆ ˆ

ˆ

T

m p

m p

T

m p

P k P k H k R k H k

x k x k

P k H k R k z k H k x k

Prediction

Measurement update

Weight between the model prediction and measurement

23

Filtering: Kalman Filter in a Nutshell Nonlinear system: linearization

Extended Kalman Filter System dynamics

1 1 , 1 , 1

,

k

k

x k q x k u k v k

z k h x k w k

Process and measurement noise and initial state are Gaussian.

1ˆ ˆ 1 , 1 ,0

1 1 1

1 1 1

p k m

T

p m

T

x k q x k u k

P k A k P k A k

L k Q k L k

1

(

)

ˆ ˆ ˆ ,0

T T

p p

T

m p k p

m p

K k P k H k H k P k H k

M k R k M k

x k x k K k z k h x

P k I K k H k P k

Key idea: • Linearize around the

estimated states • A(k) L(k) H(k) M(k) are

partial derivatives with respect to states and noise

Prediction

Measurement update

24

System states:

W WB W W 1 W 2 W; ; ; ; ; ; ;...,;a g

Kt t t t t X p q v b b L L L

Filtering: Visual Inertial Formulation

Process Model: from IMU

Tightly coupled:

Loosely coupled: W WB W; ; ; ;a gt t t t t X p q v b b

• Integration of IMU states (rotation, position, velocity) • Propagation of IMU noise

• needed for calculating the Kalman Filter gain

25

Filtering: Visual Inertial Formulation Measurement Model: from camera

+Cx x

C

Cy y

C

xf c

zu

v yf c

z

2

proj

2

10

10

x x

y y

xf f

z z

yf f

z z

H

C

C CB BW W W CB

C

x

y

z

R R L p p

Pinhole projection (without distortion)

Transform point to camera frame

Landmark CB BW CW=H R R R

pose CB BW B H R R L

Drop C for clarity

proj poseXH H H

proj LandmarkLH H H

B L

26

Filtering: ROVIO

Bloesch, Michael, et al. "Robust visual inertial odometry using a direct EKF-based approach."

Use pixel intensities as measurements.

27

Filtering: Potential Problems

Wrong linearization point

Linearization depends on the current estimates of states, which may be erroneous

Linearization around different values of the same variable leads to estimator inconsistency (wrong observability/covariance estimation)

Wrong covariance/initial states

Intuitively, wrong weights for measurements and prediction

May be overconfident/underconfident

Explosion of number of states

Roughly cubic of the number of the states

Each 3D point: 3 variables

28

Filtering: Potential Problems

Wrong linearization point

Linearization depends on the current estimates of states, which may be erroneous

Linearization around different values of the same variable leads to estimator inconsistency (wrong observability/covariance estimation)

Wrong covariance/initial states

Intuitively, wrong weights for measurements and prediction

May be overconfident/underconfident

Explosion of number of states

Each 3D point: 3 variables

29

Filtering: MSCKF (Multi-State Constraint Kalman Filter):

used in Google Tango

Key idea:

Keep a window of recent states

incorporate visual observations without including point positions into the states

Li, Mingyang, and Anastasios I. Mourikis. "High-precision, consistent EKF-based visual–inertial odometry."

Filtering: Google Tango

Fusion solved as a non-linear optimization problem

Increased accuracy over filtering methods

Forster, Carlone, Dellaert, Scaramuzza, IMU Preintegration on Manifold for efficient Visual-Inertial Maximum-a-Posteriori Estimation, Robotics Science and Systens’15, Best Paper Award Finalist

Optimization-based Approaches (MAP, Fix-lag smoothing)

* *

{ , }

22

11 1{ , }

{ , } argmax ( , | )

argmin ( ) ( , )k jk

i

X L

N M

k k i i ik iX L

X L P X L Z

f x x h x l z

32

MAP: a nonlinear least squares problem

Bayesian Theorem

( | ) ( )( | )

( )

P B A P AP A B

P B

Max a Posteriori: given the observation, what is the optimal estimation of the states?

Gaussian Property: for iid variables

Applied to state estimation problem: • X: states (position, attitude, velocity, and 3D point position) • Z: measurements (feature positions, IMU readings)

2

22 21

2

1( ,..., | , )

2

ix

kf x x e

Maximizing the probability is equivalent to minimizing the square root sum

33

MAP: a nonlinear least squares problem SLAM as a MAP problem

1

0 11 2

( , | ) ( | , ) ( , )

( | , ) ( )

( ) ( | , ) ( | )k j

M

ii

M N

i i i k ki k

P X L Z P Z X L P X L

P z X L P X

P x P z x l P x x

• X L are independent, and no prior information about L

• Measurements are independent • Markov process model

X = {x1,…xN}: robot states L = {l1,…}: 3D points Z = {zi, … zM}: feature positions

1( )

( , )k j

k k

i i i

x f x

z h x l

34

MAP: a nonlinear least squares problem SLAM as a least squares problem

0 11 2

( , | ) ( ) ( | , ) ( | )k j

M N

i i i k ki k

P X L Z P x P z x l P x x

* *

{ , }

22

11 1{ , }

{ , } argmax ( , | )

argmin ( ) ( , )k jk

i

X L

N M

k k i i ik iX L

X L P X L Z

f x x h x l z

Notes:

Normalize the residuals with the variance of process noise and measurement noise (so-called Mahalanobis distance)

Without the prior, applying the property of Gaussian distribution:

35

MAP: Nonlinear optimization

Gauss-Newton method 2*

1argmin ( )

M

i ii

f

θ

θ θ z

Solve it iteratively 2

*

1

1

argmin ( )M

s

i ii

s s

f

ε θ ε z

θ θ ε Applying first-order approximation:

2*

1

2

1

* 1 T

argmin ( )

argmin ( )

( ) ( )

Ms

i i ii

Ms

i ii

T

f

ε θ z J ε

r θ J ε

ε J J J r θ

1 1

2 2

... ...

M M

J r

J rJ r

J r

36

MAP: visual inertial formulation States

WB WB WB, , [k], ,a g

R k k k k k X p q v b b

L W1 W2 WL, ,...,X L L L

R R R L1 , 2 ,..., ,k X X X X X

1k K f x x

,ik ij ih x L z

Dynamics Jacobians

IMU integration w.r.t xk-1

Residual w.r.t. xk

Measurements Jacobians (same as filtering method)

Feature position w.r.t. pose

Feature position w.r.t. 3D coordinates

Combined:

37

MAP: why it is slow

Re-linearization

Need to recalculate the Jacobian for each iteration

But it is also an important reason why MAP is accurate

The number of states is large

Will see next: fix-lag smoothing and marginalization

Re-integration of IMU measurements

The integration from k to k+1 is related to the state estimation at time k

Preintegration

Lupton, Todd, and Salah Sukkarieh. "Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions." Forster, Christian, et al. "IMU preintegration on manifold for efficient visual-inertial maximum-a-posteriori estimation."

38

{𝜔 , 𝑎 } Δ𝑅 , Δ𝑣 , Δ𝑝

Standard: Evaluate error in global frame:

Preintegration: Evaluate relative errors:

𝒆𝑅 = Δ𝑅 𝑇Δ𝑅

𝒆v = Δv − Δv

𝒆𝑝 = Δ𝑝 − Δ𝑝

𝒆𝑅 = 𝑅 𝜔 , 𝑅𝑘−1𝑇𝑅𝑘

𝒆v = v (𝜔 , 𝑎 , v𝑘−1) − v𝑘

𝒆𝑝 = 𝑝 (𝜔 , 𝑎 , 𝑝𝑘−1) − 𝑝𝑘

𝑅, 𝑝, 𝑣

Repeat integration when previous state changes!

Preintegration of IMU deltas possible with no initial condition required.

Predicted

Estimate

MAP: IMU Preintegration

39

Fusion solved as a non-linear optimization problem

Increased accuracy over filtering methods

IMU residuals Reprojection residuals

Forster, Carlone, Dellaert, Scaramuzza, IMU Preintegration on Manifold for efficient Visual-Inertial Maximum-a-Posteriori Estimation, Robotics Science and Systens’15, Best Paper Award Finalist

Optimization-based Approaches (MAP, Fix-lag smoothing)

40

MAP: SVO + IMU Preintegration

Google Tango Proposed OKVIS

Accuracy: 0.1% of the travel distance

Open Source

Forster, Carlone, Dellaert, Scaramuzza, IMU Preintegration on Manifold for efficient Visual-Inertial Maximum-a-

Posteriori Estimation, Robotics Science and Systens’15 and IEEE TRO’16, Best Paper Award Finalist

41

MAP: SVO + IMU Preintegration

42

Fix-lag smoothing: Basic Idea

Recall MAP estimation * 1 T( ) ( )T ε J J J r θ

is also called the Hessian matrix. T

J J

Hessian for full bundle adjustment: n x n, n number of all the states

pose, velocity | landmarks

If only part of the states are of interest, can we think of a way for simplification?

43

Fix-lag smoothing: Marginalization

Schur complement

A BM

C D

1D D CA B 1A A BD C

Schur complement of A in M

Schur complement of D in M

Reduced linear system

1 1

2 2

A B

C D

x b

x b

1 1

1 1

2 2

1 0 1 0

1 1

A B

CA C D CA

x b

x b

1

20

bA B

bD

1

2 2 1b b CA b

We can then just solve for x2, and (optionally) solve for x1 by back substitution.

44

Fix-lag smoothing: Marginalization

Generalized Schur complement

Any principal submatrix: selecting n rows and n columns of the same index (i.e., select any states to marginalize)

Nonsingular submatrix: use generalized inverse (e.g., Moore–Penrose pseudoinverse)

Special structure of SLAM

Marginalization causes fill-in, no longer maintaining the sparse structure.

Inverse of diagonal matrix is very efficient to calculate.

1D

45

Fix-lag smoothing: Implementation

States and formulations are similar to MAP estimation.

Which states to marginalize?

Old states: keep a window of recent frames

Landmarks: structureless

Marginalizing states vs. dropping the states

Dropping the states: loss of information, not optimal

Marginalization: optimal if there is no linearization error, but introduces fill-in, causing performance penalty

Therefore, dropping states is also used to trade accuracy for speed.

Leutenegger, Stefan, et al. "Keyframe-based visual–inertial odometry using nonlinear optimization."

46

Fix-lag smoothing: OKVIS

47

Further topics

Rotation parameterization

Rotation is tricky to deal with…

Euler angle / Rotation matrix / Quaternion / SO(3)

Consistency: filtering and fix-lag smoothing

Linearization around different values of the same variable may lead to error

lecture 13 visual inertial fusion - uzh

Documents