generation and weighting of 3d point correspondences for improved registration of rgb-d data

Generation and Weighting of 3D Point Correspondences for Improved Registration of RGB-D Data

Kourosh Khoshelham

Daniel Dos Santos

George Vosselman

MAPPING BY RGB-D DATA

RGB-D cameras like Kinect have great potential for indoor

mapping;

Kinect captures:

depth + color images @ ~30 fps

= sequence of colored point clouds

2

+

IR emitter RGB camera IR camera

REGISTRATION OF RGB-D DATA

Mapping requires registration of consecutive frames;

Registration: transforming all point clouds into one coordinate

system (usually of the first frame).

3

𝐗𝑖,𝑗−1 = 𝐑𝑗𝑗−1

𝐗𝑖,𝑗 + 𝐭𝑗𝑗−1

Point i in frame j Point i in frame j-1

Transformation from

frame j to frame j-1

REGISTRATION BY VISUAL FEATURES

Extraction and matching of keypoints is done more reliably in RGB

images;

Two main components:

Keypoint extraction and matching SIFT, SURF, …

Outlier detection RANSAC, M-estimator, …

4

SURF

matches

Least-squares

estimation of

registration

parameters RANSAC

Conversion to 3D

correspondences

(using depth data)

Removing

outliers

CHALLENGES AND OBJECTIVES

Challenge:

Pairwise registration errors accumulate deformed point cloud

Objective:

More accurate pairwise registration by:

i. Accurate generation of 3D correspondences from 2D points;

ii. Weighting 3D point pairs based on random error of depth.

5

GENERATION OF 3D POINT CORRESPONDENCES

2D keypoints 3D point correspondences ? (ill-posed)

RGB image coordinates relate to depth image coordinates by a

shift?

Note: the FOV of the RGB camera and IR camera are different!

Our approach:

Transform 2D keypoints from RGB to depth image using

relative orientation between the two cameras;

Search along the epipolar line for the correct 3D coordinates.

Note: relative orientation parameters are estimated during calibration.

6


More formally:

Given a keypoint in the RGB frame:

1. calculate the epipolar line in the depth frame using the relative

orientation parameters;

2. define a search band along the epipolar line using the minimum and

maximum of the range of depth values (0.5 m and 5 m respectively);

For all pixels within the search band:

1. calculate 3D coordinates and re-project the resulting 3D point back

to the RGB frame;

2. calculate and store the distance between the reprojected point and

the original keypoint;

Return the 3D point whose re-projection has the smallest distance

to the keypoint.

7


Finding 3D points in the depth image (right) corresponding to 2D

keypoints in the RGB image (left) by searching along epipolar

lines (red bands).

8

ESTIMATING RELATIVE ORIENTATION PARAMETERS

Relative orientation between the RGB camera and IR camera:

approximate by a shift;

estimate by stereo calibration;

estimate by space resection.

9

Manually measured markers in the disparity (left) and colour image (right)

used for the estimation of relative orientation parameters by space resection.

WEIGHTING OF 3D POINT CORRESPONDENCES

Observation equation in the estimation model:

Approximate as:

Note: because of high frame rate transformation parameters between

consecutive frames are quite small.

Define weights as:

10

1

,

1

1, XX

j

jji

j

jjiiv tR

jijiiv ,1, XX

2

X

2

X

2

,1, jijii

kkwi

v

WEIGHTING OF 3D POINT CORRESPONDENCES

11

We use random error of depth only:

Relation between disparity (d) and depth (Z):

Propagation of variance:

Weight:

242

1

2

dZ Zc

dccZ 10

1

Calibration

parameters

4

,

4

1,

22

1

jiji

d

iZZ

kcw

RESULTS: ACCURACY OF 3D POINT CORRESPONDENCES

Relative orientation

approximated by a shift

12


13


estimated by stereo calibration


14


estimated by space resection

EFFECT OF WEIGHTS IN REGISTRATION

Six RGB-D sequences of an office environment;

Trajectories formed closed loops;

Evaluation by closing error:

15

nn

n

R v1

11

2T 10HHH

Transformation from

first frame to last frame

Transformation from

last frame to first frame

Closing

translation

Closing

rotation


Closing distance for the six sequences registered with and without

weights:

16


Closing angle for the six sequences registered with and without

weights:

17


Average closing errors for registrations with and without weight:

18

Registration Average closing

distance [cm]

Average closing

angle [deg]

without weight 6.42 6.32

with weight 3.80 4.74


The trajectory obtained by weighted registration (in blue) is more

accurate than the one without weights (in red).

19

EXAMPLE REGISTRATION RESULTS

20

EXAMPLE REGISTRATION RESULTS

21

CONCLUSIONS

Accurate transformation of keypoints from the RGB space to the

3D space more accurate registration of consecutive frames;

Assigning weights based on random error of depth improves the

accuracy of pairwise registration and sensor pose estimates.

Using weights covariance matrices for pose vectors

can be used to weight pose vectors in the global adjustment

= more accurate loop closure

Influence of synchronization errors (between RGB and IR cam)

fine registration using point- and plane correspondences

extracted directly from the point cloud.

22

SUPPLEMENTARY SLIDES

24

25

Measurement principle of Kinect

Depth measurement by triangulation:

The laser source emits a laser beam;

A diffraction grating splits the beam to create a pattern of speckles

projected onto the scene;

The speckles are captured by the infrared camera;

The speckle image is correlated with a reference image obtained by

capturing a plane at a known distance from the sensor;

The result of correlation is a disparity value for each pixel from which

depth can be calculated.

IR image of pattern of

speckles projected to

the scene

Resulting

disparity

image

26

Depth-disparity relation and calculation of point coordinates

From triangle similarities:

and:

where:

dfb

Z

ZZ

o

ok

1

)(

)(

yyyf

ZY

xxxf

ZX

okk

k

okk

k

Zo Distance of the reference plane

f Focal lnegth of the IR camera

d Measured disparity

b Base length between emitter and IR camera

xk,yk Image coordinates of point k

xo, yo Principal point offsets

δx,δy Lens distortion corrections

27

Calibration

Calibration procedure:

focal length (f);

principal point offsets (xo, yo);

lens distortion coefficients (in δx, δy);

base length (b);

distance of the reference pattern (Zo).

Standard calibration

of IR camera

Depth calibration

)()(11

fb

nZd

fb

mZ ok

Normalization:

d = md’+n

Depth calibration parameters

28

Theoretical model of depth random error

Depth equation:

Depth random error:

'

2)( dkZ Z

fb

mk

)()(11

fb

nZd

fb

mZ ok

Propagation

of variance

Random error is a quadratic function of depth.

29

Depth random error

Standard deviation of

plane fitting residuals as

a measure of depth

random error;

As expected, depth

random error increases

quadratically with

increasing distance from

the sensor.

1.0 m

2.0 m

3.0 m

4.0 m

5.0 m

30

Depth resolution

Distribution of plane fitting residuals on the

plane at 4 m distance

Depth resolution is also proportional to the

squared distance from the sensor;

Side view of the points on the plane

at 4 m (effect of depth resolution)

At a maximum range of 5 m

depth resolution is 7 cm.

generation and weighting of 3d point correspondences for improved registration of rgb-d data

Technology

d correspondences

d keypoints

d coordinates

d point pairs

rgbd data rgbd cameras

d points

rgb frame

d data mapping