sun3d: a database of big spaces reconstructed using sfm and...

hotel_umd/maryland_hotel3 (JSON XML) Loaded

0374

Image Depth Hybrid Clear Delete pillow: 2pillow: 1pillow: 3bed headboard: 1bed headboard: 2bed: 1bed: 2wall: 1pillow: 4lamp: 1night stand: 1telephone: 1sofa chairfloorwall: window sidewall: pillar 1wall: pillar 2curtain: 1testwall: window 2wall: window 3

SUN3D: A Database of Big Spaces Reconstructed using SfM and Object LabelsJianxiong Xiao (Princeton) Andrew Owens (MIT) Antonio Torralba (MIT)

Motivation

References

Generalized Bundle Adjustment

Multi-view Object Annotation

SUN3D: A Place-centric 3D Database

“Data-driven” Brute-force SfM

VISI N GROUP

http://sun3d.cs.princeton.edu

1 meter

UW NYU SUN3DBerkeley

0 10000 200000%

5%

10%

15%

number of frames (30 fps)0 100 200 300 400 500

0%

5%

10%

15%

20%

area coverage (meter2)−90 −60 −30 0 30 60 90

0%

5%

10%

camera tilt angle (degree)−90 −60 −30 0 30 60 90

0%

5%

10%

camera roll angle (degree)

0 0.5 1 1.5 2 2.50%

5%

10%

15%

20%

camera height (meter)0 1000 2000 3000 4000 5000

0%

5%

10%

15%

20%

# frames taken within 1m0 1 2 3 4 5 6 7 8 9 10

0%

5%

10%

15%

20%

25%

distance to surface (meter)0 1 2 3 4 5 6 7 8 9 10

0%

5%

10%

15%

20%

25%

mean distance to surface (meter)

0% 25% 50% 75% 100%0%

5%

10%

15%

20%

area explored by the observer0 1000 2000 3000 4000

0%

10%

20%

30%

40%

# frames that a cell is visible0 10 20 30 40 50 60 70

0%

5%

10%

15%

20%

25%

# viewing directions0% 25% 50% 75% 100%

0%

10%

20%

30%

40%

pixels with valid depth

Geometric Statistics

otherscorridor(5%)

restroom(5%)classroom(6%)

office(6%)dorm(6%)lounge(9%)

hotel(9%)lab(9%)

apartment(17%)

conference room(19%)

dorm room(3%)corridor(2%)kitchen(2%)

playroom(3%)living room(3%)

staircase(4%)

bedroom(13%)

bathroom(20%)others

conference room(8%)restroom(6%)

escalator(6%)cubicle office(5%)basement(4%)

attic(2%)laundromat(2%)

View category distribution.

Scene Statistics

Raw Depth

Improved Depth

Image

Difference Depth

Raw Normals

Improved Normals

Raw Phong

Improved Phong

Raw TSDF (Ours) Cross-bilateral Filter

apartment conference room conference hall restroom classroom dorm hotel room lab lounge office57 m2 47 m2 130 m2 23 m2 109 m2 33 m2 41 m2 55 m2 124 m2 49 m2

R2,t2

R1,t1

Frame 1

Frame 2

Frame 3R3,t3

X1

X2

X3

Frame 1 Frame 2 Frame 3R3,t3

X1

X2 X3

R2,t2 R1,t1

Bounding Box Constraints

0

0wall painting suitcase toilet

headboard chair pillow bathtub

curtain bed door cabinet

Data & Source Code Available

[1] SUN Database: Large-scale Scene Recognition from Abbey to Zoo. J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba. CVPR2010.[2] Indoor Segmentation and Support Inference from RGBD Images. N. Silberman, P. Kohli, D. Hoiem and R. Fergus. ECCV2012.[3] Shape Anchors for Data-driven Multi-view Reconstruction. A. Owens, J. Xiao, A. Torralba and W. T. Freeman. ICCV2013.[4] Recognizing Scene Viewpoint using Panoramic Place Representa-tion. J. Xiao, K. A. Ehinger, A. Oliva and A. Torralba. CVPR2012.

Source Code:• Automatic RGB-D SfM• Generalized Bundle Adjustment• TSDF Depth Improvement• Online Annotation Tool• IO functions to read SUN3D

Ψ(X, s) =∥∥∥max

(0,X− s

2,−X− s

2

)∥∥∥

Frame 1448 Frame 1188 BOW td-idf scores

2D View-based SUN database 3D Place-centric SUN3D database

SUN3D Data:• RGB-D Video for Big Spaces• Camera Poses• Refined Depth Maps• Object Labels (in progress)• Point Cloud

Corrected Reconstruction and Object Annotations (Colored Based on Semantic Categories)

Visualization of the Constraints

With Object-to-Object CorrespondencesWithout Object Constraint

Segmented Point Cloud Merged from Different Views

Idea: Turn object annotations into bounding box constraints to fix loop closure errors. Constrain the points for each object instance to lie inside a single bounding box.

NYU Depth V2 SUN3Draw video yes yes

coverage area part of a room whole room or multi-roomstypical length hundreds of frames thousands of framescamera poses no yesobject label sparse frames whole video

instance label within one frame within the whole videodepth improvement cross-bilateral filtering multi-frame TSDF filling

Differences between NYU and SUN3D

Key Frame Key FramePropagation Result

Frame 76 Frame 121Frame 96

Propagation ResultPropagation Result

More Propagation Results

Existing scene datasets contain only a limited set of views of a place, and they lack representa-tions of complete 3D spaces. We introduce SUN3D, a large-scale RGB-D video database with camera poses and object labels, capturing the full 3D extent of big spaces.

Place category distribution

Data Capturing

The tasks that go into creating such a dataset are di�cult in isolation:• Labeling videos is painstaking• SfM is unreliable for large spaces. But if we combine them together, we make the dataset construction task easier. We use 3D to make object annota-tion easier, and we use object labels to correct large camera pose errors.

min∑c

∑p∈V(c)

(∥∥x̃c

p−K [Rc|tc]Xp

∥∥2+λ0

∥∥∥X̃cp−[Rc|tc]Xp

∥∥∥2

)+λ1

∑o

∑c

∑p∈L(o,c)

Ψ([Ro|to] [Rc|tc]−1X̃c

p, so)2

where

SemanticObject

Labeling

Good Reconstruction Semantic Segmentation

errors

Bad Reconstruction

err

TSDF-based Depth Map Improvement

RGB-D bundle adjustment Bounding box costs

c p∈V(c) o c p∈L(o,c)

• Use conservative loop closing (high precision and low recall)• Capture video in many passes so that similar views appear frequently, and loop closing succeeds.

sun3d: a database of big spaces reconstructed using sfm and...

Documents