hotel_umd/maryland_hotel3 (JSON XML) Loaded
0374
Image Depth Hybrid Clear Delete pillow: 2pillow: 1pillow: 3bed headboard: 1bed headboard: 2bed: 1bed: 2wall: 1pillow: 4lamp: 1night stand: 1telephone: 1sofa chairfloorwall: window sidewall: pillar 1wall: pillar 2curtain: 1testwall: window 2wall: window 3
SUN3D: A Database of Big Spaces Reconstructed using SfM and Object LabelsJianxiong Xiao (Princeton) Andrew Owens (MIT) Antonio Torralba (MIT)
Motivation
References
Generalized Bundle Adjustment
Multi-view Object Annotation
SUN3D: A Place-centric 3D Database
“Data-driven” Brute-force SfM
VISI N GROUP
http://sun3d.cs.princeton.edu
1 meter
UW NYU SUN3DBerkeley
0 10000 200000%
5%
10%
15%
number of frames (30 fps)0 100 200 300 400 500
0%
5%
10%
15%
20%
area coverage (meter2)−90 −60 −30 0 30 60 90
0%
5%
10%
camera tilt angle (degree)−90 −60 −30 0 30 60 90
0%
5%
10%
camera roll angle (degree)
0 0.5 1 1.5 2 2.50%
5%
10%
15%
20%
camera height (meter)0 1000 2000 3000 4000 5000
0%
5%
10%
15%
20%
# frames taken within 1m0 1 2 3 4 5 6 7 8 9 10
0%
5%
10%
15%
20%
25%
distance to surface (meter)0 1 2 3 4 5 6 7 8 9 10
0%
5%
10%
15%
20%
25%
mean distance to surface (meter)
0% 25% 50% 75% 100%0%
5%
10%
15%
20%
area explored by the observer0 1000 2000 3000 4000
0%
10%
20%
30%
40%
# frames that a cell is visible0 10 20 30 40 50 60 70
0%
5%
10%
15%
20%
25%
# viewing directions0% 25% 50% 75% 100%
0%
10%
20%
30%
40%
pixels with valid depth
Geometric Statistics
otherscorridor(5%)
restroom(5%)classroom(6%)
office(6%)dorm(6%)lounge(9%)
hotel(9%)lab(9%)
apartment(17%)
conference room(19%)
dorm room(3%)corridor(2%)kitchen(2%)
playroom(3%)living room(3%)
staircase(4%)
bedroom(13%)
bathroom(20%)others
conference room(8%)restroom(6%)
escalator(6%)cubicle office(5%)basement(4%)
attic(2%)laundromat(2%)
View category distribution.
Scene Statistics
Raw Depth
Improved Depth
Image
Difference Depth
Raw Normals
Improved Normals
Raw Phong
Improved Phong
Raw TSDF (Ours) Cross-bilateral Filter
apartment conference room conference hall restroom classroom dorm hotel room lab lounge office57 m2 47 m2 130 m2 23 m2 109 m2 33 m2 41 m2 55 m2 124 m2 49 m2
R2,t2
R1,t1
Frame 1
Frame 2
Frame 3R3,t3
X1
X2
X3
Frame 1 Frame 2 Frame 3R3,t3
X1
X2 X3
R2,t2 R1,t1
Bounding Box Constraints
0
0wall painting suitcase toilet
headboard chair pillow bathtub
curtain bed door cabinet
Data & Source Code Available
[1] SUN Database: Large-scale Scene Recognition from Abbey to Zoo. J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba. CVPR2010.[2] Indoor Segmentation and Support Inference from RGBD Images. N. Silberman, P. Kohli, D. Hoiem and R. Fergus. ECCV2012.[3] Shape Anchors for Data-driven Multi-view Reconstruction. A. Owens, J. Xiao, A. Torralba and W. T. Freeman. ICCV2013.[4] Recognizing Scene Viewpoint using Panoramic Place Representa-tion. J. Xiao, K. A. Ehinger, A. Oliva and A. Torralba. CVPR2012.
Source Code:• Automatic RGB-D SfM• Generalized Bundle Adjustment• TSDF Depth Improvement• Online Annotation Tool• IO functions to read SUN3D
Ψ(X, s) =∥∥∥max
(0,X− s
2,−X− s
2
)∥∥∥
Frame 1448 Frame 1188 BOW td-idf scores
2D View-based SUN database 3D Place-centric SUN3D database
SUN3D Data:• RGB-D Video for Big Spaces• Camera Poses• Refined Depth Maps• Object Labels (in progress)• Point Cloud
Corrected Reconstruction and Object Annotations (Colored Based on Semantic Categories)
Visualization of the Constraints
With Object-to-Object CorrespondencesWithout Object Constraint
Segmented Point Cloud Merged from Different Views
Idea: Turn object annotations into bounding box constraints to fix loop closure errors. Constrain the points for each object instance to lie inside a single bounding box.
NYU Depth V2 SUN3Draw video yes yes
coverage area part of a room whole room or multi-roomstypical length hundreds of frames thousands of framescamera poses no yesobject label sparse frames whole video
instance label within one frame within the whole videodepth improvement cross-bilateral filtering multi-frame TSDF filling
Differences between NYU and SUN3D
Key Frame Key FramePropagation Result
Frame 76 Frame 121Frame 96
Propagation ResultPropagation Result
More Propagation Results
Existing scene datasets contain only a limited set of views of a place, and they lack representa-tions of complete 3D spaces. We introduce SUN3D, a large-scale RGB-D video database with camera poses and object labels, capturing the full 3D extent of big spaces.
Place category distribution
Data Capturing
The tasks that go into creating such a dataset are di�cult in isolation:• Labeling videos is painstaking• SfM is unreliable for large spaces. But if we combine them together, we make the dataset construction task easier. We use 3D to make object annota-tion easier, and we use object labels to correct large camera pose errors.
min∑c
∑p∈V(c)
(∥∥x̃c
p−K [Rc|tc]Xp
∥∥2+λ0
∥∥∥X̃cp−[Rc|tc]Xp
∥∥∥2
)+λ1
∑o
∑c
∑p∈L(o,c)
Ψ([Ro|to] [Rc|tc]−1X̃c
p, so)2
where
SemanticObject
Labeling
Good Reconstruction Semantic Segmentation
errors
Bad Reconstruction
err
TSDF-based Depth Map Improvement
RGB-D bundle adjustment Bounding box costs
c p∈V(c) o c p∈L(o,c)
• Use conservative loop closing (high precision and low recall)• Capture video in many passes so that similar views appear frequently, and loop closing succeeds.