depth estimation using deep learning

Post on 23-Jan-2018

1.763 Views

Category:

Engineering

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Depth Images Prediction from a Single RGB Image

Using Deep learning

Deep Learning

May 2017

Soubhi Hadri

Depth Images Prediction from a Single RGB Image

Table of Contents :

Introduction.1

Existing Solutions.2

Dataset and Model.3

Project Code and Results.1

Introduction

Depth Images Prediction from a Single RGB Image

Introduction

-In 3D computer graphics a depth map is an image or image channel

that contains information relating to the distance of the surfaces of

scene objects from a viewpoint.

-RGB-D image : a RGB image and its corresponding depth image

-A depth image is an image channel in which each pixel relates to a

distance between the image plane and the corresponding object in the

RGB image.

Depth Images Prediction from a Single RGB Image

Introduction

To approximate the depth of objects :

• Stereo camera : camera with two/more lenses to simulate human vision.

• Realsense or Kinect to get RGB-D images

• Deep Learning..!!

Existing Solutions

Depth Images Prediction from a Single RGB Image

Deep Learning for depth estimation :

Recently, there are many works to estimate the depth map for RGB image.

Depth Images Prediction from a Single RGB Image

Deep Learning for depth estimation :

Learning Fine-Scaled Depth Maps from Single RGB Images.

7 Feb 2017

Recently, there are many works to estimate the depth map for RGB image.

Dataset & Model

Depth Images Prediction from a Single RGB Image

Dataset : NYU Depth V2

The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.

Depth Images Prediction from a Single RGB Image

Dataset : NYU Depth V2

The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.

Depth Images Prediction from a Single RGB Image

Dataset : NYU Depth V2

The dataset consists of :

• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).

• 407,024 new unlabeled frames - raw rgb, depth (428 GB).

• Toolbox: Useful functions for manipulating the data and labels.

Different parts of the dataset can be downloaded individually.

Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus

2012

Depth Images Prediction from a Single RGB Image

Dataset : NYU Depth V2

The dataset consists of :

• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).

• 407,024 new unlabeled frames - raw rgb, depth (428 GB).

• Toolbox: Useful functions for manipulating the data and labels.

Different parts of the dataset can be downloaded individually.

Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus

2012

Depth Images Prediction from a Single RGB Image

Dataset : NYU Depth V2

For this project:

• Office 1-2 dataset (part of the whole dataset).

• 15 GB after processing RAW data.

• 3522 RGB-D images.

Depth Images Prediction from a Single RGB Image

Dataset : NYU Depth V2

For this project:

• Office 1-2 dataset (part of the whole dataset).

• 15 GB after processing RAW data.

• 3522 RGB-D images.

Split the data:

3522

20%

80% 2817

7052414

403

Training

Validation

Test

Depth Images Prediction from a Single RGB Image

Dataset : NYU Depth V2

Samples of the data:

Depth Images Prediction from a Single RGB Image

The Model for Depth Estimation:

Model proposed by JaN IVANECK in his master degree thesis -2016.

Depth Images Prediction from a Single RGB Image

The Model for Depth Estimation:

Model proposed by JaN IVANECK in his master degree thesis -2016.

He derived his model from Eigen et al.

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture.

17 Dec 2015

Depth Images Prediction from a Single RGB Image

The Model for Depth Estimation:

Global context network estimates the rough depth map of the wholescene from the input RGB image.

Depth Images Prediction from a Single RGB Image

The Model for Depth Estimation:

Gradient network estimates horizontal and vertical gradients of the depth map globally, for the whole RGB image.

Depth Images Prediction from a Single RGB Image

The Model for Depth Estimation:

Refining network improves the rough estimate from the global context network, utilizing gradients estimated by the gradient network and an input RGB image.

Depth Images Prediction from a Single RGB Image

The Model for Depth Estimation:

Global context network

Architecture of the global context network

The model is derived from AlexNet.

Depth Images Prediction from a Single RGB Image

Loss Function:

Root mean squared error log(rms-log)

Depth Images Prediction from a Single RGB Image

Training The Network:

1- Scale the output images to [0 1].

2-Subtraction 127 from input images to center the data (kind of normalization).

3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer

Learning).

4-Training the network using batches (batch size = 32) for 35 Epochs.

5- Save the session and model in the end of each Epoch.

Depth Images Prediction from a Single RGB Image

Training The Network:

1- Scale the label images to [0 1].

2-Subtraction 127 from input images to center the data (kind of normalization).

3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer

Learning).

4-Training the network using batches (batch size = 32) for 35 Epochs.

5- Save the session and model in the end of each Epoch.

Depth Images Prediction from a Single RGB Image

Training The Network:

1- Scale the label images to [0 1].

2-Subtraction 127 from input images to center the data (kind of normalization).

3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer

Learning).

4-Training the network using batches (batch size = 32) for 35 Epochs.

5- Save the session and model in the end of each Epoch.

Depth Images Prediction from a Single RGB Image

Training The Network:

1- Scale the label images to [0 1].

2-Subtraction 127 from input images to center the data (kind of normalization).

3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer

Learning).

4-Training the network using batches (batch size = 32) for 35 Epochs.

5- Save the session and model in the end of each Epoch.

Depth Images Prediction from a Single RGB Image

Training The Network:

1- Scale the label images to [0 1].

2-Subtraction 127 from input images to center the data (kind of normalization).

3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer

Learning).

4-Training the network using batches (batch size = 32) for 35 Epochs.

5- Save the session and model in the end of each Epoch.

Depth Images Prediction from a Single RGB Image

Project Functions :

1- split_data : to split and save the data into training/testing/val.npy files.

2- load_data : load data from .npy files.

3- plot_imgs: to plot pair of images.

4- get_next_batch: to get the next batch from training data.

5- loss : calculate the loss function.

6- model: to create model (network structure).

Depth Images Prediction from a Single RGB Image

Project Functions :

7- train: to start training .

8- evaluate: to evaluate new data after restoring the model..

Depth Images Prediction from a Single RGB Image

Project Tools and Libraries:

1- Tensorflow.

2- Slim : lightweight library for defining, training and evaluating complex models in TensorFlow.

3- Tensorboard.

4- numpy.

5-matplotlib.

Depth Images Prediction from a Single RGB Image

Project Results:

Training Loss error:

Depth Images Prediction from a Single RGB Image

Project Results:

Samples of new data:

Depth Images Prediction from a Single RGB Image

Project Results:

Explanation :

• Training data is not sufficient.

Depth Images Prediction from a Single RGB Image

Project Results:

Explanation :

• Training data is not sufficient.

In Jan’s experiment:• Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations.

Depth Images Prediction from a Single RGB Image

Project Results:

Explanation :

• Training data is not sufficient.

In Jan’s experiment:• Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations.

This experiment:

• It took ~26 hours for 30 Epochs.

Depth Images Prediction from a Single RGB Image

Project :

The project code and data will be available on GitHub:

https://github.com/SubhiH/Depth-Estimation-Deep-Learning

Depth Images Prediction from a Single RGB Image

Resources :

-https://arxiv.org/pdf/1607.00730.pdf

-http://janivanecky.com/

-http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html

Thank You

top related