numerical optimization for graphics/ai 3d deep …huangqx/2018_cs395_lecture_23.pdf3d generative...

30
Numerical Optimization for Graphics/AI 3D Deep Learning I Qixing Huang Nov. 19 th 2018

Upload: others

Post on 20-May-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Numerical Optimization for Graphics/AI3D Deep Learning I

Qixing Huang

Nov. 19th 2018

Page 2: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

AlexNet

Page 3: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Image Generation

Page 4: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

3D Surface Representations

Triangular mesh

Point cloudImplicit surface

Part-based models

Light Field Representation

Page 5: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

3D Voxel Grids

Page 6: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

3D Deep Learning

mesh binary voxel

3D Shape as Volumetric Representation

[Wu et al. 15]

Page 7: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

3D ShapeNets

• Convolution to enable compositionality

• No pooling to reduce reconstruction error

Layer 1-3 convolutional RBM

Layer 4 fully connected RBM

Layer 5multinomial label + Bernoulli feature

form an associate memory

configurations

Convolutional DeepBelief Network

A Deep Belief Network is a generative graphical model that describes the distribution of input x over class y.

[Wu et al. 15]

Page 8: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

3D ShapeNets

Convolutional DeepBelief Network

3D ShapeNets ≠ CNNs

* 3D ShapeNets can be converted into a CNN,

and discriminatively trained with back-propagation.

generative process

discriminative process

[Wu et al. 15]

Page 9: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Training

Layer-wise pre-training:

Lower four layers are trained by CD

Last layer is trained by FPCD[1]

Fine-tuning:

Wake sleep but keep weights tied

Maximum Likelihood Learning

Convolutional DeepBelief Network

[Wu et al. 15]

Page 10: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

generation process:Gibbs Sampling

Convolutional DeepBelief Network

Sampling[Wu et al. 15]

Page 11: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Sampling

generation process:Gibbs Sampling

object label

[Wu et al. 15]

Page 12: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

3D ShapeNets

Visualization at Different LayersConvolutional DeepBelief Network 12

[Wu et al. 15]

Page 13: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

As a 3D Shape Prior

Sampled Models

chai

rbed

des

kta

ble

nig

hts

tand

sofa

bat

htu

bto

ilet

Convolutional DeepBelief Network 13

[Wu et al. 15]

Page 14: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

3D Generative Adversarial Network [Wu et al. 16]

Page 15: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]
Page 16: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Sparse 3D Convolutional Networks [Ben Graham 2016]

40x40x40 GridSparsity for lower layers

Low resolution for upper layers

Page 17: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Octree classification networks[Wang et al. 18]

The responses of some convolutional filters at different levels on two models are rendered.Red represents a large response and blue a low response.

Page 18: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Discussion

+ Easy to implement

+ Hardware friendly

- Low resolution

- No structural information

- Cannot utilize 2D training data

Page 19: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Light Field Representation

Page 20: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]
Page 21: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]
Page 22: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Discussion

+ Can utilize 2D training data

+ Efficient since using 2D convolutions

+ Top-performing algorithms

-- Redundancy

-- Loss of information per view

-- How to pick views?

? Convolutions on Spheres

Page 23: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Point cloud Representation[Su et al. 17a, Su et al. 17b]

Page 24: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Object Classification on Partial Scans

Input:Partial scan (XYZ)

Output:Category classification

Car Car Airplane

Table

Mug

Page 25: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Object Part Segmentation

Output:Per point label

Input:Point cloud (XYZ)

Page 26: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Semantic Segmentation for Indoor Scenes

Input:Point cloud (XYZRGB) of a room

Output (current performance):Semantic segmentation of the room

wall

floor

table

Page 27: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Uniform Framework: PointNet

Point Set

PointNet

Object Classification

Part Segmentation

Point Feature Learning

Semantic Segmentation in scenes

...

Page 28: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]
Page 29: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]
Page 30: Numerical Optimization for Graphics/AI 3D Deep …huangqx/2018_CS395_Lecture_23.pdf3D Generative Adversarial Network [Wu et al. 16] Sparse 3D Convolutional Networks [Ben Graham 2016]

Model Accuracy

MLP 40%

LSTM 75%

Conv-Max-FC (1 max) 84%

Conv-Max-FC (2 max) 86%

Conv-Max-FC (2 max) + Input Transform 87.8%

Conv-Max-FC (2 max) + Feature Transform 86.8%

Conv-Max-FC (2 max) + Feature Transform + orthogonal regularization

87.4%

Conv-Max-FC (2 max) + Input Transform + Feature Transform + orthogonal regularization

88.9%

ModelNet shape 40-class classification

Best Volumetric CNN: 89.1% However, PointNet is around 5x - 10x faster than Volumetric CNN