convolutional neural networks (part ii)lvelho.impa.br/ip17/proj/slides/1110goingdeeperconv.pdf ·...

33
Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing 2016 Prof. Luiz Velho

Upload: others

Post on 20-Jun-2020

73 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 1

Convolutional Neural Networks (Part II)

08, 10 & 17 Nov, 2016

J. Ezequiel Soto S.Image Processing 2016

Prof. Luiz Velho

Page 2: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 2

Summary & References08/11 ImageNet Classification with Deep Convolutional Neural Networks

2012, Krizhevsky et. al. [source]10/11 Going Deeper with Convolutions

2015, Szegedy et. al. [source]17/11 Painting Style Transfer for Head Portraits using Convolutional Neural Networks

2016, Selim & Elgharib [source]

+ An Analysis of Deep Neural Network Models for Practical Applications

2016, Canziani & Culurciello [source]+ Provable bounds for learning some deep representations

2013, Arora et.al. [source]

Page 3: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 3

Going Deeper with Convolutions

Szegedy et.al. 2015

Page 4: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 4

Outline● Introduction● Related Work● Motivation● Architecture Detail● GoogLeNet● Training● ILSVRC 2014● Conclusions

Page 5: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 5

Introduction● GoogLeNet submission to ILSVRC →2014

● Accuracy + low cost in ops (1.5B @inference) → real world applicability

● Efficient CNN architecture: Inception

● Depth: network layers + Inception module

● Results!!! New State of the Art→

Page 6: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 6

Related Work● Standard CNN layer:

convolution + normalization + max pooling ● Good results in MNIST, CIFAR and ImageNet (with dropout vs. overfitting)

● Concerns that max-pooling loses spatial information● Neuro-science model of primate vision: stack filters

→ inspiration of the inception module● Network in Network model (NiN)● 1 x 1 convolutions:

– Increase depth– Dimension reduction (reduce computational cost)

● Regions with Convolutional Neural Networks: R-CNN

Page 7: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 7

Page 8: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 8

Page 9: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 9

Motivation● Improve CNNs by growing them deeper and wider…

– Too much parameters Overfitting →– Computational cost: two layers chained

2x filters o→ 2 computation– Zero entries? Sparsity control*→– The lack of structure, large number of filters and great

batches Efficient use of dense computation→

* Theoretical results: 2013, Arora et.al. “Provable Bounds for Learning Some Deep Representations”, 54 p.

Page 10: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 10

Motivation“This raises the question whether there is any hope for a next, intermediate step: an architecture that makes use of the extra sparsity, even at filter level, as suggested by the theory, but exploits our current hardware by utilizing computations on dense matrices.”

● Inception idea…– case study trying to approximate Arora’s sparse structure with

dense, available components (convolutions)– highly speculative / immediate good results

CAUTION: “although the proposed architecture has become a success for computer vision, it is still questionable whether its quality can be attributed to the guiding principles that have lead to its construction”

Page 11: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 11

“Given samples from a sparsely connected neural network whose each layer is a denoising autoencoder, can the net (and hence its reverse) be learnt in polynomial time with low sample complexity?”

Video 1Video 2

Page 12: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 12

Architecture Detail“finding out how an optimal local sparse structure in a convolutional vision network can be approximated and covered by readily available dense components”

● Translation invariance convolutional→● Local construction that repeats● Theory points at analyzing correlations of last layer and cluster by it.

● Lower layers: correlation spatial localization→● Avoid “aligned” correlations… using different sized filers

Page 13: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 13

Page 14: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 14

Architecture Detail● Higher levels higher abstraction:→● Spatial correlation decreases

→ increase use of bigger filters (3×3, 5×5)● Stacking large filters blows up the number of outputs! reduce dimension→

● Avoid too much compression of the information and maintain sparsity → 1×1 convolutions before the larger ones!

Page 15: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 15

Page 16: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 16

Inception module

Video

Page 17: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 17

Architecture Detail● Lower levels: classic convolutions● Higher levels: inceptions modules

* Author thinks this isn’t necessary, but compensates some inefficiency of structure design...

● Intuition scale invariance of visual information before abstraction→

● Increased computation efficiency achieved by the reductions, allowing to grow depth and breath

● Efficiency: 3 – 10x faster than similar networks without inception modules, but the design has to be careful.

Page 18: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 18

GoogLeNet● Specific design with Inception models used in the ILSVRC 2014 competition

● Same design for 6/7 of the ensemble models

● 22 layers deep

● Detail:– All convolutions include ReLU– Input: 224×224 in RGB with zero mean– #3×3 reduce = 1×1 filters before 3×3 convolutions– #5×5 reduce = 1×1 filters before 5×5 convolutions– pool proj = 1×1 filters after max-pooling

Page 19: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 19

Page 20: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 20

GoogLeNet● 22 layers (27 with max-pooling)● 100 independent building blocks● Pooling before classifying: NiN

+ Linear layer: convenience / change labels● Avg-pooling over FC gives +0.6% top-1 acc● Dropout remained essential

● Propagate gradient in effective manner discriminate →correctly in middle layers

● Inclusion of intermediate classifiers: convolutional networks on top of the inception modules (4a) and (4d) 0.3*Loss→

● Auxiliary classifiers are ignored at inference / marginal effect

Page 21: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 21

GoogLeNet● Auxiliary network:

– Avg-pooling: 5×5 filter, stride 3(4a) 4×4×512→(4d) 4×4×528→

– 1×1 convolution with 128 filters + ReLU– FC layer with 1024 units + ReLU– Dropout layer (70%)– Linear + softmax for 1000 classes

(removed @inference)

Page 23: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 23

Training Methodology● DistBelief: modest model & data parallelism (…Google)

CPU only → one week in a few GPUs (memory!)

● Stochastic Gradient Descent:– 0.9 momentum– Fixed learning rate:

-4% every 8 epochs– Polyak-Ruppert average of the iterations of

SGD● Many different methods for sampling and training over the images…– Different size crops– Patches 8% - 100% of the image– Aspect ratio [¾, 4/3]– Photometric distortions

Page 24: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 24

ILSVRC 2014: Classification● No external data for training● 7 versions of GoogLeNet model (1 wider)

– Same initialization (same weights: oversight)– Same learning policies– Different sampling

→ Ensemble prediction● Testing (more aggressive than AlexNet)

– 4 scales (256, 288, 320, 352)– Left, center and right (top, center, bottom) squares– Each square: full + 4 corners + center (224×224)– Mirrored image

→ 4×3×6×2 = 144 crops per imageNot necessary / decreasing marginal benefit

● Softmax: avg over all crops & all models (1008 tests) avgn=1008

144x

Page 25: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 25

Page 26: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 26

Source: 2016, Canziani & Culurciello

Page 27: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 27

ILSVRC 2014: Detection● Produce bbox around objects in 200 classes

– Correct if the bbox overlaps 50% w/ groundtruth– Extraneous detection (false +) are penalized

● Submission:– R-CNN + Inception model as region classifier– Selective search (2x pixel) + Multibox– Classify region: ensemble of 6 GoogLeNet models– No bounding box regression (R-CNN)– Report mean avg precision (mAP)

Page 28: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 28

Page 29: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 29

Source: 2016, Canziani & Culurciello

Page 30: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 30

Conclusions“...approximating the expected optimal sparse structure by readily available dense building blocks is a viable method for improving neural networks for computer vision.”

● Large gain / Small increase in computation● Detection is very competitive not using context nor bbox regression

● Moving to sparser architectures: feasible & useful● Importance of the analysis!!! (2013, Arora et.al.)

DeepDream (side result) examples are creepy… but show the reverse function of the network!

Input image force it to get close to animal categories→

Page 31: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 31

Page 32: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 32

Page 33: Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Networks 33

Will continue, again...