capsules dynamic routing between · 2018-04-19 · capsules should be routed to the boat capsule...

41
Dynamic Routing Between Capsules Yiting Ethan Li, Haakon Hukkelaas, and Kaushik Ram Ramasamy

Upload: others

Post on 23-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Dynamic Routing Between Capsules

Yiting Ethan Li, Haakon Hukkelaas, and Kaushik Ram Ramasamy

Page 2: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Problems & Results

Object classification in images without losing information about important parts of the picture.

smallNORB: Images of 3D objects 5 classes. Images of 50 toys in different anglesCapsNet(2017): 2.7% errorState of the art (2017): 2.56% error*CapsNet (2018): 1.4% error

MNIST: Handwritten digit classificationResult: 0.25% error (state of the art)

Page 3: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

How ConvNets would have achieved rotational invariance?

Problem with CNN

Page 4: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Traditional ConvNet

● Translational invariance: Max Pooling

● Susceptible to affine transformations

● Max Pooling throw away information

● Human brains don’t work like that

Capsule Networks

● Equivariance by “Routing by Agreement”

● Equivariance keeps track of where something is in the image

● Robust to affine transformations

● Makes biological sense

● Achieves inverse rendering (with capsules)

Motivation

Page 5: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Rendering vs. Inverse Rendering

Page 6: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output
Page 7: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

● A capsule is a group of neurons which outputs a vector activation

● The vector represents features related to the object

● Capsule represents the inverse graphics of the patch of image

● Orientation of vector: Represents properties of the entity

● Length of vector: Represents existence of the entity

What is a capsule?

Page 8: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

A Toy Example

Page 9: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

A Toy Example

Page 10: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Predict Next Layer’s Output

Page 11: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Predict Next Layer’s Output

Page 12: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Strong agreement!The rectangle and triangle capsules should be routed to the boat capsule

Slides heavily inspired by Aurélien Géron [2]

Predict Next Layer’s Output

Page 13: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Output of capsule j (parent)

Routing Algorithm

Routing coefficient between capsule i to parent capsule j.

Predict next layer output

Page 14: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Routing Weights

0.5 0.5

0.5 0.5

Page 15: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Compute Next Layer’s Output

Page 16: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Compute Next Layer’s Output

Page 17: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Agreement!

Large

Slides heavily inspired by Aurélien Géron [2]

Update Routing Weights

Page 18: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Disagreement!

Small

Slides heavily inspired by Aurélien Géron [2]

Update Routing Weights

Page 19: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Routing Weights

0.5 0.5

0.5 0.5

Page 20: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Routing Weights

0.2 0.8

0.1 0.9

Page 21: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Slides heavily inspired by Aurélien Géron [2]

Routing Weights

Page 22: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

● 70,000 handwritten digits● 28x28 grayscale images● DIgit classification (10 classes)

The MNIST Dataset

Page 23: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Architecture

Image 28x28

Conv1256x20x20

256 , 9x9

Conv256x6x6

256 , 9x9

stride 2

reshape

Capsules32x8x6x6

DigitCaps10x16

Wij[8x16]

Page 24: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Architecture

Image 28x28

Conv1256x20x20

256 , 9x9

Conv256x6x6

256 , 9x9

stride 2

reshape

Capsules32x8x6x6

DigitCaps10x16

Wij[8x16]

Page 25: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Architecture

Image 28x28

Conv1256x20x20

256 , 9x9

Conv256x6x6

256 , 9x9

stride 2

reshape

Capsules32x8x6x6

DigitCaps10x16

Wij[8x16]

Page 26: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Architecture

Image 28x28

Conv1256x20x20

256 , 9x9

Conv256x6x6

256 , 9x9

stride 2

reshape

Capsules32x8x6x6

DigitCaps10x16

Wij[8x16]

Page 27: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Architecture

Image 28x28

Conv1256x20x20

256 , 9x9

Conv256x6x6

256 , 9x9

stride 2

reshape

Capsules32x8x6x6

DigitCaps10x16

Wij[8x16]

Page 28: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Architecture

Image 28x28

Conv1256x20x20

256 , 9x9

Conv256x6x6

256 , 9x9

stride 2

reshape

Capsules32x8x6x6

DigitCaps10x16

Wij[8x16]

Page 29: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Loss Function

Page 30: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

● A decoder is used to reconstruct object from capsule representation

● Reconstruction loss: mean-squared error

● Encourages capsules to encode the instantiation parameters of the input digit

Input

Reconstructed

Reconstruction

Page 31: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Baseline #parameters: 35.4M

CapsNet (with reconstruction) #parameters: 8.2M

CapsNet (without reconstruction) #parameters: 6.8M

MNIST Result

Page 32: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

l = labelp = predictionr = reconstruction target Predicted 3, reconstructed

from 5Predicted 3, reconstructed from 3

MNIST Results

Page 33: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Capsule Interpretation

Page 34: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

MNIST data set with small random affine transformations.

Training Data : Expanded and translated MNIST dataset

Traditional CNN CapsuleNet

Expanded & Translated 99.22% 99.23%

Affine Transformation 66% 79%

MNIST Results continued

Page 35: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

● Two digits fused together

● Each digit has 80% overlap

● Training size: 60M, Testing size: 10M

5,0 6,7 4,9

MultiMNIST

Page 36: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

While = Input

Red = Digit 1 reconstruction

Green = Digit 2 reconstruction

L:(l1,l2) = Label for digit1 and digit 2

R:(r1,r2) = digits used for reconstruction

MultiMNIST Results

CNN 8.5

Caps(1 itr) 7.1

Caps(3 itr) 5.2

Page 37: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

While = Input

Red = Digit 1 reconstruction

Green = Digit 2 reconstruction

L:(l1,l2) = Label for digit1 and digit 2

R:(r1,r2) = digits used for reconstruction

MultiMNIST Results

Page 38: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

CIFAR10: 60000 32x32 colour images in 10 classes(airplane,bird,cat,deer,dog,frog,horse etc )

Result : 10.6% errorState of the art: ~2.5% error

SVHN: Street view house numbers

Result : 4.3% errorState of the art: 1.69% error

smallNORB: Images of 3D objects 5 classes. Images of 50 toys in different anglesResult: 2.7% errorState of the art: 2.56% errorCapsNet (2018): 1.4%

Other Datasets

Page 39: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

Pros:● Requires less training data

● Position and pose is preserved (Equivariance)

● Robust affine transformations

● Activation vector is easy to interpret

● Less trainable parameters required (77% less for MNIST)

● Great for overlapping objects

● Good for dealing with segmentation

Cons:● Computational heavy

● CapsNet does not allow two instances of the same class at the same location

● Likes to account for everything in the image

● Requires a lot of further research

Discussion

Page 40: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

● Capsule: A group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part

● The vector parameters could be: rotation, position, size, texture….● Dynamic routing routes information to higher layers by agreeing on output between layers● Achieves inverse rendering● Equivariance: Keeps track of where the entity is in the image.

Summary

Page 41: Capsules Dynamic Routing Between · 2018-04-19 · capsules should be routed to the boat capsule Slides heavily inspired by Aurélien Géron [2] Predict Next Layer’s Output. Output

[1] Awesome Capsule Networks. (https://github.com/aisummary/awesome-capsule-networks)

[2] Capsule Networks (CapsNets) - Tutorial. (https://www.youtube.com/watch?v=pPN8d0E3900)

[3] Understanding Hinton’s Capsule Networks. Part IV: CapsNet Architecture.

(https://medium.com/@pechyonkin/part-iv-capsnet-architecture-6a64422f7dce)

[4] Geoffrey Hinton talk "What is wrong with convolutional neural nets ?"

(https://www.youtube.com/watch?v=rTawFwUvnLE)

Additional Information on CapsNet