learning to infer graphics programs from...

30
LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin Ellis - MIT , Daniel Ritchie - Brown University, Armando Solar-Lezama - MIT , Joshua b. Tenenbaum - MIT Presented by : Maliha Arif Advanced Computer Vision -CAP 6412 29 th March 2018

Upload: lammien

Post on 21-Apr-2018

219 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES

Kev i n E l l i s - M I T , Dan ie l R i t c h i e - B rown Un i ve r s i t y, A r mando So la r- Lezama- M IT , Jo s hua b . Tenenbaum - M IT

Presented by : Maliha ArifAdvanced Computer Vision -CAP 641229th March 2018

Page 2: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

OUTLINE

Introduction / Motivation

Road map – Trace Hypothesis

Stage 1 – Obtaining a Trace Set using Neural Network Training Generalization to Hand Drawings

Stage 2 – Obtaining a Program from Trace setSearch AlgorithmCompression Factor

Applications

2

Page 3: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

MOTIVATIONHand Drawn Images

Trace Set- Rendered in LatexSynthesized Program

3

Page 4: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

TRACE HYPOTHESIS

Stage 1 Stage 2

4

Page 5: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

STAGE 1: NEURAL NETWORK

5

Page 6: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

NEURAL NETWORK COMPONENTS

CNN layers

Layer 1: 20 8 × 8 convolutions, 2 16 × 4 convolutions, 2 4 × 16 convolutions. Followed by 8 × 8 pooling with a stride size of 4.Layer 2: 10 8 × 8 convolutions. Followed by 4 × 4 pooling with a stride size of 4.

Drawing commands are predicted Token by Token using MLPs and STNs

Example Circle command is predicted , then X coordinate then Y coordinate

6

Page 7: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

NEURAL NETWORK

Circle command or first token is predicted using multinomial logistic regression using a Softmax layer

Subsequent Tokens are predicted using a Differentiable Attention Mechanism and previously predicted Tokens

STN ( Spatial Transformer Network )

7

Page 8: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

CNNs lack the ability to be spatially invariant in a computationally and parameter efficient manner.

Max Pooling layers in CNNs satisfy this property where the receptive fields are fixed and local.

STN ( Spatial Transformer Network ) is a dynamic mechanism that can actively spatially transform an image or feature map.

STN- SPATIAL TRANSFORMER NETWORK

Spatial Transformer Networks : Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, CVPR 2015

8

Page 9: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

STN – SPATIAL TRANSFORMER NETWORKS

Spatial transformers can be incorporated into CNNs to benefit multifarious tasks such as:

Image ClassificationCo-localizationSpatial attention

We use them here for spatial attention.

Spatial Transformer Networks : Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, CVPR 2015

9

Page 10: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

STN – SPATIAL TRANSFORMER NETWORKS

The Spatial transformer mechanism is split into 3 parts

Spatial Transformer Networks : Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, CVPR 2015 10

Page 11: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

Output pixels are defined to lie on aregular grid.

Target

Sampling Grid

Grid Generator + Sampler : Affine transform

Spatial Transformer Networks : Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, CVPR 2015

STN – SPATIAL TRANSFORMER NETWORKS

11

Page 12: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

TargetSource

Spatial Transformer Networks : Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, CVPR 2015

STN – SPATIAL TRANSFORMER NETWORKS

12

Page 13: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

STAGE 1: NEURAL NETWORK

13

Page 14: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

PROBABILITY DISTRIBUTION OF THE CNNDistribution over next drawing command

Distribution over traces

14

Page 15: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

TRAINING

Network may predict incorrectly

To make the network more robust, SMC is employed

SMC – Sequential Monte Carlo Sampling uses Pixel wise distance as a surrogate for likelihood function

15

Page 16: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

RESULTS Edit Distance = size of the symmetric difference between the ground truth trace set and the set produced by the model.

16

Page 17: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

GENERALIZING TO HAND DRAWINGS

Actual hand drawings are not used, rather noise is introduced into rendering of training target images

They are produced in the following ways:

Rescaling the image intensity by a factor chosen from [0.5, 1.5] Translating the image by ±3 pixels chosen uniformly Rendering the LATEX using the pencildraw style

17

Page 18: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

LIKELIHOOD FUNCTION

L learned

Model now predicts 2 scalars T1 –T2 and T2 - T1

where T1 is rendered output with noiseand T2 is rendered output

18

Page 19: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

EVALUATION

100 hand drawn images on graph

paper snapped to a 16 x 16 grid

IoU is intersection over union between

ground truth and predicted trace set

We use IoU to measure the system's

accuracy at recovering trace sets.

19

Page 20: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

STAGE 2 : PROGRAM SYNTHESIS

20

Page 21: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

COST FUNCTION

Cost is computed in the following way

Programs incur a cost of 1 for each command ( primitive drawing action, loop, or reflection )They incur a cost of 1/3 for each unique coefficient they use in a linear transformation

21

Page 22: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

BIAS OPTIMAL SEARCH POLICY

Bias Optimality Search Algorithm : A search algorithm is n-bias optimal w.r.t a distribution. if it is guaranteed to find a solution in ( ) after searching for at least

We use a 1 – bias optimal search algorithm which means we spend time looking in the program space.

This implies that we look in the whole program space

22

Page 23: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

TRAINING Policy is given by Define a loss to train the search Policy

D is the training corpus i.e. minimum cost programs for few trace sets. denotes parameters of the search policy

23

Page 24: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

BILINEAR MODEL

where denote a one-hot encoding of the parameter settings of sigma, 24 dimensions long

24

Page 25: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

SEARCH POLICY - COMPARISON

25

Page 26: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

COMPRESSION (TRACE SET AND PROGRAM )

Compression Factor

21/6 = 3.5x

Compression Factor

13/6 = 2.2x

26

Ising Model – Lattice Structure

Another example

Page 27: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

APPLICATIONS Correcting Errors made by the neural network using Prior Probability of programs

Hand Drawing Trace Set –Rendered in Latex

Program Synthesizer's Output27

Page 28: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

Modelling similarity between DrawingsAPPLICATIONS

28

Page 29: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

Extrapolating Figures

APPLICATIONS

Example 1 Example 2

29

Page 30: LEARNING TO INFER GRAPHICS PROGRAMS FROM …crcv.ucf.edu/courses/CAP6412/Spring2018/ADV_Presen… ·  · 2018-04-17LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin

THANK YOU Discussion

30