photo editing with generative adversarial networks...

25
GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial Networks (GANs)

Upload: others

Post on 25-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

GTC, May 2017. Greg Heinrich.

Photo Editing With Generative Adversarial Networks (GANs)

Page 2: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

2 2

GAN: WHAT IS A GENERATIVE MODEL?

Learn from Shakespeare novels:

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Produce:

PANDARUS:

Alas, I think he shall be come approached

and the day

When little srain would be attain'd into

being never fed,

And who is but a chain and subjects of his

death,

I should not sleep.

In Machine Learning

A generative model learns to generate samples that have the same characteristics as the samples in the dataset.

PANDARUS: Alas, I think he shall be come approached and the day When little srain would be attain'd into being never fed, And who is but a chain and subjects of his death, I should not sleep.

Page 3: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

3 3

BASIC REMINDER: BACKPROP

Output of each neuron 𝑗 of layer 𝑙 :

ℎ𝑗𝑙 = 𝜑 𝑧𝑗

𝑙 = 𝜑 𝑤𝑖𝑗𝑙 ℎ𝑖𝑙−1 + 𝑏𝑗

𝑙

𝑖

Gradient of E with respect to each weight:

𝜕𝐸

𝜕𝑤𝑖𝑗𝑙 =𝜕𝐸

𝜕𝑧𝑗𝑙

𝜕𝑧𝑗𝑙

𝜕𝑤𝑖𝑗𝑙 =𝜕𝐸

𝜕𝑧𝑗𝑙 ℎ𝑖𝑙−1

Calculation of 𝜕𝐸

𝜕𝑧𝑗𝑙:

𝜕𝐸

𝜕𝑧𝑗𝑙 =

𝜕𝐸

𝜕𝑧𝑘𝑙+1𝑘𝜕𝑧𝑘𝑙+1

𝜕𝑧𝑗𝑙 =

𝜕𝐸

𝜕𝑧𝑘𝑙+1

𝜕𝑧𝑘𝑙+1

𝜕ℎ𝑗𝑙

𝜕ℎ𝑗𝑙

𝜕𝑧𝑗𝑙𝑘

= 𝜕𝐸

𝜕𝑧𝑘𝑙+1𝑤𝑗𝑘

𝑙+1𝜑′ 𝑧𝑗𝑙

𝑘

= 𝜑′ 𝑧𝑗𝑙

𝜕𝐸

𝜕𝑧𝑘𝑙+1𝑤𝑗𝑘

𝑙+1

𝑘

Calculating 𝜕𝐸

𝜕𝑤𝑖𝑗𝑙 iteratively

Multivariate

chain rule

Chain rule

Chain

rule

𝜕𝑧𝑗𝑙

𝜕𝑤𝑖𝑗𝑙 only depends on ℎ𝑖

𝑙−1

Calculated

during

forward prop

Page 4: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

4 4

GAN: PLAYING THE ADVERSARIAL GAME Learning on a corpus of images

Let’s play a game opposing two agents:

- The Generator, a little imp in the computer who paints images.

- The Discriminator: you are collectively responsible for playing the Discriminator.

The game master (me) randomly picks images from either the corpus or the Generator and shows them to the Discriminator. The goal of the Discriminator is to identify the source of the images: real (from the corpus) or fake (painted by the little imp). The goal of the Generator is to fool the Discriminator.

Page 5: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

5 5

PLAYING THE ADVERSARIAL GAME

* veelhoek is the articulation of a ubiquitous item in the language of a tiny country in Europe that is well known for the inferior quality of its cheese.

Is this a veelhoek* from our corpus?

Yes, this red square is a veelhoek!

Note: you don’t have to know what a veelhoek is, you will learn through examples!

Page 6: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

6 6

No, those squiggly lines aren’t right!

PLAYING THE ADVERSARIAL GAME Is this a veelhoek from our corpus?

Page 7: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

7 7

PLAYING THE ADVERSARIAL GAME Is this a veelhoek from our corpus?

Yes, even though it’s blue and tiny!

Page 8: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

8 8

PLAYING THE ADVERSARIAL GAME Is this a veelhoek from our corpus?

No, those rounded corners are a giveaway!

Page 9: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

9 9

No, but it’s a very good fake!

PLAYING THE ADVERSARIAL GAME Is this a veelhoek from our corpus?

Page 10: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

10 10

No, it’s the same fake as before!

PLAYING THE ADVERSARIAL GAME Is this a veelhoek from our corpus?

Page 11: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

11 11

PLAYING THE ADVERSARIAL GAME Is this a veelhoek from our corpus?

No, but it’s a very creative fake!

Page 12: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

12 12

THE LATENT REPRESENTATION From features to images

A veelhoek is characterized by three features: - colour, - size, - number of faces This set of features is known as the “LATENT REPRESENTATION”.

We can generate many real-looking veelhoeks by randomly picking reasonable values of each feature:

Page 13: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

13 13

THE LATENT REPRESENTATION Arithmetic in latent space

We can perform operations in latent space, have them reflected in feature space:

1

2

𝑙𝑎𝑟𝑔𝑒𝑟𝑒𝑑3 𝑓𝑎𝑐𝑒𝑠

+

𝑠𝑚𝑎𝑙𝑙𝑔𝑟𝑒𝑒𝑛5 𝑓𝑎𝑐𝑒𝑠

=𝑚𝑒𝑑𝑖𝑢𝑚𝑦𝑒𝑙𝑙𝑜𝑤4 𝑓𝑎𝑐𝑒𝑠

Equivalently:

1

2

𝑟𝑒𝑑𝑙𝑎𝑟𝑔𝑒3 𝑓𝑎𝑐𝑒𝑠

+

𝑔𝑟𝑒𝑒𝑛𝑠𝑚𝑎𝑙𝑙5 𝑓𝑎𝑐𝑒𝑠

= +

Page 14: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

14 14

THE GAN SET-UP Connecting the Discriminator to the Generator and the Dataset

Random Latent vector

Page 15: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

15 15

GAN: NETWORK TOPOLOGY Radford (2015). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv:1511.06434

Generator

Discriminator

Page 16: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

16 16

TRAINING A GAN ON CELEBRITY FACES*

* CelebFaces dataset

Generating new faces by picking random values of the latent vector

Page 17: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

17 17 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

ANALOGIES man is to woman as king is to queen

Man

Blond

Hair

Blue

Eyes Smile

Looking

Left

Pointy

Nose

Top Right - - - + + +

Bottom Left + + + -

+

Subtract Top

Left + - - - -

Bottom Right + - - - + +

Reproduction of the famous “king + woman - man = queen” analogy on faces:

Page 18: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

18 18

MAPPING IMAGES TO LATENT VECTORS Transfer learning: from Discriminator to Encoder

Page 19: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

19 19

IMAGE RECONSTRUCTIONS Visualizing 𝐺 𝐸 𝑖𝑚𝑎𝑔𝑒

Page 20: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

20 20

ATTRIBUTES

The encoder 𝐸 may be used to calculate the latent vector for each attribute.

For each 𝑎𝑡𝑡𝑟 in 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠 :

𝐼𝑎𝑡𝑡𝑟+ = 𝑖𝑚 𝑎𝑡𝑡𝑟 and 𝐼𝑎𝑡𝑡𝑟

− = 𝑖𝑚 𝑎𝑡𝑡𝑟 are sets of images w/wo the attribute

𝑧 𝑎𝑡𝑡𝑟 =1

𝐼𝑎𝑡𝑡𝑟+ 𝐸(𝑖𝑚)𝑖𝑚 ∈ 𝐼𝑎𝑡𝑡𝑟

+ −1

𝐼𝑎𝑡𝑡𝑟− 𝐸(𝑖𝑚)𝑖𝑚 ∈ 𝐼𝑎𝑡𝑡𝑟

It is then straightforward to add or remove attributes from an image:

Calculating attribute vectors

From left to right: original image (OI); OI + “young” attribute; OI - “blond hair” + “black hair”; OI - “smile”; OI + “male” + “bald”.

Page 21: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

21 21

PLAYING WITH ATTRIBUTES

Page 22: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

22 22

EXTRACTING ATTRIBUTES …from portraits of illustrious people

Page 23: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

23 23

DEGENERATOR Getting the essence of your dataset

After convergence, stop updating the discriminator:

Page 24: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

24 24

DATASET VISUALIZATION Projecting latent vectors on a sphere

Page 25: Photo Editing With Generative Adversarial Networks (GANs)on-demand.gputechconf.com/gtc/2017/presentation/s... · GTC, May 2017. Greg Heinrich. Photo Editing With Generative Adversarial

25 25

THANK YOU Questions?