deep learning - jasonyanglu.github.io · imagesource:liu, ming, yukangding, min xia, xiao liu,...

DEEP LEARNINGLecture 9: Generative Adversarial Networks

Dr. Yang Lu

Department of Computer Science

[email protected]

GAN Applications

¡ Image Translation

1Image source: Choi, Yunjey, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. "Stargan v2: Diverse image synthesis for multiple domains." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188-8197. 2020.

GAN Applications

¡ Image Translation

2Image source: Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "Image style transfer using convolutional neural networks." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2414-2423. 2016.

GAN Applications

¡ Scene Generation

3Image source: Tang, Hao, Dan Xu, Yan Yan, Philip HS Torr, and Nicu Sebe. "Local class-specific and global image-level generative adversarial networks for semantic-guided scene generation." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7870-7879. 2020.

GAN Applications

¡ Facial Attribute Manipulation

4Image source: Geng, Zhenglin, Chen Cao, and Sergey Tulyakov. "3d guided fine-grained face manipulation." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9821-9830. 2019.

GAN Applications

¡ Facial Attribute Manipulation

5Image source: Liu, Ming, Yukang Ding, Min Xia, Xiao Liu, Errui Ding, Wangmeng Zuo, and Shilei Wen. "Stgan: A unified selective transfer network for arbitrary image attribute editing." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3673-3682. 2019.

GAN Applications

¡ Gaze Correction

6Image source: Zhang, Jichao, Meng Sun, Jingjing Chen, Hao Tang, Yan Yan, Xueying Qin, and Nicu Sebe. "Gazecorrection: Self-guided eye manipulation in the wild using self-supervised generative adversarial networks." arXiv preprint arXiv:1906.00805 (2019).

GAN Applications

¡ Image Animation

7Image source: Siarohin, Aliaksandr, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, and Nicu Sebe. "Animating arbitrary objects via deep motion transfer." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2377-2386. 2019.

GAN Applications

¡ Image Inpainting

8Image source: Yu, Jiahui, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. "Generative image inpainting with contextual attention." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5505-5514. 2018.

GAN Applications

¡ Image Inpainting

9Image source: Yu, Jiahui, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. "Free-form image inpainting with gated convolution." In Proceedings of the IEEE International Conference on Computer Vision, pp. 4471-4480. 2019.

GAN Applications

¡ Image blending

10Image source: Wu, Huikai, Shuai Zheng, Junge Zhang, and Kaiqi Huang. "Gp-gan: Towards realistic high-resolution image blending." In Proceedings of the 27th ACM International Conference on Multimedia, pp. 2487-2495. 2019.

GAN Applications

¡ Image Super-Resolution

11Image source: Ledig, Christian, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken et al. "Photo-realistic single image super-resolution using a generative adversarial network." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681-4690. 2017.https://www.theverge.com/21298762/face-depixelizer-ai-machine-learning-tool-pulse-stylegan-obama-bias

GAN Applications

¡ Shadow Detection and Removal

12Image source: Ding, Bin, Chengjiang Long, Ling Zhang, and Chunxia Xiao. "Argan: Attentive recurrent generative adversarial network for shadow detection and removal." In Proceedings of the IEEE international conference on computer vision, pp. 10213-10222. 2019.

GAN Applications

¡ Makeup

13Image source: Li, Tingting, Ruihe Qian, Chao Dong, Si Liu, Qiong Yan, Wenwu Zhu, and Liang Lin. "Beautygan: Instance-level facial makeup transfer with deep generative adversarial network." In Proceedings of the 26th ACM international conference on Multimedia, pp. 645-653. 2018.

GAN Applications

¡ Facial Landmark Detection

14Image source: Dong, Xuanyi, Yan Yan, Wanli Ouyang, and Yi Yang. "Style aggregated network for facial landmark detection." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 379-388. 2018.

GAN Applications

¡ Person Re-identification

15Image source: Zheng, Zhedong, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, and Jan Kautz. "Joint discriminative and generative learning for person re-identification." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2138-2147. 2019.

GAN Applications

¡ Music Generation

16

Image source: Yang, Li-Chia, Szu-Yu Chou, and Yi-Hsuan Yang. "MidiNet: A convolutional generative adversarial network for symbolic-domain music generation." arXiv preprint arXiv:1703.10847 (2017).

Generated music samples: https://soundcloud.com/vgtsv6jf5fwq/sets/midinet-samples

https://soundcloud.com/vgtsv6jf5fwq/sets/midinet-samples

Generative vs. Discriminative

¡ In machine learning, two main approaches are called the generative approaches and the discriminative approaches.

¡ Given an observable variable 𝑋 and a target variable 𝑌:¡ A generative model is a statistical model of the data distribution 𝑃(𝑋) or the

joint probability distribution on 𝑋×𝑌: 𝑃(𝑋, 𝑌).¡ A discriminative model is a model of the conditional distribution of 𝑌 given 𝑋:𝑃(𝑌|𝑋 = 𝑥).

17

Image source: https://medium.com/@jordi299/about-generative-and-discriminative-models-d8958b67ad32

Discriminative Approaches

¡ Most supervised learning methods fallinto discriminative approaches.¡ Given data: (𝑥, 𝑦), 𝑥 is data, 𝑦 is label.

¡ Goal: Learn a function to map 𝑥 −> 𝑦,namely posterior probability 𝑃(𝑌|𝑋 = 𝑥).

¡ Examples: Classification, regression, object detection, face recognition, sentimentclassification, etc.

18Image source: https://www.autoexpress.co.uk/tesla/model-3https://timesofindia.indiatimes.com/life-style/relationships/pets/5-things-that-scare-and-stress-your-cat/articleshow/67586673.cms

cat

car

Generative Approaches

¡ Given training data, generate new samples from same distribution.¡ Objectives: 1. Learn 𝑝!"#$%(𝑥) that approximates 𝑝#&'&(𝑥).2. Sample a new 𝑥 from 𝑝!"#$%(𝑥).

19

Image source: http://www.lherranz.org/2018/08/07/imagetranslation/

Outlines

¡ GAN

¡ DCGAN

¡ CGAN

¡ WGAN

¡ SAGAN

¡ pix2pix and CycleGAN

¡ SRGAN

20

GAN

21

GAN

¡ GAN was proposed by Ian Goodfellow in2014.

¡ Yann LeCun described GANs as “the most interesting idea in the last 10 years in machine learning”.

¡ Ian presented and explained his paper in NIPS 2016 with a 2-hour presentation.

22

Image source: https://youtu.be/HGYYEUSm-0Q

Ian in NIPS 2016

GAN: How to Do

¡ GAN is composed by a generator and a discriminator. They are bothneural networks.¡ Generator network: try to fool the discriminator by generating real-looking

images.¡ Discriminator network: try to distinguish between real and fake images.

23

Image source: https://sthalles.github.io/intro-to-gans/

GAN: How to Do

¡ Generator and discriminator tells each other where it waswrong.¡ Generator tells discriminator how I detect you.

¡ Discriminator tells generator how I fool you.

24


Discriminator learning signal

Generator learning signal

GAN: How to Learn

¡ Given a prior on input noise variables 𝑝𝒛(𝒛), generator 𝐺"!(𝒛)take 𝒛 as input and map it into data space.

¡ Discriminator 𝐷""(𝒙) takes the data 𝒙 as input and outputprobability that 𝒙 came from the real data rather thangenerated data 𝐺"!(𝒛).

¡ 𝐷"" and 𝐺"! have different goals (1 for real, 0 for fake):

¡ Generator wants: 𝐷!! 𝐺!" 𝒛 → 1.

¡ Discriminator wants: 𝐷!! 𝒙 → 1, 𝐷!! 𝐺!" 𝒛 → 0.

25

GAN: How to Learn

¡ By maximizing the log-likelihood, the overall objective is to simultaneously train over all 𝒙 with random generated 𝒛:

¡ train 𝐺!! to minimize log 1 − 𝐷!" 𝐺!! 𝒛 ;

¡ train 𝐷!" to maximize log𝐷!"(𝒙) and log 1 − 𝐷!" 𝐺!! 𝒛 .

¡ In other words, 𝐷!! and 𝐺!" play the following two-player minimax game:

min!"

max!!

𝔼𝒙~$!#$# log𝐷!! 𝒙 + 𝔼𝒛~$𝒛 𝒛 log 1 − 𝐷!! 𝐺!" 𝒛

26

Discriminator output for real data 𝒙

Discriminator output for generated fake data 𝐺!! 𝒛 .

GAN: How to Learn

Objective:

min("

max(#

𝔼𝒙~+#$%$ log𝐷(# 𝒙 + 𝔼𝒛~+𝒛 𝒛 log 1 − 𝐷(# 𝐺(" 𝒛

Alternate between: ¡ Gradient ascent on discriminator:

max(#

𝔼𝒙~+#$%$ log𝐷(# 𝒙 + 𝔼𝒛~+𝒛 𝒛 log 1 − 𝐷(# 𝐺(" 𝒛

¡ Gradient descent on generator:

min("

𝔼𝒛~+𝒛 𝒛 log 1 − 𝐷(# 𝐺(" 𝒛

27

GAN: How to Learn

¡ After several steps of training, if 𝐺 and 𝐷 have enough capacity, they will reach a point at which both cannot improve because 𝑝& = 𝑝'()(.¡ Generator can generate real image.¡ Discriminator is unable to differentiate between the two

distributions, i.e. 𝐷&!(𝒙) = 1/2.

28

Image source: Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. "Generative adversarial nets." In Advances in neural information processing systems, pp. 2672-2680. 2014.

discriminative distribution

datadistribution

generativedistribution

GAN: Algorithm

29

GAN: Result

30

Image source: Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. "Generative adversarial nets." In Advances in neural information processing systems, pp. 2672-2680. 2014.

Rightmost column shows the nearest training example of the neighboring sample, in order to demonstrate that the model has not memorized the training set.

GAN: Result

31


SVHNs MNIST

GAN: Frontiers

32

Image source: Ian Goodfellow. Samples from Goodfellow et al., 2014, Radford et al., 2015, Liu et al., 2016, Karras et al., 2017, Karras et al., 2018

The GAN Zoo: Explosion of GANs

33

Source: https://github.com/hindupuravinash/the-gan-zoo

DCGAN

34

DCGAN

¡ Vanilla GAN simply uses MLP, rather than CNN in both generator and discriminator.

¡ CNN can be easily applied to discriminator.

¡ Now the problem is: how can CNN be used as a generator?¡ Pooling leads to downsampling, how to upsampling?

35

Fractionally-Strided Convolutions

36

1 2

3 4

Input: 2×2Filter: 3×3

Output: 4×4

40 50

30 20

60

10

10 20 3040 50

30 20

60

10

10 20 30


37

1 2

3 4

10Input: 2×2Filter: 3×3

Output: 4×4

40 50

30 20

60

10

10 20 3050+80

60+100

20+60

10+40

120

20

20+20

30+40 60

40

30


38

1 2

3 4

60Input: 2×2Filter: 3×3

Output: 4×4

40 50

30 20

60

10

10 20 30

30+120

80+150

90 60

50+180

30

40+30

130+60

160+90 120

20

7010 40

Output: 4×4


39

1 2

3 4

70

Input: 2×2Filter: 3×3

Output: 4×4

40 50

30 20

60

10

10 20 30

230+160

230+200

60+120

30+80

20+240

40

190+40

250+80

120+120

150

90

10 6040 70

Output: 4×4


¡ Fractionally-strided convolutions are alsocalled transposed convolutions.¡ PyTorch: torch.nn.ConvTranspose2d.

¡ TensorFlow:tf.keras.layers.Conv2DTranspose.

¡ Some researchers are used to calldeconvolutions. However, truedeconvolutions are the inverse operationof convolution, which is not the same asfractionally-strided convolutions.

40

Image source: https://github.com/vdumoulin/conv_arithmetic

Transposed convolution with strideis equivalent to convolving with

zero-padding and inserting zeros.

More linear algebra details: https://arxiv.org/pdf/1603.07285.pdf

DCGAN

41

Image source: Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).

stride=2 everywhere

DCGAN: Visual Results

42


The generated bedrooms look very nice!

DCGAN: Walking in the Latent Space

¡ If walking in this latent space results in semantic changes to the image generations (such as objects being added and removed), we can reason that the model has learned relevant and interesting representations.

43


Interpolation between a series of 9 random points in 𝑍show that the space learned has smooth transitions.

DCGAN: Vector Arithmetic

44


For each column, the 𝑍 vectors of samples are averaged. Arithmetic was then performed on the mean vectors creating a new vector 𝑌.

DCGAN: Use as Feature Extractor

¡ Train on Imagenet-1k and then use the discriminator’s convolutional features from all layers.

¡ Maxpooling each layers representation to produce a 4 × 4 spatial grid.

¡ These features are then flattened and concatenated to form a 28672 dimensional vector.

¡ A regularized linear L2-SVM classifier is trained on top of them.

45

CGAN

46

CGAN

¡ We can’t control what we generate from the vanilla GAN.¡ Noise is the only input and it is totally random.

¡ How can we tell GAN what we want it to generate?¡ Straightforward solution: replace data distribution by

conditional distribution.𝑝 𝒙 → 𝑝 𝒙 𝑦 .

¡ Now, the problem becomes:¡ Generator: generate a sample for class 𝑦.¡ Discriminator: distinguish the real sample in class 𝑦 and the generated

sample in class 𝑦.

47

CGAN

¡ Both generator and discriminator are conditioned on some extra information 𝒚:

min!"

max!!

?

@

𝔼*~$!#$# log𝐷!! 𝒙|𝒚

+ 𝔼+~$ + log 1 − 𝐷!! 𝐺!" 𝒛|𝒚 .

¡ 𝒚 could be any kind of auxiliary information, such as class labels or data from other modalities.¡ E.g. the speech of saying that class.

48

CGAN

49

Generate MNIST digits by directly feeding one-hot class label.

WGAN

50

WGAN

¡ The first paper lists several problems of GANs based ontheoretical analysis:¡ Unstable behavior of training.

¡ The cost function doesn’t represent the quality of generated images.

¡ Based on the analysis, the second paper proposed Wasserstein GAN (WGAN).

51

Problems of GANs

Problem 1: Good discriminator makes the gradient of generator vanish.

¡ If discriminator is well trained, the gradient of generator vanishes.

¡ If discriminator is poorly trained, the gradient of generator hasgreat variance.

¡ Generator gets well trained only if discriminator is neither toogood nor too poor.¡ That’s why GANs are extremely difficult to train.

52

Problems of GANs

¡ Using the annotations in the paper, we have the original loss function to minimize:

𝐿 𝐷, 𝑔% = 𝔼𝒙~(# log𝐷 𝒙 + 𝔼𝒙~(! log 1 − 𝐷 𝒙 .

¡ By making the derivative of 𝐷 𝒙 to 0, the optimal discriminator has the shape:

𝐷∗ 𝒙 =𝑃*(𝒙)

𝑃* 𝒙 + 𝑃+(𝒙).

¡ Replace 𝐷∗ 𝒙 into the loss function we get:𝐿 𝐷∗, 𝑔% = 2𝐽𝑆(𝑃*||𝑃+) − 2 log 2 ,

where 𝐽𝑆 is the Jensen-Shannon divergence. This is the loss of generator withoptimal discriminator.

¡ In most of cases, 𝐽𝑆(𝑃*| 𝑃+ = log 2, which is a constant. That makes thegradient of generator 0.

53

Problems of GANs

Problem 2: the objective of the − log𝐷 trick leads to collapse mode, where the generated samples lack of diversity.

¡ Ian Goodfellow also proposed to replace log 1 − 𝐷 𝒙 bylog −𝐷 𝒙 to solve the generator gradient vanishing problem.It is usually referred to the − log𝐷 trick.

¡ Then, the loss of generator with optimal discriminator wasderived:

𝐿 𝐷∗, 𝑔" = 𝐾𝐿(𝑃.||𝑃/) − 2𝐽𝑆(𝑃/||𝑃.)

where 𝐾𝐿 is the KL divergence.

54

Problems of GANs

¡ By optimizing this generator loss, we may encounter collapse mode.

¡ The loss penalizes¡ slightly for the case that generator is not able to generate real sample;

¡ heavily for the case that generator generates unreal samples.

¡ Thus, generator is more likely to generate safe samples (lowdiversity), rather than trying to explore new sample space.

55

WGAN

¡ Replace JS and KL divergence by Wasserstein distance, aka the Earth-Mover (EM) distance:

𝑊 𝑃. , 𝑃/ = inf0∈2(4',4")𝔼 7,8 ~0[||x − 𝑦||] .

where Π(𝑃. , 𝑃/) denotes the set of all joint distributions 𝛾(𝑥, 𝑦)whose marginals are respectively 𝑃. and 𝑃/.

¡ Intuitively, 𝛾(𝑥, 𝑦) indicates how much “mass” must be transported from 𝑥 to 𝑦 in order to transform the distributions 𝑃. into the distribution 𝑃/.

¡ The EM distance then is the “cost” of the optimal transport plan.¡ The new loss function can represent image quality.

56

WGAN

¡ How to make the optimization to minimize Wassersteindistance?

¡ Only three modifications to the original GAN:1. Remove sigmoid activation in discriminator.

2. Remove log in loss of both discriminator and generator.

3. Weight clipping in a certain range [−𝑐, 𝑐].

57

WGAN

58

Replace Adam by RMSProp

No log

Weight clipping

No sigmoid

WGAN: Result

¡ The Wasserstein estimate directly reflects the quality ofgenerated images.

59

Image source: Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint arXiv:1701.07875 (2017).

WGAN: Result

60

DCGAN with WGAN DCGAN with GAN

WGAN without BN GAN without BN

WGAN using MLP GAN using MLP

Image source: Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint arXiv:1701.07875 (2017).

SAGAN

61

SAGAN: Motivation

Problem of long-range dependencies.¡ Traditional convolutional GANs generate high-resolution

details as a function of only spatially local points in lower-resolution feature maps.

¡ Long range dependencies can only be processed after passing through several convolutional layers.

¡ It fails to capture geometric or structural patterns that occur consistently in some classes. ¡ For example, dogs are often drawn with realistic fur texture but without clearly

defined separate feet.

62

SAGAN

63

Image source: Zhang, Han, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. "Self-attention generative adversarial networks." In International Conference on Machine Learning, pp. 7354-7363. PMLR, 2019.

SAGAN

64


¡ Three mapping functions 𝑓(𝑥), 𝑔(𝑥), ℎ(𝑥) and finally a weightedfeature is generated.

¡ Is it similar to something we have seen before?

SAGAN

65


PIX2PIX AND CYCLEGAN

66

pix2pix

¡ Given a pair of images, transfer the style of one image toanother.

67

Image source: Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. "Image-to-image translation with conditional adversarial networks." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125-1134. 2017.

pix2pix

¡ The model is based on CGAN.

¡ As an improvement, the generator is tasked to not only fool the discriminator but also to be near the ground truth output.

¡ 𝐿, penalization is added to the loss of CGAN to encourage less blurring:𝐿-' 𝐺 = 𝔼*,/,+ 𝑦 − 𝐺 𝑥, 𝑧 , .

68


pix2pix: More Applications

69


CycleGAN

¡ Paired examples can be expensive to obtain.

¡ Can we translate from 𝑋 ↔ 𝑌 in an unsupervised manner?

70

Paired vs. unpaired examples

CycleGAN

¡ Two generators:¡ 𝐺: 𝑋 → 𝑌;

¡ 𝐹: 𝑌 → 𝑋.

¡ Two discriminators:¡ 𝐷0 aims to distinguish between images {𝑥} and translated images {𝐹(𝑦)};

¡ 𝐷1 aims to discriminate between {𝑦}and {𝐺(𝑥)}.

71Image source: Zhu, Jun-Yan, Taesung Park, Phillip Isola, and Alexei A. Efros. "Unpaired image-to-image translation using cycle-consistent adversarial networks." In Proceedings of the IEEE international conference on computer vision, pp. 2223-2232. 2017.

CycleGAN

¡ If we can go from 𝑋 to >𝑌 via 𝐺, then it should also be possible to go from >𝑌 back to 𝑋 via 𝐹.

¡ Cycle consistency loss is added to the original adversarial loss:

𝐿010 𝐺, 𝐹 = 𝔼2 𝐹 𝐺 𝑥 − 𝑥 3 + 𝔼4 𝐺 𝐹 𝑥 − 𝑦 3 .


CycleGAN


Failed case:

SRGAN

74

SRGAN

¡ Task: Given a low-resolution image, reconstruct the high-resolution image.

¡ Previous work: Residual transpose CNN with MSE loss has shown greatperformance on this task butthe reconstructions areblurry.

75

Bicubic Original

Image source: Ledig, Christian, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken et al. "Photo-realistic single image super-resolution using a generative adversarial network." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681-4690. 2017.

SRGAN

¡ Generator: Similar to ResNet, but uses Transposed Convolutions to increase the spatial dimensions.

¡ Discriminator: Fully-convolutional CNN + MLP Classifier + Binary Cross Entropy.

76Image source: Ledig, Christian, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken et al. "Photo-realistic single image super-resolution using a generative adversarial network." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681-4690. 2017.

SRGAN: VGG Loss

¡ Pixel-wise MSE loss often lacks high-frequency content which results in smooth textures.:

¡ The authors uses VGG loss based on the ReLU activation layers of the pre-trained 19 layer VGG network

¡ 𝜙(,* is the feature map obtained by the 𝑗-th convolution (after activation) before the 𝑖-th maxpooling layer.

77

Image source: Lecture 4, MSBA434, UCLA

SRGAN: Dual Loss

¡ Weighted sum of adversarial loss and VGG activation loss:

78

Image source: Lecture 4, MSBA434, UCLA

SRGAN: Results

79Image source: Ledig, Christian, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken et al. "Photo-realistic single image super-resolution using a generative adversarial network." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681-4690. 2017.

General Idea of Using GAN

¡ GAN can be applied to lots of applications, rather than justimages.

¡ We should identify several things to make sure that our taskcan be solved by GAN.¡ What are fake samples?

¡ What are real samples?

¡ How to transform fake samples to real samples (generator)?

¡ How to distinguish fake sample from real samples (discriminator)?

80

Conclusion

After this lecture, you should know:

¡ What is the basic idea of GAN?

¡ How generator and discriminator improve each other?

¡ How does transposed convolution work?

¡ How to design application specific loss to train with adversarialloss?

81

Suggested Reading

¡ Adversarial Nets Papers

¡ Tips and tricks to make GANs work

¡令人拍案叫绝的Wasserstein GAN

82

https://github.com/zhangqianhui/AdversarialNetsPapershttps://github.com/soumith/ganhackshttps://zhuanlan.zhihu.com/p/25071913

Assignment 4

¡ Assignment 4 is released. The deadline is 18:00, 11th December.

83

Thank you!

¡ Any question?

¡ Don’t hesitate to send email to me for asking questions anddiscussion. J

84

deep learning - jasonyanglu.github.io · imagesource:liu, ming, yukangding, min xia, xiao liu,...

Documents