ladder vae - arindam.cs.illinois.edu
TRANSCRIPT
![Page 1: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/1.jpg)
Ladder VAEHantao Zhang
![Page 2: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/2.jpg)
Introduction• Ladder VAE (LVAE) was introduced in 2016, just after VAE.• Explores variational inference part of VAE model
Main change• recursively corrects the generative distribution by a data
dependent approximate likelihood
![Page 3: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/3.jpg)
Review of VAE• variational inference -> generative• hierarchies of conditional stochastic
variables
![Page 4: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/4.jpg)
The Problem
• VAE models are hierarchical
• Difficult to optimize when num_layers++• (high order layers learns nothing)• Constrained complexity
![Page 5: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/5.jpg)
Main Contribution
• Proposed Ladder VAE architecture to support deep hierarchical encoder.• Verified the importance of BatchNorm (BN) and Warm-Up (WU)
![Page 6: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/6.jpg)
Model Architecture
• Shared information between encoder and decoder• Deterministic upward pass• Followed by stochastic
downward pass
VAE LVAE
![Page 7: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/7.jpg)
Model cont.
• Objective• log 𝑝 𝑥 ≥ 𝐸!! 𝑧 𝑥 log "" #,%
!# 𝑧 𝑥 = 𝐿 θ, ϕ; 𝑥 = −𝐾𝐿 𝑞& 𝑧 𝑥 ||𝑝' 𝑧 + 𝐸!! 𝑧 𝑥 lo g 𝑝' 𝑥 𝑧
• Generative arch (Decoder)• 𝑝' 𝑧 = 𝑝' 𝑧( ∏)*+
(,+ 𝑝' 𝑧) 𝑧)-+• 𝑝' 𝑧) 𝑧)-+ = 𝑁 𝑧) µ",) 𝑧)-+ , σ)-+. 𝑧)-+ , 𝑝' 𝑧( = 𝑁 𝑧( 0, I
• 𝑝' 𝑥 𝑧+ = 𝑁 𝑥 µ",/ 𝑧+ , σ",/. 𝑧+
Variational Regularization Term
Reconstruction Error
![Page 8: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/8.jpg)
Model cont. (Inference arch)
• VAE• 𝑞! 𝑧 𝑥 = 𝑞! 𝑧" 𝑥 ∏#$%
& 𝑞! 𝑧# 𝑧#'"• 𝑞! 𝑧" 𝑥 = 𝑁 𝑧" µ(," 𝑥 , σ(,"% 𝑥
• 𝑞! 𝑧# 𝑧#'" = N 𝑧# µ(,# 𝑧#'" , σ(,#% 𝑧#'" , i =2…𝐿
• 𝑑(𝑦) = MLP(𝑦)
• 𝜇(𝑦) = Linear 𝑑(𝑦)
• 𝜎%(𝑦) = Softplus Linear 𝑑(𝑦
• LVAE
• σ!,# =$
%&$,&'('(),&
'(
• µ!,# =%)$,&%&$,&
'('*),&(),&'(
%&$,&'('(),&
'(
• σ!,+ = $𝜎!,+, µ!,+ = �̂�!,+• 𝑞! 𝑍# ⋅ = 𝑁 𝑧# µ(,# , σ(,#%
• 𝑑* = MLP 𝑑*'" , 𝑑+ = 𝑥
• Fµ(,# = Linear 𝑑# , 𝑖 = 1…𝐿
• Iσ(,#% = Softplus Linear 𝑑# , 𝑖 = 1…𝐿
![Page 9: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/9.jpg)
Warm-Up
• Motivation• Large number of units becomes inactive in early stage of training
• Solution• Initialize training using reconstruction error only
• log 𝑝 𝑥 ≥ 𝐸:, 𝑧 𝑥 log ;- <,=:. 𝑧 𝑥 = 𝐿 θ, ϕ; 𝑥
• = −𝛽𝐾𝐿 𝑞> 𝑧 𝑥 ||𝑝> 𝑧 + 𝐸:, 𝑧 𝑥 lo g 𝑝? 𝑥 𝑧
![Page 10: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/10.jpg)
ExperimentsMNIST
OMNIGLOT
MNIST
![Page 11: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/11.jpg)
Experiments
Samples from Prior
![Page 12: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/12.jpg)
Experiments: active unit comparison
![Page 13: Ladder VAE - arindam.cs.illinois.edu](https://reader031.vdocuments.mx/reader031/viewer/2022012512/618afb85517cd26b3f5e76ba/html5/thumbnails/13.jpg)
Experiments: PCA analysis