generative convnet
TRANSCRIPT
Generative ConvNetwith Continuous Latent Factors
Stu. Jerry Xu Mentor: Prof. Ying Nian Wu
Shanghai Jiao Tong University / University of California, Los Angeles
UCLA CSST --- Aug. 31st , 2016
ObjectiveIntroduction
A Model truly understand the data.
Find a significative mapping from data to latent vector.
On the other hand, every latent vector can reconstruct data.
5
Input image
Input image
Input image
πΌ β π
Generative ConvNetwith Continuous Latent Factors
πΌ = ππ + πBasic Linear Model
Factor analysis / principal component analysis 7
πΌ β π 10000 π β π 20Γ10000 π β π 20
Map @ Generative ConvNetIntroduction
8
Our Model
Non-linear; Muiti-layer
More explicitly
horizontal unfolding (Convolutional)
hieratical unfolding
π π;π βΆ
Parameterized by ConvNet
Factor analysis ModelLinear; One-layer
Dimension reduction
Dictionary learning
Latent factor extracting
e.g.
(PCA)Principal component analysis
(NMF)Non-negative matrix factorization
(ICA)Independent component analysis
Generlization
πΌ = ππ + π πΌ = π π;π + π
Generative ConvNetwith Continuous Latent Factors
9
πΌ = π π;π + πNon-linear Model (non-linear PCA)
π π;π = π1(π1, π2(π2, β¦ ππ ππ, π + β―
Generative ConvNetwith Continuous Latent Factors
Reconstruction Errorπππ = π β π π;π
2
2
Likelihood for data {Y}
πΏ π, ππ =
π
π
log π(ππ , ππ;π) = β
π
πβ₯ π β π ππ;π β₯2
2π2+β₯ ππ β₯
2
2+ ππππ π‘πππ‘
We need tomaxπ
πΏ(π, ππ )10
Basic Gradient DescentOn training Generative ConvNet
ππΏ
πππ=
1
π2ππ β π ππ ,π
ππ
πππβ ππ
ππΏ
ππ=
π=1
π1
π2ππ β π ππ ,π
ππ
ππ
11
Alternating Gradient DescentOn training Generative ConvNet
(1) Inferential back-propagation:
For each I, run L steps of gradient
descent to update ππ β ππ + πππΏ/πππ
(2) Learning back-propagation:
Update π β π + πππΏ/ππ
12
Langevin Sampling On training Generative ConvNet
maximizing the observed data log-likelihood πΏ(π) = Οπ=1
π log π ππ;π = Οπ=1π log π ππ , ππ;π πππ
ππ+1 = ππ +Ξ2
2
1
π2π β π ππ;π
ππ
ππβ ππ + Ξπ
14
Model EvaluationBy reconstruction error
Reconstruction Error:πππ = π β π π;π
2
2
17Test Train Negative
Model EvaluationBy reconstruction error
1. Fix Weights, use Langevin Sampling to Infer
the corresponding z.
2. Go through the network, use z to reconstruct
the image.
3. Calculate the reconstruct error between
reconstructed image and original image.
18
Sample reconstruction Error
19
Test Train Negative
Re
co
nst
ruc
tio
n O
rig
ina
l
Trainging on 10000 face images --- Good result
Sample reconstruction Error
20
Test Train Negative
Re
co
nst
ruc
tio
n O
rig
ina
l
Trainging on 100 cat images --- Overfitting
Sample reconstruction Error
21
Test Train NegativeRe
co
nst
ruc
tio
n O
rig
ina
l
Trainging on 20000 face images --- Underfitting
Way to make it better
Turning Configuration
Dimension of Z
Learning rate
Langavin Step/Size
Momentum
24
Modifying Net Structure
Number of FC layers
Add Conv layers
Modify # of channels
Size of output
Modify # of channelsResult on Modifying Net Structure
31Before Double Channels After Double Channels
Interpolation Results
35
The linear combination of two or more
inferred Z (by training image) is interpolation.
Conclusion
Synthesis Result is meaningful. As a result, our model can significate explain the data.
1. VS Energy based generative ConvNet: Both generative. We donβt
need MCMC which is time consuming. Easy to synthesis.
2. VS Generative Adversely Net : Need second net, a discriminator to
auxiliary training. Our model is simpler.
37
Generative ConvNet with Continuous Latent Factors
Reference
Han, Tian, et al. "Learning Generative ConvNet with Continuous Latent
Factors by Alternating Back-Propagation." arXiv preprint
arXiv:1606.08571(2016).
Xie, Jianwen, et al. "A theory of generative convnet." arXiv preprint
arXiv:1602.03264 (2016).
Karpathy, Andrej, et al. "Large-scale video classification with
convolutional neural networks." Proceedings of the IEEE conference
on Computer Vision and Pattern Recognition. 2014.
Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised
representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). 38