deep variational inference - university of texas at austinml/flare/extra/slides-deepvarinf.pdf ·...
TRANSCRIPT
![Page 1: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/1.jpg)
Deep Variational Inference
FLARE Reading Group PresentationWesley Tansey9/28/2016
![Page 2: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/2.jpg)
●
What is Variational Inference?
![Page 3: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/3.jpg)
● Want to estimate some distribution, p*(x)
What is Variational Inference? p*(x)
![Page 4: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/4.jpg)
● Want to estimate some distribution, p*(x)
● Too expensive to estimate
What is Variational Inference? p*(x)
![Page 5: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/5.jpg)
● Want to estimate some distribution, p*(x)
● Too expensive to estimate
● Approximate it with a tractable distribution, q(x)
What is Variational Inference? p*(x) q(x)
![Page 6: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/6.jpg)
● Fit q(x) inside of p*(x)● Centered at a single
mode ○ q(x) is unimodal
here○ VI is a MAP
estimate
What is Variational Inference? p*(x) q(x)
![Page 7: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/7.jpg)
● Mathematically:
KL(q || p*)
= Σxq(x)log(q(x) / p*(x))
What is Variational Inference?
Still hard!
p*(x) usually has a tricky normalizing constant
![Page 8: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/8.jpg)
● Mathematically:
KL(q || p*)
= Σxq(x)log(q(x) / p*(x))
● Use unnormalized p~ instead
What is Variational Inference?
![Page 9: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/9.jpg)
log(q(x) / p*(x))
= log(q(x)) - log(p*(x))
= log(q(x)) - log(p~(x) / Z)
= log(q(x)) - log(p~(x)) - log(Z)
● Mathematically:
KL(q || p*)
= Σxq(x)log(q(x) / p*(x))
● Use unnormalized p~ instead
What is Variational Inference?
![Page 10: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/10.jpg)
log(q(x) / p*(x))
= log(q(x)) - log(p*(x))
= log(q(x)) - log(p~(x) / Z)
= log(q(x)) - log(p~(x)) - log(Z)
● Mathematically:
KL(q || p*)
= Σxq(x)log(q(x) / p*(x))
● Use unnormalized p~ instead
What is Variational Inference?
Constant=> Can ignore in our optimization problem
![Page 11: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/11.jpg)
● Classical method
● Uses a factorized q:
q(x) = ∏i q
i(x
i)
Mean Field VI
[1] Blei, Ng, Jordan, “Latent Dirichlet Allocation”, JMLR, 2003.
![Page 12: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/12.jpg)
● Example: Multivariate Gaussian
● Product of independent Gaussians for q
● Spherical covariance underestimates true covariance
Mean Field VI
![Page 13: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/13.jpg)
● Vanilla mean field VI assumes you know all the parameters, θ, of the true distribution, p*(x)
Variational Bayes
[1] Blei, Ng, Jordan, “Latent Dirichlet Allocation”, JMLR, 2003.
![Page 14: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/14.jpg)
● Vanilla mean field VI assumes you know all the parameters, θ, of the true distribution, p*(x)
● Enter: Variational Bayes (VB)
Variational Bayes
[1] Blei, Ng, Jordan, “Latent Dirichlet Allocation”, JMLR, 2003.
![Page 15: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/15.jpg)
● VB infers both the latent q(x) variables, z, and the p*(x) parameters, θ
● VB-EM was popularized for LDA1
○ E for z, M for θ
Variational Bayes
[1] Blei, Ng, Jordan, “Latent Dirichlet Allocation”, JMLR, 2003.
![Page 16: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/16.jpg)
● VB usually uses a mean field approximation of the form:
q(x) = q(zi | θ)∏
i q
i(x
i | z
i)
Variational Bayes
![Page 17: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/17.jpg)
● Requires analytical solutions of expectations w.r.t. q
i○ Intractable in
general● Factored form limits
the power of the approximation
Issues with Mean Field VB
![Page 18: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/18.jpg)
● Requires analytical solutions of expectations w.r.t. q
i○ Intractable in
general● Factored form limits
the power of the approximation
Issues with Mean Field VB
Solution: Auto-Encoding Variational Bayes(Kingma and Welling, 2013)
![Page 19: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/19.jpg)
● Requires analytical solutions of expectations w.r.t. q
i○ Intractable in
general● Factored form limits
the power of the approximation
Issues with Mean Field VB
Solution:Variational Inference with Normalizing Flows(Rezende and Mohamed, 2015)
Solution: Auto-Encoding Variational Bayes(Kingma and Welling, 2014)
![Page 20: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/20.jpg)
Auto-Encoding Variational Bayes1
High-level idea:
1) Optimizing the same lower bound that we get in VB
2) Data augmentation trick leads to lower-variance estimator
3) Lots of choices of q(z|x) and p(z) lead to partial closed-form
4) Use a neural network to parameterize qϕ(z | x) and pθ(x | z)
5) SGD to fit everything
[1] Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR, 2014.
![Page 21: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/21.jpg)
● Given N iid data points, (x1, ... , xn)
● Maximize the marginal likelihood:
log pθ(x1,...,xn) = Σi log pθ(x(i))
1) VB Lower Bound
![Page 22: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/22.jpg)
● Given N iid data points, (x1, ... , xn)
● Maximize the marginal likelihood:
log pθ(x1,...,xn) = Σi log pθ(x(i))
1) VB Lower Bound
![Page 23: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/23.jpg)
● Given N iid data points, (x1, ... , xn)
● Maximize the marginal likelihood:
log pθ(x1,...,xn) = Σi log pθ(x(i))
1) VB Lower Bound
Always positive
![Page 24: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/24.jpg)
● Given N iid data points, (x1, ... , xn)
● Maximize the marginal likelihood:
log pθ(x1,...,xn) = Σi log pθ(x(i))
1) VB Lower Bound
Always positive
Lower bound
![Page 25: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/25.jpg)
● Write lower bound
1) VB Lower Bound
![Page 26: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/26.jpg)
● Write lower bound
1) VB Lower Bound
Anyone want the derivation?
![Page 27: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/27.jpg)
● Write lower bound
● Rewrite lower bound
1) VB Lower Bound
![Page 28: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/28.jpg)
● Write lower bound
● Rewrite lower bound
1) VB Lower Bound
![Page 29: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/29.jpg)
● Write lower bound
● Rewrite lower bound
1) VB Lower Bound
Derivation?
![Page 30: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/30.jpg)
● Write lower bound
● Rewrite lower bound
● Monte Carlo gradient estimator of expectation part
1) VB Lower Bound
![Page 31: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/31.jpg)
● Write lower bound
● Rewrite lower bound
● Monte Carlo gradient estimator of expectation part○ Too high variance
1) VB Lower Bound
![Page 32: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/32.jpg)
● Rewrite qϕ(z(l) | x)
● Separate q into a deterministic function of x and an auxiliary noise variable ϵ
● Leads to lower variance estimator
2) Reparameterization trick
![Page 33: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/33.jpg)
● Example: univariate Gaussian
● Can rewrite as sum of mean and a scaled noise variable
2) Reparameterization trick
![Page 34: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/34.jpg)
● Lots of distributions like this. Three classes given:○ Tractable inverse
CDF○ Location-scale○ Composition
2) Reparameterization trick Exponential, Cauchy, Logistic,
Rayleigh, Pareto, Weibull, Reciprocal, Gompertz, Gumbel, Erlang
Laplace, Elliptical, Student’s t, Logistic, Uniform, Triangular, Gaussian
Log-Normal (exponentiated normal)Gamma (sum of exponentials)Dirichlet (sum of Gammas)Beta, Chi-Squared, F
![Page 35: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/35.jpg)
● Yields a new MC estimator
2) Reparameterization trick
![Page 36: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/36.jpg)
● Plug estimator into the lower bound eq.
● KL term often can be integrated analytically○ Careful choice of
priors
2) Reparameterization trick
![Page 37: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/37.jpg)
● Plug estimator into the lower bound eq.
● KL term often can be integrated analytically○ Careful choice of
priors
2) Reparameterization trick
![Page 38: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/38.jpg)
● KL term often can be integrated analytically○ Careful choice of
priors○ E.g. both Gaussian
3) Partial closed form
![Page 39: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/39.jpg)
● Regularizer ● Reconstruction error
● Neural nets○ Encode: q(z | x)○ Decode: p(x | z)
4) Auto-encoder connection
![Page 40: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/40.jpg)
● q(z | x) encodes● p(x | z) decodes● “Information layer(s)”
need to compress○ Reals = infinite info○ Reals + random
noise = finite info
4) Auto-encoder connection (alt.)
More info in Karol Gregor’s Deep Mind lecture: https://www.youtube.com/watch?v=P78QYjWh5sM
![Page 41: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/41.jpg)
● Deep networks parameterize both q(z | x) and p(x | z)
● Lower-variance estimator of expected log-likelihood
● Can choose from lots of families of q(z | x) and p(z)
Where are we with VI now? (2013’ish)
![Page 42: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/42.jpg)
● Problem:○ Most parametric families
available are simple○ E.g. product of independent
univariate Gaussians○ Most posteriors are complex
Where are we with VI now? (2013’ish)
![Page 43: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/43.jpg)
Variational Inference with Normalizing Flows1
High-level idea:
1) VAEs are great, but our posterior q(z|x) needs to be simple
2) Take simple q(z | x) and apply series of k transformations to z to get q_k(z | x). Metaphor: z “flows” through each transform.
3) Be clever in choice of transforms (computational issue)
4) Variational posterior q now converges to true posterior p
5) Deep NN now parameterizes q and flow parameters[1] Rezende, Danilo Jimenez, and Shakir Mohamed. "Variational inference with normalizing flows." arXiv preprint arXiv:1505.05770 (2015)..
![Page 44: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/44.jpg)
● Function that transforms a probability density through a sequence of invertible mappings
What is a normalizing flow?
q0(z | x)
qk(z | x)
![Page 45: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/45.jpg)
● Chain rule lets us write q
k as product of
q0 and inverted determinants
Key equations (1)
![Page 46: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/46.jpg)
● Density qk(z’)
obtained by successively composing k transforms
Key equations (2)
![Page 47: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/47.jpg)
● Log likelihood of qk(z’)
has a nice additive form
Key equations (3)
![Page 48: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/48.jpg)
● Expectation over qk
can be written as an expectation under q
0
● Cute name: law of the unconscious statistician (LOTUS)
Key equations (4)
![Page 49: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/49.jpg)
Types of flows
1) Infinitesimal Flows:○ Can show convergence in the limit○ Skipping (theoretical; computationally
expensive)
2) Invertible Linear-Time Flows:○ log-det can be calculated efficiently
![Page 50: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/50.jpg)
● Applies the transform:
where:
Planar Flows
![Page 51: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/51.jpg)
● Applies the transform:
where:
Radial Flows
![Page 52: Deep Variational Inference - University of Texas at Austinml/flare/extra/slides-deepvarinf.pdf · Deep Variational Inference FLARE Reading Group Presentation ... Auto-Encoding Variational](https://reader034.vdocuments.mx/reader034/viewer/2022042308/5ed408d28d46b66d226352a8/html5/thumbnails/52.jpg)
● VI approx. p(x) via latent variable model ○ p(x) = Σ
z p(z)p(x | z)
● VAE introduces an auto-encoder approach○ Reparameterization trick makes it feasible○ Deep NNs parameterize q(z | x) and p(x | z)
● NF takes q(z|x) from simple to complex○ Series of linear-time transforms○ Convergence in the limit
Summary