modern computational statistics [0.75em] lecture 20 ... · lecture 20: applications in...
TRANSCRIPT
![Page 1: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/1.jpg)
Modern Computational Statistics
Lecture 20: Applications in ComputationalBiology
Cheng Zhang
School of Mathematical Sciences, Peking University
December 09, 2019
![Page 2: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/2.jpg)
Introduction 2/23
I While modern statistical approaches have been quitesuccessful in many application areas, there are stillchallenging areas where the complex model structuresmake it difficult to apply those methods.
I In this lecture, we will discuss some of the recentadvancement on statistical approaches for computationalbiology, with an emphasis on evolutionary models.
![Page 3: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/3.jpg)
Challenges in Computational Biology 3/23
Adapted from Narges Razavian 2013
![Page 4: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/4.jpg)
Phylogenetic Inference 4/23
The goal of phylogenetic inference is to reconstruct theevolution history (e.g., phylogenetic trees) from molecularsequence data (e.g., DNA, RNA or protein sequences)
Molecular Sequence Data
Taxa
Species A
Species B
Species C
Species D
Characters
ATGAACAT
ATGCACAC
ATGCATAT
ATGCATGC
Phylogenetic Tree
D
A
B
C
Lots of modern biological and medical applications: predict theevolution of influenza viruses and help vaccine design, etc.
![Page 5: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/5.jpg)
Example: B Cell Evolution 5/23
This happens inside of you!
These inferences guide rational vaccine design.
![Page 6: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/6.jpg)
Example: B Cell Evolution 5/23
This happens inside of you!
These inferences guide rational vaccine design.
![Page 7: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/7.jpg)
Example: B Cell Evolution 5/23
This happens inside of you!
These inferences guide rational vaccine design.
![Page 8: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/8.jpg)
Example: B Cell Evolution 5/23
This happens inside of you!
These inferences guide rational vaccine design.
![Page 9: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/9.jpg)
Bayesian Phylogenetics 6/23
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
eEvolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 10: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/10.jpg)
Bayesian Phylogenetics 6/23
pa
ch
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
eEvolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 11: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/11.jpg)
Bayesian Phylogenetics 6/23
A
A
A
A
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 12: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/12.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
A
A
A
A
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 13: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/13.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
T
T
T
T
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 14: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/14.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
G
G
G
G
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 15: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/15.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
A
C
C
C
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 16: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/16.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
A
A
A
A
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 17: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/17.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
C
C
T
T
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 18: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/18.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
C
C
T
T
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 19: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/19.jpg)
Bayesian Phylogenetics 6/23
∑∏
∑∏
∑∏
C
C
T
T
ATGAAC · · ·
ATGCAC · · ·
ATGCAT · · ·
ATGCAT · · ·(τ, q) y1y2y3y4y5y6
e
Evolution model:
p(ch|pa, qe)
qe: amount of evolution on e.
Likelihood
p(Y |τ, q) =
M∏i=1
∑ai
η(aiρ)∏
(u,v)∈E(τ)
Paiuaiv(quv)
Given a proper prior distribution p(τ, q), the posterior is
p(τ, q|Y ) ∝ p(Y |τ, q)p(τ, q).
![Page 20: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/20.jpg)
Markov chain Monte Carlo 7/23
Random-walk MCMC (MrBayes, BEAST):
I simple random perturbation (e.g., Nearest NeighborhoodInterchange) to generate new state.
NNI
Challenges for MCMC
I Large search space: (2n− 5)!! unrooted trees (n taxa)
I Intertwined parameter space, low acceptance rate, hard toscale to data sets with many sequences.
![Page 21: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/21.jpg)
Variational Inference 8/23
q∗(θ)p(θ|x)
Qq∗(θ) = arg min
q∈QKL (q(θ)‖p(θ|x))
I VI turns inference into optimization
I Specify a variational family of distributions over the modelparameters
Q = {qφ(θ);φ ∈ Φ}
I Fit the variational parameters φ to minimize the distance(often in terms of KL divergence) to the exact posterior
![Page 22: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/22.jpg)
Evidence Lower Bound 9/23
L(θ) = Eq(θ)(log p(x, θ))− Eq(θ)(log q(θ)) ≤ log p(x)
I KL is intractable; maximizing the evidence lower bound(ELBO) instead, which only requires the joint probabilityp(x, θ).I The ELBO is a lower bound on log p(x).I Maximizing the ELBO is equivalent to minimizing the KL.
I The ELBO strikes a balance between two termsI The first term encourages q to focus probability mass where
the model puts high probability.I The second term encourages q to be diffuse.
I As an optimization approach, VI tends to be faster thanMCMC, and is easier to scale to large data sets (viastochastic gradient ascent)
![Page 23: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/23.jpg)
Subsplit Bayesian Networks 10/23
Inspired by previous works (Hohna and Drummond 2012,Larget 2013), we can decompose trees into local structures andencode the tree topology space via Bayesian networks!
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 24: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/24.jpg)
Subsplit Bayesian Networks 10/23
Inspired by previous works (Hohna and Drummond 2012,Larget 2013), we can decompose trees into local structures andencode the tree topology space via Bayesian networks!
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 25: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/25.jpg)
Subsplit Bayesian Networks 10/23
Inspired by previous works (Hohna and Drummond 2012,Larget 2013), we can decompose trees into local structures andencode the tree topology space via Bayesian networks!
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 26: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/26.jpg)
Subsplit Bayesian Networks 10/23
Inspired by previous works (Hohna and Drummond 2012,Larget 2013), we can decompose trees into local structures andencode the tree topology space via Bayesian networks!
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 27: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/27.jpg)
Subsplit Bayesian Networks 10/23
Inspired by previous works (Hohna and Drummond 2012,Larget 2013), we can decompose trees into local structures andencode the tree topology space via Bayesian networks!
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 28: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/28.jpg)
Probability Estimation Over Tree Topologies 11/23
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
Rooted Trees
psbn(T = τ) = p(S1 = s1)∏i>1
p(Si = si|Sπi = sπi).
![Page 29: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/29.jpg)
Probability Estimation Over Tree Topologies 11/23
A B
C D
1 2
4 5
3
A
B
C
D
1roo
t/unro
ot
A
B
C
D
3root/unroot
A
A
B
C
D
A
B
CD
A
BCD
A
B
C
D
A
B
C
D
AB
CD
S4
S5
S6
S7
S2
S3
S1
Unrooted Trees:
psbn(T u = τ) =∑s1∼τ
p(S1 = s1)∏i>1
p(Si = si|Sπi = sπi).
![Page 30: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/30.jpg)
Tree Probability Estimation via SBNs 12/23
SBNs can be used to learn a probability distribution based on acollection of trees T = {T1, · · · , TK}.
Tk = {Si = si,k, i ≥ 1}, k = 1, . . . ,K
Rooted Trees
I Maximum Likelihood Estimates: relative frequencies.
pMLE(S1 = s1) =ms1
K, pMLE(Si = si|Sπi = ti) =
msi,ti∑s∈Ci
ms,ti
Unrooted Trees
I Expectation Maximization
pEM,(n+1) = arg maxp
Ep(S1|T,pEM,(n))
(log p(S1) +
∑i>1
log p(Si|Sπi)
)
![Page 31: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/31.jpg)
Example: Phylogenetic Posterior Estimation 13/23
10 8 10 6 10 4 10 2 100
log(ground truth)10 8
10 7
10 6
10 5
10 4
10 3
10 2
10 1
100
log(
estim
ated
pro
babi
lity)
CCD
peak 1peak 2
10 8 10 6 10 4 10 2 100
log(ground truth)10 8
10 7
10 6
10 5
10 4
10 3
10 2
10 1
100
log(
estim
ated
pro
babi
lity)
SBN-EM
peak 1peak 2
104 105
number of samples
10 2
10 1
100
KL d
iver
genc
e
DS1ccdsbn-sasbn-emsbn-em-srf
[Zhang and Matsen, NeurIPS 2018]
I Compared to a previous method CCD (Larget, 2013),SBNs significantly reduce the biases for both highprobability and low probability trees.
I SBNs perform better in the weak data regime.
![Page 32: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/32.jpg)
Example: Phylogenetic Posterior Estimation 14/23
Data set (#Taxa, #Sites)Tree space
sizeSampledtrees
KL divergence to ground truth
SRF CCD SBN-SA SBN-EM SBN-EM-α
DS1 (27, 1949) 5.84×1032 1228 0.0155 0.6027 0.0687 0.0136 0.0130DS2 (29, 2520) 1.58×1035 7 0.0122 0.0218 0.0218 0.0199 0.0128DS3 (36, 1812) 4.89×1047 43 0.3539 0.2074 0.1152 0.1243 0.0882DS4 (41, 1137) 1.01×1057 828 0.5322 0.1952 0.1021 0.0763 0.0637DS5 (50, 378) 2.84×1074 33752 11.5746 1.3272 0.8952 0.8599 0.8218DS6 (50, 1133) 2.84×1074 35407 10.0159 0.4526 0.2613 0.3016 0.2786DS7 (59, 1824) 4.36×1092 1125 1.2765 0.3292 0.2341 0.0483 0.0399DS8 (64, 1008) 1.04×10103 3067 2.1653 0.4149 0.2212 0.1415 0.1236
[Zhang and Matsen, NeurIPS 2018]
Remark: Unlike previous methods, SBNs are flexible enoughto provide accurate approximations to real data posteriors!
![Page 33: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/33.jpg)
Variational Bayesian Phylogenetic Inference 15/23
I Approximating Distribution:
Qφ,ψ(τ, q) ,
tree topology
Qφ(τ)
·branch length
Qψ(q|τ)
I Multi-sample Lower Bound:
LK(φ,ψ) = EQφ,ψ(τ1:K ,q1:K) log
(1
K
K∑i=1
p(Y |τ i, qi)p(τ i, qi)Qφ(τ i)Qψ(qi|τ i)
)I Use stochastic gradient ascent (SGA) to maximize the
lower bound:
φ, ψ = arg maxφ,ψ
LK(φ,ψ)
I φ: VIMCO/RWSI ψ: The Reparameterization Trick
![Page 34: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/34.jpg)
Variational Bayesian Phylogenetic Inference 15/23
I Approximating Distribution:
Qφ,ψ(τ, q) ,tree topology
Qφ(τ) ·branch length
Qψ(q|τ)
I Multi-sample Lower Bound:
LK(φ,ψ) = EQφ,ψ(τ1:K ,q1:K) log
(1
K
K∑i=1
p(Y |τ i, qi)p(τ i, qi)Qφ(τ i)Qψ(qi|τ i)
)I Use stochastic gradient ascent (SGA) to maximize the
lower bound:
φ, ψ = arg maxφ,ψ
LK(φ,ψ)
I φ: VIMCO/RWSI ψ: The Reparameterization Trick
![Page 35: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/35.jpg)
Variational Bayesian Phylogenetic Inference 15/23
I Approximating Distribution:
Qφ,ψ(τ, q) ,tree topology
Qφ(τ) ·branch length
Qψ(q|τ)
I Multi-sample Lower Bound:
LK(φ,ψ) = EQφ,ψ(τ1:K ,q1:K) log
(1
K
K∑i=1
p(Y |τ i, qi)p(τ i, qi)Qφ(τ i)Qψ(qi|τ i)
)
I Use stochastic gradient ascent (SGA) to maximize thelower bound:
φ, ψ = arg maxφ,ψ
LK(φ,ψ)
I φ: VIMCO/RWSI ψ: The Reparameterization Trick
![Page 36: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/36.jpg)
Variational Bayesian Phylogenetic Inference 15/23
I Approximating Distribution:
Qφ,ψ(τ, q) ,tree topology
Qφ(τ) ·branch length
Qψ(q|τ)
I Multi-sample Lower Bound:
LK(φ,ψ) = EQφ,ψ(τ1:K ,q1:K) log
(1
K
K∑i=1
p(Y |τ i, qi)p(τ i, qi)Qφ(τ i)Qψ(qi|τ i)
)I Use stochastic gradient ascent (SGA) to maximize the
lower bound:
φ, ψ = arg maxφ,ψ
LK(φ,ψ)
I φ: VIMCO/RWSI ψ: The Reparameterization Trick
![Page 37: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/37.jpg)
Structured Parameterization 16/23
SBNs Parameters
p(S1 = s1) =exp(φs1)∑
sr∈Sr exp(φsr ), p(Si = s|Sπi
= t) =exp(φs|t)∑
s∈S·|t exp(φs|t)
Branch Length Parameters
Qψ(q|τ) =∏
e∈E(τ)
pLognormal (qe | µ(e, τ), σ(e, τ))
I Simple Split
µs(e, τ) = ψµe/τ , σs(e, τ) = ψσe/τ .
I Primary Subsplit Pair (PSP)
µpsp(e, τ) = ψµe/τ +∑
s∈e//τψµs
σpsp(e, τ) = ψσe/τ +∑
s∈e//τψσs .
W
ψµ (W
1,W
2|W,
Z)
Z
ψ µ(Z
1 , Z2 |W,Z)
eψµ(W,Z)
W1
W2
Z1
Z2
+
µs(e, τ)
![Page 38: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/38.jpg)
Structured Parameterization 16/23
SBNs Parameters
p(S1 = s1) =exp(φs1)∑
sr∈Sr exp(φsr ), p(Si = s|Sπi
= t) =exp(φs|t)∑
s∈S·|t exp(φs|t)
Branch Length Parameters
Qψ(q|τ) =∏
e∈E(τ)
pLognormal (qe | µ(e, τ), σ(e, τ))
I Simple Split
µs(e, τ) = ψµe/τ , σs(e, τ) = ψσe/τ .
I Primary Subsplit Pair (PSP)
µpsp(e, τ) = ψµe/τ +∑
s∈e//τψµs
σpsp(e, τ) = ψσe/τ +∑
s∈e//τψσs .
W
ψµ (W
1,W
2|W,
Z)Z
ψ µ(Z
1 , Z2 |W,Z)
eψµ(W,Z)
W1
W2
Z1
Z2
+
µpsp(e, τ)
![Page 39: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/39.jpg)
Stochastic Gradient Estimators 17/23
SBNs Parameters φ. With τ j , qjiid∼ Qφ,ψ(τ, q)
I VIMCO. [Minh and Rezende, ICML 2016]
∇φLK(φ,ψ) 'K∑j=1
(LKj|−j(φ,ψ)− wj
)∇φ logQφ(τ j).
I RWS. [Bornschein and Bengio, ICLR 2015]
∇φLK(φ,ψ) 'K∑j=1
wj∇φ logQφ(τ j).
Branch Length Parameters ψ. gψ(ε|τ) = exp(µψ,τ + σψ,τ � ε).
I Reparameterization Trick. Let fφ,ψ(τ, q) = p(Y |τ,q)p(τ,q)Qφ(τ)Qψ(q|τ) .
∇ψLK(φ,ψ) 'K∑j=1
wj∇ψ log fφ,ψ(τ j , gψ(εj |τ j))
where τ jiid∼ Qφ(τ), εj
iid∼ N (0, I).
![Page 40: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/40.jpg)
The VBPI Pipeline 18/23
Qφ(τ)
sample
e.g., ancestralsampling for SBNs
B
D
A
C
τ 1
C
D
A
B
τ 2
... B
C
A
D
τK
Qψ(q|τ)sample
e.g., Lognormalfor branch lengths
B
D
A
C
(τ 1, q1)
C
D
A
B(τ 2, q2)
... B
C
A
D(τK, qK)
LK(φ,ψ)
multi-samplelower bound
SGA update
![Page 41: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/41.jpg)
The VBPI Pipeline 18/23
Ancestral sampling for SBNs
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 42: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/42.jpg)
The VBPI Pipeline 18/23
Ancestral sampling for SBNs
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 43: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/43.jpg)
The VBPI Pipeline 18/23
Ancestral sampling for SBNs
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 44: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/44.jpg)
The VBPI Pipeline 18/23
Ancestral sampling for SBNs
S4
S5
S6
S7
S2
S3
S1
ABC
D
A
BC
D
A
B
C
D
D
1.0
1.0
1.0
1.0
AB
CD
A
B
C
D
A
B
C
D
1.0
1.0
1.0
1.0
D
A
B
C
A
B
C
D
![Page 45: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/45.jpg)
The VBPI Pipeline 18/23
Qφ(τ)sample
e.g., ancestralsampling for SBNs
B
D
A
C
τ 1
C
D
A
B
τ 2
... B
C
A
D
τK
Qψ(q|τ)sample
e.g., Lognormalfor branch lengths
B
D
A
C
(τ 1, q1)
C
D
A
B(τ 2, q2)
... B
C
A
D(τK, qK)
LK(φ,ψ)
multi-samplelower bound
SGA update
![Page 46: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/46.jpg)
The VBPI Pipeline 18/23
Qφ(τ)sample
e.g., ancestralsampling for SBNs
B
D
A
C
τ 1
C
D
A
B
τ 2
... B
C
A
D
τK
Qψ(q|τ)
sample
e.g., Lognormalfor branch lengths
B
D
A
C
(τ 1, q1)
C
D
A
B(τ 2, q2)
... B
C
A
D(τK, qK)
LK(φ,ψ)
multi-samplelower bound
SGA update
![Page 47: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/47.jpg)
The VBPI Pipeline 18/23
Qφ(τ)sample
e.g., ancestralsampling for SBNs
B
D
A
C
τ 1
C
D
A
B
τ 2
... B
C
A
D
τK
Qψ(q|τ)sample
e.g., Lognormalfor branch lengths
B
D
A
C
(τ 1, q1)
C
D
A
B(τ 2, q2)
... B
C
A
D(τK, qK)
LK(φ,ψ)
multi-samplelower bound
SGA update
![Page 48: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/48.jpg)
The VBPI Pipeline 18/23
Qφ(τ)sample
e.g., ancestralsampling for SBNs
B
D
A
C
τ 1
C
D
A
B
τ 2
... B
C
A
D
τK
Qψ(q|τ)sample
e.g., Lognormalfor branch lengths
B
D
A
C
(τ 1, q1)
C
D
A
B(τ 2, q2)
... B
C
A
D(τK, qK)
LK(φ,ψ)
multi-samplelower bound
SGA update
![Page 49: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/49.jpg)
The VBPI Pipeline 18/23
Qφ(τ)sample
e.g., ancestralsampling for SBNs
B
D
A
C
τ 1
C
D
A
B
τ 2
... B
C
A
D
τK
Qψ(q|τ)sample
e.g., Lognormalfor branch lengths
B
D
A
C
(τ 1, q1)
C
D
A
B(τ 2, q2)
... B
C
A
D(τK, qK)
LK(φ,ψ)
multi-samplelower bound
SGA update
![Page 50: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/50.jpg)
Performance on Synthetic Data 19/23
A simulated study on unrooted phylogenetic trees with 8 leaves(10395 trees). The target distribution is a random sample fromthe symmetric Dirichlet distribution Dir(β1), β = 0.008
0 50 100 150 200Thousand Iterations
3.0
2.5
2.0
1.5
1.0
0.5
0.0
Evid
ence
Low
er B
ound
EXACTVIMCO(20)VIMCO(50)RWS(20)RWS(50)
0.02
0.00
0 50 100 150 200Thousand Iterations
10 1
100KL
Div
erge
nce
VIMCO(20)VIMCO(50)RWS(20)RWS(50)
10 4 10 3 10 2 10 1
Ground truth10 4
10 3
10 2
10 1
Varia
tiona
l app
roxi
mat
ion VIMCO(50)
RWS(50)
[Zhang and Matsen, ICLR 2019]
ELBOs approach 0 quickly ⇒ SBNs approximations are flexible.
More samples in the multi-sample ELBOs could be helpful.
![Page 51: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/51.jpg)
Performance on Real Data 20/23
0 50 100 150 200Thousand Iterations
10 1
100
101
KL D
iver
genc
e
VIMCO(10)VIMCO(20)RWS(10)RWS(20)MCMC
0 50 100 150 200Thousand Iterations
10 1
100
101
KL D
iver
genc
e
VIMCO(10) + PSPVIMCO(20) + PSPRWS(10) + PSPRWS(20) + PSPMCMC
7042 7040 7038 7036GSS
7042
7041
7040
7039
7038
7037
7036
VBPI
[Zhang and Matsen, ICLR 2019]
I More samples ⇒ better exploration ⇒ betterapproximation
I More flexible branch length distributions across treetopologies (PSP) ease training and improve approximation
I Outperform MCMC via much more efficient tree spaceexploration and branch length updates
![Page 52: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/52.jpg)
Performance on Real Data 21/23
Data setMarginal Likelihood (NATs)
VIMCO(10) VIMCO(20) VIMCO(10)+PSP VIMCO(20)+PSP SS
DS1 -7108.43(0.26) -7108.35(0.21) -7108.41(0.16) -7108.42(0.10) -7108.42(0.18)DS2 -26367.70(0.12) -26367.71(0.09) -26367.72(0.08) -26367.70(0.10) -26367.57(0.48)DS3 -33735.08(0.11) -33735.11(0.11) -33735.10(0.09) -33735.07(0.11) -33735.44(0.50)DS4 -13329.90(0.31) -13329.98(0.20) -13329.94(0.18) -13329.93(0.22) -13330.06(0.54)DS5 -8214.36(0.67) -8214.74(0.38) -8214.61(0.38) -8214.55(0.43) -8214.51(0.28)DS6 -6723.75(0.68) -6723.71(0.65) -6724.09(0.55) -6724.34(0.45) -6724.07(0.86)DS7 -37332.03(0.43) -37331.90(0.49) -37331.90(0.32) -37332.03(0.23) -37332.76(2.42)DS8 -8653.34(0.55) -8651.54(0.80) -8650.63(0.42) -8650.55(0.46) -8649.88(1.75)
[Zhang and Matsen, ICLR 2019]
I Competitive to state-of-the-art (stepping-stone),dramatically reducing cost at test time: VBPI(1000) vsSS(100,000)
I PSP alleviates the demand for large samples, reducingcomputation while maintaining approximation accuracy
![Page 53: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/53.jpg)
Conclusion 22/23
I We introduced VBPI, a general variational framework forBayesian phylogenetic inference.
I VBPI allows efficient learning on both tree topology andbranch lengths, providing competitive performance toMCMC while requiring much less computation.
I Can be used for further statistical analysis (e.g., marginallikelihood estimation) via importance sampling.
I There are many extensions, including more flexible branchlength distributions, more general models, designingadaptive transition kernels in MCMC approaches, etc.
![Page 54: Modern Computational Statistics [0.75em] Lecture 20 ... · Lecture 20: Applications in Computational Biology Cheng Zhang ... December 09, 2019. Introduction 2/23 I While modern statistical](https://reader034.vdocuments.mx/reader034/viewer/2022043023/5f3ef007b96f7c076e1b73bc/html5/thumbnails/54.jpg)
References 23/23
I Sebastian Hohna and Alexei J. Drummond. Guided treetopology proposals for Bayesian phyloge- netic inference.Syst. Biol., 61(1):1–11, January 2012.
I Bret Larget. The estimation of tree posterior probabilitiesusing conditional clade probability distributions. Syst.Biol., 62(4):501–511, July 2013.
I Zhang, C. and Matsen F. A., Generalizing Tree ProbabilityEstimation via Bayesian Networks. In Advances in NeuralInformation Processing Systems, 2018.
I Zhang, C. and Matsen F. A., Variational BayesianPhylogenetic Inference. In Proceedings of the 7thInternational Conference on Learning Representations,2019.