islands and integrals
TRANSCRIPT
Islands and IntegralsProcesses of Diversification in an Island Archipelago and
Bayesian Methods of Comparative Phylogeographical ModelChoice
Jamie R. Oaks1
1Department of Ecology and Evolutionary Biology, University of Kansas
October 16, 2013
Islands and Integrals J. Oaks, University of Kansas 1/53
Climate-driven diversification model
I Repeated coalescence andfragmentation of islandcomplexes
I Prominent paradigm forexplaining Philippinebiodiversity
I Proposed as model ofdiversification
Islands and Integrals J. Oaks, University of Kansas 5/53
Climate-driven diversification model
I Repeated coalescence andfragmentation of islandcomplexes
I Prominent paradigm forexplaining Philippinebiodiversity
I Proposed as model ofdiversification
Islands and Integrals J. Oaks, University of Kansas 5/53
Climate-driven diversification model
I Repeated coalescence andfragmentation of islandcomplexes
I Prominent paradigm forexplaining Philippinebiodiversity
I Proposed as model ofdiversification
Islands and Integrals J. Oaks, University of Kansas 5/53
Testing climate-driven diversification
Did repeated fragmentation ofislands during inter-glacialrises in sea level promotediversification?
Model has testable prediction:
I Temporally clustereddivergences among taxaco-distributed acrossfragmented islands
Islands and Integrals J. Oaks, University of Kansas 6/53
Testing climate-driven diversification
Did repeated fragmentation ofislands during inter-glacialrises in sea level promotediversification?
Model has testable prediction:
I Temporally clustereddivergences among taxaco-distributed acrossfragmented islands
Islands and Integrals J. Oaks, University of Kansas 6/53
Climate-driven model: Prediction
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 7/53
Climate-driven model: Prediction
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 7/53
Climate-driven model: Prediction
T2
T3
T5
τ2 τ1
T1
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 7/53
Divergence model choice
T2
T3
T5
τ2 τ1
T1
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (T1,T2,T3,T4,T5)
τ = {τ1, τ2}
|τ| = 2
T2
T3
T5
τ2 τ1
T1
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (330, 330, 125, 125, 125)
τ = {125, 330}
|τ| = 2
T2
T3
T5
τ2 τ1
T1
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (330, 330, 125, 330, 125)
τ = {125, 330}
|τ| = 2
T2
T3
T5
τ2 τ1
T1
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (375, 330, 125, 330, 125)
τ = {125, 330, 375}
|τ| = 3
T2
T3
T5
τ2 τ1
T1
τ3
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (T1,T2,T3,T4,T5)
τ = {τ1, τ2, τ3}
|τ| = 3
T2
T3
T5
τ2 τ1
T1
τ3
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (T1,T2, . . . ,TY)
τ = {τ1, . . . , τ|τ|}
|τ|
T2
T3
T5
τ2 τ1
T1
τ3
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (T1,T2, . . . ,TY)
τ = {τ1, . . . , τ|τ|}
|τ|
I We want to infer T given DNAsequence alignments X
I
p(T |X) =p(X |T)p(T)
p(X)
I This approach implemented inmsBayes
I Not that simple
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (T1,T2, . . . ,TY)
τ = {τ1, . . . , τ|τ|}
|τ|
I We want to infer T given DNAsequence alignments X
I
p(T |X) =p(X |T)p(T)
p(X)
I This approach implemented inmsBayes
I Not that simple
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (T1,T2, . . . ,TY)
τ = {τ1, . . . , τ|τ|}
|τ|
I We want to infer T given DNAsequence alignments X
I
p(T |X) =p(X |T)p(T)
p(X)
I This approach implemented inmsBayes
I Not that simple
Islands and Integrals J. Oaks, University of Kansas 8/53
Divergence model choice
T = (T1,T2, . . . ,TY)
τ = {τ1, . . . , τ|τ|}
|τ|
I We want to infer T given DNAsequence alignments X
I
p(T |X) =p(X |T)p(T)
p(X)
I This approach implemented inmsBayes
I Not that simple
Islands and Integrals J. Oaks, University of Kansas 8/53
The msBayes model
T2
T3
T5
τ2 τ1
T1
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 9/53
The msBayes model
T1
T2
τ2
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 9/53
The msBayes model
T1
T2
τ2
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 9/53
The msBayes model
X Sequence alignments
G Gene trees
T Divergence times
Θ Demographicparameters
T1
T2
τ2
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 10/53
The msBayes model
X Sequence alignments
G Gene trees
T Divergence times
Θ Demographicparameters
T1
T2
τ2
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 10/53
The msBayes model
Full Model:
p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)
p(X)
X Sequence alignments
G Gene trees
T Divergence times
Θ Demographic parameters
Islands and Integrals J. Oaks, University of Kansas 11/53
The msBayes model
Full Model:
p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)
p(X)
X Sequence alignments
G Gene trees
T Divergence times
Θ Demographic parameters
Islands and Integrals J. Oaks, University of Kansas 11/53
The msBayes model
Full Model:
p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)
p(X)
p(G,T, θA, θD1, θD2, τB, ζD1, ζD2,m, α,υ | X,φ,ρ, ν)
=1
p(X)p(T)f (α)
[ Y∏i=1
p(θA,i )p(θD1,i , θD2,i )p(τB,i )p(ζD1,i )f (ζD2,i )p(mi )
ki∏j=1
p(Xi,j | Gi,j , φi,j )p(Gi,j | Ti , θA,i , θD1,i , θD2,i , ρi,j , νi,j , υj , τB,i , ζD1,i , ζD2,i ,mi )
][K∏
j=1
f (υj |α)]
Islands and Integrals J. Oaks, University of Kansas 11/53
The msBayes model
Full Model:
p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)
p(X)
Approximate Bayesian computation (ABC)
X → S∗ → Bε(S∗)
Islands and Integrals J. Oaks, University of Kansas 11/53
The msBayes model
Full Model:
p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)
p(X)
Approximate Bayesian computation (ABC)
X → S∗ → Bε(S∗)
Approximate Model:
p(G,T,Θ |Bε(S∗)) =p(X |G,T,Θ)p(G,T,Θ)
p(Bε(S∗))
Islands and Integrals J. Oaks, University of Kansas 11/53
The msBayes model
Full Model:
p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)
p(X)
T Vector of divergence times across pairs of populations
|τ| Number of divergence parameters
DT The variance of T
Islands and Integrals J. Oaks, University of Kansas 11/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 12/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 13/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 13/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 13/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 13/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 13/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 13/53
Empirical results
Strong support forsimultaneous divergence ofall 22 taxon pairs
pp > 0.96
∼100,000–250,000 years ago
Islands and Integrals J. Oaks, University of Kansas 14/53
Simulation-based power analyses
What is “simultaneous”?
I Simulate datasets in which all 22 divergence times are random
I τ ∼ U(0, 0.5MGA)
I τ ∼ U(0, 1.5MGA)
I τ ∼ U(0, 2.5MGA)
I τ ∼ U(0, 5.0MGA)
I MGA = Millions of Generations Ago
I Simulate 1000 datasets for each τ distribution
I Analyze all 4000 datasets as we did the empirical data
Islands and Integrals J. Oaks, University of Kansas 15/53
Simulation-based power analyses
What is “simultaneous”?I Simulate datasets in which all 22 divergence times are random
I τ ∼ U(0, 0.5MGA)
I τ ∼ U(0, 1.5MGA)
I τ ∼ U(0, 2.5MGA)
I τ ∼ U(0, 5.0MGA)
I MGA = Millions of Generations Ago
I Simulate 1000 datasets for each τ distribution
I Analyze all 4000 datasets as we did the empirical data
Islands and Integrals J. Oaks, University of Kansas 15/53
Simulation-based power analyses
What is “simultaneous”?I Simulate datasets in which all 22 divergence times are random
I τ ∼ U(0, 0.5MGA)
I τ ∼ U(0, 1.5MGA)
I τ ∼ U(0, 2.5MGA)
I τ ∼ U(0, 5.0MGA)
I MGA = Millions of Generations Ago
I Simulate 1000 datasets for each τ distribution
I Analyze all 4000 datasets as we did the empirical data
Islands and Integrals J. Oaks, University of Kansas 15/53
Simulation-based power analyses
What is “simultaneous”?I Simulate datasets in which all 22 divergence times are random
I τ ∼ U(0, 0.5MGA)
I τ ∼ U(0, 1.5MGA)
I τ ∼ U(0, 2.5MGA)
I τ ∼ U(0, 5.0MGA)
I MGA = Millions of Generations Ago
I Simulate 1000 datasets for each τ distribution
I Analyze all 4000 datasets as we did the empirical data
Islands and Integrals J. Oaks, University of Kansas 15/53
Simulation-based power analyses: Results
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 0.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 1.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 2.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 5.0 MGA)
Estimated number of divergence events (mode)
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 16/53
Simulation-based power analyses: Results
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 0.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 1.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 2.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 5.0 MGA)
Estimated number of divergence events (mode)
Dens
ity
0.05 0.25 0.45 0.65 0.850
5
10
15
20
0.05 0.25 0.45 0.65 0.850
5
10
15
20
0.05 0.25 0.45 0.65 0.850
2
4
6
8
10
12
0.05 0.25 0.45 0.65 0.850
2
4
6
8
10
Posterior probability of one divergence
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 16/53
Simulation-based power analyses: Results
Strong support for highly clustered divergences when divergencetimes are random over 5 million generations
Our empirical results are likely spurious
Islands and Integrals J. Oaks, University of Kansas 17/53
Why the bias?
Potential causes of the bias:
1. The prior on divergence models
2. Broad uniform priors on many of the model’s parameters,including divergence times
Islands and Integrals J. Oaks, University of Kansas 18/53
Causes of bias: Prior on divergence models
T = (375, 330, 125, 330, 125)
τ = {125, 330, 375}
|τ| = 3
T2
T3
T5
τ2 τ1
T1
τ3
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 19/53
Causes of bias: Prior on divergence models
I msBayes uses a discrete uniform prior on the number ofdivergence events, |τ|
# of
div
erge
nce
mod
els
020
4060
8010
012
0
1 3 5 7 9 11 13 15 17 19 21
A
p(M
|τ|,i)
0.00
0.01
0.02
0.03
0.04
1 3 5 7 9 11 13 15 17 19 21
B
# of divergence events, |τ|
Islands and Integrals J. Oaks, University of Kansas 20/53
Causes of bias: Broad priors
I msBayes uses uniform priors on most model parameters,including divergence times
I This requires the use of broad priors
I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood
I This vast space can cause problems with Bayesian modelchoice
I Reduced marginal likelihoods
Islands and Integrals J. Oaks, University of Kansas 21/53
Causes of bias: Broad priors
I msBayes uses uniform priors on most model parameters,including divergence times
I This requires the use of broad priors
I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood
I This vast space can cause problems with Bayesian modelchoice
I Reduced marginal likelihoods
Islands and Integrals J. Oaks, University of Kansas 21/53
Causes of bias: Broad priors
I msBayes uses uniform priors on most model parameters,including divergence times
I This requires the use of broad priors
I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood
I This vast space can cause problems with Bayesian modelchoice
I Reduced marginal likelihoods
Islands and Integrals J. Oaks, University of Kansas 21/53
Causes of bias: Broad priors
I msBayes uses uniform priors on most model parameters,including divergence times
I This requires the use of broad priors
I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood
I This vast space can cause problems with Bayesian modelchoice
I Reduced marginal likelihoods
Islands and Integrals J. Oaks, University of Kansas 21/53
Causes of bias: Marginal likelihoods
p(X ) =
∫θ
p(X | θ)p(θ)dθ
Islands and Integrals J. Oaks, University of Kansas 22/53
Causes of bias: Marginal likelihoods
p(X ) =
∫θ
p(X | θ)p(θ)dθ
0.0 0.2 0.4 0.6 0.8 1.0θ
0
5
10
15
20
25
30
Dens
ity
p(X | θ)
Islands and Integrals J. Oaks, University of Kansas 22/53
Causes of bias: Marginal likelihoods
p(X ) =
∫θ
p(X | θ)p(θ)dθ
0.0 0.2 0.4 0.6 0.8 1.0θ
0
5
10
15
20
25
30
Dens
ity
p(X | θ)
p(θ)
Islands and Integrals J. Oaks, University of Kansas 22/53
Causes of bias: Marginal likelihoods
p(θ |X ) =p(X | θ)p(θ)
p(X )
p(X ) =
∫θ
p(X | θ)p(θ)dθ
Islands and Integrals J. Oaks, University of Kansas 24/53
Causes of bias: Marginal likelihoods
p(θ1 |X ,M1) =p(X | θ1,M1)p(θ1 |M1)
p(X |M1)
p(X |M1) =
∫θ1
p(X | θ1,M1)p(θ |M1)dθ1
Islands and Integrals J. Oaks, University of Kansas 24/53
Causes of bias: Marginal likelihoods
p(θ1 |X ,M1) =p(X | θ1,M1)p(θ1 |M1)
p(X |M1)
p(X |M1) =
∫θ1
p(X | θ1,M1)p(θ |M1)dθ1
p(M1 |X ) =p(X |M1)p(M1)
p(X |M1)p(M1) + p(X |M2)p(M2)
Islands and Integrals J. Oaks, University of Kansas 24/53
Causes of bias: Marginal likelihoods
Predictions:
I Posterior estimates should be sensitive to priors
I As prior converges to distribution underlying the data, thebias should disappear
Testing prior sensitivity:
1. Analyze empirical data under several different prior settings
I Results are very sensitive
2. Use simulations to assess behavior when priors are correct
Islands and Integrals J. Oaks, University of Kansas 25/53
Causes of bias: Marginal likelihoods
Predictions:
I Posterior estimates should be sensitive to priors
I As prior converges to distribution underlying the data, thebias should disappear
Testing prior sensitivity:
1. Analyze empirical data under several different prior settings
I Results are very sensitive
2. Use simulations to assess behavior when priors are correct
Islands and Integrals J. Oaks, University of Kansas 25/53
Causes of bias: Marginal likelihoods
Predictions:
I Posterior estimates should be sensitive to priors
I As prior converges to distribution underlying the data, thebias should disappear
Testing prior sensitivity:
1. Analyze empirical data under several different prior settingsI Results are very sensitive
2. Use simulations to assess behavior when priors are correct
Islands and Integrals J. Oaks, University of Kansas 25/53
Causes of bias: Marginal likelihoods
Predictions:
I Posterior estimates should be sensitive to priors
I As prior converges to distribution underlying the data, thebias should disappear
Testing prior sensitivity:
1. Analyze empirical data under several different prior settingsI Results are very sensitive
2. Use simulations to assess behavior when priors are correct
Islands and Integrals J. Oaks, University of Kansas 25/53
Simulation results: Performance when priors are correct
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
Posterior probability of one divergence
True
prob
abili
tyof
one
dive
rgen
ce
msBayes performs well when all assumptions are met
Islands and Integrals J. Oaks, University of Kansas 26/53
Causes of bias: Marginal likelihoods
Predictions:
I Posterior estimates should be sensitive to priors
I As prior converges to distribution underlying the data, thebias should disappear
Testing prior sensitivity:
1. Analyze empirical data under several different prior settingsI Results are very sensitive
2. Use simulations to assess behavior when priors are correct
3. Use simulations to assess behavior under “ideal” real-worldpriors
Islands and Integrals J. Oaks, University of Kansas 27/53
Causes of bias: Marginal likelihoods
Predictions:
I Posterior estimates should be sensitive to priors
I As prior converges to distribution underlying the data, thebias should disappear
Testing prior sensitivity:
1. Analyze empirical data under several different prior settingsI Results are very sensitive
2. Use simulations to assess behavior when priors are correct
3. Use simulations to assess behavior under “ideal” real-worldpriors
Islands and Integrals J. Oaks, University of Kansas 27/53
Simulation results: Power with informed priors
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 0.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 1.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 2.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 5.0 MGA)
Estimated number of divergence events (mode)
Dens
ity
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =0.997
1 3 5 7 9 11 13 15 17 19 210.0
0.1
0.2
0.3
0.4
0.5
0.6p( ˆ|τ|=1) =0.473
Estimated number of divergence events (mode)
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 28/53
Simulation results: Power with informed priors
0.05 0.25 0.45 0.65 0.850
5
10
15
20
τ∼U(0, 0.5 MGA)
0.05 0.25 0.45 0.65 0.850
5
10
15
20
τ∼U(0, 1.5 MGA)
0.05 0.25 0.45 0.65 0.850
2
4
6
8
10
12
τ∼U(0, 2.5 MGA)
0.05 0.25 0.45 0.65 0.850
2
4
6
8
10
τ∼U(0, 5.0 MGA)
Posterior probability of one divergence
Dens
ity
0.05 0.25 0.45 0.65 0.8502
468
1012
14
0.05 0.25 0.45 0.65 0.850123456789
0.05 0.25 0.45 0.65 0.850
1
2
3
4
5
6
0.05 0.25 0.45 0.65 0.850.0
0.5
1.0
1.5
2.0
Posterior probability of one divergence
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 29/53
Causes of bias: Simulation results
Broad uniform priors are reducing marginal likelihoods of modelswith more divergence events
Even when uniform priors are informed by the data the bias remains
Potential solution:
More flexible priors
Islands and Integrals J. Oaks, University of Kansas 30/53
Causes of bias: Simulation results
Broad uniform priors are reducing marginal likelihoods of modelswith more divergence events
Even when uniform priors are informed by the data the bias remains
Potential solution:
More flexible priors
Islands and Integrals J. Oaks, University of Kansas 30/53
Mitigating the bias
Potential solution:
More flexible priors
0.0 0.2 0.4 0.6 0.8 1.0θ
0
5
10
15
20
25
30De
nsity
p(X | θ)
p(θ)
Potential solution:
Alternative prior over divergence models (e.g., uniform or Dirichletprocess)
Islands and Integrals J. Oaks, University of Kansas 31/53
Mitigating the bias
Potential solution:
More flexible priors
0.0 0.2 0.4 0.6 0.8 1.0θ
0
5
10
15
20
25
30De
nsity
p(X | θ)
p(θ)
Potential solution:
Alternative prior over divergence models (e.g., uniform or Dirichletprocess)
Islands and Integrals J. Oaks, University of Kansas 31/53
Mitigating the bias
Potential solution:
More flexible priors
# of
div
erge
nce
mod
els
020
4060
8010
012
0
1 3 5 7 9 11 13 15 17 19 21
A
p(M
|τ|,i)
0.00
0.01
0.02
0.03
0.04
1 3 5 7 9 11 13 15 17 19 21
B
# of divergence events, |τ|
Potential solution:
Alternative prior over divergence models (e.g., uniform or Dirichletprocess)
Islands and Integrals J. Oaks, University of Kansas 31/53
Mitigating the bias
Potential solution:
More flexible priors
Potential solution:
Alternative prior over divergence models (e.g., uniform or Dirichletprocess)
Islands and Integrals J. Oaks, University of Kansas 31/53
New method: dpp-msbayes
I Reparameterized the model implemented in msBayes
I Replaced uniform priors on continuous parameters withgamma and beta distributions
I Dirichlet process prior (DPP) over all possible discretedivergence models
I Uniform prior over divergence models
Islands and Integrals J. Oaks, University of Kansas 32/53
New method: dpp-msbayes
I Reparameterized the model implemented in msBayes
I Replaced uniform priors on continuous parameters withgamma and beta distributions
I Dirichlet process prior (DPP) over all possible discretedivergence models
I Uniform prior over divergence models
Islands and Integrals J. Oaks, University of Kansas 32/53
New method: dpp-msbayes
I Reparameterized the model implemented in msBayes
I Replaced uniform priors on continuous parameters withgamma and beta distributions
I Dirichlet process prior (DPP) over all possible discretedivergence models
I Uniform prior over divergence models
Islands and Integrals J. Oaks, University of Kansas 32/53
New method: dpp-msbayes
I Reparameterized the model implemented in msBayes
I Replaced uniform priors on continuous parameters withgamma and beta distributions
I Dirichlet process prior (DPP) over all possible discretedivergence models
I Uniform prior over divergence models
Islands and Integrals J. Oaks, University of Kansas 32/53
dpp-msbayes: Simulation-based assessment
Simulate 50,000 datasets under four models
MmsBayes I U-shaped prior on divergence modelsI Uniform priors on continuous parameters
MUshaped I U-shaped prior on divergence modelsI Gamma priors on continuous parameters
MUniform I Uniform prior on divergence modelsI Gamma priors on continuous parameters
MDPP I DPP prior on divergence modelsI Gamma priors on continuous parameters
Analyze all datasets under each of the models
Islands and Integrals J. Oaks, University of Kansas 33/53
dpp-msbayes: Simulation-based assessment
Assess power
I Simulate datasets in which all 22 divergence times are random
I τ ∼ U(0, 0.5MGA)
I τ ∼ U(0, 1.5MGA)
I τ ∼ U(0, 2.5MGA)
I τ ∼ U(0, 5.0MGA)
I MGA = Millions of Generations Ago
I Simulate 1000 datasets for each τ distribution
I Analyze all 4000 datasets as we did the empirical data
Islands and Integrals J. Oaks, University of Kansas 34/53
dpp-msbayes: Simulation results
0.0
0.2
0.4
0.6
0.8
1.0
MmsBayes MDPPM
msBayes
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
0.0 0.2 0.4 0.6 0.8 1.0
MDPP
Posterior probability of one divergence
True
prob
abili
tyof
one
dive
rgen
ce Analysis
modelData model
Islands and Integrals J. Oaks, University of Kansas 35/53
dpp-msbayes: Simulation results
0.0
0.2
0.4
0.6
0.8
1.0
MmsBayes MDPP MUniform MUshaped
MmsBayes
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
MDPP
Posterior probability of one divergence
True
prob
abili
tyof
one
dive
rgen
ce Analysis
model
Data model
Islands and Integrals J. Oaks, University of Kansas 36/53
dpp-msbayes: Simulation results
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 0.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =1.0
τ∼U(0, 1.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =0.999
τ∼U(0, 2.5 MGA)
1 3 5 7 9 11 13 15 17 19 210.00.10.20.30.40.50.60.70.80.9
p( ˆ|τ|=1) =0.83
τ∼U(0, 5.0 MGA)
MmsBayes
Estimated number of divergence events (mode)
Dens
ity
1 3 5 7 9 11 13 15 17 19 210.0
0.2
0.4
0.6
0.8
1.0p( ˆ|τ|=1) =0.926
1 3 5 7 9 11 13 15 17 19 210.00.10.20.30.40.50.60.7
p( ˆ|τ|=1) =0.605
1 3 5 7 9 11 13 15 17 19 210.000.050.100.150.200.250.300.350.400.45
p( ˆ|τ|=1) =0.187
1 3 5 7 9 11 13 15 17 19 210.000.020.040.060.080.100.120.14
p( ˆ|τ|=1) =0.003
MDPP
Estimated number of divergence events (mode)
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 37/53
dpp-msbayes: Simulation results
0.05 0.25 0.45 0.65 0.8502468
10121416
τ∼U(0, 0.5 MGA)
0.05 0.25 0.45 0.65 0.850123456789
τ∼U(0, 1.5 MGA)
0.05 0.25 0.45 0.65 0.85012
34
567
τ∼U(0, 2.5 MGA)
0.05 0.25 0.45 0.65 0.850.0
0.5
1.0
1.5
2.0
2.5
3.0
τ∼U(0, 5.0 MGA)
MmsBayes
Posterior probability of one divergence
Dens
ity
0.05 0.25 0.45 0.65 0.85012
34
567
0.05 0.25 0.45 0.65 0.850.00.51.01.52.02.53.03.54.0
0.05 0.25 0.45 0.65 0.850
1
2
3
4
5
0.05 0.25 0.45 0.65 0.850
5
10
15
20
MDPP
Posterior probability of one divergence
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 38/53
dpp-msbayes: Simulation results
0.0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
150
200p(D̂T <0.01) =1.0
τ∼U(0, 0.5 MGA)
0.0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
150
200p(D̂T <0.01) =0.999
τ∼U(0, 1.5 MGA)
0.0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
150
200p(D̂T <0.01) =0.996
τ∼U(0, 2.5 MGA)
0.0 0.02 0.04 0.06 0.08 0.1 0.12020406080
100120140160180
p(D̂T <0.01) =0.637
τ∼U(0, 5.0 MGA)
MmsBayes
Estimated variance in divergence times (median)
Dens
ity
0.0 0.1 0.2 0.3 0.4 0.50
2
4
6
8
10p(D̂T <0.01) =0.002
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.00.51.01.52.02.53.03.54.04.5
p(D̂T <0.01) =0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.0
0.5
1.0
1.5
2.0
2.5p(D̂T <0.01) =0.0
0.0 0.4 0.8 1.2 1.60.0
0.5
1.0
1.5
2.0
2.5
3.0p(D̂T <0.01) =0.0
MDPP
Estimated variance in divergence times (median)
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 39/53
dpp-msbayes: Simulation results
0.0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
150
200p(D̂T <0.01) =1.0
τ∼U(0, 0.5 MGA)
0.0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
150
200p(D̂T <0.01) =0.999
τ∼U(0, 1.5 MGA)
0.0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
150
200p(D̂T <0.01) =0.996
τ∼U(0, 2.5 MGA)
0.0 0.02 0.04 0.06 0.08 0.1 0.12020406080
100120140160180
p(D̂T <0.01) =0.637
τ∼U(0, 5.0 MGA)M
msBayes
Estimated variance in divergence times (median)
Dens
ity
0.0 0.05 0.1 0.15 0.2 0.25 0.3 0.35010203040506070
p(D̂T <0.01) =0.914
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80
5
10
15
20
25p(D̂T <0.01) =0.626
0.0 0.2 0.4 0.6 0.80123456789
p(D̂T <0.01) =0.235
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.0
0.5
1.0
1.5
2.0
2.5p(D̂T <0.01) =0.004
MUshaped
Estimated variance in divergence times (median)
Dens
ity
0.0 0.1 0.2 0.3 0.4 0.50
2
4
6
8
10p(D̂T <0.01) =0.002
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.00.51.01.52.02.53.03.54.04.5
p(D̂T <0.01) =0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.0
0.5
1.0
1.5
2.0
2.5p(D̂T <0.01) =0.0
0.0 0.4 0.8 1.2 1.60.0
0.5
1.0
1.5
2.0
2.5
3.0p(D̂T <0.01) =0.0
MDPP
Estimated variance in divergence times (median)
Dens
ity
Islands and Integrals J. Oaks, University of Kansas 40/53
dpp-msbayes: Simulation results
I Results confirm the bias of msBayes was caused by
1. Broad uniform priors2. U-shaped prior on divergence models
I The new model shows improved model-choice accuracy,power, and robustness
Islands and Integrals J. Oaks, University of Kansas 41/53
Testing climate-driven diversification
Did repeated fragmentation ofislands during inter-glacialrises in sea level promotediversification?
Islands and Integrals J. Oaks, University of Kansas 42/53
Species n1 n2
MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3
Islands and Integrals J. Oaks, University of Kansas 43/53
dpp-msbayes: Philippine diversification
1 3 5 7 9 11 13 15 17 19 21Number of divergence events
0.0
0.1
0.2
0.3
0.4
0.5
Post
erio
r pro
babi
lity
msBayes
1 3 5 7 9 11 13 15 17 19 21Number of divergence events
dpp-msbayes
Islands and Integrals J. Oaks, University of Kansas 44/53
dpp-msbayes: Philippine diversification
1 3 5 7 9 11 13 15 17 19 21Number of divergence events
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Prob
abili
ty
Prior
1 3 5 7 9 11 13 15 17 19 21Number of divergence events
Posterior
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 45/53
Conclusions
I Our new approximate-Bayesian method of phylogeographicalmodel choice shows improved behavior
I Improved accuracy, robustness, and powerI More “honest” estimates regarding uncertainty
I Philippine climate-driven diversification model?
I Results consistent with prediction of clustered divergencesI Results suggest multiple co-divergencesI However, there is a lot of uncertainty
Islands and Integrals J. Oaks, University of Kansas 46/53
Conclusions
I Our new approximate-Bayesian method of phylogeographicalmodel choice shows improved behavior
I Improved accuracy, robustness, and powerI More “honest” estimates regarding uncertainty
I Philippine climate-driven diversification model?I Results consistent with prediction of clustered divergencesI Results suggest multiple co-divergencesI However, there is a lot of uncertainty
Islands and Integrals J. Oaks, University of Kansas 46/53
Future directions: Full-Bayesian phylogenetic framework
T2
T3
T5
τ2 τ1
T1
τ3
T4
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 47/53
Future directions: Full-Bayesian phylogenetic framework
0100200300400500Time (kya)
0
-50
-100
Sea le
vel (m
)
Islands and Integrals J. Oaks, University of Kansas 47/53
Software
Everything is on GitHub. . .
I dpp-msbayes: https://github.com/joaks1/dpp-msbayes
I PyMsBayes: https://github.com/joaks1/PyMsBayes
I ABACUS: Approximate BAyesian C UtilitieS.https://github.com/joaks1/abacus
Islands and Integrals J. Oaks, University of Kansas 48/53
Open Notebook Science
Everything is on GitHub. . .
I msbayes-experiments:https://github.com/joaks1/msbayes-experiments
Islands and Integrals J. Oaks, University of Kansas 49/53
Acknowledgments
Ideas and feedback:
I KU Herpetology
I Holder Lab
I Melissa Callahan
I Mike Hickerson
I Laura Kubatko
I My committee
Computation:
I KU ITTC
I KU Computing Center
I iPlant
Funding:
I NSF
I KU Grad Studies, EEB & BI
I SSB
I Sigma Xi
Photo credits:
I Rafe Brown, Cam Siler, &Jake Esselstyn
I FMNH Philippine MammalWebsite:
I D.S. Balete, M.R.M. Duya,& J. Holden
Islands and Integrals J. Oaks, University of Kansas 50/53
Gene tree divergences
Age (mybp)
Split
(Tax
on: I
slan
d 1−
Isla
nd 2
)
Crocidura beatus: Leyte−Samar
Crocidura negrina−panayensis: Negros−Panay
Cynopterus brachyotis: Biliran−Mindanao
Cynopterus brachyotis: Negros−Panay
Cyrtodactylus annulatus: Bohol−Mindanao
Cyrtodactylus gubaot−sumuroi: Leyte−Samar
Cyrtodactylus philippinicus: Negros−Panay
Dendrelaphis marenae: Negros−Panay
Gekko mindorensis: Negros−Panay
Haplonycteris fischeri: Biliran−Mindanao
Haplonycteris fischeri: Negros−Panay
Hipposideros obscurus: Leyte−Mindanao
Hipposideros pygmaeus: Bohol−Mindanao
Limnonectes leytensis: Bohol−Mindanao
Limnonectes magnus: Bohol−Mindanao
Macroglossus minimus: Biliran−Mindanao
Macroglossus minimus: Negros−Panay
Ptenochirus jagori: Leyte−Mindanao
Ptenochirus jagori: Negros−Panay
Ptenochirus minor: Biliran−Mindanao
Insulasaurus arborens: Negros−Panay
Pinoyscincus jagori: Mindanao−Samar
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5 1.0 1.5 2.0 2.5 3.0
Islands and Integrals J. Oaks, University of Kansas 53/53
Causes of bias: Insufficient sampling
I Models with more parameter space are less densely sampled
I Could explain bias toward small models in extreme casesI Predicts large variance in posterior estimates
I We explored empirical and simulation-based analyses with 2, 5,and 10 million prior samples, and estimates were very similar
0.0 0.2 0.4 0.6 0.8 1.01e8
0.0
0.2
0.4
0.6
0.8
1.0
1.2
95%
HPD
DT
UnadjustedA
0.0 0.2 0.4 0.6 0.8 1.01e8
0.00.10.20.30.40.50.60.70.8 GLM-adjustedB
Number of prior samples
Islands and Integrals J. Oaks, University of Kansas 53/53