bayesian model choice in cosmology

Post on 10-May-2015

1.162 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Talk at JSM 2010, Vancouver, B.C.

TRANSCRIPT

Bayesian Model Comparison in Cosmology

Bayesian Model Comparison in Cosmologywith Population Monte Carlo

Monthly Notices Royal Astronomical Soc. 405 (4), 2381 - 2390, 2010

Christian P. Robert

Universite Paris Dauphine & CRESThttp://www.ceremade.dauphine.fr/~xian

Joint works with D., Benabed K., Cappe O., Cardoso J.F., Fort G., Kilbinger M.,

[Marin J.-M., Mira A.,] Prunet S., Wraith D.

Bayesian Model Comparison in Cosmology

Outline

1 Cosmology background

2 Importance sampling

3 Application to cosmological data

4 Evidence approximation

5 Cosmology models

6 lexicon

Bayesian Model Comparison in Cosmology

Cosmology background

Cosmology

A large part of the data to answer some of the major questions in cosmologycomes from studying the Cosmic Microwave Background (CMB) radiation(fossil heat released circa 380,000 years after the BB).

Huge uniformity of the CMB. Only very sensitive instruments like such asWMAP (NASA, 2001) can detect fluctuations CMB temperaturee.g minute temperature variations: one part of the sky has a temperature of 2.7251Kelvin (degrees above absolute zero), while another part of the sky has a temperatureof 2.7249 Kelvin

Bayesian Model Comparison in Cosmology

Cosmology background

CosmologyA large part of the data to answer some of the major questions in cosmologycomes from studying the Cosmic Microwave Background (CMB) radiation(fossil heat released circa 380,000 years after the BB).

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

CMB

−0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6

01

23

45

[Marin & CPR, Bayesian Core, 2007]

Bayesian Model Comparison in Cosmology

Cosmology background

Plank

Temperature variations are related to fluctuations in the density ofmatter in the early universe and thus carry information about theinitial conditions for the formation of cosmic structures such asgalaxies, clusters, and voids for example.

PlanckJoint mission between the European Space Agency (ESA) and NASA, launched inMay 2009. The Planck mission plans to provide datasets of nearly 5 × 1010

observations to settle many open questions with CMB temperature data. Rather thanscalar valued observations, Planck will provide tensor-valued data and thus is likely toalso open up this area of statistical research.

Bayesian Model Comparison in Cosmology

Cosmology background

Plank

Temperature variations are related to fluctuations in the density ofmatter in the early universe and thus carry information about theinitial conditions for the formation of cosmic structures such asgalaxies, clusters, and voids for example.

PlanckJoint mission between the European Space Agency (ESA) and NASA, launched inMay 2009. The Planck mission plans to provide datasets of nearly 5 × 1010

observations to settle many open questions with CMB temperature data. Rather thanscalar valued observations, Planck will provide tensor-valued data and thus is likely toalso open up this area of statistical research.

Bayesian Model Comparison in Cosmology

Cosmology background

.

Bayesian Model Comparison in Cosmology

Cosmology background

Some questions in cosmology

Will the universe expand forever, or will it collapse?

Is the universe dominated by exotic dark matter and what isits concentration?

What is the shape of the universe?

Is the expansion of the universe accelerating rather thandecelerating?

Is the “flat ΛCDM paradigm” appropriate or is the curvaturedifferent from zero?

[Adams, The Guide [a.k.a. H2G2], 1979]

Bayesian Model Comparison in Cosmology

Cosmology background

Statistical problems in cosmology

Potentially high dimensional parameter space [Not consideredhere]

Immensely slow computation of likelihoods, e.g WMAP, CMB,because of numerically costly spectral transforms [Data is aFortran program]

Nonlinear dependence and degeneracies between parametersintroduced by physical constraints or theoretical assumptions

Ωm

w0

0.0 0.2 0.4 0.6 0.8 1.0 1.2

−3.

0−

2.0

−1.

00.

0

− M

α

19.1 19.3 19.5 19.7

1.0

1.5

2.0

2.5

Bayesian Model Comparison in Cosmology

Importance sampling

Importance sampling solutions

1 Cosmology background

2 Importance samplingAdaptive importance samplingAdaptive multiple importance sampling

3 Application to cosmological data

4 Evidence approximation

5 Cosmology models

6 lexicon

Bayesian Model Comparison in Cosmology

Importance sampling

Importance sampling 101

Importance sampling is based on the fundamental identity

π(f) =

f(x)π(x) dx =

f(x)π(x)

q(x)q(x) dx

If x1, . . . , xN are drawn independently from q,

π(f) =

N∑

n=1

f(xn)wn; wn =π(xn)/q(xn)

∑Nm=1 π(xm)/q(xm)

,

provides a converging approximation to π(f) (independent of thenormalisation of π).

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Initialising importance sampling

PMC/AIS offers a solution to the difficulty of picking q throughadaptivity:Given a target π, PMC produces a sequence qt of importancefunctions (t = 1, . . . , T ) aimed at approximating πFirst sample produced by a regular importance sampling scheme,x1

1, . . . , x1N ∼ q1, associated with importance weights

w1n =

π(x1n)

q1(x1n)

and their normalised counterparts w1n, providing a first

approximation to a sample from π.Moments of π can then be approximated to construct an updatedimportance function q2, &c.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Adaptive importance sampling

Optimality criterion?

The quality of approximation can be measured in terms of theKullback divergence from the target,

D(π‖qt) =

log

(

π(x)

qt(x)

)

π(x)dx,

and the density qt can be adjusted incrementally to minimize thisdivergence.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

PMC – Some papers

Cappe et al (2004) - J. Comput. Graph. Stat.

Outline of Population Monte Carlo but missed main point

Celeux et al (2005) - Comput. Stat. & Data Analysis Rao-Blackwellisation forimportance sampling and missing data problems

Douc et al (2007) - ESAIM Prob. & Stat. and Annals of Statistics

Convergence issues proving adaptation is positive where q is a mixture density ofrandom-walk proposals (mixture weights varied)

Cappe et al (2007) - Stat. & Computing

Adaptation of q (mixture density of independent proposals), where weights andparameters vary

Wraith et al (2009) - Physical Review D

Application of Cappe et al (2007) to cosmology and comparison with MCMC

Beaumont et al (2009) - Biometrika

Application of Cappe et al (2007) to ABC settings

Kilbinger et al (2010) - Month. N. Royal Astro. Soc.

Application of Cappe et al (2007) to model choice in cosmology

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Adaptive importance sampling (2)

Use of mixture densities

qt(x) = q(x;αt, θt) =

D∑

d=1

αtdϕ(x; θt

d)

[West, 1993]

where

αt = (αt1, . . . , α

tD) is a vector of adaptable weights for the D

mixture components

θt = (θt1, . . . , θ

tD) is a vector of parameters which specify the

components

ϕ is a parameterised density (usually taken to be multivariateGaussian or Student-t, the later preferred)

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Cappe et al (2007) optimal scheme

Update qt using an integrated EM approach minimising the KLdivergence at each iteration

D(π‖qt) =

log

(

π(x)∑D

d=1 αtdϕ(x; θt

d)

)

π(x)dx,

equivalent to maximising

ℓ(α, θ) =

log

(

D∑

d=1

αdϕ(x; θd)

)

π(x) dx

in α, θ.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

PMC updates

Maximization of Lt(α, θ) leads to closed form solutions inexponential families (and for the t distributions)For instance for Np(µd,Σd):

αt+1d =

ρd(x;αt, µt,Σt)π(x)dx,

µt+1d =

xρd(x;αt, µt,Σt)π(x)dx

αt+1d

,

Σt+1d =

(x − µt+1d )(x − µt+1

d )Tρd(x;αt, µt,Σt)π(x)dx

αt+1d

.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Empirical updates

And empirical versions,

αt+1d

=N

X

n=1

wtn ρd(xt

n;αt, µt, Σt)

µt+1d

=

PNn=1 wt

nxtn ρd(xt

n;αt, µt, Σt)

αt+1d

Σt+1d

=PN

n=1 wtn (xt

n − µt+1d

)(xtn − µt+1

d)Tρd(xt

n;αt, µt, Σt)

αt+1d

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Banana benchmark

Twisted Np(0, Σ) target with Σ = diag(σ2

1, 1, . . . , 1), changing the

second co-ordinate x2 to x2 + b(x2

1− σ2

1)

x1

x 2

−40 −20 0 20 40

−40

−30

−20

−10

010

20

p = 10, σ2

1= 100, b = 0.03

[Haario et al. 1999]

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Simulation

−40 −20 0 20 40

−40

−20

010

20

−40 −20 0 20 40

−40

−20

010

20−40 −20 0 20 40

−40

−20

010

20

−40 −20 0 20 40−

40−

200

1020

−40 −20 0 20 40

−40

−20

010

20

−40 −20 0 20 40

−40

−20

010

20

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Monitoring by perplexity

Stop iterations when further adaptations do not improve D(π‖qt).

The transform exp[−D(π‖qt)] may be estimated by the normalised

perplexity p = exp(HtN)/N, where

HtN = −

N∑

n=1

wtn log wt

n

is the Shannon entropy of the normalised weights

Thus, minimization of the Kullback divergence can beapproximately connected with the maximization of the perplexity(normalised) (values closer to 1 indicating good agreementbetween q and π).

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Monitoring by ESS

A second criterion is the effective sample size (ESS)

ESStN =

(

N∑

n=1

wtn

2

)−1

which can be interpreted as the number of equivalent iid samplepoints.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Simulation

1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

NP

ER

PL

1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

NE

SS

Normalised perplexity (top panel) and normalised effective sample size(ESS/N) (bottom panel) estimates for thefirst 10 iterations of PMC

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Comparison to MCMCAdaptive MCMC: Proposal is a multivariate Gaussian with Σupdated/based on previous values in the chain. Scale and updatetimes chosen for optimal results.

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

fa fa

fbfb

PMC MCMC

Evolution of π(fa) (top panels) and π(fb) (bottom panels) from 10k points to 100k points for both PMC (leftpanels) and MCMC (right panels).

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive importance sampling

Simulation

d10 PMC d10 MCMC d2 PMC d2 MCMC d1 PMC d1 MCMC

0.62

0.66

0.70

0.74

Propoportion of points inside

d10 PMC d10 MCMC d2 PMC d2 MCMC d1 PMC d1 MCMC

0.88

0.92

0.96

1.00

Propoportion of points inside

MCMC

MCMC

MCMC

MCMC

MCMC

MCMC

PMC PMC PMC

PMC PMC PMC

fc fe fh

fd fg fi

Results showing the distributions of the PMC and the MCMC estimates. All estimates are based on 500 simulationruns.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Adaptive multiple importance sampling

Full recycling:

At iteration t, design a new proposal qt based on all previoussamples

x11, . . . , x

1N , . . . , xt−1

1 , . . . , xt−1N

At each stage, the whole past can be used: if un-normalisedweights ωi,t are preserved along iterations, then all xt

i’s can bepooled together

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Adaptive multiple importance sampling

Full recycling:

At iteration t, design a new proposal qt based on all previoussamples

x11, . . . , x

1N , . . . , xt−1

1 , . . . , xt−1N

At each stage, the whole past can be used: if un-normalisedweights ωi,t are preserved along iterations, then all xt

i’s can bepooled together

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Caveat

When using several importance functions at once, q0, . . . , qT , withsamples x0

1, . . . , x0N0

, . . ., xT1 , . . . , xT

NTand importance weights

ωti = π(xt

i)/qt(xti), merging thru the empirical distribution

t,i

ωtiδxt

i(x)

/

t,i

ωti≈ π(x)

Fails to cull poor proposals: very large weights do remain large inthe cumulated sample and poorly performing samplesoverwhelmingly dominate other samples in the final outcome.

c© Raw mixing of importance samples may be harmful, comparedwith a single sample, even when most proposals are efficient.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Caveat

When using several importance functions at once, q0, . . . , qT , withsamples x0

1, . . . , x0N0

, . . ., xT1 , . . . , xT

NTand importance weights

ωti = π(xt

i)/qt(xti), merging thru the empirical distribution

t,i

ωtiδxt

i(x)

/

t,i

ωti≈ π(x)

Fails to cull poor proposals: very large weights do remain large inthe cumulated sample and poorly performing samplesoverwhelmingly dominate other samples in the final outcome.

c© Raw mixing of importance samples may be harmful, comparedwith a single sample, even when most proposals are efficient.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Deterministic mixtures

Owen and Zhou (2000) propose a stabilising recycling of theweights via deterministic mixtures by modifying the importancedensity qt(x

ti) under which xt

i was truly simulated to a mixture ofall the densities that have been used so far

1∑T

j=0 Nj

T∑

t=0

Ntqt(xTi ) ,

resulting into the deterministic mixture weight

ωti = π(xt

i)

/

1∑T

j=0 Nj

T∑

t=0

Ntqt(xti) .

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Unbiasedness

Potential to exploit the most efficient proposals in the sequenceQ0, . . . , QT without rejecting any simulated value nor sample.Poorly performing importance functions are simply eliminatedthrough the erosion of their weights

π(xti)

/

1∑T

j=0 Nj

T∑

l=0

Nlql(xti)

as T increases.Paradoxical feature of competing acceptable importance weightsfor the same simulated value well-understood in the cases ofRao-Blackwellisation and of Population Monte Carlo. Moreintricated here in that only unbiasedness remains [fake mixture]

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Unbiasedness

Potential to exploit the most efficient proposals in the sequenceQ0, . . . , QT without rejecting any simulated value nor sample.Poorly performing importance functions are simply eliminatedthrough the erosion of their weights

π(xti)

/

1∑T

j=0 Nj

T∑

l=0

Nlql(xti)

as T increases.Paradoxical feature of competing acceptable importance weightsfor the same simulated value well-understood in the cases ofRao-Blackwellisation and of Population Monte Carlo. Moreintricated here in that only unbiasedness remains [fake mixture]

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

AMIS

AMIS (or Adaptive Multiple Importance Sampling) usesimportance sampling functions (qt) that are constructedsequentially and adaptively, using past t − 1 weighted samples.

i weights of all present and past variables xli

(1 ≤ l ≤ t , 1 ≤ j ≤ Nt) are modified, based on the currentproposals

ii the entire collection of importance samples is used to buildthe next importance function.

[Parallel with IMIS: Raftery & Bo, 2010]

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

The AMIS algorithm

Adaptive Multiple Importance SamplingAt iteration t = 1, . . . , T

1) Independently generate Nt particles xt

i∼ q(x|θt−1)

2) For 1 ≤ i ≤ Nt, compute the mixture at xit

δti

= N0q0(xti) +

P

t

l=1 Nlq(xti; θl−1) and derive the

weight of xti, ωt

i= π(xt

i)‹

[δti

ffi

N0 +P

t

l=0 Nl] .

3) For 0 ≤ l ≤ t − 1 and 1 ≤ i ≤ Nl, actualise past weights as

δl

i= δ

l

i+ q(x

l

i; θ

t−1) and ω

l

i= π(x

l

i)‹

[δl

i

N0 +

tX

l=0

Nl] .

4) Compute the parameter estimate θt based on

(x01, ω

01, . . . , x

0N0

, ω0N0

, . . . , xt

1, ωt

1, . . . , xt

Nt, ω

t

Nt)

[Cornuet, Marin, Mira & CPR, 2009, arXiv:0907.1254]

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Studentised AMIS

When the proposal distribution qt is a Student’s t proposal,

T3(µ,Σ)

mean µ and covariance Σ parameters can be updated byestimating first two moments of the target distribution Π

µt =

Ptl=0

PNl

i=1 ωlix

li

Ptl=0

PNl

i=1 ωli

and Σt =

Ptl=0

PNl

i=1 ωli(x

li − µt)(xl

i − µt)T

Ptl=0

PNl

i=1 ωli

.

i.e. using optimal update of Cappe et al. (2007)

Obvious extension to mixtures [and again optimal update of Cappeet al. (2007)]

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Studentised AMIS

When the proposal distribution qt is a Student’s t proposal,

T3(µ,Σ)

mean µ and covariance Σ parameters can be updated byestimating first two moments of the target distribution Π

µt =

Ptl=0

PNl

i=1 ωlix

li

Ptl=0

PNl

i=1 ωli

and Σt =

Ptl=0

PNl

i=1 ωli(x

li − µt)(xl

i − µt)T

Ptl=0

PNl

i=1 ωli

.

i.e. using optimal update of Cappe et al. (2007)

Obvious extension to mixtures [and again optimal update of Cappeet al. (2007)]

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

SimulationsSame banana benchmark

Target function p AMIS Cappe’07

5 0.06558 0.06879E(x1) = 0 10 0.06388 0.11051

20 0.09167 0.17912

5 0.10215 0.11583E(x2) = 0 10 0.21421 0.22557

20 0.25316 0.29087P5

i=3 E(xi) = 0 5 0.00478 0.00927P10

i=3 E(xi) = 0 10 0.00902 0.02099P20

i=3 E(xi) = 0 20 0.01666 0.04208

5 2.60672 3.92650var(x1) = 100 10 7.06686 7.48877

20 8.20020 9.71725

5 2.10682 2.96132var(x2) = 19 10 3.76660 5.08474

20 4.85407 5.98031P5

i=3var(xi) = 3 5 0.00645 0.01196P10

i=3var(xi) = 8 10 0.01370 0.02636P20

i=3var(xi) = 18 20 0.04609 0.06424

Root mean square errors calculated over 10 replications for different target functionsand dimensions p.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Simulation (cont’d)

10 replicate ESSs for AMIS (left) and PMC (right) for p = 5, 10, 20.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Simulation (cont’d)

10 replicate absolute errors associated to the estimations of E(x1) (left column),

E(x2) (center column) andPp

i=3 E(xi) (right column) using AMIS (left in each

block) and PMC (right) for p = 5, 10, 20.

Bayesian Model Comparison in Cosmology

Importance sampling

Adaptive multiple importance sampling

Simulation (cont’d)

10 replicate absolute errors associated to the estimations of var(x1) (left column),

var(x2) (center column) andPp

i=3 var(xi) (right column) using AMIS (left in each

block) and PMC (right) for p = 5, 10, 20.

Bayesian Model Comparison in Cosmology

Application to cosmological data

Cosmological data

Posterior distribution of cosmological parameters for recentobservational data of CMB anisotropies (differences in temperaturefrom directions) [WMAP], SNIa, and cosmic shear.Combination of three likelihoods, some of which are available aspublic (Fortran) code, and of a uniform prior on a hypercube.

Bayesian Model Comparison in Cosmology

Application to cosmological data

Cosmology parameters

Parameters for the cosmology likelihood(C=CMB, S=SNIa, L=lensing)

Symbol Description Minimum Maximum ExperimentΩb Baryon density 0.01 0.1 C LΩm Total matter density 0.01 1.2 C S Lw Dark-energy eq. of state -3.0 0.5 C S Lns Primordial spectral index 0.7 1.4 C L

∆2R

Normalization (large scales) Cσ8 Normalization (small scales) C Lh Hubble constant C Lτ Optical depth CM Absolute SNIa magnitude Sα Colour response Sβ Stretch response Sa Lb galaxy z-distribution fit Lc L

For WMAP5, σ8 is a deduced quantity that depends on the other parameters

Bayesian Model Comparison in Cosmology

Application to cosmological data

Adaptation of importance function

Bayesian Model Comparison in Cosmology

Application to cosmological data

Estimates

Parameter PMC MCMC

Ωb 0.0432+0.0027−0.0024

0.0432+0.0026−0.0023

Ωm 0.254+0.018

−0.0170.253+0.018

−0.016

τ 0.088+0.018−0.016

0.088+0.019−0.015

w −1.011 ± 0.060 −1.010+0.059

−0.060

ns 0.963+0.015−0.014

0.963+0.015−0.014

109∆2R

2.413+0.098−0.093

2.414+0.098−0.092

h 0.720+0.022−0.021

0.720+0.023−0.021

a 0.648+0.040−0.041

0.649+0.043−0.042

b 9.3+1.4−0.9

9.3+1.7−0.9

c 0.639+0.084−0.070

0.639+0.082−0.070

−M 19.331 ± 0.030 19.332+0.029

−0.031

α 1.61+0.15−0.14

1.62+0.16−0.14

−β −1.82+0.17

−0.16−1.82 ± 0.16

σ8 0.795+0.028−0.030

0.795+0.030−0.027

Means and 68% credible intervals using lensing, SNIa and CMB

Bayesian Model Comparison in Cosmology

Application to cosmological data

Advantage of AIS and PMC?

Parallelisation of the posterior calculations- For the cosmological examples, we used up to 100 CPUs on a computer cluster to explore the cosmologyposteriors using AIS/PMC. Reducing the computational time from several days for MCMC to a few hoursusing PMC.

Low variance of Monte Carlo estimates- For PMC and q closely matched to π, significant reductions in the variance of the Monte Carloestimates are possible compared to estimates using MCMC. Also translating into a computational saving,with further savings possible by combining samples across iterations

Simple diagnostics of ‘convergence’ (perplexity)- For PMC, the perplexity provides a relatively simple measure of sampling adequacy to the target densityof interest

Bayesian Model Comparison in Cosmology

Evidence approximation

Evidence/Marginal likelihood/Integrated Likelihood ...

Central quantity of interest in (Bayesian) model choice

E =

π(x)dx =

π(x)

q(x)q(x)dx.

expressed as an expectation under any density q with large enoughsupport.Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,

E ≈N∑

n=1

wn

where the wn = π(xn)q(xn) are the (unnormalised) importance weights.

Bayesian Model Comparison in Cosmology

Evidence approximation

Evidence/Marginal likelihood/Integrated Likelihood ...

Central quantity of interest in (Bayesian) model choice

E =

π(x)dx =

π(x)

q(x)q(x)dx.

expressed as an expectation under any density q with large enoughsupport.Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,

E ≈N∑

n=1

wn

where the wn = π(xn)q(xn) are the (unnormalised) importance weights.

Bayesian Model Comparison in Cosmology

Evidence approximation

Back to the banana ...

Centred d-multivariate normal, x ∼ Nd(0,Σ) with covarianceΣ = diag(σ2

1 , 1, . . . , 1), which is slightly twisted in the first twodimensions by changing x2 to be x2 + β(x2

1 − σ21). where σ2

1 = 100and β controls the degree of curvature.We integrate over the unormalised target density

E =

π(β)f(x|β,Σ)dβ

or

E =

π(x|β,Σ)dx.

Bayesian Model Comparison in Cosmology

Evidence approximation

Simulation results (1)

x1

x 2

−40 −20 0 20 40

−30

−20

−10

010

−40 −20 0 20 40

−30

−20

−10

010

x1

x 2

0.02

992

0.02

996

0.03

000

0.03

004

After 10th iteration

Pos

terio

r m

ean

of β

−26

4.03

6−

264.

032

−26

4.02

8

After 10th iteration

Evi

denc

e (lo

g)

β unknown

Bayesian Model Comparison in Cosmology

Evidence approximation

Simulation results (2)

1 2 3 4 5 6 7 8 9 10

0.2

0.4

0.6

0.8

Iteration

Per

plex

ity

1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

Iteration

NE

SS

1 2 3 4 5 6 7 8 9 10

−0.

10.

00.

10.

2

Iteration

Evi

denc

e (lo

g)

−0.

015

−0.

005

0.00

50.

015

After 10th iteration

Evi

denc

e (lo

g): f

inal

sam

ple

β = 0.015 known

Bayesian Model Comparison in Cosmology

Cosmology models

Back to cosmology questions

Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.

Flat ΛCDM model with only six free parameters(Ωm,Ωb, h, ns, τ, σ8)

Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).

Testing for dark energy, curvature, and inflationary models

Bayesian Model Comparison in Cosmology

Cosmology models

Back to cosmology questions

Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.

Flat ΛCDM model with only six free parameters(Ωm,Ωb, h, ns, τ, σ8)

Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).

Testing for dark energy, curvature, and inflationary models

Bayesian Model Comparison in Cosmology

Cosmology models

Extended models

Focus on the dark energy equation-of-state parameter, modeled as

w = −1 ΛCDM

w = w0 wCDM

w = w0 + w1(1 − a) w(z)CDM

In addition, curvature parameter ΩK for each of the above is eitherΩK = 0 (‘flat’) or ΩK 6= 0 (‘curved’).Choice of models represents simplest models beyond a“cosmological constant” model able to explain the observed,recent accelerated expansion of the Universe.

Bayesian Model Comparison in Cosmology

Cosmology models

Cosmology priors

Prior ranges for dark energy and curvature models. In case of w(a)models, the prior on w1 depends on w0

Parameter Description Min. Max.

Ωm Total matter density 0.15 0.45Ωb Baryon density 0.01 0.08h Hubble parameter 0.5 0.9

ΩK Curvature −1 1w0 Constant dark-energy par. −1 −1/3

w1 Linear dark-energy par. −1 − w0−1/3−w0

1−aacc

Bayesian Model Comparison in Cosmology

Cosmology models

Cosmology priors (2)

Component to the matter-density tensor with w(a) < −1/3 forvalues of the scale factor a > aacc = 2/3. To limit the stateequation from below, we impose the condition w(a) > −1 for all a,thereby excluding phantom energy.Natural limit on the curvature is that of an empty Universe, i.e.upper boundary on the curvature ΩK = 1. A lower boundarycorresponds to an upper limit on the total matter-energy density:ΩK > −1, excluding high-density Universe(s) which are ruled outby the age of the oldest observed objects.Alternative prior on ΩK could be derived from the paradigm of inflation, but most

scenarios imply the curvature to be , on the order of 10−60. The likelihood over such

a prior on ΩK is essentially flat for any current and future experiments, hence cannot

be assessed.

Bayesian Model Comparison in Cosmology

Cosmology models

Cosmology priors (2)

Component to the matter-density tensor with w(a) < −1/3 forvalues of the scale factor a > aacc = 2/3. To limit the stateequation from below, we impose the condition w(a) > −1 for all a,thereby excluding phantom energy.Natural limit on the curvature is that of an empty Universe, i.e.upper boundary on the curvature ΩK = 1. A lower boundarycorresponds to an upper limit on the total matter-energy density:ΩK > −1, excluding high-density Universe(s) which are ruled outby the age of the oldest observed objects.Alternative prior on ΩK could be derived from the paradigm of inflation, but most

scenarios imply the curvature to be , on the order of 10−60. The likelihood over such

a prior on ΩK is essentially flat for any current and future experiments, hence cannot

be assessed.

Bayesian Model Comparison in Cosmology

Cosmology models

PMC setup

q0 is a Gaussian mixture model with D components randomlyshifted away from the MLE and covariance equal to theinformation matrix.

For the dark-energy and curvature models number ofiterations T equal to 10, unless perplexity indicated thecontrary. Average number of points sampled under anindividual mixture-component, N/D, controlled for stableupdating component (N = 7500 and D = 10).

For the primordial models T = 5, N = 10000 and D between7 and 10, depending on the dimensionality.

Parameters controlling the initial mixture means andcovariances, chosen as fshift = 0.02, and fvar between 1 and1.5. Final iteration run with a five-times larger sample

Bayesian Model Comparison in Cosmology

Cosmology models

Results

In most cases evidence in favour of the standard model. especiallywhen more datasets/experiments are combined.

Largest evidence is ln B12 = 1.8, for the w(z)CDM model andCMB alone. Case where a large part of the prior range is stillallowed by the data, and a region of comparable size is excluded.Hence weak evidence that both w0 and w1 are required, butexcluded when adding SNIa and BAO datasets.

Results on the curvature are compatible with current findings:non-flat Universe(s) strongly disfavoured for the three dark-energycases.

Bayesian Model Comparison in Cosmology

Cosmology models

Evidence

-8

-6

-4

-2

0

2

4

4 5 6

ln B

12

npar

Evidence (reference model ΛCDM flat)

inco

ncl.

wea

km

od.

wea

km

od.

stro

ng

CMB

Λ curved

w0 flat

w0 curved

w(z) flat

w(z) curved

CMB+SN

Λ curved

w0 flat

w0 curved

w(z) flat

w(z) curved

CMB+SN+BAO

Λ curved

w0 flat

w0 curved

w(z) flat

w(z) curved

Bayesian Model Comparison in Cosmology

Cosmology models

Posterior outcome

Posterior on dark-energy parameters w0 and w1 as 68%- and 95% credible regions forWMAP (solid blue lines), WMAP+SNIa (dashed green) and WMAP+SNIa+BAO(dotted red curves). Allowed prior range as red straight lines.

−1.0 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4

−0.5

0.0

0.5

1.0

1.5

2.0

w0

w1

Bayesian Model Comparison in Cosmology

Cosmology models

PMC stability−

11.0

−10

.0−

9.5

−9.

0−

8.5

iteration

ln E

1 2 3 4 5 6 7 8 9 10

wCDM flat

−14

−13

−12

−11

−10

iterationln

E

1 3 5 7 9 11 13 15 17 19

wCDM curvature

Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel)

and curved wCDM (right panel). Log-evidence

Bayesian Model Comparison in Cosmology

Cosmology models

PMC stability0.

00.

20.

40.

60.

8

iteration

perp

lexi

ty

1 2 3 4 5 6 7 8 9 10

wCDM flat

0.0

0.1

0.2

0.3

0.4

0.5

iterationpe

rple

xity

1 3 5 7 9 11 13 15 17 19

wCDM curvature

Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel)

and curved wCDM (right panel). Perplexity

Bayesian Model Comparison in Cosmology

lexicon

lexicon

BAO, baryon acoustic oscillations

CMB, cosmic microwave background radiation

COBE, cosmic background explorer

ΛCDM, lambda-cold dark matter

Lyα, Lyman-alpha

SNIa, type Ia supernovae

WMAP, Wilkinson microwave anisotropy probe

top related