on the convergence properties of the wang-landau algorithm

60
Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour On the convergence properties of the Wang-Landau Algorithm Robin J. Ryder CEREMADE, Universit´ e Paris Dauphine and CREST ENSAE – 31 January 2012 joint work with Pierre Jacob (National University of Singapore) and Pierre Del Moral (INRIA & Universit´ e de Bordeaux) R. Ryder (Dauphine) & CREST Wang-Landau 1/ 40

Upload: robin-ryder

Post on 18-Dec-2014

82 views

Category:

Documents


1 download

DESCRIPTION

Talk at ENSAE, January 2013

TRANSCRIPT

Page 1: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

On the convergence properties of theWang-Landau Algorithm

Robin J. RyderCEREMADE, Universite Paris Dauphine

and CREST

ENSAE – 31 January 2012

joint work with Pierre Jacob (National University of Singapore)and Pierre Del Moral (INRIA & Universite de Bordeaux)

R. Ryder (Dauphine) & CREST Wang-Landau 1/ 40

Page 2: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Outline

1 Wang–Landau algorithm

2 Flat Histogram in finite time

3 Parallel Wang–Landau: asymptotic behaviour

R. Ryder (Dauphine) & CREST Wang-Landau 2/ 40

Page 3: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Motivation

X

dens

ity

0.0

0.1

0.2

0.3

0.4

0.5

−5 0 5 10 15

Figure: A mixture of well-separated normal distributions.

R. Ryder (Dauphine) & CREST Wang-Landau 3/ 40

Page 4: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Motivation

X

dens

ity

0.0

0.1

0.2

0.3

0.4

0.5

−5 0 5 10 15

Figure: A mixture of well-separated normal distributions.

R. Ryder (Dauphine) & CREST Wang-Landau 4/ 40

Page 5: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Motivation

X

dens

ity

0.0

0.1

0.2

0.3

0.4

0.5

−5 0 5 10 15

Figure: A mixture of well-separated normal distributions.

R. Ryder (Dauphine) & CREST Wang-Landau 5/ 40

Page 6: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Setting

Partition the state space

X =d⋃

i=1

Xi

Desired frequencies

φ = (φ1, . . . , φd)

Penalized distribution

πθ(x) ∝ π(x)

θ(J(x))

where J(x) such that x ∈ XJ(x).

R. Ryder (Dauphine) & CREST Wang-Landau 6/ 40

Page 7: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

First algorithm

Algorithm 1 Wang-Landau with deterministic schedule (γt)

1: Init θ0 > 0, X0 ∈ X .2: for t = 1 to T do3: Sample Xt from Kθt−1(Xt−1, ·), MH kernel targeting πθt−1 .

4: Update the penalties:

log θt(i)← log θt−1(i) + γt (1IXi(Xt)− φi )

5: end for

R. Ryder (Dauphine) & CREST Wang-Landau 7/ 40

Page 8: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Video

R. Ryder (Dauphine) & CREST Wang-Landau 8/ 40

Page 9: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Video

R. Ryder (Dauphine) & CREST Wang-Landau 8/ 40

Page 10: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Flat Histogram

Issues with the first version

Choice of γ has a huge impact on the results.

Flat Histogram

Define the counters:

νt(i) :=t∑

n=1

1IXi(Xn)

Flat Histogram (FH) is reached when:

maxi∈{1,...,d}

∣∣∣∣νt(i)

t− φi

∣∣∣∣ < c

R. Ryder (Dauphine) & CREST Wang-Landau 9/ 40

Page 11: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Flat Histogram

Idea

Instead of decreasing γt at each time step t, decrease only whenthe Flat Histogram criterion is reached.

In practice

Denote by κt the number of FH criteria reached up to time t.

Use γκt instead of γt at time t.

If FH is reached at time t, reset νt(i) to 0 for all i .

R. Ryder (Dauphine) & CREST Wang-Landau 10/ 40

Page 12: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Wang–Landau with Flat Histogram

Algorithm 2 Wang-Landau with stochastic schedule (γκt )

1: Init θ0 > 0, X0 ∈ X .2: Init κ0 ← 0.3: for t = 1 to T do4: Sample Xt from Kθt−1(Xt−1, ·), MH kernel targeting πθt−1 .5: If (FH) then κt ← κt−1 + 1, otherwise κt ← κt−1.

6: Update the penalties:

log θt(i)← log θt−1(i) + γκt (1IXi(Xt)− φi )

7: end for

R. Ryder (Dauphine) & CREST Wang-Landau 11/ 40

Page 13: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

FH is met in finite time

To be sure that eventually, for any c > 0:

maxi∈{1,...,d}

∣∣∣∣νt(i)

t− φi

∣∣∣∣ < c

we want to prove:

∀i ∈ {1, . . . , d} νt(i)

tP−−−→

t→∞φi

for any fixed γ > 0.

R. Ryder (Dauphine) & CREST Wang-Landau 12/ 40

Page 14: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Various updates

Right update

log θt(i)← log θt−1(i) + γ (1IXi(Xt)− φi ) (1)

Using this update, FH is met in finite time.

Wrong update

θt(i)← θt−1(i) [1 + γ (1IXi(Xt)− φi )]

⇔log θt(i)← log θt−1(i) + log [1 + γ (1IXi

(Xt)− φi )] (2)

Here FH is not necessarily met in finite time.Special case: ∀i φi = 1

d .

R. Ryder (Dauphine) & CREST Wang-Landau 13/ 40

Page 15: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Assumptions

In this talk, there are only two bins: d = 2. Additionally:

Assumption

The bins are not empty with respect to µ and π:

∀i ∈ {1, 2} µ(Xi ) > 0 and π(Xi ) > 0

Assumption

The state space X is compact.

R. Ryder (Dauphine) & CREST Wang-Landau 14/ 40

Page 16: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Assumptions

Assumption

The proposal distribution Q(x , y) is such that:

∃qmin > 0 ∀x ∈ X ∀y ∈ X Q(x , y) > qmin

Assumption

The MH acceptance ratio is bounded from both sides:

∃m > 0 ∃M > 0 ∀x ∈ X ∀y ∈ X m <π(y)

π(x)

Q(y , x)

Q(x , y)< M

R. Ryder (Dauphine) & CREST Wang-Landau 15/ 40

Page 17: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Theorem

Theorem

Consider the sequence of penalties θt introduced in the WLalgorithm. We define:

Zt = logθt(1)

θt(2)= log θt(1)− log θt(2)

Then:Zt

t

L1−−−→t→∞

0

and consequently, with update (1) (FH) is reached in finite timefor any precision threshold c, whereas this is not guaranteed forupdate (2).

R. Ryder (Dauphine) & CREST Wang-Landau 16/ 40

Page 18: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Zt = log θt(1)(

1IXX

XX

)− log θt(2)

(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 19: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Zt = log θ0(1) + γ∑t

n=1(1IX1(Xn)− φ1)(

1IXX

XX

)− log θ0(2) + γ

∑tn=1(1IX2(Xn)− φ2)

(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 20: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Zt = γ(νt(1)− tφ1)(

1IXX

XX

)−γ(νt(2)− tφ2)

(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 21: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Zt = γ(νt(1)− tφ1)(

1IXX

XX

)−γ(t − νt(1)− t(1− φ1))

(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 22: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Zt = γ(νt(1)− tφ1)(

1IXX

XX

)+γ(νt(1)− tφ1))

(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 23: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Zt = 2γ(νt(1)− tφ1)(

1IXX

XX

)(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 24: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Ztt = 2γ

(νt(1)t − φ1

)(1IX

X

XX

)(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 25: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Ztt = 2γ

(νt(1)t − φ1

)L1−−−→

t→∞0(

1IXX

XX

)⇒ νt(1)

t

L1−−−→t→∞

φ1

(1IX

X

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 26: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Consequence

Using update (1), and starting from Z0 = 0:

Ztt = 2γ

(νt(1)t − φ1

)L1−−−→

t→∞0(

1IXX

XX

)⇒ νt(1)

t

L1−−−→t→∞

φ1

(1IX

X

XX

)Using update (2), these simplifications do not occur: νt(i)/tconverges, but not to φi in general.

R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40

Page 27: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Proof

To prove thatZt

t

L1−−−→t→∞

0,

first notice that

Zt+1 − Zt = 2γ (νt+1(1)− νt(1)− φ1)

can only take two different values, which we note +a and −b.

R. Ryder (Dauphine) & CREST Wang-Landau 18/ 40

Page 28: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Proof

Introduce Ut , the increment of Zt :

Zt+1 = Zt + Ut

Lemma

With the introduced processes Zt and Ut , there exists ε > 0 suchthat for all η > 0, there exists Zhi such that, if Zt ≥ Zhi , we havethe following two inequalities:

P[Ut+1 = −b|Ut = +a,Zt ] > ε

P[Ut+1 = −b|Ut = −b,Zt ] > 1− η.

(and the symmetric corollary)

R. Ryder (Dauphine) & CREST Wang-Landau 19/ 40

Page 29: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Proof

time

●●

Zs

Zs+T

Z~

s+T~

5 10 15 20 25 30 35

Figure: We prove that Zt returns below a given horizontal bar wheneverit goes above it, and it does so in finite time. This implies Zt/t → 0.

R. Ryder (Dauphine) & CREST Wang-Landau 20/ 40

Page 30: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Proof

Introduce a crossing time s: Zs−1 ≤ Zhi and Zs ≥ Zhi .Introduce a new process Zt :

Zs = Zs and Zt+1 = Zt + Vt

We can choose Vt such that Zt ≥ Zt and

Lemma

(Vt) is a Markov chain over the space {+a,−b} with transitionmatrix (

1− ε εη 1− η

)where the first state corresponds to +a and the second state to−b.

R. Ryder (Dauphine) & CREST Wang-Landau 21/ 40

Page 31: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Proof

time

●●

Zs

Zs+T

Z~

s+T~

5 10 15 20 25 30 35

Figure: We prove that Zt returns below a given horizontal bar wheneverit goes above it, and it does so in finite time. This implies Zt/t → 0.

R. Ryder (Dauphine) & CREST Wang-Landau 22/ 40

Page 32: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Case d = 2

This shows that Zt/t → 0, hence (FH) is met in finite time.For the case d > 2, we shall use only update 1.

R. Ryder (Dauphine) & CREST Wang-Landau 23/ 40

Page 33: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Case d > 2

To prove that (FH) is met in finite time in the case with d > 2, weneed an extra assumption:

Assumption

The desired frequencies are all rational numbers:

∀i , φi ∈ Q

R. Ryder (Dauphine) & CREST Wang-Landau 24/ 40

Page 34: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Irreducible Markov chain

Under the assumption of rationality, we have the following lemma:

Lemma

Let Θ be the following subset of Rd :

Θ = {z ∈ Rd : ∃(n1, . . . , nd) ∈ Nd zi = ni − φid∑

j=1

nj}

Then denoting by λ the product of the Lebesgue measure µ on Xand of the counting measure on Θ, (Xt , log θt) is λ-irreducible.

Proof: Bezout’s lemma.

R. Ryder (Dauphine) & CREST Wang-Landau 25/ 40

Page 35: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

A limit exists

Since (Xt , log θt) is λ-irreducible, the proportion of visits to anyλ-measurable set of X ×Θ converges to a limit in [0, 1]. Hencethe vector (ν(i)/t) converges to some limit pi . We now need toprove that this limit corresponds to (FH).

R. Ryder (Dauphine) & CREST Wang-Landau 26/ 40

Page 36: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Reductio ad absurdum

Suppose (reductio ad absurdum) that p 6= φ. Then there exist iand j such that

pi − φi < pj − φjand hence

Z j ,it = −νt(i) + νt(j) + t(φi − φj) ∼ t(−pi + φi + pj − φj)→∞

As before, we can construct Ut and Ut and show that Ut decreaseson average, which contradicts the assumption that Z j ,i

t →∞.

R. Ryder (Dauphine) & CREST Wang-Landau 27/ 40

Page 37: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Conclusion for d > 2

Conclusion: using update 1, (FH) is met in finite time for anynumber of bins d .

R. Ryder (Dauphine) & CREST Wang-Landau 28/ 40

Page 38: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Illustration

X

dens

ity

0.0

0.1

0.2

0.3

0.4

0.5

−4 −2 0 2 4

(a) Histogram of the gen-erated sample

iterations

Pro

port

ions

of v

isits

to e

ach

bin

0.0

0.2

0.4

0.6

0.8

1.0

0 1000 2000 3000 4000 5000

(b) Convergence of theproportions of visits toeach bin, using the rightupdate

iterations

Pro

port

ions

of v

isits

to e

ach

bin

0.0

0.2

0.4

0.6

0.8

1.0

0 1000 2000 3000 4000 5000

(c) Convergence of theproportions of visits toeach bin, using the wrongupdate

R. Ryder (Dauphine) & CREST Wang-Landau 29/ 40

Page 39: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Parallel chains

N chains (X(1)t , . . . ,X

(N)t ) instead of one.

targeting the same biased distribution πθt at iteration t,

sharing the same estimated bias θt at iteration t.

Old update of the penalties

log θt(i)← log θt−1(i) + γ (1IXi(Xt)− φi )

(1IX

X

XX

)

(L. Bornn, P. Jacob, P. Del Moral, A. Doucet)

R. Ryder (Dauphine) & CREST Wang-Landau 30/ 40

Page 40: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Parallel chains

N chains (X(1)t , . . . ,X

(N)t ) instead of one.

targeting the same biased distribution πθt at iteration t,

sharing the same estimated bias θt at iteration t.

New update of the penalties

log θt(i)← log θt−1(i) + γ(

1N

∑Nk=1 1IXi

(X(k)t )− φi

) (1IX

X

XX

)

(L. Bornn, P. Jacob, P. Del Moral, A. Doucet)

R. Ryder (Dauphine) & CREST Wang-Landau 30/ 40

Page 41: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Infinite number of chains

Suppose φi = 1/d . We can use either update and choose the“wrong” update

θt(i)← θt−1(i)

[1 + γ

(1

N

N∑k=1

1IXi(X

(k)t )− 1

d

)]

Note that if (X(k)t )Nk=1

iid∼ πθt then

1

N

N∑k=1

1IXi(X

(k)t )

a.s.−−−−→N→∞

∫Xi

πθt (x)dx =ψi

θt−1(i)

(∑k

ψk

θ(k)

)−1

R. Ryder (Dauphine) & CREST Wang-Landau 31/ 40

Page 42: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

This corresponds to update

θt(i)← θt−1(i)

[1 + γ

(∫Xi

πθt (x)dx − 1

d

)]

⇔ θt(i)← θt−1(i)

1 + γ

ψi

θt−1(i)

(∑k

ψk

θ(k)

)−1

− 1

d

Normalize:

θt(i)← θt(i)∑j θt(j)

Then the penalties (θt)t≥0 should converge towards ψ.

R. Ryder (Dauphine) & CREST Wang-Landau 32/ 40

Page 43: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

θt(i)←θt−1(i)

[1+γ

(ψi

θt−1(i)(∑

kψkθ(k))

−1− 1

d

)](1IXX

XX

)∑d

j=1 θt−1(j)

[1+γ

(ψj

θt−1(j)(∑

kψkθ(k))

−1− 1

d

)](1IXX

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40

Page 44: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

θt(i)←θt−1(i)

[1+γ

(ψi

θt−1(i) M−1ψ (θt−1)− 1

d

)](1IXX

XX

)∑d

j=1 θt−1(j)[1+γ

(ψj

θt−1(j) M−1ψ (θt−1)− 1

d

)](1IXX

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40

Page 45: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

θt(i)←θt−1(i)(1−γd ) + ψiγM−1

ψ (θt−1)

(1IXX

XX

)∑d

j=1 θt−1(j)(1−γd ) + ψjγM−1ψ (θt−1)

(1IXX

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40

Page 46: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

θt(i)←θt−1(i)(1−γd ) + ψiγM−1

ψ (θt−1)

(1IXX

XX

)(1−γd )+γM−1

ψ (θt−1)

(1IXX

XX

)

R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40

Page 47: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

θt(i)← θt−1(i)1− γ

d

1− γd

+γM−1ψ (θt−1)

+ ψiγM−1

ψ (θt−1)

1− γd

+γM−1ψ (θt−1)

R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40

Page 48: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

θt(i)← θt−1(i)× (1− α(θt−1)) + ψi × α(θt−1)

R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40

Page 49: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Hence the update can be written as a convex combination

θt(i)← θt−1(i)× (1− α(θt−1)) + ψi × α(θt−1)

where

α(θt−1) =γM−1

ψ (θt−1)

1− γd + γM−1

ψ (θt−1)

It can easily be shown that

∀t ≥ 0 0 < αmin ≤ α(θt) ≤ αmax < 1

provided that ∀i , θ0(i) > 0, ψ(i) > 0.

Of course the assumptions N →∞ andiid∼ sampling are strong.

What happens in practice?

R. Ryder (Dauphine) & CREST Wang-Landau 34/ 40

Page 50: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Animation

Toy example

X = {1, 2, 3, 4, 5}, π = (π1, . . . , π5)

We take the partition: Xi = {i} and launch the Wang–Landaualgorithm for N = 1, 10, 100 and γ = 1.In this case we know the value of ψi : ψi = πi , so we can computethe deterministic sequence of penalties corresponding to N =∞.

Animation

R. Ryder (Dauphine) & CREST Wang-Landau 35/ 40

Page 51: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Animation

N =∞R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40

Page 52: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Animation

N = 100R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40

Page 53: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Animation

N = 10R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40

Page 54: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Animation

N = 1R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40

Page 55: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Animation

R. Ryder (Dauphine) & CREST Wang-Landau 37/ 40

Page 56: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

New take on the Parallel Wang–Landau

Call fψ the function such that:

θt ← fψ(θt−1)

As we saw:fψ(θ) = θ(1− α(θ)) + ψα(θ))

where

α(θ) =γM−1

ψ (θ)

1− γd + γM−1

ψ (θ)

R. Ryder (Dauphine) & CREST Wang-Landau 38/ 40

Page 57: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

New take on the Parallel Wang–Landau (in progress)

In the ideal case we have

θn = fψ ◦ . . . ◦ fψ︸ ︷︷ ︸×n

(θ0) = f nψ (θ0)

If we add perturbations to ψ we also add perturbations to θ, asfollows:

θ1 = f εψ(θ0) = fψ(θ0) + V0

R. Ryder (Dauphine) & CREST Wang-Landau 39/ 40

Page 58: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

New take on the Parallel Wang–Landau (in progress)

In the ideal case we have

θn = fψ ◦ . . . ◦ fψ︸ ︷︷ ︸×n

(θ0) = f nψ (θ0)

If we add perturbations to ψ we also add perturbations to θ, asfollows:

θ1 = f εψ(θ0) = fψ(θ0) + V0

θ2 = f εψ(θ1) = f εψ(fψ(θ0) + V0) = f 2ψ (θ0) + V1

. . .

R. Ryder (Dauphine) & CREST Wang-Landau 39/ 40

Page 59: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

New take on the Parallel Wang–Landau (in progress)

In the ideal case we have

θn = fψ ◦ . . . ◦ fψ︸ ︷︷ ︸×n

(θ0) = f nψ (θ0)

If we add perturbations to ψ we also add perturbations to θ, asfollows:

θ1 = f εψ(θ0) = fψ(θ0) + V0

θ2 = f εψ(θ1) = f εψ(fψ(θ0) + V0) = f 2ψ (θ0) + V1

. . .

The properties of fψ should help control the noise terms(Vn)n≥0. . .

R. Ryder (Dauphine) & CREST Wang-Landau 39/ 40

Page 60: On the convergence properties of the Wang-Landau algorithm

Wang–Landau algorithmFlat Histogram in finite time

Parallel Wang–Landau: asymptotic behaviour

Bibliography

Atchade, Y. and Liu, J. (2010). The Wang-Landau algorithmin general state spaces: applications and convergence analysis.Statistica Sinica, 20:209–233.

Wang, F. and Landau, D. (2001). Determining the density ofstates for classical statistical models: A random walkalgorithm to produce a flat histogram. Physical Review E,64(5):56101.

An Adaptive Interacting Wang-Landau Algorithm forAutomatic Density Exploration, L. Bornn, P. Jacob, P. Del

Moral, A. Doucet, on arXiv.

The Wang-Landau algorithm reaches the Flat Histogramcriterion in finite time, P. Jacob & RR, on arXiv.

R. Ryder (Dauphine) & CREST Wang-Landau 40/ 40