chapter 5 classical and quantum monte carlo · chapter 5 classical and quantum monte carlo alexei...

October 21, 2005 10:37 Rinton-Book10x7 book

Chapter 5

Classical and Quantum Monte Carlo

Alexei Filinov and Michael Bonitz

5.1 Classical systems and the Monte Carlo method

5.1.1 Introduction

This chapter is devoted to the computation of equilibrium (thermodynamic) properties

of classical and quantum systems. In particular, we will be interested in the situation

where the interaction between particles is so strong that it cannot be treated as a small

perturbation. For weakly coupled systems many efficient theoretical and computational

techniques do exist. However, for strongly interacting systems such as nonideal gases or

plasmas, strongly correlated electrons and so on, perturbation methods fail and alternative

approaches are needed. Among them, an extremely successful one is the Monte Carlo (MC)

method which we are going to consider in this chapter.

The Monte Carlo method has been actively developed during the second half of the 20th

century and has yielded an enormous amount of fundamental results in statistical physics

of classical systems, see e.g. [1]−[8] and references therein. The extension of this method

to quantum systems is not trivial and presently a very actively investigated field [9]−[12].

After discussion of classical thermodynamic Monte Carlo we will present an introduction

to one of the most fruitful concepts of quantum Monte Carlo - path integral Monte Carlo

(PIMC).

In the widest sense of the term “Monte Carlo simulation” denotes any simulation which

utilizes random numbers in the simulation algorithm. The term “Monte Carlo” had come

from the famous casinos in Monte Carlo, when in the beginning of the 1940’s someone

started to think of the likeness of using random numbers to the randomness in the game

at the casino, and named the methods thereafter.

Since 1953, after N.C. Metropolis and co-workers [13] applied the Monte Carlo method

in the canonical ensemble for calculation of a 2D system of solid disks, this method found

broad applications. Today the MC method and its different modifications and extensions

(kinetic, quantum, variational, etc.) are widely used for different problems in physics,

chemistry, biology, economy or stock market studies.

It is worth mentioning, why computer simulation methods and the MC method, in

237


238 Classical and Quantum Monte Carlo

particular, are so attractive and popular nowadays. First, simulations can provide detailed

information on model systems which can be arbitrary in any degree. It is possible to

investigate quantities and behaviors which may be difficult or impossible to measure in an

experiment. In comparison with theory computer simulations provide an “exact” (in the

range of numerical error) solution of the model system where analytical theories would

require additional approximations. This is the reason why numerical simulations appear

to be a very useful tool which provide a bridge between theory and experiment.

Despite the fact that MC methods have very broad applications in different fields of

physics and also in other sciences, here we will be interested in the problems of statistical

physics of classical and quantum systems. So our primary concern will be calculation of

physical observables, which appear as an thermal average over some degrees of freedom.

First, we will start from general consideration of the Monte Carlo algorithm used in

multi-dimensional integrations.

5.1.2 Monte Carlo integration

5.1.2.1 Random quantities

By performing the MC calculations it is always necessary to model random quantities with

given probability distributions. Practically this problem is solved in the following way:

one first develops methods to sample a quantity uniformly distributed in the interval [0, 1],

i.e. a quantity α with the probability density

pα(x) =

1, x ∈ [0, 1]

0, x /∈ [0, 1],(5.1)

and then performs transformations from α to another quantity which has the desired

distribution [17, 18]. Such an approach is justified by the fact that having independent

realizations of α one can get a random quantity and also a random process with the

required distribution.

Let us consider practical examples. Suppose we are asked to calculate the integral of

the product of two functions in the region Q

∫

Q

A(x) p(x)dx, p(x) ≥ 0, and

∫

Q

p(x)dx = 1. (5.2)

As usual, the integral is converted into a sum and the integrand is evaluated at discrete

values of xi ∈ Q. But, in contrast to deterministic methods (such as the trapezoidal

rule or Gaussian quadrature), here the values xi are not determined in advance but they

are generated randomly. For this we need to model a random quantity ξ ∈ Q with the

probability density p(x) and evaluate

∫

Q

A(x) p(x)dx ≈ 1

K

K∑

i=1

A(ξi), (5.3)

over K different realizations. Suppose that the distribution function of the random quan-


Classical systems and the Monte Carlo method 239

0 2 4 6 8 10

0.5

1.0

Fξ(x

)ξ

i

αi

Fξ(ξ

i)=α

i

x-10 -8 -6 -4 -2

Random quantity

Dis

trib

utio

n fu

nctio

n

Fig. 5.1 Distribution function, Fξ(x), of the random quantity x.

tity ξ, i.e.

Fξ(x) ≡ P [ξ < x] =

∫ x

−∞p(x)dx, (5.4)

is a monotonic function. Then to generate the sequence ξi with the probability density

p(x) it is necessary to solve the equation Fξ(ξ) = α (see Fig. 5.1), where α is a random

number uniformly distributed in the interval [0, 1]. Hence, ξ can be found with the use of

the inverse function ξ = F−1ξ (α).

This has a simple proof. Let us calculate the distribution function of the random

quantity F−1ξ (α):

FF−1ξ (α)(x) = P [F−1

ξ (α) < x] = P [α < Fξ(x)] = P [ξ < x] = Fξ(x) (5.5)

i.e. the density distribution F−1ξ (α) is equal to p(x). These considerations can be gener-

alized to the case when Fξ(x) is not monotonic.

Example 1Suppose, that it is necessary to model a random quantity ξ with the probability density

pξ(x) =

e−x, x ≥ 00, x < 0,

(5.6)

then the distribution function of this quantity is

Fξ(x) =

1 − e−x, x ≥ 00, x < 0.

(5.7)

The probability to get into the interval (−∞, 0) is equal to zero, therefore it is sufficient to modelthe behavior of ξ only for the positive half space [0,∞). Here Fξ(x) is strictly monotonic and we



can use the equation for the inverse function

1 − e−ξ = α, (5.8)

ξ = − ln(1 − α). (5.9)

Then, the procedure to calculate the integral (5.3) consists in the following

(1) generate a random number αi ∈ [0, 1],

(2) compute ξi = ξ(αi) from (5.9) and

(3) evaluate the integrand at the argument x = ξi.

Example 2Let us now suppose that

p(x) =1√2π

exp`

−x2/2´

,

then for statistical sampling of ξ we need to solve the equation

α =1√2π

ξZ

−∞

exp`

−x2/2´

dx. (5.10)

In summary, for the calculation of the integral (5.2) with the MC method we have: i) solve an

equation similar to Eq. (5.10); ii) find corresponding argument ξ(αi); iii) for every argument ξi

evaluate the function value A(ξi) in the sum (5.3). In some cases, this procedure can be time

consuming, therefore there exist other methods for sampling probability distribution without

solving an integral equation similar to (5.10) (these methods use α as the initial random quantity).

5.1.2.2 Statistical tests

The best way to test if a random generator is good enough for an application is to run this

application with two different random generators and see if they produce the same result.

Other possibility is to run the sequence of random numbers produced by the generator

through a number of tests [20], like e.g. the χ2 test described below. The tested sequence

has to be at least as long as the sequence of random numbers used in the application,

otherwise biases the application may encounter cannot be found by a test.

Suppose that we have some set of numbers ǫi in the interval [0, 1]. We want to check

if these numbers can be considered as a sequence of independent realizations of some

random quantity α. Test results are usually reported as a χ2 measure. A χ2 measure of

±2 is probably random noise, ±3 probably means the generator is biased, and ±4 almost

certainly means the generator is biased. Let us briefly describe the χ2-criterion.

Consider a set of N numbers: β1, β2, . . . , βN . Suppose that they are independent real-

izations of some random quantity β(ω). Let us complete the following procedure. Divide

the real axis into K intervals ∆k(k = 1, . . . , K). We introduce notations: pk = P [β ∈ ∆k]

(the probability that the random quantity β is found in the interval ∆k), nk is the number



of values βi which fall into ∆k. Let us define and calculate the quantity

X2 =

K∑

k=1

(nk − Npk)2

Npk. (5.11)

If we do this for different sets of β1, . . . , βN , then we can see that the obtained values of X2

will have the distribution χ2K−1 (chi-square with K − 1 degrees of freedom). (Definition:

For n independent random quantities z1, . . . , zn having the standard distribution, the

distribution of the random quantity ξ =∑n

i=1 n z2i is called the χ2

n-distribution with n

degrees of freedom.)

Let us note that, although the last statement is true only in the limiting case N → ∞,

in practice it is sufficient that N pk ≥ 10 for arbitrary k. Let us call the γ-level of the

distribution χ2n the number χ2

n,γ if P [χ2n > χ2

n,γ ] = γ, i.e. the probability that the quantity

with the distribution χ2n will be larger then χ2

n,γ is equal to γ. Hence, we can expect that,

if our assumption about βi is true, then X2 can be larger then χ2K−1,γ only with the

probability γ. If it happens then it is considered that the assumption is rejected at the

confidence level γ with the probability (1 − γ).

Next we will consider several tests.

(a) The Kendall tests.

M.G. Kendall proposed a system of tests which allows to determine to what degree a

given sequence of numbers is “random”. The numbers satisfying these tests are called

locally-random. Let us describe these tests [17].

First, we can make from our sequence, ǫi, a new sequence, bi, using the rule: bi =

[Lǫi], where [a] is the integer part of a, and L is an integer number larger than one. For

example, in the case L = 10, the sequence bi is the sequence of the first decimal digits of

ǫi.

If the sequence ǫi was formed from independent realizations of a random quantity α,

then the sequence bi consists of independent realizations of the random quantity which

takes integer values 0, . . . , L − 1 with the probability p ≡ 1/L. This assumption is tested

with the Kendall tests.

1. Test of frequencies.

The test of frequencies checks the uniform distribution of the numbers 0, . . . , L − 1 in

the sequence bi. For completing the test let us calculate the quantity

X2 =

L−1∑

l=0

(νl − Np)2

Np, (5.12)

where N is the total number of elements bi, νl is a number of bi which are equal to l.

By choosing some level γ in the χ2-criterion we compare X2 with χ2L−1,γ and accept or

reject the assumption about the uniform distribution at this confidence level.

It is easy to note that this test does not take into account possible correlations of the



numbers ai. For example, consider L = 10 and the sequence of the form:

b1, b2, b3, b4, b5, b6, b7, b8, b9, b10, b11, b12, . . . = 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, . . . (5.13)

Then X2 = 0 (with the condition that N is a multiple of 10) and our assumption is ac-

cepted at all levels of the χ2-criterion.

2. Test of pairs.

Let N be an even number. Let us divide the sequence bi into pairs:

(b1, b2), (b3, b4), (b5, b6), . . . (5.14)

The probability that the pair (bi, bi+1) takes some definite value (l, l′) (l and l′ are integer

numbers from 0 to L − 1) is equal to p = 1/L2, due to assumed independence of the

elements bi. For the same reason independent should be the pairs themselves. Let νll′

be a number of pairs which are equal to (l, l′). Then to verify our assumption we should

calculate

X2 =

L−1∑

l,l′=0

(νll′ − Np/2)2

Np/2, (5.15)

and compare the result with χ2L2−1,γ , where γ is a chosen confidence level.

The test of pairs takes into account pair correlations, but not the correlations of higher

order, e.g. the correlations between pairs. Consider the sequence:

(0, 0), (0, 0), (0, 1), (0, 1), (0, 2), (0, 2). (5.16)

It satisfies the test of pairs at all levels of the χ2-criterion.

3. Test of series.

Suppose that the elements of the sequence bi

bk+1, . . . , bk+r,

form a series of the length r, if

bk 6= bk+1 = . . . = bk+r 6= bk+r+1. (5.17)

For example, the sequence

1, 1, 6, 9, 6, 6, 6, 3, 3, 0, . . .

forms a series with the lengths 2, 1, 1, 3, 2, . . ..

Let bi be a series with the lengths c1, c2, c3, . . . , cn. We want to calculate the theo-

retical distribution of the series versus their length. We do it first for the first series:

P [c1 = r] = P [b1 = . . . = br 6= br+1] =

L−1∑

l=0

P [b1 = l, . . . , br = l, br+1 6= l] =



]]]

∆R+1

∆R

∆2

R+1R2(((( )(

1

∆1

]

Fig. 5.2 Distribution of the intervals ∆r on the real axis (−∞,+∞).

=

L−1∑

l=0

P [b1 = l] . . . P [br = l]P [br+1 6= l] =

L−1∑

l=0

pr(1 − p) = Lpr(1 − p) =

= pr−1(1 − p). (5.18)

In an analogous way, it can be shown that for the second and next series P [ci = r] =

(1 − p)pr−1.

To apply the χ2 test and to check the theoretical distribution of the series with the

empirical results, let us choose ∆r as it is shown in Fig. 5.2. All series with lengths larger

then some R we have grouped in the last interval ∆R+1. Let us denote by nr the number

of series which fall into ∆r, and by pr the probability that the series falls into the interval

∆r (pr ≡ P [ci ∈ ∆r]). It is obvious that

pr = (1 − p)pr−1, r = 1, . . . , R,

pR+1 =

∞∑

r=R+1

(1 − p)pr−1 = pR. (5.19)

Taking into account that for the application of the χ2 test, it is sufficient that Nspr ≥ 10

(Ns is the number of realized series) and also that pR+1 ≤ pr for r = 1, . . . , R (in fact,

pR ≤ (1 − p)pr−1 at r ≤ R and p ≥ 1/2 realized in our case, as well as p = 1/L, L ≥ 2),

we can write down the condition for the choice of R:

Ns pR ≥ 10, R ≤ log Ns − 1

log L. (5.20)

In this case the assumption is verified by comparison of χ2R,γ with the quantity

X2 =R∑

r=1

(nr − Ns(1 − p)pr−1

)2

Ns(1 − p)pr−1+

(nR+1 − Ns pR

)2

Ns pR. (5.21)

Let us now proceed to the test of blocks [21]. This test allows to find correlations in

the sequence ǫi skipped by the Kendall tests.

4. Test of blocks.

Let us divide the sequence ǫi into blocks of the length n:

(ǫ1, . . . , ǫn), (ǫn+1, . . . , ǫ2n), . . .



and calculate the average from each block

ǫi =1

n

n−1∑

j=0

ǫin+j .

We can define a new quantity yi

yi =1, ǫi ≥ 1/2

0, ǫi < 1/2.(5.22)

Suppose that we have Nb such blocks, then we get Nb independent realizations of the

random quantity y with the distribution P [y = 0] = P [y = 1] = 1/2. Let us apply to the

yi the χ2-test with one degree of freedom:

X2 =

1∑

j=0

(nj − Nb/2)2

Nb/2, (5.23)

where nj is the number of yi equal to j.

Analyzing dependence of X2 on the length of the blocks n allows to find local correla-

tions in the sequence ǫi not found by the other tests [21].

In next set of problems we consider how to apply statistical tests to random number

generators.

5.1.2.3 Practical applications of statistical tests

Make a test of several random number generators (RNG) presented in the program code

random.for, ranf.for (the program codes are given in the end of this section) and the

Linear Congruential Method programmed by yourself (see Problem 4). Check that all

empirical statistical tests are passed. The proposed RNGs must produce a sequence of

random numbers uniformly distributed in the interval [0, 1). A specific example (definitely

violating this requirement) is the RNG which produces a Gaussian distribution specified

by a central point x0 and a variance σ (gauss.for). For this specific case investigate how

the choice of the parameters x0, σ will influence the results.

The obtained empirical values of X2, see Eqs. (5.24,5.25), must be compared with the

Table of the χ2 (Chi-square) distribution. If X2 is less than the 1% entry or greater than

the 99% entry, we reject the numbers as not sufficiently random. If X2 lies between the

1% and 5% entries or the the 95% and 99% entries, the numbers are “suspect”; if X2 is

between the 5% and 10% or 90% and 95% entries, the numbers might be “almost suspect”.

The χ2 test is often done at least three times on different sets of data, and if at least two

of three results are “suspect”, the numbers are regarded as not sufficiently random.

Problem 1: Frequency test.

This is a test that virtually all random number generators, good, bad, or indifferent,will pass. Divide up the range of output values into a fixed number of intervals (forexample K = 500) of equal size. Generate random numbers, e.g. N = 100 000, withthe procedure myrandom() (in file random.for). In this procedure one can chooseone of three different RNGs: GetRand(), Ranf(), Gaussian(), by changing the variable



gen_type= 1, 2, 3 and count the number of values, nk, that fall in each interval. Sincethe intervals are of equal size, approximately the same number of values should fall ineach interval. Calculate the chi-square statistics based on the actual, nk, and expectedcounts, Npk, for each interval

X2 =

KX

k=1

(nk − Npk)2

Npk, (5.24)

and compare with the critical value for a χ-square random variable with K − 1 degreesof freedom and a probability of 0.95.

• Compile and run the program ./random for each of the three RNGs by changingthe variable gen_type=1, 2, 3 in the file main.for. The output is written to the files:sequence.dat and statistics.dat.

• Make a conclusion RNGs (which of them are not uniform).• Find the critical values of the parameters x0 and σ when the Gaussian RNG passes

the Frequency test for the uniform distribution.

Problem 2: Test of pairs.

The pairs frequency test checks correlation between successive numbers in the “random”sequence produced by the function myrandom(). Generate N values and divide thesequence into N/2 pairs. These pairs, when considered as coordinates in a 2D space,should be uniformly distributed over a square [0, 1) × [0, 1). Divide this square intoK2 sub-squares by partitioning the ranges [0, 1) into K equal-size intervals. Count thenumber of pairs (coordinates) that fall in each sub-square. There should be about N/K2

in each. Calculate the χ-square statistics based on the actual and expected counts foreach sub-square and compare with the critical value for a χ-square random variable with(K2 − 1) degrees of freedom and a probability of 0.95. Use the expression

X2 =

K2−1X

k=1

(nk − Np/2)2

Np/2, (5.25)

with the probability p = 1/K2. For the pairs tests use the values K = 10 and N =1, 000, 000 (100 sub-squares and 5, 000 pairs expected in each sub-square).

• Modify the program code given in the Problem 1.• Run the program and make a test of the given RNGs (gen_type= 1, 2, 3).

Problem 3: Test of blocks.

Divide the generated sequence of “random” numbers ǫi into blocks of length n:

(ǫ1, . . . , ǫn), (ǫn+1, . . . , ǫ2n), . . .

and calculate the average from each block

ǫi =1

n

n−1X

j=0

ǫni+j .

Define the quantity yi by the following rule

yi = 1 , ǫi ≥ 1/20, ǫi < 1/2. (5.26)



Suppose that we have Nb such blocks, then we will get Nn independent realizations ofthe random quantity y with the distribution P [y = 0] = P [y = 1] = 1/2. Apply to yithe χ2-test with one degree of freedom:

X2 =1

X

j=0

(nj − Nb/2)2

Nb/2, (5.27)

where nj is the number of yi which is equal to j.

Investigate the dependence of X2 on the length of the blocks n = 2, 10, 100, . . . , 500 andtry to find local correlations in the sequence ǫi, i = 1, . . . , 500 000.

• Modify the program code given for the Problem 1 for the present case.• Make a test of the given RNGs (gen_type= 1, 2, 3).

Problem 4: Generating uniform random numbers.

One of the popular random number generators used today is based on the followingscheme (introduced by D.H. Lehmer in 1949). First, one needs to choose four numbers

m, the modulus; m > 0.

a, the multiplier; 0 ≤ a < m.

c, the increment; 0 ≤ c < m.

ǫ0, the starting value; 0 ≤ ǫ0 < m.

The desired sequence of random numbers ǫi is then obtained using the recurrencerelation

ǫi+1 = (a ǫi + c)mod m, i ≥ 0. (5.28)

This is called a linear congruential sequence. The operation f(x)mod m denotes that

we take the residue of division of f(x) over m: f(x)mod m ≡ f(x) − [ f(x)m

]m, where[. . .] denotes the integer part. For example, the sequence obtained when m = 10 andǫ0 = a = c = 7 is 7, 6, 9, 0, 7, 6, 9, 0, . . ., where the first element is ǫ0 = 7. As thisexample illustrates the congruential sequence always gets into a loop. This property iscommon to all sequences having the general form: ǫi+1 = f(ǫi) (see question 2 below).The repeating cycle is called the period. A useful sequence will have a relatively longperiod (hence large m). A convenient choice is to let m be the computer-word size,namely, m = 2n, where n is the number of binary digits of the computer, e.g. n = 32(32 bit) or n = 64 (64 bit).

Let us now investigate all possible choices of a, c and ǫ0 that give a period of maximallength m. The following theorem gives the answer:

Theorem. The linear congruential sequence defined by m,a, c and ǫ0 has a period of

length m if and only if

i) c is relative prime to m;

ii) b = a − 1 is a multiple of p, for every prime p dividing m;

iii) b is a multiple of 4, if m is a multiple of 4.

The ideas used in the proof can be found e.g. in M. Greenberger, JACM 8 (1961),383-389; Hull, Dobell, SIAM Review 4 (1962), 230-254.

It can be proved further that the period of maximum length m = 2n can be reached whenc is odd: c mod 2 = 1; and a mod 4 = 1. The RNGs of this type (with some modifications)are used in the standard libraries of the “C” language (“Watcom” and “Borland”).



Problem 5: Make the tests of Frequencies, Pairs and Blocks for the random numbers generatedusing Eq. (5.28) with the parameters a = 899, c = 0, and m = 32768, and a = 16807,c = 0, m = 231 − 1 with ǫ0 = 12. What are the periods of the corresponding randomnumber generators? One way to visualize the period of the RNG is to plot ǫi as a functionof step number i. When the period of the random number is reached, the plot will beginto repeat itself.

Why the value ǫ0 = 0 is forbidden for this choice of c?

Problem 6: Make the tests of Frequencies, Pairs and Blocks for the RNGs with the followingparameters: (i) ǫ0 = 47594118, a = 23, c = 0, m = 108 + 1; (ii) ǫ0 = 314159265,a = 218 + 1, c = 1, m = 235. Which RNG is not sufficiently random?

Problem 7: Propose your own choice of a, c and ǫ0 to get a maximal period of the linear congru-ential sequence with m = 32768 (apply the theorem stated above, given in D.E. Knuth,The art of computer programming, Vol. 2 Seminumerical algorithms. Reading, Mass.,Addison-Wesley, 2000.).

Questions to the theory

(1) What period will have a sequence generated with Eq. (5.28) with the choice a =

c = 1?

(2) Suppose that we want to generate a sequence of integers (ǫ0, ǫ1, . . .) in the range

0 ≤ ǫi < m. Let f(x) be any function such that 0 ≤ x < m implies 0 ≤ f(x) < m.

Consider a sequence formed by the rule ǫi+1 = f(ǫi). Show that the sequence

is periodic in the sense that there exist numbers ν and µ such that the values

ǫ0, . . . , ǫν+µ−1 are distinct, but ǫi+µ = ǫi when i ≥ µ. Find the maximum and

minimum possible values of ν and µ.

(3) Note that in the Test 2 (Test of pairs) we have used N random numbers in order

to make N/2 observations. Explain what error we make if we perform this test on

the pairs (ǫ1, ǫ2), (ǫ2, ǫ3), . . . , (ǫN−1, ǫN ).

(4) Discuss how to generalize Test of pairs to triples, quadruples, etc.



c ======================================================================

c Block of common data used in different subroutines

c ======================================================================

MODULE MAINDATA

c Number of elements in the test sequence

Integer :: N

c Number of intervals

INTEGER :: K

INTEGER :: Iseed

c Random sequence

REAL*8, DIMENSION(:), ALLOCATABLE :: U

c Number of elements in the interval

INTEGER, DIMENSION(:), ALLOCATABLE :: nk

c Length intervals

REAL*8 :: dl

c Empirical X2 value

REAL*8 :: X2

c Table values of the chi-square distribution for K>30

REAL*8, DIMENSION(7) :: xp, chi2

REAL*8 :: r_K, Npk

REAL*4 :: choice

END MODULE MAINDATA

c ======================================================================

c The program Random_Test performs the test of frequencies

c for three random number generators (see the text)

c ======================================================================

PROGRAM Random_Test

USE MAINDATA

Implicit None

INTEGER :: i,j, iruns

REAL*8 :: RANF

c Used random number generator

INTEGER :: gen_type

c Choose a type of random number generator

gen_type=3

c Initialize and test random number generator

CALL RANTEST(Iseed,gen_type)

c Set initial parameters

N=100000

K=INT(FLOAT(N)/100)

Write(*,*) ’Number of elements in the test sequence: ’, N



Write(*,*) ’Number of the intervals: ’, K

DO iruns=1,3

if (iruns.EQ.1) then

ALLOCATE(U(N))

ALLOCATE(nk(K))

end if

c Set initial values

U(1:N)=0.0d0

nk(1:K)=0

c Interval length

dl=1.0d0/K

c Check if the condition for application

c of the chi-square test is satisfied

if (dl*N.LT.10.) then

write(*,*)’The condition for application of the chi-square’

write(*,*)’test, i.e. dl*N < 10, is not satisfied.’

write(*,*)’dl*N=’,dl*N

write(*,*)’Increase N or decrease K, stop.’

stop

end if

c Calculate table values of the chi-square distribution

xp(1)=-2.33

xp(2)=-1.64

xp(3)=-0.675

xp(4)=0.00

xp(5)=0.675

xp(6)=1.64

xp(7)=2.33

r_K=1.0d0*(K-1)

chi2(:)=0.0d0

DO i=1,7

chi2(i)=r_K+DSQRT(2.0d0*r_K)*xp(i)+2.0d0*xp(i)*xp(i)/3.d0

& -2.0d0/3.0d0

END DO

c Generate the test sequence

DO i=1,N

CALL MYRANDOM(choice,gen_type)

U(i)=choice



END DO ! over i

c Calculate distribution in the intervals

DO i=1,N

j=INT(U(i)/dl)+1

if (j.LE.K) then

nk(j)=nk(j)+1

end if

END DO

c Form the X^2 statistics

Npk=dl*N

X2=0.0d0

DO i=1,K

X2=X2 + (FLOAT(nk(i))-Npk)**2/Npk

END DO

if (iruns.EQ.1) then

Open(21,File="statistics.dat")

end if

write(21,*)’************************ ’

write(21,*)’X2 result: ’,X2

write(21,*)’************************ ’

write(21,*)’chi-square points 1% ’,chi2(1)







write(*,*)’************************ ’

write(*,*)’X2 result: ’,X2

END DO ! over iruns

Close(21)

DEALLOCATE(U)

DEALLOCATE(nk)

Write(*,*) ’The chi-square test is completed.’

Stop

END

c=====================================================================



SUBROUTINE MYRANDOM(choice,gen_type)

REAL*4 :: choice

INTEGER :: Iseed, gen_type

DOUBLE PRECISION RANF

if (gen_type.EQ.1) then

choice = GetRand()

GOTO 100

end if


choice = RANF(Iseed)

GOTO 100

end if

c This is a Gaussian distribution, centered at x=0.5

c with the variance sigma=0.2


1 call Gaussian(choice)

choice=0.50d0+0.40d0*choice

if ((choice.LT.0).OR.(choice.GT.1.)) goto 1

GOTO 100

end if

write(*,*)’Unknown type of rand. number generator, stop.’

write(*,*)’Possible values are: 1,2,3.’

stop

100 CONTINUE

RETURN

END SUBROUTINE MYRANDOM

c=====================================================================

c Subroutine Gaussian returns a normal distribution with the

c mean value equal to zero and the variance equal to one,

c using ran(idum) as the source of uniform deviates.

c======================================================================

SUBROUTINE Gaussian(gauss)

implicit none

real *4 gauss

integer i

integer Iseed



REAL*8 RANF

c Local variables

integer iset

real *4 gset, gset1, v1, v2, r, fac, gauss1

save iset,gset1

data iset/0/

real*4 choice

COMMON /ABSs/ CHOICE

if (iset.eq.0) then

1 v1 = 2.*RANF(Iseed) - 1.

v2 = 2.*RANF(Iseed) - 1.

r = v1**2 + v2**2

if ((r.ge.1.).or.(r.eq.0.)) go to 1

fac = sqrt(-2.*log(r)/r)

gset = v1*fac

gset1 = v2*fac

gauss = gset

iset = 1

else

gauss = gset1

iset = 0

endif

return

END

c---------------------------------------------------------------------

c Function RANF is a random number generator, fast and rough, machine

c independent. Returns an uniformly distributed deviate in the 0 to 1

c interval. This random number generator is portable,

c machine-independent and reproducible, for any machine with at least

c 32 bits / real number. REF: Press, Flannery, Teukolsky, Vetterling,

c Numerical Recipes (1986), Ref. [19]

c---------------------------------------------------------------------

FUNCTION RANF(Idum)

c Idum (input): can be used as seed (not used in present

c random number generator.

IMPLICIT NONE

INTEGER Idum

DOUBLE PRECISION RANF, RCARRY

RANF = RCARRY()

RETURN



END

c---------------------------------------------------------------------

FUNCTION RANDX(Iseed)

IMPLICIT NONE

INTEGER IA, IC, Iseed, M1

DOUBLE PRECISION RANDX, RM

PARAMETER (M1=714025, IA=1366, IC=150889, RM=1.D+0/M1)

c

Iseed = MOD(IA*Iseed+IC, M1)

RANDX = Iseed*RM

IF (RANDX.LT.0.D+0) THEN

STOP ’*** Random number is negative ***’

END IF

c

RETURN

END

c---------------------------------------------------------------------

c Initializes random number generator

c---------------------------------------------------------------------

SUBROUTINE RANSET(Iseed)

IMPLICIT NONE

INTEGER Iseed

CALL RSTART(Iseed)

RETURN

END

c---------------------------------------------------------------------

c Initialize Marsaglia list of 24 random numbers.

c---------------------------------------------------------------------

SUBROUTINE RSTART(Iseeda)

IMPLICIT NONE

DOUBLE PRECISION CARRY, ran, RANDX, SEED

INTEGER i, I24, ISEED, Iseeda, J24

COMMON /RANDOM/ SEED(24), CARRY, I24, J24, ISEED

I24 = 24

J24 = 10

CARRY = 0.D+0

ISEED = Iseeda

c get rid of initial correlations in rand by throwing

c away the first 100 random numbers generated.



DO i = 1, 100

ran = RANDX(ISEED)

END DO

c initialize the 24 elements of seed

DO i = 1, 24

SEED(i) = RANDX(ISEED)

END DO

RETURN

END

c---------------------------------------------------------------------

c Random number generator from Marsaglia.

c---------------------------------------------------------------------

FUNCTION RCARRY()

IMPLICIT NONE

DOUBLE PRECISION CARRY, RCARRY, SEED, TWOm24, TWOp24, uni

INTEGER I24, ISEED, J24

PARAMETER (TWOp24=16777216.D+0, TWOm24=1.D+0/TWOp24)

COMMON /RANDOM/ SEED(24), CARRY, I24, J24, ISEED

c

c F. James, Comp. Phys. Comm. 60, 329 (1990)

c algorithm by G. Marsaglia and A. Zaman

c base b = 2**24 lags r=24 and s=10

c

uni = SEED(I24) - SEED(J24) - CARRY

IF (uni.LT.0.D+0) THEN

uni = uni + 1.D+0

CARRY = TWOm24

ELSE

CARRY = 0.D+0

END IF

SEED(I24) = uni

I24 = I24 - 1

IF (I24.EQ.0) I24 = 24

J24 = J24 - 1

IF (J24.EQ.0) J24 = 24

RCARRY = uni

RETURN

END

c---------------------------------------------------------------------

REAL*4 FUNCTION GetRand()



COMMON /GetRandCOMMON/U(97),C,CD,CM,i97,j97

REAL*4 UNI

UNI = U( i97 ) - U( j97 )

IF ( UNI .LT. 0. ) UNI = UNI + 1.

U( i97 ) = UNI

i97 = i97 - 1

IF ( i97 .LT. 1 ) i97 = 97

j97 = j97 - 1

IF ( j97 .LT. 1 ) j97 = 97

C = C - CD

IF ( C .LT. 0. ) C = C + CM

UNI = UNI - C

IF ( UNI .LT. 0. ) UNI = UNI + 1.

GetRand = UNI

RETURN

END

c---------------------------------------------------------------------

SUBROUTINE InitGetRand( ij, kl )

COMMON /GetRandCOMMON/U(97),C,CD,CM,i97,j97

i = mod(ij/177, 177) + 2

j = mod(ij , 177) + 2

k = mod(kl/169, 178) + 1

l = mod(kl , 169)

DO ii = 1, 97

s = 0.0

t = 0.5

DO jj = 1, 24

m = mod( mod(i*j,179)*k , 179 )

i = j

j = k

k = m

l = mod( 53*l+1 , 169 )

IF ( mod(l*m,64) .GE. 32 ) s = s + t

t = 0.5 * t

END DO

u(ii) = s

END DO



c = ( 362436.0 / 16777216.0)

cd = ( 7654321.0 / 16777216.0)

cm = (16777213.0 / 16777216.0)

i97 = 97

j97 = 33

RETURN

END

c ======================================================================

c Test and initialize the random number generators

c ======================================================================

SUBROUTINE RANTEST(Iseed,gen_type)

IMPLICIT NONE

INTEGER Iseed, i

INTEGER :: gen_type

DOUBLE PRECISION RANF

REAL*4 :: choice

c Initialization of the random number generators

CALL InitGetRand(1802,9373)

CALL RANSET(Iseed)


GOTO 100

end if


GOTO 100

end if


GOTO 100

end if

write(*,*)’Unknown type of rand. number generator, stop.’

write(*,*)’Possible values are: 1, 2, 3.’

stop

100 CONTINUE

PRINT *, ’ ******** test random numbers ***********’

Open(22,File="sequence.dat")

DO i = 1, 200


PRINT *, ’ i,ranf() ’, i,choice

write(22,*)i,choice

END DO



CLOSE(22)

RETURN

END

5.1.2.4 Monte Carlo integration

To understand the advantage of the MC integration, e.g. for expressions similar to the

integral (5.61), it is instructive to recall how numerical integration works. We can consider

1-D case and try to find the integral of a function in the interval [a, b]. The simplest

approach is a direct summation over n equidistant points separated by the interval ∆x

I1d =

∫ b

a

f(x)dx ≈n∑

i=1

f(xi)∆x, (5.29)

where

xi = a + (i − 0.5)∆x and ∆x =b − a

n. (5.30)

We take values of f(x) from the midpoint of each interval. One can improve the accuracy

by using the trapezoidal or Simpson’s method, but this is not essential for our present

considerations.

In M−dimensional space, the generalization of Eq. (5.29) to an interval

([a1, b1], . . . , [aM , bM ]) takes the form

IMd =(b1 − a1)(b2 − a2) . . . (bM − aM )

n1n2 . . . nM

n1∑

i1=1

n2∑

i2=1

. . .

nM∑

iM =1

f(xi), (5.31)

where xi = (xi1, x

i2, . . . , x

iM ) is an M -dimensional vector and nk is the number of integra-

tion points in k dimensions.

(a) Straightforward sampling.

The sampling methods of the MC integration are similar to the summing rules discussed

in Sec. 5.1.2.1. Instead of sampling function values in the middle of regular intervals ∆x

[see Eq. (5.29)] we now sample random points xi with some probability density p(x) and

then take the average. For example, if we pick K points xi in the interval [a, b], then the

integral becomes

I1d =

b∫

a

f(x)dx ≡b∫

a

f(x)p(x)

p(x)dx ≈ 1

K

K∑

i=1

f(xi)/p(xi), (5.32)

where the probability of sampled points xi is given by p(x)dx. Note, that in this case the

initial function f(x) must be divided by the compensating term p(x), see Sec. (d).



If the probability density is uniform, p(x) = 1/(b − a), above expression reduces to

I1d =b − a

K

K∑

i=1

f(xi). (5.33)

As one can see, this is similar to Eq. (5.29): we divided the initial interval [a, b] into K

sub-intervals and took the values f(xi) from the midpoint of each interval. The difference

between Eqs. (5.33) and (5.29) lies in the fact how we choose sub-intervals – in a regular

way or randomly. In Sec. (d) we will show that a reasonable choice of p(x) can substantially

reduce the errors of numerical integration even without a large number K of the sampled

points.

Now we want to generalize the procedure to M -dimensional integration. In M dimen-

sions we pick a vector

xi = (xi1, x

i2, . . . , x

iM )

at random in the interval ([a1, b1], [a2, b2], . . . , [aM , bM ]). This can be done by using uni-

formly distributed random numbers for each dimension. Having chosen K such points

(vector components), the MC estimate of the M -dimensional integral can be written as

IMd ≈ (b1 − a1)(b2 − a2) . . . (bM − aM )

K

K∑

i=1

f(xi). (5.34)

Comparing the sums (5.31) and (5.34) we indicate a crucial difference. In direct nu-

merical integration, Eq. (5.31), we need M different sums (one for each dimension). In

contrast, in the MC integration we need only one. This gives a clue why the MC integra-

tion is so important and advantageous in many dimensions (see example below). In 1-D

there is no major difference between two methods, but with increasing of dimensionality,

all summations in Eq. (5.31) becomes increasingly difficult and time consuming. The MC

sum (5.34) is clearly simpler, and we can use the same number of sampled points K for

any high-dimensional integration problem without significant influence of discretization on

integration errors.

Example: Volume of a hypersphereNow we consider how does the MC integration work in practice. A simple example is the calcula-tion of the volume of a 3D sphere of radius r. In this case, the region of integration is a circulararea in the xy plane and the volume is given by

V = 2

Z

x2+y2≤r2

dxdy z(x, y). (5.35)

Here we have used the fact that the volume can be expressed as a sum of the volumes of theparallelepipeds with the base dx dy and different heights z(x, y). In the case of a 3D sphere theheight is determined as a distance from the plane z = 0 to the surface of the sphere, i.e.

r2 = x2 + y2 + z2 ⇒ f(x, y) ≡ z(x, y) =p

r2 − (x2 + y2). (5.36)



The factor “2” in Eq. (5.35) takes into account that we have considered only the positive half-space.

We now consider the integration in Eq. (5.35) with two methods. At first, we use the midpointapproximation, i.e.

V ≈KxX

i=1

KyX

j=1

f(xi, yj)Θ(xi, yj)∆x∆y, (5.37)

where the points xi, yj are chosen equidistantly on the interval [−r, r] with the steps ∆x,∆y, andΘ is Theta function which equals to 1 if x2

i + y2j ≤ r2 and is zero otherwise.

The simple MC integration procedure proceeds as follows. First, we select points (xi, yi)randomly in the square [−r, r] × [−r, r], and reject those which are outside the circle of radius r.Then we do the MC summation for the points which are inside the circle

V 3d ≈ 2S

Kr

KrX

i=1

f(xi, yi). (5.38)

Here, the factor S is the area of the circle, S = πr2, and it takes into account that all acceptedpoints Kr have a uniform probability distribution p(x, y) = 1/πr2.

Below we give a Fortran code which illustrates this example. The random number generatorranf() is taken from standard Numerical Recipes [19]. The code is self-explanatory.

PROGRAM SPHERE_VOLUME

REAL*8, PARAMETER :: Pi=3.14159265358979

INTEGER :: int_points, points_inside

REAL*8 :: x, y, z, r, r2, r3, sq, f, fsum, fmean, Integ

REAL*4:: choice

int_points=10000

r=1.0d0

r2=r*r

r3=r2*r

fsum=0.0d0

points_inside=0

DO i=1,int_points

c Sample x, y randomly in the range [-r,r]

CALL MYRANDOM(choice)

x=(2.0*choice-1.0)*r


y=(2.0*choice-1.0)*r

c Evaluate function f(x,y)

sq=x*x+y*y

if (sq.LT.r2) then

c Take into account only points which are inside the 2D circle

& of the radius r

f=DSQRT(r2-sq)

fsum=fsum+f

points_inside=points_inside+1

end if



END DO ! over i

if (points_inside.EQ.0) then

write(*,*)’No points inside. Increase int_points. Stop.’

stop

end if

c MC estimate of <f>

fmean=fsum/points_inside

c Actual integral: 2 S <f>; the area S=pi*r^2

Integ=2.0d0*Pi*r2*fmean

write(*,*)’Points inside, hit ratio’,points_inside,

& (1.0*points_inside/int_points)

write(*,*)’Sphere volume is: ’,Integ

c Now print the exact value 4*Pi/3

write(*,*)’Exact result: ’,4.0d0*Pi*r3/3.0d0

stop

END PROGRAM SPHERE_VOLUME

Running the code gives the MC estimate of the sphere volume which is quite close to the

exact answer 4π/3 = 4.18879020. We will see below how the uncertainty of MC calculations can

be estimated (Par. (b)).

The generalization of this code for the calculation of the volume of a M -dimensional

hypersphere defined by the condition∑M

j=1 x2j = (x · x) ≤ r2 is not difficult. Here we can

make use of the Theta function introduced in Eq. (5.35), and finally rewrite the estimate

for the volume of the hypersphere as

V Md ≈ V

K

K∑

i=1

f(xi)Θ(xi), (5.39)

where V = (2r)M is the volume from which we sample the M -dimensional vectors x =

(x1, x2, . . . , xM ), the Theta function Θ(x) = 1 if (x · x) ≤ r2 and is zero otherwise, f(x)

is the density which equals to 1 for the calculations of the volume, and the whole method

in this case reduces to the Hit-or-Miss Monte Carlo (see Sec. 5.1.3).

It is interesting to compare the efficiency of the direct numerical integration with the

MC estimate, Eq. (5.39). For the standard numerical quadrature integration methods,

one finds that the convergence rate (i.e. the difference between the estimate and the exact

answer) behaves as K−l/M (K is the number of sampled points, M is the dimensionality).

For example, for the trapezoidal rule (5.35) it is proportional to K−2/M , for Simpson’s

rule proportional to K−4/M , and for the higher order Gauss rules to K−(2m−1)/M . In

contrast, the convergence of the MC integration is independent of the dimensionality and

goes inversely proportional to the square root of K, i.e. K−1/2. Hence, we can see that

for low dimensions the direct integration methods have a great advantage compared to

the statistical methods, though Monte Carlo algorithms are always easier in practical

implementation. The advantage to use MC method depends on the used quadrature

method, e.g. it converges faster than the trapezoidal rule when M ≥ 5, the Simpson’s rule

when M ≥ 9, etc.



M quadrature time result MC time result Hit ratio Correct

2 0.00 3.1296 0.07 3.1406 0.78 3.1415

3 1.0 · 10−4 4.2071 0.09 4.1907 0.52 4.1887

4 1.2 · 10−3 4.9657 0.12 4.9268 0.31 4.9348

5 0.03 5.2863 0.14 5.2710 0.16 5.2637

6 0.62 5.2012 0.17 5.1721 0.08 5.1677

7 14.9 4.7650 0.19 4.7182 3.7 · 10−2 4.7247

8 369 4.0919 0.22 4.0724 1.6 · 10−2 4.0587

Table 5.1

To illustrate this one can estimate the volume of the M−dimensional sphere using

the trapezoidal rule and the MC sampling. The number of intervals in direct numerical

integration was chosen to be 25, and the number of attempts in the MC simulation was

always 106. This gives results of about ∝ 0.5 percent accuracy.

The results are given in the table 5.1 below. The first column gives the number of

dimensions M , the next two columns the execution time and result of the direct numerical

integration, the next two columns correspond to data from the MC method, the 6th

column shows a Hit ratio – the ratio of the points which fall inside the sphere to the total

number of points, and the last column contains the correct answer (known analytically),

i.e. πM/2rM/Γ(M2 + 1). The times are in seconds.

As we have expected, for M < 6, the trapezoidal rule is faster, but after that it becomes

terribly slow. What is most interesting is that the time required by the MC method is

almost not increasing at all, even though the accuracy stays the same. This is what makes

it so advisable to use Monte Carlo for high-dimensional integration. It is worth to note,

however, the behavior of the Hit ratio given in the 6th column. It decreases approximately

two times in its value every time when the spatial dimension is increased by one. One can

see, that for the high dimensions, e.g. M = 7, 8, only few percent of all points fall inside

the sphere. In the general case, when the function f(x) in Eq. (5.39) has some complicated

dependence on its variable x, these few points inside the hypersphere will give only a very

rough estimate to the actual value of the integral and this results in a dramatic loss of

efficiency of the whole procedure. Hence, one should think to use more “clever” methods

for high-dimensional integration. This strategy will be discussed in Sec. (d).

(b) Error of MC integration.

For the straightforward sampling the question that arises immediately is: Does the method

converge? What would be the error of the MC integration for a given number of sampled

points K?

The convergence of the method is provided by the theorem:

Theorem 1 (A.N. Kolmogorov) For the average value of independent realizations

f(xi) of a random quantity f(x), i.e 1K

K∑i=1

f(~xi), to converge with unity probability to its

expected mathematical average Mf(~x), it is necessary and sufficient that this expected

average does exist.



0 100 200 300 400 500 600 700

0.0

0.5

1.0

1.5

2.0

Pre

ssur

e

MC steps

σ

Fig. 5.3 Fluctuating line: pressure of the Lennard–Jones fluid vs. number of MC step K. Horizontalline: the average value of the pressure. σ shows the evaluated value of the variance.

The behavior of the error of the method is given by the central limiting theorem:

Theorem 2 (Central Limiting Theorem)

Let x1, x2, . . . xN be random points selected according to a probability density p(x) (e.g

for the straightforward sampling and x ∈ [a, b] p(x) = 1b−a). The following conditions are

assumed to be satisfied:

∫ +∞

−∞p(x)dx = 1, and I =

∫ +∞

−∞f(x) p(x)dx = Mf(x), (5.40)

i.e. the probability density p(x) is normalized and the integral of the function f(x) con-

verges. Let σ be the variance of the integral of the function f(x), i.e.

σ2 = M(f(x) − I)2 =

∫ +∞

−∞f2(x) p(x)dx − I2. (5.41)

Then the following expression holds

P

(∣∣∣∣∣1

K

K∑

i=1

f(xi) − I

∣∣∣∣∣ ≤ǫσ√K

)=

√2π

∫ +ǫ

−ǫ

exp(−x2/2)dx + O(1/√

K). (5.42)

This is the central limiting theorem which states the convergence of 1K

∑Ki=1 f(xi) to the

true average Mf(x). During sampling of random points ~xi, the approximate numerical

estimate of the integral Mkf = 1K

∑Ki=1 f(~xi) exhibits statistical fluctuations around

the exact integral value, I (see Fig. 5.3 for an illustration). But as the theorem states, for

a given confidence interval the error bound is proportional to σ and inversely proportional



to the square root of K. The estimated value of the integral I can be written as

I =1

K

K∑

i=1

f(~xi) ±σ√K

, (5.43)

with the dispersion σ2 evaluated over the obtained realizations f(xi):

σ2 ≃ 1

K − 1

K∑

i=1

(f(~xi) −

1

K

K∑

i=1

f(~xi)

)2

. (5.44)

For practical purposes, the convergence of the straightforward sampling is too slow because

the uncertainty of the result decreases only as the square root of the number of sampled

points K. However, there is another parameter which allows us to reduce the statistical

error. This will be discussed in Par. (d).

(c) Efficiency criteria of the MC method.

Let us introduce the idea about efficiency of the MC method. Let t be the calculation time

of one realization of the random quantity f(~x) on a PC. The efficiency of the method can

be defined as follows

Aeff ≡ 1

σ2 t, (5.45)

where σ2 is the dispersion of the quantity f(~x). The argument for introducing such a

quantity can be the following. Let us consider two MC algorithms with two different

efficiencies Aeff1 and Aeff2, then by completing K samplings the order of magnitude of

the errors can be estimated as

δ1 =

√σ2

K=

√σ2 t

K t=

√1

Aeff1 T1, and δ2 =

√1

Aeff2 T2,

where T1,2 is the full simulation time in each case. From this it follows that to achieve

equal accuracy, different full simulation times are required: T1/T2 = Aeff2/Aeff1. Hence,

to increase the efficiency of calculations, or equivalently to decrease time cost to achieve

some predefined accuracy of the calculations, one can choose two strategies:

• decrease the time cost for organization of 1 MC step and calculations of all quan-

tities of interest.

• decrease the dispersion σ2 of the measurable random quantity f(~x).

A decrease of the time cost of calculations is usually performed by development of

refined algorithms and program codes, by using optimizing compilers etc., for which the

user often has only limited possibilities. Therefore, it is important to develop methods

which allow to reduce dispersions of measured quantities.



x

f(x)

a b[ ]

p1(x)= 1/ (b-a)

p2(x)=e-(x-x

0)2/2σ2

/(2σ2)d/2

Fig. 5.4 Different possibilities to choose the probability density p(~x) for the sampled points ~xi used for

the evaluation of the integral I =R b

a f(x)dx: p1(x) corresponds to the straightforward sampling and p2(x)to the importance sampling.

(d) Importance sampling.

In the straightforward sampling all points are chosen uniformly, and the probability den-

sity of the MC sampling has no connection with peculiarities of the function f(~x). The

estimation of the integral will be accurate only in the case if the function is also uniform.

In the opposite case, if the function has a peak in some narrow spatial region (see Fig. 5.4),

then the uncertainty from the MC integration will be large. Since the error of the MC

integration behaves as ∝ σ/√

K, it is clear that if we reduce the variance σ2 the error will

also go down for the same K. For this reason it would be more efficient to sample the

function at the points where its main contribution comes from. Then the integral over the

function f(~x) can be modified by identically rewriting

I =

∫

Q

f(~x)dx =

∫

Q

f(~x)

p(~x)p(~x)dx, (5.46)

where p(~x) is an arbitrary probability density.

Now Eq. (5.46) can be considered as the expectation value of a new function f(~x)/p(~x).

The estimation of this average is given by

M

[f(x)

p(x)

]≈ 1

K

K∑

i=1

f(xi)

p(xi), (5.47)

σ2 [f/p] = M[(f(x)/p(x))2

]− (M [f(x)/p(x)])

2. (5.48)



with the result of the integration given in the form

I ≈ 1

K

K∑

i=1

f(xi)

p(xi)±√

σ2[f/p]

K. (5.49)

How do we choose p(x) to minimize the error of the estimated value of the integral? In

the dispersion σ2[f/p] the last term is equal to the square of the required integral and,

hence, is independent of the choice of p(x). Let us demand the following conditions:

M

[(f(x)

p(x)

)2]

=

∫

Q

f(x)2

p(x)2p(x)dx =

∫

Q

f(x)2

p(x)dx = min, (5.50)

∫

Q

p(x)dx = 1. (5.51)

We can take from these equations the functional derivative with respect to p(·) and write

down conditions of the extremum:∫

Q

f(x)2

p(x)2δp(x)dx = 0, and

∫

Q

δp(x)dx = 0. (5.52)

From this it follows that the minimum of the dispersion is achieved by the choice: p(·) =

c |f(·)|, where c is a constant. In this case the dispersion exactly reduces to zero

σ2

[f

p

]= M

[(f

p

)2]−(

M

[f

p

])2

= c2 − c2 = 0.

However, for the sampling with this probability density it is necessary to know the distri-

bution function (5.4), i.e. F|f |(t) = P|f(~x)| ≤ t for f(~x), or in other words the unknown

integral∫

Qf(x)dx. This problem is even more complicated than the original one. We

certainly can choose some probability density p(x) in a way that it reproduces most of

the peculiarities of |f(~x)|, in particular, the behavior of the function in a region of large

variations. Estimation of the integral with the help of a random quantity distributed with

a good matching probability p(~x) also leads to reduction of the dispersion and is the main

issue of the importance sampling method.

With the importance sampling we succeeded in reducing the statistical uncertainty

without increasing the size of our sample K. The problem which is left is that the function

p(x) requires a prior knowledge of the integral∫

f(x)dx. However this can be overcome

by using the Metropolis algorithm discussed in the next section.

Example: Gaussian integrandAs an example let us consider different possibilities to calculate the integral

R L

−Lf(t)dt with the

function f(x) = 1√2πσ2

e− x2

2σ2 .

• Let us choose as the probability density the function

p(x) =1

2L, x ∈ [−L, L].



Then the dispersion of the estimated value of the integral is:

σ2[f(x)] =L√

2√2πσ2

Φ(L√

2) − 1 ∼ L

σ,

Φ(x) =2√

2πσ2

Z x

0

dt e− t2

2σ2 . (5.53)

As it was expected, the error of the estimated value of the integrand increases with L.This is because the main contribution to the integrand comes from the function values inthe interval [−2σ, 2σ], but the uniform distribution does not reflect this fact, and mostof the MC runs (sampling x outside this interval) are spend in vain.

• Taking into account the inefficient choice made above, let us propose a new form of theprobability density

p(x) =1

2σ, x ∈ [−σ, σ],

p(x) =1

2L − 2σ, |x| ∈ (σ, L]. (5.54)

Then one can show that

σ2[f(x)] =L√2πσ2

“

1 − Φ(σ√

2)”

+σ√

2πσ2

“

2Φ(σ√

2) − 1”

− 1 ∼ σ

σ= 1.

Even such a simple choice of the probability density p(x) (which however takes intoaccount peculiarities of the integrand function) allows us to decrease the dispersion andits dependence on the limits of integration.

On this example we have seen once more the drawback of the straightforward sampling:

if the integrand f(·) has a sharp maximum in some space domain G ⊂ Q, with the condition

that∫

G f(~t)d~t is not negligibly small in comparison with∫

Q f(~t)d~t, the usage of algorithms

with the importance sampling is much more preferable.

5.1.3 Practical realizations of Monte Carlo integration

Many problems in physics involve averaging over many variables. For the standard nu-

merical methods for d-dimensional integration one can show that if the error decreases as

N−a for d = 1 (N is the number of integration points), then the error decreases as N−a/d

in d dimensions. In contrast, the error of all Monte Carlo integration methods decreases

as N−1/2 independently of the dimensionality of the integral. Because the computational

time is roughly proportional to N in both classical and quantum Monte Carlo methods,

we can conclude that for low dimensions, classical numerical methods such as Simpson’s

rule are preferable to Monte Carlo unless the domain of integration is very complicated.

However, the error in the conventional methods increases with dimensions, and Monte

Carlo methods are essential for higher dimensional integrals, see e.g. Table 5.1.

Several examples below illustrate the application of Monte Carlo for the evaluation of

integrals (program code for the Problem 1 is given in the end of this section).



Problem 8: Hit-or-Miss Monte Carlo: The error function.

To find the area under a curve, one can use integral calculations. If the curve has noclosed form, such as the normal curve, then the area can not be derived analytically.However, one can use Monte Carlo integration to complete this task. The area under adistribution is also known as probability. In this example, we want to compute the areaunder the standard normal probability distribution from 0 to z.

Y (z) = erf(z) =

Z z

0

f(x)dx, f(x) =2√π

e−x2

. (5.55)

We will set z equal to 1.50.

(1) Use the straightforward sampling (uniform distribution) to compute the integral (5.55).The Monte Carlo Integration procedure is as follow:

(a) Identify the range of X and Y coordinates where the random number will beplaced (the random numbers must follow a uniform distribution). In this case,the minimum coordinate of X and Y are both zero. The maximum coordinatesof X are given by the user. The maximum coordinate of Y occurs where theX coordinate is zero. In this case, max Y = 2/

√π ≈ 1.12838.

(b) Compute the area of the rectangle using the X and Y range. The area isS = (1.50 − 0) × (1.12838 − 0) ≈ 1.69257.

(c) Run the random process. All the random numbers (X, Y ) will land within therectangle. Count how many points land below the curve.

(d) Divide the sum of points below the curve by the total number of points to gettheir ratio.

(e) Multiply this ratio by the total area of the rectangle to get the probability.This probability should be about 96.5920%. Compare this result to the exactanswer, 96.6105%, well known from the tables of the erf(x)-function in manytextbooks or using the supplied with the program code function erf(x).

(f) Look through the given program code which realizes this algorithm.(g) Make several calculations for different x comparing results with the values

obtained using the function erf(x).(h) Investigate how the absolute error of the integration (the difference between

the estimated and exact value) depend on the number of sampled points N =100, . . . , 1 000 000. Make a log-log plot of the error as a function of N . Whatis an approximate functional dependence of the error on N for a large numberof the sampled points, e.g. N ≥ 104?

(2) Write a second program with the random numbers generated from the normal

distribution p(x) = 1√2πσ2

e−x2/2σ2

(with σ = 1/√

2), and estimate the probabilityintegral. In order to use this method, one must be capable of generating the randomnumbers which are associated to the specific distribution (importance sampling).The probability under 1 000 000 iterations using this method is 96.6024%. In termsof computer time required to achieve a predefined accuracy, this method should befaster than using the straightforward sampling.

(i) Modify the program code for this case. Instead of uniform distribution use theprocedure Gaussian().

(ii) Investigate the behavior of the error making a log-log plot of the error as afunction of N .



(iii) Calculate the dispersion of the integral (5.55)

M

»

f(x)

p(x)

–

≈ 1

N

NX

i=1

f(xi)

p(xi), err = ±

r

σ2[f/p]

N, (5.56)

σ2 [f/p] = Mˆ

(f(x)/p(x))2˜

− (M [f(x)/p(x)])2 (5.57)

for the cases of the straightforward and importance sampling considered above.Compare the theoretical error with your empirical estimates (the log-log plotof the error as a function of N). What method is more efficient?

Problem 9: Hit-or-Miss Monte Carlo: Calculation of π.

One of the possibilities to calculate the value of π is based on the geometrical presentation:

π =4 × πR2

(2R)2=

4 × Area of a circle

Area of enclosing square. (5.58)

With this formula π can be calculated using Hit-or-Miss Monte Carlo. Choose pointsrandomly inside the square. The points should be uniformly distributed, that is eachlocation inside the square should occur with the same probability. Then the followingapproximation allows us to compute π:

4 × Area of a circle

Area of enclosing square≃ 4 × Number of points inside the circle

Total number of points. (5.59)

(i) Based on the previous example, formulate step by step, as it is done in the Problem8.1, the algorithm to implement this Monte Carlo calculation.

(ii) Write the corresponding program code. Choose a sufficient number of sampledpoints N to compute the value of π to guarantee 6-significant digit accuracy. Theexact value of π can be obtained using the standard Fortran function DATAN(x)

which produces the arctangent of x (x must be of type REAL(8)), hence π =4.0d0 ∗ DATAN(1.0d0).

Problem 10: Two-dimensional integration.

Consider a square region in the xy-plane, such that −1 ≤ x ≤ 1 and −1 ≤ y ≤ 1,containing a uniform charge distribution ρ. The electrostatic potential at the point(xp, yp) due to this charge distribution is obtained by integrating over the charged region,

Φ(xp, yp) =ρ

4πǫ0

1Z

−1

1Z

−1

dx dyp

(x − xp)2 + (y − yp)2(5.60)

(i) Write a two dimensional Monte Carlo integration routine to evaluate Φ(xp, yp), andto create a table of values for xp, yp = 2, 4, . . . , 20. Use a sufficient number ofpoints in your integration scheme to guarantee 4-significant digit accuracy in thefinal result.

Problem 11: The Acceptance-Rejection method.

Although the inverse transformation method used in the Example 1 of the paragraph 5.1.2.1can, in principle, be used to generate any desired probability density distribution, in prac-tice the method is limited to functions for which the equation ξ = F (x), can be solved



analytically for x. Another method for generating nonuniform probability distributionsis the acceptance-rejection method due to von Neumann.

Suppose that p(x) is a probability density function that we wish to generate. Consider apositive definite trial function w(x) such that w(x) > p(x) in the entire range of interest.Because the area under the curve p(x) in the range x+∆x is the probability for generatingx in this range, we can follow a procedure similar to that used in the hit or miss method.Generate two numbers at random to define a point in two dimensions which is uniformlydistributed under the function w(x). If the point is outside the area under p(x), thepoint is rejected; if it lies inside the area, we accept it. This procedure implies thatthe accepted points are uniform in the area under the curve p(x) and their x values aredistributed according to p(x).

The procedure to generate a uniform random point (x, y) is the following:

(1) Choose a form of w(x). A convenient choice of w(x) is such that the x-valuesdistributed with the probability w(x) can be generated using the inverse transfor-mation method. Let the total area under the curve w(x) be equal to A.

(2) Generate a uniform random number in the interval [0, A] and use it to obtain acorresponding value of x distributed according to w(x).

(3) For the value of x generated in step (b), generate a uniform random number y inthe interval [0, w(x)]. The point (x, y) is uniformly distributed in the area underthe trial function w(x). If y ≤ p(x), then accept x as a random number distributedaccording to p(x).

(4) Repeat steps (b) and (c) many times.

Note that the acceptance-rejection method is efficient only if the trial function w(x) isclose to p(x) over the entire range of interest.

(i) Write a program based on the acceptance-rejection method to generate a sequenceof random numbers ǫi distributed according to the normal distribution (5.55).Plot a block diagram (histogram) and check that the obtained distribution of ǫiactually corresponds to the distribution (5.55). As a trial function w(x) choose auniform distribution w(x) = 1/σ in the interval [0, 3σ] and w(x) = 0 outside thisinterval.

Questions to the theory

• Prove that in d-dimensions the error of the standard numerical methods of inte-

gration decreases as N−a/d and for the Monte Carlo method as N−1/2.

• Propose a probability density p(x, y) which can be used for the evaluation of

the integral (5.60) using the method of the importance sampling. Discuss the

advantages of the chosen form of p(x, y).

• For the problem 11 (the acceptance-rejection method) propose a better choice of

the trial function w(x) which makes the method more efficient for the distribu-

tion (5.55).

Below is the program code for the Problem 8 of this task. Some of the subroutines

(omitted here) have been already given above in Sec. 5.1.2.3.



c ======================================================================


c ======================================================================

MODULE MAINDATA

c Number of sampled points

Integer :: N

INTEGER :: Iseed

c Random numbers which define point in the 2D plane

REAL*8 :: x,y

c Maximal values of x and y

REAL*8 :: MaxX, MaxY

c Estimated area under the curve

REAL*8 :: Area

REAL*8 :: Proportion

c Number of points under the curve

INTEGER :: count

REAL*4 :: choice

c constant pi

REAL*8, PARAMETER :: Pi=3.14159265358979

END MODULE MAINDATA

c ======================================================================

c Obtain the area under the normal distribution curve (probability)

c using Monte Carlo Integration Method

c ======================================================================

Program Integration

USE MAINDATA

Implicit None

INTEGER :: i,j

REAL*8 :: RANF

c Used random number generator

INTEGER :: gen_type

REAL*8 :: erfc_x,erf_x, Prob, f_t

INTEGER :: cnt1, cnt2, cut_rate

REAL*8 :: proc_performance

c Choose the type of random number generator

gen_type=1


CALL RANTEST(Iseed,gen_type)

c Set initial parameters

N=1000000

Write(*,*) ’Number of sampled points: ’, N

count=0



c Set the limits of integration

MaxX=1.50d0

MaxY=2.0d0/SQRT(Pi)

c To estimate program performance, otherwise can be commented

CALL SYSTEM_CLOCK(count=cnt1,count_rate=cut_rate)

DO i=1,N

if (gen_type.NE.3) then


x=choice*MaxX


y=choice*MaxY

f_t=(2.0d0/SQRT(Pi))*exp(-x*x)

if (y.LT.f_t) count=count+1

else


x=choice

if (abs(x).LT.MaxX) count=count+1

end if

END DO ! over i

Proportion = 1.0d0*count/N

Area = MaxY*MaxX

if (gen_type.NE.3) then

Prob=Proportion*Area

else

Prob=Proportion

end if

c To estimate program performance, otherwise can be commented

CALL SYSTEM_CLOCK(count=cnt2)

proc_performance=(cnt2-cnt1)*1.0d0/(cut_rate*1.0d0)

write(*,*)’Process performance (sec.) :’, proc_performance

OPEN(5,FILE=’performance.dat’)

write(5,*)’Process performance (sec.) :’, proc_performance

CLOSE(5)

Open(21,File="area.dat")

write(21,*)’************************ ’

rite(21,*)’Estimated area: ’,Prob

write(21,*)’************************ ’

write(*,*)’************************ ’

write(*,*)’Hit ratio: ’,Proportion

write(*,*)’Estimated area: ’,Prob



write(*,*)’************************ ’

c Exact value

CALL EERFC(MaxX,erfc_x,erf_x)

write(21,*)’Exact result: ’,erf_x

write(*,*)’Exact result: ’,erf_x

write(*,*)’Error : ’,Prob-erf_x

Close(21)

write(*,*) ’End program work. Have a nice day! :) ’

Stop

End

c ======================================================================

c Procedure EERFC returns the value of the error function (ERF) using

c a series expansion in t (t=1/(1+p*x)):

c erf(x)=1-(a1*t+a2*t^2+a3*t^3+a4*t^4+a5*t^5)*exp(-x^2) +e(x)

c the error |e(x)|<1.5*10^-7

c ======================================================================

SUBROUTINE EERFC(x,erfc_x,erf_x)

REAL*8 :: x,erfc_x,erf_x

REAL*8 :: t,t2

REAL*8 :: znak1

DATA p/0.32759110d0/,a1/0.2548295920d0/,a2/-0.2844967360d0/

DATA a3/1.4214137410d0/

DATA a4/-1.4531520270d0/,a5/1.0614054290d0/

t=1.0d0/(1.0d0+p*abs(x))

t2=x**2

erfc_x=t*(a1+t*(a2+t*(a3+t*(a4+a5*t))))

erfc_x=erfc_x*exp(-t2)

erf_x=1.0d0-erfc_x

RETURN

END

5.1.4 Monte Carlo integration in statistical physics

5.1.4.1 Observables in statistical mechanics

In statistical physics of classical systems there exist well-known formulas, which connect

thermodynamic quantities with the potential of interparticle interaction, temperature, den-

sity, chemical potential, pressure, etc. For example, for a system of N particles in volume V

at temperature T (i.e. canonical or [NV T ] ensemble), the mean value 〈AN 〉 ≡ A(N, V, T )

corresponding to an N−particle observable AN is given by the following expression

A(N, V, T ) =1

Z(N, V, T )

∫

V

. . .

∫

V

AN (R)e−βUN(R)dNR, (5.61)



where R = r1, r2, . . . , rN is a vector of all particle coordinates, UN (R) is the potential

energy of N interacting particles (assumed to be a known function), β = 1/kBT is the

inverse temperature, Z(N, V, T ) is the canonical partition functiona

Z(N, V, T ) =1

N !ΛD N

∫

V

. . .

∫

V

e−βUN (R)dNR, (5.62)

where Λ =(2π~

2/mkBT) 1

2 and D is the dimensionality of the system, the factor N ! has

been inserted to take into account the indistinguishability of identical particles.

Eqs. (5.61)-(5.62) mean that if we know the explicit form of UN(R) then, according to

the general formulas of thermodynamics, we can derive expressions for all thermodynamic

quantities of interest. They can be calculated as an average of some N−particle observable

AN or as partial derivatives of the partition function. The well-known thermodynamic

relations are

(1) Helmholtz free energy, F (N, V, T ) = − 1β lnZ(N, V, T ).

(2) Internal (total) energy, E(N, V, T ) = − ∂∂β lnZ(N, V, T ).

(3) The mean potential energy 〈UN 〉 can be obtained from Eq. (5.61) by substituting

AN (R) ≡ UN (R), where UN(R) =∑

i<j u(rij) is a pairwise additive function

of pair interactions u(rij) depending on the relative interparticle distances rij =

|ri − rj |.(4) Pressure, P = 1

β∂

∂V lnZ(N, V, T ).

(5) Entropy, S = −∂F (N, V, T )/∂T = E/T + kB lnZ(N, V, T ).

(6) Heat capacity at constant volume, CV =(

∂E∂T

)= kBβ2 ∂2

∂β2 lnZ(N, V, T ), or,

equivalently, the heat capacity can be calculated as the difference of two aver-

ages CV ∝ 〈E2N (R)〉 − 〈EN (R)〉2.

By definition, in the integral (5.61) the vector R contains the positions of N microscopic

particles, for example N atoms. In macroscopic systems N ≈ 1023, this means that a

computer simulation of such system will outlast man’s life. Fortunately, it turns out that

a few hundred atoms may often be enough to adequately represent (or “sample”) the

whole macroscopic system, i.e. to get a decent thermodynamic limit. In this case the

corresponding high-dimensional integrals appearing on the r.h.s. of Eq. (5.61) can be

efficiently computed using Monte Carlo techniques. Typically, the macroscopic system is

sampled by a set of 102 . . . 104 particles, and MC calculations yield results which quite

accurately reproduce properties of infinite systems.

In quantum mechanics the situation is completely different. Here the classical ex-

pressions for the partition function Z cannot be used for calculations with Monte Carlo

methods. Therefore, to use the Monte Carlo algorithm it is necessary to represent ex-

pectation values of quantum mechanical operators in the form of sums and integrals of

explicit functions of temperature, density and interparticle interaction potential [14]−[16].

In Sec. 5.2 devoted to quantum MC, we will consider specific procedures for evaluating

these quantities.

aSee also Chapter 1 of this book, Sec. 1.6.



To apply the Monte Carlo method in statistical physics let us consider again Eq. (5.61).

We can see that in the canonical ensemble the equilibrium distribution of states Ri is given

by the Boltzmann factor

pB(R) = e−βUN(R)/Z(N, V, T ). (5.63)

We should stress that the Metropolis method formulated below usually is applicable for

systems in thermal equilibrium. This means that the system has reached the equilibrium

state and the distribution of states (5.63) does not change in time.

5.1.4.2 Metropolis method. Markov chain

We can use the probability (5.63) in the importance sampling to perform the MC inte-

gration in Eq. (5.61). Thus we choose the probability density (5.63) as the probability of

sampled points in the 3N−dimensional coordinate space. In this case the computation of

the quantity A, Eq. (5.61), reduces to simple arithmetic averaging over K points

A =

∫

V

. . .

∫

V

AN (R) pB(R) dNR ≈ 1

K

K∑

i=1

A(Ri). (5.64)

In this way, if configurational points Ri have the distribution pB(R), all higher moments

Aν(R) can be found at once by simple averaging

〈Aν〉 ≈ 1

K

K∑

i=1

Aν(Ri). (5.65)

If the quantities A(Ri) obtained in the importance sampling are independent, then the

convergence is guaranteed by the Theorem 1, and the size of the error can be estimated

using the Theorem 2, see Sec. 5.1.2.4.

The reason for the high efficiency of the Monte Carlo method is that the microstates

Ri are chosen with the probability pB(·). Hence for the averaging we use only microstates

which are in thermodynamic equilibrium. Besides, in the Boltzmann ensemble the method

has a clear meaning: the statistical uncertainty in the calculation of the average energy

〈E(·)〉 will be determined by the dispersion which is proportional to the specific heat,

σ2[E(·)] = 〈(E − 〈E〉)2〉 ∝ cv.

Our next goal is to generate microstates distributed according to pB(R). Here, however,

the partition function Z enters which is not known a priori. To overcome this problem

Metropolis et al. [13] put forward the idea of using the Markov chain (Markov process) such

that, starting from an initial state R0, all further states are generated with the distribution

pB(R). The Markov chain is the probabilistic analogue of a trajectory generated by the

equations of motion in the classical molecular dynamics. What one needs is to specify

transition probabilities from one state Ri to a new state Ri+1 per unit time. In Monte

Carlo applications these transition probabilities, which will be denoted as υ(Ri,Ri+1),

depend on the specific physical problem, and the states R are simply degrees of freedom

in the system, which can change their values. To ensure that the states are distributed

according to pB(R), some restrictions must be placed on υ(Ri,Ri+1) (below we consider



general expressions which are valid for an arbitrary probability density p(x), the Boltzmann

factor (5.63) is simply one particular example):

(1) The conservation law (the total probability that the system will reach some state

Ri is unity):∑

Ri+1υ(Ri,Ri+1) = 1, for all Ri.

(2) The distribution of Ri converges to the unique equilibrium state:∑Ri

p(Ri)υ(Ri,Ri+1) = p(Ri+1).

(3) Ergodicity: the transition is ergodic, i.e. one can move from any state to any

other state in a finite number of steps with a nonzero probability, i.e., the full

configurational space is not divided into mutually isolated subsets.

(4) All transition probabilities are non-negative: υ(Ri,Ri+1) ≥ 0, for all Ri.

The evolution of the probability p(R) (evolution of the Markov chain) is governed by

υ(Ri,Ri+1) and can be described by the Master equation

dp(Ri)

dt= −

∑

Ri+1

υ(Ri,Ri+1)p(Ri) +∑

Ri+1

υ(Ri+1,Ri)p(Ri+1). (5.66)

In thermodynamic equilibrium, dp(R)/dt = 0, therefore, the stationary solution of the

master equation must satisfy the relation∑

Ri+1

υ(Ri,Ri+1)p(Ri) =∑

Ri+1

υ(Ri+1,Ri)p(Ri+1). (5.67)

With the condition (1) this relation leads to the condition (2). It is possible however to

impose a more strict condition – the detailed balance

p(Ri)υ(Ri,Ri+1) = p(Ri+1)υ(Ri+1,Ri), (5.68)

which means that there are only stationary states, i.e. p(R) is conserved. Assuming

ergodicity, the detailed balance is sufficient to guarantee that one correctly samples p(R),

in the limit of large numbers K of the MC sampling.

The equation of the detailed balance (5.68) obviously does not specify υ(Ri,Ri+1)

uniquely, and some arbitrariness in the choice of the functions υ remains. One of the

common choices of transition probabilities for the Monte Carlo method in the canonical

ensemble [13] is

υ(Ri,Ri+1) =p(Ri+1)

p(Ri)=

e−βδUN (R), if δUN (R) = UN(Ri+1) − UN(Ri) ≥ 0;

1, otherwise.(5.69)

The physical meaning of the probability (5.69) is obvious: if a “move” from phase space

point Ri to Ri+1 decreases the energy UN , this move is carried out always (i.e. the

probability υ = 1, second line). The system always goes into a more energetically favorable

state. But what about moves which increase the energy, δUN (R) ≥ 0? It is tempting to

reject such moves. However, this could lead to a trapping of the system in a local energy

minimum. Further, it is easy to see that rejecting these moves would violate the detailed

balance condition (5.68). This condition is fulfilled by the choice made in the first line

of (5.69).



It is possible to prove that a sequence of states generated with the help of Eq. (5.69)

actually has the equilibrium distribution p(R) which converges towards the canonical

probability pB(Ri) = e−βUN (Ri)/Z.

Now we discuss the question what does the “move” Ri → Ri+1 mean in practice? In

principle, there is enormous freedom in the choice of this move. It can be, e.g. spin-flips in

the Ising model, change of spatial coordinates of a particle in an external potential, chain

deformations in the problem of polymer dynamics and so on. There exist systems with

continuous and discrete degrees of freedom. Obviously, it is impossible to enumerate all

various possibilities. Even for one system we can introduce different types of moves and

different transition probabilities, e.g. for a subset of degrees of freedom, for a special type

of collective excitations in strongly correlated systems when it is more energetically advan-

tageous to change positions of several particles simultaneously and so on. However, mostly,

by constructing the Markov chain we perform moves where only few (or one) degrees of

freedom are changed, whereas the moves which involve changes in many degrees of free-

dom will usually have extremely small acceptance probability as they mostly substantially

increase the energy, δU ≈∑Ni=1 δUN (Ri) > 0. In this case, the transition probabilities in

Eq. (5.69) will be practically zero and the system remains in the same configuration for a

sufficiently long “time” (i.e. for many MC steps).

The great variability and flexibility of the MC method allows it to be applied to many

problems of classical statistical physics and also in simulations of quantum systems, the

topic which will be discussed in Sec. 5.2.5. The only restriction in the choice of MC moves

are the conditions for the transition probability υ(R,R′) introduced above which must

always be satisfied.

5.1.5 Statistical ensembles

Now we consider practical realizations of the MC algorithm in different statistical ensem-

bles. The choice of the equilibrium distribution pB(R) determines the thermodynamic

ensemble of a given system.

As we already noted with the use of the importance sampling we sample a set of states

Ri which have the most pronounced effect on the end result. We can do the same in

the MC simulation of thermodynamic ensembles. In this case we know the distribution

function of points in space p(R), it is just the weight function of the given ensemble:

p(R) = ρens(R). With this choice we calculate averages as

〈A〉ens =1

K

∑

Ri

A(Ri), (5.70)

where the points Ri are distributed with the probability density ρens(R).

5.1.5.1 Canonical ensemble

In the canonical ensemble the number of particles N , system volume V and temperature

T are conserved (fixed) quantities. The expression for the canonical partition function and

calculations of the observables have been already specified by the Eqs. (5.61,5.62). Now



we are ready to give the MC algorithm where we calculate the thermodynamic average of

a macroscopic quantity A:

(1) Place particles (e.g. randomly) in some initial positions R0 = (~r(1)0 , . . . , ~r

(N)0 ).

(2) Specify the maximum number of Monte Carlo steps Kmax and initialize the observablequantities, AP = 0. Set the current MC step K = 1.

(3) Choose a particle i at random among the N particles [alternative: successive choice ofeach particle – loop over particle indices].

(4) Choose a displacement vector δr(i) = (δrx, δry, δrz). Each of its components is calculatedas δrx(y)(z) = δrmax

x(y)(z) × ξx(y)(z), where ξx(y)(z) is a uniform random number in theinterval [−1, 1] and δrmax

x(y)(z) is the maximal displacement in each dimension. δrmax isthe adjusting parameter chosen in such a way that it gives some predefined acceptanceratio of trial displacements, e.g. only 50% of the trial particle displacements should beaccepted.

(5) Displace particle i to the new position: r ′(i) = r(i) + δr(i).

(6) Calculate the energy change between the new and old configurations:δU = U(~r(1), .., r ′(i), .., r(N)) − U(~r(1), .., r(i), .., r(N)) = U(R′) − U(R).

(7) There are two possible situations:i) If (δU ≤ 0) accept the new position of particle i and update the coordinates: r(i) →r ′(i) or R → R′, and proceed to step (9);ii) If (δU > 0) proceed to step (8).

(8) If (δU > 0): generate a random number ξ between 0 and 1. Accept the move only if ξ <e−βδU . If the state is not accepted then return the system to the previous configurationr(i).

(9) Accumulate the sum, AP = AP + A(R), for the thermodynamic average and increasethe current MC step by one, K = K + 1.

(10) If K < Kmax return to step (3). Otherwise proceed to (11).

(11) Estimate the average of the given physical quantity and its statistical error, 〈A〉 =AP/Kmax ± σA/

√Kmax.

5.1.5.2 Microcanonical ensemble

In this ensemble the total energy E which is the sum of the kinetic and potential energy

of the system is kept fixed, and the microcanonical partition function has the form

Z(NV E) =1

N !

1

h3N

∫dNR dNP δ

(N∑

i=1

~p (i)2

2mi+ UN (R) − E

). (5.71)

The main problem here is that, in the MC simulations, similarly to the Metropolis

algorithm, we do not have access to the kinetic energy and do not calculate it explicitly.

To overcome this difficulty it was proposed [22]−[24] to consider the kinetic energy as a

free parameter, ED (“daemon energy”), which simulates the true sum

Z(NV E) =1

N !

1

h3N

∫dNR dNP δ (ED + UN (R) − E) . (5.72)



This free parameter ED can serve as a means to exchange energy between particles. Its

purpose is easy to understand. The microcanonical ensemble is formed by all microstates

with the same total energy E. Therefore, any change of the potential energy U (which

is changed by a particle displacement) must be compensated by an equal but negative

change of the kinetic energy. The algorithm for simulating the microcanonical ensemble

can be given as follows (many steps are similar to those of the canonical ensemble):

(1) Set initial particle positions R0.

(2) Specify the maximum number of the Monte Carlo steps Kmax and initialize the observ-ables, AP = 0. Set the current MC step K = 1.

(3) Set the energy of the daemon ED = 0.

(4) Choose a particle i at random among the N particles.

(5) Choose a displacement vector δr(i) = (δrx, δry, δrz). Each component is calculated asδrx(y)(z) = δrmax

x(y)(z) × ξx(y)(z), where ξx(y)(z) is a uniform random number in the interval[−1, 1] and δrmax

x(y)(z) is the maximal displacement in each dimension.

(6) Displace particle i to a new position: r ′(i) = r(i) + δr(i).

(7) Calculate the energy change between the new and old configurations:δU = U(~r(1), .., r′(i), .., ~r(N)) − U(~r(1), .., r(i), .., ~r(N)) = U(R′) − U(R).

(8) There are two possibilities:i) If δU ≤ 0, accept the new position of particle i: i.e. r(i) → r ′(i), and set the daemonenergy to ED = ED − δU , proceed to step (9);ii) If δU > 0: a) if ED ≥ δU give the daemon energy to the system ED = ED−δU , acceptnew configuration, and proceed to step (9); b) otherwise the configuration is rejected (itwould lead to a negative kinetic energy).

(9) If the state is not accepted then return the system to the previous configuration r(i) andproceed to (10).

(10) Sum up the desired physical property: AP = AP + A(R), and increase the current MCstep, K = K + 1.

(11) If K < Kmax return to step (3), otherwise proceed to (12).

(12) Calculate the desired average 〈A〉 = AP/Kmax.

Note that only two steps (3) and (8) differ from the canonical Metropolis approach.

We should stress several points which are important for this algorithm:

i) The total energy (daemon energy plus potential energy) ED + UN (R) stays constant.

ii) The energy of the daemon will have the Boltzmann distribution e−ED/kBT . This follows

from the fact that the potential energy has a Boltzmann distribution. It allows us to define

the temperature in the system as kBT = 〈ED〉.iii) The daemon energy is small compared to the total energy of the system (it is valid for

systems with large particle number).

iv) If we increase the number of daemons their energy will be appreciable, and we can

smoothly move from the NVE to the NVT ensemble.



5.1.5.3 Grand canonical ensemble

In the grand canonical ensemble, the control variables are the chemical potential µ, the

volume V and the temperature T . The total particle number N is therefore allowed to

fluctuate. The probability density of states in this ensemble is pN ∼ e−(E−µN)/kBT , where

µ is the chemical potential, i.e. the derivative of the Gibbs free energy with respect to the

particle number N

µ =∂G

∂N

∣∣∣∣T,V

. (5.73)

If we allow the particle number to fluctuate then the canonical partition function should

be written as follows

Z(µV T ) =

∞∑

N=0

1

N !h3NeµN/kBT

∫dNR dNP e−HN (R,P)/kBT , (5.74)

where HN (R,P) is the Hamiltonian of N particles.

Other thermodynamic quantities follow straightforwardly:

i) Internal (total) energy:

E = 〈HN (R,P)〉 =∑

N

1

N !h3NeµN/kBT

∫dNR dNP HN (R,P) e−HN (R,P)/kBT

= −(

∂ lnZ(µV T )

∂β

)

µ,V

. (5.75)

ii) Average particle number:

〈N〉 = kBT

(∂ lnZ(µV T )

∂µ

)

V,T

. (5.76)

The average value of any physical quantity depending only on the particle coordinates

can be calculated as

〈A〉µ,V,T =1

Z(µV T )

∑

N

(N !)−1V N(eµ/kBT /Λ3

)N∫

dNxA(x) e−UN (x), (5.77)

where Λ =(2π~

2/mkBT) 1

2 is the thermal de Broglie wavelength and we have made a

transition to the dimensionless coordinates, x = R/V , which gives us in Eq. (5.77) an

additional factor V N . It can be noted from Eq. (5.77) that now the probability distribution

p which we want to sample with the MC algorithm is given by

pN ∼ 1

N !e−UN (x)/kBT+µN/kBT+N ln(V/Λ3). (5.78)

It is clear from this expression that, in addition to the usual Metropolis scheme, we should

include new MC steps, which will modify the particle number N . One way to do this is

to create another reservoir system with “ghost” particles, which move around freely, and

sometimes are allowed to enter the “real” system and vice versa [25, 26]. Then, the system



plus the reservoir will conserve the particle number, and the problem is effectively reduced

to the canonical ensemble.

On the other hand, often a more simple (but efficient) method originally proposed in

Ref. [27] is used. In this method three different types of MC moves are introduced:i) Randomly chosen particle is displaced as in the original Metropolis scheme;ii) Randomly chosen particle is destroyed;

iii) A new particle is created at a random position in the system.

The criterion for accepting steps ii) and iii) is the following. Let us consider the ratio of

probabilities in the two states which differ by one particle. Then, if we destroy a particle,

we get

PD =pN−1

pN= e−(UN−1(x)−UN (x))/kBT e−µ/kBT

(NΛ3

V

). (5.79)

Correspondently, when a particle is created the ratio of the probability densities is

PC =pN+1

pN= e−(UN+1(x)−UN (x))/kBT eµ/kBT

(V

Λ3(N + 1)

). (5.80)

The destruction (creation) move is accepted with the probability min[1, pN′

pN]. The con-

dition of detailed balance is satisfied if the probabilities of destruction and creation are

equal, i.e. PD = PC . The position of the created particle is chosen randomly in the whole

space. The fastest convergence of the method was found in the case when the probabil-

ities to make one of three different steps (i.e displacement, destruction and creation) are

equal [27], i.e. PM = PD = PC = 1/3.

Since our simulations are in the grand canonical ensemble, the thermodynamic pa-

rameters T , V and also µ are constants, which have to be given in the beginning of the

simulation. The advantage of the simulations in the grand canonical ensemble is that it

allows for the calculation of the free energy, G, directly

G/N = µ − 〈P 〉µV T V

〈N〉µV T, (5.81)

using the average values of the particle number N and the instant pressure P = ρkBT +13V∑N

i=1 rifi, with ρ is the particle density. Hence by determining the free energy of two

different structures (phases), we can say which of the two structures is thermodynamically

more stable at particular values of µ and T .

5.1.6 Practical applications of the Metropolis algorithm

5.1.6.1 Simulations of the 2D Ising Model (canonical ensemble)

The Ising model is a simple model to study phase transitions. So-called spins sit on the

N2 sites of a lattice; each lattice site i is associated with a spin Si which can take two

values: Si = +1 for an “up” spin or Si = −1 for a “down” spin. A particular configuration

or a microstate of the lattice is specified by a set of values S1, S2, . . . , SN for all lattice

sites. These values could stand for the presence or absence of an atom, or the orientation

of a magnetic moment (up or down). The energy of the model derives from the interaction



between the spins plus the energy of the spins in a B field. We take the energy per pair

of neighbors Si and Sj as J(Si · Sj), where J is the spin-spin interaction. The full energy

in the Ising model is given by

E = −J

N∑

i,j=nn(i)

SiSj − µ0B

N∑

i=1

Si, (5.82)

where the first sum is over all pairs of spins which are nearest neighbors. The second

term is the energy of interaction of the magnetic moment with an external magnetic field.

When J > 0, then the states ↑↑ and ↓↓ are energetically favored in comparison to the

states ↑↓ and ↓↑. Hence, for J > 0, we expect that the state of lowest energy (the ground

state) is ferromagnetic (all spins have the same direction). If J < 0, the situation is the

opposite, the states ↑↓ and ↓↑ are favored and the system is antiferromagnetic. Below we

will concentrate on the case when J > 0 and consider effects of a finite temperature.

When the temperature T is high in relation with J , the spins are disordered: they take

more or less random values. However, when the temperature T drops below the critical

point, the spin system “orders” into a state of “broken symmetry”: most of the spins will

have the same sign (when J > 0).

Now we will specify several equilibrium quantities of interest:

(1) The mean energy: 〈E〉, which is the thermodynamic average of (5.82).

(2) The heat capacity: C = ∂〈E〉/∂T . An alternative way of determining C is to use

statistical fluctuations of the total energy:

C =(〈E2〉 − 〈E〉2

)/kBT 2.

(3) The mean magnetization: 〈M〉 =

⟨N∑

i=1

Si

⟩.

(4) The zero field (B = 0) linear magnetic susceptibility:

χ =(〈M2〉 − 〈M〉2

)/kBT , where 〈M〉 and 〈M2〉 are evaluated at zero magnetic

field.

Because we are interested in the properties of an infinite system, we have to choose

periodic boundary conditions: periodic repetition of the simulation box to avoid boundary

effects.

One way to investigate the ordering transition of a spin system is to use the importance

sampling. In each step one proposes to flip or change a sign of a single spin Si → −Si; the

acceptance probability is chosen such (depending on J, T and the spins) that each state

occurs with the right probability.

Now we list the main algorithmic steps to simulate the Ising model with the Metropolis

method:

(1) Choose (randomly) directions of the spins and compute initial values of the en-

ergy and magnetization. In the computation of energy only nearest neighbors are

counted (4 – on the 2D lattice) and avoid double counting of the interactions.

(2) Choose randomly a lattice site i and make the flip of the spin, Si → −Si. Accept

or reject the trial flip using the Metropolis acceptance probabilities.



(3) Repeat the previous step K times (choose K to be comparable with the number

of lattice sites).

(4) Call the subroutines that record the physical observables after each Monte Carlo

step per spin. Important: check that you have already reached thermal equilib-

rium. Reaching thermal equilibrium can account for a substantial fraction of the

total run time.

(5) At the end of the run various averages are normalized and printed in output-

subroutines. All averages such as the mean energy and the mean magnetization

are normalized by the number of spins.

Below is the program code which realizes this algorithm. Some of the subroutines

(using random number generators) have been already given above in Sec. 5.1.2.3.

c ======================================================================


c ======================================================================

MODULE MAINDATA

c Maximum allowed size of the lattice

INTEGER, Parameter :: Maxlat=64

c Maximum magnetization (Maxlat*Maxlat)

INTEGER, Parameter :: Maxmag=1024

c Array of spins on the lattice

REAL*8, DIMENSION(1:Maxlat,1:Maxlat) :: Spin

c Chosen lattice site (for a given simulation)

INTEGER :: Nsize

c Temperature of the heat bath in the units [kT/J]

REAL*8 :: kT

c Total number of the Monte Carlo Steps

INTEGER :: Total_MC_Steps

c Number of cycles per Monte Carlo step

INTEGER :: Ncycle

c Equilibration time

INTEGER :: nequil

c Current number of cycles and MC steps

INTEGER :: iMC_step, iCycle_per_spin

c Accumulated sums for the calculations of

c averages (Energy, Magnetization)

REAL*8 :: E_sum, M_sum

c Normalization factor for the

c accumulated sums

REAL*8 :: Weight_sum

c Weight of the configuration

REAL*8 :: Weight

c Total number of the accepted spin flips



INTEGER :: Flip_acc

c Total number of spin flips

INTEGER :: Total_Flip

END MODULE MAINDATA

c ======================================================================

c This function returns 4 neighbor spins (S_left,S_up,S_down,S_right)

c for the lattice site (x,y)

c ======================================================================

SUBROUTINE RETURN_NEIGHBORS(x,y,S_left,S_up,S_down,S_right)

USE MAINDATA

INTEGER :: x,y

REAL*8 :: S_left,S_up,S_down,S_right

if (x.EQ.1) then

S_left=Spin(Nsize,y)

else

S_left=Spin(x-1,y)

end if

if (x.EQ.Nsize) then

S_right=Spin(1,y)

else

S_right=Spin(x+1,y)

end if

if (y.EQ.1) then

S_down=Spin(x,Nsize)

else

S_down=Spin(x,y-1)

end if

if (y.EQ.Nsize) then

S_up=Spin(x,1)

else

S_up=Spin(x,y+1)

end if

END SUBROUTINE RETURN_NEIGHBORS

c ======================================================================

c 2d Ising Model

c with periodic boundary conditions(PBC)

c 4 Neighbors

c ======================================================================

Program Ising



USE MAINDATA

Implicit None

Integer :: I,J,Inew,Jnew,Sstmm,Ilat,Jlat,Kk1

REAL*8 :: M1,Ran_Uniform,

& Norm_Dist,Dist(-Maxmag:Maxmag),

& Ttime,Tstart, RANF

INTEGER :: Iseed, iNeighb, ik

REAL*8 :: Diff, Eold, Enew, tmp, P_acc

REAL*8 :: Energy, Energy1, Energy_old

REAL*8 :: Magnet, Magnet1

REAL*8 :: Lnew,Lold,Mnew,Mold

REAL*8 :: S_left,S_up,S_down,S_right

REAL*4 :: choice, GetRand

REAL*8 :: norm

CHARACTER(LEN=*), PARAMETER :: SHORT=’(26(1X,G10.4))’

CHARACTER(LEN=*), PARAMETER :: STAND=’(26(1X,G13.7))’

INTEGER :: Time_to_write

c Initialize output files

OPEN(3,FILE=’meanq.dat’)

CLOSE(3,status=’delete’)

OPEN(31,FILE=’current.dat’)

CLOSE(31,status=’delete’)


CALL InitGetRand(1802,9373)

CALL RANTEST(Iseed)

c Set initial parameters of the system

Write(6,*) ’2d Ising Model, 4 Neighbors, with Pbc’

c Set lattice size

Nsize=16

If(Nsize.Lt.3.Or.Nsize.Gt.Maxlat) Stop

If(Mod(Nsize,2).Ne.0.Or.Mod(Maxmag,2).Ne.0.Or.

& Mod(Maxlat,2).Ne.0) Stop

c Temperature of the heat bath

kT=3.0d0

c Total number of Monte Carlo steps

Total_MC_Steps=25000

Time_to_write=INT(Total_MC_Steps/500.0)

c In what period of Monte Carlo steps write to disk

if (Time_to_write.LT.1000) Time_to_write=1000

c Number of cycles per Monte Carlo step

Ncycle=Nsize*Nsize



If(Ncycle.Lt.1) then

write(*,*)’Ncycle.Lt.1’

Stop

end if

c Set the number of equilibration steps

nequil = 1500 ! reasonable choice: =INT(Total_MC_Steps/3)

c Set a limit on the equilibration time

If(nequil.Gt.Total_MC_Steps) then

write(6,*)’nequil is larger than Total_MC_Steps’

nequil = INT(Total_MC_Steps/2)

write(6,*)’set nequil=Total_MC_Steps/2’,nequil

End if

c Initialization of physical observables (E,M,f(M))

E_sum = 0.0d0

M_sum = 0.0d0

c For spin flip statistics

Flip_acc = 0

Total_Flip = 0

c Initialization of the distribution function f(M)

c of the magnetization

Do I=-Maxmag,Maxmag

Dist(I) = 0.0d0

Enddo

c Initialize initial spin configuration on the lattice

c Nsize = Size Of The Lattice

c Spin(I,J) = Spin Of Site (I,J)

Do I=1,Maxlat

Do J=1,Maxlat

Spin(I,J) = 0.0d0

Enddo

Enddo

Do I=1,Nsize

Do J=1,Nsize

choice=GetRand()

If(choice.Lt.0.5d0) Then

Spin(I,J) = 1.0d0

Else

Spin(I,J) = -1.0d0

Endif

Enddo

Enddo



c Output spin configuration on the screen

DO I=1,Nsize

write(6,*)’ ’,(INT(Spin(ik,I)),ik=1,Nsize)

END DO

c Calculate Initial Energy and Total Magnetization (Magnet)

Energy = 0.0d0

Magnet = 0.0d0

Do I=1,Nsize

Do J=1,Nsize

Magnet = Magnet + Spin(I,J)

c Apply the periodic boundary conditions

c and get 4 neighbors of the given spin

CALL RETURN_NEIGHBORS(I,J,S_left,S_up,S_down,S_right)

c Now we calculate the total energy of the lattice,

Energy = Energy-Spin(I,J)*(S_left+S_up+S_down+S_right)

Enddo

Enddo

c Take into account that we have double counting for each

c pair interaction

Energy = Energy/2.0d0

c Make the output for the initial values of parameters

Write(6,*)

Write(6,*) ’Lattice Size : ’,Nsize

Write(6,*) ’kT : ’,kT

Write(6,*) ’Total number of MC steps : ’,Total_MC_Steps

Write(6,*) ’Number Of Cycles per spin : ’,Ncycle

Write(6,*) ’N. Of equilibration steps : ’,nequil

Write(6,*)

Write(6,*)

Write(6,*) ’Initial Energy (per spin) : ’,

& Energy/(Nsize*Nsize)

Write(6,*) ’Initial Magnetization (per spin): ’,

& Magnet/(Nsize*Nsize)

Write(6,*)

If(Abs(Magnet).Gt.Maxmag) Stop

c Start a loop over Monte Carlo steps (iMC_step)

c and an internal loop over number of spin flips per

c Monte Carlo step

Do iMC_step=1,Total_MC_Steps

Do iCycle_per_spin=1,Ncycle

c Flip a single spin (Metropolis Algorithm): Choose randomly

c the lattice site (Ilat,Jlat) to make a spin flip

100 choice=GetRand()



Ilat = 1 + INT(choice*Dble(Nsize)

101 choice=GetRand()

Jlat = 1 + INT(choice*Dble(Nsize))

c Save the old values of the total magnetization

c and the spin to be fliped -- Lold

Mold = Magnet

Lold = Spin(Ilat,Jlat)

c Flip the spin and change the total magnetization

Lnew = -Lold

Mnew = Mold + Lnew - Lold

c Calculate the energy difference between the new and old

c spin configurations.

c The energy in the old and new configurations

c (only for the chosen spin Spin(Ilat,Jlat))

Diff=0.0d0

Eold=0.0d0

Enew=0.0d0

CALL RETURN_NEIGHBORS(Ilat,Jlat,S_left,S_up,S_down,S_right)

Eold = Eold-Lold*(S_left+S_up+S_down+S_right)

Enew = Enew-Lnew*(S_left+S_up+S_down+S_right)

Diff=Enew-Eold

c Metropolis acceptance/rejection rule

tmp=-Diff/kT

if (tmp.GT.0.) then

c In this case the spin flip is accepted

P_acc=2.0

else

if (tmp.LT.-12.) then

c The acceptance probability is too small

c e^-12~10^-6, hence we set it equal to zero.

P_acc=0.0d0

else

c The usual Boltzmann factor used in the

c Metropolis algorithm

P_acc=DEXP(tmp)

end if

end if

c Number of try to flip a spin

Total_Flip=Total_Flip+1

c Try to make a spin flip using the

c acceptance probability calculated above



choice=GetRand()

if (choice.Lt.P_acc) then

c Update the lattice/energy/magnetisation

Flip_acc=Flip_acc+1 ! Number of accepted spin flips

Energy_old=Energy

Energy=Energy+Diff

Magnet=Mnew

Spin(Ilat,Jlat)=Lnew

end if

Enddo ! over iCycle_per_spin

c Now accumulate the sums for calculations

c of the total energy and magnetization.

c Note: we accumulate averages only if the current

c number of Monte Carlo steps is larger than

c the time (Nequil) needed to equilibrate the system.

If(iMC_step.Gt.Nequil) Then

Weight=1.0d0

E_sum=E_sum+Weight*Energy

M_sum=M_sum+Weight*Magnet

Weight_sum=Weight_sum+1

Dist(INT(Magnet))=Dist(INT(Magnet))+1.0d0

Endif

if (MOD(iMC_step,Time_to_write).EQ.0) then

write(6,*)’Completed MC step: ’, iMC_step

write(6,*)’Current energy : ’,Energy/(Nsize*Nsize)

write(6,*)’Current magnetization: ’,Magnet/(Nsize*Nsize)

write(6,*)’======================================’

end if

c Write mean energy and magnetization into the file

If(iMC_step.Gt.Nequil) Then

c Normalization factor for calculation of averages

norm=1.0d0/(Weight_sum*Nsize*Nsize)

if (MOD(iMC_step,Time_to_write).EQ.0) then

OPEN (3,FILE=’meanq.dat’,access=’sequential’,status=’new’,err=1)

CLOSE (3)

1 OPEN (3,FILE=’meanq.dat’,access=’sequential’,position=’append’,

& status=’old’)

OPEN(3,FILE=’meanq.dat’)

write(3,STAND)iMC_step,E_sum*norm,M_sum*norm



CLOSE(3)

end if

If (MOD(iMC_step,5).EQ.0) Then

OPEN (31,FILE=’current.dat’,

& access=’sequential’,status=’new’,err=11)

CLOSE (31)

11 OPEN (31,FILE=’current.dat’,access=’sequential’,position=’append’,

& status=’old’)

OPEN(31,FILE=’current.dat’)

write(31,SHORT)iMC_step,Energy/(Nsize*Nsize),Magnet/(Nsize*Nsize)

CLOSE(31)

end if

End if

Enddo ! over iMC_step

Write(6,*)

Write(6,*) ’Average Energy (per spin) : ’,E_sum*norm

Write(6,*) ’Average Magnetization (per spin) : ’,M_sum*norm

Write(6,*) ’Fraction of accepted Flips : ’,

& Flip_acc/Total_Flip

Write(6,*)

Write(6,*) ’Energy of the last config. (Simu) : ’,

& Energy/(1.0d0*Nsize*Nsize)

Write(6,*) ’Magnetization of the last config. (Simu) : ’,

& Magnet/(1.0d0*Nsize*Nsize)

c Distribution function of the magnetization.

c Calculate normalization factor.

Norm_Dist = 0.0d0

Do I=-Maxmag,Maxmag,2

Norm_Dist=Norm_Dist+Dist(I)

Enddo

Norm_Dist=1.0d0/Norm_Dist

Open(21,File="magnetic.dat")

Do I=-Maxmag,Maxmag,2

If(Dist(I).Gt.0.5d0) Then

Write(21,*) I,Dist(I)*Norm_Dist

Endif

Enddo

Close(21)



c Calculate the energy of the last configuration

Energy = 0.0d0

Magnet = 0.0d0

Do I=1,Nsize

Do J=1,Nsize

Magnet = Magnet + Spin(I,J)

CALL RETURN_NEIGHBORS(I,J,S_left,S_up,S_down,S_right)

Energy = Energy-Spin(I,J)*(S_left+S_up+S_down+S_right)

Enddo

Enddo

Energy = Energy/2.0d0

Write(6,*) ’Energy of the last config. (Calc) : ’,

& Energy/(Nsize*Nsize)

Write(6,*) ’Magnetization of the last config. (Calc) : ’,

& Magnet/(Nsize*Nsize)

Write(6,*)

c Plot spin configuration on the screen

DO I=1,Nsize

write(6,*)’ ’,(INT(Spin(ik,I)),ik=1,Nsize)

END DO

write(*,*)’Simulation of the Ising model completed.’

Stop

End

5.1.6.2 2D Lennard–Jones fluid

In this example we consider application of the Metropolis method in the canonical (NV T )

ensemble to the system of N particles in a rectangular two-dimensional cell of fixed “vol-

ume” V interacting via the Lennard-Jones potential.

The equation of state of the Lennard-Jones fluid has been extensively investigated by

many researches using both Monte Carlo and Molecular Dynamics simulation methods

starting with the original work of Wood and Parker [28]. A systematic study of the

equation of state has been further reported by many groups [29, 30, 31]. Hence, this well

known physical system can be a good example to demonstrate how Monte Carlo method

works in the canonical ensemble.

As a first step we need to specify the model system. For the simulations we assume

that our system can be treated classically. This means that the molecules (or atoms) can

be considered as point particles (e.g. the molecules are spherical and chemically inert)

and we also assume that the force between any pair of molecules depends only on the

inter-molecular distance rij = |rij |

U =

N−1∑

i=1

N∑

j=i+1

u(rij). (5.83)



In principle, the form of u(rij) should be obtained from a first principle quantum mechani-

cal calculation. Usually, such a calculation is very difficult and a simple phenomenological

form of u(rij) is used. The most important features of this interaction potential for simple

liquids are a strong repulsion for small r (due to the Pauli exclusion principle: overlap of

the electron clouds of neighbor molecules) and a weak attraction at large r. The dominant

weak attraction at large r is due to the mutual polarization of the molecules; the resultant

attractive force is called the van der Waals force.

One of the most common phenomenological forms of u(r) is the Lennard–Jones poten-

tial:

u(r) = 4ǫ

[(σ

r

)12

−(σ

r

)6]

. (5.84)

This type of potential can be considered as a short-range interaction. This means that the

total energy is dominated by interactions with neighboring particles that are closer than

some cutoff distance rc. If we use periodic boundary conditions (for the square of size

L in 2D), and set that rc is less than L/2 (half of the simulation cell), then we need to

consider the interaction of a given particle i only with the nearest periodic image of any

other particle j.

Periodic boundary conditions. The use of periodic boundary conditions implies

that the central cell is duplicated an infinite number of times to fill the two(three)-

dimensional space. Each cell contains the original particles in the same relative positions

as in the central cell. When a particle moves in the original cell, its periodic images move in

the image cells. When a particle enters or leaves the central cell, the move is accompanied

by an image of that particle leaving or entering a neighboring cell through the opposite

face.

Equation of state. We can note that the total energy of the classical system can

be decomposed as E = K(vi) + U(ri), where the kinetic energy K is a function of

all particle velocities vi only, and the potential energy U is a function of all particle

positions ri only. Because the velocity appears quadratically in the kinetic energy, the

equipartition theorem implies that in the thermodynamic equilibrium the contribution of

the velocity coordinates to the mean energy is (1/2)kBT per degree of freedom. Hence, we

need to sample only particle positions, that is, the “configurational” degrees of freedom.

Because the simulation is performed at fixed T, V and N , the simulation samples

particle configurations according to the Boltzmann distribution

pB =1

Ze−βEB , (canonical distribution) (5.85)

where β = 1/kBT , and Z is the normalization constant (partition function).

The physical relevant quantities for a given system are the mean energy, specific heat

and equation of state. Another interesting quantity is the radial (or pair) distribution

function g(r). The function g(r) is a good probe of the local order in the system and

yields information on the interactions in the system and the equation of state.



If only two-body forces are present, the mean potential energy per particle is

〈u〉 =U

N=

ρ

2

∫g(r)u(r) dr, (5.86)

where ρ = N/V is the density and g(r) is the pair correlation function. One can understand

this result using the following consideration. For each particle in 2D there are 2πrρg(r)dr

neighbors (or 4πr2ρg(r)dr in 3D) in a circle (shell) of radius r and thickness dr. The

energy of interaction between the given particle and its neighbors is u(r). The factor 1/2

corrects double counting. To get this result we define the pair correlation function g(r) as

g(r12) = ρ2/N (r1, r2)/ρ2, (5.87)

where ρ2/N (r1, r2) is the joint probability distribution to find the first particle (there are

N possible ways to pick up this particle in the N particle system) at the position r1 and

the second particle (there are N −1 ways to pick up the second one) at the position r2. To

get this function we must perform the partial integration of the configurational integral

and the pre-factor takes into account the arbitrariness in choosing 1-st and 2-nd particle

ρ2/N (r1, r2) = N(N − 1)

∫

V

. . .

∫

V

dr3 . . . rN PN (R), (5.88)

PN (R) = e−βU(R)/kBT

[∫

V

. . .

∫

V

dr1 . . . rNe−βU(R)/kBT

]−1

(5.89)

Now with the definition of the pair correlation function we can derive the expression

for the equation of state. First we consider the thermodynamic expression for the pressure

as a partial derivative of the logarithm of the partition function (5.62) over the volume

βP =∂

∂VlnZ(N, V, β)|N,β . (5.90)

Next, we make a substitution of variables, r → λr, and V → λV0, which results in the

following substitution of the derivative

∂

∂VlnZ =

(1

∂V/∂λ

)∂

∂λlnZ =

(1

3λ2V0

)∂

∂λln(Z)|N,β,λ=1. (5.91)

Taking the derivative over the new scalar variable λ we get directly the (virial) equation

of state in the form

βP = ρ − βρ2

2 D

∫g(r) r

du(r)

drdr, (5.92)

where D is the dimensionality of space.

Truncation of Interactions. The truncation of the interatomic potential at rc will

result in a systematic error in, e.g., the mean potential energy. One can correct this error

by adding a tail contribution:

U tail ≡ ρ

2

∫ ∞

rc

g(r)u(r) 2πrdr. (5.93)



To simplify the calculations, we can set g(r)|r>rc = gid, i.e the pair correlation g(r) at

distances larger than the cut-off radius rc is equal to that of the ideal system, gid = 1. For

the Lennard–Jones potential (in 2D case) we get the following result:

U tail = 2πρǫσ2

[1

5

(σ

rc

)10

− 1

2

(σ

rc

)4]

(5.94)

The corresponding correction to the pressure is

P tail = −1

4ρ2

∫ ∞

rc

rdu(r)

dr2πrdr = 3πρ2ǫσ2

[4

5

(σ

rc

)10

−(

σ

rc

)4]

. (5.95)

Reduced Units. In the simulations it is convenient to express all quantities in reduced

units. This means that we choose a convenient unit of energy, length and mass and then

express all other quantities in terms of these basic units. For the Lennard-Jones system a

natural choice of basic units is the following:

• Unit of length: σ

• Unit of energy: ǫ

• Unit of mass: m (the mass of one atom in the system).

In terms of these reduced units, denoted with superscript ∗, the Lennard-Jones potential

is

u∗(r∗) = 4

[(1

r∗

)12

−(

1

r∗

)6]

, (5.96)

the potential energy U∗ = Uǫ−1, the pressure P ∗ = Pσ2ǫ−1, the two-dimensional density

ρ∗ = ρσ2, and the temperature T ∗ = kT ǫ−1.

Main simulation steps. Now we list the main steps to simulate the Lennard-Jones

system with the Metropolis algorithm:

(1) To start simulations, we should assign initial positions to all particles in the system.

If we wish to simulate the solid state, it is logical to to prepare the system in the

crystal structure of interest. Important: The crystal state can be metastable,

hence it is not a good choice for the initial configuration in simulations of the

liquid state.

(2) Choose randomly a particle i and displace it to a new position: x′i = xi + δxi, and

y′i = yi + δyi using a displacement vector δ~r = (δxi, δyi), calculating δx = Dr · ξ

(and analogously for δy), where Dr is the maximal displacement and ξ is a uniform

random number from the interval [−1, 1].

(3) Calculate the energy difference between the new and old configurations.

(4) Accept or reject the new configuration using the Metropolis algorithm (5.69).

(5) Repeat steps (2)-(4) M times (choose M to be comparable to the number of

particles in the system).



(6) Call the subroutines that record the physical observables and increase the cur-

rent Monte Carlo step counter by one. Important: check that you have already

achieved thermal equilibrium.

(7) At the end of the simulation calculate all averages such as the mean energy, specific

heat, pressure, radial distribution function, etc.

Investigation of the solid-liquid phase transition. One of the criteria to inves-

tigate structural phase transitions (e.g., solid-liquid phase transition) was proposed by

Lindemann [32], who used the vibrations of atoms in the crystal to explain the melting

transition. The average amplitude of thermal vibrations increases when the temperature

of the solid increases. At some point the amplitude of vibration becomes so large that

the atoms start to occupy the space of their nearest neighbors and disturb them and the

melting process initiates. According to Lindemann, melting might be expected when the

root mean vibration amplitude√〈δ2〉 exceeds a certain threshold value, namely when the

amplitude reaches at least 10% of the nearest neighbor distance

√1N

∑Ni=1〈δ2

i 〉1N

∑Ni=1〈ai〉

=

√1N

∑Ni=1〈(ri − 〈ri〉)2〉

1N

∑Ni=1〈ai〉

≥ 0.1, (5.97)

where the summation is performed over all particles, 〈ai〉 is the average distance of a

particle i to the neighbor, 〈ri〉 is the average position of the particle i. This quantity

exhibits a rapid growth when the temperature becomes close to the melting temperature

of the solid phase. This can be found by performing simulations at several values of density

by fixing N and varying the box length L, ρ = 0.2; 0.6; 0.8; 1.2, and temperature in the

range 0.1 < T < 10.0 with the step ∆T = 0.3. For each value of the density find an

approximate critical temperature Tc when the distance fluctuations (5.97) show a rapid

growth. Do you ever observe a negative pressure? Using obtained values of the critical

temperature draw a curve in the ρ − T (density-temperature) plane which characterizes

the solid-liquid phase transition.

5.2 Path Integral Monte Carlo

5.2.1 Density matrix and group property

As we have seen in the previous section, if we perform classical simulations of a sys-

tem in equilibrium, we usually need a Boltzmann-type probability distribution, pB ∼e−UN (R)/kBT /Z, and then the Monte Carlo method can be used to sample the particle

coordinates R. Now the question arises what is the appropriate probability density in the

quantum case. The answer is provided by the density operator ρ.b However, here we have

two problems. The first is the same as in the classical case: it is simply difficult to perform

an integration over 3N (or more) degrees of freedom. The second is even more severe, the

many-body density operator ρ for a system of interacting particles is simply unknown. In

bSee also Ch. 1 of this book, Sec. 1.2.


Path Integral Monte Carlo 295

the general case we do not have an analytical expression for the density operator. This

problem was first overcome by Feynman, who proposed to express the density matrix in

terms of a path integral. The key idea is to express the unknown density operator ρ(T ) for

a given temperature by its high-temperature asymptotic which is known analytically. But

the price to pay is high: instead of an (already complicated) 3N−dimensional integral (in

the partition function Z), now integrations of much higher dimensions occur. Yet, as we

have seen above, the Monte Carlo integration technique can manage such problems.

Let us now explain the basis of the path integral approach. A quantum system in

equilibrium is characterized by eigen functions (wave functions) |φi〉 and corresponding

energy eigen values Ei. The thermodynamics of a quantum system is fully determined by

the many-body density operator which is defined as the sum of projections on the state

|φi〉 weighted with the probability pi

ρ =1

Z

∑

i

pi|φi〉〈φi|,∑

i

pi = 1. (5.98)

In thermodynamic equilibrium (canonical ensemble) the probability distribution pi is given

by the Boltzmann factor

ρeq =1

Z

∑

i

|φi〉e−βEi〈φi|. (5.99)

If the temperature is sufficiently low then the contributions of all excited states Ei > E0

are exponentially small compared to the ground state E0. The partition function is the

trace of the density operator given by the sum over all energy eigen values

Z = Tr ρ =∑

i

e−βEi . (5.100)

In principle, to find Z it is necessary to solve the N -particle Schrodinger equation and

find all eigenfunctions |φi〉 and eigenvalues Ei. Then it is necessary to calculate the

sum (5.100) over all discrete and continuous states of the system. However, this could be

rather complicated or, most likely, impossible for a system of many particles.

We can treat this problem in another way and work directly with the density matrix,

which already contains contributions from all states. In coordinate representation, ρ be-

comes an N × N matrix or, equivalently, a function ρ(R,R′) depending on 6N particle

coordinates

ρ → ρ(R,R′; β) = 〈R|e−βH |R′〉 =∑

i

φ∗i (R)φi(R

′)e−βEi . (5.101)

Now, computing the trace of the operator ρ means the integration over the diagonal

elements of the density matrix (DM),

Z =

∫

V

dNR ρ(R,R; β). (5.102)

For the computation of this expression the Monte Carlo method can be directly applied

(exactly as in the classical case) if an explicit analytical expression for the diagonal element



of the density matrix ρ(R,R; β) is known. Such expressions are, in fact, well known for

ideal classical and quantum systems. However, for strongly interacting systems explicit

results exist only in few special cases of exactly solvable models. If we are interested in

realistic non-ideal quantum systems, the function ρ(R,R; β) has to be found from some

other approach.

A simple and straightforward strategy is to use the group property of the density

matrix (for a moment we return to the compact operator notation):

ρ = e−βH = [eδβH ]n, (5.103)

where n is an integer number and δβ = β/n = 1/kBTn. This means that the density

operator ρ is expressed as a product of n new density operators eδβH of the same system,

each corresponding to an n times higher temperature. We will see in a moment what the

benefit of this representation is.

In the coordinate representation the identity (5.103) has the following form written for

a general non-diagonal matrix element with the fixed end points R and R′,

ρ(R,R′; β) =

∫

V

dR1 . . .

∫

V

dRn−1ρ(R,R1; δβ) ×

ρ(R1,R2; δβ) . . . ρ(Rn−1,R′; δβ), (5.104)

where the integrations are performed over additional intermediate coordinates. Notice,

that even for the case of diagonal matrix elements on the l.h.s (R = R′), the r.h.s. involves

the non–diagonal density matrix elements. Further, to simplify notations, we introduce

q ≡ R = (r1, r2, . . . , rN ) and q′ ≡ R′ = (r′1, r′2, . . . , r

′N ).

Expressions (5.103) and (5.104) are identities at any finite n, if for the DM on the

left and right hand side the exact expressions are used. However, the formula (5.104)

is also very useful for approximate calculations. Let us suppose that ρ is the (known)

high temperature approximation of the original DM ρ(q, q′; β). We can use the same

approximation for all factors on the r.h.s. of Eq. (5.104) and obtain a better approximation

ρ(q, q′; β) of ρ(q, q′; β). Indeed, to estimate the error An, let ρ = ρ + δρ, then

ρ(q, q′; β) = ρ(q, q′; β) + An =

=

∫dq1..dqn−1ρ(q, q1; δβ)..ρ(qn−1, q1; δβ) + An = In + An, (5.105)

where

An ≈n∑

i=1

∫dq1 . . . dqn−1ρ(q, q1; δβ) . . . δρ(qi, qi+1; δβ) . . . ρ(qn−1, q′; δβ), (5.106)

and all contributions containing 2 or more factors δρ are neglected. If δρ ∼ (δβ)α, where

α > 1, then An ∼ n(δβ)α ∼ βαn1−α, that means that An → 0 at n → ∞. We can

choose the needed n to obtain the given accuracy of the calculations. So by using the high

temperature approximation ρ on the r.h.s. of Eq. (5.104) it is possible to get an accurate

expression of ρ(q, q′; β) for low temperature.



The simplest high temperature approximation ρ can be chosen in the following form

(U is a potential bounded from below)

ρ(q, q′; δβ) = λ−3Nδ exp

[−π|q − q′|2/λ2

δ − δβU(q)], (5.107)

where λ2δ = 2π~

2δβ/m is the square of the thermal DeBroglie wavelength, m is the particle

mass, and U(q) is the potential energy. For a Hamiltonian of the form H = K + U(q),

where K =∑

i ki = −~2∑

i

(∇2

i /2mi

), this follows directly from

〈q|e−δβH |q′〉 ≈ 〈q|e−δβKe−δβUe−δβ2[K,U ]/2|q′〉 ≈

≈ 〈q|e−δβKe−δβU |q′〉(

1 − O[(1/n)2

]

2

)≈ λ−3N

δ e−π(q′−q)2/λ2δe−δβU(q), (5.108)

where we have used the fact that the potential operator is diagonal in the coordinate

representation, and the kinetic energy density matrix has been evaluated using an eigen

function expansion

〈q|e−βK |q′〉 =

∫dp′dp′′〈q|p′〉〈p′|e−β

PNi=1 p2

i /2mi |p′′〉〈p′′|q′〉 =

=

∫dp′〈q|p′〉〈p′|q′〉 e−β

PNi=1 p′2

i /2mi . (5.109)

The last integral is of a gaussian type and can be performed analytically after substitution

of the explicit expressions for the plain waves 〈q|p′〉 and 〈p′|q′〉.Note also that K and U do not commute giving rise to the commutator in the first

line of Eq. (5.108) which is only the first term of a series (the next terms are double

and triple commutators) where each successive term contains an additional pre-factor δβ.

Neglecting the commutator [K, U ] gives the error of the order O[(β/n)2

](neglecting all

other commutator terms gives errors of higher order). The inclusion of the first order

commutator in the density matrix (in the case when U describes Coulomb interaction)

was considered in Ref. [33].

Substitution of Eq. (5.107) in (5.104) and transition to the limit n → ∞ results in the

expression for the density matrix in the form of the Wiener path integral. Different forms

of the high temperature approximation ρ are discussed in Sec. 5.2.2.

Special care must be taken when interparticle interactions in the potential energy

U(q) have a divergence, e.g. Coulomb or hard wall potential. In these cases we need to

go beyond the approximation (5.107), here the required conditions for the applicability

of this approximation (see below) will be always violated when r → 0. For the case

of Coulomb interaction, the divergent potential can be replaced, e.g., with the effective

quantum Kelbg potential (see paragraph (c)) which is finite at r = 0. The Kelbg potential

takes into account two particle quantum effects in the first order of perturbation theory.

We will return to this point later in the discussion of the pair density matrix and pair

potentials in Sec. 5.2.2.

The choice of n in Eq. (5.104) is defined by the importance of quantum effects and can



be estimated from the following inequalities:

An/In ≈ n−1/2βλ〈∇U 〉 ≪ 1, and An/In ≈ n−1/2βλ2〈∇2U〉 ≪ 1, (5.110)

where λ2 = 2π~2β/m. These inequalities are obtained from the treatment of the two-body

quantum problem with the pair interaction U . Inequality (5.110) relates to the approxi-

mation (5.108). Let us note that the condition of applicability of classical mechanics for

the treatment of particle motion with momentum p in a potential field with the charac-

teristic value U0 is similar to Eq. (5.110) and is of the form ∆Ux/U0 ≪ 1, where ∆Ux

is the variation of the potential over the length λ = ~/p. Generalization to many-body

systems in (5.110) is made by introducing the averaged potential field of many particles

〈∇U〉 acting on the given particle.

Besides the estimations (5.110) the real accuracy of the representation (5.104) and the

related thermodynamic quantities can be estimated from calculations with different values

of n. Increase of n should result in convergence of the obtained results to some limiting

values.

5.2.1.1 Quantum statistics. Spin effects

In the above discussion only quantum effects in the particle interaction have been taken

into account, while quantum statistical effects related to the Fermi or Bose statistics were

ignored. To take them into account it is necessary to consider (anti)symmetric wave

functions

φ(q)S/A =1

N !

∑

P

(±1)δP φi(P q), (5.111)

where P q = (qP1 , qP2 , . . . , qPN ) is the action of the permutation operator on the particle

indices, δP is the parity of the permutation. The plus sign relates to Bose particles, and

the minus sign to Fermi systems. As a result the partition function of indistinguishable

particles can be written in the following form

ZS(A) =1

N !

∑

α

∑

P

(±1)δP

∫dq 〈q, χα| e−βH

∣∣∣P q, χPα

⟩. (5.112)

Now the wave function includes spin variables, i.e. φ(q) → |q, χα〉, where q and χα are the

coordinate and spin part of the wave function, and α = (σ1, σ2, . . . , σN ) is the complete

set of spin variables. If the spin-orbit interaction can be neglected, then the wave function

factorizes

|q, χα〉 = |q〉 |χα〉 . (5.113)

Let us now present the partition function (5.112) in the form of a path integral as we

already did in Eq. (5.104)

ZS(A) =1

N !

∑

α

∑

P

(±1)δP

∫dq dq1 . . . dqn−1〈χα|〈q|e−δβH |q1〉 . . .

×〈qn−2|e−δβH |qn−1〉〈qn−1|e−δβH |P q〉|χPα〉. (5.114)



Here we have used the group property of the density matrix (5.103). Note that, since

for any pair permutation Pij , P 2ij = 1, we need to apply in Eq. (5.114) the permutation

operator only to one of the factors, here we choose the last one. Further, we took into

account that the spin part of the DM is known and does not need to be included into the

path integral. Using the high temperature approximation (5.107) the partition function

in d-dimensions takes the form

ZS(A) =1

N !λd n Nδ

∫dq dq1 . . . dqn−1e−δβU(q)e−δβU(q1) . . . e−δβU(qn−1)

×e−π(q−q1)2/λ2δe−π(q1−q2)2/λ2

δ . . . e−π(qn−2−qn−1)2/λ2δ

×∑

α

∑

P

(±1)δP e−π(qn−1−P q)2/λ2δ 〈χα|χPα〉. (5.115)

For the case when the Hamiltonian does not directly depend on the spin S, the spin part

of the wave function factorizes, and we have

|χα〉 = |χσ1〉 |χσ2〉 . . . |χσN 〉 , (5.116)

〈χα|χPα〉 =∑

s1,s2,...,sN

χσ1(s1)χ∗σP 1

(s1) . . . χσN (sN )χ∗σP N

(sN )

= ∆σ1σP1∆σ2σP2 . . . ∆σN σP N (5.117)

For the case of Fermi statistics the last term on the r.h.s. of Eq. (5.115) is a sum of N ×N

determinants∑

α

∑

P

(−1)δP e−π(qn−1−P q)2/λ2δ 〈χα|χPα〉 =

∑

α=(σ1,σ2,...,σN )

det‖ϕij ∆σiσj‖,

with the matrix elements

ϕij = exp[−π(rn−1

i − rj)2/λ2

δ

], i, j = 1, . . . , N. (5.118)

These determinants can be efficiently computed using standard linear algebra methods.

The final result for the fermion partition function ZA can be written in the following

compact form

Z(β)A =1

N !λd n Nδ

∫dq . . .

∫dqn−1e

−n−1P

m=0Sm ∑

α

det||ϕij∆σiσj ||, (5.119)

Sm =me

2~2δβ(qm − qm+1)2 + δβV (qm), with q0 = qn ≡ q

ϕij = exp

[− me

2~2δβ(rn−1

i − rj)2

](5.120)

Expression (5.119) can be given an interpretation of the configurational integral where,

however, instead of the potential energy divided by kBT (as in classical systems) the action

Sm is present. We can consider this expression as the integral over all possible trajectories

in the imaginary time t → i~β, which start from the point q, end in the point P q, and

goes through the points qm at the intermediate “time slices” (mδβ), m = 1, n − 1.



In contrast, the Bose density matrix is completely symmetric under an arbitrary per-

mutation of particle labels, and all terms in the sums on the r.h.s of Eq. (5.115) give a

positive contribution to ZS . It is more difficult compared to the Fermi case to evaluate all

N ! terms in the sum, because there is no efficient algorithm to calculate this sum explicitly

(such a sum is called “permanent”). We should note, however, that in the case n ≫ 1

(n ≈ 100 . . .300 is typical for most numerical simulations), only one particular permuta-

tion gives the main contribution to the partition function Z, and all other (N !− 1) terms

are negligible. This can be seen directly from the expression (5.118). In this case the

matrix elements behave like delta-functions.

5.2.1.2 Fermion sign problem

If we again consider simulations of Bose systems, there is a large simplification over the

fermionic case, due to the fact that all terms are positive and, therefore, the terms in (5.115)

can be sampled by Monte Carlo methods. In the Metropolis algorithm it is simply an

additional degree of freedom which we must sample: a discrete variable P in the space of

all possible N ! permutations of N particles.

In contrast, a severe problem arises when the integrand can take both positive and

negative values like the non-diagonal matrix elements of a Fermi system (the pre-factor

taking into account parity of permutations changes its sign). It turns out that in the

case of strong degeneracy (when λ ≫ r, where r is the mean interparticle distance) the

efficiency of the MC procedure is reduced drastically, because the total density matrix is

the difference of two large numbers – a contribution of even and odd permutations, both

of which are very close in their absolute values. One needs to use, in this case, additional

approximations (e.g. the fixed node approximation [34, 35, 36]) or limit the applicability of

the fermionic PIMC to problems where degeneracy is not very high. This problem appears

also in other quantum MC methods (e.g. Diffusion MC) and is called the “fermion sign

problem”. It can not be solved in the general case, although some partial solutions seem

to exist for homogeneous [37, 38] and few-particle systems [39].

As an alternative to the fixed node approximation used in Refs. [34]−[38] the di-

rect fermionic PIMC simulations (DPIMC) have been occasionally attempted by various

groups [40, 41] (and references therein). However, due to the fermionic sign problem,

these simulations have been very inefficient. Recently, a new path integral representation

was proposed for the N-particle density operator [42, 43] which allows for direct fermionic

path integral Monte Carlo simulations of dense plasmas in a wide range of densities and

temperatures.

Leaving this presently very hot topic along for an interested reader, we will return back

to the general theory of PIMC and possible practical applications.

5.2.1.3 Mapping onto an “effective” classical system

As we can see from Eq. (5.119) each particle is now presented by 3 n sets of coordinates and

can be viewed as a trajectory, a path in the configurational space. The inverse temperature

argument β can be considered as an imaginary time of the trajectory, and the trajectory



is completely defined by a set of n points separated by time intervals δβ, see Fig. 5.5. We

can think of the integral in Eq. (5.119) as some kind of classical system. What system?

If we look at the final analytical result for the high temperature density matrix,

Eq. (5.119), we recognize the usual Boltzmann factor with the action in the exponent.

This action describes two types of interaction. The first term

n−1∑

m=0

me

2~2δβ(qm − qm+1)2 =

me

2~2δβ

N∑

i=1

n−1∑

m=0

(rmi − rm+1

i )2 (5.121)

comes from the kinetic energy density matrices of free particles (i denotes the particle

index). This energy can be interpreted as the energy of the harmonic oscillator, Us =k2 (∆x)2. Changing one of the coordinates is equivalent to a change of the energy of

several coupled oscillators. These coupled oscillators provide that the nearest points on

the path are usually at some average distance proportional to the DeBroglie wavelength,

λδ = (2π~2δβ/me)

12 . With lowering the temperature the average size of the path increases

because the thermal wavelength increases also. The oscillators become more weak and the

points on the trajectory become more delocalized in the coordinate space. Following a

possible analogy with classical systems here a quantum particle can be interpreted as a

polymer with the harmonic interaction between nearest points on its chain. Adding the

interaction term δβV (qm) in Sm, Eq. (5.119), does not qualitatively change this picture.

We have now interacting polymers. But these polymers interact only at the same “time

slice” along the paths. Indeed the potential term depends only on qm, which means that

only the coordinates from the time slice mδβ are involved.

In the expression for the partition function Z for distinguishable particles the starting

and ending points of the polymers are the same (5.102) – particles are represented as

“closed ring polymers”. For indistinguishable particles the situation is different (5.112). As

we know, for Fermi (Bose) systems only totally (anti)symmetric eigenfunctions contribute

to the density matrix. In this case paths are allowed to close on any permutation of their

starting position, q = P q, see e.g. the particles 1 and 2 in Fig. 5.5. The partition function

includes contributions from all possible N! closed polymers. At high temperature the

identity permutation dominates, while at zero temperature all permutations contribute

equally. In the “effective” classical system, paths can “cross-link”. For example, a two-

atom system of n links can be in two possible permutations states: either two separate

paths, each with n points, or one large path with 2 n points. In a system of several

particles or in a simulation of a macroscopic system using periodic boundary conditions,

such polymers can wrap around the boundaries of the simulation box. For a macroscopic

Bose system this feature is a direct manifestation of superfluidity. According to Feynman’s

theory [14, 15] the superfluid transition is represented in the classical system of polymers

by formation of macroscopic cycles (paths). These paths stretch across the entire system

and involve on the order of N atoms. For such quantities as the superfluid density, the

specific heat, the condensate fraction etc., we could directly trace changes in their behavior

to these macroscopic exchanges.

As an example, in Fig. 5.5(a),(b) we show how particle statistics is taken into account in

a simulation of five interacting electrons in a harmonic trap. In the left panel, Fig. 5.5 (a)



-1.0 -0.5 0.0 0.5 1.0 1.5 2.0-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

-1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5

0

20

40

60

80

100

3

1

5

XY plane

2345 1

2

4

13 25 4

Y

M

Fig. 5.5 (a) Snapshot of the configuration of 5 electrons in a 2D harmonic trap (XY plane) from pathintegral Monte Carlo simulations. The electrons 3, 4 and 5 are in the identical permutations. Electron 1and electron 2 are involved in a pair exchange. (b) The Y-coordinates of the electrons as a function of thetime-slice number m. Labels show electron indices.

we can see a snapshot from PIMC simulations. Each particle is represented by a path (here

for a better graphical representation the original time-sliced discontinues trajectories are

smoothed by a set of Fourier coefficients which give continuous paths). First, we can

see three closed polymers representing the electrons labelled as 3, 4 and 5. These states

correspond to three identical permutations (there is no exchange between these particles).

In contrast, the electrons 1 and 2 are involved in a pair exchange and form one closed

“polymer”. In Fig. 5.5 (b) we show the Y-coordinates of the electrons at different time-

slices, ym|n−1m=0, with n = 100. Vertical dotted lines show when the starting and end

points of a particle trajectory coincide which is the case for the electrons 3, 4 and 5.

Some deviation in the position of the edge points y0 and yn−1 is due to the finite size

discretization of the trajectories (not taken into account by the Fourier smoothing for

the edge points). Hence the distance between the edge points can be of the order of the

thermal wave length λδ. As one can clearly see from this picture the end points of the 1-st

and 2-nd electron are exchanged due to the pair permutation and the distance between

the edge points of the same trajectory can be several times large than λδ.

5.2.1.4 Generalized Metropolis algorithm

Now let us come to the Metropolis Monte Carlo method, which allows to perform integra-

tion over the intermediate path variables in the expression (5.119). The full configurational

space is divided into two parts: i) the spatial coordinates of N particles at n time slices

q1, .., qn; ii) N ! permutations of the operator P . The product of the exponential fac-

tors with the actions Sm can be viewed as the weight of a non–normalized probability

distribution. In fact this expression is similar to the classical Maxwell-Boltzmann distri-



bution function, and our problem here is similar to that of the Classical Monte Carlo (see

Sec. 5.1), but now in the (3N · n)-dimensional space.

As it was already noted in section 5.1.4.2 the main idea of the Monte Carlo method

is to consider a sequence of states s1, ..sn which form a Markov chain (now a state si

is defined by a set of spatial coordinates qi and spin variables αi). For this purpose we

introduce transition elements υ(s, s′) which specify the probability of a transition from the

configuration s to a new configuration s′. The average of some physical quantity A which

takes the value A(si) when the system is in the state si, can be expressed as (using the

Central Limiting Theorem)

〈A〉 = limL→∞

1L

L∑s′=1

A(s′)/ps′

1L

L∑s′=1

1/ps′

, (5.122)

where L is the length of the Markov chain (the total number of sampled configurations)

and ps′ is the probability used in the importance sampling of configurations s′. In the

standard MC method the transition probabilities υ(s, s′) are chosen to be proportional to

the integrand. For example, for the fermion partition function (5.119) this leads to the

following result. First, we can write down the probability density of states in thermal

equilibrium

ps =1

ZAe−

n−1P

m=0Sm(s)

· abs(det ‖ϕij(s) δσiσj‖α

),

ps′ =1

ZAe−

n−1P

m=0Sm(s′)

· abs(det ‖ϕij(s

′) δσ′

iσ′

j‖α

), (5.123)

where the function abs(. . .) denote the absolute value, and the partition function ZA

assures the proper normalization. The system states are defined as

s = q, q1, . . . , qn−1; α = (σ1, . . . , σN ),s′ = q′, q′1, . . . , q′n−1; α′ = (σ′

1, . . . , σ′N ). (5.124)

In this case the transition probability takes the form

υ(s, s′) =ps′

ps= e

−n−1P

m=0Sm(s′)−Sm(s)

· abs(det ‖ϕij(s) δσiσj‖α)

abs(det ‖ϕij(s′) δσ′

iσ′

j‖α)

. (5.125)

For Bose systems the second factor in the last expression will be presented as a ratio of

two “permanents” – the sum over N ! even permutations.

The generalized Metropolis algorithm considered below is widely used nowadays. This

modification of the standard Metropolis procedure meets all necessary properties of the

Markov process, and the Central Limiting Theorem is also applicable. To construct the

Markov chain one splits the transition probability υ(s, s′) into a sampling distribution



T (s, s′) and an acceptance probability A(s, s′),

υ(s, s′) = T (s, s′)A(s, s′). (5.126)

In the original Monte Carlo procedure, T (s, s′) is chosen to be a constant distribution

inside a cube and zero outside. In contrast, now more general types of sampling are

allowed and trial moves are accepted according to A(s, s′). To satisfy the detailed balance

property, the acceptance probability A(s, s′) must be chosen according to

A(s, s′) = min

[1,

T (s′, s) p(s′)

T (s, s′) p(s)

]. (5.127)

The moves which are not accepted with the acceptance probability A(s, s′) are rejected

and the system stays in the old state. Accepted or rejected moves contribute to averages

in the same way.

An example of the generalized Metropolis scheme for classical systems is the Smart

Monte Carlo, where particle displacements are chosen to be more preferable in the direction

of a force acting on a given particle from the rest of particles, see e.g. Ref. [5].

Usually, in PIMC simulations we have several “moves” (changes in different degrees of

freedom of the system) with a particular sampling distribution T (s, s′) in each case. We

now address this point in detail.

5.2.1.5 Monte Carlo “moves” in PIMC

(a) Displacement of a whole particle.

This type of “move” is similar to the one from classical Monte Carlo, when a whole

particle (with randomly chosen index i), in our case a “ring polymer”, i.e. a path repre-

senting a given particle i, is displaced at some distance δr in the coordinate space. We

do not change the form of the path, only the position of its center of mass is varied.

In this case the scheme of the Metropolis algorithm is the same as for the Canonical

ensemble (see Sec. 5.1.5.1). The probability T (s, s′) can be chosen as a uniform dis-

tribution[(1/2rmax

x ) · (1/2rmaxy ) · (1/2rmax

z )]. The acceptance probability will be A =

min[1, υ(s, s′)] defined by Eq. (5.125). The system states s and s′, in this case, will only

differ by the new coordinates of the i-th particle, i.e.

s′ = . . . , ri + δr, r1i + δr, . . . , rn−1

i + δr, . . ..

(b) Single slice moves.

As a next type of move we consider a path deformation. So we choose randomly one

particle i for which we sample a new path. The beginning of the path r0i can be kept fixed.

This kind of move corresponds to different fluctuations of the path and the integration

over the intermediate path variables, dr1i , dr

2i . . . drn−1

i , in Eq. (5.115).

When we change relative positions of the points inside the polymer this leads to a

change of the kinetic energy, see Eq. (5.121). In the single slice move, we select a particle

i or several particles simultaneously (thus we can omit index i and consider q′ as a vector

of N -particle coordinates), select a time slice m and sample new coordinates, q′m, while



keeping the edge points qm−1 and qm+1 fixed. The optimal choice for the sampling distri-

bution T (s, s′) is given by the heat bath rule, which states that the new coordinate should

be chosen according to its equilibrium distribution

T (qm → q′m) =ρ(qm−1, q′m; δβ)ρ(q′m, qm+1; δβ)∫

dq ρ(qm−1, q; δβ)ρ(q, qm+1; δβ)

=

=ρ(qm−1, q′m; δβ)ρ(q′m, qm+1; δβ)

ρ(qm−1, qm+1; 2δβ)(5.128)

Since, in practice, it is difficult to compute the normalization density matrix

ρ(qm−1, qm+1; 2δβ) for a system of interacting particles, the free-particle sampling dis-

tribution (defined only by the kinetic energy operator) is usually used instead [11]

T (qm → q′) =ρkin(qm−1, q′; δβ)ρkin(q′, qm+1; δβ)

ρkin(qm−1, qm+1; 2δβ), (5.129)

with the result

T (q → q′) =1

2λd/2δ

exp[−2π(q′ − q)2/λ2

δ

]. (5.130)

This is the Gaussian centered at the mid-point q = 12 (qm−1 + qm+1). The acceptance with

this sampling probability will be close to 1 in a system of weakly interacting particles.

Interparticle interaction reduces the acceptance, but not strongly. By choosing δβ → 0 we

can always reduce the initial strongly coupled system to the weakly coupled one, where

the kinetic energy dominates over the potential energy.

Moving only a single point on the trajectory becomes inefficient because the neighboring

points are connected by the kinetic energy link actions (coupled harmonic oscillators) which

are exponentially falling off with the distance. So only very local modifications of the path

are possible, but this slows down the efficiency of the Metropolis algorithm (it is crucial

to explore the whole configuration space, see the integrals in Eq. (5.115), in a reasonable

computer time). Thus it is necessary to have moves which change several adjacent time-

slices of the polymer simultaneously. In practice, one cuts a slice of a polymer (e.g. of

the i-th particle or several particles), i.e. (qm+1, . . . , qm+m0−1), and samples a new path

between the points qm and qm+m0 . Here the idea of the generalized Metropolis procedure

appears to be very useful. Among the generalized Metropolis procedures, we mention the

“Multigrid method” [44], “Multilevel Monte Carlo” [45] and “Bead Fourier Path Integral

Monte Carlo” [46]. Below we will discuss the idea of “Multilevel Monte Carlo”.

(c) Multilevel moves.

The main idea of the “Multilevel MC” (bisection) algorithm [47] is the following. We start

sampling from the middle of the path. When we first sample a mid-point or points in

the center of the path, and if this type of MC “move” is not accepted, we do not need

to continue. It means that one particle or atom goes on top of another atom, and the

potential energy change is quite large. We can reject such moves early on without wasting

time on sampling other points of the path, and only if the current move is accepted we



proceed to sample other variables. In this way it becomes much faster because we waste

only little computer time before we finally decide whether to throw away the whole new

constructed part of the polymer or to continue.

An algorithm for multilevel moves comes from the Levy construction [48]. Let us con-

sider a random walk going from one point to another – in our case a trajectory representing

a particle consisting of a finite number of time slices. We want to sample this random

walk, because in PIMC we need to do averages over different configurations of paths. The

way we do it is a recursive procedure.

Suppose we consider a particle trajectory at m0 = 2l0−1 time slices qm+1, . . . , qm+m0−1,which corresponds to the imaginary time interval

β0 = δβ × m0. (5.131)

We keep fixed the initial and final positions of this interval qm, qm+m0 . Our task is to

construct a new path determined by a set of points

(q′m+1, . . . , q′m+m0−1).

We start construction of the path from its central point q′(m+m0/2), because it is far away

from the edge points and has a larger possibility for strong fluctuations in the potential

energy. In particular, the potential energy strongly increases when the coordinate of one

particle i, i.e. q′(m+m0/2)i , is close to the coordinate, q

′(m+m0/2)j , of particle j, thus the

particle trajectories are closing or crossing. Taking this into account we will use the Levy

algorithm, where one starts sampling from the middle of the interval. After we pick a

middle point we have two intervals. Further, we choose two new mid-points in each of

them. We repeat this procedure recursively for every new interval until we reach the

necessary number of points on the path, m0 = 2l0−1. In this procedure, on every recursive

step the number of points is doubled, and they are partitioned in blocks. For example,

the block l contains ml = 2l−1 points, and the trajectory is divided into time intervals of

the length

βl = δβ × 2l0−l−1 = β0/2l (5.132)

The new coordinates are partitioned and moved in a sequence of steps [11]

• s0: the initial coordinates to be moved qm+1, . . . , qm+m0−1.• s1: the coordinate at the time slice m + m0/2.

• s2: the coordinates belonging to the slices m + m0/4, m + 3m0/4.

• . . .

• sl0 : the coordinates at the time slices m + 1, m + 3, . . . , m + m0 − 1.

In classical Monte Carlo, there is a step-size parameter δ which is adjusted to make the

average acceptance ratio close to 1/2. The analogous parameter in the present method is

the number of levels l0. If l0 is too small, diffusion of the path in the phase space is slowed

down because of the “freezing” effect of the fixed end-points on the sampling. On the other

hand, if the level is chosen to be large, the acceptance ratio becomes small, too. The value

of l0 which gives a reasonable acceptance ratio can be found empirically in each particular



case. One of the possibilities is to choose l0 randomly as an integer from 1, 2, . . . , lM,where the maximal value lM is determined by the condition: 2lM−1 ≤ n/M , where M is

some integer and n is the total number of time slices on the trajectory. In practice, one

can take M equal to 3 or 4.

For moving particle coordinates at time slices we define the transition probability to a

new position at the level l as

Tl(sl → s′l) ≡ Tl(sl, s′l) = pl(s

′l)/pl(sl), (5.133)

and the acceptance probability of this move as

Al(sl → s′l) ≡ Al(sl, s′l) = min

[1,

Tl(sl)pl(s′l)pl−1(sl−1)

Tl(s′l)pl(sl)pl−1(s′l−1)

]. (5.134)

We easily verify that, in this case, the detailed balance property is satisfied for each level

l:

pl(s)

pl−1(s)Tl(sl, s

′l)Al(sl, s

′l) =

pl(s′)

pl−1(s′)Tl(s

′l, sl)Al(s

′l, sl). (5.135)

The simulation procedure is as follows: we compare Al(sl, s′l) with a random number

ξ ∈ [0, 1], and in the case Al(sl, s′l) > ξ proceed to a new level l + 1. The whole trajectory

is updated only if it is accepted at all levels l = 1, . . . , l0. If there was a rejection at one

of the levels (i.e. Al(sl, s′l) < ξ), the trajectory stays in the old position, at least until the

next MC step, when such type of move will be tried again.

As a sampling distribution Tl(q) one can use, as in the case of the single slice move, the

free-particle sampling, but now the one which follows from the ratio of the kinetic energy

density matrices taken at temperature 2l × δβ:

Tl(q) =1

2lλd/2δ

exp[−π(q − q)2/2l−1λ2

δ

], (5.136)

where the point q is taken at the time slice q = qi+2k−1

(with k = 1, . . . , l), and q =12 (qi + qi+2k

) is the center of the interval at one level higher.

Further we need to define the probability functions pl(sl) at each level. The best

choice will be the function proportional to the density matrix corresponding to the time

step 2l−1δβ. For example, on the last level l0 this function is uniquely defined as

pl0(qm; δβ) = ρ(qm−1, qm; δβ)ρ(qm, qm+1; δβ), (5.137)

which follows from the form of the integrand in Eq. (5.105).

On the other levels we can choose (on the other levels pl is a matter of our choice)

as before that the probability density is proportional to the DM at the corresponding

inverse temperature. For example, on the first level (l = 1, the coordinate at the time slice

m + m0/2) we will have

p1(qm+m0/2) = ρ(qm, qm+m0/2; β0/2)ρ(qm+m0/2, qm+m0 ; β0/2). (5.138)



The disadvantage of this form is that it requires a separate calculation of the unknown DM

at higher temperatures. It is worth noting that one can use an approximate form of pl(sl)

for all levels except for the lowest. Therefore, it can be advantageous to use a simplified

probability which can be computed faster. The often used approximation is pl(sl) =

pl0(sl; 2l0−lδβ), which means that we use the same expression for the probability from the

lowest level but with a different temperature argument, 2l0−lδβ. The probability from

the lowest level, Eq. (5.137), is determined by the N -particle DM’s, ρ(qm−1, qm; δβ) and

ρ(qm, qm+1; δβ), for them we can use the simple high-temperature approximation (5.108)

or a more accurate pair approximation, Eq. (5.148), discussed in Sec. 5.2.2.

(d) Sampling of permutations.

The multilevel Metropolis algorithm can be straightforwardly generalized to sampling of

permutations. First, we pick up a permutation (from the N ! possibilities) which will have

a non-zero probability. In fact, it is evident, that local permutation moves consisting of

a cyclic exchange of k neighboring particles will be more probable than an exchange of

particles which are far away. We now briefly discuss changes in the bisection algorithm and

then discuss how to pick permutations which give a non-zero contribution (have non-zero

weight).

Suppose we picked up a permutation. For this we need a new configuration of paths,

because now several particles can be involved in one cyclic permutation, that means that

their path variables have to be changed. In this case, to sample new path coordinates,

the bisection algorithm can be used as before. The difference is that we now sample new

trajectories simultaneously for k particles involved in a cyclic exchange, and also the final

points of the sampled trajectories are changed according to a chosen permutation, i.e.

qm, qm+m0 → qm, P qm+m0, see e.g. Fig. 5.7(b). For the k particle indices which are

changed by a given permutation, the time slices are removed and new paths connecting

one particle to another (for the k-particle exchange) or a new path connecting a particle

on itself (if a given particle undergoes the identity permutation) are sampled.

It is also possible to explore the whole space of N ! permutations by sampling new

trajectories for only two particles at the same time, i.e. with the use of transpositions

(i.e. an exchange of only two particle indices). This is true as any N -particle permutation

can be decomposed in an ordered sequence of pair transpositions (from this we can also

determine the parity of the given permutation as δP = (−1)n, where n is the number

of transpositions). In this case, we need to remove and sample new paths (with the

bisection algorithm) for only 2 particles simultaneously which, in general, will have a

higher acceptance ratio compared to the sampling of k paths. Certainly, in some cases

this method will be disadvantageous, e.g. under the conditions when in the given physical

system two-particle permutations are very improbable or non-zero are only permutations

of a certain length, i.e. they can be obtained only by applying several transpositions

at the same time. In this case, to get any cyclic k-particle permutation from identical

permutations in the beginning of the simulation will be problematic. One should start the

simulations from the k-particle sampling to get permutations of a certain most probable

length, and only then switch on to pair transpositions. We will show below how the method



-1.0 -0.5 0.0 0.5 1.0 1.5 2.0-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

-1.0 -0.5 0.0 0.5 1.0 1.5 2.0-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

4

2

XY planea)

3

5

1

XY plane

11

4

5

3 2

b)

Fig. 5.6 Snapshot of five electrons in a 2D harmonic trap from the simulations of Ref. [73] before (a)and after (b) an exchange between the particles 1 and 4. The sampling of new paths is performed attime-slices m = 17 − 33 and takes place in the region inside the dotted circle. The paths of all otherparticles stay unchanged. The beginning (m = 0) (indicated by an arrow with the particle index) andthe end (m = 100) point of the paths are shown by filled or crossed circles. After the bisection samplingwe apply a pair transposition. As a result the path of particle 1 starting from r

m|m=17 up to rm|m=100

(i.e. the end of the path) is now counted as a new path of particle 4 and vice versa (compare the pathsof particles 1 (light grey line) and 4 (grey line) in the panels (a) and (b)). Now we have a new cyclicpermutation (shown in the panel (b)) involving four particles 1, 3, 4 and 5, and which corresponds toP ′1, 2, 3, 4, 5 = 3, 2, 5, 1, 4. Hence the length of the cyclic permutation is increased by one compared

to the initial cyclic permutation, P1, 2, 3, 4, 5 = 1, 2, 5, 3, 4, shown in the panel (a) and which involvesonly three particles 3, 4 and 5. The shown paths are Fourier smoothed in the same way as in Fig. 5.5.

of pair transpositions works considering a 2D system of five electrons in a harmonic trap.

In Fig. 5.6 (a),(b) we show particle configurations before and after the pair transposition

is applied to particles 1 and 4. This corresponds to a transition from the permutation

1, 2, 5, 3, 4 to a new one 3, 2, 5, 1, 4. Each number in these sequence has the following

meaning: e.g. “3” at the first place of the sequence 3, 2, 5, 1, 4 shows that the particle

1 is exchanged with the particle 3; “2” at the second place means that particle 2 is in an

identical permutation, etc. In terms of path integrals, this sequence tells us where is the

end of the path of the current particle, whether it ends at the beginning of the path of

some other particle or at the same position.

Now we proceed to Fig. 5.7, where the Y -coordinates of the 5 electrons are shown as

a function of time-slices m. Particle indices in Fig. 5.7 (a) and (c) are placed near the

starting and end point of the particle trajectories. Hence, when the sequences of indices

at m = 0 and m = 100 do not coincide (see Figs. 5.7 (a),(b)) it means that there is

a nonidentical permutation between the particles. This sequence, however, is different

from the sequence of particle indices used in the notation of permutations (see the figure



-1.0-0.5 0.0 0.5 1.0 1.5 2.0

0

10

20

30

40

50

60

70

80

90

100

-1.0 -0.5 0.0 0.5 1.0

16

18

20

22

24

26

28

30

32

34

-1.0-0.5 0.0 0.5 1.0 1.5 2.0

0

10

20

30

40

50

60

70

80

90

100

Y

14 5 3 2

a) 45 1 23 45 1 23

Y

1 4

b)

Y

14 5 3 2

c)

Fig. 5.7 (a),(c) The Y-coordinates of five electrons as a function of the time-slice number m. Labels showparticle indices. Thick grey and light grey lines show the paths of the particles “1” and “4” which areexchanged by sampling new paths at time-slices m = 17 − 33 (these time-slices are in the region between

two dashed lines). (b) Sampling of new paths using the bisection algorithm for the electrons “1” and“4”. The new paths are constrained at the time-slices m = 17 − 33. Old (new) paths are shown by lines(circles). The filled circles show two mid-points sampled at the level l = 1 (center of the interval, m = 25)and four other mid-points for sub-intervals [17, 25] and [25, 33] sampled at level l = 2. Open circles showfinal new paths for two particles obtained with the sampling at levels l = 3, 4 and the transposition, i.e.by exchanging the paths starting from m = 33 up to the end point, m = 100.

caption of Fig. 5.6). Here it is determined by values of the Y components of the particle

coordinates and does not result from action of the permutation operator P .

As we can see from Fig. 5.7(a) the paths of particles “1” and “2” are closed (two

identical permutations), three other particles are in one cyclic exchange, and the whole

permutation can be denoted as 1, 2, 5, 3, 4 (as we can see the end of the path “3” coincides

with the beginning of the path “5”, the end of the path “4” coincides with the beginning

of the path “3” and the path “5” ends up at the starting position of the path “4”. Now

we decide to make a transposition between particles “1” and “4”. To do this we choose

randomly time slices where new paths will be sampled. In our case it was m = 17 − 33.

First, we exchange the edge points at the time slice m = 33, i.e r′331 ≡ r332 and r′332 ≡ r33

1 .

Hence the position of the edge points is not sampled, they are a part of the unchanged

trajectories. Once the initial and final points are chosen, we use the bisection algorithm

to sample two paths connecting edge points, see Fig. 5.7(b).

Let us now discuss the probability to pick up a permutation. If we make the end-

point approximation for the density matrix (5.108), we can see that the interaction term



is symmetric under particle exchange, it is not influenced by a permutation and hence can

drop out from the probability. When the sampling probability for permutations can be

chosen in the following form, T (P ) ∝ ρS/A

kin(qm, P qm+l; βl), i.e. to be proportional to the

product of free particle density matrices (see e.g. the exchange determinant in Eq. (5.119)

for a Fermi system) which is the symmetric/antisymmetric N−particle density matrix at

temperature βl = δβ × 2l0−l−1, or, more explicitly,

T (P ) =1

W (P )e−π(qm−P qm+l)

2/(λ2

δ 2l0−l−1) =

=1

W (P )exp

(−

N∑

i=1

π(rm

i − rm+lP i

)2/(λ2

δ 2l0−l−1)

), (5.139)

where W (P ) is a normalization factor. The permutations of paths can be tried at any

time-slice “m” since the permutation operator P commutes with the Hamiltonian. To get

a new permutation we should resample new paths for the time slices (m . . .m + l).

The problem here is that to sample permutations with the probability T (P ) one should

know the normalization factor W (P ) which equals

W (P ) =∑

P∈Ω(P )

e−π(qm−P qm+l)2/(λ2

δ 2l0−l−1), (5.140)

or, in the general case (if we use more accurate approximation for the high temperature

density matrix, see Sec. 5.2.2),

W (P ) =∑

P∈Ω(P )

ρ(qm, P qm+l; βl), (5.141)

which includes the exchange contribution both from the kinetic and potential energy terms,

and the sum is performed over all N ! elements P of the permutation space Ω(P ) which will

be difficult to compute with the increase of the system size. Hence one can limit changes

in current permutations when only few (2, 3 or 4) particles are exchanged cyclically. In

this case the normalization factor can be computed directly. The acceptance probability

follows from Eq. (5.127)

A(P → P ′) = min

[1,

W (P )

W (P ′)

]. (5.142)

The other way to use the probability (5.139) is to sample permutations of k particles

using a random walk through a precomputed table of transition probabilities wi,j with the

neighbors. The square matrix of distances between the end points

wi,j = exp[−(rm

i − rm+lj )2/(λ2

δ 2l0−l−1)], (5.143)

can be computed rather rapidly. The probability for trying a cyclic exchange of k particles

with labels n1, . . . , nk is then proportional to

T (P ) ∝ wn1,n2 wn2,n3 . . . wnk,n1 . (5.144)



It is best to put in the table only permutations that have a probability greater than

some threshold. This can be realized by a random walk through this table. Suppose

we will try a permutation of k particles. The first particle in this cyclic exchange with

the index n1 is chosen randomly from the list of all particles. The second particle “n2”

is selected according to the probability pn2 = wn1,n2/Wn1 with the normalization factor

Wn1 =∑N

nj=1 wn1,nj . When all k labels are selected and we verified that they are unique,

then the trial permutation is accepted with the probability

A = min

(1,

∑ki=1 Wni/wni,ni∑k

i=1 Wni/wni,ni+1

), (5.145)

where nk+1 = n1. The sum of the terms in this expression comes from the fact that,

as a starting point of the cyclic permutation, any member of this cycle can be chosen.

The acceptance probability is small due to the fact that the last link wnk,n1 has a 1/N

chance that the whole cycle closes on itself. But since the construction method of each

loop containing k links is computationally sufficiently fast, the overall efficiency is not bad.

The physics helps to fix the parameter k. It is important to go beyond pair interac-

tions when the quantum degeneracy becomes high. For example, in the simulations of

liquid helium near the transition point to a superfluid state, it is necessary to have long

permutations involving most of the particles in the system.

(e) Acceptance Ratio.

When we try different kinds of moves in the Metropolis algorithm, it can turn out that some

moves will be frequently rejected or accepted. In both cases, we loose the efficiency of the

algorithm because for a sufficiently long time (number of MC steps) we will be trapped in

some local region of the coordinate space, and to explore the whole phase space will require

much more computer time. In practice, e.g. in classical Monte Carlo, the parameters

of the moves are usually chosen to get an acceptance ratio of roughly 50% (empirically

chosen value) [3]. For different kinds of moves in PIMC (particle displacement, trajectory

deformation, permutation sampling) such an acceptance ratio will also be preferable, but

it requires a good a priori sampling distribution. Thus it is of great importance to check

the acceptance ratio for each type of moves. For example, in the bisection algorithm,

usually one has a poor acceptance at the first level, because in sampling of a mid-point

we use the density matrix of non-interacting particles. This leads to a strong overlap of

particle trajectories at high densities and, therefore, to a strong increase of the interaction

energy. As a result such moves will be accepted only accidentally. But the rest of the

path gets a better chance to succeed, as the sampling distribution (constructed from

the high temperature approximation of the density matrix) becomes more accurate with

decreasing time step τ , i.e. at higher temperatures. One can increase the acceptance ratio

by improving the level action, e.g. by shifting the Gaussian sampling distribution to the

potential minimum and modifying the dispersion.

(f) Summary.

Let us now make a list of Monte Carlo moves used in the Metropolis algorithm for a system

of indistinguishable quantum particles:



(1) The move of the whole trajectory of a given particle. The coordinates on all

time slices are changed m = 1, . . . , n. The determinant, det||ϕij ∆σiσj ||, in the

expression (5.119) also changes it value.

(2) Deformation of the trajectory on the time slices m = k, . . . , k + m0 − 1, m0 =

2l0−1, where l0 is the maximal randomly chosen number of blocks, l0 = 1, 2, 3, . . .

(multilevel moves).

(3) Permutation change: we randomly choose one of the N ! particle permutations.

The trajectories have a possibility to end at the position of another particle rP i =

rj , i 6= j. In this type of move the value of the determinant, det||ϕij ∆σiσj ||, in

Eq. (5.119) is also changed.

We do not consider here a change of the spin variables, σ1, . . . , σN, and the algorithm

is valid only for the calculations with fixed projections of the particle spins, σi. The

generalization is not difficult. The change of spins will lead to a change of the exchange

determinant, det||ϕij ∆σiσj ||. Due to the factor ∆σiσj now some non-zero elements become

zero and vice versa. The transition probability between two states which differ in the

particle spin projections, σ1, . . . , σN → σ′1, . . . , σ

′N, can be written as

υ(s, s′) = det||ϕij δσ′

iσ′

j||/det||ϕij ∆σiσj ||. (5.146)

The change of spins may also lead to a change of the parity of the permutation δP . Hence

the sign of the density matrix must be changed to the sign of the factor (−1)δP .

5.2.1.6 Monte Carlo algorithm for PIMC simulations

The brief scheme of the Metropolis algorithm in PIMC simulations consists of several steps,

which we will briefly describe below. One can recognize that some steps are repetitions

of those well known from classical MC in the canonical ensemble. We define as “s” – the

system state containing all degrees of freedom.

(1) Specify the total number of time slices n. Take into account the error of the

high-temperature representation, e.g. the Eq. (5.108), for the inverse temperature

β/n. Check the convergence of the obtained results by performing simulations for

several values of n.

If you use a pair density matrix approximation, precompute in advance the pair

DM tables for each type of interaction in the system.

(2) Choose randomly positions of N particles R0 = (r01, . . . , r

0N ). Set that, in the

beginning of the simulation, the coordinates of each particle i coincide at all time

slices, rmi = r0

i , m = 1, n, i.e. use point-like (classical) particles. Set the other

system variables, e.g. the spin projections of particles, etc.

(3) Set a total number of Monte Carlo steps Kmax and initialize all macroscopic

variables to be computed, As = 0. Set the MC step counter K = 1 and the

normalization factor Ws = 0.

(4) Set a “menu” of different Monte Carlo moves for the Metropolis algorithm, e.g.

the whole displacement of particles, multilevel moves, particle permutations, etc.



Also define a sequence and number of repetitions of these steps in the “menu”

before this sequence will be repeated from the beginning at the new MC step.

It is possible to introduce additional types of “moves”, i.e. for subsets of degrees

of freedom of the system to be changed (which e.g. take into account collective

excitations in the system). Define for these new “moves” the corresponding tran-

sition T (s, s′) and acceptance A(s, s′) probabilities, but check that they satisfy

the detailed balance condition and conserve the ergodicity property of the Markov

process.

(5) Choose a particle i at random among the N particles. Another possibility is to

choose particles systematically by organizing a cycle over their indices i = 1, . . . , N .

(6) Proceed with the sampling of the path variables and the corresponding Metropolis

scheme, depending on the realized possibility from the “menu” of Monte Carlo

moves, e.g.:

(a) Particle displacement.

(b) Multilevel deformation of particle trajectories.

(c) Permutation sampling.

(d) . . .

(7) Accept or reject the tried Monte Carlo “move” using the acceptance probability

A(s, s′). Change the system state s → s′ (s = s′) if the “move” was accepted,

otherwise stay in the old configuration s.

(8) Calculate the contribution of the current configuration s to the thermodynamic

average of the physical quantity A:

As = As + (±1)δPA(s) (the sign “±” corresponds to Bose/Fermi system, δP is

the parity of the current permutation). Increase by one the number of performed

MC steps: K = K + 1. Change the normalization factor Ws = Ws + (±1)δP .

(9) If K < Kmax return to step (5).

(10) Calculate the thermodynamic average as 〈A〉 = As/Ws.

For the illustration of the Metropolis algorithm which realizes its most important part

– the bisection algorithm used for sampling of new particle coordinates, below we provide

a part of the corresponding Fortran code.

c ==================================================================

c Block of some common data used in the given subroutines

c ==================================================================

MODULE MAINDATA

c Dimensionality of the problem

INTEGER, PARAMETER :: Dim=3

c Number of particles

INTEGER, PARAMETER :: NE=19

c Number of factors in the high temperature representation of the

c density matrix

INTEGER, PARAMETER :: NE_Ring=100



c Maximum number of levels in the bisection algorithm

INTEGER, PARAMETER :: max_level=6

c Common choice depending on the value of ’NE_Ring’:

’NE_Ring’ - ’max_level’

c 800 - 8, 100 - 6, 50 - 5

c Particles coordinates in the current and new position

REAL*8, DIMENSION(NE,NE_Ring,Dim) :: X, X_New

c Indices of exchange points

INTEGER, DIMENSION(NE) :: N_ex, No_ex

c Array of particle bounds needed for exchange

INTEGER, DIMENSION(NE,2) :: Bound,Old_Bound

c Temperature parameters

REAL*8 :: kT

END MAINDATA

c ==================================================================

c This procedure samples intermediate points with the Gaussian

c distribution

c ==================================================================

SUBROUTINE GAUSS_MIDPOINT(sigma,R1,R2,R)

USE MAINDATA

REAL*4 :: gauss

REAL*8 :: sigma

REAL*8, DIMENSION(Dim) :: R1, R2, R

INTEGER :: iDim

DO iDim=1,Dim

CALL Gaussian(gauss)

R(iDim)=0.5*(R1(iDim)+R2(iDim))+gauss*sigma

END DO

END SUBROUTINE GAUSS_MIDPOINT

c ==================================================================

c This procedure returns change of the potential energy between

c particles ’n1’ and ’n2’ involved in the pair transposition at

c the time-slice ’point’

c ==================================================================

SUBROUTINE INTER_ENERGY(n1,n2,point,energ)

USE MAINDATA

INTEGER :: n1,n2,point

REAL*8 :: R,R1,R_New,R1_New

REAL*8 :: energ

c this procedure returns the square of the distance between particles

c ’n1’ and ’n2’ at the time-slice ’point’

CALL INTERPARTICLE_DISTANCE(R,R1,X(n1,point,:)

& ,X(n2,point,:),X(n1,1,:),X(n2,1,:))



CALL INTERPARTICLE_DISTANCE(R_New,R1_New,X_New(n1,point,:)

& ,X_New(n2,point,:),X_New(n1,1,:),X_New(n2,1,:))

c Change of the Coulomb energy between the new and old positions

energ=V_e(point)*(1./SQRT(R_New)-1./SQRT(R))

END SUBROUTINE INTER_ENERGY

c ==================================================================

c This procedure realizes the bisection method.

c INPUT parameters are:

c n_path - number of paths involved in the permutation,

c possible values: n_path=2 - two particle transposition

c (exchange) (these two particles can already be involved

c in some permutation cycles. In the current realization

c all N! permutations are realized with the pair transpositions);

c : n_path=1 - sample new path for one single

c particle, no new permutation.

c iStart,iEnd - the indices of points on the path between which

c we sample new path (the two points itself are not changed);

c level - specify the level in the bisection algorithm,

c it is related with the number of sampled points on the path,

c e.g. i_level=level : 1 point - central mid-point ’X’ of the

c interval (iStart,iEnd)

c i_level=level-1 : 2 points - two mid-points of

c the subintervals (iStart,X) and (X,iEnd)

c i_level=level-2 : 4 points,

c ... etc.

c T_acc - acceptance probability;

c

c GLOBAL parameters from the module MAINDATA

c (these parameters are already specified in the external subroutines

c which call the subroutine DO_BISECTION)

c iN_perm(:) - integer logical variable

c iN_perm(:)=0 - single particle path sampling, no new permutation

c iN_perm(:)=1 - two particle transposition;

c

c iN(1:3) - array with the particle indices for which we sample

c a new path.

c X(1:NE,1:NE_Ring,1:Dim),

c X_New(1:NE,1:NE_Ring,1:Dim) - particles coordinates in the

c current and new positions.

c power_2(1:level) - array which contains powers of 2,

c i.e. 2,4,8,etc.

c ==================================================================

SUBROUTINE DO_BISECTION(n_path,iStart,iEnd,



& level,T_acc,regim)

USE MAINDATA

INTEGER :: n_path

INTEGER :: iStart, iEnd, level

c Edge points, between them we will sample a new path

REAL*8, DIMENSION(Dim) :: R1, R2, R

REAL*8 :: dU2,sigma,Prob1,Prob2, dU3

REAL*8 :: Prob_old

INTEGER :: count1,count2

REAL*8 :: T_acc

CHARACTER*5 :: regim

REAL*8 :: energ

if ((n_path.GT.2).OR.(n_path.LT.1)) then

write(*,*)’Wrong number of paths in bisection’,n_path

stop

end if

count2=0

DO i=1,n_path

IF (iN_perm(i).NE.0) THEN

X_New(iN(i),1,:)=X(iN(i),1,:)

X_New(iN(i),iStart,:)=X(iN(i),iStart,:)

c Exchange edge points of two particles at the pair transposition

if (i.EQ.1) then

X_New(iN(1),iEnd,:)=X(iN(2),iEnd,:)

else

X_New(iN(2),iEnd,:)=X(iN(1),iEnd,:)

end if

DO j=1,Dim

if (X_New(iN(i),iEnd,j).NE.X(iN(i),iEnd,j)) count2=1

END DO

END IF

END DO

count1=0

Prob1=0.

c We need to calculate the kinetic energy change - function dEkin(...)

c due to interchange of the edge points (iEnd) in the pair transposition

IF (count2.EQ.1) THEN

DO i=1,n_path

IF (iN_perm(i).NE.0) THEN



Prob1=Prob1+

& dEkin(iN(i),iStart,iEnd,power_2(level))

END IF

END DO

Prob1=BI_PL(level)*Prob1

if (Prob1.GT.17.) then

Prob1=0.0d0

else

Prob1=EXP(-Prob1)

end if

ELSE

Prob1=1.0d0

END IF

if (regim.EQ.’trial’) then

T_acc=Prob1

c Check that two exchanged particles have the same spin projection

IF (NUL(iN(1),iN(2)).EQ.0.) T_acc=0.0d0

RETURN

end if

11 dU2=0.0d0

Prob_old=Prob1*Prob1

count1=count+1

if (count1.GT.1) then

c The bisection failed, the new paths are rejected

T_acc=0.0d0

RETURN

end if

c Start a loop over the points which will be sampled with the

c bisection algorithm. The points are grouped in the sets

c corresponding to different levels.

DO i_level=level,2,-1

DO j=iStart+power_2(i_level-1)

& ,iEnd-power_2(i_level-1),power_2(i_level)

DO ipath=1,n_path

IF (iN_perm(ipath).NE.0) THEN

R1(:)=X_New(iN(ipath),j-power_2(i_level-1),:)

R2(:)=X_New(iN(ipath),j+power_2(i_level-1),:)

c ’sigma’ - the variance of the normal distribution used

c to sample mid-points, see Eq.(5.135).



c ’BI_PL(..)=2 \pi/\lambda^2’ - is a factor in the exponent

c of Eq.(5.135). This factor is an array calculated

c for different values of temperature ’\delta\beta’.

c This temperature varies depending on the variable ’i_level’

c - currently processed level

c in the bisection method.

sigma=0.5*SQRT(1.0d0/BI_PL(i_level-1))

c Sample intermediate point ’R’ using the gaussian distribution

c calculated from the ratio of kinetic energy density matrices,

c see Eq. (X.111)

CALL GAUSS_MIDPOINT(sigma,R1,R2,R)

c Store new sampled points as new particle coordinates

X_New(iN(ipath),j,:)=R(:)

END IF

END DO

c Calculate a difference in the energy due to the interaction

c with the rest particles. For this we use the arrays of

c particles coordinates in the new and old position:

c ’X_New(..)’ and ’X(..)’

DO ipath=1,n_path

IF (iN_perm(ipath).NE.0) THEN

c We consider two cases:

c i) ’simple’ - single particle path sampling, no permutation,

c ii)’psimple’ - two particle exchange (in this case the energy

c difference due to the interaction of two exchanged particles

c is calculated separately in the procedure ’INTER_ENERGY(..)’,

c see above.

c function dE(..) returns a difference in the potential energy

c of the particle ’iN(ipath)’ in the new and old position at

c the time-slice ’j’

if (iN_perm(1).EQ.iN_perm(2)) then

dU2=dU2+dE(iN(ipath),j,j,1,’psimpl’)

else

dU2=dU2+dE(iN(ipath),j,j,1,’simple’)

end if

END IF

END DO

if (iN_perm(1).EQ.iN_perm(2)) then

CALL INTER_ENERGY(iN(1),iN(2),j,energ)

dU2=dU2+energ

end if



END DO !End of loop over j

dU3=dU2*power_2(i_level-1)

dU3=dU3/kT

if (dU3.GT.17.) then

Prob2=0.0d0

else

if (dU3.LT.-20.) then

Prob2=1.0d0

else

Prob2=EXP(-dU3)

end if

end if

if (abs(Prob_old).LT.1.E-7) then

GOTO 11

else

T_acc=Prob2/Prob_old

end if

c Return random number from the interval [0,1]


IF (choice.GT.T_acc) GOTO 11

Prob_old=Prob2

END DO ! over i_level

c If we are here, the new paths have been accepted

RETURN

END SUBROUTINE DO_BISECTION

5.2.2 Improved high–temperature N−particle density matrix.

As we have already noted in Sec. 5.2.1 the use of the simple high temperature approx-

imation (5.108) will fail for the system of particles with the opposite charges (such as

electron-ion plasmas or electron-hole systems in semiconductor). The reason is that par-

ticles with attractive interaction can approach each other very closely: in that case the

simplest factorization of the density matrix into the kinetic energy and potential energy

terms (5.108) will obviously fail due to the singular behavior of the Coulomb potential

at zero distances [16]. Thus we need to make a better approximation for the link action

defined as minus the logarithm of the density matrix between two successive points on a

path

Sm = − ln[λd

δβ ρ(qm, qm+1; δβ)], (5.147)

where d is the dimensionality of space. The better we make the action between two

successive points on a path (it will become more accurate at larger values of δβ), the



smaller will be the number of time slices we need on the path. The sampling becomes

easier and the estimations of various quantities (see Sec. 5.2.3), e.g. the kinetic energy,

will have a smaller statistical error.

In practice, to decide whether the time step, δβ = β/n, is small enough, one should

study the convergence of different thermodynamic quantities of interest with a series of

long simulations with smaller and smaller time steps, e.g. β/n, β/(2n), β/(4n), etc. The

better action will give an exact result with a larger time step δβ. The primary quantity to

look at is the energy, since it is related to the partition function. But other quantities such

as the kinetic and potential energy or pair correlation functions should also be studied.

As it follows from Eq. (5.147) the exact action is a many-body function. If the interac-

tion is a pair potential, the action will have not only pair terms, but also three-body terms,

four-body terms, etc. Taking them all into account would require enormous amounts of

computer time and calculations would be possible only for small systems. On the other

hand, we know that in the high-temperature and low-density limit only two-particle corre-

lations are important and higher order terms become negligible. As a result, the following

approximation for the N -particle density matrix holds

ρ(R,R′; δβ) ≈N∏

i

ρ[1](ri, r′i; δβ) ×∏

j<k

ρ[2](rj , rk, r′j , r′k; δβ)

ρ[1](ri, r′i; δβ)ρ[1](rk, r′k; δβ)+ O(ρ[3]), (5.148)

where R = r1, . . . , rN specifies the coordinates of all N particles, i, j, k are the particle

indices, and ρ[1] (ρ[2]) is the single (two) particle density matrix. First, we can note that

this expression is exact for a pair of particles by definition. Since most of the collisions

occur between a pair of particles at a time, they are described correctly. The error comes

from the higher-order correlations which, however, are not important at high temperatures.

Thus the problem of an accurate N -particle density matrix is reduced to constructing an

accurate expression for the off-diagonal pair density matrix ρ[2](rj , rk, r′j , r′k; δβ).

In practice, it is crucial that the off-diagonal density matrix, ρ[2], can be quickly eval-

uated for any given initial (ri, rj) and final (ri′, rj

′) coordinate vectors of the particles.

For this reason, before doing the PIMC simulations, we calculate in advance tables of the

pair density matrices (PDM) or the pair action (PA), i.e. S[2] = − ln[λd

δβ ρ[2]], for each

type of interaction in the system. For example, for an electron-hole system one needs to

calculate and store three PA tables corresponding to the electron-electron, hole-hole and

electron-hole interaction.

Using the fact that the initial and final positions cannot be too far apart, one can

expand the PA in a power series. It is convenient to use the three distance

q = (|r| + |r′|)/2, s = |r − r′|, z = |r| − |r′|, (5.149)

where we have introduced relative distances: r = ri − rj and r′ = r′i − r′j . The variables

s and z are usually of the order of the thermal wavelength λδ, and thus the PA can be



expanded as (see e.g. Ref. [11])

S[2](r, r′; δβ) =1

2[S(r, r; δβ) + S(r′, r′; δβ)] +

n∑

k=1

k∑

j=0

Skj(q; δβ) z2j s2(k−j). (5.150)

The first two terms in Eq. (5.150) are the diagonal elements, but the following terms are

purely off-diagonal contributions which are important as they allow to reduce the number

of time slices. The functions Skj are obtained by a least-squares fit of the difference

S(r, r′; δβ) − [S(r, r, δβ) + S(r′, r′; δβ)]/2 to the second term on the r.h.s of Eq. (5.150).

In most cases an accurate fit is already achieved with n = 2 or n = 3.

Finally, using the PA tables, we are able to estimate

ρ[2](rj , rk, r′j , r′k; δβ) = e−S[2](rjk,r′jk;δβ), (5.151)

for every particle pair j, k, substitute this in Eq. (5.148) for the N−body density matrix

and perform the integration in Eq. (5.115) by using the multilevel Metropolis algorithm.

5.2.2.1 Calculation of the pair density matrix.

There are various ways of computing the pair density matrix ρ[2], e.g. one can use a direct

eigenfunction expansion of the density matrix and calculate the contributions from bound

and continuum states. This method is particularly useful for the types of potentials where

analytic expressions for the continuum wave functions exist. In general case, it is more

efficient to use methods which directly deal with the density matrix, e.g. the matrix-

squaring method [49] or the variational approach [16, 50]. It is possible to construct the

PDM from some effective potentials obtained by a perturbation theory. For example, in

simulations of hydrogen plasmas the Kelbg potential, which is the solution of the two-

particle Bloch equation in the limit of weak coupling, is frequently used. We will briefly

give the main ideas of these methods.

(a) Matrix squaring technique.

The exact off-diagonal pair density matrix can be calculated efficiently by this method

introduced by Storer and Klemm [49]. First, the density matrix is factorized into a center-

of-mass term and a term that is a function of the relative coordinates only

ρ(ri, rj , r′i, r

′j ; β) = ρcm(R,R′; β)ρ(r, r′; β), (5.152)

where R = (miri + mjrj)/(mi +mj), and r = ri − rj , and analogously for R′, r′. For the

case of spherical symmetry of the interaction potential, the relative pair density matrix

is expanded in terms of partial waves. This expansion reads, for the two- and three-

dimensional cases,

ρ2D(r, r′; β) =1

2π√

r r′

+∞∑

l=−∞ρl(r, r

′; β)ei lΘ, (5.153)

ρ3D(r, r′; β) =1

4πr r′

+∞∑

l=0

(2l + 1) ρl(r, r′; β) Pl(cosΘ),



where Θ is the angle between r and r′. Each partial-wave component satisfies the 1D Bloch

equationc for a single particle in an external potential given by the interaction potential

and also the convolution equation,

ρl(r, r′; τ) =

∞∫

0

dr′′ ρl(r, r′′; τ/2) ρl(r

′′, r′; τ/2). (5.154)

This is the basic equation of the matrix-squaring method which allows to calculate the

function ρl at a given temperature 1/τ from the same function at a two times higher

temperature. Squaring the density matrix k times results in a lowering of the temperature

by a factor of 2k. Each squaring involves only a one-dimensional integration which, due to

the Gaussian-like nature of the integrand in Eq. (5.154), can be performed quite accurately

and efficiently by standard numerical procedures. To start the matrix-squaring iterations,

Eq. (5.154), one needs a known accurate high-temperature form for the density matrix. A

convenient choice is the semiclassical approximation

ρl(r, r′; τ) = ρ0

l (r, r′; τ) exp

(− τ

|r − r′|

∫ r′

r

V (x)dx

), (5.155)

where ρ0l (r, r

′; τ) is the partial-wave component of the free-particle density matrix.

Once the pair density matrix ρl(r, r′; τ) is computed for the desired value of τ , it is

substituted into Eqs. (5.153-5.154), and a summation over partial waves readily yields the

full relative density matrix.

(b) Variational approach for the density matrix.

As a second method for solving the off-diagonal Bloch equation one can use a variational

perturbation expansion developed by Feynman and Kleinert [16]. In this procedure the

initial density matrix is presented in the form of a trial path integral which consists of

a suitable superposition of local harmonic oscillator path integrals centered at arbitrary

average positions xm, each with its own frequency squared Ω2(xm). One starts from

decomposing the action in the density matrix as

ρ(r, r′; β) =

∫

(r,0)→(r′,~β)

D x e−S[x]/~, (5.156)

S[x] = SΩ,xm [x] + Sint[x], (5.157)

with SΩ,xm [x] being the action of a trial harmonic oscillator with the potential minimum

located at xm, and D being the functional integral over all trajectories. The interaction

part

Sint[x] =

∫ ~β

0

dη

[V [x(η)] − 1

2µ Ω2 [x(η) − xm]2

], (5.158)

cSee the introduction of this book, Sec. 1.6.



is defined as the difference between the original potential V (x) and the displaced har-

monic oscillator. The Ω2 term in Eq. (5.158) compensates the contribution of SΩ,xm [x]

in Eq. (5.157). Now one can calculate the density matrix (5.156) by treating the interac-

tion (5.158) as a perturbation, leading to a moment expansion

ρ(r, r′) = ρΩ, xm

0 (r, r′)

(1 − 1

~〈Sint[x]〉Ω,xm

r,r′ +1

2~2〈S2

int[x]〉Ω,xm

r,r′ − ...

)=

= e−τ WΩ,xmN

( µ

2π~2τ

)d/2

, (5.159)

with the definition

WΩ,xm

N =d

2βln

sinh ~βΩ

~βΩ+

µΩ

2~β sinh ~βΩ

[(r2 + r′2) cosh ~βΩ − 2r r′

]−

− 1

β

N∑

n=1

(−1)n

n!~n〈Sint[x]〉Ω,xm

r,r′ , (5.160)

where d is the space dimensionality and N is the order of the approximation. The function

ρΩ,xm

0 (r, r′) is the trial harmonic oscillator density matrix, r = (r − xm), r′ = (r′ − xm),

and the expectation value of the interaction action on the r.h.s. of Eq. (5.160) is given by

〈Snint[x]〉Ω,xm

r,r′ =1

ρΩ,xm

0 (r, r′)

er′

,~β∫

er,0

Dx

n∏

l=1

~β∫

0

dτl ×

×Vint [x(τl) + xm] e−1~

AΩ,xm[ex+xm]

. (5.161)

The function WΩ,xm

N can be identified as an effective quantum potential which is to be

optimized with respect to the variational parameters Ω2(r, r′), xm(r, r′). Note that, in

the high temperature limit, this effective potential goes over to the original potential V (r).

The optimal parameter values are determined from the extremum condition

∂WΩ,xm

N (r, r′)

∂Ω2= 0,

∂WΩ,xm

N (r, r′)

∂xm= 0. (5.162)

The perturbation series (5.160) is rapidly converging, in most cases already the first-order

approximation WΩ,xm

1 for the effective potential gives a reasonable estimate of the desired

quantities [55].

(c) First order perturbation expansion for the pair density matrix.

Off-diagonal and diagonal Kelbg potential.

The equilibrium pair density matrix at a given inverse temperature β = 1/kBT is the

solution of the two-particle Bloch equation

∂

∂βρ(ri, rj, r

′i, r

′j ; β) = −H ρ(ri, rj , r

′i, r

′j ; β),

H = Ki + Kj + U(ri, rj , r′i, r

′j). (5.163)



If the interaction is weak, Eq. (5.163) can be solved by perturbation theory with the

following representation for the two-particle density matrix

ρij =(mimj)

3/2

(2π~β)3exp

[− mi

2~2β(ri − r′i)

2

]

× exp

[− mj

2~2β(rj − r′j)

2

]exp[−βΦij ], (5.164)

where i, j are particle indices, ρij ≡ ρ(ri, rj , r′i, r

′j ; β), and Φij(ri, rj , r

′i, r

′j ; β) is the off-

diagonal two-particle effective potential. In the following we will consider application of

this result to Coulomb systems. From first-order perturbation theory we get explicitly

Φij(rij , r′ij ; β) ≡ eiej

∫ 1

0

dα

dij(α)erf

(dij(α)/λij

2√

α(1 − α)

),

(5.165)

where dij(α) = |αrij + (1−α)r′ij |, erf(x) is the error function, erf(x) = 2√π

∫ x

0dte−t2 , and

λ2ij = ~

2β/2µij with µ−1ij = m−1

i + m−1j . The diagonal element (r′ij = rij) of (5.165) is

called the Kelbg potential, given by

Φij(xij) =eiej

λijxij

1 − e−x2

ij +√

πxij [1 − erf(xij)]

, (5.166)

with xij = |rij |/λij , and we underline that the Kelbg potential is finite at zero distance,

reflecting the quantum nature of the two-particle interaction at small distances which

prevents any divergence. From Eq. (5.166) it is also clear that quantum effects become

dominant (and there the quantum potential deviates from the classical Coulomb potential)

at distances rij . λij given by the thermal DeBroglie wavelength. In interacting systems,

this is only a rough approximation, and at strong coupling, the expression for the quantum

particle “extension” deviates strongly from λij and needs to be generalized [52]−[55].

To obtain a simplified expression for the rather complex quantum potential (5.165) one

can approximate the off-diagonal matrix elements by the diagonal ones. A first possibility

is to approximate the integral over α by the length of the interval multiplied with the

integrand in the center which leads to the so-called KTR-potential due to Klakow, Toepf-

fer and Reinhard which (in the diagonal approximation) is often used in quasi-classical

molecular dynamics simulations

Φij(rij , r′ij ; β) ≡ eiej

dij(1/2)erf

(dij(1/2)

λij

), (5.167)

where dij(1/2) = 12 |rij + r′ij |. Alternatively, the integral can be simplified by taking the

off-diagonal Kelbg potential only at the center coordinate,

Φij(rij , r′ij ; β) ≈ Φij

( |rij | + |r′ij |2

; β

). (5.168)

Many authors use the end-point approximation (5.166) for the effective potential

Φij(rij , r′ij ; β) in the pair density matrix (5.164) due to the fact that it is very convenient



r =2.0

0 1 2 3

r/aB

0 1 2 3

r =1.0

62 500 K

r =0.25

250 000 K

0 1 2 3

1 000 000 K

DKPODKP(r,r’; )

Fig. 5.8 The “exact” off-diagonal density matrix ρ(r, r′;φ) for an electron-proton pair vs. the densitymatrix calculated with the diagonal (DKP) and off-diagonal (ODKP) Kelbg potentials (Ref. [55]). In allfigures, results for three angular values are given φ = 0 (upper curves), φ = π/2 (middle) and φ = π(lower curves). The proton is located at the origin, and the vector r

′ (initial electron position) is fixed,|r′| = 0.25; 1.0; 2.0. The vector r (final electron position) is varied, φ is the angle between the vectors r

and r’.

computationally. The pair potential for the interparticle interaction is simply replaced by

an effective potential which has only a dependence on the radial variables |rij |, |r′ij |. How-

ever, the end-point approximation is less accurate compared to the off-diagonal effective

potential [54] and requires a larger time step δβ in the PIMC representation of the DM.

In Fig. 5.8 we show the angular dependence of the full off-diagonal two-particle density

matrix calculated with the off-diagonal Kelbg potential – ODKP (5.165) and its diagonal

approximation – DKP (5.166). The density matrix is shown at several temperature values

(T = 1 000 000, 250 000 and 62 500 K) and several angular distances (φ = 0, π/2, π)

between the vectors r ≡ rij , r′ ≡ r′ij (in each of the figures, the top curves correspond

to the case of parallel vectors, φ = 0, the lowest curves to antiparallel vectors, φ = π).

Also, for reference, we give the off-diagonal density matrix obtained from the “exact”

solution of the Bloch equation, cf. paragraph. (a). At high temperatures, T ≥ 250 000

K, the Kelbg density matrix does not exhibit large deviations from the exact result. At



T = 1 000 000K, the ODKP density matrix practically coincides with the exact solution,

whereas the DKP approximation shows small deviations. In these cases the perturbation

expansion applies, the coupling parameter Γ ∼ 0.15, see left column of Fig. 5.8. (The

coupling parameter denotes the ratio of the mean interaction energy to the mean kinetic

energy.) With decreasing temperature, the deviations from the exact results grow, see

middle column. To better understand the details of the deviations, we magnified them by

including also results for T = 62 500 K, which is far beyond the scope of the perturbation

theory, T ≈ 0.4 Ry/kB, i.e. Γ ≈ 2.5. Here we observe that, at the origin, the density

matrix of the Kelbg potential is 3 times less than the exact one. The largest errors were

found for the DKP, in particular, in the case when the vectors r, r′ have the opposite

direction (φ = π).

From this comparison we can conclude that both, the DKP and the ODKP, show

satisfactory agreement with the exact result in the cases where perturbation theory ap-

plies, kBT & 2 Ry. At lower temperatures there is only qualitative agreement. A more

comprehensive study of this topic can be found in Ref. [55].

5.2.3 Calculation of physical observables

Now we will discuss how to calculate thermodynamic properties in the path integral repre-

sentation. The N−particle density matrix ρ(β) contains the complete information about

the system with the observables given by

〈O〉(β) =Tr[O ρ(β)

]

Tr [ρ(β)]=

∫dR 〈R|O ρ(β)|R〉∫

dR 〈R|ρ(β)|R〉=

=

∫dR dR′ 〈R|O|R′〉〈R′|ρ(β)|R〉

∫dR 〈R|ρ(β)|R〉

. (5.169)

This expression is simplified if the physical observable O is diagonal in the coordinate

representation, i.e. 〈R′|O ρ(β)|R〉 = 〈R|O ρ(β)|R〉 δ(R′ − R′). For example, the average

density of particles can be calculated as a simple average over the paths of all N particles

ρ(r∗) =1

N n

N∑

i=1

n−1∑

k=0

〈δ(r∗ − rki )〉ρN (5.170)

where 〈. . .〉ρN defines the thermodynamic average defined in Eq. (5.169). Other quantities

can be obtained as derivatives of the partition function Z. Let us consider, e.g. the

expression for the internal energy

E = − ∂

∂β(lnZ) = − 1

Z

∂Z

∂β. (5.171)

Such an expression is called the thermodynamic estimator of energy (properties can be

often calculated in different ways, a specific formula used to calculate some physical quan-



tity is called an estimator). Taking into account the high temperature representation of

the density matrix (5.108) and applying the derivative, this estimator, for the case of a

Fermi system in d dimensions, takes the form

E = Ekin +

⟨1

n

n−1∑

k=0

V (Rk)

⟩−⟨|φij |−1 ∂ |φij |

∂β

⟩, (5.172)

Ekin =dnN

2β−⟨

n−1∑

k=0

nme

2~2β2(Rk+1 − Rk)2

⟩. (5.173)

The first two terms of Eq. (5.172) are the kinetic and potential energies, the third term

corresponds to the derivative of the exchange determinant, |φij | =∑

α det||ϕij δσiσj ||(for Bose statistics there will be the derivative from the sum over N ! terms). The main

drawback of this expression is that the kinetic energy Ekin is written as a difference of

two large terms diverging as the number of time slices goes to infinity, n → ∞, thus the

difference is a strongly fluctuating value. If we go to the classical limit by fixing n and

increasing temperature (decreasing β), the absolute error of the kinetic energy will grow.

But we know that the right result for Ekin in the classical limit approaches (3/2)kBT .

This drawback can be eliminated by using the virial energy estimator [56]. The virial

estimator is effective at computing quantum corrections to a nearly classical systems. In

this case the kinetic energy does not fluctuate and goes to the classical kinetic energy as

β → 0. At the same time the potential energy term goes over to the classical potential

energy.

One method to derive a good energy estimator is to introduce spatial coordinates which

are temperature dependent [42], i.e. the coordinates at the time slice m are given by

qm = q0 +

m∑

i=1

λδξi, m = 1, . . . , n − 1. (5.174)

Here ξi is a set of unit vectors, and q0 is the particle coordinate at the zero time slice.

The advantage of this form can be seen from the transformation of the partition function,

Eq. (5.115). For the kinetic energy density matrices we get explicitly

1

λd n Nδ

∫. . .

∫dq dq1 . . . dqn−1 e

− π

λ2δ

(q−q1)2

e− π

λ2δ

(q1−q2)2

. . . e− π

λ2δ

(qn−2−qn−1)2

=

=1

λd n Nδ

N∏

i=1

∫

. . .

∫dqidq1

i . . . dqn−1i λ

d (n−1) Nδ e

− π

λ2δ

λ2δ

n−1P

m=1

m−1P

j=1

ξji −

mP

k=1

ξki

!2 =

=1

λd Nδ

N∏

i=1

∫

. . .

∫dqidq1

i . . . dqn−1i e

−πn−1P

m=1(ξm

i )2

. (5.175)

Now the temperature dependence enters only through the normalization factor 1/λd Nδ

which, by taking the derivative (− 1Z

∂Z∂β ), gives to the total energy E the classical kinetic

energy term dN/(2β). On the other hand, the potential energy acquires a dependence on



temperature, U(q) → U(q(δβ)), and its β-derivative now consists of two terms

−∂e−βU(q)

∂β=

(U(q) + β

∂U(q)

∂q

∂q

∂β

)e−βU(q). (5.176)

The application of the transformation (5.174) can be illustrated on the example of N

particles in the harmonic potential (with frequency ω and center coordinate r0) with the

hamiltonian

H = −N∑

i=1

~2∇2

i

2me+

1

2

N∑

i=1

meω2(ri − r0)

2 +∑

i<j

e2

ǫ|ri − rj |. (5.177)

The total energy calculated from the thermodynamic estimator E = − 1Z

∂Z∂β is now

E = 〈Ekin〉 + 〈Epot〉 −⟨|φij |−1 ∂ |φij |

∂β

⟩, (5.178)

Ekin =dN

2β+

meω2

2n

N∑

i=1

n−1∑

m=1

(rmi − r0|λδ

m∑

k=1

ξk) +

+e2

2ǫ n

∑

i<j

n−1∑

m=1

(rmi − rm

j |λδ

∑mk=1ξk

i − ξkj )∣∣rm

i − rmj

∣∣3 (5.179)

Epot =meω

2

2n

N∑

i=1

n−1∑

m=0

(rmi − r0)

2 +e2

ǫ n

∑

i<j

n−1∑

m=0

1

|rmi − rm

j | , (5.180)

where (. . . | . . .) is the scalar product of two d−dimensional vectors. The expression for the

kinetic energy (5.179) has been obtained from the definition of the thermodynamic kinetic

energy estimator as the mass derivative of the partition function

Ekin =m

βZ

∂Z

∂m. (5.181)

Note, that in the expression for the kinetic energy there are only terms which are finite

even when n → ∞, and the statistical fluctuations in the evaluation of this expression

are much less than those of Eq. (5.173). We can also note, that there is a contribution

to the kinetic energy from the interaction part of the hamiltonian, i.e. the second and

the third terms in Eq. (5.179). These are the quantum corrections to the classical kinetic

energy (d/2)NkBT . As the temperature goes to zero, these terms give the value of the

energy of zero-point fluctuations. In the high temperature limit, the thermal extension of

the particles vanishes, (λδ

∑i ξi → 0), and the contribution of the interaction part also

vanishes.

Expressions for other thermodynamic quantities have already been given in Sec. 5.1.4.1.

For all quantities which are related to the partial derivatives of the partition function one

can easily derive corresponding estimators and use PIMC to calculate their thermodynamic

averages.



5.2.4 Simulations of macroscopic systems. Finite-size effects

The question about correct numerical simulations of infinite (macroscopic) physical sys-

tems using only finite number of particles is one of the key points of many investigations.

Although finite systems are very unlike bulk material, periodic systems with only few

particles, in some cases, can be large enough to model many properties of condensed mat-

ter, gases or plasmas. These simulations certainly introduce “finite-size errors”, which in

many cases limit the applicability of many-body simulation techniques to extended sys-

tems. These errors, however, can be sufficiently reduced as the number of particles in the

simulation cell (periodically repeated over the whole space) increases, or one uses some

modified interaction potentials which effectively takes into account the interaction with the

particles outside the simulation cell, and hence mimics the many-body correlations occur-

ring in the infinite system. One can also use extrapolation corrections – some functional

f(N) which allows to interpolate the thermodynamic properties of a finite N−particle

system to the corresponding thermodynamic limit. The extrapolation procedure requires

several calculations for different values of N which allows to make a subsequent fit to the

chosen functional form f(N).

We start our consideration from a simple model of point charged particles qi interacting

through the Coulomb potential, and the whole system is electrically neutral,∑

i qi = 0.

Our goal will be to choose the corresponding model Hamiltonian such that the full-potential

energy, evaluated over all pairs of particles in the simulation cell (SC), equals the average

potential energy per cell in a system containing an infinite number of periodically repeated

original SCs. However, it is well known that the sum of Coulomb interactions 1/r is only

conditionally convergent [57] and one needs to specify the boundary conditions at infinity.

The usual procedure to find the interaction in the model Hamiltonian is to solve Poisson’s

equation with the specified boundary conditions. In this case, one obtains the Ewald

interaction [58] (see also the review [3]).

5.2.4.1 Ewald transformation

The key idea of the Ewald summation (to overcome the conditional convergence of the 1/r

interaction) is to assume that every particle with the charge qi (with the charge density in

the form of a δ-function) is surrounded by a screening charge distribution of the opposite

sign. As a result, the total potential produced by this charged cloud plus the δ-charge qi

exactly cancels at large distances (the resulting potential is the Debye potential and shows

an exponential decay). The opposite charge, −qi, is assumed a priori to have a Gaussian

distribution (although such a choice is not always optimal as was shown in Ref. [62]). Now

we can easily calculate the contribution to the electrostatic potential at a point ri using the

fact that all screened charges outside the SC give only a negligible contribution. However,

in the original problem our aim was to calculate the potential resulting from the point

charges. Hence, we must correct this situation by adding the compensating charge density.

As a result we have to take into account all three contributions and exclude Coulomb self-

interactions, i.e. exclude the potential energy due to the interaction of the point particle

with the compensating charge distribution centered at the same particle position ri. This,



however, at the first step, can be taken into account to simplify the calculations, and then

be corrected in the final result. Thus we need to calculate the following sums

U = UF + UR − USI =1

2

N∑

j=1

qj [ϕF (rj) + ϕR(rj) − ϕSI(rj)] , (5.182)

where the first term, UF , is the so-called Fourier part of the Ewald sum, the second term,

UR, is the Real-Space part, and the last term, USI , takes into account the net contribution

of the Self-interaction.

The potential ϕF (r) can be found from the solution of Poisson’s equation

−∇2ϕF (r) = 4πρ(r), (5.183)

or in the Fourier form,

k2ϕF (k) = 4πρ(k). (5.184)

The charge distribution ρ(r) is a periodic sum of Gaussian distributions (with the variance

σ which is a positive arbitrary constant)

ρ(r) =

N∑

i=1

∑

n

qi(σ/π)32 exp

[−σ|r − (ri + nL)|2

]. (5.185)

The Fourier transformation of this distribution,

ρ(k) =

∫

V

dr exp(−ikr)ρ(r) =

N∑

i=1

qi exp(−ikri) exp(−k2/4σ), (5.186)

can be substituted in Poisson’s equation (5.184) with the result

ϕF (k) =4π

k2

N∑

i=1

qi exp(−ikri) exp(−k2/4σ). (5.187)

By applying the inverse Fourier transform to Eq. (5.187) one gets the spatial form of the

potential

ϕF (r) =4π

V k2

N∑

i=1

qi exp [ik(r − ri)] exp(−k2/4σ). (5.188)

Now the contribution of the first term in Eq. (5.182) can be written explicitly

UF =1

2

N∑

j=1

qjϕF (rj) =1

2

∑

k 6=0

N∑

j,i=1

4π

V k2qjqi exp [ik(rj − ri)] exp(−k2/4σ). (5.189)

The term with k = 0 can be excluded from the sum by choosing the corresponding

boundary conditions at infinity (i.e. a situation when the periodic system is embedded in

a medium with infinite dielectric constant).



Now we will calculate the contribution of the self-energy term USI . We want to compute

the potential produced by a Gaussian cloud of the compensating charge. To find this

potential we must simply solve Poisson’s equation for the charge density of the form

ρG = qj(σ/π)32 exp(−σ(r − rj)

2). (5.190)

This gives (after simple arithmetics) the Coulomb potential multiplied by the error function

ϕG(r′) =qi

r′erf(

√σr′), (5.191)

where r′ = (r − rj) and erf(x) ≡ 2√π

∫ x

0 e−t2dt. To calculate the self-energy we need to

take the value of this potential at the position of the charge qj , i.e. r′ = rj − rj = 0. This

leads to the following result

USI =1

2

N∑

j=1

qjϕG(0) = (σ/π)12

N∑

j=1

q2j . (5.192)

As the last step, we need to calculate the energy due to the point charges screened

by oppositely charged Gaussian distributions. The resulting (short-range) potential from

these two charge distributions can be immediately written as

ϕR(r) =qi

r− qi

rerf(

√σr) =

qi

rerfc(σr), (5.193)

where we have used the previously obtained result (5.191) and the definition of the com-

plementary error function, erfc(x) = 1 − erf(x). The total contribution of the screened

charges to the interaction energy now becomes

UR =1

2

N∑

i6=j

qjqierfc(σ|rj − ri|)

|rj − ri|. (5.194)

The total potential energy is a sum of all three contributions calculated above

U =1

2

∑

k 6=0

N∑

j,i=1

4π

V k2qjqi exp [ik(rj − ri)] exp(−k2/4σ) −

−(σ/π)12

N∑

j=1

q2j +

1

2

N∑

i6=j

qjqierfc(σ|rj − ri|)

|rj − ri|. (5.195)

5.2.4.2 Pseudopotentials in quantum simulations

We will now discuss how the boundary conditions can be implemented in quantum sim-

ulations (e.g. in variational or diffusion quantum Monte Carlo methods, and in the path

integral technique).

As we have already seen by performing Ewald summation, the electrostatic potential

can be broken up into a long range part (presented by the Fourier components of the

original interaction potential) and a short range part in real space. One of the straightfor-

ward ways to apply this result, e.g. in Path Integral Monte Carlo, will be to calculate the



pair density matrix (which is used in the high temperature representation of the N -body

density matrix, Eq. (5.148)) separately for the long range and the short range potentials.

For a short and long range potential we can use the matrix squaring technique discussed

in Sec. (a). However, while this potential splitting procedure is well grounded for classical

and quasi-classical systems (when the kinetic energy operator commutes with the poten-

tial energy operator) for quantum systems it is not justified because the two potential

operators do not commute. In principle, one first needs to calculate the full action (or the

density matrix) and only then make the Ewald transformation.

Taking into account this fact, we can try to construct the effective potential (or the

action) which already takes into account many-body correlation effects in quantum sys-

tems. For example, to obtain an improved long range part of the potential one can use the

random phase approximation (RPA). In this case, for the action SRPA(r, r′) connected

with the density matrix via Eqs. (5.147) and (5.151), using the assumption that the struc-

ture factor for the interacting system, S(k), is close to that of the noninteracting system,

S0(k), the Fourier transform of the pseudopotential, SRPA(k), describing correlations at

large r can be found by minimizing the variational energy with respect to SRPA, leading

to the following result [59]

SRPA(k) = − 1

2S0(k)+

(1

2S20(k)

+4V (k)m

~2k2

) 12

, (5.196)

where V (k) is the Fourier transform of the interaction potential. Variational quantum

Monte Carlo calculations [61] have shown that this pseudopotential gives quite satisfactory

results for the energy of a one-component fermion plasma in two and three dimensions.

In what follows we will briefly give the idea how the Ewald transformation can be

applied to an arbitrary pseudopotential which replaces the original interaction potential

between two particles (see Ref. [61] for details). At the first step, we recall the method to

calculate the “image” potential between two particles with the original interaction 1/rl [60]

u(r) =∑

n

uRl (|r − nL|) +

∑

k

uFl (k) exp(ikr), (5.197)

where uRl (r) and uF

l (k) represent the contributions of the real-space and Fourier parts

respectively, and are given as

uRl (r) = Γ(l/2, σr2)/Γ(l/2) r−l, (5.198)

uFl (k) =

πD/2(2/k)d−lΓ( 12 (d−l),k2/4σ)

V Γ(l/2) , k 6= 0

− 2πd/2

(d−l)V Γ(l/2)σ(d−l)/2 , k = 0.(5.199)

In the above expressions L,k are the lattice translational and reciprocal vectors (of the

SC) in the d-dimensional space, V is the volume of the SC, and Γ(a, x) is the incomplete

gamma function.

Using the introduced notations, we can now perform the Ewald summation for the

potential 1/rl. As a result we get an expression similar to Eq. (5.195) (which is the



particular case for l = 1). We can now write down the potential energy of particle j

ujl =

N∑

i=1,i6=j

uRl (rji) +

∑

k 6=0

N∑

i=1,i6=j

uFl (k) (exp [ik(rj − ri)] − 1) + uSI

l , (5.200)

where the last term takes into account the self-interaction of particle j with its own images

uSIl =

∑

k 6=0

uFl (k) − 2σl/2

l Γ(l/2). (5.201)

The pseudopotential obtained in RPA, Eq. (5.196) is of arbitrary form, hence to use

the above results we should expand it into a series of terms αi/rli keeping li < d, i.e. less

than the dimensionality of space. Such an expansion should approximately describe the

asymptotic behavior of the pseudopotential at large distances rji ≫ 1. Taking this into

account, it was proposed (see Ref. [61]) that the interaction can be presented approximately

in the following form

u(r) =

SRPA(r), r ≤ L/2∑i

αi/rli , r > L/2. (5.202)

If we make such a choice for the interaction potential, then the Ewald procedure can give

us the same result as Eq. (5.200), but the functions uRl (r) ≡ uR(r) and uF

l (k) ≡ uF (k)

should be now computed as

uR(r) =

(SRPA(r) −

∑i

αi/rli

)+∑i

αiuRli(r), r ≤ L/2

0, r > L/2.(5.203)

uF (k) =∑

i

αiuFli (k). (5.204)

As one can see, now in the short range part of the potential, uR(r), instead of the original

RPA potential, SRPA, we have the deviation of this potential from the power law αi/rli,plus a contribution from the real-space part of the Ewald potential which is due to the

interaction with the “image” cells.

The discussed method can be quite accurate if the interaction potential is well approx-

imated by the series expansion αi/rli outside the simulation cell.

More optimized methods for treating long-range potentials (compared to the Ewald

procedure) can be found in Ref. [62].

5.2.4.3 Elimination of finite-size errors

Let us now discuss two main sources of finite-size errors. The first one, comes from the

finite-size shell effects. If we consider a noninteracting homogeneous system with periodic

boundary conditions (PBC), the single particle states are plane waves. To satisfy PBC,

the wave vectors must obey the corresponding relation

kn = (2πn)/L, (5.205)



where L is the size of the SC, and n is an integer vector. These states have the energy

En = (~2/2m)k2n. If we consider a Fermi system then all states will lie within a circle

with the radius equal to the Fermi vector kF . The states are discrete and the summation

over these states gives the kinetic energy which will be different from the result when an

integration over k-vectors is performed continuously from 0 to kF . This fact introduces

a finite-size error in the kinetic energy term which decays algebraically with the particle

number as N−γ .

(a) Twist averaged boundary conditions.

To overcome this drawback it was proposed to use a modified – twist-averaged boundary

conditions [63]. It is assumed that the wave function Ψ, by reaching the boundary of the

SC, can pick up an arbitrary phase θ

Ψ(r1 + nL, r2, . . .) = eiθΨ(r1, r2, . . .) ≡ Ψ(θ; r1, r2, . . .). (5.206)

The PBC corresponds to the particular case when θ = 0.

In the framework of this method, the twist average of any physical property O is defined

as an average over all possible values of the twist angle

〈O〉 = (2π)−d

∫ π

−π

dθ〈Ψ(θ; r1, . . .)|O|Ψ(θ; r1, . . .)〉. (5.207)

The twist angle in many cases has a clear physical origin. For example, when some

system is rotating with the angular frequency ω, one can go into a rotating frame without

changing the physical properties. The twisted boundary conditions are usually used for

the solution of the band structure problem for a periodic solid, where the integration over

θ corresponds to the integration over the first Brillouin zone.

In a more general sense, the twist angle is an additional degree of freedom that can

be changed to approach the thermodynamic limit more quickly. For the noninteracting

Fermi system considered above, the twisting boundary conditions lead to the displacement

of k-vectors. By considering a set of different twisting angles θi (to approximate the

integral in Eq. (5.207)) the momentum distribution of the finite number of fermions in

the k-space becomes very similar to the spherical Fermi surface of the infinite system.

Considering from 16 to 32 values of θ it is possible to achieve a significant reduction of the

finite-size effects in the kinetic energy term. Hence, the method has been proved to be very

efficient for weakly interacting quantum systems, when the kinetic energy contribution in

the total energy dominates [63]. However, the twist-average method does not speed up

the convergence of the finite-size effects due to the potential energy computed with the

Ewald method, and further improvements in this direction are required. This leads us

to consider the second source of errors which comes from the potential energy computed

usually using the Ewald summation method.

(b) Periodic Coulomb interaction.

It is clear that the periodic repetition of the original SC over the whole space gives unphys-

ical contributions to the potential energy. If the original simulation cell contains a point

defect, then by applying the PBC this defect will be repeated throughout the whole space,



and the mutual interactions between these defects in the neighboring cells will produce a

significant finite-size error.

Recently, it was shown [65, 66] that a new Hamiltonian which replaces the Hamiltonian

of the original system and contain a modified periodic-Coulomb interaction can give much

smaller finite-size errors compared to the standard Ewald approach. The following simple

consideration shows the main drawback of the Ewald approach. If we consider an expansion

of the Ewald interaction, V E(r), around zero separation then we get (for a cell with cubic

symmetry) keeping several first leading terms [66]

V E(r) =1

r+ c +

2π

3Vr2 + O(r4/V

53 ). (5.208)

As one can see, the requirement that the Ewald interaction is periodic leads to the deviation

from 1/r. As a result the exchange-correlation energy gets a negative spurious contribution

due to the terms rm. Since this leading correction is inversely proportional to the volume

of the simulation cell, V , the error per electron is inversely proportional to the number

of electrons. To get the correct exchange-correlation energy we must remove this net

contribution. At the same time we should take into account that the Ewald interaction

gives correct value of the Hartree energy.

To solve this problem it was proposed [67] to use, instead of the Ewald interaction, a

modified periodic potential which effectively models the effects of all particles in an infinite

system

V =∑

i<j

f(ri − rj) +∑

i

∫

V

[V E(ri − r) − f(ri − r)

]n(r)dr, (5.209)

where n(r) is the particle density, and f(r) ≡ 1/rm is the Coulomb potential calculated

using a minimum image convention, i.e. the interparticle distance r is reduced (if neces-

sary) by removal of the SC lattice vectors, rm = r−nL. This ensures that the interaction

potential has the correct translational symmetry.

The first term in Eq. (5.209) describes the direct Coulomb interaction of electrons

inside the SC and the second potential due to the electrons outside the SC. The mean field

potential in the second term contains the particle density n(r), which can be, in principle,

calculated self-consistently. The extension of this method to a system of electrons and

nuclei can be found in [66].

The calculations using this modified Coulomb interaction [65, 66] have shown that

it sufficiently reduces the finite-size effects. At short distances it correctly describes the

interaction between electrons and hence should give a more accurate description of short-

range pair correlations.

(c) Finite-size scaling.

In conclusion, we should mention Fermi liquid theory which is usually used to correct

finite-size effects of one-component Fermi systems. This method is based on the quasipar-

ticle theory of Landau [64]. According to Landau it is possible to transform a strongly

interacting Fermi system to a weakly interacting one of quasi-particles, which behave like

an almost ideal Fermi gas. These quasiparticles describe the low-lying excitations of an



interacting system from the ground state determined by the Fermi surface. The energy

functional can be linearized in the parameter δn, i.e. the deviation of the occupation

numbers of quasiparticles from their values in the ground state. We can also assume that

these excitations are caused by the boundary conditions. For example, as we have already

seen from Eq. (5.205) the Fermi surface of a finite system is not of circular shape. This

fact can be considered as an excitation from the ground state, i.e. a perturbation of the

perfectly spherical Fermi surface of the infinite system. Applying Landau’s theory one

can get the following relation connecting the energies of the infinite, E∞, and finite, EN ,

systems

EN = E∞ + a(rs)(TN − T∞)/r2s + b(rs)/(Nrs), (5.210)

where a, b are functions of the density parameter rs which have to be determined from

simulations with different particle numbers N . Further, TN , is the kinetic energy of

the ideal finite system and, hence, the second term describes the finite-size errors in the

kinetic energy, which can be, in principle, eliminated using the twist-averaging procedure

discussed in Par. (a). The density dependencies of the second (1/r2s) and third (1/rs)

terms are chosen in such a way that the functions a(rs) and b(rs) become practically

independent of rs in the high density limit, hence making the fitting procedure in this case

more simple.

5.2.5 Applications of PIMC

We close this chapter with the discussion of several applications to physical systems of

current interest which can be treated with the PIMC method.

First, these are systems where quantum effects are important but quantum statistics

of particles does not play a significant role. This is the case when the thermal wave length

associated with a particle is no longer negligible with the mean interparticle distance,

but still several times smaller (quantum degeneracy is low). Examples are atoms cooled

down to temperatures of several Kelvin, or systems with strong correlations, where the

interaction leads to a spatial localization of the particle wave function.

As an example, below we will show results of numerical calculations for mesoscopic

clusters of few electrons confined in semiconductor quantum dots. Here, the external

parameters, such as temperature and the strength of the external confinement, can be

changed at will, which allows to investigate the system both in the classical-like and deep

quantum regimes and which leads to physically different behaviors and a number of phase

transitions.

5.2.5.1 2D Coulomb clusters

In recent years there is growing interest in finite quantum systems at high density or/and

low temperature. In particular, the behavior of a small number of electrons in quantum

dots is actively investigated, both experimentally [68] and theoretically [39, 69]. The

limiting behavior of two-dimensional (2D) finite quantum systems at zero temperature has

been studied by unrestricted Hartree-Fock calculations [69] which revealed a transition



from a Fermi liquid to an ordered state called “Wigner molecule”. The same crossover at

finite temperature has been recently demonstrated by fermionic path integral Monte Carlo

[39]. It has to be expected that further increase of correlations (increase of the Brueckner

parameter rs, where rs is a measure of the strength of Coulomb correlations, see below)

will lead to a still higher ordered quantum state resembling the Wigner crystal (WC) [70].

On the other hand, for finite classical systems, Monte Carlo simulations have shown

evidence of crystallization for sufficiently large values of the classical coupling parameter

Γ = U/kBT , where U is the interaction energy. These classical clusters consist of well

separated shells [71, 72], and melting proceeds in two stages: first, orientational disordering

of shells takes place - neighboring shells may rotate relative to each other while retaining

their internal order. Further growth of thermal fluctuations leads to shell broadening

and overlap - radial melting. The temperature of radial melting Tr may be many orders

of magnitude higher than the orientational melting temperature To. Large clusters with

N > 100 have a regular triangular lattice structure and exhibit only radial melting.

Now the question arises, how does the behavior of finite electron clusters change at low

temperature, i.e. in the quantum regime? It was found that, indeed, Wigner crystallization

in 2D quantum electron clusters exists and that it is accompanied by two distinct - radial

and orientational - ordering transitions too [73]. However, in contrast to classical clusters,

we observe a new melting scenario which is caused by quantum fluctuations and exists

even at zero temperature (“cold” quantum melting).

We consider a finite unpolarized 2D system of N electrons at temperature T . The

electrons interact via the repulsive Coulomb potential and are confined in a harmonic trap

of strength ω0. The system is described by the hamiltonian

H = −N∑

i=1

~2∇2

i

2m∗i

+1

2

N∑

i=1

m∗i ω

20r

2i +

N∑

i<j

e2

ǫb|ri − rj |, (5.211)

where m∗ and ǫb are the effective electron mass and background dielectric constant, respec-

tively. We use the following length and energy scales: r0, given by e2/ǫbr0 = m∗ω2r20/2,

and Ec - the average Coulomb energy, Ec = e2/ǫbr0. After the scaling transformations

r → r/r0, E → E/Ec the hamiltonian takes the form

H = −n2

2

N∑

i=1

∇2i +

N∑

i=1

r2i +

N∑

i<j

1

|ri − rj |, (5.212)

where n ≡√

2 l20/r20 = (a∗

B/r0)1/2, a∗

B is the effective Bohr radius, and l20 = ~/(m∗ω0), is

the extension of the ground state wave function of noninteracting electrons trapped in a 2D

harmonic potential. Further, we define, in analogy to macroscopic systems, the Brueckner

parameter rs ≡ r0/a∗B = 1/n2. (Note that in mesoscopic clusters, rs and n characterize

only the average electron density). Finally, we introduce the dimensionless temperature

T ≡ kBT/Ec which allows us to define the classical coupling parameter as Γ ≡ 1/T .

To obtain the configuration and thermodynamic properties of clusters of N electrons

described by the Hamiltonian (5.212), it is convenient to perform PIMC simulations using

the bisection algorithm discussed above in Par. (c) of Sec. 5.2.1.5. The number of time-



0.0 0.04 0.08 0.12 0.16n

0.1

0.2

0.3

0.4

0.5u r

,u

/10

N=12N=19

OM RM

ur

u

Fig. 5.9 Relative angular and radial fluctuations, Eqs. (5.213, 5.214), in the vicinity of the orientational(OM) and radial melting (RM) transition for N = 12 and N = 19 versus density (Ref. [73]). The upperthree figures show snapshots of the “magic” cluster with N = 19 in the three phases (left n = 0.025,middle n = 0.06, right n = 0.14). T = 5.0 × 10−4. Shown error bars are typical for all curves.

slices M has to be varied with n and T according to M = nl/T , where l is typically in the

range of 1 . . . 10 to achieve an accuracy better than 5% for the quantities (5.213,5.214), see

below. To obtain the phase boundary of the Wigner crystal, calculations in a broad range

of parameter values n, T, N have to be performed. For each set of parameters, several

independent Monte Carlo runs consisting of approximately 106 steps are carried out.

Structure of electron clusters. The simulations yield the spatial electron configurations

in the trap, examples of which are shown in Fig. 5.9 (shown is the distribution of closed

electron paths). One clearly sees the formation of shells. The number of shells and shell

occupation depend on N [73]. Clusters, in which the particle number on the outer shell

are multiples of those on the inner shell are called magic clusters and are particularly

stable against intershell rotation. Let us now discuss the influence of quantum effects on

the clusters. In contrast to classical systems, where the electrons are point-like particles,

in our case, the wave function of each electron has a finite width and may be highly

anisotropic, which is typical for low temperature, as is most clearly seen in the left inset

of Fig. 5.9 (shown is the distribution of closed electron paths, averaged over the Monte

Carlo chain while keeping the electron positions, i.e. the ends of the paths, fixed). This

peculiar shape results from a superposition of N-body correlations, quantum effects and

the confinement potential. Varying the density and temperature, the shape changes in a

very broad range, which can lead to qualitative transitions of the cluster, including cold

quantum melting, as will be shown below.

To allocate the melting temperatures and densities accurately, we examine the relative



mean angular distance fluctuations of particles from different shells

uφ ≡√〈δφ2〉 =

2

ms1ms2

ms1∑

i

ms2∑

j

√〈|φj − φi|2〉〈|φj − φi|〉2

− 1, (5.213)

where φi and φj are the angular positions of particles on shells s1 and s2, respectively,

and ms1 , ms2 are the total number of particles on the shells. We further consider the

magnitude of the relative distance fluctuations

ur ≡√〈δr2〉 =

2

N(N − 1)

N∑

i≤j

√〈r2

ij〉〈rij〉2

− 1, (5.214)

where rij is the distance between particles i and j.

Fig. 5.9 shows the n-dependence of the fluctuations uφ and ur, at a fixed temperature.

At the density n ≈ n∗o we observe a first jump of the fluctuations. At this point the angle

between particles from neighboring shells can take arbitrary values (angular fluctuations

increase) and two shells may start to rotate relative to each other. This corresponds

to quantum orientational melting. At the same time one can observe a strong (step-

like) increase of the interparticle distance fluctuations. This fact is usually used in the

Lindemann criterion for investigation of phase transitions, see Sec. 5.1.6.2.

If in Fig. 5.9 we proceed to higher densities, we observe a clear second jump of the

fluctuations uφ and ur, which is related to total melting of the electron cluster. It is

instructive to compare the state of the cluster before and after this jump, see middle and

right snapshots in Fig. 5.9, respectively. Evidently, when the density is increased, the

peaks of the wave functions broaden in radial (and angular) direction until their width

becomes comparable with the inter-shell spacing. As a result, the probability of inter-shell

transfer of electrons grows rapidly, causing a sudden increase of the radial fluctuations and

the onset of radial melting.

The obtained values of the melting temperatures and densities are now used to draw the

phase boundaries of the radially and fully ordered states of mesoscopic clusters. The results

are summarized in Fig. 5.10. The phase boundaries have been found to be very sensitive to

the electron number and to the shell symmetry. In contrast to classical clusters, we observe

“cold” orientational and radial melting which is governed by the spread of the electron

wave functions in angular and radial direction. The predictions of the considered model

calculations are expected to be relevant for other systems of particles in external fields.

The physical values of critical temperatures and densities of melting can be recalculated

from the dimensionless parameters n and T . For example, in small 2D islands of electrons

[holes] in semiconductor heterostructures: for example, in GaAs/AlGaAs systems, the

existence of the solid phase is predicted for carrier densities below approximately 108cm−2

[(109 . . . 1010)cm−2] and for temperatures below 1.6 K . . . 5.5 K, for confinement potentials

of 3 meV . . . 10 meV .



0.0 0.04 0.08 0.12 0.16n

0.25

0.5

0.75

1.0

103

(Tn)

2019121110

0.0 0.007 0.014

0.2

0.4OM

20

RM

OM

=137

105 (T n)

rs=37

Fig. 5.10 Phase diagram of the mesoscopic 2D Wigner crystal (Ref. [73]). “OM” (“RM”) denotes theorientational (radial) melting curves for N = 10, 11, 12, 19, 20. Inset shows an enlarged picture of thelow-density region. Dotted straight lines indicate the radial melting transition of a macroscopic classicaland quantum WC. Brueckner parameter follows from the density by rs = 1/n2. Shown error bars aretypical for all curves.

5.2.5.2 Binding energies of excitons, trions, biexcitons in semiconductor nanostruc-

tures

We now consider another application of PIMC simulations to excitonic bound states in

two-dimensional semiconductors (quantum wells) containing two types of charged carriers:

negative electrons and positive holes. Such excitonic “atoms” and “molecules” in quantum

confined semiconductors – quantum wells (quasi-2D system), quantum wires (quasi-1D sys-

tem) and quantum dots (quasi-zero-dimensional system) have been intensively investigated

in the last decade. These systems show nontrivial Coulomb correlation effects leading to

interesting optical and transport characteristics not seen in bulk materials. For example, a

strong increase of the binding energy of the excitonic complexes was found experimentally

with decreasing quantum well (QW) width and increasing magnetic field.

The standard theoretical approach to calculate binding energies is to solve the corre-

sponding many-particle Schrodinger equation by means of an appropriate basis expansion.

This works efficiently in simple geometries but is not easily applicable to realistic (im-

perfect) materials, in particular, to QWs with well-width fluctuations which always occur

during sample growth. In this case, a different approach can be applied. It was demon-

strated in Ref. [74] that this problem can be efficiently solved using the path integral

Monte Carlo method without any restrictions on the geometry of the confinement poten-

tial, including quantum well width fluctuation effects. The localization is considered as a

consequence of the local modulation of the thickness of the quantum well of 1-2 monolayers

which corresponds to the experimental findings of Ref. [75].



We consider a single GaAs quantum well grown between two AlxGa1−xAs barriers

(0 ≤ x ≤ 1). The effective mass framework is used to describe the semiconductor material

and the QW structure. Taking into account the isotropy of the in-plane motion of the

particles, the Hamiltonian for Ne electrons and Nh holes reads:

H =

Na∑

a=e,h

[− ~

2

2mai

∇2 + Va(zai) + V loca (rai)

]+

Ne,Nh∑

i<j

ei ej

ǫ|ri − rj |, (5.215)

where mai and ei are the mass and charge of the i-th particle, ǫ is the dielectric constant,

which we assume equal for the well and for the barrier, Ve (h) is the confinement potential

associated with the presence of the QW, V loce (h) is the lateral (localization) confinement

which is due to the fluctuations of the QW width. We take the quantum well growth

direction as the z-direction.

For a GaAs/AlxGa1−xAs quantum well, we consider the following heights of the

square-well potential: Ve = 0.57 × (1.155x + 0.37x2) eV for electrons and Vh = 0.43 ×(1.155x+ 0.37x2) eV for holes. In our calculations we use an Al concentration of x = 0.3.

Furthermore, the following material parameters are used: ǫ = 12.58, me = 0.067 m0,

mh = 0.34 m0, where m0 is the mass of the free electron. The units for energy and dis-

tance are H∗a = 2R∗

y = e2/(ǫ aB) = 11.58 meV , aB = ~2ǫ/(me e2) = 99.7 A, respectively.

We have also considered the case of an anisotropic hole mass according to [76], using for

the in-plane hole mass a smaller value of m||h = 0.112m0, and in the quantum well growth

direction mzh = 0.377m0. Comparing the binding energies calculated with the isotropic and

anisotropic approximations gives important insight about the relevance of band structure

details for the properties of excitonic complexes in quantum wells.

To simplify the solution of the N -particle Bloch equationd with the Hamiltonian (5.215)

(i.e. the equation of motion for the density matrix ρ of Ne electrons and Nh holes, (N =

Ne + Nh), which is associated with the corresponding stationary Schrodinger equation),

we further make the following assumption. It appearers that the QW confinement is

sufficiently strong and is several times larger then the Coulomb interaction among the

particles. Consequently, we can separate the particle motion in the z-direction and in the

plane of the QW. In this adiabatic approximation the full N -particle density matrix can

be factorized into

ρ(Rxyz; β) = ρ(Ze; β) ρ(Zh; β) ρ(Rxy; β), (5.216)

where Rxyz (Rxy) = re 1, re 2, . . . , re Ne ; rh 1, rh 2, . . . , rh Nh is a 3D (2D) vector of all

particle coordinates, Ze (h) is the z coordinate of all electrons (holes), ρ(Ze; β) and ρ(Zh; β)

are the density matrices of free electrons and holes confined in the z direction by the square

well, and β = 1/kBT is the inverse temperature. We underline that the density matrix

ρ(Rxy; β) contains all in-plane electron-hole correlations and fully includes the effect of the

localization potential. It obeys the two-dimensional N -particle Bloch equation which is

obtained by averaging the three-dimensional Bloch equation over z and using Eq. (5.215)

dSee also the introductory chapter, Sec. 1.6.



and the ansatz (5.216)

∂

∂βρ(Rxy; β) =

−

Na∑

a=e,h

[~

2

2mai

∇2xy + V loc, xy

a

]+ V xy

eff

ρ(Rxy; β). (5.217)

Here, we have introduced an effective 2D in-plane interaction potential V xyeff between an

electron and a hole

V xyeff (β) =

∫dZe dZh

∑

i<j

ei ej

ǫ|ri − rj |ρ(Ze, β) ρ(Zh; β)

×[∫

dZe dZh ρ(Ze; β) ρ(Zh; β)

]−1

. (5.218)

The effective e-e and h-h potentials are defined analogously. The total localization poten-

tial is introduced as (δ is the magnitude of the well width fluctuation)

V loc, xyD =

E0(L + δ) − E0(L), if

√(x2 + y2) ≤ D/2;

0, if√

(x2 + y2) > D/2,(5.219)

where D is the diameter of the spherical (i.e. cylindrical) defect and E0(L) is the lowest

energy level in a QW of width L.

We can numerically solve the Bloch equation (5.217) using the path integral rep-

resentation of the density matrix. For the N -particle high-temperature density ma-

trix, ρ(R,R′; τ), we use the pair approximation (5.148) which is valid for approximately

τ ≤ 1/(3 H∗a). Before doing the PIMC simulations, we calculated in advance i) tables of

the pair density matrices corresponding to the electron-electron, hole-hole and electron-

hole interactions given by the two-particle Bloch equation with the smoothed effective

2D Coulomb potentials, see Eq. (5.218), and ii) two tables with the density matrix of a

single particle (electron and hole) in a 2D cylinder of finite height (for particles localized

at the interface defect). The contributions of all these interactions (correlations) can be

treated as additive, once the used high-temperature pair density matrices correspond to

sufficiently high temperature. We used tables of the pair density matrices at a temperature

three times the effective electron-hole Hartree, i.e. 1/τ = 3H∗a = 403 K. By choosing the

number of time slices equal to n = 270, the full N -particle density matrix, ρ(R,R; β) and

all thermodynamic quantities can be accurately evaluated at a temperature T = 1.49 K.

As an example of calculations, we present our results for the binding energy of the

positively and negatively charged excitons (trions), see Figs. 5.11(a,b), and biexciton, see

Fig. 5.12. We define the binding energy of the exciton, charged exciton and biexciton as:

EB(X) = Ee + Eh − E(X),

EB(X±) = E(X) + Eh(e) − E(X±),

EB(X2) = 2E(X) − E(X2), (5.220)

where Ee (h) is the energy of a single electron (hole) in the given quantum well with a free

particle mean kinetic (thermal) energy kBT , and E(A) is the total energy of the excitonic



0 100 200 300 4000.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0 isotropic, X -

isotropic, X +

EB(X

+/-)

(meV

)

EB(X

+/-)

(meV

)

L (Å)

a)

X - Shields et al.

X - Kaur et al.

X - Finkelstein et al.

X - Yan et al.

0 100 200 300 4000.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

X + Kleinman et al.

X + Blakemore et al.

X + Phillips et al.

anisotropic, X -

anisotropic, X +

L (Å)

b)

Fig. 5.11 Trion binding energies for isotropic (a) and anisotropic (b) hole mass vs. quantum well width(Ref. [74]). Symbols are experimental data. Full (dashed) lines with symbols are PIMC results whenlocalization is (not) included. Typical numerical errors are shown only for several curves.

complex A.

We compare our theoretical results with the available experimental data for negative

and positive trions (for the full list of references, see Ref. [74]). The theoretical results

corresponding to the inclusion of localization (solid lines) are obtained for the interface

defect of diameter D = 400 A. We can note that the agreement with the experiments

is quite good for QW widths L ≥ 150 A. Specifically, the experimental points for the

X+ are close to our theoretical result, see Fig. 5.11(a). Unfortunately, for narrow QW’s

(L ≤ 150 A) when localization effects become important, there are currently no available

experimental data. This would be of high interest as the two points for the X− reported

by R. Kaur et al. and Z.C. Yan et al. show a more rapid increase of the binding energy

with the QW width than the one predicted by theory when the localization is not taken

into account. On the contrary, calculations with the QW width fluctuations included agree

well with these data.

The obtained numerical results have two important implications which can be useful in

the interpretation of experimental data. First, by comparing the measured binding energy

with our numerical calculations for different defect sizes allows one to characterize certain

experimental parameters, such as the magnitude of the disorder (well width fluctuations)

in a given sample. Secondly, one can verify or predict whether or not the experimentally

observed excitonic states are localized or delocalized in a given experimental set up.

In summary, PIMC simulations are very efficient to solve problems connected with

systems of strongly correlated quantum particles. In contrast, to standard semi-classical

methods, such as basis function expansions, which are efficient if certain symmetries exist,



0 100 200 300 400

1

2

3

4 E

B(X

2) (m

eV)

b)

anisotropic, X2

EB(X

2) (m

eV)

L (Å)0 100 200 300 400

1

2

3

4 isotropic, X

2

Birkedal et al.

Kim et al.

Bar-Ad et al.

Pantke et al.

Smith et al.

Adachi et al.

Langbein et al.

L (Å)

a)

Fig. 5.12 The same as in Fig. 5.11 for the biexciton binding energy (Ref. [74]). Symbols are experimentaldata.

the PIMC simulations apply to arbitrary geometries. In particular, it is very easy to

include disorder effects.

5.2.6 Conclusion and outlook

This chapter was devoted to a powerful computational method which is based on random

numbers and random (Markov) processes. We have demonstrated, that suitably organized,

such methods may be more efficient than common “direct” methods. A simple example,

is the comparison of MC and conventional discretization methods for computing multidi-

mensional integrals. As we have seen, already for dimensions d ≥ 6 conventional methods

fall behind MC techniques, where the last has doubtless priority. Hence integrals with

dimensionality d in the range 103 – 106 (and above) are feasible on modern computers.

This gives access to a tremendous variety of important problems of classical and quantum

statistical physics, with the number of solvable problems steadily increasing from year to

year. Most importantly the solutions are of first principle character, as no assumptions

regarding the role of correlations or quantum effects are made.

Among the open questions remains the fermion sign problem (see Sec. 5.2.1.2) which

limits simulations of quantum systems (without further assumptions) to temperatures

(densities) above (below) a certain limit. At this point further progress is urgently needed

which will allow for a similarly complete treatment as is already possible for the equilibrium

properties of Bose and classical distinguishable particles.



References

1. K. Binder and D.W. Heermann, Monte Carlo Simulation in Statistical Physics:

An Introduction. Berlin, Springer, 2002.

2. Dieter W. Heermann, Computer simulation methods in theoretical physics. Berlin,

Springer, 1990.

3. Daan Frenkel and Berend Smit, Understanding Molecular Simulations: From Al-

gorithms to Applications. Academic Press: A division of Harcourt, Inc., 2002.

4. V.M. Zamalin, G.E. Norman, and V.S. Filinov, The Monte Carlo Method in Sta-

tistical Thermodynamics, Nauka, Moscow 1977 (in Russian).

5. M.P. Allen and D.J. Tildesley, Computer Simulations of Liquids. Clarendon Press,

Oxford, 1987.

6. David P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical

Physics. Cambridge, Cambridge University Press, 2000.

7. Kurt Binder and A. Baumgartner, The Monte Carlo Method in Condensed Matter

Physics, Berlin, Springer, 1992.

8. G. Ciccotti, D. Frenkel, and I.R. McDonald, Simulations of Liquids and Solids:

Molecular Dynamics and Monte Carlo Methods in Statistical Mechanics. North-

Holland, Amsterdam, 1987.

9. H. Feldmeier and J. Schnack, Molecular dynamics for fermions, Rev. Mod. Phys.

72, 655 (2000).

10. W.M.C. Foulkes, L. Mitas, R.J. Needs and G. Rajagopal, Quantum Monte Carlo

simulations of solids, Rev. Mod. Phys. 73, 33 (2001).

11. D.M. Ceperley, Path Integrals in the Theory of Condensed Helium, Rev. Mod.

Phys. 67, 279 (1995).

12. D.M. Ceperley, Microscopic simulations in physics, Rev. Mod. Phys. 71, 438

(1999).

13. N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.N Teller, and E. Teller,

Equation of state calculations by fast computing machines, J. Chem. Phys. 21,

1087 (1953).

14. R.P. Feynman, Statistical Mechanics: A Set of Lectures. Addison-Wesley Publish-

ing Company, 1990.

15. R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals, McGraw

Hill, New York 1965.

16. H. Kleinert, Path Integrals in Quantum Mechanics, Statistics and Polymer

Physics, World Scientific, Second edition, 1995.

17. I.M. Sobol, The Monte Carlo Method (Popular Lectures in Mathematics, Univer-

sity of Chicago Press, 1975; [German translation:] I.M. Sobol, Die Monte-Carlo-

Methode. Berlin, Dt. Verl. der Wiss., 1991;

18. S.M. Ermakow, [German translation:] Die Monte-Carlo-Methode und verwandte

Fragen. Munchen, Oldenbourg, 1975.

19. W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical

Recipes in FORTRAN: The Art of Scientific Computing, Cambridge University


References 347

Press, 1992

20. D.E. Knuth, The art of computer programming / Vol. 2 Seminumerical algorithms.

Reading, Mass., Addison-Wesley, 2000.

21. I. Vattulainen et. al., Phys. Rev. Lett. 73, 2513 (1994).

22. M. Creutz, Phys. Rev. Lett. 50, 1411 (1983).

23. G. Bhanot, M. Creutz, and H. Neuberger, Nucl. Phys. B 235, 417 (1984).

24. D.W. Heermann and R.C. Desai, Comput. Phys. Commun. 50, 297 (1988).

25. L.A. Rowley, D. Nicholson, and N.G. Parsonage, Monte Carlo grand canonical

ensemble calculation in a gas-liquid transition region for 12-6 argon, J. Comput.

Phys. 17, 401 (1975).

26. J. Yao, R.A. Greenkorn, and K.C. Chao, Monte Carlo simulation of the grand

canonical ensemble, Mol. Phys. 46, 587 (1982).

27. G.E. Norman and V.S. Filinov, Investigations of phase transitions by the Monte

Carlo method, High Temp. (USSR) 7, 216 (1969).

28. W.W. Wood and F.R. Parker, J. Chem. Phys. 27, 720 (1957).

29. L. Verlet, Computer “experiments” on classical fluids. I. Thermodynamical prop-

erties of Lennard-Jones molecules, Phys. Rev. 159, 98 (1967).

30. J.J. Nicolas, K.E. Gubbins, W.B. Streett, and D.J. Tildesley, Equation of state

for the Lennard-Jones fluid, Mol. Phys. 37, 1429 (1979).

31. J.K. Johnson, J.A. Zollweg, and K.E. Gubbins, The Lennard-Jones equation of

state revisited, Mol. Phys. 78, 591 (1993).

32. F. Lindemann, Z. Phys 11, 609 (1910).

33. V.S. Filinov, High Temperature 13, 1065 (1975) and 14, 225 (1976).

34. D.M. Ceperley, Fermion Nodes, J. Stat. Phys. 63, 1237 (1991).

35. D.M. Ceperley, Path Integral Calculations of Normal Liquid 3He, Phys. Rev. Lett.

69, 331 (1992).

36. D.M. Ceperley, Path Integral Monte Carlo Methods for Fermions in Monte Carlo

and Molecular Dynamics of Condensed Matter Systems, Ed. K. Binder and G.

Ciccotti, Editrice Compositori, Bologna, Italy, 1996.

37. W. Magro, B. Militzer, D. Ceperley, B. Bernu, and C. Pierleoni, Restricted Path

Integral Monte Carlo Calculations of Hot, Dense Hydrogen, in Strongly Coupled

Coulomb Systems, ed. by G. J. Kalman, J. M. Rommel and K. Blagoev. Plenum

Press, New York NY, 1998.

38. B. Militzer and E.L. Pollock, Variational density matrix method for warm, con-

densed matter: Application to dense hydrogen, Phys. Rev. E 61, 3470 (2000).

39. R. Egger, W. Hausler, C.H. Mak, and H. Grabert, Crossover from Fermi Liquid to

Wigner Molecule Behavior in Quantum Dots, Phys. Rev. Lett. 82, 3320 (1999).

40. M. Imada, J. Phys. Soc. Japan 53, 2861 (1984).

41. M.F. Herman, E.J. Bruskin, and B.J. Berne, J. Chem. Phys. 76, 5150 (1982).

42. V.S. Filinov, M. Bonitz, W. Ebeling, and V.E. Fortov, Thermodynamics of hot

dense H-plasmas: Path integral Monte Carlo simulations and analytical approxi-

mations, Plasma Physics and Controlled Fusion 43, 743 (2001).

43. V.S. Filinov, M. Bonitz, D. Kremp, W.D. Kraeft, W. Ebeling, P.R. Levashov,



and V.E. Fortov, Path integral simulations of the thermodynamic properties of

quantum dense plasma, Contrib. Plasma Phys. 41, 135 (2001).

44. A. Goodman and D. Sokal, Phys. Rev. D 56, 1024 (1987).

45. D.M. Ceperley and E.L. Pollock, Path Integral Computation of the Low Temper-

ature Properties of Liquid 4He. Phys. Rev. Lett. 56, 351 (1986).

46. P.N. Vorontsov-Velyaminov, M.O. Nesvit, and R.I. Gorbunov, Bead-Fourier path-

integral Monte Carlo method applied to system of identical particles, Phys. Rev.

E 55 N2, 1979 (1997).

47. D.M. Ceperley and E.L. Pollock, Phys. Rev. Lett. 56, 351 (1986).

48. P. Levy, Compositio Math. 7, 283 (1939).

49. R.G. Storer, J. Math. Phys. 9, 964 (1968); A.D. Klemm, and R.G. Storer, Aust.

J. Phys. 26, 43 (1973).

50. H. Kleinert, Phys. Rev. D 57, 2264 (1998).

51. W. Ebeling, H.J. Hoffmann, and G. Kelbg, Contr. Plasma Phys. 7, 233 (1967)

and references therein.

52. M.-M. Gombert, H. Minoo, Contrib. Plasma Phys. 29, 355 (1989).

53. H. Wagenknecht, W. Ebeling, and A. Forster, Contrib. Plasma Phys. 41, 15

(2001).

54. A. Filinov, M. Bonitz, and W. Ebeling, J. Phys. A: Math.Gen. 36, 5957 (2003).

55. A.V. Filinov, V.O. Golubnychiy, M. Bonitz, W. Ebeling, and J.W. Dufty, Phys.

Rev. E 70, 046411 (2004).

56. M.F. Herman, E.J. Bruskin, and B.J. Berne, J. Chem. Phys. 76, 5150 (1982).

57. S.W. de Leeuw, J.W. Perram, and E.R. Smith, Proc. R. Soc. London, Ser. A

373, 27 (1980).

58. P.P. Ewald, Ann. Phys. (Leipzig) 64, 253 (1921).

59. T. Gaskell, Proc. Phys. Soc. 77 1182 (1961); 80, 1091 (1962).

60. B.R.A. Nijboer and F.W. de Wette, Physica 23, 309 (1957).

61. D.M. Ceperley, Phys. Rev. B 18, 3126 (1978).

62. V. Natoli and D.M. Ceperley, J. Comput. Phys. 117, 171 (1995).

63. C. Lin, F.H. Zong, and D.M. Ceperley, Phys. Rev. E 64, 016702 (2001).

64. L.D. Landau and E.M. Lifshitz, Statistical Physics, vol. 2. Oxford: Pergamon

Press, 1980.

65. L.M. Fraser, W.M.C. Foulkes, G. Rajagopal, R.J. Needs, S. Kenny, and

A.J. Williamson, Phys. Rev. B 53, 1814 (1996).

66. P.R.C. Kent, R.Q. Hood, A.J. Williamson, R.J. Needs, W.M.C. Foulkes, and

G. Rajagopal, Phys. Rev. B 59, 1917 (1999).

67. A.J. Williamson, G. Rajagopal, R.J. Needs, L.M. Fraser, W.M.C. Foulkes,

Y. Wang, and M.-Y. Chou, Phys. Rev. B 55, R4851.

68. R.C. Ashoori, Nature (London) 379, 413 (1996); N.B. Zhitenev et al., Phys. Rev.

Lett. 79, 2309 (1997).

69. C. Yannouleas and U. Landman, Phys. Rev. Lett. 82, 5325 (1999), and references

therein.

70. E. Wigner, Phys. Rev. 46, 1002 (1934).


References 349

71. Yu.E. Lozovik and V.A. Mandelshtam, Phys. Lett. A 145 N5, 269-271 (1990).

72. V.M. Bedanov and F.M. Peeters, Phys. Rev. B 49, 2667 (1994), and references

therein.

73. A.V. Filinov, M. Bonitz, and Yu.E. Lozovik, Phys. Rev. Lett. 86, 3851 (2001).

74. A.V. Filinov, C. Riva, F.M. Peeters, Yu.E. Lozovik, and M. Bonitz, Phys. Rev.

B 70, 035323 (2004).

75. J.G. Tischler, A.S. Bracker, D. Gammon, and D. Park, Phys. Rev. B 66,

081310(R) (2002).

76. R. Winkler, Phys. Rev. B 51, 14 395 (1995).

chapter 5 classical and quantum monte carlo · chapter 5 classical and quantum monte carlo alexei...

Documents