dirty tricks for statistical mechanics - mcgill physics - mcgill university

Dirty tricks for statistical mechanics

Martin Grant

Physics Department, McGill University

c© MG, August 2004, version 0.91

ii

Preface

These are lecture notes for PHYS 559, Advanced Statistical Mechanics,which I’ve taught at McGill for many years. I’m intending to tidy this upinto a book, or rather the first half of a book. This half is on equilibrium, thesecond half would be on dynamics.

These were handwritten notes which were heroically typed by Ryan Porterover the summer of 2004, and may someday be properly proof-read by me.Even better, maybe someday I will revise the reasoning in some of the sections.Some of it can be argued better, but it was too much trouble to rewrite thehandwritten notes. I am also trying to come up with a good book title, amongstother things. The two titles I am thinking of are “Dirty tricks for statisticalmechanics”, and “Valhalla, we are coming!”. Clearly, more thinking is necessary,and suggestions are welcome.

While these lecture notes have gotten longer and longer until they are al-most self-sufficient, it is always nice to have real books to look over. My favoritemodern text is “Lectures on Phase Transitions and the Renormalisation Group”,by Nigel Goldenfeld (Addison-Wesley, Reading Mass., 1992). This is referredto several times in the notes. Other nice texts are “Statistical Physics”, byL. D. Landau and E. M. Lifshitz (Addison-Wesley, Reading Mass., 1970) par-ticularly Chaps. 1, 12, and 14; “Statistical Mechanics”, by S.-K. Ma (WorldScience, Phila., 1986) particularly Chaps. 12, 13, and 28; and “HydrodynamicFluctuations, Broken Symmetry, and Correlation Functions”, by D. Forster (W.A. Benjamin, Reading Mass., 1975), particularly Chap. 2. These are all greatbooks, with Landau’s being the classic. Unfortunately, except for Goldenfeld’sbook, they’re all a little out of date. Some other good books are mentioned inpassing in the notes.

Finally, at the present time while I figure out what I am doing, you can readthese notes as many times you like, make as many copies as you like, and referto them as much as you like. But please don’t sell them to anyone, or use thesenotes without attribution.

Contents

1 Motivation 1

2 Review of Stat Mech Ideas 5

3 Independence of Parts 7

3.1 Self Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . 11

3.3 Correlations in Space . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Toy Fractals 21

4.1 Von Koch Snowflake . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2 Sierpinskii Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.3 Correlations in Self-Similar Objects . . . . . . . . . . . . . . . . . 27

5 Molecular Dynamics 31

6 Monte Carlo and the Master Equation 39

7 Review of Thermodynamics 49

8 Statistical Mechanics 55

9 Fluctuations 59

9.1 Thermodynamic Fluctuations . . . . . . . . . . . . . . . . . . . . 62

10 Fluctuations of Surfaces 67

10.1 Lattice Models of Surfaces . . . . . . . . . . . . . . . . . . . . . . 70

10.2 Continuum Model of Surface . . . . . . . . . . . . . . . . . . . . 72

10.3 Impossibility of Phase Coexistence in d=1 . . . . . . . . . . . . . 88

10.4 Numbers for d=3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

11 Broken Symmetry and Correlation Functions 93

11.1 Proof of Goldstone’s Theorem . . . . . . . . . . . . . . . . . . . . 96

iii

iv CONTENTS

12 Scattering 99

12.1 Scattering from a Flat Interface . . . . . . . . . . . . . . . . . . . 10312.2 Roughness and Diffuseness . . . . . . . . . . . . . . . . . . . . . . 10612.3 Scattering From Many Flat Randomly–Oriented Surfaces . . . . 108

13 Fluctuations and Crystalline Order 113

14 Ising Model 121

14.1 Lattice–Gas Model . . . . . . . . . . . . . . . . . . . . . . . . . . 12614.2 One–dimensional Ising Model . . . . . . . . . . . . . . . . . . . . 12714.3 Mean–Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 130

15 Landan Theory and Ornstien–Zernicke Theory 139

15.1 Landan Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14415.2 Ornstein — Zernicke Theory of Correlations . . . . . . . . . . . . 145

16 Scaling Theory 151

16.1 Scaling With ξ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

17 Renormalization Group 161

17.1 ǫ–Expansion RG . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

Chapter 1

Motivation

Why is a coin toss random?

Statistical Mechanics: Weighting for averages isby “equal a priori probability.” So, all many—bodyquantum states with the same energy (for a givenfixed volume) have equal weight for calculating prob-abilities.

This is a mouthful, but at least it gives the rightanswer. There are two microscopic states, HEADS

and TAILS, which are energetically equivalent. Sothe result is random.

However nice it is to get the right answer, and however satisfying it is tohave a prescription for doing other problems, this is somewhat unsatisfying. Itis worth going a little deeper to understand how microscopic states (HEADS orTAILS) become macroscopic observations (randomly HEADS or TAILS).

To focus our thinking it is useful to do an experiment. Take a coin and flipit, always preparing it as heads. If you flip it at least 10–20cm in the air, theresult is random. If you flip it less than this height, the result is not random.Say you make N attemps, and the number of heads is nheads and the numberof tails is ntails, for different flipping heights h. Then you obtain something likethis:

h

1

0 2 4 6 8 10

Figure 1.1: Bias of Flips With Height h

1

2 CHAPTER 1. MOTIVATION

There are a lot of other things one could experimentally measure. In fact, mycartoon should look exponential, for reasons we will see later on in the course.

Now we have something to think about: Why is the coin toss biased forflips of small height? In statistical mechanics language we would say, “why isthe coin toss correlated with its initial state?” Remember the coin was alwaysprepared as heads. (Of course, if it was prepared randomly, why flip it?) Thereis a length scale over which those correlations persist. For our example, thisscale is ∼ 2–3cm, which we call ξ, the correlation length. The tailing off of thegraph above follows:

Correlations ∼ e−ξ/h

which we need more background to show. But if we did a careful experiment,we could show this is true for large h. It is sensible that

ξ ≥ coin diameter

as a little thinking will clarify.This is still unsatisfying though. It leads to the argument that coin tosses are

random because we flip the coin much higher than its correlation length. Also,we learn that true randomness requires ξ/h → 0, which is analogous to requiringan infinite system in the thermodynamic limit. [In fact, in passing, note thatfor ξ/h = O(′) the outcome of the flip (a preponderance of heads) does notreflect the microscopic states (HEADS and TAILS equivalent). To stretch a bit,this is related to broken symmetry in another context. That is, the macroscopicstate has a lower symmetry than the microscopic state. This happens in phasetransitions, where ξ/L = O(′) because ξ ≈ L as L → ∞, where L is the edgelength of the system. It also happens in quantum mesoscopic systems, whereξ/L = O(′) because L ≈ ξ and L is somewhat small. This latter case is morelike the present examlpe of coin tossing.]

But what about Newton’s laws? Given an initial configuration, and a precise

force, we can calculate the trajectory of the coin if we neglect inhomogeneities inthe air. The reason is that your thumb cannot apply a precise force for flipping towithin one correlation length of height. The analogous physical problem is thatmacroscopic operating conditions (a fixed pressure or energy, for example) donot determine microscopic states. Many such microscopic states are consistentwith one macroscopic state.

These ideas, correlations on a microscopic scale, broken symmetries, andcorrelation lengths, will come up throughout the rest of the course. The rest ofthe course follows this scenario:

• Fluctuations and correlations: What can we do without statistical me-chanics?

• What can we do with thermodynamics and statistical mechanics?

• Detailed study of interfaces.

• Detailed study of phase transitions.

3

The topics we will cover are as current as any issue of Physical Review E. Justtake a look.

We’ll begin the proper course with a review of, and a reflection on, somebasic ideas of statistical mechanics.

4 CHAPTER 1. MOTIVATION

Chapter 2

Review of Stat Mech Ideas

In statistical physics, we make a few assumptions, and obtain a distrobutionto do averages. A famous assumption is that of “equal a priori probability.”By this it is meant that all microscopic states, consistent with a paticular fixedtotal energy E and number of particles N , are equally likely. This gives themicrocanonical ensemble. It can then be checked that thermodynamic quanti-ties have exceedingly small fluctuations, and that one can derive — from themicrocanonical ensemble — the canonical ensemble, where temperature T andvolume V are fixed. It is this latter ensemble which is most commonly used.Microscopic states, consistent with T and V are not equally likely, they areweighted with the probability factor

e−Estate/kBT (2.1)

where Estate is the energy of the state, and kB is Boltzmann’s Constant. Theprobability of being in a state is

ρ =1

Ze−Estate/kBT (2.2)

where the partition function is

Z =∑

{states}


The connection to thermodynamics is from the Helmholtz free energe

F = E − TS (2.4)

where S is the entropy, ande−F/kBT = Z (2.5)

Note that since F is extensive, that is F ∼ O(N), we have

Z = (e−f

kBT )N

5

6 CHAPTER 2. REVIEW OF STAT MECH IDEAS

where F = fN . This exponential factor of N arises from the sum over stateswhich is typically

∑

{states}

= eO(N) (2.6)

For example, for a magnetic dipole, which can have each constivent up or down,there are evidently

dipole #1 × dipole #2...dipole #N = 2 × 2 × ...2 = 2N = eN ln 2 (2.7)

total states. The total number of states in a volume V is

V N

N !≈ V N

NN= eN ln V

N (2.8)

where the factor N ! accounts for indistinguisability.These numbers are huge in a way which cannot be imagined. Our normal

scales sit somewhere between microscopic and astronomical. Microscopic sizesare O(A) with times O(10−12sec). Astronomical sizes, like the size and age ofthe universe are O(1030m) and O(1018s). The ratio of these numbers is

1030m

10−10m∼ 1040 (2.9)

This is in no way comparable to (!)

101023

(2.10)

In fact, it seems a little odd (and unfair) that a theorist must average over eN

states to explain what an experimentalist does in one experiment.

Chapter 3

Independence of Parts

It is worth thinking about experiments a bit to make this clear. Say an ex-perimentalist measures the speed of sound in a sample, as shown. The experi-

Figure 3.1: Experiment 1

mentalist obtains some definite value of c, the speed of sound. Afterwards, theexperimentlalist might cut up the sample into many parts so that his or herfriends can do the same experiment. That is as shown in Fig. 3.2. Each of


7

8 CHAPTER 3. INDEPENDENCE OF PARTS

these new groups also measure the sound speed as shown. It would seem thatthe statistics improve because 4 experiments are now done?! In fact one cancontinue breaking up the system like this, as shown in Fig. 3.3.


Of course, 1 experiment on the big system must be equivalent to manyexperiments on individual parts of a big system, as long as those individualparts are independent. By this it is meant that a system “self-averages” all itsindependent parts.

The number of independent parts is (to use the same notation as the numberof particles)

N = O(L3

ξ3) (3.1)

for a system of edge length L. The new quantity introduced here is the cor-relation length ξ, the scale over which things are correlated, so they are notindependent. The correlation length must be at least as large as an atom so

ξ ≥ O(A) (3.2)

and indeed usually ξ ≈ O(A). Hence for a macroscopic length, the number ofindependent points is

N ∼ O(1023) (3.3)

3.1. SELF AVERAGING 9

of course. Now, each independent part has a few degrees of freedom, such asquantum numbers. Hence, the total number of states in a typical system are

∫dq1

∫dq2 . . .

∫dqN

quantum quantum quantumstates states statesof part 1 of part 2 of part N

Say∫

dqi = 4 for an example, so that the total number of states is

eO(N) (3.4)

as it should. Now, however, we have a feeling for the origin of some of theassumptions entering statistical physics.

We can go a little farther with this, and prove the central–limit theorem ofprobability theory.

3.1 Self Averaging

Consider an extensive quantity X. If a system consists of N independent partsas shown, then

i = 6

i = Ni = 8i = 7

i = 4 i = 5

i = 3i = 2i = 1

Figure 3.4: N independent parts

X =N∑

i=1

Xi (3.5)

where Xi is the value of X in independent part i. The average value of X is

〈X〉 =

N∑

i=1

〈Xi〉 (3.6)

But since 〈Xi〉 6= 〈Xi〉(N), we have

〈X〉 = O(N) (3.7)

It is handy to introduce the intensive quantity

x =X

N(3.8)


then〈x〉 = O(1)

This is not very amazing yet. Before continuing, it is worth giving an exampleof what we mean by the average. For example, it might be a microscopicallylong, but macroscopically short time average

〈X〉 =

∫

τdtX(t)

τ

as drawn. As also indicated, the quantity 〈(X − 〈X〉)2〉1/2 is a natural one to

X

t

〈X〉

τ

〈(X − 〈X〉)2〉1/2

Figure 3.5: Time Average

consider. Let the deviation of X from its average value be

∆X = X − 〈X〉 (3.9)

Then clearly〈∆X〉 = 〈X〉 − 〈X〉 = 0

as is clear from the cartoon. The handier quantity is

〈(∆X)2〉 = 〈(X − 〈X〉)2〉= 〈X2 − 2X〈X〉 + 〈X〉2〉= 〈X2〉 − 〈X〉2

(3.10)

Giving the condition, in passing,

〈X2〉 ≥ 〈X〉2 (3.11)

since 〈(∆X)2〉 ≥ 0. Splitting the system into independent parts, again, gives

〈(∆X)2〉 = 〈N∑

i=1

∆Xi

N∑

j=1

∆Xj〉

Now using that i may or may not equal j

〈(∆X)2〉 = 〈N∑

i=1(i=j)

(∆Xi)2〉 + 〈

N∑

i=1j=1(i6=j)

(∆Xi)(∆Xj)〉 (3.12)

3.2. CENTRAL LIMIT THEOREM 11

But box i is independent of box j, so,

〈∆Xi∆Xj〉 = 〈∆Xi〉〈∆Xj〉= 0 × 0

= 0

(3.13)

and,

〈(∆X)2〉 =

N∑

i=1

〈(∆Xi)2〉

or〈(∆X)2〉 = O(N) (3.14)

Also, the intensive x = X/N satisfies

〈(∆x)2〉 = O(1

N) (3.15)

This implies that fluctuations in quantities are very small in systems of many(N → ∞) independent parts. That is, because

X ≈ 〈X〉{1 + O(〈(∆X)2〉1/2

〈X〉 )}

≈ 〈X〉{1 + O(1

N1/2)}

(3.16)

So, in essence, X ≈ 〈X〉. The result

〈(∆X)2〉〈X〉2 = O(

1

N) (3.17)

that relative fluctuations are very small, if N is large, is called the self-averaginglemma. Note that, since N = V/ξ3,

〈(∆X)2〉〈X〉2 = O(

ξ3

V) (3.18)

so fluctuations measure microscopic correlations, and become appreciable forξ3

V ∼ O(1). This occurs in quantum mesoscopic systems (where V is small),and phase transitions (where ξ is large), for example.

3.2 Central Limit Theorem

We will now prove (to a physicist’s rigor) the central–limit theorem for systemsif many independent parts. In the same manner as above we will calculate allthe moments, such as the nth moment,

〈(∆X)n〉 (3.19)


or at least their order with N .For simplicity we will assume that all the odd moments vanish

〈(∆X)2n+1〉 = 0

for intiger n. This means the distrobution ρ is symetric about the mean 〈X〉 asshown. It turns out that one can show that the distrobution function is symetric

ρρρ

x x x〈x〉〈x〉〈x〉

Figure 3.6: Symetric distribution functions ρ(x)

as (N → ∞), but lets just assume it so the ideas in the algebra are clear. It canbe shown (see below) that we can then reconstruct the distrobution ρ(X) if weknow all the even moments

〈(∆X)2n〉

for positive integer n. We know n = 1, so let’s do n = 2, as we did 〈(∆X)2〉above

〈(∆X)4〉 = 〈N∑

i=1

∆Xi

N∑

j=1

∆Xj

N∑

k=1

∆Xk

N∑

l=1

∆Xl〉

= 〈N∑

i=1(i=j=k=l)

(∆Xi)4〉 + 3〈

N∑

1=1j=1(i6=j)

(∆Xi)2(∆Xj)

2〉 + O :0〈∆Xi〉

= O(N) + O(N2)

(3.20)

So as (N → ∞)

〈(∆X)4〉 = 3〈N∑

i=1

(∆Xi)2〉〈

N∑

j=1

(∆Xj)2〉

or

〈(∆X)4〉 = 3〈(∆X)2〉2 = O(N2) (3.21)

Note that the largest term (as (N → ∞)) came from pairing up all the ∆Xi’sappropriately. A little thought makes it clear that the largest term in 〈(∆X)2n〉will result from this sort of pairing giving,

〈(∆X)2n〉 = const. 〈(∆X)2〉n = O(Nn) (3.22)

3.2. CENTRAL LIMIT THEOREM 13

We will now determine the const. in Eq. 3.22. Note that

〈(∆X)2n〉 = 〈∆X ∆X ∆X...∆X ∆X〉

There are (2n − 1) ways to pair off the first ∆X. After those two ∆X’s aregone, there are then (2n − 3) ways to pair off the next free ∆X. And so on,until finally, we get to the last two ∆X’s which have 1 way to pair off. Hence

〈(∆X)2n〉 = (2n − 1)(2n − 3)(2n − 5)...(1)〈(∆X)2〉n ≡ (2n − 1)!!〈(∆X)2〉n

or more conventionally,

〈(∆X)2n〉 =2n!

2nn!〈(∆X)2〉n (3.23)

Now we know all the even moments in terms of the first two moments 〈X〉 and〈(∆X)2〉. We can reconstruct the distribution with a trick. First, let us beexplicit about ρ. By an average we shall mean,

〈Q(X)〉X =

∫ ∞

−∞

ρ(X)Q(X) (3.24)

where Q(X) is some quantity. Note that

〈δ(X − X′

)〉X′ = ρ(X) (3.25)

where δ(X) is the Dirac delta function! Recall the Fourrier representation

δ(X − X′

) =1

2π

∫ ∞

−∞

dk e−ik(X−X′) (3.26)

Averaging over X′

gives

ρ(X) = 〈δ(X − X′

)〉X′ =1

2π

∫ ∞

−∞

dk e−ikX〈eikX′

〉X′ (3.27)

In statistics, 〈eikX〉 is called the characteristic function. But we can now expandthe exponential (for convenience, let 〈X〉 = 0 for now, one only has to add and

subtract it in e−ik(X−X′) = e−ik(∆X−∆X

′) but it makes the algebra messier) as

e−ikX′

=

∞∑

n=0

(ik)n

n!(X

′

)n

So that we have,

ρ(X) =1

2π

∫ ∞

−∞

dk eikX∞∑

n=0

(ik)n

n!〈Xn〉 (3.28)


where the prime superscript has been dropped. Note that Eq. 3.28 means thatρ(X) can be reconstructed from all the 〈Xn〉, in general. For our case, all theodd moments vanish, so

ρ(X) =1

2π

∫ ∞

−∞

dk eikX∞∑

n=0

(−k2)n

(2n)!〈X2n〉

But 〈X2n〉 = (2n)!2nn! 〈X2〉n from above, so,

ρ(X) =1

2π

∫ ∞

−∞

dk eikX∞∑

n=0

1

n!(−k2

2〈X2〉)n

which is the exponential again, so,

ρ(X) =1

2π

∫ ∞

−∞

dk eikX e−k2

2 〈X2〉 (3.29)

(In passing, it is worth remembering that 〈eikX〉 = e−k2

2 〈X2〉 for a Gaussiandistribution.) The integral can be done by completing the square, adding and

subtracting −X2

2〈X2〉 , so that

ρ(X) =1

2πe

−X2

2〈X2〉

∫ ∞

−∞

dk e−(

k〈X2〉1/2

21/2+ iX

2〈X2〉1/2)2

=1

2πe

−X2

2〈X2〉

√

2

〈X2〉

*

√π

∫ ∞

−∞

du e−u2

using an obvious sudstitution. Hence, on putting 〈X〉 as non–zero again,

ρ(X) =1

√

2π〈(∆X)2〉e

−(X−〈X〉)2

2〈(∆X)2〉 (3.30)

which is the Gaussian distribution. This is the central–limit theorem for asystem of many independent parts. Its consequences are most transparent ifone deals with the intensive quantity x = X

N , then recall

〈x〉 = O(1)

〈(∆x)2〉 = O(1/N)(3.31)

Letting

〈(∆x)2〉 ≡ a2

N(3.32)

where a2 = O(1) we have, up to normalization,

ρ(x) ∝ e−N(x−〈x〉)2

2a2 (3.33)

so the width of the Gaussian is exceedingly small and it most represents a spike.

3.3. CORRELATIONS IN SPACE 15

〈x〉 = O(1)

O(1/N1/2)ρ(x)

x

Figure 3.7: Gaussian (Width Exaggerated)

Which is to say again that fluctuations are small, and average values are well

〈x〉 = O(1)

ρ(x)

x

O(1/N1/2)

Figure 3.8: Gaussian (Width (sort of) to scale)

defined for a system of many (N → ∞) independent parts.

3.3 Correlations in Space

The results so far apply to a large system, and perhaps it is not surprising thatthe global fluctuations in a large system are small. These results for the globalfluctuations have implications for the correlations of quantities, though. Saythe intensive variable x has different values at points ~r in space. (Precisely, letd~r, the volume element, be microscopically large, but macroscopically small).Consider the correlation function

〈∆x(~r) ∆x(~r′

)〉 ≡ C(~r, ~r′

) (3.34)

We can say a few things about C(~r, ~r′

) from general considerations.

1. If space is translationally invariant, then the origin of the coordinate sys-tem for ~r and ~r

′

is arbitrary. It can be placed at point ~r or ~r′

. Thephysical length of intrest is (~r − ~r

′

) or (~r′ − ~r). Hence

C(~r, ~r′

) = C(~r − ~r′

) = C(~r′ − ~r) (3.35)


~r

~r′

~r − ~r′∆x(~r)

∆x(~r′)

o

Figure 3.9: Transitional invariance (only ~r − ~r′ is important)

and, of course,

〈∆x(~r) ∆x(~r′

)〉 = 〈∆x(~r − ~r′

) ∆x(0)〉 (3.36)

etc.

2. If space is isotropic, then the direction of ∆x(~r) from ∆x(~r′

) is unim-portant, and only the magnitude of the distance | ~r − ~r

′ | is physically

∆x(~r′)

∆x(~r)

|~r − ~r′|

Figure 3.10: Isotropy (only |~r − ~r′| is important)

relevant. HenceC(~r, ~r

′

) = C(| ~r − ~r′ |) (3.37)

So what was a function of 6 numbers (~r, ~r′

) is now a function of only 1,| ~r − ~r

′ |.

3. Usually correlations fall off if points are arbitrarily far apart. That is,

〈∆x(r → ∞) ∆x(0)〉 = 〈∆x( :0r → ∞)〉〈 :0

∆x(0)〉= 0

orC(r → ∞) = 0 (3.38)

The scale of the correlation is set by the correlation length.

Now we knowC(| ~r − ~r

′ |) = 〈∆x(~r) ∆x(~r′

)〉 (3.39)

Subject to these conditions. A constraint on C(| ~r − ~r′ |) is provided by our

previous result

〈(∆x)2〉 = O(1

N)


C(r)

r

O(ξ)

Figure 3.11: Typical correlation function

for a global fluctuation. Clearly,

∆x =

∫d~r ∆x(~r)∫

d~r

=1

V

∫

d~r ∆x(~r)

(3.40)

So we have

1

V

∫

d~r

∫

d~r′ 〈∆x(~r) ∆x(~r

′

)〉 = 〈(∆x)2〉

= O(1

N)

(3.41)

or,∫

d~r

∫

d~r′

C(~r − ~r′

) = O(V 2

N)

Let ~R = ~r − ~r′

, ~R′

= 12 (~r − ~r

′

). The Jacobian of the transformation is unity,and we obtain

∫

d~R′

∫

d~R C(~R) = O(V 2

N)

= V

∫

d~R C(~R)

(3.42)

So letting ~R → ~r, ∫

d~r C(~r) = O(V

N)

But N = Vξ3 , or in d dimensions, N = Ld

ξd , so∫

d~r C(~r) = O(ξ3)

or

∫

d~r 〈∆x(~r) ∆x(0)〉 = O(ξ3)

(3.43)


for a system of N → ∞ independant parts. This provides a constraint oncorrelations. It is a neat result in Fourier space. Let

C(~k) ≡∫

d~r ei~k·~rC(~r)

and finally

C(~r) =

∫d~k

(2π)de−i~k·~rC(~k)

(3.44)

and,

〈|∆x(k)|2〉 ≡∫

d~r ei~k·~r 〈∆x(~r) ∆x(0)〉

Hence the constraint of Eq. 3.43 is

limk→0

C(k) = O(ξ3)

or

limk→0

〈|∆x(k)|2〉 = O(ξ3)

(3.45)

Many functions satisfy the constraint of Eq. 3.43 (see Fig. 3.11). It turns outthat usually,

〈∆x(~r) ∆x(0)〉 ∼ e−r/ξ (3.46)

for large r, which satisfies Eq. 3.43. This is obtained from analyticity of (C(k))−1

near k = 0. It can be argued,

1

C(k)≈ (const.) + (const.)k2 + ...

so that 〈|∆x(k)|2〉 = const.const.+k2 near k = 0. Fourier transformation gives the

result of Eq. 3.46 above.(Note: In the previous notation with boxes, as in Fig. 3.4, we had

∆X =∑

i

∆Xi

or

∆x =1

N

∑

i

∆Xi

This means1

V

∫

d~r ∆x(~r) =1

N

∑

i

∆Xi (3.47)


This is satisfied by

∆x(r) =V

N

N∑

i=1

δ(~r − ~ri)∆Xi

where ~ri is a vector pointing to box i, and ∆Xi = ∆X(~ri).)Let me emphasize again that all these results follow from a system having

N → ∞ independent parts. To recapitulate, for x = X/N , where X is extensive,N = V/ξ3 is the number of independent parts. V is the volume, and ξ is thecorrelation length:

〈x〉 = O(1) (3.7)

〈(∆x)2〉 = O(1

N) (3.15)

ρ(x) ∝ e(x−〈x〉)2

2〈(∆x)2〉 (3.30)∫

d~r 〈∆x(r) ∆x(0)〉 = O(ξ3) (3.43)

The purpose of this course is to examine the cases where V/ξ3 = O(1) and theseresults break down! It is worthwhile to give some toy examples now where, inparticular, Eq. 3.43 breaks down.

Chapter 4

Toy Fractals

Toy Fractals are mathamatical constructs that seem to exist in fractional di-mensions between 1 and 2, or, 2 and 3, or what have you. They can provide auseful starting point for thinking about real systems where V/ξ3 = O(1), or ifV = L3, ξ/L = O(1).

We’ll briefly discuss self-affine, or surface fractals, and volume fractals. This

Figure 4.1: Self–Affine Fractal

is a self–affine fractal where the surface is very jagged, or as is said in physics,rough. The self-affine fractal has the property that its perimeter length P (orsurface area in three dimensions) satisfies,

P ∼ Lds , as L → ∞ (4.1)

where ds is the surface self-affine fractal exponent. For a circle of diameter L

Pcircle = πL

while for a square of width L

Psquare = 4L

21

22 CHAPTER 4. TOY FRACTALS

In fact, for run-of-the-mill surfaces in d dimensions

Pnonfractal ∼ Ld−1 (4.2)

Usually

ds > d − 1 (4.3)

I’ll give an example below. This is a volume fractal, note all the holes with

Figure 4.2: Volume Fractal

missing stuff. The fractal has the propert that its mass M satisfies

M ∼ Ldf as L → ∞ (4.4)

where df is the fractal dimension. For a circle M = π/4 L2, in units of thedensity for a diameter L, while for a square M = L3. For run-of-the-millobjects

Mnonfractal ∼ Ld (4.5)

in d dimensions. Usually

df < d (4.6)

Note that since the mass density is M/Ld, this implies that

DENSITY OF A FRACTAL ∼ 1

Ld−df→ 0 (4.7)

as L → ∞, implying fractals float (!?). Think about this.

Two toy examples show that ds > d − 1 and df < d are possible.

4.1 Von Koch Snowflake

To draw the von Koch snowflake, follows the steps shown. To find the self-affine

4.1. VON KOCH SNOWFLAKE 23

Figure 4.3: Von Koch Snowflake

fractal exponent for the perimeter, imagine I had (somehow) drawn one of thesefractals out a very very large number of steps. Say the smallest straight link islength l. We’ll calculate the perimeter as we go through increasing numbers oflinks in a large complex fractal. Hence

Figure 4.4:

Pn = 4nl

when

Ln = 3nl

(4.8)

Taking the logarithm

lnP/l = n ln 4

and

lnL/l = n ln 3

Dividing these into each other

lnP/l

lnL/l=

ln 4

ln 3


or

P/l = (L/l)ds (4.9)

where

ds =ln 4

ln 3≈ 1.26

> d − l = 1

It is common in the physics literature to write

P (L) = w(L)Ld−1 (4.10)

where w(L) is the width or thickness of a rough interface, and

w(L) ∼ Lx , as L → ∞ (4.11)

where x is the roughening exponent, and clearly, (x ≈ 0.26 for the snowflake),

ds = x + d − 1 (4.12)

4.2 Sierpinskii Triangle

To draw the Sierpinskii triangle, follow the steps shown. To find the fractal

Figure 4.5: Sierpinskii Triangle

exponent for the mass, imagine I had (somehow) drawn one of these fractalsout to a very large number of steps. Say the smallest filled triangle had massm and volume equal to v. Then we’ll calculate the mass as we go throughincreasing the numbers of triangles. Hence

4.2. SIERPINSKII TRIANGLE 25

Figure 4.6:

Mn = 3nm

when

Vn = 4nv

(4.13)

or since Vn = 12L2

n and v = 12 l2 we have,

L2n = 4nl2 or Ln = 2nl

Taking logarithms,

lnM

m= n ln 3

and

lnL

l= n ln 2

or, taking the ratio

lnM/m

lnL/l=

ln 3

ln 2

and

M

m= (

L

l)df (4.14)

where

df =ln 3

ln 2≈ 1.585 (4.15)

So df < d where d = 2 here.


This is not the common notation in the physics literature. Below, but neara second–order phase transition, the excess density satisfies

n ∝ |T − Tc|β

where Tc is the critical temperatire. However the correlation length satisfies

ξ ∝ |T − Tc|−v

Or, since ξ’s largest value is limited by the edge length L of the entire system

ξ ∼ L ∼ |T − Tc|−v

Using this above gives

n ∝ [L−1/v]β

or, Excess density near Tc

n ∝ 1

Lβ/v

where β and v are critical exponents. But, for a fractal

n ∝ 1

Ld − df

sodf = d − β/v (4.16)

where β/v ≈ 0.415 for the Sierpinskii snowflake.For a real system (in the Ising universality class in two dimensions),

x = 1/2, so ds = 1 1/2 and β/v = 1/8, so df = 1 7/8

Figure 4.7:

4.3. CORRELATIONS IN SELF-SIMILAR OBJECTS 27

4.3 Correlations in Self-Similar Objects

Figure 4.8:

A striking thing about a fractal is that it “looks the same” on differentlength scales. This is to say that your eye sees similar (or, for these toy fractals,identical features) correlations if you look on length scales r or r/b where b > 1.So far as density correlations go, let

C(r) = 〈∆n(r) ∆n(0)〉

(where the average is, say, over all orientations in space for convenience). Thenthe evident self–similarity of fractals implies

C(r) ∝ C(r/b)

or

C(r) = f(b)C(r/b)

(4.17)

It is easy to show that

f(b) = b−p (4.18)

where p is a constant. Take the derivative with respect to b of Eq. 4.17, letting

rk ≡ r/b (4.19)

Then

dC(r)

db= 0 =

df

dbC(r∗) + f

dC

dr∗dr∗

db


or,

df

dbC(r∗) = + f

r∗

b

dC

dr∗

andd ln f(b)

d ln b=

d ln C(r∗)

d ln r∗(4.20)

Because one side only depends on b, and the other only on r∗. Hence

d ln f(b)

d ln b= const. ≡ −p (4.21)

So

f(b) ∝ b−p (4.22)

and the invariance relation is

C(r) = b−pC(r/b) (4.23)

(up to a multiplicative constant).

If one lets

b = 1 + ǫ, ǫ << 1 (4.24)

C(r) = (1 + ǫ)−p C(r

1 + ǫ)

≈ (1 − pǫ){C(r) − ǫr∂C

∂r} + ...

Terms of order ǫ give

−pC(r) = r∂C

∂r

The solution of which is

C ∝ 1

rp (4.25)

Hence, self similarity implies homogeneous correlation functions

C(r) = b−p C(r/b)

and power–law correlation

C(r) ∝ 1

rp(4.26)

Usually, these apply for large r. The sign of p chosen to make C(r → ∞) →0. Note that this is quite different from what occurs for a system of many

4.3. CORRELATIONS IN SELF-SIMILAR OBJECTS 29

C(r)

r

1/rp

Figure 4.9: Decay of Correlations in a Self–Similar Object

C(r)

r

e−r/ξ

ξ

Figure 4.10: Decay of Correlations in a System of Many Independent Parts

independent parts where (we said)

C(r) ∼ e−r/ξ

The restriction ∫

d~rC(r) = O(ξd)

in d–dimensions is evidently unsatisfied for

C(r) ∼ 1

rp

if d > p, for then the integral diverges!Usually,

p = 2 − η (4.27)

and η ≥ 0, so d > p in d = 2 and 3.So, fractals, self–affine (rough) surfaces, and second–order phase transition

fluctuations do not satisfy the central limit theorem.In fact, rough surfaces and phase transitions will be the focus for the rest of

the course.

Chapter 5

Molecular Dynamics

At this point, one might say, computers are so big and fierce, why not simlpysolve the microscopic problem completely! This is the approach of moleculardynamics. It is a very important numerical technique, which is very very con-servative. It is used by physicists, chemists, and biologists.

Imagine a classical solid, liquid or gas composed of many atoms or molecules.Each atom has a position ~ri and a velocity ~vi (or momentum m~vi). The Hamil-

~ri

~rj

~vi

~vj

|~ri − ~rj |

particle i

particle j

o

Figure 5.1:

tonian is (where m is the atom’s mass)

H =N∑

i=1

m~v2i

2+ V (~r1, ...~rN )

Usually, the total potential can be well approximated as the sum of pairwiseforces, i.e.

V (r1, ...rN ) =

N∑

i=1, j=1(i>j)

V (|~ri − ~rj |)

which only depends on the distance between atoms. This potential usually has

31

32 CHAPTER 5. MOLECULAR DYNAMICS

a form like that drawn. Where the depth of the well must be

r0

v

rǫ

Figure 5.2:

ǫ ∼ (100′sK)kB

and

ro ∼ (a few A′s)

This is because, in a vapor phase, the density is small and the well is unimpor-tant; when the well becomes important, the atoms condense into a liquid phase.Hence ǫ ∼ boiling point of a liquid, and ro is the separation of atoms in a liquid.Of course, say the mass of an atom is

m ∼ 10 × 10−27kg

ro ∼ 2A

then the liquid density is

ρ ∼ m

r3o

=10 × 10−27kg

10A3

=10−27 × (103gm)

(10−8cm)3

ρ ∼ 1gm/cm3

of course. To convince oneself that quantum effects are unimportant, simlpyestimate the de Broglie wavelength

λ ∼ h√mkBT

∼ 1A/√

T

where T is measured in Kelvin. So this effect is unimportant for most liq-uids. The counter example is Helium. In any case, quantum effects are usually

33

marginal, and lead to no quantitative differences of importance. This is sim-ply because the potentials are usually parameterized fits such as the Lennard -Jones potential

V (r) = 4ǫ[(σ

r)12 − (

σ

r)6]

where

ro = 21/6σ ≈ 1.1σ

and

V (ro) = −ǫ

by construction. For Argon, a good “fit” is

ro = 3.4A

ǫ = 120K

Nummerically it is convenient that V (2ro) << ǫ, and is practically zero. Henceone usually considers

V (r) =

{

Lennard – Jones , r < 2ro

0 , r > 2ro

where ro = 21/6σ. Lennard-Jones has a run–of–the–mill phase diagram whichis representative of Argon or Neon, as well as most simple run–of–the–mill puresubstances. Simpler potentials like the square well also have the same properties.

s l

v

P

T

and

T

ρ

lv s

Coexistance

Figure 5.3: Leonnard Jousium (The solid phase is fcc as I recall)


But the hard sphere potential has a more simple–minded phase diagram. This

V

r

V =

∞ , r < σ

−ǫ , σ < r < σ′0 , r > σ′

Figure 5.4:

scoex.

lP

T

s

l

T

ρ

and

V

r

V =

{

∞ , r < σ

0 , r > σ

where,

Figure 5.5:

is because of the lack of an attractive interaction.Finally, lets get to the molecular dynamics method. Newton’s equations (or

Hamilton’s) are,d~ri

dt= ~vi

and ~ai =~Fi

m

d~vi

dt=

1

m

N∑

j=1(i6=j)

∂

∂~riV (|~ri − ~rj |)

Clearly, we can simply solve these numerically given all the positions and veloc-ities initially, i.e. {~ri(t = 0), ~vi(t = 0)}. A very simple (in fact too simple) way

35

to do this is byd~ri

dt=

ri(t + ∆t) − ri

∆t

for small ∆t. This gives

~ri(t + ∆t) = ~ri(t) + ∆t~vi(t)

and

~ri(t + ∆t) = ~vi(t) + ∆t1

m~Fi

where ~Fi(t) is determined by {~ri(t)} only. A better (but still simple) methodis due to Verlet. It should be remembered that the discretization method has alittle bit of black magic in it.

Valet notes that,

~ri(t ± ∆t) = ~ri(t) ± ∆t~vi +(∆t)2

2~ai(t) + ...

where ~ai = ~Fi/m = d2~ri

dt2 . Add this to itself

~ri(t + ∆t) = 2~ri(t) − ~ri(t − ∆t) + (∆t)21

m~Fi + O(∆t)

And us the simple rule

~vi(t + ∆t) =ri(t + ∆t) − ri(t − ∆t)

2∆t

Thats all. You have to keep one extra set of ri’s, but it is the common method,and works well. To do statistical mechanics, you do time averages, after thesystem equilibrates. Or ensemble averages over different initial conditions. Theensemble is microcanonical sine energy is fixed. Usually one solves it for a fixedvolume, using periodic boundary conditions. A common practice is to do a

Figure 5.6: Periodic Boundary Conditions


“cononical” ensemble by fixing temperature through the sum rule

1

2m〈v2〉 =

1

2kBT

or,

T =〈v2〉mkB

This is done by constantly rescaling the magnitude of the velocities to be

|v| ≡ 〈v2〉1/2 ≡√

mkBT

What about microscopic reversibility and the arrow of time?

Recall the original equations

∂~ri

∂t= ~vi

∂~vi

∂t=

1

m

∑

j(j 6=i)

∂

∂~riV (|ri − rj |)

The equations are invariant under time reversal t → −t if

~r → ~r

~v → −~v

How then can entropy increase? This is a deep and uninteresting question.Entropy does increase; its a law of physics. The real question is how to calculatehow how fast entropy increases; e.g. calculate a viscosity or a diffusion constant,which set the time scale over which entropy increases.

Nevertheless, here’s a neat way to fool around with this a bit. Take a gasof N atoms in a box of size Ld where energy is fixed such that the averageequilibrium separation between atoms

seq = O(3 or 4)ro

where ro is a parameter giving the rough size of the atoms. Now initialize theatoms in a corner of the box with

s(t = 0) = O(ro)

or smaller (see fig. 5.7). Now, at time t = later, let all ~v → −~v. By microscopicreversibility you should get fig. 5.8. Try it! You get fig. 5.9. It is round off

37

t = 0 t = later t = equilibrium

later t

“Entropy”

Figure 5.7:

t = 0 t = later t = 2later

tlater

1

“Entropy”

Figure 5.8:

tlater

“Entropy”

1

Figure 5.9:

error, but try to improve your precision. You will always get the same thing.Think about it. The second law is a law, not someone’s opinion.

Chapter 6

Monte Carlo and the

Master Equation

An improtant description for nonequilibrium statistical mechanics is provided bythe master equation. This equation also provides the motivation for the MonteCarlo method, the most important numerical method in many–body physics.

The master equation, despite the pretentious name, is phenomenology fol-lowing from a few reasonable assumptions. Say a system is defined by a state{S}. This state might, for example, be all the spins ±1 that an Ising model hasat each site. That is {S} are all the states appearing in the

Z =∑

{S}

e−E{S}/kBT (6.1)

partition function above. We wish to know the probability of being in somestate {S} at a time t,

p({S}, t) =? (6.2)

from knowledge of probabilities of states at earlier times. In the canonicalensemble (fixed T , fixed number of particles)

Peq({S}) ∝ e−E{S}/kBT (6.3)

of course. For simplicity, lets drop the { } brackets and let

{S} ≡ S (6.4)

As time goes on, the probability of being in state S increases because one makestransitions into this state from other (presumably nearby in some sence) statesS′. Say the probability of going from S → S′, in a unit time is

W (S, S′) ≡(Read As)

W (S ← S′) (6.5)

This is a transition rate as shown in fig. 6.1. Hence

39

40 CHAPTER 6. MONTE CARLO AND THE MASTER EQUATION

S′ S

W (S′, S)

W (S, S′)

Figure 6.1:

W (S, S′)P (S′) (6.6)

are contributions increasing P (S).However, P (S) decreases because one makes transitions out of S to the states

S′ with probabilityW (S′, S) ≡

(Read As)W (S′ ← S) (6.7)

So thatW (S′, S)P (S) (6.8)

are contributions decreasing P (S) as time goes on.The master equation incorporates both these processes as

∂P (S, t)

∂t=

∑

S′

[W (S, S′)P (S′) − W (S′, S)P (S)] (6.9)

which can be written as

∂P (S, t)

∂t=

∑

S′

L(S, S′)P (S′) (6.10)

where the inappropriately named Liouvillian is

L(S, S′) = W (S, S′) − (∑

S′′

W (S′′, S))δS, S′ (6.11)

Note that the master equation

1. has no memory of events further back in time than the differential element“dt”. This is called a Markov process

2. is linear.

3. is not time reversal invariant as t → −t (since it is first order in time).

In equilibrium,∂Peq(S)

∂t= 0 (6.12)

41

since we know this to be the definition of equilibrium. Hence in Eq. 6.9,

W (S, S′)Peq(S′) = W (S′, S)Peq(S) (6.13)

which is called detailed balance of transitions or fluctuations. For the canonicalensemble therefore,

W (S, S′)

W (S′, S)= e−(E(S)−E(S′))/kBT (6.14)

This gives a surprisingly simple result for the transitions W , it is necessary (butnot sufficient) that they satisfy Eq. 6.14. In fact, if we are only concerned withequilibrium properties, any W which satisfies detailed balance is allowable. Weneed only pick a convenient W !

The two most common W ’s used in numerical work are the metropolis rule

WMetropolis(S, S′) =

{

e−∆E/kBT ,∆E > 0

1 ,∆E ≤ 0(6.15)

and the Glauber rule

WGlauber(S, S′) =1

2(1 − tanh

∆E

2kBT) (6.16)

where

∆E = E(S) − E(S′) (6.17)

Let us quickly check that WMetropolis satisfies detailed balance. If E(S) > E(S′)

W (S, S′)

W (S′, S)=

e−(E(S)−E(S′))/kBT

1= e−∆E/kBT

Now if E(S) < E(S′)

W (S, S′)

W (S′, S)=

1

e−(E(S)−E(S′))/kBT= e−∆E/kBT

where ∆E = E(S) − E(S′) as above. So Detailed balance works. You cancheck yourself that the Glauber rule works. The form of the rule is shown infigures 6.2 and 6.3. To do numerical work via Monte Carlo, we simply maketransitions from state {S′} → {S} using the probability W ({S}, {S′}). A lastbit of black magic is that one usually wants the states {S′} and {S} to be“close”. For something like the Ising model, a state {S} is close to {S′} if thetwo states differ by no more than one spin flip, as drawn in fig. 6.4. Let us dothis explicitly for the 2 dimensional Ising model in zero field, the same thingthat Onsager solved!

ESTATE = −J∑

〈ij〉

Si Sj (6.18)


1 1 1

∆E ∆E ∆E

Low T Moderate T High T

Metropolis

Figure 6.2:

1 1

∆E ∆E ∆E

Low T Moderate T High T

1

1/2 1/2

Glauber

Figure 6.3:

? or

Outcome withprobabilityW ({S}, {S′})

Source State {S′} Target State {S}

Figure 6.4:

43

?

prob = e−8J/kBTCase 1

?

?

?Case 4

?

Case 2

Case 3

Case 5

prob = e−4J/kBT

prob = 1

prob = 1

prob = 1

Figure 6.5:


where Si = ±1 are the spins on sites i = 1, 2, ... N spins (not to be confusedwith our notation for a state {S}!), and 〈ij〉 means only nearest neighboursare summed over. The positive constant J is the coupling interaction. If only

one spin can possibly flip, note there are only five cases to consider as shown in

fig. 6.5.To get the probabilities (Metropolis)For Case 1:

EBefore = −4J , EAfter = +4J

∆E = 8J

W (S, S′) = Prob. of flip = e−8J/kBT

For Case 2:

EBefore = −2J , EAfter = +2J

∆E = 4J

W (S, S′) = Prob. of flip = e−4J/kBT

For Case 3:

EBefore = 0 , EAfter = 0

∆E = 0

W (S, S′) = Prob. of flip = 1

For Case 4:

EBefore = +2J , EAfter = −2J

∆E = −4J


For Case 5:

EBefore = +4J , EAfter = −4J

∆E = −8J


This is easy if W = 1, if W = e−4J/kBT , then you have to know the temper-ature in units of J/kB . Let’s say, for example,

W = 0.1

You know the spin flips 10% of the time. You can enforce this by comparingW = 0.1 to a random number uniformly between 0 and 1. If the random numberis less that or equal to W , then you flip the spin.

Here is an outline of a computer code. If you start at a large temperature and

45

Initialize System

- How many spins, How long to average

- How hot, (give T = (...)J/kB)?

- How much field H?

- Give all N spins some initial value (such as Si = 1).

use periodic boundary conditions

Do time = 0 to maximum time

Do i = 1 to N

- Pick a spin at random

- Check if it flips

- Update the system

Do this N times

(each spin is flipped on average)

Calculate magnetization per spin

〈M〉 = 1N 〈∑i Si〉

Calculate energy per spin

〈EJ 〉 = −1

N 〈∑〈ij〉 SiSj〉

averaging over time (after throwing away some early time).

time t = 0

Grinding

any 1mcs

time step

as N updates

Output

Figure 6.6:


M

1high T > Tc

t

t in Monte Carlo step 1, 1 unit of time

corresponds to attempting N spin flips

Figure 6.7:

1

M transient

time average to get 〈M〉

t

low T < Tc

Figure 6.8:

M ∼ |T − Tc|−beta, β = 1/8

Tc ≈ 2.269J/kB

T

1

〈M〉

Figure 6.9:

47

Tc

T

−2

〈E/J〉

Figure 6.10:

initially Si = 1, the magnetization quickly relaxes to zero (see fig. 6.7). Below Tc,the magnetization quickly relaxes to its equilibrium value (see fig. 6.8). Doingthis for many temperatures, for a large enough latice gives fig. 6.9. Doing thisfor the energy per spin gives fig. 6.10. which is pretty boring until one plots theheat capacity per spin

C ≡ 1

J

∂〈E〉∂T

Try it. You’ll get very close to Ousager’s results with fairly small systems

ln |T − Tc|

Tc ≈ 2.269 J/kB

C

Figure 6.11:

N = 10× 10, or 100× 100 over a thousand or so time steps, and you’ll do muchbetter than the mean field theory we did in class.

As an exercise, think how would you do this for the solid–on–solid model

Z =∑

{States}

e−EState/kBT

where

EState = JN∑

i=1

|hi − hi+1|

where hi are heights at sites i.

Chapter 7

Review of Thermodynamics

There are a lot of good thermodynamics books, and a lot of bad ones too. Theones I like are “Heat and Thermodynamics” by Zemansky and Dittman (butbeware their mediocre discussion of statistical mechanics, and their confusingdiscussion of phase transitions), and Callin’s book, “Thermostatics”, (which isoverly formal).

A thermodynamic state of a system is described by a finite number of vari-ables, usually at least two. One needs at least two because of conservationlaws for energy E and number of particles N . So one needs at least (E, N) tospecify a system, or two independent variables which could give (E, N). Somesystems need more than two, and for some systems, it is a useful approach toonly consider one variable, but for now, we will consider that two are enough.

There is a deep question lurking here. Thermodynamics describes systemsin the thermodynamic limit: N → ∞, volume V → ∞, such that numberdensity n = N/V = const. Each classical particle has, in three dimensions, 6independent degrees of freedom describing its position and momentum. So thereare not 2, but ∼ 1023 degrees of freedom! This is a very deep question. Manypeople have worried about it. But let’s take the physics view. We need at least

two variables, so let’s use two. Experimentally, this describes thermodynamicsystems on large length and time scales, far from, for example, phase transitions.

2 out of

T , P , S, E, V ,

N , ...

are independent

Figure 7.1: Our System

Consider a pure system which can be in a gas, liq-uid, or simple solid phase. It can be described by tem-perature T , pressure P , entropy S, in addition to E,V , and N . Temperature is a ”good” thermodynamicvariable by the zeroth law of thermodynamics (twosystems in thermal equilibrium with a third systemare in thermal equilibrium with each other). Entropyis a ”good” thermodynamic variable by the secondlaw of thermodynamics (the entropy of an isolatedsystem never decreases).

Variables can be classified by whether they are

49

50 CHAPTER 7. REVIEW OF THERMODYNAMICS

extencive or intensive. Say one considers a system of two independent parts asdrawn in fig. 7.2.

1 2 21

Figure 7.2: A system with two independent parts.

An extensive variable satisfies

Vtotal = V1 + V2

Etotal = E1 + E2

Stotal = S1 + S2

while an intensive variable satisfies

T = T1 = T2

P = P1 = P2

From mechanics, the work done on a system to displace it by dx is

−F dx

where F is the force. Acting on an area, we have

dW︸︷︷︸

= −P︸︷︷︸

dV︸︷︷︸

Small Pressure Differentialamount volumeof work change

Two new concepts arise in thermodynamics. Temperature, which is somethingsystems in thermal equilibrium share; and heat, which is a form of energy ex-changed between systems at different temperatures.

The law of energy concervation (fist law of thermodynamics) is

dE︸︷︷︸

= dW︸︷︷︸

+ dQ︸︷︷︸

change in work heatinternal done on transferedenergy system

The second law of thermodynamics is

dQ = TdS

where S is entropy. Heat always flows from the hot system to the cold system.In fig. 7.3, both sides change due to transfer of heat Q. For the hot side

51

THOT TCOLD

Q

Figure 7.3: System with hot and cold parts exchanging heat Q.

(∆S)HOT =−Q

THOT

for the cold side

(∆S)COLD =−Q

TCOLD

Hence

(∆S)TOTAL = Q [1

TCOLD− 1

THOT]

So

(∆S)TOTAL ≥ 0

This is the usual way the second law is presented. Note that this is the onlylaw of physics that distinguishes past and future.

As an aside, it is interesting to note that S must be a “good” thermodynam-ics variable, or else the second law is violated. If we assume the opposite (thattwo distinct states have the same S) we can create a perpetual motion machine,as drawn in fig. 7.4. Blobbing up the 1st and 2nd laws gives

(T, S)

(T, S)

adiabat

isotherm

P

V

Q = W

Figure 7.4: Perpetual Motion Machine

dE = −P dV + T dS

where only 2 out of E, P, V, T and S are independent. This is the most impor-tant and useful result of thermodynamics.

To get further, we need equations of state, which relate the variables. Whithinthermodynamics, these are determined experimentally. For example, the ideal


gas equation of state is

P = P (V, T ) = N kB T/V

andE = E(T )

where usuallyE = C T

and C is a constant determined experimentally. Then

C dT = −NkBT

VdV + T dS

So, for an adiabatic process (wherein no heat is exchanged, so dS = 0), a changein volume causes a change in temperature.

In general, equations of state are not known. Instead one obtains differentialequations of state

dP = (∂P

∂V)T dV + (

∂P

∂T)V dT

and

dE = (∂E

∂V)T dV + (

∂E

∂T)V dT

Clearly, if we know all the (T, V ) dependence of these derivatives, such as (∂E∂T )V ,

we know the equations of state. In fact, most work focuses on the derivatives.Some of these are so important they are given their own names.

Isothermal Compressibility kT = − 1

V(∂V

∂P)T

Adiabatic Compressibility ks = − 1

V(∂V

∂P)s

Volume Expansivity α =1

V(∂V

∂T)P

Heat Capacity at Constant Volume CV = T (∂S

∂T)V

Heat Capacity at Constant Pressure CP = T (∂S

∂T)P

Of course, kT = kT (T, V ), etc.These results rest on such firm foundations, and follow from such simple

principles, that any microscopic theory must be consistent with them. Further-more, a microscopic theory should be able to address the unanswered questionsof thermodynamics, e.g., calculate kT or CV from first principles.

These quantities can have very striking behavior. Consider the phase dia-gram of a pure substance (fig. 7.5). The lines denote 1st order phase transi-

53

T

v

ls

P

Figure 7.5: Solid, liquid, vapor phases of pure substance

v

ls

T1 T2T

P

Pc

Figure 7.6:

tions. The dot denotes a continuous phase transition, often called second order.Imagine preparing a system at the pressure corresponding to the continuoustransition, and increasing temperature from zero through that transition (seefig. 7.6).

Figure 7.7 shows the behavior of kT and CV : At the first-order transition,kT has a delta-function divergence while CV has a cusp. At the continuoustransition, both CV and kT have power-law divergences. We will spend a lot oftime on continuous, so-called second order transitions in the course.


kT

T1 T2

P ≡ Pc

T1 T2

Cv P ≡ Pc

Figure 7.7:

Chapter 8

Statistical Mechanics

We will develop statisitical machanics with “one eye” on thermodynamics. Weknow that matter is made up of a large number of atoms, and so we expectthermodynamic quantities to be avarages of these atomic properties. To do theaverages, we need a probability weight.

We will assume that, in a isolated system, the probability of being in a stateis a function of the entropy alone. This is because, in thermodynamics, theequilibrium state is the one with the maximum entropy. Hence, the probabilityweight

ρ = ρ(S)

To get the form of S we need to use

1. S is extensive.

2. S is maximized in equilibrium.

3. S is defined up to a constant.

Since S is extensive, two independent systems of respective entropies S1 and S2

have total entropy (S1 + S2) (see fig. 8.1). But

1 2

Figure 8.1:

ρ(Stotal) = ρ(S1)ρ(S2)

Since they are independent (and note I will not worry about normalization fornow). Since S is extensive we have

ρ(S1 + S2) = ρ(S1)ρ(S2)

55

56 CHAPTER 8. STATISTICAL MECHANICS

or

ln ρ(S1 + S2) = ln ρ(S1) + ln ρ(S2)

This is of the form

f(S1 + S2) = f(S1) + f(S2)

where f is some function. The solution is clearly

f(S) = AS

where A is a constant, so

ρ = eAS

But S is defined up to a constant so

ρ = BeS/kB , A = 1/kB

where the constant B is for normalization, and our earlier carelessness withnormalization has no consequence. This is the probability weight for the micro-

cannonical ensemble since S = S(E, V ) in thermodynamics from

T dS = dE + P dV

That is, it is the ensemble for given (E, V). The more handy ensemble, for given(T, V) is the canonical one. Split the system into a reservoir and a system. Theyare still independent, but the resevoir is much bigger than the system (fig. 8.2).

Resevoir

System

fixed T , V

Figure 8.2:

Then

S(Eresevoir) = S(Etotal − E)

≈ S(Dtotal) − (∂S

∂E)totalE

and

S(Eresevoir) = const − E

T

57

So, for the canonical ensemble

ρstate = const e−

EstatekBT

The normalization factor is

Z =∑

states

e−Estate/kBT

The connection to thermodynamics is from the Helmholtz free energy F =E − TS, where

Z = e−F/kBT

We haven’t shown this, but it is done in many books.

58 CHAPTER 8. STATISTICAL MECHANICS

Chapter 9

Fluctuations

Landan and Lifshitz are responsible for “organizing” the theory of fluctuations.They attribute the theory to Einstein in their book on statistical physics. Iam following their treatment. First, lets do toy systems, where entropy onlydepends on one variable, x.

The probability weightρ ∼ eS(x)/kB

so the probability of being in x → x + dx is

const eS(x)/kBdx

where the constant gives normalization. At equilibrium, S is maximized, and xis at its mean value of 〈x〉 as drawn in fig. 9.1. Near the maximum we have

x〈x〉

S

Figure 9.1:

S(x) = S(〈x〉) +∂S

∂x

∣∣∣∣x=〈x〉

(x − 〈x〉) +1

2

∂2S

∂x2

∣∣∣∣x=〈x〉

(x + 〈x〉)2 + ...

But S(〈x〉) is just a constant, ∂S/∂〈x〉 = 0, and ∂2S/∂〈x〉2 is negative definite,since S is a maximum. Hence,

S(x)

kB=

1

2σ2(x − 〈x〉)2

59

60 CHAPTER 9. FLUCTUATIONS

whereσ2 ≡ −kB/(∂2S/∂〈x〉2)

So the probability of being in x → x + dx is

const. e−(x−〈x〉)2/2σ2

which is simply a Gaussian distribution. Since it is sharply peaked at x = 〈x〉,we will let integrals vary from x = −∞ to x = ∞. Then

〈(x − 〈x〉)2〉 = σ2

i.e.,

〈(x − 〈x〉)2〉 =−kB

∂2S/∂〈x〉2

This is called a fluctuation-dissipation relation of the first kind. Fluctuations inx are related to changes in entropy. They are also called correlation–responserelations. As an aside, the “second kind” of fluctuation–dissipation relationsturn up in nonequilibrium theory.

Since S is extensive, we have

〈(∆x)2〉 = O(1/N)

again, if x is intensive. Note that, from before we said that

〈(∆x)2〉 = O(ξ3/L3)

So that fluctuations involved microscopic correlation lengths. Now we have

−kB

∂2S/∂〈x〉2 = O(ξ3/L3)

So thermodynamic derivatives are determined by microscopic correlations. Thisis why we need statistical mechanics to calculate thermodynamic derivatives.We need more degrees of freedom than x to do anything all that interesting,but we have enough to do something.

Consider isothermal toys. These are things like balls, pencils, ropes, andpendulums, in thermal equilibrium. They have a small number of degrees offreedom. For constant temperature, using thermodynamics

T ∆S = ∆E︸︷︷︸

− W︸︷︷︸

≡ 0 work donefor simple to systemsystems

Hence ∆S = −W/T . Since entropy is only defined up to a constant, this is allwe need.

This is how it works. Consider a simple pendulum (fig. 9.2). The amount of

61

θ

x

Figure 9.2: Simple pendulum of mass m, length l, displaced an angle θ.

work done by the thermal background to raise the pendulum a height x is

mgx

where g is the gravitational acceleration. From the picture

x = l − l cos θ

≈ lθ2

2

so

W =mglθ2

2

and S = −mglθ2

2 + const. so, since 〈θ〉 = 0, we have

〈θ2〉 =kBT

mgl

This is a very small fluctuation for a pendulum like in the physics labs. Whenthe mass is quite small, the fluctuations are appriciable. This is the origin, forexample, of fluctuations of small particles in water, called Brownian motion,which was figured out by Einstein. There are many other isothermal toys.


Time to

Fall

T

5s

Figure 9.3:

Landan and Lifshitz give several examples with solutions. My favorite, not givenin Landan’s book though, is the pencil balanced on its tip. It is interestingto estimate how long it takes to fall as a function of temperature (at zerotemperature, one needs to use the uncertainty principle).

9.1 Thermodynamic Fluctuations

We can use this approach to calculate the corrections to thermodynamics. Againimagine we have two weakly interacting systems making up a larger system. Dueto a fluctuation, there is a change in entropy of the total system due to eachsubsystem: “1”, the system we are interested in, and “2”, the reservoir

1

2 “1” The system

N fixed, common T , P

“2” The resevior

Figure 9.4:

∆S = ∆S1+ ∆S2

= ∆S1+dE2 + PdV2

T︸︷︷︸

thermodynamics for

the reservoir

But when the reservoir increases its energy, the system decreases its energy.The same thing for the volume. However, they have the same P (for mechanicalequilibrium) and T (for thermal equilibrium). So we have

∆S = ∆S1 −∆E1

T− P

T∆V1

9.1. THERMODYNAMIC FLUCTUATIONS 63

The probability weight is e∆S/kB , so we have

Probability ∝ e− 1

kBT (∆E−T∆S+P∆V )

where I have dropped the “1” subscript. There are still only 2 relevant variables,so, for example:

∆E =∆E(S, V )

=∂E

∂S∆S +

∂E

∂V∆V +

+1

2[∂2E

∂S2(∆S)2 + 2

∂2E

∂S∂V(∆S∆V ) +

∂2E

∂V 2(∆V )2]

to second order. By ∂E/∂V , I mean (∂E/∂V )S , but it is already pretty messynotation. Rearranging:

∆E =∂E

∂S∆S +

∂E

∂V∆V +

1

2[∆

∂E

∂S∆S + ∆

∂E

∂V∆V ]

But

(∂E

∂S)V = T

and

(∂E

∂V)S = −P

so we have

∆E =T∆S − P∆V +1

2(∆T∆S − ∆P∆V )

The term in brackets gives the first–order correlation to thermodynamics. Inthe probability weight

Probability ∝ e−(∆T∆S−∆P∆V )/2kBT

There are still only two independent variables, so, for example

∆S = (∂S

∂T)V ∆T + (

∂S

∂V)T ∆V

=CV

T∆T + (

∂P

∂T)T ∆V

Likewise

∆P = (∂P

∂T)V ∆T + (

∂P

∂V)T ∆V

= (∂P

∂T)V ∆T − 1

κT V∆V


Putting these expressions into the probability weight, and simplifying, gives

Probability ∝ e[

−CV2kBT2 (∆T )2− 1

2kBT κT V (∆V )2]

Evidently the probabilities for fluctuations in T and V are independent since

Probability = Prob(∆T ) · Prob(∆V )

Also, all the averages are gaussians, so by inspection:

〈(∆T )2〉 =kBT 2

CV

〈(∆V )2〉 = kBTκT V

and

〈∆T ∆V 〉 = 0

If we had chosen ∆S and ∆P as our independent variables, we would haveobtained

Probability ∝ e[ −12kBCP

(∆S)2−κSV

2kBT (∆P )2]

so that

〈(∆S)2〉 = kBCp

〈(∆P )2〉 =kBT

κSV

〈∆S ∆P 〉 = 0

Higher–order moments are determined by the Gaussian distributions. So, forexample,

〈(∆S)4〉 = 3〈(∆S)2〉2

= 3(kBCP )2

Note that any thermodynamic quantity is a function of only two variables.Hence, its fluctuations can be completly determined from, for example 〈(∆T )2〉,〈(∆V )2〉, and 〈∆T ∆V 〉. For example

∆Q = ∆Q(T, V )

= (∂Q

∂T)V ∆T + (

∂Q

∂V)T ∆V

so

〈(∆Q)2〉 =(∂Q

∂T)2V 〈(∆T )2〉 + ... 〈 :0

∆T ∆V 〉 + (∂Q

∂V)2T 〈(∆V )2〉

9.1. THERMODYNAMIC FLUCTUATIONS 65

and

〈(∆Q)2〉 =(∂Q

∂T)2V

kBT 2

CV+ (

∂Q

∂V)2T kBTκT V

Check this, for example, with Q = S or Q = P . Note the form of these relationsis

〈(∆x)2〉 = O(1/N)

for intensive variables, and

〈(∆X)2〉 = O(N)

for extensive variables. Also note the explicit relationship between correctionsto thermodynamics (〈(∆T )2〉), thermodynamic derivatives (CV = T (∂S/∂T )V ),and microscopic correlations (1/N = ξ3/L3).

The most important relationship is

〈(∆V )2〉 = kB T κT V

although not in this form. Consider the number density n = N/V . Since N isfixed

∆n = ∆N

V= − N

V 2∆V

Hence

〈(∆V )2〉 =V 4

N2〈(∆n)2〉

so,

〈(∆n)2〉 =N2

V 4kBTκT V

or

〈(∆n)2〉 =n2kBTκT

V

for the fluctuations in the number density. This corresponds to the total fluc-tuation of number density in our system. That is, if ∆n(~r) is the fluctuation ofnumber density at a point in space ~r,

∆n =1

V

∫

d~r∆n(~r)

(Sorry for the confusing tooking notation.) Hence

〈(∆n)2〉 =1

V 2

∫

d~r

∫

d~r〈∆n(~r)∆n(~r′)〉


Again, for translational invariance of space, and isotropy of space

〈∆n(~r)∆n(~r′)〉 = C(|~r − ~r′|)

In the same way as done earlier we have

〈(∆n)2〉 =1

V

∫

d~rC(r)

where the dummy integral “~r” is “~r−~r′” of course. But 〈(∆n)2〉 = n2kBTκT /V ,so we have the thermodynamic “sum rule”:

∫

d~r C(r) ≡∫

d~r〈∆n(r)∆n(0)〉 = n2kBTκT

If

C(k) ≡∫

d~r ei~k·~rC(~r)

we havelimk→0

C(k) = n2kBTκT

The quantity C(k) is called the structure factor. It is directly observable by x–ray, neutron, light, etc. scattering. For a liquid, one typically obtains somethinglike this The point at k = 0 is fixed by the result above.

C(k)

C(k)

k

(peaks correspond to short scale order)

Figure 9.5: Typical structure factor for a liquid.

Chapter 10

Fluctuations of Surfaces

A striking fact is that, although most natural systems are inhomogeneous, withcomplicated form and structure, almost all systems in thermodynamic equilib-rium must have no structure, by Gibbs’ phase rule.

As an example, consider a pure system which can be in the homogeneousphases of solid, liquid, or vapour. Each phase can be described by a Gibbsenergy gs(T, P ), gl(T, P ), or gv(T, P ). In a phase diagram, the system choosesits lowest Gibbs free energy at each point (T, P ). Hence, the only possibilityfor inhomogeneous phases are along lines and points:

gs(T, P ) = gl(T, P )

gives the solidification line

gl(T, P ) = gv(T, P )

gives the vaporization line, while

gs(T, P ) = gl(T, P ) = gv(T, P )

gives the triple point of solid–liquid–vapour coexistance.

T

v

ls

P

Figure 10.1:

Inhomogeneous states exist on lines in the phase diagram, a set of measurezero compared to the area taken up by the P–T plane.

67

68 CHAPTER 10. FLUCTUATIONS OF SURFACES

Much modern work in condensed–matter shysics concerns the study of in-homogeneous states. To get a basis for this, we shall consider the most simplesuch states, equilibrium coexisting phases, separated by a surface.

Tc

Pc

P

T

l

v

Vapor

Liquid

Cartoon of SystemPhase Diagram

Figure 10.2:

As every one knows, in a finite volume droplet, droplets are spheres whilemuch larger droplets are locally flat. This is the same for liquid–vapour systems,oil–water systems, magnetic domain walls, as well as many other systems. Thereason for this is simple thermodynamics. A system with a surface has extrafree energy proportional to the area of that surface. Using the Helmholtz freeenergy now, for convenience,

F = (Bulk Energy) + σ

∫

d~x

︸︷︷︸

Surface Area

(10.1)

where the positive constant σ is the surface tension. Note that the extra free

~x

y

Bulk Energy σ∫

d~x

L

Figure 10.3:

energy due to the surface is minimized by minimizing the surface area.

The coordinate system is introduced in the cartoon above. Note that y isperpendicular to the surface, while ~x is a (d− 1)–dimensional vector parallel tothe surface. Furthermore, the edge length of the system is L, so its total volume

V = Ld

69

while the total surface areaA = Ld−1

Equation 10.1, ∆F = σLd−1, is enough to get us started at describing fluc-tuations of a surface at a non–zero temperature. Of course, Eq. 10.1 is ther-modynamics, while in statistical mechanics, we need a probability function, orequivalently, the partition function.

Z =∑

{states}


to calculate such fluctuations strangely enough, using Eq. 10.1 we can constructmodels for Estate in Eq. 10.2! This sounds suspicious since Estate should bemicroscopic, but it is a common practice.

We shall make the decision to limit our investigations to long-length-scalephenomena. So it is natural to expect that phenomena on small length scales,like the mass of a quark or the number of gluons, do not affect our calculations.In particle physics they call this renormalizability: if we consider length scales

r > a few Angstroms

all the physics on scales below this only gives rise to coupling constants enteringour theory on those larger length sacles. The main such coupling constant weshall consider is simply the radius of an atom

ro ≈ 5A (10.3)

so that we consider thength scales

r ≥ ro

Usually this appears as a restriction, and introduction of coupling constant, inwavenumber k, so that we consider

k ≤ Λ

where Λ ≡ 2π/ro. In particle physics this is called an ultraviolet cutoff. Iemphasize that for us Λ or ro is a physically real cutoff giving the smallestlength scale in which our theory is well defined.

This is not too dramatic so far. An important realization in statisticalmechanics is that many microscopic systems can give rise to the same behaviouron long length scales. In the most simple sense, this means that large–length–scale physics is independent of the mass of the gluon. We can choose anyconvenient mass we want, and get the same large–length–scale behaviour. Infact, it is commonplace to go much much further than this, and simlpy inventmicroscopic models — which are completely unrelated to the true microscopicsin question! — which are consistent with thermodynamics. The rule of thumbis to construct models which are easy to solve, but still retain the correct long–length–scale physics.


This insight, which has dramatically changed condensed–matter physics overthe last twenty years, is called “universality”. Many microscopic models have thesame long–length–scale physics, and so are said to be in the same “universalityclass”. The notion of a universality class can be made precise in the context ofthe theory of phase transitions. Even without precision, however, it is commonlyapplied throughout modern condenced–matter physics!

10.1 Lattice Models of Surfaces

To get a sense of this, let us reconsider our system of intrest, an interface. Wewill construct a “microscopic” model, as shown. We have split the ~x and y axes

~x

y

Physical Surface Microscopic modeled abstraction

Figure 10.4:

into discrete elements. Instead of a continuous variable where

−L/2 ≤ x ≤ L/2 and − L/2 ≤ y ≤ L/2 (10.4)

Say (in d = 2), we have a discrete sum along the x direction

i = −L

2, ...,−3,−2,−1, 0, 1, 2, 3, ...,

L

2

(where L is measured in units of ro, the small length scale cutoff), and a similardiscrete sum along the y direction

j = −L

2, ...,−3,−2,−1, 0, 1, 2, 3, ...,

L

2(10.5)

Actually, for greater convenience, let the index along the x axis vary as

i = 1, 2, 3, ..., L (10.6)

All the possible states of the system then correspond to the height of the in-terface at each point i, called hi, where hi is an integer −L/2 ≤ hi ≥ L/2 asshown in fig. 10.5. Hence the sum in the partition function is

∑

{states}

=∑

{hi}

=

L∏

i=1

L/2∑

hi=−L/2

(10.7)

10.1. LATTICE MODELS OF SURFACES 71

i = 1 i = 2 i = 3 i = 4 i = L

h1 = 0

h2 = 1 h3 = 0 h4 = 1

h5 = 4

Figure 10.5:

This is a striking simplification of the microscopics of a true liquid–vapour sys-tem, but as L → ∞, the artificial constant imposed by the i, j discrete grid willbecome unimportant.

Now let us invent a “microscopic” energy Estate = E{hi} which seems phys-ically reasonable, and consistent with the free energy ∆f = σLd−1. Physically,we want to discourage surfaces with too much area, since the free energy isminimized by a flat surface. It is natural for a microscopic energy to act with a

Discouraged by Estate Encouraged by Estate

Figure 10.6:

microscopic range. So we enforce flatness, by enforcing local flatness

local Estate = +(hi − hi+1)2 (10.8)

So the energy increases if a system does not have a locally flat interface. Intotal

Estate = J

L∑

i=1

(hi − hi+1)2 (10.9)

where J is a positive constant. This is called the discrete Gaussian solid–on–solid model. Equally good models(!) are the generalized solid–on–solid models

Enstate = J

L∑

i=1

|hi − hi+1|n (10.10)


where n = 1 is the solid–on–solid model, and n = 2 is the discrete Gaussianmodel. With Eq. 10.9, the partition function is

Z = (∏

i

∑

hi

)e− J

kBT

LP

i′=1

|hi′−hi′+1|2

(10.11)

This is for d = 2, but the generalization to d=3 is easy. Note that

Z = Z(kBT

J, L) (10.12)

One can straight forwardly solve Z numerically. Analytically it is a little tough.It turns out to be easier to solve for the continuum version of Eq. 10.11 whichwe shall consider next.

10.2 Continuum Model of Surface

Instead of using a discrete mesh, let’s do everything in the continuum limit.Let h(~x) be the height of the interface at any point ~x as shown below. This

y

~xh(~x)

y = h(~x)

is the interface

Figure 10.7:

is a useful description if there are no overhangs or bubbles, which cannot bedescribed by y = h(~x). Of course, although it was not mentioned above, the

These give a multivalued h(~x), as shown

Overhang Bubble

Figure 10.8:

“lattice model” (fig. 10.5) also could not have overhangs or bubbles. The states

10.2. CONTINUUM MODEL OF SURFACE 73

the system can have are any value of h(~x) at any point ~x. The continuumanalogue of Eq. 10.7 is

∑

{states}

=∑

{h~x}

=∏

~x

∫

dh(~x)

≡∫ −

−

∫

Dh(·)(10.13)

where the last step just shows some common notation for∏

~x

∫dh(~x). Things

are a little subtle because ~x is a continuous variable, but we can work it through.One other weaselly point is that the sum over states is dimensionless, while∏

~x

∫dh(~x) has dimensions of (

∏

x L). This constant factor of dimensionalitydoes not affect our results, so we shall ignore it.

For the energy of a state, we shall simply make it proportional to the amountof area in a state h(~x), as shown in fig. 10.7.

Estate{h(~x)} ∝ (Area of surface y = h(~x))

orEstate = σ(L′)d−1 (10.14)

where (L′)d−1 is the area of the surface y = h(~x), and σ is a positive constant.It is worth stopping for a second to realize that this is consistent with out

definition of excess free energy ∆F = σLd−1. In fact, it almost seems a littletoo consistent, being essentially equal to the excess free energy!

This blurring of the distinction between microscopic energies and macro-scopic free energies is common in modern work. It means we are considering a“coarse–grained” or mesoscopic description. Consider a simple example

e−F/kBT ≡ Z =∑

{h}

e−Eh/kBT

where h labels a state. Now say one has a large number of states which havethe same energy Eh. Call these states, with the same Eh, H. Hence we canrewrite ∑

{h}

e−Eh/kBT =∑

{H}

gHe−EH/kBT

wheregH = # of states h for state H

Now let the local entropyS = kB ln gH

So we havee−F/kBT =

∑

{H}

e−FH/kBT

where FH = EH −TSH . This example is partially simlpe since we only partiallysummed states with the same energy. Nevertheless, the same logic applies if one


does a partial sum over states which might have different energies. this longdigression is essentially to rationalize the common notation (instead of Eq. 10.14)of

Estate = Fstate = σ(L′)d−1 (10.15)

Let us now calculate (L′)d−1 for a given surface y = h(~x)

√

(dh)2 + (dx)2

dh

dx

x

y

Figure 10.9:

As shown above, each element (in d = 2) of the surface is

dL′ =√

(dh)2 + (dx)2

= dx

√

1 + (dh

dx)2

(10.16)

In d–dimensions one has

dd−1L′ = dd−1x

√

1 + (∂h

∂~x)2 (10.17)

Integrating over the entire surface gives

(L′)d−1 =

∫

dd−1x

√

1 + (∂h

∂~x)2

so that

Estate = σ

∫

dd−1~x

√

1 + (∂h

∂~x)2 (10.18)

Of course, we expect

(∂h

∂~x)2 << 1 (10.19)

(that is, that the surface is fairly flat), so expanding to lowest order gives

Estate = σLd−1 +σ

2

∫

dd−1~x(∂h

∂x)2

︸︷︷︸

Extra surface area due to fluctuation

(10.20)

The partition function is therefore,

Z =∑

{states}

e−Estate/kBT


or,

Z = e−σLd−1

kBT (∏

~x

∫

dh(x)) exp−σ

2kBT

∫

dd−1~x(∂h

∂~x′)2 (10.21)

This looks quite formidable now, but it is straightforward to evaluate. In par-ticle physics it is called a free–field theory. Note the similarity to the discretepartition function of Eq. 10.11. The reason Eq. 10.21 is easy to evaluate is thatit simply involves a large number of Gaussian integrals

∫

dh e−ah2

As we know, Gaussian integrals only have two important moments,

〈h(x)〉

and

〈h(~x)h(~x′)〉

All properties can be determined from knowledge of these two moments! How-ever, space is translationally invariant and isotropic in the ~x plane (the planeof the interface), so properties involving two points do not depend upon their

~x1

|x1 − x2|

~x2

o

Figure 10.10: ~x plane

exact positions, only the distance between them. That is

〈h(x)〉 = const.

and

〈h(~x1)h(~x2)〉 = G(|~x1 − ~x2|) (10.22)

For convenience, we can choose

〈h〉 = 0 (10.23)

then our only task is to evaluate the correlation function

G(x) = 〈h(~x)h(0)〉= 〈h(~x + x′)h(~x′)〉 (10.24)


To get G(x), we use the partition function from Eq. 10.21, from which theprobability of a state is

ρstate ∝ e−σ

2kBT

R

dd−1~x( ∂h∂~x )2

(10.25)

The subtle thing here is the gradients which mix the probability of h(~x) with thatof a very near by h(~x′). Eventually this means we will use fourier transforms.To see this mixing, consider

∫

d~x(∂h

∂~x)2 =

∫

d~x[∂

∂~x

∫

d~x′h(~x′)δ(~x − ~x′)]2

=

∫

d~x

∫

d~x′

∫

d~x′′h(~x)h(~x′′)(∂

∂~xδ(~x − ~x′)) · ( ∂

∂~xδ(~x − ~x′′))

Using simple properties of the Dirac delta function. Now let

x → x′′

x′ → x

x′′ → x′

Since they are just dummy indices, giving

=

∫

d~x

∫

d~x′

∫

d~x′′h(~x)h(~x′)(∂

∂~x′′δ(~x′′ − ~x)) · ( ∂

∂~x′′δ(~x′′ − ~x′))

=

∫

d~x

∫

d~x′

∫

d~x′′h(~x)h(~x′)∂

∂~x· ∂

∂~x′(δ(~x′′ − ~x) δ(~x′′ − ~x′))

=

∫

d~x

∫

d~x′h(~x)h(~x′)∂

∂~x· ∂

∂~x′δ(~x − ~x′)

=

∫

d~x

∫

d~x′h(~x)h(~x′)[− ∂2

∂~x2δ(~x] − ~x′)

(you can do this by parts integration, and there will be terms involving surfacesat ∞, which all vanish.) Hence,

∫

dd−1~x(∂h

∂~x)2 =

∫

dd−1~x

∫

dd−1~x′h(~x)M(~x − ~x′)h(~x′) (10.26)

where the interaction matrix,

M(~x) = − ∂2

∂~x2δ(~x) (10.27)

which has the form shown in fig. 10.11. This “Mexican hat” interaction matrixappears in many different contexts. It causes h(~x) to interact with h(~x′).

In fourier space, gradients ∂/∂~x become numbers, or rather wave numbers ~k.We will use this to simplify our project. It is convenient to introduce continuous


~x

M(~x) = − ∂2

∂~x2 δ(~x)

Figure 10.11:

and discrete fourier transforms. One or another is more convenient at a giventime. We have,

h(~x) =

∫dd−1~k

(2π)d−1 ei~k·~xh(k) (Continuous)1

Ld−1

∑

~k

ei~k·~xhk (Discrete) (10.28)

where

h(k) =

∫

dd−1~xe−i~k·~xh(~x)

= hk

(10.29)

Closure requires

δ(~x) =

∫dd−1~k

(2π)d−1 ei~k·~x

1Ld−1

∑

~k

ei~k·~x (10.30)

and

δ(~k) =

∫dd−1~x

(2π)d−1ei~k·~x

but

δ~k, 0 =1

Ld−1

∫

dd−1~xei~k·~x (10.31)

is the Kronecker delta

δ~k, 0 =

{

1 , if ~k = 0

0 , otherwise(10.32)

not the Dirac delta function.We have used the density os states in ~k space as

(L

2π)d−1

so that the discrete → continuum limit is


∑

~k

→ (L

2π)d−1

∫

dd−1~k

and

δ~k, 0 → (L

2π)d−1δ(~k) (10.33)

Using the Fourier transforms we can simplify the form of Estate in Eq. 10.20.Consider

∫

dd−1~x(∂h

∂x)2 =

=

∫

d~x(∂

∂~x

∫d~k

(2π)d−1ei~k·~xh(~k))2

=

∫

d~x

∫d~k

(2π)d−1

∫d~k′

(2π)d−1(−~k · ~k′)h(~k)h(~k′)ei(~k+~k′)·~x

=

∫d~k

(2π)d−1

∫d~k′

(2π)d−1(−~k · ~k′)h(~k)h(~k′)[

∫

d~xei(~k+~k′)·x]

=

∫d~k

(2π)d−1k2h(k)h(−k)

after doing ~k′ integral. But h(~x) is real, so h∗(~k) = h(−k) (from Eq. 10.29),and we have

∫

dd−1~x(∂h

∂~x)2 =

∫dd−1~k

(2π)d−1k2|h(~k)|2 (Continuum)

or ∫

dd−1x(∂h

∂~x)2 =

1

Ld−1

∑

~k

k2|h~k|2 (Discrete) (10.34)

Hence modes of h~k are uncoupled in fourier space; the matrix M(~x) has beenreplaced by the number k2, (see Eqs. 10.26 and 10.27). Therefore we have

Estate = σLd−1 +σ

2Ld−1

∑

~k

k2|h~k|2 (10.35)

For the partition function we need to sum out all the states∑

{states} using the

fourier modes hk rather than h(~x). This is a little subtle because h(~x) is real

while h~k is complex. Of course not all the components are independent since

hk = h∗−k, so we double count each mode. To do the sum properly we have

∑

{states}

=∏

~x

∫

dh(~x) =∏

~k

′∫

d2h~k (10.36)


where d2hk = dℜhk dℑhk

and∏

~k

′ ← Restriction to half of ~k space to avoid double counting

For the most part, this prime plays no role in the algebra below. Recall that

Integrating in kg > 0, only, for example

kx

∏′k

ky

Figure 10.12:

the only nontrivial moment is

G(x) = 〈h(~x)h(0)〉= 〈h(~x + ~x′)h(~x′)〉

from Eq. 10.23 above. This has a simple consequence for the Fourier modecorrelation function

〈h(~k) h(~k′)〉Note

〈h(~k) h(~k′)〉 =

∫

dd−1~x

∫

dd−1~x′e−i~k·~x−i~k′·~x′〈h(~x) h(~x′)〉

=

∫

d~x

∫

d~x′e−i~k·~x−i~k′·~x′

G(~x − ~x′)

Let,

~y = ~x − ~x′ ~x = ~y′ +1

2~y

~y =1

2(~x + ~x′) ~x′ = ~y′ +

1

2~y

so that d~xd~x′ = d~yd~y′, and

〈h(~k) h(~k′)〉 =

∫

d~y e−i(~k−~k′)· ~y2 G(y) ×

∫

d~y′e−i(~k+~k′)·~y′

so,

〈h(~k) h(~k′)〉 =

∫

d~y e−i~k·~yG(y)(2π)d−1δ(~k + ~k′)

= [

∫

d~x e−i~k·~xG(~x)](2π)d−1δ(~k + ~k′) , letting ~y → ~x


as

〈h(~k) h(~k′)〉 =

{

G(~k) (2π)d−1δ(~k + ~k′) (Continuum)

G~k Ld−1δ~k+~k′, 0 (Discrete)(10.37)

Hence, in Fourier space, the only non–zero Fourier cos is

〈h~k h−~k〉 = 〈|h~k|2〉

which is directly related to G~k. Using the probability

ρstate ∝ e− σ

2kBT Ld−1

P

~kk2|hk|

2

we have,

〈|h~k|〉 =

∏

~k′ ′∫

d2hk′ |hK |2e−σ

2kBT Ld−1

P

~k′′ k′′2|hk′′ |2

∏′~k′

∫d2hk′e

− σ

2kBT Ld−1

P

~k′′ k′′2|hk′′ |2(10.38)

Except for the integrals involving h~k, everything on the numerator and denomi-nator cancels out. We have to be careful with real and imaginary parts though.Consider ∑

k

k2|hk|2 =∑

k

k2hkh−k

For the particular term involving h15, for example

(15)2h15h−15 + (−15)2h−15h−15 = 2(15)2h15h−15

giving a factor of 2. In any case,

〈|h~k|2〉 =

∫d2h~k|hk|2e

− 2σ

2kBT Ld−1 k2|hk|2

∫d2hke

− 2σ

2kBT Ld−1 k2|hk|2

(These steps are given in Goldenfeld’s book in Section 6.3, p.174.) Now let

h~k = Reiθ

so that

〈|hk|2〉 =

∫ ∞

0R dR R2e−R2/a

∫ ∞

0R dR e−R2/a

where

a =kBT

σk2

1

Ld−1(10.39)

A little fiddling gives

〈|hk|2〉 = a =kBT

σk2

1

Ld−1


Hence we obtain, from Eq. 10.37

〈h(~k) h(~k′)〉 = G(~k)(2π)d−1δ(~k + ~k′)

where

G(~k) =kBT

σ

1

k2

andG(x) = 〈h(~x)h(0)〉 (10.40)

From this correlation function we will obtain the width of the surface. We will

~x

y

W W , width of surface

Figure 10.13:

define the width to be the root–mean–square of the height fluctuations averagedover the ~x plane. That is

w2 =1

Ld−1

∫

dd−1~x〈(h(~x) − 〈h(~x)〉)2〉 (10.41)

which is overcomplicated for our case, since 〈h(~x)〉 = 0, and 〈h(~x)h(~x)〉 = G(0),from above. So we have the simpler form

w2 = G(0) (10.42)

From the definition of fourier transform,

G(0) =

∫dd−1~k

(2π)d−1

kBT

σk2(10.43)

which has a potential divergence as ~k → 0, or ~k → ∞. We shall always restrictour integrations so that

2π

L︸︷︷︸

System size

≤ |~k| ≤ Λ︸︷︷︸

Lattic constant

(10.44)

Then it is easy to do the integral of Eq. 10.43. First in d = 2,

G(0) =kBT

σ

1

2π

∫ Λ

−Λ

dk1

k2

=kBT

πσ

∫ Λ

2πL

dk1

k2


and

G(0) =kBT

2π2σL , d = 2

so

w =

√

kBT

2π2σL1/2 , d = 2 (10.45)

Similarly, in d = 3,

G(0) =kBT

σ

1

4π22π

∫ Λ

2πL

k dk

k2

and

G(0) =kBT

2πσln (LΛ/2π) , d = 3

so,

w =

√

kBT

2πσ(ln

LΛ

2π)1/2 , d = 3 (10.46)

It is easy to obtain L dependences as dimension of space varies. One obtaines,

w =

undefined , d ≤ 1

LX , 1 < d < 3

(lnL)1/2 , d = 3

const. , d > 3

(10.47)

where

X =3 − d

2(10.48)

is called the roughening exponent. The fact that w is undefined for d ≤ 1 isexplained below.

Eq. 10.47 implies the interface is rough and increases with L for d ≤ 3.However, note that the width (for d > 1) is not large compared to the system

W ∼ L1/2

L

L

Width in d = 2 remains small

compared to system as L → ∞

Figure 10.14:

itself, as L → ∞.w

L=

1

L1−X=

1

L(d−1)/2(10.49)

Hence the width is small for d > 1. In d = 1, it appears the width is comparableto the system size! This means the interface — and indeed phase coexistance


itself — does not exist in d = 1. This is one signature of a lower critical

dimension, at and below which, phase coexistance cannot occur at any nonzerotemperature.

In general, one defines the roughening exponent X by

w ∼ LX (10.50)

as L → ∞. This is equivalent to the definition of the surface correlation expo-nent ηs

ˆG(k) ∼ 1

k2−ηs(10.51)

as k → 0. (Here ηs = 0, of course.) The two exponents are related by theformula

X =3 − d

2− ηs

2(10.52)

The surface turns out to be self-affine, in the way we discussed earlier for theVon Koch snowflake, since correlation functions like G(~k) are power–law like.The self–affine dimension of the rough surface is

ds = X + d − 1 (10.53)

To obtain the correlation function in real space is a little awkward because ofthe divergence of G(X → 0) as L → 0. For convenience, consider

g(x) ≡ 〈(h(~x) − h(0))2〉 (10.54)

Multiplying this out gives

g(x) = 2(〈h2〉 − 〈h(~x)h(0)〉

This is simply

g(x) =

2 (G(0) − G(x))

or

2 (w2 − G(x))

(10.55)

The interpretation of g(x) from Eq. 10.54 is simple, it is the (squared) differencein the heights between two points a distance x apart, as shown in fig. 10.15. Notsurprisingly, we expect (for large L)

g(x → L) = 2w2 (10.56)

since, G(L) = 〈h(L) h(0)〉 = 〈 * 0

h(L)〉〈 * 0h(0)〉 because the points 0 and L are so far

appart so as to be uncorrelated, so G(L) = 0.In any case, we have

g(x) = 2kBT

σ

1

(2π)d−1

∫

dd−1~k(1 − ei~k·~x

k2)

= 2kBT

σ

1

(2π)d−1

∫

dd−1~k(1 − cos~k · ~x

k2)

(10.57)


√

g(~x − ~x′)

h(~x) h(~x′)

Figure 10.15:

Consider d = 2. Let

g(x) = 2kBT

σ

1

2π

∫ Λ

−Λ

dk(1 − cos kx

k2)

= 2kBT

πσ

∫ Λ

2π/L

dk1 − cos kx

k2

let u = kx

g(x) = 2kBT

πσx{

∫ Λx

2πx/L

du1 − cos u

u2}

let Λ → ∞ and L → ∞ in the integral (for convenience)

g(x) = 2kBT

πσx (

∫ ∞

0

du1 − cos u

u2)

︸︷︷︸

π/2

so,

g(x) =kBT

σx , d = 2 (10.58)

In d = 3,

g(x) = 2kBT

σ

1

4π2

∫ 2π

0

dθ

∫ Λ

2π/L

kdk(1 − cos kx cos θ

k2)

This is a little more involved, and I will just quote the answer

g(x) =kBT

4πσlnxΛ , d = 3 (10.59)

In general, one obtians

g(x) =

undefined , d = 1

x2X , 1 < d < 3

lnx , d = 3

const. , d > 3

(10.60)


where, again X = (3 − d)/2. This confirms our earlier claim that correlationsare power — law like.

(There is a subtlety here involving constraints which deserves a minute oftime. The limit g(x → L) ∼ L2X , but the constraint is not the same as 2w2 asderived above. Reconsider the integral of Eq. 10.58:

g(x) = 2kBT

πσx{

∫ Λx

2πxL

du1 − cos u

u2}

Now just let Λ = ∞. Rewriting gives

g(x) = 2kBT

πσx{

∫ ∞

0

du1 − cos u

u2−

∫ 2πxL

0

du1 − cos u

u2}

= 2kBT

πσx{π

2−

∫ 2πxL

0

du1 − cos u

u2}

or, for d = 2

g(x, L) =kBT

σxf(x/L) (10.61)

where

f(x) = 1 − 2

π

∫ 2πx

0

du1 − cos u

u2(10.62)

Note thatf(x → ∞) = 0

butf(x → 0) = 1

This is called a scaling function. End of digression.)

1

0

f(x/L)

x/L

Figure 10.16:

We are pretty well done now, except we shall calculate the “renormalized”surface tension from the partition function. Recall from Eq. 10.21

Z = e−σLd−1

kBT (∏

~x

∫

dh(x)) exp−σ

2kBT

∫

dd−1~x(∂h

∂~x′)2


Switching to Fourier space using Eqs. 10.34 and 10.36 gives

Z = e−σLd−1

kBT (∏

~k

′d2hk) exp−σ

2kBT

1

Ld−1

∑

k

k2|hk|2

≡ e−F/kBT

(10.63)

So calculating Z gives the thermodynamic free energy F . We will call the“renormolized”, or more properly the thermodynamic surface tension σ∗

σ∗ ≡ (∂F

∂A)T,V,N

or,

σ∗ = (∂F

∂Ld−1)T,V,N (10.64)

going back to Eq. 10.63

Z = e−σLd−1

kBT

∏

~k

′∫

d2h~ke− σ

2kBTk2

Ld−1 |hk|2

Since all modes are independent. But∫

d2h~k =

∫ ∞

−∞

dℜh~k

∫ ∞

−∞

dℑh~k

or if h~k = Reiθ

∫

d2hk =

∫ 2π

0

dθ

∫ ∞

0

RdR

(10.65)

Hence

Z = e−σLd−1

kBT

∏

~k

′∫ 2π

0

dθ

∫ ∞

0

R dR e−( σ

2kBTk2

Ld−1 )R2

let u = (σ

2kBT

k2

Ld−1)R2

Z = e−σLd−1

kBT

∏

~k

′2π1

2( σ2kBT

k2

Ld−1 )

∫ ∞

0

:0du e−u

= e−σLd−1

kBT

∏

~k

′2πkBTLd−1

σk2

= e−σLd−1

kBT elnQ′

~k

2πkBT Ld−1

σk2

= exp(−σLd−1

kBT+

∑

~k

′ ln 2πkBTLd−1

σk2)


Z = exp(−σLd−1

kBT+

1

2

∑

~k

ln2πkBTLd−1

σk2) (10.66)

But, Z = exp(−F/kBT ), up to a constant, so

F = σLd−1 − kBT

2

∑

~k

ln2πkBTLd−1

σk2(10.67)

is the free energy due to the surface. The second term gives the contribution dueto fluctuations. To take the derivative with respect to ∂/∂Ld−1, it is important

to note that the ~k’s get shifted with L’s, and that it is the (kL) combinationwhich is not a function of L. So to take the derivative, we rewrite

2π/k 2π/k′

L L + dL

note k shifts to k′

as L → L + dL

Figure 10.17:


2

∑

~k

ln2πkBTLd−1

σ(kL)2

or,


2

∑

~k

ln(Ld−1)(d+1)/(d−1) − kBT

2

∑

~k

ln2πkBT

σ(kL)2

or,


2(d + 1

d − 1)(ln Ld−1)

∑

~k

−kBT

2

∑

~k

ln2πkBT

σ(kL)2(10.68)

The last term is independent of Ld−1, so taking the derivative ( ∂F∂Ld−1 )T,V,N = σ∗

gives,

σ∗ = σ − kBT

2

d + 1

d − 1(

1

Ld−1

∑

~k

)(Discrete)

or,

σ∗ = σ − kBT

2

d + 1

d − 1(

∫dd−1~k

(2π)d−1)(Continuum) (10.69)


is the renormalized surface tension. Note that fluctuations decrease the valueof σ to the lower σ∗.

In d = 2,

∫dd−1~k

(2π)d−1=

1

2π

∫ Λ

−Λ

dk =Λ

π(10.70)

In d = 3,

∫dd−1~k

(2π)d−1=

1

4π2

∫ 2π

0

dθ

∫ Λ

0

kdk =1

4π2πΛ2 =

Λ2

4π(10.71)

In d = 1,

∫dd−1~k

(2π)d−1= 1 (10.72)

Hence we have,

σ∗ = σ − 3kBT

2πΛ , d = 2

σ∗ = σ − kBT

4πΛ2 , d = 3

and since d = 1 is odd, let d = 1 + ǫ, for very small ǫ, and

σ∗ = σ − kBT

ǫ, d = 1 + ǫ , ǫ << 1

(10.73)

(It is worth nothing that this is the first term in the d = 1 + ǫ expansion, where

we have only considered the leading term in EState = σ∫

dd−1~x√

1 + (∂h∂~x )2 ≈

σLd−1 + σ2

∫dd−1~x(∂h

∂x )2 + ...)

10.3 Impossibility of Phase Coexistence in d=1

Note that, again, the renormalized surface tension vanishes as d → 1 (as ǫ → 0).In our simple calculation σ∗ actually becomes negative infinity at d = 1.

In fact, the lack of phase coexistance in d = 1 is a theorm due to Landanand Pierls, which is worth quickly proving. (See Goldenfeld’s book, p.46.)

A homogeneous phase has an energy

EBulk ∝ +Ld−1 (10.74)

Putting in a surface gives an extra energy

ESurface ∝ +Ld−1 (10.75)

So the extra energy one gets due to adding a surface is

∆E1Surface ∝ +Ld−1

In d = 1,∆E1Surface ∝ +O(1)(10.76)

10.3. IMPOSSIBILITY OF PHASE COEXISTENCE IN D=1 89

Ebulk ∝ Ld

Esurface ∝ Ld−1

y

~x

Figure 10.18:

Basically, this looks bad for surfaces.Since the energe increases, it looks like one wants to have the fewest possible

surfaces, so the picture on the previous page is that one: two coexisting phaseswith one well–defined surface.

However, the system’s state is chosen by free energy minimization not energyminimization, where

F = E − TS

Surfaces are good for entropy because they can occur anywhere along the y axis,as shown. The total number of such ways is L along the y axis. Of course there

or or or etc.

Figure 10.19:

are all rotations and upside down interfaces, so that in the end there are Ld

ways. The entropy is

∆S = kB ln(numberofstates)

so we obtain, for a surface,

∆S = +kB lnLd

or,

∆S = +kBd(lnL) (10.77)

is the increase in entropy due to having a surface. Hence, the total change inthe free energy due to a surface is

∆F = +O(Ld−1) − kBTO(lnL) (10.78)


So, for d > 1, L → ∞, the free energy increases if a surface is added, but ford = 1 (at T > 0) the free energy decreases if a surface is added!

This means coexisting phases are stable for d > 1, but in d = 1 they areunstable. Since one interface decreases F , we can add another and another andanother, and keep on decreasing F until there are no longer coexisting phases(of bulk energy E ∼ Ld as L → ∞), just a mishmash of tiny regions of phasefluctuations. Hence, for systems with short–ranged forces, at T > 0, there canbe no phase coexistance in d = 1. (Short–ranged forces were assumed when wesaid the surface energy ESurface ∼ Ld−1).

10.4 Numbers for d=3

Some of these things may seem a little academic, since we have constantly doneeverything in arbitrary d. Our results for d = 2, apply to films absorbed onliquid surfaces and to ledges in crystal surfaces, for example. Experiments have

Figure 10.20:

been done on these systems, confirming our d = 2 results, such as w ∼ L1/2.In d = 3, it is worth looking again at two of our main results, Eq. 10.47

w =

√

kBT

2πσ(ln

LΛ

2π)1/2 (10.79)

σ∗ = σ − kBT

4πΛ2

or

σ∗ = σ(1 − 4πkBT

σ(

Λ

2π)2)

10.4. NUMBERS FOR D=3 91

or

σ∗ = σ(1 − T/T ∗) (10.80)

where

T ∗ ≡ σ

4πkB(Λ/2π)2(10.81)

Now, reconsider the phase diagram of a pure substance. Our analysis describes,

ls

P

Pc

T

vtriple point

Tc

Critical point

Figure 10.21: Phase Diagram

for example, the behavior along the liquid-vapour coexistance line. Near thetriple point of a typical fluid, the surface tension is such that

√

kBTt

2πσ≈ 1A (10.82)

where Tt is the triple point temperature. Similarly, one roughly has

Λ

2π≈ 10A (10.83)

Putting these in Eq. 10.79 above implies

w = 1A(lnL

10A)1/2 (10.84)

Putting in numbers is amusing. The width divergence is very weak.

w =

2.6A, L = 10000A (light wavelength)

4.3A, L = 10cm (coffee cup)

5.9A, L = 1000km (ocean)

9.5A, L = 1030m (universe)

(Of course, in a liquid–vapour system, gravity plays a role. (Certainly it plays

a role if one had a liquid–vapour system the size of the universe.) The extraenergy due to gravity is

EGravity =1

2

∫

dd−1~x(ρl − ρv)gh2(x)


where ρ is the mass density of the liquid or vapour, and g is gravity. Thischanges our results in a way you should work out for yourself.

)

Putting in numbers for the renormalized surface tension gives the result

σ∗(T = Tt) ≈ σ(0.3)

which is a much bigger result!Note that σ∗ vanishes (assume σ 6= σ(T )) at T = Tt, as T is increased. It is

tempting, although a little too rough to interpret

T ∗ = Tc

putting in numbers gives T ∗ ≈ 1.4Tt, which is Tc to the right order. Moreprecisely, it is known that

σ∗ = Const.(1 − T/Tc)µ

for T → Tc, where µ ≈ 1.24 in d = 3. So our theory, and (over)–interpretationgives µ = 1, instead. This is not too bad, considering we only went to lowestorder in the EState =

∫d~x

√

1 + (∂h/∂~x)2 expansion, and did not consider bulkproperties at all.

Chapter 11

Broken Symmetry and

Correlation Functions

A major concept in modern work is symmetry. These results for surfaces arean example of more general properties of correlations in systems with brokensymmetries.

By a broken symmetry, we mean the following. Note that in the partitionfunction

Z =∑

{States}

e−EStates/kBT

the energy of a state has a “symmetry”

EState(h + Const.) = EState(h) (11.1)

Since

EState = σLd−1 +σ

2

∫

dd−1~x(dh

∂~x)2

Naively, since the sum∑

{States} does not break this symmetry, we would expectit to be respected by any quantity obtained from Z. The two main quantitieswe have obtianed are the moments 〈h〉 and 〈hh〉. As we found before,

〈h〉 = 0 (11.2)

In fact we could have the average value be any result, without changing ouralgebra. Note that h’s expectation value is a definite number. In particular

〈h + Const.〉 6= 〈h〉

This is a common signature of a broken symmetry variable: Its correlations andaverages do not respect that full symmetry of Z or EState.

It is also instructive to think of the symmetry of Eq. 11.1 in a more abstractway. Note that homogeneous systems, whether liquid or vapour, are transla-tionally invariant and rotationally invariant. However, a system with a surface,

93

94CHAPTER 11. BROKEN SYMMETRY AND CORRELATION FUNCTIONS

Figure 11.1:

as drawn in fig. 11.1, only has translational invariance and isotropy in the ~xplane of the interface; There is no invariance in the y direction. Hence thesystem with a surface has a lower symmetry (a broken symmetry) compared tothe liquid or vapour system.

This perfect symmetry of a homogeneous system is lost precisely because h,the broken symmetry variable, has a definite value, like 〈h〉 = 0.

It is instructive to see that the fact Z and EState have this symmetry impliesthe form of the correlation functions of the broken symmetry variable h. Con-sider d = 2 only. Since a surface can be anywhere for a given system (it onlyhas a definite value for a given system), it can be different places for differentsystems. In a very large system, part of 〈h〉 might have one value, while far far

y

x

Figure 11.2:

away on the x axis it might want to have another value. The free energy cost ofsuch a fluctuation, if it involves very large length scales (equivalent to k → 0)is infinitesimal.

∆EState(k) =σ

2k2|hk|2

as k → 0. Indeed, if one has a very very large system encompassing the fourshown in fig. 11.2, the four subsystems behave like independent systems which

x

y

Figure 11.3: Surface of a very big system

95

each have a definite value for 〈h〉. Since the energy cost of such fluctuationsis infinitesimal as k → 0, we can set up long–wavelength sloshing modes withinfinitesimal kBT . Roughly,

kBT

2=

σ

2k2|hk|2

and

〈|hk|2〉 ∼kBT

σ

1

k2

as we recall. These sloshing modes are capillary waves driven by thermal noise.But we need not know the form of EState to obtain 〈|hk|2〉 ∼ 1/k2. We

only need to know that it breaks a symmetry of Z and EState. As shown infig. 11.3, fluctuations occur, say ±∆y, over a long distance ∆x. The values of∆y (which could be ∼ A) and ∆x (which could be ∼ meters) are not importantfor the argument. Let us invent a model where, after one goes a distance ∆xto the right, one goes up or down ∆y, at random, as shown in fig. 11.4. Of

step 1

step 2

liquid

vapor

y

x

Figure 11.4: Interface (in units of ∆x and ∆y)

course, this is a one–dimensional random walk along the y axis in “time” x.The root–mean–square deviation of y from its initial value is

yRMS ∼ x1/2

for large “time” x. Of course yRMS is related to the correlation function(g(x))1/2 from Eq. 10.54, so we have

g(x) ∼ x

Fourier transforming, and fiddling via Eqs. 10.55 and 10.56, gives G(k) ∼ 1/k2

for small k so that,〈|hk|2〉 ∼ 1/k2

as expected. Our argument was only in d = 2, but it can be generalized to otherdimensions.

Note that it was the fact that h was a broken symmetry variable which gaverise to the 1/k2; no special details of the interaction were necessary.

Before obtaining the general result, note that the system reacts to a sym-metry breaking variable by “shaking it” in such a way as to try and restore the


w ∼ Lx

A symmetry breaking variable ends up

having its fluctuations try to restore

the symmetry.

x

y

Figure 11.5:

broken symmetry. The rough width of a surface is this attempt to restore thefull translational invariance in the y direction, as shown in fig. 11.5. In fact, ind = 1, the fluctuations are so strong that the full symmetry is restored, and onehas translational invariance in the y drection (since coexisting phases cannotexist). In, 1 < d ≤ 3, the fluctuations do not restore the symmetry, but theymake the surface rough, or diffuse, as L → ∞. In d > 3, fluctuations do notaffect the flat interface.

Finally, it should be noted that the cartoon on p. 64 is self–similar and selfaffine, in the same sense as the von–koch snowflake we discussed below. Notethe important new ingredient: Broken symmetry ⇔ power–law correlations ⇔self–similar structure.

11.1 Proof of Goldstone’s Theorem

We shall now prove Goldstone’s theorem. The theorem is, if B is a brokensymmetry variable, where B(~r) is B’s local variable, (~r is a field point in ddimensions) then

〈B(~k) B(~k′)〉 = (2π)dδ(~k + ~k′)G(k)

wherelimk→0

G(k) → ∞

and usuallyG(k) ∝ 1/k2

We will give not so much a proof of this as an indication of how to work it outfor whatever system one is interested in. To make things easier, say

〈B〉 = 0

If this is not true, redefine B ⇒ B − 〈B〉. Assuming one has identified B as animportant broken symmetry variable, then the free energy’s dependence on Bcan be found by a Taylor expansion:

F (B) = F (0) +∂F

∂BB +

1

2

∂2F

∂B2B2 + ...

11.1. PROOF OF GOLDSTONE’S THEOREM 97

But F (0) is just a constant, and ∂F∂B = 0 at equilibrium, so

F (B) =1

2

∂2F

∂B2B2 + ... (11.3)

Of course, by F we mean EState or FState in the sense of Eq. 10.10. Now, itturns out that it is important to deal with the local value of B(~r), or the Fourier

component B(~k). If we use the local value for the expansion in Eq. 11.3 we have,

F (B) =1

2

∫

dd~r

∫

dd~r′∂2F

∂B(~r)∂B(~r′)B(~r)B(~r′) (11.4)

which looks worse than it is. Since ∂2F∂B(~r)∂B(~r′) is a thermodynamic derivative of

some kind, in a system which is (we assume) rotationally invariant and transla-tionally invariant,

∂2F

∂B(~r)∂B(~r′)≡ C(~r, ~r′) = C(~r − ~r′) (11.5)

In fourier space

F (B) =1

2

∫dd~k

(2π)d

∫dd~k

(2π)d

∂2F

∂B(~k) ∂B(~k′)B(~k) B(~k′) (11.6)

But translational invariance gives

∂2F

∂B(~k) ∂B(~k′)= (2π)dδ(~k + ~k′)C(k) (11.7)

and, if B(~x) is real, so B∗(−k) = B(k), we obtain (finally)

F (B) =

{12

∫ddk

(2π)d C(k)|B(k)|2 (Continuum)1

2Ld

∑

~k Ck|Bk|2 (Discrete)(11.8)

We can obtain the form of C(~k) form expanding close to ~k = 0.

C(~k) = (Const.) + (Const.)k2 + (Const.)k4 + ... (11.9)

where we have assumed everything is analytic near k = 0. Hence terms like~k ·~k = k2 and (~k ·~k)2 = k4 appear, but not nonanalytic terms like (~k ·~k)|~k| = k3,(which has the absolute value sign).

Finally, we use that B breaks a symmetry of F . Hence F can have no

dependence on B in the thermodynamic limit ~k = 0. Therefore

limk=0

C(k) = 0 (11.10)

and, if we have analyticity

C(k) = (Const.)k2 (11.11)


for a broken–symmetry variable near ~k = 0. This gives

F (B) = (Const.)∑

~k

k2| ~Bk|2 (11.12)

This is the same as Eq. 10.35 so we have

〈B(~k)B(~k′)〉 = (2π)dδ(~k + ~k′)G(k)

where

G(k) ∝ 1

k2, ~k = 0

as claimed. If C(k) is not analytic as assumed in Eq. 11.9 we get the weakercondition

limk→0

G(k) → ∞

Finally, it is amusing to note that if B is not a broken symmetry variable, fromEq. 11.9 we have

F (B) = (Const.)∑

~k

(k2 + ξ−2)|Bk|2

where ξ−2 is a constant. This gives 〈B(~k)B(~k′)〉 = (2π)dδ(~k + ~k′)G(k) whereG(k) ∝ 1

k2+ξ−2 as k → 0. Fourier transforming gives

〈B(~r)B(0)〉 = G(r) ∝ e−r/ξ

as r → ∞, as we anticipated earlier, when correlations were discussed.

Chapter 12

Scattering

Figure 12.1:

Scattering experiments evidently involve the change in momentum of someincident particle or radiation by a sample. The momentum transfered to thesample (in units of h) is

~q = ~ki − ~kf (12.1)

by conservation of momentum. This, along with the intensity of the scatteredradiation

I(~q) (12.2)

is measured at the detector. We shall only consider elastic scattering whereenergy, w in units of h, is unchanged. Furthermore, we are implicitly consideringthe first Born approximation and there by neglecting multiple scattering.

The intensity of the scattered radiation might be due to an electromagneticfield. That is

I(~q) ∝ |E(q)|2 (12.3)

99

100 CHAPTER 12. SCATTERING

where E is the electric field. Clearly the electric field variations with ~q are dueto variations in the dielectric properties of the sample, i.e.,

E(q) ∝ δǫ(q) (12.4)

where δǫ is the dielectric variability. Now, if δǫ is a function of only a smallnumber of local thermodynamic variables, like density and temperature

δǫ = (∂ǫ

∂ρ)δρ + (

∂ǫ

∂T)δT (12.5)

Usually ∂ǫ/∂T is exceedingly small, so we have

I(q) ∝ |ρ(q)|2 (12.6)

A similar argument can be constructed if the scattering is not electromagnetic,giving the same result. We’ll not focus on |ρ(q)|2 letting

S(q) ≡ |ρ(q)|2 (12.7)

be the structure factor.I Should note that, although I’ve skimmed this, the details giving Eq. 12.6

are often experimentally important. But that’s their job, ours is to figure outS(q). We will deal with two cases for ρ(q)

1. ρ corresponds to the order parameter, a local density or local concentra-tion.

2. The order parameter is not the local density, but instead corresponds to asublattice concentration or magnetization, or to a Bragg peak’s intensity.

The first of these is easier. Consider a system of uniform density (we’ll dointerfaces later), then

〈ρ(x) ρ(0)〉 = ρ2 (12.8)

of course, so the fourier transform gives

S(q) ∝ δ(q) (12.9)

A cartoon (fig. 12.2) may make this clear. The second case is more subtle. Anexample is the sublattice concentration of a binary alloy. Say atoms are A andB in the alloy, and that locally, A wants to be neighbours with B but not withanother A. This makes the ordered state look like a checker–board as shown infig. 12.3.

Assuming the scattering is different from A or B atoms this sort of order canbe seen by looking at S(q). However, it must be realized that the ordering of,say, the A atoms is on a sublattice, which has twice the lattice spacing of theoriginal system as shown in fig. 12.4. Hence the density of A atoms is uniformon a sublattice with twice the lattice constant of the original system, and the

101

Uniform boring system

〈ρ(x) ρ(0)〉 = ρ2

x

y S(q)

qy, or, qx

S(q) ∝ δ(q)

Scattering as shaded region

Figure 12.2:

= A

= B

Perfectly ordered

A − B alloy

Figure 12.3:

2a

(Sublatice 2 is unshaded)

Sublatice 1 (Shaded)

a

Figure 12.4:


−q0 q0 qy

S(q)

−q0 q0

q0

−q0

qy

qxor

Figure 12.5:

scattering will show peaks not at ~q = 0 but at ~q = ~q0 corresponding to thatstructure

S(~q) ∝∑

~q0

δ(~q − ~q0) (12.10)

where q0 = 2π/(2a), and a is the original lattice spacing.The same thing happens in a crystal, where Bragg peaks form at specific

positions in ~q space. By monitoring the height and width of such peaks, thedegree of order can be determined. A further complication is that usually thesample will not be so well aligned with the incident radiation that one getsspots on the qx and qy axes as shown. Instead they appear at some randomorientation (fig. 12.6). This might seem trivial (one could just realign the sam-

q0

qy

qx

qy

qx

qy

qx

q0

q0or or etc

Figure 12.6:

ple), except that often one has many different crystallite orientations scatteringsimultaneously, so that all the orientations indecated above are smeared into aring of radious q0.

qy

qxq0

Figure 12.7: Average of Many Crystallites

12.1. SCATTERING FROM A FLAT INTERFACE 103

12.1 Scattering from a Flat Interface

Consider a flat interface without roughening, as drawn in fig. 12.8. For simplicitywe will usually consider an interface embedded in dimension d = 2. If the density

y = 0

is interface

Grey Scale

x

y

ρ

y

1

Density

Figure 12.8:

is uniform, it can be written as

ρ(x, y) = A θ(y) + B (12.11)

where θ(y) is the Heaviside step function, and A and B are constants. Forsimplicity, we will let A = 1 and B = 0 from now on:

ρ(x, y) = θ(y) (12.12)

To get S(q) we need the fourier transform of θ. Let

θ = H + C

where C is a to–be–determined constant. Taking a derivative gives

dθ

dy=

dH

dy+ 0

or,dH

dy= δ(y)

Fourier transform gives

H(ky) =1

iky

so that

θ(ky) =1

iky+ C δ(ky)

To determine C, note that

θ(x) + θ(−x) = 1


so

θ(ky) + θ(−ky) = δ(ky)

Hence C = 1/2 and,

θ(ky) =1

iky+

1

2δ(ky) (12.13)

The two–dimensional Fourier transform of ρ(x, y) = θ(y) is therefore

ρ(~k) =1

ikyδ(kx) +

1

2δ(~k) (12.14)

From this we obtain the scattering from a flat interface y = 0 as the modulus,

S(k) = |ρ(k)|2 ∝ 1

k2y

δ(kx) (12.15)

where we have neglected the trivial bulk factor delta function. If the unit vectornormal to the interface is n and that along the surface is n⊥, then

S(k) ∝ 1

(~k · n)2δ(~k · n⊥) (12.16)

As a cartoon (fig. 12.10), the scattering appears as a streak pointed in the

n⊥

n

x

y

Figure 12.9:

S(k)kx = 0

ky

kx

ky

Grey Scale of S(k)

Figure 12.10: Scattering from interface y = 0

direction of n (y in fig. 12.10).

12.1. SCATTERING FROM A FLAT INTERFACE 105

x

y

ρ

y

DensityGrey Scale

y = 0

is interface

Figure 12.11:

Lets also consider the scattering from a line defect, which is similar, but hasimportant differences. In this case,

ρ(x, y) = δ(y) (12.17)

Thenρ = δ(kx) (12.18)

andS(k) ∝ δ(kx) (12.19)

or if the unit vector normal to the surface is n, and that along the surface isn⊥,

S(k) ∝ δ(~k · n⊥) (12.20)

for a line defect. Now, if ρ is not the order parameter (like the second case

S(k)kx = 0

ky

kx

ky

Grey Scale of S(k)

Figure 12.12: Scattering from a line defect at y = 0

discussed above) so that the density doesn’t vary across the interface, but anorder parameter corresponding to sublattice concentration or crystalline orien-tation does, we have to incorporate the wavenumber ~k0 corresponding to thatstructure. For simplicity, we will assume that all one has to do is modulate thestructure up to that wavenumber. That is, for an interface

S(k) ∝ 1

|(~k − ~k0) · n|2δ((~k − ~k0) · n⊥) (12.21)


while for a line defect

S(k) ∝ δ((~k − ~k0) · n⊥) (12.22)

12.2 Roughness and Diffuseness

So far this is only geometry. And even as geometry, we have not considereda diffuse interface. Before doing roughness, lets consider a diffuse interface asshown in fig. 12.13. One example of this is

ξ

ρ

ξ y

y

x

Figure 12.13: A diffuse interface, with no roughness

ρdiffuse(x, y) ∼ tanh(y/ξ) (12.23)

So that

S(k) = |ρ(k)diffuse|2 ∼ δ(kx)

k2y

(1 + O(ξky)2) (12.24)

for an interface at y = 0. So the diffuseness only effects scattering at kx = 0

ky

kx

extrascattering

due to diffuseness

Figure 12.14:

and large ky. On the other hand, let us now consider a rough interface, with nodiffuseness, where the interface is given by

y = h(x) (12.25)

and

ρRough(x, y) = θ(y − h(x)) (12.26)

12.2. ROUGHNESS AND DIFFUSENESS 107

Figure 12.15: A very rough interface, with no diffuseness

If h(x) involves a small variation, we can accordingly expand to obtain

ρRough ≈ θ(y) + h(x)δ(y) + ... (12.27)

since dθ/dy = δ(y).Taking the fourier transform and squaring this gives

S(k) = |ρRough(k)|2 ∼ δ(kx)

k2y

+ |h(kx)|2 (12.28)

Averaging over thermal fluctuations using

〈|h(kx)|2〉 ∼ 1

k2−ηx

(12.29)

which defines η, gives

SRough(~k) ∼ δ(kx)

k2y

+T

k2−ηx

(12.30)

where T is a constant. It would appear, then, that one could deconvolute the

Extra scattering due to roughness

ky

kx

Figure 12.16:

geometry δ(kx)k2

yfrom the diffuseness O(ξky)2, from the roughness 1/k2−η

x , since

they have different signatures in k space. For excellent samples this is in facttrue, but if there are many randomly oriented interfaces present, these signaturesare smeared out, and washed away.


12.3 Scattering From Many Flat Randomly–Oriented

Surfaces

Evidently the scattering from many randomly–oriented surfaces gives a pinwheel–like cluster of streaks as shown in fig. 12.17. Angularly averaging over all orien-

ky

kx

AverageAngular

ky1/kp

Figure 12.17:

tations of the interface gives

S =

∫dn|ρk|2∫

dn(12.31)

where n is the unit vector normal to the surface. We will write

n = − sin θx + cos θy (12.32)

andn⊥ = cos θx + sin θy (12.33)

with~k = k cos βx + k sin θy (12.34)

Then, for scattering from a surface where |ρk|2 = δ(~k·n⊥)

|~k·n|2we have (in d = 2)

S =1

2π

∫ 2π

0

dθδ[k cos β cos θ + k sin β sin θ]

| − k cos β sin θ + k sinβ cos θ|2

=1

2π

1

k3

∫ 2π

0

dθδ[cos (θ − β)]

| sin (θ − β)|2

=1

2π

1

k3

∫ 2π

0

dθδ[cos θ]

| sin θ|2

But cos θ = 0 at θ = π/2, 3π/2 where | sin θ|2 = 1, so

SSurfaces =1

π

1

k3(12.35)

Similarly, in arbitrary dimension one has

SSurfaces ∝ 1

kd+1(12.36)

12.3. SCATTERING FROM MANY FLAT RANDOMLY–ORIENTED SURFACES109

for scattering from a surface. For scattering from a line defect which has|ρ(k)|2 ∝ δ(kx), one similarly obtains

Slinedefects ∝ 1

kd−1(12.37)

We shall call these results, Porod’s Law. They involve only geometry.The case where ρ is not the order, and a modulated structure exists is more

subtle. If the wavenumber of that structure is ~k0 then this introduces a newangle via

~k0 = k0 cos αx + k0 sin αy (12.38)

Now there are two possible averages

Figure 12.18:

1. α or β, average over crystallite orientation, or, angles in a non–angularlyresolved detector. These are equivalent (look at the cartoon for a fewseconds), and so we only need to do one of these two averages.

2. θ, averages over surface orientation.

If we average over α or β, for a fixed n, it is easy to anticipate the answer asshown in fig. 12.19, where in the cartoon the regions around (kx = k0, ky = 0)

averageα or β

ky ky

kxkx

interface n = y

Figure 12.19:

retain the singularity of the original δ(kx − k0)/(ky − k0)2.


ky

kx

θ Average

ky

k0k0

kx

Figure 12.20:

This is basically a detector problem, so instead we will consider averaging θfirst. In fact it is obvious what the result of such an average is, it must be theprevious result, but shifted from ~k = 0 to ~k = ~k0, that is

Ssurface(k) ∝ 1

|~k − ~k0|d+1

Sline(k) ∝ 1

|~k − ~k0|d−1

(12.39)

If a further average is done over crystallite orientation, we can take these ex-pressions and angularly average them over the angle between ~k and ~k0. Clearlyafter averaging we will have

S = S(|k| − |k0|

|k0|) ≡ S(∆k) (12.40)

First, letφ = β − α (12.41)

so that(~k − ~k0)

2 = k2 − 2kk0 cos φ + k20

or, on usingk ≡ k0(∆k + 1) (12.42)

we have(~k − ~k0)

2 = k20{(∆k)2 + 2(1 − cos φ)(∆k + 1)} (12.43)

Hence in two dimensions, for a surface, we have

S(∆k) =1

2πk30

∫ 2π

0

dφ1

|(∆k)2 + 2(1 − cos φ)(∆k + 1)|3/2(12.44)

which is some weird integral. We will work out its asymptotic limits ∆k >> 1and ∆k << 1.

Firstly, if ∆k >> 1, then

S(∆k) =1

2πk30

1

(∆k)3

∫ 2π

0

dφ + ...

12.3. SCATTERING FROM MANY FLAT RANDOMLY–ORIENTED SURFACES111

or (in arbitrary d)

S(∆k) ∼ 1

(∆k)d+1, ∆k >> 1 (12.45)

If ∆k << 1, consider

S(∆k → 0)?=

1

2πk30

∫ 2π

0

dφ1

|2(1 − cos φ)|3/2

which diverges due to the φ = 0 (and φ = π) contribution. Oops. So we hadbetter keep a little ∆k. For convenience, we will also expand φ around 0.

S(∆k) ≈ 1

2πk30

∫ 2π

0

dφ1

|(∆k)2 + φ2|3/2+ ...

Now let u = φ/∆k

S(∆k) =1

2πk30

1

(∆k)2

∫ :∞2π/∆k

0

du1

|1 + u2|3/2

so,

S(∆k) ∼ 1

(∆k)2as ∆k → 0 (12.46)

for all dimensions d.

If one does this in d dimensions with

S =1

|~k − ~k0|γ(12.47)

then

S =

{1

(k−k0)γ−(d−1) ,k−k0

k0<< 1

1(k−k0)γ , k−k0

k0>> 1

(12.48)

If γ − (d − 1) > 0. For the case γ = d − 1, one obtains

S ∼{

− ln (k − k0),k−k0

k0<< 1

1(k−k0)d−1 , k−k0

k0>> 1

(12.49)

It should be noted that the actual scattering is given by a weird boring integral


ky1

|k−k0|d+1 , near tails

1|k−k0|2

, near center

k0

kx

Figure 12.21:

like Eq. 12.44, which is pretty close to an elliptical integral.Finally, note the big mess of power laws, and we have not considered dif-

fuseness or roughness! Everything is just geometry.

Chapter 13

Fluctuations and

Crystalline Order

Usually it is safer to apply Goldstone’s theorem by “re–deriving” it for what-ever system one is interested in. However, to give an interesting and concreteexample, consider crystalline order in a solid. The density ρ clearly depends on

Density

ρ = ρ(~r)

Atoms on a lattice

Figure 13.1:

positions ~r, as shown, in a regular crystal. A crystal, although it looks sym-metric in a common sense way, actually has very little symmetry compared toa liquid or a vapour.

A liquid or a vapour are isotropic in all directions of space and traditionallyinvariant in all directions. A crystalline solid has a very restricted symmetry.In figure 13.1, the system is invariant under (e.g.)

~r → ~r + r0(nx + my) (13.1)

where n and m are integers (positive or negative), and r0 is the distance betweenatoms. Clearly then a crystal has very low symmetry, and–indeed–breaks thetranslational and rotational invariance of space!

This means there should be a broken symmetry variable, associated withthis, which satisfies Goldstone’s theorem. Clearly this variable is associatedwith the ~r dependence of

ρ(~r) (13.2)

113

114 CHAPTER 13. FLUCTUATIONS AND CRYSTALLINE ORDER

~r ~r′

r0

invariance only for

~r → ~r′, with

~r′ = r0(+1x − 1y)

shown here

Figure 13.2:

Let us then introduce a variable ~u(~r) which discribes the local variations in thedensity. That is, due to some fluctuation,

~r → ~r + ~u(~r)

andρ(~r) → ρ(~r + ~u(~r)) (13.3)

where ~u(~r) is the local displacement vector as shown in fig. 13.3.

Equilibrium averaged position

Actual position due to fluctuation

~u(~r)

Figure 13.3: Jiggled atoms due to thermal fluctuations

Aside: (What does it mean to be translationally invariant

ρ(~r) → ρ(~r + ~u)

Local variable corresponding to broken symmetry

Figure 13.4:

~u(~r)

115

A definite value for 〈u(~r)〉 = 0, say, breaks space symmetry (translational androtational invariance).)

By construction

〈~u(~r)〉 = 0 (13.4)

In Fourier space, if all directions of fluctuation are equally likely,

〈~u(~k)~u(~k′)〉 ∝ (2π)dδ(~k + ~k′)(1

k2) (13.5)

It is a little hard to see what this does to a density, since ~u appears insideρ(~r +~u(~r)). The thing to do is to expand in a Fourier series, but to get an idea,consider d = 1. Atoms don’t look like cosines, but this gives the sense of it.

2π/G

x

ρ(x) ∝ (cos Gx + 1)

ρ

Figure 13.5: Rough depiction of a one–dimensional variation of density viacosine

More precisely,

ρ(x) =∑

G

eiGxaG (13.6)

where there might be several wavenumbers G giving the density, and aG is aFourier coefficient giving the atomic form. In arbitrary dimension we write

2π/Gx

ρ

ρ(x) =∑

G eiGxaG

Figure 13.6: Better depiction of a one–dimensional variation. But the cosine isessentially the right idea.

ρ(~r) =∑

~G

ei ~G·~ra~G (13.7)

where ~G are called the reciprocal lattice vectors. Then we have a fluctuationrepresented by

ρ(~r + ~u(~r)) =∑

~G

ei ~G·(~r+~u(~r))a~G (13.8)


〈ρ(~r + ~u(~r))〉 =∑

~G

ei ~G·~ra~G〈ei ~G·~u(~r)〉 (13.9)

To calculate the average we use the fact that ~u(~r) is a Gaussian variable. There-fore

〈ei ~G·~u〉 = e−12 〈

~G·~u〉2

= e−12

~G ~G:〈~u~u〉(13.10)

But ~u is only correlated to ~u if they are in the same direction, we expect. Thatis

〈uαuβ〉 ∝ δαβ

so〈e−i ~G·~u〉 ≈ e

12 G2〈u2〉

But, from Eq. 13.5 we have,

〈u2〉 =

∫ddk

(2π)d

1

k2

or

〈u2〉 =

Const. d = 3

lnL d = 2

L d = 1

(13.11)

So in d = 3〈ρ〉 = Const.

∑

~G

ei ~G·~ra~G (13.12)

where the Const. = exp− 12G2(Const.) is called the deBye–Waller factor. In

d = 2 we have

〈ρ〉 =∑

~G

ei ~G·~raG e

−1

2G2 lnL

︸︷︷︸1L

const.G2/2 (13.13)

which is a marginal case. In d = 1 we have

〈ρ〉 =∑

~G

ei ~G·~raG e−12 G2L (13.14)

which is a large effect, much greater than the periodicity as L → ∞. This is thecontent of the original Landan–Peierls theorem, which states that a one dimen-sional variation in density, ρ(x) (whether crystal structure or phase coexistance)is impossible: The only G which can survive in Eq. 13.14 is G = 0 as L → ∞which gives ρ(x) → Constant, due to ~u(~r) fluctuations.

It is more revealing to do these things for the scattering intensity I(k), where

I(k) ∝ |ρk|2 (13.15)

117

(Scattering is from coherent density fluctuations). From Eq. 13.7,

ρk =

∫

ddr e−i~k·~r{∑

~G

ei ~G·~ra~G}

=∑

~G

aG

∫

ddr e−i(~k−~G)·~r

orρk =

∑

~G

aGδ(~k − ~G) (13.16)

so,

ρ∗k =∑

~G

a∗Gδ(~k − ~G)

and|ρk|2 =

∑

~G, ~G′

aGa∗G′δ(~k − G)δ(~k − ~G′)

or,

|ρk|2 ∝∑

~G

|aG|2δ(~k − ~G) (13.17)

where |aG|2 is an atomic form factor. This gives well defined peaks where ~k = ~Gas shown in fig. 13.7. When thermal fluctuations are present

k

Ik = |ρk|2 Scattering gives sharp

peaks, if thermal fluctuations

are absent.

various ~G’s

Figure 13.7:

ρk =

∫

dd~r e−i~k·~r(∑

G

ei ~G·~raG ei ~G·u(~r))

or,

ρk =∑

~G

aG

∫

dd~r e−i(~k−~G)·r ei ~G·u(~r) (13.18)

so

ρ∗k =∑

~G′

a∗G

∫

dd~r′ ei(~k−~G′)·~r′

e−i ~G′·u(~r′)


Hence

〈|ρk|2〉 =∑

~G, ~G′

aGa∗G

∫

dd~r

∫

dd~r′ e−i(~k−~G)·~r ei(~k−~G′)·~r′〈ei ~G·u(~r)−i ~G′·u(~r′)〉

(13.19)This is the same integral as on page so we can simply use that

〈ei ~G·u(~r)−i ~G′·u(~r′)〉 = f(~r − ~r′) (13.20)

and then,

〈|ρk|2〉 =∑

~G, ~G′

aGa∗G

∫

dd~r e−i(~k−~G)·~r(2π)dδ(~G − ~G′)f(~r) (13.21)

where

f(~r) = 〈ei ~G·~u(r)−i ~G·u(0)〉

= e−G2

2 〈(u(~r)−u(0))2〉

= Const. e−G2〈u(r)u(0)〉

(13.22)

so,

〈|ρk|2〉 = const.∑

~G

|a~G|2∫

ddr e−i(~k−~G)·~re−G2〈u(~r)u(0)〉 (13.23)

The correlation function 〈u(~r)u(0)〉 can be worked out as before (knowing 〈|u(k)|2〉 ∼1k2 ),

〈u(~r)u(0)〉 =

Const., d = 3

ln r, d = 2

r, d = 1

(13.24)

In d = 3,

〈|ρk|2〉 = Const.∑

~G

|aG|2δ(~k − ~G)

with the same delta function peaks.In d = 2, the marginal case,


~G

|aG|2∫

ddr e−i(~k−~G)·re−Const. ln r

= Const.∑

~G

|aG|2∫

ddre−i(~k−~G)·~r

rConst.

︸︷︷︸

this is the fourier transform of a power

= Const.∑

~G

|aG|21

|~k − ~G|d−Const.

119

I(k)

k~G

d = 3

Figure 13.8: Delta Function Singulatities

or, letting (d − Const.) ≡ 2 − η, we have


~G

|aG|21

|~k − ~G|2−η(13.25)

in d = 2. The thermal fluctuations change the delta function singularitiesto power–law singularities! Sometimes they are called η–line shapes (eta–lineshapes).

I(k)

G

d = 2

k

Figure 13.9: Power Law Singularities

In d = 1, we have


~G |aG|2∫

ddr e−i(~k−~G)·~r e−(Const.)r

︸︷︷︸

this is the Fouriertransform of anexponential. But∫

ddr ei~k′·~r e−r/ξ ∝ 1(k′)2+ξ−2

for small k′

So, for d = 1,


~G

|a~G|21

(~k − ~G)2 + ξ−2(13.26)


where ξ−2 is a constant. The thermal fluctuations remove all singularities as~k = ~G now! This is another sign that ρ(x) is impossible, and there can be no

d = 1

I(k)

G G G

Figure 13.10: No Singularities

1 − d crystalline order (or 1 − d phase coexistance).There are two other things associated with broken symmetry which I will

not discuss, but you can look up for yourself.

1. Broken symmetries are usually associated with a new (gapless, soft, oracoustic) mode. For surfaces, these propagating modes are capillary waves.For crystalline solids, they are shear–wave phonons.

2. Broken symmetries are associated with a “rigidity”. For crystalline solids,this is simply their rigidity: it takes a lot of atoms to break a symmetryof nature, and they all have to work together.

Chapter 14

Ising Model

There are many phenomena where a large number of constituents interact togive some global behavior. A magnet, for example, is composed of many in-teracting dipoles. A liquid is made up of many atoms. A computer programhas many interacting commands. The economy, the psychology of road rage,plate techtonics, and many other things all involve phenomena where the globalbehavior is due to interactions between a large number of entities. One wayor another, almost all such systems end up being posed as an ”Ising model” ofsome kind. It is so common, that it is almost a joke.

The Ising model has a large number of spins which are individually in mi-croscopic states +1 or −1. The value of a spin at a site is determined by thespin’s (usually short–range) interaction with the other spins in the model. SinceI have explained it so vaguely, the first, and major import, of the Ising model isevident: It is the most simple nontrivial model of an interacting system. (Theonly simpler model would have only one possible microscopic state for each spin.I leave it as an exercise for the reader to find the exact solution of this trivialmodel.) Because of this, whenever one strips away the complexities of someinteresting problem down to the bone, what is left is usually an Ising model ofsome form.

In condensed–matter physics, a second reason has been found to use thesimplified–to–the–bone Ising model. For many problems where one is interestedin large length scale phenomena, such as continuous phase transitions, the addi-tional complexities can be shown to be unimportant. This is called universality.This idea has been so successful in the study of critical phenomena, that it hasbeen tried (or simply assumed to be true) in many other contexts.

For now, we will think of the Ising model as applied to anisotropic magnets.The magnet is made up of individual dipoles which can orient up, correspondingto spin +1 or down corresponding to spin −1. We neglect tilting of the spins,

121

122 CHAPTER 14. ISING MODEL

arrow Spin S = +1

arrow Spin S = −1

Figure 14.1:

such as տ, or ր, or ց, and so on. For concreteness, consider dimension d = 2,putting all the spins on a square lattice, where there are i = 1, 2, 3, ...N spins.In the partition function, the sum over states is

or or . . .

Figure 14.2: Possible configurations for a N = 42 = 16 system.

∑

{States}

=

+1∑

s1=−1

+1∑

s2=−1

...

+1∑

sN=−1

=

N∏

i=1

+1∑

si=−1

where each sum has only two terms, of course. The energy in the exp −EState/kBT now must be prescribed. Since we want to model a magnet, whereall the spins are +1 or −1, by and large, we make the spin of site i want tooriented the same as its neighbors. That is, for site i

Energy of Site i =∑

neighbors of i

(Si − Sneighbors)2

= S2i − 2

∑

neighbors of i

SiSneigh. + S2neigh.

But S2i = 1 always, so up to a constant (additive and multiplicative)

Energy of Site i = −∑

neighbors of i

SiSneighbors of i

On a square lattice, the neighbors of “i” are the spin surrounding it (fig. 14.3).As said before, the idea is that the microscopic constituents -the spins- don’t

know how big the system is, so they interact locally. The choice of the 4 spins

123

Middle Spin is i, 4 Neighbors are

to the North, South, East, and West

Figure 14.3: Four nearest neighbors on a square lattice.

above, below and to the right and left as “nearest neighbors” is a convention.The spins on the diagonals (NW, NE, SE, SW) are called “next–nearest neigh-bors”. It doesn’t actually matter how many are included, so long as the inter-action is local. The total energy for all spins is

EState = −J∑

〈ij〉

SiSj

where J is a positive interaction constant, and 〈ij〉 is a sum over all i andj = 1, 2, ...N , provided i and j are nearest neighbors, with no double counting.Note

∑

〈ij〉

=

N∑

i=1

N∑

j=1︸︷︷︸

neighbors no double counting

= N × 4

2

If there are “q” nearest neighbors, corresponding to a different interaction, or adifferent lattice (say hexagonal), or another dimension of space

∑

〈ij〉

=q

2N

So the largest (negative) energy is E = − qJ2 N . If there is constant external

magnetic field H, which tends to align spins, the total energy is

EState = −J∑

〈ij〉

SiSj − H∑

i

Si

and the partition function is

Z =

N∏

i=1

+1∑

S1=−1

e( J

kBT )P

〈jk〉 SjSk+( HkBT )

P

i Sj

= Z(N,J

kBT,

H

kBT)

As described above, the sum over q nearest neighbors is q = 4 for a two–dimensional square lattice. For three dimensions, q = 6 on a simple cubiclattice. The one–dimensional Ising model has q = 2.


The one–dimensional model was solved by Ising for his doctoral thesis (wewill solve it below). The two–dimensional model was solved by Ousager, for H =0, in one of the premiere works of mathematical physics. We will not give thatsolution. you can read it in Landan and Lifshitz’s “Statistical Physics”. Theygive a readable, physical, but still difficult treatment. The three–dimensionalIsing model has not been solved.

The most interesting result of the Ising model is that it exhibits brokensymmetry for H = 0. In the absence of a field, the energy of a state, and thepartition function, have an exact symmetry

Si ↔ −Si

(The group is Z2.) Hence (?)

〈Si〉 ↔ −〈Si〉

and〈Si〉 ?

= 0

In fact, for dimension d = 2 and 3 this is not true. The exact result in d = 2 issurprisingly simple: The magnetization/spin is

m = 〈Si〉 =

{

(1 − cosech2 2JkBT )1/8, T < Tc

0, T > Tc

The critical temperature Tc ≈ 2.269J/kB is where (1−cosech2 2JkBT )1/8 vanishes.

A rough and ready understanding of this transition is easy to get. Consider the

0

1

m = 〈Si〉

Tc T

Figure 14.4: Two–dimensional square lattice Ising model magnetization per spin

Helmholtz free energyF = E − TS

where S is now the entropy. At low temperatures, F is minimized by minimizingE. The largest negative value E has is − q

2JN , when all spins are aligned +1 or−1. So at low temperatures (and rigorously at T = 0), 〈Si〉 → +1 or 〈Si〉 → −1.At high temperatures, entropy maximization minimizes F . Clearly entropy ismaximized by having the spins helter–skelter random: 〈Si〉 = 0.

The transition temperature between 〈Si〉 = 1 and 〈Si〉 = 0 should be whenthe temperature is comparable to the energy per spin kBT = O( q

2J) which is

125

in the right ball park. There is no hand–waving explanation for the abruptincrease of the magnetization per spin m = 〈Si〉 at Tc. Note it is not like aparamagnet which shows a gradual decrease in magnetization with increasingtemperature (in the presence of an external field H). One can think of it (or

Paramagnet

m = 〈Si〉

1

T

No BrokenSymmetryH > 0

Figure 14.5:

at least remember it) this way: At Tc, one breaks a fundamental symmetry ofthe system. In some sense, one breaks a symmetry of nature. One can’t breaksuch a symmetry in a tentative wishy–washy way. It requires a firm, determinedhand. That is, an abrupt change.

The phase diagram for the Ising model (or indeed for any magnet) look likefig. 14.6. It is very convenient that H is so intimately related with the broken

Paramagnet

Tc

coexisting phases)

magnetization (andis there spontaneous

Only on the shaded line

T Tc H = 0

+1−1 m

for single phaseforbidden region

m

H > 0

+1

T

−1

T

H < 0

m

Figure 14.6:

symmetry of the Ising model. Hence it is very easy to understand that brokensymmetry is for H = 0, not H 6= 0. Things are more subtle in the lattice–gasversion of the Ising model.


14.1 Lattice–Gas Model

In the lattice–gas model, one still has a lattice labeled i = 1, 2...N , but nowthere is a concentration variable ci = 0 or 1 at each site. The most simplemodel of interaction is the lattice–gas energe

ELG = −ϑ∑

〈ij〉

cicj − µ∑

i

ci

where ϑ is an interaction constant and µ is the constant chemical potential.One can think of this as a model of a binary alloy (where Ci is the local

concentration of one of the two phases), or a liquid–vapour system (where Ci isthe local density).

This is very similar to the Ising model

EIsing = −J∑

〈ij〉

SiSj − H∑

i

Si

To show they are exactly the same, let

Si = 2Ci − 1

then

EIsing = −J∑

〈ij〉

(2Ci − 1)(2Cj − 1) − H∑

i

(2Ci − 1)

= −4J∑

〈ij〉

CiCj + 4J∑

〈ij〉

Ci + Const. − 2H∑

i

Ci + Const.

= −4J∑

〈ij〉

CiCj + 4Jq

2

∑

i

Ci − 2H∑

i

Ci + Const.

And

EIsing = −4J∑

〈ij〉

CiCj − 2(H − Jq)∑

i

Ci + Const.

Since the additive constant is unimportant, this is the same as ELG, provided

ϑ = 4J

and

µ = 2(H − jq)

Note that the plane transition here is not at some convenient point µ = 0, itis at H = 0, which is the awkward point µ = 2Jq. This is one reason why theliquid–gas transition is sometimes said — incorrectly — to involve no changeof symmetry (Z2 group) as the Ising model. But the external field must have acarefully tuned value to get that possibility.

14.2. ONE–DIMENSIONAL ISING MODEL 127

T

µ

Tc

〈c〉 ≈ 1

〈c〉 ≈ 0

10

T

Tc

〈c〉

TTc

T

µ = 2Jq

2Jq

P

Pc Tc

Density

Real liquid – gas transition

Lattice – gas transition

P = Pc

Figure 14.7:

14.2 One–dimensional Ising Model

It is easy to solve the one–dimensional Ising model. Of course, we already knowthat it cannot exhibit phase coexistance, from the theorm we proved earlier.Hence we expect

〈Si〉 = m = 0

for all T > 0, where m is the magnetization per spin. The result and the methodare quite useful, though.

Z =N∏

i=1

+1∑

Si=−1

eJ

kBT

PNj=1 SjSj+1+

HkBT

PNj=1 Sj

Note we have simplified the∑

〈ij〉 for one dimension. We need, however, aprescrition for the spin SN+1. We will use periodic boundary conditions so thatSN+1 ≡ S1. This is called the Ising ring. You can also solve it, with a little

1

2

3

N

N − 1

Ising Ring Periodoc Boundary Conditions

. . . .

Figure 14.8:


more work, with other boundary conditions. Let the coupling constant

k ≡ J/kBT

andh = H/kBT

be the external field. Then

Z(k, h,N) =

N∏

i=1

+1∑

Si=−1

ePN

j=1(kSjSj+1+hSj)

or, more symmetrically,

Z =

N∏

i=1

+1∑

Si=−1

eP

j [kSjSj+1+12 h(Sj+Sj+1)]

=

N∏

i=1

+1∑

Si=−1

N∏

j=1

ekSjSj+1+12 h(Sj+Sj+1)

Let matrix elements of P be defined by

〈Sj |P |Sj+1〉 ≡ ekSjSj+1+12 h(Sj+Sj+1)

There are only four elements

〈+1|P | + 1〉 = ek+h

〈−1|P | − 1〉 = ek−h

〈+1|P | − 1〉 = e−k

〈−1|P | + 1〉 = e−k

Or, in matrix form,

P =

[ek+h e−k

e−k ek−h

]

The partition function is

Z =

N∏

i=1

+1∑

Si=−1

〈S1|P |S2〉〈S2|P |S3〉...〈SN |P |S1〉

=

+1∑

S1=−1

〈S1|P (

+1∑

S2=−1

|S2〉〈S2|) P (

+1∑

S3=−1

|S3〉〈S3|)...P |S1〉

But, clearly, the identity operator is

1 ≡∑

S

|S〉〈S|

14.2. ONE–DIMENSIONAL ISING MODEL 129

so,

Z =+1∑

S1=−1

〈S1|PN |S1〉

= Trace(PN )

Say the eigenvalues of P are λ+ and λ−. Then,

Z = λN+ + λN

−

Finally, say λ+ > λ−. Hence as N → ∞Z = e−F/kBT = λN

+

andF = −NkBTλ+

It remains only to find the eigenvalues of P .We get the eigenvalues from

det(λ − P ) = 0

This gives(λ − ek+h)(λ − ek−h) − e−2k = 0

or,

λ2 − 2λ (ek+h + ek−h

2)

︸︷︷︸

ek cosh h

+e2k − e−2k = 0

The solutions are

λ± =2ek cosh h ±

√

4e2k cosh2 h − 4e2k + 4e−2k

2

= ek[cosh h ±√

cosh2 h − 1 + e−4k]

So,

λ+ = ek(cosh h +√

sinh2 h + e−4k)

is the larger eigenvalue. The free energy is

F = −NkBT [k + ln (cosh h +√

sinh2 h + e−4k)]

where k = J/kBT , and h = H/kBT .The magnetization per spin is

m = − 1

N

∂F

∂H= − 1

NkBT

∂F

∂h

Taking the derivative and simplifying, one obtains

m(h, k) =sinhh

√

sinh2 h + e−4k

As anticipated, m(h = 0, k) = 0. There is magnetization for h 6= 0, althoughthere is no broken symmetry of course. There is a phase transition at T = 0.


low T

high T

−1

+1

m

H

Figure 14.9: Magnetization of 1-d Ising model

14.3 Mean–Field Theory

There is a very useful approximate treatment of the Ising model. We will nowwork out the most simple such “mean–field” theory, due in this case to Braggand Williams.

The idea is that each spin is influenced by the local magnetization, or equiva-lently local field, of its surrounding neighbors. In a magnetized, or demagnetizedstate, that local field does not fluctuate very much. Hence, the idea is to replacethe local field, determined by a local magnetization, with an average field, de-termined by the average magnetization. “Mean” field means average field, notnasty field.

It is easy to solve the partition function if there is only a field, and nointeractions. This is a (very simple) paramagnet. Its energy is

EState = −H∑

i

Si

and,

Z =∏

i

+1∑

Si=−1

eH

kBT

P

j Sj

= (∑

S1=±1

eH

kBT S1)...(∑

SN=±1

eH

kBT SN )

= (∑

S=±1

eH

kBT S)N

so,

Z = (2 coshH

kBT)N

and

F = −NkBT ln 2 coshH

kBT

14.3. MEAN–FIELD THEORY 131

so since,

m = − 1

N∂F/∂H

thenm = tanhH/kBT

For the interacting Ising model we need to find the effective local field Hi at

+1

−1

m

H

1

m

T

H > 0

Figure 14.10: Noninteracting paramagnet solution

site i. The Ising model is

EState = −J∑

〈ij〉

SiSj − H∑

i

Si

we want to replace this — in the partition function weighting — with

EState = −∑

i

HiSi

Clearly

− ∂E

∂Si= Hi

Thus, from the Ising model

− ∂E

∂Si= J

∑

〈jk〉

(Sjδk,i + δj,iSk) + H

= J · 2 ·∑

j neighbors of i(no double counting)

Sj + H

Now let’s get rid of the 2 and the double–counting restriction. Then

Hi = J∑

j neighbors of i

Sj + H

is the effective local field. To do mean–field theory we let

Hi ≈ 〈Hi〉


where

〈Hi〉 = J∑

j neighbors

〈Sj〉 + H

but all spins are the same on average, so

〈Hi〉 = qJ〈S〉 + H

and the weighting of states is given by

EState = − (qJ〈S〉 + H)︸︷︷︸

mean field, 〈H〉

∑

i

Si

For example, the magnetization is

m = 〈S〉 =

∑

S=±1 Se〈H〉kBT S

∑

S=±1 e〈H〉kBT S

where 〈S〉 = 〈S1〉, and contributions due to S2, S3... cancel from top and bottem.So

m = tanh〈H〉kBT

or

m = tanhqJm + H

kBT

This gives a critical temperature

kBTc = qJ

as we will now see.Consider H = 0.

m = tanhm

T/Tc

Close to Tc, m is close to zero, so we can expand the tanh function:

m =m

T/Tc− 1

3

m3

(T/Tc)3+ ...

One solution is m = 0. This is stable above Tc, but unstable for T < Tc.Consider T < Tc, where m is small, but nonzero. Dividing by m gives

1 =1

T/Tc− 1

3

m2

(T/Tc)3+ ...


Rearranging

m2 = 3(T

Tc)2(1 − T

Tc) + ...

Or,

m = (3T

Tc)1/2(1 − T

Tc)1/2

for small m, T < Tc. Note that this solution does not exist for T > Tc. ForT < Tc, but close to Tc, we have

m ∝ (1 − T

Tc)β

where the critical exponent, β = 1/2. All this is really saying is that m is a

Close to Tc, mean

field gives m ∝ (1 − T/Tc)1/2

m

TTc

H = 0

Figure 14.11:

parabola near Tc, as drawn in fig. 14.12 (within mean field theory). This is

+1

−1

T

mParabola near m = 0, T = Tc, H = 0

“Classical” exponent β = 1/2

same as parabola

Figure 14.12:

interesting, and useful, and easy to generalize, but only qualitatively correct.Note

kBTc = qJ =

2J , d = 1

4J , d = 2

6J , d = 3

The result for d = 1 is wildly wrong, since Tc = 0. In d = 2, kBTc ≈ 2.27J , sothe estimate is poor. Also, if we take the exact result from Ousageris solution,


m ∝ (1 − T/Tc)1/8, so β = 1/8 in d = 2. In three dimensions kBTc ∼ 4.5J

and estimates are β = 0.313.... Strangely enough, for d ≥ 4, β = 1/2 (althoughthere are logarithmic corrections in d = 4)!

For T ≡ Tc, but H 6= 0, we have

m = tanh (m +H

kBTc)

or

H = kBTc(tanh−1 m − m)

Expanding giving

H = kBTc(m +1

3m3 + ... − m)

so

H =kBTc

3m3

for m close to zero, at T = Tc. This defines another critical exponent

T = Tcm

H

Shape of critical isotherm

is a cubic H ∝ m3

in classical theory

Figure 14.13:

H ∝ mδ

for T = Tc, where δ = 3 in mean–field theory.The susceptibility is

XT ≡ (∂m

∂H)T

This is like the compressability in liquid–gas systems

κT = − 1

V(∂V

∂P)T

Note the correspondence between density and magnetization per spin, and be-tween pressure and field. The susceptibility “asks the question”: how easy is itto magnetize something with some small field. A large XT means the materialis very “susceptible” to being magnetized.


Also, as an aside, the “sum rule” for liquids∫

d~r〈∆n(r)∆n(0)〉 = n2kBTκT

has an analog for, magnets∫

d~r〈∆m(r)∆m(0)〉 ∝ XT

We do not need the proportionality constant.From above, we have

m = tanhkBTcm + H

kBT

or,H = kBT · tanh−1 m − kBTcm

and

(∂H

∂m)T = kBT

1

1 − m2− kBTc

so

XT =1

T − Tc

close to Tc. OrXT ∝ |T − Tc|−γ

where γ = 1. This is true for T → T+c , or, T → T−

c . To get the energy, and so

χT , susceptibility

diverges near Tc

Tc

χT

Figure 14.14:

the heat capacity, consider

〈E〉 = 〈−J∑

〈ij〉

SiSj − H∑

i

Si〉

All the spins are independent, in mean–field theory, so

〈SiSj〉 = 〈Si〉〈Sj〉

Hence

〈E〉 = −Jq

2Nm2 − HNm


The specific heat is

c =1

N(∂E

∂T)N

Near Tc, and at H = 0,

m2 ∝{

(Tc − T ) T < Tc

0 T > Tc

So,

〈E〉 =

{

−(Const.)(Tc − T ), T < Tc

0, T > Tc

and

c =

{

Const. T < Tc

0 T > Tc

If we include temperature dependence away from Tc, the specific heat looksmore like a saw tooth, as drawn in fig. 14.15. This discontinuity in c, a second

Tc

C

Figure 14.15: Discontinuity in C in mean field theory.

derivative of the free energy is one reason continuous phase transitions werecalled “second order”. Now it is known that this is an artifact of mean fieldtheory.

It is written asc ∝ |T − Tc|−α

where the critical exponentα = o(disc.)

To summarize, near the critical point,

m ∼ (Tc − T )β , T < Tc, H = 0

H ∼ mδ, T = Tc

XT ∼ |T − Tc|−γ , H = 0

and

c ∼ |T − Tc|−α, H = 0


The exponents are:

mean field d = 2 (exact) d = 3 (num) d ≥ 4α 0 (disc.) 0 (log) 0.125... 0 (disc.)β 1/2 1/8 0.313... 1/2γ 1 7/4 1.25... 1δ 3 15 5.0... 3

Something is happening at d = 4. We will see later that mean–field theoryworks for d ≥ 4. For now we will leave the Ising model behind.

notransition

non trivialexponents

mean fieldtheory works

1 2 3 4 5

0.2

0.3

0.4

0.5

0.1

β

d

Figure 14.16: Variation of critical exponent β with spatial dimension

Chapter 15

Landan Theory and

Ornstien–Zernicke Theory

Nowadays, it is well established that, near the critical point, one can either solvethe Ising model

E = −J∑

〈ij〉

SiSj − H∑

i

Si

with

Z =∏

i

∑

Si=±1

e−E/kBT

or, the ψ4 field theory

F =

∫

ddx{k

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4} − H

∫

ddx ψ(~x)

where k, u, and H are constants, and

r ∝ T − Tc

with the partition function

Z =

∫ −

−

∫

D ψ(·) e−F

where the kBT factor has been set to kBTc, and incorporated into k, r, u andH (this is ok near Tc).

Look back in the notes for the discussion of the discrete Gaussian modelversus the continuous drumhead model of interfaces. Although it may not looklike it, the field theory model is a very useful simplification of the Ising model,wherein no important features have been lost.

139

140CHAPTER 15. LANDAN THEORY AND ORNSTIEN–ZERNICKE THEORY

To motivate their similarities, consider the following. First, the mappingbetween peices involving external field only require getting used to notation:

Ising Model ψ4 Field Theory

−HN∑

i=1

Si ⇐⇒ −H

∫

ddxψ(~x)

One is simply the continuum limit of the other. The square gradient term islikewise straightforward, as we now see.

Ising Model ψ4 Field Theory

J

2

N∑

i=1

∑

j neighbors of i

(Si − Sj)2 ⇐⇒ K

∫

ddx(∇ψ)2

yusing S2

i = 1

const. − J∑

〈ij〉

SiSj

The terms harder to see are the∫

dd~x ( r2ψ2 + u

4 ψ4) terms. Let us define

f(ψ) =r

2ψ2 +

u

4ψ4

This is the free energy for a homogeneous system in the ψ4 field theory (that is,

where ~∇ψ ≡ 0), in the absence of an external field, per unit volume. Note

r > 0, T > Tc

r < 0,T < Tc

So we have the double well form. This may look familiar, it was originally due

f

ψ

f

ψ

T > Tc T < Tc

r > 0 r < 0

Figure 15.1: “free energy”/volume. No ~∆ψ, no H

to Van der Waals. (In fact, the whole ψ4 theory is originally due to Van derWaals.)

Note that, since ψ is determined by the minimum of F , this single or doublewell restricts ψ to be within a small range. Otherwise, ψ could have any value,since ∫ −

−

∫

Dψ(·) =∏

~x

∫ ∞

−∞

dψ(~x)

141

These unphysical values of ψ are suppressed in the free energy. In the Isingmodel, the restriction is Si = ±1. We can lift this restriction with a kroneckerdelta

∏

i

+1∑

Si=−1

=∏

i

∞∑

Si=−∞

δS2i −1

=∏

i

∞∑

Si=−∞

e− ln (1/δ

S2i−1)

So, although this is awkward and artificial, it is sort of the same thing. Expand-ing it out, as S2

i → 0,

− ln (1/δS2i −1) ≈ S2

i −O(S4i )

the double–well form. This can be done in a very precise way near Tc. Here Iam just trying to give an idea of what is going on.

So the ψ4 theory is similar to the Ising model, where ψ(~x) is the localmagnetization. We will be concerned with the critical exponents α, β, γ, δ, η,and ν defined as follows

F (T ) ∼ |T − Tc|2−α ,H = 0

So that C = ∂2F∂T 2 ∼ |T − Tc|−α

ψ ∼ (Tc − T )β ,H = 0 , T < Tc

(∂ψ

∂H)T = XT ∼ |T − Tc|−γ ,H = 0

ψ ∼ H1/δ , T = Tc

〈ψ(~x)ψ(0)〉 ∼ 1

xd−2+η, T = Tc ,H = 0

〈ψ(~x)ψ(0)〉 ∼{

e−x/ξ , T > Tc

〈ψ〉2 + O(e−x/ξ) , T < Tc

and

ξ ∼ |T − Tc|−ν ,H = 0

This is the first time we have mentioned η and ν in this context.Landan motivates the ψ4 theory in a more physical way. He asserts that a

continuous phase transition involves going from a disordered phase, with lotsof symmetry, to an ordered phase, with a lower symmetry. To describe thetransition we need a variable, called the order parameter which is nonzero inthe ordered phase only. For example, in a magnet, the order parameter is themagnetization. In general, identifying the order parameter, and the symmetery


broken at Tc can be quite subtle. This will take us too far afield, so we willonly consider a scalar order parameter, where the symmetry broken is Z2, thatis, where ψ ↔ −ψ symmetry is spontaneously broken. (Note that a magneticspin confined to dimension d = 2, wherein all directions are equivalent, has anorder parameter ~ψ = ψxx + ψy y. The group in u(1). A spin in d = 3 has~ψ = ψxx+ψy y+ψz z, if all directions are equivalent; the group in O(3) I think.)

So Landan argues that the free energy to describe the transition must be ananalytic function of ψ. Hence

F =

∫

ddx f(ψ)

wheref(ψ) =

r

2ψ2 +

u

4ψ4 + ...

No odd terms appear, so that the

ψ ↔ −ψ

Symmetry is preserved. In the event an external field breaks the symmetry,another term is added to the free energy, namely

−H

∫

ddx ψ

At high temperatures, in the disordered phase, we want ψ = 0. Hence

r > 0

for T > Tc. However, at low temperatures, in the ordered phase, we must haveψ > 0. Hence

r < 0

(To make sure everything is still well defined, we need u > 0. There is no needfor u to have any special dependence in T near Tc, so we take it to be a constant.Or, if you prefer

u(T ) ≈ u(Tc) +:unimportant

(Const.)(T − Tc) + ...

near Tc. So we simply set u to be constant.)To find r’s evident T dependence, we Taylor expand

r = r0(T − Tc) + ...

keeping only the lowest–order term, where r0 > 0.Finally, we know that spatial fluctuations should be suppressed. Expanding

in powers of ~∇ and ψ, the lowest order nontrivial term is

1

2K

∫

ddr |∇ψ|2

143

where K > 0. Similar terms like∫

ψ∇2ψ turn into this after parts integration.

Note that this term means inhomogeneities (~∇ψ 6= 0) add free energy.Now we’re done. The Landan free energy is

F =

∫

ddx {f(ψ) +k

2|∇ψ|2} − H

∫

ddx ψ

with

f(ψ) =r

2ψ2 +

u

4ψ4

In high–energy physics, this is called a ψ4 field theory. The ∇2 term giveskinetic energy; r ∝ mass of particle; and ψ4 gives interactions. Van der Waalswrote down the same free energy (the case with k ≡ 0 is what is given in thetext books). He was thinking of liquid–gas transitions, so ψ = n − nc, where nis density and nc is the density at the critical point. Finally, from the theory ofthermodynamic fluctuations, we should expect that (in the limit in which thesquare gradient and quartic term can be neglected)

r ∝ 1/κT

or equivalently

r ∝ 1/XT

It is worth emphasizing that the whole machinery of quadratic, quartic, andsquare gradient terms is necessary. For example, note that the critical point isdefined thermodynamically as

(∂P

∂V)Tc

= 0

and

(∂2P

∂V 2)Tc

= 0

in a liquid–gas system. In a magnet:

(∂H

∂m)Tc

= 0, and(∂2H

∂m2)Tc

= 0

Trying to find the equation of state for T ≡ Tc in a liquid we expand:

P (n, Tc) = Pc +(∂P

∂n)Tc

(n−nc)+1

2(∂2P

∂n2)Tc

(n−nc)2 +

1

6(∂3P

∂n3)Tc

(n−nc)3 + ...

For a magnet

H(m,Tc) = >0

Hc + (∂H

∂m)Tc

(m − *0mc) +

1

2(∂2H

∂m2)Tc

m2 +1

2(∂3H

∂m3)Tc

m3 + ...


So, at the critical point for both systems — from thermodynamics and analyticity

only — we have

(P − Pc) ∝ (n − nc)δ

and

H ∝ mδ

where

δ = 3

Experimentally, δ ≈ 5.0... in three dimensions, while δ ≡ 15 in d = 2. So wehave a big subtle problem to solve!

15.1 Landan Theory

We now solve for the homogeneous states of the system. In that case ~∇ψ = 0,so we need only consider

F/V = f(ψ) − Hψ

=r0

2(T − Tc)ψ

2 +u

4ψ4 − Hψ

For H = 0, we find the solution by minimizing F/V (or f):

∂f

∂ψ= 0 = r0(T − Tc)ψ + uψ3

There are 3 solutions

ψ = 0, (stable for T > Tc)

ψ = ±√

r0(Tc − T )

u, (stable for T < Tc)

(To check stability, consider ∂2f/∂ψ2.) for H 6= 0, only one solution is stablefor T < Tc. You can check this yourself.

∂

∂ψ(f(ψ) − Hψ) = 0 =

∂f

∂ψ− H

or

H = r0(T − Tc)ψ + uψ3

Note that this gives

1/XT = (∂H

∂ψ)T = r0(T − Tc)

close to the critical point, or

XT ∝ |T − Tc|−1

15.2. ORNSTEIN — ZERNICKE THEORY OF CORRELATIONS 145

Also, at T = Tc, this givesH ∝ ψ3

Finally, note that (at H = 0)

f =1

2rψ2 +

1

4uψ4

but

ψ2 =

{

0 , T > Tc

− ru , T < Tc

so,

f =

{

0 , T > Tc

− 14

r2

u , T < Tc

or

f ∝{

0 , T > Tc

(Tc − T )2 , T < Tc

but c ∝ ∂2f/∂T 2, so the heat capacity

c ∝{

0 , T > Tc

Const. , T < Tc

In summary, these are the same results as we obtained from the mean fieldtheory of the Ising model. The relationship between analyticity of equations ofstate, and mean–field theory, is worth thinking about.

α 0 (disc.)β 1/2γ 1δ 3

15.2 Ornstein — Zernicke Theory of Correla-

tions

Ornstein and Zernicke give a physically plausible approach to calculating corre-lation functions. It works well (except near the critical point).

In a bulk phase, say T >> Tc, and we shall only consider H = 0, ψ has anaverage value of zero:

〈ψ〉 = 0

So we expectψ4 << ψ2

and, above Tc,

F ≈∫

ddx[k

2(∇ψ)2 +

r

2ψ2]


Far below Tc, we expect ψ to be closse to its average value

〈ψ〉 = ±√

− r

u

where r = r0(T − Tc) = negative below Tc. Let

∆ψ ≡ ψ − 〈ψ〉

Then~∇ψ = ~∇(∆ψ − 〈ψ〉) = ~∇(∆ψ)

Also,ψ2 = (∆ψ − 〈ψ〉)2 = (∆ψ)2 − 2〈ψ〉∆ψ + 〈ψ〉2

and

ψ4 = (∆ψ − 〈ψ〉)4

= (∆ψ)4 − 4〈ψ〉(∆ψ)3 + 6〈ψ〉2(∆ψ)2 − 4〈ψ〉3(∆ψ) + 〈ψ〉4

Hence, grouping terms by powers of ∆ψ,

r

2ψ2 +

u

4ψ4 = (−r〈ψ〉 − u〈ψ〉3)∆ψ + (

r

2+

3u

2〈ψ〉2)(∆ψ)2 + O(∆ψ)3

But 〈ψ〉 = ±√

−r/u, so

r

2ψ2 +

u

4ψ4 = −r(∆ψ)2 + O((∆ψ)3)

But (∆ψ)3 << (∆ψ)2, so, below Tc

F ≈∫

ddx [k

2(∇∆ψ)2 + |r|(∆ψ)2]

where we have used the fact that r is negative definite below Tc to write

−r = |r|

Both approximate free energies are pretty much the same as the free energy forthe interface model:

σ

2

∫

dd−1x(∂h

∂x)2

They can be solved exactly the same way. The definitions of Fourier transformsare the same (except for dd−1x → ddx, and Ld−1 → Ld, for example), and thearguments concerning translational invariance of space, and isotropy of spacestill hold. Let

C(x) ≡ 〈ψ(x)ψ(0)〉

Then we have (putting the kBT back in e−F

kBT )

C(k) =kBT

Kk2 + r, T > Tc


and since

〈∆ψ(x)∆ψ(0)〉 = 〈ψ(x)ψ(0)〉 − |r|u

using 〈ψ〉2 = |r|/u, we have

C(k) =kBT

Kk2 + 2|r| −|r|u

(2π)dδ(k) , T < Tc

These can be Fourier transformad back to real space.As an asside, remember the Green function for the Helmholtz equation sat-

isfies(∇2 − ξ−2)G(x) = −δ(x)

HenceG(k) = 1/(k2 + ξ−2)

The Green function is given in many books

G(x) =

ξ/2 e−x/ξ, , d = 112π K0(x/ξ) , d = 21

4πx e−x/ξ , d = 3

for example. Recall the modified Bessel function of zeroth order satisfies K0(y →0) = − ln y, and K0(y → ∞) = (π/2y)1/2 e−y. This is more than we need. Indiscussions of critical phenomena, most books state that if

G(k) ∝ 1

k2 + ξ−2

then

G(x) ∝ 1

xd−2e−x/ξ

The reason for the seemingly sloppiness is that proportionality constants areunimportant, and it is only long–length–scale phenomena we are interested in.Indeed, we should only trust out results for large x, or equivalently small k(since we did a gradient expansion in ~∇ψ).

In fact, so far as the Fourier transforms go, here is what we can rely on:If

G(k) ∝ 1

k2 + ξ−2

then

G(x) ∝ e−x/ξ

unless ξ = ∞, in which case

G(x) ∝ 1

xd−2


for large x.

Using these results, we finally obtain the correlation function up to a constantof proportionality,

C(x) =

e−x/ξ , , T > Tc

1xd−2 , T = Tc

〈ψ〉2 + O(e−x/ξ) , T < Tc

where the correlation length:

ξ =

√kr , T > Tc

√k

2|r| , T < Tc

Note that the result for T < Tc

C(x) = 〈ψ(x)ψ(0)〉 = 〈ψ〉2 + O(e−x/ξ)

is a rewriting of

〈∆ψ(x)∆ψ(0)〉 = e−x/ξ

In cartoon form: figure 15.2. Note the correlation length satisfies ξ ∼ 1/√

|r|

T

ξ Correlation length ξ

ξ ∼ |T − Tc|−1/2or

ξ ∼ 1√|r|

Figure 15.2:

above and below Tc, so

ξ ∼ |T − Tc|−1/2

in Ornstein–Zernicke theory. The correlation function becomes power–law–likeat Tc.


T > Tc

T < Tc

T = Tc

C(x)

x

Figure 15.3: Correlation function above at and below Tc

C(x) =1

xd−2, T = Tc, (H = 0)

Sinceξ ∼ |T − Tc|ν

and

C(x) ∼ 1

xd−2+η

we have ν = 1/2 and η = 0.This completes the list of critical exponents.In summary:

Mean Field Ising Landan 0 − Z d = 2 Ising d = 3 Ising d ≥ 4 Isingα 0 (disc.) 0 (disc.) 0 (log) 0.125... 0β 1/2 1/2 1/8 0.313... 1/2γ 1 1 7/4 1.25... 1δ 3 3 15 5.0... 3η 0 0 1/4 0.041... 0ν 1/2 1/2 1 0.638... 1/2

So something new is needed.

Chapter 16

Scaling Theory

Historically, the field of phase transitions was in a state of termoil in the 1960’s.Sensible down–to–earth theories like the mean–field approach, Landan’s ap-proach, and Ornstein–Zernicke theory were not working. A lot of work was putinto refining these approaches, with limited success.

For example, the Bragg–Williams mean field theory is quite crude. Refine-ments can be made by incorporating more local structure. Pathria’s book onStatistical Mechanics reviews some of this work. Consider the specific heat ofthe two–dimensional Ising model. Since Ousager’s solution, this was knownexactly. It obeys C ∼ ln |T − Tc|, near Tc. The refinements make for muchbetter predictions of C away from Tc. At Tc, all mean field theories give Cto be discontinuous. Comparing to the exact result shows this as well. (This

C

Tc

T

Crude mean field Better mean field

C

Tc

T

C

Tc

T

better . . .

All mean theories give C ∼ discontinuous

Figure 16.1:

151

152 CHAPTER 16. SCALING THEORY

Figure 16.2:

is taken from Stanley’s book, “Introduction to Phase Transitions and CriticalPhenomena”.)

Experiments consistently did not (and do not of course) agree with Landan’sprediction for the liquid–gas or binary liquid case. Landan gives

n − nc ∼ (Tc − T )β

with β = 1/2. Three–dimensional experiments gave β ≈ 0.3 (usually quoted as1/3 historically), for many liquids. The most famous paper is by Guggenheimin J. Chem. Phys. 13, 253 (1945).

Figure 16.3:

This is taken from Stanley’s book again. The phenomena at Tc, where

153

ξ → ∞ is striking.

Figure 16.4:

Here I show some (quite old) simulations of the Ising model from Stanley’sbook. Note the “fractally” structure near Tc. You can easily program thisyourself if you are interested. Nowadays, a PC is much much faster than oneneeds to study the two–dimensional Ising model. Strange things are also seenexperimentally.

For a fluid, imagine we heat it through Tc, fixing the pressure so that it


corresponds to phase coexistence (i.e., along the liquid–vapor line in the P–Tdiagram as drawn). The system looks quite different depending on temperature.

P

T

liquid

vapor

heat this way

Figure 16.5:

It is transparent, except for the meniscus separating liquid and vapour for T <

Liquid

Vapor

Looking

Milky

Mess

Gas

Opaque TransparentTransparent

P ≡ Pliquid vapor coexistance

T < Tc T > TcT = Tc

Figure 16.6:

Tc, for T < Tc and T > Tc. But at Tc, it is milky looking. This is called “criticalopalescence”. The correlation length ξ, corresponding to fluctuations, is so bigit scatters light.

Rigorous inequalities were established with the six critical exponents. (Ishould mention that I am assuming any divergence is the same for T → T+

c orT → T−

c , which is nontrivial, but turns out to be true.) For example,Rushbrooke proved

α + 2β + γ ≥ 2

Griffiths proved

α + β(1 + δ) ≥ 2

These are rigorous. Other inequalities were proved with some plausible assump-tions. These all work for the exponents quoted previously — as equalities! Infact, all the thermodynamic inequalities turned out to be obeyed as equalities.These inequalities were very useful for ruling out bad theories and bad experi-ments.

16.1. SCALING WITH ξ 155

Here is a list of all the independent relations between exponents

α + 2β + γ = 2

γ = ν(2 − η)

γ = β(δ − 1)

andνd = 2 − α

So it turns out there are only two independent exponents. The last equation iscalled the “hyperscaling” relation. It is the only one involving spatial dimensiond. The others are called “scaling” relations. This is because the derivation ofthem requires scaling arguments.

16.1 Scaling With ξ

Note that near Tc, ξ diverges. Also, correlation functions become power–lawlike at Tc according to the approximate Ornstein–Zernicke theory as well as theexact result for the Ising model in two dimensions.

So let us make the bold assertion that, since ξ → ∞ at Tc, the only observableinformation on a large length scale involves quantities proportional to ξ. Thatis, for example, all lengths scale with ξ.

Let us now derive the hyperscaling relation. The free energy is extensive,and has units of energy, so

F = kBTV

ξd

One can think of V/ξd as the number of independent parts of the system. But

ξ ∼ |T − Tc|−ν

soF = |T − Tc|νd

But the singular behavior in F is from the heat capacity exponent α via

F = |T − Tc|2−α

Henceνd = 2 − α

This back–of–the–envelope argument is the fully rigorous (or as rigorous as itcomes) argument! The hyperscaling relation works for all systems, except themean field solutions must have d = 4.

To derive the three scaling relations, we assert that correlations are scaleinvariant:

C(x, T, H) =1

xd−2+η

∼

C(x/ξ, Hξy)


Note that, since blocks of spin are correlated on scale ξ, this means only a littleH is needed to magnetize the system. So the effect of H is magnified by ξ. Thefactor is ξy, where y must be determined. In a finite–size system we would have

C(x, T, H L) =1

xd−2+η

∼

C(x/ξ, Hξy, L/ξ)

In any case, in the relation above

ξ ∼ |T − Tc|−ν

For ease of writing, let

t ≡ |T − Tc|

So that ξ ∼ t−ν . Then we have

C(x, T, H) =1

xd−2+η

∼

C(xtν , H/t∆)

where

∆ = yν

is called the “gap exponent”. But the magnetic susceptibility satisfies the sumrule

XT ∝∫

ddxC(x, T, H)

=

∫

ddx1

xd−2+η

∼

C(xtν , H/t∆)

= (tν)−2+η

∫

dd(xtν)

∼

C(xtν , H/t∆)

(xtν)d−2+η

so,

XT (T,H) = t(−2+η)ν∼

X(H/t∆)

where ∫ ∞

−∞

dd(xtν)1

(xtν)d−2+η

∼

C(xtν , H/t∆) ≡∼

X(H

t∆)

But

X(T,H = 0) = t−γ

Hence

γ = (2 − η)ν

called Fisher’s law.

Now note∂ψ

∂H(T,H) = XT (T,H)


so integrating, we have the indefinite integral

ψ(T,H) =

∫

dH X(T,H)

=

∫

dH t−γ∼

X(H

t∆)

= t∆−γ

∫

d(H

t∆)

∼

X(H

t∆)

= t∆−γ∼

ψ(H/t∆)

Butψ(T,H = 0) = tβ

Soβ = ∆ − γ

and the gap exponent∆ = β + γ

or equivalently, y = (β + γ)/ν. So we have

ψ(T,H) = tβ∼

ψ(H

tβ+γ)

Here is a neat trick for changing the dependence “out front” from t to H. We

don’t know what the function∼

ψ is. All we know is that it is a function of Htβ+γ .

Any factor of H/tβ+γ that we multiply or divide into∼

ψ leaves another unknownfunction, which is still a function of only H/tβ+γ . In particular, a crafty choicecan cancel the tβ factor:

ψ(T,H) = tβ(H

tβ+γ)

ββ+γ (

H

tβ+γ)

−ββ+γ

∼

ψ(H

tβ+γ)

We call the last term on the right–hand side

∼∼

ψ, and obtain

ψ(T,H) = Hβ/β+γ

∼∼

ψ(H

tβ+γ)

Note that H−β/β+γ∼

ψ(H) =

∼∼

ψ(H). Now

ψ(Tc,H) = H1/δ

is the shape of the critical isotherm, so δ = 1 + γ/β, or

γ = β(δ − 1)

called the Widom scaling law.


Note that ψ = ∂F/∂H, so

F =

∫

dH ψ

=

∫

dH tβ∼

ψ(H

tβ+γ)

= t2β+γ

∫

d(H

tβ+γ)∼

ψ(H

tβ+γ)

or,

F (T,H) = t2β+γ∼

F (H

tβ+γ)

ButF (T, 0) = t2−α

so 2 − α = 2β + γ, orα + 2β + γ = 2

which is Rushbrooke’s law.The form for the free energy is

F (T,H) = t2−α∼

F (H

tβ+γ)

Note that

limH→0

∂nF

∂Hn= t2−αt−n(β+γ)

orF (n) = t2−α−n(β+γ)

So all derivatives of F are separated by a constant “gap” exponent (β + γ).

yn = 2 − α − n(β + γ)

0

1

2

−1

−2

free energy

magnetization

Susceptability

yn

2 31

Figure 16.7: Gap Exponent

Finally, we will give another argument for the hyperscaling relation. It isnot, however, as compelling as the one above, but it is interesting.


Near Tc, we assert that fluctuations in the order parameter are of the sameorder as ψ itself. Hence

〈(∆ψ)2〉〈ψ〉2 = O(1)

This is the weak part of the argument, but it is entirely reasonable. Clearly

〈ψ〉2 ∼ t2β

To get 〈(∆ψ)2〉, recall the sum rule

∫

ddx〈∆ψ(x)∆ψ(0)〉 = XT

The integral is only nonzero for x ≤ ξ. In that range,

〈∆ψ(x)∆ψ(0)〉 ∼ O〈(∆ψ)2〉

Hence ∫

ddx〈∆ψ(x)∆ψ(0)〉 = ξd〈(∆ψ)2〉 = XT

or,〈(∆ψ)2〉 = XT ξ−d

But

XT ∼ t−γ

and

ξ ∼ t−ν

so,

〈(∆ψ)2〉 = tνd−γ

The relative fluctuation therefore obeys

〈(∆ψ)2〉〈ψ〉2 = tνd−γ−2β

But if, as we argued〈(∆ψ)2〉〈ψ〉2 = O(1)

we haveνd = γ + 2β

which is the hyperscaling relation. (To put it into the same form as before, useRushbrooke scaling law of α + 2β + γ = 2. This gives νd = 2 − α.)


As a general feature〈(∆ψ)2〉〈ψ〉2

measures how important fluctuations are. We discussed this at the beginning ofthe course. Mean field theory works best when fluctuations are small. So we saythat mean–field theory is correct (or at least self–consistent) if its predictionsgive

〈(∆ψ)2〉〈ψ〉2 << 1

This is called the Ginzburg criterion for the correctness of mean–field theory.But from above

〈(∆ψ)2〉〈ψ〉2 = tνd−γ−2β

Putting in the results from mean–field theory (ν = 1/2, γ = 1, β = 1/2) gives

〈(∆ψ)2〉〈ψ〉2 = t

12 (d−4)

Hence, mean–field theory is self consistent for d > 4 as T → Tc (t → 0). Mean–field theory doesn’t work for d < 4.

This is called the upper critical dimension; the dimension at which fluctua-tions can be neglected.

Chapter 17

Renormalization Group

The renormalization group is a theory that builds on scaling ideas. It gives away to calculate critical exponents. The idea is that phase transition propertiesdepend only on a few long wavelength properties. These are the symmetry ofthe order parameter and the dimension of space, for systems without long–rangeforces or quenched impurities. We can ignore the short length scale properties,but we have to do that “ignoring” in a controlled carefil way. That is what therenormalization group does.

The basic ideas were given by Kadanoff in 1966, in his “block spin” treatmentof the Ising model. A general method, free of inconsistencies, was figured outby Wilson (who won the Nobel prize for his work). Wilson and Fisher figuredout a very useful implementation of the renormalization group in the ǫ = 4 − dexpansion, where the small parameter ǫ was eventually extrapolated to unity,and so d → 3.

At a critical point, Kandanoff reasoned that spins in the Ising model acttogether. So, he argued one could solve the partition function recursively, byprogressively coarse–graining on scales smaller than ξ. From the recursion rela-tion describing scale changes, one could solve the Ising model. This would onlybe a solution on large length scales, but that is what is of interest for criticalphenomena.

The partition function is

Z(N,K, h) =∑

{States}

eKP

〈ij〉 SiSj+hP

i Si

where K = J/kBT and h = H/kBT .

Now, let us (imagine we) reorganize all of these sums as follows.

1. Divide the lattice into blocks of size bb (we’ll let b = 2 in the cartoonbelow).

2. Each block will have an internal sum done on it leading to a block spin

161

162 CHAPTER 17. RENORMALIZATION GROUP

S = ±1. For example, the majority–rule transformation is

Si =

+1,∑

iǫbd Si > 0

−1,∑

iǫbd Si < 0

±1randomly

,∑

iǫbd Si = 0

Clearly, if N is the number of spins originally, N/bd is the number of spins afterblocking.

It is easy to see that this is a good idea. Any feature correlated on a finitelength scale l before blocking becomes l/b after blocking once. Blocking m timesmeans this feature is now on the small scale l/bm. In particular, consider thecorrelation length ξ. The renormalization group (RG) equation for ξ is

ξ(m+1) = ξ(m)/b

If the RG transformation has a fixed point (denoted by *), which we alwaysassume is true here,

ξ∗ = ξ∗/b

The only solutions are

ξ∗ =

{

0, T rivial : T = 0, F ixedPoint : T = ∞∞, Nontrivial fixed point : T = Tc

The critical fixed point is unstable, while the T = 0 and T = ∞ fixed points

ξ

Tc

Arrows show flow

under RG

Figure 17.1:

are stable.

163

Note that the coarse–graining eliminates information on small scales in acontrolled way. Consider figure 17.2. Let us consider a blocking transformation

T > Tc blocking Renormalized System

Less Fluctuations

System with

a few fluctuations

(lower T )

Figure 17.2:

(fig. 17.3). We know the states in the new renormalized system, S(m+1)i = ±1,

Original System

RG level m

blocking with b = 2 Renormalized ConfigurationRG level (m + 1)

Figure 17.3:

for the renormalized spins on sites

i = 1, 2, ..N (m+1)

Note,

N (m+1)N (m)/bd

In the cartoon, N (m+1) = 16, bd = 22, and N (m) = 64.

This is all pretty clear: we can easily label all the states in the sum for Z.But how do the spins interact after renormalization? It is clear that, if theoriginal system has short–ranged interactions, the renormalized system will aswell.

At this point, historically, Kadanoff simply said that if E(m)State was the Ising

model, so was E(m+1)State . The coupling constants K(m) and h(m) would depend

on the level of transformation, but that’s all. This is not correct as we will seeshortly, but the idea and approach is dead on target.


In any case, assuming this is true

Z(m)(N (m),K(m), h(m)) =∑

{States}

eK(m) P

〈ij〉 S(m)i S

(m)j +h(m) P

i S(m)i

ButZ(m) = e−N(m)f(m)(h(m),t(m))

where t(m) is the reduced temperature obtained from K(m), and f (m) is the freeenergy per unit volume. But we must have Z(m) = Z(m+1), since we obtainZ(m+1) by doing some of the sums in Z(m). Hence

N (m)f (m)(h(m), t(m)) = N (m+1)f (m+1)(h(m+1), t(m+1))

or,f (m)(h(m), t(m)) = b−df (m+1)(h(m+1), t(m+1))

The only reason h and t vary under RG is due to B. We assume

t(m+1) = bAt(m)

and

h(m+1) = bBt(m)

So (and dropping the annoying superscripts from t and h) we have

f (m)(h, t) = b−df (m+1)(hbB , tbA)

Now we use Kadanoff’s bold, and unfortunately incorrect, assertion that f (m)

and f (m+1) are both simply the Ising model. Hence

f(h, t) = b−df(hbB , tbA)

This is one form of the scaling assumption, which is due to Widom. Widomargued that the free energy density was a homogeneous function. This allowsfor the possibility of nontrivial exponents (albeit built into the description), andone can derive the scaling relations.

For example, the factor b is not fixed by our arguments. Let

tbA ≡ 1

Thenf(h, t) = td/Af(ht−B/A, 1)

or,

f(h, t) = td/A∼

f (ht−B/A)

which is of the same form as the scaling relation previously obtained for F .

165

vs.

E(m)

kBT= −K(m)

P

〈ij〉 S(m)i S

(m)j vs. E(m+1)

kBT= −K(m+1)

P

〈ij〉 S(m+1)i S

(m+1)j

+(langer range interactions)

Figure 17.4:

Nearest neighbor interaction shown by solid line

shown by wavy lineNext – nearest neighbor interaction (along diagonal)

Figure 17.5:

Kadanoff also presented results for the correlation function, but we’ll skipthat.

Let us consider the interactions between spins again–before and after appli-cation of RG. For simplicity, let H = 0. The original system has interactionsto its nearest neighbors as drawn in fig. 17.4. The blocked spin treats 4 spinsas interacting together, with identical strength with blocked spins. The nearestneighbor interaction is as a cartoon, in fig. 17.5. Note that the direct part ofthe nearest–neighbor interaction to the right in the renormalized system has 4spins interacting equivalently with 4 other spins. There are 16 cases, which arereplaced by one global effective interaction. Those 16 cases have: All K(m+1)

contributions treated equally in block spin near neighbor interaction.

2 interactions separated by 1 lattice constant

6 interactions separated by 2 lattice constants



Let’s compare this to the next–nearest neighbor interaction, along the diagonals.There are again 16 cases which have: All next nearest neighbor contributions


Figure 17.6: Nearest neighbor interaction vs. Next–nearest neighbor

treated equally in block spin representation.






Let us call the block–spin next–nearest neighbor coupling K(m+1)nn , which lumps

all of this into one effective interaction. It is clear from looking at the interac-

tions underlying the block spins that K(m+1) (nearest neighbor) and K(m+1)nn

(next nearest neighbor) are quite similar, although it looks like K(m+1)nn <

K(m+1). However, Kadanoff’s step of K(m+1)nn ≡ 0 looks very drastic, in fact

with the benifit of hindsight, we know it is too drastic.One can actually numerically track what happens to the Ising model under

the block–spin renormalization group. It turns out that longer–range interac-tions between two spins are generated (as we have described in detail above)as well as three spin interactions (like

∑SiSjSk), and other such interactions.

Although the interactions remain short ranged, they become quite complicated.It is useful to consider the flow of the Energy of a state under application ofRG. For H = 0, for the Ising model, the phase diagram is as in fig. 17.7. For

0 Kc K

low T

infinite T

Recall K = J/kBT

Figure 17.7: Usual Ising model H = 0

the Ising model with only next nearest neighbor interactions (J ≡ 0) one getsthe same phase diagram at H = 0 (fig. 17.8). The phase diagram for an Ising

0

low T

infinite T

(Knn)c Knn Knn = Jnn/kBT

Figure 17.8: Ising model, no near neighbor interaction

167

model with both K and Knn namely,

E

kBT= −K

∑

〈ij〉

SiSj − Knn

∑

〈〈ij〉〉

SiSj

for H = 0, is drawn in fig. 17.9. The flow under RG is hence as drawn in

Knn

(Knn)c

0 Kc

disordered

Low Tordered

2nd order transitions

K

infinite T

Figure 17.9:

fig. 17.10. Of course there are many other axes on this graph corresponding to

Knn

K

Critical

of RG transformation

fixed point

Figure 17.10:

different interactions. One should also note that the interactions developed bythe RG, depend on the RG. For example, b = 3 would give a different sequenceof interactions, and a different fixed result, than b = 2. But, it turns out,the exponents would all be the same. This is an aspect of universality: Allmodels with the same symmetry and spatial dimension share the same criticalexponents.

The formal way to understand these ideas was figured out by Wilson. TheRG transform takes us from one set of coupling constants K(m) to anotherK(m+1) via a length scale change of b entering a recursion relation: K(m+1) =R(k(m)).

At the fixed pointK∗ = R(k∗)

Let us now imagine we are close enough to the fixed point that we can linearizearound it. In principle this is clear. In practice, because of all the couplingconstants generated on our way to the fixed point, it is a nightmare. Usually itis done by numerical methods, or by the ǫ = 4 − d expansion described later.


Using subscripts, which correspond to different coupling constants

K(m+1)α − K∗

α = R(K(m)) − R(K∗)

or,

δK(m+1)α =

∑

β

Mαβ δK(m)β

where

Mαβ = (∂Rα/∂Kβ)∗

is the linearized RG transform through which “b” appears, and

δK = K − K∗

All that is left to do is linear algebra to diagonalize Mαβ .Before doing this, let us simply pretend that Mαβ is already diagonal. Then

we can see everything clearly. If

λ(1) 0λ(2)

...0 ...

the transform becomes

δK(m+1)α = λ(α)δK(m)

α

The form of λ(α) is easy to get. It is clear that acting with the RG on scale b,followed by acting again by b′, is the same as acting once by b · b′. Hence:

M(b)M(b′) = M(bb′)

somewhat schematically, or

λ(b)λ(b′) = λ(bb′)

The solution of this is

λ = by

The remormalization group transformation becomes

δK(m+1)α = byαδK(m)

α

If yα > 0, δKα is relevant, getting bigger with RG.If yα = 0, δKα is marginal, staying unchanged with RG.If yα < 0, δKα is irrelevant, getting smaller with RG.The singular part of the free energy near the fixed point satisfies

f∗(δK1, δK2, ..) = b−df∗(by1δK1, by2δK2, ...)

169

This is the homogeneous scaling form, where we now know how to calculatethe yα’s (in principle), and we know why (in principle) only two exponents arenecessary to describe the Ising model (presumably the remainder are irrelevant).

In practice, the problem is to find the fixed point. A neat trick is the ǫ = 4−dexpansion. From our discussion above about the Ginzburg criteria, mean–fieldtheory works for d > 4 (it is marginal with logarithmic correlations in d = 4itself). In fact we then expect there to be a very easy–to–find, easy–to–fiddle–with, fixed point in d ≥ 4. If we are lucky, this fixed point can be analyticallycontinued to d < 4 in a controllable ǫ = 4− d expansion. We will do this below.I say “lucky” because the fixed point associated with mean–field theory is notalways related to the fixed point we are after (usually it is then a fixed pointrelated to a new phase transition at even higher dimensions, i.e., it correspondsto a lower critical dimension).

Finally let us now return to the linear algebra:

δK(m+1)α =

∑

β

Mαβ δK(m)α

since Mαβ is rarely diagonal. We will assume it is symmetric to make thingseasier. (If not, we have to define right and left eigenvalues. This is done inGoldenfeld’s book.) We define an orthonormal complete set of eigenvectors by

∑

β

Mαβe(γ)β = λ(γ)e(γ)

α

where λ(γ) are eigenvalues and e are the eigenvectors. We expand δK via

δKα =∑

γ

δKγe(γ)α

or,

δKγ =∑

α

δKαe(γ)α

where δK are the “scaling fields” — a linear combination of the original inter-

actions. Now we act on the linearized RG transform with∑

α e(γ)α (...). This

gives∑

α

e(γ)α (δK(m+1)

α ) =∑

α

e(γ)α (

∑

β

Mαβ δK(m)α )

It is a little cofusing, but the superscripts on the e’s have nothing to do withthe superscripts on the δK’s. We get

δK(m+1)γ = λ(α)

∑

β

e(γ)β δK

(m)β

so,δK(m+1)

γ = λ(α)δK(m)γ

This is what we had (for δK’s rather than δK’s) a few pages back. Everythingfollows the same from there.


17.1 ǫ–Expansion RG

Reconsider

F [ψ] =

∫

ddx{K

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4}

near Tc, where K and u are constants and

r ∝ T − Tc

Near Tc we can conveniently consider

F

kBT=

∫

ddx{K

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4}

by describing the kBT , which is nonsingular, within K, r, and u. We can removethe constant K by absorbing it into the scale lenght or letting r → r

K , F → FK

etc., so thatF

kBT=

∫

ddx{1

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4}

Now we can easily construct dimensional arguments for all coefficients. Clearly

[F

kBT] = 1

where [...] means the dimension of the quantity. Hence,

[

∫

ddx(∇ψ)2] = 1

or

[Ld−2ψ2] = 1, since [x] = L

or,

[ψ] =1

Ld2−1

Similarly,

[

∫

ddxr

2ψ2] = 1

or

[Ldr1

Ld−2] = 1

17.1. ǫ–EXPANSION RG 171

or

[r] =1

L2

Finally,

[

∫

ddxu

4ψ4] = 1

so [Ldu1

L2d−4] = 1

or [u] =1

L4−d

Eventually we will be changing scale by a length rescaling factor, so we need tofind out how these quantities are affected by a simple change in scale. Theseare called engineering dimensions.

Note the following,

[ψ2] =1

Ld−2

But at Tc

〈ψ2〉 =1

xd−2+η

where η is nonzero! Also, since r ∝ T − Tc, near Tc

[T − Tc] =1

L2

but [ξ] = L, soξ = (T − Tc)

−1/2

near Tc. Butξ = (T − Tc)

−ν

So our results of scaling theory give

[ψ]after avg. =1

L(d−2+η)/2=

1

Ld2−1

1

Lη/2

and

[r]after avg. =1

L1/ν=

1

L2

1

L1ν −2

The extra length which appears on averaging is the lattice constant a ≈ 5A.Hence

〈ψ2〉 =1

xd−2+ηaη

and η/2 is called ψ’s anomalous dimension, while 1/η − 2 is the anomalousdimension of r. By doing the averaging, carefully taking account of changes inscale by L, we can calculate ν and η as well as other quantities.


In this spirit, let us measure quantities in units of their engineering dimen-sions. Let

ψ′ =ψ

[ψ]=

ψ

1/Ld2−1

and

x′ =x

[x]=

x

L

so that (dropping the kBT under F ),

F =

∫

ddx′{1

2(∇′ψ′)2 + (

r

2L2)ψ′2 + (

u

4L4−d)ψ′4}

we can of course let

r′ =r

[r]=

r

1/L2

and

u′ =u

[u]=

u

1/L4−d= uL4−d

so we have

F =

∫

ddx′{1

2(∇′ψ′)2 +

r′

2ψ′2 +

u′

4ψ′4}

Equivalently, since we are interested in the behaviour at Tc, let rL2 ≡ 1 (choos-ing the scale by the distance from Tc), then

F =

∫

ddx′{1

2(∇′ψ′)2 +

1

2ψ′2 +

u′′

4ψ′4}

where

u′′ = u(1

r)

4−d2

Note that u′ is negligable as L → ∞ for d > 4, and more revealingly, u′′ isnegligible as T → Tc for d > 4.

This is the Ginzburg criterion again, and note the equivalence between L →∞ and T → Tc (i.e., r → 0).

It is worthwhile to consider how the coefficient of the quartic term (with itsengineering dimension removed) varies as scale L is varied. Consider the ”betafunction”:

−Ldu′

dL= −(4 − d)L4−du

= −(4 − d)u′


d > 4

d < 4

d = 4

−Ldu′

dL

u′

initial value of u′ is

desceased as L increases

initial value of u′ is

increased as L increases

Figure 17.11:

The direction of the arrows is from

du′

d ln L= −(d − 4)u′ , d > 4

du′

d ln L= (4 − d)u′ , d < 4

Note that u′ = 0 is a stable fixed point for d > 4 and an unstable fixed point ford < 4. This sort of picture implicitly assumes u′ is small. The renormalizationgroup will give us a more elaborate flow similar to

−Ldu′

dL= −(4 − d)u′ + O(u′2)

so that the flow near the fixed pt. determines the dimension of a quantity.

−Ldu′

dL u′ = 0 fixed pt. d > 4

d < 4

u′

new Wilson fixed pt.

u′ > 0

Figure 17.12:

Clearly

−Ldu′

dL= (Dim. of u)u′

and similarly

−Ldr′

dL= (Dim. of r) r′

So the anomalous dimension can be determined by the “flow” near the fixedpoint.


The analysis above is OK for d > 4, and only good for u′ close to zero ford < 4. To get some idea of behavior below 4 dimensions, we stay close to d = 4and expand in ǫ = 4 − d. (See note below)

So consider

F =

∫

ddx{K

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4}

we will find

ǫ = 1 (d = 3) Exact (d = 3)α = ǫ/6 0.17 0.125β = 1

2 − ǫ6 0.33 0.313

γ = 1 + ǫ/6 4 5.0δ = 3 + ǫ 1.17 1.25ν = 1

2 + ǫ/12 0.58 0.638η = 0 + O(ǫ2) 0 0.041

(Note: Irrelevant CouplingsConsider ∫

ddx[1

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4 +

v

6ψ6 +

w

8ψ8]

as before,

[ψ] = 1/Ld2−1

[ν] = 1/L2

[u] = 1/L4−d

But similarly,

[v] = 1/L2(3−d)

[w] = 1/L3(2 23−d)

so,

F =

∫

ddx′[1

2(∇′ψ′)2 + (

r

2L2)ψ′2 + (

u

4L4−d)ψ′4 +

vL2(3−d)

6ψ6 + wL3(2 2

3−d)ψ8]

Hence, v, w look quite unimportant (at least for the Gaussian fixed point) neard = 4. This is called power counting. For the remainder of these notes we willsimply assume v, w are irrelevant always.)

We will do a scale transformation in Fourier space,

ψ(x) =

∫

|k|<Λ

ddk

(2π)dei~k·~xψk

where (before we start) Λ = π/a, since we shall need the lattice constant toget anomalous dimensions. We will continually apply the RG to integrate out


Λ

Λ

−Λ

−Λ

ky

kx

Λ/b

Λ/b

ky

kx . . . . 0

ky

kx

RGRG

domain of nonzero hatpsi k Scale transformationby factor of b

to k → 0

Figure 17.13:

small length scale behavior, which, in Fourier space, means integrating out largewavenumber behaviour. Schematically shown in fig. 17.13. Let

ψ(x) = ψ(x) + δψ(x)

all degrees of freedom degrees of freedom degrees of freedom

after several for “slow” modes on for “fast” modes on

transformations |k| < Λ/b Λ/b < |k| < Λ

|k| < Λ to be integrated out

Also, for convenience,Λ

b= Λ − δΛ

So that the scale factor

b =Λ

Λ − δΛ≥ 1

Then we explicitly have

ψ(x) =

∫

|k|<Λ−δΛ

ddk

(2π)dei~k·~xψk

and

δψ(x) =

∫

Λ−δΛ<|k|<Λ

ddk

(2π)dei~k·~xψk

Schematically, refer to fig. 17.14. This is called the “momentum–shell” RG. Wedo this in momentum space rather than real space because it is convenient forfactors of ∫

ddx(∇ψ)2︸︷︷︸

coupling different ψ(x)’s

=

∫ddk

(2π)dk2|ψk|2

︸︷︷︸

fourier modes uncoupled


Λ

ky

kx

ψ(x)

Λ

ky

kx

ψ(x)

Λ

ky

kx

δψ(x)

Λ − δΛΛ − δΛ

Figure 17.14:

which are trivial in momentum space.

The size of the momentum shell is

∫

Λ−δΛ<|k|<Λ

ddk

(2π)d= aΛd−1δΛ

where a is a number that we will later set to unity for convenience, althougha(d = 3) = 1/2π2, a(d = 2) = 1/2π. Let us now integrate out the modes δψ

e−F [ψ]︸︷︷︸

scales |k| < Λ − δΛ

≡∑

States δψ︸︷︷︸

shell scales

e−F [ψ+δψ]︸︷︷︸

scales |k| < Λ

Essentially we are — very slowly — solving the partition function by bit–by–bitintegrating out degrees of freedom for large k.

We will only do this once, and thus obtain recursion relations for how tochange scales.

F [ψ + δψ] =

∫

ddx{1

2(∇(ψ + δψ))2 +

r

2[ψ + δψ]2 +

u

4[ψ + δψ]4}

We will expand this out, ignoring odd terms in ψ (which must give zero contri-bution since they change the symmetry of F ), constant terms (which only shiftthe zero of the free energy), and terms of order higher than quartic like ψ6 andψ8 (which are irrelevant near d = 4 by power counting as we did above).


In that case

1

2[∇(ψ + δψ)]2 =

1

2(∇ψ)2 +

1

2(∇δψ)2 + odd term

r

2[ψ + δψ]2 =

r

2ψ2 +

r

2(δψ)2 + odd term

u

4[ψ + δψ]4 =

u

4ψ4[1 +

δψ

ψ]4

=u

4ψ4[1 + 4

δψ

ψ+ 6

(δψ)2

ψ2+ 4

(δψ)3

ψ3+

(δψ)4

ψ4]

=u

4ψ4 +

3u

2ψ2(δψ)2 + O(δψ)4 + odd term

We will neglect (δψ)4, since these modes have a small contribution to the k = 0behavior. This can be self–consistently checked. Hence

F [ψ + δψ] = F{ψ}︸︷︷︸

|k|<Λ−δΛ

+

∫

ddx[1

2(∇δψ)2 +

1

2(r + 3uψ2)(δψ)2]

but δψ =∫

Λ−δΛ<|~k|<Λddk

(2π)d ei~k·~xψk, so

(∇δψ)2 = Λ2(δψ)2

for a thin shell, and

F [ψ + δψ] = F [ψ] +

∫

ddx1

2(Λ2 + r + 3uψ2)(δψ)2

and we have,

e−F [ψ] = e−F [ψ]∑

States δψ

e−R

ddx Λ2+r+3uψ2

2 (δψ)2

where∑

States δψ

=∏

all δψ states

∫

dδψ(x)

The integrals are trivially Gaussian in δψ, so

e−F [ψ] = e−F [ψ]∏

x

∏

total no. of δψ states

√

2π

Λ2 + r + 3uψ2

we know the total number of δψ states because we know the total number ofmodes in Fourier space in the shell are

∫

Λ−δΛ<|k|<Λ

ddk

(2π)d= aΛd−1δΛ


so ∏

x

∏

total no.

(...) → e(R

dx)(aΛd−1δΛ) ln (...)

ande−F [ψ] = e−F [ψ]e−

R

ddx aΛd−1δΛ 12 ln (Λ2+r+3uψ2)+const.

Now consider

1

2ln [Λ2 + r + 3uψ2] =

1

2ln (1 +

r

Λ2+

3u

Λ2ψ2) + const.

we expand using ln (1 + ǫ) = ǫ − ǫ2

2 + ǫ3

3

=1

2{ r

Λ2+

3u

Λ2ψ2 − 1

2(

r

Λ2+

3u

Λ2ψ2)2 + ...}

=1

2{ 7

const.r

Λ2+

3u

Λ2ψ2 −

* const.1

2(

r

Λ2)2 − 3ur

Λ2ψ2 − 9u2

2Λ4ψ4 + ...}

=1

2(3u

Λ2− 3ur

Λ4)ψ2 +

1

4(−9u2

Λ4)ψ4

So that

F [ψ] = F [ψ] +

∫

ddx{1

2(3u

Λ2− 3ur

Λ4)(aΛd−1δΛ)ψ2 +

1

4(−9u2

Λ4)(aΛd−1δΛ)ψ4}

so,

F [ψ] =

∫

ddx{1

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4}

where,

r = r + (3u

Λ2− 3ur

Λ4)(aΛd−1δΛ)

u = u + (−9u2

Λ4)(aΛd−1δΛ)

Butr = r(Λ − δΛ) etc,

sor(Λ − δΛ) − r(Λ)

δΛ= (

3u

Λ2− 3ur

Λ4)aΛd−1

andu(Λ − δΛ) − u(Λ)

δΛ= (

−9u2

Λ4)aΛd−1

or,

−1

a

∂r

∂Λ= (

3u

Λ2− 3ur

Λ4)Λd−1

−1

a

∂u

∂Λ= (

−9u2

Λ4)Λd−1


Set a = 1 (or let u → u/a), and set Λ = 1/L, to obtain

−L∂r

∂L= −3uL2−d + 3urL4−d

−L∂u

∂L= 9u2L4−d

We need to include the trivial change in length due to the overall rescalingof space, with engineering dimensions [r] = 1/L2, [u] = 1/L4−d, by lettingr′ = rL2, u′ = uL4−d. Note

dr

dL=

d(r′/L2)

dL= (

dr′

dL− 2

r′

L)

1

L2

so

−Ldr

dL= (−L

dr′

dL+ 2r′)

1

L2

and

−Ldu

dL= (−L

du′

dL+ (4 − d)u′)

1

L4−d

But

−3uL2−d = −3u′L2−d

L4−d= −3u′ 1

L2

and

−3urL4−d = −3u′r′L4−d

L4−dL2= −3u′r′

1

L2

and

9u2L4−d = 9u′2 L4−d

(L4−d)2= 9u′2 1

L4−d

Hence we obtain

−Ldr′

dL= −2r′ − 3u′ + 3u′r′

and

−Ldu′

dL= −(4 − d)u′ + 9u′2

Which are the recursion relations, to first order. Evidently there is still a trivialGaussian fixed point r∗ = u+ = 0, but there is a nontrivial “Wilson–Fisher”fixed point which is stable for d < 4 ! From the u equation (using ǫ = 4 − d)

u∗ =ǫ

9

and in the r equation (using u∗ ∼ r∗ ∼ ǫ with ǫ << 1), 2r∗ = −3u∗, so

r∗ = − ǫ

6


−Ldu′

dL

d < 4

u′

d > 4d = 4

u∗

Wilson–Fisher fixed pt.

Stable for d > 4

Gaussianfixed pt.

Figure 17.15:

Linearlizing around this fixed point gives the dimensions of r and u as that fixedpoint is approached.

Let δr = r′ − r∗, δu = u′ − u∗. In the u equation,

−Ldδu

dL= −ǫu∗ − ǫδu + 9(u∗)2(1 +

2δu

u∗)

= [18u∗ − ǫ]δu

−Ldδu

dL= ǫδu (Wilson–Fisher fixed pt.)

So the dimension of u is ǫ.In the r equation,

−Ldδ r

dL= −2r∗ − 3u∗ − 2δr + 3u∗δr

but δr >> δu as L → ∞, so,

−Ldδr

dL= (−2 + 3u∗)δr

or,

−Ldδr

dL= −2(1 − ǫ

6)δr Wilson–Fisher fixed pt. (stable d < 4)

So the dimension of r is −2(1 − ǫ/6).Similarly, near the Gaussian fixed pt. where u∗ = r∗ = 0, we have (as we

did before)

−Ldδu

dL= −ǫδu

−Ldδr

dL= −2δr

Giving the dimensions of r and u there.


To get the physical exponents, recall

[r] = 1/L1/ν

Hence

ν =1

2, Gaussian fixed pt.

ν =1

2(1 +

ǫ

6), Wilson–Fisher fixed pt.

To get another exponent we use

[ψ] =1

Ld2−1+η/2

or more precisely

[ψ] =1

Ld2−1+η/2

f(rL1/ν , uL−yu)

where

yu =

{

−ǫ, Gaussian

ǫ, Wilson–Fisher

is the dimension of u. Choosing the scale

rL1/ν ≡ 1

gives[ψ] = r

ν2 (d−2+η)f(1, urνyu)

for the Wilson–Fisher fixed point this is straight forward. Since νyu > 0, theterm urνyu is irrelavant as r → 0. Hence

[ψ] = rν2 (d−2+η) : const.

f(1, 0)

≡ rβ

Now it turns out η = 0 + O(ǫ2) (which can be calculated later via scalingrelations), so,

β =1

2

1

2(1 +

ǫ

6)(2 − ǫ)

=1

2(1 +

ǫ

6)(1 − ǫ

2)

or

β =1

2(1 − ǫ

3) Wilson–Fisher fixed pt.

The Gaussian fixed point is more tricky (although academic since it is for d > 4).We again obtain

[ψ] = rν2 (d−2+η)f(1, urνyu)


but now ν = 12 , yu = −ǫ = d − 4, η = 0, so

[ψ] = rd−24 f(1, ur

d−42 )

This looks like β = d−24 ! Not the well–known β = 1/2. The problem is that

u is irrelevant, but dangerous, for T < Tc (where ψ is nonzero). Without thisquartic term, the free energy is ill defined. For the equation above, f(1, u → 0)is singular. Its form is easy to see. By dimensions ψ ∼

√

r/u below Tc, so

f(1, u → 0) =1

u1/2

and

[ψ]u→0 = rd−24

1

(urd−42 )1/2

=r

d−24 − d−4

4

u1/2

= (r

u)1/2

So, β = 1/2 for the Gaussian fixed point. Hence d ≥ 4, the Gaussian fixed pointis stable and α = 0, β = 1/2, δ = 3, γ = 1, η = 0, ν = 1

2 (The remainder areobtained carefully with the scaling relations).

For d < 4, the Wilson–Fisher fixed point is stable and since α + 2β + γ = 2,γ = ν(2 − η), γ = β(δ − 1), νd = 2 − α we have:

ǫ = 1 Ising d = 3α = ǫ

6 0.17 0.125...β = 1

2 (1 − ǫ3 ) 0.33 0.313...

γ = 1 + ǫ6 1.17 1.25...

δ = 3(1 + ǫ3 ) 4 5.0...

ν = 12 (1 + ǫ

12 ) 0.58 0.638...η = 0 + O(ǫ2) 0 0.041...

dirty tricks for statistical mechanics - mcgill physics - mcgill university

Documents