mathematics for physics: a guided tour for graduate students

This page intentionally left blank

Mathematics for PhysicsA Guided Tour for Graduate Students

An engagingly written account of mathematical tools and ideas, this book provides agraduate-level introduction to the mathematics used in research in physics.

The first half of the book focuses on the traditional mathematical methods of physics:differential and integral equations, Fourier series and the calculus of variations. Thesecond half contains an introduction to more advanced subjects, including differentialgeometry, topology and complex variables.

The authors exposition avoids excess rigour whilst explaining subtle but impor-tant points often glossed over in more elementary texts. The topics are illustrated atevery stage by carefully chosen examples, exercises and problems drawn from realisticphysics settings. These make it useful both as a textbook in advanced courses and forself-study. Password-protected solutions to the exercises are available to instructors atwww.cambridge.org/9780521854030.

michael stone is a Professor in theDepartment of Physics at theUniversity of Illinoisat Urbana-Champaign. He has worked on quantum field theory, superconductivity, thequantum Hall effect and quantum computing.

paul goldbart is a Professor in theDepartment of Physics at theUniversity of Illinoisat Urbana-Champaign, where he directs the Institute for Condensed Matter Theory. Hisresearch ranges widely over the field of condensed matter physics, including soft matter,disordered systems, nanoscience and superconductivity.

MATHEMATICS FOR PHYSICS

A Guided Tour for Graduate Students

MICHAEL STONEUniversity of Illinois at Urbana-Champaign

and

PAUL GOLDBARTUniversity of Illinois at Urbana-Champaign

CAMBRIDGE UNIVERSITY PRESS

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore,

So Paulo, Delhi, Dubai, Tokyo

Cambridge University Press

The Edinburgh Building, Cambridge CB2 8RU, UK

First published in print format

ISBN-13 978-0-521-85403-0

ISBN-13 978-0-511-59516-5

M. Stone and P. Goldbart 2009

2009

Information on this title: www.cambridge.org/9780521854030

This publication is in copyright. Subject to statutory exception and to the

provision of relevant collective licensing agreements, no reproduction of any part

may take place without the written permission of Cambridge University Press.

Cambridge University Press has no responsibility for the persistence or accuracy

of urls for external or third-party internet websites referred to in this publication,

and does not guarantee that any content on such websites is, or will remain,

accurate or appropriate.

Published in the United States of America by Cambridge University Press, New York

www.cambridge.org

eBook (EBL)

Hardback

To the memory of Mikes mother, Aileen Stone: 9 9 = 81.

To Pauls mother and father, Carole and Colin Goldbart.

Contents

Preface page xi

Acknowledgments xiii

1 Calculus of variations 11.1 What is it good for? 11.2 Functionals 11.3 Lagrangian mechanics 101.4 Variable endpoints 271.5 Lagrange multipliers 321.6 Maximum or minimum? 361.7 Further exercises and problems 38

2 Function spaces 502.1 Motivation 502.2 Norms and inner products 512.3 Linear operators and distributions 662.4 Further exercises and problems 76

3 Linear ordinary differential equations 863.1 Existence and uniqueness of solutions 863.2 Normal form 933.3 Inhomogeneous equations 943.4 Singular points 973.5 Further exercises and problems 98

4 Linear differential operators 1014.1 Formal vs. concrete operators 1014.2 The adjoint operator 1044.3 Completeness of eigenfunctions 1174.4 Further exercises and problems 132

5 Green functions 1405.1 Inhomogeneous linear equations 1405.2 Constructing Green functions 141

vii

viii Contents

5.3 Applications of Lagranges identity 1505.4 Eigenfunction expansions 1535.5 Analytic properties of Green functions 1555.6 Locality and the GelfandDikii equation 1655.7 Further exercises and problems 167

6 Partial differential equations 1746.1 Classification of PDEs 1746.2 Cauchy data 1766.3 Wave equation 1816.4 Heat equation 1966.5 Potential theory 2016.6 Further exercises and problems 224

7 The mathematics of real waves 2317.1 Dispersive waves 2317.2 Making waves 2427.3 Nonlinear waves 2467.4 Solitons 2557.5 Further exercises and problems 260

8 Special functions 2648.1 Curvilinear coordinates 2648.2 Spherical harmonics 2708.3 Bessel functions 2788.4 Singular endpoints 2988.5 Further exercises and problems 305

9 Integral equations 3119.1 Illustrations 3119.2 Classification of integral equations 3129.3 Integral transforms 3139.4 Separable kernels 3219.5 Singular integral equations 3239.6 WienerHopf equations I 3279.7 Some functional analysis 3329.8 Series solutions 3389.9 Further exercises and problems 342

10 Vectors and tensors 34710.1 Covariant and contravariant vectors 34710.2 Tensors 35010.3 Cartesian tensors 36210.4 Further exercises and problems 372

Contents ix

11 Differential calculus on manifolds 37611.1 Vector and covector fields 37611.2 Differentiating tensors 38111.3 Exterior calculus 38911.4 Physical applications 39511.5 Covariant derivatives 40311.6 Further exercises and problems 409

12 Integration on manifolds 41412.1 Basic notions 41412.2 Integrating p-forms 41712.3 Stokes theorem 42212.4 Applications 42412.5 Further exercises and problems 440

13 An introduction to differential topology 44913.1 Homeomorphism and diffeomorphism 44913.2 Cohomology 45013.3 Homology 45513.4 De Rhams theorem 46913.5 Poincar duality 47313.6 Characteristic classes 47713.7 Hodge theory and the Morse index 48313.8 Further exercises and problems 496

14 Groups and group representations 49814.1 Basic ideas 49814.2 Representations 50514.3 Physics applications 51714.4 Further exercises and problems 525

15 Lie groups 53015.1 Matrix groups 53015.2 Geometry of SU(2) 53515.3 Lie algebras 55515.4 Further exercises and problems 572

16 The geometry of fibre bundles 57616.1 Fibre bundles 57616.2 Physics examples 57716.3 Working in the total space 591

17 Complex analysis 60617.1 CauchyRiemann equations 606

x Contents

17.2 Complex integration: Cauchy and Stokes 61617.3 Applications 62417.4 Applications of Cauchys theorem 63017.5 Meromorphic functions and the winding number 64417.6 Analytic functions and topology 64717.7 Further exercises and problems 661

18 Applications of complex variables 66618.1 Contour integration technology 66618.2 The Schwarz reflection principle 67618.3 Partial-fraction and product expansions 68718.4 WienerHopf equations II 69218.5 Further exercises and problems 701

19 Special functions and complex variables 70619.1 The Gamma function 70619.2 Linear differential equations 71119.3 Solving ODEs via contour integrals 71819.4 Asymptotic expansions 72519.5 Elliptic functions 73519.6 Further exercises and problems 741

A Linear algebra review 744A.1 Vector space 744A.2 Linear maps 746A.3 Inner-product spaces 749A.4 Sums and differences of vector spaces 754A.5 Inhomogeneous linear equations 757A.6 Determinants 759A.7 Diagonalization and canonical forms 766

B Fourier series and integrals 779B.1 Fourier series 779B.2 Fourier integral transforms 783B.3 Convolution 786B.4 The Poisson summation formula 792

References 797

Index 799

Preface

This book is based on a two-semester sequence of courses taught to incoming graduatestudents at the University of Illinois at Urbana-Champaign, primarily physics studentsbut also some from other branches of the physical sciences. The courses aim to intro-duce students to some of the mathematical methods and concepts that they will finduseful in their research. We have sought to enliven the material by integrating the math-ematics with its applications. We therefore provide illustrative examples and problemsdrawn from physics. Some of these illustrations are classical but many are small partsof contemporary research papers. In the text and at the end of each chapter we providea collection of exercises and problems suitable for homework assignments. The formerare straightforward applications of material presented in the text; the latter are intendedto be interesting, and take rather more thought and time.

We devote the first, and longest, part (Chapters 19, and the first semester in theclassroom) to traditional mathematical methods. We explore the analogy between linearoperators acting on function spaces and matrices acting on finite-dimensional spaces,and use the operator language to provide a unified framework for working with ordinarydifferential equations, partial differential equations and integral equations. The mathe-matical prerequisites are a sound grasp of undergraduate calculus (including the vectorcalculus needed for electricity and magnetism courses), elementary linear algebra andcompetence at complex arithmetic. Fourier sums and integrals, as well as basic ordinarydifferential equation theory, receive a quick review, but it would help if the reader hadsome prior experience to build on. Contour integration is not required for this part ofthe book.

The second part (Chapters 1014) focuses on modern differential geometry and topol-ogy, with an eye to its application to physics. The tools of calculus on manifolds,especially the exterior calculus, are introduced, and used to investigate classical mechan-ics, electromagnetism and non-abelian gauge fields. The language of homology andcohomology is introduced and is used to investigate the influence of the global topologyof a manifold on the fields that live in it and on the solutions of differential equationsthat constrain these fields.

Chapters 15 and 16 introduce the theory of group representations and their applicationsto quantum mechanics. Both finite groups and Lie groups are explored.

The last part (Chapters 1719) explores the theory of complex variables and itsapplications. Although much of the material is standard, we make use of the exterior

xi

xii Preface

calculus, and discuss rather more of the topological aspects of analytic functions than iscustomary.

A cursory reading of the Contents of the book will show that there is more materialhere than can be comfortably covered in two semesters.When using the book as the basisfor lectures in the classroom, we have found it useful to tailor the presented material tothe interests of our students.

Acknowledgments

A great many people have encouraged us along the way:

Our teachers at theUniversity of Cambridge, theUniversity of California-LosAngeles,and Imperial College London.

Our students your questions and enthusiasm have helped shape our understandingand our exposition.

Our colleagues faculty and staff at theUniversity of Illinois at Urbana-Champaign how fortunate we are to have a community so rich in both accomplishment andcollegiality.

Our friends and family: Kyre and Steve and Ginna; and Jenny, Ollie and Greta wehope to be more attentive now that this book is done.

Our editor Simon Capelin at Cambridge University Press your patience isappreciated.

The staff of the US National Science Foundation and the US Department of Energy,who have supported our research over the years.

Our sincere thanks to you all.

xiii

1Calculus of variations

We begin our tour of useful mathematics with what is called the calculus of variations.Many physics problems can be formulated in the language of this calculus, and oncethey are there are useful tools to hand. In the text and associated exercises we will meetsome of the equations whose solution will occupy us for much of our journey.

1.1 What is it good for?

The classical problems that motivated the creators of the calculus of variations include:

(i) Didos problem: In Virgils Aeneid, Queen Dido of Carthage must find the largestarea that can be enclosed by a curve (a strip of bulls hide) of fixed length.

(ii) Plateaus problem: Find the surface of minimum area for a given set of boundingcurves. A soap film on a wire frame will adopt this minimal-area configuration.

(iii) Johann Bernoullis brachistochrone: A bead slides down a curve with fixed ends.Assuming that the total energy 12mv

2 + V (x) is constant, find the curve that givesthe most rapid descent.

(iv) Catenary: Find the form of a hanging heavy chain of fixed length by minimizingits potential energy.

These problems all involve finding maxima or minima, and hence equating some sortof derivative to zero. In the next section we define this derivative, and show how tocompute it.

1.2 Functionals

In variational problems we are provided with an expression J [y] that eats whole func-tions y(x) and returns a single number. Such objects are called functionals to distinguishthem from ordinary functions. An ordinary function is a map f : R R. A functionalJ is a map J : C(R) R where C(R) is the space of smooth (having derivativesof all orders) functions. To find the function y(x) that maximizes or minimizes a givenfunctional J [y] we need to define, and evaluate, its functional derivative.

1

2 1 Calculus of variations

1.2.1 The functional derivative

We restrict ourselves to expressions of the form

J [y] = x2x1

f (x, y, y, y, y(n)) dx, (1.1)

where f depends on the value of y(x) and only finitely many of its derivatives. Suchfunctionals are said to be local in x.

Consider first a functional J = fdx in which f depends only x, y and y. Make achange y(x) y(x)+ (x), where is a (small) x-independent constant. The resultantchange in J is

J [y + ] J [y] = x2x1

{f (x, y + , y + ) f (x, y, y)} dx

= x2x1

{

f

y+ d

dx

f

y+ O(2)

}dx

=[

f

y

]x2x1

+ x2x1

((x))

{f

y d

dx

(f

y

)}dx

+ O(2).

If (x1) = (x2) = 0, the variation y(x) (x) in y(x) is said to have fixedendpoints. For such variations the integrated-out part [. . .]x2x1 vanishes. Defining J tobe the O() part of J [y + ] J [y], we have

J = x2x1

((x))

{f

y d

dx

(f

y

)}dx

= x2x1

y(x)

(J

y(x)

)dx. (1.2)

The function

J

y(x) f

y d

dx

(f

y

)(1.3)

is called the functional (or Frchet) derivative of J with respect to y(x). We can thinkof it as a generalization of the partial derivative J/yi, where the discrete subscript ion y is replaced by a continuous label x, and sums over i are replaced by integralsover x:

J =

i

J

yiyi

x2x1

dx

(J

y(x)

)y(x). (1.4)

1.2 Functionals 3

1.2.2 The EulerLagrange equation

Suppose that we have a differentiable function J (y1, y2, . . . , yn) of n variables and seekits stationary points these being the locations at which J has its maxima, minima andsaddle points. At a stationary point (y1, y2, . . . , yn) the variation

J =n

i=1

J

yiyi (1.5)

must be zero for all possible yi. The necessary and sufficient condition for this is that allpartial derivatives J/yi, i = 1, . . . , n be zero. By analogy, we expect that a functionalJ [y] will be stationary under fixed-endpoint variations y(x) y(x) + y(x), when thefunctional derivative J/y(x) vanishes for all x. In other words, when

f

y(x) d

dx

(f

y(x)

)= 0, x1 < x < x2. (1.6)

The condition (1.6) for y(x) to be a stationary point is usually called the EulerLagrangeequation.

That J/y(x) 0 is a sufficient condition for J to be zero is clear from its definitionin (1.2). To see that it is a necessary conditionwemust appeal to the assumed smoothnessof y(x). Consider a function y(x) at which J [y] is stationary but where J/y(x) isnon-zero at some x0 [x1, x2]. Because f (y, y, x) is smooth, the functional derivativeJ/y(x) is also a smooth function of x. Therefore, by continuity, it will have the samesign throughout some open interval containing x0. By taking y(x) = (x) to be zerooutside this interval, and of one sign within it, we obtain a non-zero J in contradictionto stationarity. In making this argument, we see why it was essential to integrate by partsso as to take the derivative off y: when y is fixed at the endpoints, we have

y dx = 0,

and so we cannot find a y that is zero everywhere outside an interval and of one signwithin it.

When the functional depends on more than one function y, then stationarity under allpossible variations requires one equation

J

yi(x)= f

yi d

dx

(f

yi

)= 0 (1.7)

for each function yi(x).If the function f depends on higher derivatives, y, y(3), etc., then we have to integrate

by parts more times, and we end up with

0 = Jy(x)

= fy

ddx

(f

y

)+ d

2

dx2

(f

y

) d

3

dx3

(f

y(3)

)+ . (1.8)


y(x)xx2x1

Figure 1.1 Soap film between two rings.

1.2.3 Some applications

Now we use our new functional derivative to address some of the classic problemsmentioned in the introduction.

Example: Soap film supported by a pair of coaxial rings (Figure 1.1). This is a simplecase of Plateaus problem. The free energy of the soap film is equal to twice (once foreach liquidair interface) the surface tension of the soap solution times the area of thefilm. The film can therefore minimize its free energy by minimizing its area, and theaxial symmetry suggests that the minimal surface will be a surface of revolution aboutthe x-axis. We therefore seek the profile y(x) that makes the area

J [y] = 2 x2x1

y

1 + y2 dx (1.9)

of the surface of revolution the least among all such surfaces bounded by the circles ofradii y(x1) = y1 and y(x2) = y2. Because a minimum is a stationary point, we seekcandidates for the minimizing profile y(x) by setting the functional derivative J/y(x)to zero.

We begin by forming the partial derivatives

f

y= 4

1 + y2, f

y= 4yy

1 + y2

(1.10)

and use them to write down the EulerLagrange equation

1 + y2 d

dx

yy1 + y2

= 0. (1.11)

1.2 Functionals 5

Performing the indicated derivative with respect to x gives1 + y2 (y

)21 + y2

yy

1 + y2+ y(y

)2y

(1 + y2)3/2 = 0. (1.12)

After collecting terms, this simplifies to

11 + y2

yy

(1 + y2)3/2 = 0. (1.13)

The differential equation (1.13) still looks a trifle intimidating. To simplify further, wemultiply by y to get

0 = y

1 + y2 yy

y

(1 + y2)3/2

= ddx

y1 + y2

. (1.14)The solution to the minimization problem therefore reduces to solving

y1 + y2

= , (1.15)

where is an as yet undetermined integration constant. Fortunately this nonlinear, first-order, differential equation is elementary. We recast it as

dy

dx=

y2

2 1 (1.16)

and separate variables dx =

dy

y2

2 1

. (1.17)

We now make the natural substitution y = cosh t, whencedx =

dt. (1.18)

Thus we find that x + a = t, leading to

y = cosh x + a

. (1.19)


x

y

h

L L

Figure 1.2 Hanging chain.

We select the constants and a to fit the endpoints y(x1) = y1 and y(x2) = y2.

Example: Heavy chain over pulleys. We cannot yet consider the form of the catenary, ahanging chain of fixed length, but we can solve a simpler problem of a heavy flexiblecable draped over a pair of pulleys located at x = L, y = h, and with the excess cableresting on a horizontal surface as illustrated in Figure 1.2.

The potential energy of the system is

P.E. =

mgy = g LL

y

1 + (y)2dx + const. (1.20)

Here the constant refers to the unchanging potential energy

2 h0

mgy dy = mgh2 (1.21)

of the vertically hanging cable. The potential energy of the cable lying on the horizontalsurface is zero because y is zero there. Notice that the tension in the suspended cable isbeing tacitly determined by the weight of the vertical segments.

The EulerLagrange equations coincide with those of the soap film, so

y = cosh (x + a)

(1.22)

where we have to find and a. We have

h = cosh(L + a)/ ,= cosh(L + a)/ , (1.23)

1.2 Functionals 7

y

y ht/L

y cosh t

t L/

Figure 1.3 Intersection of y = ht/L with y = cosh t.

so a = 0 and h = cosh L/ . Setting t = L/ this reduces to(h

L

)t = cosh t. (1.24)

By considering the intersection of the line y = ht/L with y = cosh t (Figure 1.3) wesee that if h/L is too small there is no solution (the weight of the suspended cable is toobig for the tension supplied by the dangling ends) and once h/L is large enough therewill be two possible solutions. Further investigation will show that the solution with thelarger value of is a point of stable equilibrium, while the solution with the smaller isunstable.

Example: The brachistochrone. This problem was posed as a challenge by JohannBernoulli in 1696. He asked what shape should a wire with endpoints (0, 0) and (a, b)take in order that a frictionless bead will slide from rest down the wire in the shortestpossible time (Figure 1.4). The problems name comes from Greek: o meansshortest and oo means time.

When presented with an ostensibly anonymous solution, Johann made his famousremark: Tanquam ex unguem leonem (I recognize the lion by his clawmark) meaningthat he recognized that the author was Isaac Newton.

Johann gave a solution himself, but that of his brother Jacob Bernoulli was superiorand Johann tried to pass it off as his. This was not atypical. Johann later misrepresentedthe publication date of his book on hydraulics to make it seem that he had priority in thisfield over his own son, Daniel Bernoulli.

We begin our solution of the problem by observing that the total energy

E = 12m(x2 + y2) mgy = 1

2mx2(1 + y2) mgy, (1.25)


x

y

g

(a,b)

Figure 1.4 Bead on a wire.

of the bead is constant. From the initial condition we see that this constant is zero. Wetherefore wish to minimize

T = T0

dt = a0

1

xdx =

a0

1 + y22gy

dx (1.26)

so as to find y(x), given that y(0) = 0 and y(a) = b. The EulerLagrange equation is

yy + 12(1 + y2) = 0. (1.27)

Again this looks intimidating, but we can use the same trick of multiplying through byy to get

y(yy + 1

2(1 + y2)

)= 1

2

d

dx

{y(1 + y2)

}= 0. (1.28)

Thus

2c = y(1 + y2). (1.29)

This differential equation has a parametric solution

x = c( sin ),y = c(1 cos ), (1.30)

(as you should verify) and the solution is the cycloid shown in Figure 1.5. The parameterc is determined by requiring that the curve does in fact pass through the point (a, b).

1.2 Functionals 9

x

y

(0,0)

(a,b)

(x,y)

Figure 1.5 Awheel rolls on the x-axis. The dot, which is fixed to the rim of the wheel, traces outa cycloid.

1.2.4 First integral

How did we know that we could simplify both the soap-film problem and the brachis-tochrone by multiplying the Euler equation by y? The answer is that there is a generalprinciple, closely related to energy conservation in mechanics, that tells us when andhow we can make such a simplification. The y trick works when the f in

f dx is of the

form f (y, y), i.e. has no explicit dependence on x. In this case the last term in

df

dx= y f

y+ y f

y+ f

x(1.31)

is absent. We then have

d

dx

(f y f

y

)= y f

y+ y f

y y f

y y d

dx

(f

y

)= y

(f

y d

dx

(f

y

)), (1.32)

and this is zero if the EulerLagrange equation is satisfied.The quantity

I = f y fy

(1.33)

is called a first integral of the EulerLagrange equation. In the soap-film case

f y fy

= y

1 + (y)2 y(y)2

1 + (y)2 =y

1 + (y)2 . (1.34)

When there are a number of dependent variables yi, so that we have

J [y1, y2, . . . yn] =

f (y1, y2, . . . yn; y1, y

2, . . . y

n) dx (1.35)


then the first integral becomes

I = f

i

yif

yi. (1.36)

Again

dI

dx= d

dx

(f

i

yif

yi

)

=

i

(yi

f

yi+ yi

f

yi yi

f

yi yi

d

dx

(f

yi

))

=

i

yi(

f

yi d

dx

(f

yi

)), (1.37)

and this is zero if the EulerLagrange equation is satisfied for each yi.Note that there is only one first integral, no matter how many yis there are.

1.3 Lagrangian mechanics

In his Mcanique Analytique (1788) Joseph-Louis de La Grange, following JeandAlembert (1742) and Pierre de Maupertuis (1744), showed that most of classicalmechanics can be recast as a variational condition: the principle of least action. The ideais to introduce the Lagrangian function L = T V where T is the kinetic energy of thesystem and V the potential energy, both expressed in terms of generalized coordinatesqi and their time derivatives qi. Then, Lagrange showed, the multitude of NewtonsF = ma equations, one for each particle in the system, can be reduced to

d

dt

(L

qi

) L

qi= 0, (1.38)

one equation for each generalized coordinate q. Quite remarkably given that Lagrangesderivation contains nomention ofmaxima orminima we recognize that this is preciselythe condition that the action functional

S[q] = tfinaltinitial

L(t, qi; qi) dt (1.39)

be stationary with respect to variations of the trajectory qi(t) that leave the initial andfinal points fixed. This fact so impressed its discoverers that they believed they haduncovered the unifying principle of the universe. Maupertuis, for one, tried to base aproof of the existence of God on it. Today the action integral, through its starring role in

1.3 Lagrangian mechanics 11

Tx1

m1

m2

T

g

x2

Figure 1.6 Atwoods machine.

the Feynman path-integral formulation of quantum mechanics, remains at the heart oftheoretical physics.

1.3.1 One degree of freedom

We shall not attempt to derive Lagranges equations from dAlemberts extension ofthe principle of virtual work leaving this task to a mechanics course but insteadsatisfy ourselves with some examples which illustrate the computational advantages ofLagranges approach, as well as a subtle pitfall.

Consider, for example, Atwoods machine (Figure 1.6). This device, invented in 1784but still a familiar sight in teaching laboratories, is used to demonstrate Newtons lawsof motion and to measure g. It consists of two weights connected by a light string oflength l which passes over a light and frictionless pulley.

The elementary approach is to write an equation of motion for each of the two weights

m1x1 = m1g T ,m2x2 = m2g T . (1.40)

We then take into account the constraint x1 = x2 and eliminate x2 in favour of x1:

m1x1 = m1g T ,m2x1 = m2g T . (1.41)

Finally we eliminate the constraint force and the tension T , and obtain the acceleration

(m1 + m2)x1 = (m1 m2)g. (1.42)


Lagranges solution takes the constraint into account from the very beginning byintroducing a single generalized coordinate q = x1 = l x2, and writing

L = T V = 12(m1 + m2)q2 (m2 m1)gq. (1.43)

From this we obtain a single equation of motion

d

dt

(L

qi

) L

qi= 0 (m1 + m2)q = (m1 m2)g. (1.44)

The advantage of the Lagrangian method is that constraint forces, which do no net work,never appear. The disadvantage is exactly the same: if we need to find the constraintforces in this case the tension in the string we cannot use Lagrange alone.

Lagrange provides a convenientway to derive the equations ofmotion in non-cartesiancoordinate systems, such as plane polar coordinates.

Consider the central force problem with Fr = rV (r). Newtons method beginsby computing the acceleration in polar coordinates. This is most easily done by settingz = rei and differentiating twice:

z = (r + ir )ei ,z = (r r2)ei + i(2r + r )ei . (1.45)

Reading off the components parallel and perpendicular to ei gives the radial and angularacceleration (Figure 1.7)

ar = r r2,a = r + 2r . (1.46)

Newtons equations therefore become

m(r r2) = Vr

m(r + 2r ) = 0, ddt

(mr2 ) = 0. (1.47)

Setting l = mr2 , the conserved angular momentum, and eliminating gives

mr l2

mr3= V

r. (1.48)

(If this were Keplers problem, where V = GmM/r, we would now proceed to simplifythis equation by substituting r = 1/u, but that is another story.)


r

y

x

a

ar

Figure 1.7 Polar components of acceleration.

Following Lagrange we first compute the kinetic energy in polar coordinates (thisrequires less thought than computing the acceleration) and set

L = T V = 12m(r2 + r22) V (r). (1.49)

The EulerLagrange equations are now

d

dt

(L

r

) L

r= 0, mr mr2 + V

r= 0,

d

dt

(L

) L

= 0, d

dt(mr2 ) = 0, (1.50)

and coincide with Newtons.The first integral is

E = r L r

+ L

L

= 12m(r2 + r22) + V (r). (1.51)

which is the total energy. Thus the constancy of the first integral states that

dE

dt= 0, (1.52)

or that energy is conserved.


Warning: we might realize, without having gone to the trouble of deriving it from theLagrange equations, that rotational invariance guarantees that the angular momentuml = mr2 is constant. Having done so, it is almost irresistible to try to short-circuit someof the labour by plugging this prior knowledge into

L = 12m(r2 + r22) V (r) (1.53)

so as to eliminate the variable in favour of the constant l. If we try this we get

L? 1

2mr2 + l

2

2mr2 V (r). (1.54)

We can now directly write down the Lagrange equation r, which is

mr + l2

mr3?= V

r. (1.55)

Unfortunately this has the wrong sign before the l2/mr3 term! The lesson is that we mustbe very careful in using consequences of a variational principle to modify the principle.It can be done, and in mechanics it leads to the Routhian or, in more modern language,to Hamiltonian reduction, but it requires using a Legendre transform. The reader shouldconsult a book on mechanics for details.

1.3.2 Noethers theorem

The time-independence of the first integral

d

dt

{qL

q L

}= 0, (1.56)

and of angular momentum

d

dt{mr2} = 0, (1.57)

are examples of conservation laws. We obtained them both by manipulating the EulerLagrange equations of motion, but also indicated that they were in some way connectedwith symmetries. One of the chief advantages of a variational formulation of a physicalproblem is that this connection

Symmetry Conservation law

can be made explicit by exploiting a strategy due to Emmy Noether. She showed howto proceed directly from the action integral to the conserved quantity without havingto fiddle about with the individual equations of motion. We begin by illustrating her


technique in the case of angular momentum, whose conservation is a consequence ofthe rotational symmetry of the central force problem. The action integral for the centralforce problem is

S = T0

{1

2m(r2 + r22) V (r)

}dt. (1.58)

Noether observes that the integrand is left unchanged if we make the variation

(t) (t) + (1.59)

where is a fixed angle and is a small, time-independent, parameter. This invarianceis the symmetry we shall exploit. It is a mathematical identity: it does not require that rand obey the equations of motion. She next observes that since the equations of motionare equivalent to the statement that S is left stationary under any infinitesimal variationsin r and , they necessarily imply that S is stationary under the specific variation

(t) (t) + (t) (1.60)

where now is allowed to be time-dependent. This stationarity of the action is no longera mathematical identity, but, because it requires r, , to obey the equations of motion,has physical content. Inserting = (t) into our expression for S gives

S = T0

{mr2

} dt. (1.61)

Note that this variation depends only on the time derivative of , and not itself. This isbecause of the invariance of S under time-independent rotations. We now assume that(t) = 0 at t = 0 and t = T , and integrate by parts to take the time derivative off andput it on the rest of the integrand:

S = {

d

dt(mr2 )

}(t) dt. (1.62)

Since the equations of motion say that S = 0 under all infinitesimal variations, and inparticular those due to any time-dependent rotation (t), we deduce that the equationsof motion imply that the coefficient of (t) must be zero, and so, provided r(t), (t),obey the equations of motion, we have

0 = ddt

(mr2 ). (1.63)

As a second illustration we derive energy (first integral) conservation for the casethat the system is invariant under time translations meaning that L does not depend


explicitly on time. In this case the action integral is invariant under constant time shiftst t + in the argument of the dynamical variable:

q(t) q(t + ) q(t) + q. (1.64)

The equations of motion tell us that the action will be stationary under the variation

q(t) = (t)q, (1.65)

where again we now permit the parameter to depend on t. We insert this variation into

S = T0

L dt (1.66)

and find

S = T0

{L

qq + L

q(q + q)

}dt. (1.67)

This expression contains undotted s. Because of this the change in S is not obviouslyzero when is time independent, but the absence of any explicit t dependence in L tellsus that

dL

dt= L

qq + L

qq. (1.68)

As a consequence, for time-independent , we have

S = T0

{dL

dt

}dt = [L]T0 , (1.69)

showing that the change in S comes entirely from the endpoints of the time interval. Thesefixed endpoints explicitly break time-translation invariance, but in a trivial manner. Forgeneral (t) we have

S = T0

{(t)

dL

dt+ L

qq

}dt. (1.70)

This equation is an identity. It does not rely on q obeying the equation of motion. Afteran integration by parts, taking (t) to be zero at t = 0, T , it is equivalent to

S = T0

(t)d

dt

{L L

qq

}dt. (1.71)


Now we assume that q(t) does obey the equations of motion. The variation principlethen says that S = 0 for any (t), and we deduce that for q(t) satisfying the equationsof motion we have

d

dt

{L L

qq

}= 0. (1.72)

The general strategy that constitutes Noethers theoremmust now be obvious: we lookfor an invariance of the action under a symmetry transformation with a time-independentparameter. We then observe that if the dynamical variables obey the equations of motion,then the action principle tells us that the action will remain stationary under such avariation of the dynamical variables even after the parameter is promoted to being timedependent. The resultant variation of S can only depend on time derivatives of theparameter. We integrate by parts so as to take all the time derivatives off it, and on to therest of the integrand. Because the parameter is arbitrary, we deduce that the equationsof motion tell us that that its coefficient in the integral must be zero. This coefficient isthe time derivative of something, so this something is conserved.

1.3.3 Many degrees of freedom

The extension of the action principle to many degrees of freedom is straightforward. Asan example consider the small oscillations about equilibrium of a system with N degreesof freedom. We parametrize the system in terms of deviations from the equilibriumposition and expand out to quadratic order. We obtain a Lagrangian

L =N

i, j=1

{1

2Mijq

iqj 12Vijq

iqj}

, (1.73)

where Mij and Vij are N N symmetric matrices encoding the inertial and potentialenergy properties of the system. Now we have one equation

0 = ddt

(L

qi

) L

qi=

Nj=1

(Mijq

j + Vijqj)

(1.74)

for each i.

1.3.4 Continuous systems

The action principle can be extended to field theories and to continuum mechanics.Here one has a continuous infinity of dynamical degrees of freedom, either one foreach point in space and time or one for each point in the material, but the extensionof the variational derivative to functions of more than one variable should possess noconceptual difficulties.


Suppose we are given an action functional S[] depending on a field (x) and itsfirst derivatives

x

. (1.75)

Here x, = 0, 1, . . . , d, are the coordinates of (d + 1)-dimensional space-time. It istraditional to take x0 t and the other coordinates space-like. Suppose further that

S[] =

L dt =

L(x,,) dd+1x, (1.76)

where L is the Lagrangian density, in terms of which

L =

L ddx, (1.77)

and the integral is over the space coordinates. Now

S = {

(x)L

(x)+ ((x)) L

(x)

}dd+1x

=

(x)

{L

(x)

x

(L

(x)

)}dd+1x. (1.78)

In going from the first line to the second, we have observed that

((x)) = x

(x) (1.79)

and used the divergence theorem,

(A

x

)dn+1x =

AndS, (1.80)

where is some space-time region and its boundary, to integrate by parts. Here dSis the element of area on the boundary, and n the outward normal. As before, we take to vanish on the boundary, and hence there is no boundary contribution to variationof S. The result is that

S

(x)= L

(x)

x

(L

(x)

), (1.81)

and the equation of motion comes from setting this to zero. Note that a sum over therepeated coordinate index is implied. In practice it is easier not to use this formula.Instead, make the variation by hand as in the following examples.

Example: The vibrating string. The simplest continuous dynamical system is thetransversely vibrating string (Figure 1.8). We describe the string displacement by y(x, t).


y(x,t)0 L

Figure 1.8 Transversely vibrating string.

Let us suppose that the string has fixed ends, a mass per unit length of and is undertension T . If we assume only small displacements from equilibrium, the Lagrangian is

L = L0

dx

{1

2y2 1

2Ty2

}. (1.82)

The dot denotes a partial derivative with respect to t, and the prime a partial derivativewith respect to x. The variation of the action is

S = L

0dtdx

{y y Tyy}

= L

0dtdx

{y(x, t)

(y + Ty)} . (1.83)To reach the second line we have integrated by parts, and, because the ends are fixed,and therefore y = 0 at x = 0 and L, there is no boundary term. Requiring that S = 0for all allowed variations y then gives the equation of motion

y Ty = 0. (1.84)

This is the wave equation describing transverse waves propagating with speed c =T/. Observe that from (1.83) we can read off the functional derivative of S with

respect to the variable y(x, t) as being

S

y(x, t)= y(x, t) + Ty(x, t). (1.85)

In writing down the first integral for this continuous system, we must replace the sumover discrete indices by an integral:

E =

i

qiL

qi L

dx

{y(x)

L

y(x)

} L. (1.86)


When computing L/y(x) from

L = L0

dx

{1

2y2 1

2Ty2

},

we must remember that it is the continuous analogue of L/ qi, and so, in contrast towhat we do when computing S/y(x), we must treat y(x) as a variable independent ofy(x). We then have

L

y(x)= y(x), (1.87)

leading to

E = L0

dx

{1

2y2 + 1

2Ty2

}. (1.88)

This, as expected, is the total energy, kinetic plus potential, of the string.

The energymomentum tensor

If we consider an action of the form

S =

L(,) dd+1x, (1.89)

in which L does not depend explicitly on any of the coordinates x, we may refineNoethers derivation of the law of conservation of total energy and obtain accountinginformation about the position-dependent energy density. To do this we make a variationof the form

(x) (x + (x)) = (x) + (x) + O(||2), (1.90)

where depends on x (x0, . . . , xd). The resulting variation in S is

S = {

L

+ L

()

}dd+1x

=

(x)

x

{L

L

}dd+1x. (1.91)

When satisfies the equations of motion, this S will be zero for arbitrary (x). Weconclude that

x

{L

L

}= 0. (1.92)


The (d + 1)-by-(d + 1) array of functions

T L

L (1.93)

is known as the canonical energymomentum tensor because the statement

T = 0 (1.94)

often provides bookkeeping for the flow of energy and momentum.In the case of the vibrating string, the = 0, 1 components of T = 0 become

the two following local conservation equations:

t

{

2y2 + T

2y2}+

x

{T yy} = 0, (1.95)and

t

{yy}+ x

{

2y2 + T

2y2}

= 0. (1.96)

It is easy to verify that these are indeed consequences of the wave equation. They arelocal conservation laws because they are of the form

q

t+ div J = 0, (1.97)

where q is the local density, and J the flux, of the globally conserved quantity Q =q ddx. In the first case, the local density q is

T 00 =

2y2 + T

2y2, (1.98)

which is the energy density. The energy flux is given by T 10 T yy, which is the ratethat a segment of string is doing work on its neighbour to the right. Integrating over x,and observing that the fixed-end boundary conditions are such that L

0

x

{T yy} dx = [ T yy]L0 = 0, (1.99)gives us

d

dt

L0

{

2y2 + T

2y2}

dx = 0, (1.100)

which is the global energy conservation law we obtained earlier.


The physical interpretation of T 01 = yy, the locally conserved quantity appearingin (1.96), is less obvious. If thiswere a relativistic system, wewould immediately identify

T 01 dx as the x-component of the energymomentum 4-vector, and therefore T01 as the

density of x-momentum. Now any real string will have some motion in the x-direction,but the magnitude of this motion will depend on the strings elastic constants and otherquantities unknown to our Lagrangian. Because of this, the T 01 derived from L cannot bethe strings x-momentum density. Instead, it is the density of something called pseudo-momentum. The distinction between true and pseudo-momentum is best appreciated byconsidering the corresponding Noether symmetry. The symmetry associated with New-tonian momentum is the invariance of the action integral under an x-translation of theentire apparatus: the string, and any wave on it. The symmetry associated with pseudo-momentum is the invariance of the action under a shift y(x) y(x a) of the locationof the wave on the string the string itself not being translated. Newtonian momen-tum is conserved if the ambient space is translationally invariant. Pseudo-momentumis conserved only if the string is translationally invariant i.e. if and T are position-independent. A failure to realize that the presence of a medium (here the string) requiresus to distinguish between these two symmetries is the origin ofmuch confusion involvingwave momentum.

Maxwells equations

Michael Faraday and James Clerk Maxwells description of electromagnetism in termsof dynamical vector fields gave us the first modern field theory. DAlembert and Mau-pertuis would have been delighted to discover that the famous equations of Maxwells ATreatise on Electricity and Magnetism (1873) follow from an action principle. There is aslight complication stemming from gauge invariance but, as long as we are not interestedin exhibiting the covariance of Maxwell under Lorentz transformations, we can sweepthis under the rug by working in the axial gauge, where the scalar electric potential doesnot appear.

We will start from Maxwells equations

divB = 0,

curlE = Bt

,

curlH = J + Dt

,

divD = , (1.101)

and show that they can be obtained from an action principle. For convenience we shalluse natural units in which 0 = 0 = 1, and so c = 1 and D E and B H.

The first equation divB = 0 contains no time derivatives. It is a constraint which wesatisfy by introducing a vector potential A such that B = curlA. If we set

E = At

, (1.102)


then this automatically implies Faradays law of induction

curlE = Bt

. (1.103)

We now guess that the Lagrangian is

L =

d3x

[1

2

{E2 B2

}+ J A

]. (1.104)

The motivation is that L looks very like T V if we regard 12E2 12 A2 as beingthe kinetic energy and 12B

2 = 12 (curlA)2 as being the potential energy. The term in Jrepresents the interaction of the fields with an external current source. In the axial gaugethe electric charge density does not appear in the Lagrangian. The corresponding actionis therefore

S =

L dt =

d3x

[1

2A2 1

2(curlA)2 + J A

]dt. (1.105)

Now vary A to A + A, whence

S =

d3x[A A (curlA) (curl A) + J A] dt. (1.106)

Here, we have already removed the time derivative from A by integrating by parts inthe time direction. Now we do the integration by parts in the space directions by usingthe identity

div (A (curlA)) = (curlA) (curl A) A (curl (curlA)) (1.107)

and taking A to vanish at spatial infinity, so the surface term, which would come fromthe integral of the total divergence, is zero. We end up with

S =

d3x{A [A curl (curlA) + J]} dt. (1.108)

Demanding that the variation of S be zero thus requires

2A

t2= curl (curlA) + J, (1.109)

or, in terms of the physical fields,

curlB = J + Et

. (1.110)

This isAmpres law, as modified by Maxwell so as to include the displacement current.How do we deal with the last Maxwell equation, Gauss law, which asserts that

divE = ? If were equal to zero, this equation would hold if divA = 0, i.e. if A were


solenoidal. In this case we might be tempted to impose the constraint divA = 0 on thevector potential, but doing so would undo all our good work, as we have been assumingthat we can vary A freely.

We notice, however, that the three Maxwell equations we already possess tell us that

t(divE ) = div (curlB)

(div J +

t

). (1.111)

Now div (curlB) = 0, so the left-hand side is zero provided charge is conserved,i.e. provided

+ div J = 0. (1.112)We assume that this is so. Thus, if Gauss law holds initially, it holds eternally. Wearrange for it to hold at t = 0 by imposing initial conditions on A. We first choose A|t=0by requiring it to satisfy

B|t=0 = curl (A|t=0) . (1.113)The solution is not unique, because may we add any to A|t=0, but this does not affectthe physical E and B fields. The initial velocities A|t=0 are then fixed uniquely byA|t=0 = E|t=0, where the initial E satisfies Gauss law. The subsequent evolution ofA is then uniquely determined by integrating the second-order equation (1.109).

The first integral for Maxwell is

E =3

i=1

d3x

{Ai

L

Ai

} L

=

d3x

[1

2

{E2 + B2

} J A

]. (1.114)

This will be conserved if J is time-independent. If J = 0, it is the total field energy.Suppose J is neither zero nor time-independent. Then, looking back at the derivation

of the time-independence of the first integral, we see that if L does depend on time, weinstead have

dE

dt= L

t. (1.115)

In the present case we have

Lt

=

J A d3x, (1.116)

so that

J A d3x = dEdt

= ddt

(Field Energy) {

J A + J A} d3x. (1.117)


Thus, cancelling the duplicated term and using E = A, we find

d

dt(Field Energy) =

J E d3x. (1.118)

Now

J (E) d3x is the rate at which the power source driving the current is doingwork against the field. The result is therefore physically sensible.

Continuum mechanics

Because the mechanics of discrete objects can be derived from an action principle, itseems obvious that so must the mechanics of continua. This is certainly true if we usethe Lagrangian description where we follow the history of each particle composing thecontinuous material as it moves through space. In fluid mechanics it is more natural todescribe themotion by using theEulerian description in whichwe focus onwhat is goingon at a particular point in space by introducing a velocity field v(r, t). Eulerian actionprinciples can still be found, but they seem to be logically distinct from the Lagrangianmechanics action principle, and mostly were not discovered until the twentieth century.

We begin by showing that Eulers equation for the irrotational motion of an inviscidcompressible fluid can be obtained by applying the action principle to a functional

S[, ] =

dt d3x

{

t+ 1

2()2 + u()

}, (1.119)

where is themass density and the flowvelocity is determined from the velocity potential by v = . The function u() is the internal energy density.

Varying S[, ] with respect to is straightforward, and gives a time-dependentgeneralization of (Daniel) Bernoullis equation

t+ 1

2v2 + h() = 0. (1.120)

Here h() du/d is the specific enthalpy.1 Varying with respect to requires anintegration by parts, based on

div ( ) = () () + div (), (1.121)

and gives the equation of mass conservation

t+ div (v) = 0. (1.122)

1 The enthalpy H = U + PV per unit mass. In general u and h will be functions of both the density and thespecific entropy. By taking u to depend only on we are tacitly assuming that specific entropy is constant.

This makes the resultant flow barotropic, meaning that the pressure is a function of the density only.


Taking the gradient of Bernoullis equation, and using the fact that for potential flow thevorticity curl v is zero and so ivj = jvi, we find that

v

t+ (v )v = h. (1.123)

We now introduce the pressure P, which is related to h by

h(P) = P0

dP

(P). (1.124)

We see that h = P, and so obtain Eulers equation

(v

t+ (v )v

)= P. (1.125)

For future reference, we observe that combining the mass-conservation equation

t + j{vj} = 0 (1.126)

with Eulers equation

(tvi + vjjvi) = iP (1.127)

yields

t {vi} + j{vivj + ijP

} = 0, (1.128)which expresses the local conservation of momentum. The quantity

ij = vivj + ijP (1.129)

is the momentum-flux tensor, and is the j-th component of the flux of the i-th componentpi = vi of momentum density.

The relations h = du/d and = dP/dh show that P and u are related by a Legendretransformation: P = h u(). From this, and the Bernoulli equation, we see that theintegrand in the action (1.119) is equal to minus the pressure:

P = t

+ 12()2 + u(). (1.130)

This Eulerian formulation cannot be a follow the particle action principle in a cleverdisguise. The mass conservation law is only a consequence of the equation of motion,and is not built in from the beginning as a constraint. Our variations in are thereforeconjuring up new matter rather than merely moving it around.

1.4 Variable endpoints 27

1.4 Variable endpoints

We now relax our previous assumption that all boundary or surface terms arising fromintegrations by parts may be ignored. We will find that variation principles can be veryuseful for working out what boundary conditions we should impose on our differentialequations.

Consider the problemof building a railway across a parallel sided isthmus (Figure 1.9).Suppose that the cost of construction is proportional to the length of the track, but thecost of sea transport being negligible, we may locate the terminal seaports wherever welike. We therefore wish to minimize the length

L[y] = x2x1

1 + (y)2dx, (1.131)

by allowing both the path y(x) and the endpoints y(x1) and y(x2) to vary. Then

L[y + y] L[y] = x2x1

(y) y

1 + (y)2 dx

= x2x1

{d

dx

(y

y1 + (y)2

) y d

dx

(y

1 + (y)2)}

dx

= y(x2) y(x2)

1 + (y)2 y(x1)y(x1)1 + (y)2

x2x1

yd

dx

(y

1 + (y)2)

dx. (1.132)

y(x1)

y(x2)y

x

Figure 1.9 Railway across an isthmus.


We have stationarity when both

(i) The coefficient of y(x) in the integral,

ddx

(y

1 + (y)2)

, (1.133)

is zero. This requires that y = const., i.e. the track should be straight.(ii) The coefficients of y(x1) and y(x2) vanish. For this we need

0 = y(x1)

1 + (y)2 =y(x2)1 + (y)2 . (1.134)

This in turn requires that y(x1) = y(x2) = 0.The integrated-out bits have determined the boundary conditions that are to be imposedon the solution of the differential equation. In the present case they require us to buildperpendicular to the coastline, and so we go straight across the isthmus. When boundaryconditions are obtained from endpoint variations in this way, they are called naturalboundary conditions.

Example: Sliding string. A massive string of linear density is stretched between twosmooth posts separated by distance 2L (Figure 1.10). The string is under tension T , andis free to slide up and down the posts. We consider only small deviations of the stringfrom the horizontal.

As we saw earlier, the Lagrangian for a stretched string is

L = LL

{1

2y2 1

2T (y)2

}dx. (1.135)

Now, Lagranges principle says that the equation of motion is found by requiring theaction

S = tfti

L dt (1.136)

x

y

LL

Figure 1.10 Sliding string.


to be stationary under variations of y(x, t) that vanish at the initial and final times, ti andtf . It does not demand that y vanish at the ends of the string, x = L. So, when wemake the variation, we must not assume this. Taking care not to discard the results ofthe integration by parts in the x-direction, we find

S = tfti

LL

y(x, t){y + Ty} dxdt tf

tiy(L, t)Ty(L) dt

+ tfti

y(L, t)Ty(L) dt. (1.137)

The equation of motion, which arises from the variation within the interval, is thereforethe wave equation

y Ty = 0. (1.138)

The boundary conditions, which come from the variations at the endpoints, are

y(L, t) = y(L, t) = 0, (1.139)

at all times t. These are the physically correct boundary conditions, because any up-or-down component of the tension would provide a finite force on an infinitesimal mass.The string must therefore be horizontal at its endpoints.

Example: Bead and string. Suppose now that a bead of mass M is free to slide up anddown the y axis, and is attached to the x = 0 end of our string (Figure 1.11). TheLagrangian for the stringbead contraption is

L = 12M [y(0)]2 +

L0

{1

2y2 1

2Ty2

}dx. (1.140)

Here, as before, is the mass per unit length of the string and T is its tension. The endof the string at x = L is fixed. By varying the action S = Ldt, and taking care not to

x

y

y (0)

0 L

Figure 1.11 A bead connected to a string.


throw away the boundary part at x = 0 we find that

S = tfti

[Ty My]x=0 y(0, t) dt + tf

ti

L0

{Ty y} y(x, t) dxdt. (1.141)

The EulerLagrange equations are therefore

y(x) Ty(x) = 0, 0 < x < L,My(0) Ty(0) = 0, y(L) = 0. (1.142)

The boundary condition at x = 0 is the equation of motion for the bead. It is clearlycorrect, because Ty(0) is the vertical component of the force that the string tension exertson the bead.

These examples led to boundary conditions that we could easily have figured out forourselves without the variational principle. The next example shows that a variationalformulation can be exploited to obtain a set of boundary conditions that might be difficultto write down by purely physical reasoning.

Harder example: Gravity waves on the surface of water (Figure 1.12). An actionsuitable for describing water waves is given by2 S[, h] = L dt, where

L =

dx h(x,t)0

0

{

t+ 1

2()2 + gy

}dy. (1.143)

Here is the velocity potential and 0 is the density of the water. The density will not bevaried because the water is being treated as incompressible. As before, the flow velocityis given by v = . By varying (x, y, t) and the depth h(x, t), and taking care not

y

x

h(x,t)g

P0

0

Figure 1.12 Gravity waves on water.

2 J. C. Luke, J. Fluid Dynamics, 27 (1967) 395.


to throw away any integrated-out parts of the variation at the physical boundaries, weobtain:

2 = 0, within the fluid

t+ 1

2()2 + gy = 0, on the free surface

y= 0, on y = 0

h

t

y+ h

x

x= 0, on the free surface (1.144)

The first equation comes from varying within the fluid, and it simply confirms that theflow is incompressible, i.e. obeys div v = 0. The second comes from varying h, and isthe Bernoulli equation stating that we have P = P0 (atmospheric pressure) everywhereon the free surface. The third, from the variation of at y = 0, states that no fluidescapes through the lower boundary.

Obtaining and interpreting the last equation, involving h/t, is somewhat trickier. Itcomes from the variation of on the upper boundary. The variation of S due to is

S =

0

{

t +

x

(

x

)+

y

(

y

) 2

}dtdxdy. (1.145)

The first three terms in the integrand constitute the three-dimensional divergencediv (), where, listing components in the order t, x, y,

=[1,

x,

y

]. (1.146)

The integrated-out part on the upper surface is therefore( n) d|S|. Here, the

outward normal is

n =(1 +

(h

t

)2+(

h

x

)2)1/2 [h

t,h

x, 1

], (1.147)

and the element of area

d|S| =(1 +

(h

t

)2+(

h

x

)2)1/2dtdx. (1.148)

The boundary variation is thus

S|y=h = {

h

t

y+ h

x

x

}(x, h(x, t), t

)dxdt. (1.149)


Requiring this variation to be zero for arbitrary (x, h(x, t), t

)leads to

h

t

y+ h

x

x= 0. (1.150)

This last boundary condition expresses the geometrical constraint that the surface moveswith the fluid it bounds, or, in other words, that a fluid particle initially on the surfacestays on the surface. To see that this is so, define f (x, y, t) = h(x, t) y. The free surfaceis then determined by f (x, y, t) = 0. Because the surface particles are carried with theflow, the convective derivative of f ,

df

dt f

t+ (v )f , (1.151)

must vanish on the free surface. Using v = and the definition of f , this reduces toh

t+

x

h

x

y= 0, (1.152)

which is indeed the last boundary condition.

1.5 Lagrange multipliers

Figure 1.13 shows the contour map of a hill of height h = f (x, y). The hill is traversedby a road whose points satisfy the equation g(x, y) = 0. Our challenge is to use the dataf (x, y) and g(x, y) to find the highest point on the road.

When r changes by dr = (dx, dy), the height f changes by

df = f dr, (1.153)

where f = (xf , yf ). The highest point, being a stationary point, will have df = 0for all displacements dr that stay on the road that is for all dr such that dg = 0. Thus

y

x

Figure 1.13 Road on hill.

1.5 Lagrange multipliers 33

f dr must be zero for those dr such that 0 = g dr. In other words, at the highestpoint f will be orthogonal to all vectors that are orthogonal to g. This is possibleonly if the vectors f and g are parallel, and so f = g for some .

To find the stationary point, therefore, we solve the equations

f g = 0,g(x, y) = 0, (1.154)

simultaneously.

Example: Let f = x2 + y2 and g = x + y 1. Then f = 2(x, y) and g = (1, 1). So

2(x, y) (1, 1) = 0 (x, y) = 2(1, 1)

x + y = 1 = 1 = (x, y) =(

1

2,1

2

).

When there are n constraints, g1 = g2 = = gn = 0, we want f to lie in

(gi) = gi, (1.155)

where ei denotes the space spanned by the vectors ei and ei is its orthogonal com-plement. Thus f lies in the space spanned by the vectors gi, so there must exist nnumbers i such that

f =n

i=1igi. (1.156)

The numbers i are called Lagrange multipliers. We can therefore regard our problemas one of finding the stationary points of an auxiliary function

F = f

i

igi, (1.157)

with the n undetermined multipliers i, i = 1, . . . , n, subsequently being fixed byimposing the n requirements that gi = 0, i = 1, . . . , n.Example: Find the stationary points of

F(x) = 12x Ax = 1

2xiAijxj (1.158)

on the surface x x = 1. Here Aij is a symmetric matrix.Solution: We look for stationary points of

G(x) = F(x) 12|x|2. (1.159)


The derivatives we need are

F

xk= 1

2kiAijxj + 12xiAijjk

= Akjxj , (1.160)

and

xk

(

2xjxj

)= xk . (1.161)

Thus, the stationary points must satisfy

Akjxj = xk ,xixi = 1, (1.162)

and so are the normalized eigenvectors of the matrix A. The Lagrange multiplier at eachstationary point is the corresponding eigenvalue.

Example: Statistical mechanics. Let denote the classical phase space of a mechanicalsystemof nparticles governed by aHamiltonianH (p, q). Letd be theLiouvillemeasured3np d3nq. In statistical mechanics we work with a probability density (p, q) such that(p, q)d is the probability of the system being in a state in the small region d. Theentropy associated with the probability distribution is the functional

S[] =

ln d. (1.163)

We wish to find the (p, q) that maximizes the entropy for a given energy

E =

H d. (1.164)

We cannot vary freely as we should preserve both the energy and the normalizationcondition

d = 1 (1.165)

that is required of any probability distribution. We therefore introduce two Lagrangemultipliers, 1 + and , to enforce the normalization and energy conditions, and lookfor stationary points of

F[] =

{ ln + ( + 1) H } d. (1.166)

1.5 Lagrange multipliers 35

Now we can vary freely, and hence find that

F =

{ ln + H } d. (1.167)

Requiring this to be zero gives us

(p, q) = eH (p,q), (1.168)

where , are determined by imposing the normalization and energy constraints. Thisprobability density is known as the canonical distribution, and the parameter is theinverse temperature = 1/T .Example: The catenary. At last we have the tools to solve the problem of the hangingchain of fixed length. We wish to minimize the potential energy

E[y] = LL

y

1 + (y)2dx, (1.169)

subject to the constraint

l[y] = LL

1 + (y)2dx = const., (1.170)

where the constant is the length of the chain. We introduce a Lagrange multiplier andfind the stationary points of

F[y] = LL

(y )

1 + (y)2dx, (1.171)

so, following our earlier methods, we find

y = + cosh (x + a)

. (1.172)

We choose , , a to fix the two endpoints (two conditions) and the length (one condition).

Example: SturmLiouville problem.Wewish to find the stationary points of the quadraticfunctional

J [y] = x2x1

1

2

{p(x)(y)2 + q(x)y2

}dx, (1.173)

subject to the boundary conditions y(x) = 0 at the endpoints x1, x2 and the normalization

K[y] = x2x1

y2 dx = 1. (1.174)


Taking the variation of J (/2)K , we find

J = x2x1

{(py) + qy y} y dx. (1.175)Stationarity therefore requires

(py) + qy = y, y(x1) = y(x2) = 0. (1.176)This is the SturmLiouville eigenvalue problem. It is an infinite-dimensional analogueof the F(x) = 12x Ax problem.Example: Irrotational flow again. Consider the action functional

S[v,, ] = {

1

2v2 u() +

(

t+ div v

)}dtd3x. (1.177)

This is similar to our previous action for the irrotational barotropic flow of an inviscidfluid, but here v is an independent variable and we have introduced infinitely manyLagrange multipliers (x, t), one for each point of space-time, so as to enforce theequation of mass conservation + div v = 0 everywhere, and at all times. EquatingS/v to zero gives v = , and so these Lagrange multipliers become the velocitypotential as a consequence of the equations of motion. The Bernoulli and Euler equationsnow follow almost as before. Because the equation v = does not involve timederivatives, this is one of the cases where it is legitimate to substitute a consequenceof the action principle back into the action. If we do this, we recover our previousformulation.

1.6 Maximum or minimum?

We have provided many examples of stationary points in function space. We have saidalmost nothing about whether these stationary points are maxima or minima. There is areason for this: investigating the character of the stationary point requires the computationof the second functional derivative

2J

y(x1)y(x2)

and the use of the functional version of Taylors theorem to expand about the stationarypoint y(x):

J [y + ] = J [y] +

(x)J

y(x)

ydx

+ 2

2

(x1)(x2)

2J

y(x1)y(x2)

ydx1dx2 + . (1.178)

1.6 Maximum or minimum? 37

Since y(x) is a stationary point, the term with J/y(x)|y vanishes. Whether y(x) is amaximum, a minimum, or a saddle therefore depends on the number of positive andnegative eigenvalues of 2J/(y(x1))(y(x2)), a matrix with a continuous infinity ofrows and columns, these being labelled by x1 and x2, respectively. It is not easy todiagonalize a continuously infinite matrix! Consider, for example, the functional

J [y] = ba

1

2

{p(x)(y)2 + q(x)y2

}dx, (1.179)

with y(a) = y(b) = 0. Here, as we already know,

J

y(x)= Ly d

dx

(p(x)

d

dxy(x)

)+ q(x)y(x), (1.180)

and, except in special cases, this will be zero only if y(x) 0. We might reasonablyexpect the second derivative to be

y(Ly)

?= L, (1.181)

where L is the SturmLiouville differential operator

L = ddx

(p(x)

d

dx

)+ q(x). (1.182)

How can a differential operator be a matrix like 2J/(y(x1))(y(x2))?We can formally compute the second derivative by exploiting the Dirac delta

function (x) which has the property that

y(x2) =

(x2 x1)y(x1) dx1. (1.183)

Thus

y(x2) =

(x2 x1)y(x1) dx1, (1.184)

from which we read off that

y(x2)

y(x1)= (x2 x1). (1.185)

Using (1.185), we find that

y(x1)

(J

y(x2)

)= d

dx2

(p(x2)

d

dx2(x2 x1)

)+ q(x2)(x2 x1). (1.186)


How are we to make sense of this expression? We begin in the next chapter where weexplain what it means to differentiate (x), and show that (1.186) does indeed correspondto the differential operator L. In subsequent chapters we explore the manner in whichdifferential operators andmatrices are related.Wewill learn that just as somematrices canbe diagonalized so can some differential operators, and that the class of diagonalizableoperators includes (1.182).

If all the eigenvalues of L are positive, our stationary point was a minimum. For eachnegative eigenvalue, there is direction in function space in which J [y] decreases as wemove away from the stationary point.

1.7 Further exercises and problems

Here is a collection of problems relating to the calculus of variations. Some date backto the sixteenth century, others are quite recent in origin.

Exercise 1.1: Asmooth path in the xy-plane is given by r(t) = (x(t), y(t))with r(0) = a,and r(1) = b. The length of the path from a to b is therefore

S[r] = 10

x2 + y2 dt,

where x dx/dt, y dy/dt. Write down the EulerLagrange conditions for S[r] to bestationary under small variations of the path that keep the endpoints fixed, and henceshow that the shortest path between two points is a straight line.

Exercise 1.2: Fermats principle. A medium is characterized optically by its refractiveindex n, such that the speed of light in the medium is c/n. According to Fermat (1657),the path taken by a ray of light between any two points makes the travel time stationarybetween those points.Assume that the ray propagates in the xy-plane in a layeredmediumwith refractive index n(x). Use Fermats principle to establish Snells law in its generalform n(x) sin = constant, by finding the equation giving the stationary paths y(x) for

F1[y] =

n(x)

1 + y2dx.

(Here the prime denotes differentiation with respect to x.) Repeat this exercise for thecase that n depends only on y and find a similar equation for the stationary paths of

F2[y] =

n(y)

1 + y2dx.

By using suitable definitions of the angle of incidence in each case, show that thetwo formulations of the problem give physically equivalent answers. In the secondformulation you will find it easiest to use the first integral of Eulers equation.

1.7 Further exercises and problems 39

Problem 1.3: Hyperbolic geometry. This problem introduces a version of the Poincarmodel for the non-Euclidean geometry of Lobachevski.

(a) Show that the stationary paths for the functional

F3[y] =

1

y

1 + y2dx,

with y(x) restricted to lying in the upper half-plane, are semicircles of arbitrary radiusand with centres on the x-axis. These paths are the geodesics, or minimum lengthpaths, in a space with Riemann metric

ds2 = 1y2

(dx2 + dy2), y > 0.

(b) Show that if we call these geodesics lines, then one and only one line can be drawnthough two given points.

(c) Two lines are said to be parallel if, and only if, they meet at infinity, i.e. on thex-axis. (Verify that the x-axis is indeed infinitely far from any point with y > 0.)Show that given a line q and a point A not lying on that line, there are two linespassing throughA that are parallel to q, and that between these two lines lies a pencilof lines passing through A that never meet q.

Problem 1.4: Elastic rods. The elastic energy per unit length of a bent steel rod is givenby 12YI/R

2. HereR is the radius of curvature due to the bending, Y is theYoungsmodulusof the steel and I = y2dxdy is the moment of inertia of the rods cross-section aboutan axis through its centroid and perpendicular to the plane in which the rod is bent. Ifthe rod is only slightly bent into the yz-plane and lies close to the z-axis, show that thiselastic energy can be approximated as

U [y] = L0

1

2YI(y)2

dz,

where the prime denotes differentiation with respect to z and L is the length of the rod.We will use this approximate energy functional to discuss two practical problems.

(a) Eulers problem: The buckling of a slender column. The rod is used as a columnwhich supports a compressive load Mg directed along the z-axis (which is vertical;see Figure (1.14a)). Show that when the rod buckles slightly (i.e. deforms with bothends remaining on the z-axis) the total energy, including the gravitational potentialenergy of the loading mass M , can be approximated by

U [y] = L0

{YI

2

(y)2 Mg

2

(y)2}

dz.


L Mg(a) (b)

Mg

Figure 1.14 A rod used as: (a) a column, (b) a cantilever.

By considering small deformations of the form

y(z) =

n=1an sin

nz

L

show that the column is unstable to buckling and collapse if Mg 2YI/L2.(b) Leonardo da Vincis problem: The light cantilever. Here we take the z-axis as hori-

zontal and the y-axis as being vertical (Figure 1.14b). The rod is used as a beam orcantilever and is fixed into a wall so that y(0) = 0 = y(0). A weight Mg is hungfrom the end z = L and the beam sags in the (y)-direction. We wish to find y(z)for 0 < z < L. We will ignore the weight of the beam itself. Write down the complete expression for the energy, including the gravitational

potential energy of the weight. Find the differential equation and boundary conditions at z = 0, L that arise from

minimizing the total energy. In doing this take care not to throw away any termarising from the integration by parts. You may find the following identity to beof use:

d

dz(f g fg) = f g fg.

Solve the equation. You should find that the displacement of the end of the beamis y(L) = 13MgL3/YI .

Exercise 1.5: Suppose that an elastic body of density is slightly deformed so that thepoint that was at cartesian coordinate xi is moved to xi + i(x). We define the resultingstrain tensor eij by

eij = 12

(j

xi+ i

xj

).


It is automatically symmetric in its indices. The Lagrangian for small-amplitude elasticmotion of the body is

L[] =

{1

22i

1

2eijcijklekl

}d3x.

Here, cijkl is the tensor of elastic constants, which has the symmetries

cijkl = cklij = cjikl = cijlk .

By varying the i, show that the equation of motion for the body is

2i

t2

xjji = 0,

where

ij = cijklekl

is the stress tensor. Show that variations of i on the boundary give as boundaryconditions

ijnj = 0,

where ni are the components of the outward normal on .

Problem 1.6: The catenary revisited. We can describe a catenary curve in paramet-ric form as x(s), y(s), where s is the arc-length. The potential energy is then simply L0 gy(s)ds where is the mass per unit length of the hanging chain. The x, y are not

independent functions of s, however, because x2 + y2 = 1 at every point on the curve.Here a dot denotes a derivative with respect to s.

(a) Introduce infinitelymanyLagrangemultipliers(s) to enforce the x2 + y2 constraint,one for each point s on the curve. From the resulting functional derive two coupledequations describing the catenary, one for x(s) and one for y(s). By thinking aboutthe forces acting on a small section of the cable, and perhaps by introducing theangle where x = cos and y = sin , so that s and are intrinsic coordinatesfor the curve, interpret these equations and show that (s) is proportional to theposition-dependent tension T (s) in the chain.

(b) You are provided with a lightweight line of length a/2 and some lead shot of totalmass M . By using equations from the previous part (suitably modified to take intoaccount the position dependent (s)) or otherwise, determine how the lead shouldbe distributed along the line if the loaded line is to hang in an arc of a circle of radiusa (see Figure 1.15) when its ends are attached to two points at the same height.


s

a

4

Figure 1.15 Weighted line.

y

xO

P

R

Q X

rD2

Figure 1.16 The Poincar disc of Exercise 1.7. The radius OP of the Poincar disc is unity, whilethe radius of the geodesic arc PQR is PX = QX = RX = R. The distance between the centres ofthe disc and arc is OX = x0. Your task in part (c) is to show that OPX = ORX = 90.

Problem 1.7: Another model for Lobachevski geometry (see Exercise 1.3) is thePoincar disc (Figure 1.16). This space consists of the interior of the unit discD2 = {(x, y) R2 : x2 + y2 1} equipped with the Riemann metric

ds2 = dx2 + dy2

(1 x2 y2)2 .

The geodesic paths are found by minimizing the arc-length functional

s[r]

ds = {

1

1 x2 y2

x2 + y2}

dt,

where r(t) = (x(t), y(t)) and a dot indicates a derivative with respect to the parameter t.


(a) Either by manipulating the two EulerLagrange equations that give the conditionsfor s[r] to be stationary under variations in r(t), or, more efficiently, by observingthat s[r] is invariant under the infinitesimal rotation

x = yy = x

and applying Noethers theorem, show that the parametrized geodesics obey

d

dt

(1

1 x2 y2xy yxx2 + y2

)= 0.

(b) Given a point (a, b) within D2, and a direction through it, show that the equationyou derived in part (a) determines a unique geodesic curve passing through (a, b) inthe given direction, but does not determine the parametrization of the curve.

(c) Show that there exists a solution to the equation in part (a) in the form

x(t) = R cos t + x0y(t) = R sin t.

Find a relation between x0 and R, and from it deduce that the geodesics are circulararcs that cut the bounding unit circle (which plays the role of the line at infinity inthe Lobachevski plane) at right angles.

Exercise 1.8: The Lagrangian for a particle of charge q is

L[x, x] = 12mx2 q(x) + qx A(x).

Show that Lagranges equation leads to

mx = q(E + x B),

where

E = At

, B = curlA.

Exercise 1.9: Consider the action functional

S[, p, r] = (

1

2I1

21 +

1

2I2

22 +

1

2I3

23 + p (r + r)

}dt,

where r and p are time-dependent 3-vectors, as is = (1,2,3). Apply the actionprinciple to obtain the equations of motion for r, p, and show that they lead to Eulers


x

y

Figure 1.17 Vibrating piano string.

equations

I11 (I2 I3)23 = 0,I22 (I3 I1)31 = 0,I33 (I1 I2)12 = 0,

governing the angular velocity of a freely rotating rigid body.

Problem 1.10: Piano string. An elastic piano string can vibrate both transversely andlongitudinally, and the two vibrations influence one another (Figure 1.17). ALagrangianthat takes into account the lowest-order effect of stretching on the local string tension,and can therefore model this coupled motion, is

L[ , ] =

dx

120[(

t

)2+(

t

)2]

2

[0

+

x+ 1

2

(

x

)2]2 .Here (x, t) is the longitudinal displacement and (x, t) the transverse displacement ofthe string. Thus, the point that in the undisturbed string had coordinates [x, 0] is movedto the point with coordinates [x+(x, t), (x, t)]. The parameter 0 represents the tensionin the undisturbed string, is the product of Youngs modulus and the cross-sectionalarea of the string and 0 is the mass per unit length.

(a) Use the action principle to derive the two coupled equations ofmotion, one involving2

t2and one involving

2

t2.

(b) Show that when we linearize these two equations of motion, the longitudinaland transverse motions decouple. Find expressions for the longitudinal (cL) andtransverse (cT) wave velocities in terms of 0, 0 and .

(c) Assume that a given transverse pulse (x, t) = 0(x cTt) propagates along thestring. Show that this induces a concurrent longitudinal pulse of the form (xcTt).Show further that the longitudinal Newtonian momentum density in this concurrentpulse is given by

0

t= 1

2

c2Lc2L c2T

T 01


where

T 01 0 x

t

is the associated pseudo-momentum density.

The forces that created the transverse pulse will also have created other longitudinalwaves that travel at cL. Consequently the Newtonian x-momentum moving at cT is notthe only x-momentum on the string, and the total true longitudinal momentum densityis not simply proportional to the pseudo-momentum density.

Exercise 1.11: Obtain the canonical energymomentum tensor T for the barotropicfluid described by (1.119). Show that its conservation leads to both the momentumconservation equation (1.128), and the energy conservation equation

tE + i{vi(E + P)},

where the energy density is

E = 12()2 + u().

Interpret the energy flux as being the sum of the convective transport of energy togetherwith the rate of working by an element of fluid on its neighbours.

Problem 1.12: Consider the action functional3

S[v, ,,, ] =

d4x

{1

2v2

(

t+ div (v)

)+

(

t+ (v )

)+ u()

},

which is a generalization of (1.177) to include two new scalar fields and . Show thatvarying v leads to

v = + .

This is the Clebsch representation of the velocity field. It allows for flows with non-zerovorticity

curl v = .

3 H. Bateman, Proc. Roy. Soc. Lond. A, 125 (1929) 598; C. C. Lin, Liquid Helium in Proc. Int. Sch. Phys.Enrico Fermi, Course XXI (Academic Press, 1965).


Show that the equations that arise from varying the remaining fields , , , togetherimply the mass conservation equation

t+ div (v) = 0,

and Bernoullis equation in the form

v

t+ v =

(1

2v2 + h

).

(Recall that h = du/d.) Show that this form of Bernoullis equation is equivalent toEulers equation

v

t+ (v )v = h.

Consequently S provides an action principle for a general inviscid barotropic flow.

Exercise 1.13: Drums andmembranes. The shape of a distorted drumskin is described bythe function h(x, y), which gives the height to which the point (x, y) of the flat undistorteddrumskin is displaced.

(a) Show that the area of the distorted drumskin is equal to

Area[h] =

dx dy

1 +

(h

x

)2+(

h

y

)2,

where the integral is taken over the area of the flat drumskin.(b) Show that for small distortions, the area reduces to

A[h] = const. + 12

dx dy |h|2.

(c) Show that if h satisfies the two-dimensional Laplace equation then A is stationarywith respect to variations that vanish at the boundary.

(d) Suppose the drumskin has mass 0 per unit area, and surface tension T . Write downthe Lagrangian controlling the motion of the drumskin, and derive the equation ofmotion that follows from it.

Problem 1.14: The Wulff construction. The surface-area functional of the previous exer-cise can be generalized so as to find the equilibrium shape of a crystal. We describethe crystal surface by giving its height z(x, y) above the xy-plane, and introduce thedirection-dependent surface tension (the surface free-energy per unit area)(p, q), where

p = zx

, q = zy

. ()


We seek to minimize the total surface free energy

F[z] =

dxdy

{(p, q)

1 + p2 + q2

},

subject to the constraint that the volume of the crystal

V [z] =

z dxdy

remains constant.

(a) Enforce the volume constraint by introducing a Lagrange multiplier 21, and soobtain the EulerLagrange equation

x

(f

p

)+

y

(f

q

)= 21.

Here

f (p, q) = (p.q)

1 + p2 + q2.

(b) Show in the isotropic case, where is constant, that

z(x, y) =

()2 (x a)2 (y b)2 + const.

is a solution of the EulerLagrange equation. In this case, therefore, the equilibriumshape is a sphere.

An obvious way to satisfy the EulerLagrange equation in the general anisotropic casewould be to arrange things so that

x = fp

, y = fq

. ()

(c) Show that () is exactly the relationship we would have if z(x, y) and f (p, q) wereLegendre transforms of each other, i.e. if

f (p, q) = px + qy z(x, y),

where the x and y on the right-hand side are functions of p, q obtained by solving(). Do this by showing that the inverse relation is

z(x, y) = px + qy f (p, q)

where now the p, q on the right-hand side become functions of x and y, and areobtained by solving ().


n (b)(a)

Figure 1.18 Two-dimensional Wulff crystal. (a) Polar plot of surface tension as a function ofthe normal n to a crystal face, together with a line perpendicular to n at distance from the origin.(b) Wulffs construction of the corresponding crystal surface as the envelope of the family ofperpendicular lines. In this case, the minimum-energy crystal has curved faces, but sharp corners.The envelope continues beyond the corners, but these parts are unphysical.

For real crystals, (p, q) can have the property of being a continuous-but-nowhere-differentiable function, and so the differential calculus used in deriving the EulerLagrange equation is inapplicable. The Legendre transformation, however, has ageometric interpretation that is more robust than its calculus-based derivation.

Recall that if we have a two-parameter family of surfaces in R3 given byF(x, y, z; p, q) = 0, then the equation of the envelope of the surfaces is found by solvingthe equations

0 = F = Fp

= Fq

so as to eliminate the parameters p, q.

(d) Show that the equation

F(x, y, z; p, q) px + qy z (p, q)

1 + p2 + q2 = 0describes a family of planes perpendicular to the unit vectors

n = (p, q,1)1 + p2 + q2

and at a distance (p, q) away from the origin.(e) Show that the equations to be solved for the envelope of this family of planes are

exactly those that determine z(x, y). Deduce that, for smooth (p, q), the profilez(x, y) is this envelope.

Wulff conjectured4 that, even for non-smooth (p, q), the minimum-energy shape isgiven by an equivalent geometric construction: erect the planes from part (d) and, for

4 G. Wulff, Zeitschrift fr Kristallografie, 34 (1901) 449.


each plane, discard the half-space of R3 that lies on the far side of the plane from theorigin. The convex region consisting of the intersection of the retained half-spaces is thecrystal. When (p, q) is smooth this Wulff body is bounded by part of the envelope ofthe planes. (The parts of the envelope not bounding the convex body the swallowtailsvisible in Figure 1.18 are unphysical.) When (p, q) has cusps, these singularities cangive rise to flat facets which are often

mathematics for physics: a guided tour for graduate students

Documents