mp469: diﬀerential equations and complex analysis · 2014. 9. 16. · mp469: diﬀerential...

MP469: Differential Equations and Complex Analysis

Brian Dolan

Ordinary Differential Equations

1. Definitions

An ordinary differential equation involves derivatives of a function y(x) of a singleindependent variable x. A linear ordinary differential equations is one in which y(x) andits derivatives appear linearly, i.e. there are no terms involving y2, yy′ or other non-linearcombinations of y and its derivatives. In this course we shall deal exclusively with linear,second order differential equations, that is linear differential equations in which the highestderivative of y is its second derivative, y′′.

Examples of second order, linear, ordinary differential equations are (in all of theseequations λ is a constant):

• Harmonic Oscillator Equation

y′′ = λy, x ∈ (−∞,∞) (or some subset I ⊆ (−∞,∞))

• Legendre’s Equation

(1− x2)y′′ − 2xy′ = λy, x ∈ [−1, 1] (x = ±1 are singular points)

• Laguerre’s equation

xy′′ + (1− x)y′ = λy, x ∈ [0,∞) (x = 0 is a singular point)

• Hermite’s equation

y′′ − 2xy′ = λy x ∈ (−∞,∞) (or some subset I ⊆ (−∞,∞))

• Bessel’s equation

x2y′′ + xy′ + x2y = λy x ∈ [0,∞) (or some subset I ⊆ [0,∞)).

These equation are all of the form

Ly = λy (1)

1

where λ is a constant and

L = a2(x)d2

dx2+ a1(x)

d

dx+ a0(x),

with a2(x), a1(x) and a0(x) functions of x. L is a second-order, linear, differential opera-tor because it operates on everything to the right, not just by ordinary multiplication butalso by the operation of differentiation. Thus for the example equations:

Harmonic Oscillator: a2(x) = 1, a1(x) = a0(x) = 0

Legendre: a2(x) = 1− x2, a1(x) = −2x, a0(x) = 0

Laguerre: a2(x) = x, a1(x) = 1− x, a0(x) = 0

Hermite: a2(x) = 1, a1(x) = −2x, a0(x) = 0

Bessel: a2(x) = x2, a1(x) = x, a0(x) = x2.

Equations of the formLy = 0 (2)

are called homogeneous differential equations. More generally we shall attempt to solveinhomogeneous equations of the form

Ly(x) = h(x)

where the right-hand side is some given function h(x). Equation (1) will be a central to theanalysis of both homogeneous and inhomogeneous equation, and we shall start by studyingthe former, equation (2) .

For each equation the interval I on which the equation is to be solved must be specified— the nature of the possible solutions can be different for different intervals. The operatorL is called normal if a2(x) 6= 0 in the range of x, I ⊆ (−∞,∞). Points at which a2(x)vanishes are called singular points of the equation. For example Legendre’s equation isnormal in the open interval I = (−1, 1) but not in the closed interval I = [−1, 1], sincea2(x) = 1 − x2 vanishes at the end points x = ±1, which are therefore singular points.In the following we shall always be dealing with ordinary, linear, second order equationswhich are normal in the interval of interest, with the possible exception of the end pointsat which some equations may not be normal. This will always be assumed unless otherwisestated.

In analogy with linear algebra (1) is called an eigenvalue equation, with λ an eigen-value and y an eigenfunction of the differential operator L. For a given L on an intervalI there are in general many solutions of (1) with different eigenvalues. Even for a fixedeigenvalue there are many possible solutions but any solution can always be written aslinear combination of only two different solutions. The two solutions chosen must be lin-early independent in the sense that they are not constant multiples of each other (this isanalogous to expressing a vector in two dimensions as a linear combination of two basisvectors — any two basis vectors will do, provided they are not parallel). Roughly speakingthere are two independent solutions because solving a second order differential equation

2

requires performing two integrations (to undo the two derivatives) and necessarily intro-duces two arbitrary integration constants. We need two further pieces of information to fixthe two integration constants uniquely. This extra information, is usually given in termsof boundary conditions, that is information about the value of the function y(x) and/orits derivatives at the end points of the interval of interest I.

For example if I is the interval [a, b], then boundary conditions are of the form

α1y(a) + α2y(b) + α3y′(a) + α4y

′(b) = γ1

β1y(a) + β2y(b) + β3y′(a) + β4y

′(b) = γ2

where αi, βi and γi are constants. Common choices of boundary conditions are1) α2 = α4 = β1 = β3 = 0, then the two boundary conditions specify the combination

α1y(a) + α3y′(a) = γ1 at one endpoint and β2y(b) + β4y

′(b) = γ2 at the other. Theseare called unmixed boundary conditions.A specific instance of unmixed boundary conditions is α2 = α3 = α4 = β1 = β3 =β4 = 0, then the two boundary conditions specify y at the two endpoints, α1y(a) = γ1and β2y(b) = γ2. These are called Dirichlet boundary conditions.Another specific example of unmixed conditions is α1 = α2 = α4 = β1 = β2 = β3 = 0,then the two boundary conditions specify y′ at the two endpoints, α3y

′(a) = γ1 andβ4y

′(b) = γ1. These are called Neumann boundary conditions.2) An important class of boundary conditions is y(a) = y(b) and y′(a) = y′(b), which

requires α3 = α4 = β1 = β2 = γ1 = γ2 = 0 and α1 = −α2, β3 = −β4. These arecalled periodic boundary conditions.

3) Another important class of boundary conditions is when y(a) and y′(a) are givenwhile y(b) and y′(b) are left free. This corresponds to α2 = α3 = α4 = 0 andβ1 = β2 = β4 = 0 giving α1y(a) = γ1 and β3y

′(a) = γ2. Problems with theseboundary conditions are called initial value problems and are very important intheoretical physics. An important theorem for such problems states that, with theseboundary conditions, there exists a unique solution of any second order linear ordinarydifferential equation in an interval I in which the equation is normal (see Kreider,Kuller Ostberg and Perkins, “An Introduction to Linear Analysis”, page 104).

Examples

Harmonic Oscillator: y′′ = λy

We shall take the closed interval I = [0, π], and consider three cases in turn:

1) λ > 0: the general solution is y(x) = Ae√λx + Be−

√λx with A and B constants.

If we chose the Dirichlet boundary conditions, y(0) = 0, y(π) = 1 then y(0) =

A + B = 0 ⇒ A = −B while y(π) = A(e√λπ − e−

√λπ) = 1 ⇒ A = 1/(2 sinh(

√λπ))

and there is a unique solution for any λ > 0. On the other hand, if we choose asboundary conditions y(0) = y(π) = 0 then y(0) = A + B = 0 ⇒ A = −B and

y(π) = A(e√λπ − e−

√λπ) = 0 ⇒ A = 0 and the only solution is the trivial one, y = 0.

3

2) λ < 0: the general solution is y(x) = A cos(√−λx) + B sin(

√−λx) with A and

B constants. If we chose Dirichlet boundary conditions, y(0) = 0, y(π) = 1 theny(0) = A = 0 while y(π) = B sin(

√−λπ) = 1 ⇒ B = 1/ sin(

√−λπ) and a solution

only exists if√−λ is not an integer.

On the other hand, if we choose as boundary conditions y(0) = y(π) = 0 theny(0) = A = 0 and y(π) = B sin(

√−λπ) = 0 and there is a non-trivial solution

(B 6= 0) only if√−λ = n where n is an integer, i.e. λ = −n2, in which case B is

arbitrary.

3) λ = 0: The general solution is y = A+Bx. Choosing Dirichlet boundary conditions,y(0) = 0, y(π) = 1, gives A = 0 and B = 1/π, so the unique solution is y = x/π.

These examples are instructive because they show that the possible values of theeigenvalue λ are affected by the choice of boundary condition.

Legendre’s equation: (1− x2)y′′ − 2xy′ = λy

In general there are two linearly independent solutions on the interval [−1, 1] for anygiven λ, but one of them diverges logarithmically at the end points, x = ±1, which aresingular points of the equation and so is not allowed when I includes the end points. Ifwe look for a series solution which terminates at a finite power of x, then the possibleeigenvalues are constrained to be λ = −n(n + 1), where n is a positive integer (or zero).The eigenfunctions are then the Legendre Polynomials Pn(x) (or a constant multiple ofthem),

(1− x2)P ′′n − 2xP ′

n + n(n+ 1)Pn = 0.

By convention the Legendre polynomials are usually normalised so that Pn(1) = 1 and thefirst few are:

P0(x) = 1, P1(x) = x, P2(x) =1

2(3x2 − 1), P3(x) =

1

2(5x3 − 3x).

A general formula (known as Rodrigues’ formula) for Pn(x) is

Pn(x) =1

2nn!

dn

dxn(x2 − 1)n (3)

(the prefactor 12nn! is to conform to the convention that Pn(1) = 1). Note that Pn(x) is

an even function, Pn(−x) = Pn(x), if n is even and an odd function, Pn(−x) = −Pn(x), ifn is odd.

4

Laguerre’s equation: xy′′ + (1− x)y′ = λy

Again of the two linearly independent solutions one diverges logarithmically, this timeat the singular point x = 0, so this must be rejected on the interval I = [0,∞). Theother one terminates in a series solution to a finite polynomial in x if and only if λ = −n,n = 0, 1, 2, 3, . . . These are the Laguerre polynomials, Ln(x) satisfying

xL′′n + (1− x)L′

n + nLn = 0.

The first few are:

L0(x) = 1, L1(x) = 1− x, L2(x) =x2

2− 2x+ 1,

L3(x) = −x3

6+

3

2x2 − 3x+ 1

Hermite’s equation: y′′ − 2xy′ = λy

There are two linearly independent solutions, both perfectly well behaved on thewhole real line −∞ < x < ∞. In physics we normally only need one of these, whichcan be constructed as a series solution which terminates if and only if λ = −2n, withn = 0, 1, 2, 3, . . . These are the Hermite polynomials satisfying

H ′′n − 2xH ′

n + 2nHn = 0.

The first few are:

H0(x) = 1, H1(x) = 2x, H2(x) = 4x2 − 2, H3(x) = 8x3 − 12x.

The other linearly independent solution is not a finite polynomial, but an infiniteseries.

5

2. Linear Independence of Functions

For an a finite dimensional vector space a set of k vectors {v1, . . .vk} is linearlyindependent if the only solution of the equation

d1v1 + · · ·dkvk = 0

is d1 = · · · = dk = 0. If this equation has a solution where some or all of the di are non-zerothen the set of vectors is said to be linearly dependent.

Similarly a set of functions {y1(x), . . . yk(x)} is said to be linearly independent on aninterval I if the only solution of the equation

d1y1(x) + · · ·+ dkyk(x) = 0 (4)

on the interval is d1 = · · · = dk = 0, when d1, . . . , dk are constants. If there exists a set ofdi’s which satisfy this equation in which some or all of the di’s is non zero, then the set offunctions {y1(x), . . . , yk(x)} is said to be linearly dependent in the interval I.

Note that it is important to specify the interval I. The two functions y1(x) = x3 andy2(x) = |x|3 are linearly independent for I = (−∞,∞), but they are linearly dependentfor I = [0,∞) since x3 − |x|3 = 0 for positive x and d1 = −d2 = 1 is a non-zero solutionfor the latter range.

There is a simple criterion for checking to see if a set of k functions is linearly indepen-dent on an interval I, when the functions are (k − 1)-times differentiable on that interval.We first derive a set of k-equations by differentiating (4):

d1y1(x) + · · ·+ dkyk(x) = 0

d1y(1)1 (x) + · · ·+ dky

(1)k (x) = 0

...

d1y(k−1)1 (x) + · · ·+ dky

(k−1)k (x) = 0 (5)

where y(m)i = dmyi

dxm is the m-th derivative of yi(x) (sometimes a prime will be used to

denote differentiation: thus y′i = y(1)i , y′′i = y

(2)i , etc.).

Equations (5) can be re-arranged as a k × k matrix equation

y1 · · · yky(1)1 · · · y

(1)k

......

y(k−1)1 · · · y

(k−1)k

d1...dk

=

0...0

. (6)

Define a functionW (x) on I (called theWronskian) byW (x) = det

y1 · · · yky(1)1 · · · y

(1)k

......

y(k−1)1 · · · y

(k−1)k

.

If W (x0) 6= 0 for any point x0 ∈ I then the inverse matrix exists at this point and we can

6

multiply both sides of (6) on the left by the inverse to conclude that d1 = · · · = dk = 0.Hence the set of functions {y1(x), . . . yk(x)} is linearly independent if the Wronskian isnon-zero for any point x ∈ I. Often we deal with the case k = 2, when there are only twofunctions, when W (x) = y1y

′2 − y′1y2.

Example

Consider the two functions y1(x) = cos(x) and y2(x) = sin(x) on the interval I =[−π, π], (here k = 2).

W (x) = det

(cos(x) sin(x)− sin(x) cos(x)

)= cos2(x) + sin2(x) = 1

so in this case W (x) 6= 0, ∀x ∈ I and we can conclude that the functions cos(x) and sin(x)are linearly independent on the interval [−π, π] (actually they are linearly independent onthe whole real line (−∞,∞)).

It is important to note that the converse is not true. If W (x) = 0, ∀x ∈ I we cannot

conclude that {y1(x), . . . , yk(x)} are necessarily linearly dependent in I. For example the

two functions y1(x) = x3 and y2(x) = |x|3 have W (x) = det

(x3 |x|33x2 3x3/|x|

)= 0 on the

whole real line, but are linearly independent for I = (−∞,∞) as we have already seen.We can however tighten the result to make it if and only if by imposing an extra

constraint on the functions. For simplicity we shall just consider the case k = 2, which themain case of interest in this course, but the result generalises to higher k.

Theorem: Two solutions of a normal, 2nd order, homogeneous, linear differentialequation in an interval I are linearly independent if and only if the the Wronskian nevervanishes on I.

Proof: Let y1(x) and y2(x) be two solutions of a second order differential equation of theform specified in the statement of the theorem. If W (x) 6= 0 for some x ∈ I then y1 andy2 are linearly independent as before.

To prove the converse suppose there exists a point x0 ∈ I at which W (x0) = 0. Thenthere exists a non-zero solution {d1, d2} of the equations

d1y1(x0) + d2y2(x0) = 0

d1y′1(x0) + d2y

′2(x0) = 0

because det

(y1(x0) y2(x0)y′1(x0) y′2(x0)

)= 0. Using this solution define the function y(x) =

d1y1(x) + d2y2(x), then y(x0) = 0 and y′(x0) = 0 by construction. Now we use theassumption of the theorem, that y1 and y2 are both solutions of a second order, linear,differential equation which is normal in the interval I. We use y(x0) = y′(x0) = 0 as initialconditions to solve the differential equation in the two halves of the interval I with x > x0

7

and x < x0. We now have an initial value problem and, as stated in the discussion onboundary conditions, the resulting solution is unique. Because the equation is assumedto be homogeneous, one solution with these boundary conditions is y(x) = 0, but theboundary conditions render the solution unique, so this is the only solution and we musthave y(x) = d1y1(x)+ d2y2(x) = 0, ∀x ∈ I. Since d1 and d2 are not both zero we concludethat y1(x) and y2(x) are linearly dependent.

We can use a simlar result to show that there are at most two linearly independentsolutions of any linear homogeneous, second order ordinary differential equation which isnormal on an interval I. To see this let y1(x), y2(x) and y3(x) be three solutions of theequation

a2(x)y′′(x) + a1(x)y

′(x) + a0(x)y(x) = 0. (7)

Then form the Wronskian

W (x) = det

y1 y2 y3y′1 y′2 y′3y′′1 y′′2 y′′3

= det

y1 y2 y3y′1 y′2 y′3

− 1a2(a1y

′1 + a0y1) − 1

a2(a1y

′2 + a0y2) − 1

a2(a1y

′3 + a0y3)

,

where, in the last equality, we have used equation (7) to re-write all the second derivativesin terms of lower derivatives. Now, for each x, we can add a multiple a0/a2 of the firstrow to the third row and a multiple a1/a2 times the second row to the last row — it is aproperty of determinants that this does not change the determinant — hence

W (x) = det

y1 y2 y3y′1 y′2 y′30 0 0

= 0

and the Wronskian necessarily vanishes for all x ∈ I. Choose one x0 ∈ I, e.g. the smallestvalue of x in I if I is closed, and we know that, since the determinant W (x0) = 0 there existthree constants, d1, d2 and d3, not all zero, for which d1y1(x0) + d2y2(x0) + d3y3(x0) = 0and d1y

′1(x0) + d2y

′2(x0) + d3y

′3(x0) = 0. So, as in the proof of the previous theorem, the

functiony(x) = d1y1(x) + d2y2(x) + d3y3(x)

is a solution of the differential equation with the boundary conditions y(x0) = y′(x0) = 0,but the unique solution of the equation with these conditions is y(x) = 0. Hence

y(x) = d1y1(x) + d2y2(x) + d3y3(x) = 0,

with d1, d2 and d3 not all zero, and so y1(x), y2(x) and y3(x) are necessarily linearlydependent. We have thus shown that there can be at most only two linearly independentsolutions of any second order, linear, homogeneous, differential equation.

We can go further and say that, if y1 and y2 are two linearly independent solutionsof (7), then any solution can be written as a linear combination of y1(x) and y2(x). Thisis a central result because it means that we only need to find two linearly independentsolutions in order to find all solutions of the equation, so let’s repeat it with emphasis:

8

Theorem: If y1 and y2 are two linearly independent solutions of (7), then any solutioncan be written as a linear combination of y1(x) and y2(x), y(x) = Ay1(x) +By2(x), withA and B constants.

A Second Solution

We have just seen that, if boundary conditions are not specified, a linear, homogeneousequation can have only two linearly independent solutions. If we know one solution in aninterval I in which the equation is normal then the Wronskian can be used to find asecond solution. Suppose we know one solution y1(x), for example a series solution fromthe Frobenius method, so

y′′1 +a1a2

y′1 +a0a2

y1 = 0, (8)

where division by a2(x) is justified since the equation is normal in I. We can find a secondlinearly independent solution y2(x) by considering W (x) = y1y

′2−y′1y2, differentiating and

then using (8):W ′ = y1y

′′2 − y′′1 y2

= y1

(−a1a2

y′2 −a0a2

y2

)−(−a1a2

y′1 −a0a2

y1

)y2

= −a1a2

(y1y′2 − y′1y2) = −a1

a2W.

This gives a first order equation for W (x) which is easily solved

dW (x)

dx= −a1(x)

a2(x)W (x) ⇒ dW

W= −a1

a2dx ⇒ W (x) = exp

(−∫ x a1(t)

a2(t)dt

)

(provided W > 0). Now W = y1y′2 − y′1y2 = y21

ddx

(y2

y1

)so this gives d

dx

(y2

y1

)=

1y21(x)

exp(−∫ x a1(t)

a2(t)dt)which can be integrated to

y2(x) = y1(x)

∫ x

exp(−∫ t′ a1(t)

a2(t)dt)

y21(t′)

dt′,

provided y1(t′) 6= 0 in the region of the t′ integration.

We can use Wronskians to prove that there are at most two linearly independentsolutions of a second order liner homogeneous ordinary differential equation

a2y′′ + a1y

′ + a0y = 0. (9)

Suppose y1, y2 and y3 are three solution of the same equation,

a2y′′i + a1y

′i + a0yi = 0

9

with i = 1, 2, 3. The Wronskian associated with these three functions is

W (x) = det

y1 y2 y3y′1 y′2 y′3y′′1 y′′2 y′′3

= det

y1 y2 y3y′1 y′2 y′3

−(

a1y′

1+a0y1

a2

)−(

a1y′

2+a0y2

a2

)−(

a1y′

3+a0y3

a2

)

= det

y1 y2 y3y′1 y′2 y′30 0 0

= 0, ∀x ∈ I,

where we have used the fact that the last row is a linear combination of the first twoand adding multiples of one row of a matrix to another does not change the determinant.Choose one point x0 ∈ I, then since the determinant vanishes at this point, W (x0) = 0,and we can find three constants d1, d2 and d3, not all zero, for which

d1y1(x0) + d2y2(x0) + d3y3(x0) = 0

andd1y

′1(x0) + d2y

′2(x0) + d3y

′3(x0) = 0.

Now construct the function y(x) = d1y1(x) + d2y2(x) + d3y3(x). This satisfies the sameequation

a2y′′ + a1y

′ + a0y = 0

with boundary conditions y(x0) = y′(x0) = 0. But these boundary conditions suffice tospecify the solution uniquely and y(x) = 0 also satisfies the differential equation with thesame boundary conditions, hence

y(x) = d1y1(x) + d2y2(x) + d3y3(x) = 0

for all x with d1, d2 and d3 not all zero. Hence y1(x), y2(x) and y3(x) must be linearlydependent and there are at most two linearly independent solutions of (9).

10

3. Orthogonal Function Expansions(Sturm-Liouville Theory)

You are familiar with Fourier series: expansions of functions in in an interval I =[−L, L] in terms of trigonometric functions, sin(πx/L) and cos(πx/L). It is often useful toexpand a function in terms of Legendre polynomials or some other special function. Thiscan be done because these functions have some important properties that follow from thedifferential equations that they satisfy. To describe these properties we first give somedefinitions:

DefinitionA second order, linear, differential operator L defined on an interval I = (a, b) is said

to be in self-adjoint form if it is written in the form

L =d

dx

(p(x)

d

dx

)+ q(x), (10)

where p(x) is a differentiable function that never vanishes in the interval I and q(x) is acontinuous function in I.

The definition of a self-adjoint operator is sufficiently general to include all normal,second order, linear differential operators on (a, b). To see this take such an operator

L = a2(x)d2

dx2+ a1(x)

d

dx+ a0(x)

and multiply it by p(x)a2(x)

(this is allowed because it is assumed that a2(x) 6= 0, ∀x ∈ (a, b)).

Then choosing p(x) so that p′ = (a1

a2)p and q(x) to be q = (a0

a2)p ensures that

L :=

(p(x)

a2(x)

)L = p(x)

d2

dx2+ p′(x)

d

dx+ q(x) =

d

dx

(p(x)

d

dx

)+ q(x)

is in self-adjoint form. The homogeneous equations Ly = 0 and Ly = 0 have exactly thesame solutions, solving one is completely equivalent to solving the other.

The function p(x) can be found from the condition p′ = (a1

a2)p,

p′(x)

p(x)=

a1(x)

a2(x)⇒ p(x) = exp

(∫ x a1(t)

a2(t)dt

).

11

Examples1) The harmonic oscillator equation is already self-adjoint with p(x) = 1 and q(x) = 0.

2) Legendre’s equations is already self-adjoint with p(x) = (1 − x2), p′(x) = −2x, andq(x) = 0 so

d

dx

[(1− x2)

dy

dx

]= λy.

3) Laguerre’s equation is not self-adjoint but can be made so by using

p(x) = exp

{∫ x(1− t

t

)dt

}= xe−x, q(x) = 0

giving

xe−xy′′ + (1− x)e−xy′ = λe−xy ⇒ d

dx

(xe−x dy

dx

)= λe−xy.

4) Hermite’s equation is not self-adjoint but can be made so by using

p(x) = exp

{∫ x

(−2t)dt

}= e−x2

, q(x) = 0

giving

e−x2

y′′ − 2xe−x2

y′ = λe−xy ⇒ d

dx

(e−x2 dy

dx

)= λe−x2

y.

In the last two examples we have had to modify the eigenvalue equation in its originalform, Ly = λy, to include a function on the right hand side,

Ly = λr(x)y,

where r(x) = p(x)/a2(x) is a continuous, non-negative function on I. So for Laguerre’s

equation r(x) = e−x while for Hermite’s equation r(x) = e−x2

. In general r(x) is calledthe weight function for the differential equation.

The self-adjoint form is extremely important in mathematical physics because theireigenfunctions can be shown to have the following properties:

1) Eigenfunctions corresponding to distinct non-zero eigenvalues are linearly independent

2) Eigenfunctions corresponding to distinct eigenvalues are orthogonal (in the sense de-fined below).

3) The eigenfunctions of a self-adjoint operator form a complete set, i.e. they can beused to expand functions in the same way as sin(nx) and cos(nx) are used in Fourierseries.

12

1) Linear Independence

Denote the eigenfunctions of a linear operator by φi and the corresponding eigenvaluesby λi, where i = 1, 2, . . . labels the eigenvalues. In self-adjoint form the eigenvalue equationis

Lφi(x) = λir(x)φi(x).

We shall prove that {φ1, . . . , φk} are linearly independent if {λ1, . . . , λk} are a set of kdistinct non-zero eigenvalues. The proof is by induction.

i) Obviously a single non-zero eigenfunction φ1 is linearly independent, since d1φ1 = 0if and only if d1 = 0.

ii) Assume that eigenfunctions {φ1, . . . , φk} with distinct non-zero eigenvalues {λ1, . . . , λk}are linearly independent, i.e. the only way that

d1φ1 + · · ·+ dkφk = 0 (11)

can be satisfied with di constants is to have d1 = · · · = dk = 0.iii) Let φk+1 be another eigenfunction with non-zero eigenvalue, λk+1 6= λi, 1 ≤ i ≤ k.

We must show that the only way that the equation

d1φ1 + · · ·+ dkφk + dk+1φk+1 = 0 (12)

can be satisfied is to have d1 = · · · = dk = dk+1 = 0.

Applying L to (12) gives

d1Lφ1 + · · ·+ dkLφk + dk+1Lφk+1 = 0

because L is linear. Since Lφi = λir(x)φi this means that

d1λ1rφ1 + · · ·+ dkλkrφk + dk+1λk+1rφk+1 = 0

⇔ d1λ1φ1 + · · ·+ dkλkφk + dk+1λk+1φk+1 = 0. (13)

Multiplying (12) by λk+1 and it subtracting the result from (13) gives

d1(λ1 − λk+1)φ1 + · · ·+ dk(λk − λk+1)φk = 0.

Since, by assumption, λk+1 6= λi and φi are linearly independent for 1 ≤ i ≤ k thiscan only be true if d1 = · · · = dk = 0. Equation (12) then implies that dk+1 = 0 aswell so it must be the case that all of d1 = · · · = dk = dk+1 = 0 for (12) to be true.Hence {φ1, . . . , φk, φk+1} are all linearly independent.

2) Orthogonality

Let φ1(x) and φ2(x) be two eigenfunctions of a self-adjoint operator L on an interval[a, b]. Associate with these two eigenfunctions the integral

φ1.φ2 :=

∫ b

a

φ1(x)φ2(x)r(x)dx, (14)

13

where r(x) is the weight function for L. Note that φ1.φ2 is just a number, it is independentof x. We shall call this number the inner product of the two functions φ1(x) and φ2(x)— this is in analogy to the inner product of two vectors. When φ1.φ2 = 0 the two functionsare said to be orthogonal. We shall show that, if φ1(x) and φ2(x) correspond to distincteigenvalues λ1 and λ2 with λ1 6= λ2, then φ1.φ2 = 0 (subject to some constraints on theboundary conditions which will be derived below). To see this consider

λ1φ1.φ2 − λ2φ1.φ2 = λ1

∫ b

a

φ1(x)φ2(x)r(x)dx− λ2

∫ b

a

φ1(x)φ2(x)r(x)dx

=

∫ b

a

{λ1φ1(x)r(x)}φ2(x)dx−∫ b

a

φ1(x){λ2φ2(x)r(x)}dx

=

∫ b

a

{Lφ1(x)}φ2(x)dx−∫ b

a

φ1(x){Lφ2(x)}dx

=

∫ b

a

{ d

dx

[pdφ1

dx

]+ qφ1

}φ2dx−

∫ b

a

φ1

{ d

dx

[pdφ2

dx

]+ qφ2

}dx.

where p(x) and q(x) are the functions appearing in the self-adjoint form of L in (10). Thetwo terms involving q(x) cancel and the remaining terms can be integrated by parts togive

λ1φ1.φ2 − λ2φ1.φ2 =[p φ′

1φ2]ba −

∫ b

a

pdφ1

dx

dφ2

dxdx−

[p φ1φ

′2]

ba +

∫ b

a

pdφ1

dx

dφ2

dxdx

=[p(φ′

1φ2 − φ1φ′2)]ba.

Since λ1 6= λ2 by assumption we can conclude that φ1.φ2 = 0 if and only if

p(b){φ′1(b)φ2(b)− φ1(b)φ

′2(b)} = p(a){φ′

1(a)φ2(a)− φ1(a)φ′2(a)}. (15)

Three important cases in which these boundary conditions are satisfied are:

1) Periodic boundary conditions: if p(a) = p(b), then periodic boundary conditions,φi(a) = φi(b) and φ′

i(a) = φ′i(b) (for i = 1, 2), automatically satisfy (15). This is the

case for the harmonic oscillator equation, for example.

2) If p(a) = p(b) = 0: then (15) is automatic. An example of this situation is Legendre’sequation with a = −1, b = 1 and p(x) = 1− x2.

3) Unmixed boundary conditions: if the boundary conditions are unmixed and ofthe form α1φi(a) + α3φ

′i(a) = 0 and β2φi(b) + β4φ

′i(b) = 0, for both i = 1 and i = 2,

then both sides of (15) vanish identically.

In all three of these situations any two eigenfunctions corresponding to distinct eigen-values are orthogonal, φ1.φ2 = 0. The term ‘orthogonal’ here is in analogy with ordinary

14

vector spaces: if V is a finite dimensional vector space, of dimension d, with an innerproduct defined on it and L : V → V is a linear operator on V then, in a given basis, Lcan be represented as a matrix with components La

b, a, b = 1, . . . , d. The matrix L then“operates” on V as a linear map of vectors to vectors — if v is a vector with componentsva in some basis then L maps v to v′ where, in matrix notation, (v′)a =

∑db=1 La

bvb. Thevector v is an eigenvector of the matrix L if

Lv = λv,

where λ is just a number, called the eigenvalue of v. Two vectors v1 and v2 are orthogonalif their inner product vanishes, v1.v2 = 0. If v1 and v2 are both eigenvectors of L withdistinct eigenvalues λ1 6= λ2, then

v1.(Lv2)− (Lv1).v2 = v1.(λ2v2)− (λ1v1).v2 = (λ1 − λ2)v1.v2

so v1 and v2 are orthogonal if and only if

v1.(Lv2)− (Lv1).v2 = 0. (16)

Any matrix L which satisfies (16) for all vectors v1 and v2 is called a self-adjoint operator.

3) Completeness

For a finite dimensional vector space a basis is said to be complete if any vector canbe written as a linear sum of the basis vectors. Similarly a set of functions is said to becomplete if any (reasonably well behaved) function can be written as a linear combinationof the functions in the set, with constant co-efficients. This is a generalisation of the conceptof a Fourier series (expanding a periodic function in terms of sines and cosines). Labelthe eigenfunctions corresponding to linearly independent eigenfunctions of a self-adjointdifferential operator in an interval I = [a, b] by λ0, λ1, . . ., in ascending order, and thecorresponding eigenfunctions by φ0(x), φ1(x), · · ·. We assume here that the eigenvalues arediscrete but they need not be distinct — two linearly independent eigenfunctions may havethe same eigenvalue. The eigenfunctions are complete if any square integrable function∗

h(x) in I can be written as a sum

h(x) =∞∑

n=0

cnφn(x) (17)

where cn are constants. More precisely, the set {φn(x)} is a complete set if there existconstants cn such that

limN→∞

∫ b

a

{h(x)−

N∑

n=0

cnφn(x)}2

r(x)dx = 0.

∗ A function h(x) is square integrable on an interval [a, b] if∫ b

ah2dx exists and is finite.

15

The square is to ensure that the integrand is positive everywhere, then there is no possibilitythat h(x) −

∑∞n=0 cnφn(x) 6= 0 but still integrates to zero because of oscillations — the

integral of a positive integrand can only give zero if the integrand vanishes for all x ∈ (a, b).The trigonometric functions, sin(nx) and cos(nx) give a complete set of functions on

I = [−π, π]. Other examples are:• Legendre polynomials Pn(x) are a complete set on I = [−1, 1].• The Laguerre polynomials Ln(x) are a complete set on I = [0,∞).• The Hermite polynomials are a complete set in I = (−∞,∞).We shall not prove that these functions are complete on the stated intervals in this

course, but merely state the fact.When the eigenvalues of the eigenfunctions appearing in the expansion (17) are distinct

we can use orthogonality of the eigenfunctions to determine the constants cn. Since

∫ b

a

h(x)φm(x)r(x)dx =∞∑

n=0

cn

∫ b

a

φn(x)φm(x)r(x)dx

and the integral vanishes unless m = n by orthogonality, so only the term with n = msurvives in the sum. This means that

∫ b

a

h(x)φm(x)r(x)dx = cm

∫ b

a

{φm(x)}2r(x)dx

so the cn in (17) are given by

cn =

∫ b

ah(x)φn(x)r(x)dx

∫ b

a{φn(x)}2r(x)dx

(18)

and the cn can be calculated either by evaluating the integrals analytically or numerically,if necessary.

Examples1) Using harmonic functions sin(nx) and cos(nx), let h(x) = x, x ∈ (−π, π). Since h(x)

is an odd function h(−x) = −h(x), we expect that only sin(nx) will contribute. Thisexpectation is confirmed by observing that

∫ π

−πx cos(nx)dx = 0. Then we only need

to consider φn(x) = sin(nx) in equation (18) giving

cn =

∫ π

−πx sin(nx)dx

∫ π

−πsin2(nx)dx

=

∫ π

−πx sin(nx)dx

π

= − 1

πn

[x cos(nx)

]π−π

+1

nπ

∫ π

−π

cos(nx)dx

= − 1

πn{π cosnπ − (−π) cos(−nπ)}

= − 2

ncosnπ =

2

n(−1)n+1.

16

We conclude that

x = 2∞∑

n=1

(−1)n+1

nsin(nx), −π < x < π. (19)

Note that this formula is not correct for x ≥ π or x ≤ −π, in particular the right-handside is zero for x = ±π.

2) The function h(x) = x is even easier using, for example, Legendre polynomials. SinceP1(x) = x we can see right away, without having to do any integrals, that h(x) = P1(x)in this case, so c1 = 1 and cn = 0, ∀n 6= 1. In fact this expansion is valid for all−∞ < x < ∞ and not just x ∈ [−1, 1].

3) A slightly less trivial example is h(x) = (x−2)2. Since this is a quadratic function wecan get away with using only the first three Legendre polynomials, P2(x) =

12(3x

2−1),P1(x) = x and P0(x) = 1. Then

h(x) = x2 − 4x+ 4 =2

3P2(x) +

1

3− 4x+ 4 =

2

3P2(x)− 4P1(x) +

13

3P0(x).

So c0 = 133 , c1 = −4, c2 = 2

3 and, for n ≥ 4, cn = 0.

4) Now consider the discontinuous function

h(x) =

{V 0 < x < 1−V −1 < x < 00 x = 0,

where V is a positive constant. This is an odd function, h(−x) = −h(x). If we expandthis function using trigonometric functions then again we only need sine functionsh(x) =

∑∞n=1 cn sin(nx) and equation (18) gives

cn = V

(∫ 1

0sin(nπx)dx−

∫ 0

−1sin(nπx)dx

∫ 1

−1sin2(nπx)dx

)

=V

π

{− 1

n

[cos(nπx)

]10+

1

n

[cos(nπx)

]0−1

}

=V

π

{− 1

n

((−1)n − 1

)+

1

n

(1− (−1n)

)}

=2V

nπ

(1− (−1)n

)

=

{4Vπn for n odd0 for n even.

So

h(x) =4V

π

∞∑

m=0

sin((2m+ 1)x

)

2m+ 1, −1 < x < 1.

17

We can also expand this function using Legendre polynomials for x in the range(−1, 1). We already know that the Legendre polynomials are orthogonal and it is a stan-dard conventions to normalise them so that

∫ 1

−1

Pn(x)Pm(x)dx =

(2

2n+ 1

)δn,m. (20)

Note also that Pn(x) is an even function when n is even and an odd function when n isodd, Pn(−x) = (−1)nPn(x), so we will only need odd n in an expansion of h(x). Usingequation (18) again

cn =(2n+ 1)V

2

(∫ 1

0

Pn(x)dx−∫ 0

−1

Pn(x)dx

)

=

{(2n+ 1)V

∫ 1

0Pn(x)dx for n odd

0 for n even.

Of course∫ 1

0Pn(x)dx is a number that depends on n. In fact this can be evaluated

using Rodrigues’ formula (3):

Pn(x) =1

2nn!

dn

dxn(x2 − 1)n

⇒∫ 1

0

Pn(x)dx =1

2nn!

[dn−1

dxn−1(x2 − 1)n

]1

0

= − 1

2nn!

[dn−1

dxn−1(x2 − 1)n

]

x=0

= − 1

2nn!

n∑

j=0

(n

j

)(−1)n−j

[dn−1

dxn−1x2j

]

x=0

.

This vanishes if n is even, while if n is odd only the term in the sum with j = (n− 1)/2 isnon-zero. So we consider n = 2m+ 1 odd, in which case

∫ 1

0

P2m+1(x)dx = − 1

22m+1(2m+ 1)!

(2m+ 1

m

)(−1)m+1(2m)! =

(−1)m

22m+1

(2m)!

m!(m+ 1)!.

So

c2m+1 = (−1)mV(4m+ 3)

22m+1

(2m)!

m!(m+ 1)!

and

h(x) = V∞∑

m=0

(−1)m(4m+ 3)

22m+1

(2m)!

m!(m+ 1)!P2m+1(x)

= V

(3

2P1(x)−

7

8P3(x) +

11

16P5(x)− . . .

).

18

Partial Differential Equations

1. Definition

A partial differential equation involves derivatives of a function u(x, y, . . .) of morethan one independent variable. Partial differential equations necessarily involve partialderivatives such as ∂u

∂x.

Examples of second order, linear, partial differential equations are:

• Wave Equation: u(x, t)∂2u

∂x2= a2

∂2u

∂t2

• Heat Equation: u(x, t)∂2u

∂x2= a2

∂u

∂t

• Laplace’s Equation: u(x, y, z)

∂2u

∂x2+

∂2u

∂y2+

∂2u

∂z2= 0.

In all of these cases limits must be put on the range of x and t (or x, y and z in thecase of Laplace’s equation) and, in order to ensure a unique solution of the equation, onemust specify some properties of the function u at the limit of these ranges. For examplesolving Laplace’s equation in some finite volume of 3-dimensional space requires specifyingproperties of u(x, y, z) on the boundary surface of the volume of interest. If the value of uis given at every point on the boundary then a unique solution exists (Dirichlet boundaryconditions); alternatively a unique solution of Laplace’s equation exists if the derivativeof u in the direction normal to the boundary is specified at every point of the boundary(Neumann boundary conditions).

2. Separation of Variables

The method used for solving partial differential in this course will be that of separa-tion of variables. Consider, for example, the wave equation

∂2u

∂x2= a2

∂2u

∂t2. (21)

The method of solution by separation of variables requires first looking for solutions of theform u(x, t) = T (t)X(x), i.e. solutions which factorise into a function T (t) of t only and afunction X(x) of x only. This then reduces the partial differential equation (21) for u(x, t)to ordinary differential equations for T (t) and X(x), which we can then try to solve using

19

the methods already studied. Of course not all solutions of the partial differential equationcan be factorised, or separated, in this way but, for the equations that are studied in thiscourse, a general solution can always be written as a linear combination of such separatedsolutions, so finding all possible separated solutions leads to the general solution.

ExamplesExample 1): the wave equation

∂2u

∂x2= a2

∂2u

∂t2. (22)

This equation describes, for example, small lateral oscillations of a string or wire stretchedunder tension between two fixed points, with u(x, t) the lateral displacement of the wireat the point x at time t — a violin or a guitar string for instance. We shall not derive theequation in this course, we shall just treat the mathematical problem of finding solutions.Suppose the string has length L and the end points are fixed at x = 0 and x = L, this meansthat the displacement u(0, t) = u(L, t) = 0 for all t. Suppose we also know the initial profileof the string’s position and velocity, that is u(x, 0) = f(x) and

(∂u∂t

)t=0

= v(x) where f(x)and v(x) are known functions. In the separation of variables method we try u = X(x)T (t)and look for a solution of (22) with this form. This implies that

(d2X

dx2

)T = a2X

(d2T

dt2

),

so the partial derivatives have disappeared and been replaced by ordinary derivatives.Dividing through by XT (assuming it is not zero) gives

1

X

(d2X

dx2

)=

a2

T

(d2T

dt2

). (23)

Now notice that the left-hand side of this equation only depends on x while the right-handside only depends on t. Pick a value of t and keep it fixed, then the right-hand side isequal to a fixed number and

1

X

d2X

dx2= λ,

where λ is the value of a2

T

(d2Tdt2

)at our fixed choice of t. The problem is now reduced to

that of solvingd2X

dx2= λX

with λ a constant and this is just the harmonic oscillator equation, with the boundaryconditions that X(0) = X(L) = 0 (since u(0, t) = u(L, t) = 0 for any t). With these

boundary conditions λ < 0 for a non-zero solution, more specifically λ = −n2π2

L2 withn = 1, 2, 3, . . ., and

X(x) = Cn sin(nπx

L

)

20

where Cn is a constant, for the moment arbitrary.Now use this form for X(x) in (23) and we get an ordinary differential equation for

T (t)a2

T

(d2T

dt2

)= λ ⇒

(d2T

dt2

)= −

(nπaL

)2T,

which is the harmonic oscillator equation for T (t). The most general solution of thisequation is

T (t) = An cos

(nπt

aL

)+Bn sin

(nπt

aL

),

where An and Bn are constants.So we have found an infinite number of solutions of (22) with the separated form

u = TX , with u(0, t) = u(L, t) = 0, one for each positive integer n,

un(x, t) =

{An cos

(nπt

aL

)+Bn sin

(nπt

aL

)}Cn sin

(nπxL

).

Clearly the constants Cn are redundant here and we might as well write

un(x, t) =

{An cos

(nπt

aL

)+Bn sin

(nπt

aL

)}sin(nπx

L

)

by redefining An and Bn if necessary.In fact, because the original equation is linear, any linear combination of the un

u(x, t) =

∞∑

n=1

{An cos

(nπt

aL

)+Bn sin

(nπt

aL

)}sin(nπx

L

)(24)

is also a solution that satisfies the boundary conditions u(0, t) = u(L, t) = 0. Notice thatthis sum is not of separated form, in general. In fact (24) is the most general solution of(22) that has u(0, t) = u(L, t) = 0 (a proof of this is not given here).

The final step is to determine the constants An and Bn from the initial conditionsu(x, 0) = f(x) and

(∂u∂t

)t=0

= v(x). From (24)

u(x, 0) =∞∑

n=1

An sin(nπx

L

)= f(x)

and (∂u(x, t)

∂t

)

t=0

=( π

aL

) ∞∑

n=1

nBn sin(nπx

L

)= v(x).

We can now extract An and Bn using

∫ L

0

sin(πnx

L

)sin(πmx

L

)dx =

L

2δmn,

21

for m and n positive integers, to get

Am =2

L

∫ L

0

f(x) sin(πmx

L

)dx, Bm =

2a

πm

∫ L

0

v(x) sin(πmx

L

)dx.

Suppose, for example, that v(x) = 0 so the motion starts from rest at t = 0 withinitial profile

f(x) =

{2dLx 0 ≤ x ≤ L/2

2d(L−xL

), L/2 ≤ x ≤ L.

(25)

So Bn = 0 and

Am =4d

L2

{∫ L/2

0

x sin(mπx

L

)dx+

∫ L

L/2

(L− x) sin(mπx

L

)dx

}=

8d

(πm)2sin(mπ

2

),

(26)Since sin

(mπ2

)is zero if m is even and (−1)(m−1)/2 if m is odd the function (25) can be

expressed as

f(x) =8d

π2

∑

n=1,3,5,...

(−1)(n−1)/2 1

n2sin(nπx

L

).

Note in passing that, evaluating both sides of this expression at x = L/2 using (25), wehave

d =8d

π2

∑

n=1,3,5,...

1

n2⇒

∑

n=1,3,5,...

1

n2=

π2

8.

We can put everything together now and write the full solution, with the initialcondition (25), as

u(x, t) = 8d

π2

∑

n=1,3,5,...

(−1)(n−1)/2 1

n2sin(nπx

L

)cos

(nπt

aL

).

Example 2): the heat equation∂2u

∂x2= a2

∂u

∂t. (27)

This equation describes, for example, the distribution of temperature as a function oftime along a rod: u(x, t) is the temperature at the point x at time t. More generally(27) describes the diffusion of any physical quantity, not just heat, but we shall focus onheat as our example. Suppose, for instance, that the rod has length L and is positionedwith the left-hand end at x = 0 and the right-hand at x = L and that the left-hand endhas a fixed temperature u(0, t) = 0 while the right-hand end has its temperature heldfixed at u(L, t) = T , where T is a constant (in this example T shall be used to denote aconstant temperature and S(t) will be used for the function of t in the separated form:u(x, t) = X(x)S(t)). We shall suppose that the initial temperature profile of the rod att = 0 is a known function of position, so u(x, 0) = f(x) for some given function f(x).

22

Using separation of variables we put u(x, t) = X(x)S(t) into the equation (27) gives

(d2X

dx2

)S = a2X

(dS

dt

),

so the partial derivatives have disappeared and been replaced by ordinary derivatives.Dividing through by XS (assuming it is not zero) gives

1

X

(d2X

dx2

)=

a2

S

(dS

dt

). (28)

Again the left-hand side of this equation only depends on x while the right-hand side onlydepends on t. Pick a value of x and keep it fixed, then the left-hand side is equal to a fixednumber and

1

S

dS

dt=

λ

a2,

where λ is the value of 1X

(d2Xdx2

)at our fixed choice of x. The problem is now reduced to

that of solvingdS

dt=

(λ

a2

)S

with λ a constant. The solutions of this equation are exponential functions

S(t) = Cλeλt/a2

,

where Cλ are constants. If λ were positive then S(t), and so u(x, t), would grow expo-nentially for large t and, if u(x, t) has the physical interpretation of a temperature, this isnot possible, so λ ≤ 0. Writing λ = −µ2 equation (28) now gives an ordinary differentialequation for X(x)

d2X

dx2= −µ2X,

the harmonic oscillator equation again. If µ 6= 0 the most general solution is

X(x) = Aµ cos(µx) +Bµ sin(µx)

while for µ = 0X(x) = A0 +B0x,

where Aµ and Bµ are constants. Imposing the boundary condition that u(0, t) = 0, ∀t,implies that X(0) = 0 which requires Aµ = 0. The other boundary condition, thatu(L, t) = T , ∀t, can easily be accommodated by taking B0 = T

L and µ = nπL with n an

integer (which we can always take to be positive) leading to the form

u(x, t) =Tx

L+

∞∑

n=1

Bn sin(nπx

L

)exp

(−(nπaL

)2t

)(29)

23

where the Cλ have been absorbed into the Bµ and Bµ have been relabelled as Bn.The constants Bn have still to be calculated from the initial temperature profile,

u(x, 0) = f(x), but first observe that the late time behaviour is independent of Bn

limt→∞

u(x, t) =Tx

L, (30)

so at late times u(x, t) becomes independent of t and rises linearly with x from 0 to Tas x goes from 0 to L. This has the intuitive interpretation that, if a rod has one endfixed at temperature 0 and the other at temperature T then one expects the equilibriumtemperature to increase linearly from one end to the other. No matter what the initialtemperature profile is it will always settle down to this equilibrium profile eventually andequation (29) gives us the wherewithal to follow this evolution through quantitatively. Notethat the smaller the constant a2 is the faster (29) relaxes to the equilibrium configuration(30). The constant a2 in (27) is the inverse of the thermal conductivity of the rod —materials with a high thermal conductivity relax to their equilibrium temperature fasterthan materials with a low thermal conductivity.

Returning to the full solution as a function of time (29) consider the case when theinitial temperature profile is constant along the the rod and then jumps discontinuouslyto T and the right-hand end:

u(x, 0) =

{0, 0 ≤ x < LT, x = L.

(31)

A subtlety here is that u(x, 0) is not differentiable at x = L, but this is not an obstacle tosolving the problem – we solve for 0 ≤ x < L, where u(x, 0) is differentiable, and imposethe boundary condition that limx→L u(x, 0) = T .

With the initial condition (31) setting t = 0 in (29) implies

Tx

L+

∞∑

n=1

Bn sin(nπx

L

)=

{0, 0 ≤ x < LT, x = L.

(32)

This requires that the infinite sum cancels the linear term on the left-hand side for 0 ≤x < L (note that the sum vanishes for x = L, so the boundary condition at the right-hand end is automatically satisfied in this expression). We have already worked out theFourier series for x, with −π < x < π, in (19) and it is a simple matter to extend this to−L < x < L, by rescaling x → πx

L in (19),

x =2L

π

∞∑

n=1

(−1)n+1

nsin(nπx

L

), −L < x < L. (33)

The constants Bn can now be read off by putting (33) into (32),

Bn = −2T

π

(−1)n+1

n,

24

so the complete solution is, from (29),

u(x, t) =Tx

L+

2T

π

∞∑

n=1

(−1)n

nsin(nπx

L

)exp

(−(nπaL

)2t

).

Example 3): Laplace’s equation

∂2u

∂x2+

∂2u

∂y2+

∂2u

∂z2= 0. (34)

This equation is important for problems in electrostatics, where it describes the electricpotential u(x, y, z) in a charge-free region of space. Generically we want to solve for u ina volume V that contains no electric charges, but there may be charges outside V , or onits boundary, and these can produce an electric field E = −∇u inside V . The net effectof these charges can be incorporated into boundary conditions on u, i.e. by specifying uon the boundary of V . If V is a cuboid then Cartesian co-ordinates are appropriate, with0 < x < a, 0 < y < b and 0 < z < c for a cuboid of width a, depth b and height c. Supposefirst of all that u vanishes on all except the top face and is given by a specified functionu(x, y, c) = f(x, y) on the top face.

The technique of separation of variables requires looking for solutions of (34) of theform

u(x, y, z) = X(x)Y (y)Z(z),

so, (d2X

dx2

)Y Z +X

(d2Y

dy2

)Z +XY

(d2Z

dz2

)= 0.

Dividing through by XY Z (assuming it is non-zero) we get

1

X

d2X

dx2+

1

Y

d2Y

dy2+

1

Z

d2Z

dz2= 0.

Since the second and third terms in this equation are independent of x we conclude that1X

d2Xdx2 is a constant, independent of x. The same argument allows to conclude that 1

Yd2Ydy2

is independent of y and 1Y

d2Ydz2 is independent of z. So we are led to

d2X

dx2= λX,

d2Y

dy2= µY,

d2Z

dz2= νZ

where λ, µ and ν are constants summing to zero,

λ+ µ+ ν = 0. (35)

Looking for solutions of (34) that factorise has reduced the problem to that of threeharmonic oscillator equations.

25

Since the boundary conditions require that u should vanish for x = 0, x = a, y = 0and y = b it must be the case that X(0) = X(a) = 0 and Y (0) = Y (b) = 0. For non-zerosolutions this is only possible if λ and µ are negative, in particular we are forced to

X(x) = An sin(nπx

a

), λ = −

(nπa

)2

Y (y) = Bm sin(mπy

b

), µ = −

(mπ

b

)2,

where n and m are integers (which we can take to be positive) and An and Bm areconstants.

This now fixes the constant ν in (35) to be

ν =

(n2

a2+

m2

b2

)π2

so the final equation is

d2Z

dz2= νZ, with ν =

(n2

a2+

m2

b2

)π2 > 0.

This has solutionsZ = Cnme

√νz +Dnme−

√νz

where Cnm and Dnm are constants, which be be different for different pairs (n,m). Oneof the boundary conditions was that u(x, y, 0) = 0 so we must impose Z(0) = 0 whichrequires Dnm = −Cnm so

Z = Cnm

(e√νz − e−

√νz)= 2Cnm sinh(

√νz).

Combining this with the forms ofX and Y obtained above gives the following solutionsof Laplace’s equation, one for each pair of positive integers m and n,

unm(x, y, z) = Anm sin(nπx

a

)sin(mπy

b

)sinh(

√νnmz),

where all the constants have been combined into one Anm = 2AnBmCnm. and the different

ν’s are distinguished by subscripts: νnm =(

n2

a2 + m2

b2

)π2. These are all solutions of

Laplace’s equation which vanish on five faces of the cuboid, but not on the top face z = c.The most general solution satisfying these conditions is a linear combination of these,which can be represented by a double sum over n and m:

u(x, y, z) =∞∑

n,m=1

Anm sin(nπx

a

)sin(mπy

b

)sinh

(√(n2

a2+

m2

b2

)πz

),

where νnm has been written in full.

26

To solve the problem uniquely we now need to specify u(x, y, z) on the top face,u(x, y, z) = f(x, y) for some given function f(x, y), and then the constants Anm are deter-mined by demanding that

u(x, y, c) = f(x, y) =

∞∑

n,m=1

Anm sin(nπx

a

)sin(mπy

b

)sinh(

√νnmc).

For example suppose u is constant on the top face, u(x, y, c) = C. Then

∞∑

n,m=1

Anm sin(nπx

a

)sin(mπy

b

)sinh(

√νnmc) = C

and the Anm can extracted explicitly using

∫ a

0

sin(nπx

a

)sin

(n′πx

a

)dx =

a

2δnn′ and

∫ b

0

sin(mπy

b

)sin

(m′πy

b

)dx =

b

2δmm′ ,

for integers n, n′, m and m′. From this we can derive

(An′m′ab

4

)sinh(

√νn′m′c) = C

∫ a

0

sin

(n′πx

a

)dx

∫ b

0

sin

(m′πy

b

)dy.

This vanishes if either n′ or m′ is even, and

An′m′ =16C

π2n′m′ sinh(√νn′m′c)

if both n′ and m′ are odd. This gives

f(x, y) =16C

π2

∞∑

n,m=1,3,5,...

(1

nm

)sin(nπx

a

)sin(mπy

b

)=

{C, for 0 < x < a; 0 < y < b0, for x = 0, a; y = 0, b.

The full solution, with u = C on the top face and zero on the other five faces, is

u(x, y, z) =16C

π2

∞∑

n,m=1,3,5,...

sin(nπxa

)sin(mπyb

)sinh

(πz√

n2

a2 + m2

b2

)

nm sinh

(πc√

n2

a2 + m2

b2

) .

This method of solving Laplace’s equation can easily be extended to the case whereu 6= 0 on all six faces: because the equation is linear in u we can solve it for u = 0 on fivefaces and u 6= 0 on the remaining face; do this for u 6= 0 on each of the six faces in turn;then simply add the six resulting solutions. This will give the unique solution with u 6= 0on all six faces.

27

The choice of Cartesian co-ordinates in the above example of Laplace’s equation istailored to the shape of a cuboid, with boundary conditions on the surface. With othershapes different co-ordinates are often more useful. Consider for example the 2-dimensionalLaplace equation

∂2u

∂x2+

∂2u

∂y2= 0, (36)

for a function u(x, y). Cartesian co-ordinates are appropriate here for a rectangular region,but if we want to solve the equation in the interior of a disc, with a circular boundary,2-dimensional polar co-ordinates are more suitable. Let

x = ρ cosφ, y = ρ sinφ,

where 0 ≤ φ < 2π is a polar angle and 0 ≤ ρ < ∞ a radial co-ordinate in two dimensionswith ρ2 = x2 + y2. To solve (36) in the disc we need to think of u as a function of ρ and φand convert the partial derivatives to polar co-ordinates. The chain rule for differentiationgives

∂u

∂ρ=

(∂x

∂ρ

)(∂u

∂x

)+

(∂y

∂ρ

)(∂u

∂y

)= cosφ

(∂u

∂x

)+ sinφ

(∂u

∂y

)

∂u

∂φ=

(∂x

∂φ

)(∂u

∂x

)+

(∂y

∂φ

)(∂u

∂y

)= −ρ sinφ

(∂u

∂x

)+ ρ cosφ

(∂u

∂y

).

Applying the chain rule a second time gives

∂2u

∂ρ2= cosφ

{cosφ

(∂2u

∂x2

)+ sinφ

(∂2u

∂y∂x

)}+ sinφ

{cosφ

(∂2u

∂x∂y

)+ sinφ

(∂2u

∂y2

)}

= (cosφ)2(∂2u

∂x2

)+ (sinφ)2

(∂2u

∂y2

)+ 2 cosφ sinφ

(∂2u

∂x∂y

),

∂2u

∂φ2= −ρ cosφ

(∂u

∂x

)− ρ sinφ

{−ρ sinφ

(∂2u

∂x2

)+ ρ cosφ

(∂2u

∂y∂x

)}

− ρ sinφ

(∂u

∂y

)+ ρ cosφ

{−ρ sinφ

(∂2u

∂x∂y

)+ ρ cosφ

(∂2u

∂y2

)}

= ρ2(sinφ)2(∂2u

∂x2

)+ ρ2(cosφ)2

(∂2u

∂y2

)− 2ρ2 cosφ sinφ

(∂2u

∂x∂y

)

− ρ

{cosφ

(∂u

∂x

)+ sinφ

(∂u

∂y

)}.

By taking linear combinations of these equations one finds

∂2u

∂ρ2+

1

ρ

∂u

∂ρ+

1

ρ2∂2u

∂φ2=

∂2u

∂x2+

∂2u

∂y2.

Finally the 2-dimensional Laplace’s equation written in polar co-ordinates is:

∂2u

∂ρ2+

1

ρ

∂u

∂ρ+

1

ρ2∂2u

∂φ2= 0. (37)

28

We shall now find solutions to this equation using separation of variables. First note that,if u(ρ, φ) is to be a single valued function of position it must be periodic in the angularvariable φ, u(ρ, φ+ 2π) = u(ρ, φ). With this condition we seek separated solutions of theform

u(ρ, φ) = R(ρ)Φ(φ).

Putting this form of u(ρ, φ) into (37) we get

(d2R

dρ2

)Φ+

1

ρ

(dR

dρ

)Φ+

R

ρ2

(d2Φ

dφ2

)= 0.

Multiplying this by ρ2

RΦwe arrive at

ρ2

R

(d2R

dρ2

)+

ρ

R

(dR

dρ

)+

1

Φ

(d2Φ

dφ2

)= 0.

The by now familiar argument says that 1Φ

(d2Φdφ2

)and ρ2

R

(d2Rdρ2

)+ ρ

R

(dRdρ

)are constants.

Since Φ(φ+ 2π) = Φ(φ) is periodic we must have

Φ(φ) = An cos(nφ) +Bn sin(nφ)

were An and Bn are constants, n is an integer (which we can take to be non-negative) and

d2Φ

dφ2= −n2Φ.

There remains the equation

ρ2

R

(d2R

dρ2

)+

ρ

R

(dR

dρ

)= n2 ⇒ ρ2

(d2R

dρ2

)+ ρ

(dR

dρ

)= n2R.

For n 6= 0 this equation has linearly independent solutions R = ρn and R = ρ−n so thegeneral solution is

R = Cnρn +Dnρ

−n,

where Cn and Dn are constants. For n = 0, ln ρ is a solution (as well as R = const) so thegeneral solution for n = 0 is

R = C0 +D0 ln ρ,

with C0 and D0 again arbitrary constants.Putting all this together, the general solution of (37) is

u(ρ, φ) = C0 +D0 ln ρ+

∞∑

n=1

(Cnρ

n +Dnρ−n)(An cos(nφ) +Bn sin(nφ)) . (38)

29

(For a given n we do not need all four of An, Bn, Cn, and Dn: for instance, if Cn 6= 0, wecan absorb it into a redefinition of An and Bn and Dn.)

If we are to solve the equation in the interior of a disc, of radius a say, then Dn = 0,∀n, since otherwise u would diverge at ρ = 0. In this case the solution (38) reduces to

u(ρ, φ) = C0 +∞∑

n=1

ρn (An cos(nφ) +Bn sin(nφ)) (39)

where Cn has been absorbed into the constants An and Bn. Finally An, Bn and C0 mustbe determined by boundary conditions, such as specifying u(a, φ) on the perimeter of thedisc. For example if we are told that

u(a, φ) = u0 cos(2φ)

on the perimeter, where u0 is a constant, then

u(a, φ) = C0 +∞∑

n=1

an (An cos(nφ) +Bn sin(nφ)) = u0 cos(2φ).

Using orthogonality of the trigonometric functions

∫ 2π

0

cos(mφ) cos(nφ)dφ = πδmn

∫ 2π

0

cos(mφ) sin(nφ)dφ = 0,

for positive integers m and n, we derive

A2 =u0

a2, An = 0 for n 6= 2, C0 = 0 and Bn = 0 for ∀n.

So the full solution with u(a, φ) = u0 cos(2φ) is

u(ρ, φ) = u0

(ρa

)2cos(2φ)

If the origin is not in the region in which (37) is to be solved, then the other termsin (38) must be taken into account. For example, suppose we want to solve (37) in anannulus centred on the origin with the inner ring of radius b and the outer ring of radiusa, with u(a, φ) = u0 and u(b, φ) = u1 constants, independent of φ. Then the angular partof u(ρ, φ) is trivial, An = Bn = 0, and

u(ρ) = C0 +D0 ln ρ.

The ln ρ term is now allowed because ρ = 0 has been excluded from the region of interest.The constants C0 and D0 are determined by

u(a) = C0 +D0 lna = u0 and u(b) = C0 +D0 ln b = u1,

30

so

D0(lna− ln b) = u0 − u1 ⇒ D0 =(u0 − u1)

ln(a/b)

and

C0 = u0 −(u0 − u1

ln(a/b)

)lna =

(u1 lna− u0 ln b

ln(a/b)

).

The solution in the interior of the annulus, b ≤ ρ ≤ a, is now given by

u(ρ) =

(u1 ln a− u0 ln b

ln(a/b)

)+

(u0 − u1)

ln(a/b)ln ρ =

u1 ln(a/ρ) + u0 ln(ρ/b)

ln(a/b).

For many problems involving Laplace’s equation in 3-dimensions it is more convenientto use spherical polar co-ordinates (r, θ, φ) rather than Cartesian co-ordinates (x, y, z).These are related to each other in the usual way by

x = r cosφ sin θ

y = r sinφ sin θ

z = r cos θ.

To translate (34) into a differential equation involving (r, θ, φ) we need the following partialderivatives:

∂x

∂r= cosφ sin θ,

∂y

∂r= sinφ sin θ,

∂z

∂r= cos θ,

∂x

∂θ= r cosφ cos θ,

∂y

∂θ= r sinφ cos θ,

∂z

∂θ= −r sin θ,

∂x

∂φ= −r sinφ sin θ,

∂y

∂φ= r cosφ sin θ,

∂z

∂φ= 0.

Using these the chain rule for differentiation implies that

∂

∂r=

∂x

∂r

∂

∂x+

∂y

∂r

∂

∂y+

∂z

∂r

∂

∂z= cosφ sin θ

∂

∂x+ sinφ sin θ

∂

∂y+ cos θ

∂

∂z,

∂

∂θ=

∂x

∂θ

∂

∂x+

∂y

∂θ

∂

∂y+

∂z

∂θ

∂

∂z= r cosφ cos θ

∂

∂x+ r sinφ cos θ

∂

∂y− r sin θ

∂

∂z,

∂

∂φ=

∂x

∂φ

∂

∂x+

∂y

∂φ

∂

∂y+

∂z

∂φ

∂

∂z= −r sinφ sin θ

∂

∂x+ r cosφ sin θ

∂

∂y.

These can be inverted, by taking linear combination with trigonometric function for exam-ple, to express partial derivatives of Cartesian co-ordinates in terms of polar co-ordinates:

∂

∂x= cosφ sin θ

∂

∂r+

cosφ cos θ

r

∂

∂θ− sinφ

r sin θ

∂

∂φ

∂

∂y= sinφ sin θ

∂

∂r+

sinφ cos θ

r

∂

∂θ+

cosφ

r sin θ

∂

∂φ

∂

∂z= cos θ

∂

∂r− sin θ

r

∂

∂θ.

31

Using these, after a rather tedious calculation, the Laplace equation (34) can be expresseddirectly in terms of spherical polar co-ordinates:

∂2u

∂x2+

∂2u

∂y2+

∂2u

∂z2=

1

r

∂2(ru)

∂r2+

1

r2 sin θ

∂

∂θ

(sin θ

∂u

∂θ

)+

1

r2 sin2 θ

∂2u

∂φ2= 0.

We shall now show how to solve Laplace’s equation in spherical polar co-ordinates

1

r

∂2(ru)

∂r2+

1

r2 sin θ

∂

∂θ

(sin θ

∂u

∂θ

)+

1

r2 sin2 θ

∂2u

∂φ2= 0 (40)

for a function u(r, θ, φ).For simplicity let’s first tackle the problem when u is independent of φ — later we

shall look at the more general case when u also depends on φ. When u is independent ofφ equation (40) is

1

r

∂2(ru)

∂r2+

1

r2 sin θ

∂

∂θ

(sin θ

∂u

∂θ

)= 0.

As before we try a separation of variable form

u =R(r)

rΘ(θ)

(the 1/r is just for convenience, it make the final differential equation for R simpler).Equation (40) then reduces to

1

r

(d2R

dr2

)Θ+

R

r3 sin θ

d

dθ

(sin θ

dΘ

dθ)

)= 0.

Multiplying by r3/(RΘ) gives

r2

R

(d2R

dr2

)+

1

Θ sin θ

d

dθ

(sin θ

dΘ

dθ

)= 0,

and the first term in this equation depends only on r, not on θ, while the second dependsonly on θ, not on r — each term must therefore be a constant. The equation can thereforebe separated into two equations

r2d2R

dr2= −λR

1

sin θ

d

dθ

(sin θ

dΘ

dθ

)= λΘ,

where λ is a constant. The first of these equations can be satisfied by the monomialR = r−n, where λ = −n(n + 1) and by R = rn+1 with the same λ. The general solutionis therefore a linear combination of these

R(r) = Anrn+1 +Bnr

−n (41)

32

with An and Bn constants. The second equation is now

1

sin θ

d

dθ

(sin θ

dΘ

dθ

)= −n(n+ 1)Θ. (42)

This is in fact Legendre’s equation in disguise, because we can change variables from θ tox = cos θ (this is not the Cartesian co-ordinate x), with −1 ≤ x ≤ 1 for 0 ≤ θ ≤ π. Now

d

dθ=

dx

dθ

d

dx= − sin θ

d

dx= −

√1− x

2d

dx,

and (42) is the familiar

d2

dx

{(1− x

2)dΘ

dx

}= (1− x

2)d2Θ

dx2− 2x

dΘ

dx= −n(n+ 1)Θ,

and the solutions are Legendre polynomials

Θ = Pn(x) = Pn(cos θ)

with n = 0, 1, 2, . . . . Combining this with (41) the general solution of (40), with u inde-pendent of the polar angle φ, is (remember u = 1

rRΘ)

u =∞∑

n=0

(Anrn +Bnr

−n−1)Pn(cos θ).

Now consider the more general problem when u depends on all three arguments r, θand φ. Observe first that, because increasing the polar angle φ by 2π, φ → φ+ 2π, bringsus back to the same point, u must be a periodic function of φ, u(r, θ, φ+ 2π) = u(r, θ, φ).Bearing this in mind, we look for a solution of (40) of the form

u(r, θ, φ) =R(r)

rΘ(θ)Φ(φ).

With this form the equation reduces to

1

r

(d2R

dr2

)ΘΦ +

R

r3 sin θ

d

dθ

(sin θ

dΘ

dθ

)Φ+

RΘ

r3 sin2 θ

d2Φ

dφ2= 0.

Now multiplying by (r3 sin2 θ)/(RΘΦ) gives

r2 sin2 θ

R

(d2R

dr2

)+

sin θ

Θ

d

dθ

(sin θ

dΘ

dθ

)+

1

Φ

d2Φ

dφ2= 0. (43)

The first two terms are independent of φ, so the last term must be a constant, which wedenote by −m2, so

d2Φ

dφ2= −m2Φ.

33

Since Φ(φ+ 2π) = Φ(φ) we conclude that m is an integer and

Φ = Cm cos(mφ) +Dm sin(mφ)

with Cm and Dm constants. An alternative way of writing this is

Φ = Cmeimφ + C∗me−imφ,

where the Cm are complex and Cm = 12(Cm − iDm) with C∗

m = 12 (Cm + iDm). Either way

way, equation (43) is now

r2

R

(d2R

dr2

)+

1

Θ sin θ

d

dθ

(sin θ

dΘ

dθ

)− m2

sin2 θ= 0.

The first term here is function of r only while the second and third terms together givea function of θ only, and so must be a constant. Denoting this constant by −n(n + 1) asbefore we have

r2(d2R

dr2

)− n(n+ 1)R = 0

1

sin θ

d

dθ

(sin θ

dΘ

dθ

)+ n(n+ 1)Θ− m2Θ

sin2 θ= 0.

The equation for R has the same solutions as before,

R(r) = Anrn+1 +Bnr

−n,

while the equation for Θ can be written in terms of x = cos θ as

(1− x2)d2Θ

dx2− 2x

dΘ

dx2+ n(n+ 1)Θ− m2

(1− x2)Θ = 0. (44)

Solutions of this equation are called Associated Legendre Functions and are usuallydenoted by Pm

n (x). For m = 0 the solutions are just the usual Legendre polynomials,P 0n(x) = Pn(x). The first few associated Legendre functions with m 6= 0 are:

P 11 (x) = (1− x

2)1/2 = sin θ

P 12 (x) = 3x(1− x

2)1/2 = 3 cos θ sin θ

P 22 (x) = 3(1− x

2) = 3 sin2 θ

P 13 (x) =

3

2(5x2 − 1)(1− x

2)1/2 =3

2(5 cos2 θ − 1) sin θ

P 23 (x) = 15x(1− x

2) = 15 cos θ sin2 θ

P 33 (x) = 15(1− x

2)3/2 = 15 sin3 θ.

An explicit formula for positive m, which is sometimes useful, is

Pmn (x) = (1− x

2)m/2 dm

dxmPn(x).

34

Note that, since Pn(x) is a polynomial of order n in x, Pmn (x) vanishes for m > n, so we

shall restrict to m ≤ n. The sign of m does not matter in (44) and, by convention, P−mn (x)

are defined as being proportional to Pmn (x):

P−mn (x) = (−1)m

(n−m)!

(n+m)!Pmn (x),

so −n ≤ m ≤ n (the factor (−1)m (n−m)!(n+m)! is just a standard convention). The associated

Legendre functions are orthogonal for the same m but different n:

∫ 1

−1

Pmn′ (x)Pm

n (x)dx =

(2

2n+ 1

)(n+m)!

(n−m)!δn′,n.

Now the general solution of (40) can be written as a linear combination of the separatedforms:

u(r, θ, φ) =

∞∑

n=0

Anrn

n∑

m=−n

Pmn (cos θ)

(Cn,meimφ + C∗

n,me−imφ)

+

∞∑

n=0

Bnr−(n+1)

n∑

m=−n

Pmn (cos θ)

(Dn,meimφ +D∗

n,me−imφ), (45)

where Cn,m and Dn,m are complex constants.

3. Some Other Partial Differential Equations

The techniques described above can be used to tackle other partial differential equa-tions of a similar form. Examples are:

• the 2-dimensional heat equation for u(x, y, t) in Cartesian co-ordinates

∂2u

∂x2+

∂2u

∂y2= a2

∂u

∂t

• the 2-dimensional heat equation for u(ρ, φ, t) in polar co-ordinates

∂2u

∂ρ2+

1

ρ

∂u

∂φ+

1

ρ2∂2u

∂φ2= a2

∂u

∂t

• the 3-dimensional heat equation for u(x, y, z, t) in Cartesian co-ordinates

∂2u

∂x2+

∂2u

∂y2+

∂2u

∂z2= a2

∂u

∂t

• the 3-dimensional heat equation for u(r, θ, φ, t) in spherical polar co-ordinates

1

r

∂2(ru)

∂r2+

1

r2 sin θ

∂

∂θ

(sin θ

∂u

∂θ

)+

1

r2 sin2 θ

∂2u

∂φ2= a2

∂u

∂t

35

• the 2-dimensional wave equation for u(x, y, t) in Cartesian co-ordinates

∂2u

∂x2+

∂2u

∂y2= a2

∂2u

∂t2

• the 2-dimensional wave equation for u(ρ, φ, t) in polar co-ordinates

∂2u

∂ρ2+

1

ρ

∂u

∂φ+

1

ρ2∂2u

∂φ2= a2

∂2u

∂t2

• the 3-dimensional wave equation for u(x, y, z, t) in Cartesian co-ordinates

∂2u

∂x2+

∂2u

∂y2+

∂2u

∂z2= a2

∂2u

∂t2

• the 3-dimensional wave equation for u(r, θ, φ, t) in spherical polar co-ordinates

1

r

∂2(ru)

∂r2+

1

r2 sin θ

∂

∂θ

(sin θ

∂u

∂θ

)+

1

r2 sin2 θ

∂2u

∂φ2= a2

∂2u

∂t2.

4. Spherical Harmonics

When a periodic function f(φ+ 2π) = f(φ) is expanded as a Fourier series, in termsof trigonometric functions sin(mφ) and cos(mφ), it can be viewed as a function on a circleparameterised by the angular variable φ, i.e. each point φ on the circle is associated aunique real number f(φ). The expansion can be written in terms of sines and cosines

f(φ) =∞∑

m=0

(Cm cos(mφ) +Dm sin(mφ)) ,

with Cm and Dm real constants, or in terms of complex exponentials

f(φ) =

∞∑

m=0

(Cmeimφ + C∗

me−imφ),

where Cm = 12 (Cm − iDm) and C∗

m = 12 (Cm + iDm).

In a similar way one can expand functions on the surface of a sphere, parameterisedby angular co-ordinates 0 ≤ θ ≤ π and 0 ≤ φ < 2π, in terms of functions called SphericalHarmonics. These can be written in terms of the associated Legendre functions Pm

n (cos θ)and the complex exponential eimφ. They are denoted Y m

n (θ, φ) and are defined by

Y mn (θ, φ) =

√(2n+ 1)

4π

(n−m)!

(n+m)!Pmn (cos θ)eimφ,

36

(again the square root pre-factor is just a standard convention). A function on a sphericalsurface can then be written as

f(θ, φ) =

∞∑

n=0

n∑

m=−n

Cm,nYmn (θ, φ),

where Cm,n are constants. The Y mn are complex and

(Y mn )∗ = (−1)mY m

n

so f(θ, φ) is real if C∗mn = (−1)mCmn.

The first few spherical harmonics are:

Y 00 =

√1

4π,

Y 01 =

√3

4πcos θ, Y 1

1 =

√3

8πsin θ eiφ,

Y 02 =

1

2

√5

4π(3 cos2 θ − 1), Y 1

2 =

√15

8πsin θ cos θ eiφ, Y 2

2 =1

4

√15

2πsin2 θ e2iφ,

The considerations of the previous section show that the Y mn (θ, φ) are eigenfunctions

of the angular part of the Laplace operator (40)

1

sin θ

∂

∂θ

(sin θ

∂

∂θ

)Y mn +

1

sin2 θ

∂2

∂φ2Y mn = −n(n+ 1)Y m

n .

Spherical harmonics are orthogonal∫ 2π

0

dφ

∫ π

0

sin θdθ{Y m′

n′ (θ, φ)}∗

Y mn (θ, φ) = δn′,nδm′,m.

They also satisfy the Addition Theorem for Spherical Harmonics, which weshall state but not prove here. Let (θ, φ) and (θ′, φ′) label two points on the surface of asphere with angular separation γ, so

cos γ = sin θ sin θ′ cos(φ− φ′) + cos θ cos θ′,

then the addition theorem states that

Pn(cos γ) =

(4π

2n+ 1

) n∑

m=−n

{Y mn (θ′, φ′)

}∗Y mn (θ, φ).

Spherical harmonics can be used to expand solutions of the 3-dimensional Laplaceequation in spherical polar co-ordinates (this is just another way of writing equation (45)),

u(r, θ, φ) =

∞∑

n=0

n∑

m=−n

(An,mrn +Bn,mr−(n+1)

)Y mn (θ, φ),

where the constants An,m and Bn,m must satisfy An,−m = (−1)mA∗n,m and Bn,−m =

(−1)mB∗n,m in order to ensure that u(r, θ, φ) is a real function. In order to specify An,m

and Bn,m uniquely boundary conditions on u must be given, usually by specifying thevalue of u on some sphere of fixed radius a, u(a, θ, φ) = f(θ, φ) with f(θ, φ) given.

37

Complex Analysis

1. Complex Functions

Denote the set of all real numbers by R. A complex number, z, is a linear combinationof two real numbers, x and y,

z = x+ iy,

where i2 = −1. Complex numbers can be thought of as pairs of real numbers (x, y) andso can be interpreted as points in a 2-dimensional plane, called the complex plane, withx and y Cartesian co-ordinates in the plane. But they are more than that, because twocomplex numbers cam be multiplied together to give a third. The complex conjugateof a complex number is denote by z∗ with

z∗ = x− iy.

We shall denote the set of all complex numbers by C.A real function f(x) assigns a real number f(x) to every real number x ∈ R. A

complex function f(z) assigns a complex number, f(z), to every z ∈ C. A complexfunction therefore has a real and an imaginary part,

f(z) = u(x, y) + iv(x, y)

where u(x, y) and v(x, y) are real functions. The function u is called the real part of f ,and is often denoted by ℜf , while v is called the imaginary part and is denoted by ℑf .In general f(z) can depend on z and z∗ independently, e.g.

f(z, z∗) = zz∗ = x2 + y2 (in this case v = 0)

f(z, z∗) = z(z∗ + 1) = x2 + y2 + x+ iy,

but it may depend only on z (or only on z∗), e.g.

f(z) = z2 = (x+ iy)2 = x2 − y2 + 2ixy

f(z) = ez = ex+iy = exeiy = ex(cos y + i sin y)

f(z) = cos z =1

2

(eiz + e−iz

)=

1

2

(eix−y + e−ix+y

)

=1

2

(e−y(cosx+ i sinx) + ey(cosx− i sinx)

)

= cosh y cosx− i sinh y sinx.

In these last three examples f(z) can be written purely as a function of z.Sometimes it is convenient to use 2-dimensional polar co-ordinates, x = ρ cosφ and

y = ρ sinφ, soz = ρ(cosφ+ i sinφ) = ρeiφ.

38

In particular, for ρ = 1 and φ = π, we get Euler’s famous formula

eiπ = −1.

Sometimes this polar decomposition is useful in functions, for example

f(z) = ln z = ln(ρeiφ

)= ln(ρ) + iφ.

In particular, for φ = π, we have z = −ρ and, with ρ > 0,

ln(−ρ) = ln(ρ) + iπ

the logarithm of a negative number! There is a subtle ambiguity here though: sincee2πin = 1 for any integer n, we have eiφ = ei(φ+2πn) and

ln(z) = ln(ρei(φ+2πn)

)= ln(ρ) + i(φ+ 2πn).

We get a different value of log(z) for every integer n. This is really just a more extreme

case of the familiar ambiguity√x2 = ±|x| associated with real functions. Indeed

z12 =

√z =

√ρei(φ+2πn) =

√ρ e

iφ

2 eiπn = ±√ρ e

iφ

2 ,

where√ρ is taken to be the positive square root and the plus sign is for n even, the minus

sign for n odd. The usual real case is φ = 0. The complex function z12 is a multi-valued

function, it has two different values depending on whether n is even or odd — these arecalled branches of the function. The function f(z) = ln(z) has an infinite number ofbranches, one for each integer n.

Many formula familiar from real analysis must be interpreted with care in complexanalysis, for example

1x = 1 (46)

for any real number x.* Consider the complex expression, with n an integer,

1z = (e2πin)(x+iy) = e2πinxe−2πny

and again the result is different for every n, there is an infinite number of branches. So1z 6= 1 for all z, unless n = 0, and this has implications for raising any number to a complexpower. As another example take ez, where z = 1 + 2πik with k an integer. Then, sincee = e1+2πin,

ez =(e1+2πin

)1+2πik= e(1+2πin)(1+2πik) = e1−4π2kn+2πi(k+n) = e1−4π2kn (47)

* One way to prove this is to take logarithms, ln(1x) = x ln(1) = 0 for all real x, but theonly real number whose logarithm is zero is one, therefore 1x = 1, for all −∞ < x < ∞.

39

where, in the last equation, we have used e2πi(k+n) = 1 for any integers n and k. So, forfixed k, ez has an infinite number of branches, one for each integer n.

Now what about (ez)z? Well from (47)

ez = e1−4π2kn,

so now

(ez)z=(e1−4π2kn.e2πin

′

)z=(e1−4π2kn+2πin′

)1+2πik

= e1−4π2k(n+n′)+2πi(n′+k−4π2k2n)

= e1−4π2k(n+n′)−8π3ik2n (48)

where a second set of branches has been introduced, labelled by a new integer n′. We nowhave a double infinity of branches, labelled by the two integers n and n′. Compare this to

ez2

=(e1+2πin

)1−4π2k2+4πik.

Expanding the exponent

ez2

=(e1+2πin

)1−4π2k2+4πik

= e1−4π2k(2n+k)+2πi(2k+n′−4π2nk2)

= e1−4π2k(2n+k)−8π3ink2). (49)

Comparing (48) and (49) we see that (ez)z = ez2

if and only if

n+ n′ = 2n+ k ⇒ n′ = n+ k.

This only makes sense, when z = 1 + 2πik, if we are careful how we choose the branches:there is only one set of branches, labelled by n, and we must choose n′ = n + k in (48)giving

(ez)z = ez2

= e1−4π2k(2n+k)−8π3ik2n.

Note that the choice n′ = n + k is exactly what we would we have found if we had neverused e2πi(k+n) = 1 in the last equation of (47), but instead had kept the phase 2π(n+ k)in the exponent when calculating

(ez)z.

In fact it should be clear that we can iterate this to show that

N times︷︸︸︷(((ez)z)z · · ·

)z= ez

N

and the different branches can be consistently treated by replacing

e → e1+2πin

40

on both sides, provided the phase is never dropped, since then

N times︷︸︸︷(((ez)z)z · · ·

)z=

(((e1+2πin

)1+2πik)1+2πik

· · ·)1+2πik

= e(1+2πin)

N times︷︸︸︷(1+2πik)...(1+2πik)

= e(1+2πin)(1+2πik)N =(e1+2πin

)(1+2πik)N

= ezN

is always true, for any integer n.Failure to choose the right branches can lead to errors, for example setting n = n′ = 0

in (48) and (49) and assuming(ez)z

= ez2

gives the equation

e = e1−4π2k2

, (50)

which is obviously only valid for k = 0. There is a whole class of conundrums in the theoryof complex numbers based on this kind of example,* arguments that lead to apparentparadoxes, like (50) with k 6= 0, which are the result of choosing the wrong branch!

2. Differentiation

When f(z) is independent of z∗ a natural definition of its derivative would be

df

dz:= lim

δz→0

δf

δz= lim

δz→0

f(z + δz) − f(z)

δz

but we must be careful to check that this definition is consistent. A potential problemarises because δz can be real or imaginary, or indeed can point in any direction in thecomplex plane. The limit δz → 0 only makes sense if it is independent of this direction.To determine whether or not this is the case we write f = u + iv, δf = δu + iδv andδz = δx+ iδv so

δf

δz=

δu+ iδv

δx+ iδy.

Then, when δy = 0, we have

limδz→0

δf

δz= lim

δx→0

(δu+ iδv

δx

)=

∂u

∂x+ i

∂v

∂x

and, when δx = 0, we have

limδz→0

δf

δz= lim

δy→0

(δu+ iδv

iδy

)= −i

∂u

∂y+

∂v

∂y.

* This known as Clausen’s puzzle.

41

Demanding that these are the same means that their real and imaginary parts must bethe same, so

∂u

∂x=

∂v

∂y,

∂v

∂x= −∂u

∂y. (51)

These conditions are not only necessary for the derivative of f(z) to exist, they are alsosufficient, essentially because an arbitrary direction for δz is just a linear combination ofδz = x and δz = iy.

The conditions (51) that u(x, y) and v(x, y) must satisfy for dfdz

to be well defined arecalled the Cauchy-Riemann conditions.

If a function f(z) is differentiable in some neighbourhood of a point z0 then it is saidto be analytic in that neighbourhood. An alternative name, synonymous with analytic, isholomorphic (from the Greek oλoσ ‘holos’, meaning whole, and µoρφη ‘morphe’, meaningshape). If f(z) is analytic everywhere in the complex plane (except possibly for z → ∞)then it is said to be an entire function.

Examples:

i) f(z) = z2 ⇒ u(x, y) = x2 − y2, v(x, y) = 2xy

∂u

∂x= 2x =

∂v

∂y,

∂v

∂x= 2y = −∂u

∂y.

The derivatives exist everywhere except for x → ∞ and y → ∞ so z2 is an entire function.

ii) f(z) =1

z=

x− iy

x2 + y2⇒ u(x, y) =

x

x2 + y2, v(x, y) = − y

x2 + y2

∂u

∂x=

y2 − x2

(x2 + y2)2=

∂v

∂y,

∂u

∂y= − 2yx

(x2 + y2)2= −∂v

∂x.

The derivatives exist everywhere except at the origin x = y = 0, so 1/z is analytic every-where except at the origin.

Functions that depend on z∗ are not analytic

iii) f(z∗) = z∗ = x− iy ⇒ u(x, y) = x, v(x, y) = −y

∂u

∂x= 1,

∂u

∂y= 0,

∂v

∂x= 0,

∂v

∂y= −1,

so z∗ does not satisfy the Cauchy-Riemann conditions.

iii) f(z, z∗) = zz∗ ⇒ u(x, y) = x2 + y2, v(x, y) = 0

42

∂u

∂x= 2x,

∂u

∂y= 2y,

∂v

∂x=

∂v

∂y= 0,

and again the Cauchy-Riemann conditions are not satisfied.

Analytic functions are related to the 2-dimensional Laplace equation:

∂2u

∂x2=

∂2v

∂x∂y=

∂2v

∂y∂x= −∂2u

∂y2⇒ ∂2u

∂x2+

∂2u

∂y2= 0

∂2v

∂x2= − ∂2u

∂x∂y= − ∂2u

∂y∂x= −∂2v

∂y2⇒ ∂2v

∂x2+

∂2v

∂y2= 0,

so both u(x, y) and v(x, y) satisfy the 2-dimensional Laplace equation.

3. IntegrationAn analytic function f(z) can be integrated along a curve connecting two fixed points,

z0 and z′0, by splitting the curve up into a large number of small segments with endpointsz0 < z1 < z2 < · · · < zN−1 < zN = z′0 and then letting N → ∞. Let zj−1 < ζj < zj befixed points in the j-th segment, as shown below,

z 0

z 1

z2

zN-1

z Nz’0

ζ2

ζΝ−1

ζΝ

ζ1

=

. ..

then the integral can be defined as

∫ z′

0

z0

f(z)dz = limN→∞

N∑

j=1

f(ζj)(zj − zj−1),

and is independent of the choices for ζj in the limit.

Cauchy’s Integral Theorem

If f(z) is analytic within and on a closed curve C, then

∮

C

f(z)dz = 0.

43

The symbol∮here indicates integration around a closed loop and a closed curve C is often

called a contour in complex integration.

Proof: first split f(z) and z up into real and imaginary parts,

∮

C

f(z)dz =

∮

C

(u+ iv)(dx+ iy)

=

∮

C

(udx− vdy) + i

∮

C

(vdx+ udy).

(52)

Now apply Stokes’ theorem which states that, for any vector field V = Vxx+ Vy y with xand y-components Vx and Vy,

∮

C

(Vxdx+ Vydy

)=

∫

S

(∂Vy

∂x− ∂Vx

∂y

)dxdy,

where S is the 2-dimensional region bounded by the curve C. Applying this to the realpart of (52), with Vx = u and Vy = −v gives

∮

C

(udx− vdy) =

∫

S

(−∂v

∂x− ∂u

∂y

)dxdy,

which vanishes from the Cauchy-Riemann conditions (51). Similarly applying Stokes’theorem to the imaginary part of (52), with Vx = v and Vy = u gives

∮

C

(vdx+ udy) =

∫

S

(∂u

∂x− ∂v

∂y

)dxdy,

which again vanishes from the Cauchy-Riemann conditions.This shows that both the real and imaginary parts of

∮Cf(z)dz vanish if f(z) satisfies

the Cauchy-Riemann conditions in the region S bounded by the curve C.An immediate consequence of Cauchy’s integral theorem is that, if a function is an-

alytic in some region containing two points z0 and z′0, then the integral along a curveconnecting z0 and z′0 is independent of the curve chosen, provided it does not does notstray out of the region in which f(z) is analytic.* This is because Cauchy’s integral theo-rem shows that integrating from z0 to z′0 along one curve and then integrating back fromz′0 to z0 along a different curve gives opposite answers. Combining the two integrals intoa single integral along a closed curve passing through z0 and z′0 is then zero.

Cauchy’s Integral Formula

* If the real and imaginary parts f(z) are thought of as potential energies for 2-dimensional problems in mechanics then this is analogous to the concept of conservedforces.

44

If f(z) is analytic within and on a closed curve C, then

∮

C

f(z)

(z − z0)dz = 2πif(z0)

where z0 is an arbitrary point inside C.

There is a convention here that the integral is performed in an anti-clockwise directionaround the curve C — be careful of this: failure to adhere to this convention can introduceminus signs into some of the formulae!

z

z0

C

Proof: although, by assumption, f(z) is analytic within C, f(z)/(z − z0) is not analyticat z = z0. However, if we choose a different curve C′ as shown below, which excludesz0, then f(z)/(z − z0) is analytic within and on C′. Hence

∮C′

f(z)dz = 0 from Cauchy’sintegral formula.

z

2z

1z4z

z0

3z

C’

The curve C′ can be decomposed as

C′ = C12 + C23 + C34 + C41

and then we have∮

C′

f(z)

(z − z0)dz =

∫

C12

f(z)

(z − z0)dz +

∫

C23

f(z)

(z − z0)dz +

∫

C34

f(z)

(z − z0)dz +

∫

C41

f(z)

(z − z0)dz

= 0,

45

where∫C12

f(z)(z−z0)

dz =∫ z2z1

f(z)(z−z0)

dz along the segment C12, etc. In the limit z4 → z1 and

z3 → z2, the segments C12 and C34 coincide, but the integrals are performed in oppositedirections, so ∫

C12

f(z)

(z − z0)dz = −

∫

C34

f(z)

(z − z0)dz, (53)

leaving ∮

C41

f(z)

(z − z0)dz = −

∮

C23

f(z)

(z − z0)dz, (54)

where limz4→z1 C41 = C and C23 is now also a closed curve since we have set z2=z3. Nowwe choose C23 to be a small circle of radius ǫ centred on z0 and parameterise the pointson it by an angle φ, where

z = z0 + ǫ eiφ so dz = iǫ eiφdφ,

and φ increases in an anti-clockwise direction. We must be careful of signs because thecontour C23 is defined above as being traversed in the clockwise direction, so this introducesa minus sign. ∮

C23

f(z)

(z − z0)dz = −

∫ 2π

0

f(z0 + ǫ eiφ)

ǫ eiφ(iǫ eiφ

)dφ

= −i

∫ 2π

0

f(z0 + ǫ eiφ)dφ

−→ǫ→0

− i

∫ 2π

0

f(z0)dφ

= −2πif(z0).

where the limit ǫ → 0 has been taken in the third step. Using this in (54), with C41 = C,now gives Cauchy’s integral formula:

∮

C

f(z)

(z − z0)dz = 2πif(z0).

Differentiating Cauchy’s integral formula with respect to z0 gives us

f ′(z0) =1

2πi

∮

C

f(z)

(z − z0)2dz.

More generally, differentiating n times we get

f (n)(z0) =n!

2πi

∮

C

f(z)

(z − z0)n+1dz, (55)

where f (n)(z0) :=dn

dzn0

f(z0).

46

4. Laurent Expansions and Complex Analyticity

A Laurent expansion of a complex function is a generalisation of Taylor expansions,familiar from real analysis and we shall start by describing Taylor expansions for complexfunctions.

Taylor ExpansionsSuppose a function f(z) is analytic in some region containing a point z0 and let z1 be thenearest point to z0 at which f(z) is non-analytic (z1 might be at infinity, but it could alsobe a finite distance from z0, as in the picture below). Let C be a contour which enclosesz0 but does not extend as far out as z1.

z1

z0

z −z1 0

z’−z0

z’

C

Then, from Cauchy’s integral formula, for any other point z inside C

f(z) =1

2πi

∮

C

f(z′)dz′

(z′ − z)

=1

2πi

∮

C

f(z′)dz′

(z′ − z0)[1−

(z−z0z′−z0

)] .(56)

Now[1−

(z−z0z′−z0

)]−1

can be expanded in the usual way. Let w = z−z0z′−z0

and SN =∑N

n=0 wn. Then

(1− w)SN =

N∑

n=0

wn −N+1∑

n=1

wn = 1− wN+1.

Writing w = ρ eiφ we have ρN+1 → 0 as N → ∞, provided ρ < 1, hence wN+1 → 0 asN → ∞ provided |w| < 1. So we see that

(1− w)S∞ = 1 ⇒ 1

1− w=

∞∑

n=0

wn.

47

This formula should be familiar for real w but it is also valid for complex w, if |w| < 1. Ifthe points z and z0 are chosen so that |z − z0| < |z′ − z0| < 1 for every point z′ on thecontour C, then

1

1−[z−z0z′−z0

] =

∞∑

n=0

(z − z0z′ − z0

)n

and this can be used in (56) to give

f(z) =1

2πi

∞∑

n=0

(z − z0)n

∮

C

f(z′)dz′

(z′ − z0)n+1.

Now, using (55), this can be re-expressed as

f(z) =

∞∑

n=0

1

n!(z − z0)

nf (n)(z0). (57)

This is exactly the same as the usual formula for the Taylor expansion for a real function.For complex functions we see that the formula works provided |z − z0| < |z′ − z0|, wherez′ lies on the contour used in (55). The only requirement that this contour has to satisfyis that f(z) is analytic within and on C, so we may as well take C as large as we can.We can therefore take C to be a circle centred on z0 and just inside z1, the nearest pointof non-analyticity of f(z) to z0. The Taylor expansion (57) works for any point z that iscloser to z0 than z1.

Analytic ContinuationNon-analytic points are extremely important. They are obstructions to extending

series expansions of f(z) beyond certain limits. For example f(z) = 11+z

is non-analyticat z = −1. Taylor expanding about z0 = 0,

1

1 + z= 1− z + z2 − z3 + · · · =

∞∑

n=0

(−1)nzn,

is valid provided |z| < 1. The unit circle centred on the origin, |z| = 1, is called thecircle of convergence for this expansion. The radius of this circle, unity in this case, iscalled the radius of convergence. Strictly speaking 1

1+z and∑∞

n=0(−1)nzn are differentfunctions: they co-incide for |z| < 1 they but are not the same for |z| > 1, since the formeris perfectly well behaved for |z| > 1 while the latter is not even defined in this region as thesum diverges. We might get a different radius of convergence if we Taylor expand about adifferent point though. For example expanding about z0 = i, we have

f(z) =1

1 + z=

1

(1 + i)

1[1 +

(z−i1+i

)]

=1

(1 + i)

∞∑

n=0

(−1)n(z − i

1 + i

)n

,

48

and this expansion converges provided∣∣∣ z−i1+i

∣∣∣ < 1, i.e. provided |z − i| < |1 + i| =√2.

This circle of convergence has radius√2 and is centred at z0 = i: it has a greater radius

of convergence than the expansion about z0 = 0 and the two expansions are valid indifferent regions. This does not mean that this expansion is any better or worse than theprevious one, they are just valid in different regions as shown in the following picture. Theexpansion about z0 = 0 is valid inside the solid circle and the expansion about z0 = iinside the dashed circle, both expansions are valid in the region where the circles overlap.

1-1

i

Thus the sum∑∞

n=0(−1)nzn is not defined outside |z| < 1, but can be continued into thecrescent shaped region above the semi-circular arch between 1 and −1 by expanding about

z0 = i to give 11+i

∑∞n=0(−1)n

(z−i1+i

)n. In this manner the region in which the expansion is

valid can be continued to different areas of the complex plane. By choosing many differentpoints to expand about we can build up a patchwork covering the whole complex plane,excluding the point of non-analyticity z = −1, giving a collection of infinite sums, one foreach expansion point, all of which co-incide with the function 1

1+z in the regions in whichthey converge. This process is called analytic continuation of the original expansion,∑∞

n=0(−1)nzn, outside of the region |z| < 1.Analytic continuation is very important in complex analysis since many functions can

be defined through a series expansion, e.g.

ez = 1 + z +1

2!z2 +

1

3!z3 + · · · =

∞∑

n=0

1

n!zn.

This expansion actually converges for all |z| < ∞, it has an infinite radius of convergenceand is an example of an entire function.

Laurent ExpansionsLet f(z) be analytic at z0 and let z1 be the nearest point of non-analyticity. Suppose

there are no other points of non-analyticity out as far as another point z2. Let z be a pointbetween z1 and z2, so

|z1 − z0| < |z − z0| < |z2 − z0|.

49

This means that f(z) is analytic in the region between the two curves C1 and C2 in thefollowing figure:

z1

z0

z2

C1

C2

z

We can combine C1 and C2 into one continuous curve, C′, by cutting the annularregion between them along the dotted lines indicated in the figure and including the dottedlines in the contour. The function f(z) is analytic in the region enclosed by C1, C2 andthe dotted lines, so Cauchy’s integral formula (54) implies that

f(z) =1

2πi

∮

C′

f(z′)dz′

(z′ − z)

=1

2πi

∮

C1

f(z′)dz′

(z′ − z)− 1

2πi

∮

C2

f(z′)dz′

(z′ − z). (58)

In the second equation above we have taken the limit of the two dotted line segments co-inciding, so that the integrals along them exactly cancel, and we have chosen to integratein an anti-clockwise direction along C1, which then requires a clockwise integration aroundC2 — hence the minus sign in the latter integral.

Expanded the denominators of the integrands in the same way as we did for the Taylorexpansion previously we have, for C1,

1

2πi

∮

C1

f(z′)dz′

(z′ − z)=

1

2πi

∮

C1

f(z′)dz′

(z′ − z0)[1−

(z−z0z′−z0

)]

=1

2πi

∞∑

n=0

(z − z0)n

∮

C1

f(z′)dz′

(z′ − z0)n+1. (59)

The sum converges since, by construction,∣∣∣ z−z0z′−z0

∣∣∣ < 1 for z′ on C1 (see the figure above).

While, for C2,

1

2πi

∮

C2

f(z′)dz′

(z′ − z)= − 1

2πi

∮

C2

f(z′)dz′

(z − z0)[1−

(z′−z0z−z0

)]

50

= − 1

2πi

∞∑

m=0

1

(z − z0)m+1

∮

C2

f(z′)(z′ − z0)mdz′

= − 1

2πi

∞∑

n=1

(z − z0)−n

∮

C2

f(z′)dz′

(z′ − z0)1−n, (60)

where, in the last integral, the summation has been shifted by one, n = m + 1, and the

sum converges since, by construction,∣∣∣ z

′−z0z−z0

∣∣∣ < 1 for z′ on C2. Now substitute (59) and

(60) in (58) and we get

f(z) =1

2πi

{∞∑

n=0

(z − z0)n

∮

C1

f(z′)dz′

(z′ − z0)n+1+

∞∑

n=1

(z − z0)−n

∮

C2

f(z′)dz′

(z′ − z0)−n+1

}

1

2πi

{∞∑

n=0

(z − z0)n

∮

C1

f(z′)dz′

(z′ − z0)n+1+

−∞∑

n=−1

(z − z0)n

∮

C2

f(z′)dz′

(z′ − z0)n+1

}

=∞∑

n=−∞

an(z0)(z − z0)n,

where

an(z0) =

{1

2πi

∮C1

f(z′)dz′

(z′−z0)n+1 for n ≥ 0

12πi

∮C2

f(z′)dz′

(z′−z0)n+1 for n < 0.

Now observe that the integrals involved in calculating an(z0) are completely independentof z, they depend only on z0, and f(z′)/(z′ − z0)

n+1 is analytic in the annulus between z2and z1. Due to Cauchy’s integral theorem, we can distort C1 and C2 until they lie on topof each other, giving a single curve C = C1 = C2, without changing the value of an(z0)(provided neither C1 nor C2 passes through the points of non-analyticity, z1 and z2, in theprocess). Hence we finally arrive at

f(z) =

∞∑

n=−∞

an(z0)(z − z0)n with an(z0) =

1

2πi

∮

C

f(z′)dz′

(z′ − z0)n+1, (61)

where C is any curve threading between the two points of non-analyticity nearest to z0,z1 and z2, as shown below:

z0

z 2

z 1

C

51

Equation (61) is called the Laurent expansion of f(z) about z0. Note that it differsfrom the Taylor expansion in that the summation contains negative powers of (z − z0),this is an inevitable consequence of the existence of the point of non-analyticity, z1, lyingbetween z and z0. The summation may truncate at a finite negative power if there existssome N > 0 such that a−n(z0) = 0, ∀n > N .

As a final point, before going on to give an example of a Laurent expansion, note thatnothing stops us taking the limit z1 → z0 so that z0 becomes a non-analytic point andz2 is the next closest non-analytic point to z0. This limit allows us to discuss expansionsabout non-analytic points.

Example

f(z) =1

z(1− z)

This function has two points of non-analyticity, z = 0 and z = 1. Now perform a Laurentexpansion around the point z0 = 0. The next nearest point of non-analyticity to z = 0 isz = 1, so we can use the expression for an in (61) with the curve C any loop around theorigin which has no point greater than a unit distance from the origin, i.e. |z′| < 1 for allpoints z′ on C. Equation (61) then gives

an(0) =1

2πi

∮

C

dz′

(z′)n+1z′(1− z′)

=1

2πi

∮

C

dz′

(z′)n+2(1− z′)

=1

2πi

∮

C

dz′

(z′)n+2

∞∑

m=0

(z′)m provided |z′| < 1

=1

2πi

∞∑

m=0

∮

C

(z′)m−n−2dz′.

By Cauchy’s integral theorem the integrals are independent of the curve chosen, providedC is restricted to lie between |z| = 1 and z = 0, so we can evaluate the integrals bychoosing C to be a circle of radius a < 1, centred on the origin. On this circle z′ = aeiφ

′

,so dz′ = aieiφ

′

dφ′, and

an(0) =i

2πi

∞∑

m=0

am−n−1

∫ 2π

0

ei(m−n−1)φ′

dφ′.

Now

∫ 2π

0

ei(m−n−1)φ′

dφ′ =

∫ 2π

0

cos[(m− n− 1)φ′]dφ′ + i

∫ 2π

0

sin[(m− n− 1)φ′]dφ′

= 2πδm,n+1,

52

so

an(0) =∞∑

m=0

am−n−1δm,n+1 =

{1, n ≥ −10, n < −1

independent of a.

Using this in (61) gives

f(z) =1

z+ 1 + z + z2 + z3 · · · =

∞∑

n=−1

zn.

This example has been used to illustrate the application of the integral form of an(z0) in(61), but in this particular case there is a quicker way of getting the Laurent expansion.Since the function f(z) = 1

z(1−z) has two factors, 1/z and 1/(1− z), we can Taylor expand

1/(1− z) around z0 = 0 to obtain

1

1− z=

∞∑

n=0

zn,

(for |z| < 1) and then multiply by 1/z to get

f(z) =1

z

∞∑

n=0

zn =

∞∑

n=−1

zn

as before. Sometimes this can be a useful trick to obtain the Laurent expansion of afunction without having to do the integrals in (61).

5. Poles and Branch Cuts

In this section we introduce a classification of the singularities (points of non-analyticity)that a function f(z) might have. If f(z) is non-analytic at a point z0, but is analytic atall neighbouring points, then z0 is called an isolated singularity of f(z). All the non-analytic behaviour we have encountered so far (at least for functions that depend only onz and not both z and z∗) has been of this form, for example f(z) = 1/z has an isolatedsingularity at z = 0. In a Laurent expansion about an isolated singularity z0,

f(z) =

∞∑

n=−∞

an(z0)(z − z0)n,

if an(z0) = 0 for all n < −N and a−N (z0) 6= 0, with N > 0, then f(z) is said to have apole of order N at z0. A pole of order one is called a simple pole. A function whichis analytic everywhere except for isolated poles of finite order is called a meromorphicfunction (from the Greek µǫρoς ‘meros’, meaning part).

53

Examples

i) f(z) =1

z(1− z)=

∞∑

n=−1

zn

has a−1(0) = 1 and an(0) = 0 for all n < −1, so N = 1 in this case, and z = 0 is a simplepole. The other non-analytic point, at z = 1, is also a simple pole.

ii) f(z) =1

z2(1− z)=

∞∑

n=−2

zn

has a pole of order two (sometimes called a double pole) at z = 0 and a simple pole atz = 1,

iii) f(z) = e1/z =∞∑

n=0

1

n!

(1

z

)n

=0∑

n=−∞

1

|n|!zn.

In this example the Laurent series does not terminate at any finite N , but extends all theway down to N = −∞. A singularity of this form is called an essential singularity.

Branch PointsConsider the function

f(z) = zα

when α is not an integer. For example if α = 1/2 then we can use the polar decompositionz = ρeiφ to express f(z) as

f(z) = ρ1/2eiφ/2.

A little thought reveals that this is actually not a very well defined function. Supposewe start at z = 1 and take z around the unit circle centred on z = 0, z = eiφ. Writingf = eiφ/2 as a function of φ we have f(φ = 0) = 1 and as we go round from φ = 0 toφ = 2π we return to z = 1 but we find that f(φ = 2π) = eiπ = −1. Thus f = ±1 at z = 1,reflecting the usual sign ambiguity in a square root. A similar ambiguity in sign is presentfor every value of z, except z = 0, so f(z) = z1/2 really has two values for all z 6= 0 — it issaid to be multi-valued. The function z1/2 is said to have two branches and the pointwhere the two branches coincide, z = 0, is called a branch point.

We can however construct a single-valued function, fs(z), from f(z) = z1/2 at theexpense of introducing a discontinuity. We first of all demand that on the positive realaxis, with φ = 0 and z = ρ = x, fs(x) =

√x, with a plus sign. We then demand that, for

0 ≤ φ < 2π, the phase of fs(z) is φ/2. This means that, when we go all the way roundthe origin to just below the positive real axis, fs(z) has changed sign, but it jumps backto the positive sign on the positive real axis. We now have a single-valued function, butit is discontinuous (and so not differentiable) across the positive real axis. Actually we

54

could have chosen any line emanating from the origin and extending out to infinity to geta single-valued, discontinuous function in this way, but it would be a different functionfor different choices of line. We could even have chosen a curved line, it does not have tobe straight. The line chosen in order to construct a discontinuous single-valued functionassociated with a continuous multi-valued function is called a cut-line for the function.

πα2ie ρ1/2

cut−line

ρ1/2

f(z) =

f(z) =

A cut-line for the function f(z) = zα, with α /∈ Z.In the example chosen above, f(z) = z1/2, the derivative of f , f ′ = 1

2z−1/2, is singular

at the branch-point z = 0, but this need not be the case. For example f(z) = z3/2 also hasa branch-point at z = 0 but its first derivative is perfectly finite there (though its secondderivative is singular at z = 0).

6. Louiville’s Theorem

A function that is everywhere finite and analytic is a constant.

Proof: By assumption f(z) is everywhere finite, so ∃K such that |f(z)| < K, ∀z. Cauchy’sintegral formula then implies that, for any two points z1 and z2,

f(z1)− f(z2) =1

2πi

∮

C

{1

(z′ − z1)− 1

(z′ − z2)

}f(z′)dz′

with C any closed curve encircling both z1 and z2. From this we deduce that

|f(z1)− f(z2)| <K

2π

∣∣∣∣∮

C

{1

(z′ − z1)− 1

(z′ − z2)

}dz′∣∣∣∣ .

We are free to chose C to be a circle of radius R centred on z1 large enough to enclose z2.Parameterise this circle by z′ = z1 +Reiφ

′

, so that dz′ = Rieiφ′

dφ′, and let R > 2|z1 − z2|.Then∣∣∣∣∮

C

{1

(z′ − z1)− 1

(z′ − z2)

}dz′∣∣∣∣ =

∣∣∣∣∮

C

(z1 − z2)

(z′ − z1)(z′ − z2)dz′∣∣∣∣

≤∣∣∣∣∫ 2π

0

(z1 − z2)

(z′ − z2)dφ′

∣∣∣∣ since |z′ − z1| = R

≤∫ 2π

0

|z1 − z2||z′ − z2|

dφ′, since

∣∣∣∣∫

f(z′)dz′∣∣∣∣ ≤

∫|f(z′)dz′|.

55

Choosing R > 2|z2 − z1| implies that |z′ − z2| > R/2, so

|f(z1)− f(z2)| <K

2π

|z1 − z2|(R/2)

∫ 2π

0

dφ′

=2K

R|z2 − z1|

−→R→∞

0.

Since |f(z1)− f(z2)| cannot have any explicit dependence on R we conclude that

|f(z1)− f(z2)| = 0 ⇒ f(z1) = f(z2) ∀z1, z2.

7. Calculus of Residues

Cauchy’s integral formula is extremely useful for calculating definite integrals. Con-sider the Laurent expansion of a function f(z) around an isolated singular point z0,

f(z) =∞∑

n=−∞

an(z0)(z − z0)n.

Take any contour C encircling z0, with no other singularities inside C, and let z be anypoint on C, then

∮

C

an(z0)(z − z0)ndz = an(z0)

(z − z0)n+1

(n+ 1)

∣∣∣∣z

z

= 0, provided n 6= −1.

The case n = −1 must be treated carefully. Since there are no singularities other than z0inside C, the contour can always be distorted to a circle of radius r centered on z0 withoutchanging the value of the integral around C. So

∮

C

a−1(z0)dz

(z − z0)= a−1(z0)

∫ 2π

0

rieiφdφ

reiφ= 2πia−1(z0),

where the circular contour is parameterised by z = z0 + reiφ. This means that

1

2πi

∮

C

f(z)dz = a−1(z0). (62)

This is a very powerful result, it means that we only need to know a−1(z0) in order tocalculate the integral — none of the other an’s is relevant! Note that this is true for anycontour C, provided only that that there are no singularities inside C other than z0. Theco-efficient a−1(z0) is called the residue because, like the grin of the Cheshire cat, it is allthat remains of f(z) after the integral is performed.

56

If C encloses a finite set of isolated singularities, say k + 1 of them, like this

z

z0 z1

zk

. . .

C

then we can still do the integral. First deform C to C′ as shown below,

z

z0 z1

zk

C0 C 1

C k

. . .

C’

Because C′ does not enclose any singularities we have

∮

C′

f(z)dz = 0,

by Cauchy’s integral theorem. Now, when the straight line segments are taken infinitesi-mally close to one another, the integrals along opposing edges cancel and the the perimeterbecomes C, as in the proof of Cauchy’s integral formula given earlier. So

∮

C′

f(z)dz =

∮

C

f(z)dz +

∮

C0

f(z)dz +

∮

C1

f(z)dz + · · ·∮

Ck

f(z)dz = 0.

This can be re-written as

∮

C

f(z)dz = −k∑

j=0

∮

Cj

f(z)dz. (63)

Now each little counter Cj encloses only one singularity, so we can use the result (62),taking note of the fact that the Cj above are being traversed in a clockwise direction and(62) was derived by integrating around C in an anti-clockwise direction, to deduce that

∮

Cj

f(z)dz = −2πia−1(zj).

Equation (63) now reads

∮

C

f(z)dz = 2πi{a−1(z0) + a−1(z1) + · · ·+ a−1(zk)

},

57

leading to

The Residue Theorem:

∮

C

f(z)dz = 2πik∑

j=0

a−1(zj).

As an example of the application of the residue theorem consider the definite integral

∫ ∞

−∞

dx

1 + x2.

Treating the real line −∞ < x < ∞ as the horizontal axis in the complex z-plane, withz = x+ iy, we can extend this to ∫

C

dz

1 + z2

where C is some contour that includes the real line. For example we can take C to bea large semi-circle above the real axis, centred on z = 0 and radius R, with its diameterlying along the real axis. Then, on the semi-circular arch, z = Reiφ and dz = Rieiφdφwith 0 ≤ φ ≤ π, so

∫

C

dz

1 + z2= Ri

∫ π

0

eiφdφ

1 +R2e2iφ+

∫ R

−R

dx

1 + x2.

As R → ∞ the first integral on the right-hand side vanishes so, taking the semi-circle tobe of infinite radius, ∫

C

dz

1 + z2=

∫ ∞

−∞

dx

1 + x2.

Now we can use the residue theorem to evaluate the left-hand side. The integrand

1

1 + z2=

1

(1 + iz)

1

(1− iz)=

1

(z + i)

1

(z − i)

has two simple poles, one at z = i and one at z = −i. The pole at z = −i lies below thereal axis and is outside the contour C, but the one at z = i is inside C, and is the onlypoint of non-analyticity inside C. Expanding about z0 = i

1

z + i=

1

2i

1[1 +

(z−i2i

)] =1

2i

∞∑

n=0

(−1)n(z − i

2i

)n

so1

1 + z2= −1

4

∞∑

n=−1

(−1)n+1

(z − i

2i

)n

.

58

The residue of the simple pole at z0 = i is the n = −1 term:

a−1(z0 = i) =1

2i.

There is only one pole, and hence only one residue, within C so

∫

C

dz

1 + z2= 2πia−1(z0 = i) = π.

Thus we have used the residue theorem to evaluate

∫ ∞

−∞

dx

1 + x2= π.

Evaluating definite integrals using the residue theorem in this way is called the calculusof residues.

8. Fourier Transforms

You are familiar with Fourier series for a function on an interval −T/2 < t < T/2.

f(t) =

∞∑

n=0

An cos

(2πnt

T

)+

∞∑

n=1

Bn sin

(2πnt

T

).

where the constants An and Bn are determined, using orthogonality of the trigonometricfunctions for positive integers n and n′

∫ π

−π

cos(nθ) cos(n′θ)dθ = πδnn′

∫ π

−π

sin(nθ) sin(n′θ)dθ = πδnn′

∫ π

−π

sin(nθ) cos(n′θ)dθ = 0,

to be

An =2

T

∫ T/2

−T/2

f(t) cos

(2πnt

T

)dt, Bn =

2

T

∫ T/2

−T/2

f(t) sin

(2πnt

T

)dt

for positive n, while

A0 =1

T

∫ T/2

−T/2

f(t)dt.

In the Fourier series nπT is a frequency,

ωn =nπ

T,

59

and for large T the ωn are close together for successive n, approaching a continuous variableω as T → ∞. As T → ∞, the sums over n go over to Riemann integrals over angularfrequency, ω, with infinitesimal frequency interval

dω =2π

T.

If∫∞∞ f(t) cos(ωn)dt and

∫∞∞ f(t) sin(ωn)dt are finite An and Bn will vanish as T → ∞ so

we define

a(ωn) :=T

2πAn =

1

π

∫ T/2

−T/2

f(t) cos(ωnt)dt −→T→∞

a(ω) =1

π

∫ ∞

−∞f(t) cos(ωt)dt

b(ωn) :=T

2πBn =

1

π

∫ T/2

−T/2

f(t) sin(ωnt)dt −→T→∞

b(ω) =1

π

∫ ∞

−∞f(t) sin(ωt)dt,

then

f(t) =

∫ ∞

0

(a(ω) cos(ωt) + b(ω) sin(ωt)

)dω.

It is conventional to define the cosine transform fc(ω) and the sine transform fs(ω)of f(t) as

fc(ω) :=

√2

π

∫ ∞

0

f(t) cos(ωt)dt

fs(ω) :=

√2

π

∫ ∞

0

f(t) sin(ωt)dt.

These integrals certainly exist if∫∞0

|f(t)|dt exists and is finite.

If f(−t) = f(t) is an even function then fc(ω) =√

π2a(ω) and if f(−t) = f(t) is an

odd function then fs(ω) =√

π2 b(ω).

Another type of integral transform that is very useful in physics when periodic phe-nomena are under consideration is is the Fourier transform,

f(ω) =

√1

2π

∫ ∞

−∞f(t)eiωtdt. (64)

Specifying f(ω) is completely equivalent to specifying f(t) because

f(t) =

√1

2π

∫ ∞

−∞f(ω)e−iωtdω, (65)

as we shall now show.Using the definition of f(ω) in the right hand side of (65) gives

√1

2π

∫ ∞

−∞f(ω)e−iωtdω =

1

2π

∫ ∞

−∞

(∫ ∞

−∞f(t′)eiωt′dt′

)e−iωtdω

=1

2π

∫ ∞

−∞

(∫ ∞

−∞eiω(t′−t)dω

)f(t′)dt′. (66)

60

We now argue 12π

∫∞−∞ eiω(t′−t)dω can be interpreted as an integral representation of

the Dirac delta-function. First suppose that t 6= t′ in the integral. The integral can becomputed by using complex integration in the complex ω-plane. Consider t < t′ andt > t′ separately. When t < t′ eiω(t′−t) is complex analytic everywhere in the upper-halfcomplex-ω plane, Im(ω) > 0, so we choose the contour C+ below — a large semi-circlein the upper-half plane centred on the origin. As the radius of the semi-circle becomesinfinite

1

2π

∫

C+

eiω(t′−t)dω → 1

2π

∫ ∞

−∞eiω(t′−t)dω = 0

from Cauchy’s integral formula. When t > t′ use the contour C− below and the samereasoning gives

1

2π

∫

C−

eiω(t′−t)dω → 1

2π

∫ ∞

−∞eiω(t′−t)dω = 0.

So 12π

∫∞−∞ eiω(t′−t)dω = 0 when t 6= t′.

C

C

−

+

When t = t′ the integral is divergent. Consider

∫ T

−T

(∫ ∞

−∞eiωτdω

)dτ =

∫ ∞

−∞

[eiωτ

iω

]T

−T

dω = 2

∫ ∞

−∞

sin(ωT )

ωdω = 4

∫ ∞

0

sin(ωT )

ωdω = 2π,

which is finite ∀T , in particular for T → ∞∫ ∞

−∞

(∫ ∞

−∞eiωτdω

)dτ = 2π.

We have shown that∫∞−∞ eiωτdω is zero for τ 6= 0, infinite for τ = 0 and its integral is finite

and equal to 2π. From the defining properties of the Dirac delta-function we conclude that

δ(t− t′) =1

2π

∫ ∞

−∞eiω(t−t′)dω.

Using this in (66)

61

√1

2π

∫ ∞

−∞f(ω)e−iωtdω =

1

2π

∫ ∞

−∞

(∫ ∞

−∞eiω(t′−t)dω

)f(t′)dt′

=

∫ ∞

−∞δ(t′ − t)f(t′)dt′ = f(t)

as claimed.To summarise the Fourier transform and the inverse transform are∗

f(ω) =

√1

2π

∫ ∞

−∞f(t)eiωtdt, f(t) =

√1

2π

∫ ∞

−∞f(ω)e−iωtdω.

The Fourier, sine and cosine transforms are related. If f(t) is an even function theimaginary part of f(ω) vanishes and

fc(ω) = f(ω),

while if If f(t) is an odd function the f(ω) is pure imaginary and

fs(ω) = −if(ω).

These integrals are examples of a class of functions called integral transforms where,given a function f(t), we construct a new function f(ω) as an integral

f(ω) :=

∫ ∞

−∞f(t)K(ω, t)dt

where K(ω, t) is called the kernel of the integral transform.

∗ Note: some texts use a slightly different convention for the definition of the Fouriertransform, namely

f(ω) =

∫ ∞

−∞f(t)eiωtdt, f(t) =

1

2π

∫ ∞

−∞f(ω)e−iωtdω.

This is more natural for physics applications, where ν = 2πω with ν a frequency in Hertz,hence dν = dω

2π . This convention is used in the particle physics module MP466.

62

mp469: diﬀerential equations and complex analysis · 2014. 9. 16. · mp469: diﬀerential...

Documents