time-independent perturbation theory - university of michigansunkai/teaching/winter_2016/... ·...

2Time-Independent Perturbation Theory

2.1. Overview

2.1.1. General question

Assuming that we have a Hamiltonian,

H = H0 + λH1 (2.1)

where λ is a very small real number. The eigenstates of the Hamiltonian should not be very different from the eigenstates of H 0. If we already

know all eigenstates of H0, can we get eigenstates of H1 approximately?

Bottom line: we are studying an approximate method.

2.1.2. Why perturbation theory?

Why we need to study this approximation methods? (considering the fact that numerical methods can compute the eigenstates very

efficiently and accurately for any Hamiltonian that we consider in this course)

Reason number I: It is part of the history (QM was born before electronic computer becomes a powerful tool in scientific research).

Reason number II: It reveals to us universal principles, which are very important and cannot be obtained from just numerical simulations

Reason number III: The idea of perturbation theory has very deep and broad impact in many branches of physics. Perturbation theories is in

many cases the only theoretical technique that we have to handle various complex systems (quantum and classical). Examples: in quantum

field theory (which is in fact a nonlinear generalization of QM), most of the efforts is to develop new ways to do perturbation theory (Loop

expansions, 1/N expansions, 4-ϵ expansions).

2.1.3. Assumptions

Assumption #1: we know all eigenstates of H0, as well as their corresponding eigenenergies

H0 ψn0 = En

0 ψn0 (2.2)

Assumption #2: we know the perturbation H '. What do we mean by knowing H '? Here, we mean that we can write down H ' using the

complete basis of ψn0, i.e., we know the value of ψn0 H' ψm

0 for any m and n.

Assumption #3: we only consider quantum states with discrete eigenenergies

In general, the energy spectrum of a quantum system (i.e. all eigenvalues of the Hamiltonian) falls into one of the following three general

possibilities

◼ A discrete spectrum: eigenenergies can only take certain discredited values (example: infinite deep potential wells, e.g. harmonic potential

En = (n + 1 /2) ℏω)

4 Phys460.nb

◼ A continuous spectrum: eigenenergies can take any (real) values in certain allowed range (example: a constant potential. Here, any E ≥ V

is an eigenenergy)

◼ A mixed spectrum: some parts of the spectrum are continuous, while other parts has discrete eigenenergies. (example: a finite potential

well. Here, we may have some discrete states inside the well. But for E above the top of the potential well, we have a continuous spectrum).

Q: Consider the energy spectrum of an attractive Coulomb (1 /r) potential. Is it discrete, continuous or mixed?

A: It is mixed. When we consider the an attractive Coulomb potential, we mostly focus on the negative energy states (E < 0). This part of the

spectrum is discrete, as we all know very well from the study of a hydrogen atom. But if we look at states with positive energies, there is a

continuous spectrum for E > 0. For E > 0, the system is NOT a bound state, i.e. the proton and the electron doesn’t form an atom. In other

words, we have a high probability found the proton and the electron to be separated far from each other. There, the attractive potential is very

small and negligible, so we have two free particles and only need to consider their kinetic energies. For free particles, we know that any positive

energy is an allowed eigenenergy (i.e. we have a continuum spectrum for E > 0).

Bottom line: in this chapter, our perturbation theory only consider discrete spectrum or the discrete part of a mixed spectrum.

Another version of assumption #3: we only consider confined states. (In QM, in most cases, confined states=discrete energy and unconfined

states=continuous energy).

Comment: In QM, we only study discrete states in a perturbation theory. But this is NOT true for other branches of physics. For example, in

quantum field theory, perturbation theory is applied to continuous spectral.

2.2. Non-degenerate Perturbation Theory

2.2.1. Assumptions

Key assumption: we consider a specific state ψn0. Here, we assume that En

0 - Em0 is much larger than λH1 for any other

eigenstate ψm0

2.2.2. Preparation #1 wavefunctions

Since the eigenstates of H0 form a compete basis, we can write down any quantum state as a linear superposition of ψm0

ψ⟩ =m

am ψm0 (2.3)

Now, if we consider an eigenstate of H , ψ

n, it can also be written in a similar fashion

ψ

n =m

am ψm0 (2.4)

As discussion above, if λ is small, an eigenstate of H would be similar to an eigenstate of H0. Here, we assume that ψ

n is very close to ψn0.

This means that an ≈ 1 and for other values of m ≠ n, am~0, To highlight this, we separate the term for ψn0 out from the sum

ψ

n = an ψn0 +

m≠nam ψm

0 (2.5)

It turns out that it is usually more convenient to use unnormalized eigenstates. Now, let us define unnormalized eigenstates of H

ψn⟩ =1

an

ψ

n = ψn0 +

m≠n

am

an

ψm0 (2.6)

For simplicity, we will now call am /an = cm

ψn⟩ = ψn0 +

m≠ncn ψm

0 (2.7)

Because am ~ 0 and an ~ 1, we know that cm~0 for small λ.

Comment #1: This state is NOT normalized

⟨ψn ψn⟩ = 1 +m≠n

cm2 ≥ 1 (2.8)

Phys460.nb 5

But we can easily normalized it, if we want to

ψ

n =1

1 +∑m≠n cm2

ψn(2.9)

Comment #2: (almost) any quantum states can be written in the form of Eq. (2.3). This is because ψm0 forms a complete basis.

Q: what does the word “almost” mean here?

A: If a state is orthogonal to ψn0, we cannot write the state the form of Eq. (2.3). But we don’t need to worry about it here, because we are

doing perturbation theory and we know that the eigenstates of H is close to eigenstates of H0. So it can not be orthogonal to ψn0.

Bottom line: we are not making any assumptions or approximations here. It is just a new way to write down eigenstates of H .

Comment #3: cms are functions of λ, i.e. cm(λ). For small λ, we can use the Taylor series:

cm = cm(1) λ + cm

(2) λ2 + cm(3) λ3 +… (2.10)

Here, the Taylor series doesn’t contain the 0th order term of λ (i.e. the constant term). This is because when λ = 0, ψn⟩ = ψn0, and thus

cm(λ) = 0 at λ = 0.

As a result,

ψn⟩ = ψn0 +

m≠ncm(λ) ψm

0 = ψn0 +

m≠n

k=1

∞cm

(k) λk ψm0 = ψn

0 +k=1

∞λk

m≠ncm

(k) ψm0 (2.11)

If we define

ψnk =

m≠ncm

(k) ψm0 (2.12)

we get

ψn⟩ = ψn0 + λ ψn

1 + λ2 ψn2 +… (2.13)

This is Eq. [6.5] in the textbook.

Important: ψn1, ψn

2 … doesn’t contain ψn0. In other words, all corrections are orthogonal to ψn

0.

2.2.3. Preparation #2 eigenenergies

Eigenenergies of H are also functions of λ, and for small λ, we can use the Taylor series:

En(λ) = En0 + λ En

1 + λ2 En2 +… (2.14)

This is Eq. [6.6] in the textbook.

2.2.4. Schrodinger Equation in the perturbation theory

H ψn⟩ = En ψn⟩ (2.15)

(H0 + λH ') ψn0 + λ ψn

1 + λ2 ψn2 +… = En

0 + λ En1 + λ2 En

2 +… ψn0 + λ ψn

1 + λ2 ψn2 +… (2.16)

H0 ψn0 + λH0 ψn

1 + H ' ψn0 + λ2H0 ψn

2 + H ' ψn1 +… = En

0 ψn0 + λEn

0 ψn1 + En

1 ψn0 + λ2En

0 ψn2 + En

1

ψn1 + En

2 ψn0 +…

(2.17)

In the perturbation theory, we need to compute two sets of quantities (1) energy corrections at each order En1, En

2,... and (2) wavefunc-

tion corrections at each order, ψn1, ψn

2, ψn3. It turns out that these two set of quantities are entangled together and we need

to compute both of them. At each order, we will first compute energy corrections, and then the wavefunction corrections.

2.2.5. Zeroth order

The leading order terms in the equation is λ0 = constant

H0 ψn0 = En

0 ψn0 (2.18)

6 Phys460.nb

This is identical to the case of λ = 0, i.e. the unperturbed system.

2.2.6. First order

To the order of λ, we have

H0 ψn1 + H ' ψn

0 = En0 ψn

1 + En1 ψn

0 (2.19)

Here, we first compute the energy correction En1. This is done by multiplying on both sides ψn

0

ψn0 H0 ψn

1 + ψn0 H ' ψn

0 = ψn0 En

0 ψn1 + ψn

0 En1 ψn

0 (2.20)

For the first term on the l.h.s., we use the fact that

ψn0 H0 = ψn

0 En0 (2.21)

For the last term on the r.h.s., we use the fact that En1 is a number (not a quantum operator), and thus ψn

0 En1 ψn

0 = En1 ψn

0 ψn0 = En

1

ψn0 En

0 ψn1 + ψn

0 H ' ψn0 = ψn

0 En0 ψn

1 + En1 (2.22)

ψn0 H ' ψn

0 = En1 (2.23)

The first order correction in energy is the expectation value of H '.

En = En0 + λ ψn

0 H ' ψn0 + Oλ2 =

ψn0 H0 ψn

0 + ψn0 λH ' ψn

0 + Oλ2 = ψn0 H0 + λH ' ψn

0 + Oλ2 = ψn0 H ψn

0 + Oλ2(2.24)

Bottom line: to the first order (or say up to corrections at the order of λ2), we can use the old wavefunction (the zeroth order

wavefunction).

Then we compute the first order correction for the wavefunction ψn1. To do that, we multiply both sides of the equation with ψm

0 where

m ≠ n

ψm0 H0 ψn

1 + ψm0 H ' ψn

0 = ψm0 En

0 ψn1 + ψm

0 En1 ψn

0 (2.25)

For the first term on the l.h.s., we use the fact that

ψm0 H0 = ψm

0 Em0 (2.26)

For the two terms on the r.h.s., we use the fact that En0 and En

1 are both numbers (not quantum operators), so ψm0 En

0 ψn1 = En

0 ψm0 ψn

1

and ψm0 En

1 ψn0 = En

1 ψm0 ψn

0 = 0. Here, we used the fact that when m ≠ n, the two quantum states are orthogonal and thus

ψm0 ψn

0 = 0.

Em0 ψm

0 ψn1 + ψm

0 H ' ψn0 = En

0 ψm0 ψn

1 (2.27)

So

ψm0 ψn

1 =ψm

0 H ' ψn0

En0 - Em

0(2.28)

According to the definition of ψn1

ψn1 =

m≠ncm

(1) ψm0 (2.29)

we have

cm(1) = ψm

0 ψn1 =

ψm0 H ' ψn

0

En0 - Em

0(2.30)

And therefore,

Phys460.nb 7

ψn1 =

m≠nψm

0ψm

0 H ' ψn0

En0 - Em

0(2.31)

So

ψn⟩ = ψn0 + λ

m≠nψm

0ψm

0 H ' ψn0

En0 - Em

0+… (2.32)

2.2.7. Second order

H0 ψn2 + H ' ψn

1 = En0 ψn

2 + En1 ψn

1 + En2 ψn

0 (2.33)

Here, we first compute the energy correction En2. This is done by multiplying on both sides ψn

0

ψn0 H0 ψn

2 + ψn0 H ' ψn

1 = ψn0 En

0 ψn2 + ψn

0 En1 ψn

1 + ψn0 En

2 ψn0 (2.34)

ψn0 En

0 ψn2 + ψn

0 H ' ψn1 = ψn

0 En0 ψn

2 + En1 ψn

0 ψn1 + En

2 (2.35)

The second term on the r.h.s. is zero, because we required ψn0 ψn

1 = 0 at the beginning.

En2 = ψn

0 H ' ψn1 (2.36)

Bottom line: to compute the second order perturbation, we need to know wavefunction at the first order.

This conclusion is in fact generically true. We need wavefunction at lower order to compute energy correction at one order higher.

En2 = ψn

0 H ' ψn1 =

m≠nψn

0 H ' ψm0ψm

0 H ' ψn0

En0 - Em

0=

m≠n

ψn0 H ' ψm

0 ψm0 H ' ψn

0

En0 - Em

0(2.37)

En = En0 + λ ψn

0 H ' ψn0 + λ2

m≠n

ψn0 H ' ψm

0 ψm0 H ' ψn

0

En0 - Em

0+ Oλ3 (2.38)

The we compute the second order correction for the wavefunction ψn2. To do that, we multiply both sides of the equation with ψm

0 where

m ≠ n

ψm0 H0 ψn

2 + ψm0 H ' ψn

1 = ψm0 En

0 ψn2 + ψm

0 En1 ψn

1 + ψm0 En

2 ψn0 (2.39)

Em0 ψm

0 ψn2 + ψm

0 H ' ψn1 = En

0 ψm0 ψn

2 + En1 ψm

0 ψn1 + En

2 ψm0 ψn

0 (2.40)

En0 - Em

0 ψm0 ψn

2 = ψm0 H ' ψn

1 - En1 ψm

0 ψn1 (2.41)

ψm0 ψn

2 =ψm

0 H ' ψn1

En0 - Em

0-

En1

En0 - Em

0ψm

0 ψn1 =

m'≠n

ψm0 H ' ψm'

0

En0 - Em

0

ψm'0 H ' ψn

0

En0 - Em'

0-

m'≠n

ψn0 H ' ψn

0

En0 - Em

0ψm

0 ψm'0ψm'

0 H ' ψn0

En0 - Em'

0=

m'≠n

ψm0 H ' ψm'

0

En0 - Em

0

ψm'0 H ' ψn

0

En0 - Em'

0-

m'≠n

ψn0 H ' ψn

0

En0 - Em

0δm,m'

ψm'0 H ' ψn

0

En0 - Em'

0=

m'≠n

ψm0 H ' ψm'

0

En0 - Em

0

ψm'0 H ' ψn

0

En0 - Em'

0-

m'≠n

ψn0 H ' ψn

0 ψm0 H ' ψn

0

En0 - Em

02

(2.42)

According to the definition of ψn2

ψn2 =

m≠ncm

(2) ψm0 (2.43)

We have

cm(2) = ψm

0 ψn2 (2.44)

8 Phys460.nb

and thus

ψn2 =

m≠n

m'≠n

ψm0 H ' ψm'

0

En0 - Em

0

ψm'0 H ' ψn

0

En0 - Em'

0-ψn

0 H ' ψn0 ψm

0 H ' ψn0

En0 - Em

02

ψm0 (2.45)

2.2.8. Third order

Same as the second order, we can use the same method to show that

En3 = ψn

0 H ' ψn2 (2.46)

So,

En3 =

m≠n

m'≠n

ψn0 H ' ψm

0 ψm0 H ' ψm'

0 ψm'0 H ' ψn

0

En0 - Em

0 En0 - Em'

0-ψn

0 H ' ψn0 ψm

0 H ' ψn0 ψn

0 H ' ψm0

En0 - Em

02 (2.47)

And one can keep doing this for higher and higher order

2.2.9. Summary

For a Hamiltonian

H = H0 + λH1 (2.48)

assuming that we know all the eigenstates of H0 ψn0), and we know the expectation values ψm1

0 H ' ψm20 for any two eigenstates of H0,

ψm10 and ψm2

0), then we can write down eigenstates of H as a power series expansions of λ

En = En0 + λ En

1 + λ2 En2 +… = En

0 + λ ψn0 H ' ψn

0 + λ2

m≠n

ψn0 H ' ψm

0 ψm0 H ' ψn

0

En0 - Em

0+… (2.49)

and

ψn⟩ = ψn0 + λ ψn

1 + λ2 ψn2 +… = ψn

0 + λm≠n

ψm0ψm

0 H ' ψn0

En0 - Em

0+… (2.50)

2.2.10. Second order perturbation always reduces the energy of the ground state

One key conclusion from the perturbation theory is that the second order correction always makes the energy of the ground state lower (in

comparison to the unperturbed one). This can be seen by looking at E02

En2 =

m≠n

ψn0 H ' ψm

0 ψm0 H ' ψn

0

En0 - Em

0=

m≠n

ψn0 H ' ψm

0 2

En0 - Em

0(2.51)

In the numerator, ψn0 H ' ψm

0 is the complex conjugate of ψm0 H' ψn0, so it is ψn

0 H ' ψm0 2, which is non-negative

ψn0 H ' ψm

0 2 ≥ 0 (2.52)

The denominator En0 - Em

0 < 0, if n is the ground state for the unperturbed Hamiltonian (if it is the ground state, then its eigenenergy must be

smaller than eigenenergy of any other states). And therefore

ψn0 H ' ψm

0 2

En0 - Em

0≤ 0 (2.53)

So

En2 =

m≠n

ψn0 H ' ψm

0 2

En0 - Em

0≤ 0 (2.54)

The equal sign only arise when ψn0 H ' ψm

0 = 0 for ALL m ≠ n. (if this is the case, we don’t need to do perturbation theory. The first order

perturbation become exact). As long as we ignore this very special case, we find that En2 < 0 for the ground state, regardless of details.

Phys460.nb 9

This conclusion is very important in quantum mechanics, because in many systems, the first order perturbation of the ground state happens to

be zero. En1 = 0. There,

En = En0 + λ2 En

2 +… (2.55)

The energy correction is dominated by the second order term, which must be negative for the ground state. Without any calculation, we know

immediately that

En < En0 (2.56)

In the first homework, we will see that this relation implies that the speed of light in a (linear) medium can only be slower than the vacuum.

(i.e., if En > En0, we will violate the special relativity).

2.3. Brillouin-Wigner Perturbation Theory

2.3.1. Negative sides of Rayleigh–Schrödinger perturbation theory

The perturbation theory discussed above is known as Rayleigh–Schrödinger perturbation theory. It is presented for most of the textbooks.

However, this approach has some limitations and is not sufficient enough for some cases.

1. Too complicated to go to higher order (e.g. third order or fourth order correction)

2. The physical meaning is less clear (Why do we need to sum over all other quantum state? How should we think about the sum.)

3. One needs to compute energy and wavefunctions at the same time (if we only want to know the eigenenergy, can we compute only energy

without bothering to do wavefunction?)

One way to resolve these problems: Brillouin-Wigner Perturbation Theory

2.3.2. Brillouin-Wigner Perturbation Theory

Brillouin-Wigner Perturbation Theory considers the same setup and the final conclusions are exactly the same. However it has a couple of

advantages

1. It offers a nice and simple physical interpretation (a baby version of Feynman diagrams used in quantum field theory)

2. It is easier to compute higher order corrections (If we want to compute the eigenenergy using a computer, this perturbation theory just needs

one very simple iteration)

3. One can compute energy along, without worry about wavefunctions.

Let’s start from the same setup

(H0 + λH ') ψn⟩ = En ψn⟩ (2.57)

where ψn⟩ represents the same unnormalized eigenstate of H

ψn⟩ = ψn0 +

m≠ncn ψm

0 (2.58)

We can rewrite the equation above as

(E - H0) ψn⟩ = λH ' ψn⟩ (2.59)

and

ψn⟩ = λ (E - H0)-1 H ' ψn (2.60)

Note: here (E - H0)-1 is the matrix inverse, instead of a number inverse, because H0 is an operator, instead of a number.

Q: What is a function of operator? e.g, f (Q)?

A: First, we write down the same function as a number function and do a power-law expansion

f (x) = a0 + a1 x + a2 x2 +… (2.61)

10 Phys460.nb

then, f (Q) represents the same power series, but with number x substitute by operator Q

f (Q) = a0 + a1 Q

+ a2 Q

2+… (2.62)

where Q2

= QQ

, etc.

Here, the inverse function of operator (E - H0)-1 shall be understood the same way

Because

1

E - x=

1

E+

x

E2+

x2

E3+… (2.63)

we know that

E - H

0-1

=1

E+

H

0

E2+

H

02

E3+… (2.64)

Now, back to the derivation above:

ψm0 ψn = ψm

0 λ (E - H0)-1 H ' ψn =

λ

E - Em0ψm

0 H ' ψn (2.65)

Remember that from the definition, of ψn⟩

ψn⟩ = ψn0 +

m≠ncm ψm

0 (2.66)

we have

ψm0 ψn = cm (2.67)

for any m ≠ n. And thus, we get

ψn⟩ = ψn0 + λ

m≠nψm

01

E - Em0ψm

0 H ' ψn (2.68)

If we define a quantum operator X as

R=

m≠nψm

01

E - Em0ψm

0(2.69)

we get

ψn⟩ = ψn0 + λ R

H

' ψn (2.70)

Thus,

I- λ R

H

' ψn = ψn0 (2.71)

where I is the identity operator

So,

ψn⟩ = I- λ R

H

'-1ψn

0 (2.72)

Again, we emphasize that here, I- λ R

H

'-1 represent matrix inverse. For matrix inverse, we can use Taylor expansions to write it out. We

know that

(1 - a)-1 = 1 + a + a2 + a3 +… (2.73)

So similarly, we have

I - λ R

H

'-1= I + λ R

H

' + λ2 R

H

' R

H

' + λ3 R

H

' R

H

' R

H

' +… (2.74)

So we have

ψn⟩ = I - λ R

H

'-1ψn

0 = ψn0 + λ R

H

' ψn0 + λ2 R

H

' R

H

' ψn0 + λ3 R

H

' R

H

' R

H

' ψn0 + (2.75)

Phys460.nb 11

So we find that

ψnk = R

H

'k ψn0 (2.76)

Previous, we found that

En1 = ψn

0 H ' ψn0 (2.77)

En2 = ψn

0 H ' ψn1 (2.78)

En3 = ψn

0 H ' ψn2 (2.79)

In fact, we can use the same procedure to show that for kth order,

Enk = ψn

0 H ' ψnk-1 (2.80)

Because we have found that ψnk-1 = R

H'

k-1ψn

0

Enk = ψn

0 H ' ψnk-1 = ψn

0 H ' R

H

'k-1ψn

0 (2.81)

So, we have

En1 = ψn

0 H ' ψn0 (2.82)

En2 = ψn

0 H ' R H ' ψn0 =

m≠nψn

0 H ' ψm0

1

En - Em0ψm

0 H ' ψn0 (2.83)

En3 = ψn

0 H ' R H ' R H ' ψn0 =

m≠n

m'≠nψn

0 H ' ψm0

1

En - Em0ψm

0 H ' ψm'0

1

En - Em'0ψm'

0 H ' ψn0 (2.84)

... (2.85)

From these formula we see a pattern.

1. For any Enk, if we look at the formula from right to left, one always start from unperturbed state ψn

0 and eventually goes back to the

same state ψn0 .

2. In the path from ψn0 to ψn0 , we go through several intermediate states ψm

0, ψm'0…. For kth order perturbation, we have

k - 1 intermediate states.

3. To turn from a state to another along the path (e.g. from n to m’ or from m’ to m in En3), we use the perturbation H '

4. For each intermediate state,we have an denominator 1En-Em0

2.3.3. Diagrammatic representation

We can represent the Enk using diagrams.

1. For each intermediate state, we represent 1En-Em0

as a solid line with integer m labeling the state.

2. For each ψm0 H ' ψm'

0, we represent it as a dot. And we use Vm m' to represent ψm0 H ' ψm'

0

3. Connect everything together in the same order as in Enk

4. At the two ends of the line, we use two short line to present that we start from and end at the same state ψn0

First order:

12 Phys460.nb

Second order:

Third order

By making the line longer, we can write down easily perturbation terms to any order.

Relations to QFT:

In QFT, we use very similar diagrams, known as the Feynman diagrams. There, solid lines are propagator of a particle 1ω-ϵ0

where ω is

frequency, pretty much the same as energy En and ϵ0 is the unperturbed energy of the particle (energy ignore interactions between particles). In

faction, the diagrams we show here are baby versions of the diagrams of Feynman.

Physics meaning discussed in class: (example: two electrons exchange photons to get E&M interactions).

2.3.4. How to compute the energy using Brillouin-Wigner Perturbation Theory?

First, let’s define some abbreviation to make the formula shorter,

Vij = ψi0 H ' ψ j

0 (2.86)

and thus

En = En0 + λ Vnn + λ2

Vn m Vm n

En - Em0+ λ3

Vn m Vm m' Vm' n

En - Em0 En - Em'

0+ λ4

Vn m Vm m' Vm' m'' Vm'' n

En - Em0 En - Em'

0 En - Em''0

+ ...(2.87)

Here all the ms are summed over but they cannot be the same as n. It may looks like that we can find En using this formula, but it is not quite

the case yet. This is because on the r.h.s., the denominator contains also En, i.e. IT is a equation for En and En arises on both sides.

This equation can be solved easily using iterative method (e.g. using a computer code). One start from zeroth order, and then go to first,

second, third order …, every time we need En in the kth order calculation, we just use the (k - 1)th order En on the r.h.s.. Here is how it is done

First run

En(1) = En

0 + λ Vnn (2.88)

Second run

En(2) = En

0 + λ Vnn + λ2Vn m Vm n

En(1) - Em

0 (2.89)

Third run

En(3) = En


En(2) - Em

0+ λ3

Vn m Vm m' Vm' n

En(2) - Em

0 En(2) - Em'

0(2.90)

Fourth run

Phys460.nb 13

En(4) = En


En(3) - Em

0+ λ3

Vn m Vm m' Vm' n

En(3) - Em

0 En(3) - Em'

0+ λ4


En(3) - Em

0 En(3) - Em'

0 En(3) - Em''

0(2.91)

...

Another option is using analytic methods, as will be discussed below.

2.3.5. Preparation

Consider the following function

f (x) = x2 g(x) = x21 + a x + b x2 + c x3 + ... (2.92)

If we want to keep f (x) to O(xn), we only need to keep g(x) to Oxn-2. Similarly, for the following function

f (x) =x2

g(x)=

x2

1 + a x + b x2 + c x3 + ...(2.93)

If we want to keep f (x) to O(xn), we only need to keep g(x) to Oxn-2. This will be something that useful for us latter

2.3.6. iterative method


Vn m Vm n

En - Em0+ λ3

Vn m Vm m' Vm' n

En - Em0 En - Em'

0+ λ4


En - Em0 En - Em'

0 En - Em''0

+ ...(2.94)

Zeroth order (no En on the l.h.s., so job done)

En = En0 + O(λ) (2.95)

First order (no En on the l.h.s., so job done)

En = En0 + λ Vnn + Oλ2 (2.96)

Second order, we use En obtained at zeroth order for the λ2 term


Vn m Vm n

En - Em0+ Oλ3 = En


En0 - Em

0+ Oλ3 (2.97)

This is because the third term on the r.h.s. already has a λ2 prefactor. Thus to keep to Oλ2, we only need to keep the denominator to Oλ0.

Third order,


Vn m Vm n

En - Em0+ λ3

Vn m Vm m' Vm' n

En - Em0 En - Em'

0+ Oλ4 =

En0 + λ Vnn + λ2

Vn m Vm n

En0 + λ Vnn - Em

0+ λ3

Vn m Vm m' Vm' n

En0 - Em

0 En0 - Em'

0+ Oλ4

(2.98)

In the λ2 term, we now need to keep to O(λ). In the λ3 term, we just need to keep En to the zeroth order.

For the λ2 term, we can expand it for small λ

λ2Vn m Vm n

En0 + λ Vnn - Em

0= λ2

Vn m Vm n

En0 - Em

0- λ3

Vn m Vm n

En0 - Em

0

Vnn

En0 - Em

0+…

(2.99)

So


Vn m Vm n

En0 - Em

0+ λ3

Vn m Vm m' Vm' n

En0 - Em

0 En0 - Em'

0-

Vn m Vm n Vn n

En0 - Em

02 + Oλ4

(2.100)

Fourth order,

14 Phys460.nb


Vn m Vm n

En0 + λ Vnn + λ2 Vn m' Vm' n

En0-Em'0

- Em0

+

λ3Vn m Vm m' Vm' n

En0 + λ Vnn - Em

0 En0 + λ Vnn - Em'

0+ λ4


En0 - Em

0 En0 - Em'

0 En0 - Em''

0+ Oλ5

(2.101)


Vn m Vm n

En0 - Em

0+ λ3

Vn m Vm m' Vm' n

En0 - Em

0 En0 - Em'

0-

Vn m Vm n Vn n

En0 - Em

02 +

λ4Vn m Vm m' Vm' m'' Vm'' n

En0 - Em

0 En0 - Em'

0 En0 - Em''

0-

Vn m Vm n

En0 - Em

02

Vn m' Vm' n

En0 - Em'

0+

Vn m Vm n

En0 - Em

03

Vnn2 - 2

Vn m Vm m' Vm' n Vn n

En0 - Em

02En

0 - Em'0

+

Oλ5

(2.102)

2.4. Degenerate Perturbation Theory

In the previous section, we studied the effect of a small perturbation λH ' on an eigenstate of H0, ψn0. The key assumption there is that

before we turn on the perturbation (i.e. at λ = 0), the eigenenergies of all other eigenstates of H0 are very far away from En0

En0 - Em

0|> > ψi0 λH ' ψ j

0 (2.103)

This section, we will consider the opposite situation, where there is at least one other eigenstate of H0 which has the same eigenenergy

as ψn0. Two states having the same eigenenergy is known as “degeneracy”. So this perturbation theory is known as the degenerate

perturbation theory.

2.4.1. Why non-degenerate perturbation theory fails in the presence of degeneracy?

In the presence of degeneracy, the perturbation theory that we learned before will fail. To see this, we just need to look at the second order

perturbation of the eigenenergy

En = En0 + λ En

1 + λ2 En2 +… = En

0 + λ ψn0 H ' ψn

0 + λ2

m≠n

ψn0 H ' ψm

0 ψm0 H ' ψn

0

En0 - Em

0+… (2.104)

Here, we focus on the second order correction:

λ2

m≠n

ψn0 H ' ψm

0 ψm0 H ' ψn

0

En0 - Em

0(2.105)

If H0 has another eigenstate ψn'0 with the same eigenenergy, at least one term in this sum will have zero in the denominator and thus will

diverge, i.e., when En0 = En'

0, 1En

0-En'0→∞, and thus the theory becomes ill-defined.

NOTE: the same divergence will arise also in higher order corrections. But there is no divergence in the first order correction En1.

In power-law expansions, infinite coefficient doesn’t always mean singularity. It means that we missed something in the lower order correction.

Here is a simple example: Let’s consider a function f (x), which can be written as the following Taylor expansion at small x

f (x) = a0 + a1 x + a2 x2 +… (2.106)

Now, assume that I made a mistake in the Taylor expansion for the coefficient a1. Instead of the correction value, a1, I used a wrong coefficient

for the linear term, say b1.

f (x) = a0 + b1 x + (a1 - b1) x + a2 x2 +… (2.107)

In other words, here I missed part of the linear term, (a1 - b1) x. And thus coefficients of the higher order terms will also need to be adjusted to

absorb this mistake. Let’s try to use the x2 term to correct this error, i.e.

f (x) = a0 + b1 x +a1 - b1

x+ a2 x2 +… (2.108)

Phys460.nb 15

Let me define b2 = a2 +a1-b1

x

f (x) = a0 + b1 x + b2 x2 +… (2.109)

Now, once again, I wrote my function as a power-law expansion. Because I used a wrong coefficient for the linear term, b1, my second order

term needs to use this new coefficient. This new coefficient b2 is infinite at small x. This is transparent if we notice that when x → 0

b2 = a2 +a1 - b1

x→∞ (2.110)

Bottom line: infinite coefficient in the second order term (and higher order term) means that the first order result is incorrect and

needs to be revised.

2.4.2. What to do?

Here, let’s first take another look at the second order correction

En2 =

m≠n

ψn0 H ' ψm

0 ψm0 H ' ψn

0

En0 - Em

0(2.111)

As we know, the problem arises because En0 = Em

0 for certain m, and thus we get 10=∞. To avoid this singularity, the only thing that we need

to do is to request that the numerator also vanish whenever the denominator is zero. i.e., if En0 = Em

0, we must make sure that

ψm0 H ' ψn

0 = 0.

NOTE: the two factors in the numerator are complex conjugate to each other: ψn0 H ' ψm

0 = ψm0 H ' ψn

0*, and thus if one of them is

zero, the other is also zero.

Bottom line: For degenerate states, before we start the procedure described in the non-degenerate perturbation theory, we need to first

make sure that for any degenerate states, ψm0 H ' ψn

0 = 0

2.4.3. Whenever there is an degeneracy, we have an option to choose the basis

A good example, a free particle. Consider a free particle with mass m.

H0 =p2

2 m= -

ℏ2

2 m

ⅆ2

ⅆx2(2.112)

The eigenstates of H0 arises in pairs (i.e. there is a degeneracy for any excited states). The static Schrodinger equation here is

-ℏ2

2 m

ⅆ2

ⅆx2ψ(x) = E ψ(x) (2.113)

It is a second order differential equation and we know the solution are just plane waves

ψ = A ⅇⅈ k x + B ⅇ-ⅈ k x (2.114)

The eigenenergy for this state is E = p2 2 m = (ℏ k)2 2 m, i.e. the kinetic energy. Here, A and B are two arbitrary coefficients.

For each fixed k, we have one eigenenergy E = (ℏ k)2 2 m, but infinite number of eigenstates ψ = A ⅇⅈ k x + B ⅇ-ⅈ k x, i.e. a degeneracy. This

example is known as two-fold degeneracy, or we say that two states have the same energy. The reason we say “two states” here is because not

all the eigenstates are linear independent. In fact, we just need two states, ⅇⅈ k x and ⅇ-ⅈ k x, all other eigenstates can be written as linear superposi-

tion of these two. Bottom line: two-fold degeneracy means that any linear combination of these two states is an eigenstate of H0 with the

same eigenenergy.

Now, let’s look at the same second order differential equation again.

-ℏ2

2 m

ⅆ2

ⅆx2ψ(x) = E ψ(x) (2.115)

we know that we can also write the solution for this equation as

ψ = C cos k x + D sin k y (2.116)

16 Phys460.nb

i.e., instead of using exponentials, we can use sin or cos functions to represent plane waves. Here, once again we find infinite number of

eigenstates with the same eigenenergy, and once again, they are not all independent. We just need two states cos k x or sin k x. And all other

eigenstates are just linear superpositions of they two. So, again, we reach the same conclusion, the system have a two-fold degeneracy. But

early on, we said that the two states are ⅇⅈ k x and ⅇ-ⅈ k x, but now for the two degenerate states, we use cos k x or sin k x.

These different choices are just different basis to represent all the eigenstates. We can choose to use ⅇⅈ k x and ⅇ-ⅈ k x or cos k x or sin k x. There

is no difference between them. In fact, we can choose any two linear independent states

ψ1 = A1 ⅇⅈ k x + B2 ⅇ

-ⅈ k x (2.117)

ψ2 = A2 ⅇⅈ k x + B2 ⅇ

-ⅈ k x (2.118)

And then, we can say that we have two degenerate states ψ1(x) and ψ2(x). and then, we can represent any other eigenstates (with the same

eigenenergy) as

ψ(x) = X ψ1(x) + Y ψ2(x) (2.119)

Q: Why do we usually use ⅇⅈ k x and ⅇ-ⅈ k x or cos k x or sin k x? Why not use ⅇⅈ k x and cos k x.

A: There is no problem (mathematically) if we choose to use ⅇⅈ k x and cos k x. However, for convenience, it is usually better using orthonormal

bases

ⅆx ⅇⅈ k1 x*ⅇⅈ k2 x = 2 π δ(k1 - k2) (2.120)

ⅆx (cos k x)* sin k x = 0 (2.121)

Q: Why do we use ⅇⅈ k x and ⅇ-ⅈ k x more often than cos k x or sin k x in quantum mechanics?

A: For H0, there is little difference between the two choices. However, if we consider other quantum operators, like momentum, ⅇⅈ k x and ⅇ-ⅈ k x

is a better choice. This is because ⅇⅈ k x and ⅇ-ⅈ k x are eigenstates of the momentum operator too! So they have not only well-defined energy, but

also well defined momenta (ℏ k and -ℏ k respectively). cos k x or sin k x don’t have well defined momentum. They have 50% chance having

momentum ℏ k and another 50% chance having momentum -ℏ k.

Bottom line: when you cannot decide which choice of basis is better, look at another quantum operator.

These conclusions are true generically. Let’s start from two-fold degenerate. Assuming that for H0, there are two degenerate eigenstates:

H0 ψa0 = E0 ψa

0 (2.122)

and

H0 ψb0 = E0 ψb

0 (2.123)

We assume that these two states are orthogonal to each other (otherwise, we make them orthogonal, using Gram-Schmidt procedure). We

assume that ψa0 H ' ψb

0 ≠ 0. Here we first prove a fact: if ψa0 and ψb

0 are both eigenstates of H0 and they have the same eigen-

value E0, then any linear superposition of ψa0 and ψb

0 is also an eigenstate of H0 with the same eigenenergy. Let's define

ψ0 = α ψa0 + β ψb

0

H0 ψ0 = H0α ψa0 + β ψb

0 = αH0 ψa0 + βH0 ψb

0 = α E0 ψa0 + β E0 ψb

0 = E0αH0 ψa0 + βH0 ψb

0 =

E0 ψ0(2.124)

Bottom line: if H0 has two degenerate eigenstates ψa0 and ψb

0, we have infinite eigenstates with the same

eigenvalue ψ0 = α ψa0 + β ψb

0

To represent these infinite eigenstates, we need to choose two states as basis, e.g. ψa0 and ψb

0. Then, any eigenstates with eigenenergy

E0 can be written as a superposition of them. When we have one set of basis, we know that we can choose another set of basis (i.e. we can

change to a different set of basis): for example, we can define ψ10 and ψ2

0, where

ψ10 = α1 ψa

0 + β1 ψb0 (2.125)

Phys460.nb 17

ψ20 = α2 ψa

0 + β2 ψb0 (2.126)

Here, we request ψ10 and ψ2

0 to be orthogonal to each other (otherwise, we make them orthogonal, using Gram-Schmidt procedure)

ψ10 ψ2

0 = ψa0 α1

* + ψb0 β1

* α2 ψa0 + β2 ψb

0 = α1* α2 + β1

* β2 = 0 (2.127)

and we assume that ψ10 and ψ2

0 are normalized

ψ10 ψ1

0 = ψa0 α1

* + ψb0 β1

* α1 ψa0 + β1 ψb

0 = α1* α1 + β1

* β1 = α12 + β1

2 = 1 (2.128)

ψ20 ψ2

0 = ψa0 α2

* + ψb0 β2

* α2 ψa0 + β2 ψb

0 = α2* α2 + β2

* β2 = α22 + β2

2 = 1 (2.129)

Bottom line, instead of our old states ψa0 and ψb

0, we can use ψ10 and ψ2

0 instead as our basis.

2.4.4. Which basis shall we use?

As mentioned above, in general, ψa0 H ' ψb

0 ≠ 0. If so, the native perturbation theory will have singularity at the second order. Now, we

learned that we can choose to use a different set of unperturbed eigenstates ψ10 and ψ2

0, so can we make ψ10 H ' ψ2

0 = 0? If so, it

will save the day and get ride of the singularity. The way to do it is very simple. As we know early on, if we cannot decide which basis to use,

we shall look at another quantum operator. Here we do have one more quantum operator, which is H '.

We first write down a 2×2 matrix

W =ψa

0 H ' ψa0 ψa

0 H ' ψb0

ψb0 H ' ψa

0 ψb0 H ' ψb

0(2.130)

To make the formula shorter, we define

Wij = ψi0 H ' ψ j

0 (2.131)

So

W = Waa Wab

Wba Wbb (2.132)

1. W is a Hermitian matrix (W† = W), so its eigenvalues are real

This is pretty straightforward to prove, because ψ1 Xψ2

*= ψ2 X

†ψ1. In particular, if X

is an Hermitian operator (the quantum operator

of any physics observable is Hermitian), we have X= X †

and thus ψ1 Xψ2

*= ψ2 X

ψ1. Thus it is easy to notice that

Waa = Waa* and Wbb = Wbb

* and Wba = Wab* (2.133)

W† = W*=

Waa* Wba

*

Wab* Wbb

* =

Waa Wab

Wab Wbb = W (2.134)

2. W has two eigenvalues E+ and E-, and each of them has a vector, α1

β1 for E+ and

α2

β2 for E-

Waa Wab

Wba Wbb α1

β1 = E+

α1

β1 (2.135)

and

Waa Wab

Wba Wbb α2

β2 = E-

α2

β2 (2.136)

where E+ and E- are the two eigenvalues.

3. We can use these two eigenvectors to define our ψ10 and ψ2

0 as

ψ10 = α1 ψa

0 + β1 ψb0 (2.137)

ψ20 = α2 ψa

0 + β2 ψb0 (2.138)

As will be shown below, these two states are precisely what we should use for the perturbation theory.

18 Phys460.nb

4. The two eigenvalues are the first order corrections to the eigenenergy

E1 = E0 + λ E+ + Oλ2 (2.139)

E2 = E0 + λ E- + Oλ2 (2.140)

Q: Why we have two eigenenergies here?

A: Because we started from two degenerate eigenstates. At λ = 0, the two states ψ10 and ψ2

0 have the same energy. Now, if we turn on a

small perturbation λH ', we find that these two states (in general) have different eigenenergies. One of them is E1 = E0 + λ E+ + Oλ2 and the

other E2 = E0 + λ E- + Oλ2.

NOTE #1: We say that the perturbation “lifted the degeneracy”.

NOTE #2: After we lift the degeneracy, ψ10 and ψ2

0 no longer have the same energy. If the perturbation is small enough, we can

now do non-degenerate perturbation theory, i.e. problem solved.

2.4.5. A key conclusion: in quantum mechanics, perturbations will in general lift all degeneracy, unless there is a reason saying that the degeneracy shall not be lifted.

In general, in the study of qua tum physics, we can never include all terms in the Hamiltonian in our theoretical calculation. We always need

some approximations (i.e. drop some small/unimportant part of the Hamiltonian). For example. in the study of a Hydrogen atom, we ignored

relativistic effects. We also ignored the magnetic interactions between the electron and the nucleon (remember that both particles have spin.

Whenever a charged particle starts to spin, there is a magnetic dipole. Because both the electron and the proton have magnetic dipoles, there

should be an dipole-dipole interaction between them, which was ignored). In addition,we also ignored the earth magnetic field, which is always

in presence when we do an experiment (unless we screen it out using some special devices.)

Let’s use Hreal to represent the full Hamiltonian of a real system and Hmodel to represent the Hamiltonian that we used to theoretically analyze

the system. We know that these two are not the same, because we always need some approximation to simplify a real problem, i.e.

Hreal = Hmodel + δH (2.141)

we can treat δH as a perturbation.

Now here comes the questions, if we found that two (or more) states have the same eigenenergy (degeneracy) using Hmodel, are these states

really degenerate in a real system? The general answer is no (unless there is a reason), because our first order degenerate perturbation theory

told as that any small perturbation will in general lift the degeneracy.

The only except is: if there is a reason (usually based on symmetry) to tell us that Waa is exactly the same as Wbb and Wab = Wba = 0 precisely.

(in a real physics system, in most of the case, we cannot say that the value of a quantity is precisely this number. What we really mean is that

there is an argument to show that the difference between Waa and Wbb is unmeasurably small and Wab and Wba is unmeasurably small).

2.4.6. Prove: ⟨ψ10 H ' ψ2

0⟩ = 0

Here the proof contains two steps: first

ψ10 H ' ψ2

0 = ( α1* β1

* ) Waa Wab

Wba Wbb α2

β2 (2.142)

and then we will show

( α1* β1

* ) Waa Wab

Wba Wbb α2

β2 = 0 (2.143)

The first step is very straightforward (it is from the definition of ψ10 and ψ2

0 )

ψ10 = α1 ψa

0 + β1 ψb0 (2.144)

ψ20 = α2 ψa

0 + β2 ψb0 (2.145)

so

Phys460.nb 19

ψ10 H ' ψ2

0 = ψa0 α1

* + ψb0 β1

*H ' α2 ψa0 + β2 ψb

0 = ψa0 α1

* + ψb0 β1

* α2 H ' ψa0 + β2 H ' ψb

0 =

α1* α2 ψa

0 H ' ψa0 + α1

* β2 ψa0 H ' ψb

0 + β1* α2 ψb

0 H ' ψa0 + β1

* β2 ψb0 H ' ψb

0(2.146)

The r.h.s. of the equation is in fact exactly the same, if we remember the definition of the W matrix Wij = ψi0 H ' ψ j

0

( α1* β1

* ) Waa Wab

Wba Wbb α2

β2 = α1

* α2 Waa + α1* β2 Wab + β1

* α2 Wba + β1* β2 Wbb (2.147)

So, we proved that ψ10 H ' ψ2

0 = ( α1* β1

* ) Waa Wab

Wba Wbb α2

β2

For the second step, we first consider the situation that E+ ≠ E-. According to the eigenequations, we have

Waa Wab

Wba Wbb α2

β2 = E-

α2

β2 (2.148)

If we multiply on both sides (α1*, β1

*), we get

(α1*, β1

*) Waa Wab

Wba Wbb α2

β2 = (α1

*, β1*) E-

α2

β2 = E-(α1

*, β1*)

α2

β2 (2.149)

Similarly, if we start from the other eigenequation

Waa Wab

Wba Wbb α1

β1 = E+

α1

β1 (2.150)

we get

(α2*, β2

*) Waa Wab

Wba Wbb α1

β1 = (α2

*, β2*) E+

α2

β2 = E+(α2

*, β2*)

α1

β1 (2.151)

If we take a conjugate on both sides

(α1*, β1

*) Waa Wab

Wba Wbb α2

β2 = E+(α1

*, β1*)

α2

β2 (2.152)

here we used the fact that W is Hermitian and thus W† = W. Notice that we have shown

(α1*, β1

*) Waa Wab

Wba Wbb α2

β2 = E+(α1

*, β1*)

α2

β2 (2.153)

and

(α1*, β1

*) Waa Wab

Wba Wbb α2

β2 = E-(α1

*, β1*)

α2

β2 (2.154)

If E+ ≠ E- the only way that these two equations can both be valid is that

(α1*, β1

*) Waa Wab

Wba Wbb α2

β2 = E+(α1

*, β1*)

α2

β2 = E-(α1

*, β1*)

α2

β2 = 0 (2.155)

So we proved that ψ10 H ' ψ2

0 = 0.

Q: what will happen is E+ = E- ?

A: Turns out that this is the simple case. If E+ = E-, as will be shown below, Wab = ψa0 H ' ψb

0 = 0. So, there is no divergence from the

beginning. We can start the perturbation theory without worrying about these divergence.

2.4.7. First order perturbation

The calculation described above provides to us the zeroth order wavefunctions (i.e., we should use ψ10 or ψ2

0) , instead of ψa0 or

ψb0 as our unperturbed wavefunction). As we learned early on (in non-degenerate perturbation theory), the first order correction of energy

is just

En1 = ψn

0 H ' ψn0 (2.156)

i.e., we use the zeroth order wavefunction and compute the expectation value for H '. Here, for the zeroth order wavefunctions, we have two of

them, ψ10 and ψ2

0, so we need to compute the first order energy correction for each of them. And we will prove in this section

20 Phys460.nb

E11 = ψ1

0 H ' ψ10 = E+ (2.157)

and

E21 = ψ2

0 H ' ψ20 = E- (2.158)

i.e., the first order energy corrections for ψ10 and ψ2

0 are precisely the two eigenvalues of the W matrix.

ψ10 H ' ψ1

0 = ψa0 α1

* + ψb0 β1

*H ' α1 ψa0 + β1 ψb

0 = ψa0 α1

* + ψb0 β1

* α1 H ' ψa0 + β1 H ' ψb

0 =

α1* α1 ψa

0 H ' ψa0 + α1

* β1 ψa0 H ' ψb

0 + β1* α1 ψb

0 H ' ψa0 + β1

* β1 ψb0 H ' ψb

0(2.159)

If we remember the definition of the W matrix Wij = ψi0 H ' ψ j

0, we realized immediately that this formula is exactly the same as

( α1* β1

* ) Waa Wab

Wba Wbb α1

β1 = α1

* α1 Waa + α1* β1 Wab + β1

* α1 Wba + β1* β1 Wbb (2.160)

So we found

E11 = ψ1

0 H ' ψ10 = ( α1

* β1* )

Waa Wab

Wba Wbb α1

β1 (2.161)

Because α1

β1 is an eigenvector of B

Waa Wab

Wba Wbb α1

β1 = E+

α1

β1 (2.162)

E11 = ψ1

0 H ' ψ10 = ( α1

* β1* )

Waa Wab

Wba Wbb α1

β1 = E+( α1

* β1* )

α1

β1 (2.163)

Because we have required α1β1

to be normalized, ( α1* β1

* ) α1

β1 = α1

* α1 + β1* β1 = 1

E11 = ψ1

0 H ' ψ10 = ( α1

* β1* )

Waa Wab

Wba Wbb α1

β1 = E+( α1

* β1* )

α1

β1 = E+ (2.164)

Similarly, we can show that

E21 = ψ2

0 H ' ψ20 = ( α2

* β2* )

Waa Wab

Wba Wbb α2

β2 = E-( α2

* β2* )

α2

β2 = E- (2.165)

2.4.8. Eigenvalues of the matrix W

In this part, we review basic ideas of eigenvalues and eigenvectors. We starts from the eigenequation defined in the previous section

Waa Wab

Wba Wbb αβ = E

αβ (2.166)

This means that

Waa α + Wab β = E α (2.167)

Wba α + Wbb β = E β (2.168)

These two equations have an obvious and trivial solution α = β = 0. This solution is NOT what we want and we will not consider this trivial

solution. To get a nontrivial solution, the eigenvalue E cannot be an arbitrary value. It can only be one of two values, as will be seeing below.

Using the first equation, we get

α =Wab

E - Waa

β (2.169)

Using the second equation, we get

α =E - Wbb

Wba

β (2.170)

The first relation means

Phys460.nb 21

α

β=

Wab

E - Waa(2.171)

but the second relation requires

α

β=

E - Wbb

Wba(2.172)

So we have

α

β=

Wab

E - Waa

=E - Wbb

Wba(2.173)

In general, Wab

E-Waa≠

E-Wbb

Wba, so we find an contradiction. This contradiction means that for a general value of E, we will only have the trivial

solution α = β = 0. To get a nontrivial solution, we have to request Wab

E-Waa=

E-Wbb

Wba. This equation is often written in a different form

Wab

E - Waa

=E - Wbb

Wba(2.174)

(E - Waa) (E - Wbb) = Wab Wba (2.175)

(E - Waa) (E - Wbb) - Wab Wba = 0 (2.176)

det E - Waa -Wab

-Wba E - Wbb = 0 (2.177)

or equivalently

det E 00 E

- Waa Wab

Wba Wbb = 0 (2.178)

det (E - W) = 0 (2.179)

Here, W is the matrix that we define above

W = Waa Wab

Wba Wbb (2.180)

and number E here means E times the identity matrix

E* 1 00 1

= E 00 E

(2.181)

det (E - W) = 0 means

(Waa - E) (Wbb - E) - Wab Wba = 0 (2.182)

And thus

E2 - (Waa + Wbb) E + (Waa Wbb - Wab Wba) = 0 (2.183)

By definition, tr W = Waa + Wbb and det W = Waa Wbb - Wab Wba. Therefore, we can write the same equation as

E2 - tr WE + det W = 0 (2.184)

This equation has two solutions

E± =tr W ± (tr W)2 - 4 det W

2(2.185)

As shown above, these two solutions, E± are the first order correction to the eigenenergy. In the perturbation theory, the eigenenergies of these

two quantum states are

E = E0 + E+ λ + Oλ2 (2.186)

and

22 Phys460.nb

E = E0 + E- λ + Oλ2 (2.187)

at small λ.

Comment #1. tr W and det W are both real. This is straightforward to prove, if we notice that W is Hermitian. Because

Waa = Waa* and Wbb = Wbb

* and Wba = Wab*, Waa and Wbb are real. And Wab Wba = Wab

2 is also real, so tr W = Waa + Wbb and

det W = Waa Wbb - Wab Wba are both real.

Comment #2. (tr W)2 - 4 det W ≥ 0. Therefore, E± are both real.

(tr W)2 - 4 det W = (Waa + Wbb)2 - 4 Waa Wbb + 4 Wab Wba = (Waa - Wbb)

2 + 4 Wab2 ≥ 0 (2.188)

here we used the fact that Wab* = Wba.

Comment #3. There are in general two possible situations (a) If (tr W)2 - 4 det W > 0, E+ > E-. i.e. the two eigenvalues are NOT the same. (b)

If Waa = Wbb and Wab = Wba = 0, (tr W)2 - 4 det W = 0, and thus E+ = E- = tr W2.

The situation (b) is the easy case, because Wab = Wba = 0 means ψa0 H ' ψb

0 = 0. Remember that the problem we had from the beginning is

that the second order correction will diverge, ⟨ψa0 H' ψb

0⟩ ⟨ψb0 H' ψa

0⟩

Ea0-Eb

0 , because Ea0 = Eb

0. For situation (b), the numerator is zero, so there is no

divergence. And thus we can just do non-degenerate perturbation theory. The situation (a) is the more generic case. There, as we have shown

early on, the perturbation H ' lift the degeneracy.

2.4.9. Eigenvectors of the matrix W

In this section, we will assume that (tr W)2 - 4 det W > 0, i.e. situation (a) discussed in the previous section. We have two eigenvalues. For each

eigenvalue, we can solve for the corresponding eigenvector. For the eigenvalue E+ =tr W+ (tr W)2-4 det W

2, we have

Waa Wab

Wba Wbb α1

β1 = E+

α1

β1 (2.189)

and for the other eigenvalue E- =tr W- (tr W)2-4 det W

2, we have

Waa Wab

Wba Wbb α2

β2 = E-

α2

β2 (2.190)

We will use the first one as example (equation for E+). There, the matrix equation can be written as two separate equations

Waa α1 + Wab β1 = E+ α1 (2.191)

Wba α1 + Wbb β1 = E+ β1 (2.192)

Using the first equation, we get

α1 =Wab

E+ - Waa

β1 (2.193)

Using the second equation, we get

α1 =E+ - Wbb

Wba

β1 (2.194)

These two relations are actually identical, because for any eigenvalue E, we have Wab

E-Waa=

E-Wbb

Wba as we proved early on.

In addition, we know that α1* α1 + β1

* β1 = 1, i.e. the normalization condition. So we have

α1 =Wab

Wab2 +(E+ - Waa)2

(2.195)

β1 =E+ - Waa

Wab2 +(E+ - Waa)2

(2.196)

Similarly, we have

Phys460.nb 23

α2 =Wab

Wab2 +(E- - Waa)2

(2.197)

β2 =E- - Waa

Wab2 +(E- - Waa)2

(2.198)

In conclusion, we found that

ψ10 = α1 ψa

0 + β1 ψb0 =

Wab

Wab2 +(E+ - Waa)2

ψa0 +

E+ - Waa

Wab2 +(E+ - Waa)2

ψb0

(2.199)

ψ20 = α2 ψa

0 + β2 ψb0 =

Wab

Wab2 +(E- - Waa)2

ψa0 +

E- - Waa

Wab2 +(E- - Waa)2

ψb0

(2.200)

2.4.10. the very special case

In general, we expect E+ ≠ E-, i.e. the generacy is lifted. What will happen if E+ = E-. From the equation

E± =tr W ± (tr W)2 - 4 det W

2(2.201)

we know that E+ = E- can only arise when (tr W)2 - 4 det W =0, i.e. (tr W)2 - 4 det W = 0. As shown above

(tr W)2 - 4 det W = (Waa - Wbb)2 + 4 Wab

2 (2.202)

Both the two terms on the r.h.s. are non-negative, and thus if we want the whole thing to be zero, we must have

(Waa - Wbb)2 = 0 (2.203)

and

4 Wab2 = 0 (2.204)

i.e., Waa = Wbb and Wab = 0.

With Waa = Wbb and Wab = 0, W is actually proptional to an identity matrix.

W = Waa 0

0 Waa = Waa

1 00 1

(2.205)

This situation is highly unlikely to arise (unless there is a reason) because, in general, the W matrix has four free values to pick Waa, Wbb, the

real part of Wab and the imaginary part of Wab (note 1: Waa and Wbb are real, so they don’t have imaginary part. note 2: Wba is the complex

conjugate of Wab, so we don’t need to consider it here as a seperate degree of freedom). If you have four real values, what is the probability for

these four real values to satisify that Waa matches exactly Wbb without any error bar, and both the real and imarginaly parts of Wab vanishes

exactly without any error bar? Without a reason, the chance is zero. So this is a sitation that we don’t need to worry much, unless there is a

reason.

In most cases, such a specal case arises due to symmetry. For example, time reversal symmetry tells us that there should be two degenerate

states (a state and its time reversal state). Then, for H0 these two states degenerates and for H , they should still be degenerate, so E+ = E-. For

that situation, it turns out that one can directly start from non-degenrate pertubation theory (no singularities will arise), although the states are

degenerate. We will dicuss a more generic situation later, which covers this case.

2.4.11. Review: quantum states and quantum operators as matrices

Once we choose a set of basis, any quantum state can be written as a vector (i.e., a N-by-1 matrix).

For a complete set of basis, { ψi⟩}, we can write any quantum states as

ψ⟩ =ici ψi (2.206)

24 Phys460.nb

where ci are complex numbers. Here, we find that if we want to describe a state, we just need to know all the coefficient ci. We can write these

ci as a vector

c1

c2

c3

⋮

(2.207)

These coefficients are

ci = ⟨ψi ψ⟩ (2.208)

To see this, we multiply ⟨ψ j for both sides of ψ⟩ =∑i ci ψi⟩

⟨ψ j ψ⟩ =ici ⟨ψ j ψi⟩ =

ici δij = c j (2.209)

Bottom line: a quantum state is a column vector

ψ⟩ →

c1

c2

c3

⋮

=

⟨ψ1 ψ⟩

⟨ψ2 ψ⟩

⟨ψ3 ψ⟩

⋮

(2.210)

Conjugate states is the represented by the conjugate vector. By definition, we know that

ψ =i⟨ψi ci

*(2.211)

so, we can write all these ci* as a row vector

( c1* c2

* c3* … ) = ( ⟨ψ ψ1⟩ ⟨ψ ψ2⟩ ⟨ψ ψ3⟩ … ) (2.212)

Here, we used the fact that ⟨ψ ψi⟩ is the complex conjugate of ⟨ψi ψ⟩

Inner produce of two states are product of a row vector and a column vector

If we know two quantum states

ψ⟩ →

c1

c2

c3

⋮

(2.213)

and

ϕ⟩ →

d1

d2

d3

⋮

(2.214)

then, we know

⟨ϕ → ( d1* d2

* d3* … ) (2.215)

so

⟨ϕ ψ⟩ → ( d1* d2

* d3* … )

c1

c2

c3

⋮

= d1* c1 + d2

* c2 + d3* c3 +… (2.216)

Q: How about a quantum operator?

A: Once we choose a set of basis, a quantum operator is a matrix.

To understand this, we just need to realize that a quantum operator transforms a quantum state into a different state

Xψ = ϕ (2.217)

As we have known, |ψ⟩ is a column vector, and |ϕ⟩ is another column vector. Which object transfers a column vector to a different column

Phys460.nb 25

vector? We know that a matrix can do such a job

x11 x12 x13 …

x21 x22 x23 ...x31 x32 x33 …

⋮ ⋮ ⋮ ⋱

c1

c2

c3

⋮

=

x11 c1 + x12 c2 + x13 c3 +…

x21 c1 + x22 c2 + x23 c3 +…

x31 c1 + x32 c2 + x33 c3 +…

⋮

=

d1

d2

d3

⋮

(2.218)

So a quantum operator is really similar to a matrix. In fact, the matrix elements xijs are very easy to compute

xij = ψi Xψ j (2.219)

Q: How about eigenvalues and eigenstates?

A: Matrices also have eigenvalues and eigenstates

x11 x12 x13 …

x21 x22 x23 ...x31 x32 x33 …

⋮ ⋮ ⋮ ⋱

c1

c2

c3

⋮

= W

c1

c2

c3

⋮

(2.220)

1. The matrix of a Hermitian operator is a Hermitian matrix

2. An N×N Hermitian matrix has N eigenvalues,each of which has an eigenvector

3. Eigenvalues of the matrix is the same as the eigenvalues of the corresponding quantum operator

4. Each eigenvector corresponds to a eigenstate, i.e. If

c1

c2

c3

⋮

is an eigenvector with eigenvalue W , then ψ⟩ =∑i ci ψi⟩ is an eigenstate of X

with eigenvalue W .

Final conclusion: for a quantum system, we just needs to play with matrices

Only one problem: these matrices are huge (∞×∞)

It is extremely hard to handle big matrices (say 100 million by 100 million). So this approach doesn’t make our life easier.

2.4.12. Degenerate perturbation theory

H = H0 + λH ' (2.221)

Using eigenstates of H0 as basis, then H0 corresponds to a diagonal matrix

H0 ψi⟩ = Ei0 ψi (2.222)

where i = 1, 2, 3, … and we request this is an orthonormal basis

⟨ψi ψ j⟩ = δij (2.223)

a matrix element of the matrix is

⟨ψi H0 ψ j⟩ = ψi E j0 ψ j = E j

0 ⟨ψi ψ j⟩ = E j0 δi, j (2.224)

So,

H0 →

E10 0 0 …

0 E20 0 ...

0 0 E30 …

⋮ ⋮ ⋮ ⋱

(2.225)

This conclusion is generically true. If we use the eigenstates of an operator as our basis, then this operator is a diagonal matrix (i.e. off-diagonal

terms are all zero). And along the diagonal line, we just have all the eigenvalues of this quantum operator.

λH ' → λ

⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩ …

⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩ ...⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩ …

⋮ ⋮ ⋮ ⋱

(2.226)

26 Phys460.nb

In general, H ' is NOT an diagonal matrix

H = H0 + λH ' →

E10 0 0 …

0 E20 0 ...

0 0 E30 …

⋮ ⋮ ⋮ ⋱

+ λ

⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩ …


⋮ ⋮ ⋮ ⋱

=

E10 + λ ⟨ψ1 H ' ψ1⟩ λ ⟨ψ1 H ' ψ2⟩ λ ⟨ψ1 H ' ψ3⟩ …

λ ⟨ψ2 H ' ψ1⟩ E20 + λ ⟨ψ2 H ' ψ2⟩ λ ⟨ψ2 H ' ψ3⟩ ...

⟨ψ3 H ' ψ1⟩ λ ⟨ψ3 H ' ψ2⟩ E30 + λ ⟨ψ3 H ' ψ3⟩ …

⋮ ⋮ ⋮ ⋱

(2.227)

This is a very large matrix and is very hard to handle in general.

However, if there are 2 degenerate states,

H0 =

⋱ ⋮ ⋮ …

... E0 0 ...

... 0 E0 …

⋮ ⋮ ⋮ ⋱

(2.228)

i.e., two of the eigenvalues of H0 coincides, or say two of two numbers along the diagonal line of the matrix of H0 happens to be the same,

the we don’t need to handle the whole big matrix, if λ is small. Here, we can do degenerate perturbation theory, and to the first order, we can

forget all other quantum states and only look at the two degenerate ones. What does this mean? Remember, that in general, a set of complete

basis contains infinite number of states ψi⟩ with i = 1, 2, …∞. As a result, the matrix of a quantum operator has dimension ∞×∞,

ψi Xψ j with i = 1, 2, …∞ and j = 1, 2, …∞. Now, if we only limit ourselves to the two degenerate states ψa

0 and ψb0, then the

matrix of our quantum operator only has dimensions 2×2, because my i and j here can only be a or b

X→

ψa0 X

ψa

0 ψa0 X

ψb

0

ψb0 X

ψa

0 ψb0 X

ψb

0(2.229)

For H0, its matrix is

H0 → E0 00 E0

(2.230)

and for H ', the matrix is

H ' →ψa

0 H ' ψa0 ψa

0 H ' ψb0

ψb0 H ' ψa

0 ψb0 H ' ψb

0(2.231)

So our H is

H = H0 + λH ' →E0 + λ ψa

0 H ' ψa0 λ ψa

0 H ' ψb0

λ ψb0 H ' ψa

0 E0 + λ ψb0 H ' ψb

0(2.232)

The eigenvalues of this matrix are E0 + λ E+ and E0 + λ E-. And the eigenvector is the same as we computed in previous section

Bottom line: for degenerate perturbation, we can drop all other states (with different eigenenergies), and consider a much smaller

Hilbert space (only the degenerate states are considered here). Then, our Hamiltonian becomes a very small matrix, and we can

diagonalize this small matrix. The eigenvalues are the eigenenergies to the first order. And the eigenvectors give us eigenwavefunctions

to zeroth order.

2.4.13. n-fold degeneracy

If H0 as n-fold degeneracy, and we want to do perturbation theory for these n degenerate states, we just ignore all other states and only keep

these n states.

H0 →

E0 0 0 …

0 E0 0 ...0 0 E0 …

⋮ ⋮ ⋮ ⋱ n×n

(2.233)

Phys460.nb 27

and H ' is a n×n matrix with matrix elements ψi0 H' ψj0, where i = 1, …, n and j = 1, …, n

Then, there are two (equivalent) ways to do the calculation

Option #1: compute eigenvalues for the n×n matrix of H ', as E11 …En

1. Then the eigenenergy to the first order correction is

Ei = E0 + λ Ei1 + Oλ2 (2.234)

where i = 1, 2, …, n.

Option #2: direction compute eigenvalues of the n×n matrix of H = H0 + λH'. You will find n eigenvalue, they are

E0 + λ Ei1 (2.235)

where i = 1, 2, …, n.

2.4.14. Nearly-degenerate perturbation theory

What if we have two states that are not totally degenerate, but nearly degenerate, i.e. two eigenstates of H0 ψa0 and ψb

0 has very similar

energies Ea0 ~ Eb

0, but not exactly the same.

Case 1. if λ H ' << Ea0 - Eb

0 , we can do non-degenerate perturbation theory

Case 2. if λ H ' >> Ea0 - Eb

0 , but λ H ' << Ea0 - Em

0 and λ H ' << Eb0 - Em

0 for any other eigenstates of H0, where Em0 represent

eigenenergy of another eigenstate of H0 (beyond ψa0 and ψb

0), we can do nearly-degenerate perturbation theory.

Here, the procedure is similar to the degenerate perturbation theory, we ignore all other states and only consider ψa0 and ψb

0. Now, every

quantum operator becomes a 2×2 matrix.

H0 →Ea

0 00 Eb

0 (2.236)

and for H ', the matrix is

H ' →ψa

0 H ' ψa0 ψa

0 H ' ψb0

ψb0 H ' ψa

0 ψb0 H ' ψb

0=

Vaa Vab

Vba Vaa (2.237)

Notice that H0 has two different diagonal components Ea0 ≠ Eb

0. This is the difference between degenerate and nearly-degenerate perturbation

theory. Now we consider H , which is

H = H0 + λH ' →Ea

0 + λ Vaa λ Vab

λ Vba Eb0 + λ Vbb

(2.238)

Then, we can get eigenvalues of this matrix, and these are the eigenvalues of H up to first order in perturbation theory.

Case 3. if λH ' is too large, even larger than |Ea0 - Em

0| and |Eb0 - Em

0|, then λH ' is too large and thus cannot be considered as a perturbation

and as a result, we cannot do perturbation theory anymore.

This result can be easily generalized to cases where we have more than 2 nearly-degenerate states.

2.4.15. Philosophy behind degenerate and nearly-degenerate perturbation theory

Assuming that H0 have n eigenstates who have very similar (or exactly the same) eigenenergies ψa0, ψb

0… and they all have energy near

E0), but all other eigenstates of H0 have energies very different from these states ( ψm⟩ has eigenenergy Em0 and m is not one of the nearly

degenerate states. For any m, we have Em0 very different from E0). Then when we start our system from one (or some superposition) of these n

states, and then perturb the Hamiltonian by a small amount H = H0 + λH ' with λ being very small. Then, because any state ψm⟩ has an

energy much different from E0, when the perturbation is small, it is (almost) impossible for the system to reach a state ψm⟩ from one state

with energy E0.

Note: for a classical system, this would be totally impossible due to energy conservation. In a classical system, if we start from a state with

energy E0 and then add a small amount of energy δE to the system, the final states must have energy E0 + δE, which would be very close to E0,

if δE is small. So, it is absolutely impossible to have a final states with energy very different from E0. But for a quantum system, anything is

28 Phys460.nb

possible (think about quantum tunneling, classically it is impossible, but for a quantum system it become possible). However, we know that in

quantum mechanics, the probability for us to reach such a final states is small (although not exactly zero). Since the probability is small, to the

leading order, we can ignore that probability. This is the key reason why we can ignore all those states ψm0.

Since it is highly unlikely to reach ψm⟩, we can ignore them to the leading order approximation. After we ignore all of them, our Hilbert

space becomes very small, only n quantum states now. And thus our quantum operators becomes n×n matrices. If n is 2, we can easily find the

eigenvalue. If n=3 or 4, we can get the eigenvalue (with analytic form) with a little bit of help (e.g. software like Mathematica). If n > 4 but not

extremely huge), say a couple of hundreds or smaller, we can easily get the eigenvalue numerically using available software packages. If n is 10

or 100 million, it is not easy to get all the eigenvalues for the system, but we can easily get the smallest several or the largest several numeri-

cally using techniques like Lanczos algorithm.

After taking care of the n×n matrices, we may be able to take ψm0 back into consideration by going to higher orders in the perturbation

theory.

2.4.16. Example: (textbook page 262)

Consider a 3D infinite cubical potential well

V(x, y, z) = 0 if 0 < x < a, 0 < y < a and 0 < z < a

+∞ otherwise (2.239)

The Hamiltonian for this system is

H0 =P2

2 m+ V(x, y, z) = -

ℏ2

2 m∇2+V(x, y, z) (2.240)

The eigenstates of H0 are sin waves

ψ0nx ny nz(x, y, z) =

2

a

3/2

sinnx π

ax sin

ny π

ay sin

nz π

az (2.241)

where nx, ny and nz are positive integers. The eigenenergy for such a state is

E0nx ny nz =

π2 ℏ2

2 m a2nx

2 + ny2 + nz

2 (2.242)

The ground state is obviously nx = ny = nz = 1, which has energy E0111 =

3 π2 ℏ2

2 m a2 . For simplicity, we will call this energy E00 =

3 π2 ℏ2

2 m a2 , where the

subscript 0 represents the ground states.

There are three (degenerate) first excited states with nx, ny and nz being (1, 1, 2) or (1,2,1) or (2,1,1).

E0112 = E0

121 = E0211 =

π2 ℏ2

2 m a21 + 1 + 22 = 3

π2 ℏ2

m a2(2.243)

For simplicity, we will call the energy of this first excited states E10 where the subscript 1 means that this is for the first excited states. In

addition, for simplicity, we define

ψa = ψ112 and ψb = ψ121 and ψc = ψ211 (2.244)

Now, consider a perturbation

H = H0 + λH ' (2.245)

where

H ' = V0 if 0 < x < a /2, 0 < y < a /2 and 0 < z < a

0 otherwise (2.246)

For the ground state, we shall do non-degenerate perturbation theory, because there is no degeneracy, and the first order correction to the energy

is

Phys460.nb 29

E1111 = ⟨ψ111 H ' ψ111⟩ = ⅆx ⅆy ⅆ z ψ0

111(x, y, z)*

H ' ψ0111(x, y, z) = V0

0

a/2ⅆx

0

a/2ⅆy

0

a

ⅆ z ψ0111(x, y, z)

2

= V0

2

a

3

0

a/2ⅆx sin 2

π

ax

0

a/2ⅆy sin2

π

ay

0

a

ⅆ z sin2π

az = V0

2

a

3

×1

2

a

2×

1

2

a

2×

1

2a =

V0

4

(2.247)

So, the energy of the ground states now becomes

E111 = E0111 + λ E1

111 +… =3 π2 ℏ2

2 m a2+ λ

V0

4+… (2.248)

For the first excited states, there is a three-fold degeneracy, so we need to define a 3×3 matrix

W =

Waa Wab Wac

Wba Wbb Wbc

Wca Wcb Wcc

=

⟨ψa H ' ψa⟩ ⟨ψa H ' ψb⟩ ⟨ψa H ' ψc⟩

⟨ψb H ' ψa⟩ ⟨ψb H ' ψb⟩ ⟨ψb H ' ψc⟩

⟨ψc H ' ψa⟩ ⟨ψc H ' ψb⟩ ⟨ψc H ' ψc⟩(2.249)

For diagonal terms

Waa = ⟨ψa H ' ψa⟩ = ⅆx ⅆy ⅆ z ψ0112(x, y, z)

*H ' ψ0

112(x, y, z) = V0 0

a/2ⅆx

0

a/2ⅆy

0

a

ⅆ z ψ0112(x, y, z)

2

= V0

2

a

3

0

a/2ⅆx sin 2

π

ax

0

a/2ⅆy sin2

π

ay

0

a

ⅆ z sin22 π

az = V0

2

a

3

×1

2

a

2×

1

2

a

2×

1

2a =

V0

4

(2.250)

Similarly, we can show that

Waa = Wbb = Wcc = V0 /4 (2.251)

For Wab, we have

Wab = ⟨ψa H ' ψb⟩ = ⅆx ⅆy ⅆ z ψ0112(x, y, z)

*H ' ψ0

121(x, y, z)

= V0

2

a

3

0

a/2ⅆx sin 2

π

ax

0

a/2ⅆy sin

π

ay sin

2 π

ay

0

a

ⅆ z sinπ

az sin

2 π

az

(2.252)

One can show that the last integral ∫0aⅆ z sin π

az sin 2 π

az is zero. So Waa = 0. Similarly, Wab = Wac = 0.

Finally, for Wbc

Wab = ⟨ψb H ' ψc⟩ = ⅆx ⅆy ⅆ z ψ0121(x, y, z)

*H ' ψ0

211(x, y, z)

= V0

2

a

3

0

a/2ⅆx sin

π

ax sin

2 π

ax

0

a/2ⅆy sin

π

ay sin

2 π

ay

0

a

ⅆ z sin2π

az =

16

9 π2V0

(2.253)

Thus, we have

W =

V0

40 0

0 V0

416

9 π2 V0

0 169 π2 V0

V0

4

(2.254)

For this matrix, it has three eigenvalues (all are real numbers). The equation for eigenvalues is

det(E I - W ) = 0 (2.255)

det

E -V0

40 0

0 E -V0

4-

169 π2 V0

0 -16

9 π2 V0 E -V0

4

= 0 (2.256)

And thus

E -V0

4 E -

V0

4

2

-16

9 π2V0

2

= 0 (2.257)

30 Phys460.nb

E -V0

4E -

V0

4+

16

9 π2V0 E -

V0

4-

16

9 π2V0 = 0 (2.258)

So the solutions are

E11 =

V0

4-

16

9 π2V0 =

V0

41 -

8

3 π

2

(2.259)

E21 =

V0

4(2.260)

E31 =

V0

4+

16

9 π2V0 =

V0

41 +

8

3 π

2

(2.261)

So the energies of the first excited states are

E1 =

E10 + λ E1

1 +… = 3 π2 ℏ2

m a2 + λV0

41 -

83 π2 state #1

E10 + λ E2

1 +… = 3 π2 ℏ2

m a2 + λV0

4state #2

E10 + λ E3

1 +… = 3 π2 ℏ2

m a2 + λV0

41 +

83 π2 state #3

(2.262)

Now, for the eigenstates of the W matrix

V0

40 0

0 V0

416

9 π2 V0

0 169 π2 V0

V0

4

α1

β1

γ1

= E11α1

β1

γ1

=V0

41 -

8

3 π

2

α1

β1

γ1

(2.263)

V0

4

1 0 0

0 1 8

3 π2

0 8

3 π2

1

α1

β1

γ1

=V0

41 -

8

3 π

2

α1

β1

γ1

(2.264)

By canceling V0 /4 on both sides, we get

1 0 0

0 1 8

3 π2

0 8

3 π2

1

α1

β1

γ1

= 1 -8

3 π

2

α1

β1

γ1

(2.265)

It means that

α1 = 1 -8

3 π

2

α1 (2.266)

β1 +8

3 π

2

γ1 = 1 -8

3 π

2

β1 (2.267)

8

3 π

2

β1 + γ1 = β1 (2.268)

The first equation means α1 = 0 and the last two means β1 = -γ1. In addition, we know that normalization condition requires

α 2 + β 2 + γ 2 = 1, so

α1

β1

γ1

=

01

2

-1

2

(2.269)

So,

Phys460.nb 31

ψ1(x, y, z) = α1 ψa + β1 ψb + γ1 ψc =ψb - ψc

2=ψ121 - ψ211

2=

2

a

3/2 sin πa

x sin 2 πa

y - sin 2 πa

x sin πa

y

2sin

π

az (2.270)

Using the same approach, we find that

α2

β2

γ2

=100

(2.271)

and thus

ψ2(x, y, z) = α2 ψa + β2 ψb + γ2 ψc = ψa = ψ211 =2

a

3/2

sin2 π

ax sin

π

ay sin

π

az (2.272)

Finally, for the third eigenstate, we can use the same method to show that

α3

β3

γ3

=

01

2

1

2

(2.273)

and thus

ψ3(x, y, z) = α3 ψa + β3 ψb + γ3 ψc =ψb + ψc

2=ψ121 + ψ211

2=

2

a

3/2 sin πa

x sin 2 πa

y + sin 2 πa

x sin πa

y

2sin

π

az (2.274)

2.4.17. A small trick for finding ψ1 and ψ2

We use two-fold degenerate here as an example, but the conclusion here can be easily generalized.

As we have shown above, the key in degenerate perturbation theory is to find a good set of basis, such that ψ10 H ' ψ2

0 = 0. In the most

general situation, we state from a set of states ψa0 and ψb

0. If it is a good set already (i.e., ψa0 H ' ψb

0 = 0) , we don't need to find

another basis. We can just use them and

E = E0 + λ ψa0 H ' ψa

0 + Oλ2 (2.275)

and

E = E0 + λ ⟨ψb H ' ψb⟩ + Oλ2 (2.276)

In general, we would be so lucky. i.e., if we just randomly choose a a set of states ψa0 and ψb

0, the chance for this basis to be a good set of

basis is extremely low (we will in general have ⟨ψa H ' ψb⟩ ≠ 0). Is there a way to help us pick ψa0 and ψb

0? The answer is yes, for some

cases.

If there is another quantum operator A, which compute with both H0 and H ', then we can use the common eigenstates of A

and H0 as the basis

for H0

H0 ψa0 = E0 ψa

0 (2.277)

H0 ψb0 = E0 ψb

0 (2.278)

Aψa

0 = Aa ψa0 (2.279)

Aψb

0 = Ab ψb0 (2.280)

In particular, if Aa ≠ Ab, then ψa0 and ψb

0 are already a good set of basis.

To see this, we just need to prove that ψa0 H ' ψb

0 = 0

Because we have assumed that A, H ' = A

H ' - H ' A

= 0,

32 Phys460.nb

0 = ψa0 A

H ' - H ' Aψb

0 =

ψa0 A

H ' ψb0 - ψa

0 H ' Aψb

0 = Aa ψa0 H ' ψb

0 - Ab ψa0 H ' ψb

0 = (Aa - Ab) ψa0 H ' ψb

0(2.281)

If Aa ≠ Ab, this equation means that ψa0 H ' ψb

0 = 0.

For this situation, although we have a degeneracy, one can just do non-degenerate perturbation for ψa0 and ψb

0 (separately) and

there will be no singularities at all.

2.5. the fine structure of a hydrogen atom

2.5.1. Relativistic correction

In QMI, we solved an ideal model for a hydrogen atom (i.e. a particle in 1 /r potential). In a real hydrogen atom, that model missed some of the

physics, and one of them is relativistic effects.

Q: what is the energy of a particle, if the particle is moving at speed v and the rest mass m.

E = M c2 =m

1 -v2

c2

c2

(2.282)

Q: what is the momentum of a particle, if the particle is moving at speed v and the rest mass m.

p = M v =m

1 -v2

c2

v

(2.283)

As a result,

E = p2 c2 + m2 c4 (2.284)

To prove this relation, we start from the r.h.s.,

p2 c2 + m2 c4 =

m2 v2

1 -v2

c2

c2 + m2 c4 =m2 v2 c2

1 -v2

c2

+m2 c41 -

v2

c2

1 -v2

c2

=m2 v2 c2 + m2 c4 - m2 c2 v2

1 -v2

c2

=m2 c4

1 -v2

c2

=m

1 -v2

c2

c2 = E(2.285)

This relation between E and p is an very important relation for relativistic physics!

Q: what is kinetic energy?

A: First, find the energy of a particle when it is not moving p = 0. Then we measure the energy again when it is moving (with momentum p).

The energy difference between them is the kinetic energy for this particle.

T = E - m c2 = p2 c2 + m2 c4 - m c2 = m c2 1 +p

m c

2- 1 (2.286)

When particle is moving at low velocity (v << c), p << m c, and thus p /m c << 1. As a result, we can use the following expansion

1 + x = 1 +x

2-

x2

8+… (2.287)

i.e.,

1 + x - 1 =x

2-

x2

8+… (2.288)

Phys460.nb 33

So,

T = m c21

2

p

m c

2-

1

8

p

m c

4+… =

p2

2 m-

p4

8 m3 c2+… (2.289)

The first term here is the kinetic energy in classical mechanics. In relativistic physics, the kinetic energy is NOT just p2 2 m. Instead, we have a

lot of corrections. These corrections are small if a particle is moving at low speed. There, we can treat them as perturbation

H ' = -p4

8 m3 c2(2.290)

NOTE: this treatment is NOT the rigorous way to combine special relativity with quantum mechanics, because this treatment has one

major flaw. At larger p or small m (e.g. consider a very light particle), the series will diverge. This problem comes from the fact that we

used square root in the definite of the Hamiltonian. Square root is NOT an analytic function near small x, and thus will cause trouble

(to see this, think about f (x) = x , one can easily show that for the first order derivative, x=0 is infinite limx→0 f ' (x)→∞). The correct way

to do it is to use a matrix. Notice that for a matrix, square root arises naturally (e.g., the eigenvalue of m c2 p c

p c -m c2 are

± p2 c2 + m2 c4 . We get square root without having any square root in the matrix). The person who figured this out is Dirac and this is

Dirac’s theory for relativistic fermions.

If we ignore higher order terms, our hydrogen atom should follow this Hamiltonian

H = H0 + H ' (2.291)

where H0 =p2

2 m+ V(r). With the perturbation H ', the energy of a hydrogen atom will be different from what we computed early on. How large

is the difference? This question can be answered by the perturbation theory.

We have already known the energy spectrum of H0,

En0 = -

13.6 eV

n2with n = 1, 2, 3, … (2.292)

More precisely,

En0 = -

1

n2

m

2 ℏ2

e2

4 π ϵ0

2

(2.293)

We often define Bohr radius a as

a =ℏ2

m

4 π ϵ0

e2(2.294)

And then,

En0 = -

1

n2

1

2

e2

4 π ϵ0 a(2.295)

For En, there are n2 degenerate quantum states (ignore spin at this moment) ψn l m where l is the angular momentum quantum number

l = 0, 1, 2 …n - 1 and m is the quantum number for Lz and m = -l, -l + 1, …0, …, l - 1, l

L2 ψn l m = ℏ2 l(l + 1) ψn l m (2.296)

Lz ψn l m = ℏm ψn l m (2.297)

For, n = 1 there is no degeneracy, and we can do non-degenerate perturbation theory. For any n > 1, there are n degenerate states, and thus we

should do degenerate perturbation theory. However, we are very lucky here. We don’t need to worry about degenerate perturbation theory,

because ψn l m is already a good set of basis:

⟨ψn l m H ' ψn l ' m'⟩ = 0 (2.298)

if l ≠ l ' or m ≠ m '.

34 Phys460.nb

This is because both H0 and H ' commute with L2. And we can also show that both H0 and H ' commute with Lz. Here, L2 and Lz serve as the A

operator that we defined in the previous section. For a fixed n, because the degenerate states all have different eigenvalues for L2 and Lz

(different l and m), ⟨ψn l m H ' ψn l ' m'⟩ = 0. So we don’t need to choose any other basis and can start with non-degenerate perturbation.

The correction to the energy is (to the first order)

En l m1 = ⟨ψn l m H ' ψn l m⟩ =

-ψn l m

p4

8 m3 c2ψn l m = -

1

8 m3 c2ψn l m p4 ψn l m = -

1

8 m3 c2ψn l m p2 p2 ψn l m = -

1

8 m3 c2ψn l m p2 p2 ψn l m

(2.299)

For ψn l m⟩, we know that

H0 ψn l m⟩ = En0 ψn l m (2.300)

p2

2 m+ V ψn l m = En

0 ψn l m (2.301)

p2

2 mψn l m = En

0 - V ψn l m (2.302)

p2 ψn l m = 2 m En0 - V ψn l m (2.303)

p2 ψn l m = 2 m En0 - V ψn l m (2.304)

The conjugate of this equation gives

ψn l m p2 = ψn l m 2 m En0 - V (2.305)

So,

ψn l m p2 p2 ψn l m = ψn l m 2 m En0 - V 2 m En

0 - V ψn l m = 4 m2 ψn l m En0 - V

2ψn l m (2.306)

Here, V = -e2

4 π ϵ0

1r and

ψn l m En0 - V

2ψn l m =

ψn l m En0

2- 2 En

0 V + V2 ψn l m = ψn l m En0

2ψn l m - 2 ψn l m En

0 V ψn l m + ψn l m V(r)2 ψn l m =

En0

2+ 2 En

0e2

4 π ϵ0

ψn l m

1

rψn l m +

e2

4 π ϵ0

2

ψn l m

1

r2ψn l m

(2.307)

Without going into details, we will just show the results here

ψn l m

1

rψn l m =

1

n2 a(2.308)

ψn l m

1

r2ψn l m =

1

n3l +12 a2 (2.309)

Thus,

En l m1 = -

1

8 m3 c2ψn l m p2 p2 ψn l m = -

1

8 m3 c24 m2En

02+ En

0e2

2 π ϵ0

1

n2 a+

e2

4 π ϵ0

2 1

n3l +12 a2

=

-1

8 m3 c24 m2En

02+

2 En0

n2

e2

4 π ϵ0 a+

1

n3l +12

e2

4 π ϵ0 a

2

(2.310)

As we have shown early on

En0 = -

1

n2

1

2

e2

4 π ϵ0 a(2.311)

Phys460.nb 35

-2 n2 En0 =

e2

4 π ϵ0 a(2.312)

En l m1 = -

1

2 m c2En

02+

2 En0

n2

e2

4 π ϵ0 a+

1

n3l +12

e2

4 π ϵ0 a

2

=

-1

2 m c2En

02-

2 En0

n22 n2 En

0 +1

n3l +12

-2 n2 En0

2 = -

En0

2

2 m c2

4 n

l +12

- 3

(2.313)

So, the eigen-energy in a H atom shall be

En l m = En0 + En l m

1 +… = En0 -

En0

2

2 m c2

4 n

l +12

- 3 +… (2.314)

The zeroth order term

En0 = -

1

2 n2

m

ℏ2

e2

4 π ϵ0

2

(2.315)

it is proportional to

En0 ∝

m

ℏ2

e2

4 π ϵ0

2

=e2

4 π ϵ0 ℏ c

2

m c2(2.316)

The prefactor e2

4 π ϵ0 ℏ c is a very important physics constant, known as the fine structure constant.

α =e2

4 π ϵ0 ℏ c≈

1

137.036(2.317)

So,

En0 ∝ α2 m c2 (2.318)

The first order term

En l m1 ∝

En0

2

m c2= α4 m c2 (2.319)

If we compare the first order and zeroth order term,

En l m1

En0∝α4 m c2

α2 m c2= α2 =

1

137.036

2

≈1

10 000(2.320)

So indeed, the perturbation theory works, i.e. higher order term is much smaller than the leading order. (remember that Taylor expansions only

converge when the small parameter λ is small enough. Here, our small parameter is α, which is smaller than 1%).

Relativistic correction is indeed an small correction in a H atom

NOTE: the fine structure constant is one of the most important physics constant. It is dimensionless. It involves special relativity (contains the

speed of light). It involves quantum mechanics (having ℏ in its definition) and it also involves E&M (having ϵ0 and e2). In this section, we

showed that it is so lucky for us that for a H atom, because α is small, our relativistic correction is indeed small and thus we can do perturbation

theory. In QFT (QED), small α means that interactions between particles (quantum electron-dynamics) is a small perturbation. To the leading

order, we can treat a particle as a free particle, and then add E&M interactions as a perturbation. Because the small parameter α is so small, in

QED, our perturbation theory converge very fast. First order perturbation gives us an accuracy of the order α1~10-2. Second order perturbation

increases the accuracy to α2~10-4. By going to 5th order in perturbation, we can get an accuracy of the order α5~10-10. This is the reason

why QED is such a successful theory.

2.5.2. Spin-orbit coupling

36 Phys460.nb

In QM I, we treat the spin of an electron as an independent quantity, independent from the orbit agular momentum. For example in a hydrogen

atom, for any eigenwavefunction ψn l m(x, y, z), it actually means two degenerate states: (1) one electron with wavefunction ψn l m(x, y, z) and

spin up and (2) one electron with wavefunction ψn l m(x, y, z) and spin down. This conclusion remains the same after we take into accound the

relastivistic correction (now the eigenenergy depends on both n and l, but still for every eigenwavefuntion, it means two degenerate state when

we taken into account the spins).

In this section, we will consider one more effect, which was ignored previously and this effect will tell us the spin of an electron and its orbital

motion are coupled togehter.

Warning: you may find that the derivtion in this section very disturbing. Because for multiple times, after some derivations, we will say

the following without providing much justification, “by the way, this result is in fact not quite right, and we will need to throw in an

extra factor of 2 to get the correcti answer.” The reason for these extra factors of 2 is because this section is NOT treating spin-orbit

coupling in the rigroius and correct way, which requires Dirac’s equation. Instead, what we are trying to do here is to use various

tricks trying to recover Dirac’s finally conclusion without using Dirac’s equation. These tricks (they are not rigrious at all) get some

part of the story right, but in many cases, they lead to wrong results. Because we already know the right answer from Dirac’s equation,

whenever we find that these tricks fail to get the correct answer, we will correct it by adding some extra factor. Within our deviation,

these extra factor looks totally unreasonable and weird, but if one start from Dirac’s equation, the results are all very natural and

straghtforward. Bottom line, please don't take these derivations very seriously, because they are not supposed to give (fully) correct

descriptoin after all. But the physics, at the end of the day, is correct.

Magnetic dipole of an electron

If a charge partilce moves in ciricles, it a creates circular current, and the circular current will result in an magnetic dipole moment (according

to E&M). To see this, we use a simple model to demonstrate this physics. Assuming that we have a ring, and there is a charged particle (with

charge q) moving around the ring. The dipole momentu is

μ→=

1

2q r→× v→

(2.321)

where q is the charge of the particle. r→ and v→ are the location and velocity of the particle.

μ→=

1

2q r→× v→=

q

2 mm r

→× v→ =

q

2 mL→

(2.322)

where L→= m r

→× v→ is the angular momentum.

Simillary, if we have a spinning charged particle, the angular momentum from the spin will also result in a magnetic dipole. The diople moment

from spins is also proportional to the angluar momentum of the spin, but with an extra factor known as the g-factor

μ→= g

q

2 mS→

(2.323)

The charge of an electron is -e (negative charge), so

μ→= -g

e

2 mS→

(2.324)

and g is a number, whose value is really close to 2. In this course, we will say that g = 2 for simplicity, but in reality, g = 2.00231930436182.

In Dirac theory, g is exactly 2. The reason the real value of g is a little bit larger than 2 is due to interactions between electrons and photons

(light), which wasn’t considered in Dirac’s equation.

Note: in many cases, people absorb the minus sign into the definition of g,

μ→= g

e

2 mS→

(2.325)

where g = -2. But no matter what convetion one adopts,

μ→= -

e

mS→

(2.326)

We can define the Bohr magneton, which is a fundimental physics constant

Phys460.nb 37

μB→

=e ℏ

2 m= 9.27400968 (20)×10-24 J /T (2.327)

and

μ→= g

e

2 mS→= g

e ℏ

2 m

S→

ℏ= g μB

S→

ℏ(2.328)

For an electron, the magnetic dipole is ±μB. We demonstrate this by considering the dipole moment along z direction

μz = g μB

Sz

ℏ(2.329)

The spin operator Sz has eigenvalues ±ℏ/2, and g = -2. For an eigenstate of Sz,

μz = -μB if Sz eigenvalue is + ℏ /2, i.e. spin up+μB if Sz eigenvalue is - ℏ /2, i.e. spin down (2.330)

Effective B field from the nucleon

If we stand on an electron (using the electron as our reference frame), we will find that the nucleon is moving around us in a circle. Because the

nucleon has postive charge +e, when it moves around us, it generate a circular current and thus leads to a magnetic field. According to the

“Biot–Savart law” in E&M, the B field generated by a wire with current I is

B→=μ0 I

4 πⅆ l→× r→

r3(2.331)

where ⅆ l→

is a small section of the wire and the direction is parrella to the wire. r→ is the distance between the wire and the place at which we

want to measure the B field. For a circular motion of the nucleon, the wire here is a circle and we want to know the B field at the center of the

circle

B =μ0 I

4 π

0

2 π r ⅆθ r

r3=μ0 I

4 π

0

2 πⅆθ =

μ0 I

4 π r2 π =

μ0 I

2 r(2.332)

The current I here is

I =e

T(2.333)

where e is (the absolute value of) the charge of an electron (remember that the nucleon in a hydrogen atome is +e). T is the how long it takes for

the nucleon to go around a ciricle.

I =e

T=

e

2 π /ω=

eω

2 π (2.334)

Here, ω is the angular velocity. Notice that the angular velocity here is the same as the angular velocity of the electron ω (in the rest frame).

I =eω

2 π=

e L

2 πm r2(2.335)

So

B =μ0 I

2 r=μ0

2

e L

2 πm r3= ϵ0 μ0

e L

4 π ϵ0 m r3 (2.336)

Notice that ϵ0 μ0 = 1c2 and in addition, it is easy to realize that B→// L→

, so we get

B→= ϵ0 μ0

e L→

4 π ϵ0 m r3=

e

4 π ϵ0

L→

m c2 r3(2.337)

Magnetic dipole in an B field

We proved above that using the frame of the electron, the electron feels a B field, which is generated by the nucleon

38 Phys460.nb

B→=

e

4 π ϵ0

L→

m c2 r3(2.338)

and the electron has a magnetic diople

μ→= -

e

mS→

(2.339)

We have a dipole in a B field, we shall have energy

H ' = -μ→·B→=

e2

4 π ϵ0

1

m2 c2 r3S→·L→

(2.340)

Here, our naive tricks miss a factor of 1/2 in comparison to the correct result (from Dirac’s equation). The right result should be

HSO =e2

8 π ϵ0

1

m2 c2 r3S→·L→

(2.341)

Therefore, we shall add one extra term to the Hamiltonian

H = H0 + HSO (2.342)

Here, H0 is what we learned in QM1. And HSO is this new term. Here, we treat H0 as unperturbed Hamiltonian, and treat HSO as a small

perturbation.

Basis without HSO

For the commutation relations for the orbital angular momentum, we know that

[Lx, Ly] = ⅈ ℏ Lz (2.343)

[Ly, Lz] = ⅈ ℏ Lx (2.344)

[Lz, Lx] = ⅈ ℏ Ly (2.345)

or we can write the same formular as

[Li, L j] = ⅈ ℏ ϵi, j,k Lk (2.346)

where ϵi, j,k is the Levi-Civita symbol.

For L, we know that we can define the opertor L2

L2 = Lx2 + Ly

2 + Lz2 (2.347)

And we know that it commute with Lz, L2, Lz = 0. As a result, we cannot measure all the three components of the angular momentum due to

the uncertainly principle. But, we can measure L2 and Lz at the same time, by defining common eigenstates for these two opertoators

L2 ψl m(x, y, z) = l(l + 1) ℏ2 ψl m(x, y, z) (2.348)

Lz ψl m(x, y, z) = m ℏ ψl m(x, y, z) (2.349)

Here, l is a non-negative integer, l = 0, 1, 2, …, and m is an integer between +l and -l. The eigenwavefunctions is very easy to write down in

spherical corrediate

ψl m(r, θ, ϕ) = R(r) Ylm(θ, ϕ) (2.350)

where R(r) is an arbitary function of r (the function doesn’t depends on θ or ϕ), and Ylm(θ, ϕ) are a set of special functions known as the

spherical harmonics.

For spins, we have the same communtation relation,

[Sx, Sy] = ⅈ ℏ Sz (2.351)

[Sy, Sz] = ⅈ ℏ Sx (2.352)

[Sz, Sx] = ⅈ ℏ Sy (2.353)

Phys460.nb 39

And again, we can define

S2 = Sx2 + Sy

2 + Sz2 (2.354)

And same as above, S2, Sz = 0, so we can measure S2 and Sz at the same time.

S2 s, m = s(s + 1) ℏ2 s, m (2.355)

Sz s, m⟩ = m ℏ s, m⟩ (2.356)

where s is a non-negative integer or half-integer, s = 0, 1 /2, 1, 3 /2, … Once s is determined, m = -s, -s + 1, …, s - 1, s. For electrons,

s = 1 /2, and thus m = -1 /2 or +1 /2

In addition, we know that S→

and L→

commute with each other,

S→

, L→ = 0 (2.357)

Without spin-orbit coupling (i.e. the unperturbed Hamiltonian H0), we can easily prove that H0, S→ = H0, L

→ = 0, as a result, we find that the

following operators commute with one another, H0, L2, Lz, S2, Sz. So we can request our quantum states to be common eigenstates of all these

operators: n, l, m, s, sz⟩

H0 n, l, m, s, sz⟩ = -13.6 eV

n2n, l, m, s, sz (2.358)

L2 En, l, m, s, sz = l(l + 1) ℏ2 n, l, m, s, sz (2.359)

Lz En, l, m, s, sz⟩ = m ℏ n, l, m, s, sz⟩ (2.360)

S2 En, l, m, s, sz =3

4ℏ2 n, l, m, s, sz (2.361)

Sz En, l, m, s, sz⟩ = sz ℏ n, l, m, s, sz⟩ (2.362)

where sz = +1 /2 or -1 /2. For an electron, we know that s is always 1 /2, so we don’t really need to writeit out: n, l, ml, ms⟩. Compare to the

results without spins (ψlmn), the only thing we get here is an extra index sz = ±1 /2. This quantum number tells me whether my spin is pointing

up or down. At the end of the day, we didn’t get anything beyond what we have already known, except that we now need to specify whether the

spin of the electron is up or down.

With SO coupling, the basis desribed above is NOT a good option, because ⟨n, l, m, sz HSO n ', l ', m ', sz '⟩ ≠ 0, i.e. to do degenerate

perturbation theory, we will need a new basis.

Basis with HSO

To get the proper basis, we can go throw the derivation that we demonstrated for degenerate perturbation theory. Here, instead, we will use a

trick to get the correct basis direction. The trick is what we proved early on. We know that if we can find a quantum operator, A, which

commutes with both H0 and the perturbation HSO, we can use common eigenstates of A

and H0 as a set of basis. If in this set, every state has a

different eigenvalue for A, then it is a good state for degenerate pertubation theory.

In the previous section (relastivisctic correction), we used L2 and Lz to serve as the A operator. Here, after taking into account spins and for the

perturbatoin HSO, we will need to use L2, S2, J2 and Jz as A.

If an electron have both orbit and spin angular momenta, we can add them up to get the total angular momentum

J→= L→+ S→

(2.363)

or equivallently,

Jx = Lx + Sx (2.364)

Jy = Ly + Sy (2.365)

Jz = Lz + Sz (2.366)

40 Phys460.nb

For Js, we have the same commutation relation

[Jx, Jy] = ⅈ ℏ Jz (2.367)

[Jy, Jz] = ⅈ ℏ Jx (2.368)

[Jz, Jx] = ⅈ ℏ Jy (2.369)

And we can also define J2 as

J2 = Jx2 + Jy

2 + Jz2 (2.370)

Same as L and S, we know that J2, Jz = 0. So we can measure J2 and Jz at the same time

J2 j, m = j( j + 1) ℏ2 s, m (2.371)

Jz j, m⟩ = m ℏ s, m⟩ (2.372)

where j is an integer or half-integer, j = 0, 1 /2, 1, 3 /2, … Once j is determined, m = -s, -s + 1, …, s - 1, s. For electrons, s = 1 /2, and thus

m = -1 /2 or +1 /2.

If we have a particle with spin quantum number s and orbit angular momentum quantum number l, then j = l + s, l + s - 1, … l - s . NOTE:

j cannot be negative. For spin s = 1 /2, this means that j = l - 1 /2 or j = l + 1 /2 for l ≥ 1. And j = 1 /2 if l = 0.

◼ If we put an electron on an s-wave state (l = 0), the total angular momentum j = 1 /2

◼ If we put an electron on an p-wave state (l = 1), the total angular momentum j = 1 /2 or 3 /2

◼ If we put an electron on an d-wave state (l = 2), the total angular momentum j = 3 /2 or 5 /2

◼ If we put an electron on an f-wave state (l = 3), the total angular momentum j = 5 /2 or 7 /2

◼ ...

In our homework, we will show that J2, Jz, L2, S2 compute with HSO. It is also straightforward to see that J2, Jz, L2, S2 all commute with H0, so

we can use them as our A

operator. In addition, it is also easy to verify that these four operators commute with each other, so we can define

common eigenstates for H0, J2, Jz, L2, S2 and using this common eigenstates as our basis for perturbation theory

H0 n, l, s, j, jz⟩ =-13.6 eV

n2n, l, s, j, jz (2.373)

L2 n, l, s, j, jz = l(l + 1) ℏ2 n, l, s, j, jz (2.374)

S2 n, l, s, j, jz = s(s + 1) ℏ2 n, l, s, j, jz =3

4ℏ2 n, l, s, j, jz (2.375)

J2 n, l, s, j, jz = j( j + 1) ℏ2 n, l, s, j, jz (2.376)

Jz n, l, s, j, jz⟩ = jz ℏ n, l, s, j, jz⟩ (2.377)

Notice that

J2 = J→· J→= L

→+ S→ · L

→+ S→ = L

→·L→+ S→· S→+ 2 S

→·L→= L2 + S2 + 2 S

→·L→

(2.378)

As a result,

S→·L→=

J2 - L2 - S2

2(2.379)

So we can write our pertubation as

HSO =e2

8 π ϵ0

1

m2 c2 r3S→·L→=

e2

8 π ϵ0

1

m2 c2 r3

J2 - L2 - S2

2(2.380)

The first order perturbation theory

En,l,s, j, jz1 = ⟨n, l, s, j, jz HSO n, l, s, j, jz⟩ =

e2

8 π ϵ0

1

2 m2 c2n, l, s, j, jz

J2 - L2 - S2

r3n, l, s, j, jz =

Phys460.nb 41

e2

8 π ϵ0

1

2 m2 c2n, l, s, j, jz

j( j + 1) - l(l + 1) - s(s + 1)

r3ℏ2 n, l, s, j, jz =

e2 ℏ2

16 π ϵ0

j( j + 1) - l(l + 1) - s(s + 1)

m2 c2n, l, s, j, jz

1

r3n, l, s, j, jz =

The average value for 1r3 is known for the unperturbed Hamiltonian

n, l, s, j, jz1

r3n, l, s, j, jz = ⅆ r

→ψn,l,m

*(r, θ, ϕ)1

r3ψn,l,m(r, θ, ϕ) =

1

l(l + 1 /2) (l + 1) n3 a3 (2.382)

So

En,l,s, j, jz1 =

e2 ℏ2

16 π ϵ0

j( j + 1) - l(l + 1) - s(s + 1)

m2 c2n, l, s, j, jz

1

r3n, l, s, j, jz =

e2 ℏ2

16 π ϵ0

j( j + 1) - l(l + 1) - 34

m2 c2

1

l(l + 1 /2) (l + 1) n3 a3=

e2

16 π ϵ0

1

n4 a2

ℏ2

a

1

m2 c2

j( j + 1) - l(l + 1) - 34

l(l + 1 /2) (l + 1)n

(2.383)

Remember, the Bohr radius is

a =ℏ2

m

4 π ϵ0

e2(2.384)

and the eigenenergies for the unperturbed Hamiltonian is

En = -1

n2

e2

8 π ϵ0 a(2.385)

En,l,s, j, jz1 =

e2 ℏ2

16 π ϵ0

j( j + 1) - l(l + 1) - s(s + 1)

m2 c2n, l, s, j, jz

1

r3n, l, s, j, jz =

e2 ℏ2

16 π ϵ0

j( j + 1) - l(l + 1) - 34

m2 c2

1

l(l + 1 /2) (l + 1) n3 a3=

e2

16 π ϵ0

1

n4 a2

m e2

4 π ϵ0

1

m2 c2

j( j + 1) - l(l + 1) - 34

l(l + 1 /2) (l + 1)n =

e2

8 π ϵ0

1

n2 a

2 1

m c2

j( j + 1) - l(l + 1) - 34

l(l + 1 /2) (l + 1)n =

En2

m c2

j( j + 1) - l(l + 1) - 34

l(l + 1 /2) (l + 1)n

(2.386)

Recall that in the previous section, we find that the relastivistic correction is (at the first order)

-En

02

2 m c2

4 n

l +12

- 3 (2.387)

If we combine both effects together, to the first order, the energy is

En, j,l,s, jz = En0 +

En0

2

2 m c22( j + 1) - l(l + 1) - 3

4

l(l + 1 /2) (l + 1)n -

4 n

l +12

+ 3 +… (2.388)

Notice that j = l + 1 /2 or l - 1 /2, so we have l = j - 1 /2 or j + 1 /2. For l = j - 1 /2, we find that

En, j,l,s, jz = En0 -

En0

2

2 m c2

4 n

j + 1 /2- 3 +… (2.389)

for l = j + 1 /2, we find exactly the same result


En0

2

2 m c2

4 n

j + 1 /2- 3 +… (2.390)

So we conclude, no matter what, we have

42 Phys460.nb


En0

2

2 m c2

4 n

j + 1 /2- 3 +… (2.391)

After taken into account both SO coupling and relastivisitic correction, we find that the energy of a quantum state only depends on n and j,

En, j = En0 -

En0

2

2 m c2

4 n

j + 1 /2- 3 +… (2.392)

This is the fine structure correction in a hydrogen atom.

According to this formular, the fine structure correctio always reduces the energy of a state by a very smalll fraction (the correction is α2~10-4,

which is a 0.01% change). The smaller the j is the bigger this correction is. So s-wave states (with l = 0 get the largest modification). For states

with l > 0, e.g. p-wave, d-wave, etc., they splitts into two different energy levels (with j = l - 1 /2 and j = l + 1 /2, and the former has lower

energy than the latter).

NOTE: the fine structure correction can also be written as

En, j = En01 -

En0

2 m c2

4 n

j + 1 /2- 3 +… (2.393)

because

En0 = -

1

2 n2m c2

e2

4 π ϵ0 ℏ c

2

= -α2

2 n2m c2

(2.394)

wehre α is the fine structure constant

α =e2

4 π ϵ0 ℏ c≈

1

137.036(2.395)

we can rewrite the formular as

En, j = En01 +

α2

4 n2

4 n

j + 1 /2- 3 +… = En

01 +α2

n2

n

j + 1 /2-

3

4+… = -

13.6 eV

n21 +

α2

n2

n

j + 1 /2-

3

4+… (2.396)

2.6. The Zeeman effect

In the previous section, we found that after considering relativistic effects (i.e., fine structure), the eigenenergies in a hydrogen atom only

depends on the quantum numbers n and j. In particular, the energy is independent of jz.

For a fixed j, jz = - j, - j + 1, …, j - 1, j, all have the same energy, i.e. 2 j + 1-fold degeneracy.

In this section, we will show that in the presence of an external B field, these 2 j + 1-fold degeneracy will be lifted.

In a magnetic field, the energy of a magnetic dipole is E = -μ→·B→

. So for an atom

HZ ' = -μ→

L + μ→

S ·B→

(2.397)

where μ→L is the magnetic dipole moment from orbit motion

μ→

L = -e

2 mL→= -μB

L→

ℏ(2.398)

and μ→S is the magnetic dipole moment from electron spin

μ→

S = -2×e

2 mS→= -

e

mS→= -2 μB

S→

ℏ(2.399)

where μB =eℏ

2 m= 5.788×10-5 eV /T is Bohr magneton. Here, L

→ and S

→ are angular momenta from orbit motion and electron spin respectively

Phys460.nb 43

HZ ' = μB

L→+ 2 S

→

ℏ·B→

(2.400)

Without loss of genericity, we will set B to be along the z direction, so the total energy is

B→= B z

(2.401)

As a result,

HZ ' = μB BLz + 2 Sz

ℏ(2.402)

Consider

H = H0 + Hr ' + HSO ' + Hz ' (2.403)

where H0 is the Hamiltonian that we studied in QM I (kinetic energy+1/r attraction), and Hr ' is the relativistic correction. HSO ' is the SO

coupling effect, and HZ ' = μB BLz+2 Sz

ℏ.

2.6.1. Difficulty

For the Hamiltonian above H , the key difficulty lies in the fact that the last two terms, HSO' and Hz', do not commute with each other. For Hz ',

we must know Lz and Sz. However, we have learned early on HSO ' doesn’t commute with Lz and Sz (HSO ' commutes with jz, but not with Sz or

Lz, as we showed in our homework). So, we cannot measure HSO ' with Lz and Sz at the same time, but HZ ' needs information about Lz and Sz.

This is the confliction

NOTE: this problem comes from the g factor for electrons. μ→L = -μBL→

ℏ and μ→S = -2 μB

S→

ℏ, the prefactor for them are DIFFERENT! (differ

by a factor of 2, i.e., the g-factor). If there is no this extra factor g = 2, things would be very easy. There, HZ ' = μB BLz+Sz

ℏ= μB B

Jz

ℏ, so we only

need Jz. But unfortunately, Hz ' is not proportional to Jz.

2.6.2. Strong field

When Hz >> HSO ', we can treat HSO ' and Hr ' (they two are comparable as we learned in the previous section) as perturbation, and thus our

unperturbed Hamiltonian is

H0 + Hz ' (2.404)

The eigenstates of this Hamiltonian is the same as the eigenstates of H0: n, l, ml, ms⟩. Here, n, l, ml⟩ are the eigenwavefunctions that we

learned in QM I. Here, we add back the spin Sz quantum state ms

L2 n, l, ml, ms = ℏ l (l + 1) n, l, ml, ms (2.405)

Sz n, l, ml, ms⟩ = ms ℏ n, l, ml, ms⟩ (2.406)

Lz n, l, ml, ms⟩ = ml ℏ n, l, ml, ms⟩ (2.407)

H0 n, l, ml, ms⟩ = -13.6 eV

n2n, l, ml, ms (2.408)

Hz ' n, l, ml, ms⟩ =μB B

ℏ(Lz + 2 Sz) n, l, ml, ms =

μB B

ℏLz n, l, ml, ms + 2

μB B

ℏSz n, l, ml, ms =

μB B

ℏℏml n,

l, ml, ms + 2μB B

ℏℏms n, l, ml, ms = μB B(ml + 2 ms) n, l, ml, ms

(2.409)

So our zeroth order eigenenergy is

En,l,ml,ms0 = -

13.6 eV

n2+ μB B(ml + 2 ms) (2.410)

At B = 0, we know that energy is independent of ml and ms, i.e., all quantum states are degenerate with (at least 2×(2 l + 1)-fold degenerate).

For finite B however, these states splits.

44 Phys460.nb

Example: if we consider states n = 2 and l = 1 (first excited states with orbit angular moment quantum number l = 1). There, ml = -1, 0, +1

and ms = -12

or + 12

. At B = 0, all these six states are degenerate (E = -13.6 /4 = -3.4 eV). In the presence of strong B field,

En,l,ml,ms0 =

-3.4 eV + 2 μB B ml = +1 and ms = +1 /2-3.4 eV + μB B ml = 0 and ms = +1 /2-3.4 eV ml = -1 and ms = +1 /2, or, ml = +1 and ms = -1 /2-3.4 eV - μB B ml = 0 and ms = -1 /2-3.4 eV - 2 μB B ml = -1 and ms = -1 /2

(2.411)

Now, we consider HSO ' and Hr ' . Because we assumed that they are much smaller than H0 and HZ ', we treat them as perturbation and compute

the first order correction to the eigenenergy

En,l,ml,ms1 = ⟨n, l, ml, ms Hr ' + HSO ' n, l, ml, ms⟩ (2.412)

The realistic correction is same as what we learned before

⟨n, l, ml, ms Hr ' n, l, ml, ms⟩ = -En

02

2 m c2

4 n

l +12

- 3 (2.413)

Because

En0 = -

1

2 n2

m

ℏ2

e2

4 π ϵ0

2

= -13.6

n2eV (2.414)

α =e2

4 π ϵ0 ℏ c≈

1

137.036(2.415)

we know that

En0

m c2= -

1

2 n2

1

ℏ2 c2

e2

4 π ϵ0

2

= -1

2 n2α2

(2.416)

so

⟨n, l, ml, ms Hr ' n, l, ml, ms⟩ = -En

0

2

En0

m c2

4 n

l +12

- 3 = -En

0

2

α2

2 n2

4 n

l +12

- 3 = -13.6 eVα2

n4

n

l +12

-3

4(2.417)

For spin-orbit coupling,

⟨n, l, ml, ms HSO ' n, l, ml, ms⟩ =

n, l, ml, ms

e2

8 π ϵ0

1

m2 c2 r3S→·L→

n, l, ml, ms =e2

8 π ϵ0

1

m2 c2 r3n, l, ml, ms S

→·L→

n, l, ml, ms(2.418)

Notice that in the zeroth order wavefunctions, n, l, ml, ms⟩, spin and orbit angular momenta are independent of each other, so

n, l, ml, ms S→·L→

n, l, ml, ms = S→ · L

→ = ⟨Sx⟩ ⟨Lx⟩ + ⟨Sy⟩ ⟨Ly⟩ + ⟨Sz⟩ ⟨Lz⟩ (2.419)

In QM I, we learned that for eigenstates of L2 and Lz, ⟨Lx⟩ = ⟨Ly⟩ = 0. And similarly, ⟨Sx⟩ = ⟨Sy⟩ = 0. And thus

n, l, ml, ms S→·L→

n, l, ml, ms = S→ · L

→ = ⟨Sz⟩ ⟨Lz⟩ = ms ℏml ℏ = ms ml ℏ

2(2.420)

As a result,


e2

8 π ϵ0

1

m2 c2n, l, ml, ms

S→·L→

r3n, l, ml, ms =

e2

8 π ϵ0

ms ml ℏ2

m2 c2n, l, ml, ms

1

r3n, l, ml, ms

(2.421)

and

n, l, s, j, jz1

r3n, l, s, j, jz = ⅆ r

→ψn,l,m

*(r, θ, ϕ)1

r3ψn,l,m(r, θ, ϕ) =

1

l(l + 1 /2) (l + 1) n3 a3 (2.422)

Phys460.nb 45

where a is the Bohr radius a =ℏ2

m

4 π ϵ0

e2

So


e2

8 π ϵ0

ms ml ℏ2

m2 c2

1

ℏ2

m

4 π ϵ0

e2 3

1

l(l + 1 /2) (l + 1) n3=

1

2

e2

4 π ϵ0 cℏ

2 e2

4 π ϵ0

2 m

ℏ2

ms ml

l(l + 1 /2) (l + 1) n3

(2.423)

Because

En0 = -

1

2 n2

m

ℏ2

e2

4 π ϵ0

2

= -13.6

n2eV (2.424)

α =e2

4 π ϵ0 ℏ c≈

1

137.036(2.425)

we can rewrite the formula as

⟨n, l, ml, ms HSO ' n, l, ml, ms⟩ = 13.6 eV α2ms ml

l(l + 1 /2) (l + 1) n3 (2.426)

So our first order correction is

En,l,ml,ms1 = ⟨n, l, ml, ms Hr ' + HSO ' n, l, ml, ms⟩ = -13.6 eV

α2

n4

n

l +12

-3

4+ 13.6 eV α2

ms ml

l(l + 1 /2) (l + 1) n3=

13.6 eV

n3α2

3

4 n-

1

l +12

+ms ml

l(l + 1 /2) (l + 1) =

13.6 eV

n3α2

3

4 n-

l(l + 1) - ms ml

ll +12 (l + 1)

(2.427)

So

En,l,ml,ms= En,l,ml,ms

0 + En,l,ml,ms1 = -

13.6 eV

n2+ μB B(ml + 2 ms) +

13.6 eV

n3α2

3

4 n-

l(l + 1) - ms ml

ll +12 (l + 1)

(2.428)

Bottom line: at very strong field (second term much larger than the last one), the energy splittings between the levels are proportional

to B and the slop is proportional to μB(ml + 2 mS). The eigenstates are (almost) n, l, ml, ms⟩, where ms and ml are good quantum

numbers. (we should arrange the states according to the orbit and spin angular moment, NOT the total angular momentum j).

2.6.3. Weak field

When Hz << HSO ', we can treat Hz ' as a small perturbation. The zeroth order Hamiltonian (i.e. ignoring HZ ') is what we studied in the previous

section. There, we know that eigenstates are n, j, l, s, m j⟩ and the eigenenergy is

En, j = En01 +

α2

4 n2

4 n

j + 1 /2- 3 +… = En

01 +α2

n2

n

j + 1 /2-

3

4+… = -

13.6 eV

n21 +

α2

n2

n

j + 1 /2-

3

4+… (2.429)

In first order perturbation theory, the energy correction is

En, j,l,s,mj1 = ⟨n, j, l, s, m j Hz ' n, j, l, s, m j⟩ =

n, j, l, s, m j μB

L→+ 2 S

→

ℏ·B→

n, j, l, s, m j =μB

ℏn, j, l, s, m j B

→·L→+ 2 B

→· S→

n, j, l, s, m j(2.430)

The key is to compute expectation values of L→ and S

→ for eigenstates of J2 and Jz. Here, we use the fact that

L→ // S

→ // J

→ (2.431)

So

46 Phys460.nb

L→ = L

→· J→ J

→

J2 (2.432)

Here,

B→·L→ = L

→· J→ B

→· J→

J2 = n, j, l, s, m j L

→· J→ B

→· J→

J2n, j, l, s, m j (2.433)

For B//z, we have B→· J→= B Jz

B→·L→ = n, j, l, s, m j L

→· J→ B Jz

J2n, j, l, s, m j (2.434)

Here, we use the fact that

Jz n, j, l, s, m j⟩ = m j ℏ n, j, l, s, m j⟩ (2.435)

J2 n, j, l, s, m j = j( j + 1) ℏ2 n, j, l, s, m j (2.436)

so

B→·L→ = n, j, l, s, m j L

→· J→ B Jz

J2n, j, l, s, m j =

n, j, l, s, m j L→· J→ B m j ℏ

j( j + 1) ℏ2n, j, l, s, m j =

B m j

j( j + 1) ℏn, j, l, s, m j L

→· J→

n, j, l, s, m j

(2.437)

For L→· J→

, we use the fact that

S→= J→- L→

(2.438)

S→· S→= J

→- L→ · J

→- L→ = J

→. · J→+ L→·L→- 2 L

→· J→

(2.439)

Thus,

L→· J→=

J→

. · J→+ L→·L→- S→· S→

2(2.440)

So,

n, j, l, s, m j L→· J→

n, j, l, s, m j =

n, j, l, s, m j

J→

. · J→+ L→·L→- S→· S→

2n, j, l, s, m j = n, j, l, s, m j

ℏ2 j( j + 1) + ℏ2 l(l + 1) - ℏ2 s (s + 1)

2n, j, l, s, m j =

ℏ2 j( j + 1) + ℏ2 l(l + 1) - ℏ2 s (s + 1)

2=ℏ2

2[ j( j + 1) + l(l + 1) - s (s + 1)]

(2.441)

we know that s = 1 /2 for electrons,

n, j, l, s, m j L→· J→

n, j, l, s, m j =ℏ2

2[ j( j + 1) + l(l + 1) - 3 /4] (2.442)

Therefore,

B→·L→ =

B m j

j( j + 1) ℏn, j, l, s, m j L

→· J→

n, j, l, s, m j =B m j

j( j + 1) ℏ

ℏ2

2[ j( j + 1) + l(l + 1) - 3 /4] =

B m j ℏ

2 j( j + 1)[ j( j + 1) + l(l + 1) - 3 /4]

(2.443)

Similarly, we can prove that

B→· S→ =

B m j

j( j + 1) ℏn, j, l, s, m j S

→· J→

n, j, l, s, m j =B m j

j( j + 1) ℏ

ℏ2

2[ j( j + 1) - l(l + 1) + 3 /4] =

B m j ℏ

2 j( j + 1)[ j( j + 1) - l(l + 1) + 3 /4]

(2.444)

Phys460.nb 47

And thus

En, j,l,s,mj1 = ⟨n, j, l, s, m j Hz ' n, j, l, s, m j⟩ =

n, j, l, s, m j μB

L→+ 2 S

→

ℏ·B→

n, j, l, s, m j =μB

ℏn, j, l, s, m j B

→·L→+ 2 B

→· S→

n, j, l, s, m j =

μB

ℏ

B m j ℏ

2 j( j + 1)[ j( j + 1) + l(l + 1) - 3 /4] + 2

B m j ℏ

2 j( j + 1)[ j( j + 1) - l(l + 1) + 3 /4] =

μB

ℏ

B m j ℏ

2 j( j + 1)[3 j( j + 1) - l(l + 1) + 3 /4] = μB B m j

1

2 j( j + 1)[3 j( j + 1) - l(l + 1) + 3 /4]

(2.445)

We can define an atomic g-fact,

g j =1

2 j( j + 1)[3 j( j + 1) - l(l + 1) + 3 /4] =

3

2-

l (l + 1) - 3 /4

2 j( j + 1)=

3

2-

l +32 l -

12

2 j( j + 1)(2.446)

we know that j = l + 1 /2 or j = l - 1 /2. If j = l + 1 /2, we find that

g j =3

2-

l +32 l -

12

2 j( j + 1)=

3

2-

l +32 l -

12

2 l + 12 l +

32

=3

2-

l -12

2 l + 12

=3

2-

l +12- 1

2 l + 12

=3

2-

1

2+

1

2 l + 12

= 1 +1

2 l + 1(2.447)

If j = l - 1 /2

g j =3

2-

l +32 l -

12

2 j( j + 1)=

3

2-

l +32 l -

12

2 l - 12 l +

12

=3

2-

l +32

2 l + 12

=3

2-

l +12+ 1

2 l + 12

=3

2-

1

2-

1

2 l + 12

= 1 -1

2 l + 1(2.448)

With this g j - factor,

En, j,l,s,mj1 = g μB B m j (2.449)

The energy correction (first order), is proportional to m j (total angular momentum along the field).

Total energy:

En, j = -13.6 eV

n21 +

α2

n2

n

j + 1 /2-

3

4 + g μB B m j (2.450)

2.6.4. Intermediate-field

H = H0 + Hr ' + HSO ' + Hz ' (2.451)

When HSO'~Hz', we should treat the last three terms as perturbation. Here, we can treat the problem using degenerate perturbation theory.

For H0, we consider n = 2 states (first excited states). In QM I, we learned that there are 4 degenerate states: one s-wave state with l = 0 and

three p-wave states (l = 1, and ml = -1, 0, 1). If we consider spins, there are 4×2 = 8 states. In first order degenerate perturbation theory, we

can ignore all other states except n = 2 states, and only focus on these 8 states. So the perturbation Hamiltonian is now a 8×8 matrix, which was

shown in textbook (page 248). The eigenvalues of this matrix give us the first order corrections in energy.

2.7. Summary

Objective: compute eigenvalues for the Hamiltonian

Hψn = En ψn (2.452)

Key assumption:

H= H

0 + λH

' (2.453)

where the second term λH

' is dramatically smaller than the first part.

48 Phys460.nb

2.7.1. nondegenerate perturbation theory

Step 1: solve for eigenstates for H

0

H

0 ψ0n = E0

n ψ0n (2.454)

If the state that we consider has no degeneracy, we use nondegenerate perturbation theory

Step 2: first order correction

En1 = ψ0

n H

' ψ0n (2.455)

Step 3: second order correction

En2 =

m≠nψn

0 H ' ψm0

1

En0 - Em

0ψm

0 H ' ψn0 =

m≠n

ψm0 H ' ψn

0 2

En0 - Em

0(2.456)

Step 4: eigenenergy

En = En0 + λ En

1 + λ2 En2 +… (2.457)

Wave functions:

ψn⟩ = ψn0 + λ

m≠nψm

01

En0 - Em

0ψm

0 H ' ψn +… (2.458)

2.7.2. degenerate perturbation theory

Step 1: solve for eigenstates for H

0

H

0 ψ0n = E0

n ψ0n (2.459)

If the state that we consider has degeneracy (i.e. there is at least one other state has the same eigenenergy), we use degenerate perturbation

theory:

H0 ψa0 = E0 ψa

0 (2.460)

and

H0 ψb0 = E0 ψb

0 (2.461)

Step 1: Create a n×n matrix (if there is an n-fold degeneracy),

W =ψa

0 H ' ψa0 ψa

0 H ' ψb0

ψb0 H ' ψa

0 ψb0 H ' ψb

0(2.462)

Step 2: The eigenvalues of the matrix is the first order correction

E1 = E0 + λ E+ + Oλ2 (2.463)

E2 = E0 + λ E- + Oλ2 (2.464)

Wavefunctions: eigenvectors

Waa Wab

Wba Wbb α1

β1 = E+

α1

β1 (2.465)

and

Waa Wab

Wba Wbb α2

β2 = E-

α2

β2 (2.466)

where E+ and E- are the two eigenvalues.

ψ10 = α1 ψa

0 + β1 ψb0 (2.467)

Phys460.nb 49

ψ20 = α2 ψa

0 + β2 ψb0 (2.468)

2.7.3. Perturbation theory in matrix formula (example: homework 2.3)

This case usually uses eigenstates of H0 as basis,

H

0 ψ0n = E0

n ψ0n (2.469)

With one complete set of basis, we can write an operator as a matrix

(H0)mn = ψ0

m H

0 ψ0n (2.470)

If we use eigenstates of H0 as basis, the matrix is diagonal and the diagonal components are eigenvalues of the H0

H0 →

E10 0 0 …

0 E20 0 ...

0 0 E30 …

⋮ ⋮ ⋮ ⋱

(2.471)

Using the same basis, we can write H ' as a matrix

λH ' → λ

⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩ …


⋮ ⋮ ⋮ ⋱

(2.472)

In many cases, we only need keep a small number of states (e.g. only the tree states with lowest energy)

H0 =

E10 0 0

0 E20 0

0 0 E30

(2.473)

and

λH ' = λ⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩

⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩

⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩(2.474)

Objective: compute eigenstates for the H matrix

H =

E10 0 0

0 E20 0

0 0 E30

+ λ

⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩

⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩

⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩(2.475)

Assumption: the second matrix is much smaller than the first

Nondegenerate perturbation:

Among the three unperturbed eigenvalues, E10, E2

0 and E30, if one of them is different from the other two (e.g. E1

0 is different), then we can

use nondegenerate perturbation theory.

First order perturbation:

E11 = ⟨ψ1 H ' ψ1⟩ (2.476)

Notice that it is just one element in the second matrix in H .

Second order perturbation:

E12 =

m≠nψ1

0 H ' ψm0

1

E10 - Em

0ψm

0 H ' ψ10 =

ψ10 H ' ψ2

0 ψ20 H ' ψ1

0

E10 - E2

0+ψ1

0 H ' ψ30 ψ3

0 H ' ψ10

E10 - E3

0(2.477)

Notice that the denominator are just elements from the first matrix in H and the numerators are from the second matrix.

Degenerate perturbation:

Among the three unperturbed eigenvalues, E10, E2

0 and E30, if two (or more) of them are identical (e.g. E2

0 and E30 have the same value, then

we can use degenerate perturbation theory.

50 Phys460.nb

Step one: create the W matrix

W =ψ2

0 H ' ψ20 ψ2

0 H ' ψ30

ψ30 H ' ψ2

0 ψ30 H ' ψ3

0(2.478)

Step two: compute eigenvalues of W, which are the first order correction

Phys460.nb 51

time-independent perturbation theory - university of michigansunkai/teaching/winter_2016/... ·...

Documents