time-independent perturbation theory - university of michigansunkai/teaching/winter_2016/... ·...
TRANSCRIPT
2Time-Independent Perturbation Theory
2.1. Overview
2.1.1. General question
Assuming that we have a Hamiltonian,
H = H0 + λH1 (2.1)
where λ is a very small real number. The eigenstates of the Hamiltonian should not be very different from the eigenstates of H 0. If we already
know all eigenstates of H0, can we get eigenstates of H1 approximately?
Bottom line: we are studying an approximate method.
2.1.2. Why perturbation theory?
Why we need to study this approximation methods? (considering the fact that numerical methods can compute the eigenstates very
efficiently and accurately for any Hamiltonian that we consider in this course)
Reason number I: It is part of the history (QM was born before electronic computer becomes a powerful tool in scientific research).
Reason number II: It reveals to us universal principles, which are very important and cannot be obtained from just numerical simulations
Reason number III: The idea of perturbation theory has very deep and broad impact in many branches of physics. Perturbation theories is in
many cases the only theoretical technique that we have to handle various complex systems (quantum and classical). Examples: in quantum
field theory (which is in fact a nonlinear generalization of QM), most of the efforts is to develop new ways to do perturbation theory (Loop
expansions, 1/N expansions, 4-ϵ expansions).
2.1.3. Assumptions
Assumption #1: we know all eigenstates of H0, as well as their corresponding eigenenergies
H0 ψn0 = En
0 ψn0 (2.2)
Assumption #2: we know the perturbation H '. What do we mean by knowing H '? Here, we mean that we can write down H ' using the
complete basis of ψn0, i.e., we know the value of ψn0 H' ψm
0 for any m and n.
Assumption #3: we only consider quantum states with discrete eigenenergies
In general, the energy spectrum of a quantum system (i.e. all eigenvalues of the Hamiltonian) falls into one of the following three general
possibilities
◼ A discrete spectrum: eigenenergies can only take certain discredited values (example: infinite deep potential wells, e.g. harmonic potential
En = (n + 1 /2) ℏω)
4 Phys460.nb
◼ A continuous spectrum: eigenenergies can take any (real) values in certain allowed range (example: a constant potential. Here, any E ≥ V
is an eigenenergy)
◼ A mixed spectrum: some parts of the spectrum are continuous, while other parts has discrete eigenenergies. (example: a finite potential
well. Here, we may have some discrete states inside the well. But for E above the top of the potential well, we have a continuous spectrum).
Q: Consider the energy spectrum of an attractive Coulomb (1 /r) potential. Is it discrete, continuous or mixed?
A: It is mixed. When we consider the an attractive Coulomb potential, we mostly focus on the negative energy states (E < 0). This part of the
spectrum is discrete, as we all know very well from the study of a hydrogen atom. But if we look at states with positive energies, there is a
continuous spectrum for E > 0. For E > 0, the system is NOT a bound state, i.e. the proton and the electron doesn’t form an atom. In other
words, we have a high probability found the proton and the electron to be separated far from each other. There, the attractive potential is very
small and negligible, so we have two free particles and only need to consider their kinetic energies. For free particles, we know that any positive
energy is an allowed eigenenergy (i.e. we have a continuum spectrum for E > 0).
Bottom line: in this chapter, our perturbation theory only consider discrete spectrum or the discrete part of a mixed spectrum.
Another version of assumption #3: we only consider confined states. (In QM, in most cases, confined states=discrete energy and unconfined
states=continuous energy).
Comment: In QM, we only study discrete states in a perturbation theory. But this is NOT true for other branches of physics. For example, in
quantum field theory, perturbation theory is applied to continuous spectral.
2.2. Non-degenerate Perturbation Theory
2.2.1. Assumptions
Key assumption: we consider a specific state ψn0. Here, we assume that En
0 - Em0 is much larger than λH1 for any other
eigenstate ψm0
2.2.2. Preparation #1 wavefunctions
Since the eigenstates of H0 form a compete basis, we can write down any quantum state as a linear superposition of ψm0
ψ⟩ =m
am ψm0 (2.3)
Now, if we consider an eigenstate of H , ψ
n, it can also be written in a similar fashion
ψ
n =m
am ψm0 (2.4)
As discussion above, if λ is small, an eigenstate of H would be similar to an eigenstate of H0. Here, we assume that ψ
n is very close to ψn0.
This means that an ≈ 1 and for other values of m ≠ n, am~0, To highlight this, we separate the term for ψn0 out from the sum
ψ
n = an ψn0 +
m≠nam ψm
0 (2.5)
It turns out that it is usually more convenient to use unnormalized eigenstates. Now, let us define unnormalized eigenstates of H
ψn⟩ =1
an
ψ
n = ψn0 +
m≠n
am
an
ψm0 (2.6)
For simplicity, we will now call am /an = cm
ψn⟩ = ψn0 +
m≠ncn ψm
0 (2.7)
Because am ~ 0 and an ~ 1, we know that cm~0 for small λ.
Comment #1: This state is NOT normalized
⟨ψn ψn⟩ = 1 +m≠n
cm2 ≥ 1 (2.8)
Phys460.nb 5
But we can easily normalized it, if we want to
ψ
n =1
1 +∑m≠n cm2
ψn(2.9)
Comment #2: (almost) any quantum states can be written in the form of Eq. (2.3). This is because ψm0 forms a complete basis.
Q: what does the word “almost” mean here?
A: If a state is orthogonal to ψn0, we cannot write the state the form of Eq. (2.3). But we don’t need to worry about it here, because we are
doing perturbation theory and we know that the eigenstates of H is close to eigenstates of H0. So it can not be orthogonal to ψn0.
Bottom line: we are not making any assumptions or approximations here. It is just a new way to write down eigenstates of H .
Comment #3: cms are functions of λ, i.e. cm(λ). For small λ, we can use the Taylor series:
cm = cm(1) λ + cm
(2) λ2 + cm(3) λ3 +… (2.10)
Here, the Taylor series doesn’t contain the 0th order term of λ (i.e. the constant term). This is because when λ = 0, ψn⟩ = ψn0, and thus
cm(λ) = 0 at λ = 0.
As a result,
ψn⟩ = ψn0 +
m≠ncm(λ) ψm
0 = ψn0 +
m≠n
k=1
∞cm
(k) λk ψm0 = ψn
0 +k=1
∞λk
m≠ncm
(k) ψm0 (2.11)
If we define
ψnk =
m≠ncm
(k) ψm0 (2.12)
we get
ψn⟩ = ψn0 + λ ψn
1 + λ2 ψn2 +… (2.13)
This is Eq. [6.5] in the textbook.
Important: ψn1, ψn
2 … doesn’t contain ψn0. In other words, all corrections are orthogonal to ψn
0.
2.2.3. Preparation #2 eigenenergies
Eigenenergies of H are also functions of λ, and for small λ, we can use the Taylor series:
En(λ) = En0 + λ En
1 + λ2 En2 +… (2.14)
This is Eq. [6.6] in the textbook.
2.2.4. Schrodinger Equation in the perturbation theory
H ψn⟩ = En ψn⟩ (2.15)
(H0 + λH ') ψn0 + λ ψn
1 + λ2 ψn2 +… = En
0 + λ En1 + λ2 En
2 +… ψn0 + λ ψn
1 + λ2 ψn2 +… (2.16)
H0 ψn0 + λH0 ψn
1 + H ' ψn0 + λ2H0 ψn
2 + H ' ψn1 +… = En
0 ψn0 + λEn
0 ψn1 + En
1 ψn0 + λ2En
0 ψn2 + En
1
ψn1 + En
2 ψn0 +…
(2.17)
In the perturbation theory, we need to compute two sets of quantities (1) energy corrections at each order En1, En
2,... and (2) wavefunc-
tion corrections at each order, ψn1, ψn
2, ψn3. It turns out that these two set of quantities are entangled together and we need
to compute both of them. At each order, we will first compute energy corrections, and then the wavefunction corrections.
2.2.5. Zeroth order
The leading order terms in the equation is λ0 = constant
H0 ψn0 = En
0 ψn0 (2.18)
6 Phys460.nb
This is identical to the case of λ = 0, i.e. the unperturbed system.
2.2.6. First order
To the order of λ, we have
H0 ψn1 + H ' ψn
0 = En0 ψn
1 + En1 ψn
0 (2.19)
Here, we first compute the energy correction En1. This is done by multiplying on both sides ψn
0
ψn0 H0 ψn
1 + ψn0 H ' ψn
0 = ψn0 En
0 ψn1 + ψn
0 En1 ψn
0 (2.20)
For the first term on the l.h.s., we use the fact that
ψn0 H0 = ψn
0 En0 (2.21)
For the last term on the r.h.s., we use the fact that En1 is a number (not a quantum operator), and thus ψn
0 En1 ψn
0 = En1 ψn
0 ψn0 = En
1
ψn0 En
0 ψn1 + ψn
0 H ' ψn0 = ψn
0 En0 ψn
1 + En1 (2.22)
ψn0 H ' ψn
0 = En1 (2.23)
The first order correction in energy is the expectation value of H '.
En = En0 + λ ψn
0 H ' ψn0 + Oλ2 =
ψn0 H0 ψn
0 + ψn0 λH ' ψn
0 + Oλ2 = ψn0 H0 + λH ' ψn
0 + Oλ2 = ψn0 H ψn
0 + Oλ2(2.24)
Bottom line: to the first order (or say up to corrections at the order of λ2), we can use the old wavefunction (the zeroth order
wavefunction).
Then we compute the first order correction for the wavefunction ψn1. To do that, we multiply both sides of the equation with ψm
0 where
m ≠ n
ψm0 H0 ψn
1 + ψm0 H ' ψn
0 = ψm0 En
0 ψn1 + ψm
0 En1 ψn
0 (2.25)
For the first term on the l.h.s., we use the fact that
ψm0 H0 = ψm
0 Em0 (2.26)
For the two terms on the r.h.s., we use the fact that En0 and En
1 are both numbers (not quantum operators), so ψm0 En
0 ψn1 = En
0 ψm0 ψn
1
and ψm0 En
1 ψn0 = En
1 ψm0 ψn
0 = 0. Here, we used the fact that when m ≠ n, the two quantum states are orthogonal and thus
ψm0 ψn
0 = 0.
Em0 ψm
0 ψn1 + ψm
0 H ' ψn0 = En
0 ψm0 ψn
1 (2.27)
So
ψm0 ψn
1 =ψm
0 H ' ψn0
En0 - Em
0(2.28)
According to the definition of ψn1
ψn1 =
m≠ncm
(1) ψm0 (2.29)
we have
cm(1) = ψm
0 ψn1 =
ψm0 H ' ψn
0
En0 - Em
0(2.30)
And therefore,
Phys460.nb 7
ψn1 =
m≠nψm
0ψm
0 H ' ψn0
En0 - Em
0(2.31)
So
ψn⟩ = ψn0 + λ
m≠nψm
0ψm
0 H ' ψn0
En0 - Em
0+… (2.32)
2.2.7. Second order
H0 ψn2 + H ' ψn
1 = En0 ψn
2 + En1 ψn
1 + En2 ψn
0 (2.33)
Here, we first compute the energy correction En2. This is done by multiplying on both sides ψn
0
ψn0 H0 ψn
2 + ψn0 H ' ψn
1 = ψn0 En
0 ψn2 + ψn
0 En1 ψn
1 + ψn0 En
2 ψn0 (2.34)
ψn0 En
0 ψn2 + ψn
0 H ' ψn1 = ψn
0 En0 ψn
2 + En1 ψn
0 ψn1 + En
2 (2.35)
The second term on the r.h.s. is zero, because we required ψn0 ψn
1 = 0 at the beginning.
En2 = ψn
0 H ' ψn1 (2.36)
Bottom line: to compute the second order perturbation, we need to know wavefunction at the first order.
This conclusion is in fact generically true. We need wavefunction at lower order to compute energy correction at one order higher.
En2 = ψn
0 H ' ψn1 =
m≠nψn
0 H ' ψm0ψm
0 H ' ψn0
En0 - Em
0=
m≠n
ψn0 H ' ψm
0 ψm0 H ' ψn
0
En0 - Em
0(2.37)
En = En0 + λ ψn
0 H ' ψn0 + λ2
m≠n
ψn0 H ' ψm
0 ψm0 H ' ψn
0
En0 - Em
0+ Oλ3 (2.38)
The we compute the second order correction for the wavefunction ψn2. To do that, we multiply both sides of the equation with ψm
0 where
m ≠ n
ψm0 H0 ψn
2 + ψm0 H ' ψn
1 = ψm0 En
0 ψn2 + ψm
0 En1 ψn
1 + ψm0 En
2 ψn0 (2.39)
Em0 ψm
0 ψn2 + ψm
0 H ' ψn1 = En
0 ψm0 ψn
2 + En1 ψm
0 ψn1 + En
2 ψm0 ψn
0 (2.40)
En0 - Em
0 ψm0 ψn
2 = ψm0 H ' ψn
1 - En1 ψm
0 ψn1 (2.41)
ψm0 ψn
2 =ψm
0 H ' ψn1
En0 - Em
0-
En1
En0 - Em
0ψm
0 ψn1 =
m'≠n
ψm0 H ' ψm'
0
En0 - Em
0
ψm'0 H ' ψn
0
En0 - Em'
0-
m'≠n
ψn0 H ' ψn
0
En0 - Em
0ψm
0 ψm'0ψm'
0 H ' ψn0
En0 - Em'
0=
m'≠n
ψm0 H ' ψm'
0
En0 - Em
0
ψm'0 H ' ψn
0
En0 - Em'
0-
m'≠n
ψn0 H ' ψn
0
En0 - Em
0δm,m'
ψm'0 H ' ψn
0
En0 - Em'
0=
m'≠n
ψm0 H ' ψm'
0
En0 - Em
0
ψm'0 H ' ψn
0
En0 - Em'
0-
m'≠n
ψn0 H ' ψn
0 ψm0 H ' ψn
0
En0 - Em
02
(2.42)
According to the definition of ψn2
ψn2 =
m≠ncm
(2) ψm0 (2.43)
We have
cm(2) = ψm
0 ψn2 (2.44)
8 Phys460.nb
and thus
ψn2 =
m≠n
m'≠n
ψm0 H ' ψm'
0
En0 - Em
0
ψm'0 H ' ψn
0
En0 - Em'
0-ψn
0 H ' ψn0 ψm
0 H ' ψn0
En0 - Em
02
ψm0 (2.45)
2.2.8. Third order
Same as the second order, we can use the same method to show that
En3 = ψn
0 H ' ψn2 (2.46)
So,
En3 =
m≠n
m'≠n
ψn0 H ' ψm
0 ψm0 H ' ψm'
0 ψm'0 H ' ψn
0
En0 - Em
0 En0 - Em'
0-ψn
0 H ' ψn0 ψm
0 H ' ψn0 ψn
0 H ' ψm0
En0 - Em
02 (2.47)
And one can keep doing this for higher and higher order
2.2.9. Summary
For a Hamiltonian
H = H0 + λH1 (2.48)
assuming that we know all the eigenstates of H0 ψn0), and we know the expectation values ψm1
0 H ' ψm20 for any two eigenstates of H0,
ψm10 and ψm2
0), then we can write down eigenstates of H as a power series expansions of λ
En = En0 + λ En
1 + λ2 En2 +… = En
0 + λ ψn0 H ' ψn
0 + λ2
m≠n
ψn0 H ' ψm
0 ψm0 H ' ψn
0
En0 - Em
0+… (2.49)
and
ψn⟩ = ψn0 + λ ψn
1 + λ2 ψn2 +… = ψn
0 + λm≠n
ψm0ψm
0 H ' ψn0
En0 - Em
0+… (2.50)
2.2.10. Second order perturbation always reduces the energy of the ground state
One key conclusion from the perturbation theory is that the second order correction always makes the energy of the ground state lower (in
comparison to the unperturbed one). This can be seen by looking at E02
En2 =
m≠n
ψn0 H ' ψm
0 ψm0 H ' ψn
0
En0 - Em
0=
m≠n
ψn0 H ' ψm
0 2
En0 - Em
0(2.51)
In the numerator, ψn0 H ' ψm
0 is the complex conjugate of ψm0 H' ψn0, so it is ψn
0 H ' ψm0 2, which is non-negative
ψn0 H ' ψm
0 2 ≥ 0 (2.52)
The denominator En0 - Em
0 < 0, if n is the ground state for the unperturbed Hamiltonian (if it is the ground state, then its eigenenergy must be
smaller than eigenenergy of any other states). And therefore
ψn0 H ' ψm
0 2
En0 - Em
0≤ 0 (2.53)
So
En2 =
m≠n
ψn0 H ' ψm
0 2
En0 - Em
0≤ 0 (2.54)
The equal sign only arise when ψn0 H ' ψm
0 = 0 for ALL m ≠ n. (if this is the case, we don’t need to do perturbation theory. The first order
perturbation become exact). As long as we ignore this very special case, we find that En2 < 0 for the ground state, regardless of details.
Phys460.nb 9
This conclusion is very important in quantum mechanics, because in many systems, the first order perturbation of the ground state happens to
be zero. En1 = 0. There,
En = En0 + λ2 En
2 +… (2.55)
The energy correction is dominated by the second order term, which must be negative for the ground state. Without any calculation, we know
immediately that
En < En0 (2.56)
In the first homework, we will see that this relation implies that the speed of light in a (linear) medium can only be slower than the vacuum.
(i.e., if En > En0, we will violate the special relativity).
2.3. Brillouin-Wigner Perturbation Theory
2.3.1. Negative sides of Rayleigh–Schrödinger perturbation theory
The perturbation theory discussed above is known as Rayleigh–Schrödinger perturbation theory. It is presented for most of the textbooks.
However, this approach has some limitations and is not sufficient enough for some cases.
1. Too complicated to go to higher order (e.g. third order or fourth order correction)
2. The physical meaning is less clear (Why do we need to sum over all other quantum state? How should we think about the sum.)
3. One needs to compute energy and wavefunctions at the same time (if we only want to know the eigenenergy, can we compute only energy
without bothering to do wavefunction?)
One way to resolve these problems: Brillouin-Wigner Perturbation Theory
2.3.2. Brillouin-Wigner Perturbation Theory
Brillouin-Wigner Perturbation Theory considers the same setup and the final conclusions are exactly the same. However it has a couple of
advantages
1. It offers a nice and simple physical interpretation (a baby version of Feynman diagrams used in quantum field theory)
2. It is easier to compute higher order corrections (If we want to compute the eigenenergy using a computer, this perturbation theory just needs
one very simple iteration)
3. One can compute energy along, without worry about wavefunctions.
Let’s start from the same setup
(H0 + λH ') ψn⟩ = En ψn⟩ (2.57)
where ψn⟩ represents the same unnormalized eigenstate of H
ψn⟩ = ψn0 +
m≠ncn ψm
0 (2.58)
We can rewrite the equation above as
(E - H0) ψn⟩ = λH ' ψn⟩ (2.59)
and
ψn⟩ = λ (E - H0)-1 H ' ψn (2.60)
Note: here (E - H0)-1 is the matrix inverse, instead of a number inverse, because H0 is an operator, instead of a number.
Q: What is a function of operator? e.g, f (Q)?
A: First, we write down the same function as a number function and do a power-law expansion
f (x) = a0 + a1 x + a2 x2 +… (2.61)
10 Phys460.nb
then, f (Q) represents the same power series, but with number x substitute by operator Q
f (Q) = a0 + a1 Q
+ a2 Q
2+… (2.62)
where Q2
, etc.
Here, the inverse function of operator (E - H0)-1 shall be understood the same way
Because
1
E - x=
1
E+
x
E2+
x2
E3+… (2.63)
we know that
E - H
0-1
=1
E+
H
0
E2+
H
02
E3+… (2.64)
Now, back to the derivation above:
ψm0 ψn = ψm
0 λ (E - H0)-1 H ' ψn =
λ
E - Em0ψm
0 H ' ψn (2.65)
Remember that from the definition, of ψn⟩
ψn⟩ = ψn0 +
m≠ncm ψm
0 (2.66)
we have
ψm0 ψn = cm (2.67)
for any m ≠ n. And thus, we get
ψn⟩ = ψn0 + λ
m≠nψm
01
E - Em0ψm
0 H ' ψn (2.68)
If we define a quantum operator X as
R=
m≠nψm
01
E - Em0ψm
0(2.69)
we get
ψn⟩ = ψn0 + λ R
H
' ψn (2.70)
Thus,
I- λ R
H
' ψn = ψn0 (2.71)
where I is the identity operator
So,
ψn⟩ = I- λ R
H
'-1ψn
0 (2.72)
Again, we emphasize that here, I- λ R
H
'-1 represent matrix inverse. For matrix inverse, we can use Taylor expansions to write it out. We
know that
(1 - a)-1 = 1 + a + a2 + a3 +… (2.73)
So similarly, we have
I - λ R
H
'-1= I + λ R
H
' + λ2 R
H
' R
H
' + λ3 R
H
' R
H
' R
H
' +… (2.74)
So we have
ψn⟩ = I - λ R
H
'-1ψn
0 = ψn0 + λ R
H
' ψn0 + λ2 R
H
' R
H
' ψn0 + λ3 R
H
' R
H
' R
H
' ψn0 + (2.75)
Phys460.nb 11
So we find that
ψnk = R
H
'k ψn0 (2.76)
Previous, we found that
En1 = ψn
0 H ' ψn0 (2.77)
En2 = ψn
0 H ' ψn1 (2.78)
En3 = ψn
0 H ' ψn2 (2.79)
In fact, we can use the same procedure to show that for kth order,
Enk = ψn
0 H ' ψnk-1 (2.80)
Because we have found that ψnk-1 = R
H'
k-1ψn
0
Enk = ψn
0 H ' ψnk-1 = ψn
0 H ' R
H
'k-1ψn
0 (2.81)
So, we have
En1 = ψn
0 H ' ψn0 (2.82)
En2 = ψn
0 H ' R H ' ψn0 =
m≠nψn
0 H ' ψm0
1
En - Em0ψm
0 H ' ψn0 (2.83)
En3 = ψn
0 H ' R H ' R H ' ψn0 =
m≠n
m'≠nψn
0 H ' ψm0
1
En - Em0ψm
0 H ' ψm'0
1
En - Em'0ψm'
0 H ' ψn0 (2.84)
... (2.85)
From these formula we see a pattern.
1. For any Enk, if we look at the formula from right to left, one always start from unperturbed state ψn
0 and eventually goes back to the
same state ψn0 .
2. In the path from ψn0 to ψn0 , we go through several intermediate states ψm
0, ψm'0…. For kth order perturbation, we have
k - 1 intermediate states.
3. To turn from a state to another along the path (e.g. from n to m’ or from m’ to m in En3), we use the perturbation H '
4. For each intermediate state,we have an denominator 1En-Em0
2.3.3. Diagrammatic representation
We can represent the Enk using diagrams.
1. For each intermediate state, we represent 1En-Em0
as a solid line with integer m labeling the state.
2. For each ψm0 H ' ψm'
0, we represent it as a dot. And we use Vm m' to represent ψm0 H ' ψm'
0
3. Connect everything together in the same order as in Enk
4. At the two ends of the line, we use two short line to present that we start from and end at the same state ψn0
First order:
12 Phys460.nb
Second order:
Third order
By making the line longer, we can write down easily perturbation terms to any order.
Relations to QFT:
In QFT, we use very similar diagrams, known as the Feynman diagrams. There, solid lines are propagator of a particle 1ω-ϵ0
where ω is
frequency, pretty much the same as energy En and ϵ0 is the unperturbed energy of the particle (energy ignore interactions between particles). In
faction, the diagrams we show here are baby versions of the diagrams of Feynman.
Physics meaning discussed in class: (example: two electrons exchange photons to get E&M interactions).
2.3.4. How to compute the energy using Brillouin-Wigner Perturbation Theory?
First, let’s define some abbreviation to make the formula shorter,
Vij = ψi0 H ' ψ j
0 (2.86)
and thus
En = En0 + λ Vnn + λ2
Vn m Vm n
En - Em0+ λ3
Vn m Vm m' Vm' n
En - Em0 En - Em'
0+ λ4
Vn m Vm m' Vm' m'' Vm'' n
En - Em0 En - Em'
0 En - Em''0
+ ...(2.87)
Here all the ms are summed over but they cannot be the same as n. It may looks like that we can find En using this formula, but it is not quite
the case yet. This is because on the r.h.s., the denominator contains also En, i.e. IT is a equation for En and En arises on both sides.
This equation can be solved easily using iterative method (e.g. using a computer code). One start from zeroth order, and then go to first,
second, third order …, every time we need En in the kth order calculation, we just use the (k - 1)th order En on the r.h.s.. Here is how it is done
First run
En(1) = En
0 + λ Vnn (2.88)
Second run
En(2) = En
0 + λ Vnn + λ2Vn m Vm n
En(1) - Em
0 (2.89)
Third run
En(3) = En
0 + λ Vnn + λ2Vn m Vm n
En(2) - Em
0+ λ3
Vn m Vm m' Vm' n
En(2) - Em
0 En(2) - Em'
0(2.90)
Fourth run
Phys460.nb 13
En(4) = En
0 + λ Vnn + λ2Vn m Vm n
En(3) - Em
0+ λ3
Vn m Vm m' Vm' n
En(3) - Em
0 En(3) - Em'
0+ λ4
Vn m Vm m' Vm' m'' Vm'' n
En(3) - Em
0 En(3) - Em'
0 En(3) - Em''
0(2.91)
...
Another option is using analytic methods, as will be discussed below.
2.3.5. Preparation
Consider the following function
f (x) = x2 g(x) = x21 + a x + b x2 + c x3 + ... (2.92)
If we want to keep f (x) to O(xn), we only need to keep g(x) to Oxn-2. Similarly, for the following function
f (x) =x2
g(x)=
x2
1 + a x + b x2 + c x3 + ...(2.93)
If we want to keep f (x) to O(xn), we only need to keep g(x) to Oxn-2. This will be something that useful for us latter
2.3.6. iterative method
En = En0 + λ Vnn + λ2
Vn m Vm n
En - Em0+ λ3
Vn m Vm m' Vm' n
En - Em0 En - Em'
0+ λ4
Vn m Vm m' Vm' m'' Vm'' n
En - Em0 En - Em'
0 En - Em''0
+ ...(2.94)
Zeroth order (no En on the l.h.s., so job done)
En = En0 + O(λ) (2.95)
First order (no En on the l.h.s., so job done)
En = En0 + λ Vnn + Oλ2 (2.96)
Second order, we use En obtained at zeroth order for the λ2 term
En = En0 + λ Vnn + λ2
Vn m Vm n
En - Em0+ Oλ3 = En
0 + λ Vnn + λ2Vn m Vm n
En0 - Em
0+ Oλ3 (2.97)
This is because the third term on the r.h.s. already has a λ2 prefactor. Thus to keep to Oλ2, we only need to keep the denominator to Oλ0.
Third order,
En = En0 + λ Vnn + λ2
Vn m Vm n
En - Em0+ λ3
Vn m Vm m' Vm' n
En - Em0 En - Em'
0+ Oλ4 =
En0 + λ Vnn + λ2
Vn m Vm n
En0 + λ Vnn - Em
0+ λ3
Vn m Vm m' Vm' n
En0 - Em
0 En0 - Em'
0+ Oλ4
(2.98)
In the λ2 term, we now need to keep to O(λ). In the λ3 term, we just need to keep En to the zeroth order.
For the λ2 term, we can expand it for small λ
λ2Vn m Vm n
En0 + λ Vnn - Em
0= λ2
Vn m Vm n
En0 - Em
0- λ3
Vn m Vm n
En0 - Em
0
Vnn
En0 - Em
0+…
(2.99)
So
En = En0 + λ Vnn + λ2
Vn m Vm n
En0 - Em
0+ λ3
Vn m Vm m' Vm' n
En0 - Em
0 En0 - Em'
0-
Vn m Vm n Vn n
En0 - Em
02 + Oλ4
(2.100)
Fourth order,
14 Phys460.nb
En = En0 + λ Vnn + λ2
Vn m Vm n
En0 + λ Vnn + λ2 Vn m' Vm' n
En0-Em'0
- Em0
+
λ3Vn m Vm m' Vm' n
En0 + λ Vnn - Em
0 En0 + λ Vnn - Em'
0+ λ4
Vn m Vm m' Vm' m'' Vm'' n
En0 - Em
0 En0 - Em'
0 En0 - Em''
0+ Oλ5
(2.101)
En = En0 + λ Vnn + λ2
Vn m Vm n
En0 - Em
0+ λ3
Vn m Vm m' Vm' n
En0 - Em
0 En0 - Em'
0-
Vn m Vm n Vn n
En0 - Em
02 +
λ4Vn m Vm m' Vm' m'' Vm'' n
En0 - Em
0 En0 - Em'
0 En0 - Em''
0-
Vn m Vm n
En0 - Em
02
Vn m' Vm' n
En0 - Em'
0+
Vn m Vm n
En0 - Em
03
Vnn2 - 2
Vn m Vm m' Vm' n Vn n
En0 - Em
02En
0 - Em'0
+
Oλ5
(2.102)
2.4. Degenerate Perturbation Theory
In the previous section, we studied the effect of a small perturbation λH ' on an eigenstate of H0, ψn0. The key assumption there is that
before we turn on the perturbation (i.e. at λ = 0), the eigenenergies of all other eigenstates of H0 are very far away from En0
En0 - Em
0|> > ψi0 λH ' ψ j
0 (2.103)
This section, we will consider the opposite situation, where there is at least one other eigenstate of H0 which has the same eigenenergy
as ψn0. Two states having the same eigenenergy is known as “degeneracy”. So this perturbation theory is known as the degenerate
perturbation theory.
2.4.1. Why non-degenerate perturbation theory fails in the presence of degeneracy?
In the presence of degeneracy, the perturbation theory that we learned before will fail. To see this, we just need to look at the second order
perturbation of the eigenenergy
En = En0 + λ En
1 + λ2 En2 +… = En
0 + λ ψn0 H ' ψn
0 + λ2
m≠n
ψn0 H ' ψm
0 ψm0 H ' ψn
0
En0 - Em
0+… (2.104)
Here, we focus on the second order correction:
λ2
m≠n
ψn0 H ' ψm
0 ψm0 H ' ψn
0
En0 - Em
0(2.105)
If H0 has another eigenstate ψn'0 with the same eigenenergy, at least one term in this sum will have zero in the denominator and thus will
diverge, i.e., when En0 = En'
0, 1En
0-En'0→∞, and thus the theory becomes ill-defined.
NOTE: the same divergence will arise also in higher order corrections. But there is no divergence in the first order correction En1.
In power-law expansions, infinite coefficient doesn’t always mean singularity. It means that we missed something in the lower order correction.
Here is a simple example: Let’s consider a function f (x), which can be written as the following Taylor expansion at small x
f (x) = a0 + a1 x + a2 x2 +… (2.106)
Now, assume that I made a mistake in the Taylor expansion for the coefficient a1. Instead of the correction value, a1, I used a wrong coefficient
for the linear term, say b1.
f (x) = a0 + b1 x + (a1 - b1) x + a2 x2 +… (2.107)
In other words, here I missed part of the linear term, (a1 - b1) x. And thus coefficients of the higher order terms will also need to be adjusted to
absorb this mistake. Let’s try to use the x2 term to correct this error, i.e.
f (x) = a0 + b1 x +a1 - b1
x+ a2 x2 +… (2.108)
Phys460.nb 15
Let me define b2 = a2 +a1-b1
x
f (x) = a0 + b1 x + b2 x2 +… (2.109)
Now, once again, I wrote my function as a power-law expansion. Because I used a wrong coefficient for the linear term, b1, my second order
term needs to use this new coefficient. This new coefficient b2 is infinite at small x. This is transparent if we notice that when x → 0
b2 = a2 +a1 - b1
x→∞ (2.110)
Bottom line: infinite coefficient in the second order term (and higher order term) means that the first order result is incorrect and
needs to be revised.
2.4.2. What to do?
Here, let’s first take another look at the second order correction
En2 =
m≠n
ψn0 H ' ψm
0 ψm0 H ' ψn
0
En0 - Em
0(2.111)
As we know, the problem arises because En0 = Em
0 for certain m, and thus we get 10=∞. To avoid this singularity, the only thing that we need
to do is to request that the numerator also vanish whenever the denominator is zero. i.e., if En0 = Em
0, we must make sure that
ψm0 H ' ψn
0 = 0.
NOTE: the two factors in the numerator are complex conjugate to each other: ψn0 H ' ψm
0 = ψm0 H ' ψn
0*, and thus if one of them is
zero, the other is also zero.
Bottom line: For degenerate states, before we start the procedure described in the non-degenerate perturbation theory, we need to first
make sure that for any degenerate states, ψm0 H ' ψn
0 = 0
2.4.3. Whenever there is an degeneracy, we have an option to choose the basis
A good example, a free particle. Consider a free particle with mass m.
H0 =p2
2 m= -
ℏ2
2 m
ⅆ2
ⅆx2(2.112)
The eigenstates of H0 arises in pairs (i.e. there is a degeneracy for any excited states). The static Schrodinger equation here is
-ℏ2
2 m
ⅆ2
ⅆx2ψ(x) = E ψ(x) (2.113)
It is a second order differential equation and we know the solution are just plane waves
ψ = A ⅇⅈ k x + B ⅇ-ⅈ k x (2.114)
The eigenenergy for this state is E = p2 2 m = (ℏ k)2 2 m, i.e. the kinetic energy. Here, A and B are two arbitrary coefficients.
For each fixed k, we have one eigenenergy E = (ℏ k)2 2 m, but infinite number of eigenstates ψ = A ⅇⅈ k x + B ⅇ-ⅈ k x, i.e. a degeneracy. This
example is known as two-fold degeneracy, or we say that two states have the same energy. The reason we say “two states” here is because not
all the eigenstates are linear independent. In fact, we just need two states, ⅇⅈ k x and ⅇ-ⅈ k x, all other eigenstates can be written as linear superposi-
tion of these two. Bottom line: two-fold degeneracy means that any linear combination of these two states is an eigenstate of H0 with the
same eigenenergy.
Now, let’s look at the same second order differential equation again.
-ℏ2
2 m
ⅆ2
ⅆx2ψ(x) = E ψ(x) (2.115)
we know that we can also write the solution for this equation as
ψ = C cos k x + D sin k y (2.116)
16 Phys460.nb
i.e., instead of using exponentials, we can use sin or cos functions to represent plane waves. Here, once again we find infinite number of
eigenstates with the same eigenenergy, and once again, they are not all independent. We just need two states cos k x or sin k x. And all other
eigenstates are just linear superpositions of they two. So, again, we reach the same conclusion, the system have a two-fold degeneracy. But
early on, we said that the two states are ⅇⅈ k x and ⅇ-ⅈ k x, but now for the two degenerate states, we use cos k x or sin k x.
These different choices are just different basis to represent all the eigenstates. We can choose to use ⅇⅈ k x and ⅇ-ⅈ k x or cos k x or sin k x. There
is no difference between them. In fact, we can choose any two linear independent states
ψ1 = A1 ⅇⅈ k x + B2 ⅇ
-ⅈ k x (2.117)
ψ2 = A2 ⅇⅈ k x + B2 ⅇ
-ⅈ k x (2.118)
And then, we can say that we have two degenerate states ψ1(x) and ψ2(x). and then, we can represent any other eigenstates (with the same
eigenenergy) as
ψ(x) = X ψ1(x) + Y ψ2(x) (2.119)
Q: Why do we usually use ⅇⅈ k x and ⅇ-ⅈ k x or cos k x or sin k x? Why not use ⅇⅈ k x and cos k x.
A: There is no problem (mathematically) if we choose to use ⅇⅈ k x and cos k x. However, for convenience, it is usually better using orthonormal
bases
ⅆx ⅇⅈ k1 x*ⅇⅈ k2 x = 2 π δ(k1 - k2) (2.120)
ⅆx (cos k x)* sin k x = 0 (2.121)
Q: Why do we use ⅇⅈ k x and ⅇ-ⅈ k x more often than cos k x or sin k x in quantum mechanics?
A: For H0, there is little difference between the two choices. However, if we consider other quantum operators, like momentum, ⅇⅈ k x and ⅇ-ⅈ k x
is a better choice. This is because ⅇⅈ k x and ⅇ-ⅈ k x are eigenstates of the momentum operator too! So they have not only well-defined energy, but
also well defined momenta (ℏ k and -ℏ k respectively). cos k x or sin k x don’t have well defined momentum. They have 50% chance having
momentum ℏ k and another 50% chance having momentum -ℏ k.
Bottom line: when you cannot decide which choice of basis is better, look at another quantum operator.
These conclusions are true generically. Let’s start from two-fold degenerate. Assuming that for H0, there are two degenerate eigenstates:
H0 ψa0 = E0 ψa
0 (2.122)
and
H0 ψb0 = E0 ψb
0 (2.123)
We assume that these two states are orthogonal to each other (otherwise, we make them orthogonal, using Gram-Schmidt procedure). We
assume that ψa0 H ' ψb
0 ≠ 0. Here we first prove a fact: if ψa0 and ψb
0 are both eigenstates of H0 and they have the same eigen-
value E0, then any linear superposition of ψa0 and ψb
0 is also an eigenstate of H0 with the same eigenenergy. Let's define
ψ0 = α ψa0 + β ψb
0
H0 ψ0 = H0α ψa0 + β ψb
0 = αH0 ψa0 + βH0 ψb
0 = α E0 ψa0 + β E0 ψb
0 = E0αH0 ψa0 + βH0 ψb
0 =
E0 ψ0(2.124)
Bottom line: if H0 has two degenerate eigenstates ψa0 and ψb
0, we have infinite eigenstates with the same
eigenvalue ψ0 = α ψa0 + β ψb
0
To represent these infinite eigenstates, we need to choose two states as basis, e.g. ψa0 and ψb
0. Then, any eigenstates with eigenenergy
E0 can be written as a superposition of them. When we have one set of basis, we know that we can choose another set of basis (i.e. we can
change to a different set of basis): for example, we can define ψ10 and ψ2
0, where
ψ10 = α1 ψa
0 + β1 ψb0 (2.125)
Phys460.nb 17
ψ20 = α2 ψa
0 + β2 ψb0 (2.126)
Here, we request ψ10 and ψ2
0 to be orthogonal to each other (otherwise, we make them orthogonal, using Gram-Schmidt procedure)
ψ10 ψ2
0 = ψa0 α1
* + ψb0 β1
* α2 ψa0 + β2 ψb
0 = α1* α2 + β1
* β2 = 0 (2.127)
and we assume that ψ10 and ψ2
0 are normalized
ψ10 ψ1
0 = ψa0 α1
* + ψb0 β1
* α1 ψa0 + β1 ψb
0 = α1* α1 + β1
* β1 = α12 + β1
2 = 1 (2.128)
ψ20 ψ2
0 = ψa0 α2
* + ψb0 β2
* α2 ψa0 + β2 ψb
0 = α2* α2 + β2
* β2 = α22 + β2
2 = 1 (2.129)
Bottom line, instead of our old states ψa0 and ψb
0, we can use ψ10 and ψ2
0 instead as our basis.
2.4.4. Which basis shall we use?
As mentioned above, in general, ψa0 H ' ψb
0 ≠ 0. If so, the native perturbation theory will have singularity at the second order. Now, we
learned that we can choose to use a different set of unperturbed eigenstates ψ10 and ψ2
0, so can we make ψ10 H ' ψ2
0 = 0? If so, it
will save the day and get ride of the singularity. The way to do it is very simple. As we know early on, if we cannot decide which basis to use,
we shall look at another quantum operator. Here we do have one more quantum operator, which is H '.
We first write down a 2×2 matrix
W =ψa
0 H ' ψa0 ψa
0 H ' ψb0
ψb0 H ' ψa
0 ψb0 H ' ψb
0(2.130)
To make the formula shorter, we define
Wij = ψi0 H ' ψ j
0 (2.131)
So
W = Waa Wab
Wba Wbb (2.132)
1. W is a Hermitian matrix (W† = W), so its eigenvalues are real
This is pretty straightforward to prove, because ψ1 Xψ2
*= ψ2 X
†ψ1. In particular, if X
is an Hermitian operator (the quantum operator
of any physics observable is Hermitian), we have X= X †
and thus ψ1 Xψ2
*= ψ2 X
ψ1. Thus it is easy to notice that
Waa = Waa* and Wbb = Wbb
* and Wba = Wab* (2.133)
W† = W*=
Waa* Wba
*
Wab* Wbb
* =
Waa Wab
Wab Wbb = W (2.134)
2. W has two eigenvalues E+ and E-, and each of them has a vector, α1
β1 for E+ and
α2
β2 for E-
Waa Wab
Wba Wbb α1
β1 = E+
α1
β1 (2.135)
and
Waa Wab
Wba Wbb α2
β2 = E-
α2
β2 (2.136)
where E+ and E- are the two eigenvalues.
3. We can use these two eigenvectors to define our ψ10 and ψ2
0 as
ψ10 = α1 ψa
0 + β1 ψb0 (2.137)
ψ20 = α2 ψa
0 + β2 ψb0 (2.138)
As will be shown below, these two states are precisely what we should use for the perturbation theory.
18 Phys460.nb
4. The two eigenvalues are the first order corrections to the eigenenergy
E1 = E0 + λ E+ + Oλ2 (2.139)
E2 = E0 + λ E- + Oλ2 (2.140)
Q: Why we have two eigenenergies here?
A: Because we started from two degenerate eigenstates. At λ = 0, the two states ψ10 and ψ2
0 have the same energy. Now, if we turn on a
small perturbation λH ', we find that these two states (in general) have different eigenenergies. One of them is E1 = E0 + λ E+ + Oλ2 and the
other E2 = E0 + λ E- + Oλ2.
NOTE #1: We say that the perturbation “lifted the degeneracy”.
NOTE #2: After we lift the degeneracy, ψ10 and ψ2
0 no longer have the same energy. If the perturbation is small enough, we can
now do non-degenerate perturbation theory, i.e. problem solved.
2.4.5. A key conclusion: in quantum mechanics, perturbations will in general lift all degeneracy, unless there is a reason saying that the degeneracy shall not be lifted.
In general, in the study of qua tum physics, we can never include all terms in the Hamiltonian in our theoretical calculation. We always need
some approximations (i.e. drop some small/unimportant part of the Hamiltonian). For example. in the study of a Hydrogen atom, we ignored
relativistic effects. We also ignored the magnetic interactions between the electron and the nucleon (remember that both particles have spin.
Whenever a charged particle starts to spin, there is a magnetic dipole. Because both the electron and the proton have magnetic dipoles, there
should be an dipole-dipole interaction between them, which was ignored). In addition,we also ignored the earth magnetic field, which is always
in presence when we do an experiment (unless we screen it out using some special devices.)
Let’s use Hreal to represent the full Hamiltonian of a real system and Hmodel to represent the Hamiltonian that we used to theoretically analyze
the system. We know that these two are not the same, because we always need some approximation to simplify a real problem, i.e.
Hreal = Hmodel + δH (2.141)
we can treat δH as a perturbation.
Now here comes the questions, if we found that two (or more) states have the same eigenenergy (degeneracy) using Hmodel, are these states
really degenerate in a real system? The general answer is no (unless there is a reason), because our first order degenerate perturbation theory
told as that any small perturbation will in general lift the degeneracy.
The only except is: if there is a reason (usually based on symmetry) to tell us that Waa is exactly the same as Wbb and Wab = Wba = 0 precisely.
(in a real physics system, in most of the case, we cannot say that the value of a quantity is precisely this number. What we really mean is that
there is an argument to show that the difference between Waa and Wbb is unmeasurably small and Wab and Wba is unmeasurably small).
2.4.6. Prove: ⟨ψ10 H ' ψ2
0⟩ = 0
Here the proof contains two steps: first
ψ10 H ' ψ2
0 = ( α1* β1
* ) Waa Wab
Wba Wbb α2
β2 (2.142)
and then we will show
( α1* β1
* ) Waa Wab
Wba Wbb α2
β2 = 0 (2.143)
The first step is very straightforward (it is from the definition of ψ10 and ψ2
0 )
ψ10 = α1 ψa
0 + β1 ψb0 (2.144)
ψ20 = α2 ψa
0 + β2 ψb0 (2.145)
so
Phys460.nb 19
ψ10 H ' ψ2
0 = ψa0 α1
* + ψb0 β1
*H ' α2 ψa0 + β2 ψb
0 = ψa0 α1
* + ψb0 β1
* α2 H ' ψa0 + β2 H ' ψb
0 =
α1* α2 ψa
0 H ' ψa0 + α1
* β2 ψa0 H ' ψb
0 + β1* α2 ψb
0 H ' ψa0 + β1
* β2 ψb0 H ' ψb
0(2.146)
The r.h.s. of the equation is in fact exactly the same, if we remember the definition of the W matrix Wij = ψi0 H ' ψ j
0
( α1* β1
* ) Waa Wab
Wba Wbb α2
β2 = α1
* α2 Waa + α1* β2 Wab + β1
* α2 Wba + β1* β2 Wbb (2.147)
So, we proved that ψ10 H ' ψ2
0 = ( α1* β1
* ) Waa Wab
Wba Wbb α2
β2
For the second step, we first consider the situation that E+ ≠ E-. According to the eigenequations, we have
Waa Wab
Wba Wbb α2
β2 = E-
α2
β2 (2.148)
If we multiply on both sides (α1*, β1
*), we get
(α1*, β1
*) Waa Wab
Wba Wbb α2
β2 = (α1
*, β1*) E-
α2
β2 = E-(α1
*, β1*)
α2
β2 (2.149)
Similarly, if we start from the other eigenequation
Waa Wab
Wba Wbb α1
β1 = E+
α1
β1 (2.150)
we get
(α2*, β2
*) Waa Wab
Wba Wbb α1
β1 = (α2
*, β2*) E+
α2
β2 = E+(α2
*, β2*)
α1
β1 (2.151)
If we take a conjugate on both sides
(α1*, β1
*) Waa Wab
Wba Wbb α2
β2 = E+(α1
*, β1*)
α2
β2 (2.152)
here we used the fact that W is Hermitian and thus W† = W. Notice that we have shown
(α1*, β1
*) Waa Wab
Wba Wbb α2
β2 = E+(α1
*, β1*)
α2
β2 (2.153)
and
(α1*, β1
*) Waa Wab
Wba Wbb α2
β2 = E-(α1
*, β1*)
α2
β2 (2.154)
If E+ ≠ E- the only way that these two equations can both be valid is that
(α1*, β1
*) Waa Wab
Wba Wbb α2
β2 = E+(α1
*, β1*)
α2
β2 = E-(α1
*, β1*)
α2
β2 = 0 (2.155)
So we proved that ψ10 H ' ψ2
0 = 0.
Q: what will happen is E+ = E- ?
A: Turns out that this is the simple case. If E+ = E-, as will be shown below, Wab = ψa0 H ' ψb
0 = 0. So, there is no divergence from the
beginning. We can start the perturbation theory without worrying about these divergence.
2.4.7. First order perturbation
The calculation described above provides to us the zeroth order wavefunctions (i.e., we should use ψ10 or ψ2
0) , instead of ψa0 or
ψb0 as our unperturbed wavefunction). As we learned early on (in non-degenerate perturbation theory), the first order correction of energy
is just
En1 = ψn
0 H ' ψn0 (2.156)
i.e., we use the zeroth order wavefunction and compute the expectation value for H '. Here, for the zeroth order wavefunctions, we have two of
them, ψ10 and ψ2
0, so we need to compute the first order energy correction for each of them. And we will prove in this section
20 Phys460.nb
E11 = ψ1
0 H ' ψ10 = E+ (2.157)
and
E21 = ψ2
0 H ' ψ20 = E- (2.158)
i.e., the first order energy corrections for ψ10 and ψ2
0 are precisely the two eigenvalues of the W matrix.
ψ10 H ' ψ1
0 = ψa0 α1
* + ψb0 β1
*H ' α1 ψa0 + β1 ψb
0 = ψa0 α1
* + ψb0 β1
* α1 H ' ψa0 + β1 H ' ψb
0 =
α1* α1 ψa
0 H ' ψa0 + α1
* β1 ψa0 H ' ψb
0 + β1* α1 ψb
0 H ' ψa0 + β1
* β1 ψb0 H ' ψb
0(2.159)
If we remember the definition of the W matrix Wij = ψi0 H ' ψ j
0, we realized immediately that this formula is exactly the same as
( α1* β1
* ) Waa Wab
Wba Wbb α1
β1 = α1
* α1 Waa + α1* β1 Wab + β1
* α1 Wba + β1* β1 Wbb (2.160)
So we found
E11 = ψ1
0 H ' ψ10 = ( α1
* β1* )
Waa Wab
Wba Wbb α1
β1 (2.161)
Because α1
β1 is an eigenvector of B
Waa Wab
Wba Wbb α1
β1 = E+
α1
β1 (2.162)
E11 = ψ1
0 H ' ψ10 = ( α1
* β1* )
Waa Wab
Wba Wbb α1
β1 = E+( α1
* β1* )
α1
β1 (2.163)
Because we have required α1β1
to be normalized, ( α1* β1
* ) α1
β1 = α1
* α1 + β1* β1 = 1
E11 = ψ1
0 H ' ψ10 = ( α1
* β1* )
Waa Wab
Wba Wbb α1
β1 = E+( α1
* β1* )
α1
β1 = E+ (2.164)
Similarly, we can show that
E21 = ψ2
0 H ' ψ20 = ( α2
* β2* )
Waa Wab
Wba Wbb α2
β2 = E-( α2
* β2* )
α2
β2 = E- (2.165)
2.4.8. Eigenvalues of the matrix W
In this part, we review basic ideas of eigenvalues and eigenvectors. We starts from the eigenequation defined in the previous section
Waa Wab
Wba Wbb αβ = E
αβ (2.166)
This means that
Waa α + Wab β = E α (2.167)
Wba α + Wbb β = E β (2.168)
These two equations have an obvious and trivial solution α = β = 0. This solution is NOT what we want and we will not consider this trivial
solution. To get a nontrivial solution, the eigenvalue E cannot be an arbitrary value. It can only be one of two values, as will be seeing below.
Using the first equation, we get
α =Wab
E - Waa
β (2.169)
Using the second equation, we get
α =E - Wbb
Wba
β (2.170)
The first relation means
Phys460.nb 21
α
β=
Wab
E - Waa(2.171)
but the second relation requires
α
β=
E - Wbb
Wba(2.172)
So we have
α
β=
Wab
E - Waa
=E - Wbb
Wba(2.173)
In general, Wab
E-Waa≠
E-Wbb
Wba, so we find an contradiction. This contradiction means that for a general value of E, we will only have the trivial
solution α = β = 0. To get a nontrivial solution, we have to request Wab
E-Waa=
E-Wbb
Wba. This equation is often written in a different form
Wab
E - Waa
=E - Wbb
Wba(2.174)
(E - Waa) (E - Wbb) = Wab Wba (2.175)
(E - Waa) (E - Wbb) - Wab Wba = 0 (2.176)
det E - Waa -Wab
-Wba E - Wbb = 0 (2.177)
or equivalently
det E 00 E
- Waa Wab
Wba Wbb = 0 (2.178)
det (E - W) = 0 (2.179)
Here, W is the matrix that we define above
W = Waa Wab
Wba Wbb (2.180)
and number E here means E times the identity matrix
E* 1 00 1
= E 00 E
(2.181)
det (E - W) = 0 means
(Waa - E) (Wbb - E) - Wab Wba = 0 (2.182)
And thus
E2 - (Waa + Wbb) E + (Waa Wbb - Wab Wba) = 0 (2.183)
By definition, tr W = Waa + Wbb and det W = Waa Wbb - Wab Wba. Therefore, we can write the same equation as
E2 - tr WE + det W = 0 (2.184)
This equation has two solutions
E± =tr W ± (tr W)2 - 4 det W
2(2.185)
As shown above, these two solutions, E± are the first order correction to the eigenenergy. In the perturbation theory, the eigenenergies of these
two quantum states are
E = E0 + E+ λ + Oλ2 (2.186)
and
22 Phys460.nb
E = E0 + E- λ + Oλ2 (2.187)
at small λ.
Comment #1. tr W and det W are both real. This is straightforward to prove, if we notice that W is Hermitian. Because
Waa = Waa* and Wbb = Wbb
* and Wba = Wab*, Waa and Wbb are real. And Wab Wba = Wab
2 is also real, so tr W = Waa + Wbb and
det W = Waa Wbb - Wab Wba are both real.
Comment #2. (tr W)2 - 4 det W ≥ 0. Therefore, E± are both real.
(tr W)2 - 4 det W = (Waa + Wbb)2 - 4 Waa Wbb + 4 Wab Wba = (Waa - Wbb)
2 + 4 Wab2 ≥ 0 (2.188)
here we used the fact that Wab* = Wba.
Comment #3. There are in general two possible situations (a) If (tr W)2 - 4 det W > 0, E+ > E-. i.e. the two eigenvalues are NOT the same. (b)
If Waa = Wbb and Wab = Wba = 0, (tr W)2 - 4 det W = 0, and thus E+ = E- = tr W2.
The situation (b) is the easy case, because Wab = Wba = 0 means ψa0 H ' ψb
0 = 0. Remember that the problem we had from the beginning is
that the second order correction will diverge, ⟨ψa0 H' ψb
0⟩ ⟨ψb0 H' ψa
0⟩
Ea0-Eb
0 , because Ea0 = Eb
0. For situation (b), the numerator is zero, so there is no
divergence. And thus we can just do non-degenerate perturbation theory. The situation (a) is the more generic case. There, as we have shown
early on, the perturbation H ' lift the degeneracy.
2.4.9. Eigenvectors of the matrix W
In this section, we will assume that (tr W)2 - 4 det W > 0, i.e. situation (a) discussed in the previous section. We have two eigenvalues. For each
eigenvalue, we can solve for the corresponding eigenvector. For the eigenvalue E+ =tr W+ (tr W)2-4 det W
2, we have
Waa Wab
Wba Wbb α1
β1 = E+
α1
β1 (2.189)
and for the other eigenvalue E- =tr W- (tr W)2-4 det W
2, we have
Waa Wab
Wba Wbb α2
β2 = E-
α2
β2 (2.190)
We will use the first one as example (equation for E+). There, the matrix equation can be written as two separate equations
Waa α1 + Wab β1 = E+ α1 (2.191)
Wba α1 + Wbb β1 = E+ β1 (2.192)
Using the first equation, we get
α1 =Wab
E+ - Waa
β1 (2.193)
Using the second equation, we get
α1 =E+ - Wbb
Wba
β1 (2.194)
These two relations are actually identical, because for any eigenvalue E, we have Wab
E-Waa=
E-Wbb
Wba as we proved early on.
In addition, we know that α1* α1 + β1
* β1 = 1, i.e. the normalization condition. So we have
α1 =Wab
Wab2 +(E+ - Waa)2
(2.195)
β1 =E+ - Waa
Wab2 +(E+ - Waa)2
(2.196)
Similarly, we have
Phys460.nb 23
α2 =Wab
Wab2 +(E- - Waa)2
(2.197)
β2 =E- - Waa
Wab2 +(E- - Waa)2
(2.198)
In conclusion, we found that
ψ10 = α1 ψa
0 + β1 ψb0 =
Wab
Wab2 +(E+ - Waa)2
ψa0 +
E+ - Waa
Wab2 +(E+ - Waa)2
ψb0
(2.199)
ψ20 = α2 ψa
0 + β2 ψb0 =
Wab
Wab2 +(E- - Waa)2
ψa0 +
E- - Waa
Wab2 +(E- - Waa)2
ψb0
(2.200)
2.4.10. the very special case
In general, we expect E+ ≠ E-, i.e. the generacy is lifted. What will happen if E+ = E-. From the equation
E± =tr W ± (tr W)2 - 4 det W
2(2.201)
we know that E+ = E- can only arise when (tr W)2 - 4 det W =0, i.e. (tr W)2 - 4 det W = 0. As shown above
(tr W)2 - 4 det W = (Waa - Wbb)2 + 4 Wab
2 (2.202)
Both the two terms on the r.h.s. are non-negative, and thus if we want the whole thing to be zero, we must have
(Waa - Wbb)2 = 0 (2.203)
and
4 Wab2 = 0 (2.204)
i.e., Waa = Wbb and Wab = 0.
With Waa = Wbb and Wab = 0, W is actually proptional to an identity matrix.
W = Waa 0
0 Waa = Waa
1 00 1
(2.205)
This situation is highly unlikely to arise (unless there is a reason) because, in general, the W matrix has four free values to pick Waa, Wbb, the
real part of Wab and the imaginary part of Wab (note 1: Waa and Wbb are real, so they don’t have imaginary part. note 2: Wba is the complex
conjugate of Wab, so we don’t need to consider it here as a seperate degree of freedom). If you have four real values, what is the probability for
these four real values to satisify that Waa matches exactly Wbb without any error bar, and both the real and imarginaly parts of Wab vanishes
exactly without any error bar? Without a reason, the chance is zero. So this is a sitation that we don’t need to worry much, unless there is a
reason.
In most cases, such a specal case arises due to symmetry. For example, time reversal symmetry tells us that there should be two degenerate
states (a state and its time reversal state). Then, for H0 these two states degenerates and for H , they should still be degenerate, so E+ = E-. For
that situation, it turns out that one can directly start from non-degenrate pertubation theory (no singularities will arise), although the states are
degenerate. We will dicuss a more generic situation later, which covers this case.
2.4.11. Review: quantum states and quantum operators as matrices
Once we choose a set of basis, any quantum state can be written as a vector (i.e., a N-by-1 matrix).
For a complete set of basis, { ψi⟩}, we can write any quantum states as
ψ⟩ =ici ψi (2.206)
24 Phys460.nb
where ci are complex numbers. Here, we find that if we want to describe a state, we just need to know all the coefficient ci. We can write these
ci as a vector
c1
c2
c3
⋮
(2.207)
These coefficients are
ci = ⟨ψi ψ⟩ (2.208)
To see this, we multiply ⟨ψ j for both sides of ψ⟩ =∑i ci ψi⟩
⟨ψ j ψ⟩ =ici ⟨ψ j ψi⟩ =
ici δij = c j (2.209)
Bottom line: a quantum state is a column vector
ψ⟩ →
c1
c2
c3
⋮
=
⟨ψ1 ψ⟩
⟨ψ2 ψ⟩
⟨ψ3 ψ⟩
⋮
(2.210)
Conjugate states is the represented by the conjugate vector. By definition, we know that
ψ =i⟨ψi ci
*(2.211)
so, we can write all these ci* as a row vector
( c1* c2
* c3* … ) = ( ⟨ψ ψ1⟩ ⟨ψ ψ2⟩ ⟨ψ ψ3⟩ … ) (2.212)
Here, we used the fact that ⟨ψ ψi⟩ is the complex conjugate of ⟨ψi ψ⟩
Inner produce of two states are product of a row vector and a column vector
If we know two quantum states
ψ⟩ →
c1
c2
c3
⋮
(2.213)
and
ϕ⟩ →
d1
d2
d3
⋮
(2.214)
then, we know
⟨ϕ → ( d1* d2
* d3* … ) (2.215)
so
⟨ϕ ψ⟩ → ( d1* d2
* d3* … )
c1
c2
c3
⋮
= d1* c1 + d2
* c2 + d3* c3 +… (2.216)
Q: How about a quantum operator?
A: Once we choose a set of basis, a quantum operator is a matrix.
To understand this, we just need to realize that a quantum operator transforms a quantum state into a different state
Xψ = ϕ (2.217)
As we have known, |ψ⟩ is a column vector, and |ϕ⟩ is another column vector. Which object transfers a column vector to a different column
Phys460.nb 25
vector? We know that a matrix can do such a job
x11 x12 x13 …
x21 x22 x23 ...x31 x32 x33 …
⋮ ⋮ ⋮ ⋱
c1
c2
c3
⋮
=
x11 c1 + x12 c2 + x13 c3 +…
x21 c1 + x22 c2 + x23 c3 +…
x31 c1 + x32 c2 + x33 c3 +…
⋮
=
d1
d2
d3
⋮
(2.218)
So a quantum operator is really similar to a matrix. In fact, the matrix elements xijs are very easy to compute
xij = ψi Xψ j (2.219)
Q: How about eigenvalues and eigenstates?
A: Matrices also have eigenvalues and eigenstates
x11 x12 x13 …
x21 x22 x23 ...x31 x32 x33 …
⋮ ⋮ ⋮ ⋱
c1
c2
c3
⋮
= W
c1
c2
c3
⋮
(2.220)
1. The matrix of a Hermitian operator is a Hermitian matrix
2. An N×N Hermitian matrix has N eigenvalues,each of which has an eigenvector
3. Eigenvalues of the matrix is the same as the eigenvalues of the corresponding quantum operator
4. Each eigenvector corresponds to a eigenstate, i.e. If
c1
c2
c3
⋮
is an eigenvector with eigenvalue W , then ψ⟩ =∑i ci ψi⟩ is an eigenstate of X
with eigenvalue W .
Final conclusion: for a quantum system, we just needs to play with matrices
Only one problem: these matrices are huge (∞×∞)
It is extremely hard to handle big matrices (say 100 million by 100 million). So this approach doesn’t make our life easier.
2.4.12. Degenerate perturbation theory
H = H0 + λH ' (2.221)
Using eigenstates of H0 as basis, then H0 corresponds to a diagonal matrix
H0 ψi⟩ = Ei0 ψi (2.222)
where i = 1, 2, 3, … and we request this is an orthonormal basis
⟨ψi ψ j⟩ = δij (2.223)
a matrix element of the matrix is
⟨ψi H0 ψ j⟩ = ψi E j0 ψ j = E j
0 ⟨ψi ψ j⟩ = E j0 δi, j (2.224)
So,
H0 →
E10 0 0 …
0 E20 0 ...
0 0 E30 …
⋮ ⋮ ⋮ ⋱
(2.225)
This conclusion is generically true. If we use the eigenstates of an operator as our basis, then this operator is a diagonal matrix (i.e. off-diagonal
terms are all zero). And along the diagonal line, we just have all the eigenvalues of this quantum operator.
λH ' → λ
⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩ …
⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩ ...⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩ …
⋮ ⋮ ⋮ ⋱
(2.226)
26 Phys460.nb
In general, H ' is NOT an diagonal matrix
H = H0 + λH ' →
E10 0 0 …
0 E20 0 ...
0 0 E30 …
⋮ ⋮ ⋮ ⋱
+ λ
⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩ …
⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩ ...⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩ …
⋮ ⋮ ⋮ ⋱
=
E10 + λ ⟨ψ1 H ' ψ1⟩ λ ⟨ψ1 H ' ψ2⟩ λ ⟨ψ1 H ' ψ3⟩ …
λ ⟨ψ2 H ' ψ1⟩ E20 + λ ⟨ψ2 H ' ψ2⟩ λ ⟨ψ2 H ' ψ3⟩ ...
⟨ψ3 H ' ψ1⟩ λ ⟨ψ3 H ' ψ2⟩ E30 + λ ⟨ψ3 H ' ψ3⟩ …
⋮ ⋮ ⋮ ⋱
(2.227)
This is a very large matrix and is very hard to handle in general.
However, if there are 2 degenerate states,
H0 =
⋱ ⋮ ⋮ …
... E0 0 ...
... 0 E0 …
⋮ ⋮ ⋮ ⋱
(2.228)
i.e., two of the eigenvalues of H0 coincides, or say two of two numbers along the diagonal line of the matrix of H0 happens to be the same,
the we don’t need to handle the whole big matrix, if λ is small. Here, we can do degenerate perturbation theory, and to the first order, we can
forget all other quantum states and only look at the two degenerate ones. What does this mean? Remember, that in general, a set of complete
basis contains infinite number of states ψi⟩ with i = 1, 2, …∞. As a result, the matrix of a quantum operator has dimension ∞×∞,
ψi Xψ j with i = 1, 2, …∞ and j = 1, 2, …∞. Now, if we only limit ourselves to the two degenerate states ψa
0 and ψb0, then the
matrix of our quantum operator only has dimensions 2×2, because my i and j here can only be a or b
X→
ψa0 X
ψa
0 ψa0 X
ψb
0
ψb0 X
ψa
0 ψb0 X
ψb
0(2.229)
For H0, its matrix is
H0 → E0 00 E0
(2.230)
and for H ', the matrix is
H ' →ψa
0 H ' ψa0 ψa
0 H ' ψb0
ψb0 H ' ψa
0 ψb0 H ' ψb
0(2.231)
So our H is
H = H0 + λH ' →E0 + λ ψa
0 H ' ψa0 λ ψa
0 H ' ψb0
λ ψb0 H ' ψa
0 E0 + λ ψb0 H ' ψb
0(2.232)
The eigenvalues of this matrix are E0 + λ E+ and E0 + λ E-. And the eigenvector is the same as we computed in previous section
Bottom line: for degenerate perturbation, we can drop all other states (with different eigenenergies), and consider a much smaller
Hilbert space (only the degenerate states are considered here). Then, our Hamiltonian becomes a very small matrix, and we can
diagonalize this small matrix. The eigenvalues are the eigenenergies to the first order. And the eigenvectors give us eigenwavefunctions
to zeroth order.
2.4.13. n-fold degeneracy
If H0 as n-fold degeneracy, and we want to do perturbation theory for these n degenerate states, we just ignore all other states and only keep
these n states.
H0 →
E0 0 0 …
0 E0 0 ...0 0 E0 …
⋮ ⋮ ⋮ ⋱ n×n
(2.233)
Phys460.nb 27
and H ' is a n×n matrix with matrix elements ψi0 H' ψj0, where i = 1, …, n and j = 1, …, n
Then, there are two (equivalent) ways to do the calculation
Option #1: compute eigenvalues for the n×n matrix of H ', as E11 …En
1. Then the eigenenergy to the first order correction is
Ei = E0 + λ Ei1 + Oλ2 (2.234)
where i = 1, 2, …, n.
Option #2: direction compute eigenvalues of the n×n matrix of H = H0 + λH'. You will find n eigenvalue, they are
E0 + λ Ei1 (2.235)
where i = 1, 2, …, n.
2.4.14. Nearly-degenerate perturbation theory
What if we have two states that are not totally degenerate, but nearly degenerate, i.e. two eigenstates of H0 ψa0 and ψb
0 has very similar
energies Ea0 ~ Eb
0, but not exactly the same.
Case 1. if λ H ' << Ea0 - Eb
0 , we can do non-degenerate perturbation theory
Case 2. if λ H ' >> Ea0 - Eb
0 , but λ H ' << Ea0 - Em
0 and λ H ' << Eb0 - Em
0 for any other eigenstates of H0, where Em0 represent
eigenenergy of another eigenstate of H0 (beyond ψa0 and ψb
0), we can do nearly-degenerate perturbation theory.
Here, the procedure is similar to the degenerate perturbation theory, we ignore all other states and only consider ψa0 and ψb
0. Now, every
quantum operator becomes a 2×2 matrix.
H0 →Ea
0 00 Eb
0 (2.236)
and for H ', the matrix is
H ' →ψa
0 H ' ψa0 ψa
0 H ' ψb0
ψb0 H ' ψa
0 ψb0 H ' ψb
0=
Vaa Vab
Vba Vaa (2.237)
Notice that H0 has two different diagonal components Ea0 ≠ Eb
0. This is the difference between degenerate and nearly-degenerate perturbation
theory. Now we consider H , which is
H = H0 + λH ' →Ea
0 + λ Vaa λ Vab
λ Vba Eb0 + λ Vbb
(2.238)
Then, we can get eigenvalues of this matrix, and these are the eigenvalues of H up to first order in perturbation theory.
Case 3. if λH ' is too large, even larger than |Ea0 - Em
0| and |Eb0 - Em
0|, then λH ' is too large and thus cannot be considered as a perturbation
and as a result, we cannot do perturbation theory anymore.
This result can be easily generalized to cases where we have more than 2 nearly-degenerate states.
2.4.15. Philosophy behind degenerate and nearly-degenerate perturbation theory
Assuming that H0 have n eigenstates who have very similar (or exactly the same) eigenenergies ψa0, ψb
0… and they all have energy near
E0), but all other eigenstates of H0 have energies very different from these states ( ψm⟩ has eigenenergy Em0 and m is not one of the nearly
degenerate states. For any m, we have Em0 very different from E0). Then when we start our system from one (or some superposition) of these n
states, and then perturb the Hamiltonian by a small amount H = H0 + λH ' with λ being very small. Then, because any state ψm⟩ has an
energy much different from E0, when the perturbation is small, it is (almost) impossible for the system to reach a state ψm⟩ from one state
with energy E0.
Note: for a classical system, this would be totally impossible due to energy conservation. In a classical system, if we start from a state with
energy E0 and then add a small amount of energy δE to the system, the final states must have energy E0 + δE, which would be very close to E0,
if δE is small. So, it is absolutely impossible to have a final states with energy very different from E0. But for a quantum system, anything is
28 Phys460.nb
possible (think about quantum tunneling, classically it is impossible, but for a quantum system it become possible). However, we know that in
quantum mechanics, the probability for us to reach such a final states is small (although not exactly zero). Since the probability is small, to the
leading order, we can ignore that probability. This is the key reason why we can ignore all those states ψm0.
Since it is highly unlikely to reach ψm⟩, we can ignore them to the leading order approximation. After we ignore all of them, our Hilbert
space becomes very small, only n quantum states now. And thus our quantum operators becomes n×n matrices. If n is 2, we can easily find the
eigenvalue. If n=3 or 4, we can get the eigenvalue (with analytic form) with a little bit of help (e.g. software like Mathematica). If n > 4 but not
extremely huge), say a couple of hundreds or smaller, we can easily get the eigenvalue numerically using available software packages. If n is 10
or 100 million, it is not easy to get all the eigenvalues for the system, but we can easily get the smallest several or the largest several numeri-
cally using techniques like Lanczos algorithm.
After taking care of the n×n matrices, we may be able to take ψm0 back into consideration by going to higher orders in the perturbation
theory.
2.4.16. Example: (textbook page 262)
Consider a 3D infinite cubical potential well
V(x, y, z) = 0 if 0 < x < a, 0 < y < a and 0 < z < a
+∞ otherwise (2.239)
The Hamiltonian for this system is
H0 =P2
2 m+ V(x, y, z) = -
ℏ2
2 m∇2+V(x, y, z) (2.240)
The eigenstates of H0 are sin waves
ψ0nx ny nz(x, y, z) =
2
a
3/2
sinnx π
ax sin
ny π
ay sin
nz π
az (2.241)
where nx, ny and nz are positive integers. The eigenenergy for such a state is
E0nx ny nz =
π2 ℏ2
2 m a2nx
2 + ny2 + nz
2 (2.242)
The ground state is obviously nx = ny = nz = 1, which has energy E0111 =
3 π2 ℏ2
2 m a2 . For simplicity, we will call this energy E00 =
3 π2 ℏ2
2 m a2 , where the
subscript 0 represents the ground states.
There are three (degenerate) first excited states with nx, ny and nz being (1, 1, 2) or (1,2,1) or (2,1,1).
E0112 = E0
121 = E0211 =
π2 ℏ2
2 m a21 + 1 + 22 = 3
π2 ℏ2
m a2(2.243)
For simplicity, we will call the energy of this first excited states E10 where the subscript 1 means that this is for the first excited states. In
addition, for simplicity, we define
ψa = ψ112 and ψb = ψ121 and ψc = ψ211 (2.244)
Now, consider a perturbation
H = H0 + λH ' (2.245)
where
H ' = V0 if 0 < x < a /2, 0 < y < a /2 and 0 < z < a
0 otherwise (2.246)
For the ground state, we shall do non-degenerate perturbation theory, because there is no degeneracy, and the first order correction to the energy
is
Phys460.nb 29
E1111 = ⟨ψ111 H ' ψ111⟩ = ⅆx ⅆy ⅆ z ψ0
111(x, y, z)*
H ' ψ0111(x, y, z) = V0
0
a/2ⅆx
0
a/2ⅆy
0
a
ⅆ z ψ0111(x, y, z)
2
= V0
2
a
3
0
a/2ⅆx sin 2
π
ax
0
a/2ⅆy sin2
π
ay
0
a
ⅆ z sin2π
az = V0
2
a
3
×1
2
a
2×
1
2
a
2×
1
2a =
V0
4
(2.247)
So, the energy of the ground states now becomes
E111 = E0111 + λ E1
111 +… =3 π2 ℏ2
2 m a2+ λ
V0
4+… (2.248)
For the first excited states, there is a three-fold degeneracy, so we need to define a 3×3 matrix
W =
Waa Wab Wac
Wba Wbb Wbc
Wca Wcb Wcc
=
⟨ψa H ' ψa⟩ ⟨ψa H ' ψb⟩ ⟨ψa H ' ψc⟩
⟨ψb H ' ψa⟩ ⟨ψb H ' ψb⟩ ⟨ψb H ' ψc⟩
⟨ψc H ' ψa⟩ ⟨ψc H ' ψb⟩ ⟨ψc H ' ψc⟩(2.249)
For diagonal terms
Waa = ⟨ψa H ' ψa⟩ = ⅆx ⅆy ⅆ z ψ0112(x, y, z)
*H ' ψ0
112(x, y, z) = V0 0
a/2ⅆx
0
a/2ⅆy
0
a
ⅆ z ψ0112(x, y, z)
2
= V0
2
a
3
0
a/2ⅆx sin 2
π
ax
0
a/2ⅆy sin2
π
ay
0
a
ⅆ z sin22 π
az = V0
2
a
3
×1
2
a
2×
1
2
a
2×
1
2a =
V0
4
(2.250)
Similarly, we can show that
Waa = Wbb = Wcc = V0 /4 (2.251)
For Wab, we have
Wab = ⟨ψa H ' ψb⟩ = ⅆx ⅆy ⅆ z ψ0112(x, y, z)
*H ' ψ0
121(x, y, z)
= V0
2
a
3
0
a/2ⅆx sin 2
π
ax
0
a/2ⅆy sin
π
ay sin
2 π
ay
0
a
ⅆ z sinπ
az sin
2 π
az
(2.252)
One can show that the last integral ∫0aⅆ z sin π
az sin 2 π
az is zero. So Waa = 0. Similarly, Wab = Wac = 0.
Finally, for Wbc
Wab = ⟨ψb H ' ψc⟩ = ⅆx ⅆy ⅆ z ψ0121(x, y, z)
*H ' ψ0
211(x, y, z)
= V0
2
a
3
0
a/2ⅆx sin
π
ax sin
2 π
ax
0
a/2ⅆy sin
π
ay sin
2 π
ay
0
a
ⅆ z sin2π
az =
16
9 π2V0
(2.253)
Thus, we have
W =
V0
40 0
0 V0
416
9 π2 V0
0 169 π2 V0
V0
4
(2.254)
For this matrix, it has three eigenvalues (all are real numbers). The equation for eigenvalues is
det(E I - W ) = 0 (2.255)
det
E -V0
40 0
0 E -V0
4-
169 π2 V0
0 -16
9 π2 V0 E -V0
4
= 0 (2.256)
And thus
E -V0
4 E -
V0
4
2
-16
9 π2V0
2
= 0 (2.257)
30 Phys460.nb
E -V0
4E -
V0
4+
16
9 π2V0 E -
V0
4-
16
9 π2V0 = 0 (2.258)
So the solutions are
E11 =
V0
4-
16
9 π2V0 =
V0
41 -
8
3 π
2
(2.259)
E21 =
V0
4(2.260)
E31 =
V0
4+
16
9 π2V0 =
V0
41 +
8
3 π
2
(2.261)
So the energies of the first excited states are
E1 =
E10 + λ E1
1 +… = 3 π2 ℏ2
m a2 + λV0
41 -
83 π2 state #1
E10 + λ E2
1 +… = 3 π2 ℏ2
m a2 + λV0
4state #2
E10 + λ E3
1 +… = 3 π2 ℏ2
m a2 + λV0
41 +
83 π2 state #3
(2.262)
Now, for the eigenstates of the W matrix
V0
40 0
0 V0
416
9 π2 V0
0 169 π2 V0
V0
4
α1
β1
γ1
= E11α1
β1
γ1
=V0
41 -
8
3 π
2
α1
β1
γ1
(2.263)
V0
4
1 0 0
0 1 8
3 π2
0 8
3 π2
1
α1
β1
γ1
=V0
41 -
8
3 π
2
α1
β1
γ1
(2.264)
By canceling V0 /4 on both sides, we get
1 0 0
0 1 8
3 π2
0 8
3 π2
1
α1
β1
γ1
= 1 -8
3 π
2
α1
β1
γ1
(2.265)
It means that
α1 = 1 -8
3 π
2
α1 (2.266)
β1 +8
3 π
2
γ1 = 1 -8
3 π
2
β1 (2.267)
8
3 π
2
β1 + γ1 = β1 (2.268)
The first equation means α1 = 0 and the last two means β1 = -γ1. In addition, we know that normalization condition requires
α 2 + β 2 + γ 2 = 1, so
α1
β1
γ1
=
01
2
-1
2
(2.269)
So,
Phys460.nb 31
ψ1(x, y, z) = α1 ψa + β1 ψb + γ1 ψc =ψb - ψc
2=ψ121 - ψ211
2=
2
a
3/2 sin πa
x sin 2 πa
y - sin 2 πa
x sin πa
y
2sin
π
az (2.270)
Using the same approach, we find that
α2
β2
γ2
=100
(2.271)
and thus
ψ2(x, y, z) = α2 ψa + β2 ψb + γ2 ψc = ψa = ψ211 =2
a
3/2
sin2 π
ax sin
π
ay sin
π
az (2.272)
Finally, for the third eigenstate, we can use the same method to show that
α3
β3
γ3
=
01
2
1
2
(2.273)
and thus
ψ3(x, y, z) = α3 ψa + β3 ψb + γ3 ψc =ψb + ψc
2=ψ121 + ψ211
2=
2
a
3/2 sin πa
x sin 2 πa
y + sin 2 πa
x sin πa
y
2sin
π
az (2.274)
2.4.17. A small trick for finding ψ1 and ψ2
We use two-fold degenerate here as an example, but the conclusion here can be easily generalized.
As we have shown above, the key in degenerate perturbation theory is to find a good set of basis, such that ψ10 H ' ψ2
0 = 0. In the most
general situation, we state from a set of states ψa0 and ψb
0. If it is a good set already (i.e., ψa0 H ' ψb
0 = 0) , we don't need to find
another basis. We can just use them and
E = E0 + λ ψa0 H ' ψa
0 + Oλ2 (2.275)
and
E = E0 + λ ⟨ψb H ' ψb⟩ + Oλ2 (2.276)
In general, we would be so lucky. i.e., if we just randomly choose a a set of states ψa0 and ψb
0, the chance for this basis to be a good set of
basis is extremely low (we will in general have ⟨ψa H ' ψb⟩ ≠ 0). Is there a way to help us pick ψa0 and ψb
0? The answer is yes, for some
cases.
If there is another quantum operator A, which compute with both H0 and H ', then we can use the common eigenstates of A
and H0 as the basis
for H0
H0 ψa0 = E0 ψa
0 (2.277)
H0 ψb0 = E0 ψb
0 (2.278)
Aψa
0 = Aa ψa0 (2.279)
Aψb
0 = Ab ψb0 (2.280)
In particular, if Aa ≠ Ab, then ψa0 and ψb
0 are already a good set of basis.
To see this, we just need to prove that ψa0 H ' ψb
0 = 0
Because we have assumed that A, H ' = A
H ' - H ' A
= 0,
32 Phys460.nb
0 = ψa0 A
H ' - H ' Aψb
0 =
ψa0 A
H ' ψb0 - ψa
0 H ' Aψb
0 = Aa ψa0 H ' ψb
0 - Ab ψa0 H ' ψb
0 = (Aa - Ab) ψa0 H ' ψb
0(2.281)
If Aa ≠ Ab, this equation means that ψa0 H ' ψb
0 = 0.
For this situation, although we have a degeneracy, one can just do non-degenerate perturbation for ψa0 and ψb
0 (separately) and
there will be no singularities at all.
2.5. the fine structure of a hydrogen atom
2.5.1. Relativistic correction
In QMI, we solved an ideal model for a hydrogen atom (i.e. a particle in 1 /r potential). In a real hydrogen atom, that model missed some of the
physics, and one of them is relativistic effects.
Q: what is the energy of a particle, if the particle is moving at speed v and the rest mass m.
E = M c2 =m
1 -v2
c2
c2
(2.282)
Q: what is the momentum of a particle, if the particle is moving at speed v and the rest mass m.
p = M v =m
1 -v2
c2
v
(2.283)
As a result,
E = p2 c2 + m2 c4 (2.284)
To prove this relation, we start from the r.h.s.,
p2 c2 + m2 c4 =
m2 v2
1 -v2
c2
c2 + m2 c4 =m2 v2 c2
1 -v2
c2
+m2 c41 -
v2
c2
1 -v2
c2
=m2 v2 c2 + m2 c4 - m2 c2 v2
1 -v2
c2
=m2 c4
1 -v2
c2
=m
1 -v2
c2
c2 = E(2.285)
This relation between E and p is an very important relation for relativistic physics!
Q: what is kinetic energy?
A: First, find the energy of a particle when it is not moving p = 0. Then we measure the energy again when it is moving (with momentum p).
The energy difference between them is the kinetic energy for this particle.
T = E - m c2 = p2 c2 + m2 c4 - m c2 = m c2 1 +p
m c
2- 1 (2.286)
When particle is moving at low velocity (v << c), p << m c, and thus p /m c << 1. As a result, we can use the following expansion
1 + x = 1 +x
2-
x2
8+… (2.287)
i.e.,
1 + x - 1 =x
2-
x2
8+… (2.288)
Phys460.nb 33
So,
T = m c21
2
p
m c
2-
1
8
p
m c
4+… =
p2
2 m-
p4
8 m3 c2+… (2.289)
The first term here is the kinetic energy in classical mechanics. In relativistic physics, the kinetic energy is NOT just p2 2 m. Instead, we have a
lot of corrections. These corrections are small if a particle is moving at low speed. There, we can treat them as perturbation
H ' = -p4
8 m3 c2(2.290)
NOTE: this treatment is NOT the rigorous way to combine special relativity with quantum mechanics, because this treatment has one
major flaw. At larger p or small m (e.g. consider a very light particle), the series will diverge. This problem comes from the fact that we
used square root in the definite of the Hamiltonian. Square root is NOT an analytic function near small x, and thus will cause trouble
(to see this, think about f (x) = x , one can easily show that for the first order derivative, x=0 is infinite limx→0 f ' (x)→∞). The correct way
to do it is to use a matrix. Notice that for a matrix, square root arises naturally (e.g., the eigenvalue of m c2 p c
p c -m c2 are
± p2 c2 + m2 c4 . We get square root without having any square root in the matrix). The person who figured this out is Dirac and this is
Dirac’s theory for relativistic fermions.
If we ignore higher order terms, our hydrogen atom should follow this Hamiltonian
H = H0 + H ' (2.291)
where H0 =p2
2 m+ V(r). With the perturbation H ', the energy of a hydrogen atom will be different from what we computed early on. How large
is the difference? This question can be answered by the perturbation theory.
We have already known the energy spectrum of H0,
En0 = -
13.6 eV
n2with n = 1, 2, 3, … (2.292)
More precisely,
En0 = -
1
n2
m
2 ℏ2
e2
4 π ϵ0
2
(2.293)
We often define Bohr radius a as
a =ℏ2
m
4 π ϵ0
e2(2.294)
And then,
En0 = -
1
n2
1
2
e2
4 π ϵ0 a(2.295)
For En, there are n2 degenerate quantum states (ignore spin at this moment) ψn l m where l is the angular momentum quantum number
l = 0, 1, 2 …n - 1 and m is the quantum number for Lz and m = -l, -l + 1, …0, …, l - 1, l
L2 ψn l m = ℏ2 l(l + 1) ψn l m (2.296)
Lz ψn l m = ℏm ψn l m (2.297)
For, n = 1 there is no degeneracy, and we can do non-degenerate perturbation theory. For any n > 1, there are n degenerate states, and thus we
should do degenerate perturbation theory. However, we are very lucky here. We don’t need to worry about degenerate perturbation theory,
because ψn l m is already a good set of basis:
⟨ψn l m H ' ψn l ' m'⟩ = 0 (2.298)
if l ≠ l ' or m ≠ m '.
34 Phys460.nb
This is because both H0 and H ' commute with L2. And we can also show that both H0 and H ' commute with Lz. Here, L2 and Lz serve as the A
operator that we defined in the previous section. For a fixed n, because the degenerate states all have different eigenvalues for L2 and Lz
(different l and m), ⟨ψn l m H ' ψn l ' m'⟩ = 0. So we don’t need to choose any other basis and can start with non-degenerate perturbation.
The correction to the energy is (to the first order)
En l m1 = ⟨ψn l m H ' ψn l m⟩ =
-ψn l m
p4
8 m3 c2ψn l m = -
1
8 m3 c2ψn l m p4 ψn l m = -
1
8 m3 c2ψn l m p2 p2 ψn l m = -
1
8 m3 c2ψn l m p2 p2 ψn l m
(2.299)
For ψn l m⟩, we know that
H0 ψn l m⟩ = En0 ψn l m (2.300)
p2
2 m+ V ψn l m = En
0 ψn l m (2.301)
p2
2 mψn l m = En
0 - V ψn l m (2.302)
p2 ψn l m = 2 m En0 - V ψn l m (2.303)
p2 ψn l m = 2 m En0 - V ψn l m (2.304)
The conjugate of this equation gives
ψn l m p2 = ψn l m 2 m En0 - V (2.305)
So,
ψn l m p2 p2 ψn l m = ψn l m 2 m En0 - V 2 m En
0 - V ψn l m = 4 m2 ψn l m En0 - V
2ψn l m (2.306)
Here, V = -e2
4 π ϵ0
1r and
ψn l m En0 - V
2ψn l m =
ψn l m En0
2- 2 En
0 V + V2 ψn l m = ψn l m En0
2ψn l m - 2 ψn l m En
0 V ψn l m + ψn l m V(r)2 ψn l m =
En0
2+ 2 En
0e2
4 π ϵ0
ψn l m
1
rψn l m +
e2
4 π ϵ0
2
ψn l m
1
r2ψn l m
(2.307)
Without going into details, we will just show the results here
ψn l m
1
rψn l m =
1
n2 a(2.308)
ψn l m
1
r2ψn l m =
1
n3l +12 a2 (2.309)
Thus,
En l m1 = -
1
8 m3 c2ψn l m p2 p2 ψn l m = -
1
8 m3 c24 m2En
02+ En
0e2
2 π ϵ0
1
n2 a+
e2
4 π ϵ0
2 1
n3l +12 a2
=
-1
8 m3 c24 m2En
02+
2 En0
n2
e2
4 π ϵ0 a+
1
n3l +12
e2
4 π ϵ0 a
2
(2.310)
As we have shown early on
En0 = -
1
n2
1
2
e2
4 π ϵ0 a(2.311)
Phys460.nb 35
-2 n2 En0 =
e2
4 π ϵ0 a(2.312)
En l m1 = -
1
2 m c2En
02+
2 En0
n2
e2
4 π ϵ0 a+
1
n3l +12
e2
4 π ϵ0 a
2
=
-1
2 m c2En
02-
2 En0
n22 n2 En
0 +1
n3l +12
-2 n2 En0
2 = -
En0
2
2 m c2
4 n
l +12
- 3
(2.313)
So, the eigen-energy in a H atom shall be
En l m = En0 + En l m
1 +… = En0 -
En0
2
2 m c2
4 n
l +12
- 3 +… (2.314)
The zeroth order term
En0 = -
1
2 n2
m
ℏ2
e2
4 π ϵ0
2
(2.315)
it is proportional to
En0 ∝
m
ℏ2
e2
4 π ϵ0
2
=e2
4 π ϵ0 ℏ c
2
m c2(2.316)
The prefactor e2
4 π ϵ0 ℏ c is a very important physics constant, known as the fine structure constant.
α =e2
4 π ϵ0 ℏ c≈
1
137.036(2.317)
So,
En0 ∝ α2 m c2 (2.318)
The first order term
En l m1 ∝
En0
2
m c2= α4 m c2 (2.319)
If we compare the first order and zeroth order term,
En l m1
En0∝α4 m c2
α2 m c2= α2 =
1
137.036
2
≈1
10 000(2.320)
So indeed, the perturbation theory works, i.e. higher order term is much smaller than the leading order. (remember that Taylor expansions only
converge when the small parameter λ is small enough. Here, our small parameter is α, which is smaller than 1%).
Relativistic correction is indeed an small correction in a H atom
NOTE: the fine structure constant is one of the most important physics constant. It is dimensionless. It involves special relativity (contains the
speed of light). It involves quantum mechanics (having ℏ in its definition) and it also involves E&M (having ϵ0 and e2). In this section, we
showed that it is so lucky for us that for a H atom, because α is small, our relativistic correction is indeed small and thus we can do perturbation
theory. In QFT (QED), small α means that interactions between particles (quantum electron-dynamics) is a small perturbation. To the leading
order, we can treat a particle as a free particle, and then add E&M interactions as a perturbation. Because the small parameter α is so small, in
QED, our perturbation theory converge very fast. First order perturbation gives us an accuracy of the order α1~10-2. Second order perturbation
increases the accuracy to α2~10-4. By going to 5th order in perturbation, we can get an accuracy of the order α5~10-10. This is the reason
why QED is such a successful theory.
2.5.2. Spin-orbit coupling
36 Phys460.nb
In QM I, we treat the spin of an electron as an independent quantity, independent from the orbit agular momentum. For example in a hydrogen
atom, for any eigenwavefunction ψn l m(x, y, z), it actually means two degenerate states: (1) one electron with wavefunction ψn l m(x, y, z) and
spin up and (2) one electron with wavefunction ψn l m(x, y, z) and spin down. This conclusion remains the same after we take into accound the
relastivistic correction (now the eigenenergy depends on both n and l, but still for every eigenwavefuntion, it means two degenerate state when
we taken into account the spins).
In this section, we will consider one more effect, which was ignored previously and this effect will tell us the spin of an electron and its orbital
motion are coupled togehter.
Warning: you may find that the derivtion in this section very disturbing. Because for multiple times, after some derivations, we will say
the following without providing much justification, “by the way, this result is in fact not quite right, and we will need to throw in an
extra factor of 2 to get the correcti answer.” The reason for these extra factors of 2 is because this section is NOT treating spin-orbit
coupling in the rigroius and correct way, which requires Dirac’s equation. Instead, what we are trying to do here is to use various
tricks trying to recover Dirac’s finally conclusion without using Dirac’s equation. These tricks (they are not rigrious at all) get some
part of the story right, but in many cases, they lead to wrong results. Because we already know the right answer from Dirac’s equation,
whenever we find that these tricks fail to get the correct answer, we will correct it by adding some extra factor. Within our deviation,
these extra factor looks totally unreasonable and weird, but if one start from Dirac’s equation, the results are all very natural and
straghtforward. Bottom line, please don't take these derivations very seriously, because they are not supposed to give (fully) correct
descriptoin after all. But the physics, at the end of the day, is correct.
Magnetic dipole of an electron
If a charge partilce moves in ciricles, it a creates circular current, and the circular current will result in an magnetic dipole moment (according
to E&M). To see this, we use a simple model to demonstrate this physics. Assuming that we have a ring, and there is a charged particle (with
charge q) moving around the ring. The dipole momentu is
μ→=
1
2q r→× v→
(2.321)
where q is the charge of the particle. r→ and v→ are the location and velocity of the particle.
μ→=
1
2q r→× v→=
q
2 mm r
→× v→ =
q
2 mL→
(2.322)
where L→= m r
→× v→ is the angular momentum.
Simillary, if we have a spinning charged particle, the angular momentum from the spin will also result in a magnetic dipole. The diople moment
from spins is also proportional to the angluar momentum of the spin, but with an extra factor known as the g-factor
μ→= g
q
2 mS→
(2.323)
The charge of an electron is -e (negative charge), so
μ→= -g
e
2 mS→
(2.324)
and g is a number, whose value is really close to 2. In this course, we will say that g = 2 for simplicity, but in reality, g = 2.00231930436182.
In Dirac theory, g is exactly 2. The reason the real value of g is a little bit larger than 2 is due to interactions between electrons and photons
(light), which wasn’t considered in Dirac’s equation.
Note: in many cases, people absorb the minus sign into the definition of g,
μ→= g
e
2 mS→
(2.325)
where g = -2. But no matter what convetion one adopts,
μ→= -
e
mS→
(2.326)
We can define the Bohr magneton, which is a fundimental physics constant
Phys460.nb 37
μB→
=e ℏ
2 m= 9.27400968 (20)×10-24 J /T (2.327)
and
μ→= g
e
2 mS→= g
e ℏ
2 m
S→
ℏ= g μB
S→
ℏ(2.328)
For an electron, the magnetic dipole is ±μB. We demonstrate this by considering the dipole moment along z direction
μz = g μB
Sz
ℏ(2.329)
The spin operator Sz has eigenvalues ±ℏ/2, and g = -2. For an eigenstate of Sz,
μz = -μB if Sz eigenvalue is + ℏ /2, i.e. spin up+μB if Sz eigenvalue is - ℏ /2, i.e. spin down (2.330)
Effective B field from the nucleon
If we stand on an electron (using the electron as our reference frame), we will find that the nucleon is moving around us in a circle. Because the
nucleon has postive charge +e, when it moves around us, it generate a circular current and thus leads to a magnetic field. According to the
“Biot–Savart law” in E&M, the B field generated by a wire with current I is
B→=μ0 I
4 πⅆ l→× r→
r3(2.331)
where ⅆ l→
is a small section of the wire and the direction is parrella to the wire. r→ is the distance between the wire and the place at which we
want to measure the B field. For a circular motion of the nucleon, the wire here is a circle and we want to know the B field at the center of the
circle
B =μ0 I
4 π
0
2 π r ⅆθ r
r3=μ0 I
4 π
0
2 πⅆθ =
μ0 I
4 π r2 π =
μ0 I
2 r(2.332)
The current I here is
I =e
T(2.333)
where e is (the absolute value of) the charge of an electron (remember that the nucleon in a hydrogen atome is +e). T is the how long it takes for
the nucleon to go around a ciricle.
I =e
T=
e
2 π /ω=
eω
2 π (2.334)
Here, ω is the angular velocity. Notice that the angular velocity here is the same as the angular velocity of the electron ω (in the rest frame).
I =eω
2 π=
e L
2 πm r2(2.335)
So
B =μ0 I
2 r=μ0
2
e L
2 πm r3= ϵ0 μ0
e L
4 π ϵ0 m r3 (2.336)
Notice that ϵ0 μ0 = 1c2 and in addition, it is easy to realize that B→// L→
, so we get
B→= ϵ0 μ0
e L→
4 π ϵ0 m r3=
e
4 π ϵ0
L→
m c2 r3(2.337)
Magnetic dipole in an B field
We proved above that using the frame of the electron, the electron feels a B field, which is generated by the nucleon
38 Phys460.nb
B→=
e
4 π ϵ0
L→
m c2 r3(2.338)
and the electron has a magnetic diople
μ→= -
e
mS→
(2.339)
We have a dipole in a B field, we shall have energy
H ' = -μ→·B→=
e2
4 π ϵ0
1
m2 c2 r3S→·L→
(2.340)
Here, our naive tricks miss a factor of 1/2 in comparison to the correct result (from Dirac’s equation). The right result should be
HSO =e2
8 π ϵ0
1
m2 c2 r3S→·L→
(2.341)
Therefore, we shall add one extra term to the Hamiltonian
H = H0 + HSO (2.342)
Here, H0 is what we learned in QM1. And HSO is this new term. Here, we treat H0 as unperturbed Hamiltonian, and treat HSO as a small
perturbation.
Basis without HSO
For the commutation relations for the orbital angular momentum, we know that
[Lx, Ly] = ⅈ ℏ Lz (2.343)
[Ly, Lz] = ⅈ ℏ Lx (2.344)
[Lz, Lx] = ⅈ ℏ Ly (2.345)
or we can write the same formular as
[Li, L j] = ⅈ ℏ ϵi, j,k Lk (2.346)
where ϵi, j,k is the Levi-Civita symbol.
For L, we know that we can define the opertor L2
L2 = Lx2 + Ly
2 + Lz2 (2.347)
And we know that it commute with Lz, L2, Lz = 0. As a result, we cannot measure all the three components of the angular momentum due to
the uncertainly principle. But, we can measure L2 and Lz at the same time, by defining common eigenstates for these two opertoators
L2 ψl m(x, y, z) = l(l + 1) ℏ2 ψl m(x, y, z) (2.348)
Lz ψl m(x, y, z) = m ℏ ψl m(x, y, z) (2.349)
Here, l is a non-negative integer, l = 0, 1, 2, …, and m is an integer between +l and -l. The eigenwavefunctions is very easy to write down in
spherical corrediate
ψl m(r, θ, ϕ) = R(r) Ylm(θ, ϕ) (2.350)
where R(r) is an arbitary function of r (the function doesn’t depends on θ or ϕ), and Ylm(θ, ϕ) are a set of special functions known as the
spherical harmonics.
For spins, we have the same communtation relation,
[Sx, Sy] = ⅈ ℏ Sz (2.351)
[Sy, Sz] = ⅈ ℏ Sx (2.352)
[Sz, Sx] = ⅈ ℏ Sy (2.353)
Phys460.nb 39
And again, we can define
S2 = Sx2 + Sy
2 + Sz2 (2.354)
And same as above, S2, Sz = 0, so we can measure S2 and Sz at the same time.
S2 s, m = s(s + 1) ℏ2 s, m (2.355)
Sz s, m⟩ = m ℏ s, m⟩ (2.356)
where s is a non-negative integer or half-integer, s = 0, 1 /2, 1, 3 /2, … Once s is determined, m = -s, -s + 1, …, s - 1, s. For electrons,
s = 1 /2, and thus m = -1 /2 or +1 /2
In addition, we know that S→
and L→
commute with each other,
S→
, L→ = 0 (2.357)
Without spin-orbit coupling (i.e. the unperturbed Hamiltonian H0), we can easily prove that H0, S→ = H0, L
→ = 0, as a result, we find that the
following operators commute with one another, H0, L2, Lz, S2, Sz. So we can request our quantum states to be common eigenstates of all these
operators: n, l, m, s, sz⟩
H0 n, l, m, s, sz⟩ = -13.6 eV
n2n, l, m, s, sz (2.358)
L2 En, l, m, s, sz = l(l + 1) ℏ2 n, l, m, s, sz (2.359)
Lz En, l, m, s, sz⟩ = m ℏ n, l, m, s, sz⟩ (2.360)
S2 En, l, m, s, sz =3
4ℏ2 n, l, m, s, sz (2.361)
Sz En, l, m, s, sz⟩ = sz ℏ n, l, m, s, sz⟩ (2.362)
where sz = +1 /2 or -1 /2. For an electron, we know that s is always 1 /2, so we don’t really need to writeit out: n, l, ml, ms⟩. Compare to the
results without spins (ψlmn), the only thing we get here is an extra index sz = ±1 /2. This quantum number tells me whether my spin is pointing
up or down. At the end of the day, we didn’t get anything beyond what we have already known, except that we now need to specify whether the
spin of the electron is up or down.
With SO coupling, the basis desribed above is NOT a good option, because ⟨n, l, m, sz HSO n ', l ', m ', sz '⟩ ≠ 0, i.e. to do degenerate
perturbation theory, we will need a new basis.
Basis with HSO
To get the proper basis, we can go throw the derivation that we demonstrated for degenerate perturbation theory. Here, instead, we will use a
trick to get the correct basis direction. The trick is what we proved early on. We know that if we can find a quantum operator, A, which
commutes with both H0 and the perturbation HSO, we can use common eigenstates of A
and H0 as a set of basis. If in this set, every state has a
different eigenvalue for A, then it is a good state for degenerate pertubation theory.
In the previous section (relastivisctic correction), we used L2 and Lz to serve as the A operator. Here, after taking into account spins and for the
perturbatoin HSO, we will need to use L2, S2, J2 and Jz as A.
If an electron have both orbit and spin angular momenta, we can add them up to get the total angular momentum
J→= L→+ S→
(2.363)
or equivallently,
Jx = Lx + Sx (2.364)
Jy = Ly + Sy (2.365)
Jz = Lz + Sz (2.366)
40 Phys460.nb
For Js, we have the same commutation relation
[Jx, Jy] = ⅈ ℏ Jz (2.367)
[Jy, Jz] = ⅈ ℏ Jx (2.368)
[Jz, Jx] = ⅈ ℏ Jy (2.369)
And we can also define J2 as
J2 = Jx2 + Jy
2 + Jz2 (2.370)
Same as L and S, we know that J2, Jz = 0. So we can measure J2 and Jz at the same time
J2 j, m = j( j + 1) ℏ2 s, m (2.371)
Jz j, m⟩ = m ℏ s, m⟩ (2.372)
where j is an integer or half-integer, j = 0, 1 /2, 1, 3 /2, … Once j is determined, m = -s, -s + 1, …, s - 1, s. For electrons, s = 1 /2, and thus
m = -1 /2 or +1 /2.
If we have a particle with spin quantum number s and orbit angular momentum quantum number l, then j = l + s, l + s - 1, … l - s . NOTE:
j cannot be negative. For spin s = 1 /2, this means that j = l - 1 /2 or j = l + 1 /2 for l ≥ 1. And j = 1 /2 if l = 0.
◼ If we put an electron on an s-wave state (l = 0), the total angular momentum j = 1 /2
◼ If we put an electron on an p-wave state (l = 1), the total angular momentum j = 1 /2 or 3 /2
◼ If we put an electron on an d-wave state (l = 2), the total angular momentum j = 3 /2 or 5 /2
◼ If we put an electron on an f-wave state (l = 3), the total angular momentum j = 5 /2 or 7 /2
◼ ...
In our homework, we will show that J2, Jz, L2, S2 compute with HSO. It is also straightforward to see that J2, Jz, L2, S2 all commute with H0, so
we can use them as our A
operator. In addition, it is also easy to verify that these four operators commute with each other, so we can define
common eigenstates for H0, J2, Jz, L2, S2 and using this common eigenstates as our basis for perturbation theory
H0 n, l, s, j, jz⟩ =-13.6 eV
n2n, l, s, j, jz (2.373)
L2 n, l, s, j, jz = l(l + 1) ℏ2 n, l, s, j, jz (2.374)
S2 n, l, s, j, jz = s(s + 1) ℏ2 n, l, s, j, jz =3
4ℏ2 n, l, s, j, jz (2.375)
J2 n, l, s, j, jz = j( j + 1) ℏ2 n, l, s, j, jz (2.376)
Jz n, l, s, j, jz⟩ = jz ℏ n, l, s, j, jz⟩ (2.377)
Notice that
J2 = J→· J→= L
→+ S→ · L
→+ S→ = L
→·L→+ S→· S→+ 2 S
→·L→= L2 + S2 + 2 S
→·L→
(2.378)
As a result,
S→·L→=
J2 - L2 - S2
2(2.379)
So we can write our pertubation as
HSO =e2
8 π ϵ0
1
m2 c2 r3S→·L→=
e2
8 π ϵ0
1
m2 c2 r3
J2 - L2 - S2
2(2.380)
The first order perturbation theory
En,l,s, j, jz1 = ⟨n, l, s, j, jz HSO n, l, s, j, jz⟩ =
e2
8 π ϵ0
1
2 m2 c2n, l, s, j, jz
J2 - L2 - S2
r3n, l, s, j, jz =
Phys460.nb 41
e2
8 π ϵ0
1
2 m2 c2n, l, s, j, jz
j( j + 1) - l(l + 1) - s(s + 1)
r3ℏ2 n, l, s, j, jz =
e2 ℏ2
16 π ϵ0
j( j + 1) - l(l + 1) - s(s + 1)
m2 c2n, l, s, j, jz
1
r3n, l, s, j, jz =
The average value for 1r3 is known for the unperturbed Hamiltonian
n, l, s, j, jz1
r3n, l, s, j, jz = ⅆ r
→ψn,l,m
*(r, θ, ϕ)1
r3ψn,l,m(r, θ, ϕ) =
1
l(l + 1 /2) (l + 1) n3 a3 (2.382)
So
En,l,s, j, jz1 =
e2 ℏ2
16 π ϵ0
j( j + 1) - l(l + 1) - s(s + 1)
m2 c2n, l, s, j, jz
1
r3n, l, s, j, jz =
e2 ℏ2
16 π ϵ0
j( j + 1) - l(l + 1) - 34
m2 c2
1
l(l + 1 /2) (l + 1) n3 a3=
e2
16 π ϵ0
1
n4 a2
ℏ2
a
1
m2 c2
j( j + 1) - l(l + 1) - 34
l(l + 1 /2) (l + 1)n
(2.383)
Remember, the Bohr radius is
a =ℏ2
m
4 π ϵ0
e2(2.384)
and the eigenenergies for the unperturbed Hamiltonian is
En = -1
n2
e2
8 π ϵ0 a(2.385)
En,l,s, j, jz1 =
e2 ℏ2
16 π ϵ0
j( j + 1) - l(l + 1) - s(s + 1)
m2 c2n, l, s, j, jz
1
r3n, l, s, j, jz =
e2 ℏ2
16 π ϵ0
j( j + 1) - l(l + 1) - 34
m2 c2
1
l(l + 1 /2) (l + 1) n3 a3=
e2
16 π ϵ0
1
n4 a2
m e2
4 π ϵ0
1
m2 c2
j( j + 1) - l(l + 1) - 34
l(l + 1 /2) (l + 1)n =
e2
8 π ϵ0
1
n2 a
2 1
m c2
j( j + 1) - l(l + 1) - 34
l(l + 1 /2) (l + 1)n =
En2
m c2
j( j + 1) - l(l + 1) - 34
l(l + 1 /2) (l + 1)n
(2.386)
Recall that in the previous section, we find that the relastivistic correction is (at the first order)
-En
02
2 m c2
4 n
l +12
- 3 (2.387)
If we combine both effects together, to the first order, the energy is
En, j,l,s, jz = En0 +
En0
2
2 m c22( j + 1) - l(l + 1) - 3
4
l(l + 1 /2) (l + 1)n -
4 n
l +12
+ 3 +… (2.388)
Notice that j = l + 1 /2 or l - 1 /2, so we have l = j - 1 /2 or j + 1 /2. For l = j - 1 /2, we find that
En, j,l,s, jz = En0 -
En0
2
2 m c2
4 n
j + 1 /2- 3 +… (2.389)
for l = j + 1 /2, we find exactly the same result
En, j,l,s, jz = En0 -
En0
2
2 m c2
4 n
j + 1 /2- 3 +… (2.390)
So we conclude, no matter what, we have
42 Phys460.nb
En, j,l,s, jz = En0 -
En0
2
2 m c2
4 n
j + 1 /2- 3 +… (2.391)
After taken into account both SO coupling and relastivisitic correction, we find that the energy of a quantum state only depends on n and j,
En, j = En0 -
En0
2
2 m c2
4 n
j + 1 /2- 3 +… (2.392)
This is the fine structure correction in a hydrogen atom.
According to this formular, the fine structure correctio always reduces the energy of a state by a very smalll fraction (the correction is α2~10-4,
which is a 0.01% change). The smaller the j is the bigger this correction is. So s-wave states (with l = 0 get the largest modification). For states
with l > 0, e.g. p-wave, d-wave, etc., they splitts into two different energy levels (with j = l - 1 /2 and j = l + 1 /2, and the former has lower
energy than the latter).
NOTE: the fine structure correction can also be written as
En, j = En01 -
En0
2 m c2
4 n
j + 1 /2- 3 +… (2.393)
because
En0 = -
1
2 n2m c2
e2
4 π ϵ0 ℏ c
2
= -α2
2 n2m c2
(2.394)
wehre α is the fine structure constant
α =e2
4 π ϵ0 ℏ c≈
1
137.036(2.395)
we can rewrite the formular as
En, j = En01 +
α2
4 n2
4 n
j + 1 /2- 3 +… = En
01 +α2
n2
n
j + 1 /2-
3
4+… = -
13.6 eV
n21 +
α2
n2
n
j + 1 /2-
3
4+… (2.396)
2.6. The Zeeman effect
In the previous section, we found that after considering relativistic effects (i.e., fine structure), the eigenenergies in a hydrogen atom only
depends on the quantum numbers n and j. In particular, the energy is independent of jz.
For a fixed j, jz = - j, - j + 1, …, j - 1, j, all have the same energy, i.e. 2 j + 1-fold degeneracy.
In this section, we will show that in the presence of an external B field, these 2 j + 1-fold degeneracy will be lifted.
In a magnetic field, the energy of a magnetic dipole is E = -μ→·B→
. So for an atom
HZ ' = -μ→
L + μ→
S ·B→
(2.397)
where μ→L is the magnetic dipole moment from orbit motion
μ→
L = -e
2 mL→= -μB
L→
ℏ(2.398)
and μ→S is the magnetic dipole moment from electron spin
μ→
S = -2×e
2 mS→= -
e
mS→= -2 μB
S→
ℏ(2.399)
where μB =eℏ
2 m= 5.788×10-5 eV /T is Bohr magneton. Here, L
→ and S
→ are angular momenta from orbit motion and electron spin respectively
Phys460.nb 43
HZ ' = μB
L→+ 2 S
→
ℏ·B→
(2.400)
Without loss of genericity, we will set B to be along the z direction, so the total energy is
B→= B z
(2.401)
As a result,
HZ ' = μB BLz + 2 Sz
ℏ(2.402)
Consider
H = H0 + Hr ' + HSO ' + Hz ' (2.403)
where H0 is the Hamiltonian that we studied in QM I (kinetic energy+1/r attraction), and Hr ' is the relativistic correction. HSO ' is the SO
coupling effect, and HZ ' = μB BLz+2 Sz
ℏ.
2.6.1. Difficulty
For the Hamiltonian above H , the key difficulty lies in the fact that the last two terms, HSO' and Hz', do not commute with each other. For Hz ',
we must know Lz and Sz. However, we have learned early on HSO ' doesn’t commute with Lz and Sz (HSO ' commutes with jz, but not with Sz or
Lz, as we showed in our homework). So, we cannot measure HSO ' with Lz and Sz at the same time, but HZ ' needs information about Lz and Sz.
This is the confliction
NOTE: this problem comes from the g factor for electrons. μ→L = -μBL→
ℏ and μ→S = -2 μB
S→
ℏ, the prefactor for them are DIFFERENT! (differ
by a factor of 2, i.e., the g-factor). If there is no this extra factor g = 2, things would be very easy. There, HZ ' = μB BLz+Sz
ℏ= μB B
Jz
ℏ, so we only
need Jz. But unfortunately, Hz ' is not proportional to Jz.
2.6.2. Strong field
When Hz >> HSO ', we can treat HSO ' and Hr ' (they two are comparable as we learned in the previous section) as perturbation, and thus our
unperturbed Hamiltonian is
H0 + Hz ' (2.404)
The eigenstates of this Hamiltonian is the same as the eigenstates of H0: n, l, ml, ms⟩. Here, n, l, ml⟩ are the eigenwavefunctions that we
learned in QM I. Here, we add back the spin Sz quantum state ms
L2 n, l, ml, ms = ℏ l (l + 1) n, l, ml, ms (2.405)
Sz n, l, ml, ms⟩ = ms ℏ n, l, ml, ms⟩ (2.406)
Lz n, l, ml, ms⟩ = ml ℏ n, l, ml, ms⟩ (2.407)
H0 n, l, ml, ms⟩ = -13.6 eV
n2n, l, ml, ms (2.408)
Hz ' n, l, ml, ms⟩ =μB B
ℏ(Lz + 2 Sz) n, l, ml, ms =
μB B
ℏLz n, l, ml, ms + 2
μB B
ℏSz n, l, ml, ms =
μB B
ℏℏml n,
l, ml, ms + 2μB B
ℏℏms n, l, ml, ms = μB B(ml + 2 ms) n, l, ml, ms
(2.409)
So our zeroth order eigenenergy is
En,l,ml,ms0 = -
13.6 eV
n2+ μB B(ml + 2 ms) (2.410)
At B = 0, we know that energy is independent of ml and ms, i.e., all quantum states are degenerate with (at least 2×(2 l + 1)-fold degenerate).
For finite B however, these states splits.
44 Phys460.nb
Example: if we consider states n = 2 and l = 1 (first excited states with orbit angular moment quantum number l = 1). There, ml = -1, 0, +1
and ms = -12
or + 12
. At B = 0, all these six states are degenerate (E = -13.6 /4 = -3.4 eV). In the presence of strong B field,
En,l,ml,ms0 =
-3.4 eV + 2 μB B ml = +1 and ms = +1 /2-3.4 eV + μB B ml = 0 and ms = +1 /2-3.4 eV ml = -1 and ms = +1 /2, or, ml = +1 and ms = -1 /2-3.4 eV - μB B ml = 0 and ms = -1 /2-3.4 eV - 2 μB B ml = -1 and ms = -1 /2
(2.411)
Now, we consider HSO ' and Hr ' . Because we assumed that they are much smaller than H0 and HZ ', we treat them as perturbation and compute
the first order correction to the eigenenergy
En,l,ml,ms1 = ⟨n, l, ml, ms Hr ' + HSO ' n, l, ml, ms⟩ (2.412)
The realistic correction is same as what we learned before
⟨n, l, ml, ms Hr ' n, l, ml, ms⟩ = -En
02
2 m c2
4 n
l +12
- 3 (2.413)
Because
En0 = -
1
2 n2
m
ℏ2
e2
4 π ϵ0
2
= -13.6
n2eV (2.414)
α =e2
4 π ϵ0 ℏ c≈
1
137.036(2.415)
we know that
En0
m c2= -
1
2 n2
1
ℏ2 c2
e2
4 π ϵ0
2
= -1
2 n2α2
(2.416)
so
⟨n, l, ml, ms Hr ' n, l, ml, ms⟩ = -En
0
2
En0
m c2
4 n
l +12
- 3 = -En
0
2
α2
2 n2
4 n
l +12
- 3 = -13.6 eVα2
n4
n
l +12
-3
4(2.417)
For spin-orbit coupling,
⟨n, l, ml, ms HSO ' n, l, ml, ms⟩ =
n, l, ml, ms
e2
8 π ϵ0
1
m2 c2 r3S→·L→
n, l, ml, ms =e2
8 π ϵ0
1
m2 c2 r3n, l, ml, ms S
→·L→
n, l, ml, ms(2.418)
Notice that in the zeroth order wavefunctions, n, l, ml, ms⟩, spin and orbit angular momenta are independent of each other, so
n, l, ml, ms S→·L→
n, l, ml, ms = S→ · L
→ = ⟨Sx⟩ ⟨Lx⟩ + ⟨Sy⟩ ⟨Ly⟩ + ⟨Sz⟩ ⟨Lz⟩ (2.419)
In QM I, we learned that for eigenstates of L2 and Lz, ⟨Lx⟩ = ⟨Ly⟩ = 0. And similarly, ⟨Sx⟩ = ⟨Sy⟩ = 0. And thus
n, l, ml, ms S→·L→
n, l, ml, ms = S→ · L
→ = ⟨Sz⟩ ⟨Lz⟩ = ms ℏml ℏ = ms ml ℏ
2(2.420)
As a result,
⟨n, l, ml, ms HSO ' n, l, ml, ms⟩ =
e2
8 π ϵ0
1
m2 c2n, l, ml, ms
S→·L→
r3n, l, ml, ms =
e2
8 π ϵ0
ms ml ℏ2
m2 c2n, l, ml, ms
1
r3n, l, ml, ms
(2.421)
and
n, l, s, j, jz1
r3n, l, s, j, jz = ⅆ r
→ψn,l,m
*(r, θ, ϕ)1
r3ψn,l,m(r, θ, ϕ) =
1
l(l + 1 /2) (l + 1) n3 a3 (2.422)
Phys460.nb 45
where a is the Bohr radius a =ℏ2
m
4 π ϵ0
e2
So
⟨n, l, ml, ms HSO ' n, l, ml, ms⟩ =
e2
8 π ϵ0
ms ml ℏ2
m2 c2
1
ℏ2
m
4 π ϵ0
e2 3
1
l(l + 1 /2) (l + 1) n3=
1
2
e2
4 π ϵ0 cℏ
2 e2
4 π ϵ0
2 m
ℏ2
ms ml
l(l + 1 /2) (l + 1) n3
(2.423)
Because
En0 = -
1
2 n2
m
ℏ2
e2
4 π ϵ0
2
= -13.6
n2eV (2.424)
α =e2
4 π ϵ0 ℏ c≈
1
137.036(2.425)
we can rewrite the formula as
⟨n, l, ml, ms HSO ' n, l, ml, ms⟩ = 13.6 eV α2ms ml
l(l + 1 /2) (l + 1) n3 (2.426)
So our first order correction is
En,l,ml,ms1 = ⟨n, l, ml, ms Hr ' + HSO ' n, l, ml, ms⟩ = -13.6 eV
α2
n4
n
l +12
-3
4+ 13.6 eV α2
ms ml
l(l + 1 /2) (l + 1) n3=
13.6 eV
n3α2
3
4 n-
1
l +12
+ms ml
l(l + 1 /2) (l + 1) =
13.6 eV
n3α2
3
4 n-
l(l + 1) - ms ml
ll +12 (l + 1)
(2.427)
So
En,l,ml,ms= En,l,ml,ms
0 + En,l,ml,ms1 = -
13.6 eV
n2+ μB B(ml + 2 ms) +
13.6 eV
n3α2
3
4 n-
l(l + 1) - ms ml
ll +12 (l + 1)
(2.428)
Bottom line: at very strong field (second term much larger than the last one), the energy splittings between the levels are proportional
to B and the slop is proportional to μB(ml + 2 mS). The eigenstates are (almost) n, l, ml, ms⟩, where ms and ml are good quantum
numbers. (we should arrange the states according to the orbit and spin angular moment, NOT the total angular momentum j).
2.6.3. Weak field
When Hz << HSO ', we can treat Hz ' as a small perturbation. The zeroth order Hamiltonian (i.e. ignoring HZ ') is what we studied in the previous
section. There, we know that eigenstates are n, j, l, s, m j⟩ and the eigenenergy is
En, j = En01 +
α2
4 n2
4 n
j + 1 /2- 3 +… = En
01 +α2
n2
n
j + 1 /2-
3
4+… = -
13.6 eV
n21 +
α2
n2
n
j + 1 /2-
3
4+… (2.429)
In first order perturbation theory, the energy correction is
En, j,l,s,mj1 = ⟨n, j, l, s, m j Hz ' n, j, l, s, m j⟩ =
n, j, l, s, m j μB
L→+ 2 S
→
ℏ·B→
n, j, l, s, m j =μB
ℏn, j, l, s, m j B
→·L→+ 2 B
→· S→
n, j, l, s, m j(2.430)
The key is to compute expectation values of L→ and S
→ for eigenstates of J2 and Jz. Here, we use the fact that
L→ // S
→ // J
→ (2.431)
So
46 Phys460.nb
L→ = L
→· J→ J
→
J2 (2.432)
Here,
B→·L→ = L
→· J→ B
→· J→
J2 = n, j, l, s, m j L
→· J→ B
→· J→
J2n, j, l, s, m j (2.433)
For B//z, we have B→· J→= B Jz
B→·L→ = n, j, l, s, m j L
→· J→ B Jz
J2n, j, l, s, m j (2.434)
Here, we use the fact that
Jz n, j, l, s, m j⟩ = m j ℏ n, j, l, s, m j⟩ (2.435)
J2 n, j, l, s, m j = j( j + 1) ℏ2 n, j, l, s, m j (2.436)
so
B→·L→ = n, j, l, s, m j L
→· J→ B Jz
J2n, j, l, s, m j =
n, j, l, s, m j L→· J→ B m j ℏ
j( j + 1) ℏ2n, j, l, s, m j =
B m j
j( j + 1) ℏn, j, l, s, m j L
→· J→
n, j, l, s, m j
(2.437)
For L→· J→
, we use the fact that
S→= J→- L→
(2.438)
S→· S→= J
→- L→ · J
→- L→ = J
→. · J→+ L→·L→- 2 L
→· J→
(2.439)
Thus,
L→· J→=
J→
. · J→+ L→·L→- S→· S→
2(2.440)
So,
n, j, l, s, m j L→· J→
n, j, l, s, m j =
n, j, l, s, m j
J→
. · J→+ L→·L→- S→· S→
2n, j, l, s, m j = n, j, l, s, m j
ℏ2 j( j + 1) + ℏ2 l(l + 1) - ℏ2 s (s + 1)
2n, j, l, s, m j =
ℏ2 j( j + 1) + ℏ2 l(l + 1) - ℏ2 s (s + 1)
2=ℏ2
2[ j( j + 1) + l(l + 1) - s (s + 1)]
(2.441)
we know that s = 1 /2 for electrons,
n, j, l, s, m j L→· J→
n, j, l, s, m j =ℏ2
2[ j( j + 1) + l(l + 1) - 3 /4] (2.442)
Therefore,
B→·L→ =
B m j
j( j + 1) ℏn, j, l, s, m j L
→· J→
n, j, l, s, m j =B m j
j( j + 1) ℏ
ℏ2
2[ j( j + 1) + l(l + 1) - 3 /4] =
B m j ℏ
2 j( j + 1)[ j( j + 1) + l(l + 1) - 3 /4]
(2.443)
Similarly, we can prove that
B→· S→ =
B m j
j( j + 1) ℏn, j, l, s, m j S
→· J→
n, j, l, s, m j =B m j
j( j + 1) ℏ
ℏ2
2[ j( j + 1) - l(l + 1) + 3 /4] =
B m j ℏ
2 j( j + 1)[ j( j + 1) - l(l + 1) + 3 /4]
(2.444)
Phys460.nb 47
And thus
En, j,l,s,mj1 = ⟨n, j, l, s, m j Hz ' n, j, l, s, m j⟩ =
n, j, l, s, m j μB
L→+ 2 S
→
ℏ·B→
n, j, l, s, m j =μB
ℏn, j, l, s, m j B
→·L→+ 2 B
→· S→
n, j, l, s, m j =
μB
ℏ
B m j ℏ
2 j( j + 1)[ j( j + 1) + l(l + 1) - 3 /4] + 2
B m j ℏ
2 j( j + 1)[ j( j + 1) - l(l + 1) + 3 /4] =
μB
ℏ
B m j ℏ
2 j( j + 1)[3 j( j + 1) - l(l + 1) + 3 /4] = μB B m j
1
2 j( j + 1)[3 j( j + 1) - l(l + 1) + 3 /4]
(2.445)
We can define an atomic g-fact,
g j =1
2 j( j + 1)[3 j( j + 1) - l(l + 1) + 3 /4] =
3
2-
l (l + 1) - 3 /4
2 j( j + 1)=
3
2-
l +32 l -
12
2 j( j + 1)(2.446)
we know that j = l + 1 /2 or j = l - 1 /2. If j = l + 1 /2, we find that
g j =3
2-
l +32 l -
12
2 j( j + 1)=
3
2-
l +32 l -
12
2 l + 12 l +
32
=3
2-
l -12
2 l + 12
=3
2-
l +12- 1
2 l + 12
=3
2-
1
2+
1
2 l + 12
= 1 +1
2 l + 1(2.447)
If j = l - 1 /2
g j =3
2-
l +32 l -
12
2 j( j + 1)=
3
2-
l +32 l -
12
2 l - 12 l +
12
=3
2-
l +32
2 l + 12
=3
2-
l +12+ 1
2 l + 12
=3
2-
1
2-
1
2 l + 12
= 1 -1
2 l + 1(2.448)
With this g j - factor,
En, j,l,s,mj1 = g μB B m j (2.449)
The energy correction (first order), is proportional to m j (total angular momentum along the field).
Total energy:
En, j = -13.6 eV
n21 +
α2
n2
n
j + 1 /2-
3
4 + g μB B m j (2.450)
2.6.4. Intermediate-field
H = H0 + Hr ' + HSO ' + Hz ' (2.451)
When HSO'~Hz', we should treat the last three terms as perturbation. Here, we can treat the problem using degenerate perturbation theory.
For H0, we consider n = 2 states (first excited states). In QM I, we learned that there are 4 degenerate states: one s-wave state with l = 0 and
three p-wave states (l = 1, and ml = -1, 0, 1). If we consider spins, there are 4×2 = 8 states. In first order degenerate perturbation theory, we
can ignore all other states except n = 2 states, and only focus on these 8 states. So the perturbation Hamiltonian is now a 8×8 matrix, which was
shown in textbook (page 248). The eigenvalues of this matrix give us the first order corrections in energy.
2.7. Summary
Objective: compute eigenvalues for the Hamiltonian
Hψn = En ψn (2.452)
Key assumption:
H= H
0 + λH
' (2.453)
where the second term λH
' is dramatically smaller than the first part.
48 Phys460.nb
2.7.1. nondegenerate perturbation theory
Step 1: solve for eigenstates for H
0
H
0 ψ0n = E0
n ψ0n (2.454)
If the state that we consider has no degeneracy, we use nondegenerate perturbation theory
Step 2: first order correction
En1 = ψ0
n H
' ψ0n (2.455)
Step 3: second order correction
En2 =
m≠nψn
0 H ' ψm0
1
En0 - Em
0ψm
0 H ' ψn0 =
m≠n
ψm0 H ' ψn
0 2
En0 - Em
0(2.456)
Step 4: eigenenergy
En = En0 + λ En
1 + λ2 En2 +… (2.457)
Wave functions:
ψn⟩ = ψn0 + λ
m≠nψm
01
En0 - Em
0ψm
0 H ' ψn +… (2.458)
2.7.2. degenerate perturbation theory
Step 1: solve for eigenstates for H
0
H
0 ψ0n = E0
n ψ0n (2.459)
If the state that we consider has degeneracy (i.e. there is at least one other state has the same eigenenergy), we use degenerate perturbation
theory:
H0 ψa0 = E0 ψa
0 (2.460)
and
H0 ψb0 = E0 ψb
0 (2.461)
Step 1: Create a n×n matrix (if there is an n-fold degeneracy),
W =ψa
0 H ' ψa0 ψa
0 H ' ψb0
ψb0 H ' ψa
0 ψb0 H ' ψb
0(2.462)
Step 2: The eigenvalues of the matrix is the first order correction
E1 = E0 + λ E+ + Oλ2 (2.463)
E2 = E0 + λ E- + Oλ2 (2.464)
Wavefunctions: eigenvectors
Waa Wab
Wba Wbb α1
β1 = E+
α1
β1 (2.465)
and
Waa Wab
Wba Wbb α2
β2 = E-
α2
β2 (2.466)
where E+ and E- are the two eigenvalues.
ψ10 = α1 ψa
0 + β1 ψb0 (2.467)
Phys460.nb 49
ψ20 = α2 ψa
0 + β2 ψb0 (2.468)
2.7.3. Perturbation theory in matrix formula (example: homework 2.3)
This case usually uses eigenstates of H0 as basis,
H
0 ψ0n = E0
n ψ0n (2.469)
With one complete set of basis, we can write an operator as a matrix
(H0)mn = ψ0
m H
0 ψ0n (2.470)
If we use eigenstates of H0 as basis, the matrix is diagonal and the diagonal components are eigenvalues of the H0
H0 →
E10 0 0 …
0 E20 0 ...
0 0 E30 …
⋮ ⋮ ⋮ ⋱
(2.471)
Using the same basis, we can write H ' as a matrix
λH ' → λ
⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩ …
⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩ ...⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩ …
⋮ ⋮ ⋮ ⋱
(2.472)
In many cases, we only need keep a small number of states (e.g. only the tree states with lowest energy)
H0 =
E10 0 0
0 E20 0
0 0 E30
(2.473)
and
λH ' = λ⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩
⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩
⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩(2.474)
Objective: compute eigenstates for the H matrix
H =
E10 0 0
0 E20 0
0 0 E30
+ λ
⟨ψ1 H ' ψ1⟩ ⟨ψ1 H ' ψ2⟩ ⟨ψ1 H ' ψ3⟩
⟨ψ2 H ' ψ1⟩ ⟨ψ2 H ' ψ2⟩ ⟨ψ2 H ' ψ3⟩
⟨ψ3 H ' ψ1⟩ ⟨ψ3 H ' ψ2⟩ ⟨ψ3 H ' ψ3⟩(2.475)
Assumption: the second matrix is much smaller than the first
Nondegenerate perturbation:
Among the three unperturbed eigenvalues, E10, E2
0 and E30, if one of them is different from the other two (e.g. E1
0 is different), then we can
use nondegenerate perturbation theory.
First order perturbation:
E11 = ⟨ψ1 H ' ψ1⟩ (2.476)
Notice that it is just one element in the second matrix in H .
Second order perturbation:
E12 =
m≠nψ1
0 H ' ψm0
1
E10 - Em
0ψm
0 H ' ψ10 =
ψ10 H ' ψ2
0 ψ20 H ' ψ1
0
E10 - E2
0+ψ1
0 H ' ψ30 ψ3
0 H ' ψ10
E10 - E3
0(2.477)
Notice that the denominator are just elements from the first matrix in H and the numerators are from the second matrix.
Degenerate perturbation:
Among the three unperturbed eigenvalues, E10, E2
0 and E30, if two (or more) of them are identical (e.g. E2
0 and E30 have the same value, then
we can use degenerate perturbation theory.
50 Phys460.nb
Step one: create the W matrix
W =ψ2
0 H ' ψ20 ψ2
0 H ' ψ30
ψ30 H ' ψ2
0 ψ30 H ' ψ3
0(2.478)
Step two: compute eigenvalues of W, which are the first order correction
Phys460.nb 51