lie algebras in particle physics 2020/2021 (nwi-nm101b)1tbudd/docs/liealgebras.pdf · 2021. 5....

Lie Algebras in Particle Physics 2020/2021 (NWI-NM101B)1

Timothy Budd

Version: May 5, 2021

Contents

1 Introduction and basic definitions 2

2 Finite groups 5

3 Introduction to Lie groups and Lie algebras 13

4 Irreducible representations of su(2) 19

5 The adjoint representation 23

6 Root systems, simple roots and the Cartan matrix 27

7 Irreducible representations of su(3) and the quark model 33

8 Classification of compact, simple Lie algebras 38

1These lecture notes are an extended version of notes by F. Saueressig and are loosely based on the book LieAlgebras in Particle Physics by H. Georgi.

1 Introduction and basic definitions

Group theory is the mathematical theory underlying the notion of symmetry. Understanding thesymmetries of a system is of great importance in physics. In classical mechanics, Noether’s theo-rem relates symmetries to conserved quantities of the system. Invariance under time-translationsimplies the conservation of energy. Invariance with respect to spatial translations leads to theconservation of momentum, and rotational invariance to the conservation of angular momentum.These conserved quantities play a crucial role in solving the equations of motion of the system, sincethey can be used to simplify the dynamics. In quantum mechanics, symmetries allow one to classifyeigenfunctions of, say, the Hamiltonian. The rotational symmetry of the hydrogen atom impliesthat its eigenstates can be grouped by their total angular momentum and angular momentum inthe z-direction, which completely fixes the angular part of the corresponding wave-functions.

In general a symmetry is a transformation of an object that preserves some properties of the object.The two-dimensional rotational symmetries SO(2) correspond to the linear transformations of theEuclidean plane that preserve lengths. A permutation of n particles preserves local particle number.Since a symmetry is a transformation, the composition of two symmetries of an object is alwaysanother symmetry. The concept of a group is introduced exactly to capture the way in which suchsymmetries compose.

Definition 1.1 (Group). A group (G, · ) is a set G together with a binary operation G×G→ Gthat associates to every ordered pair f, g of elements a third element f · g. It is required to satisfy:

(i) If f, g ∈ G then h = f ·g is also an element of G, i.e. the group is closed under multiplication.(“closure”)

(ii) The group multiplication is associative. For f, g, h ∈ G we have that f · (g · h) = (f · g) · h.(“associativity”)

(iii) There is an identity element e such that for all f ∈ G, e · f = f · e = f . (“identity”)

(iv) Every element f ∈ G has an inverse denoted by f−1 such that f ·f−1 = f−1·f = e. (“inverse”)

Let us start by looking at a few abstract examples (check that these indeed satisfy the groupaxioms!):

Example 1.1. G = {1,−1} with the product given by multiplication.

Example 1.2. The integers under addition (Z,+).

Example 1.3. The permutation group (SX , ◦) of a set X: SX is the set of bijections {ψ : X → X}and ◦ is given by composition, i.e.

if ψ : X → X and ψ′ : X → X then ψ ◦ ψ′ : X → X : x 7→ ψ(ψ′(x)).

In particular, if n is a positive integer we write Sn := S{1,2,...,n} for the permutation group on{1, 2, . . . , n}.

Example 1.4. The general linear group G = GL(N,R) consisting of all real N ×N matrices withnon-zero determinant with product given by matrix multiplication.

Based on these examples we should make a few remarks:

• The product of a group is not necessarily implemented as a multiplication, as should be clearfrom examples 2 and 3. One is free to choose a notation different from f = g · h to reflectthe product structure, e.g. f = g ◦ h, f = g + h, f = gh. Often we will write G instead of(G, ·) to indicate a group if the product is clear from the context.

2

• Notice that there is no requirement that g · h = h · g in the definition of a group, and indeedexamples 3 and 4 contain elements g and h such that g · h 6= h · g.

Definition 1.2 (Abelian group). A group G is abelian when

g · h = h · g for all g, h ∈ G.

Otherwise the group is called non-abelian.

• The groups in the examples are of different size:

Definition 1.3 (Order of a group). The order of a group G is the number of elementsin its set. A group of finite order is called a finite group, and a group of infinite order is aninfinite group.

For example, Sn is a finite group of order n!, while GL(N,R) is an infinite group.

Although one may use different operations to define a group (multiplication, addition, composition,. . . ) this does not mean that the resulting abstract groups are necessarily distinct. In fact, wewill identify any two groups that are related through a relabeling of their elements, i.e. any twogroups that are isomorphic.

Definition 1.4 (Isomorphic groups). Two groups G and G′ are called isomorphic, G ' G′, ifthere exists a bijective map φ : G→ G′ that preserves the group structure, i.e.

φ(g · h) = φ(g) · φ(h) for all g, h ∈ G. (1.1)

Example 1.5 (Cyclic group). For positive integer n the three groups defined below are allisomorphic (see exercises). Any one of them may be taken as the definition of the cyclic group oforder n.

• the set Zn := {0, 1, 2, . . . , n− 1} equipped with addition modulo n;

• the set Cn ⊂ Sn of cyclic permutations on {1, 2, . . . , n} equipped with composition;

• the nth roots of unity Un := {z ∈ C : zn = 1} equipped with multiplication.

Abstract group theory is a vast topic of which we will only cover a few aspects that are mostrelevant for physics. Central in this direction are representations.

Definition 1.5 (Representation). A representation D of a group G on a vector space V isa mapping D : G → GL(V ) of the elements of G to invertible linear transformations of V , i.e.elements of the general linear group GL(V ), such that the product in G agrees with compositionin GL(V ),

D(g)D(h) = D(g · h) for all g, h ∈ G. (1.2)

Often V = Rn or V = Cn meaning that D(g) is given by a n×n matrix and the product in GL(V )is simply matrix multiplication. The dimension of a representation is the dimension of V .

Similarly to the isomorphism of groups, we consider two representations D : G → GL(V ) andD′ : G→ GL(V ′) of G to be equivalent (or isomorphic, D ' D′) if they are related by a similaritytransform, meaning that there exists an invertible linear map S : V → V ′ such that

D′(g) = SD(g)S−1 for any g ∈ G. (1.3)

3

Having introduced abstract groups and their representations, we disentangle two aspects of thesymmetry of a physical system: the abstract group captures how symmetry transformations com-pose, while the representation describes how the symmetry transformations act on the system.2

This separation is very useful in practice, thanks to the fact that there are far fewer abstractgroups than conceivable physical systems with symmetries. Understanding the properties of someabstract group teaches us something about all possible systems that share that symmetry group.As we will see later in the setting of Lie groups, one can to a certain extent classify all possibleabstract groups with certain properties. Furthermore, for a given abstract group one can then tryto classify all its (inequivalent) representations, i.e. all the ways in which the group can be realizedas a symmetry group in the system.

For instance, the isospin symmetry of pions in subatomic physics and the rotational symmetry ofthe hydrogen atom share (almost) the same abstract group. This means that their representationsare classified by the same set of quantum numbers (the total (iso)spin and (iso)spin in the z-direction).

First we will focus on representations of finite groups, while the remainder of the course dealswith Lie groups, i.e. infinite groups with the structure of a finite-dimensional manifold (likeU(1), SO(3),SU(2), . . .).

2By definition the representations we are considering act as linear transformations on a vector space. So we onlycover physical systems with a linear structure in which the symmetries are linear transformations. This is not muchof a restriction, since all quantum mechanical systems are of this form.

4

2 Finite groups

When dealing with finite groups one may conveniently summarize the structure of a group G ={e, g1, g2, . . .} in its group multiplication table, also known as its Cayley table:

· e g1 g2 . . .

e e g1 g2 . . .g1 g1 g1 · g1 g1 · g2 . . .g2 g2 g2 · g1 g2 · g2 . . ....

......

.... . .

(2.1)

Note that the closure and inverse axioms of G (axioms (i) and (iv) in Definition 1.1) precisely statethat each column and each row contains all elements of the group exactly once. Associativity onthe other hand is not easily visualized in the table.

2.1 Example: The group of order 3

We can easily deduce using the Cayley table that there exists a unique abstract group of order 3.Indeed, if G = {e, a, b} with e the identity, then we know that the Cayley table is of the form

· e a b

e e a b

a a ? ?

b b ? ?

(2.2)

There is only one way to fill in the question marks such that each column and each row containsthe three elements e, a, b:

· e a b

e e a b

a a b e

b b e a

(2.3)

It is easily checked that this multiplication is associative and thus defines a group. In fact, we havealready encountered this group as a special case of the cyclic group of Example 1.5 when n = 3,i.e. G is isomorphic to the group Z3 of addition modulo 3 (with isomorphism given by e → 0,a→ 1, b→ 2 for instance).

Given the abstract group, one can construct representations. We first specify the vector space onwhich the representation acts to be C. In this case one has the trivial representation

D(e) = 1 , D(a) = 1 , D(b) = 1 , (2.4)

and the non-trivial representation corresponding to rotations by 120 degrees

D(e) = 1 , D(a) = e2πi/3 , D(b) = e4πi/3 . (2.5)

A straightforward computation verifies that these representations indeed obey the multiplicationlaws indicated in the group multiplication table. Both representations (2.4) and (2.5) are complexone-dimensional. It is then a natural question if there are other representations of Z3 which wecan construct. This brings us to the so-called regular representation.

Definition 2.1 (Regular representation). The regular representation of a finite group G ={g1, g2, . . . , gn} is obtained by choosing the vector space V to be the one spanned by its groupelements,

V =

{n∑i=1

λi|gi〉 : λi ∈ C

}. (2.6)

5

For h, g ∈ G the relationD(h)|g〉 ≡ |h · g〉 (2.7)

then defines a representation D : G→ GL(V ), by extending to linear combinations,

D(h)n∑i=1

λi|gi〉 =n∑i=1

λi|h · gi〉. (2.8)

The dimension of the regular representation thus equals the order of the group.

We use the example Z3 to illustrate how the definition (2.7) allows to construct the matrices D(g)of the regular representation explicitly. The technique used in the construction is general andcan be applied to any finite group. As starting point we choose a basis for the vector space, |gi〉,i = 1, 2, 3 where g0 = e, g1 = a, g2 = b are the group elements. We introduce a scalar product,requiring that the basis is orthonormal

〈gi|gj〉 = δij . (2.9)

The matrices D(h) implicitly defined through the relation (2.7) are constructed by computing theirmatrix elements

[D(h)]ij = 〈gi|D(h) |gj〉 = 〈gi|h · gj〉. (2.10)

The explicit computation uses the action of the representation on the basis elements

D(a)|e〉 = |a〉 , D(a)|a〉 = |b〉 , D(a)|b〉 = |e〉 , . . . , (2.11)

and subsequently evaluating the scalar products. The resulting matrices are

D(e) =

1 0 00 1 00 0 1

, D(a) =

0 0 11 0 00 1 0

, D(b) =

0 1 00 0 11 0 0

. (2.12)

Using standard matrix multiplication, one can verify that these matrices indeed obey the groupmultiplication laws (2.3). Thus (2.12) indeed is a representation of Z3.

2.2 Irreducible representations

A key role in classifying the representations of a group is played by the so-called irreduciblerepresentations.

Definition 2.2 (Invariant subspace). An invariant subspace W of a representation D : G →GL(V ) is a linear subspace W ⊂ V such that

D(g)ψ ∈W for any ψ ∈W and g ∈ G. (2.13)

Trivially W = V and W = {0} are always invariant, so we are usually more interested in properinvariant subspaces W , meaning that W ⊂ V is invariant and W 6= {0} and W 6= V .

Definition 2.3 ((Ir)reducible representation). A representation is reducible if it has a properinvariant subspace, and irreducible otherwise.

Sometimes it is convenient to formulate this in terms of projection operators.3 A representation isreducible if and only if there exists a non-trivial projection operator P : V → V satisfying

P D(g)P = D(g)P for all g ∈ G . (2.14)

Indeed, the image of V under P is then an invariant subspace.

3Recall that a projection operator P satisfies P 2 = P and, as a consequence P (1− P ) = 0.

6

Definition 2.4 (Completely reducible representation). A representation is called completelyreducible if it is equivalent to a representation whose matrix elements have block-diagonal form,

D(g) =

D1(g) 0 · · ·0 D2(g) · · ·...

.... . .

(2.15)

where the Dj are (not necessarily different) irreducible representations. Equivalently, eq. (2.15)may be written as the direct sum of irreducible representations each acting on a subspace

D(g) = D1(g)⊕D2(g)⊕ . . . . (2.16)

We now show that the 3-dimensional representation of Z3 is completely reducible. The first stepshows that the representation is reducible. The projector defining the invariant subspace is givenby

P =1

3

1 1 11 1 11 1 1

. (2.17)

An explicit computation shows that PD(g)P = D(g)P , ∀g ∈ {e, a, b}. Notably, the projectorsingles out the completely symmetric state ∝ |e〉 + |a〉 + |b〉, which is manifestly invariant underpermutations.

To show that the representation is completely reducible, we construct the similarity transformdiagonalizing the matrices (2.12). Setting ω ≡ e2πi/3 this transformation is given by

S =1√3

1 1 11 ω2 ω1 ω ω2

, S−1 =1√3

1 1 11 ω ω2

1 ω2 ω

. (2.18)

Evaluating D′(g) = SD(g)S−1 yields

D′(e) =

1 0 00 1 00 0 1

, D′(a) =

1 0 00 ω2 00 0 ω

, D′(b) =

1 0 00 ω 00 0 ω2

. (2.19)

Thus the regular representation of Z3 can be written as the direct sum of irreducible representa-tions, D′(g) = Dt ⊕D1 ⊕D1, where Dt is the trivial representation (2.4) and D1 is the complex1-dimensional representation (2.5).

One should think of the irreducible representations as the analogues of prime numbers. Justlike any positive integer can be decomposed as a product of primes, any representation can bedecomposed as a direct sum of irreducible representations.

Theorem 2.1. Every representation of a finite group is completely reducible and thus similar toa representation of the form of (2.15).

The proof relies on a number of ingredients.

Definition 2.5 (Unitary representation). A representation D : G → GL(V ) on a complexvector space V is unitary if D(g)†D(g) = 1 for all g ∈ G.

In the exercises you will prove the following two results.

Theorem 2.2. Any representation of a finite group is equivalent to a unitary representation.

7

Theorem 2.3. If a unitary representation of a finite group has an invariant subspace W , then itsorthogonal complement W⊥ is invariant as well.

Proof of Theorem 2.1. Let D : G→ GL(V ) be a representation. By Theorem 2.2 we may assumewithout loss of generality that D is unitary. If D is irreducible we are done. Otherwise it has aninvariant subspace W (by Definition 2.3), while its orthogonal complement W⊥ is invariant too(by Theorem 2.3). Then there exist unitary representations D1 and D2 such that D(g) is similarto the block diagonal form

D(g) '(D1(g) 0

0 D2(g)

), (2.20)

where D1(g) corresponds to the action of G on W and D2(g) that on W⊥. We may iterate theprocedure for D1 and D2 separately, until we are left with only irreducible representations. Thishappens after a finite number of steps since upon decomposition the dimensions of the represen-tations strictly decrease.

Remark: Theorem 2.2 and Theorem 2.1 are not true for infinite groups. For instance, the integersZ under addition (Example 1.2) has the two-dimensional representation

D(x) =

(1 x0 1

). (2.21)

It is reducible since it leaves the first coordinate invariant, but it is not similar to a block diagonalform. Therefore it is not completely reducible.

2.3 Schur’s lemma

The importance of irreducible representations for physics lies in the property that they allow toseparate the dynamics of the system and the structures determined by the symmetries of the prob-lem. To illustrate this, we first discuss Schur’s lemma before applying it in quantum mechanics.

Theorem 2.4 (Schur’s lemma - Part I). Let D1 : G → GL(V1) and D2 : G → GL(V2) beirreducible representations of a group G, and A : V2 → V1 a linear operator. If A intertwines D1

and D2, meaning that D1(g)A = AD2(g) for all g ∈ G, then A = 0 or D1 and D2 are equivalent.

Proof. The proof is done in two steps.

A is injective or A = 0: Let us consider the subspace W2 ⊂ V2 annihilated by A. For anyvector |µ〉 ∈ V2 we have that

|µ〉 ∈W2 =⇒ D1(g)A|µ〉 = 0 =⇒ AD2(g)|µ〉 = 0 =⇒ D2(g)|µ〉 ∈W2 (2.22)

for any g ∈ G, but that means W2 is a (not necessarily proper) invariant subspace of D2. SinceD2 is irreducible, W2 = {0} or W2 = V2, meaning that A is injective or A = 0.

A is surjective or A = 0: Let us consider the image W1 = A(V2) ⊂ V1. For any vector |ν〉 ∈ V1

we have

|ν〉 ∈W1 =⇒ |ν〉 = A|µ〉 for some |µ〉 ∈ V2 (2.23)

=⇒ D1(g)|ν〉 = D1(g)A|µ〉 = AD2(g)|µ〉 (2.24)

=⇒ D1(g)|ν〉 ∈W1 (2.25)

for any g ∈ G, but that means W1 is an invariant subspace of D1. Since D1 is irreducible, W1 = V1

or W1 = {0}, meaning that A is surjective or A = 0.

8

Together this implies that A = 0 or A is invertible. The latter case means that D1 and D2 areequivalent, since then D1(g) = AD2(g)A−1 for all g ∈ G.

The results is useful if you want to determine whether two different-looking irreducible represen-tations are equivalent: if you can find any non-zero operator A that intertwines them, then theyare equivalent.

Theorem 2.5 (Schur’s lemma - Part II). Let D : G → GL(V ) be a finite-dimensional ir-reducible representation. If a linear operator A : V → V commutes with all elements of D, i.e.[D(g), A] = 0 for all g ∈ G, then A is proportional to the identity operator.

Proof. The proof uses that every finite-dimensional matrix A has at least one eigenvector4, A|µ〉 =λ|µ〉. From the assumption [D(g), A] = 0 it then follows that

0 = D(g) (A− λ1)|µ〉 = (A− λ1)D(g)|µ〉 for all g ∈ G. (2.26)

Applying Theorem 2.4, with D1 = D2 = D, to the operator (A − λ1) it follows that A − λ1 = 0or invertible. It cannot be invertible since (A− λ1)|µ〉 = 0, thus A = λ1.

In other words, if you can find any operator A that is not proportional to the identity and thatcommutes with a representation D, then D is reducible.

A direct consequence of Schur’s lemma is that states transforming under an irreducible repre-sentation are essentially unique. Suppose that the matrices D(g) are fixed for all g ∈ G. Anyfurther basis transformation would correspond to an operator A satisfying AD(g)A−1 = D(g), butaccording to Schur’s lemma A ∝ 1, corresponding only to trivial phase transformations.

We close this subsection with an application of irreducible representations in quantum mechanics.In this case the dynamics of the system is encoded in the Hamilton operator H and observablescorrespond to the eigenvalues of hermitian operators O, i.e. satisfying O† = O. Symmetriesare implemented through a unitary representation D(g) of the symmetry group G acting on theentire Hilbert space of the system. In general, the representation D(g) is reducible. If it iscompletely reducible, one can choose a basis in which the action of D is block diagonal. Each setof states belonging to a given block transforms with respect to an irreducible representation Da(g)of G.

It is then natural to introduce an orthonormal set of states |a, j, x〉 on the Hilbert space satisfying〈a, j, x|b, k, y〉 = δabδjkδxy. Here a, b label the subspaces on which the irreducible representationsact, j, k label the basis element within such a subspace, and x, y are collective labels for all otherquantum numbers. Notably, this structure has profound consequences at the level of the matrixelements of the operators O

〈a, j, x| O |b, k, y〉 (2.27)

First notice that D(g) is blockdiagonal when acting on the basis |a, j, x〉:

〈a, j, x|D(g) |b, k, y〉 = δab δxy [Da(g)]jk . (2.28)

If D(g) generates a symmetry, it commutes with the operators O, [D(g), O] = 0, and in particularwith the Hamiltonian H. This implies

0 = 〈a, j, x| [O,D(g)] |b, k, y〉 (2.29)

4Any non-constant polynomial has at least one (complex) root, so in particular the characteristic polynomialλ 7→ det(A− λI). This root is an eigenvalue of A.

9

=∑a′,k′,x′

(〈a, j, x| O |a′, k′, x′〉〈a′, k′, x′|D(g) |b, k, y〉 − 〈a, j, x|D(g) |a′, k′, x′〉〈a′, k′, x′| O |b, k, y〉

)(2.30)

=∑k′

(〈a, j, x| O |b, k′, y〉[Db(g)]k′k − [Da(g)]jk′〈a, k′, x| O |b, k, y〉

). (2.31)

The last line is precisely the structure entering Schur’s lemma with A = 〈a, j, x| O |a′, k′, x′〉. As aconsequence

Part I (Theorem 2.4): 〈a, j, x| O |b, k, y〉 = 0 , if a 6= b ,

Part II (Theorem 2.5): 〈a, j, x| O |b, k, y〉 ∝ δjk , if a = b .(2.32)

We conclude that there exists some function fa(x, y) such that

〈a, j, x| O |b, k, y〉 = fa(x, y) δab δjk . (2.33)

Thus the dependence of the matrix elements of O on the quantum numbers a, j is fixed by grouptheory, and all physics is stored in the unknown coefficients fa(x, y). This entails in particularthat if the Hamiltonian commutes with all elements D(g) of a representation of G, one can choosethe eigenstates of H to transform according to irreducible representations of G. If an irreduciblerepresentation appears only once in the Hilbert space, every state transforming within this repre-sentation is an eigenstate of H with the same eigenvalue.

2.4 Characters of a representation

The decomposition of a representation into its irreducible building blocks utilizes the charactersof the representation.

Definition 2.6 (Character). The character χD : G → C of a representation D of G is definedthrough the trace of the linear operator D(g),

χD(g) = TrD(g) =∑i

[D(g)]ii . (2.34)

The character is invariant under similarity transformations, and therefore equivalent representa-tions D and D′ have the same character χD = χ′D. As we will see, the converse is also true forirreducible representations: if two irreducible representations share the same character, then theyare equivalent. More precisely, characters of inequivalent irreducible representations are orthogonalas we will now prove.

Theorem 2.6 (Orthogonality of characters). Let G be a group of order N and D, D′ irre-ducible representations. Then∑

g∈GχD(g)χD′(g) =

{0 if D,D′ inequivalent

N if D,D′ equivalent.(2.35)

This will be a direct consequence of the following striking result. Let us consider the quite largefamily of functions G → C consisting of all matrix elements [D(g)]ij of all irreducible representa-tions D of G. According to the following theorem these are all orthogonal when summed over theelements of the group.

Theorem 2.7 (Orthogonality of matrix elements). Let D : G → GL(V ) and D′ : G →GL(V ′) be irreducible representations of G with dimensions nD and nD′ . Then∑

g∈G[D(g−1)]ij [D′(g)]kl =

{0 if D,D′ inequivalentNnDδjkδil if D = D′.

(2.36)

10

Proof. Let X : V ′ → V be any linear map. We claim that the operator A : V ′ → V defined by

A =∑g∈G

D(g−1)XD′(g) (2.37)

intertwines D and D′. To see this we evaluate D(h)A as follows,

D(h)A =∑g∈G

D(h)D(g−1)XD′(g) (2.38)

=∑g∈G

D(h · g−1)XD′(g) (2.39)

=∑g∈G

D((g · h−1)−1)XD′(g) (2.40)

g′=g·h−1

=∑g′∈G

D((g′)−1)XD′(g′ · h) (2.41)

=∑g′∈G

D((g′)−1)XD′(g′)D′(h) = AD′(h). (2.42)

By Schur’s lemma A = 0 if D and D′ are inequivalent, while A ∝ 1 if D = D′. Notice also that ifD = D′, then

Tr(A) =∑g∈G

Tr(D(g−1)XD(g)) =∑g∈G

Tr(D(g)−1XD(g)) =∑g∈G

Tr(X) = NTr(X), (2.43)

and therefore we must have

A =N

nDTr(X)1. (2.44)

Let {|i〉 : i = 1, . . . , nD} be a basis for V and {|k′〉 : k = 1, . . . , nD′} a basis for V ′. Then X = |j〉〈k′|with fixed j, k is an example of a linear map V ′ → V . The matrix element [A]il of the associatedoperator is precisely the left-hand side of (2.36):

[A]il = 〈i|A|l′〉 =∑g∈G〈i|D(g−1)|j〉〈k′|D′(g)|l′〉 =

∑g∈G

[D(g−1)]ij [D′(g)]kl. (2.45)

By our previous observation this vanishes if D and D′ are inequivalent. If on the other handD = D′, we have Tr(X) = δjk and therefore by (2.44),

[A]il =N

nDTr(X)δil =

N

nDδjkδil. (2.46)

Remark 2.1. Note that Theorem 2.7 does not apply to non-identical but equivalent representa-tions D ∼= D′. As an aside let us see what happens in this case. By definition of equivalence thereexists a linear map S : V → V ′ such that D′(g) = SD(g)S−1. Then we find∑

g∈G[D(g−1)]ij [D′(g)]kl =

∑g∈G

[D(g−1)]ij∑m,n

[S]km[D(g)]mn[S−1]nl

=∑m,n

[S]km[S−1]nl∑g∈G

[D(g−1)]ij [D(g)]mn

Thm. 2.7=

N

nD

∑m,n

[S]km[S−1]nlδinδj,m

=N

nD[S]kj [S

−1]il, (2.47)

so instead of the orthogonality one finds the matrix elements of the similarity transform and itsinverse.

11

Proof of Theorem 2.6. Since the characters are invariant under similarity transformations, we mayassume without loss of generality that D and D′ are unitary (Theorem 2.2) and that D = D′ ifthey are equivalent. Then we have

∑g∈G

χD(g)χD(g) =∑g∈G

χD(g−1)χD(g) =

nD∑i=1

nD′∑k=1

∑g∈G

[D(g−1)]ii[D′(g)]kk. (2.48)

By the previous theorem this vanishes if D and D′ are inequivalent, while for D = D′ it equals

nD∑i=1

nD∑k=1

N

nDδik = N. (2.49)

The orthogonality relation entails the following profound consequences

1) Theorem 2.7 shows that the matrix elements of inequivalent irreducible unitary representa-tions are orthogonal and non-vanishing (and therefore linearly independent) functions of thegroup elements. It can be shown (but we will not do it here) that these form a complete basisfor the linear functions on G. As a consequence, if G has order N and na are the dimensionsof its irreducible representations then one has the remarkable formula

N =∑a

n2a. (2.50)

2) Consider a representation D of the group G with characters χD(g). Its completely reducedform contains the irreducible representation Da precisely ma times, where

ma =1

N

∑g∈G

χDa(g)χD(g) . (2.51)

This can be seen as follows. Since the characters are independent of the choice of basis onthe linear space, we can choose a basis where D(g) is block diagonal,

D = Da ⊕ · · · ⊕Da︸︷︷︸ma times

⊕ · · ·︸︷︷︸other irreducible representations

. (2.52)

In this form it is clear that χD =∑

amaχDa and then (2.51) follows from Theorem 2.6.

3) The characters can be used to construct projectors onto the subspaces transforming underthe irreducible representation Da(g):

Pa =naN

∑g∈G

χDa(g)D(g) . (2.53)

Again, this is a consequence of the completeness relation (2.36), which traced over j = kimplies

naN

∑g∈G

χDa(g) [D(g)]lm = δab δlm , (2.54)

In the block-diagonal basis for D this gives “1” on subspaces transforming with respect toDa and zero otherwise.

This concludes our survey of finite groups.

12

3 Introduction to Lie groups and Lie algebras

If the elements of a group G depend smoothly, g = g(α), on a set of continuous parameters αa,a = 1, . . . , n, then G is called a Lie group. A prototypical example is the rotation group SO(2),implementing rotations in the two-dimensional plane: the group elements depend on a singlecontinuous parameter given by the rotation angle α.

A matrix representation for SO(2) acting on R2 is given by

D(g(α)) =

(cosα − sinαsinα cosα

). (3.1)

Since D(g)−1 = D(g)† = D(g)T this representation is unitary.

Formally, a Lie group is defined as follows:

Definition 3.1 (Lie group). A Lie group of dimension n is a group (G, ·) whose set G possessesthe additional structure of an n-dimensional smooth manifold. The group structures, composition(g, h)→ g · h and inversion g → g−1, are required to be smooth maps.

Recall that, informally speaking, a set G is an n-dimensional smooth manifold if the neighbourhoudof any element can be smoothly parametrized by n parameters (α1, . . . , αn) ∈ Rn. In terms of sucha parametrization the composition and inversion correspond locally to maps Rn × Rn → Rn andRn → Rn respectively, which we require to be smooth in the usual sense.

We see that SO(2) is a 1-dimensional Lie group because the rotations of R2 are smoothly parametrizedby a single parameter α ∈ R. Since α and α + 2π are mapped to the same matrix D(g(α)), themanifold structure of SO(2) is that of a circle (of length 2π).

Recall that a representation D of a group G on a vector space V is a mapping G → GL(V ) thatis compatible with the group structure of G. In the case of a Lie group G one requires that D issmooth, meaning that the matrices D(g(α)) depend smoothly on the parameters αa.

3.1 Generators of a Lie groups

When parameterizing a Lie group, we adopt the convention that αa = 0 corresponds to theidentity, i.e., at the level of the group g(α)|α=0 = e or for a representation D(g(α))|α=0 = 1. Itturns out that a lot of information on the Lie group can be recovered by studying the vicinity ofthe identity. The construction of the generators exploits that the representation smoothly dependson the parameters αa. As a consequence any representation admits a Taylor expansion at theidentity,

D(g(α)) = 1 + αaXa +O(α2), (3.2)

where the generators Xa are defined as5

Xa ≡∂

∂αaD(g(α))|α=0 . (3.3)

5Note that in the physics literature one often uses a convention with an additional imaginary unit in the definitionof the generator, i.e. D(g(ε)) = 1 + i εa Xa + . . . and Xa ≡ −iXa. The reason is that most Lie groups encounteredin physics are unitary, in which case the generators Xa are Hermitian while the Xa are antihermitian (see section3.3 below). We will stick to the mathematical convention without the i, which is more convenient when consideringthe algebraic structure.

13

In (3.2) we have employed the Einstein summation convention to sum over repeated indices:xaya :=

∑na=1 xaya.

Applying (3.3) to (3.1) shows that SO(2) possesses a single generator given by

X =

(0 −11 0

). (3.4)

There is a natural way to recover group elements from linear combinations of the generators.Notice that if the vector αa is not particularly small, then 1 + αaXa does not give an elementof the group, because we have neglected the contributions from the higher order terms in (3.2).Luckily we can choose an integer k and construct a group element in k steps: In the limit where kbecomes large, 1 + αa

k Xa does represent a group element, close to the identity, that we can thencompose k times with itself. This motivates the introduction of the exponential map.

Definition 3.2 (The exponential map). Sufficiently close to unity, the elements of a represen-tation can be obtained as the exponential of the generators

D(g(α)) = limk→∞

(1 + αa

Xa

k

)k=

∞∑m=0

1

m!(αaXa)

m ≡ eαaXa . (3.5)

This construction applies to any representation D of the group. In the exercises you will checkthat the exponential map of multiples of (3.4) reproduces the parametrization (3.1), i.e.

eαX =

(cosα − sinαsinα cosα

), (3.6)

and therefore the exponential map of SO(2) covers the whole group. This turns out to be true forall the Lie groups that we will consider in this course:

Fact 3.1. If G is a connected6 and compact7 Lie group and D : G→ GL(V ) a finite-dimensionalrepresentation, then for any g ∈ G one can find a linear combination αaXa such that D(g) = eαaXa .

If all generators mutually commute, i.e. [Xa, Xb] = 0 for all a, b = 1, . . . n, then it is easily seenfrom the definition of the exponential map that A = αaXa and B = βaXa satisfy

eA · eB = eB · eA = eA+B. (if [Xa, Xb] = 0) (3.7)

In other words the Lie group is necessarily Abelian and the group multiplication is equivalent toaddition of the corresponding generators.

6A Lie group is connected if any element in G can be reached from the identity by a continuous path. An exampleof a Lie group that is not connected is O(2), the set of 2× 2 orthogonal real matrices M . Since det(M) = ±1 thereis no way to reach a matrix with determinant −1 along a continuous path starting at the identity matrix, which hasdeterminant 1.

7A Lie group defined via a set of matrices is compact if the absolute value of all matrix elements is bounded bysome constant.

14

For general non-Abelian Lie groups however we know that there are A = αaXa and B = βaXa

that fail this relation, i.e.eAeB 6= e(A+B) 6= eAeB. (3.8)

Note that we do haveeλAeµA = e(λ+µ)A for λ, µ ∈ R. (3.9)

In particular, setting λ = 1 and µ = −1 we find for the group inverse of eA,

eAe−A = e0A = 1, =⇒ (eA)−1 = e−A. (3.10)

3.2 Lie algebras

As we will see below, the generators Xa span a vector space g = {αaXa : αa ∈ R} with a particularstructure, known as a Lie algebra.

Definition 3.3 (Lie algebra). A Lie algebra g is a vector space together with a “Lie bracket”[·, ·] that satisfies

(i) closure: if X,Y ∈ g then [X,Y ] ∈ g;

(ii) linearity : [aX + bY, Z] = a[X,Z] + b[Y,Z] for X,Y, Z ∈ g and a, b ∈ C;

(iii) antisymmetry : [X,Y ] = −[Y,X] for X,Y ∈ g;

(iv) Jacobi identity : [X, [Y, Z]] + [Z, [X,Y ]] + [Y, [Z,X]] = 0 for X,Y, Z ∈ g.

When working with matrices, like in our case, we can take the Lie bracket to be defined by thecommutator of the matrix multiplication

[X,Y ] := XY − Y X. (3.11)

It is then easy to see that this bracket satisfies linearity and antisymmetry. In the exercises you willverify that it also satisfies the Jacobi identity by expanding the commutators. Properties (ii)-(iv)thus follow directly from the elements being represented by matrices. We will now see that theclosure (i) of the generators Xa follows from the closure of the group multiplication.

Let A,B ∈ g, i.e. A and B are linear combinations of the generators Xa, and let ε ∈ R, thentypically

eεAeεB 6= eεBeεA, or equivalently e−εAe−εBeεAeεB 6= 1. (3.12)

However, by the closure of the group e−εAe−εBeεAeεB must be some group element. Let us deter-mine this group element for small ε. Expanding to second order in ε,

e−εAe−εBeεAeεB

=(1− εA+ 1

2ε2A2

) (1− εB + 1

2ε2B2

) (1 + εA+ 1

2ε2A2

) (1 + εB + 1

2ε2B2

)+O(ε3)

The terms proportional to εA, εB, ε2A2, ε2B2 all cancel, leaving

e−εAe−εBeεAeεB = 1 + ε2(AB −BA) +O(ε3) = exp(ε2[A,B] +O(ε3)

). (3.13)

Since this is true for ε arbitrarily small, we must have that [A,B] is a generator, thus [A,B] ∈ gand (i) follows.

We conclude that to any Lie group G we can associate a Lie algebra g spanned by the generatorsof G. In mathematical terms the linear space spanned by the generators is the tangent space ofthe manifold G at the identity e ∈ G, so we find that the tangent space of a Lie group G naturallypossesses a Lie algebra structure g. By convention, we will use Fraktur font to denote Lie algebras:the Lie algebras of the Lie groups SO(N), SU(N), . . . , are denoted by so(N), su(N), . . .

15

Since the generators Xa form a basis of the Lie algebra, the commutator of two generators [Xa, Xb]is a real linear combination of generators again. The real coefficients are called the structureconstants.

Definition 3.4 (Structure constants). Given a Lie group G with generators Xa, a = 1, . . . , n.The structure constants fabc of the Lie algebra g spanned by Xa are defined through the relation

[Xa, Xb] = fabcXc. (3.14)

The properties of the structure constants are summarized as follows:

1) The structure constants do not depend on the representation, since they express how in-finitesimal symmetry transformations compose. They are thus a property of the underlyingabstract Lie group. Note that they do depend on the choice of basis of generators (we willcome back to this dependence on the basis in Section 5.2).

2) The anti-symmetry of the commutator implies that the structure constants are anti-symmetricin the first two indices

fabc = −fbac . (3.15)

3) The Jacobi identity

[Xa , [Xb , Xc ] ] + [Xb , [Xc , Xa ] ] + [Xc , [Xa , Xb ] ] = 0 (3.16)

can be equivalently stated as a relation among the structure constants:

fbcd fade + fabd fcde + fcad fbde = 0 . (3.17)

4) The structure constants are sufficient to compute the product of the group elements exactly.This follows from the Baker-Campbell-Hausdorff formula for exponentials of matrices

eX eY = eX+Y+12 [X,Y ]+

112 ([X,[X,Y ]]+[Y,[Y,X]])+more multi-commutators . (3.18)

The multi-commutators can again be expressed in terms of the structure constants.

The importance of the Lie algebra arises from the property that it allows to study properties andrepresentations of a group at the level of linear algebra. In particular, the exponential map allowsone to translate proofs back and forth between the two perspectives.

Two families of Lie groups, the special unitary groups SU(N) and special orthogonal groups SO(N),play an important role in physics. We also include the definition of the less familiar family ofcompact symplectic groups Sp(N), because together SU(N), SO(N) and Sp(N) form the onlyinfinite families of the (yet to be introduced) simple compact Lie groups, as we will see in the lastchapter of these lecture notes.

3.3 The special unitary group SU(N)

For N ≥ 2, the special unitary group in N dimensions is

SU(N) ={U ∈ CN×N : U †U = 1, detU = 1

}. (3.19)

Its importance as a symmetry group stems from the fact that it leaves invariant the N -dimensionalscalar product familiar from quantum mechanics,

〈 ξ | η 〉 ≡ ξ∗a ηa, ξ, η ∈ CN , (3.20)

16

where we used the summation convention. Indeed, we observe that for any U ∈ SU(N),

〈Uξ|Uη〉 = (Uξ)∗a(Uη)a = U∗abξ∗b Uacηc = ξ∗b (U †)baUacηc

U†U=1= ξ∗b δbc ηc = 〈ξ|η〉. (3.21)

Note that we have not used that detU = 1. The most general linear transformations preservingthe inner product thus form the larger unitary group

U(N) ={U ∈ CN×N : U †U = 1

}. (3.22)

The determinant of a unitary matrix U ∈ U(N) necessarily satisfies |detU | = 1. Therefore U(N)and SU(N) differ only by a phase: any unitary matrix U ∈ U(N) can be written as

U = eiαU ′, eiNα = detU, U ′ ∈ SU(N). (3.23)

Since this phase commutes with all elements of the group it does not play a role in understandingthe algebraic structure of unitary matrices.8 For this reason we concentrate on SU(N) rather thanU(N).

In order to determine the corresponding Lie algebra we need to translate the conditions (3.19)for the Lie group elements into conditions on the generators. Writing an arbitrary group element

in terms of the exponential map, U(α) ≡ eαaXa , one has U † = eαaX†a and U−1 = e−αaXa . The

condition U † = U−1 then implies that the generators are antihermitian X†a = −Xa. In addition,detU = 1 imposes that the generators must be traceless, trXa=0. This property follows from

log detU = 0!

= tr log eαaXa = αa tr Xa . (3.24)

Since this must hold for any value of the αa the generators must be traceless. Thus the Lie algebrasu(N) consists of all antihermitian, trace-free matrices,

su(N) ={X ∈ CN×N : X† = −X, trX = 0

}. (3.25)

The number of generators can be deduced from the following counting argument: a N ×N matrixwith complex entries has 2N2 parameters. The condition that the matrix is antihermitian imposesN2 constraints. Requiring that the matrices are trace-free removes an additional generator (theone corresponding to global phase rotations). Hence

dim su(N) = N2 − 1 . (3.26)

3.4 The special orthogonal group SO(N)

For N ≥ 2, the special orthogonal group SO(N) is the real analogue of the special unitarygroup,

SO(N) ={

Λ ∈ RN×N : ΛTΛ = 1, det Λ = 1}. (3.27)

This time it is the Euclidean scalar product on RN that is preserved,

〈 ξ , η 〉 ≡ ξa ηa, (3.28)

since for any Λ ∈ SO(N) we have

〈Λξ,Λη〉 = (Λξ)a(Λη)a = ξb(ΛT )baΛacηc

ΛT Λ=1= ξb δbc ηc = 〈ξ, η〉. (3.29)

Again SO(N) is not the largest group of matrices preserving the inner product, because we havenot used that det Λ = 1. The largest group O(N) =

{Λ ∈ RN×N : ΛTΛ = 1

}is that of all or-

thogonal N ×N matrices including reflections. Since any orthogonal matrix has determinant ±1,

8We will see later that the absence of this phase in SU(N) entails that the Lie algebra su(N) is simple (in aprecise sense), while u(N) is not.

17

O(N) really consists of two disjoint components, one for each sign of the determinant. The com-ponent containing the identity matrix is precisely SO(N). We will focus on the latter becauseit is connected (and compact), and therefore any element can be obtained via the exponentialmap.

For any Λ ∈ SO(N) we may write Λ = eαaXa with Xa real N × N matrices. Then ΛT =

eαaXTa and Λ−1 = e−αaXa . Thus ΛT = Λ−1 implies that the generators of the Lie algebra so(N)

are antisymmetric martrices XTa = −Xa. The restriction on the determinant implies that the

generators are traceless, but this already follows from antisymmetry. Hence we find that

so(N) ={X ∈ RN×N : XT = −X

}. (3.30)

The number of generators follows from counting the independent entries in an antisymmetric N×Nmatrix and is given by

dim so(N) =1

2N(N − 1) . (3.31)

3.5 The compact symplectic group Sp(N)

For N ≥ 1, the compact symplectic group Sp(N) ⊂ SU(2N) consists of the unitary 2N × 2Nmatrices that leave invariant the skew-symmetric scalar product

( ξ , η ) ≡2N∑a=1

ξa Jab ηb , J =

[0 1N

−1N 0

]. (3.32)

Hence it satisfiesSp(N) =

{U ∈ C2N×2N : U †U = 1, UTJU = J

}. (3.33)

The last condition can be equivalently written as U−1 = J−1UTJ . In terms of the exponentialmap this means that

e−αaXa = J−1eαaXTa J = eαaJ−1XT

a J , (3.34)

where the last equality follows easily from the power series expansion of the exponential. SinceJ−1 = −J , it follows that the generators of the Lie algebra satisfy

XTa = JXaJ. (3.35)

Together with the fact that the generators have to be antihermitian for U to be unitary, this leadsus to identify the Lie algebra as

sp(N) ={X ∈ C2N×2N : X† = −X, XT = JXJ

}. (3.36)

With some work one may deduce from these conditions that sp(N) has dimensionsN(2N+1).

18

4 Irreducible representations of su(2)

The exponential map provides a connection between Lie groups and the underlying algebra. As aconsequence we can construct representations of the Lie group from representations of the corre-sponding Lie algebra.

Note that we have been a bit sloppy in our notation for the Lie algebra and its representations. Inthe previous section we introduced the Lie algebra g of a Lie group G by looking at a representationD : G → GL(CN ) and considering the expansion of D(g(α)) around the identity g(0) = e. Thelinear space spanned by the generators should be understood as one particular representation(which we could denote by D : g → CN×N ) of an abstract Lie algebra g. Even though thedimension of the representation (i.e. the size N of the matrices) depends on the representation,the dimension n = dim(G) of the Lie algebra g is fixed (i.e. the number of generators). Wehave argued that the structure constants fabc can be understood as a property of the abstractLie algebra. Therefore the problem of finding other, say, N -dimensional representations of a Liealgebra g is equivalent to finding a collection of n generators Xa ∈ CN×N satisfying (3.14).

The goal of this section is to construct all finite-dimensional irreducible representations of the Liealgebra su(2).

4.1 The fundamental representation of su(2)

We have already seen one representation of su(2) in Section 3.3, namely the defining or fundamentalrepresentation

su(2) ={X ∈ C2×2 : X† = −X, trX = 0

}. (4.1)

A basis of traceless Hermitian matrices, well-known from quantum mechanics, is given by the Paulimatrices

σ1 =

(0 11 0

), σ2 =

(0 −ii 0

), σ3 =

(1 00 −1

), (4.2)

satisfyingσ†a = σa, tr[σa] = 0, σa σb = δab 1 + iεabc σc. (4.3)

Here εabc is the completely anti-symmetric ε-tensor with ε123 = 1. To obtain a basis of antihermitiantraceless matrices, we then only have to multiply the Pauli matrices by i. In particular, if weset

Xa = − i2σ (4.4)

then[Xa , Xb ] = εabcXc , (a, b, c = 1, 2, 3). (4.5)

It follows that the structure constants of su(2) are fabc = εabc. Any other set of three antihermitianmatrices satisfying (4.5) constitutes another representation of su(2).

4.2 Irreducible representations of su(2) from the highest-weight construction

Recall that a representation D : G → GL(V ) is irreducible if it has no proper invariant subspaceW ⊂ V , i.e. a subspace such that D(g)w ∈ W for all w ∈ W and g ∈ G. An invariant subspaceW can be equivalently characterized in terms of the Lie algebra g, namely that Xaw ∈ W for allw ∈W and all generators Xa. An irreducible representation of su(2) therefore corresponds to a setof N ×N antihermitian matrices satisfying (4.5) that do not leave any proper subspace W ⊂ CNinvariant.

Such representations can be constructed systematically via the highest-weight construction. Thefirst step in the construction diagonalizes as many generators as possible. From quantum mechanicswe recall that two (Hermitian or antihermitian) operators can be diagonalized simultaneously if

19

and only if they commute. For su(2) this implies that only one generator may be taken diagonal,which without loss of generality we take to be X3. Since X3 is antihermitian,

J3 ≡ iX3 (4.6)

is a real diagonal N ×N matrix. The remaining generators X1 and X2 are recast into raising andlowering operators

J± ≡1√2

(J1 ± i J2) =−i√

2(X1 ± iX2) , Ja ≡ iXa. (4.7)

Since X†a = −Xa, one has (J−) = (J+)†. Furthermore, the commutator (4.5) implies

[ J3 , J± ] = ±J± , [ J+ , J− ] = J3 . (4.8)

The key property of the raising and lowering operators is that they raise and lower the eigenvaluesof J3 by one unit. To be precise, suppose |m〉 is an eigenvector of J3 with eigenvalue m, which isnecessarily real because J3 is Hermitian. Then J±|m〉 is an eigenvector of J3 with eigenvalue m±1(unless J±|m〉 = 0), because

J3 J± |m〉 = J± J3 |m〉 ± J± |m〉 = (m± 1) J± |m〉 , (4.9)

where we used the commutator (4.8) in the first step.

The states transforming within a given irreducible representation are found from the highest weightconstruction. Since the representation is finite-dimensional and J3 is hermitian, J3 has a largesteigenvalue j ∈ R. Let |j〉 be one of the corresponding (normalized) eigenvectors, which we call ahighest-weight state,

J3|j〉 = j |j〉 , 〈j|j〉 = 1. (4.10)

The property that it is the highest-weight state implies that it is annihilated by the raising oper-ator

J+|j〉 = 0, (4.11)

since by construction there cannot be any state with J3-eigenvalue j + 1. Applying J− lowers theJ3-eigenvalue by one unit, and we denote the corresponding eigenvector by |j − 1〉,

J−|j〉 = Nj |j − 1〉 , (4.12)

where we introduced a real normalization constant Nj > 0 to ensure 〈j − 1|j − 1〉 = 1. Thevalue of Nj can be computed from the normalization of the highest weight state and commutator.Employing the definition (4.12)

N2j 〈j − 1|j − 1〉 = 〈j| J+ J− |j〉

= 〈j| [ J+ , J− ] |j〉=〈j| J3 |j〉 = j.

(4.13)

Thus we conclude that Nj =√j. Conversely, applying J+ to |j − 1〉 leads back to the state

|j〉:J+ |j − 1〉 = J+

1

NjJ− |j〉 =

1

Nj[ J+ , J− ] |j〉 =

1

NjJ3 |j〉 =

j

Nj|j〉 = Nj |j〉 . (4.14)

We can continue this process and define |j−k〉 to be the properly normalized state obtained after kapplications of J− on |j〉, assuming it does not vanish. In particular we introduce the normalizationcoefficients Nj−k > 0 via

J− |j − k〉 ≡Nj−k |j − k − 1〉 . (4.15)

20

We claim that the states |j〉, |j − 1〉, . . . span an invariant subspace W of the representation. It isclear from the construction that W is invariant under J3 and J−, so it remains to verify that forJ+|j−k〉 ∈W for all k. This can be checked by induction: we have seen that always J+|j−1〉 ∈W .Now suppose J+|j − k〉 ∈W for some k ≥ 1. Then

J+|j − k − 1〉 =1

Nj−kJ+J−|j − k〉 =

1

Nj−k(J−J+|j − k〉+ J3|j − k〉) , (4.16)

but both terms are in W : J−J+|j − k〉 ∈ W because J+|j − k〉 ∈ W by the induction hypothesisand J3|j− k〉 = (j− k)|j− k〉 ∈W . Hence W is a non-trivial invariant subspace. If we require therepresentation to be irreducible, |j〉, |j − 1〉, . . . must form a basis for the full vector space. Thisimplies in particular that our choice of higher-weight state |j〉 was unique (up to phase).

The definition (4.15) allows to derive the recursion relation

|Nj−k|2 − |Nj−k+1|2 = j − k . (4.17)

This follows from

|Nj−k|2 = |Nj−k|2 〈 j − k − 1 | j − k − 1 〉= 〈 j − k | J+ J− |j − k〉= 〈 j − k | [J+ , J− ] |j − k〉+ 〈 j − k | J− J+ |j − k〉=j − k + |Nj−k+1|2 .

(4.18)

Noticing that the recursion relation (4.17) is a telescopic sum, the solution for the normalizationcoefficients is obtained as

|Nj−k|2 = |Nj |2 +

k∑l=1

(j − l) = 12(k + 1)(2j − k) . (4.19)

In order for the number of states transforming within the representation to be finite, there mustalso be a lowest-weight state with J3 eigenvalue j− l, l ∈ N. By definition this state is annihilatedby J−,

J− |j − l〉 = 0, (4.20)

since there is no state with J3-eigenvalue j − l− 1. This implies that the norm Nj−l introduced in(4.15) and computed in (4.19) has to vanish

Nj−l =1√2

√(2j − l)(l + 1)

!= 0 . (4.21)

This happens if and only if j = 0, 1/2, 1, 3/2, . . . and l = 2j. In other words, one obtains a finite-dimensional representation if j is either integer or half-integer. The finite-dimensional irreduciblerepresentations of su(2) can then be labeled by their j-value, and are called the spin-j represen-tations. The spin-j representation is 2j + 1-dimensional and the J3-eigenvalues of the states coverthe range −j ≤ m ≤ j with unit steps.

We have proved that any N -dimensional irreducible representation of su(2) is equivalent to thespin-j representation with 2j + 1 = N , meaning that we can find a basis |m〉 ≡ |j,m〉, m =−j,−j + 1, . . . , j such that

J3|j,m〉 = m |j,m〉, J−|j,m〉 = Nm |j,m− 1〉, J+|j,m〉 = Nm+1 |j,m+ 1〉, (4.22)

Nm =1√2

√(j +m)(j −m+ 1). (4.23)

21

In particular we can read off explicit matrices for the generators in the spin-j representation

[J3]m′m = 〈j,m′ | J3 |j,m〉 = mδm′,m . (4.24)

and

[J+]m′m = 〈j,m′ | J+ |j,m〉 = Nm+1 〈j,m′ |j,m+ 1〉

=1√2

√(j +m+ 1)(j −m) δm′,m+1 .

(4.25)

The matrix representation of the lowering operator is obtained via J− = (J+)†. It remains tocheck that this spin-j representation is indeed an irreducible representation. It is easy to verifythat the matrices J3, J+, J− satisfy the commutation relations (4.8). Solving (4.6) and (4.7) thengives explicit matrices Xa satisfying the commutation relations [Xa, Xb] = εabcXc. To see that therepresentation is irreducible let’s consider an arbitrary (non-vanishing) vector

|ψ〉 =

j∑m=−j

am|j,m〉. (4.26)

Note that we can always find a k ≥ 0 such that

Jk+|ψ〉 ∝ |j, j〉, (4.27)

namely k = 2j if a−j 6= 0 or k = 2j − 1 if a−j = 0 and a−j+1 6= 0, etcetera. But then we canobtain all basis vectors from |ψ〉 by repeated application of J− and J+ via

J l−Jk+|ψ〉 ∝ |j, j − l〉. (4.28)

This means that if |ψ〉 is in some invariant subspace then all basis vectors are in this subspace,so the spin-j representation has no proper invariant subspaces! This completes the proof of thefollowing theorem.

Theorem 4.1. A complete list of inequivalent irreducible representations of su(2) is given by the(2j + 1)-dimensional spin-j representations (4.22) for j = 0, 1

2 , 1,32 , . . ..

We close this section by checking that for j = 1/2 we reproduce the two-dimensional fundamentalrepresentation of su(2). Using the basis |1/2, 1/2〉, |1/2,−1/2〉 we evaluate (4.24) to

J3 =1

2

(1 00 −1

)=

1

2σ3 = iX3. (4.29)

From (4.25) we obtain

J+ =1√2

(0 10 0

), J− =

1√2

(0 01 0

). (4.30)

Converting back to X1 and X2 via

X1 = − i√2

(J+ + J−) = − i2

(0 11 0

), X2 = − 1√

2(J+ − J−) = − i

2

(0 −ii 0

), (4.31)

one recovers Xa = − i2σa with σa the Pauli matrices (4.2).

22

5 The adjoint representation

5.1 Definition

Let G be a Lie group and g the corresponding Lie algebra in some N -dimensional matrix repre-sentation. If the dimension of G, i.e. the number of generators of the Lie algebra g, is n, thenG possesses a natural n-dimensional representation with underlying vector space given by the Liealgebra g itself.

Definition 5.1 (Adjoint representation of a Lie group). The adjoint representation D :G→ GL(g) of G is given by

D(g)X = gXg−1 for g ∈ G,X ∈ g. (5.1)

This definition makes sense because the elements of G and g are both N × N matrices. To seethat gXg−1 ∈ g, notice that

t 7→ g etXg−1 (5.2)

determines a smooth path in G that visits the identity at t = 0. Hence,

∂t(g etXg−1

)|t=0 = gXg−1 ∈ g. (5.3)

Let us look at the corresponding representation of the Lie algebra. To this end we consider agenerator Y ∈ g and expand the action of D(1 + εY ) to first order in ε,

D(1 + εY )X = (1 + εY )X(1− εY + . . .) = X + ε[Y,X] + . . . = (1 + ε[Y, · ] + . . .)X. (5.4)

Hence the generator Y acts on the Lie algebra g by taking the Lie bracket [Y, · ]. Viewing theelements X ∈ g as states |X〉 we may thus summarize the representation as

Y |X〉 = |[Y,X]〉. (5.5)

To obtain explicit matrices [Ta]bc for this representation, we choose a basis of generators Xa

spanning the Lie algebra g and consider an inner product9 such that 〈Xa|Xb〉 = δab. Recall thatthe commutator of the generators is given in terms of the structure constants by

[Xa , Xb ] = fabc Xc . (5.6)

To find the matrix element [Ta]bc corresponding to the action of Xa on g we use the inner productto find

[Ta]bc(2.10)

= 〈Xb|Xa |Xc〉(5.5)= 〈Xb|[Xa, Xc]〉

(5.6)= facd〈Xb|Xd〉 = facb. (5.7)

This constitutes the adjoint representation of g:

Definition 5.2 (Adjoint representation of a Lie algebra). The adjoint representation of aLie algebra g with structure constants fabc is generated by the matrices

[Ta]bc ≡ facb . (5.8)

The generators Ta satisfy the commutator relation (5.6). This can be verified by substituting thedefinition (5.8) into (5.6) and applying the Jacobi identity (3.17) (see exercises).

The dimension of the adjoint representation equals the dimension of the Lie group. This is similarto the regular representation encountered in the context of finite groups, where the elements ofthe group served as a basis of the linear space on which the representation acts. Now it is not thegroup elements but the generators that span the linear space.

9We will see in a minute that the algebra has a canonical inner-product given by the Cartan-Killing metric. Herethe inner product is arbitrarily defined by our choice of basis.

23

5.2 Cartan-Killing metric

There is a natural metric on the space of generators:

Definition 5.3 (Cartan-Killing metric). The Cartan-Killing metric

γab := tr [Ta Tb] = [Ta]cd [Tb]dc = fadcfbcd (5.9)

provides a scalar product on the space of generators of the adjoint representation. (Note that thetrace tr is with respect to the matrix-indices of the generators [Ta]bc.)

Fact 5.1. As a side-remark let us mentions that if g is a simple10 Lie algebra and one is not worriedover the precise normalization, one may equally compute the Cartan-Killing metric by taking thetrace γab = c tr[XaXb] in any other irreducible representation than the adjoint one, because theyare all equal up to a positive real factor c.

A natural question concerns the freedom in writing down the generators Ta. Since the generatorsXa constitute a basis of a linear space, there is the freedom of transforming to a new basis X ′a byperforming an invertible, linear transformation L:

X ′a = LbaXb . (5.10)

The change of basis induces a change in the structure constants

fabc 7→ f ′abc = Lda Leb fdef[L−1

]cf. (5.11)

This is seen by applying the transformation (5.10) to the commutator[X ′a, X

′b

] (5.6)= f ′abcX

′c

(5.10)= f ′abc LfcXf

!=Lda Leb [Xd , Xe ]

(5.6)= Lda Leb (fdefXf ) .

(5.12)

As a consequence the generators of the adjoint representation transform according to

[Ta]bc 7→[T ′a]bc

(5.8)= f ′acb

(5.11)= Lda Lfc fdfe [L−1]be

(5.8)= Lda Lfc [Td]ef [L−1]be = Lda [L−1TdL]bc. (5.13)

This, in turn, implies that the Cartan-Killing metric transforms as

γab = tr [Ta Tb] 7→ γ′ab = tr[T ′a T

′b

]= Lca Ldb tr

[L−1TcLL

−1TdL]

= Lca Ldb γcd, (5.14)

where we used the cyclicity of the trace.

Since the Cartan-Killing metric is symmetric in the generator indices a, b, one may choose thebasis transformation L such that that after the transformation γab is of the diagonal form

γab = tr [Ta Tb] =

−1n− 0 00 +1n+ 00 0 01n0

(5.15)

for some integers n−, n+ and n0 that add up to n. This fact about symmetric matrices may befamiliar to you from general relativity (where the spacetime metric can be brought by a local basistransformation to Minkowski form, i.e. n− = 1, n+ = 3) and is know under the name of Sylvester’slaw of inertia.

10A Lie algebra is simple if its adjoint representation is irreducible. A more useful definition of a simple Lie algebrawill be given in Chapter 8. For now it is sufficient to know that the Lie algebras su(N) for N ≥ 2 and so(N) forN ≥ 3 are all simple.

24

The block containing the negative eigenvalues is generated by the n− “compact” (rotation-type)generators, the positive eigenvalues belong to the n+ “non-compact” (boost-type) generators andthe zero-entries originate from nilpotent generators where (Xa)

p = 0 for some p > 1. Thus theeigenvalues of the Cartan-Killing metric provide non-trivial information on the compactness of theLie algebra.

Definition 5.4 (Compact Lie algebra). A Lie algebra g is compact if the Cartan-Killing metricis negative definite, meaning that ka < 0 for all a.

Adopting the basis where the Cartan-Killing metric is diagonal has the consequence that thestructure constants of a compact Lie algebra g become completely antisymmetric11

fabc = fbca = fcab = −fbac = −facb = −fcba . (5.16)

The proof of this property is rather instructive. It expresses the structure constants in terms ofthe Cartan-Killing metric. Normalizing the generators such that ka = −1 for all Ta one has

tr ( [Ta , Tb ] Tc)(5.6)= fabd tr (Td Tc)

(5.9)= fabdγdc

γdc=−δdc= −fabc . (5.17)

The cyclicity of the trace then shows that

fabc =− tr ( [Ta , Tb ] Tc)

=− tr (Ta Tb Tc − Tb Ta Tc)=− tr (Tb Tc Ta − Tc Tb Ta)=− tr ( [Tb , Tc ] Ta)

= fbca.

(5.18)

Together with the anti-symmetry of the structure constants in the first two indices, this establishesthe relation (5.16).

5.3 Casimir operator

Suppose g is compact with generators Xa in any irreducible representation. Then we can introducean important quadratic operator.

Definition 5.5 (Casimir operator). The (quadratic) Casimir operator is given by

C := γabXaXb, (5.19)

where γab = [γ−1]ab is the inverse of the Cartan-Killing metric γab.

Note that generally C /∈ g since it is not built from commutators of the generators, but it makesperfect sense as a matrix acting on the same vectors as Xa.

Theorem 5.1. The Casimir operator is independent of the choice of basis Xa and commutes withall generators, i.e. [C,Xa] = 0.

Proof. To see that it is independent of the basis let us look at a transformation Xa 7→ X ′a = LbaXb

as before. Then

C 7→ C ′(5.19)

= (γ′)abX ′aX′b = (γ′)abLcaXcLdbXd (5.20)

(5.14)=

[L(LtγL)−1Lt

]cdXcXd = γcdXcXd = C. (5.21)

11This is the basis which is typically adopted when studying non-abelian gauge groups in quantum field theory.

25

Without loss of generality we may assume then assume that we have chosen generators such thatγab = δab. Then we find for any c,

[C,Xb](5.19)

= [XaXa, Xb] = Xa[Xa, Xb] + [Xa, Xb]Xa (5.22)

(5.6)= XafabcXc + fabcXcXa

(5.16)= 0. (5.23)

Note that the Casimir operator does not just commute with all generators, but also with the groupelements ofG in the corresponding representationD. Indeed, using the exponential map (Definition3.2) any element g ∈ G can be written as D(g) = exp(αaXa) and therefore satisfies

[C,D(g)] = [C, exp(αaXa)]Thm. 5.1

= 0. (5.24)

This puts us precisely in the setting of Schur’s Lemma, Theorem 2.5, which allows us to concludethat if D is irreducible the Casimir operator is proportional to the identity,

C = CD 1, (5.25)

where CD is a positive real number that only depends on the representation D. It can thereforebe used to label or distinguish irreducible representations.

As an example let us look at su(2) with the standard basis

[Xa, Xb] = εabcXc. (5.26)

In the exercises you will determine that the corresponding Cartan-Killing metric is

γab = −2δab. (5.27)

The Casimir operator is therefore

C = γabXaXb = −1

2(X2

1 +X22 +X2

3 )Ja=iXa=

1

2(J2

1 + J22 + J2

3 ) (5.28)

Given what we know about the total angular momentum (or spin) in quantum mechanics it shouldnot come as a surprise that it commutes the generators Ja. To compute the value Cj in the spin-jrepresentation it is sufficient to focus on a single state, say the highest-weight state |j〉 = |j,m = j〉.Using that

J+J− + J−J+ (4.7)=

1

2(J1 + iJ2)(J1 − iJ2) +

1

2(J1 − iJ2)(J1 + iJ2) = J2

1 + J22 , (5.29)

we find

C|j〉 (5.28)=

1

2(J2

3 + J+J− + J−J+)|j〉 =1

2(J2

3 + [J+, J−] + 2J−J+)|j〉 (5.30)

(4.8)=

1

2(J2

3 + J3 + 2J−J+)|j〉 (4.11)=

1

2(J2

3 + J3)|j〉, (5.31)

and therefore Cj = 12j(j + 1). We see that the Casimir operator indeed distinguishes between the

different irreducible representations of su(2).

For larger Lie algebras, like su(N) with N ≥ 3, this is no longer the case since irreducible represen-tations are typically labeled by more than one parameter. Luckily in general the quadratic Casimiroperator C is not the only operator that commutes with all group elements. Further Casimir in-variants that are polynomials in Xa of order higher than two can then be constructed. It turnsout that all Casimir invariants together do distinguish the irreducible representations.

26

6 Root systems, simple roots and the Cartan matrix

6.1 Cartan subalgebra

Central to the classification of Lie algebras and their representations is the selection of a maximalset of commuting generators. Mathematically this is described by the Cartan subalgebra of a Liealgebra. First we need to know what a subalgebra is.

Definition 6.1 (Subalgebra). A subalgebra h of a Lie algebra g is a linear subspace h ⊂ g onwhich the Lie bracket closes: [X,Y ] ∈ h for all X,Y ∈ h. This can also be denoted

[h, h] ⊂ h. (6.1)

If Hi ∈ g for i = 1, . . .m form a collection of commuting Hermitian generators, [Hi, Hj ] = 0 for alli and j, then the Hi are called Cartan generators. Trivially they span a subalgebra of g.

Remark 6.1. Here you may rightfully object that the Lie algebras g we have encountered so far donot contain any Hermitian generators (except the trivial 0 ∈ g). Indeed, in the case of the special

unitary and orthogonal groups, the generators Xa were antihermitian X†a = −Xa. In order to findHermitian generators we have to allow taking complex linear combinations βaXa, βa ∈ C, insteadof just real linear combinations αaXa ∈ g, αa ∈ C. In mathematical terms such complex linearcombinations take value in the complexified Lie algebra gC = g + ig. Note that we have alreadyimplicitly used such generators in Section 4.2 when introducing the lowering and raising operatorsJ± = −i√

2(X1 ± iX2) which are elements of gC but not of g (they are not antihermitian). When

dealing with representation theory and classification of Lie algebras it is typically a lot easier towork with gC. We will do so implicitly in these lectures even when we write just g!

Definition 6.2 (Cartan subalgebra). A subalgebra h of a Lie algebra g is a Cartan subalgebraof g if [h, h] = 0 and it is maximal, in the sense that there are no elements X ∈ g \ h such that[h, X] = 0.

Although a Lie algebra g can have many Cartan subalgebras, they are all equivalent in the sensethat they can be related by a basis transformation of g. Therefore we often talk about the Cartansubalgebra of g.

Definition 6.3 (Rank). The rank of g is the dimension of its Cartan subalgebra.

The Cartan generators Hi form a maximal set of generators that can be simultaneously diagonal-ized. Hence, we may choose a basis {|µ, x〉} of states such that

Hi|µ, x〉 = µi|µ, x〉, (6.2)

where x is some additional label to specify the state (in case of multiplicities).

Definition 6.4 (Weights). The vector of eigenvalues µ = (µ1, . . . , µr) of a state is called a weightor weight vector.

Example 6.1 (su(2)). The rank of su(2) is 1 since no two generators commute. The Cartangenerator (spanning the Cartan subalgebra of su(2)) is typically chosen to be J3 = iX3. Theweights of the spin-j representation of su(2) are simply j, j − 1, . . . ,−j.

27

6.2 Roots

For the remainder of this Chapter we focus on the adjoint representation of a compact Lie algebrag, where states |X〉 are labeled by generators X ∈ g. Suppose g has rank r. States associated toCartan generators have weight vector (0, . . . , 0) ∈ Rr, since for all i, j = 1, . . . , r we have

Hi|Hj〉(5.5)= |[Hi, Hj ]〉

Cartan= 0. (6.3)

Conversely, by the maximality criterion (Definition 6.2), any state with zero weight vector is aCartan generator.

The non-Cartan states can be taken to be simultaneous eigenstates of the Cartan generators

Hi|Eα〉 = αi|Eα〉 for all i = 1, . . . , r. (6.4)

Later we will see that the eigenstate |Eα〉 is uniquely specified (up to scalar multiplication) by theweight vector α = (α1, . . . , αr).

Definition 6.5 (Roots). A nonvanishing weight vector α = (α1, . . . , αr) of the adjoint represen-tation is called a root.

At the level of g (6.4) implies that[Hi, Eα] = αiEα. (6.5)

Thus the generators Eα act similarly to the raising and lowering operators we saw in the case ofsu(2). Notice that, contrary to the Cartan generators, the generators Eα cannot be Hermitian.Indeed, taking the conjugate of (6.5) we find

[Hi, E†α] = −αiE†α. (6.6)

Instead we can choose the normalization such that E†α = E−α, similarly to the relation J†+ = J−between the raising and lowering operators in su(2).

The Cartan-Killing metric induces a scalar product on the space of states

〈Y |X〉 = tr(Y †X

), (6.7)

where we have include the conjugate transpose since we wish to compute inner-products of non-Hermitian operators Eα as well. The states |Hi〉 and |Eα〉 can then be normalized such that

〈Hi|Hj〉 = δij , 〈Eα|Eβ〉 = δαβ, (6.8)

where δαβ :=∏ri=1 δαiβi . The set {Hi} ∪ {Eα} then forms an orthonormal basis of g.

6.3 Root system

It turns out that the collection of root vectors α satisfies very rigid conditions, that we capture inthe following definition. Here we use the notation |α|2 = α · α and α · β = αiβi.

Definition 6.6 (Root system). A root system Φ or rank r is a finite subset of non-zero vectorsin Rr satisfying

(i) Φ spans Rr.

(ii) The only scalar multiples of α ∈ Φ that also belong to Φ are ±α.

(iii) For any α, β ∈ Φ the ratio α·βα·α is a half-integer.

28

(iv) For every α, β ∈ Φ the Weyl reflection β − 2 α·βα·α α of β in the hyperplane perpendicular to α

is also in Φ.

We will now show that

Theorem 6.1. The set {α} of roots of a compact Lie algebra g forms a root system.

We will verify the properties one by one. Property (i) follows from the compactness of g:

Proof of property (i). Suppose the roots do not span Rr. Then there exists a nonzero vector βisuch that β · α = 0 for all α ∈ Φ. But that means that all states |Eα〉 are annihilated by βiHi,βiHi|Eα〉 = β · α|Eα〉 = 0, and the same is true by definition for the states |Hj〉. Since thosestates together form a basis of g, it follows that [βiHi, g] = 0. Since βiHi is represented by thezero matrix in the adjoint representation, it has vanishing norm in the Cartan-Killing metric, incontradiction with the compactness of g.

The next steps rely on the fact that to each pair ±α of roots one may associate an su(2)-subalgebra.If α is a root, then [Eα, E−α] ∈ h has to be in the Cartan subalgebra h. To see this we use the Liebracket (6.5) to compute

Hi|[Eα, E−α]〉 = HiEα|E−α〉 = (EαHi + αiEα)|E−α〉 = (−αi + αi)Eα|E−α〉 = 0. (6.9)

On the other hand using (6.8),

〈Hi|[Eα, E−α]〉 = tr(Hi[Eα, E−α])

= tr(HiEαE−α)− tr(HiE−αEα)

= tr(E−α[Hi, Eα]) (6.10)

= αi tr(E−αEα)

= αi〈Eα|Eα〉 = αi.

Since the Hi form an orthonormal basis of the Cartan subalgebra we conclude that

[Eα, E−α] = αiHi. (6.11)

On the basis of this relation we introduce the su(2) generators E± and E3 by

E± := |α|−1E±α, E3 := |α|−2αiHi. (6.12)

It is straightforward to check that they indeed satisfy the su(2) algebra

[E+, E−] = E3, [E3, E±] = ±E±. (6.13)

Let us use this structure to verify that the root vector α specifies the generator Eα uniquely.Suppose that there exists another state |E′α〉 with root vector α orthogonal to |Eα〉. Then thestate E−α|E′α〉 has zero weight vector and is thus a Cartan generator, which can be identifiedfollowing a computation similar to (6.10):

〈Hi|E−α |E′α〉 = −αi tr(E−αE

′α

)= 0 . (6.14)

Thus the projection on the Cartan subalgebra vanishes. As a consequence |E′α〉 must be a lowestweight state, E−|E′α〉 = 0. But

E3|E′α〉 = |α|−2 αiHi |E′α〉 = |E′α〉 . (6.15)

Thus the state |E′α〉 has an E3-eigenvalue m = +1. This is in contradiction with the fact shownin Chapter 4 that the lowest weight states of an su(2)-representation must have a negative E3-eigenvalue (namely −j in the spin-j representation). Thus we conclude that one cannot have twogenerators Eα and E′α corresponding to the same root. Similar arguments lead to the follow-ing.

29

Proof of property (iii). Let us consider a root β different from α. From (6.12) is follows that

E3|Eβ〉 = |α|−2αiHi|Eβ〉 =α · βα · α

|Eβ〉. (6.16)

But from the su(2) representation theory the eigenvalue (α · β)/(α · α) has to be half-integer.

Proof of property (ii). Notice that it is sufficient to show that if α is a root that tα for t > 1 cannotbe a root. To this end, suppose to the contrary that α′ = tα is a root. Applying property (iii) toα and tα we find that both t and 1/t must be half-integers, so the only option is t = 2.

Let us consider the su(2)-subalgebra (6.12) associated to root α. The state |Eα′〉 has E3-eigenvalue2 and therefore it has a component in a spin-j representation for some integer j ≥ 2. Acting withthe lowering operator E− on this component we construct a state with root vector α. But this stateis not a multiple of |Eα〉, since |Eα〉 lives in the spin-1 representation (because E+|Eα〉 = 0 andE3|Eα〉 = |Eα〉). We have shown that this is impossible, finishing our proof by contradiction.

Proof of property (iv). Suppose α and β are two distinct roots. The state |Eβ〉 has E3-eigenvaluem = (α ·β)/(α ·α) with respect to the su(2)-representation (6.12) with m half-integer. Hence |Eβ〉can be decomposed into components transforming in various spin-j representations for j ≥ |m|. Ifm ≥ 0 then acting 2m times with the lowering operator produces a non-vanishing state E2m

− |Eβ〉satisfying

HiE2m− |Eβ〉 = −2mαiE

2m− |Eβ〉+ E2m

− Hi|Eβ〉 = (βi − 2mαi)E2m− |Eβ〉. (6.17)

It therefore has root equal to β − 2mα = β − 2(α · β)/(α · α)α. Similarly, if m < 0 one may

act 2|m| times with the raising operator to produce a non-vanishing state E2|m|+ |Eβ〉 with root

β + 2|m|α = β − 2(α · β)/(α · α)α.

We conclude that the roots of any compact Lie algebra g form a root system Φ with rank r equalto the rank of g, i.e. the dimension of its Cartan subalgebra. Although we will not prove this indetail, any root system Φ characterizes a unique compact Lie algebra.

Fact 6.1. Compact Lie algebras g are in 1-to-1 correspondence with root systems Φ.

6.4 Angles between roots

Property (iii) has a direct consequence for the angle θαβ between any two roots α, β ∈ Φ for whichα 6= ±β. Not only α · β/|α|2 but also α · β/|β|2 must be a half-integer. Hence, by the cosinerule

4 cos2 θαβ = 4(α · β)2

|α|2|β|2=

2(α · β)

|α|22(α · β)

|β|2∈ Z (6.18)

is an integer. By inspection one deduces that there are only few possibilities for the angle θαβ andthe length ratio |α|/|β|, which are summarized as follows:

4 cos2 θαβ θαβ |α|2/|β|20 π/2 arbitrary1 π/3 , 2π/3 12 π/4 , 3π/4 2 or 1

23 π/6 , 5π/6 3 or 1

3

(6.19)

6.5 Simple roots

In order to proceed to classify root systems it is useful to adopt a convention as to which of thetwo roots ±α should be interpreted as a raising operator.

30

Definition 6.7 (Positive root). A root vector α = (α1, . . . , αr) is called positive if its firstnon-zero component is positive, and negative otherwise.

If α is positive we call |α|−1Eα a raising operator, while if α is negative |α|−1Eα is a loweringoperator.

Definition 6.8 (Simple root). A simple root of g is a positive root that is not a sum of twoother positive roots.

Simple roots will be highlighted with a hat on top of the root vector in the sequel.

Lemma 6.1. The simple roots satisfy the following important properties:

(i) A root system of rank r has precisely r simple roots, which are linearly independent. Theythus form a basis of Rr.

(ii) All other roots can be obtained from successive Weyl reflections of simple roots (see property(iv) of the root system).

(iii) If α and β are simple roots then α− β is not a root.

(iv) If α and β are different simple roots, then their scalar product is nonpositive, α · β ≤ 0.

Part (i) is proved in the exercises and we skip the proof of (ii). The two remaining parts are easilydeduced as follows.

Proof of Lemma 6.1(iii) . Assume that α and β are simple roots and α− β is a root. Then either

α − β or β − α is a positive root. But then either α =(α− β

)+ β or β =

(β − α

)+ α can

be written as the sum of two positive roots. This contradicts the assumption that α and β aresimple.

Proof of Lemma 6.1(iv) . From (iii) it follows that E−α|Eβ〉 = |[E−α, Eβ]〉 = 0 since β − α cannot

be a root. Hence |β〉 is a lowest-weight state in the su(2)-representation of α, but then its E3-eigenvalue α · β/(α · α) cannot be positive.

The information about the r simple roots αi, i = 1, . . . , r is conveniently stored in the Cartanmatrix:

Definition 6.9 (Cartan matrix). Given the r simple roots of a simple, compact Lie algebra,the r × r-dimensional Cartan matrix A is defined by the matrix elements

Aij =2αi · αj

|αj |2. (6.20)

From the definition it follows that

• the diagonal elements of the Cartan matrix are always 2.

• the off-diagonal elements encode the angles and relative lengths of the simple roots. AssumingAij ≤ Aji the possible values are

(Aij , Aji) (0, 0) (−1,−1) (−2,−1) (−3,−1)angle π

22π3

3π4

5π6

(6.21)

Example 6.2 (Cartan matrix of su(3)). The two simple roots are

α1 =(

12 ,√

32

), α2 =

(12 , −

√3

2

). (6.22)

31

Evaluating the scalar products of the roots gives

|α1|2 = |α2|2 = 1 , α1 · α2 = −12 . (6.23)

Substituting these values into the definition of the Cartan matrix (6.20) gives

A =

[2 −1−1 2

]. (6.24)

32

7 Irreducible representations of su(3) and the quark model

After SU(2), whose representations we have already studied, SU(3) is arguably the most important(simple) compact Lie group in particle physics. The simplest application is the quark model thatdescribes mesons and baryons in terms of their constituent quarks. The main idea is that the threelightest quarks, the up, down and strange quark, are related by SU(3) flavour symmetry, spanninga 3-dimensional vector space that transforms under the fundamental representation. Their anti-particles transform in the anti-fundamental representation (to be defined below). Any compositestates of quarks and anti-quarks live in tensor products of these representations, which necessarilydecompose into irreducible representations of SU(3). The similarities between the masses of certainmesons and baryons can be explained by recognizing the particles with similar masses as living inthe same irreducible representation. To understand this grouping let us start by classifying theirreducible representations.

Remark 7.1. This approximate flavour-SU(3) symmetry is unrelated to the exact color-SU(3)symmetry of the Standard Model gauge group SU(3)× SU(2)×U(1) that was discovered later.

7.1 The algebra

As we have seen in Section 3.3, the fundamental representation of the Lie algebra su(3) is givenby the traceless antihermitian 3× 3 matrices,

su(3) ={X ∈ C3×3 : X† = −X, trX = 0

}.

A basis of Hermitian traceless 3× 3 matrices, analogous to the Pauli matrices of su(2), is given bythe Gell-Mann matrices

λ1 =

0 1 01 0 00 0 0

, λ2 =

0 −i 0i 0 00 0 0

, λ3 =

1 0 00 −1 00 0 0

,λ4 =

0 0 10 0 01 0 0

, λ5 =

0 0 −i0 0 0i 0 0

, λ6 =

0 0 00 0 10 1 0

,λ7 =

0 0 00 0 −i0 i 0

, λ8 = 1√3

1 0 00 1 00 0 −2

.(7.1)

They are normalized such that tr(λaλb) = 2δab. It is customary to take the generators in thefundamental representation to be

Xa ≡ −iTa ≡ − i2λa, tr(TaTb) = 1

2δab, a, b = 1, . . . , 8. (7.2)

The structure constants fabc are completely antisymmetric and the only nonvanishing constants(with indices in ascending order) are

f123 = 1, f147 = f246 = f257 = f345 = −f156 = −f367 =1

2, f458 = f678 =

√3

2. (7.3)

Any other set of Hermitian matrices T1, . . . , T8 satisfying

[Ta, Tb] = ifabcTc (7.4)

therefore determines a representation Xa ≡ −iTa of su(3) (and of SU(3) via the exponentialmap).

33

7.2 Cartan subalgebra and roots

By examining the structure constants (7.3) one may observe that at most two generators canbe found to commute, meaning that the Lie algebra su(3) is of rank 2. The Cartan subalgebrais conventionally taken to be spanned by the generators H1 = T3 and H2 = T8, which in thefundamental representation are diagonal. These generators have clear physical interpretations inthe quark model, since

I3 = H1 = T3, Y =2√3H2 =

2√3T8, Q = I3 +

1

2Y (7.5)

are respectively the isospin, hypercharge and electrical charge generators of the quarks.

The remaining generators can be combined into raising and lowering operators Eα,

E±α1 =1√2

(T1 ± iT2), E±α2 =1√2

(T4 ± iT5), E±α3 =1√2

(T6 ± iT7) (7.6)

with corresponding roots

α1 = (1, 0), α2 = (12 ,

12

√3), α3 = (−1

2 ,12

√3). (7.7)

The root diagram therefore looks as follows.

7.3 Representations

Let us assume T1, . . . , T8 are Hermitian and form an irreducible representation of su(3).

Recall from (6.13) that for each root αk, the operators E±αk and αkiHi form an su(2)-subalgebra

[Eαk , E−αk ] = αkiHi, [αkiHi, E±αk ] = ±E±αk , (7.8)

which may or may not be reducible (note that |αk| = 1). In any case our knowledge aboutsu(2)-representations implies that the eigenvalues of αkiHi are half-integers. These eigenvaluesare precisely the inner products αk · µ of the roots with the weights µ = (µ1, µ2) ∈ R2 of therepresentation. Hence, the weights must lie on the vertices of the triangular lattice depicted in thefigure below. If µ is a weight, meaning that Hi|Ψ〉 = µi|Ψ〉 for some state |Ψ〉, then E±αk |Ψ〉 isa state with weight µ ± αk provided E±αk |Ψ〉 6= 0. Using a similar reasoning as for the roots inSection 6.3 it follows that if αk · µ > 0 then all of µ, µ− αk, µ− 2αk, . . . , µ− 2(αk · µ)αk are alsoweights. In particular, the set of weights is invariant under the Weyl reflections µ→ µ−2(αk ·µ)αk,i.e. reflections in the planes orthogonal to the roots α1, α2 and α3. Moreover, any two states withdifferent weights can be obtained from each other by actions of the operators E±αk , because thecollection of all states accessible from some initial state span an invariant subspace, which hasto be the full vector space because of irreducibility. Therefore the weights must differ by integermultiples of αk. Combining these observations we conclude that the set of weights necessarilyoccupy a convex polygon that is invariant under Weyl reflections (thus either a hexagon or atriangle), like in the following figure.

34

7.4 Highest weight states

In Chapter 4 we have seen that irreducible representations of su(2) are characterized by theirhighest weights (j in the spin-j representation). Something similar is true for su(3), but now theweights are two-dimensional so we need to specify what it means for a weight to be highest. Tothis end we make use of the simple roots of su(3) which we have determined to be

α1 = α2 = (12 ,

12

√3), α2 = −α3 = (1

2 ,−12

√3) (7.9)

in Example 6.2.

Definition 7.1 (Highest weight). A state |Ψ〉 that is annihilated by the raising operators Eα1

and Eα2 is called a highest weight state. Its weight is called the highest weight.

Any irreducible representation can be shown to have a unique highest weight with multiplic-ity one. In the weight diagram above it corresponds to the right-most weight (i.e. largest H1-eigenvalue).

Definition 7.2 (Label). The label (p, q) of an irreducible representation is given in terms of thehighest weight µ by

p = 2 α1 · µ, q = 2 α2 · µ. (7.10)

The irreducible representations of su(3) satisfy the following properties, which we do not provehere.

• Each label (p, q), p, q = 0, 1, 2, . . ., corresponds to a unique irreducible representation of su(3).

• The set of weights of the representation with label (p, q) can be determined as follows. Byinverting (7.10) the highest weight is µ = (1

2(p + q), 12√

3(p − q)). The Weyl reflections of µ

determine the corners of a convex polygon P , that is either a hexagon or a triangle. Thecomplete set of weights then corresponds to those vectors of the form µ+ n1α

1 + n2α2 with

n1, n2 ∈ Z that are contained in P .

• The multiplicities of the weights (i.e. the number of linearly independent states with thatweight) can be deduced from the diagram. The weights occur in nested “rings” that areeither hexagonal (the outer ones) or triangular (the inner ones). The weights on the outerring have multiplicity 1. Each time one moves inwards from a hexagonal ring to the nextring the multiplicity increases by 1. Inside triangular rings the multiplicities do not changeanymore.

Examples of weight diagrams with small labels are shown below. The highest weights are circled(in blue), and multiplicities are indicated on the weights when they are larger than 1.

35

As one may verify for these examples, the dimension of the representation is generally givenby

N =1

2(p+ 1)(q + 1)(p+ q + 2). (7.11)

For p 6= q the two irreducible representations (p, q) and (q, p) are not equivalent but related bycomplex conjugation. To see this, let Ta be the generators of the irreducible (p, q)-representation.Since [Ta, Tb] = ifabcTc with real structure constants fabc we have that

[−T ∗a ,−T ∗b ] = [Ta, Tb]∗ = ifabc(−T ∗c ), (7.12)

showing that the matrices −T ∗a also generate a representation, which is automatically irreducibletoo. All eigenvalues of H1 = T3 and H2 = T8 have changed sign, thus changing the set of weightsto those of the (q, p) representation. The irreducible representations are often denoted by theirdimension (7.11) in bold face N instead of their label (p, q), where if p > q the representation Ncorresponds to (p, q) and N to (q, p).

Let us discuss the interpretation of some of these representations in the quark model.

• 3, (p, q) = (1, 0): This is the fundamental representation (7.2) of su(3) as can be easilychecked by determining the eigenvalues of H1 = 1

2λ3 and H2 = 12λ8. The three basis states,

|u〉, |d〉, |s〉 correspond to the up, down and strange quark respectively. Their arrangementdetermines the values of the isospin and charge via (7.5).

• 3, (p, q) = (0, 1): the complex conjugate representation or anti-fundamental representation.Since the corresponding states have opposite quantum numbers as compared to the funda-mental representation, it stands to reason that they correspond to the anti-quarks |u〉, |d〉, |s〉

• 8, (p, q) = (1, 1): this is the octet or adjoint representation, since the non-zero weights areexactly the roots of su(3). It appears in the decomposition 3 ⊗ 3 = 8 ⊕ 1 of the tensorproduct of the fundamental and anti-fundamental representation, and therefore describesbound states of a quark with an anti-quark, which are mesons. It also appears in thedecomposition 3 ⊗ 3 ⊗ 3 = 10 ⊕ 8 ⊕ 8 ⊕ 1 of the tensor product of three fundamentalrepresentations, describing bound states of three quarks, which are baryons. The baryons inthe adjoint representation have spin 1/2.

36

• 10, (p, q) = (3, 0): the decuplet representation also appears in the decomposition of 3⊗3⊗3and describes the spin-3/2 baryons.

We close this section with several remarks concerning the quark model.

• The existence of quarks was unknown in the early sixties. It is on the basis of the nice group-ing (the Eightfold way) of the discovered mesons and baryons into octets and decuplets thatGell-Mann and Zweig (independently in 1964) postulated the existence of more fundamentalparticles living in the fundamental representation of su(3) (dubbed “quarks” by Gell-Mann).

• The su(3) flavor symmetry is only an approximate symmetry of the standard model (contraryto the su(3) color symmetry of QCD) because, among other reasons, the quark masses arequite different (mu = 2.3 MeV, md = 4.8 MeV, ms = 95 MeV). Since the difference in massesbetween the up and down quark is much smaller than between the up and strange quark,one would expect that the symmetry relating the up and down quarks is more accurate.This smaller symmetry corresponds precisely to su(2)-subrepresentation associated to theroot α1 = (1, 0), which has the isospin I3 as Cartan generator. Indeed, the masses of themesons and baryons in the corresponding su(2)-multiplets (horizontal lines in the diagrams)are quite close, e.g. the proton and neutron among the spin-1/2 baryons.

37

8 Classification of compact, simple Lie algebras

8.1 Simple Lie algebras

Recall (Definition 5.4) that a Lie algebra is called compact if its Cartan-Killing metric is positivedefinite. An important result, that we will not prove, is that any compact Lie algebra g decomposesas a direct sum of compact, simple Lie algebras

g = g1 ⊕ · · · ⊕ g2. (8.1)

Hence, in order to classify compact Lie algebras it is sufficient to classify compact, simple Liealgebras and then consider all possible compositions. In order to introduce the concept of simpleLie algebras, we need a few definitions.

Recall that a subspace h ⊂ g is called a subalgebra of g if [h, h] ⊂ h. A stronger condition is thefollowing.

Definition 8.1 (Invariant subalgebra). A subalgebra h of a Lie algebra g is called invariant if[X,Y ] ∈ h for all X ∈ h and Y ∈ g, i.e.

[h, g] ⊂ h. (8.2)

For a Lie algebra g with a basis X1, . . . , Xn of generators and corresponding structure constantsfabc the k-dimensional subspace spanned by X1, . . . , Xk is a subalgebra precisely when

fijα = 0 for 1 ≤ i, j ≤ k, k < α ≤ n, (8.3)

while it is invariant if in addition

fiβα = 0 for 1 ≤ i ≤ k, k < α, β ≤ n. (8.4)

Definition 8.2 (Simple Lie algebra). A Lie algebra g is simple if it has no invariant subalgebras(other than {0} and g itself).

Equivalently, a Lie algebra is simple precisely when its adjoint representation is irreducible. Indeed,h is an invariant subalgebra precisely when X|Y 〉 = |[X,Y ]〉 ∈ h for any X ∈ g and Y ∈ h. This isprecisely the condition for h ⊂ g to be an invariant subspace for the adjoint representation.

8.2 Dynkin diagrams

In the previous chapter we have seen that compact Lie algebras can be characterized in variousequivalent ways (with increasing simplicity) by

• the root system Φ = {αi};

• the collection of simple roots {αi};

• the Cartan matrix Aij = 2αi·αj

αj ·αj .

The Cartan matrix is often visualized via its Dynkin diagram.

Definition 8.3 (Dynkin diagram). The Dynkin diagram describing a Lie algebra is constructedas follows:

• simple roots are represented by nodes •

• max{|Aij |, |Aji|} gives the number of lines connecting the circles representing the roots αi

and αj .

38

• if the roots connected by lines have different length, one adds an arrow pointing towards theshorter root (hint: think of the arrow as a “greater than” symbol >).

In the exercises it is shown that the Dynkin diagram of a simple compact Lie algebra is connected.Otherwise the Dynkin diagram consists of a number of connected components corresponding tothe simple compact Lie algebras in to which it decomposes as a direct sum.

From the definition it follows that there is only one Dynkin diagram of rank 1 and that there arefour diagrams of rank 2:

Four of these correspond to familiar Lie algebras, while G2 is the first exceptional Lie group thatwe encounter.

If we go to higher rank, then not every graph built from three or more nodes connected by (multiple)links corresponds to the Dynkin diagram of a compact Lie algebra. Indeed, there have to exist rlinearly independent vectors αi, whose inner products give rise to the links in the diagram. Luckilythis is the only criterion that a diagram has to satisfy to qualify as a Dynkin diagram. As we willsee now it puts severe restrictions on the types of graphs that can appear.

8.3 Classification of compact, simple Lie algebras

In this section we focus only on connected Dynkin diagrams, since we want to classify simple Liealgebras. For the moment we also forget about the arrows on the Dynkin diagrams, since we arejust going to consider what angles between the simple roots are possible.

Lemma 8.1 (Dynkin diagrams of rank 3). The only Dynkin diagrams of rank 3 are

Proof. This results from the fact that the sum of angles between 3 linearly independent vectorsmust be less than 360◦. Computing the angle in the first diagram, one has 120◦+120◦+90◦ = 330◦.The analogous computation for the second diagram yields 120◦ + 135◦ + 90◦ = 345◦. This alsoimplies that the diagrams shown below are not valid, since the angles add up to 360◦:

Adding even more lines of course even increases the sum of angles.

This has important consequences for diagrams of higher rank, because of the following.

Lemma 8.2 (Subsets of a Dynkin diagram). A connected subset of nodes from a Dynkindiagram together with the links connecting them is again a Dynkin diagram.

39

Proof. Taking a subset of simple roots preserves the corresponding lengths and angles, and thevectors are still linearly independent.

In particular, a triple line cannot occur in diagrams with three or more nodes, because otherwisethere would be a subset of rank 3 with a triple line. Hence, we have:

Lemma 8.3 (No triple lines). G2 is the only Dynkin diagram with a triple line.

Another way to shrink a diagram is by shrinking single lines.

Lemma 8.4 (Contraction of a single line). If a Dynkin diagram contains two nodes connectedby a single line, then shrinking the line and merging the two nodes results again in a Dynkindiagram.

Proof. Suppose that α and β are connected by a single line. We claim that replacing the pair ofvectors α and β by the single vector α+ β preserves the necessary properties. By Lemma 8.1 andLemma 8.2 each other node γ is connected to at most one of α or β (otherwise there would be atriangle subdiagram). If γ was connected to α then γ · (α + β) = γ · α, while if γ was connectedto β then γ · (α+ β) = γ · β. Since also |α+ β|2 = |α|2 = |β|2, all lines other than the contractedone are preserved. The resulting set of vectors is also still linearly independent.

As a consequence we have that

Lemma 8.5 (No loops). A Dynkin diagram cannot contain loops.

Proof. By Lemma 8.1 a loop must always contain a single edge, which can be contracted by Lemma8.4. Hence, every loop can be contracted to a triangle, in contradiction with Lemma 8.1.

Lemma 8.6 (At most one double line). A Dynkin diagram can contain at most one doubleline.

Proof. If there are two or more double lines, then one can always contract single lines until twodouble lines becomes adjacent. But this is not allowed by Lemma 8.1.

At this point we know that Dynkin diagrams always have the shape of a tree with at most one ofits links being a double line. To further restrict branching of the tree the following result is quiteuseful.

Lemma 8.7 (Merging two single lines). If a Dynkin diagram has a node with (at least) twodangling single lines, then merging them into a double line gives another Dynkin diagram. In otherwords, the following is a valid operation on Dynkin diagrams (with the gray blob representing theremaining part of the diagram):

Proof. Replacing α and β by α+ β, linear independence of the resulting set of vectors is automatic.Hence it remains to check the inner products between γ and α+ β. Using that |α|2 = |β|2 = |γ|2,

40

α · β = 0, and 2α · γ/|γ|2 = 2β · γ/|γ|2 = −1, it follows indeed that

2γ · (α+ β)

|γ|2= −2 and 2

γ · (α+ β)

|α+ β|2= −1. (8.5)

This corresponds precisely with the diagram on the right.

As a consequence a Dynkin diagram can only contain one particular junction.

Lemma 8.8 (At most one junction or double line). A Dynkin diagram contains one doubleline, or one junction, or neither. A junction is a node with more than two neighbours. Any junctionmust have the form

,

i.e. three neighbours connected by single lines.

Proof. By repeated contraction (Lemma 8.4) and merging (Lemma 8.7) any diagram with a dif-ferent type of junction (with more than three neighbours or involving double lines) can be reducedto a diagram with two adjacent double lines, which is not allowed. The same is true if a diagramhas two junctions, or both a junction and a double line.

The remaining step is to get rid of some remaining diagrams on a case-by-case basis, by showingthat they cannot satisfy linear independence.

Lemma 8.9 (Diagrams failing linear independence). The following diagrams cannot berealized by linear independent vectors:

Proof. The labels µ1, . . . , µr indicated on the diagram are such that∑r

j=1 µjαj = 0. To see this

one may straightforwardly compute |∑r

j=1 µjαj |2 =

∑ri,j=1 µiµjα

i ·αj = 0 using the inner products

41

αi · αj as prescribed by the diagram. Such a linear relation implies that the vectors αj are notlinearly independent.

This is all we need to deduce the final classification theorem. In general, connected Dynkindiagrams are denoted by a capital letter denoting the family and a subscript denoting the rank,e.g. B2 for the Dynkin diagram of the rank-2 Lie algebra so(5). The same notation is alsosometimes confusingly used to specify the corresponding Lie algebra and/or Lie group.

Theorem 8.1 (Classification of simple, compact Lie algebras). Any simple, compact Liealgebra has a Dynkin diagram equal to a member of one of the four infinite families An, Bn, Cn,Dn, n ≥ 1, or to one of the exceptional Lie algebras G2, F4, E6, E7, or E8. In other words itsDynkin diagram appears in the following list:

Proof. Let’s look have a look at the three cases or Lemma 8.8.

No junction or double line: Then the diagram is clearly of the form of An for some n ≥ 1.

One double line: The diagram has no junction. If the double line occurs at an extremity thenit is of the form of Bn of Cn. If it sits in the middle, then by Lemma 8.9(d) and (e) it can haveonly one single edge on both sides, i.e. of the form F4.

One junction: Let’s denote the lengths of the paths emanating from the junction by `1 ≤ `2 ≤ `3.Using Lemma 8.9: (a) implies `1 = 1; (b) implies `2 ≤ 2; and (c) implies `2 = 1 or `3 ≤ 4. It isthen easy to see that Dn (`2 = 1), E6 (`2 = `3 = 2), E7 (`2 = 2, `3 = 3), and E8 (`2 = 2, `3 = 4)are the only possibilities.

As one may note by looking at the infinite families for small n, there are some overlaps betweenthe families. Since there is only one rank-1 diagram it makes sense to label it by A1, and requiren ≥ 2 for the remaining families. However, B2 = C2 and D3 = A3, while D2 consisting of twodisconnected nodes does not correspond to a simple Lie algebra. Hence, to make the classificationunique one may require n ≥ 1 for An, n ≥ 2 for Bn, n ≥ 3 for Cn, and n ≥ 4 for Dn.

Finally, going back to the common Lie groups that we identified in Chapter 6 it is possible toobtain the following identification:

42

Cartan label Lie algebra dimension rank

An su(n+ 1) (n+ 1)2 − 1 n ≥ 1Bn so(2n+ 1) (2n+ 1)n n ≥ 2Cn sp(n) (2n+ 1)n n ≥ 3Dn so(2n) (2n− 1)n n ≥ 4G2 14 2F4 52 4E6 78 6E7 133 7E8 248 8

From the overlaps described above it follows that we have the following isomorphisms between Liealgebras,

Cartan label Lie algebra

A1, B1, C1 so(3) ' su(2) ' sp(1)B2, C2 so(5) ' sp(2)A3, D3 so(6) ' su(4)

, (8.6)

as well as the nonsimple so(4) = su(2)⊕ su(2) (see exercises).

8.4 Concluding remark on unified gauge theories

The idea behind Grand Unified Theories (GUT) is to embed the gauge group of the standardmodel of particle physics, SU(3)×SU(2)×U(1) in a simple, compact Lie group. In other words wewant to recognize su(3)⊕ su(2)⊕ u(1) as a subalgebra in one of the simple, compact Lie algebrasof Theorem 8.1. Based on the Dynkin diagrams, one can determine which Lie algebras are suitedfor this purpose. The simplest possibility turns out to be A4, i.e. su(5).

Let us demonstrate more generally how one can obtain a subalgebra from a rank-r Lie algebra gby removing a node α0 from its Dynkin diagram. We label the simple roots as α0, . . . αr−1 andchoose a basis of Cartan generators H0, . . . ,Hr−1 such that αj0 = 0 for j ≥ 1. Then the Cartangenerators H1, . . . ,Hr−1 together with the roots arising from Weyl reflections of α1, . . . , αr−1 spana rank-(r − 1) subalgebra g1 ⊂ g, whose Dynkin diagram is the original one with the node α0

removed. However, one can extend the subalgebra to g1 ⊕ u(1) by adding the Cartan generatorH0, since by construction it commutes with all of g1.

In particular, if we remove one of the middle nodes of the diagram A4 of su(5) then the remainingdiagram decomposes in an A2 and an A1, which together form the Dynkin diagram for su(3)⊕su(2).Hence we find the explicit subalgebra

su(3)⊕ su(2)⊕ u(1) ⊂ su(5), (8.7)

which forms the basis of the famous Georgi-Glashow GUT based on SU(5).

43

lie algebras in particle physics 2020/2021 (nwi-nm101b)1tbudd/docs/liealgebras.pdf · 2021. 5....

Documents