basic concepts in algebra - penn mathchai/371s10/course... · basic concepts in algebra x1....

$: Basic Concepts in Algebra - Penn Mathchai/371s10/course... · Basic Concepts in Algebra x1. Notations and terminologies (1.1) Some symbols 8: \for all" 9: \there exists" 7!: maps$
Basic Concepts in Algebra

§1. Notations and terminologies(1.1) Some symbols

• ∀: “for all”

• ∃: “there exists”

• 7→: maps to (under a map/function)

• lhs := rhs (“rhs” is the definition of “lhs”)

• X r S: the complement in S of a subset S ⊆ X

• 2S, where S is a set: the set of all subsets of S

• Z: the set of all integers

• N: the set of all natural numbers (including 0)

• Q: the set of all rational numbers

• R: the set of all real numbers

• C: the set of all complex numbers

• N>0: the set of all positive integers

• M2(R): the set of all 2× 2 matrices with entries in R

• GL3(Q): the set of all invertible 3× 3 matrices A with entries in Q, i.e. there exists a3 × 3 matrix B with entries in Q such that A · B = B · A = I3, where I3 is the 3 × 3matrix whose diagonal entries are 1 and off-diagonal entries are 0.

• GL2(Z): the set of all invertible 2 × 2 matrices A with entries in Z such that thereexists a 2× 2 matrix B with entries in Q satisfying A ·B = B · A = I3 ; this amountsto the condition that the determinant of A is ±1.

• R[x]: the set of all polynomials with coefficients in R

• Q[x, y]: the set of all polynomials in two variables x and y with all coefficients in Q

• Z[x, y, z]: the set of all polynomials in three variables x, y, z with all coefficients in Z.

(1.2) Definition Let f : S → T be a map from a set S to a set T .

1

(a) The map f is said to be injective (or, f is an injection) if for any two elements s1, s2 ∈ S,f(s1) = f(s2) only if s1 = s2. Another standard terminology for the same concept: fis one-to-one.

(b) The map f is said to be surjective (or, f is a surjection) if for every element t ∈ T ,there exists an element s ∈ S such that f(s) = t. In other words the image f(S) off is equal to the target T of the map f . Another standard terminology for the sameconcept: f is onto.

(c) The map f is said to be bijective (or, f is a bijection) if f is both injective andsurjective. Another standard terminology for the same concept: f is a one-to-one andonto correspondence.

(1.3) Definition Let S1, . . . , Sn be sets. The product S1 × · · ·Sn of S1, . . . , Sn is the setconsisting of all n-tuples (x1, . . . , xn) such that xi ∈ Si for all i = 1, . . . , n. Such a productis also denoted

∏ni=1 Si, where “

∏” is the symbol for a product. The above definition in

words can also be expressed by the following formula

n∏i=1

Si = S1 × · · ·Sn := {(x1, . . . , xn) | xi ∈ Si ∀ i = 1, . . . , n}

(1.4) Remark We will not need to use the product of an infinite family of sets. In caseyou wonder what such an infinite product is, suppose that I is an indexing set, and Si is afamily of sets indexed by I.

(a) The disjoint union ti∈ISi of the sets Si. is the set of all pairs (i, x) where i is anelement of the indexing set I and x is an element of the set Si indexed by i. For eachj ∈ I, the set Sj is naturally identified with the subset of all elements in ti∈ISi of theform (j, x) such that x ∈ Sj.(Another notation form the disjoint union is “

∐” because “coproduct” of sets are

nothing but disjoint union, so ti∈ISi can also be written as∐

i∈I Si. However we wantto avoid such notation because the coproduct of a finite family of groups is the same astheir products, but the coproduct of an infinite family of groups is a proper subgroupof the product group.)

(b) The product∏

i∈I Si is the set of all functions

f : I → ti∈ISi

from I to the disjoint union ti∈ISi of the sets Si such that

(1.5) Definition Let f : X → S and g : Y → S be maps of sets. The fiber product off : X → S and g : Y → S, denoted by X×f,S,g Y or X×S Y for short, is the subset of X×Y

2

consisting of all pairs (x, y) ∈ X × Y such that f(x) = g(y). Denote by π1 : X ×S Y → Xand π2 : X ×S Y → Y the two “projections”, defined by

π1(x, y) = x , π2(x, y) = y ∀(x, y) ∈ X ×S Y .

Clear f ◦ π1 = g ◦ π2 by construction. The triple (X ×S Y, π1, π2) satisfies the followinguniversal property :

For any maps of sets u : T → X, v : T → Y such that f ◦ u = g ◦ v, there existsa unique map h : T → X ×S Y such that u = π1 ◦ h and v = π2 ◦ h.

(1.6) Definition An equivalence relation on a set S is a subset R ⊆ S × S satisfying thefollowing properties.

• (R is reflexive) (x, x) ∈ R for all x ∈ S. In word, every element of S is equivalent toitself under the equivalence relation R.

• (R is symmetric) If (x, y) ∈ R then (y, x) ∈ R. In words, if x is equivalent to y then yis equivalent to x.

• (R is transitive) If x, y, z are elements in S such that (x, y) ∈ R and (y, z) ∈ R, then(x, z) in R. In words, if x is equivalent to y and y is equivalent to z, then x is equivalentto z.

(1.7) Definition Let R be an equivalence relation on a set S.

(a) The equivalence class under R containing an element x ∈ R is the set of all elementsy ∈ S such that (x, y) ∈ R. (So an equivalence class is a subset of S. Note that anytwo equivalence classes are either identical or disjoint. So the equivalence relation Rpartitions the set S into a disjoint union of equivalence classes.)

(b) The set S/R is the set of all equivalence classes in S with respect to R. (So eachelement of the set S/R is a subset of S, i.e. S/R is a set of subsets of S.)

Example: Two integers a, b are said to be congruent modulo 37 (notation: a ≡ b (mod 37))if their difference is divisible by 37. Being congruent modulo 37 is an equivalence relation onZ. The set of all equivalence classes for this equivalence relation denoted by Z/37Z. Notethat each element of Z/37Z is a subset of Z; one such element is 1 + 37Z, consisting of allintegers n such that get 1 as the remainder if you divide n by 37.

Of course you can replace 37 by any integer N and define an equivalence relation in asimilar way.

(1.8) Definition A partial ordering on a set S is a relation on S, written a � b if thisrelation holds for the ordered pair (a, b), and the following properties hold.

3

(i) a � a for all a ∈ S.

(ii) If a � b and b � c, a, b, c ∈ S, then a � c.

(iii) If a, b ∈ S, a � b and b � a, then a = b.

A partial ordering on S is said to be a total ordering if property (iv) below holds.(iv) For any two elements a, b ∈ S, either a � b or b � a.

(1.9) Definition Let � be a partial ordering on a set S.

(a) An upper bound of a subset T is an element b ∈ S such that t � b for all t ∈ T .

(b) A maximal element of a subset T is an element m ∈ T such that there is no elementin T which is bigger than m. In other words, if t ∈ T and m � t, then t = m.

(1.10) Zorn’s Lemma. Let S be a non-empty partially ordered set such that every totallyordered subset T ⊂ S has an upper bound in S. Then there exists a maximal element in S.

Zorn’s Lemma is an equivalent form of the axiom of choice in set theory, which is known tobe independent of the basic axioms in standard set theory and is consistent if the standardset theory is, i.e. assuming it will not lead to contradiction unless the standard set theoryalready does (an inconceivable scenario—then almost all of the known mathematics will haveto be abandoned). Most mathematicians use Zorn’s lemma freely.

(1.11) Definition (a) Two sets S1 and S2 are said to have the same cardinality if thereexists a bijection between S1 and S2; we write Card(S1) = Card(S2) if this is the case.

(b) We say that the cardinality of a set S1 is less than or equal to a set S2 if there existsan injection from S1 → S2. This property is equivalent (under the axiom of choice) tothe existence of a surjection from S2 to S1. Notation: Card(S1) ≤ Card(S2).

(1.12) Basic facts about cardinality, assuming the axiom of choice.

(i) If Card(S1) ≤ Card(S2), and Card(S2) ≤ Card(S3), then Card(S1) ≤ Card(S3).

(ii) If Card(S1) ≤ Card(S2), and Card(S2) ≤ Card(S1), then Card(S1) = Card(S2).

(iii) Either Card(S1) ≤ Card(S2) or Card(S2) ≤ Card(S1) for any two sets S1 and S2.

§2. Groups(2.1) Definition A group is a triple (G, µ, e), where G is a set, µ : G×G→ G is a binaryoperation and e is an element of G, such that the following properties are satisfied.

• (e is a unity element for the group law µ) µ(x, e) = µ(e, x) = x for all x ∈ G.

4

• (associativity) µ(x, µ(y, z)) = µ(µ(x, y), z) for all x, y, z ∈ G.

• (existence of inverse) For every element x ∈ G, there exists an element y ∈ G such thatµ(x, y) = µ(y, x) = e. [It is easy to check that this element y is uniquely determinedby the above property; it is called the inverse of x and denoted x−1.]

When the group law µ is understood, we will suppress the symbol µ and write “x · y” forµ(x, y). Moreover will often suppress both the group law µ and the unity element e, andsimply use the underlying set G as the notation for the group, if that cause no confusion.We often write xn for the product of x with itself n times; x−n mean (x−1)n.

(2.2) Definition A group G is said to be commutative if xy = yx for all x, y ∈ G. As asynonym, an abelian group (in honor of the mathematician Abel) is the same as a commuta-tive group. We often use the symbol “+” instead of “·”. When using the additive notationfor an abelian group G, xn in the multiplicative notation becomes n · x, apply the grouplaw to n copies of an element x of G.

(2.3) Definition A subgroup of a group G is a subset H of G which contains the unityelement e of G such that x · y ∈ G for all x, y ∈ G.

(2.4) Definition Let S be a subset of a group G. The subgroup of G generated by S is thesmallest subgroup of G which contains S; it is the subset of G consisting the unity elemente and all finite products

x1 · x2 · · ·xn (n ∈ N>0)

where each xi is either an element of S or is the inverse of an element of S.

(2.5) Definition Let G be a group and let H be a normal subgroup.

(a) The left H-coset which contains an element a ∈ G is the subset a ·H = {a ·h | h ∈ H}of G. Being in the same left H-coset is an equivalence relation on G: two element xand y are equivalent if and only if x−1 · y ∈ H.

(b) G/H is the set of all left H-cosets. (So G/H is a set of subsets of G.)

(c) The right H-coset which contains an element a ∈ G is the subset H · a of G; H\G isthe set of all right H-cosets in G

Note that map x 7→ x−1, induces a bijection between G/H and H\G. The common cardi-nality of G/H and H\G is called the index of H in G, denoted [G : H] (which can be eithera positive integer or ∞). We have |G| = [G : H] · |H| if G is a finite group.

(2.6) Definition A group homomorphism from a group (G1, µ1, e1) to a group (G2, µ2, e2)is a map h : G1 → G2) such that h(e1) = h(e2) and h(µ1(x, y)) = µ2(h(x), h(y)) for allx, y ∈ G1. In other words, h respects the group structures.

5

Suppose that h is a homomorphism from G1 to G2, then the image under h of a subgroupof G1 is a subgroup of G2, and the inverse image under h of a subgroup of G2 is a subgroupof G1.

(2.7) Definition A group homomorphism h : G1 → G2) is an isomorphism if there existsa group homomorphism f : G2 → G1 such that f ◦ h = idG1 and h ◦ f = idG2 .

(2.8) Definition Let h be a homomorphism from a group G1 to a group G2. The kernelof h, denoted by Ker(h), is the subset of G1 consisting of all elements x ∈ G1 such thath(x) = eG2 , where eG2 is the unity element of the target group G2.

(2.9) Definition A normal subgroup N of a group G is a subgroup of G such that xyx−1 ∈N for all elements x ∈ G and all elements y ∈ N . Notation: N EG.

Remark A subgroup H of a group G is normal if and only if every left H-coset is a rightH-coset. In particular every subgroup of index two is normal.

(2.10) Remark Let h be a homomorphism from a group G1 to a group G2.

(a) The kernel Ker(h) of h is a normal subgroup of G1. More generally the inverse imageunder h of any normal subgroup of the target group G2 is a normal subgroup of thesource group G1.

(b) h is injective if and only if Ker(h) is the trivial subgroup {eG} of G.

Terminology. An group endomorphism is a homomorphism from a group to itself. A groupautomorphism is an isomorphism from a group to itself.

(2.11) Definition Let (G1, ·1, e1) and (G2, ·2, e2) be groups. The product G1 × G2 of theunderlying sets has a natural group structure, where the group law is defined by

(x, y) · (u, v) = (x ·1 u, y ·2 v) ∀x, u ∈ G1, ∀y, v ∈ G2 .

The resulting group structure on G1 ×G2 is called the product of G1 and G2.

The injective group homomorphisms h1 : G1 −→ G1 ×G2 and h2 : G1 −→ G1 ×G2, definedby

h1(x) = (x, e2) ∀x ∈ G1 , h2(y) = (e1, y) ∀ y ∈ G2 ,

identifies G1 and G2 as normal subgroups of the group G1 × G2 which intersect triviallyand generate the group G1 ×G2. Note also that any element of h1(G1) commutes with anyelement of h2(G2). Conversely, if G1 and G2 are normal subgroups of a group G such thatG1 ∩ G2 = {eG} and G is the smallest subgroup of G which contains G1 and G2, then themap α : G1 ×G2 −→ G defined by

α((x, y)) = x · y , ∀x ∈ G1, ∀y ∈ G2

is a group isomorphism.

6

(2.12) Definition Let G be a group. Let u be an element of G.

(a) A conjugate of u in G is an element of the form x · u · x−1.

(b) The conjugacy class in G containing u is the set of all conjugates of u; it is a subset ofG, denoted by Gu

One can check without difficulty that the surjective map x 7→ x · u · x−1 from G to theconjugacy class of u induces a bijection

G/ZG(u)∼−→ Gu , xZG(u) 7→ xux−1

from G/ZG(u) to the conjugacy class Gu.

(2.13) Definition Let H be a subgroup of a group G.

(a) The centralizer of u in G, denoted by ZG(u) is the subset of all elements x ∈ Gsuch that xu = ux (or equivalently xux−1 = u). It is easily checked that ZG(u) is asubgroup of G.

(b) The normalizer of H in G, denoted by NG(H), is the subset of all elements x ∈ G suchthat x · H · x−1 = H. It is easily checked that NG(H) is a subgroup of G; moreoverit contains H and ZG(H). It is clear that NG(H) = H if and only if H is a normalsubgroup of G.

(2.14) Quotient groups Let N be a normal subgroup of a group (G, µ, e). Consider theset G/N of all (left) N -cosets in G. It is not difficult check that the map

µ : (G/N)× (G/N) −→ G/N , (x ·N, y ·N) 7→ xy ·N ∀x, y ∈ G

is well-defined, i.e. independent of the choice of representatives in the cosets. Let e be thecoset N . It is easier to check that (G/N, µ, e is a group, and the natural surjective map

π : G −→ G/I , π(x) = x ·N ∀x ∈ G

is a group homomorphism. Moreover the kernel h(π) of the homomorphism π is the normalsubgroup N .

Important property: There is a natural bijection between subgroups of the quotient groupG/N and subgroups of G containing N : To any subgroup H of G/N , associate the subgroupπ−1(H) of G. Conversely, to every subgroup H of G which contains N , associated thesubgroup π(H) = H/N of G/N . Under the correspondence, H is normal if and only if His normal. Moreover if H is normal, then we have a natural isomorphism between G/H and(G/N)/(H/N).

7

(2.15) Definition Let S be a set. A permutation on S is a bijection σ : S → S from S toitself. The set of all permutations form a group, denoted by Perm(S), where the group lawis given by composition (of bijections).

Remark When S = {1, 2, . . . n} for a positive integer n, we usually write Sn for the set ofall permutations of these n labels. It is a finite group with n! elements.

(2.16) Definition There is a group homomorphism sgn: Sn → µ2 = {±1}, defined by∏1≤i<j≤n

(xσ(i) − xσ(j)) = sgn(σ)∏

1≤i<j≤n

(xi − xj) .

A permutation σ ∈ Sn is said to be even (resp. odd) if sgn(σ) = 1 (resp. if sgn(σ) = −1.The subgroup of Sn consisting of all even permutations is called the alternating group in nletters, denoted An.

(2.17) Definition Let p be a prime number. A p-group is a finite group whose cardinalityis a power of p.

(2.18) Definition Let G be a group. The commutator subgroup of G, denoted by (G,G)is the subgroup of G generated by all elements of the form x · y · x−1 · y−1 , where x and yare elements of G.

Remark The commutator subgroup (G,G) of G is a normal subgroup of G and G/(G,G)is an abelian group. Moreover the quotient homomorphism π : G −→ G/(G,G) has thefollowing property: if h : G→ A is a group homomorphism and A is an abelian group, thenthere exists a unique group homomorphism g : G/(G : G) −→ A such that h = g ◦ π.

(2.19) Definition Let G be a group.

(a) The set Autgrp(G) of all group automorphisms forms a group; the group law is givenby composition of automorphisms.

(b) For every element x ∈ G, denote by AdG(x) the group automorphism of G which sendsevery element y ∈ G to x · y · x−1. The map x 7→ AdG(x) is a group homomorphism

AdG : G −→ Autgrp(G) .

Note that Ker(AdG) is the center Z(G) of G.

(2.20) Definition Let N and H be subgroups of a group G. We say that G is a semi-directproduct of N and H with N normal if N is a normal subgroup of G, N ∩H = {eG} and Gis generated by H and N . Notation: G ∼= N oH.

8

Note that G is not determined by N and H up to isomorphism. In other words there areexamples of non-isomorphic groups G and G′, such that G is a semi-direct product of anormal subgroup N1 and a subgroup H1, G

′ is semi-direct product of a normal subgroup N2

and a subgroup H2, and there exist group isomorphisms H1∼−→ H2, N1

∼−→ HN .

(2.21) Definition Let H and N be groups, and let α : H −→ Autgrp(N) be a group ho-momorphism. The semi-direct product N oα H attached to (H,N, α) is the group withunderlying set N ×H, whose group law is defined by the following formula:

(n1, h1) · (n2, h2) := (n1 · α(h1)(n2), h1 · h2) ∀n1, n2 ∈ N, ∀h1, h2 ∈ H

We have natural injective group homomorphisms j : H −→ N oα H and ι : N −→ N oα Hgiven by

j(h) = (eN , h) ∀h ∈ H, ι(n) = (n, eH) ∀n ∈ N .

It is easy to see that ι(N) is a normal subgroup of N oα H intersecting the subgroup j(H)trivially, and N oα H is generated by ι(N) and j(H). Moreover α is naturally identifiedwith the conjugation action of elements of j(H) on the normal subgroup N .

Conversely, supposed that H is a subgroup of a group G, N is a normal subgroup of G,and G is a semi-direct product of N and H. Denote by α the homomorphism from H toAutgrp(N) given by

α(h)(n) = h · n · h−1 ∀h ∈ H, ∀n ∈ N .

Then one can check that G is isomorphic to the semi-direct product N oα H.

§3. Vector spaces(3.1) Definition Let (F,+V , ·F , 0F , 1F ) be a field.1 A vector space over F (or an F -vectorspace) is a quadruple (V,+V , µ, 0V ), where V is a set, 0V is an element of V , +V is a binaryoperation on V , and µ : F × V −→ V is a map, such that the following conditions aresatisfied.

• (V,+V , 0V ) is an abelian group.

• µ(a, µ(b, v)) = µ(a ·F b, v) for all a, b ∈ F and all v ∈ V .

• µ(a+ b, v) = µ(a, v) + µ(b, v) for all a, b ∈ F and all v ∈ V .

• µ(a, v + w) = µ(a, v) + µ(a, w) for all a ∈ F and all v, w ∈ V .

• µ(0F , v) = 0V for all v ∈ V .

• µ(1F , v) = v for all v ∈ V .

1See 5.10 for the definition of fields.

9

We often suppress the symbol “µ” and write a · v or av for µ(a, v) is no confusion is possible.An element of V is often called a “vector” in V .

(3.2) Definition Let V1 and V2 be vector spaces over the same field F . A linear transfor-mation over F from V1 to V2 is a map T : V1 → V2 that preserves the structure of F -vectorspaces. More precisely, T is a group homomorphism for the underlying abelian groups, andT (a · v) = a · T (v) for all a ∈ F and all v ∈ V . Synonym: F -linear maps.

(3.3) Remark (1) The kernel Ker(T ) of T is the vector subspace consisting of all ele-ments v ∈ V1 such that T (v) = 0 in V2. The image Im(T ) of the map T forms a vectorsubspace of V2 over F .

(2) A more uniform terminology for “linear transformations” is “homomorphisms betweenF -vector spaces”.

(3) A linear transformation is injective if and only its kernel is trivial.

(4) A (linear) endomorphism of a vector space V over a field F is a linear transformationfrom V to itself. A linear endomorphism of V is also called a linear operator on V .

(5) The set of all F -linear endomorphisms of V is denoted by EndF (V ). The set of allF -linear transformations from an F -linear vector space V1 to an F -linear vector spaceV2 is denoted by HomF (V1, V2). Both EndF (V ) and HomF (V1, V2) are vector spacesover F

(6) The set of all F -linear automorphism of V is denoted by GLF (V ); it has a naturalgroup structure, with the group law given by composition.

(7) The dual vector space to an F -vector space V is the F -vector space HomF (V, F ). Anelement of HomF (V, F ) is said to be a linear functional on V .

(3.4) Definition Let V be a vector space over a field F . Let v1, . . . , vm be elements of V .

(a) An F -linear combination (or linear combination for short) of v1, . . . , vm is an expressionof the form

m∑i=1

ai · vi

with ai ∈ F .

(b) The F -linear span of v1, . . . , vm is the smallest F -vector subspace of V which containsv1, . . . , vm. It is easy to see that the linear span of v1, . . . , vm consists of all F -linearcombinations of v1, . . . , vm.

10

(c) We say that the list of vectors v1, . . . , vm are linearly independent over F if for everym-tuple (a1, . . . , am) in F ,

∑mi=1 ai ·vm = 0 only if a1 = · · · = am = 0. In other words,

every non-trivial linear combination of the vi’s is not equal to the zero vector.

(d) We say that the list of vectors v1, . . . , vm forms an F -basis if it is linearly independentover F and its F -linear span is equal to V . Two other equivalent properties for the listof vectors v1, . . . , vn to be an F -basis:

(†) v1, . . . , vn is a maximal linearly independent list of vectors in V .

(‡) v1, . . . , vn is a minimal list of vectors in V which spans V .

(3.5) Proposition Let V be a vector space over a field F .

(a) (Existence of basis) V has an F -basis.

(b) Let S and T be subsets of V . IF V is the F -linear span of S and T is linearly inde-pendent over F , then Card(T ) ≤ Card(S)

(c) (definition of dimension) Any two F -basis of V have the same cardinality. The commoncardinality of all basis of an F -vector space V is called the dimension of V , denoteddimF (V ).

Remark The statement 3.5 holds for arbitrary vectors spaces, including those which areinfinite dimensional, i.e. cannot be spanned by a finite number of elements. The meaning ofcardinality is taken in the sense of set theory as in 1.11.

(3.6) Some basic properties.

(1) Any two vector spaces over the same field F are isomorphic as F -vector spaces.

(2) dimF (HomF (V1, V2)) = d1 · d2, where if di = dimF (Vi) for i = 1, 2.

(3) For any linear transformation T : V1 → V2 between F -vector spaces, we have

V1/Ker(T )∼−→ Im(T ) .

In particular dimF (V1) = dimF (Ker(T )) + dimF (Im(T )).

(3.7) Let V and W be finite dimensional vector spaces over a field F , and let T : V → Wbe an F -linear transformation. Let v1, . . . , n be an F -basis of V , and let w1, . . . , wm be anF -basis of W . The matrix representation of T for the bases v1, . . . , n and w1, . . . , wm is them× n matrix A = (ai,α) whose entries ai,α : 1 ≤ i ≤ m 1 ≤ α ≤ n are determined by

T (vα) =m∑i=1

aiαwi ∀α = 1, . . . , n .

11

(3.8) Definition Let T ∈ EndF (V ) be a linear operator on a finite dimensional vectorspace V over a field F . Let v1, . . . , vn be an F -basis of V , and let A ∈ Mn(F ) be the matrixrepresentation of T , whose entries ai,j are determined by

T (vj) =n∑i=1

ai,jvi j = 1, . . . , n .

(i) The trace of T is given by

Tr(T ) =n∑i=1

ai,i .

(ii) The determinant of T is given by

det(T ) =∑σ∈Sn

sgn(σ)n∏i=1

ai,σ(i)

(iii) The characteristic polynomial of T is the polynomial det(x · Idn − A) , where n =dimF (V ) Here det(x · Idn −A) is an n× n matrix with coefficients in the polynomialring F [x], and the determinant of this matrix is computed using the displayed formulafor determinants in (ii) above.

The trace, determinant and characteristic polynomials are independent of the choice of basisv1, . . . , vn, hence are intrinsic invariants of the linear operator T . Up to sign the traceand determinant are coeffients of the characteristic polynomial: Write the characteristicpolynomial of T as xn + an−1 x

n−1 + · · ·+ a1 x+ a0, then

Tr(T ) = −an−1 , det(T ) = (−1)na0 .

(3.9) Definition Let T ∈ EndF (V ) be a linear operator on a finite dimensional vectorspace V over a field F .

(i) An element λ ∈ F is an eigenvalue of T if Ker(T − λ · IdV ) 6= (0).

(ii) The eigenspace attached to an eigenvalue λ of T is the vector subspace Ker(T−λ · IdV );its elements are called eigenvectors of T for the eigenvalue λ.

(iii) The linear operator T is diagonalizable if and only if there exists an F -basis of Vconsisting of eigenvectors; equivalently V is the linear span of the eigenspaces of T .

(3.10) Definition Let V be a vector space over a field F .

(a) The dual vector space of V is

tV := HomF (V, F ) ,

consisting of all F -linear transformations from V to F (also called the linear functionalsof V ).

12

(b) We have a natural F -linear map from V to t(tV ), the dual of the dual of V , defined by

v 7→ (α 7→ α(v) ) ∀v ∈ V, ∀α(v) .

In other words, the canonical map sends a vector v ∈ V to the linear functional can(v)on the dual of V , given by “evaluating at v”.

The canonical map can: V −→t (tV ) is an isomorphism if dimF (V ) <∞, but it is notan isomorphism if dimF (V ) =∞.

(c) Suppose that dimF (V ) = n < ∞. The dual basis to an F -basis v1, . . . , vn is the basistv1, . . . ,

tvn of tV , where vi is given by

tvi(vj) =

{1 if j = i0 if j 6= i

∀j = 1, . . . , n

for i = 1, . . . , n.

(3.11) Definition The dual (or transpose) of an F -linear transformation T ∈ EndF (V,W )is the map tT : tW →t V such that

tT (β)(v) = β(T (v)) ∀β ∈t W, ∀v ∈ V .

Suppose that dimF (V ) = n < ∞, dimF (W ) = m < ∞, v1, . . . , vn is an F -basis of V , andw1, . . . , wm is an F -basis of W . Let A ∈ Mm×n(F ) be the matrix representation of T withrespect to the above basis. Let tv1, . . . ,

tvn and tw1, . . . ,twm be the dual basis of the above

bases, for tV and tW respectively. Then the matrix representation of tT with respect to thetwo dual bases is tA ∈ Mn×m, the transpose of A.

§4. Group actions(4.1) Definition Let (G, ·, e) be a group. A left action of a group on a set S is a mapµ : G× S −→ S satisfying the following conditions.

• µ(e, s) = s for all s ∈ S.

• µ(x, µ(y, s)) = µ(x · y, s) for all x, y ∈ G and all s ∈ S.

(4.2) Remark (1) When there is no possible confusion we will suppress the symbol “µ”and write x · s or xs for µ(x, s).(2) A left action µ of a group G on a set S defines a group homomorphism ρ : G −→ Perm(S)such that ρ(x)(s) = µ(x, s). Conversely every group homomorphism from G to Perm(S)gives rise to a left action of G on S, according to the same formula above.(3) The notion of right action G is defined in a similar way.(4) A left G-action µ on a set S can be turned into a right G-action ν on S, and vice versa,by

µ(g, s) = ν(s, g−1) ∀g ∈ G, ∀s ∈ S .

13

(4.3) Definition Let V be a vector space over a field F . Let G be a group. A left linearaction of G on V is a left action µ : G× V −→ V such that

µ(g, a · u+ b · v) = a · µ(g, u) + b · µ(g, v) ∀g ∈ G, ∀a, b ∈ F ∀u, v ∈ V .

In other words the µ induces a ring homomorphism ρ : G −→ GLF (V ) given by

ρ(g)(v) = µ(g, v) .

A ring homomorphism ρ : G → GLF (V ) is also called an linear representation of G on thevector space V .

(4.4) Definition Suppose we have a left action (S,G, µ) of a group G on a set S.

(1) The G-orbit of an element s ∈ S, denoted by G · s, is the subset of S consisting of allelements x ∈ S such that x = µ(g, s) for some element g ∈ G.

(2) Any two G-orbits in S are either equal or disjoint. In other words the left G-actionpartitions the set S into a disjoint union of G-orbits. The set of all (left) G-orbits onS is denoted G\S; it is a set of subsets of S.

(3) The stabilizer of an element s ∈ S, denoted StabG(s), is the subset of G consisting ofall elements g ∈ G such that µ(g, s) = s; it is a subgroup of G.

The map from G to the G-orbit containing s, which sends every element g ∈ G to theelement µ(g, s) in S, defines a bijection

G/StabG(s)∼−→ G · s .

(4) The fixer of a subset T ⊆ S, denoted FixG(T ) , is the subset of G consisting of allelements g ∈ G such that µ(g, t) = t; it is a subgroup of G.

(5) The stabilizer of a subset T ⊆ S, denoted StabG(T ) , is the subset of G consistingof all elements g ∈ G such that µ(g, t) ∈ T and µ(g−1, t) ∈ T for all t ∈ T . Thelast condition is equivalent to µ(g, T ) = T ; StabG(T ) is a subgroup of G containingFixG(T ).

(4.5) The conjugation action. Let G be a group. The conjugation action (or the adjointaction) of G on itself is the map

µAdG : G×G −→ G , µAd

G (x, y) = AdG(x)y = x · y · x−1.

Recall that AdG is defined in 2.19. In many cases, orbits and stabilizers for the conjugationor for the action on 2G induced by the conjugation action have received our attention before,where 2G is the set of all subsets of G.

14

(1) The stabilizer subgroup of an element x ∈ G under the conjugation action is ZG(x),the centralizer subgroup of x in G.

(2) Each orbit of the conjugation action of G on itself is a conjugacy class in G. For eachelement x ∈ G, the map from G to the conjugacy class containing x, given by sendingevery element y ∈ G to the element Ad(x)(y) = x · y · x−1, establishes a bijection fromG/ZG(x)) to the conjugacy class of x in G; this is a special case of 4.4 (3).

The decomposition of G into a disjoint union of conjugacy classes is a special case ofthe decomposition into orbits. When G is a finite group, counting the cardinality usingthis decomposition gives the following class equation for G:

|G| = |Z(G)|+∑i∈I

|G||ZG(xi)|

,

where I is a finite set which parametrizes the set of all conjugacy classes of G which arenot in Z(G) (i.e. have more than one element), and xi is an element of the conjugacyclass Ci parametrized by i.

(3) Let H be a subgroup of G, which can be regarded as a subset of G. The stabilizer ofthe subset H ⊂ G under the conjugation action is the normalizer subgroup NG(H) ofH, while the fixer subgroup of this subset is the centralizer subgroup FixG(H).

We can also regard H as an element [H] of 2G; the stabilizer subgroup of the element[H] ∈ 2G under the action on 2G induced by the conjugation action is StabG(H). Wehave a bijection from G/StabG(H) to the set of all subgroups of G conjugate to H,again as a special case of 4.4 (3).

The following is an easy consequence of the class equation and induction.

(4.6) Proposition Let p be a prime number and G be a non-trivial finite p-group, i.e. |G|is a power of p.

(i) The center Z(G) of G is non-trivial.

(ii) There exists a finite chain of subgroups 0 = G0 ( G1 ( G2 ( · · · ( Gm = G such thatGiEG for each i = 1, . . . ,m and x ·y ·x−1 ·y−1 ∈ Gi−1 for all x ∈ G, all y ∈ Gi and alli = 1, . . . ,m. In particular the quotient group Gi/Gi−1 is abelian for all i = 1, . . . ,m.

(4.7) Definition Let p be a prime number and let G be a finite number. A p-subgroup Hin G is a Sylow p-subgroup of G if the index [G : H] of H in p is relatively prime to p. Inother words, |H| is the highest power of p which divides |G|.

(4.8) Theorem (Sylow) Let G be a finite group. Let p be a prime number which divides|G|.

15

(1) There exists a Sylow p-subgroup H in G.

(2) Any two Sylow p-subgroups H1 and H2 of G are conjugate in G; i.e. there exists anelement x ∈ G such that x ·H1 · x−1 = H2.

(3) [G : NG(H)] ≡ 1 (mod p). In other words the number of Sylow p-subgroups has theform 1 + a · p for some n ∈ N.

§5. Rings, ideals and factorization(5.1) Definition A ring is a quintuple (R,+, ·, 0, 1), where R is a set, + : R × R → Rand · : R×R→ R are two binary operations satisfying the following conditions.

• (R,+, 0) is a commutative group,

• 1 6= 0.

• 1 · x = x · 1 = x for all x ∈ R,

• (associativity) (x · y) · z = x · (y · z) for all x, y, z ∈ R.

• (distributive laws) (x+ y) · z = (x · z) + (y · z) and x · (y + z) = (x · y) + (x · z) for allx, y, z ∈ R.

A ring R is said to be commutative if x · y = y · x for all x, y ∈ R. We often suppress the “·”in formulas when there is no possible confusion. A subset S of a ring R is a subring of R if(S,+, 0) is a subgroup of (R,+, 0), 1 ∈ S, and x · y ∈ S for all x, y ∈ S.

(5.2) Examples of rings.

• Z, Q, R, C are commutative rings.

• R[x], C[x, y], Z[x], Q[x, y, z] are commutative rings.

• The matrix rings M2(Z), M3(Q), M4(R), M5(C) with the standard definition of additionand multiplication are non-commutative rings.

• The set of all R-valued continuous functions C(R) on R with “+” and “·” given bythe sum and product of values of functions forms a commutative rings.

• The subset N of Z is not a subring of Z.

• Let R be a commutative ring. The set R[x] of all polynomials in a variable x withall coefficients in R is a ring, where the two operations “+” and “·” are given by thestandard formulas.

16

• Let M be a commutative group. The set Endgrp(M) of all group endomorphisms of Mhas a natural structure as a ring, where the multiplication is given by composition ofendomorphisms. Such rings are usually non-commutative.

• Let V be a vector space over a field F . Then the set EndF (V ) of all F -linear endo-morphisms of V has a natural structure as a ring, where the multiplication is givenby composition of endomorphisms. If dimF (V ) = n < ∞, then the ring EndF (V ) isisomorphic to the matrix ring Mn(F ); see 5.7 for the definition of ring isomorphisms.

(5.3) Definition Let R be ring. An element u ∈ R is a unit of R if there exists an elementv ∈ R such that u · v = v · u = 1. Note that such an element v is uniquely determined bythe above condition; we say that v is the inverse of u. Denote by R× the set of all units inR. It is easy to check that (R×, ·, 1) forms a group; this group is commutative if R is.

Examples.

(i) Z× = {1,−1}

(ii) Let V be a vector space. We have EndF (V )× = GLF (V ); this matrix group is isomor-phic to GLn(V ) if dimF (V ) = n.

(iii) If R is a commutative ring, then the group of units R[x]× in the ring R[x] of allpolynomials in one variable x with coefficients in R is equal to R×.

(iv) (Z/8Z)× ∼= (Z/2Z)× (Z/2Z).

(5.4) Group rings. Let G be a finite group and let R be a commutative ring. Let R[G]be the set of all formal sums of the form∑

x∈G

ax [x] , ax ∈ R ∀x ∈ G .

Define addition by adding coefficients:∑x∈G

ax [x]) + (∑x∈G

bx [x]) = (∑x∈G

(ax + bx) [x] ,

and define multiplication by

∑x∈G

ax [x]).(∑x∈G

bx [x]) =∑x∈G

(∑u∈G

au · bu−1x

)[x] .

Then R[G] is a ring, andG −→ R[G]× , x 7→ [x]

is a group homomorphism.

17

(5.5) Definition Let R be a ring.

(a) A subset I of R is a left ideal (resp. right ideal) if x · I ⊆ I for all x ∈ R (resp. ifI · x ⊆ I for all x ∈ R). A subset I of a ring R is an ideal if it is both a left and aright ideal; in other words I is stable under both the right and the left multiplicationby arbitrary elements of R.

(b) Let S be a subset of a ring R. The ideal (resp. left ideal, resp. right ideal) generatedby S is the smallest ideal (resp. left ideal, resp. right ideal) of R which contains S; inother words it is the intersection of all ideals (resp. left ideals, resp. right ideals) in Rwhich contain S. Explicitly, the left ideal generated by S is the subset of R consistingof all finite sums of the form ∑

j∈J

xjsj ,

where J is a finite indexing set, xj ∈ R and sj ∈ S for every j ∈ J . Similarly the idealgenerated by S is the subset of R consisting of all finite sums of the form∑

j∈J

xjsjyj ,

with xj, yj ∈ R and sj ∈ S for every element J in the finite indexing set J .

(5.6) Definition Let R be a ring and let I and J be ideals. The ideal I ·J is the idealgenerated by the subset {x ·y | x ∈ I, y ∈ J} ⊂ R. If I and J are generated by subsetsS, T ⊂ R of R respectively, then the ideal I·J is the subset of R consisting of all finite sumsof the form ∑

k∈K

xk · sk · yk · tk · zk ,

where K is a finite indexing set, xk, yk, zk ∈ R, sk ∈ S, tk ∈ T for all k ∈ K.

(5.7) Definition Suppose (R,+R, ·R, 0R, 1R) and (S,+S, ·S, 0S, 1S) are rings. A ring ho-momorphism from the ring R to the ring S is a map h : R → S which respects the ringstructure. In other words

• h(0R) = 0S,

• h(1R) = 1S,

• h(x+R y) = h(x) +S h(y) ∀x, y ∈ R, and

• h(x ·R y) = h(x) ·S h(y) ∀x, y ∈ R.

The kernel of a ring homomorphism h : R → S, denoted by Ker(h), is the subset of Rconsisting of all elements a ∈ R such that h(a) = 0S; it is an ideal of R.

A ring homomorphism h : R → S as above is said to be a ring isomorphism if thereexists a ring homomorphism h′ : S → R such that h′ ◦ h = idR, and h ◦ h′ = idS.

18

(5.8) Remark Suppose that h : R→ S is a ring homomorphism as above.

(1) The image under h of a subring of R is a subring of S.

(2) The inverse image under h of a subring of S is a subring of R.

(3) The inverse image under h of an ideal (resp. left ideal, resp. right ideal) of S is an ideal(resp. left ideal, resp. right ideal) of R. [However if we replace “subring” by “ideal”(or left/right ideal) in the statement (1) above, the resulting statement is false.]

(5.9) Quotient rings. Let I be a proper ideal of a ring R (i.e. I 6= R). Since (R,+, 0)is a commutative group, we can consider the quotient group R/I for the addition, whoseelements are subsets of R of the form a+ I with a ∈ R. Consider the natural surjective map

π : R −→ R/I , x 7→ x+ I .

It turns out that the condition that I is an ideal implies that there is a (necessarily unique)ring structure on R/I such that π is a ring homomorphism. (Check that the map

(R/I)× (R/I) −→ R/I , ((a+ I), (b+ I)) 7→ ab+ I ∀a, b ∈ R

is well-defined.)

Important property: There is a natural bijection between ideals of the quotient ring R/I andideals of R containing I: To any ideal J of R/I, associate the ideal π−1(J) of R. Conversely,to every ideal J of R which contains N , associated the ideal π(J) of R/I. The statementshold if we replace “ideal(s)” by “left ideal(s)” in the above; similarly for right ideal.

(5.10) Definition (1) A division ring is a ring such that for every element 0 6= x ∈ R,there exists an element y ∈ R such that x · y = y ·x = 1; in other words every non-zeroelement of R is a unit of R.

(2) A field is a commutative division ring.

(3) A field F is said to be algebraically closed if and only if every non-constant polynomialf(x) ∈ F [x] with coefficients in F has at least one root in F ; equivalently, every non-constant polynomial in F [x] is a product of linear polynomials (i.e. polynomials ofdegree one).

(5.11) Examples of fields and division rings.

• The rings Q, R, C are fields.

• Z is not a field.

• The polynomial rings R[x], Z[x, y] are not fields.

19

• The Hamiltonian quaternion H is a 4-dimensional vector space over R with 1, i, j, k asan R-basis, where 1 is the unity element for the multiplication, and i, j, k are elementsof H with the following properties:

i2 = j2 = k2 = −1, i · j = k = −j · i, j · k = i = −k · j, k · i = j = −i · k

An easy calculation shows that the inverse of a non-zero element a+ bi+ cj + dk in Hwith a, b, c, d ∈ R is (a2 + b2 + c2 +d2)−1 · (a− bi− cj−dk), so H is a non-commutativedivision ring.

• The matrix rings M2(R), M3(Z) are not division rings.

• The fields Q and R are not algebraically closed. The field C of all complex numbersis algebraically closed. The last statement is the famous Fundamental Theorem ofArithmetic, first proved by Gauss in the 18th century.

• The set of all complex numbers z which is the root of some non-constant polynomialf(x) ∈ Q[x] with coefficients in Q is a subfield of C, called the field of all algebraicnumbers and denoted by Qalg.

(5.12) Definition An integral domain is a commutative ring such that x · y 6= 0 if x 6= 0and y 6= 0.

It is clear that every field is an integral domain.

Examples.

• Z, Q, R, C are integral domains.

• Z/6Z is not an integral domain: the product of the non-zero elements 2 + 6Z and3 + 6Z is zero.

• Every subring of an integral domain is an integral domain. In particular every subringof a field is an integral domain.

• Let R be an integral domain. Then the polynomial ring R[x] consisting of all polyno-mials in a variable x with all coefficients in R is an integral domain.

(5.13) Definition Let R1, . . . , Rn be rings. The product set R1×· · ·×Rn, consisting of alln-tuples (x1, x2, . . . , xn) such that xi ∈ Ri for all i = 1, . . . , n, has a natural ring structure,with addition and multiplication given coordinate-by-coordinate. We call it the product ringof R1, . . . , Rn; the unity element for addition is (0R1 , . . . , 0Rn), while the unity element formultiplication is (1R1 , . . . , 1Rn).

Remark (i) If R1, . . . , Rn are commutative and n ≥ 2, then the product ring R1×· · ·×Rn

is not an integral domain. For instance (1, 0, . . . , 0) · (0, 1, 0, . . . , 0) = (0, 0, . . . , 0).

20

(ii) Each projection pri : R1 × · · · × Rn → Ri is a ring homomorphism. However the“inclusion maps” ιi : Ri → R1 × · · · × Rn, which sends every element x ∈ Ri to then-tuple whose i-th component is x and the other components are 0, is not a ringhomomorphism if n ≥ 2: 1Ri is not sent to the unity element of the product ring.

(5.14) Definition Let I be an ideal in a commutative ring R.

(i) I is a maximal ideal if I 6= R and the only ideals of R containing I are I and R.Equivalent, the quotient ring R/I is a field.

(ii) I is a prime ideal if for all elements x, y /∈ I, their product x · y in again not in I.Equivalently, the quotient ring R/I is an integral domain.

It is clear from the two alternative definitions that every maximal ideal is a prime ideal.

The equivalence of the two definitions in (ii) comes from the following easy fact: A commu-tative ring is a field if and only if it has only two ideals, (0) and the whole ring itself.

Examples.

(i) In the ring Z[x], {0}, (3), (x), (3, x) are all prime ideals; among them only (3, x) is amaximal ideal.

(ii) The finite ring Z/100Z has only two prime deals, 2Z/100Z and 5Z/100Z, both are alsomaximal ideals.

(iii) In the ring R = R[x, y, z]/(x2 + y2 + z2), the principal ideal generated by the image xis a prime ideal, but is not a maximal ideal. The ideal m := (x, y − 1) is maximal andthe quotient R/m is isomorphic to C.

Notation. Let R be a commutative ring. Denote by Spec(R) the set of all prime ideals ofR, called the spectrum of R. The subset of Spec(R) consisting of all maximal ideals is calledthe maximal spectrum of R and denoted by MaxSpec(R).

(5.15) Remark Suppose that R1, . . . , Rn are commutative.

(i) There is a bijection α from the disjoint union of the spectra Spec(Ri)’s to the spectrumSpec(

∏ni=1Ri) of the product ring, which sends a prime ideal ℘ in Spec(Ri) to pr−1

i (℘),the inverse image of ℘ in the product ring under the i-th projection homomorphismpri :

∏ni=1Ri −→ Ri.

(ii) The bijection α in (i) induces a bijection from the disjoint union of the maximal spectraMaxSpec(Ri)’s to the maximal spectrum MaxSpec(

∏ni=1Ri) of the product ring.

(5.16) Definition An principal ideal domain (PID) is an integral domain such that everyideal is principal, i.e. can be generated by one element.

21

Examples.

• Z is a principal ideal domain.

• If F is a field, then the polynomial ring F [x] is an integral ideal domain. (Exer. Usethe division algorithm to prove this assertion.)

• Z[x] is not an integral ideal domain. Similarly Q[x, y] is not an integral ideal domain.

• The ring of Gaussian integers Z[√−1], consisting of all complex numbers of the form

a+ b√−1 with a, b ∈ Z is a principal ideal domain.

(5.17) Definition (Euclidean domains) An integral domain R is an Euclidean domainif there is a function σ : R − {0} → N such that a “division algorithm”, namely: For everyelement a ∈ R and every non-zero element b ∈ R − {0}, there are elements q, r ∈ R suchthat

a = q · b+ r , σ(r) < σ(b) if r 6= 0 .

Remark (i) It is easy to see that every Euclidean domain is a PID.

(ii) Examples of Euclidean domains do not come in abundance. The better-known exam-ples of Euclidean domains include Z, polynomial rings F [x] over a field F , and the ringof Gaussian integers Z[

√−1]. These are also the better-known examples of PID’s.

(iii) There are examples of PID’s which are not Euclidean domains, but it takes efforts inthese examples to prove the non-existence of Euclidean algorithms.

(iv) For an algebraic number field K, the ring OK of algebraic integers in K is a PID if andonly if the class number of K is equal to one.2 Algebraic integers and class numbersare typically explained in books on algebraic number theory.

(5.18) In the ring Z of integers, the relation “a | b” (a divides b) can be better thought ofas a relation between ideals: (a) ⊃ (b). (Here (a) is the general notation for the ideal (ofZ in the present case) generated by a, i.e. the ideal aZ.) The same statement holds forany commutative ring. For integral ideal domains the familiar elementary concepts for thearithmetic of Z can be naturally generalized, and usually best thought of in terms of ideals.We illustrate this point below.

2An algebraic number field is a subfield K of C consisting of algebraic numbers, i.e. roots of non-constantpolynomials in Q[x], which is finite dimensional as a vector space over Q. The subring OK of K, consistingof all elements of K which are roots of some monic polynomial in Z[x], is called the ring of algebraic integersin K. Two non-zero ideals I1, I2 in OK are said to be equivalent if there exists an element a ∈ K× suchthat a · I1 = I2. A theorem of Dirichlet asserts that there are only a finite number of equivalence classes ofnon-zero ideals in OK ; that number is called the class number of K.

22

Unique factorization in a PID. One formulation of this property for a PID is the fol-lowing:Every proper ideal I, there exists a positive integer m, maximal ideals ℘1, . . . ℘m and posi-tive integers e1, . . . , em such that I =

∏mj=1 ℘

ejj . Moreover the positive integer m is uniquely

determined by I, and the m-tuples (℘1, . . . , ℘m) and (e1, . . . , em) are uniquely determined upto permutation.

gcd and lcm in a PID. The familiar concepts “greatest common divisor” and “leastcommon multiple” in Z can also be formulated in terms of ideals:

(gcd(a, b)) = (a, b)(= aZ + bZ), lcm(a, b)) = (a) ∩ (b)(= aZ ∩ bZ) .

It is clear how to extend the concepts of gcd and lcm to the context of principal ideal do-mains. Here is a useful special case.Suppose that F is a field and f1(t), . . . , fm(t) are polynomials such that there is no (non-constant) irreducible polynomial p(t) which divides all the fi(t)’s. Then there exists polyno-mials g1(t), . . . , gm(t) ∈ F [t] such that

∑mi=1 gi(t)fi(t) = 1.

Proof. That gcd of the elements fi(t) for i = 1, . . . ,m is equal to 1, which means that theideal of F [t] generated by the element fi(t)’s is equal to the whole ring F [t].

(5.19) Definition Two non-zero elements a, b in a principal ideal domain are said to berelatively prime if the ideal (a, b) = aR + bR generated by a and b is equal to R.

(5.20) Definition Let F be a field, and let F [x] be the field of all polynomials in onevariable x with coefficients in F .

(i) Let f(x) = ad xd + ad−1, x

d−1 + · · ·+ a1 x+ a0 be a polynomial in F [x]. The derivativeof f , denoted by f ′(x), is the polynomial

f ′(x) := d ad xd−1 + (d− 1) ad−1 x

d−2 + · · ·+ a1 ∈ F [x] .

(ii) An polynomial f(x) ∈ F [x] is separable if it is relatively prime to its derivative f ′(x);in other words the ideal of F [x] generated by f(x) and its derivative f ′(x) is equal toF [x].

Suppose that E is an algebraically closed field which contains F as a subfield, then an elementf(x) ∈ F [x] is separable if and only if the polynomial f(x) does not have multiple roots inE.

A simple example of an inseparable polynomial: Let F = Fp(t), the fraction field of Fp[x];see 5.23 example (2). The polynomial xp − t ∈ Fp(t)[x], is not separable.

(5.21) Proposition Let R be a commutative ring and let I be a proper ideal of R (i.e.I 6= R). Then there exists a maximal ideal M of R which contains I.

23

Prop. 5.21 follows very quickly from Zorn’s Lemma; see 1.10 or Lemma 1.9 in the Appendixof Artin’s book for the statement of Zorn’s lemma. Here is a sketch of the proof of 5.21.

Consider the set J consisting of all ideals J of R which contains I but does not containthe unity element 1. The inclusion relation is a partial ordering on J . Clearly J is non-empty. Let’s check that every totally ordered subset T ⊆ J has an upper bound in J : Theunion I ′ = ∪J∈T J is an ideal which contains every element J ∈ T and I ′ does not contain1, so I ′ is an upper bound of I in T . By Zorn’s lemma the partially ordered set J containsa maximal element M , which is nothing but a maximal ideal of R containing I.

(5.22) Theorem (Nullstellensatz) Let I = (f1, . . . , fr) be an ideal in the polynomial ringC[x1, . . . , xn] generated by polynomials fi(x1, . . . , xn) ∈ C[x1, . . . , xn] for i = 1, . . . , r. LetV ⊆ Cn be the subset of common zeroes of the polynomials f1, . . . , fr, i.e.

V = {(z1, . . . , zn) ∈ Cn | fi(z1, . . . , zn) = 0 ∀i = 1, . . . , r} .

If g(x1, . . . , xn) is an element of C[x1, . . . , xn] such that g(z1, . . . , zn) = 0 for all (z1, . . . , zn) ∈V , then some power of g(x1, . . . , xn) is in the ideal I.

This is the classical form of Hilbert’s Nullstellensatz; see Theorem 8.7 of Chapter 10 of Artin’sbook. The same statement holds if the field C is replaced by an arbitrary algebraically closedfield F .

(5.23) Definition Let R be an integral domain. A field of fractions for R is an injective ringhomomorphism j : R ↪→ F from R to a field F such that F is the smallest field containingthe image j(R) of R; equivalently, for every element x ∈ F , there exists a non-zero elementb ∈ R such that a · x ∈ R.

A field of fractions of an integral domain R can be constructed as the set of all equivalenceclasses on R× (R− {0}), modulo the following equivalence relation:

(a, b) ∼ (a′, b′) ⇐⇒ a · b′ = b · a′ ∀ a, b, a′, b ∈ R, b 6= 0, b′ 6= 0

[Think of the equivalence class containing (a, b) as the element j(a) · j(b)−1 in the fractionfield F .] Below is a universal property for a/the field of fractions an integral domain R:

For any injective ring homomorphism ι : R→ K from R to a field K, there existsa unique field homomorphism α : F → K such that ι = α ◦ j.

Examples.

(1) Q is the fraction field of Z.

(2) The fraction field of a polynomial ring F [x] over a field F , denoted by F (x) and calledthe field of rational functions over F in one variable x, consists of fractions of the formf(x)g(x)

with f(x), g(x) ∈ F [x], g(x) 6= 0, modulo the usual equivalence relation

f1(x)

g1(x)=f2(x)

g2(x)if and only if f1(x) · g2(x) = f2(x) · g1(x) .

24

(3) More generally, if R is an integral domain with fraction field F , then the fraction fieldof R[x] is naturally isomorphic to F (x).

(5.24) Definition Let F be a field. Let h : Z −→ F be the (only) ring homomorphismfrom Z to F .

(i) If Ker(h) 6= (0), then Ker(h) = pZ for some prime p; in this case we say that F hascharacteristic p.

(ii) If Ker(h) = (0), then we say that F has characteristic 0.

(iii) The prime subfield of F is the smallest subfield contained in F . It is isomorphic toFp := Z/pZ if F has characteristic p > 0, and isomorphic to Q if F is of characteristic0.

(5.25) Definition Let R be an integral domain.

(i) An non-zero element a ∈ R is irreducible if a /∈ R× and a cannot be factored non-trivially; i.e. if a = b · c, b, c ∈ R, then b ∈ R× or c ∈ R×.

(ii) Let x be a non-zero element in R which is not a unit, i.e. x /∈ R×. We say that uniquefactorization holds for a if the following holds.

(a) There exist an element u ∈ R× and irreducible elements y1, . . . , ym ∈ R, m > 0,such that x = u ·

∏mi=1 yi.

(b) Suppose that x = v ·∏n

j=1 zj, where n ∈ N>0, v ∈ R× is a unit and zj is an irre-ducible element in R for j = 1, . . . n. Then m = n, and there exists a permutationσ ∈ Sm and units u1, . . . um ∈ R× such that zi = ui · yσ(i) for i = 1, . . . ,m.

(c) We say that R is a unique factorization domain (UFD) if unique factorizationholds for every non-zero element of R which is not a unit.

Remark (1) Let P be the set of all principal ideals in R which are not equal to R, partiallyordered by inclusion. A non-zero element x ∈ R which is not a unit is irreducible in Rif and only if the principal ideal (x) is a maximal element in P .

(2) Property (ii)(a) holds for all elements 0 6= x /∈ R× in R (existence of factorization) ifand only if every increasing chain in P stabilizes after finite a finite number of steps.

(3) Property (ii) (b) holds for all elements 0 6= x /∈ R× in R (uniqueness of factorization)if and only if the principal ideal generated by any irreducible element x ∈ R is a primeideal in R.

25

(4) We can define gcd and lcm in a UFD R. It is better to think of concepts in termsof principal ideals. For instance the gcd of a finite number of elements a1, . . . , am inR, not all equal to 0, is the smallest principal ideal (b) which contains (ai) for alli = 1, . . .m. Similarly, the lcm of a finite number of non-zero elements a1, . . . , am in Ris the largest principal ideal c which is contained in (ai) for all i = 1, . . . ,m.

(5) An equivalent statement of conditions (a), (b) above is: the principal ideal (x) canbe factored as a product of principal ideals generated by irreducible elements in anessentially unique way.

(5.26) An example of an integral domain which is not a UFD.Let R = Z +

√−5Z = {a + b

√−5 ∈ C | a, b ∈ Z}, a subring of C, hence an integral

domain. Note that the function σ : R → N which sends any element a + b√−5 ∈ R

to a2 + b2 satisfies σ(x · y) = σ(x) · σ(y). Using the function σ we see immediately thatR× = {±1}. It is also easy to check that 3 and 7 are irreducible elements in R: Supposethat 3 = (a+ b

√−5)(c+ d

√−5, a, b, c, d ∈ Z, a+ b

√−5 6= ±1, c+ d

√−5 6= ±1. Evaluating

the function σ using this decomposition, we get 21 = (a2 + 5b2) · (c2 + 5d2), which quicklyleads to contradiction.

We have two factorizations

21 = 3 · 7 = (4 +√−5) · (4−

√−5)

of the non-zero element 21 ∈ R. However neither 4 +√−5 nor 4−

√−5 is divisible by the

irreducible element 3. Therefore R is not a UFD.

(5.27) Remark The failure of unique factorization for the ring R above turns out to be ofminor nature when one examines it from the vintage point of factorization of ideals. Thetruth is that every non-zero ideal of R can be factored into a product of maximal ideals inan essentially unique way. However we get into the unpleasant situation of non-uniquenessof factorization if we insists on using only principal ideals for factoring.

To see what is really happening in the above example, consider the following ideals in R,

P1 := (3, 4 +√−5), P2 := (3, 4−

√−5), Q1 := (7, 4 +

√−5), Q2 := (7, 4−

√−5).

One can verify that P1, P2, Q1, Q2 are maximal ideals in R, and we have the followingfactorization

(3) = P1 · P2, (7) = Q1 ·Q2, (4 +√−5) = P1 ·Q1, (4−

√−5) = P2 ·Q2 .

of ideals in R, which completely “explains” the non-uniqueness of the two factorization ofthe element 21.

26

Ideals were introduced by Kummer to salvage the unique factorization property for ringssuch as R above. He called them ideal numbers, which is the historical origin of the name“ideals”.

(5.28) Proposition Let R be a UFD, and let R[x] be the ring of all polynomials withcoefficients in R. Recall that the degree of a non-zero element a0 + a1 x+ . . .+ ad x

d is d ifad 6= 0. Recall also that (R[x])× = R×.

(1) R[x] is a UFD.

(2) Every irreducible element of R is an irreducible element of R[x].

(3) Every element f(x) = a0 + a1 x + · · · + ad xd of positive degree d ≥ 1 such that

gcd(a0, . . . , ad) = (1) is an irreducible element in R[x].

(4) Every irreducible element in R[x] is of the form described in (2) or (3) above.

Remark It is easy to see that if R is a ring and R[x] is a UFD, then R is a UFD.

(5.29) Corollary The following rings are UFD.

(i) Z, Z[x], Z[x1, . . . , xn]

(ii) F , F [x], F [x1, . . . , xn], where F is a field.

(iii) Z[√−1], Z[

√−1][x1, . . . , xn].

§6. Modules(6.1) Definition A left module over a ring (R,+R, ·R, 0R, 1R), or a left R-module, is aquadruple (M,+M , µ, 0M), where (M,+M , 0M) is a commutative group, and

µ : R×M −→M

is a map satisfying the following properties.

• µ(a, x+M y) = µ(a, x) +M µ(a, y) for all a ∈ R and all x, y ∈M ,

• µ(a+R b, x) = µ(a, x) +M µ(b, x) for all a, b ∈ R and all x ∈M ,

• µ(a, µ(b, x)) = µ(a ·R b, x) for all a, b ∈ R and all x ∈M .

• µ(1, x) = x for all x ∈M .

• µ(0, x) = oM for all x ∈M .

27

Remark (1) We usually suppress the symbol “µ” and write a · x or ax for µ(a, x) whenno confusion is possible.

(2) Right modules are defined in a similar way.

(3) There is no difference between left and right R-modules if R is a commutative ring;then we will suppress “left” or “right” and only say “R-modules”.

(4) A left R-module structure on an abelian group M is the same as a ring homomorphismR −→ Endgrp(M).

Examples.

1. Every abelian group can be regarded as a module over Z. In other words a Z-moduleis nothing other than an abelian group.

2. A module over a field F is the same as a vector space over F .

(6.2) Definition Let F be a field and let G be a finite group. Then a module over thegroup ring F [G] of the group G over F is the same as an F -linear action of G on an F -vectorspace, or equivalently a homomorphism from G to a linear group GLF (V ). Any of the threeequivalent concepts will be call a linear representation of G.

The correspondence between the three equivalent notions can be described as follows. LetV be the underlying vector space of the linear representation. Let µ : F [G] × V −→ V bethe left F [G]-module structure on V , let ν : G × V −→ V be the corresponding left linearG-action on V , and let ρ : G→ GLF (V ) be the corresponding group homomorphism. Recallthat a typical element of F [G] is a formal sum

∑x∈G ax · [x] with “coefficients” ax ∈ F .

Then we haveρ(x)(v) = ν(x, v) = µ(1 · [x], v) ∀x ∈ G ∀ v ∈ V

and

µ

(∑x∈G

ax · [x], v

)=∑x∈G

ax · ν(x, v) =∑x∈G

ax · ρ(x)(v)

for all elements∑

x∈G ax · [x] ∈ F [G] and all v ∈ G.

(6.3) Another important class of examples comes from linear algebra. Suppose that V isa finite dimensional vector space over a field F and T ∈ EndF (V ) is an F -linear operatoron V . Then V has a structure as a module over the polynomial ring F [x], with the modulestructure given by

f(x) · v := f(T )(v) , ∀ f(x) ∈ F [x] .

Here f(T ) :=∑m

i=0 ai Ti(v) if f(x) =

∑mi=0 ai x

i. Equivalently, there is a unique ring homo-morphism hT : F [X]→ EndF (V ) such that hT (x) = T . The “pull-back” of the tautologicalEndF (V )-module structure on V by the ring homomorphism hT gives V an F [x]-modulestructure.

28

Assume that V is finite dimensional over F . Then dimF (EndF (V )) = dimF (V )2 < ∞,and the kernel Ker(hT ) of the ring homomorphism hT : F [x] → EndF (V ) defined in 6.3 isnon-zero. Therefore there exists a unique non-constant monic polynomial g(x) ∈ F [x] whichgenerates the ideal Ker(hT ).

(6.4) Definition Notation as above. The generator g(x) of the kernel Ker(hT ) of the ringhomomorphism hT : F [x] → End(V ) which sends x to T is called the minimal polynomialof the linear operator T .

Among monic polynomials f(x) in F [x] such that f(T )(v) = 0 for all v ∈ V , the minimalpolynomial is the one with the smallest degree.

(6.5) Definition An R-submodule of a left R-module M is a subset M ′ of M such that M ′

is a subgroup of (M,+M , 0M) and a · x ⊂M ′ for all x ∈M ′.

(6.6) Definition The R-submodule generated by a subset S ⊂M of M is the smallest R-submodule of M which contains S; in other words it is the intersection of all R-submodulesof M which contain S. More explicitly it is the subset consisting of all elements which canbe written as a finite sum ∑

i∈I

xi · si , xi ∈ R, si ∈ S ∀i ∈ I .

A left R-module M is of finite type if M can be generated by a finite subset of M .

(6.7) Definition Let R be a ring and M,N be left R-modules.

(1) A module homomorphism from M to N is a map h : M → N such that h is a grouphomomorphism for the abelian groups underlying M and N , and h(a ·x) = a ·h(x) forall a ∈ R and all x ∈M .

(2) The kernel of a module homomorphism h : M → N is the subset of M , denoted byKer(h), consisting of all elements x ∈ M such that h(x) = 0 in M . It is a left R-submodule of M .

(3) Denote by HomR(M,N) the set of all left R-module homomorphisms; it has a naturalstructure as a commutative group, with group law given by addition. When R iscommutative, HomR(M,N) has a natural structure as an R-module.

(4) We write EndR(M) for HomR(M,M).

29

(6.8) Examples.

(1) Let R be a ring. We can consider R as a left R-module using the multiplication law inR. Then left R-submodules of this left R-module R are exactly the left ideals of R.

(2) Let R be a ring, let M be a left R-module and let I be a set. Denote by M I the set ofall maps f : I → M ; it has a natural left M -module structure, where the sum of twoelements f and g is the map i 7→ f(i) + g(i) ∀i ∈ I, and the module structure is givenby

(a · f)(i) = a · f(i) ∀a ∈ R, ∀f ∈ RI , ∀i ∈ I .

This left R-module RI is called the direct product of copies of M indexed by I.

[Note that when M = R, RI also has a natural right R-module structure, where theproduct (f · b) of an element f ∈ RI by an element b ∈ R on the right is defined to by(f · b)(i) = f(i) · b for all i ∈ I. Moreover this right R-module structure is compatiblewith the previous left R-module structure, in th sense that (a · f) · b = a · (f · b) forall a ∈ R, all f ∈ RI and all b ∈ R. The standard terminology is “M is an (R,R)-bimodule”. We will leave the formal definition of the notion of (R1, R2)-bimodules andtheir elementary properties to the reader as we will not use this notion.]

(3) Let R be a ring, M be a left R-module and let I be a set. The direct sum of copiesof M indexed by I is the R-submodule of M I consisting of all elements f ∈ M I suchthat there exists a finite subset J ⊂ I(which may depend on f) with the property thatf(i) = 0 for all i /∈ J .

When I = {1, . . . , n} for some natural number n ∈ N, we write R⊕I for the direct sum.

(4) A free left R-module is a left module isomorphic to the left R-module R⊕I for someset I. When I = {1, . . . , n} for some n ∈ N, the free module R⊕I is written R⊕n, orRn for short.

(5) Let I be an indexing set. Let R be a ring, and let {Mi | i ∈ I} be a family of leftR-modules indexed by I. Let ti∈IMi be the formal disjoint union of the Mi’s.The direct product

∏i∈IMi of the Mi’s is the set of all maps f : I −→ ti∈IMi such

that f(i) ∈ Mi for all i ∈ I. The direct sum∐

i∈IMi of the Mi’s is the subset of M I

consisting of all maps f : I −→ ti∈IMi such that there exists a finite subset J ⊂ I(which may depend on f) with the property that f(i) = 0 for all i /∈ J . The additionof elements in

∏i∈IMi (resp. in

∐i∈IMi) and left multiplication with elements of R

are defined coordinate-wise.

It is clear that∐

i∈I Xi =∏

i∈I Xi if I is finite. When I = {1, 2, . . . , n}, n ∈ N, we oftenwrite X1 ⊕ · · · ⊕Xn or X1 × · · · ×Xn.

30

(6.9) Some basic properties. Let N be a left R-module.

(i) The map HomR(R,N) −→ N which sends every left R-module homomorphism h :M → N to the element h(1) ∈ N is a bijection.

(ii) The map α : HomR(N,∏

i∈IMi) −→∏

i∈I HomR(N,Mi) which sends an R-modulehomomorphism h : N →

∏i∈IMi to the element α(h) ∈

∏i∈I HomR(N,Mi), given by

α(h)(i) = pri ◦ h ∀i ∈ I ,

is a bijection. Here pri :∏

j∈IMj −→ Mi is the “i-th projection”, which sends atypical element f : I → tj∈IMj in

∏j∈IMj to the i-th component f(i) of f .

(iii) The map β : HomR(∐

i∈IMi, N) →∏

i∈I HomR(Mi, N) which sends an R-modulehomomorphism h :

∐i∈IMi → N to the element β(h) ∈

∏i∈I HomR(Mi, N) given by

β(h)(i) = h ◦ ιi ,

is a bijection. Here ιi : Mi →∐

j∈IMj is the natural “inclusion map” from Mi to thedirect sum, such that for any element x ∈Mi, ιi(x) is the map from I to tj∈IMj givenby

ιi(x)(j) =

{x if j = i0 if j 6= i .

(6.10) Definition Let N be an R-submodule of a left R-module M . We define a left R-module structure on the quotient group M/N as follows. For any a ∈ R and any x ∈ M ,the product a · (x + N) of a with the element x + N ∈ M/N is defined to be the elementx+N ∈M/N . It is straight-forward to check that the above definition is well-definition andgives M/N a structure of a left R-module.

In many ways working with modules is the same as doing linear algebra over rings insteadof fields. So far everything is formal. The next result on the structure of finitely generatedmodules over a PID is a main result for an introductory course on algebra.

(6.11) Theorem Let R be a principal ideal domain. Let M be a finitely generated moduleover M . Let Mtor be the torsion R-submodule of M , consisting of all element x ∈ M suchthat there exists a non-zero element a ∈M with a ·m = 0.

(1) There exists a natural number r ∈ N and an R-module isomorphism M ∼= Rn ⊕Mtor.The natural number r is uniquely determined by M , called the rank of M .

(2) Let N be a finitely generated torsion module. Then there exists a finite number ofmutually distinct maximal ideals ℘1, . . . , ℘m, m ∈ N, such that the R-submodule N [℘∞i ]of N , given by

N [℘∞] := {x ∈ N | ℘n · I = {0} for some n ∈ N>0} ,

31

is non-zero. The natural R-homomorphism N [℘∞1 ]⊕ · · ·N [℘∞m ] −→ N , defined by

(x1, . . . , xm) 7→ x1 + · · ·+ xm , xi ∈ N [℘∞i ] ∀i = 1, . . . ,m

is an isomorphism.

(3) Let ℘ be a maximal ideal of R and let N be a finitely generated R-module such that℘n ·N = {0} for some n ∈ N>0. Then there exists a natural number a ∈ N and positiveintegers e1, . . . , ea with ei > 0 for all i = 1, . . . , a, and an R-module isomorphism

N ∼= R/℘e1 ⊕ · · · ⊕R/℘ea .

The natural number a is uniquely determined by N , and the positive integers e1, . . . , eaare uniquely determined by N up to permutation.

Remark (a) The isomorphism in statement (1) implies that the torsion submodule Mtor isof finite type over R, i.e. it is a finitely generated R-module. Similarly the ℘i-primarycomponent N [℘∞i ] of the torsion module N in (2) is of finite type over R

(b) The statement (1) implies that every torsion-free finitely generated R-module is free.In particular every torsion-free finitely generated abelian group is isomorphic to Zr fora unique natural number r.

(c) Thm. 6.11, applied to the principal ideal domain Z, gives the structure theorem forfinitely generated abelian groups :

Let A be a finitely generated abelian group. Then there exist

– natural numbers r,m ∈ N,

– mutually distinct prime numbers p1, . . . , pm,

– a finite sequenceei,1 ≤ . . . ≤ ei,ai

of non-decreasing positive integers attached to the prime pi for eachi = 1, . . . ,m, and

– a group isomorphism

A ∼= Zr ⊕⊕

1≤i≤m,1≤j≤ai

Z/pei,ji Z .

The integers r,m and are uniquely determined by A, the prime numbersp1, . . . , pm are uniquely determined by A up to permutation, the number aiand the non-decreasing positive integer ei,1 ≤ · · · ≤ ei,ai attached to theprime number pi is uniquely determined by A.

32

(6.12) Corollary Let V be a non-zero finite dimensional vector space over a field F . LetT ∈ EndF (V ) be a linear operator on V . We give V the structure of F [x]-module such thateach polynomial f(x) ∈ F [x] operates on V as f(T ).

(1) There exists a positive integer m and a finite number of distinct monic irreduciblepolynomials f1(x), . . . , fm(x) and such that V decomposes into the direct sum of non-zero T -stable linear subspaces V [f∞i ], where

V [f∞i ] := { v ∈ V | fi(T )n(v) = 0 for some n ∈ N>0 } .

(2) For each i = 1, . . . ,m, there exists a sequence of natural numbers 0 < ei,1 ≤ . . . ≤ ei,aiand an F [x]-module isomorphism

V [f∞i ] ∼= F [x]/ (fi(x)ei,1)⊕ · · · ⊕ F [x]/ (fi(x)ei,ai ) .

(3) For each i = 1, . . . ,m, the positive integers ai and 0 < ei,1 ≤ . . . ≤ ei,ai attached to theirreducible polynomial fi(x) are uniquely determined by the linear operator T on V .

(3) The minimal polynomial of the operator T is

m∏i=1

fi(x)ei,ai .

(4) The characteristic polynomial of T is

m∏i=1

fi(x)P

1≤j≤aiei,j .

Remark Recall that the characteristic polynomial of T is det(x·IddimF (V ) − A

), where A

is a matrix representation of the linear operator T . The square matrix x·IddimF (V ) − A isan element of MdimF (V )(F [x]), and the determinant is a monic polynomial in F [x].

(6.13) Cor. 6.12 is a formulation of the theory of rational canonical forms in terms ofmodules over the polynomial ring F [x]. We make it more explicit by choosing a suitable basisfor each direct summand corresponding to a factor of the form F [x]/(f(x)e) = F [x]/I, wheref(x) is a monic irreducible polynomial in F [x] and I is the principal ideal generated by f(x)e.The linear operator in question is induced by the element x ∈ F [x], operating on the quotientW = F [x]/(f(x)e) via “multiplication by x”. Write f(x) = xd + ad−1 x

d−1 + . . .+ a1 x+ a0.Then dimF (W ) = de. The following list of vectors in W = F [x]/I

v1 := xd−1f(x)e−1 + I v2 := xd−2f(x)e−1 + I · · · vd := f(x)e−1 + Ivd+1 := xd−1f(x)e−2 + I vd+2 := xd−2f(x)e−2 + I · · · v2d := f(x)e−2 + I...

.... . .

...v(e−1)d+1 := xd−1 + I v(e−1)d+2 := xd−2 + I · · · ved := 1 + I

33

is a basis in W , such that the matrix representation of the linear operator “multiplicationby x” has a simple form, called an irreducible block for the rational canonical form of theoperator T , or an irreducible rational canonical block for short. Such an irreducible rationalcanonical block, corresponding to a factor F [x]/(f(x)e) where f(x) is a monic irreduciblepolynomial of degree d in F [x], is a de× de matrix in block form, where you have e diagonalblocks of size d × d, occupied by the same cyclic d × d matrix associated to the irreduciblemonic polynomial f(x) of degree d. Most of the entries outside these diagonal blocks arezero, except for e − 1 entries in the “inner upper diagonal corners”. We illustrate it belowin the case when f(x) = x4 + a3 x

3 + a2 x2 + a1 x+ a0 and e = 3; the block is

−a3 1 0 0−a2 0 1 0−a1 0 0 1−a0 0 0 0 1

−a3 1 0 0−a2 0 1 0−a1 0 0 1−a0 0 0 0 1

a3 1 0 0−a2 0 1 0−a1 0 0 1−a0 0 0 0

Note that the irreducible polynomial f(x) ∈ F [x] and the exponent e can be immediatedly“read off” from such an irreducible rational canonical block. When d = 1, i.e. the irreduciblepolynomial f(x) is an linear polynomial x−λ, and we have the familiar Jordan blocks. Belowis an illustration in the case e = 5

λ 1 0 0 00 λ 1 0 00 0 λ 1 00 0 0 λ 10 0 0 0 λ

.

A matrix representation of a linear operator T ∈ EndF (V ) on a finite dimensional vectorspace V in diagonal block such that each block is an irreducible rational canonical block iscalled the rational canonical form of T .

(6.14) Corollary Notation as in 6.12.

(1) The linear operator T is diagonalizable, (i.e. V is spanned by eigenvectors of T ) ifand only if the minimal polynomial of T is a product of polynomials of degree-one inF [x].

34

(2) (Cayley-Hamilton) The minimal polynomial of T divides the characteristic polyno-mial of T . In other words fmin,T (T ) = 0, where fmin,T (x) ∈ F [x] is the minimalpolynomial of T .

(3) Two matrices in Mn(F ) are conjugate if and only if they have the same rational canon-ical form.

(4) Every matrix in Mn(F ) is conjugate to its transpose.

(6.15) Remark Cor. 6.12 summarizes the theory of canonical forms for a linear operatoron a finite dimensional vector space V over a field F . Below are some related definitions andtheir basic properties, of interest only when the base field F is not algebraically closed.

(i) We say that a linear operator T on V is reduced if the subring of EndF (V ) generatedby F and T is isomorphic to the product of a finite number of fields

(ii) Because this subring is isomorphic to the image of the ring homomorphism

hT : F [x] −→ EndF (V )

whose kernel is the principal ideal generated by the minimal polynomial of T , thissubring is isomorphic to

F [x]

/(m∏i=1

fi(x)ei,ai

)∼=

m∏i=1

(F [x] /(fi(x)ei,ai )) ,

a product of quotient rings of the form F [x]/ (f(x)e) with f(x) irreducible and e ≥ 1.

(iii) It follows quickly from (ii) above that T is reduced if and only if one of the followingequivalent conditions hold.

– The quotient ring F [x]/ (fi(x)ei,ai ) is a field for each i = 1, . . . ,m.

– The exponent ei,ai of the irreducible polynomial fi(x) in the minimal polynomialof T is equal to 1 for all i = 1, . . . ,m.

– The image hT (F [x]) of the ring homomorphism hT is reduced, i.e. if x is an elementof the ring hT (F [x]) such that xN = 0 for some positive integer N , then x = 0.

[Note. An element x ∈ R such that xN = 0 for some positive integer N is callednilpotent. The set of all nilpotent elements in a commutative ring R is an ideal ofR, called the radical of R. A commutative ring R is reduced if its radical is {0};equivalently the the only nilpotent element in R is 0.]

(iv) We say that the linear operator T is semi-simple if the minimal polynomial of T isseparable.

It is clear that if T is separable then T is reduced. If the field F is of characteristic 0,then being separable and being reduced are equivalent.

35

(v) An element λ ∈ F is an eigenvalue of T if and only if (x − λ) divides the minimalpolynomial of T ; equivalently (x − λ) is one of the irreducible polynomials fi(x)’sin the statement of 6.12. The T -invariant vector subspace V [(x − λ)∞] is called thegeneralized λ-eigenspace of T ; it contains the eigenspace Ker(T − λ IdV ) of T .

§7. Tensor product of vector spacesWe fix a field F throughout this section.

(7.1) Definition Let V1, . . . , Vm,W be F -vector spaces. A map T : V1× · · · ×Vm −→ W ismultilinear over F if

T (v1, . . . , avi + bv′i, vi+1, . . . , vm) = a T (v1, . . . , vi, . . . , vm) + b T (v1, . . . , v′i, . . . , vm)

for all v1 ∈ V1, v2 ∈ V2, . . . , vi, v′i ∈ Vi, . . . , vm ∈ Vm, all a, b ∈ F and all i = 1, . . . ,m. The

map T is said to be bilinear (resp. trilinear) if m = 2 (resp. m = 3).

(7.2) Definition Let V,W be vector spaces over F . Denote by V ×n the product of n copiesof V .

(1) A map S : V ×n → W is symmetric if S(vσ(1), . . . , vσ(n)) = S(v1, . . . , vn) for all list ofvectors v1, . . . , vn ∈ V and all permutations σ ∈ Sn.

(2) A map A : V ×n → W is alternating if A(v1, . . . , vn) = 0 for all list of vectors v1, . . . , vn ∈V such that vi = vj for some i 6= j with 1 ≤ i, j ≤ n. This condition implies thatA(v1, . . . , vn) = sgn(σ)A(vσ(1), . . . , vσ(n)) for all list of vectors v1, . . . , vn ∈ V and allpermutations σ ∈ Sn.

(7.3) Definition Let V1, . . . , Vm be vector spaces over F . A tensor product of V1, . . . , Vmis an F -multilinear map α : V1 × · · · × Vm → U which satisfies the following universalproperty : For any multilinear map T : V1 × · · · × Vm → W , there exists a unique F -linearmap f : U → W such that T = f ◦ α.

The above universal property implies that the tensor product, if exists, is unique up tounique isomorphism. In other words, if β : V1 × · · · × Vm → U ′ is another tensor productof V1, . . . , Vm, then there exists a unique isomorphism δ : U

∼−→ U ′ with β = δ ◦ α. A“general nonsense construction” show that a tensor product exists; write α : V1×· · ·×Vm →V1 ⊗ · · · ⊗ Vm for a/the tensor product of V1, . . . , Vm. Denote by v1 ⊗ · · · ⊗ vm the image ofan element (v1, . . . , vm) ∈ V1 × · · · × Vm).

(7.4) Remark Because the double dual of a finite dimensional vector space over F is canon-ically isomorphic to the vector space itself, if V1, . . . , Vm are finite dimensional vector spacesover F , V1 ⊗ · · · ⊗ Vm is naturally isomorphic to the F -linear dual

HomF (ML(V1 × · · · × Vm, F ), F )

of the space ML(V1 × · · · × Vm, F ) of all multilinear maps from V1 × · · · × Vm to F .

36

[Exercise: Describe the natural map from V1× · · ·×Vm to HomF (ML(V1× · · ·×Vm, F ), F ).]

(7.5) Lemma Let V be a vector space over F , n ∈ N. Let α : V ×n → V ⊗n be the tensorproduct of n copies of V .

(1) There exists a symmetric multilinear map β : V ×n → SnV with the following universalproperty: For any symmetric multiplinear map S : V ×n → W , there exists a uniqueF -linear map f : SnV → W such that S = f ◦ β.

(2) There exists an alternating multilinear map γ : V ×n → ΛnV with the following universalproperty: For any alternating multiplinear map A : V ×n → W , there exists a uniqueF -linear map g : ΛnV → W such that A = g ◦ γ.

(3) The F -linear map π1 : V ⊗n → SnV such that β = π1◦α is surjective. (So the symmetricproduct SnV is naturally a quotient of V ⊗n.) Denote by v1 · v2 · . . . · vn the elementβ(v1, . . . , vn) ∈ SnV .

(4) The F -linear map π2 : V ⊗n → ΛnV such that γ = π2◦α is surjective. (So the symmetricproduct ΛnV is naturally a quotient of V ⊗n.) Write v1 ∧ · · · ∧ vn for the elementβ(v1, . . . , vn) ∈ ΛnV .

By convention, S0V = F = Λ0V .

(7.6) Lemma Let U, V,W be vector spaces over F . We have natural isomorphisms.

• U ⊗ V ∼= V ⊗ U , underwhich an element u⊗ v is mapped to v ⊗ u, ∀ (u, v) ∈ U × V .

• (U ⊕ V )⊗W ∼= (U ⊗W )⊕ (V ⊗W ), U ⊗ (V ⊕W ) ∼= (U ⊗ V )⊕ (U ⊗W )

• (U ⊗ V )⊗W ∼= U ⊗ (V ⊗W ), and both are naturally isomorphic to U ⊗ V ⊗W .

(7.7) Lemma Let V be a finite dimensional vector space, and let v1, . . . , vm be an F -basisof V .

(1) The set of vectors vi1 ⊗ · · · ⊗ vin, where the index (i1, . . . , in) runs through all n-tuplesin {1, 2, . . . ,m}n, is an F -basis of V ⊗n. In particular dimF (V ⊗n) = mn.

(2) The set of vectors vi1 ∧ · · · ∧ vin, where the index (i1, . . . , in) runs through all n-tuples in {1, 2, . . . ,m}n with i1 < i2 < · · · < im is an F -basis of Λn

FV . In particulardimF (Λn

FV ) =(mn

).

(3) The set of vectors vi1 ·vi2 · . . . ·vin, where the index (i1, . . . , in) runs through all n-tuplesin {1, 2, . . . ,m}n with i1 ≤ i2 ≤ · · · ≤ im is an F -basis of SnFV . In particular we havedimF (SnFV ) =

(m+n−1m−1

)=(m+n−1

n

).

37

(7.8) Lemma (1) For i = 1, . . . , n let Ti : Vi → Wi be an F -linear map, then there is aunique F -linear map

T1 ⊗ · · · ⊗ Tn : V1 ⊗ · · · ⊗ Vn −→ W1 ⊗ · · · ⊗Wn

such that T1⊗· · ·⊗Tn(v1⊗· · ·⊗vn) = T (v1)⊗· · ·⊗T (vn) for all (v1, . . . , vn) ∈ V1×· · ·×Vn.(2) Let T : V → W be an F -linear map. Then there are F -linear maps

SnT : SnV −→ SnW and ΛnT : ΛnV −→ ΛnW

characterized by

SnT (v1 · . . . · vn) = T (v1) · . . . · T (vn) and ΛnT (v1 ∧ · · · ∧ vn) = T (v1) ∧ · · · ∧ T (vn)

for all (v1, . . . , vn) ∈ V ×n.

(7.9) Lemma (1) In the situation of 7.8 (1), we have

Tr(T1 ⊗ · · · ⊗ Tn) = Tr(T1) · . . . · Tr(Tn) .

(2) Notation as in 7.8 (2). Assume that the characteristic polynomial of T splits into a productof linear factors in F [x]. Let a1, . . . , am be the eigenvalues of T , listed with multiplicity. Then

Tr(SnT ) =∑

1≤i1≤i2≤···≤in≤m

n∏j=1

aij , Tr(ΛnT ) =∑

1<i1<i2<···<in≤m

n∏j=1

aij .

(7.10) Lemma Suppose that n! ∈ F×, that is n! · 1 6= 0 in the base field F . Let π1 : V ⊗n →SnV and π2 : V ⊗n → ΛnV be the projection from V ⊗n to the symmetric and alternatingproduct respectively. The permutation group Sn operates naturally on V ⊗n, such that

σ(v1 ⊗ · · · ⊗ vn) = vσ(1) ⊗ · · · vσ(n) ∀σ ∈ Sn, ∀v1, . . . , vn ∈ V .Let Sym(V ⊗n) ⊂ V ×n (resp. Skew(V ⊗n)) be the set of all symmetric (resp. skew symmetric)tensors in V ⊗n, consisting of all elements x ∈ V ⊗n such that σ(x) = x (resp. σ(x) =sgn(σ) · x) for all σ ∈ Sn.

(1) The projections π1 and π2 induces isomorphisms

Sym(V ⊗n)∼−→ SnV and Skew(V ⊗n)

∼−→ ΛnV .

(2) Both Sym(V ⊗n) ⊂ V ×n and Skew(V ⊗n)) are stable under the action of T⊗n on V ⊗n

for all T ∈ EndF (V ).

Remark When n = 2 and 2 ∈ F×, we have V ⊗ V = Sym(V ⊗2) ⊕ Skew(V ⊗2). This is nolong true for n ≥ 3: When n ≥ 3, Sn has irreducible representations of dimension at leasttwo, and V ⊗n decomposes into a direct sum of linear subspaces corresponding to varioussymmetry patterns for Sn; each of the direct summands is stable under the action of T⊗n forevery linear operator T ∈ EndF (V ). The number of these direct summands is the numberof irreducible representations of Sn, which is equal to the number of partitions of the integern (e.g. 4 = 3 + 1 = 2 + 2 = 2 + 1 + 1 = 1 + 1 + 1 + 1 gives 5 ways to pattern 4 into a sumof positive integers.)

38

§8. Linear representation of finite groupsWe recall the definition of linear representations; see 6.2.

(8.1) Definition Let G be a group and let F be a field.

(1) A linear representation of G on a vector space V over F is a group homomorphismρ : G −→ GLF (V ), or equivalently an F -linear left action of G on V . If G is a finitegroup, the above is also equivalent to a left F [G]-module structure on V .

(2) A subrepresentation of a linear representation (V, ρ) of G on an F -vector space V is avector subspace W of V which is invariant under G; i.e. ρ(g)(v) ∈ V for all g ∈ G andall v ∈ V .

(3) Let W be a subrepresentation of (V, ρ) as in (2) above. The quotient representation ofV by W is the homomorphism ρ : G→ GLF (V/W ) induced by ρ, i.e.

ρ(g)(v +W ) = ρ(g)(v) +W ∀g ∈ G, ∀v ∈ V .

We often suppress the symbol ρ and abbreviate ρ(g)(v) to g ·v if it does not lead to confusion.

(8.2) Examples.

(i) The trivial F -linear representation of G is the representation (F,1G), where

1G : F → GLF (F ) ∼= F×

is the trivial group homomorphism. More generally a representation (V, ρ) is said tobe trivial if the group homomorphism ρ : G→ GLF (V ) is the trivial homomorphism.

(ii) Suppose that G is a finite group. The regular representation is the F -linear representa-tion whose underlying F -vector space is the group ring F [G], with the left F [G]-modulestructure given by the product law in the ring F [G]. In other words

ρ(g)

(∑x∈G

ax [x]

)=∑x∈G

ax [gx] =∑y∈G

ag−1y [y]

for every g ∈ G and every element∑

x∈G ax [x] ∈ F [G].

(8.3) Definition Let F be a field and let (V, ρV ), (W, ρW ) be F -linear representations of agroup G.

(i) An F -linear map T : V → W is G-equivariant if T (ρV (g)(v)) = ρW (g)(T (v)) forall g ∈ G and all v ∈ V . A G-equivaraint linear transformation is also called anintertwining operator ; if G is finite it is the same as a F [G]-module homomorphism.

39

(ii) The kernel of an equivariant F -liniear map T : V → W is the subrepresentation Ker(T )consisting of all elements v ∈ V such that T (v) = 0.

(iii) Two F -linear representations V,W of G are isomorphic if there exists equivariant G-linear maps α : V → W and β : W → V such that α ◦ β = IdW and β ◦ α = IdV .

(8.4) Definition Let F be a field. An non-zero F -linear representation (V, ρV ) of a groupG is irreducible if V and {0} are the only subrepresentations of V .

If G is finite, then every irreducible F -linear irreducible representation V of G is finitedimensional: For any non-zero elemenet of V , the linear span of the finite set {x ·v | x ∈ G}is equal to V .

(8.5) Lemma Let (V, ρ) be a linear representation of a finite group G over a field F . andlet W be a subrepresentation of G. Suppose that Card(G)·1 6= 0 in F , i.e. either char(F ) = 0or char(G) = p > 0 and Card(G) 6≡ 0 (mod p).

(i) There exists an equivariant F -linear map π : V → W such that π(v) = π for all w ∈ W .Note that π ◦ π = π.

(ii) V is the direct sum of the two subrepresentations W and Ker(π).

Under the assumption on F in 8.5, every finite dimensional F -linear representation of a finitegroup G is isomorphic to the direct sum of a finite number of irreducible representations.

Construction of a map π satisfying the requirements in (i): Pick an F -linear transformationh : V → W such that h(w) = W for all w ∈ W ; there are plenty of such. Define π : V → Wby

π(v) := Card(G)−1 ·∑x∈G

ρ(x)(h(ρ(x−1)(v))).

[The idea is to average the linear transformation h : V → W over G to produce a G-equivariant one; this averaging process only changes the effect of the linear transformationoutside W .]

(8.6) Proposition (Schur’s lemma) Let F be field, let G be a finite group and let (V, ρV )and (W, ρW ) be irreducible F -linear representations. Let T : V → W be an equivariantG-linear homomorphism.

(i) If the representations V and W are not isomorphic, then T = 0.

(ii) If V = W and F is algebraically closed, then T = a · IdV for some a ∈ F .

In the rest of this section we assume that G is finite and the base field F isC.

40

(8.7) Definition (1) Let (V, ρ) be a finite dimensional linear representation of a group Gover C. The character of G is the function χρ : G→ F on G defined by

χρ(x) = Tr(ρ(x)) ∀x ∈ G .

(2) A function f : G→ C is a class function if f(x · y · x−1) = f(x) for all x, y ∈ C. Let G\

be the set of all conjugacy classes of G and let π : G→ G\ be the natural projection, then aclass function on G is a function of the form ν ◦ π, where ν : G\ → C.

Note that χρ(x) is the sum of a finite number of roots of 1, namely the eigenvalues of ρ(x),and the complex conjugate of these eigenvalues are the eigenvalues of ρ(x−1), hence thecomplex conjugate χρ(x)∗ of χρ(x) is equal to χρ(x

−1).

Examples.

(a) The character of the trivial representation of G on C is the function on G with constantvalue 1 ∈ C.

(b) The character χreg of the regular representation of G is

χreg(x) =

{Card(G) · 1 if x = eG0 if x 6= eG

(8.8) Definition For any two functions φ, ψ : G −→ C, put

(φ|ψ) :=1

Card(G)

∑x∈G

φ(x)ψ(x)∗

where ψ(x)∗ is the complex conjugate of ψ(x), and

〈φ|ψ〉 :=1

Card(G)

∑x∈G

φ(x)ψ(x−1)

Note that (φ|ψ) = (ψ|φ)∗ and 〈φ|ψ〉 = 〈ψ|φ〉. If ψ is the character of a finite dimensionalrepresentation of G, then ψ(x)∗ = ψ(x−1) for all x ∈ G ahd (φ|ψ) = 〈φ|ψ〉.

(8.9) Proposition (i) If χ is the character of an irreducible complex representation, then(χ|χ) = 1.

(ii) If χ1 and χ2 are the characters of two non-isomorphic complex irreducible representa-tions, then (χ1|χ2) = 0.

(8.10) Proposition The regular representation C[G] of G decomposes into a direct sum ofirreducible complex representations of G:

C[G] ∼=h⊕i=1

(Vi, ρi)⊕ni

41

where (V1, ρ1), . . . , (Vh, ρh) are mutually non-isomorphic complex representations of G, andn1, . . . nh are positive integers.

(1) ni = dim(Vi) for each i = 1, . . . , h.

(2)∑h

i=1 n2i = Card(G).

(3) Every irreducible complex representation of G is isomorphic to (Vi, ρi) for some i with1 ≤ i ≤ h.

(4) The characters χ1, . . . , χh of the irreducible representations (V1, ρi), . . . , (Vh, ρh) forman orthogonal basis of the C-vector space H of all class functions on G, where the innerproduct on H is given by (φ, ψ) 7→ (φ|ψ).

(5) In particular the number h of non-isomorphic irreducible representations is equal toCard(G\), the number of conjugacy classes in G.

(8.11) Character table.Let G be a finite group, let h be the number of conjugacy classes of G, let (C,1G) =

(V1, ρ1), . . . , (Vh, ρh) be a set of representatives of isomorphism classes of irreducible complexlinear representations of G, and let χ1, . . . , χh be their characters. Let {eG} = C1, . . . , Ch bethe conjugacy classes of G. Pick elements xi ∈ Ci for i = 1, . . . , h, let ci = Card(Ci). Thecharacter table is the h × h matrix, whose row are indexed by the irreducible characters ofG and whose columns are indexed by the conjugacy classes of G, such that its (i, α)-entry isχi(xα).

Every entry in the first row of the character table is 1 because χ1 is the trivial character.The entries in the first column of the character table are the dimensions of the irreduciblerepresentations (Vi, ρi)’s, also called the degrees of the irreducible characters χi’s.

The orthogonality relations of characters is expressed in the following orthogonality re-lation of the character table.

(1)n∑i=1

cα χi(xα)χj(xα)∗ =

{Card(G) if i = j0 ifi 6= j

(2)h∑i=1

χi(xα)χi(xβ)∗ =

{Card(G)

cαif α = β

0 if α 6= β

(8.12) Definition Let (ρ1, V1) and (ρ2, V2) be finite dimensional C-representations of finitegroups G1 and G2 respectively. Define a C-representation

ρ1 � ρ2 : G1 ×G2 −→ GL(V1 ⊗ V2)

byρ1 � ρ2(x1, x2) = ρ1(x1)⊗ ρ2(x2) ;

42

we call it the external tensor product of (ρ1, V1) and (ρ2, V2). Its character is

χρ1�ρ2(x1, x2) = χρ1(x1) · χρ2(x2) ∀(x1, x2) ∈ G1 ×G2 .

If G1 = G2 = G, the restriction to the diagonal subgroup G ∼= ∆G ⊂ G × G of ρ1 � ρ2 iscalled the internal tensor product of ρ1 and ρ2, denoted by ρ1 ⊗ ρ2; its character is

χρ1⊗ρ2(x) = χρ1(x) · χρ2(x) ∀x ∈ G .

(8.13) Proposition Let G1, G2 be finite groups.

(i) Let (ρ1, V1) and (ρ2, V2) be irreducible C-representatiions of G1 and G2 respectively.Then (ρ1 � ρ2, V1 ⊗ V2) is an irreducible C-representation of G1 ×G2.

(ii) Every irreducible C-represenation of G1×G2 is isomorphic to (ρ1� ρ2, V1⊗V2) for anirreducible C-representation (ρ1, V1) of G1 and an irreducible C-representation (ρ2, V2)of G2.

Here is a useful fact about the dimension of irreducible complex representations of a finitegroup.

The degree of every complex irreducible representation of G divides [G : Z(G)],where Z(G) is the center of the finite G.

(8.14) Definition Let G be a finite group, H be subgroup of G, and let (W, θ) be a finitedimensional complex representation of H. Let

V = IndGH(W ) := { f : G→ W | f(hx) = θ(h)(f(x)) ∀h ∈ H, ∀x ∈ G} .

Let ρ : G→ GLC(V ) be the linear left action of G on V defined by

(ρ(y)(f))(x) = f(xy) ∀x, y ∈ G .

We say that the representation (IndGH(W ), ρ) of G is induced by the representation (W, θ) ofthe subgroup H.

The character χρ of the representation (IndGH(W ), ρ) G induced by the representation (W, θ)of the subgroup H is given by the following formula in terms of the character χθ of (W, θ).

χρ(x) =1

Card(H)

∑s∈G

sxs−1∈H

χθ(sxs−1) ∀x ∈ G .

More explicitly, let Cx be the conjugacy class of x in G, and write Cx∩H as a disjoint union

Cx = C1 t · · · t Cm ,

where each Ci is a conjugacy class in H. Then we have

χρ(x) =Card(G)

Card(H) · Card(Cx)

m∑i=1

Card(Ci) · χθ(xi) ,

where xi is an element of Ci for each i = 1, . . . ,m.

43

(8.15) Further properties of induced representations.

(i) Frobenius reciprocity: If f is a class function on G, then

(f |χρ)G = (fH |χθ)H) ,

where fH is the restriction to H of the class function f on G, and the inner product iscalculated on G and H respectively.

(ii) Mackey’s criterion: The complex representation (IndGH(W ), ρ) of G is irreducible ifand only if (W, θ) is irreducible and for every element s ∈ GrH, the two representa-tions θ|sHs−1∩H and θs of sHs−1 ∩ H are disjoint (i.e. do not contain any irreduciblerepresentation in common). Here θ|sHs−1∩H is the restriction to sHs−1 ∩ H of therepresentation θ of H, and θs is given by

θs(x) := θ(s−1xs) ∀x ∈ H ∩ sHs−1 .

(iii) Artin’s theorem: Every character of a finite group G is a Q-linear combination ofcharacters of representations induced from cyclic subgroups of G.

(iv) Brauer’s theorem: Every character of a finite group G is a Z-linear combinationof characters of representations induced from a subgroup H ⊆ G such that H isisomorphic to the product of a cyclic group with a p-group.

44

basic concepts in algebra - penn mathchai/371s10/course... · basic concepts in algebra x1....

Documents