diophantine geometry - eth z · in lecture 7 we will discuss unit equations which are basic tools...

Diophantine geometry

Prof. Gisbert WustholzETH Zurich

Spring Semester, 2013

Contents

About these notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1 Algebraic Number Fields 6

2 Rings in Arithmetic 102.1 Integrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Valuation rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Discrete valuation rings . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Absolute Values and Completions 183.1 Basic facts about absolute values . . . . . . . . . . . . . . . . . . . 183.2 Absolute Values on Number Fields . . . . . . . . . . . . . . . . . . 213.3 Lemma of Nakayama . . . . . . . . . . . . . . . . . . . . . . . . . . 243.4 Ostrowski’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 263.5 Product Formula for Q . . . . . . . . . . . . . . . . . . . . . . . . . 293.6 Product formula for number fields . . . . . . . . . . . . . . . . . . 30

4 Heights and Siegel’s Lemma 344.1 Absolute values and heights . . . . . . . . . . . . . . . . . . . . . . 344.2 Heights in affine and projective spaces . . . . . . . . . . . . . . . . 354.3 Height of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.4 Absolute height . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.5 Liouville estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.6 Siegel’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Logarithmic Forms 415.1 Historical remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.2 The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.3 Extrapolations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.4 Multiplicity estimates . . . . . . . . . . . . . . . . . . . . . . . . . 53

1

6 The Unit Equation 546.1 Units and S-units . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546.2 Unit and regulator . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7 Integral points in P1 \ 0, 1,∞ 57

8 Appendix: Lattice Theory 598.1 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598.2 Geometry of numbers . . . . . . . . . . . . . . . . . . . . . . . . . 608.3 Norm and Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628.4 Ramification and discriminants . . . . . . . . . . . . . . . . . . . . 64

About these notes

These are notes from the course on Diophantine Geometry of Prof. GisbertWustholz in the Spring of 2013.

Please attribute all errors first to the scribe1.

Introduction

The primary goal will be to consider the unit equation and especially itseffective solution via linear forms in logarithms.

Lecture 1 introduces the primary object of study in algebraic numbertheory: an algebraic number field, and Lecture 2 will move on to arithmeticrings, including local rings, valuation rings and Dedekind rings. Dirichletproved a basic result:

Theorem 0.1 (Dirichlet). The group of units in a number field is finitelygenerated.

Lecture 3 treats the general theory of absolute values.Both heights and Siegel’s Lemma provide the subject matter of Lecture

4. We will begin with the notion of an absolute value for a general algebraicnumber field and study how they split upon extension of the number field.This information determines the height, a key tool for modern number theory(see, e.g., the six hundred page tome, Heights, by Bombieri and Gubler).Siegel’s lemma will be treated only in a very elementary form. We willexamine the problem of finding integers solutions of equations of the form,

x1a1 + · · ·+ xnan = 0, a1, . . . , an ∈ Z not all zero

1Jonathan Skowera, [email protected]

2

mailto:[email protected]

The discussion proceeds in Lecture 5 to linear forms in logarithms. Bakerreceived the Fields medal for a version of the following theorem: (the bestversion, presented here, is due to

Theorem 0.2. Let a1, . . . , an be n integers different from 0 and 1, and letb1, . . . , bn be any integers. Then,

|ab11 · · · abnn − 1| > f(A,B, n), A = max(|ai|), B = max(|bi|, e)> A−C(n) logB

where C(n) is an effective constant.2

The proof requires much machinery, so we will prove a simpler version.A trivial lower bound would be A−CB.

The theorem includes an absolute value, and for a general algebraic num-ber field, the various embeddings of the number field into the complex num-bers C induce various absolute values.

Fix a finitely generated subgroup Γ ⊂ K×, then you can exactly deter-mine (in principle) the solutions x and y of the unit equation. Hence thewhole chain of results can be solved effectively. Solutions of covers of P1

follow from this and also an effective form of Siegel’s theorem.Roth received the Field’s medal for the following theorem. Note that it

is not effective!

Theorem 0.3 (Thue-Siegel-Dyson-Schneider-Roth). Let α be algebraic andirrational, and let ε > 0. Then

|x− αy| < max(|x|, |y|)−(1+ε)

has only finitely many solutions.

In Thue’s version from the 19th century, n = degα and the exponent isn/2. Because this is the non-effective side of the story, we will leave it asidefor the course and concentrate on the arithmetic approach.

Theorem 0.4 (Fermat’s Last Theorem/Wiles’ Theorem). The equation xn+yn = 1 has only finitely many solutions x, y ∈ Q.

Precursors to the theorem are the work of Masser and Wustholz.In Lecture 6, we will briefly touch on lattice theory. Let B be a simply

connected domain in R3 containing a single lattice point: the origin. As

2We will call an effective constant any constant which one can, at least in principle,compute.

3

the domain expands, it grows to contain more and more point. If the ex-panded domain λB, λ ∈ R>1 contains a lattice point on its boundary, thenthe number λ is said to be a minima of B.

Beginning with a smaller B generally increases the values of the successiveminima, which are a function of the domain, λ1(B) < λ2(B) < · · · Theλi depend on two parameters: on the volume of the domain vol(B) andon the determinant vol(R3/Λ) of the lattice. We only touch the surface ofthis very interesting theory whose modern developments include Hermitianlattices and their algebraic geometry, including generalizations of theoremslike Riemann-Roch.

In lecture 7 we will discuss unit equations which are basic tools for solvinga large class of diophantine equations and diophantine problems.

Definition 0.5 (Unit equation). Let K be an algebraic number field, andlet Γ ⊂ K× be finitely generated. The unit equation is x + y = 1, wherex, y ∈ Γ.

In lecture 8, we will examine the Thue equation.3

Definition 0.6 (Thue equation). The Thue equation of degree d is of theform

f(x, y) = 1

for a homogeneous, degree d polynomial f with at least 3 distinct zeros andcoefficients in Q.

The unit equation has further ramifications for solutions of yn = f(x).

The projective closure C of the solutions C = (x, y) | yn = f(x) form acovering of P1 of degree n (technically, an algebraic curve C with a morphismC → P1). Its genus can be calculated by the Hurwitz formula and definedover Q. It has finitely many solutions in the integers x and y. This fact relieson the unit equation.

Conjecture 0.7 (Mordell conjecture). If C has genus g > 1 over the fieldQ, then it has only finitely many rational points.

Faltings proved this fact in 1983 in the more general case where the curveC may be over any finite extension of Q.

Conjecture 0.8 (Shafarevich’s Conjecture). Fix a number field K and afinite set of primes S. Then there are only finitely many elliptic curves (upto isomorphism) over K which have good reduction outside of S.

3Thue’s 150th birthday incidentally occurs this year.

4

The general Shafarevich conjecture makes the same claim for Abelianvarieties.

Good reduction means the following: let y2 = x3 + ax + b, a, b ∈ Z bereduced modulo p. Its discriminant has the form ∆ = 4a3 + 27b2. Whenp | ∆, the discriminant vanishes modulo p, and hence the curve is not elliptic(it is not smooth). In all others cases, the curve is said to have good reductionat p.

The Mordell conjecture is a consequence of the Shafarevich conjectureaccording to Parshin.

Faltings proved the Shafarevich conjecture, and hence the Mordell con-jecture. The proof reduces the problem to a covering of P1 which hence hasfinitely many solutions. For a fixed ∆, the covering is 4a3 = −27b2 + ∆ .

Theorem 0.9 (Siegel’s theorem). Let f(x, y) = 0 define a curve of genus atleast one with coefficients in Q. Then there are only finitely many solutions(x, y) ∈ Z× Z.

Certain questions can be generalized to number fields of class number onewhich are those number fields with rings of integers in which ideals factoruniquely into prime ideals.

Conjecture 0.10 (Gauß conjecture on class numbers). There are only finitelymany imaginary quadratic number fields with class number one.

Baker demonstrated the truth of this conjecture by showing that

|∆K | ≤ 163

for all K = Q(√n), where n < 0.

5

Lecture 1

Algebraic Number Fields

Let K ⊇ Q be a finite extension of fields. Then K =∑n

i=1 Q · ωi is inparticular a finite dimensional Q-vector space, where n = [K : Q] is calledthe degree of the field extension.

The Galois group of the extension K/Q is Gal(K/Q) = φ : K∼−→ K |

φ fixes Q (in fact, the condition is vacuous, since homomorphisms send 1 to1). The extension need not be normal, hence not Galois, but the set,

HomQ(K,C) = σ : K → C homomorphism

is a measure of the distance of the field from being Galois and gives a critera:the extension K ⊇ Q is Galois if and only if HomQ(K,C) ∼= Gal(K/Q).Furthermore, there is an action,

Kρ //

∼

""EEEE

EEEE

C

ρ(K)

subset

OO

K ′ π

where the rightmost notation means that the Galois group acts on the image(π ∈ Gal(K/Q)). So HomQ(K,C) is a principal homogeneous space under π.

Theorem 1.1 (Primitive element theorem). The finite extension K/Q hasa primitive element, i.e., there exists an α ∈ K such that K = Q(α) = Q[α],where the first field is the field of rational functions in α which is equal tothe polynomial ring Q[α] = Q · 1 + Q · α + · · ·+ Q · αn−1. Hence

K ∼= Q[T ]/(fα(T )) fα(T ) irreducible, fα(α) = 0.

Let OK ⊂ K be the ring of integers of K, i.e.,

OK = α ∈ K | minimal polynomial fα = T n+a1Tn−1+· · ·+an−1T+an ai ∈ Z

It is not direct that this forms a ring, so we must return to this point.

6

Rings and Ideals

Convention 1.2. Rings R are always commutative and unitary. Homomor-phisms of rings always send 1 to 1.

A ⊆ B means that A is a subset of B, while A ⊂ B means that A ⊆ Band A 6= B.

Recall that a ring is entire1 if 0 6= 1 and the ring is without zero divisors,i.e. ab = 0 implies a = 0 or b = 0.

• A subset I ⊆ R is an ideal if an only if I is an R-module.

• An ideal p ⊂ R is prime if and only if rs ∈ p implies r ∈ p or s ∈ p forall r, s ∈ R.2 Equivalently, p is prime if and only if R/p is entire.

• An ideal q ⊂ R is primary if and only if rs ∈ q implies r ∈ q or sn ∈ qfor some n.3 These are powers of prime ideals in the ring of integers ofa number field.

• An ideal p ⊂ R is maximal if and only if R/p is a field.

Let M be an R-module. Then the annihilator ideal of M in R is

Ann(M) = 0 : M = r ∈ R : rM = 0 ⊆ R.

For example, ifR = Z andM = Z/6Z, then the annihilator ideal is Ann(M) =6Z.

To M are associated prime ideals which occur as annihilators of rank onemodules:

ass(M) = p ⊂ R prime ideal | ∃m ∈M : p = Ann(Rm).

Returning to the viewpoint of K as a Q-vector space, to each Q-basis ofK, B = ω1, . . . , ωn ⊂ K, is associated a discriminant.

∆(ω1, . . . , ωn) = det(σiωj)2,

where 1 ≤ i, j ≤ n and HomQ(K,C) = σ1, . . . , σn. Using the primitiveelement theorem, this can be made more explicit. Let K = Q[α] = Q · 1 +Q · α + · · · + Q · αn−1 ⊂ C. Then the primitive element α has a minimal

1Sometimes called integral2Compare to the case of integers and a prime number p: mn ∈ (p)⇒ m ∈ (p) or n ∈ (p)

means p | mn⇒ p | m or p | n.3Compare to the case of integers and a prime power pn: ab ∈ (pn) ⇒ a ∈ (pn) or

bn ∈ (pn) means pn | ab⇒ pn | a or pn | bn.

7

polynomial fα, the lowest degree polynomial with integer coefficients suchthat fα(α) = 0.

fα(T ) = a0Tn + a1T

n−1 + · · ·+ an−1T + an, ai ∈ Z

= a0

n∏i=1

(T − αi) = a0

n∏i=1

(T − σiα)

HomQ(K,C) = σ1, . . . , σn σi(α) = αi

The discriminant is a polynomial in the coefficients ai which vanishes ex-actly when fα(T ) has a multiple root. The exact expression appears as thedeterminant of a Sylvester matrix, but the situation becomes much simplerif the roots αi are used instead. In that case, a multiple root means thatαi = αj for some i 6= j, and the Vandermonde determinant calculates thediscriminant:

∆(B) = ∆(1, α, . . . , αn−1) =

∣∣∣∣∣∣∣∣∣1 α1 · · · αn−1

1

1 α2 · · · αn−12

......

1 αn · · · αn−1n

∣∣∣∣∣∣∣∣∣2

=∏i<j

(αj − αi)2

For example, let K = Q[i]. Then α = i and

fα(T ) = T 2 + 1 = (T − i)(T + i) = (T − i)(T − σ(i))

where σ : K → K exchanges i and −i. In this case, the discriminant of theassociated basis is ∆(1, i) = (i−(−i))2 = (2i)2 = −4. Note that this recoversthe usual discriminant of a quadratic polynomial from algebra: b2 − 4ac.

Norm and Trace

The norm of an element in a number field is just the product of all theconjugates of it. But the norm and the trace can be defined using linearalgebra using the left regular representation of K,

ρ :K → EndQ(K)x 7→ (a 7→ x · a)

.

Via this representation, the norm and trace on K are defined as the grouphomomorphisms,

N :K× → Q×x 7→ det(ρ(x))

Tr :K → Qx 7→ tr(ρ(x))

8

Question: Does N(x) = NormK/Q(x)? Does Tr(x) = TraceK/Q(x)?We observe that if B ⊂ OK is a basis of the ring of integers, then its

discriminant ∆(ω1, . . . , ωn) is an integer.

Theorem 1.3. Let B ⊂ OK be a basis such that |∆(ω1, . . . , ωn)| is minimalamong all bases. Then OK = Zω1 + · · ·+ Zωn.

Two such minimal bases differ by an element in GLn(Z).

Proof. Let α ∈ OK . Then

α = a1ω1 + · · ·+ anωn ai ∈ Q, 1 ≤ i ≤ n.

= m1ω1 + · · ·+mnωn + (r1ω1 + · · ·+ rnωn) mi = baic, ri = ai −mi.

Assume that not all ri are zero, and without loss of generality, let us assumethat r1 6= 0. Change basis:

ω′1 := α−m1ω1 − · · · −mnωn

ω′2 := ω2

...

ω′n := ωn.

So

∆(ω′1, . . . , ω′n) = det

r1 · · ·0 1 0 · · · 0...

......

...0 0 0 · · · 1

2

∆(ω1, . . . , ωn) = r21∆(ω1, . . . , ωn).

Then ω′1, . . . , ω′n is a basis in OK with |∆(ω′1, . . . , ω

′n)| < |∆(ω1, . . . , ωn)|, a

contradiction.

9

Lecture 2

Rings in Arithmetic

In this lecture we consider localization, integrality and generalities on valu-ation rings.

Definition 2.1. Let R be a commutative, unital ring and S ⊂ R a multi-plicative set, i.e., a set with 1 ∈ S, such that x, y ∈ S implies xy ∈ S. Thenthe localization of R with respect to S makes precise the notion of a ring offractions r

s| r ∈ R, s ∈ S. In particular, the localization S−1R can be

formalized as a ring of (equivalence classes) of pairs (r, s):

S−1R := R× S/ ∼ (r, s) ∼ (r′, s′)⇔ ∃t ∈ S : t(rs′ − r′s) = 0

[(r, s)] + [(r′, s′)] = [(rs′ + r′s, ss′)] [(r, s)] · [(r′, s′)] = [(rr′, ss′)]0 = [(0, 1)] 1 = [(1, 1)]

Direct calculation shows that addition and multiplication are well-defined onequivalence classes.

The reason for the requirement that S be multiplicative becomes clear;the denominator of the pairs contain terms of the form ss′. These formulasfollow from the analogy with fractions, i.e., (r, s)↔ r

s. For example,

r

s=r′

s′⇔ rs′ = r′s⇔ rs′ − r′s = 0

The additional t in the definition of ∼ deals with the case where R has zero-divisors, i.e., the counter-intuitive case when neither t nor rs′ − r′s are zeroand yet their product is. For entire rings with 0 /∈ S, the t may be left off.

There is a well-defined, canonical morphism,

ιS :R → S−1Rr 7→ [(r, 1)]

If R is entire, and 0 /∈ S, then ιS is injective, and S−1R is entire.

10

• Let R = K and S = K×. Then localization has no effect, since a fieldalready contains all fractions, i.e., S−1R = K.

• Localization at a multiplicative set containing 0 always produces thezero ring, i.e., if 0 ∈ S, then S−1R = 0 is the zero ring.

• Let p ⊂ R be a prime ideal. Then S := R \ p is a multiplicative setby the definition of a prime ideal. The localization at S is denoted byRp := (R \ p)−1R. It has a unique maximal ideal pRp. In fact, there isa bijective correspondence,

I ⊂ p | (∀s ∈ R \ p)sa ∈ I ⇒ a ∈ I ←→ I ⊂ RpI 7→ IRp

ι−1p (J) ← J

This holds in particular for prime ideals q ⊂ p.

• Let R = Z and S = Z \ 0. Then S−1R = ab| a, b ∈ Z, b 6= 0 = Q

• More generally, let R be arbitrary and S = R \ 0, then S−1R =Frac(R) is the field of fractions of R.

• Let R = Z and p be a prime number. Letting S = Z \ pZ gives thelocalization S−1R = Z(p) = a

b| a, b ∈ Z, (p, b) = 11 The only ideals

are the zero ideal (0) and powers of the maximal ideal (pn) for n ∈ N.

• Again let R = Z and p be a prime. Letting S = pn | n ∈ N andlocalizing gives the ring of fractions whose denominators are powers ofp, i.e., S−1Z = a

pn| a ∈ Z, n ∈ N.

Definition 2.2. A ring of the form Rp is called a local ring.2 Equivalently,a local ring is a ring with a unique maximal ideal. They are called localin analogy with geometry, where the ring of germs of functions (i.e., locallydefined functions) at a fixed point p of a smooth manifold has a uniquemaximal ideal formed by the functions vanishing at p.

2.1 Integrality

Let R be a commutative, unitary, entire ring. An (associative, commutative,unitary) R-algebra A is an R-module A with a bilinear multiplication β :A × A → A, i.e., a · (b + c) = a · b + a · c and (b + c) · a = b · a + c · a. Weidentify3 R with its isomorphic image in A under the ring morphism R→ A

1The notation (p, b) means the greatest common divisor gcd(p, b) of p and b.2That is, R′ is local if there exists a ring R and prime ideal p ⊂ R such that R′ ∼= Rp.3That is, we abuse notation to write R ⊂ A.

11

which maps r 7→ r · 1A.

Definition 2.3. An element a ∈ A is integral over R if and only if thereexists g(T ) ∈ R[T ]of the form

g(T ) = T n + g1Tn−1 + · · ·+ gn−1T + gn

such that g(a) = 0.

The weakness of this definition is that it does not immediately show whysums and products of integral elements are integral. The below theoremremedies this.

Definition 2.4. An R-algebra A is integral over R if and only if every a ∈ Ais integral over R. For example, R is integral over R, since for any r ∈ R,the polynomial g(T ) = T − r has zero g(r) = 0.

Theorem 2.5 (Equivalent characterizations of integrality). There are threeequivalent conditions for the integrality of an element of an R-algebra.

(i) a ∈ A is integral over R.

(ii) R[a] is a finitely generated R-module where R[a] is the smallest subal-gebra of A containing a, i.e., R[a] := ∩A′3aA′.

(iii) There exists an R[a]-module M which is finitely generated as an R-module and satisfies AnnR[a](M) = 0.

Proof. (i) ⇒ (ii) : Mq := 〈1, a, a2, . . . , an+q〉 ⊂ R[a] =∑

n≥0Ran for q ≥ 0.

We have

0 = g(a) = an + g1an−1 + · · ·+ gn−1a+ gn

⇒ an = −g1an−1 − · · · − gn

⇒ an+q = −g1an+q−1 − · · · − gnaq ∈Mq−1

⇒ M0 ⊂Mq ⊂Mq−1 ⊂ · · · ⊂M0

⇒ R[a] = M0 is finitely generated.

(ii) ⇒ (iii) : How can we find an M? We use the only candidate we have:M = R[a]. Then AnnR[a](R[a]) = 0, because the ring R[a] is entire.(iii)⇒ (i) : This follows immediately from a lemma.

Lemma 2.6. Let M be an R[a]-module which is finitely generated over Rsuch that AnnR[a](M) = 0 and let I ⊂ R be an ideal with aM ⊂ IM . Thena is a zero of a polynomial g(T ) = T n + g1T

n−1 + · · ·+ gn with gi ∈ I for alli.

12

Let M = Rm1 + · · ·+Rmn. Writing multiplication by a in this basis givesami = µi1(a)m1 + · · · + µin(a)mn. This defines a matrix µ(a) = (µij(a)) ∈I ·Mn×n(R). Let Φ := a · In×n − µ(a), and then

a

m1...mn

= µ(a)

m1...mn

⇒ (a · In×n − µ(a))

m1...mn

= 0

Multiplying on the left by the adjugate matrix4 shows that det Φ = 0:

(det Φ)·

m1...mn

= 0⇒ det Φ·mi = 0⇒ det Φ·M = 0⇒ det Φ ∈ Ann(M) = 0⇒ det Φ = 0

Hence g(T ) = det(T · In×n− µ(T )) is a polynomial of the required form.

Corollary 2.7. (i) An element a ∈ A is integral over R if and only ifthere exists a finite5 R-algebra A′ ⊂ A such that R[a] ⊂ A′.

(ii) The set R of elements of A which are integral over R is a subring of Acalled the integral closure of R in A.

Proof. Part (i). ⇒ : By (ii) of the above theorem, we may take A′ = R[a].Since a is integral over R, we note that this is the same as R[a] being finitelygenerated. ⇐ : Let A′ be a finite R-algebra and R[a] ⊂ A′. Hence aA′ ⊂ A′.So A′ is an R[a]-module with AnnR[a](A

′) = 0, since 1 ∈ R[a] ⊂ A′. Usingthe (iii)⇒ (i) part of the above theorem shows that a is integral over R.

Part (ii). Let a, b ∈ R. Then there exist R-subalgebras A′, A′′ ⊂ A suchthat R[a] ⊂ A′ and R[b] ⊂ A′′ and A′, A′′ are finitely generated R-modules.The product A′A′′ is also finitely generated.6 Since R[a + b], R[ab] ⊂ R[a, b]and R[a, b] ⊂ R[a] ·R[b] ⊂ A′A′′, applying Part (i) gives the proof.

The outlook for all this is the application to the case when K ⊇ Q ⊃ Zfor A = K and R = Z, where K is a number field. Then the ring of integersin K is OK := Z, the integral closure of Z in K.

Looking forward, we will consider absolute values which correspond roughlyto valuations, and then move on to heights.

4The adjugate matrix adj(Φ) always exists and satisfies adj(Φ) · Φ = det Φ.5We use “finite” to mean “finitely generated as a module”.6Proof: A′ =

∑ni=1Rai, A

′′ =∑mj=1Rbj ⇒ A′A′′ =

∑i,j Raibj .

13

2.2 Valuation rings

We again let R be a commutative, unitary ring and let U(R) = R× be theunits7 of R.

Theorem 2.8. Let I = ∪a⊂Ra be the union of all proper ideals8 of R. ThenI = R \U(R), and I is an ideal if and only if R has a unique maximal ideal.

Proof. The first claim is proved by showing both inclusions.

I ⊆ R \ U(R): Let x ∈ R \ U(R) and let a = xR. If a were the whole ring,then we would have 1 ∈ xR. In other words, there would be an elementy ∈ a with xy = 1. But then x would be a unit. Contradiction. Hencex ∈ a ⊆ I.

I ⊇ R \ U(R): Let x ∈ I, so that x ∈ a ⊂ R. If x were a unit, then a wouldcontain x−1x = 1 and hence be the whole ring, a contradiction. Sox ∈ R \ U(R).

The second claim is proved by showing both directions of implication.

(⇒) Assume I is an ideal. Since I contains all proper ideals, it is by defi-nition maximal. Let m be another maximal ideal. Then m is a properideal, hence I contains m. Since m is maximal and I is a proper ideal,m = I, i.e., I is unique.

(⇐) Assume R has a unique maximal ideal. Let x, y be elements of I. Bythe first part, they are not units, so xR, yR 6= R. Let m be the uniquemaximal ideal which must contain both xR and yR. Thus it containsboth x and y. Hence x − y ∈ m and rx ∈ m for all r ∈ R. Since Icontains all proper ideas, it also contains m, so x− y, rx ∈ I, i.e., I isan ideal.

Note that the proposition implies that U(R) = R \ I.

Definition 2.9. Let R be entire with quotient field K = FracR. The ringR is called a valuation ring if either r ∈ R or r−1 ∈ R for every r ∈ K.

Definition 2.10. We will denote by I(R) the set of ideals of R.

7An element r ∈ R is a unit if it is invertible with respect to multiplication, i.e., thereexists r−1 ∈ R such that r−1r = 1R.

8An ideal I ⊂ R is a proper ideal if I 6= R.

14

Theorem 2.11. Let R be a valuation ring. Then the set of ideals, I(R), istotally ordered by inclusion.

Proof. Let I, J ∈ I(R). If I = J , then there is nothing to show, so letI 6= J . Without loss of generality, let I contain an x which is not in J . Lety ∈ J \ 0. Then (

x

y

)· y = x /∈ J and y ∈ J ⇒ x

y/∈ R

By hypothesis, R is a valuation ring, hence y/x = (x/y)−1 ∈ R and y =(y/x) · x ∈ I. Since y was arbitrary, J ⊂ I.

The total ordering on I(R) immediately implies a couple corollaries.

Corollary 2.12. Let R be a valuation ring. Then R has a unique maximalideal.

Corollary 2.13. Let R be a valuation ring. Then U(R) = R \m(R).

Corollary 2.14. Let R be a valuation ring. Then K \ R = x ∈ U(K) |x−1 ∈ m(R).Proof. ⊆: By the definition of a valuation ring, if x /∈ R, then x−1 ∈ R

and also x−1 /∈ U(R) (otherwise x = (x−1)−1 ∈ U(R) ⊂ R.) Hencex−1 ∈ m(R).

⊇: Let x ∈ U(K), and let x−1 ∈ m(R). Since x−1 ∈ m(R), x /∈ U(R). Butif x−1 is not in U(R), then neither is x in U(R). Since x−1 ∈ m(R),x cannot be in m(R), or else m(R) would be the whole ring. But ifx is neither in m(R) nor in U(R), it must not be in R at all, that is,x ∈ K \R.

Theorem 2.15. Let R be a valuation ring. Then R is integrally closed inK.

Proof. Let x ∈ K be integral over R. Assume that x /∈ R. Since R is avaluation ring, this implies x−1 ∈ R. By definition of integrality, there is apolynomial relation,

xn + a1xn−1 + · · ·+ an−1x+ an = 0.

Division by xn and rearrangement gives,

1 = −a1x−1−· · ·−an−1x

−(n−1)−anx−n = x−1(−a1−· · ·−an−1x−(n−2)−anx−(n−1)).

So x−1 ∈ U(R). Thus x−1 ∈ m(R). In that case, 1 ∈ m(R), a contradiction.So the assumption was false and x ∈ R, that is, R is integrally closed.

15

2.3 Discrete valuation rings

Let R be an entire ring, and K = FracR be its quotient field.

Definition 2.16 (Real valuation and value group). Let K be a field. A realvaluation on K is a function ν : K → R ∪ ∞ such that

1. ν(xy) = ν(x) + ν(y).

2. ν(x+ y) ≥ minν(x), ν(y)

3. ν(x) =∞⇔ x = 0.

The image ν(K \ 0) ⊆ R is called the value group.

Definition 2.17 (Discrete valuation). Let K be a field. A real valuation onK such that the value group is a subgroup of Z is called a discrete valuationin which case the value group has the form e · Z for some e > 0.

We introduce the notation,

Rν := x ∈ K | ν(x) ≥ 0 ⊂ K (subring)

mν := x ∈ K | ν(x) > 0 ⊂ Rν (ideal)

Then Rν is a discrete valuation ring. (Let x ∈ K \ Rν . By the definitionof Rν , this means ν(x) < 0. Hence ν(x−1) = −ν(x) > 0, from which weconclude that x−1 ∈ Rν .)

In addition, mν is the unique maximal ideal whose existence is guaranteedby Corollary 2.12. (Let mn := x ∈ Rν | ν(x) ≥ n for n ∈ Z>0. Then

m1 = mν ⊇ m2 ⊇ · · · ⊇ mn ⊇ · · ·

We take π ∈ m1 with ν(π) = 1. In fact, π is uniquely determined up tomultiplication by a unit in Rν . Let ν(π′) = 1. Then

ν

(π′

π

)= 0⇒ π′

π∈ Rν \mν = U(Rν)⇒

π′

π= ε ∈ U(Rν)⇒ π′ = επ.

The same argument shows that mn = mn for all n:

x ∈ mn\mn+1 ⇒ ν(x) = n⇒ ν( xπn

)= 0⇒ x = επn ⇒ x ∈ mn ⇒ mn = mn.

Proposition 2.18. Let R be a discrete valuation ring. Then R is a principalideal domain.

16

Proof. Let I ⊆ R be an ideal. Then ν(I \ 0) ⊆ Z is an additive subgroup.Hence ν(I) = nZ for some n. Hence I = mn = (πn) is a principal ideal.

Theorem 2.19. Let R be a noetherian, entire local ring such that everynon-zero prime ideal is maximal. Then the following are equivalent:

(i) R is a discrete valuation ring.

(ii) R is integrally closed.

(iii) m is a principal ideal.

(iv) m/m2 is an R/m-vector space of dimension one.

(v) For all ideals a 6= 0 in R, there is an n ≥ 0 with a = mn.

(vi) There exists a π ∈ R such that for all ideals a ⊆ R, a = (π)n for somen ≥ 0.

Lemma 2.20. If R is a discrete valuation ring, then the module quotientmn/mn+1 is an R/m-vector space for all n, and it has dimension one.

Proof. Since mn+1 is an R-module, multiplication by r in mn/mn+1 is well-defined:

r : x+ mn+1 7→ rx+ rmn+1 = rx+ mn+1.

Hence mn/mn+1 is an R-module. Since Ann(mn/mn+1) = m, it is furthermorean R/m-module. Finally, the isomorphism

mn = (πn) : mn/mn+1 ∼−→ R/m

0 6= x ∈ mn : x = επn 7→ ε

shows that it is a one dimensional vector space over R/m.

Let R be a discrete valuation ring. We give it a topology of open setsgenerated by the basis of open neighborhoods of 0:

R ⊃ m ⊃ m2 ⊃ · · · ⊃ mn ⊃ · · · ⊃ 0

The topology is separated if

∩n≥0mn = (0).

If R is noetherian, then by a lemma of Nakayama (to be proved later), mI =I.

17

Lecture 3

Absolute Values andCompletions

Today we introduce the general theory of absolute values and completionswith respect to them. Then we examine the particular case of absolute valueson number fields. Finally, we prove Nakayama’s Lemma.

3.1 Basic facts about absolute values

Definition 3.1 (Absolute value). Let K be a field. An absolute value on Kis a function | · | : K → R≥0 such that the following three properties hold.

1. |x+ y| ≤ |x|+ |y|.

2. |xy| = |x| · |y|.

3. |x| = 0⇔ x = 0.

An absolute value is called non-archimedean if it additionally satisfies theinequality

|x+ y| ≤ max(|x|, |y|)

Lemma 3.2. Let (K, | · |) be a value field (a field with absolute value). Then|1| = | − 1| = 1.

Proof. By multiplicativity, 1 = 1·1 and 1 = (−1)(−1) imply that |1| = |1|·|1|and | − 1| = | − 1| · | − 1|. Since only 0 has absolute value 0, we may divideby |1| and | − 1| respectively, showing that |1| = | − 1| = 1.

Proposition 3.3. The absolute value | · | is non-archimedean if and only if| · | is bounded on Z ⊂ K.

18

Proof. ⇒: Let the absolute value be non-archimedean. We proceed byinduction on n. In the base cases, |1| = |1 + 0| ≤ max(|1|, |0|) ≤ 1. Forthe general case, let n ∈ Z>1. Then |n| = |(n − 1) + 1| ≤ max(|n −1|, |1|) ≤ 1.

⇐: Now let the absolute value be bounded on Z. Let x, y ∈ K. Considerthe binomial formula

(x+ y)n =n∑k=0

(nk

)xn−kyk

|x+ y|n ≤n∑j=0

∣∣∣∣( nk

)∣∣∣∣ ·Mn, M = max(|x|, |y|)

≤ (n+ 1)cMn c = maxn∈Z|n| <∞

Hence|x+ y| ≤ (n+ 1)

1n c

1nM for all n ∈ N

Taking the limit as n goes to infinity gives |x+ y| ≤M .

An straightforward calculation shows that every absolute value | · | : K →R≥0 induces a metric d : K ×K → R≥0 by the formula

d(x, y) = |x− y|.

Definition 3.4 (Completion). Let (K, | · |) be a field with absolute value.Let

S = (xn) ∈ KN | (∀ε > 0)(∃N)(∀m,n > N)(xm − xn < ε)

be the set of Cauchy sequences. The completion, K, of the field K withrespect to | · | is

K = S/ ∼ (xn) ∼ (yn)⇔ limn→∞

(xn − yn) = 0.

with the canonical embedding of K given by the map to constant sequences,

ι :K → Kx 7→ (x)n∈N

The operations are defined element-wise, i.e., [(xn)]+ [(yn)] = [(xn+yn)] and[(xn)] · [(yn)] = [(xnyn)]. These are well-defined and give it the structure of afield.

19

The following can be easily checked:

• The completion K is complete, i.e., every Cauchy sequence in KN con-verges in K.

• We extend | · | to | · | on K by taking limits, i.e., |[(xn)]| := limn→∞ |xn|.

A simple question asks whether an arbitrary field admits an absolutevalue. In fact, ever field admits at least the trivial absolute value.

Definition 3.5 (Trivial absolute value). Let K be a field. Every field admitsat least the trivial absolute value defined by |x| = 1 for all x 6= 0 in K.

Lemma 3.6. If (K, | · |) is a finite field with absolute value, then |x| = 1 forall x ∈ K \ 0.

Proof. The absolute value must take 0 to the zero element, 0K , and everynon-zero element is a torsion element, so |x|q = |xq| = |1| = 1.

Definition 3.7 (Equivalent absolute values). Let K be a field. Two absolutevalues | · |1 and | · |2 on K are said to be equivalent if they define the sametopology.

Theorem 3.8. Let K be a field, and let | · |1 and | · |2 be non-trivial absolutevalues on K. They are equivalent if and only if |x|1 < 1 implies |x|2 < 1 forall x 6= 0. In that case, there exists a λ > 0 such that

|x|2 = |x|λ1 ∀x ∈ K \ 0.

Proof. Let x ∈ K \ 0. Then |x| < 1 if and only if limn→∞ |xn| = 0. Butequivalent topologies mean that limn→∞ |xn|1 = limn→∞ |xn|2. Hence

| · |1 ∼ | · |2 ⇔(

limn→∞

|xn|1 = 0⇔ limn→∞

|xn|2 = 0)

⇔ (|x|1 < 1⇔ |x|2 < 1).

This proves the first part.For the second part, let x0 ∈ K\0 such that |x0|1 6= 1 (it exists since the

absolute values are non-trivial). If |x0|1 < 1, then replace x0 by its inverse, sothat in either case, |x0|1 > 1. The goal is to show that |x0|2 = |x0|λ1 . Solvingfor λ suggests the definition,

λ =log |x0|2log |x0|1

.

20

This definition is well-defined, because |x0|1 is neither 1 nor ∞. It dependsa priori on x0. Our task will be to show that it is in fact independent ofthis choice. Let x ∈ K \ 0 be arbitrary, and let α ∈ R be such that|x|1 = |x0|α1 . Let (sn) and (rn) be sequences in Q whose limits are both α(i.e., limn→∞ sn = limn→∞ rn = α), but such that rn ≥ α for all n whilesn ≤ α for all n. Let rn = an/bn with an, bn ∈ Z. Hence

|x|1 = |x0|α1 ≤ |x0|rn1 ⇒∣∣∣∣xbnxan0

∣∣∣∣1

=|x|bn1|x0|an1

≤ 1⇔∣∣∣∣xbnxan0

∣∣∣∣2

≤ 1⇒ |x|bn2|x0|an2

≤ 1⇒ |x|2 ≤ |x0|rn2 .

Taking the limit as n goes to infinity results in the inequality |x|2 ≤ |x0|α2 .The same argument with (rn) replaced by (sn) gives |x|2 ≥ |x0|α2 . Hence

|x|2 = |x0|α2 = |x0|λα1 = |x|λ1 .

for all x ∈ K \ 0.

Definition 3.9 (Discrete absolute value). An absolute value is discrete ifand only if |K \ 0| ⊆ R≥0 is a discrete subset.

A (non-trivial) discrete absolute value satisfies |K \ 0| = qZ for some0 < q < 1.

Definition 3.10 (Homomorphism of valued fields). A homomorphism ofvalued fields, σ : (K, | · |)→ (K ′, | · |′), is a field homomorphism σ : K → K ′

such that |σx|′ = |x|. In other words, σ∗(K ′, | · |′) = (K, | · |).

Next time we will restrict ourselves to special case of number fields. Theset of homomorphisms from the field into the complex numbers which forma valued field. Hence each homomorphism induces an absolute value on thefield. By this construction derive all archimedean absolute values on the field.Then we will determine all absolute values on the rationals by the theoremof Ostrowski which gives information about absolute values on number fieldsmore generally. The absolute values on the rationals satisfy the productformula, i.e., that the product of all absolute values of a non-zero rationalnumber gives one. Then we will be prepared to study heights.

3.2 Absolute Values on Number Fields

Let K ⊇ Q be a number field, and let

Σ := HomQ(K,C)

21

be the Q-vector space of Q-linear field homomorphisms to C. The standardabsolute value on C is |z| = (z · z)

12 and makes (C, | · |) into a valued field.

Conjugation F∞ : z 7→ z preserves the standard absolute value and fixesthe real numbers. It is known in this context as the Frobenius morphism atinfinity. The group 1, F∞ acts on Σ.

Given an embedding σ : K → C, the standard absolute value on C canbe pulled back to give an absolute value on K defined by,

|x|σ := |σx|, x ∈ K

Since F∞ preserves the absolute value, the pullback |·|F∞σ gives an equivalentabsolute value on K. Such pull-backs give all archimedean absolute valueson a number field K.

To illustrate these notions, let us consider a simple example. Let K =Q(√

2). Then Σ = σ, τ. These are two embeddings in C,

σ : a+ b√

2 7→ a+ b√

2 τ : a+ b√

2 7→ a− b√

2

inducing two absolute values,

|a+ b√

2|σ = a+ b√

2 |a+ b√

2|τ = |a− b√

2|

On the other hand, if K = Q(√−2), then the embeddings,

a+ b√−2 7→ a+ b

√−2 a+ b

√−2 7→ a− b

√−2,

induce the same absolute value (|a + b√−2| =

√a2 + b2), and there is just

one (archimedean) absolute value on K.In general, the set of embeddings can be decomposed into real and com-

plex embeddings as Σ = ΣR ∪ ΣC. Let r = |ΣR| and 2s = |ΣC|. Thenr + 2s = [K : Q], and r + s is the total number of archimedean absolutevalues.

It remains to classify non-archimedean absolute values on K. To thatend, let (K, | · |) be a non-archimedean value field. Then

O := x ∈ K | |x| ≤ 1 ⊇ Zp := x ∈ K | |x| < 1

The set O is a ring. Indeed, if x, y ∈ O, then both x − y and xy are in O,because |x − y| < max(|x|, | − y|) < 1 and |xy| = |x||y| < 1. It is even avaluation ring, because for all x ∈ K, either |x| ≤ 1 or |x| > 1, hence eitherx ∈ O or x−1 ∈ O.

22

On the other hand, if we begin with OK ⊂ K, the integral closure ofZ in K, and p ⊂ OK is a non-zero prime ideal, then p is maximal. Sincep∩Z = (p) is prime ideal, the number p is prime. Furthermore, the quotientby p, OK/p, is integral over Z/pZ = Fp. Indeed, let [x] ∈ OK/p be anequivalence class with x ∈ OK its representative. Since x is integral over Z,it satisfies a monic polynomial relation with integral coefficients which canbe reduced modulo p:

xn+a1xn−1+· · ·+an−1x+an = 0⇒ [x]n+[a1][x]n−1+· · ·+[an−1][x]+[an] = [0]

The second relation shows that [x] is integral over Z/pZ.

Proposition 3.11. Retaining the above notation, the ring OK/p is a field.More generally, a ring which is integral over a field is a field.

Proof. We must show that every element has an inverse. Let [x] ∈ OK/p.We would like to find a [y] ∈ OK/p such that [x][y] = [1]. Since [x] is integralover Z/pZ,

[1] = −[an]−1([x]n−1 + [a1]xn−2 + · · ·+ [an−1])[x].

Letting [y] = −[an]−1([x]n−1 + [a1]xn−2 + · · ·+ [an−1]) finishes the proof. Thesame proof holds for the general case.

Since the quotient by the ideal p is a field, p must be a maximal ideal.

Proposition 3.12. The ring of integers OK is integrally closed and everynonzero prime ideal of OK is maximal.

Since OK is a noetherian domain, the immediate consequence of theproposition is that OK is a Dedekind domain, i.e., an integrally closed,noetherian domain of Krull dimension one. Krull dimension one means ex-actly that every nonzero prime ideal is maximal.

In a Dedekind domain, every nonzero proper ideal factors uniquely into aproduct of prime ideals. This unique factorization property makes Dedekinddomains the most important rings in arithmetic. They also enjoy the prop-erty that all primary ideals are powers of prime ideals.

We have begun with a prime ideal p in OK , and we would now like toconstruct an absolute value from it. To begin, its powers define a filtration,

OK ⊃ p ⊇ p2 ⊇ p3 ⊇ · · · ⊇ pn ⊇ · · · ⊇ (0)

We claim the above chain of ideals tend to the zero ideal in the limit, i.e.,

I := ∩n≥0pn = (0), (3.1)

23

We will show this after proving Nakayama’s lemma.This vanishing implies that the inclusions of the above filtrations are strict

and define a basis for a separated topology on OK . In this case, the filtrationdetermines a valuation by the definition,

ordp :OK → Z ∪ ∞x 7→ mine ∈ N | x ∈ pe \ pe+1

A straightforward calculation shows that this is indeed a valuation. It canbe used to define an absolute value on OK by choosing 0 < q < 1 and letting

|x|p := qordp x

This absolute value depends on a choice, namely a choice of real number, q,between 0 and 1. But different choices yield equivalent absolute values.

Definition 3.13 (Place). A place of a number field K is an equivalence classof absolute values on K. There are two kinds of absolute values. An infiniteplace is a class containing an archimedean absolute value, while a finite placecontains non-archimedean absolute values.

Every prime ideal p ⊂ OK defines a place by taking the equivalence classof absolute values containing | · |p. If p = charOK/p, then ordp(x) | p for allx ∈ K.

3.3 Lemma of Nakayama

Proposition 3.14. Let R be a Dedekind domain, and let M bea finitelygenerated R-module. Let a ⊂ R be an ideal, and let φ ∈ EndR(M)1 be suchthat φ(M) ⊆ a ·M2. Then φ satisfies the equation

φn + a1φn−1 + · · ·+ an−1φ+ an = 0, ai ∈ a

Proof. Let m1, . . . ,mk be generators for M over R. Then the hypothesisφ(M) ⊆ a ·M implies,

φ(mi) =k∑j=1

aijmj, aij ∈ ai.

1Note that EndR(M) is an R-module containing R by the canonical morphism r 7→ r·id.2The set φ ∈ EndR(M) | φ(M) ⊆ a ·M is a right ideal in EndR(M).

24

This can be rewritten as,

k∑j=1

(δijφ− aij)mj = 0, i = 1, . . . , k.

Letting Ψ be the matrix (δijφ − aij), the system has the form Ψ · m = 0.Multiplying both sides by the adjugate of Ψ gives dm := det(δijφ−aij)·m = 0.Hence dmi = 0 for all i. Since the mi generate M , it must be that d = 0.This finishes the proof with the relation det(δijφ − aij) = 0 as the desiredequation.

Corollary 3.15. Let R again be a Dedekind domain with proper ideal a ⊂ R,and let M be a finitely generated R-module. If aM = M then there existsx ∈ R such that x ≡ 1 (a) and xM = 0.

Proof. Letting φ := id and applying the previous proposition gives thatidn + a1id

n−1 + · · · an = 0 for some ai ∈ a. Let x = 1 + a1 + · · · + an. Thenx ≡ 1 (a) and xM = (idn + · · ·+ a+ n)M = 0.

Lemma 3.16 (Nakayama). If a ⊂ R is an ideal contained in each maximalideal, and aM = M , then M = 0.

Proof. Let x be as in the corollary. Then x ≡ 1 (a). Hence x ≡ 1 (m) for allmaximal ideals m, i.e., x− 1 ∈ m. If x were in m, then 1 would also be in m,so x is not in any maximal ideal. Hence x is a unit, so that x−1 ∈ R. ThusxM = 0 implies that M = x−1xM = 0.

Returning to the condition , the definition of I implies that

I = p · I (3.2)

The inclusion of p · I ⊆ I follows straight from the definition. In the otherdirection, the difficulty is that an element x ∈ I = ∩npn might not factoruniformly into a product x = yz for y ∈ p and z ∈ pn for all n. (It does ofcourse factor separately for each n, but this is not sufficient.) Since OK isnoetherian, this factorization is a consequence of the Artin-Rees Lemma.

Lemma 3.17 (Artin-Rees). Let I ⊂ R be an ideal in a Noetherian ring R.Let M be a finitely generated R-module and N ≤ M , a submodule. Thereexists an integer k ≥ 1 such that

InM ∩N = In−k((IkM) ∩N)

for all n ≥ k.

25

Applying this to the case where I := p, M := OK and N := I gives that,

pn ∩ I = pn−k(pk ∩ I), n ≥ k

This can be rewritten as I = pn−kI. Letting n = k + 1 results in the desiredequality.

Were the equality 3.2 to hold for a non-zero I, it would seem strange,because the equality could be interpreted as saying that elements of I areinfinitely divisible. In other words, one might take x ∈ I and factor out anelement of p, e.g., x = p1 · x1. Iteratively applying the equality, an arbitrarynumber of elements of p may be factored out x = p1 · · · pnxn.

More precisely, localizing the exact sequence,

0 // pI // I // I/pI // 0

at p produces

0 // (pI)p∼ // Ip // (I/pI)p = 0 // 0.

By writing out a generic element, one sees that (pI)p ∼= aIp, where a = pOK,p.Localization is exact, so this gives

aIp ∼= Ip.

Since a is the unique maximal ideal in the localization, we may apply Nakayama’slemma to infer that Ip = 0. Since OK has no zero-divisors, this implies thatthe intersection I = 0 which was to be shown.

We have constructed in a very natural way a set of absolute values. Itis not yet clear whether these are all. This will be a central point of thenext lecture. In the case of K = Q, it will turn out that in principal, wehave already found them all. The result must then be extended to arbitrarynumber fields K. In this direction we will prove the product formula andthen proceed to heights.

3.4 Ostrowski’s Theorem

To summarize the results so far, if K ⊇ Q is a number field andOK ⊃ Z is theintegral closure of Z in K, then the absolute values fall into two categories:

Archimedean absolute values These arise as the pull-backs of (C, | · |) bythe embeddings of K in C, HomQ(K,C).

26

Non-archimedean absolute values The ring OK is a Dedekind domain.Every prime ideal in OK is maximal, and there is a descending chainOK ⊃ p ⊃ p2 ⊃ · · · ⊃ (0) such that ∩∞n=0p

n = (0) (which implies thatthe inclusions are strict.) The associated absolute value is |x|p := λordp x

for 0 < λ < 1.

Each non-archimedean absolute value defines a valuation ring Op ⊃ OK byOp := OK,p, the localization at the set S = OK \p. These local rings intersectto give the original (“global”) ring, i.e., ∩pOp = OK .

Remark 3.18. A non-archimedean absolute value can be normalized. Firsta prime number p can be associated to p by intersecting p ∩ Z = (p) andchoosing the positive generator. Let q be chosen so that q− ordp p = p, i.e., sothat |p|p = |p|. The correct choice is q = p−1/ep , where ep = ordp p. This isthe normalization.

Remark 3.19. This works also for Q replaced by Qp. There are inclusionsK ⊇ Qp ⊇ Zp. The ring of integers, Op ⊂ K is the integral closure of Zp inK.

Applying this theory to Q gives the absolute values|x|∞ = max(x,−x) archimedean|x|p = p− ordp x non-archimedean (normalize λ = 1

p)

These are all absolute values:

Theorem 3.20 (Ostrowski3). Up to equivalence these are all the non-trivialabsolute values on Q.

Proof. We fix a non-trivial absolute value on Q. Then by Lemma 3.2, |1| =| − 1| = 1. Let k ∈ N be a natural number. Then

|k| = | 1 + · · ·+ 1︸︷︷︸k

| ≤ k|1| = k ⇒ | − k| ≤ k.

For integers m,n > 1, define

N := max(1, |n|).

We can expand m in terms of powers of n, i.e., m = m0 +m1n+ · · ·+mdnd

where 0 ≤ mi < n and md 6= 0. The necessary power of n may be deteced by

3Ostrowski was professor at Basel

27

the inequality m ≥ nd which rewritten becomes d ≤ logm/ log n. Applyingabsolute values to the expansion of m gives,

|m| ≤ |m0|+ |m1||n|+ · · ·+ |md||n|d

≤ m0 +m1|n|+ · · ·+md|n|d

< n(1 + |n|+ · · ·+ |n|d)≤ n(1 +N + · · ·+Nd) ≤ n(d+ 1)Nd.

Choose an integer s > 0, and replace m by ms. Then d becomes ds.

|m|s = |ms| ≤ n(ds+ 1)Nds

Taking the sth root gives

|m| ≤ n1s (ds+ 1)

1sNd

The limit as s approaches infinity is |m| ≤ Nd ≤ N logm/ logn which rewrittengives

|m|1

logm ≤ N1

logn = max(1, |n|)1

logn

The maximum divides into two cases according to whether the absolute valueis archimedean. In the first case, let max(1, |n|) = |n| for some integern > 1. Interchanging the role of m and n in the above derivation gives theabove inequality with m and n exchanged, i.e., |n|1/ logn ≤ max(1, |m|)1/ logn.Concatenating these two inequalities produces an equality,

|m|1

logm = |n|1

logn .

Since the equality holds for all m, it will hold in addition for all n. Indepen-

dent of n, there is thus a λ such that |n|1

logn = eλ. Hence,

|n| = eλ logn = nλ, n ∈ Z>1.

In fact, for all rational numbers r = p/q, we have

|r| = |r|λ∞

for the standard absolute value | · |∞. This holds for integers, because |−n| =|(−1)(n)| = |−1|·|n| = nλ = |n|λ∞, and |−1| = |1| = 1 = 1λ and |0| = 0 = 0λ.For a general rational number, qr = p implies |q||r| = |p|. Hence

|r| = |p|/|q| = |p|λ∞/|q|λ∞ = |p/q|λ∞ = |r|λ∞.

By Theorem 3.8, this means that | · | ∼ | · |∞.

28

On the other hand, let max(1, |n|) = 1 for some n, i.e., |n| ≤ 1. Theninduction on m shows that |m| ≤ 1 for all integers m. Proposition 3.3 thenimplies that this is a non-archimedean absolute value bounded by 1. Wedefine,

O := x ∈ Q | |x| ≤ 1 p := x ∈ O | |x| < 1.The ring O is a discrete valuation ring, and the ideal p = (π) intersects Z ineither p∩Z = (p) or (0). If | · | is non-trivial, then there exists an x 6= 0 with|x| 6= 1. Either |x| or |x−1| is less than one, so either x ∈ p or x−1 ∈ p. So itis impossible that p ∩ Z should be (0).

Let therefore p be the generator of p ∩ Z, and let | · |p be the normalizedabsolute value corresponding to p for which |p|p = |p|. Let n = pkm bean integer factored so that p - m, i.e., m /∈ p ∩ Z. Hence m ∈ O \ p, so|m| = 1 = |m|p. Putting this together gives,

|n| = |pkm| = |p|k|m| = |p|kp |m|p = |n|p.

The equality may be extended to rational numbers, proving that |·| ∼ |·|p.

3.5 Product Formula for QTheorem 3.21 (Product Formula). Let MQ be the set of normalized valua-tions on Q. For x ∈ Q×, we have∏

ν∈MQ

|x|ν = 1.

Proof. Let P(x) :=∏

ν |x|ν . We show that P(x) = 1. Since P : Q× → Ris multiplicative, it suffices to show the formula for prime x, and then applyprime decomposition to an arbitrary integer. Let x = p be prime, and let qbe any finite prime. Then

ordq p =

1 q = p0 q 6= p

Therefore ∏ν

|p|ν = |p|∞|p|p ·∏q 6=p

|p|q = pp−1 ·∏q 6=p

1 = 1.

Remark 3.22. Let Qν be the completion of Q at ν ∈ MQ, i.e., Qν = Qp if pis a prime and Qν = R if p =∞. Let

AQ ⊆∏ν

Qν

29

be the subset of (xν)ν∈MQ with |xν |ν ≤ 1 for all but finitely many ν, i.e.,where xν is a p-adic integer for all but finitely many p. Then AQ is a ringcalled the ring of Adeles with an embedding of Q by

ι :Q → AQx 7→ (x)ν

By the product formula, the image of Q lies in the hypersurface cut out bythe equation

∏|xν |ν = 1. This theory was initially developed by A. Weil.

3.6 Product formula for number fields

We now examine the general case of number fields. Let (K, | · |ν) be a valuedfield, and (Kν , | · |w), its completion. Let L ⊇ K be a finite, separableextension. The K-algebra L is a finite, separable field extension. HenceL = K[x] for some x ∈ L, and

π :K[T ] → L

T 7→ x

is a ring homomorphism with kernel (f) for an irreducible polynomial f . Itsquotient is L ∼= K[T ]/(f), and changing coefficients to the completion Kν

produces L⊗KKν∼= Kν [T ]/(f). Factoring the polynomial f = f1 · · · fn over

the completion Kν leads to a finite set of pairwise orthogonal idempotents.Namely, by the Chinese Remainder Theorem,

Kν [T ]/(f) = Kν [T ]/(f1) ∩ · · · ∩ (fn) ∼=n∏i=1

Kν [T ]/(fi) =:n∏i=1

Li

Hence Eν := L ⊗K Kν∼=∏n

i=1 Li for fields Li ⊇ Kν . The projectionsεi ∈ EndKν (Eν) such that εi : Eν → Li are idempotents. They decomposethe identity as 1 = 1⊗ 1 =

∑i εw and multiply according to the rule,

εw · εw′ = δw,w′εw,

for the Kronecker delta, δw,w′ . These two properties can be seen using thediagram,

Eν ∼=n∏i=1

Lipri−→ Li

i→︸︷︷︸

ei

n∏j=1

Lj ∼= Eν .

There are only finitely many prime ideals pw ⊂ Eν , one for each εw,

pw :=⊕w′ 6=w

Eνεw′ = Ann(Ewεw) = x =∑w′

xw′εw′ ∈ Ew | xεw = 0.

30

They are all maximal. Letting x =∑

w′ xw′εw′ , the condition that 0 =xεw becomes 0 =

∑w′ xw′εw′εw = xwεw, which happens exactly when xw =

0. Hence the morphism πw is a quotient by a maximal ideal, and Eνεw ∼=(Eν/pw) · εw = Lw is a field.

Constructing the absolute values

All absolute values of L will be found as absolute values of Lw in the followingmanner. The above reasoning may be summarized by the diagram,

Kν_

y

ιw

++WWWWWWWWWWWWWWWWWWWWWWWW

K'

44iiiiiiiiiiiiiiiiiiiiiiiii x

λ**VVVVVVVVVVVVVVVVVVVVVVVVV Eν := L⊗K Kν = ⊕wEνεw

πw // Lw := Eνεw

L?

OO

% κw

33ggggggggggggggggggggggggg

By the density of the embedding of K in its completion, the extension of | · |νto Kν is unique. We have already constructed it above by taking limits. Theembedding L→ L⊗KKν embeds L as a dense subset, and thus any absolutevalue on L extends uniquely to Eν . For each norm on Lw which pulls backto | · |w on Kν , we can pull it back to L to get a norm extending the givennorm | · |ν on K.

Each idempotent defines an extension ‖ · ‖w of the absolute value ‖ · ‖νto Eν :

‖x‖ν =n∑i=1

|prix|ν =n∑i=1

|xi|ν , x = (x1, . . . , xn) ∈ Knν .

This norms makes Eν into a complete Kν-vector space. It is an extension:ι∗w‖ · ‖w = | · |ν . In turn, it defines an absolute value on L by

|x|w := ‖κwx‖w.

This absolute value extends the original absolute value on K, i.e., |x|w =|λx|ν . We have shown,

Lemma 3.23. Let K be a number field, and let L ⊇ K be a finite, separa-ble extension. In the above notation, every idempotent εw in Eν defines anabsolute value | · |w on L which extends | · |ν.

31

Letting nw/v := [Lw : Kν ] and using the fact that tensoring with a fieldis flat, we arrive at an important formula,

[L : K] = dimKν Eν =∑w|ν

[Lw : Kν ] =∑w|ν

nw/v

Lemma 3.24.‖x‖nw/vw = |NLw/Kν (x)|ν

Proof. Let Lw ⊇ Kν . Then every σ ∈ Gal(Lw/Kν) preserves the norm, i.e.,σ ∗ ‖ · ‖w = ‖ · ‖w, because the norm on Lw which extends the norm on Kν

is unique. Then

‖x‖nw/vw =∏σ

‖x‖w =∏σ

‖xσ‖w = ‖∏σ

xσ‖w

= ‖NLw/Kν (x)‖w= |NLw/Kν (x)|ν .

The decomposition of Eν gives a morphism

x 7→ (Tx : y 7→ xy) ∈ EndKν Eν∼= EndKν (Lw)

The idempotents induce a decomposition, Tx = ⊕wTxεw . Each idempotentdefines an extension ‖ · ‖w of the norm of norm ‖ · ‖ν to Eν :

‖x‖ν =n∑i=1

|prix|ν =n∑i=1

|xi|ν , x = (x1, . . . , xn) ∈ Knν .

This norms makes Eν into a complete Kν-vector space. It is an extension:ι∗w‖ · ‖w = | · |ν . In turn, it defines an absolute value on L by

|x|w := ‖κwx‖w.

This absolute value extends the original absolute value on K, i.e., |x|w =|λx|ν . Then,

NEw/Kν (x) = detTx =∏w

detTxεw =∏w

NLw/Kν (xεw).

This leads to the theorem,

32

Lecture 4

Heights and Siegel’s Lemma

4.1 Absolute values and heights

The product formula plays an important role in calculating with heights. LetK ⊇ Q be a number field. The places ν of K have the following form.

• ν | ∞: There are two cases. For every, real embedding σ : K →R, there is a real place with absolute value |x|ν := |σx|. Complexplaces have the form |x|ν = |σx| and correspond to pairs (σ, σ) of anembedding σ : K → C and its conjugate.

• ν | p: We classified the finite places in Lemma 3.23. Each has anabsolute value of the form |x|ν = p−(fν/nν) ordν x. Here nν := [Kν : Q] isthe degree of the completion as an extension of Q. Letting p ⊆ Oν bethe valuation ideal of ν, we define fν := dimFp Oν/p = [kν : Fp].

What is |p|p? To calculate it, let eν = ordν p. Then nν = eν · fν , so

|p|p = p−fνnν

ordνp = p−fνeνnν = p−1.

The product formula for K (see below) motivates the definition of the nor-malization of an absolute value.

Definition 4.1 (Normalization of an Absolute Value). The normalization ofan absolute value of a number field K ⊇ Q is,

‖x‖ν := |x|nννwhere nν = [Kν : Qp] if ν is a finite place, and

nν =

1 σ real2 σ complex

when ν is an infinite place.

34

With this definition in hand, the product formula for K takes the form,∏ν

‖x‖ν =∏ν

|x|nνν = 1, x ∈ K, x 6= 0.

Definition 4.2. Let K/Q be a finite extension. The (logarithmic) height ofan element x ∈ K is

hK(x) :=∑ν

log+ ‖x‖ν .

where log+ r := max0, log r.

The height enjoys a number of properties inherited from the norm. It isnon-negative, and satisfies a triangle inequality, hK(x) ≥ 0 and hK(x+ y) ≤hK(x) + hK(y). It also remains invariant under multiplication by λ ∈ K×.

Example 4.3. Let K = Q, and let x ∈ Q. We may factor its numeratorand denominator:

x =a

b=∏

p prime

pε(p) =∏ε(p)≥0

pε(p)∏ε(p)<0

pε(p)

The height can be calculated as follows.

log+ |x| =

∑p ε(p) log p x ≥ 0

0 x < 0

log+ |x|p =

−ε(p) log p ε(p) ≤ 00 ε(p) > 0

If x = 35, then

log+ |x|∞ = 0

log |x|3 = 0

log |x|5 = −(−1) · log 5 = log 5.

Hence hQ(35) = log 5 = log(max(3, 5)).

Exercise 4.4. What is h(√

2)?

4.2 Heights in affine and projective spaces

Let An(K) = (x1, . . . , xn) | xi ∈ K be affine n-space.

35

Definition 4.5. The norm1 of an point x ∈ An(K) is defined by

‖x‖ν :=

max‖xi‖ν ν non-archimedean

(∑n

i=1 τ(xi)2)

1/2ν real∑n

i=1 τ(xi)τ(xi) ν complex

Definition 4.6 (Heights in Affine Space). The (logarithmic) height of a pointx ∈ An(K) is

hK(x) :=∑ν

log ‖x‖ν .

The lack of log+ means that the height of x ∈ K might not agree with theheight of x ∈ A1(K), i.e., hK(x) 6= hK(x).

These heights are compatible. For example, given the embedding

i1 :A1 → A2

x 7→ (1, x)

the height on A1 agrees with the restriction of the height on A2, i.e., hK(i1(x)) =hK(1, x) =

∑ν log ‖x‖ν = hK(x).

Recall that projective space is defined as the quotient of affine space bythe multiplicative group of the field:

Pn(K) = An+1(K)/K×

We can extend the definition of heights to projective space virtue of a lemma:

Lemma 4.7. Let hK be the height on A1(K) and x ∈ A1(K). Then for allλ ∈ K×, h(λx) = h(x).

Proof. By the product formula for λ,

h(λx) =∑ν

log ‖λx‖ν =∑ν

log ‖λ‖ν‖x‖ν =∑ν

log ‖λ‖ν︸︷︷︸=0

+∑ν

log ‖x‖ν = h(x).

Definition 4.8 (Heights in Projective Space). The (logarithmic) height ofa point (x0 : · · · : xn) ∈ Pn(K) can be defined using the definition for affinespace, because heights are invariant under the K×-action:

hK((x0 : · · · : xn)) := hK(x0, . . . , xn)

1Note that this does not agree with many standard definitions, where max‖xi‖ν isalso used for the archimedean case. But our definition agrees with the matrix definitionto follow.

36

4.3 Height of matrices

We will denote by Mm,n(K) the K-vector space of m× n-matrices.There are two ways of assigning height to X ∈ Mm,n(K). The naive

definition takes the height of the corresponding point in affine space underthe isomorphism Mm,n(K) ∼= Am×n defined by

X 7→ (. . . , xij(X), . . .) xij(X) = (i, j)th-entry

The better way is due to Bombieri-Vaaler.

Definition 4.9 (Height of a Matrix). The height of a matrix X ∈Mm,n(K)is defined as follows. Assume m ≤ n, and let I ⊆ 1, . . . , n be a choice of arank m minor (i.e., |I| = m). Let XI denote the minor of coefficients whosecolumns and rows indices appear in I. Let

Hν(X) :=

max|I|=m ‖ detXI‖ν ν non-archimedean

‖ detXXT‖1/2

ν ν real

‖ detXXT‖ν ν complex

Then the height of X is defined to be

h(X) =∑ν

logHν(X).

Remark 4.10. A more elegant way to introduce these heights applies theinclusion

i :Mm,n(K) → AN

X 7→ (. . . , detXI , . . .)

for N =

(nm

)and m ≤ n. Then the case m = 1 has M1×n = An, and we

recover heights for affine space.

h : X →

max ‖ detX‖ν ν -∞‖ det(XX

T)‖1/2ν ν | ∞

The properties of the heights of matrix include the following.

• h(X) ≥ 0.

• There are only finitely many X ∈ Mm,n(X)/K× such that h(X) ≤ Cfor given C.

37

• Let X =

(X1

X2

)for Xi ∈Mmi,n(K), and let m = m1 +m2. Then

h(X) ≤ h(X1) + h(X2).

• Let P be an element of K[x1, . . . , xn]d ⊆ K[x1, . . . , xn], the polynomialsof degree at most d. There is an isomorphism

4.4 Absolute height

Since Q ⊂ Q(√

2), we now have two definitions of height for an x ∈ Q: theheight on Q and the height on Q(

√2). These do not agree. The height in

the Q(√

2) will be [Q(√

2) : Q] times the height in Q. This property holdsin general and leads to a definition of height preserved by field extension.

Exercise 4.11. Let p ∈ Z be prime. Calculate hQ(p) and hQ(√

2)(p).

Definition 4.12. The absolute height of a non-zero element x ∈ Q \ 0 isgiven by choosing K ⊂ Q containing x and defining

h(x) :=1

[K : Q]hK(x).

This does not depend on the choice of K 3 x.

4.5 Liouville estimates

Let x = uv∈ Q be non-zero with u, v ∈ Z. Then

|x| ≥ 1

|v|≥ 1

max(|u|, |v|)=

1

h(x). (4.1)

This can be generalized as follows.

Proposition 4.13. Let L = `1T1 + · · · `nTn for `i ∈ Z. Let K ⊇ Q be anumber field, w, an infinite place of K, and ξ ∈ Kn. If L(ξ) 6= 0, then

log ‖L(ξ)‖w ≥ −(h(ξ) + h(L)) + log ‖L‖w + log ‖ξ‖w

where |L| = (∑n

i=1 `2i )

1/2.

38

Proof. By the product formula

1 =∏ν

‖L(ξ)‖ν = ‖L(ξ)‖w ·∏ν 6=w

‖L(ξ)‖ν

≤ ‖L(ξ)‖w∏ν 6=w

‖L‖ν · ‖ξ‖ν

Note that the finite places ν satisfy

|L(ξ)|ν = |`1ξ1 + · · ·+ `nξn| ≤ max |ìξi|ν ≤ max |ì|ν ·max |ξi|ν .

so ‖L(ξ)‖ν ≤ Hν(L) · Hν(ξ). For the rest, one has ν | ∞. For the infiniteplaces ν, the Cauchy-Schwarz inequality gives

|L(ξ)|ν = |〈`, ξ〉|ν ≤ 〈`, `〉1/2〈ξ, xi〉1/2

Therefore,

1 ≤ ‖L(ξ)‖w∏ν

‖L‖ν ·∏ν

‖ξ‖ν · ‖L‖−1w · ‖ξ‖−1

w

= ‖L(ξ)‖w∏ν

h(L)h(ξ) · ‖L‖−1w · ‖ξ‖−1

w .

Taking logarithms gives the result.

This generalizes the equation 4.1 as follows. Let L = T2, and let ξ = (1, x).Then L(ξ) = x, h(L) = 0 and h(ξ) = log max(|u|, |v|), so we conclude by theproposition,

log |x| ≥ − log max(|u|, |v|) + log(

1 +∣∣∣uv

∣∣∣) = h(x) + log(

1 +∣∣∣uv

∣∣∣) ≥ −h(x)

4.6 Siegel’s Lemma

Let N > M > 0 be integers. Following Bombieri-Vaaler, we consider asystem of M -linear forms in N variables.

Li(T1, . . . , TN) =N∑j=1

ìjTj, 1 ≤ i ≤M

We want a small solution of Li(T ) = 0 for 1 ≤ i ≤M .

39

Lemma 4.14 (Siegel’s Lemma). The system has a non-trivial solution x =(x1, . . . , xN) ∈ ONK such that

h(x) ≤(

2

π

)s√D

1/4 (maxih(Li)

) MN−M

,

where, d = [K : Q], D = discr(K), and s is the number of real places of K.

Phrasing the system of equations as a matrix equation, previous proper-ties imply a bound on the height of the matrix by the height of the columnvectors. Instead of proving this, we prove a light version of Siegel’s lemma.

Lemma 4.15 (Siegel’s Lemma: Light Version). Let N > M > 0, and let

N∑j=1

aijTj = 0, 1 ≤ i ≤M,aij ∈ Z.

for Vi = maxj |aij|. Then the system has a non-tivial solution ξ = (ξ1, . . . , ξn) ∈ZN such that

|ξ1| < 2 +

(∏j

NVj

) 1N−M

.

Proof. We apply Dirichlet’s box principle. Let H > 0 be an integer, and letC be the cube given by |xi| ≤ H for 1 ≤ i ≤ N . It contains (2H+1)N points.Let α : RN → RM be the linear map, α = (α1, . . . , αM), αi =

∑j aijTj. The

image α(C) ⊆ C ′ has y ∈ C ′ if and only if |yi| ≤ N · ViH. Then C ′ contains(2NViH)M points. If H is sufficiently large, then α|C is not injective. Thusthere exist ξ′ 6= ξ′′ in C such that α(ξ′) = α(ξ′′).

Let ξ := ξ′− ξ′′ 6= 0. Then |ξi| ≤ |ξ′i|+ |ξ′′i | ≤ 2H. We choose H such that(∏j

NVj

) 1N−M

≤ 2H <

(∏j

NVj

) 1N−M

+ 2

Then

(2H+1)N = (2H+1)N−M ·(2H+1)M > (2H+1)M∏j

NVj ≥∏

(2NVjH+1) = |C ′|

In other words, the number of points in C ′ is greater than the number ofpoints in the image of α.

Siegel’s lemma is essentially the best possible bound.

40

Lecture 5

Logarithmic Forms

This lecture treats one of the central contributions to number theory of thenineteenth century. It goes back to a problem of Gauß.

Let α1, . . . , αn ∈ Q×

and b1, . . . , bn ∈ Z. We define

Γ := 〈α1, . . . , αn〉 ⊂ Q×.

If |α− 1| := |αb11 · · ·αbnn − 1| 6= 0, then we want to know how close this valuecan approach zero in terms of B := max|b1|, . . . , |bn|. In other words, wewant a lower bound |α− 1| > C.

There is a trivial lower bound. Assume that αi ∈ OK . Then N(α−1) ≥ 1.Indeed, N(α− 1) =

∏ν|∞ |α− 1|ν . Hence

|α− 1|ν ≤ |α|ν + 1 ≤ (maxHν(αi))B + 1 ≤ (2Hν(α))B =: CB

ν .

HenceCB ≥ |α− 1|ν ·

∏µ 6=ν

|α− 1|µ ≥ 1

so that |α− 1| ≥ C−B.

5.1 Historical remarks

We ask whether this can be improved, for example, to a bound of the form

|α− 1| > c−Bε

, ε > 0

This would show that diophantine equations of the type,

f(x, y) = 1,

41

for a homogeneous, irreducible f ∈ Z[X, Y ] of degree at least 3, have onlyfinitely many integer solutions. The size can be effectively bounded.

The equation y2 = x3 +ax+b for a, b ∈ Z has only finitely many solutions(x, y) ∈ Z2.

Similarly, the hyperelliptic curve yk = F (x) is a ramified, cyclic1 k-foldcovering of P1. The fundamental group π1(C) may regarded as the Galoisgroup of the covering. In general, any rational function r ∈ Q(C) defines aramified cover C → P1.

This may be reformulated in terms of logarithms,

Λ = b1 logα1 + · · ·+ bn logαn, bi ∈ Z

Remark 5.1. If the bi are replaced by algebraic numbers βi, then αβ11 · · ·αβnnis expected to be transcendental.

Problem 5.2 (Hilbert’s 7th Problem, 1900 ICM Paris). Does α = 0, 1

Another of Hilbert’s problems was the Riemann Hypothesis. Althoughthe Riemann Hypothesis remains unsolved, he considered his seventh problemto be harder.

Theorem 5.3 (Gel’fand, Schneider, 1934). If α, β /∈ Q, then αβ.

Fixing α1, . . . , αn, we want lower bound for |Λ| (Λ 6= 0) in terms of B. In1966, Baker showed that

log |Λ| > −c(logB)k

whenever k > 2n+ 1. He improved the result in 1969 (?) to include all caseswhere k > n. Around 1970, Feldman showed that this holds so long as k > 1.

We write f g for functions f and gif there exists a constant C suchthat f ≥ g. In this case, |Λ and B are functions of the bi. In its final form,the theorem says that

log |Λ| − logB.

Remark 5.4. If α1, . . . , αn are rational integers α1, . . . , αn and Ω = logA1 · · · logAn.Then An ≥ Ai ≥ |ai| and Ω′ = Ω/ logAn. Then Baker showed that log |Λ| − logBΩ log Ω′. Wustholz improved this to log |Λ| − logBΩ, which iscurrently the best form.

Conjecture 5.5.

log |Λ| − logB(logA1 + . . .+ logAn)

(This is closely related to the abc-conjecture, but we omit the relation.)

1A cycle covering C → P1 is one such that π1(C) is a cyclic group.

42

5.2 The Theorem

Theorem 5.6. Let b1, . . . , bn ∈ Z and let B ≥ 2 be a real number bound suchthat |bi| ≤ B for all i. Let α1, . . . , αn ∈ K× for a number field K ⊆ Q. IfΛ = b1 logα1 + · · ·+ bn logαn 6= 0, then there exists an effectively computableconstant C > 0 such that

log |Λ| > −C logB.

The constant C depends on α1, . . . , αn and d = [K : Q].

Remark 5.7. The theorem can also be stated as follows: There exists a C > 0such that if

log |Λ| ≤ −C logB

then Λ = 0. In other words, α1, . . . , αn are multiplicatively dependent.

Preparations for the proof

Feldman introduced ∆-functions. For each integer k ≥ 1, we define a functionfrom C to C by,

∆(z, k) :=(z + 1) · · · (z + k)

k!.

The ∆ functions satisfy a number of properties.

• The ∆ functions map integers to integers, i.e., ∆(Z, k) ⊆ Z.

• |∆(z, k)| ≤ (|z|+k)k

k!≤ e|z|+k

• The set ∆(z, k) | k ∈ N is a Q-basis of the polynomial ring Q[z].

• The Z-submodule U = f ∈ Q[z] | f(Z) ⊆ Z ⊆ Q[z] of functionswhich preserve integers is generated by the ∆ functions.

U = ⊕k≥0Z∆(z, k)

For all integers ` > 0 and m ≥ 0, we define

∆(z, k, `,m) =1

m!

(d

dz

)n∆(z, k)` =

1

2πi

∫|ζ−z|=1

∆(ζ, k)`dζ

(ζ − z)m+1

Hence we arrive at a bound which does not depend on n.

|∆(z, k, `,m)| ≤ e(|z|+k)` (5.1)

43

We have sucessfully played the game to find a useful bound. But there areno games without pain, so the question is, where does the pain enter? Theanswer is that differentiating to define ∆(z, k, `,m) introduces denominators.It no longer maps integers to integers. Let

v(k) = lcm(1, . . . , k) =∏

p prime

plog k/ log p

By the Prime Number Theorem, the number of primes less than or equal tok is k/ log k. Therefore v(k) ≤ 4k. By this definition, we have assuaged thepain, since

v(k)∆(z, k, `,m)

maps integers to integers.

Proof: The anxiliary function

We introduce the notation c1, c2, c3, . . . for positive constants which do notdepend on B (although perhaps they depend on αi or K), and which can beeffectively computed.

We begin the proof by assuming that log |Λ| ≤ −C logB. Without lossof generality, we assume that bn 6= 0. Knowing how the proof will end, wedefine constants

L := [kn/(n+1) log k] h := [log kB] L′ := h− 1

for some large k ∈ N. We define a function on Cn by

Φ(z0, . . . , zn−1) :=∑0≤λ−1≤L′

∑0≤λ0,...,λn≤L p(λ)∆(z0 + λ−1, h, λ0 + 1, 0)αγ1z11 · · ·αγn−1zn−1

n−1

where the function p(λ) is unknown, and

γi = λi −bibnλn 1 ≤ i ≤ n− 1.

We shall determine p(λ) as an element of K = Q(α1, . . . , αn).

Lemma 5.8. There exist p(λ) ∈ K, not all 0, of height at most chk1 such that

S :=∑λ

p(λ)∆(`+ λ−1, h, λ0 + 1,m0)n∏j=1

αλj`j

n−1∏j=1

γmjj = 0 (5.2)

for 0 ≤ ` < h, mj ≥ 0 (0 ≤ j ≤ n− 1) and m0 + · · ·+mn−1 < k.

44

Proof. This is a typical application of Siegel’s Lemma. Up to a factor of theform (logα1)m1 · · · (logαn−1)mn−1 , the formula of the lemma can be writtenas,

1

m0!

(d

dz0

)m0(d

dz1

)m1

· · ·(

d

dzn−1

)mn−1

Φ(z0, z1, . . . , zn−1).

Taking the z0 derivative of Φ only changes 0 into m0 in the fourth argumentof the ∆ function (by the properties of ∆). The rest of the derivatives add afactor of (logα1)m1 · · · (logαn−1)mn−1 . To make them disappear, we define

Dj := (logαj)−1 d

dzjD0 :=

d

dz0

The new system is

1

m0!Dm0

0 Dm11 · · ·D

mn−1

n−1 Φ(z0, z1, . . . , zn−1)

We reformulate the powers appearing in Φ:

αγ1z11 · · ·αγn−1zn−1

n−1 = αλ1−b1/bnλn1 z1 · · ·αλn−1−bn−1/bnλn

n−1 zn−1

= αλ1z11 · · ·αλn−1zn−1

n−1

(α−b1/bnz11 · · ·α−bn−1/bnzn−1

n−1

)λn= αλ1z11 · · ·αλn−1zn−1

n−1

(α−b1/bn1 · · ·α−bn−1/bn

n−1

)λnz= αλ1z11 · · ·αλn−1zn−1

n−1 (α′n)λnz

where we define α′n := α−b1/bn1 · · ·α−bn−1/bn

n−1 . Recall the Schwarz lemma:

Lemma 5.9 (Schwarz Lemma). Let f : D → D be a holomorphic functionfrom the open unit disk, D = z : |z| < 1 ⊂ C to itself. If f(z) = 0,then |f(z)| ≤ |z| for all z ∈ D, and furthermore, |f ′(0)| ≤ 1. If additionally|f(z)| = |z| for some non-zero z or if |f ′(0)| = 1, then f(z) = eiθz for someθ ∈ R.

We estimate the difference |α′n − αn|. By the power series ez = 1 + z +z2/2! + z3/3! + · · · ,

|ez − 1| ≤ |z|e|z| for all z ∈ C

45

Letting Q = bn(logα′n − logαn),

|α′n − αn| = |αn||α′nα−1n − 1| = |αn| · |elogα′n−logαn − 1|

= |αn||eQ/bn − 1|≤ |αn||Q/bn|e|Q/bn|

≤ |αn||Q|e|Q|

≤ |αn|B−CeB−C

≤ B−34C

At the last step, we impose a condition on the constant C. It must satisfyeB−C ≤ 1. Since B ≥ 2, this will be true if e2−C ≤ 1 is satisfied, a condition

which does not depend on B.We now bound the height of 5.2. The proof divides here into two cases.

In the first case, we consider archimedean places ν. We use the bound 5.1.Evaluating z at ` gives that |z| ≤ h. Observing that k = λ0 + 1 ≤ L′ andthat ` ≤ L gives that

∆(`, λ−1, h, λ0 + 1,m0) ≤ e(L′+h)L

Again observing the bounds on the sums,∏n

j=1 αλj`j ≤

∏ni=1Hν(αj)

Lh. Usingthe definition of γi bounds the final product in equation 5.2 to give

S ≤ c(L′+h)L2

n∏i=1

Hν(αj)Lh

n−1∏j=1

(max(1,‖bj‖ν‖bn‖ν

2L)mj

≤ c(L′+h)L2

n∏i=1

Hν(αj)Lh(2BL)m1+···+mn−1

≤ c(L′+h)L2

n∏i=1

Hν(αj)Lh(2BL)k

Second, we consider the case when ν is a finite place. Then

S ≤ ‖ν(h)‖−m0ν ·

n∏j=1

Hν(αj)Lh‖b−1

n ‖kν

Altogether this makesH(S) ≤ chk3

For an application of Siegel’s Lemma, we need to bound the heights of thecoefficients as well as the quantity M

N−M , where N is the number of variables,and M is the number of equations in 5.2.

46

• The coefficients are bounded by chk3 .

• N = (L′ + 1)(L+ 1)n = h · (knn+ 1 log k)n+1 ≥ 2hkn.

• M = h ·(k + n− 1

n

)≤ hkn.

This gives MN−M ≤ 1, and now Siegel’s Lemma gives the result.

We look now at the holomorphic functions

fm(z) =1

m0!Dm0

0 Dm11 · · ·D

mn−1

n−1 Φ(z0, . . . , zn−1)

∣∣∣∣z0=···=zn−1=z

.

We will sometimes supress the m in the notation and simply write f(z).Similar estimates as in the proof of Lemma 5.8 give

|f(z)| ≤ chk+L|z|4 (5.3)

for m0 + · · ·mn−1 < k.

Exercise 5.10. Prove the estimate of equation 5.3.

Definition 5.11. Let

g(`) =∑λ

p(λ)∆(`+ λ−1, h, λ0 + 1, 0)αλ1` · · ·αλn`n γm11 · · · γmnn

Exercise 5.12. For 0 ≤ ` < hk1

2(n+1) (log k)−1,

|f(`)− g(`)| ≤ B−12C .

Lemma 5.13. For all integers ` with 0 ≤ ` < hk1

2(n+1) (log k)−1, we eitherhave g(`) = 0 or

|f(`)| > c−hk−L`6 .

Proof. If g(`) = 0, then there is nothing to show, so let g(`) 6= 0. We use theproduct formula to show that

|g(`)| > c−hk7 .

Then

|f(`)| = |g(`)− f(`)− g(`)| ≥ |g(`)| − |f(`)− g(`)|≥ |g(`)| −B−

12C

≥ 1

2|g(`)|

47

because, as we shall see, we may take C sufficiently large so that

B−12C <

1

2c−hk7 .

Since g(`) 6= 0, by Lemma 5.8, ` ≥ h. If ν | ∞, then

n∏j=1

αλj`j ≤ max(1, ‖α‖ν)

n∏j=1

γmjj ≤

(max

(1,‖bj‖‖bn‖

)2‖L‖ν

)mjThe first part of g(`) can also be bounded (we omit this) to give a totalbound of

‖g(`)‖ν ≤ chk7

n∏j=1

max(1, ‖α‖ν)(

max

(1,‖bj‖‖bn‖

)2‖L‖ν

)mj.

If ν -∞, then g(`) has a correspondingly different bound,

‖g(`)‖ν ≤ ‖ν(h)‖−m0ν

n∏j=1

max(1, ‖α‖ν)(

max

(1,‖bj‖‖bn‖

)2‖L‖ν

)mj.

Since L` ≤ hk, this implies that

1 =∏ν

‖g(`)‖ν ≤ chk7 ‖g(`)‖

Hence |g(`)| > c−hk7 .The inequality 5.2 can be acheived for C sufficiently large by a chain of

reasoning which produces a list of constants c8, . . . imposing a finite numberof lower bounds on C. Choosing C greater than all these lower bounds willthen satisfy 5.2.

5.3 Extrapolations

We now turn our attention to the zeros of f(`) in the rectangle bounded byh and k:

h

k

S

T = [k/2]

`

m

48

Recall that h = log(kB). Our aim is to show that, if f(`) has zeros in theh× k rectangle, then it has zeros in the rectangle S × T , where

S := [hk1

2(n+1) (log k)−1] T := [k/2].

Proposition 5.14. For 0 ≤ ` < S and m0, . . . ,mn−1 ≥ 0 with m0 + · · · +mn−1 < T , we have g(`) = 0.

Proof. We use Cauchy’s integral formula. Let

F (z) :=h−1∏j=0

(z − j)T

and consider the function φ(z) := f(z)/F (z). We will use the notation

|G(z)|r := max|z|=r|G(z)|

for the maximum of a holomorphic function on the disk of radius r. By the

Schwarz lemma 5.9, |φ(z)|S ≤ |φ(z)|R for R := [k1

2(n+1)S]. Hence |f(z)|S ≤|f(z)|R |F (z)|S

|F (z)|Rand

Shk

(R/2)hk=

(z

R1

2(n+1)

)hkHence |z − j| ≥ |z| − j and R− h ≥ R/2, and

|φ(z)| ≤ |φ(z)|R(S

R

)hkThe function φ(z) is meromorphic with poles at 0, . . . , h. To apply theCauchy integral formula, we define contours:

Γ

Rh− 1

Γ0 Γ1· · ·

49

Here Γ is the circle of radius R and the Γs are circles of radius 1/2 centeredat integral points on the x-axis. Then

1

2πi

∫Γ

φ(z)

z − `dz = φ(`) +

1

2πi

h−1∑s=0

T−1∑t=0

f (t)(s)

t!

∫Γs

1

F (z)

(z − s)t

z − `dz.

Multiplying by F (`) gives

1

2πi

∫Γ

F (`)

F (z)

f(z)

z − `dz︸︷︷︸

I

= f(`) +1

2πi

h−1∑s=0

T−1∑t=0

f (t)(s)

t!

∫Γs

F (`)

F (z)

(z − s)t

z − `dz︸︷︷︸

J

.

(Note the definitions of I and J .) We want to exclude the possibility that|f(`)| > c−hk−L`6 . To do this, we show that I, J < c−hk8 . We bound F (`)/F (z)by first bounding F (z) from below on the disk of radius R, and then boundingF (`) from above.

On Γ, the disk of radius R, |z − j| ≥ |z| − j = R− j ≥ R/2. Hence,

|F (z)| > 2−hTRhT |F (`)| =h−1∏j=0

|`− j|T ≤ ShT

Together, this gives, ∣∣∣∣F (`)

F (z)

∣∣∣∣ ≤ 2hkk−hT

2(n+1) z ∈ Γ. (5.4)

We have L|z| = LR = knh+1 log k and k

12(n+1)S = kh, so

|f(z)| ≤ chK+L|z|9 ≤ c2hk

10 z ∈ Γ. (5.5)

Using equations 5.4 and 5.5, we can now bound the integral I by,

I ≤ 2 · 2hk(S

R

)hTc2hk

10 = chk11k− hT

2(n+1) ≤ B−14C .

To bound the double sum of integrals, J , we write the derivatives of f interms of partial derivatives of ∆,

f (t)m (z) =

(∂

∂z0

+ · · ·+ ∂

∂zn−1

)t1

m0!Dm0

0 · · ·Dmn−1

n−1 Φ(z0, . . . , zn−1)

∣∣∣∣∣z0=···=zn−1=z

, t = m0+· · ·mn−1

50

Recall that ∂∂z0

= D0 and ∂∂zi

= logαiDi for 1 ≤ i ≤ n − 1. The powers ofthe derivative are then(

d

dz

)t= (D0 + logα1D1 + · · ·+ logαnDn)t =

∑τ

q(τ)1

τ0!Dτ

where we use the notation Dτ = Dτ00 · · ·D

τn−1

n−1 . The main point is that thesedifferential operators are not defined over Q.

f (t)m (z) =

∑τ

q(τ)1

m0!τ0!Dm0+τ0

0 · · ·Dmn−1+τn−1

n−1 Φ(z0, . . . , zn−1)∣∣z0=···=zn−1=z

=∑τ

q∗(τ)1

(m0 + τ0)!Dm0+τ0

0 · · ·Dmn−1+τn−1

n−1 Φ(z0, . . . , zn−1)∣∣z0=···=zn−1=z

=∑

τ,|m|≤k−1

q∗(τ)fm+τ (z) 0 ≤ |m| ≤ T − 1

where

q∗(τ) =t!

τ1! · · · τn−1!

(τ0 +m0

τ0

)(logα1)τ1 · · · (logαn−1)τn−1 .

Hence ck ·∑

t!τ1!···τn−1!

≤ ck · nT . And if m0 + · · · + mn−1 ≤ T − 1, then the

summation goes over fm+τ (z) with m0 + · · ·+mn−1 ≤ k − 1.

|fm+τ (s)| ≤ |gm+τ (s)|︸︷︷︸=0

+ |fm+τ (s)− gm+τ (s)|︸︷︷︸<B−C/2

We arrive at an upper bound for the derivatives,

|f (t)m (s)| < chk12B

−C/2, 0 ≤ t < T − 1, 0 ≤ s ≤ h− 1 (5.6)

Finally, we need a bound on the integrals over Γs.

|z − j| ≥

1/2 s = j(s− j)/2 0 ≤ j < s(j − s)/2 s ≤ j ≤ h− 1

Then|F (z)| ≥ 2−hs!(h− s)!(h− s)−1.

where s!(h− s)! ≤(hs

)h!.

51

Sterling’s formula gives that h! ≥(he

)h√2πh. Finally, |F (z)| ≥ (8e)−hThhT .

This gives for z ∈ Γs,∣∣∣F (`)F (z)

∣∣∣ ≤ ShT

hST(8e)hT

≤(k

12(n+1) hh

)hT(8e)hT ≤ (8ek

12(n+1) )hT

(5.7)

We are finally ready to bound the double sum of integrals using equation 5.7and 5.6 to give,

J ≤h−1∑s=0

T−1∑t=0

1

t!chkB−

12Ck

hT2(n+1)

≤

(h−1∑s=0

T−1∑t=0

1

t!

)B−

14C ≤ heB−

14C

provided that B14C > (ck)hk.

The bounds on I and J together bound the absolute value of f(`):

|f(`)| = I + J ≤ B−14C(1 + he) ≤ B−

15C

Details for abbreviated parts of the proof can be found in Baker and Wustholz.

Proposition 5.15. If log |Λ| < −C logB we have g(`) = 0 for all 0 ≤ ` < Sand all non-negative integers m0, . . . ,mn−1 with m0 + · · ·+mn−1 ≤ T .

Proof. Let

P (X0, . . . , Xn) :=L′∑

λ−1=0

∑λ0,...,λn≤L

p(λ−1, . . . , λn)∆(X0+λ−1, h, λ0+1, 0)Xλ11 · · ·Xλn

n

Then for m0, . . . ,mn−1 with m0 + · · ·+mn−1 < T ,

g(`) =1

m0!Dm0

0 · · ·Dmn−1

n−1 P (X0, . . . , Xn)

∣∣∣∣(X0,...,Xn)=(`,α`1,...,α

`n)

= Φ(`, . . . , `)

where D0 = ∂∂X0

and Di = Xi∂∂Xi− bi

bnXn

∂∂Xn

. Hence P satisfies,

Dm00 · · ·Dmn

n−1P (`, α`1, . . . , α`n) = 0,

for all 0 ≤ ` < S and all m0, . . . ,mn−1 such that m0 + · · ·+mn−1 < T .

52

5.4 Multiplicity estimates

The presentation here can be supplemented by the book by Baker andWustholz. If

S(T/n)n ≥ kεL′Ln+1

for sufficiently large ε and k, then α1, . . . , αn are multiplicatively dependent,i.e., there exist X1, . . . , Xn ∈ Z not all zero, such that

αX11 · · ·αXnn = 1. (5.8)

This is satisfied for our P (X0, . . . , Xn). Siegel’s lemma says that ST n ≤ L′Ln

and extrapolation says that ST n > kεL′Lnnn.

Theorem 5.16. If α1, . . . , αn are multiplicatively independent, then

log |Λ| > −C logB

for some effectively computable constant C > 0 which depends only on [K :Q], h(α1), . . . , h(αn) and n.

If the terms are multiplicatively dependent, then taking the logarithm ofequation 5.8 gives a linear relation,

logαn =1

Xn

(X1 logα1 · · ·+Xn−1 logαn−1)

The nth case is thereby reduced to the (n− 1)th case.

53

Lecture 6

The Unit Equation

6.1 Units and S-units

Let K ⊃ Q be a number field, and let OK be a ring of integers. LetΣ := Hom(K,C). Then |Σ| = r + 2s where r is the number of homo-morphisms which factor through R → C and s is the number of pairs ofcomplex conjugate embeddings.

Theorem 6.1 (Dirichlet’s Unit Theorem). With the above notation,

O×K ∼= µ(K)× Zt, t = r + s− 1

where µ(K) is the group of roots of unity in K.

Let ε1, . . . , εt ∈ O×K be a basis for O×K/µ(K). Then x ∈ O×K ⇔ x =ζ · εx11 · · · εxtt for some x1, . . . , xt ∈ Z and some ζ ∈ µ(K). In this lecture, wesolve the unit equation:

Definition 6.2 (Unit equation). The unit equation is

x+ y = 1, x, y ∈ O×K . (6.1)

Definition 6.3 (S-units). If S is a finite set of primes in OK , then theS-units are the invertible elements of

OK,S =ab| a, b ∈ OK , b /∈ p for p /∈ S

= x ∈ K | |x|ν ≤ 1 for ν /∈ S .

Note that OK,S ⊃ OK .

Definition 6.4 (S-unit equation). The S-unit equation is x + y = 1 forx, y ∈ O×K,S. A solution of the S-unit equation gives a solution of ax+by = 1with x, y ∈ OK .

54

Let x, y ∈ O×K be solutions of 6.1. Put

x = ξεx11 · · · εxtt , ξ ∈ µ(K), xi ∈ Zy = ηεy11 · · · ε

ytt , η ∈ µ(K), yi ∈ Z

Let X := max(|xi|) and Y := max(|yi|). We shall bound max(X, Y ). With-out loss of generality, we assume that X ≤ Y . We assume that there existsν | ∞ such that

− log ‖y‖ν Y.

We will show that this assumption is actually no restriction using the productformula. Equation 6.1 implies that∥∥∥∥xy + 1

∥∥∥∥ =

∥∥∥∥1

y

∥∥∥∥ =1

‖y‖≤ e−εY

where the norm is the ν-norm. Now

x

y=ξ

ηεx1−y11 · · · εxt−ytt = ζεx1−y11 · · · εxt−ytt

sox

y+ 1 = ζ ′εx1−y11 · · · εxt−ytt + 1 = ez + 1

where z =∑t

i=1(xi − yi) log εi + iqπ for some q ∈ Q depending on the rootof unity ζ ′.

Lemma 6.5. Let z ∈ C satisfy

|ez + 1| ≤ 1

4

Then there exists an integer k such that

|z − ikπ| ≤ 4

3|ez + 1|.

Proof. We take the principle branch of the logarithm which satisfies log ez =z for 0 ≤ arg z < 2π.

Let ζ = ez + 1, and let w lie in the strip such that z = w modulo 2πi.Then w = log ez = log(ζ − 1). The hypothesis |ζ| = |ez + 1| ≤ 1

4and the

power series formula log(ζ − 1) = −∑∞

n=1ζn

ntogether imply that

| log(1− ζ)| ≤ 4

3|ζ| = 4

3|ez + 1|.

55

The Lemma implies that there is some k ∈ Q such that

|z − ikπ| = |Λ| ≤ 4

3

∣∣∣∣xy + 1

∣∣∣∣where Λ =

∑ti=1(xi−yi) log εi−k log(−1). Thus log |Λ| −Y = max(|xi|, |yi|).

We now have a linear combination of logarithms, and we would like to applyBaker’s theorem. This requires,

|xi − yi| ≤ 2 max(|xi|, |yi|) = 2Y and |k| Y.

These both hold. Thus − log Y log Λ. But this reasoning depends on thehypothesis that at some place w we have log |y|w Y .

6.2 Unit and regulator

In this section, we denote the units of K by UK = O×K . Then ε ∈ UKimplies that N(ε) = ±1. Let ε1, . . . , εt be a set of generators of UK , and letΣ0 := σ1, . . . , σr+s ⊆ Σ ⊆ Hom(K,C) be a set of embeddings.

Definition 6.6 (Regulator). Let

M := (log |σj(εi)|)i=1,...,tj=1,...,t+1

= (log |εi|w) ∈Mt+1,t(R).

We take any submatrix R in Mt,t(R). Then the regulator of K is detR whichis independent of the choice of R.

Taking the logarithm of the absolute value of y gives

log |y|w = y1 log |ε1|w + · · ·+ yt log |εt|w, w ∈ Σ0.

As w ranges through the embeddings Σ0, this produces a system of equationsfor y1, . . . , yt. Hence

Y = max |yi| max(log |y|w).

So there exists a w such that log |y|w Y and finitely many solutions to theunit equation.

56

Lecture 7

Integral points in P1 \ 0, 1,∞

Let K be a number field, and let X = SpecR be an affine variety for R =K[f1, . . . , fn]. The variety X embeds in affine space by the morphism x 7→(f1(x), . . . , fn(x)) ∈ An.

Definition 7.1 (Integral points). The integral points of X are

X(OK) = P ∈ X | f1(P ), . . . , fn(P ) ∈ OK

Let X be projective over K, and let D be a very ample divisor.

XK//

PNK // PNOK

SpecK

σK

JJ

XOK \DOK // XOK

OO

DOKoo

yytttttt

tttt

SpecOKσOK

ggOOOOOOOOOOOσ

JJ

The image σ(SpecOK) does not intersect DOK .Now let X = P1

K \ 0, 1,∞ be the projective line with three punctures.It can be expressed as the affine variety X = A1 \ 0, 1 whose regularfunctions include T, 1

T, T − 1 and 1

T−1. These generate the coordinate ring of

R = K[T, T−1, T − 1, (T − 1)−1].By definition, a point P ∈ X is an integral point if and only if the value

of these regular functions is integral. Since 1/T (P ) is defined, T (P ) ∈ O×K .Similarly, 1/(T − 1)(P ) is defined, so (T − 1)(P ) ∈ O×K . We conclude thatthe integral points of P1

K \ 0, 1,∞ are solutions of the unit equation:

T (P ) + (T − 1)(P ) = 1, T (P ), (T − 1)(P ) ∈ O×K

57

Examples

Let E be the elliptic curve in normal form y2 = x(x− 1)(x− λ). It forms aramified cover

E \ E1, E2, E3 → P1 \ 0, 1,∞

Integrality of points in E is intimately related to integrality of points in theimage, of which there are finitely many. This allows one to conclude that Ealso has finitely many integral points.

A super-elliptic curve has normal form

ym = x(x− 1)(x− λ).

On any super-elliptic curve, there are only finitely many integral points, andthey can be determined effectively.

This should be compared with Ziegel’s theorem that the set of integralpoints on an algebraic curve of genus greater than one is finite. But Ziegel’stheorem is not effective, hence the work we have done.

58

Lecture 8

Appendix: Lattice Theory

8.1 Lattices

Let (E, 〈., .〉) be a finite dimensional Euclidean Q-vector space, where

〈., .〉 : E × E → Q

is a positive-definite bilinear form.A subgroup Λ ⊂ E is called discrete if for all bounded B ⊂ E, the

intersection B ∩ Λ is finite.

Theorem A.1. A subgroup Λ ⊂ E is discrete if and only if

dim(QΛ) = rk(Λ)

Proof. (⇐) Let l1, . . . , lr ∈ Λ be free generators from QΛ(r ≤ dim(E)).By assumption, these are linear independent over Q and can therefore becompleted to a basis of E. Because B ⊂ E is bounded, the coordinatefunctions of B ∩ Λ are also bounded, and hence B ∩ Λ is finite.

(⇒) We choose a basis l1, . . . , lr ∈ Λ of QΛ and let

F (Λ) :=

r∑i=1

xili | xi ∈ Q,−1/2 < xi ≤ 1/2

.

The set F (Λ) is bounded and therefore F (Λ)∩Λ is finite by assumption. Weset Λ′ :=

∑i Zli ⊂ Λ, and then the inclusion Λ/Λ′ → QΛ/Λ′ ∼= F (Λ) shows

that the quotient Λ/Λ′ is also finite. Therefore the sublattice Λ′ ⊆ Λ hasfinite index, so that Λ is finitely generated and torsion free as a subgroup ofE. It is thus free and hence

dim(QΛ) = rk(Λ′) = rk(Λ).

A discrete subgroup of maximal rank is called a lattice.

59

We now equip (E, 〈., .〉) with the following measure: given an orthonormalbasis e1, . . . , en of E, we can produce an isomorphism e between E and Qn.We set then dµ = e∗dµL, where dµL indicates the Lebesgue measure. Weobtain in this way a volume,

Vol(F (Λ)) =

∫F (Λ)

dµ.

Theorem A.2. Let Λ ⊂ E be a lattice, and e1, . . . , en, an orthonormal basisof E with Λ = Φ(Ze1 + · · ·+ Zen). Then,

Vol(F (Λ)) = | det(Φ)|.

Proof. The pullback satisfies Φ∗dµ = det(Φ)dµ, and therefore,

Vol(F (Λ)) =

∫F (Λ)

dµ =

∫F (Φ(Ze1+···+Zen))

dµ

=

∫F (Ze1+···Zen)

Φ∗dµ

= | det(Φ)|∫F (Ze1+···+Zen)

dµ

= | det(Φ)|∫F (Zn)

dµL = | det(Φ)|.

We note further that det(Φ)2 = det(ΦΦT ) = det(〈li, lj〉) for li = Φ(ei).

8.2 Geometry of numbers

We consider now the lattice Λ ⊆ Rn of the Euclidean space Rn viewed witha standard scalar product with respect to which the canonical basis is or-thonormal. We call a set S ⊂ Rn convex if for every pair of points x, y ∈ S,the linear combinations λx+µy ∈ S for all λ+µ < 1 with 0 ≤ λ, µ ≤ 1. Wecall the set S symmetric if S = −S. For a closed, symmetric and convex setK ⊂ Rn, we define the successive minima of the lattice Λ with respect to Kby

λk := infλ | λK contains k linear independent lattice points.

It clearly holds thatλ1 ≤ λ2 ≤ · · · ≤ λn.

Writing d(Λ) = det(Λ), we formulate the following theorem of Minkowski.

60

Theorem A.3 (Minkowski). With the above notation,

1

n!≤ λ1 · · ·λn

Vol(K)

2nd(Λ)≤ 1.

Proof. We refer the reader to J. W. S. Cassels, An Introduction to the Ge-ometry of Numbers.

We now let δ(K,Λ) = Vol(K)2nd(Λ)

, so that

1

n!≤ λ1 · · ·λn · δ(K,Λ) ≤ 1.

In particular,λn1 ≤ λ1 · · ·λn ≤ δ(K,Λ)−1.

Then δ(K,Λ) ≥ 1, so that λ1 ≤ 1 which means that K contains a nonzerolattice point. This is the content of the following corollary.

Corollary A.4 (Minkowski’s Lattice Point Theorem). If, under the aboveassumptions, the inequality

Vol(K) ≥ 2nd(Λ),

holds, then K contains a non-trivial lattice point.

This immediately implies a corollary.

Corollary A.5. Let l1, . . . , ln ∈ Hom(Rn,R) be linear forms, and let 0 <c1, . . . , cn ∈ R with c1 · · · cn ≥ det(l1, . . . , ln)d(Λ), so there is a 0 6= u ∈ Λwith

|lj(u)| ≤ cj

for 1 ≤ j ≤ n.

Proof. We define K by |lj| ≤ cj for 1 ≤ j ≤ n. Then,

Vol(K) = 2nc1 · · · cn/ det(l1, . . . , ln) ≥ 2nd(Λ).

61

8.3 Norm and Trace

Let L/K be a finite field extension. Then L is in particular a finite di-mensional K-vector space. We consider ξ ∈ L and define Tξ ∈ GLK(L) byTξ(x) = ξx and let

NL/K(ξ) := det(Tξ)

TrL/K(ξ) := Tr(Tξ).

The norm and trace enjoy the following properties:

(i) NL/K(ξη) = NL/K(ξ)NL/K(η),

(ii) NL/K(1) = 1,

(iii) TrL/K(ξ + η) = TrL/K(ξ) + TrL/K(η).

Therefore the norm and trace homomorphisms map from L∗ to K∗ and fromL to K respectively.

According to the definition Tξ(x) = ξx, which means x ∈ ker(ξid − Tξ)and hence det(ξid− Tξ) = 0. Therefore ξ is the root of

det(ξid− Tξ) = T n − trL/K(ξ)T n−1 + · · ·+ (−1)nNL/K(ξ)

=∏

σ∈Hom(L/K)

(T − σξ)

Hence a theorem.

Theorem A.6. For separable extensions L/K we have,

(i) det(ξid− Tξ) =∏

σ∈Hom(L/K)(T − σξ),

(ii) NL/K(ξ) =∏σξ,

(iii) TrL/K(ξ) =∑σξ.

The following statements are left to the reader as an exercise.

(i) M ⊇ L ⊇ K ⇒ TrM/K = TrL/K TrM/L,

(ii) NM/K = NL/K NM/L,

(iii) If L/K is separable, the function

TrL/K :L× L → K(ξ, η) 7→ Tr(ξη)

is non-degenerate, symmetric and bilinear.

62

(iv) If L ⊇ Ks ⊇ K, Ks is the separable closure of K and L ⊃ Ks is purelyinseparable, so that

TrL/K(ξ) = [L : K] TrKs/K(ξ)

andNL/K(ξ) = NKs/K(ξ)[L:K].

Now let K be an algebraic number field. Then the norm of an ideala ∈ OK is defined by

NK(a) := OK : a.

For ξ ∈ OK we set NK(ξ) := NK(ξOK); this definition agrees with the onegiven earlier.

Since OK is a Dedekind domain,

a 6=∏

pνp(a)

and by the Chinese Remainder theorem it follows that

OK/a ∼=∏OK/pνp(a),

henceNK(a) =

∏NK(pνp(a))

Lemma A.7. With the above notation,

pn/pn+1 ∼= O/p.

Proof. There are isomorphisms,

O/p ∼= Op/pOp∼= Op/πOp

andpn/pn+1 ∼= pnOp/p

n+1Op∼= πnOp/π

n+1Op.

The final term is an Op/πOp-vector space of dimension one. It follows that

pn/pn+1 ∼= O/p.

The immediate conclusion gives that NK(pn) = NK(p)n, hence

NK(a) =∏

NK(p)νp(a)

63

8.4 Ramification and discriminants

Let R be a Dedekind domain with fraction field K, and furthermore letL ⊇ K be a finite, separable extension containing R′, the integral closure ofR in L.

Definition A.8. Let P be a nonzero prime ideal in R′ and P ∩ R = p. Wesay that P is ramified over R if either e(P) > 1 or k(P) is non-separableover k(p). One says the prime ideal p is ramified over R′ if there is a primeideal P in R′ which lies over p and is ramified over R.

Let V be a vector space, B = v1, . . . , vn, a basis, and β : V × V → Ka bilinear form. We set

d(B) := det(β(vi, vj))

and call this the discriminant of the basis B with respect to β. If L/K is afinite field extension of number fields and a ⊂ L is a finitely generated OK-module, then we choose a free submodule a1OK + · · · + anOK and examinethe discriminant

d = det(TrL/K(ai, aj)).

The ideal generated in OK by all such d is called the discriminant of aand is denoted by δL/K(a). In the particular case when a = OL, one callsδL/K = d(OL) the discriminant of L/K. It is an ideal in OK .

Proposition A.9. Let S be a multiplicative set in OK. Then

δL/K(a)S = δL/K(aS)

Proof. A basis of a is a basis of aS , and ever basis BS ⊆ aS defines a basisof a by multiplying by an element s ∈ S. From this follows the conclusion,because

dsB = s2[L:K]dB.

The relation between ramification and discriminant is given through thefollowing theorem.

Theorem A.10. The prime ideals of R which are ramified in R′ are exactlythose ideals which divide the discriminant δL/K.

Proof. Let p ∈ OK be a prime ideal, and S = OKp. By Proposition A.9, wecan localize at S and consider the rings R = S−1OK as well as R′ = S−1OL.Then p is ramified in OL exactly when pR is ramified in R′. The ring R is adiscrete valuation ring and therefore a principal ideal ring. For this reason,R′ is a free R-module which can be written as R′ = x1R ⊕ · · · ⊕ xrR, etc.(see Gerald J. Janusz, Algebraic Number Fields).

64

Minkowski Space1

Let F : C → C be complex conjugation, i.e., Gal(C,R) = 1, F and Σ =HomQ(K,C). Then we obtain the embedding

j :K → Γ(Σ,C) ∼= K ⊗Q Cx 7→ j(x) : j(x)(τ) = τ(x), τ ∈ Σ,

where Γ(Σ,C) = z : Σ→ C. The complex conjugation F operates on Σ byτ 7→ F (τ) = F τ and on Γ(Σ,C) by z 7→ F (z) via F (z)(τ) = F (z(F (τ))) =(F zF )(τ). The space Γ(Σ,C) possesses a non-trivial involution x 7→ F (x),and we obtain a eigenspace decomposition,

Γ(Σ,C) = Γ(Σ,C)+ ⊕ Γ(Σ,C)−

in the eigenspace with eigenvalues 1 and -1 respectively. For z ∈ Γ(Σ,C)+ wehave z(F (τ)) = F (z(τ)), in other words, z and F commute. The embeddingj factors through Γ(Σ,C)+ because

F (j(x))(τ) = (F j(x) F )(τ)

= F (j(x)(F (τ)))

= F (F (τ)(x))

= (F F τ)(x)

= τ(x)

= j(x)(τ)

for an arbitrary τ ∈ Σ, and therefore j(x) ∈ Γ(Σ,C)+. This space to-gether with the Euclidean scalar product that we now introduce is called theMinkowski space.

We define a sesquilinear form on Γ := Γ(Σ,C) by

(x, y) ∈ Γ× Γ 7→ 〈x, y〉 :=∑τ

x(τ)F (y(τ)).

It satisfies the property that F 〈x, y〉 = 〈y, x〉 and

〈Fx, Fy〉 =∑τ

F (x)(τ)F (F (y)(τ)))

=∑τ

(F x F )(τ)(F (F y F ))(τ)

= F

(∑τ

x(τ)F (y(τ))

)= F (x, y),

1J. Neukirch, Algebraische Zahlentheorie, III.4

65

that means F commutes with Tr, so that,

Tr(x) = Tr(F (x)) = F (Tr(x))

for x ∈ Γ(Σ,C)+, and therefore j Tr = TrK/Q as claimed.We now construct a canonical measure on (Γ(Σ,C)+, 〈, 〉). Assume the

volume of an orthonormal parallelepiped P is Vol(P) = 1. Let b1, . . . , bn bea basis of Γ(Σ,C)+, and let

P(b1, . . . , bn) = x1b1 + · · ·+ xnbn | 0 ≤ xi < 1,

be the parallelepiped defined by the basis. The transformation

bi =∑k

aikek, 1 ≤ i ≤ n

defines a matrix A := (aik), and we obtain

Vol(P(b1, . . . , bn)) = det(〈bi, bj〉)

= det

(⟨∑k

aikek,∑l

ajlel

⟩)

= det

(∑k

aikajk

)= det(AAT )

The basis b1, . . . , bn generates the lattice,

Λ = Zb1 + · · ·+ Zbn ⊆ Γ(Σ,C)+

and we define the Lattice volume to be,

d(Λ) := Vol(P(b1, . . . , bn)).

Example A.11. OK is a free Z-module of the form

OK = Zω1 + · · ·+ Zωn,

and thereforej(OK) = Zj(ω1) + · · ·Zj(ωn).

It follows that

d(OK)2 = det(〈jωk, jωl〉) = dK = det(TrK/Q(ωkωl)).

For a ⊂ OK is Λ := ja ⊆ Γ(Σ,C)+ a sub-lattice of j(OK , and one obtainsthe relation,

d(a) = N(a)d(OK) = N(a)√|dK |.

66

The logarithmic Minkowski space2

2J. Neukirch, Algebraische Zahlentheorie, I.7

67

diophantine geometry - eth z · in lecture 7 we will discuss unit equations which are basic tools...

Documents