quaternion algebra

A Proof of Lagrange’s Four Square Theorem

Using Quaternion Algebras

Drew Stokesbary

Spring 2007

Abstract

Many prime numbers can be expressed as a sum of the squares of

two other numbers. This paper explores which numbers can be written

as a sum of the squares of four numbers. This question is deeply

related to a number system known as quaternion algebra, which will

be developed in this paper to describe what numbers can be written

as the sum of four squares.

In 1770, Joseph Louis Lagrange proved that every positive integer can beexpressed as a sum of the squares of four integers, which has henceforth beencalled Lagrange’s Theorem, and we will eventually prove this very result. Inorder to reach this conclusion, we will introduce a number system called aQuaternion Algebra, which may be roughly thought of as an extension of thecomplex numbers or the Gaussian integers.

After introducing the fundamentals of quaternions and exploring somepeculiarities of arithmetic in this number system, we will begin travelingalong the road towards a proof a Lagrange’s Theorem. We will use thearithmetic of quaternions to examine the notion of a norm and of a unitin this number system. These concepts will allow us to develop a divisionalgorithm for quaternions and prove the existence of a greatest (right-hand)common divisor. Finally, we will see what it means to be a prime quaternion.

All of these properties of quaternions will eventually allow us to provethat every positive integer can be written as a sum of four squares.

1

1 Quaternion Arithmetic

Formally, we define the quaternions H to be the set of all ordered quadruples(a1, a2, a3, a4), where a1, a2, a3, a4 ∈ R. Addition is defined as

(1) (a1, a2, a3, a4) + (b1, b2, b3, b4) = (a1 + b1, a2 + b2, a3 + b3, a4 + b4),

and multiplication is defined as

(2) (a1, a2, a3, a4) · (b1, b2, b3, b4) = (c1, c2, c3, c4),

where

c1 = a1b1 − a2b2 − a3b3 − a4b4,

c2 = a1b2 + a2b1 + a3b4 − a4b3,

c3 = a1b3 − a2b4 + a3b1 + a4b2,

c4 = a1b4 + a2b3 − a3b2 + a4b1.

(3)

We use Greek letters to represent quaternions, and if α = (a1, a2, a3, a4),then we say that a1, a2, a3, and a4 are the coordinates of α.

Definition. If α = (a1, a2, a3, a4) and β = (b1, b2, b3, b4), then α and β areequal if (a1, a2, a3, a4) = (b1, b2, b3, b4).

The equation x2 = −1 has the quadruples (0, 1, 0, 0), (0, 0, 1, 0), and(0, 0, 0, 1) as solutions. We denote each solution quadruple by the symbols i,j, and k, respectively. In other words,

i = (0, 1, 0, 0),(4)

j = (0, 0, 1, 0),(5)

and k = (0, 0, 0, 1).(6)

Although

(7) i2 = j2 = k2 = −1,

we can see from the definition of equality that

(8) i 6= j 6= k.

2

We then define i, j, and k so that

jk = i = −kj,

ki = j = −ik,

ij = k = −ji.

(9)

From this fact, we can see that multiplication of quaternions is not commuta-tive. This peculiarity about quaternions is what sets them apart from othernumber systems.

If α = (a1, a2, a3, a4), then we can also write α = a1 + a2i + a3j + a4k. Infact, the aforementioned definition of multiplication comes from expandingthe product (a1 + a2i + a3j + a4k)(b1 + b2i + b3j + b4k) according to thedistributive property and the multiplicative definitions of i, j, and k.

In addition to the set H of quaternions, which we defined as

(10) H = {(a1, a2, a3, a4) | a1, a2, a3, a4 ∈ R},

we can define other sets of quaternions, such as

HQ = {(a1, a2, a3, a4) | a1, a2, a3, a4 ∈ Q},(11)

and HZ = {(a1, a2, a3, a4) | a1, a2, a3, a4 ∈ Z}.(12)

We will now consider the set H′ ⊂ H, which we define as

(13) H′ = {(a1, a2, a3, a4) | a1, a2, a3, a4 ∈ Z or a1, a2, a3, a4 ∈1

2ZOdd},

where 1

2ZOdd is the set of all odd integers divided by 2. The reasoning for

defining this odd (no pun intended!) set of quaternions will soon becomeclear, but before we explore some of the deeper properties of quaternions, wewill first show that the set H′ is closed under addition and multiplication,and that the set is a non-commutative ring.

Theorem 1. H′ is closed under addition.

Proof. Suppose α, β ∈ H′ and

α = a1 + a2i + a3j + a4k(14)

and β = b1 + b2i + b3j + b4k.(15)

Then, α+β = (a1, a2, a3, a4)+(b1, b2, b3, b4) = (a1+b1, a2+b2, a3+b3, a4+b4).

3

If the coordinates of α and β are integers, then the coordinates of α + βare also integers, and so α + β ∈ H′.

If the coordinates of α and β are not integers, then they are odd integersdivided by 2. The sum of any two odd integers is an even integer, and aneven integer divided by two is always an integer. Thus if the coordinates ofα and β are not integers, then the coordinates of α + β are integers, andα + β ∈ H′.

If the coordinates of either α or β are integers, and the coordinates ofthe other are not integers, then the coordinates of α + β are the sums of aninteger and half of an odd integer. Each of these sums will be half of an oddinteger, and thus α + β ∈ H′.

Theorem 2. H′ is closed under multiplication.

Proof. First, we will define a special quaternion ξ so that

(16) ξ = 1

2(1 + i + j + k).

Thus, any integer quaternion can now be written in the manner

(17) ρ = r1ρ + r2i + r3j + r4k,

where r1, r2, r3, r4 ∈ Z. If the coordinates of ρ are integers, then r1 will beeven, and if the coordinates of ρ are non-integers, then r1 will be odd. Anyquaternion written in this form will be an integer quaternion.

Using the definition of multiplication, we then compute

ξ2 = 1

2(1 + i + j + k) 1

2(1 + i + j + k) = 1

2(−1 + i + j + k) = ξ − 1,(18)

ξi = 1

2(1 + i + j + k)i = 1

2(−1 + i + j − k) = −ξ + i + j(19)

iξ = i1

2(1 + i + j + k) = 1

2(−1 + i − j + k) = −ξ + i + k,(20)

ξj = 1

2(1 + i + j + k)j = 1

2(−1 − i + j + k) = −ξ + j + k,(21)

jξ = j 1

2(1 + i + j + k) = 1

2(−1 + i + j − k) = −ξ + i + j,(22)

ξk = 1

2(1 + i + j + k)k = 1

2(−1 + i − j + k) = −ξ + i + k(23)

kξ = k 1

2(1 + i + j + k) = 1

2(−1 − i + j + k) = −ξ + j + k.(24)

Each of these products is in the form of equation (17), so each is itself aninteger quaternion.

If we take α = (a1, a2, a3, a4) and β = (b1, b2, b3, b4), with α, β ∈ H′, andrewrite each so that it is in the form of equation (17), then by the definition of

4

multiplication and the results of equations (18) through (24), we can see thatthe product αβ can also be written in the form of equation (17). Therefore,αβ is an integer quaternion, and H′ is indeed closed under multiplication.

Since we have shown that H′ is closed under addition and multiplication,we will now show that H′ is a non-commutative ring.

Theorem 3. H′ is a non-commutative ring.

Proof. If we say that α, β, and γ are of the form

α = (a1, a2, a3, a4),(25)

β = (b1, b2, b3, b4),(26)

and γ = (c1, c2, c3, c4),(27)

we can then show that each of the seven conditions for a non-commutativering hold.

For all α, β ∈ H′, we see from the definition of addition that

α + β = (a1, a2, a3, a4) + (b1, b2, b3, b4)

= (a1 + b1, a2 + b2, a3 + b3, a4 + b4)

= (b1 + a1, b2 + a2, b3 + a3, b4 + a4)

= (b1, b2, b3, b4) + (a1, a2, a3, a4)

= β + α.

(28)

Thus, the commutative law of addition holds.For all α, β, γ ∈ H′, we see from the definition of addition that

(α + β) + γ = [(a1, a2, a3, a4) + (b1, b2, b3, b4)] + (c1, c2, c3, c4)

= (a1 + b1, a2 + b2, a3 + b3, a4 + b4) + (c1, c2, c3, c4)

= (a1 + b1 + c1, a2 + b2 + c2, a3 + b3 + c3, a4 + b4 + c4)

= ((a1, a2, a3, a4) + (b1 + c1, b2 + c2, b3 + c3, b4 + c4)

= ((a1, a2, a3, a4) + [(b1, b2, b3, b4) + (c1, c2, c3, c4)]

= α + (β + γ).

(29)

Thus, the associative law of addition holds.It is clear that (0, 0, 0, 0) ∈ H′. For all α ∈ H′,

(30) (0, 0, 0, 0) + α = (0 + a1, 0 + a2, 0 + a3, 0 + a4) = (a1, a2, a3, a4) = α.

5

Thus there exists an element which is an additive identity.Since (a1, a2, a3, a4 ∈ H

′, then (−a1,−a2,−a3,−a4) ∈ H′. We can see that

(a1, a2, a3, a4) + (−a1,−a2,−a3,−a4) = (a1 − a1, a2 − a2, a3 − a3, a4 − a4) =(0, 0, 0, 0), so H′ has an additive inverse for all α ∈ H′.

For all α, β, γ ∈ H′, we can see from the definition of multiplication that(αβ)γ = α(βγ), so multiplication in H′ is associative.

The element (1, 0, 0, 0) is in the set H′, and from the definition of multi-plication, we see that for all α ∈ H′,

(31) (1, 0, 0, 0)α = (1, 0, 0, 0)(a1, a2, a3, a4) = (a1, a2, a3, a4) = α

and

(32) α(1, 0, 0, 0) = (a1, a2, a3, a4)(1, 0, 0, 0) = (a1, a2, a3, a4) = α.

Thus H′ includes an element which is a multiplicative identity.If α, β, γ ∈ H′, then α(β + γ) = α(b1 + c1, b2 + c2, b3 + c3, b4 + c4), which

we can see from the definition of multiplication gives (a1b1 + a1c1, a2b2 +a2c2, a3b3 + a3c3, a4b4 + a4c4), and thus multiplication is distributive.

Therefore, each of the seven laws necessary for the set H′ to be a non-commutative ring hold.

Now that various arithmetic properties of quaternions have been estab-lished (namely, the definitions of addition and multiplication for quaternionsand the fact that the integer quaternions are closed under addition and mul-tiplication and form a non-commutative ring), we can now begin to developcertain intermediate theorems about quaternions which will help us in ourquest to prove Lagrange’s Theorem.

2 Quaternion Norms

The norm is an important concept in other number systems like Z[i]. Herewe draw on many of the ideas from such number systems in order to developthe idea of a norm for quaternions.

The first idea we will borrow from other number systems is that of theconjugate. Recall that in C, the number (a + bi) had a conjugate, namely,(a − bi). We define something similar for quaternions.

Definition. Let α = a1 + a2i + a3j + a3k. Then α = a − a1i − a2j − a3k isthe conjugate of α.

6

Note, that it is clear if α ∈ H′, then α ∈ H′. Thus, armed knowing how toform the conjugate of a quaternion, we can define the norm of a quaternionin the same way as number systems like Z[i] or C.

Definition. For any quaternion α, N(α) = αα is the norm of α.

Lemma 1. Let α = a1 + a2i + a3j + a4k. Then

(33) N(α) = αα = αα = a21 + a2

2 + a23 + a2

4.

Proof. If α = a1 + a2i + a3j + a4k, then α = a1 − a2i − a3j − a4k andN(α) = αα. From the definition of multiplication,

(34) αα = (a1 +a2i+a3j +a4k)(a1 −a2i−a3j−a4k) = c1 + c2i+ c3j + c4k,

where

c1 = a1a1 − a2(−a2) − a3(−a3) − a4(−a4) = a21 + a2

2 + a23 + a2

4,

c2 = a1(−a2) + a2a1 + a3(−a4) − a4(−a3) = 0,

c3 = a1(−a3) − a2(−a4) + a3a1 + a4(−a2) = 0,

c4 = a1(−a4) + a2(−a3) − a3(−a2) + a4a1 = 0.

(35)

Substituting the results of equation (35) into equation (34), we obtain

(36) αα = a21 + a2

2 + a23 + a2

4.

Similarly, we see that

(37) αα = (a1 −a2i−a3j−a4k)(a1 +a2i+a3j +a4k) = c1 + c2i+ c3j + c4k,

where

c1 = a1a1 − (−a2)a2 − (−a3)a3 − (−a4)a4 = a21 + a2

2 + a23 + a2

4,

c2 = a1a2 + (−a2)a1 + (−a3)a4 − (−a4)a3 = 0,

c3 = a1a3 − (−a2)a4 + (−a3)a1 + (−a4)a2 = 0,

c4 = a1b4 + (−a2)a3 − (−a3)a2 + (−a4)a1 = 0.

(38)

After substituting the results of equation (38) into equation (37), we againobtain

(39) αα = a21 + a2

2 + a23 + a2

4.

Therefore,

(40) N(α) = αα = αα = a21 + a2

2 + a23 + a2

4.

7

Remember the seemingly bizarre way we defined the set H′? The reasonfor constructing the set in such a manner was so that the norm of any elementof the set would be an integer.

Lemma 2. Let α ∈ H′ where α = (a1, a2, a3, a4) 6= 0. Then N(α) ∈ N.

Proof. Since α ∈ H′, there are two possibilities. Either a1, a2, a3, a4 ∈ Z ora1, a2, a3, a4 ∈

1

2ZOdd.

First consider the case where a1, a2, a3, a4 ∈ Z. Because multiplication isclosed over the integers, we know that

(41) a21, a

22, a

23, a

24 ∈ Z,

and because addition is closed as well, we know

(42) a21 + a2

2 + a23 + a2

4 ∈ Z.

Now consider the alternative, that a1, a2, a3, a4 ∈1

2ZOdd. If we use 1

2ZOdd

to mean the set of all odd integers divided by 2, then the set of all odd integersdivided by 4 will be denoted as 1

4ZOdd. If m

2∈ 1

2ZOdd, then (m

2)2 = m2

4. Since

m is odd, then m2 is odd, and m2

4∈ 1

4ZOdd. Thus, if a1, a2, a3, a4 ∈ 1

2ZOdd,

then we can see that

(43) a21, a

22, a

23, a

24 ∈

1

4ZOdd.

When we add four numbers from the set 1

4ZOdd, we obtain an integer, so that

(44) a21 + a2

2 + a23 + a2

4 ∈ Z.

Therefore, for this case as well, N(α) ∈ Z.In addition, by trichotomy, we know that if a ∈ Z, then a2 ∈ N ∪ {0}.

Thus, N(α) ∈ Z. However, since α 6= 0, we know it has at least one non-zero coordinate. Thus a2

1 + a22 + a2

3 + a24 > 0, and N(α) /∈ {0}. Therefore,

N(α) ∈ N.

Definition. Suppose α ∈ H′. Then we say α is an integer quaternion.

The term integer quaternion of comes from the fact that the norm of aquaternion in the set H′ is an integer, which was proven in Lemma 2.

8

Definition. Suppose α ∈ H′. Then we say α is odd if N(α) is odd and evenif N(α) is even.

In order to prove deeper results about quaternions, we need the norm tohave the property that N(αβ) = N(α)N(β). Before we can prove this fact,however, we must first prove the following Lemma, which has some lengthyarithmetic.

Lemma 3. Let α = a1 + a2i + a3j + a4k and β = b1 + b2i + b3j + b4k. Thenαβ = βα.

Proof. By the definition of multiplication,

(45) αβ = c1 + c2i + c3j + c4k,

where

c1 = a1b1 − a2b2 − a3b3 − a4b4,

c2 = a1b2 + a2b1 + a3b4 − a4b3,

c3 = a1b3 − a2b4 + a3b1 + a4b2,

c4 = a1b4 + a2b3 − a3b2 + a4b1.

(46)

By definition of the conjugate,

(47) αβ = c1 − c2i − c3j − c4k.

After establishing what αβ looks like, we now turn to βα. By the defini-tion of the conjugate, we see that

α = a1 − a2i − a3j − a4k(48)

and β = b1 − b2i − b3j − b4k,(49)

and by the definition of multiplication, we have

(50) βα = c′1 + c′2i + c′3j + c′4k,

where

c′1 = b1a1 − (−b2)(−a2) − (−b3)(−a3) − (−b4)(−a4),

c′2 = b1(−a2) + (−b2)a1 + (−b3)(−a4) − (−b4)(−a3),

c′3 = b1(−a3) − (−b2)(−a4) + (−b3)a1 + (−b4)(−a2),

c′4 = b1(−a4) + (−b2)(−a3) − (−b3)(−a2) + (−b4)a1.

(51)

9

By simple algebraic manipulation, we can transform equation (51) as

c′1 = b1a1 − b2a2 − b3a3 − b4a4 = a1b1 − a2b2 − a3b3 − a4b4,

c′2 = −b1a2 − b2a1 + b3a4 − b4a3 = −a1b2 − a2b1 − a3b4 + a4b3,

c′3 = −b1a3 − b2a4 − b3a1 + b4a2 = −a1b3 + a2b4 − a3b1 − a4b2,

c′4 = −b1a4 + b2a3 − b3a2 − b4a1 = −a1b4 − a2b3 + a3b2 − a4b1.

(52)

Notice that this makes

c′1 = c1, c′2 = −c2, c′3 = −c3, and c′4 = −c4.(53)

Substituting equation (53) into equation (50), we find that

(54) αβ = c1 − c2i − c3j − c4k.

From equations (45) and (54), it is clear that

(55) αβ = βα.

Now our desired property of the norm, that N(αβ) = N(α)N(β), followseasily from Lemma 3.

Lemma 4. Let α and β be quaternions. Then N(αβ) = N(α)N(β).

Proof. N(αβ) = αβαβ = αββα = αN(β)α = ααN(β) = N(α)N(β).

3 Quaternion Units

We now turn to units, an idea which as a place in nearly every number systemimaginable. In N, 1 is a unit, in Z, 1 and −1 are units, and in Z[i], there arefour units: 1, −1, i, and −i, and in Zp, every element is a unit. Yet, despitethe existence of different units in different number systems, the definition ofa unit remains the same in each. We will take this same definition and applyit to quaternions.

Definition. We say ε ∈ H′ is a unit if there is a quaternion α ∈ H′ so thatεα = αε = 1.

10

In number systems which have the concept of a norm, such as Z[i], a unitu has the property that N(u) = 1. To show this is true for quaternions, wemust first define the inverse of a quaternion.

Definition. Suppose α is nonzero and α ∈ H′. Then there exists α−1, calledthe inverse of α, so that

(56) α−1 =α

N(α).

We saw that for a quaternion α ∈ H′, it had a conjugate α ∈ H′ and anorm N(α) ∈ Z. Thus we can see that it has an inverse α−1 ∈ HQ. However,if you know that N(α) = 1, then α−1 = α, so α−1 ∈ H′. This fact is thegateway for our discussion of units.

Lemma 5. α is a unit if and only if N(α) = 1.

Proof. From equations (33) and (56), we see

(57) αα−1 =αα

N(α)=

N(α)

N(α)= 1

and

(58) α−1α =αα

N(α)=

N(α)

N(α)= 1.

It is here we see that α and α−1 are both units. Now we can take the normof both sides of equation (57) or (58) to obtain

(59) N(αα−1) = N(1),

which we know is

(60) N(α)N(α−1) = 1.

It appears that the only solution to equation (60) is N(α) = N(α−1) = 1,but to be sure, we can use our definition of inverse and norm to see

(61) N(α−1) = N

(

α

N(α)

)

=α

N(α)

α

N(α)=

N(α)

N(α)= 1.

From equations (60) and (61), we see that if N(α−1) = 1, then N(α) = 1as well. Therefore, it is indeed the case that α is a unit if and only ifN(α) = 1.

11

Since we know that α ∈ H′ must have coordinates in either the set Z or1

2ZOdd, it would seem reasonable to conjecture that there are only a finite

number of units. Thus we will now take a moment to examine exactly howmany quaternions are units in H′, and also what they are.

Theorem 4. There are 24 units in H′.

Proof. Say α = a1 + a2i + a3j + a4k and α ∈ H′. From Lemma 5, if αis a unit, then N(α) = 1. Since α ∈ H′, then either a1, a2, a3, a4 ∈ Z ora1, a2, a3, a4 ∈

1

2ZOdd.

If a1, a2, a3, a4 ∈ Z, then we know a21, a

22, a

23, a

24 ∈ N∪{0}. Since N(α) = 1,

then we can see that one out a1, a2, a3, and a4 must equal ±1, while theother three must equal 0. This provides for a total 8 possible units.

On the other hand, if a1, a2, a3, a4 ∈ 1

2ZOdd, then we can observe that

a21, a

22, a

23, a

24 ∈ 1

4ZOdd. If N(α) = 1, then the only solution of fourths of odd

integers comes when a21 = a2

2 = a23 = a2

4 = 1

4. Thus, a1 = ±1

2, a2 = ±1

2,

a3 = ±1

2, and a4 = ±1

2. These values for a1, a2, a3, and a4 present 16

possible units.Therefore, we have 8 units when the coordinates of the units are integers

and 16 units when the coordinates are non-integers, for a total of 24 differentunits in H′.

Now knowing the coordinates of all 24 units, we immediately reach thefollowing corollary.

Corollary 1. The units in H′ are:

±1, ±i, ±j, ±k, and 1

2(±1 ± i ± j ± k).

Another concept for quaternions which we will borrow from other num-ber systems is the associate. The following definition for the associate ofa quaternion is identical to the definition for associates in number systemssuch as Z[i].

Definition. Let α be a quaternion. If ε ∈ H′ is a unit, then εα and αε arecalled associates of α. If β = εα, then it is said that β associates α andwritten β ∼ α.

We will now prove four lemmas which will be immensely helpful in manyof the more difficult proofs which lay ahead.

12

Lemma 6. Suppose α, β ∈ H′. If α ∼ β, then N(α) = N(β).

Proof. Let ε ∈ H′ be a unit. From the definition of associates,

(62) α = εβ.

Taking the norm of equation (62), we obtain

(63) N(α) = N(εβ) = N(ε)N(β)

But from Lemma 5, N(ε) = 1, so therefore equation (63) reduces to

(64) N(α) = N(β),

as desired.

Lemma 7. If α ∼ β and β ∼ γ, then α ∼ γ.

Proof. From the definition of associates, if α ∼ β and β ∼ γ, then there existunits ε1 and ε2 so that

α = ε1β and β = ε2γ.(65)

Combining these two equations, we see that

(66) α = (ε1ε2)γ.

What is the product ε1ε2? Since ε1 and ε2 are themselves units, it wouldseem that ε1ε2 is also a unit. We can verify their product is a unit as wellby taking the norm:

(67) N(ε1ε2) = N(ε1)N(ε2) = 1 · 1 = 1.

By Lemma 5, since N(ε1ε2) = 1, we see that indeed ε1ε2 is also unit. Knowingthis, we can refer back to equation (66) and see that α ∼ γ.

Lemma 8. α ∼ β if and only if β ∼ α.

Proof. If α ∼ β, then there exists a unit ε so that

(68) α = εβ.

13

We can multiply both sides of equation (68) by ε−1, the inverse of ε, so that

(69) ε−1α = ε−1εβ.

Recall from the discussion of inverses that ε−1ε = 1 and that if ε is a unit,then ε−1 is also a unit. Thus,

(70) β = ε−1α,

and since ε−1 is a unit, then we can say β ∼ α. Finally, the converse of thisproof can be proven by merely switching α and β.

Lemma 9. If α ∈ H; and β ∼ α, then β ∈ H;

Proof. If β ∼ α, then there exists a unit ε ∈ H′ such that

(71) β = εα.

From Theorem 2, we know that H′ is closed under multiplication, that is, theproduct of two elements of H′ is itself an element of H′. Thus, from equation(71), we can easily see that β ∈ H′.

The following theorem has a difficult proof and, while the result is cer-tainly true, may appear to be meaningless and irrelevant. However, thiscould not be farther from the truth, as we will later see that this theoremprovides a crucial link in the proofs of what numbers can be written as asum of four integer squares.

Theorem 5. If α = a1 + a2i + a3j + a4k ∈ H′, then there is β = b1 + b2i +b3j + b4k so that β ∼ α and β ∈ HZ.

Proof. It is given that α ∈ H′, so we know either a1, a2, a3, a4 ∈ Z ora1, a2, a3, a4 ∈

1

2ZOdd.

The simple case is when a1, a2, a3, a4 ∈ Z. Let ε be a unit such that ε = 1.Take

(72) β = εα.

Thus, β ∼ α. Furthermore, from the definition of multiplication, we see that

(73) εα = α,

14

so

(74) β = α.

From the definition of equal quaternions, the coordinates of β are the sameas those of α, and thus β ∈ HZ and β ∼ α.

The other case is when a1, a2, a3, a4 ∈ 1

2ZOodd. Through simple algebra,

we can manipulate the terms of α into the form δ + γ, where

δ = d1 + d2i + d3j + d4k, di ∈ ZEven, and γ = 1

2(±1 ± i ± j ± k),(75)

so that

(76) α = δ + γ.

From Corollary 1, we know γ is a unit, as is γ, so

(77) γγ = 1.

Because each of d1, d2, d3, and d4 are even, according to the definition ofmultiplication, the coordinates of δγ will be integers. Since γ is a unit, bytaking β = αγ, it is plain that β ∼ α. It follows from equation (76) that

(78) β = αγ = (δ + γ)γ = δγ + γγ.

Therefore, because δγ has integer coordinates and γγ = 1, the coordinatesof δγ + γγ, and thus β, are integers. Therefore, β ∼ α and β ∈ HZ.

We have seen that norms, units, and associates of quaternions have near-identical definitions to their counterparts living in number systems like Z[i].In the area of divisors, however, quaternions begin to distinguish themselvesfrom these other number systems. This is due to the fact that, unlike for theGaussian integers, multiplication of quaternions is not commutative. Natu-rally then, it makes sense that division in H differs from division in Z[i].

Definition. If α, β, γ ∈ H′ and γ = αβ, then we say α is a left-hand divisorof γ and write αeγ, and that β is a right-hand divisor of γ and write βdγ.

The distinction between left- and right-hand divisors is necessary because,in general, αβ 6= βα. As we saw earlier, multiplication is not commutative.For the purposes of this paper, we will work with right-hand divisors, but

15

for consistency’s sake only. Every proof involving a right-hand divisor couldeasily be modified to prove a similar result using left-hand divisors.

(Note, however, multiplication between an integer and a quaternion iscommutative, so if α ∈ Z or β ∈ Z, then αβ = βα, and the distinctionbetween left- and right-hand divisors is unnecessary.)

The following theorem is necessary in order to prove the existence of adivision algorithm in H′. In order for long division to be “useful,” repeatedlong division must terminate; that is, the remainder must be less than thedivisor. This theorem proves just that, if there is long division, then theremainder term will be less than the divisor.

Theorem 6. Suppose κ ∈ H′ and m ∈ Z. Then there exists λ ∈ H′ so that

(79) N(κ − mλ) < m2.

Proof. First note that because m ∈ Z, the norm of m is given by N(m) = m2,so it may also be said that we are proving

(80) N(κ − mλ) < N(m).

If κ = (k1, k2, k3, k4), and λ = (l1, l2, l3, l4), then

(81) κ − mλ = (k1 − ml1, k2 − ml2, k3 − ml3, k4 − ml4).

We want to find when N(κ − mλ) < N(m), which happens when

(82) (k1 − ml1)2 + (k2 − ml2)

2 + (k3 − ml3)2 + (k4 − ml4) < m2.

Observe that if each |ki − mli| < |12m|, then (ki − mli)

2 < 1

4m2, and

(83) (k1 − ml1)2 + (k2 − ml2)

2 + (k3 − ml3)2 + (k4 − ml4) < 41

4m2 = m2.

So we merely need to find each li so that

(84) |ki − mli| < |12m|,

which occurs when

(85)2ki − m

2m< li <

2ki + m

2m.

We can see that 2ki+m

2m− 2ki−m

2m= 1, that is, the interval between 2ki−m

2mand

2ki+m

2mis 1. This ensures there exists at least one li in the interval such that

li ∈ Z or li ∈1

2ZOdd.

Finding each coordinate of λ, l1, l2, l3, and l4, in this manner ensures thatλ ∈ H′ and N(κ − mλ) < N(m).

16

Armed with Theorem 6, we now look to prove the existence of a full-fledged division algorithm for H.

Theorem 7 (Division Algorithm). Suppose α, β ∈ H′, and β 6= 0, thenthere are lambda, γ ∈ H′ so that

α = λβ + γ, with 0 ≤ N(γ) < N(β).(86)

Proof. Define κ ∈ H′ and m ∈ N such that

κ = αβ and m = N(β).(87)

From Theorem 6, we know there exists λ ∈ H′ such that N(κ − mλ) < m2.Now, using such a λ derived from Theorem 6, define γ as

(88) γ − λβ.

This satisfies α = δβ + ε from equation (86). From the definitions of κ andm, we see that

(89) (α − λβ)β = αβ − λββ = κ − mλ,

so thus

(90) N [(α − λβ)β] = N(κ − mλ).

From Theorem 6, we know that N(κ − mλ) < m2, so

(91) N [(α − λβ)β] = N(α − λβ)N(β) < m2.

Since N(β) = ββ = m, we can apply the cancelation law to see

(92) N(α − λβ) < m,

and from the definitions of γ and m, we can substitute to conclude

(93) N(γ) < N(β).

17

4 Quaternion GCDs

In addition to norms and units, we are now familiar with other importantideas about quaternions, namely division and divisors. In other number sys-tems, we might want to consider when a number of the greatest commondivisor, or GCD, of two other numbers. If d is the GCD of a and b, thenwe can easily see that d divides all linear combinations of a and b. Thisknowledge can be useful for many reasons, such as proving unique prime fac-torization within a number system or finding solutions of linear Diophantineequations. Also in other number systems, the GCD has important connec-tions to the primality, and such a connection also exists for quaternions. Forthis reason, an examination of GCDs in H′ now will aid our later study ofprimes in H′.

As we have done with concepts such as the norm, units, and associates,in order to define a GCD in H′, we look to the definition of a GCD in othernumber systems. However, knowing that multiplication is not commutative,we must again be careful to distinguish between left- and right-hand divisors.

Definition. Let α, β be quaternions. Then α and β have a greatest commonright-hand divisor δ, denoted gcdr(α, beta), if

(1) δ is a right-hand divisor of both α and β, and(2) every right-hand divisor of α and β is also a right-hand divisor of δ.

Theorem 8 (Greatest Common Divisor). Given α, β ∈ H′, where atleast one of α and β are non-zero, then α and β have a greatest commonright-hand divisor δ, which is unique up to associates and can be written inthe form

(94) δ = µα + νβ,

where µ, ν ∈ H′.

Proof. Let Γ be a set defined as

(95) Γ = {N(µα + νβ) | µ, ν ∈ H′}.

Since it was given that both α and β are not zero, then we can see that forany µ, ν ∈ H′, N(µα + νβ) > 0, so Γ ∈ N. Furthermore, taking µ = α andν = β, we see that N(N(α) + N(β)], so Γ 6= ∅. Thus, the Well OrderingPrinciple applies to Γ, so we know there exists g0 = N(µ0α + ν0β) such that

18

g0 ≤ g for all g ∈ Γ. Also, define δ = µ0α + ν0β. We want to show that δ isa greatest common right-hand divisor of α and β.

By Theorem 7, we know that given α, δ ∈ H′, we can find λ, γ ∈ H′ sothat

(96) α = λδ + γ,

where 0 ≤ N(γ) < N(δ). Since δ0 = µ0α + ν0β, we can substitute to show

(97) α = λ(µ0α + ν0β) + γ,

and then rearrange to show

(98) γ = (1 − λµ0)α + (−ν0)β.

On account of the closure of addition and multiplication, (1 − λµ0), (−ν0) ∈H′, and thus N(γ) ∈ Γ if N(γ) > 0. Assume momentarily that N(γ) ∈ Γ.By Theorem 7, we know that N(γ) < N(δ), however, g0 = N(δ) is the leastelement of Γ, so N(γ) /∈ Γ and thus we know N(γ) = 0, and also γ = 0.Equation (96) now becomes

(99) α = λδ,

and by definition, δdα. Similarly, we can see that δdβ.Now let κ ∈ H′ and assume that κdα and κdβ. Then it follows that κdµ0α

and κdν0β, and that κd(µ0α + ν0β). Since δ = µ0α + ν0β, then κdδ.Thus, δ is a right-hand divisor of α and β, and an arbitrary common

divisor of α and β is also a divisor of δ. Therefore, by definition, δ is agreatest common right-hand divisor.

Theorem 9. Suppose α, β ∈ H′. If gcdr(α, β) = δ and δ is not a unit, thengcd[N(α), N(β)] = N(δ), and N(δ) > 1.

Proof. It is given that

(100) gcdr(α, β) = δ,

and so δ is a right-hand divisor of α and β, and there exist γ1, γ2 ∈ H′ suchthat

α = γ1δ(101)

and β = γ2δ.(102)

19

Taking the norm of both sides of these equations gives

N(α) = N(γ1δ) = N(γ1)N(δ)(103)

and N(β) = N(γ2δ) = N(γ2)N(δ).(104)

From Lemma 2, the norm of a quaternion is an integer, and from divisibilityfor the integers, we can say that

N(δ)|N(α) and N(δ)|N(β).(105)

Thus N(δ) is, at minimum, a common divisor of N(α) and N(β).Another consequence of equation (100) is that if there exists λ ∈ H′ that

is also right-hand divisor of α and β, then λ must be a right-hand divisor ofδ. If such λ exists, then there is γ ′ ∈ H′ such that δ = γ ′λ. Substituting thisinto equations (101) and (102), we see that

α = γ1γ′λ(106)

and β = γ2γ′λ.(107)

Following similar reasoning as above, this implies that

N(λ)|N(α), N(λ)|N(β), and N(λ)|N(δ).(108)

Therefore, if gcdr(α, β) = δ, we can draw two conclusions. First, thatN(δ)|N(α) and N(δ)|N(β). Second, if there exists λ such that N(λ)|N(α)and N(λ)|N(β), then it must also be the case that N(λ)|N(δ). These are thetwo criteria for N(δ) to be the greatest common divisor of N(α) and N(β),so thus gcd[N(α), N(β)] = N(δ). In addition, it is given that δ is not a unit,so N(δ) 6= 1, which, along with Lemma 2, implies that N(δ) > 1.

Theorem 10. Suppose α ∈ H′ and let β = m ∈ N. Then gcdr(α, β) = 1 ifand only if gcd[N(α),m] = 1.

Proof. By Theorem 8, the following statements are equivalent:

(109) gcdr(α, β) = 1,

and there exist µ, ν ∈ H′ such that

(110) 1 = µα + νβ.

20

Equation 110 can be rearranged in the form

(111) µα = 1 − νβ.

Substituting m for β in equation (111) and taking the norm of both sides ofgives

(112) N(µα) = N(1 − mν) = (1 − mν)(1 − mν),

which can be expanded to

(113) N(µ)N(α) = 1 − mν − mν + m2N(ν)

and then rearranged as

(114) N(µ)N(α) + mν + mν − m2N(ν) = 1.

Let d be an integer such that d = gcd[N(α),m]. By definition, d is a commondivisor of N(α) and m, so d|N(α) and d|m. Thus, each of the following istrue as well:

d|N(µ)N(α), d|mν, d|mν, and d| − m2N(ν).

Therefore, from the properties of divisibility for the integers, we know thatd|[N(µ)N(α)+mν+mν−m2N(ν)], and given equation (114), d|1. Since d ∈Z, d must equal 1. Therefore, gcd[N(α),m] = 1. Note that because N(β) =m2, the statement gcd[N(α),m] = 1 is equivalent to gcd[N(α), N(β)] =1.

5 Primes

We now look at what it means to be a prime quaternion. By proving severalstatements about primes in set H′, and exploring how these primes relate toprimes in Z, we will eventually be able to prove that any prime number canbe written as a sum of the squares of four integers.

Before we can prove anything about primes inH′, we need to know exactlywhat constitutes a prime in H′. For this, we draw from the definition usedfor primes in sets such as Z[i] and Zm.

Definition. A non-zero quaternion π is prime in H′ if, for any α, β ∈ H′

such that π = αβ, either α or β is a unit (but not both).

21

Now that we have established what it means to be prime in H′, we canbegin to prove certain theorems connecting primes in H′ with primes in Z.The following theorem is similar to results in Z[i]. A Gaussian integer isprime in Z[i] if its norm is prime in Z

Theorem 11. Let π be a quaternion. If N(π) is prime in Z, then π is primein H′.

Proof. It is given that π ∈ H′ and that N(π) is prime in Z. Define α, β ∈ H′

such that

(115) π = αβ.

Taking the norm of both sides of equation (115) gives N(π) = N(αβ) =N(α)N(β). Since N(π) is prime in Z, by the definition of prime, either N(α)or N(β) must be a unit in Z. From Lemma 5, this implies that either α orβ is a unit in H′. Given equation (115), if either α or β is a unit, then bydefinition, π must be prime in H′.

Before we can continue to prove things about primes in H′, we must firststate an auxiliary theorem that will be needed later.

Theorem 12. Suppose p ∈ Z is an odd prime (p 6= ±2). Then there existx, y ∈ Z such that

(116) 1 + x2 + y2 ≡ 0 (mod p),

where 0 < x < p and 0 < y < p.

Theorem 13. If an integer p is prime in Z, then p is not prime in H′.

Proof. Let p be an integer that is prime in Z. If p = 2, then we can see that

(117) 2 = (1 + i)(1 − i).

Of course, (1 + i) and (1 − i) are in the set H′, but neither is a unit, so 2 isnot prime in H′. We can therefore assume that p > 2.

By Theorem 12, there exist r, s ∈ Z such that

(118) 1 + r2 + s2 ≡ 0 (mod p),

22

where 0 < r < p and 0 < s < p. Now define α ∈ H′ so that

(119) α = 1 + 0i + sj − rk,

where r and s are obtained from Theorem 12. Thus N(α) = 1 + r2 + s2,and from equation (118), N(α) ≡ 0 (mod p). It follows from the proper-ties of modular arithmetic that p|N(α), and it is trivial that p|p. There-fore, gcd[N(α), p] ≥ p. By definition, p > 1, so it is easy to see thatthat gcd[N(α), p] 6= 1. Using the result of Theorem 10, this implies thatgcdr(α, p) 6= 1. Now, we define δ such that

(120) δ = gcdr(α, p),

and we know that δ is not a unit in H′. Furthermore, because δ is also acommon right-hand divisor of α and p, we can say that δdα and δdp. Thus,there exist λ1, λ2 ∈ H

′ such that

α = λ1δ(121)

and p = λ2δ.(122)

Assume by way of contradiction that λ2 is a unit in H′. Then from equation(122), we see that p ∼ δ. This in turn implies that pdα. Given equation (119),there must exist γ = c1 + c2i+ c3j + c4k ∈ H′ such that (1 + sj − rk) = γp =pc1 + pc2i + pc3j + pc4k. However, no such γ exists, because it is impossibleto find a suitable c1 to satisfy pc1|1 when p > 2. Thus, a contradiction arises,and it cannot be the case that λ2 is a unit. Hence, p = λ2δ, where neitherλ2 nor δ is a unit, and therefore, p is not prime in H′.

We saw in Theorem 11 the connection between prime quaternions andprime integers, a result for quaternions which is identical to Gaussian inte-gers. The following theorem is the converse of Theorem 11 but is not truefor primes in Z[i].

Theorem 14. Let π be a prime in H′. Then N(π) is prime in Z.

Proof. It is given that π is a quaternion which is prime in H′. However,assume by way of contradiction, that the norm N(π) is not prime in Z. Thismeans that there exist integers a and p, both not units, so that

(123) N(π) = ap.

23

Furthermore, assume that p is a prime factor of N(π). Since by construction,p|N(π), and also trivially, p|p, then p is a common divisor of N(π) and p, sogcd[N(π), p] ≥ p. Because p > 1, it is also true that gcd[N(π), p] 6= 1. FromTheorem 10, it follows that gcdr(π, p) 6= 1. Define δ such that

(124) δ = gcdr(π, p).

Thus δ ¿ 1, so δ is not a unit in H′. Since δ is the greatest common right-hand divisor of π and p, it is also a common right-hand divisor of π and p.Hence, there exist λ1, λ2 ∈ H

′ such that

π = λ1δ(125)

and p = λ2δ.(126)

Since π is prime and δ is not a unit, then λ1 is a unit, and π ∼ δ. FromLemma 6, N(π) = N(δ). Taking the norm of both sides of equation (126)gives N(p) = p2 = N(λ2δ) = N(λ2)N(δ), and substituting N(π) for N(δ)gives

(127) p2 = N(λ2)N(π).

By combining this with equation (123) and performing simple algebra, weobtain

(128) p = aN(λ2).

Because p was defined to be a prime, then by definition, N(λ2) must equaleither 1 or p. Assume temporarily that N(λ2) = 1. Then from Lemma 5,λ2 would be a unit in H′. From equation (126), it would follow that p ∼ δ,and from Lemmas 7 and 8, p ∼ π. But then p would be prime in H′, whichviolates Theorem 13. This contradiction means that N(λ2) cannot equal 1,and instead must equal p. Substituting this result into equation (127) givesp2 = pN(π), which reduces to p = N(π). Therefore, since p is prime in Zand N(π) = p, then N(π) is prime in Z.

We can now combine some of our results to form a single statement thatis stronger than previous theorems.

Theorem 15. Let π be a quaternion. Then π is prime in H′ if and only ifN(π) is prime in Z.

Proof. This theorem easily follows from Theorems 11 and 14.

24

6 Numbers that are Sums of 4 Squares

Now that we know some of the relationships between primes in H′ and Z, wecan easily prove that all prime numbers can be expressed as a sum of fourinteger squares. This in turn will allow us to see what other numbers aresums of four squares.

Theorem 16. Let p ∈ N be a prime number. Then there are integers n1,n2, n3, and n4 so that p = n2

1 + n22 + n2

3 + n24.

Proof. Since it is given that p is prime in N, we know that p is also prime inZ. By Theorem 13, we know p is not prime in H′. Thus, there exist α, π ∈ H′

such that

(129) p = απ,

where α and π are not units in H′. Taking the norm of this gives N(p) =p2 = N(απ) = N(α)N(π). Since p is prime in Z, the only factors of p2 in Zare 1, p, and p2. Hence, either

N(α) = 1, N(α) = p, or N(α) = p2.(130)

However, since α and π are not units in H′, then it cannot be the case thatN(α) = 1 or N(π) = 1. This rules out that possibilities that N(α) = 1and N(α) = p2, leaving only that N(α) = p. Thus, the only solution top2 = N(α)N(π) is N(α) = N(π) = p.

Thus, N(π) = p, so N(π) is prime in Z. If we say that π = p1 + p2i +p3j + p4k, then we know that either pi ∈ Z or pi ∈

1

2ZOdd.

If pi ∈ Z, then we can define n1, n2, n3, and n4 such that

n1 = p1, n2 = p2, n3 = p3, and n4 = p4,(131)

and so it is plain that p = N(π) = n21 + n2

2 + n23 + n2

4, where n1, . . . ∈ Z.If instead pi ∈

1

2ZOdd, then by Theorem 5, there exists π′ ∈ H′ such that

π′ ∼ π and π′ has integer coordinates. If we define n1, n2, n3, and n4 to bethe coordinates of π′, so

(132) π′ = n1 + n2i + n3j + n4k,

then using Lemma 6, we see that p = N(π) = N(π′) = n21 + n2

2 + n23 + n2

4,where ni ∈ Z.

Thus, for any prime p, there exist integers n1, n2, n3, and n4 so thatp = n2

1 + n22 + n2

3 + n24.

25

Now that we have proven that a prime number can be expressed as thesum of four integer squares, we now turn to composite numbers to show thatthey too can be expressed as such a sum.

Theorem 17. Let m ∈ N be a composite number. Then there are integersn1, n2, n3, and n4 so that m = n2

1 + n22 + n2

3 + n24.

Proof. This will be proven by induction. Because a is a composite number,there exist primes p1, p2, . . . , pn ∈ N such that a = p1p2 · · · pn, where n isthe number of prime factors of a. If n = 1, then m would be prime, whichviolates the initial assumption that m is composite. Therefore, assume thatn ≥ 2.

First, take n = 2, so that

(133) m = p1p2.

The numbers p1 and p2 are prime, so from Theorem 16, there exist integers,say a1, a2, a3, a4, b1, b2, b3, b4, such that

p1 = a21 + a2

2 + a33 + a2

4(134)

and p2 = b21 + b2

2 + b33 + b2

4.(135)

Now, define α, β ∈ H′ such that

α = a1 + a2i + a3j + a4k(136)

and β = b1 + b2i + b3j + b4k.(137)

Observe that a consequence of this definition of α and β is that N(α) = p1

and N(β) = p2. If we let µ = m1 + m2i + m3j + m4k = αβ, then fromTheorem 2, µ ∈ H′, and thus either m1, . . . ∈ Z or m1, . . . ∈

1

2ZOdd.

If m1, . . . ∈ Z, then we can simply define n1, n2, n3, and n4 so that

n1 = m1, n2 = m2, n3 = m3, and n4 = m4.(138)

Thus, we can see that

(139) N(µ) = n21 + n2

2 + n23 + n2

4.

In addition, using the definition of µ and equation (133), we can see thatN(µ) = N(αβ) = N(α)N(α) = p1p2 = m. Therefore, after combining thisresult with equation (139), we obtain

(140) m = n21 + n2

2 + n23 + n2

4,

26

where n1, n2, n3, n4 ∈ Z.If instead we had m1, . . . ∈

1

2ZOdd, then by Theorem 5, we know there

exists µ′ ∈ H′, an associate of π1π2, with integer coordinates. In this case,we can instead define n1, n2, n3, and n4 so that

(141) µ′ = n1 + n2i + n3j + n4k.

Hence, N(µ) = n21 + n2

2 + n23 + n2

4. On account of Lemma 6 and equation(133), we see that N(µ′) = N(αβ) = N(α)N(β) = p1p2 = m. Thus, for thiscase as well,

(142) m = n21 + n2

2 + n23 + n2

4,

where n1, n2, n3, n4 ∈ Z.Now, take n ≥ 2 and assume that for a = p1p2 · · · pn, there exist integers

a1, a2, a3, and a4 such that a = a21 + a2

2 + a23 + a2

4. We must show thatfor m = p1p2 · · · pnpn+1, there exist integers n1, n2, n3, and n4 such thatm = n2

1 + n22 + n2

3 + n24. Given the assumptions about a, we can write b as

b = apn+1. From Theorem 16, since pn+1 is prime, there exist integers b1, b2,b3, and b4 such that pn+1 = b2

1 + b22 + b2

3 + b24. If here we define α, β ∈ H′ such

that

α = a1 + a2i + a3j + a4k(143)

and β = b1 + b2i + b3j + b4k,(144)

we find ourselves in the same situation as when we set n = 2. Thus, in thesame manner, we can obtain a quaternion µ, where either

µ = αβ or µ ∼ αβ,(145)

and where

(146) µ = n1 + n2i + n3j + n4k.

Thus, N(µ) = n21 +n2

2 +n23 +n2

4 and N(µ) = N(αβ) = N(α)N(β) = apn+1 =m. We can now see that for this case as well, there exist n1, n2, n3, n4 ∈ Zsuch that

(147) m = n21 + n2

2 + n23 + n2

4,

which concludes the proof. Therefore, any composite number can be writtenas the sum of the squares of four integers.

27

There is one last case to consider, the trivial case of when a number is 1(recall that 1 is neither prime nor composite). The following Lemma provesthat 1 can also be expressed as a sum of the squares of four integers.

Lemma 10. There are integers n1, n2, n3, and n4 so that 1 = n21+n2

2+n23+n2

4.

Proof. Take n1 = 1 and n2 = n3 = n4 = 0. Thus, 1 = 12 + 02 + 02 + 02.

Finally, we can now formally prove that every positive integer can beexpressed as a sum of the squares of four integers, which was our goal fromthe beginning. This result is known as Lagrange’s Theorem.

Theorem 18 (Lagrange’s Theorem). For every n ∈ N, there are integersn1, n2, n3, and n4 so that

(148) n = n21 + n2

2 + n23 + n2

4.

Proof. Every n ∈ N is either prime, composite, or 1, so there are three casesto consider. From Theorem 16, Theorem 17, and Lemma 10, it is clear thatif n falls under any of these cases, then there exist integers n1, n2, n3, andn4 such that n = n2

1 + n22 + n2

3 + n24.

28

quaternion algebra

Economy & Finance

a4 kb1

a4 k15

quaternions h

set h of quaternions

set h h

odd integers

dene i

multiplication of quaternions