chapter 3 high-school algebra revisited

Chapter 3

High-school algebra revisited

In this chapter, I will review some of the basic constructions from high-schoolalgebra from the perspective of this book.

3.1 The rules of the game

3.1.1 The axioms

Algebra deals with the manipulation of symbols. This means that symbolsare altered and combined according to certain rules. In high-school, the alge-bra you studied was mainly based on the properties of the real numbers. Thismeans that when you write x you mean an unknown or yet-to-be-determinedreal number. In this section, I shall describe the rules, or axioms, that youuse for doing algebra with real numbers. The primary operations we areinterested in are addition x + y and multiplication x × y. As usual, I shallabbreviate the operation of multiplication by concatenation, which simplymeans we write xy. Sometimes, it is helpful to denote multiplication asfollows x · y. Of course, there are two other familiar operations: subtrac-tion and division. We shall see that these should be treated in a differentway: subtraction as the inverse of addition, and division as the inverse ofmultiplication.

Both addition and multiplication require two inputs and then deliverone output with the inputs and outputs all being taken from the same set.They are therefore examples of what are called binary operations and arethe commonest kinds of operations in algebra. For example, as we shall see

53

54 CHAPTER 3. HIGH-SCHOOL ALGEBRA REVISITED

later, matrix addition and matrix multiplication are both binary operations,the vector product of two vectors is a binary operation, and the intersectionand union of two sets are both binary operations. I shall use ∗ to mean anybinary operation defined on some specified set X.

a

ba ∗ b∗

The two most important properties a binary operation may have is com-mutativity and associativity. A binary operation is commutative if

a ∗ b = b ∗ a

in all cases. That is, the order in which you carry out the operation isnot important. Addition and multiplication of real, and as we shall see later,complex numbers are commutative. But we shall also meet binary operationsthat are not commutative: both matrix multiplication and vector productsare examples. Commutativity is therefore not automatic. A binary operationis associative if

(a ∗ b) ∗ c = a ∗ (b ∗ c)in all cases. Remember that the brackets tell you how to work out theproduct. Thus (a ∗ b) ∗ c means first work out a ∗ b, let’s call it d, and thenwork out d ∗ c. Almost all the binary operations we shall meet in this bookare associative, the one important exception being the vector product.

In order to show that a binary operation ∗ is associative, we have inprinciple to check that all possible products (a∗b)∗c and a∗ (b∗c) are equal.To show that a binary operation is not associative, we simply have to findspecific values for a, b and c so that (a ∗ b) ∗ c 6= (a ∗ b) ∗ c. Here examples ofboth of these possibilities.

Example 3.1.1. Let’s take the set or real numbers R and investigate a newbinary operation denoted by ◦ that is defined as follows

a ◦ b = a+ b+ ab.

We shall prove that it is associative. First, we have to understand what it iswe have to show. From the definition of associativity, we have to prove that

(a ◦ b) ◦ c = a ◦ (b ◦ c)

3.1. THE RULES OF THE GAME 55

for all real numbers a, b and c. To do this, we calculate first the lefthand sideand then the righthand side and then verify they are equal. Because we aretrying to prove a result true for all real numbers, we cannot choose specificvalues of a, b and c. We first calculate (a ◦ b) ◦ c. Using the axioms for realnumbers, we get that

(a ◦ b) ◦ c = (a+ b+ ab) ◦ c = (a+ b+ ab) + c+ (a+ b+ ab)c

which is equal to a+ b+ c+ ab+ ac+ bc+ abc. Now we calculate a ◦ (b ◦ c).We get that

a ◦ (b ◦ c) = a ◦ (b+ c+ bc) = a+ (b+ c+ bc) + a(b+ c+ bc)

which is equal to a+ b+ c+ ab+ ac+ bc+ abc. We now see that we get thesame answers however we bracket the product and so we have proved thatthe binary operation ◦ is associative.

Example 3.1.2. Let’s take the set N and define the binary operation ⊕ asfollows

a⊕ b = a2 + b2.

I shall show that this binary operation is not associative. Let’s calculate first(1⊕ 2)⊕ 3. By definition this is computed as follows

(1⊕ 2)⊕ 3 = (12 + 22)⊕ 3 = 5⊕ 3 = 52 + 32 = 25 + 9 = 34.

Now we calculate 1⊕ (2⊕ 3) as follows

1⊕ (2⊕ 3) = 1⊕ (22 + 32) = 1⊕ (4 + 9) = 1⊕ 13 = 12 + 132 = 1 + 169 = 170.

Therefore(1⊕ 2)⊕ 3 6= 1⊕ (2⊕ 3).

It follows that the binary operation ⊕ is not associative.

We are now ready to state the algebraic axioms that form the basis ofhigh-school algebra. We shall split them up into three groups: those dealingonly with addition, those dealing only with multiplication, and finally thosethat deal with both operations together.

Axioms for addition


(F1) Addition is associative. Let x, y and z be any real numbers. Then(x+ y) + z = x+ (y + z).

(F2) There is an additive identity. The number 0 (zero) is the additiveidentity. This means that for an real number x we have that x + 0 =x = 0 + x. Thus adding zero to a number leaves it unchanged.

(F3) Each element has a unique additive inverse. This means that for eachnumber x there is another number, written −x, with the property thatx+(−x) = 0 = (−x)+x. The number −x is called the additive inverseof the number x.

(F4) Addition is commutative. Let x and y be any real numbers. Thenx + y = y + x. The word commutative means that the order in whichyou add the numbers does not matter.

The first thing to understand is that none of these axioms should besurprising. They should all agree with your intuition.

Axioms for multiplication

(F5) Multiplication is associative. Let x, y and z be any real numbers. Then(xy)z = x(yz).

(F6) There is a multiplicative identity. The number 1 is the multiplicativeidentity. This means that for any real number x we have that 1x =x = x1.

(F7) Each non-zero number has a unique multiplicative inverse. Let x 6= 0.Then there is a unique real number written x−1 with the property thatx−1x = 1 = xx−1. The number x−1 is called the multiplicative inverseof x. It is, of course, the number 1

x. It is very important to observe

that zero does not have a multiplicative inverse.

(F8) Multiplication is commutative. Let x and y be any real numbers. Thenxy = yx. Once again the word commutative means that the order inwhich you carry out the operations doesn’t matter. In this case, theoperation is multiplication.

The axioms for multiplication are very similar to those for addition. Theonly real difference between them is axiom (F7). This expresses the fact thatyou cannot divide by zero.


Linking axioms

(F9) 0 6= 1.

(F10) The additive identity is a multiplicative zero. This means that 0x =0 = x0. If you multiply any real number by 0 then you get 0.

(F11) Multiplication distributes over addition on the left and the right. Thereare actually two distributive laws: the left distributive law

x(y + z) = xy + xz

and the right distributive law

(y + z)x = yx+ zx.

Let me come back to the omission of subtraction and division. These arenot viewed as binary operations in their own right. Instead, we define a− bto mean a + (−b). Thus to subtract b means the same thing as adding −b.Likewise, we define a÷ b, when b 6= 0 to mean a× b−1. Thus to divide by bis to multiply by b−1.

We have missed out one further ingredient in algebra, and that is theproperties of equality.

Properties of equality

(E1) If a = b then c+ a = c+ b.

(E2) If a = b then ca = cb.

Example 3.1.3. When I talked about algebra in Chapter 1, I mentionedthat the usual way of solving a linear equation in one unknown depended onthe properties of real numbers. Let me now show you how we use the aboveaxioms to solve ax+b = 0 where a 6= 0. Throughout, I use without comment


the two properties of equality I have listed above.

ax+ b = 0

(ax+ b) + (−b) = 0 + (−b) by (F3)

ax+ (b+ (−b)) = 0 + (−b) by (F1)

ax+ 0 = 0 + (−b) by (F3)

ax = 0 + (−b) by (F2)

ax = −b by (F2)

a−1(ax) = a−1(−b) by (F10) since a 6= 0

(a−1a)x = a−1(−b) by (F5)

1x = a−1(−b) by (F10)

x = a−1(−b) by (F5)

I don’t propose that you go into quite such gory detail when solvingequations, but I wanted to show you what actually lay behind the rules thatyou might have been taught at school.

Example 3.1.4. We can use our axioms to prove that−1×−1 = 1 somethingwhich is hard to understand in any other way. By definition, −1 is theadditive inverse of 1. This means that 1 + (−1) = 0. Let us calculate(−1)(−1)− 1. We have that

(−1)(−1)− 1 = (−1)(−1) + (−1) by definition of subtraction

= (−1)(−1) + (−1)1 since 1 is the multiplicative identity

= (−1)[(−1) + 1] by the left distributivity law

= (−1)0 by properties of additive inverses

= 0 by properties of zero

Hence (−1)(−1) = 1. In other words, the result follows from the usual rulesof algebra.

3.1.2 Indices

We usually write a2 rather than aa, and a3 instead of aaa. In this section,I want to review the meaning of algebraic expressions such as a

rs where r

sis

any rational number. Our starting point is a result that I would encourageyou to assume as an axiom at a first reading. I have included the proof toshow you a more sophisticated example of proof by induction.


Lemma 3.1.5 (Generalized associativity). Let ∗ be any binary operationdefined on a set X. If ∗ is associative then however you bracket a productsuch as

x1 ∗ . . . ∗ xnyou will always get the same answer.

Proof. If x1, x2, · · · , xn are elements of the set X then one particular brack-eting will play an important role in our proof

x1 ∗ (x2 ∗ (· · · (xn−1 ∗ xn) · · · ))

which we write as [x1x2 . . . xn].The proof is by strong induction on the length n of the product in ques-

tion. The base case is where n = 3 and is just an application of the associativelaw. Assume that n ≥ 4 and that for all k < n, all bracketings of a sequenceof k elements of X lead to the same answer. This is therefore the induc-tion hypothesis for strong induction. Let X denote any properly bracketedexpression obtained by inserting brackets into the sequence x1, x2, · · · , xn.Observe that the computation of such a bracketed product involves comput-ing n − 1 products. This is because at each step we can only compute theproduct of adjacent letters xi ∗ xi+1. Thus at each step of our calculationwe reduce the number of letters by one until there is only one letter left.However the expression may be bracketed, the final step in the computationwill be of the form Y ∗Z, where Y and Z will each have arisen from properlybracketed expressions. In the case of Y it will involve a bracketing of somesequence x1, x2, . . . , xr, and for Z the sequence xr+1, xr+2, . . . xn for some rsuch that 1 ≤ r ≤ n − 1. Since Y involves a product of length r < n, wemay assume by the induction hypothesis that Y = [x1x2 . . . xr]. Observe that[x1x2 . . . xr] = x1 ∗ [x2 . . . xr]. Hence by associativity,

X = Y ∗ Z = (x1 ∗ [x2 . . . xr]) ∗ Z = x1 ∗ ([x2 . . . xr] ∗ Z).

But [x2 . . . xr] ∗ Z is a properly bracketed expression of length n − 1 inx2, · · · , xn and so using the induction hypothesis must equal [x2x3 . . . xn].It follows that X = [x1x2 . . . xn]. We have therefore shown that all possiblebracketings yield the same result in the presence of associativity.

We illustrate a special case of the above proof in the example below.


Example 3.1.6. Take n = 5. Then the notation [x1x2x3x4x5] introducedin the above proof means x1 ∗ (x2 ∗ (x3 ∗ (x4 ∗ x5))). Consider the product((x1 ∗ x2) ∗ x3) ∗ (x4 ∗ x5). Here we have Y = (x1 ∗ x2) ∗ x3 and Z = x4 ∗ x5.By associativity Y = x1 ∗ (x2 ∗ x3). Thus Y ∗Z = (x1 ∗ (x2 ∗ x3)) ∗ (x4 ∗ x5).But this is equal to x1 ∗ ((x2 ∗ x3) ∗ (x4 ∗ x5)) again by associativity. By theinduction hypothesis (x2 ∗ x3) ∗ (x4 ∗ x5) = x2 ∗ (x3 ∗ (x4 ∗ x5)), and so

((x1 ∗ x2) ∗ x3) ∗ (x4 ∗ x5) = x1 ∗ (x2 ∗ (x3 ∗ (x4 ∗ x5))),

as required.

If a binary operation is associative then the above lemma tells us thatcomputing products of elements is straightforward because we never haveto worry about how to evaluate it as long as we maintain the order of theelements. We now consider a special case of this result. Let a be any realnumber. Define the nth power an of a, where n is a natural number, asfollows: a1 = a and an = aan−1 for any n ≥ 2. Generalized associativitytells us that an can in fact be calculated in any way we like because we shallalways obtain the same answer. The following result should be familiar. Ishall ask you to prove it in the exercises.

Lemma 3.1.7 (Laws of exponents). Let m,n ≥ 1 be any natural numbers.

1. am+n = aman.

2. (am)n = amn.

It follows from the above lemma that powers of the same element a com-mute with one another: aman = anam as both products equal am+n. Our goalnow is to define what am means when m is an arbitrary rational number. Weshall be guided by the requirement that the above laws of exponents shouldstill hold. We may extend the laws of exponents to allow m or n to be 0.The only way to do this is to define a0 = 1, where 1 is the identity and a 6= 0.

An extreme case! What about 00? This is a can of worms. For this book,it is probably best to define 00 = 1.

We have explained what an means when n is positive but what can we saywhen the exponent is negative? In other words, what does a−n mean? Weassume that the rules above still apply. Thus whatever a−n means we should


have that a−nan = a0 = 1. It follows that a−n = 1an

. With this interpretationwe have defined an for all integer values of x.

We now investigate what a1n should mean. If the law of exponents are to

continue holidng, then (a1n )n = a1 = a. It follows that a

1n = n√a.

We may now calculate ars it is equal to

ars = ( s

√a)r.

How do we calculate (ab)n? This is just ab times itself n times. But theorder in which we multiply a’s and b’s doesn’t matter and so we can arrangeall the a’s to the front. Thus (ab)n = anbn.

We also have similar results for addition. We define 2x = x + x andnx = x+ . . .+ x where the x occurs n times. We have 1x = x and 0x = 0.

Let {a1, . . . , an} be a set of n elements. If we write them all in someorder ai1 , . . . , ain then we have what is called a permutation of the elements.The following lemma can be treated as an axiom and the proof omitted untillater.

Lemma 3.1.8 (Generalized commutativity). Let ∗ be an associative andcommutative binary operation on a set X. Let a1, . . . , an be any n elementsof X. Then

a1 ∗ . . . ∗ an = ai1 ∗ . . . ∗ ain .

Proof. First prove by induction the result that

a1 ∗ . . . ∗ an ∗ b = b ∗ a1 ∗ . . . ∗ an.

Let a1, . . . , an, an+1 be n+1 elements. Consider the product ai1∗. . .∗ain∗ain+1 .Suppose that an+1 = air . Then

ai1 ∗ . . . ∗ air ∗ . . . ∗ ain ∗ ain+1 = (ai1 ∗ . . . ∗ ain) ∗ an+1

where the expression in the backets is a product of some permutation of theelements a1, . . . , an. We have used here our result above. But by (IH), wemay write ai1 ∗ . . . ∗ ain = a1 ∗ . . . ∗ an.

3.1.3 Sigma notation

At this point, it is appropriate to introduce some useful notation. Leta1, a2, . . . , an be n numbers. Their sum is a1 + a2 + . . . + an and because


of generalized associativity we don’t have to worry about brackets. We nowabbreviate this as

n∑

i=1

ai.

Where∑

is Greek ‘S’ and stands for Sum. The letter i is called a subscript.The equality i = 1 tells us that we start the value of i at 1. The equalityi = n tells us that we end the value of i at n. Although I have started thesum at 1, I could, in other circumstances, have started at 0, or any otherappropriate number. This notation is very useful and can be manipulatedusing the rules above. If 1 < s < n, then we can write

n∑

i=1

ai =s∑

i=1

ai +n∑

s+1

ai.

If b is any number then

b

(n∑

i=1

ai

)=

n∑

i=1

bai

is the generalized distributivity law that you are asked to prove in the exer-cises. These uses of sigma-notation shouldn’t cause any problems.

The most complicated use of∑

-notation arises when we have to sum upwhat is called an array of numbers aij where 1 ≤ i ≤ m and 1 ≤ j ≤ n.This arises in matrix theory, for example. For concreteness, I shall give theexample where m = 3 and n = 4. We can therefore think of the numbers aijas being arranged in a 3× 4 array as follows:

a11 a12 a13 a14a21 a22 a23 a24a31 a32 a33 a34

Observe that the first subscript tells you the row and the second subscripttells you the column. Thus a23 is the number in the second row and the thirdcolumn. Now we can add these numbers up in two different ways getting thesame answer in both cases. The first way is to add the numbers up along therows. So, we calculate the following sums

4∑

j=1

a1j,

4∑

j=1

a2j,

4∑

j=1

a3j.


We then add up these three numbers

4∑

j=1

a1j +4∑

j=1

a2j +4∑

j=1

a3j =3∑

i=1

(4∑

j=1

aij

).

The second way is to add the numbers up along the columns. So, we calculatethe following sums

3∑

i=1

ai1,3∑

i=1

ai2,

3∑

i=1

ai3,

3∑

i=1

ai4.

We then add up these four numbers

n∑

i=1

ai1 +n∑

i=1

ai2 +n∑

i=1

ai3 +n∑

i=1

ai4 =4∑

j=1

(3∑

i=1

aij

).

The fact that3∑

i=1

(4∑

j=1

aij

)=

4∑

j=1

(3∑

i=1

aij

)

is a consequence of the generalized commutativity law that you are asked toprove in the exercises. We therefore have in general that

m∑

i=1

(n∑

j=1

aij

)=

n∑

j=1

(m∑

i=1

aij

).

3.1.4 Infinite sums

What I have defined so far are finite sums and form part of algebra. Thereare also infinite sums ∞∑

i=1

ai

which form part of analysis, the subject that provides the foundations forcalculus. There is one place where we use infinite sums in everyday life, andthat is in the decimal representations of numbers. Thus the fraction 1

3can

be written as 0 · 3333 . . . and this is in fact an infinite sum: it means theinfinite sum ∞∑

i=1

3

10i.


But in general infinite sums are problematic. For example, consider theinfinite sum

S =∞∑

i=1

(−1)i+1.

So, this is justS = 1− 1 + 1− 1 + . . .

What is S? You’re first instinct might be to say 0 because

S = (1− 1) + (1− 1) + . . .

But it could equally well be 1 calculated as follows

S = 1 + (−1 + 1) + (−1 + 1) + . . .

In fact, it could even be 12

since S + S = 1 and so S = 12. There is clearly

something seriously awry here, and it is that infinite sums have to be handledvery carefully if they are to make sense. Just how is the business of analysisand won’t be an issue in this book.

Warning! ∞ is not a number. It simply tells us to keep adding on termsfor increasing values of i without end so we never write

3

10∞.

Exercises 3.1

1. Prove the following identities using the axioms introduced.

(a) (a+ b)2 = a2 + 2ab+ b2.

(b) (a+ b)3 = a3 + 3a2b+ 3ab2 + b3

(c) a2 − b2 = (a+ b)(a− b)(d) (a2 + b2)(c2 + d2) = (ac− bd)2 + (ad+ bc)2

2. Calculate the following.

(a) 23.

3.2. SOLVING QUADRATIC EQUATIONS 65

(b) 213 .

(c) 2−4.

(d) 2−32 .

3. Assume that aij are assigned the following values

a11 = 1 a12 = 2 a13 = 3 a14 = 4a21 = 5 a22 = 6 a23 = 7 a24 = 8a31 = 9 a32 = 10 a33 = 11 a34 = 12

Calculate the following sums.

(a)∑3

i=1 ai2.

(b)∑4

j=1 a3j.

(c)∑3

i=1

(∑4j=1 a

2ij

).

4. Let a, b, c ∈ R. If ab = ac is it true that b = c? Explain.

5. Laws of exponents.

(a) Prove by induction that am+n = aman. To do this, fix m and thenprove the result by induction on n. Deduce that it holds for allm.

(b) Prove by induction that (am)n = amn. To do this, fix m and thenprove the result by induction on n. Deduce that it holds for allm.

6. Prove by induction that the left generalized distributivity law holds

a(b1 + b2 + b3 + . . .+ bn) = ab1 + ab2 + ab3 + . . .+ abn,

for any n ≥ 2.

3.2 Solving quadratic equations

The previous section might have given the impression that algebraic calcu-lations are routine. In fact, once you pass beyond linear equations, theyusually require good ideas. The first place where a good idea is needed is in


solving quadratic equations. Quadratic equations were solved by the Baby-lonians and the Egyptians and are dealt with in all school algebra courses. Ihave included them here because I want to show you that you don’t have toremember a formula to solve such equations; what you have to remember isa method. Let’s recall some definitions. An expression of the form

ax2 + bx+ c

where a, b, c are numbers and a 6= 0 is called a quadratic polynomial or apolynomial of degree 2. The numbers a, b, c are called the coefficients of thequadratic. A quadratic where a = 1 is said to be monic. A number r suchthat

ar2 + br + c = 0

is called a root of the polynomial. The problem of finding all the roots of aquadratic is called solving the quadratic. Usually this problem is stated inthe form: ‘solve the quadratic equation ax2 + bx+ c = 0’. Equation becausewe have set the polynomial equal to zero. I shall now show you how to solvea quadratic equation without having to remember a formula. Observe firstthat if ax2 + bx+ c = 0 then

x2 +b

ax+

c

a= 0.

Thus it is enough to find the roots of monic quadratics. We shall solve thisequation by trying to do the following: write x2 + b

ax as a perfect square plus

a number. This will turn out to be the crux of solving the quadratic. Weshall illustrate our construction by using some diagrams. First, we representgeometrically the expression x2 + b

ax.


x

x

ba

Now cut the red rectangle into two pieces along the dotted line and rearrangethem as shown below.

x

x

b2a

b2a

It is now geometrically obvious that if we add in the small dotted square, weget a new bigger square. This explain why the procedure is called completingthe square. We now express in algebraic terms what these diagrams suggest.

x2 +b

ax =

(x2 +

b

ax+

b2

4a2

)− b2

4a2=

(x+

b

2a

)2

− b2

4a2.


We therefore have that

x2 +b

ax =

(x+

b

2a

)2

− b2

4a2.

Look carefully at what we have done here: we have rewritten the lefthandside as a perfect square — the first term on the righthandside — plus anumber — the second term on the righthandside. It follows that

x2 +b

ax+

c

a=

(x+

b

2a

)2

− b2

4a2+c

a=

(x+

b

2a

)2

+4ac− b2

4a2.

Setting the last expression equal to zero and rearranging, we get

(x+

b

2a

)2

=b2 − 4ac

4a2.

Now take square roots of both sides, remembering that a non-zero numberhas two square roots:

x+b

2a= ±

√b2 − 4ac

4a2

which of course simplifies to

x+b

2a= ±√b2 − 4ac

2a.

Thus

x =−b±

√b2 − 4ac

2a

the usual formula for finding the roots of a quadratic.

Example 3.2.1. Solve the quadratic equation

2x2 − 5x+ 1 = 0.

by completing the square. Divide through by 2 to make the quadratic monicgiving

x2 − 5

2x+

1

2= 0.

We now want to write

x2 − 5

2x


as a perfect square plus a number. We get

x2 − 5

2x =

(x− 5

4

)2

− 25

16.

Thus our quadratic becomes

(x− 5

4

)2

− 25

16+

1

2= 0.

Rearranging and taking roots gives us

x =5

4±√

17

4=

5±√

17

4.

We now check our answer by substituting each of our two roots back intothe original quadratic and ensuring that we get zero in both cases.

For the quadratic equation

ax2 + bx+ c = 0

the number D = b2 − 4ac, called the discriminant of the quadratic, plays animportant role.

• If D > 0 then the quadratic equation has two distinct real solutions.

• If D = 0 then the quadratic equation has one real root repeated. In

this case, the quadratic is the perfect square(x+ b

2a

)2.

• If D < 0 then we shall see that the quadratic equation has two complexroots which are complex conjugate to each other. This is called theirreducible case.

If we put y = ax2 + bx+ c then we may draw the graph of this equation.The roots of the original quadratic therefore correspond to the points wherethis graph crosses the x-axis. The diagrams below illustrate the three casesthat can arise.


D > 0

D = 0

D < 0

Exercises 3.2

1. Calculate the discriminants of the following quadratics and so deter-mine whether they have two distinct roots, or repeated roots, or noreal roots.

(a) x2 + 6x+ 5.

(b) x2 − 4x+ 4.

3.3. ORDER 71

(c) x2 − 2x+ 5.

2. Solve the following quadratic equations by completing the square. Checkyour answers.

(a) x2 + 10x+ 16 = 0.

(b) x2 + 4x+ 2 = 0.

(c) 2x2 − x− 7 = 0.

3. I am thinking of two numbers x and y. I tell you their sum a and theirproduct b. What are x and y in terms of a and b?

4. Let p(x) = x2 + bx + c be a monic quadratic with roots x1 and x2.Express the discriminant of p(x) in terms of x1 and x2.

5. This question is an interpretation of part of Book X of Euclid. We shallbe interested in numbers of the form a+

√b where a and b are rational

and b > 0 where√b is irrational1.

(a) If√a = b+

√c where

√c is irrational Then c = 0.

(b) If a+√b = c+

√d where a and c are rational and

√b and

√d are

irrational then a = c and√b =√d.

(c) Prove that the square roots of a+√b have the form ±(

√x+√y).

3.3 Order

In addition to algebraic operations, the real numbers are also ordered: wecan always say of two real numbers whether they are equal or whether one ofthem is bigger than the other. I shall write down first the axioms for orderthat hold both for rational and complex numbers. The following notation isimportant. If a ≤ b and a 6= b then we write a < b and say that a is strictlyless than b.

Axioms for order

(O1) For every element a ≤ a.

1Remember that irrational means not rational.


(O2) If a ≤ b and b ≤ a then a = b.

(O3) If a ≤ b and b ≤ c then a ≤ c.

(O4) Given any two elements a and b then either a ≤ b or b ≤ a or a = b.

If a > 0 the we say that it is positive and if a < 0 we say it is negative.

(O5) If a ≤ b and c ≤ d then a+ b ≤ b+ d.

(O6) If a ≤ b and c is positive then ac ≤ bc.

The only axiom that you really have to watch is (O6). Here is an exampleof a proof using these axioms.

Example 3.3.1. We prove that a ≤ b if, and only if, b− a is positive. Sincethis statement involves an ‘if, and only, if’ there are, as usual,two statementsto be proved. Suppose first that a ≤ b. By axiom (O5), we may add −a toboth sides to get a+(−a) ≤ b+(−a). But a+(−a) = 0 and b+(−a) = b−a,by definition. It follows that 0 ≤ b−a and so b−a is positive. Now we provethe converse. Suppose that b − a is positive. Then by definition 0 ≤ b − a.Also by definition, b− a = b+ (−a). Thus 0 ≤ b+ (−a). By axiom (O5), wemay add a to both sides to get 0 + a ≤ (b + (−a)) + a. But 0 + a = a and(b + (−a)) + a quickly simplifies to b. We have therefore proved that a ≤ b,as required.

Exercises 3.3

1. Prove that between any two distinct rational numbers there is anotherrational number.

2. Prove the following using the axioms.

(a) If a ≤ b then −b ≤ −a.

(b) a2 is positive for all a 6= 0.

(c) If 0 < a < b then 0 < b−1 < a−1.

3.4. THE REAL NUMBERS 73

3.4 The real numbers

The axioms I have introduced so far apply equally well to both the rationalnumbers Q and the real numbers R. But we have seen that although Q ⊆ Rthe two sets are not equal because we have proved that

√2 /∈ Q. In fact, we

shall see later that there are many more irrational numbers than there arerational numbers. In this section, I shall explain the fundamental differencebetween rationals and reals. This material will not be needed in the rest ofthis book instead its role is to connect with the foundations of calculus, thatis, with analysis.

It is convenient to write K to mean either Q or R in what follows because Iwant to make the same definitions for both sets. A non-empty subset A ⊆ Kis said to be bounded above if there is some number b ∈ K so that for alla ∈ A we have that a ≤ b. For example, the set A = {2n : n ≥ 0} is notbounded above since its elements getter bigger and bigger without limit. Onthe other hand, the set B = {

(12

)n: n ≥ 0} is bounded above, for example

by 1. A non-empty subset A as above is said to have a least upper bound ifyou can find a number a ∈ K with the following two properties: first of all,a but be an upper bound for A and second of all if b is any upper bound forA then a ≤ b. We shall now apply these definitions to a result we obtainedearlier.

Let

A = {a : a ∈ Q and a2 ≤ 2}

and let

B = {a : a ∈ R and a2 ≤ 2}.

Then A ⊆ Q and B ⊆ R. Both sets are bounded above: the number 112, for

example, works in both case. However, I shall prove that the subset A doesnot have a least upper bound, whereas the subset B does.

Let’s consider subset A first. Suppose that r were a least upper bound.I claim that r2 would have to equal 2 which is impossible because we haveproved that

√2 is irrational.

Suppose first that r2 < 2. Then I claim there is a rational number r1 suchthat r < r1 and r21 < 2. Choose any rational number h such that 0 < h < 1and

h <2− r22r + 1

.


Put r1 = r + h. By construction r1 > r. We calculate r21 as follows

r21 = r2 + 2rh+ h2 = r2 + (2r + h)h < r2 + (2r + 1)h = r2 + 2− r2 = 2.

Thus r21 < 2 as claimed. But this contradicts the fact that r is an upperbound of the set A.

Suppose now that 2 < r2. Then I claim that I can find a rational numberr1 such that r1 < r and 2 < r21. Put h = r2−2

2rand define r1 = r− h. Clearly,

0 < r1 < r. We calculate r22 as follows

r21 = r2 − 2rh+ h2 = r2 − (r2 − 2) + h2 > r2 − (r2 − 2) = 2.

But this contradicts the fact that r is supposed to be a least upper bound.We have therefore proved that if r is a least upper bound of A then

r =√

2. But this is impossible because we have proved that√

2 is irrational.Thus the set A does not have a least upper bound in the rationals. However,by essentially the same reasoning the set B does have a least upper boundin the reals: the number

√2. This motivates the following definition. It is

this axiom that is needed to develop calculus properly.

The completeness axiom for R

Every non-empty subset of the reals that is bounded above has a leastupper bound.

The Peano Axioms

Set theory is supposed to be a framework in which all of mathematicscan take place. Let me briefly sketch out how we can construct thereal numbers using set theory. The starting point are the Peano axiomsstudied by G. Peano (1858–1932). These deal with a set P and anoperation on this set called the successor function which for each n ∈ Pproduces a unique element n+. The following four axioms should hold:

(P1) There is a distinguished element of P that we denote by 0.

(P2) There is no element n ∈ P such that n+ = 0.

(P3) If m,n ∈ P and m+ = n+ then m = n,

3.4. THE REAL NUMBERS 75

(P4) If X ⊆ P is such that 0 ∈ X and if n ∈ X then n+ ∈ X thenX = P .

By using ideas from set theory, one shows that P is essentially the setof natural numbers together with its operations of addition and multi-plication.

The natural numbers are deficient in that it is not always possibleto solve equations of the form a+ x = b because of the lack of negativenumbers. However, we can use set theory to construct Z from N byusing ordered pairs. The idea is to regard (a, b) as meaning a − b.However, there are many names for the same negative number so weshould have (0, 1) and (2, 3) and (3, 4) all signifying the same number:namely, −1. To make this work, one uses another idea from set theory,that of equivalence relations which we shall meet later. This gives riseto the set Z. Again using ideas from set theory, the usual operationscan be constructed on Z.

But the integers are deficient because we cannot always solve equa-tions of the form ax + b = 0 because of the lack of rational numbers.To construct them we use ordered pairs again. This time (a, b), whereb 6= 0, is interpreted as a

b. But again we have the problem of multiple

names for what should be the same number. Thus (1, 2) should equal(−1,−2) should equal (2, 4) and so forth. Once again this problem issolved by using an equivalence relation, and once again, the set whicharises, which is denoted by Q, is endowed with the usual operations.

As we have seen, the rationals are deficient in not containing numberslike√

2. The intuitive idea behind the construction of the reals from therationals is that we want to construct R as all the numbers that canbe approximated arbitrarily by rational numbers. To do this, we formthe set of all subsets X of Q which have the following characteristics:X 6= ∅, X 6= Q, if x ∈ X and y ≤ x then y ∈ X, and X doesn’t havea biggest element. These subsets are called Dedekind cuts and shouldbe regarded as defining the real number r so that X consists of all therational numbers less than r.

chapter 3 high-school algebra revisited

Documents