lecture notes on discrete mathematics · 5 ordering and strong induction 34 ... question of the...

M . A N D R E W M O S H I E R

C O N T E M P O R A R YD I S C R E T E

M AT H E M AT I C S

M & H P U B L I S H I N G

Copyright © 2018 M. Andrew Moshier

PUBLISHED BY M&H PUBLISHING

MANDREWMOSHIER .GITHUB . IO/LNDM/

Contents

What is this about? 4

I Natural Numbers and Induction 9

1 The Natural Numbers 11

2 Arithmetic 18

3 Laws of Arithmetic 22

4 Examples of Recursion and Induction 31

5 Sets as Types of Mathematical Data 37

6 Functions and Operations 44

7 The Set of Natural Numbers 60

8 Ordering and Strong Induction 67

3

9 Minimum and Maximum 73

10 Divisibility: Ordering the Natural Numbers by Multiplication 77

11 Greatest Common Divisors and Least Common Multiples 85

12 Prime Numbers 94

What is this about?

D ISCRETE MATHEMATICS consists of many individual topics that, imprecisely, contrast with continuousmathematics, e.g., topics like calculus, real analysis, differential equations, and so on. But a sharp contrastbetween discrete and continuous mathematics is mainly a convenience. We put some topics in one basket,others in a different basket, but things are more connected than that.

Discrete mathematics texts tend to be disorganized, featuring disparate topics that start looking like anIsland of Misfit Math — stuff that just does not fit anywhere else. In this text, I take the view that discretemathematics comprises a coherent set of ideas centered around some key themes:

• The natural numbers and closely related inductive structures are central.

• Mathematical data is “typed”. Natural numbers consistute a type of data; real numbers constitute adifferent type of data; polynomials with real coefficients constitute yet another type of data, and so on.

• Algebraic (equational and inequational) reasoning provides key techniques.

• Mathematics itself is computational — at least it is extremely productive to think that way.

• Abstraction and use of analogy is a crucial means of discovering new ideas, and of tying what we knowtogether.

In these notes, we deal head on with mathematics as the study of abstract structure. This can be frustrat-ing at first because the concrete applications are not always obvious. The pay off comes when we see thatgeneralization from particulars leads to much wider applications than we could have anticipated.

For example, 1239283+ 111111

(a number with more that 100 trillion digits) is the same as 111111

+ 1239289.These numbers are far too big to add explicitly. We can not actually hope to perform the additions andcompare the results. Yet, we are completely confident that if we could perform the two additions, the resultswould be the same.

We should be able to convince ourselves that m+n = n+m is true for any two counting numbers, not justby taking it for granted. Figuring out how to convince someone that something like this must true is actuallythe main activity of mathematics.

Persuasion

Mathematicians spend a lot of time thinking about what is, or isn’t true. But that would be pretty useless,if they did not then communicate the result of their thinking. To actually put mathematics to work, you must

5

learn how to pursuade someone else to accept a proposed solution to a problem. It is not enough to knowhow to solve problems.

The standards for persuasion in mathematics are very high. We do not settle for “preponderance ofevidence” or “beyond reasonable doubt”, or “that seems reasonable.” We aim for “beyond any doubt.” Inmost human endeavours (including in the sciences), that standard would be paralyzing, but in mathematics,where abstract structure is the object of investigation, it is not only within reach, but it is the standard thatmakes sense.

In mathematics, a proof is a convincing argument. So it falls into the literary genre of pursuasive writing.This point of view is critical for understanding how mathematicians work.

The emphasis on writing is important. After all, no one cares what I think, unless I manage to convincehim or her that my thinking is on track.

Most exercises in these notes are not of the sort: “”Solve for x: 4x− 3 = x2". Rather, an exercise might say“Show that the equation 4x− 3 = x2 has two solutions.” The former would tempt you simply to write (witha box around the answer) something like x = 1, 3. The latter asks you to convince the reader that these twovalues are the only solutions.

Particularly in the first part of the text, I will emphasize fairly formulaic writing. This is meant to giveyou practice with precise mathematical writing.

Roughly, an Outline

These notes start by investigating two ideas that are fundamental to all of mathematics.

• First, we look carefully at the idea of counting things. You might think there is not much to say aboutsomething you have been doing reliably for most of your life. But it turns out that arithmetic (additionand multiplication, for example) is tightly related to counting. Understanding arithmetic relies on under-standing how counting works.

• Second is the idea of putting things into lists. Again, you have been doing this a long time. You writedown shopping lists, you take two lists and merge them into one list, and so on. Lists are crucial foreverything in mathematics, even how we write a number in base ten. After all, what exactly does 3429mean? A numeral is formed by putting symbols (digits) next to each other into a certain order. So 3429 isnot the same numeral as 3492. To understand this, one needs to understand position. That means puttingthings in order. Lists provide a general way to think about this.

After looking at the basics of counting and of lists, we turn to the foundational language of contemporarymathematics: sets, functions and relations. Informally, a set is simply a collection of things, such as N — thecollection of natural numbers. A function is a correlation of the things in one set with things in another set.A familiar example is f(x) = x2 that correlates with any real number x, the square x2. A relation is what issays it is: a way to relate things in one set to things in another.

To make sense of sets, functions and relations, we need to consider some basic questions:

• What does it mean say two sets are equal?

• What does it mean to say two functions are equal?

• What does it mean to say two relations are equal?

6

• How can we specify or “build” particular sets?

• How can we specify or “build” particular functions?

• How can we speficy or “build” particular relations?

The middle part of this text concerns these questions.In the last part of the text, we investigate a variety of useful mathematical ideas using the techniques de-

veloped in the first two parts. The emphasis is mostly on practical ideas that either emerge from computing,or are directly useful there.

Some basics

Before we launch into mathematics proper, we need to sort out some basic notation and discuss how to readthis text. As usual, we typically use single letters, for example, in x2 + 3x+ 1, as placeholders for values. Theseare usually called “variables”.

There is nothing special about the name ‘x’. I might as well have written y2 + 3y+ 1 or a2 + 3a+ 1. Thedifference is only apparent when we make assumptions about the variables. For example, look at these two:

• “Suppose x is a positive real number. Consider x2 + 3x+ 1.”

• “Suppose x is a positive real number. Consider y2 + 3y+ 1.”

Clearly, they mean different things. In the first one, the polynomial expression is guaranteed to be positive.In the second one, the polynomial is not constrained at all (and if y is −1, might be negative). Contrast thiswith these two:

• “Suppose x is a positive real number. Consider x2 + 3x+ 1.”

• “Suppose y is a positive real number. Consider y2 + 3y+ 1.”

Evidently, these two really do mean the same thing. It is important to keep details like this in mind. Tounderstand how a variable should be interpreted look for the context in which it was first mentioned.

Frequently, I will use letters in certain parts of the alphabet to refer to specific types on values. For exam-ple, I will mostly usem, n and p to refer to natural numbers, and keep using x, y and z for generic values(not from a particular type of value). Usually f, g, h usually stand for functions. Later on, other types ofentities will be needed. I will use Greek letters for some things. So it is in your interest to be familiar withthe Greek alphabet — at least, some commonly used letters.

Sometimes, a datum should have no meaning on its own. Symbols are just names with no interpretation.I will typically write a symbol in a type face like so: waffles. Thus x is a symbol, whereas x is a variable. Inhandwritten text, a symbol might be indicated by underlining.

Two other type faces will come in handy: san serif and SMALL CAPS. Usage will not make much sensenow. I’ll wait to explain how I intend to use them when the time is right.

Parentheses are used in mathematics is several distinct ways. This, unfortunately, can be a sorce ofmisunderstanding. Here are some of the ways they are used.

• Function application. We frequently write f(5) to mean “apply the function f to the value 5.

• Pairs, triples and so on. One may write (4, 5) for the pair consisting of 4 then 5.

7

• Ranges. In some situations, the notation (0, 1) stands for the collection of real numbers that lie strictlybetween 0 and 1.

• Grouped sub-expressions. In an expression such a+ b · c, you know to evaluate b · c before added a to theresult. But to get the result of multiplying a+ bwith c, you would write (a+ b) · c.

Usually, these different uses of parentheses will not cause any confusion because the context will make itclear what is meant. I just caution you to pay close attention.

Throughout the text, I refer to certain parts as algorithms or definitions. The distinction is not precise. Theidea is that an algorithm is meant to convey a precise way of calculating something; a definition is meantto convey a way to talk about something. For example, addition of two natural numbers is presented as analgorithm because there is a precise sequence of concrete steps to take when adding. But ≤ (the relation ofbeing less than or equal to) is presented as a definition because reallym ≤ n just means thatm plus someother natural number equals n. There are no concrete steps in mind for determining whetherm ≤ n is true— though one could design an algorithm to test whetherm ≤ n is true or not.

This usage of the word algorithm is not meant to be informal. An algorithm in the text is not a programin a particular programming language. Rather, an algorithm is just a precise description of a procedure thatmight be carried out by hand or by computer.

Part I

Natural Numbers and Induction

10 CONTEMPORARY DISCRETE MATHEMATICS

OUR EARLIEST MATHEMATICAL EXPERIENCE is learning to count. Arithmetic on the natural numbers,simple as it seems, exhibits many of the features of higher contemporary mathematics that concern usthroughout this course. By starting with a careful look at arithmetic, you prepare yourself for what comes inthis course and in later mathematics.

An important technique for reasoning about natural numbers and many other mathematical structures iscalled induction. Induction is the main theme of Part I, and is used throughout the rest of the text.

1The Natural Numbers

THE NATURAL NUMBERS have to do with counting. They do not in-CHAPTER GOALS

Investigate the structure of countingand explain that structure via postulatesfor the natural numbers.

clude negatives or fractions or irrationals. In this chapter, the structureof natural numbers is the topic. To hone in on that structure, we lookat similar structures that fail to capture basic aspects of counting. Bo-gus structures can be ruled out by being precise about how countingbehaves. The result is a system of easily remembered postulates thatdistinguish the structure of natural numbers from all others.

The natural numbers constitute the most important data type ofmathematics and computing. Indeed, the 19th century mathematicianKronecker said “Die ganzen Zahlen hat der liebe Gott gemacht, allesandere ist Menschenwerk” (the counting numbers were made by God,the rest is man’s handiwork). The task at hand here is to characterizethe system of natural numbers, commonly refered to by the symbol N,as the system that, in a precise sense, captures the idea of counting (oriteration or repetition).

1.1 The Basic Picture

A question “how many X’s are there?” is distinct from “How much x isINFORMAL DEFINITION OF NATU-

RAL NUMBERS

A counting or natural number is anumber that can be used to answer aquestion of the form “How many X’s arethere?”

there?”. The latter sort of question could be answered with 12 (as long

as we know how we are measuring). On the other hand, the answer toa “how many” question could be 1, 2, 3 and so on. The answer couldalso be 0 (for example, “how many math professors own unicorns?”).On the other hand, “how many?” can not be answered with 1

2 , π, −5and so on.

Natural numbers may be pictured like stepping stones as in Fig-ure 1.1.


0 1 2 3 . . . Figure 1.1: A picture of the naturalnumbers

Not all “stepping stone” pictures are acceptable as pictures of thenatural numbers. Figures 1.2, 1.3 and 1.4 illustrate three ways not topicture the natural numbers.

0 . . .

Figure 1.2: Forks in the path

0 Figure 1.3: Nowhere to go

. . . . . . Figure 1.4: Nowhere to start

Examples of how things can go wrongplay an important role in math. They arecalled counter-examples.

These incorrect pictures can be ruled out by explaining the basicvocabulary of counting.

THE NATURAL NUMBERS 13

VOCABULARY 1: Basic Vocabulary of Natural Numbers

The natural numbers have the following features.

• There is a special natural number, zero, denoted by 0.

• For any natural number n, there is a unique next natural num- The notation ny is not standard. I useit to distinguish the successor of n fromn+ 1, because the latter presupposes thatwe already know what + means. It istoo early in the discussion to do that.

ber, called the successor of n. In these notes, the successor of n isdenoted by ny.

The collection of natural numbers is denoted by N. To indicate This (N) is very common usage. Anymathematician will know immediatelythat you mean the natural numberswhen you use this symbol.

that a variable n is intended to range over natural numbers, we writen ∈N.

According to Vocabulary 1, expressions like 0, 0y, 0yy, 0yyy

denote natural numbers. Of course, we usually abbreviate these bywriting 0, 1, 2, 3. But the characters 1, 2, 3, etc., are not related to eachother. Vocabulary 1 makes it clear that 0y is meant to be the numberafter 0; 0yy, the number after that, and so on. We will want to be ableto switch between the familiar “decimal” notation and “successor”notation whenever it is convenient. Assuming the plus sign is part of We will not worry about parentheses

too much in this notation. For example,(0yy)yyy and 0yyyyy mean thesame thing.

an augmented vocabulary, 2+ 2 = 4 could be written as

0yy + 0yy = 0yyyy.

The next exercises emphasize the switch back and forth.


Exercises

Convert the following expres-sions from decimal notation tosuccessor notation.

1. 9

2. 10

3. 4+ 3

4. n+ 4

Convert the following expres-sions from successor notation todecimal notation.

5. 0yyyy

6. n+ 0y = 0yy +m

7. 5yy

8. 0y + 0yy

1.2 Narrowing to actual counting

F IGURES 1.5 AND 1.6 illustrate correct pictures of Vocabulary 1.Obviously, they are not pictures of the natural numbers. It is notenough just to spell out carefully how to talk about counting. We alsoneed to explain what is not allowed. Postulates play that role.

0

Figure 1.5: A strange way to count

0

Figure 1.6: Another strange way tocount

Figure 1.5 is flawed because 0 has a predecessor: a value n satisfyingny = 0. Figure 1.6 is flawed because an element has two distinctpredecessors: 0y = 0yyyy.

We can rule these problems out by postulating behavior of 0 andy.


POSTULATE 1: Nothing Precedes 0

For every natural number n, it is the case that ny 6= 0.

POSTULATE 2: Predecessors are Unique

For any natural numbersm and n, ifmy = ny thenm = n. In other words, if p is the successor of anatural number, then it’s predecessor isunique.

These postulates eliminate Figures 1.5, 1.6 and similar bad pictures.But there is still a subtle problem. Consider Figure 1.7.

0 . . .

?

Figure 1.7: A model of the naturalnumbers?

This picture uses the basic vocabulary, and satisfies the first twopostulates. Yet, it is not a picture of natural numbers because it has“extra stuff” in it. In this case, ? is the only “extra stuff,” but we couldhave concocted more complicated examples.

The challenge is to rule out any possible “extra stuff”. Were weto erase the circle labelled ? and any arrows leading to and fromit, the remaining part of Figure 1.7 would still live up to the basicvocabulary of natural numbers (Vocabulary 1). This is exactly whatmakes for “extra stuff”: elements that can be removed without doingany harm to our ability to talk about all natural numbers. Clearly,we can not remove 0 because the basic vocabulary requires it. Andwe can not remove 0y because the basic vocabulary requires 0 tohave a successor. Likewise, we cannot remove 0yy because the basicvocbulary requires 0y to have a successor. And so on. So the valuesthat can be reached by a sequence of steps (that is, by counting) from0must stay. Anything else would be extra. This leads to our lastpostulate.


POSTULATE 3: The Axiom of Induction

No natural numbers can be removed without violating Vocabu- This postulate is different from theprevious two. In the Postulates 1 and 2,the axiom refers directly to the things(natural numbers) being talked aboutusing the vocabulary. The Axiom ofInduction refers to the whole collectionof natural numbers at once. Statementsthat refer to individual things are calledfirst-order statements. Those that refer towhole collections are called second-order.

lary 1.

Believe it or not, the vocabulary and three postulates we havestated here completely characterize the standard picture of the naturalnumbers. In other words, any picture that satisfies these will look thesame. A rigorous proof of this is possible, but not necessary for now.

To avoid overly wordy expressions, we let N stand for the collec-tion of all natural numbers, and let n ∈ N indicate the suppositionthat n is a natural number. So for example, saying “Every n ∈ N iseither odd or even” is the same as saying “Every natural number n iseither odd or even.”

In Chapter 5, we will make the notation n ∈ N more precise, andwill generalize to other “data types”.

Exercises

9. Each of the following pictures violates the vocabulary or one ormore of our postulates. For each, explain what is violated.

(a) 0

(b)

(c) 0

(d) 0

10. I have in mind a picture for Vocabulary 1 an Postulates 1 and 2. Exercise 10 shows that in the naturalnumbers, if n 6= 0 then n = ky forsome k. In other words, every n ∈ N

falls into exactly one of two mutuallyexclusive cases: either n = 0 or n has apredecessor.

Furthermore, in that picture, I have in mind an element n for which(a) n 6= 0 and (b) n has no predecessor (that is, n 6= ky for everynatural number k). Convince me that the picture fails to satisfyPostulate 3.

11. Draw three different pictures of situations that satisfy all the pos-tulates except that they fail Postulate 1. So there will be an arrowfrom a bubble into the bubble labelled 0. The result must satisfy allother postulates including the Axiom of Induction.

2Arithmetic

ADDING AND MULTIPLYING arise from counting. In this chapter,CHAPTER GOALS

Define addition and multiplicationas mechanical computations based oncounting.

we explore how to define them computationally, purely in terms ofcounting.

2.1 Basic Arithmetic Operations

Addition works by counting ahead. For example, to add 4+ 5, startwith 4 and then count five more. This is how you first learned to add.Multiplication works by counting a number of additions. To multiply2 · 3, add 2 three times: 2+ 2+ 2. The following capture the idea.

ALGORITHM 1: Addition

The sum ofm ∈N and n ∈N is a natural numberm+n, calculatedby the following:

m+ 0 = m

m+ ky = (m+ k)y for any k

ARITHMETIC 19

ALGORITHM 2: Multiplication

The product ofm ∈ N and n ∈ N, is a natural numberm · n, In this text, I will always denote multi-plication with an explicit ·. Though itis conventional to write things like 2minstead of 2 ·m, I keep the multiplicationsign to make things clearer.

calculated by the following:

m · 0 = 0m · ky = m+ (m · k) for any k

An algorithm is a precise specification for how to calculate some-thing. The foregoing are both algorithms in so far as they spell outhow to calculatem+n andm · n for anym and n.

The vocabulary and postulates of natural numbers ensure thatthere are indeed unique operations + and · that satisfy the equations.A detailed proof of this fact is not illuminating right now. We willbe better off proving that recursive algorithms like this always work.Then sum and product will turn out just to be particular instances of ageneral method for specifying algorithms. We will return to this pointin Chapter 7. For now, let us just sketch why the definition of + makessense.

Let us say that a natural number n is “good form” ifm + n isdefined uniquely by the equations, and “bad form” if not. Evidently,0 is good form because of the equationm+ 0 = m. Suppose some k isgood form. Then ky is also good form because if we supposem+ k

is defined uniquely, the second equation defining addition tells us thatm+ ky = (m+ k)y. So the numbers that are good form constitutea picture of Vocabulary 1: 0 is good form and if k is good form, so isky. The Axiom of Induction tells us there are no bad natural numbers.For if there were, we could remove all of them and still have a modelof the vocabulary. But Induction says we can not remove anything. Sothere can not be any bad numbers.

In the next chapter, we will establish a proof pattern called “simplearithmetic induction” generalizing this argument. This is one of themainstays of mathematics and computing.

Induction also shows that the equations for products define anoperation. Writing out an argument now won’t be helpful, though. Itwill be better to wait to look at the general pattern.


EXAMPLE 1:

Calculate 4+ 3:When you need to calculate something“by hand”, use this format. Write thefirst expression followed by =, followedby the first step in the calculation. Alignthe equals signs and write subsequentsteps on new lines. This makes it mucheasier to check your own work. Alsouse another column to write notesexplaining why a step is valid. Theseexplanatory notes are not alwaysneeded when a step is obvious, but fornow get in the habit of writing this way.

3+ 4 = 0yyy + 4 [3 abbreviates 0yyy]

= (0yy + 4)y [ky +n = (k+n)y]

= (0y + 4)yy [Same reason]

= (0+ 4)yyy [Same reason]

= 4yyy [0+n = n]

= (0yyyy)yyy [4 abbreviates 0yyyy]

= 0yyyyyyy [Remove unneeded parentheses]

= 7 [7 abbreviates 0yyyyyyy]

A product can be calculated similarly. Consider 3 · 2.

3 · 2 = 3 · 0yy · 3 [2 abbreviates 0yy]

= 3+ (3 · 0y) [m · ky = m+ (m · k)]= 3+ (3+ (3 · 0)) [Same reason]

= 3+ (3+ 0) [m · 0 = 0]= 3+ 3 [m+ 0 = m]

= 3+ 0yyy [3 abbreviates 0yyy]

= (3+ 0yy)y [m+ ky = (m+ k)y]

= (3+ 0y)yy [Same reason]

= (3+ 0)yyy [Same reason]

= 3yyy [m+ 0 = m]

= (0yyy)yyy [3 abbreviates 0yyy]

= 0yyyyyy [Remove unnecessary parentheses]

= 6 [6 abbreviates 0yyyyyy]

We certainly will not want to calculate this way in real life. Even asimple calculation like 3 · 2 = 6 took thirteen steps. But these examples You knew this already. All we have

done is make it explicit.show how addition and multiplication are built from counting.

Exercises

12. Calculate these sums, following the previous example to writeeach step of your calculation explicitly. Include the reason for each

ARITHMETIC 21

step (as in the previous example). Take care to lay out the chain ofequalities correctly, and do not skip any steps.

(a) 2+ 4

(b) 4+ 2

(c) 3+ (3+ 1)

(d) (3+ 3) + 1

(e) 0+ 3

13. Notice that it takes more steps to calculate 2+ 4 than 4+ 2, eventhough they produce the same answer. Explain why.

14. Calculate the following values, writing each step explicity.

(a) 2 · 3

(b) 0 · 2

(c) 2 · (2 · 2)

(d) 3 · (2+ 1)

(e) (3 · 2) + (3 · 1)

15. Write a definition of exponentiation via equations form0 and formky . Follow the pattern of definition for addition and multiplica-tion.

3Laws of Arithmetic

YOU DO NOT NEED TO CALCULATE anything to know thatCHAPTER GOALS

Remind ourselves of the basic laws ofaddition and multipication. Establishthe pattern of proof by simple arithmeticinduction. Illustrate how to use thepattern by proving that these laws hold.

335 · (23+ 17) = 335 · 23+ 335 · 17.

This is because you know that multiplication always distributes overaddition. This is a law of arithmetic. You can think of several otherlaws we use all the time, such as commutativity —m+ n = n+m—and associativity —m+ (n+ p) = (m+n) + p.

We can prove that these laws hold by virtue of the structure of We use the names of these laws fre-quently. Many of them show up innon-numerical contexts as well andhave common, familiar names. So youwill do yourself a favor by memorizingthem. A few of them will be more ob-scure to you right now, but they are allimportant for actually using arithmatic.

counting and the definitions of addition and multiplication. In thischapter, we will see that the Axiom of Induction gives rise to animportant pattern for proving facts about natural numbers. Later wewill generalize the techique to other kinds of mathematical data.

The following table summarizes several useful laws of arithmeticon the natural numbers. Most of them will be familiar to you.

LAWS 1: Basic Laws of Arithmetic

For any natural numbers,m, n and p: This table is organized to emphasizesimilarities between addition andmultiplication. Pay attention to that.

LAWS OF ARITHMETIC 23

Associativity m+ (n+ p)= (m+n) + p

m · (n · p)= (m · n) · p

Commutativity m+n= n+m

m · n= n ·m

Identity m+ 0= m = 0+m

m · 1= m = 1 ·m

Positivity ifm+n = 0 then n = 0

Integrality ifm · n = 1 then n = 1

Cancellativity ifm+ p = n+ p thenm = n

ifm · (py) = n · (py) thenm = n

Distributivity m · (n+ p)= (m · n) + (m · p)

Case Distinction ifm 6= 0 thenm = ky for some kNo Zero Divisors my · ny 6= 0

The Law of Case Distinction was the subject of Chapter 1 Exer-cise 10. Go back and look at that exercise again.

3.1 Inductive Proofs

Suppose we wish to prove that every natural number has some prop-erty. For example, we wish to prove that 0+m = m for all naturalnumbersm. This is obviously true. But why? Our definition of +says thatm+ 0 = m, not the other way around. So something needsto be explained. We could try saying “addition is commutative, so0+m = m+ 0 and by the definitionm+ 0 = m”. But now we haveanother problem. How do we know that addition is commutative?Again, the defining equations for addition do not explicitly tell us this.

The Axiom of Induction holds the key to proving that some fact istrue for all natural numbers. The idea is simple, but will take somegetting used to. Back to the example of 0+m = m. We could simplytry to check for allmwhether this is true. First, is 0+ 0 = 0 true? Yes.Is 0+ 0y = 0y? Now we need to do a bit of calculating. 0+ 0y =

(0+ 0)y according to the definition of addition. But we just noted that0+ 0 = 0. Now is 0+ 0yy = 0yy? Again, we will need to do somecalculating. Evidently, “just check everything” is not going to workbecause we will never complete the task.


On the other hand, think about the natural numbers that do happento satisfy 0+m = m. If just these natural numbers provide us with amodel of Vocabulary 1, then the Axiom of Induction asserts that wecan not remove any natural numbers from the ones that happen tosatisfy 0+m = m. So all natural numbers must.

Let us see how it works in practice. Say that a natural numberm isnice if it has the property 0+m = m. We want to prove that all naturalnumbers are nice. What we really do is show that the nice naturalnumbers form a picture of Vocabulary 1.

• Evidently, 0 is nice because 0+ 0 = 0. That is part of the algorithmfor addition.

• Suppose k is a nice natural number.

• For the nice natural numbers to form a model of Vocabulary 1, itmust be the case that ky is also nice. So we check:

0+ ky = (0+ k)y [By the addition of algorithm.]

= ky [Because we suppose k is nice]

Putting these together, we see that the nice natural numbers form thecharacteristic “stepping stone” picture: zero is nice, any successor of anice number is nice. So the Axiom of Induction ensures that all naturalnumbers are nice.

A proof employing the Axiom of Induction in this way is called aproof by simple arithmetic induction, or just a proof by induction, forshort. We will see more general forms of induction later.


OUTLINE OF INDUCTIVE PROOFS

To make inductive proofs easier to understand, we often write themin the three-step outline suggested above. To prove that all naturalnumbers are “nice”:

• [Basis] Prove that 0 is nice.

• [Inductive Hypothesis] Assume that k is nice for some (unspecified)k ∈N.

• [Inductive Step] Prove that ky is also nice. You may use the inductive hypothesis(the assumption that k is nice) in theinductive step.From these, the nice numbers constitute a model of Vocabulary 1).So

by the Axiom of Induction, all natural numbers are nice.

The familiar laws of arithmetic are now provable by induction.We do not have to accept them “from authority.” We can check forourselves why they must be true.

PROPOSITION 1: Addition is associative.

For any natural numbersm, n and p,

m+ (n+ p) = (m+n) + p.

PROOF : Suppose thatm and n are fixed natural numbers (not known A common technique is to suppose thatsome of the values in the proof are fixed,but otherwise not special in any way. Aproof that does not use anything specialabout them will be valid for all.

to us). We prove the result by induction on p.

• [Basis] The goal of the basis is the show thatm+ (n+ 0) = (m+

n) + 0. Butm+ (n+ 0) = m+n = (m+n) + 0 due to the definitionof +.

• [Inductive Hypothesis] Assume thatm+ (n+ k) = (m+n) + k forsome k ∈N.

• [Inductive Step] The goal is to show thatm+ (n+ ky) = (m+n) +

ky.

m+ (n+ ky) = m+ (n+ k)y [Def. of +]

= (m+ (n+ k))y [Same reason]

= ((m+n) + k)y [Inductive Hypothesis]

= (m+n) + ky [Def. of +]


Therefore (by the Axiom of Induction),m+(n+p) = (m+n)+p holdsfor all natural numbers p. Since this argument does not depend onany extra assumptions aboutm and n, it holds for all natural numbersm and n. �

Mathematicians typically use thesymbol � as a punctuation mark toindicate the end of a proof.

In the remainder of this section, we illustrate the technique ofsimple arithmetic induction by proving other laws of arithmetic.

PROPOSITION 2: 0 is the identity for addition.

For anym, it is the case thatm+ 0 = m andm = 0+m. Because we are currently discussingnatural numbers and nothing else, itis safe to write “For anym” instead of“For any natural numberm”.

PROOF : The first equality is true by the definition of +. The secondequality,m = 0+m, was proved above as an introduction to induction.�

To prove that addition is commutative, we need another fact abouthow successor and addition interact.

LEMMA 1: Successors migrate Mathematicians use the word lemma toindicate that a certain fact is needed tomake other proofs easier. It is sort of likea subroutine.For anym and n,m+ny = my +n.

PROOF : We prove the claim by induction on n. Supposem is somefixed natural number.

• [Basis] The goal is to show thatm+ 0y = my + 0.

m+ 0y = (m+ 0)y [Def. of +]

= my [Def. of +]

= my + 0 [Def. of +]

• [Inductive Hypothesis] Supposem+ ky = my + k for some k ∈N.

• [Inductive Step] The goal is to show thatm+ kyy = my + ky.

m+ kyy = (m+ ky)y Def. of +

= (my + k)y [Inductive Hypothesis]

= my + ky [Def. of +]

So (m+ n)y = my + n for all n. Because the proof does not dependon any assumption aboutm, it is valid for allm. �


Roughly speaking this lemma, and the definition of +, permit us tomovey anywhere within an addition: my+n = (m+n)y = m+ny.So we are free to move a successor “out of the way” whenever weneed to. The next proof illustrates the point.

PROPOSITION 3: Addition is commutative.

For any natural numbersm and n,

m+n = n+m

PROOF : We need to show thatm+ n = n+m for allm and n. Thistime, the proof is by induction onm. Fix a value for n ∈N.

• [Basis] The goal is to show that 0+n = n+ 0. But 0+n = n = n+ 0

holds because of Proposition 2 and the definition of +.

• [Inductive Hypothesis] Assume that k+n = n+ k for some k.

• [Inductive Step] The goal is to show that ky +n = n+ ky.

ky +n = (k+n)y [Lemma 1]

= (n+ k)y [Inductive Hypothesis]

= n+ ky [Def. of +]

Therefore,m+ n = n+m for allm. Since this argument does notdepend on any assumptions about n, it is valid for all n. �

The next law may be less familiar to you. Roughly, it says that wecan “subtract” equals and get equals. But because actual subtractiondoes not generally make sense for natural numbers (5 − 7meansnothing without introducing negative numbers), cancellativity is thebest we can do.


PROPOSITION 4: Addition is cancellative.

For any natural numbersm, n and p, ifm+ p = n+ p thenm = n.

PROOF : The proof is by induction on p. Assume thatm and n are This proof is a little subtler than theprevious ones. But notice that it stillfollows the same form: Basis, InductiveHypothesis, Inductive Step.

some fixed natural numbers.

• [Basis] The goal is to show that ifm + 0 = n + 0, thenm = n.Supposem + 0 = n + 0. Then immediately by definition of +,m = n.

• [Inductive Hypothesis] Assume that the following statement is truefor some k ∈N: ifm+ k = n+ k thenm = n.

• [Inductive Step] The goal is to show that ifm+ ky = n+ ky thenm = n. Supposem+ ky = n+ ky [call this supposition (*) forreference]. Then

(m+ k)y = m+ ky [Def. of +]

= n+ ky [By the supposition (*)]

= (n+ k)y [Definition of +]

Thus (m+ k)y = (n+ k)y. Hencem+ k = n+ k by Postulate 2. Soby the inductive hypothesis,m = n.

Therefore,m+ p = n+ p impliesm = n for all p. Since this argumentdoes not depend on any assumptions regardingm and n, it is valid forallm and all n. �

To prove that multiplication is commutative, we will need thefollowing lemmas (analogous to Proposition 2 and Lemma 1).

LEMMA 2: 0 is an “annihilator”

For any natural number n, it is the case that 0 · n = 0 = n · 0.

PROOF : The algorithm for multiplication directly specifies thatn · 0 = 0.

The proof that 0 · n = n is by induction on n.

• [Basis] The goal is to show that 0 · 0 = 0. But this is directly from thealgorithm for multiplication.

• [Inductive Hypothesis] Assume that 0 · k = 0 for some k ∈N.


• [Inductive Step] We must show that 0 · ky = 0.

0 · ky = 0+ 0 · k [Definition of ·]= 0+ 0 [Inductive Hypothesis]

= 0 [Definition of +]

�

LEMMA 3: Successors migrate in products


my · n = m · n+n.

PROOF : The proof is by induction on n. Assumem is fixed.

• [Basis] The goal is to show thatmy · 0 = m · 0+ 0. Clearlymy · 0 =0 = 0+ 0 = m · 0+ 0 all follow from the definitions of + and ·.

• [Inductive Hypothesis] Supppose thatmy · k = m · k+ k for somek ∈N.

• [Inductive Step] The goal is to show thatmy · ky = m · ky + ky.

my · ky = my +my · k [Exercise]

= (m+my · k)y [Exercise]

= (m+ (m · k+ k))y [Exercise]

= ((m+m · k) + k)y [Exercise]

= (m · ky + k)y [Exercise]

= m · ky + ky [Exercise]

Because the proof does on depend on assumptions aboutm, it is truefor allm. �

Some of the other laws are left as exercises.

Exercises

16. Prove that 1 is the identity for multiplication. That is, 1 ·m = m andm = m · 1.


17. Write out the entire proof of Lemma 3 providing the justificationsfor each line of the calculation in the inductive step.

18. Prove that multiplication distributes over addition [m · (n+ p) =

m · n+m · p] by induction on p. You can use any of the lemmas andpropositions we have already proved.

(a) Prove the basis: m · (n+ 0) = m · n+m · 0.

(b) Write the inductive hypothesis.

(c) Prove the inductive step: m · (n+ ky) = m · n+m · ky

19. Prove that multiplication is associative [m · (n · p) = (m · n) · p] byinduction on p.

(a) Prove the basis: m · (n · 0) = (m · n) · 0.

(b) Write the inductive hypothesis.

(c) Prove the inductive step: m · (n · ky) = (m · n) · ky. Hint: Usethe Law of Distribution, which you just proved.

20. Prove that multiplication is commutative. Hint: Use the two Lem-mas we proved right before these exercises.

21. Prove the Law of Positivity: ifm+ n = 0, then n = 0. [Hint: UseCase Distinction.]

22. Prove the Law of Integrality: ifm · n = 1, then n = 1. [Hint: Showthat if n 6= 1, thenm · n 6= 1. By Case Distinction, n 6= 1means thateither n = 0 or n = kyy for some k.]

23. Prove the Law of No Zero Divisors. Hint: my · ny = my +my · n For the record, we don’t prove thecancellation law for multiplication here.We return to it Chapter 9 when we haveother techniques.

by definition of multiplication. No explicit induction is needed forthis proof.

4Examples of Recursion and Induction

ARITHMETIC OPERATIONS ON NATURAL NUMBERS like sum andCHAPTER GOALS

Define several additional operationson natural numbers using recursion.Prove useful facts about the operations,partly to get more practice deployingthe proof pattern of simple arithmeticinduction.

product have been defined by recursion. In this chapter, we look atseveral other operations that arise in applications. Some of these, suchas factorials, are familiar to you. Others are less so.

ALGORITHM 3: Factorial

For natural number n, define the factorial of n, written n!, by thefollowing:

0! = 1

(ky)! = k! · ky

ALGORITHM 4: Rising exponent

For natural numbers n andm, define the rising nth exponent ofm,writtenmn, by the following:

m0 = 1

mky = m · (my)k

The sense in which rising exponent generalizes factorial is summa-rized in the following lemmas. The first two establish a law similar tothe familiar fact about standard exponents that bx+y = bx · by.


LEMMA 4: Successor in rising exponents


mny = mn · (m+n).

PROOF : The proof is by induction on n.

• [Basis] The goal is to show thatm0y = m0 · (m+ 0) for allm ∈ N.Butm0y = m · (my)0 = m And likewisem0 · (m+ 0) = m.

• [Inductive hypothesis] Suppose that for some k ∈ N, it is the casethatmky = mk · (m+ k) for allm : N

• [Inductive Step] The goal is to show that mkyy = mky · (m+ ky) I omit the justifications for these calcula-tions because, by now, you ought to beable to supply them for yourself.

for allm ∈N.

mkyy = m · (my)ky

= m · (my)k · (my + k)

= mky · (m+ ky)

�

LEMMA 5: The law of rising exponents

For any natural numbersm, n and p,

mn+p = mn · (m+n)p.

PROOF : The proof is by induction on p.

• [Basis] The goal is to show thatmn+0 = mn · (m+ n)0. This isclearly true for anym and n because n+ 0 = n and (m+n)0 = 1.

• [Inductive hypothesis] Suppose that for some k ∈N,

mn+k = mn · (m+n)k

for all natural numbersm and n.

• [Inductive step] The goal is to show that

mn+ky = mn · (m+n)ky

EXAMPLES OF RECURSION AND INDUCTION 33

for anym and n.

mn+ky = m(n+k)y

= mn+k · (m+n+ k)

= mn · (m+n)k · (m+n+ k)

= mn · (m+n)ky

[By Lemma 4]

�

LEMMA 6: Factorial is a special case of rising exponent

For any natural number n,

n! = 1n

PROOF : We prove this by induction on n.

• [Basis] The goal is to show that 0! = 10. But 0! = 1 and 10 = 1 bydefinition.

• [Inductive hypothesis] Suppose that k! = 1k for some k,

• [Inductive step] The goal is to show that (ky)! = 1ky .

1ky

= 1k · (1+ k)= k! · ky

(ky)!

�

The triangular numbers are the numbers that are defined analo-gously to factorial. But instead of multiplying, we add.


ALGORITHM 5: Triangular numbers

Define tri(n) by the following:

tri(0) = 0

tri(ky) = tri(k) + ky

By unwinding this for exanple, we get tri(4) = (((0+ 1) + 2) + 3) + 4.Generally, a triangular number is the sum of consecutive naturalnumbers.

You may already know this formula:

tri(n) =n(n+ 1)

2.

There are nice geometric “proofs” to convince yourself that this is true.But these rely on informal (but nevertheless reliable) intuition aboutgeometry. For our purposes, though, we are interested in being moreprecise and in proving a generalization of this to higher dimensions,where geometric intuitions stop working.

For three dimensions, the idea is to stack balls into a triangularpyramid (a regular tetrahedron). So the number of balls needed istri(1) + tri(2) + · · · tri(n). In four dimensions (which you and I can’tvisualize), the idea is to “stack” increasing sized tetrahedra. And soon for even higher dimensions. We would like to figure out whetherthere is a nice equation that describes all of these higher dimensionalpyramids.

To start, let’s translate the known formula to our language fornatural numbers. At present, we do not have division. So althoughn(n+1)

2 it is perfectly clear, but it is not expressible in the notation wehave. Nevertheless, the equation can be rewritten into our language as

2 · tri(n) = n · (n+ 1).

Even better, we can replace n+ 1with ny, and recognize that n · ny

is actually the “rising square” n2. So an equation that expresses thesame fact as tri(n) = n(n+1)

2 , but uses the available notation is

2 · tri(n) = n2.

EXAMPLES OF RECURSION AND INDUCTION 35

Exercises

24. Prove that for every natural number n, it is the case that 2 · tri(n) =n2.

Let’s see what happens in three dimensions. If we think of tetra-hedral numbers as stacking triangles, we can define them by a verysimilar recursion to the triangular numbers.

ALGORITHM 6: Tetrahedral numbers

Define tet(n) by the following:

tet(0) = 0

tet(ky) = tet(k) + tri(ky)

Computing the first few values tet(n) and of n · ny · nyy, we see apattern.

n tri(n) tet(n) n · ny · nyy

0 0 0 0

1 1 0+ 1 = 1 6

2 3 1+ 3 = 4 24

3 6 4+ 6 = 10 60

4 10 10+ 10 = 20 120

5 15 15+ 20 = 35 210

Notice that 6 · tet(n) = n · ny · nyy, at least for the values we havecalculated to far. That seems promising, but now we have a puzzle.How is 6 related to three dimensional objects?

The three dimensional case of tetrahedra amounts to summingthe first n 2-dimensional values (tri(−)). Extending to dy dimen-sions, we sum the first n, d-dimensional values. Let us refer to thed-dimensional objects as “hyper-triangles”, and write trid(n) as thed-dimensional hyper-triangle with n on a side. We can thus definetridy(n) recursively:

tridy(0) = 0

tridy(ny) = tridy(n) + trid(n

y)

In particular, we expect tri2(n) to be the nth triangular number, so that


tri3(n) is the nth tetrahedral number, and so on.We have a problem, though. For this definition to work, we need to

say what tri0(n) is. That is geometrically a strange thing to ask about,since is seems to deal with 0-dimensional “hyper-triangles”. But let’ssee what happens. Evidently,

tri2(ny) = ny + tri2(n)

andtri2(n

y) = tri1(ny) + tri2(n).

So it has to be the case that tri1(ny) = ny. We already agreed thattri1(0) = 0. So in fact, tri1(n) = n for all n ∈N.

Unfolding the definition again leads to

ny = tet1(ny) = tri0(n

y) + tri1(n) = tri0(ny) +n.

So tet0(ny) = 1. Hence we can use the following complete definition.

tet0(n) = 1

tetdy(0) = 0

tetdy(ny) = tetdy(n) + tetd(n

y)

Exercises

25. Calculate tet4(3) in detailed steps.

26. Prove that for all natural number d and n, it is the case that d! ·trid(n) = nd. [Hint: Prove this by induction on d. The goal of theinductive step will be to show that (ky)! · triky(n) = nky for all n.Prove that by induction on n.]

5Sets as Types of Mathematical Data

THE DEVELOPMENT OF PROGRAMMING LANGUAGES has made theCHAPTER GOALS

In this Chapter, we introduce the basicstructure, vocabulary and notation forsets. This includes set membershipnotation x ∈ A and subset notationA ⊆ B. We also introduce the axiom ofextensionality, which characterizes setsas pure unstructured collections.

importance of data types clear. In most contemporary programminglanguages, each datum has an associated type. For example, codemight involve integer data separate from character data and separatefrom, say, floating point data (roughly, floating point data approxi-mates real number data in a way that is computationally tractible).Other data types include things like matrices, arrays, lists, and muchmore.

In mathematics, we speak about natural numbers, real numbers, inte-ger polynomials, complex matrices, continuous functions on the reals andso on. These are roughly the mathematical versions of programminglangauge data types. They help us to organize the data in much thesame way that data types do.

There are differences, however, between how mathematicianstypically think about types of data and how computer scientists do.

In mathematics, it is best to take as generous an outlook as possibleregarding what consistutes a type of data. This is for the practicalreason that there is no point in limiting th idea too soon. The mathe-matical notion of a set does the job.

In computing, we are forced to be very constrained in what shouldcount as a data type. This is for the different practical reason that aprogram has to be executible on an actual computer. So in computingneed a very precise (and relatively small) system of types that can beimplemented on a computer.

A set is, in the formulation of Cantor, jedes Viele, welches sich als Einesdenken lasst “any multiplicity which can be comprehended as one”.So N is a set because it is possible to think of the entire collectionof natural numbers as a single entity. For another example, severalplaying cards taken together form a single deck of cards; the deck


is comprehended as one thing. The several students taking DiscreteMath right now can be comprehended as one class. And so on.

Several sets of mathematical data are already known to you, and goby standard symbols.

SOME IMPORTANT SETS

The following sets are denoted by the special symbols: These symbols are very common. Youwill be understood by any mathemati-cian if you use them.

N = the set of natural numbers: 0, 1, 2, . . .

Z = the set of integers: . . . , −2, −1, 0, 1, 2, ldots

Q = the set of rational numbers:1

2,24

23and so on

R = the set of real numbers

C = the set of complex numbers

In Chapter ??, we introduced the notation n ∈ N to indicate that nis a natural number. We extend that notation to all other sets.

VOCABULARY 2: Basic Vocabulary of Sets

A set is an entity A that we can think of as consisting of elements.For any thing x, either x is in A or x is not in A. We write x ∈ A if

x is in A and x /∈ A if x is not in A. When x ∈ A, we say that x is anelement of A.

For variety, all of the following phrases mean the same thing:

• x is in A

• x is an element of A

• x is a member of A

• A contains x

• x belongs to A

To describe a set, we have to describe, in a precise way, its elements.For a small set, we can simply list the elements, using braces ({ and })as follows.

SETS AS TYPES OF MATHEMATICAL DATA 39

EXAMPLE 2:

A standard deck of poker cards can be described by

DECK = {A♣, 2♣, 3♣, 4♣, 5♣, 6♣, 7♣, 8♣, 9♣, 10♣, J♣, Q♣, K♣,

A♦, 2♦, 3♦, 4♦, 5♦, 6♦, 7♦, 8♦, 9♦, 10♦, J♦, Q♦, K♦,

A♠, 2♠, 3♠, 4♠, 5♠, 6♠, 7♠, 8♠, 9♠, 10♠, J♠, Q♠, K♠,

A♥, 2♥, 3♥, 4♥, 5♥, 6♥, 7♥, 8♥, 9♥, 10♥, J♥, Q♥, K♥}

The elements are arranged here conveniently, but we could just aswell have listed the cards in any shuffled order. The set (that is, thedeck itself) would be the same regardless of how it is shuffled.

The notation used braces {. . .} is a standard way to indicate that thisis meant to be a set.

We can also describe the set of ranks and the set of suits:

RANK = {A, 2, 3, 4, 56, 7, 8, 9, 10, J, Q, K}

SUIT = {♣,♦,♠,♥}.

As you know, the deck consists of 52 cards: one card for each possiblerank and suit. So DECK really is built from RANK and SUIT. We willreturn to other examples like this latter. For now, we note that DECK

could be described as RANK × SUIT.In a deck a five card hand is a subset of DECK consisting of five cards.

For example, {2♥, 2♣, 2♦, 10♠, 10♦} is a subset representing a fullhouse.

To describe larger sets (including infinite ones) we can not list theelements. So we will need other notation. For now, let us concentrateon small finite sets that we can list.

5.1 Subsets and Extensionality

A set is meant to be a collection: some things are in, others are not.Unlike a list, a set has no “initial” element. For example, the set {1, 2, 3}is the same as the set {2, 3, 1}, because they have exactly the sameelements; in contrast, the list [1, 2, 3] is not the same as the list [2, 3, 1]because, in particular, they have different initial items. To make thisprecise, we need to be clear about when sets are equal. That is madeeasier by introducing the important idea of subsets.


DEFINITION 1: Subsets

For sets X and Y, we say that X is a subset of Y provided that everyelement of X is an element of Y. We write this as X ⊆ Y, and say thatX is included in Y, or Y includes X. We may also write Y ⊇ X to meanthe same thing, and say that Y is a superset of X.

If X is not a subset of Y, we write X 6⊆ Y. If X ⊆ Y and Y 6⊆ X, then Xis called a proper subset of Y. To indicate that X is a proper subset ofY, we may write X ( Y. Some people write X ⊂ Y for proper subsets,but I will never use that notation.

To say X ⊆ Y is to say that for any x, if x ∈ X then x ∈ Y. In plainEnglish, we may translate it as “all Xs are Ys.” For example, supposeP is the set of all professors, and H is the set of all human beings. ThenP ⊆ H is the assertion that “all professors are human beings”.

EXAMPLE 3:

Here are some examples and counter-examples of the subset relation.

• {1, 2, 3} ⊆ {0, 1, 2, 3}.

• {{1}, } 6⊆ {0, 1} because the only element of the first set is {1}, and{1} /∈ {0, 1}.

• X ⊆ X for any set X because, trivially, every element of X is anelement of X.

• {} ⊆ X for any set X because every element of {} (there are none) isan element of X.

• {1, 2, 3} 6⊆ {0, 2, 3} because 1 ∈ {1, 2, 3} but 1 /∈ {0, 2, 3}.

• {1, 2, 3} ⊆ {2, 3, 1}.

• {♠} ⊆ SUIT.


Exercises

For each of the following pairs of sets, determine whether or not thefirst is a subset of the second. Explain each answer in one sentence.

27. {0, 1} and {1, 0}

28. {a,b, c,d} and {a,b,d, e, c}

29. {} and {{}}

30. {0, 3, 6, 10} and {10, 9, 8, 7, 6, 5, 4, 2, 1, 0}

31. {1, 2, 1} and {1, 2, 3}

We can summarize two important properties of ⊆.

Reflexivity For any set X, X ⊆ X. We say ⊆ is reflexive.

Transitivity For any sets X, Y and Z, if X ⊆ Y and Y ⊆ Z, then X ⊆ Z. Another familiar example of reflexivityand transitivity is the relation ≤ onthe natural numbers. In fact there aremany examples of reflexive, transitiverelations throughout mathematics. Suchrelations are called pre-orders and are thesubject of Chapter ??.

We say ⊆ is transitive.

Suppose X ⊆ Y and Y ⊆ X. Then, by definition X and Y have exactlythe same elements. Because sets are meant to be pure collections —with no other structure than ∈— X and Y must be equal. So we statethis as another princple.

POSTULATE 4: Set Extensionality

Anti-symmetry For sets X and Y, if X ⊆ Y and Y ⊆ X, then X = Y. We Is ≤ anti-symmetric on natural num-bers?say ⊆ is anti-symmetric.

Based on extensionality, we can already establish the followinguseful fact.


LEMMA 7: Uniqueness of the empty set

There is exactly one empty set.

PROOF : The set built from an empty list { } has no elements. So thereis at least one empty set.

Suppose E is a set with no elements. Then E ⊆ { } because eachelement of E (there are none) is an element of { }. Similarly, { } ⊆ E

because each element of { } (again, there are none) is an element of E.So by Principle 4, E = { }. �

DEFINITION 2: Empty set

The set { } is denoted by ∅.

Set extensionality expresses the idea that a set by itself does nothave any structure other than its membership. To emphasize this,sometimes it is useful to depict a set with elements scattered aboutsomething like

d

c b

a

.

Evidently, a re-arrangement of the elements does not change thedepicted set. So

d

c b

a

depicts the same set.Depicting a subset of a set is a matter of drawing a smaller bound-

ary around some of the elements as in

d

c b

a.


Exercises

Draw depictions of the following sets

32. {1, 4, 5, 2, 3}

33. {1, 2, 3, . . . , 23}

34. {a,b, c,d, e} and {c, e, f,g} on the same diagram

35. {a, e,b, c, e} [sic]

36. {1, 3, 6, 7} and {1, 3, 5, 6, 7, 9} on the same diagram

37. {FALSE, TRUE}

38. {•}

39. {TRUE, FALSE, 3, 5, 1, •}

6Functions and Operations

FUNCTIONS (perhaps in your calculus courses) are often talked aboutCHAPTER GOALS

Sets, by themselves, are nearly uselesswithout the related ideas of functionsand relations. In this chapter we intro-duce functions, and how to reason aboutthem.

as operations. For example,

sqr(x) = x2

can be seen as an operation that transforms a number x into anothernumber x2. But it can also be seen as an attribute of numbers (the“square of x” is an attribute that each number x possesses). The “oper-ational” view suggests that sqr has to do with calculating or computing.

Consider the square of the successor of a number, defined by

s-sqr(n) = (ny)2

Now considerf(n) = (n2 + 2 · n)y

The calculations of s-sqr and f are quite different. In the first, onesimply finds the successor of n and squares that. In the second, onesquares the value, doubles the value, adds those two results andthen takes the successor of that sum. Though s-sqr and f involve verydifferent steps, they arrive at the same results because for all n, it isthe case that (ny)2 = n2 + (2 · n)y.

So we have a decision to make. Should we regard s-sqr and f to beunequal (because they involve different computational steps) or equal(because they always lead to the same results)? If this seems like astrange question to you, you are in good company. Mathematicianstook centuries to settle on the concept of function that we use today.In contemporary mathematics, we agree that functions are equal ifthey produce equal results. So s-sqr = f. This is important, in partbecause it allows us to distinguish between calculation and the resultsof calculation. Another useful feature is that a function does have to

FUNCTIONS AND OPERATIONS 45

be defined by a calculation at all.

VOCABULARY 3: Basic Vocabulary of Functions

For a set X and a set Y, a function from X to Y is an entity fwith thefollowing feature.

• Each element a ∈ X determines an element of f(a) ∈ Y, spoken as“f of a”, or sometimes as “f evaluated at a.”

The parentheses are sometimes omitted, writing fa instead of f(a).Also, for some special purposes, fa is an alternative notation Thepoint is that a function from X to Y together with an element a ∈ Xdetermine an element f(a) ∈ Y.

To talk about functions, we also use the following language:

• We write f : X→ Y to indicate that f is a function from X to Y.

• For a function f : X→ Y, the set X is called the domain of f and Y iscalled the codomain of f. We write dom(f) for the domain of f andcod(f) for the codomain.

• For any set X, there is a function from X to X, called the identityfunction on X. We denote it by idX.

• For any two functions f : X → Y and g : Y → Z (note that the Occasionally, the circle is omitted, sogfmeans the same as g ◦ f (similarto the way the multiplication sign issometimes omitted when we write xyinstead of x · y). Notice that functionevaluation also can sometimes bewritten as fa instead of f(a). You willsee shortly why this is not going tocause any confusion.

codomain of fmust match the domain of g), there is a functiong ◦ f : X → Z. This is called the composition of g with f (or gfollowing f). Whenever dom(g) = cod(f), we say that g and f arecomposible.

A function may sometimes also be called a map, or a transforma-tion.

Function evaluation deals with how a function interacts withelements of the domain and codomain. Function composition dealswith how functions interact with other functions. We will need tounderstand how evaluation and composition relate. This is closelyrelated to the question of when two functions are equal.

6.1 Function Extensionality

The examples of + and u, and s-sqr and f, from the introductionillustrate that we need a way to say when two functions are equal.


PRINCIPLE 1: Function Extensionality

For functions f : X→ Y and g : X→ Y, Important: Equality of functions onlymakes sense when the two functionsshare the same domain and the samecodomain.

f = g if and only if f(x) = g(x) for all x ∈ X.

Now we turn to the interaction of functions and elements of sets.Together, functions and sets constitute what is called a category.We will encounter other categories later. For now we use the terminformally.

LEMMA 8: Sets and functions form a category

Sets and functions form a category via composition and identityfunctions. That is,

• for functions f : W → X, g : X→ Y and h : Y → Z,

h ◦ (g ◦ f) = (h ◦ g) ◦ f.

• For function g : X→ Y,

g = g ◦ idX and

g = idY ◦ g.

PROOF : The proof of this is an exercise. It follows from functionextensionality and another principle introduced below. �

Principle 1 and Lemma 8 are tied together by establishing how g ◦ fand id are evaluated.


PRINCIPLE 2: Composition and identities behave“element-wise”

• For any composible functions f : X → Y and g : Y → Z and any Suppose we omit most parenthesesand the ◦ in composition. Then thestatement about composition is (gf)a =

g(fa). The statement about identityis that idXa = a. Compare thesetwo equalities to the associativityand identity rules for composition:(hg)f = h(gf) and idYf = f. It appearsthat function evaluation f(a) may be aspecial sort of function composition. Wemake good on that idea later.

element a ∈ X, it is the case that

(g ◦ f)(a) = g(f(a)).

• For any set Y and any element a ∈ Y, it is the case that

a = idY(a).

Exercises

40. Prove that Lemma 8 is a consequence of Principles 1 and 2.

To define a function f : X → Y amounts to specifying a rule, ora calculation. We often do this by giving the function a name andspelling out the rule at the same time. So we might define a functionf : R→ R (R being the set of real numbers) by

f(x) = x2 + 4x+ 2.

Sometimes it is useful to have a rule without giving it a name. Forthat, we will use the “maps to” arrow 7→ . So we may define the samefunction f by saying that f : R→ R is given by the rule

x 7→ x2 + 4x+ 2.

The rule x 7→ x2 + 4x+ 2 is the same as the rule y 7→ y2 + 4y+ 2. Thevariable only serves as a place holder, so its particular name does notmatter. Saying that “f : R → R is given by the rule x 7→ x2 + 4x+ 2

means the same as saying that “f is defined by f(x) = x2 + 4x+ 2”.

6.2 Internal Diagrams

To depict a function on small finite sets, we can use the internal dia-grams of Chapter 5 with arrows indicating the input/output correla-


tion. For example,

d

c

b

a

4

3

2

1

depicts a function from the set {a,b, c,d} to the set {1, 2, 3, 4}.Composition can also be illustrated using internal diagrams. For

example,

d

c

b

a

4

3

2

1

u

v

w

x

zf

g

g ◦ f

Shows the composition of g following f.

Exercises

Use internal diagrams for the following exercises.

41. Depict four different functions from the set {1, 2, 3} to the set {FALSE, TRUE}. [Draw four different dia-grams.]

42. Depict all of the functions from {•} to {a,b, c}.

43. Depict all of the functions from {a,b, c} to {•}.

44. Are there any functions from {a,b} to ∅?

45. Are there any functions from ∅ to {a,b}? If there are, how many?

46. For each of the following diagrams, determine whether or not the diagram depicts a function. If not,explain why not.


(a)d

c

b

a

4

3

2

1

(b)d

c

b

a

4

3

2

1

(c)d

c

b

a

4

3

2

1

(d)d

c

b

a

4

3

2

1

47. Let A = {1, 2, 3}. Let B = {a,b, c, e} and let C = {FALSE, TRUE}. Depict some functions f : A→ B, g : B→ C,and g ◦ f.

48. Think about how you might depict a function h : A → A using only one picture of the set A. Describewhat you would do, and provide an example.

49. Suppose f : R → R is given by the rule x 7→ x2, suppose g : R → R is given by the rule x 7→ x − 1.Write rules for f ◦ g and g ◦ fwithout using the symbols f and g. Explain whether or not it is the case thatf ◦ g = g ◦ f.

6.3 External Diagrams

Internal diagrams are useful for getting the idea of function compo-sition, but they have limited utility because we can only depict smallfinite sets this way. When we are not concerned about the detailedinnards of small sets, but only with how functions interact, an individ-

ual function can be depicted simply as X f−→ Y. So a composition offunctions can be depicted as in

X Y

Z

f

gg ◦ f

We do not really need to draw g ◦ f as a separate arrow because the pathfrom X to Y to Z is already an implicit depiction of g ◦ f. So the simpler


diagram

X Y

Z

f

g

conveys the same information, namely, that f : X → Y and g : Y → Z

are functions and therefore, g ◦ f : X → Z is too. Notice that we alsodid not bother to draw the identity functions for X, Y and Z. If we diddraw them, they would appear as “loops”.

Now a diagram such as this

W X

Y Z

f

gh

k

depicts ten functions all together: for identity functions, the namedfunctions f, g, h, and k, and the two composite functions g ◦ f and k ◦ h.We say that the diagram commutes if g ◦ f = k ◦ h. More generally, toassert that a diagram commutes means all the compositions depictedimplicitly in the diagram that can be equal are equal. So a commuta-tive diagram is a graphical depiction of equations. This takes somegetting used to, but it is worth the practice.

Consider the Law of Associativity for composition. For composiblefunctions, we can draw the basic situation as

W X

Y Z

f

g

h

Now including the compositions g ◦ f and h ◦ g, we get

W X

Y Z

f

g

h

g ◦ fh ◦ g

Associativity says that this diagram commutes. Likewise, the Identity


Laws say that the diagrams

X

Y Y

ff

idY and

X X

Y

f

idX

f

both commute. We will encounter many specific examples later.Consider the functions f : N → N, g : N → N, h : N → N and

k : N→N defined by

f(n) = n+ 1

g(n) = n2

h(n) = 2n

k(n) = n3

We can form a diagram depicting g ◦ f and k ◦ h as in

N N

N N

f

gh

k

but this diagram does not commute because, for example, g ◦ f(2) =9 6= 64 = k ◦ h(2). The mere fact that we can draw such a diagram andthat the paths corresponding to g ◦ f and k ◦ h start and end at the sameplaces is not a guarantee that the thing commutes.

Exercises

50. For each of the following pairs of functions N → N, determinewhether they are equal and explain why or why not.

(a) f(n) = 2n+ 3 and g(m) = 2m+ 3

(b) f(n) = 2n+1 − 1 and g(n) =∑n

i=0 2i

(c) f(n) = n2 + 5n+ 6 and g(n) = (n+ 3)(n+ 2)

(d) f(n) = n4 − 10n3 + 35n2 + 50n+ 24 and g(n) = 24

51. Let R denote the set of all real numbers. Explain why tan (thetangent “function”) does not actually define a function from R to R.

52. Suppose the following functions exist: f : W → X, g : X → Y,


a : W → Z, b : Y → Z. Draw a commutative diagram asserting thatb ◦ g ◦ f = a.

53. Suppose the following functions exist: f : C → A, g : C → B,h : C → P, p : P → A and q : P → B. Draw a single commutativediagram asserting that f = p ◦ h and g = q ◦ h.

54. Define functions from N to N as follows:

f(n) = n+ 1

g(n) = n2

h(n) = n2 + 2n

Draw a diagram depicting the compositions g ◦ f and f ◦ h. Does thediagram commute?

6.4 Functions on finite sets

Recall the sets DECK, RANK and SUIT from Chapter 5. Given a cardc ∈ DECK, there is evidently an asignment of a rank. For example, therank of 5♥ is 5. This suggests that a function rank : DECK → RANK.Likewise, there ought to be a function suitDECKSUIT that “picks out”the suit of any card.

The rules defining rank and suit are obvious, but to make themcompletely explicit, we can build a table:

We do not really need the table in Figure 6.1 because all the valuesare so easily predictable. The cards themselves are systematically builtfrom all possible pairings of a rank and a suit. We will look at thegeneral idea of building sets by pairing later.

For a less uniform example, suppose we have a set corresponing tosix students S = {s1, s2, s3, s4, s5, s6}. Each student has a first nameand a last name. So we can specify their names as in Table 6.2:


c ∈ DECK rank(c) suit(c) c ∈ DECK rank(c) suit(c)

A♣ A ♣ A♦ A ♦2♣ 2 ♣ 2♦ 2 ♦3♣ 3 ♣ 3♦ 3 ♦4♣ 4 ♣ 4♦ 4 ♦5♣ 5 ♣ 5♦ 5 ♦6♣ 6 ♣ 6♦ 6 ♦7♣ 7 ♣ 7♦ 7 ♦8♣ 8 ♣ 8♦ 8 ♦9♣ 9 ♣ 9♦ 9 ♦10♣ 10 ♣ 10♦ 10 ♦J♣ J ♣ J♦ J ♦Q♣ Q ♣ Q♦ Q ♦K♣ K ♣ K♦ K ♦A♠ A ♣ A♥ A ♥2♠ 2 ♠ 2♥ 2 ♥3♠ 3 ♠ 3♥ 3 ♥4♠ 4 ♠ 4♥ 4 ♥5♠ 5 ♠ 5♥ 5 ♥6♠ 6 ♠ 6♥ 6 ♥7♠ 7 ♠ 7♥ 7 ♥8♠ 8 ♠ 8♥ 8 ♥9♠ 9 ♠ 9♥ 9 ♥10♠ 10 ♠ 10♥ 10 ♥J♠ J ♠ J♥ J ♥Q♠ Q ♠ Q♥ Q ♥K♠ K ♠ K♥ K ♥

Table 6.1: The rank and suit functions

x ∈ S forename(x) family(x)

s1 Carol Binkleys2 Pat Pattersons3 Kelly Treebaums4 Jamie Doodles5 Kelly Greens6 Violet Green

Table 6.2: Names of students in S


Exercises

55. Provide a table defining a function (call it f0) from the set A =

{0, 1, 2, 3} to B = {N,S,E,W} and table defining a function (call it g0)from B to A so that idA = g0 ◦ f0 and idB = f0 ◦ g0.


{0, 1, 2, 3} to B = {N,S,E,W} and table defining a function (call it g1)from B to A so that idA = g1 ◦ f1 but idB 6= f1 ◦ g1.


{0, 1, 2, 3} to B = {N,S,E,W} and table defining a function (call it g2)from B to A so that idA 6= g2 ◦ f2 but idB 6= f2 ◦ g2.

58. The table in Figure 6.3 does not represent a function f from the setT = {a,b, c,d} to the set W = {0, 1, 2, 3, 4, 5, 6}. Explain why.

x ∈ T f(x)

a 0b 1c 2d 1a 2

Table 6.3: A table that does not define afunction

6.5 Multi-argument Functions

The basic ideas of sets and functions between sets is quite general,but does not account for operations like addition and multiplication.These operations depend on two arguments: a natural numberm anda natural number n determine the natural numberm+ n. To bring“binary” operations like this into fold, we would like to be able to saythat addition and multiplication are functions. To do that, we need tofigure out how to specify their domain and codomain.

The set DECK defined above has a feature that gives as a clue onhow to deal with binary operations. The individual elements of DECK

represent playing cards. Each one has a rank and a suit, representedby the functions in Table 6.1.

What makes DECK special is that every possible rank paired withevery possible suit is represented by exactly one card. We can say thatin terms of the functions. For each r ∈ RANK and each s ∈ SUIT,


there is a unique card c ∈ DECK so that rank(c) = r and suit(c) = s.The notation we used for card, such as 2♥, was meant to remind usthat a card is really just a rank and a suit.

Suppose we wanted to define a function This should be the samething as assigning a value to each pair (r, s) where r is a rank and s isa suit. For example, suppose we define value functions for ranks andsuits by the tables

r ∈ RANK r-val(r) r ∈ RANK r-val(r)

2 1 9 83 2 10 94 3 J 105 4 Q 116 5 K 127 6 A 138 7

Table 6.4: Values assigned to card ranks

s ∈ SUIT s-val(s)

♣ 1♦ 2♠ 3♥ 4

Table 6.5: Values assigned to card ranks

We can now assign a value to each card by combining rank and suitvalues. For example, we might define

c-val(c) = 4 · r-val (rank(c) − 1) + s-val (suit(c))

We might as well have written this as a function that depends on arank and a suit:

rs-val(r, s) = 4 · r-val (r− 1) + s-val (s)

This is essentially the same as c-val because a card are completelydetermined by a rank and a suit.

Exercises

59. What is c-val(3♥)?

60. What is the value of rs-val(3,♥)?

61. Suppose I tell you c-val(c) = 10. What card is c?


62. Suppose I tell you rs-val(r, s) = 14. What is r? What is s?

You are quite used to other collections that are a lot like DECK. Forexample, you almost certainly picture a point the plane as somethinglike Figure 6.1. To specify exactly which point it pictured, we can giveit’s “coordinates”. In this example, it has 3.5 as x coordinate and 2.25as y coordinate. So the analytic plane consists of points. And eachpoint corresponds to exacty one pair of real numbers.

x axis

y axis

−4

−4

−3

−3

−2

−2

−1

−1

1

1

2

2

3

3

4

4

Figure 6.1: A point in the analytic plane

The set DECK is similar. It consists of cards. And each card corre-sponds to exactly one pair consisting of a rank and a suit. In fact, wecan picture the deck as a similar “coordinate system”.


2 3 4 5 6 7 8 9 10 J Q K A

♣

♦

♠

♥ Figure 6.2: Five cards in DECK

Exercises

63. Figure 6.2 shows elements of DECK. What poker hand does thiscorrespond to?

64. Draw a figure similar to Figure 6.2 picturing a full house.

65. Draw a figure similar to Figure 6.2 picturing a straight that is not astraight flush.

We can generalize our examples by establishing a principle thatsays, essentially, any two sets can be combined into a set of pairs. Butfirst, we need to be clear about the main ingredients.

The relationship between sets DECK, RANK and SUIT is summa-rized in the two functions rank : DECK → RANK and suit : DECK →SUIT.

Suppose we are given two other functions f : X → RANK andg : X → SUIT, where X could be any set. Then for any x ∈ X, f(x) is arank, g(x) is suit. So together f(x) and g(x) determine a card. In otherwords, there is a function h : X→ DECK so that rank(h(x)) = f(x) andsuit(h(x)) = suit(x) for all x ∈ X. This leads to the following definitionand principle.


DEFINITION 3: Product Sets

For sets X and Y, a span consists of two functions X f←− C g−→ Y.Note that the two functions have the same domain.

For sets X and Y, a product of X and Y is a span Xp←− P p ′−→ Y so

that for any span X f←− C g−→ Y there is exactly one function T h−→ P

for which f = p ◦ h and g = p ′ ◦ h.

This definition tells us how we should expect DECK to behave inrelation to ranks and suits. Namely,

RANKrank←− DECK

suit−→ SUIT

ought to constitute a product. But the definition does not guaranteethat we can make a product of any two sets. We need a postulate tomake that explicit.

POSTULATE 5: Cartesian Products Exist.

For any sets X and Y, there is a set, denoted by X× Y with functions

XprX,Y←− X× Y pr ′X,Y−→ Y.

Together these data constitute a product of X and Y. In particular,

for any span X f←− C g−→ Y, there is exacly one function 〈f,g〉 : C →X× Y for which prX,Y ◦ 〈f,g〉 = f and pr ′X,Y ◦ 〈f,g〉 = g.

In general, the elements of X× Y are denoted as pairs (x,y) where Mathematicians usually think concretelyof X × Y as the set consisting of allsuch pairs. This can be made precise inseveral ways. One way is to postulatethat V ⊆ X andW ⊆ X implies thatV ×W ⊆ X× Y.

x ∈ X and y ∈ Y. Using this notation, we can define the two projectionfunctions by

prX,Y(x,y) = x

pr ′X,Y(x,y) = y

With products, a binary function, such as addition, is simply afunction from N ×N to N. In the Chapter ??, we investigate thetechnique of recursion for defining functions on natural numbers.

The pairing notation allows us to describe the product of any two


sets. Explicitly,

RANK × SUIT = {(A,♣), (2,♣), (3,♣), (4,♣), (5,♣), (6,♣), (7,♣),(8,♣), (9,♣), (10,♣), (J,♣), (Q,♣), (K,♣),(A,♦), (2,♦), (3,♦), (4,♦), (5,♦), (6,♦), (7,♦),(8,♦), (9,♦), (10,♦), (J,♦), (Q,♦), (K,♦),(A,♠), (2,♠), (3,♠), (4,♠), (5,♠), (6,♠), (7,♠),(8,♠), (9,♠), (10,♠), (J,♠), (Q,♠), (K,♠),(A,♥), (2,♥), (3,♥), (4,♥), (5,♥), (6,♥), (7,♥),(8,♥), (9,♥), (10,♥), (J,♥), (Q,♥), (K,♥)}

Compare this to the earlier definition of DECK. Clearly, the two setsdiffer only in how we have written the pairs. So we can supposeDECK and RANK × SUIT are equal. Or we can suppose that a pairlike (K,♥) is not a playing card, but that it merely corresponds to aplaying card.

Evidently, we can also define products of three sets, four sets, etc.,by repeating the × construction. That is, (X× Y)× Z is formally theset consisting of all pairs (p, z) where p ∈ X × Y and z ∈ Z. Butp ∈ X× Y means that p = (x,y) where x ∈ X and y ∈ Y. So an elementof (X× Y)× Z looks like ((x,y), z), using the pairing notation. Wemight as well use tripling notation for the same thing, writing (x,y, z)instead of ((x,y), z).

7The Set of Natural Numbers

THE SYMBOL N denotes the set of natural numbers. It is time that weCHAPTER GOALS

In this chapter, we investigate whatdistinguishes N, the of natural numbers,from other sets. The main idea is that 0,successors and the Axiom of Inductioncharacterize N as a set that universallysupports recursion.

make the structure of N explicit within our developing ideas aboutsets and functions..

The structure of N brings set theory into contact with computationvia the crucial idea of recursion. For example, n! (the factorial of n) isa function from N to N specified recursively by

0! = 1

ky! = ky · k!

The axioms for N are chosen precisely to ensure that a specificationlike this is guaranteed to define a function uniquely.

7.1 Sequences and Simple Recurrences

Let us make the informal word sequence official.

DEFINITION 4: Sequences

For a set X, a sequence in X is a function σ : N→ X.For a sequence σ in X and natural number n, we may write σn

instead of σ(n). In any case, the idea is that σ0,σ1,σ2, . . . are elementsof X.

As we studied in previous lectures, the basic vocabulary of naturalnumbers is that (i) there is a starting natural number, 0, and (ii) foreach natural number k there is a next one, ky.

To discuss zero and successor in the language of sets and functions,we stipulate that 0 ∈ N and that successor is a function suc : N → N

THE SET OF NATURAL NUMBERS 61

(we think of it as being given by the rule k 7→ ky). Evidently, thenotation suc is only needed to bring ky into the standard function/ap-plication notation. So N is not just a set: it is a set equipped with aspecial element and a function.

Suppose X is some other set, b ∈ X is an element, and r : X → X

is a function. In other words, (X,b, r) has structure like (N, 0,�).So Xmay be the natural numbers, but it certainly does not have tobe. It has its own vocabulary b is “a beginning” and for each x ∈ X,the element r(x) comes “after” x. Then we should be able define asequence σ in X, so that σ0 = b, σ1 = r(b), σ2 = r(r(b)), σ3 =

r(r(r(b))), and so on.In general, σk should be determined by starting with b and repeat-

edly applying r a total of k times. In other words, we can summarize σby two equations:

σ0 = b

σ�(k) = r(σk)

We can make this precise.

DEFINITION 5: Simple Recurrences and Counting Sets

A simple recurrence is a set with distinguished element b ∈ X andfunction r : X → X. We will say the b and r form a simple recurrenceon X. To make things easy to read, we will write (X,b, r) for a simplerecurrence on X.

A universal recurrence is a simple recurrence (N, z, s) so that forany other simple recurrence (X,b, r) there is exactly one function

Nf−→ X so that

f(z) = b

f(s(n)) = r(f(n)) for all n ∈ N


POSTULATE 6:

[N is a Counting Set] The set N of natural numbers with 0 and suc is auniversal recurrence.

Consider a simple recurrence 1b̂−→ X

r←− X. This principle tells usthere is a unique function f : N→ X so that

f(0) = b

f(ky) = r(f(k))

In fact, this is just what Postulate 6 tells us: every simple recurrence ona set X determines a sequence in X.

On the other hand, it is not the case that every sequence is de-termined by a simple recurrence. Take for example, the sequence0, 1, 0, 2, 0, 3, . . .. This can not be defined (at least not directly) by giv-ing an initial entry and specifying successive entries based only onimmediate predecessors.

We need additonal techniques for defining more general functionsby recursion.

7.2 Primitive Recursion

Evidently, addition, multiplication, factorial, and other familiar func-tions should be definable using Principle 6. But there are problems toovercome: Addition is not a sequence, and factorial is not definable bya simple recurrence. We need a scheme that generalizes simple recur-sion to permit (i) dependence on parameters not directly involved inthe recursion, and (ii) dependence on n at each stage of the recursion.

DEFINITION 6:

A primitive recurrence in X with parameters in P consists of twofunctions P b−→ X

r←− P×N× X.A parametric sequence in X with parameters in P is a function

a : P×N→ X.

This allows for a very general definition of functions by recursion.


POSTULATE 7: Functions defined by primitive recursion

For any primitive recurrence P b−→ Xr←− P ×N× X, there is a Actually, this will be a theorem, once

we establish some additional postulatesfor sets. For now, we will take it to be abasic principle.

Stronger forms of recursive definitionare possible, but these require a moresophisticated understanding of the rolethat N plays.

unique function f : P×N→ X satisfying:

f(p, 0) = b(p)

f(p,ky) = r(p,k, f(p,k))

We say that f is defined by primitive recursion.

If the parameter set is trivial (we don’t actually want extra parame-ters) we can omit them in the notation.

EXAMPLE 4:

The “predecessor” function is defined by the scheme

pred(0) = 0

pred(ny) = n.

Addition add : N×N→N is defined by

add(m, 0) = m

add(m,ky) = add(m,k)y.

Of course, we writem+ n instead of add(m,n). Notice that, formally,we should give a functions b : N→N and r : N×N×N→N

Monus (subtraction in N) can be defined by

m −̇ 0 = my

m −̇ny = pred(m −̇n)

Factorial is defined by

0! = 1

(ny)! = ny · n!

Exercises


66. Define addition N×N+−→ N explicitly by primitive recursion.

That is, find functions Nb−→N and N×N×N

r−→N so that

m+ 0 = b(m)

m+ ky = r(m,k,m+ k)

67. Define multiplication N×N·−→N by explicit primitive recursion.

You may use addition in the definition.

68. For a given function f : N → N, find a primitive recurrencethat defines the function Σf : N → N by the informal rule n 7→∑n−1

i=0 f(i). That is, the result will be the sequence 0, f(0), f(0) + f(1),f(0) + f(1) + f(2), . . . .

69. This is a challenging exercise. Suppose f : N→N. Define µf : N→N so that µf(m) is the least value n < m for which f(n) > 0 andism otherwise. In other words, µf(m) will always be a value lessthan or equal tom. If f(n) = 0 for all n < m, then µf(m) = m.Otherwise f(µf(m)) > 0 and if f(n) > 0 for some n < m, thenµf(m) ≤ n.

70. Define by recursion, “halving” as a function from N to N. That is,define h : N→N so that h(2n) = n and h(2n+ 1) = n.

7.3 Mutual Recursion and Other Generalizations

Often, we want to define two or more functions simultaneously interms of each other. For example, consider the following “definitions”of two functions f and g.

f(0) = 0

g(0) = 1

f(ny) = f(n) + g(n)

g(ny) = f(n)

So f(1) = f(0) + g(0) = 1; f(2) = f(1) + g(1) = f(1) + f(0) = 1;f(3) = f(2) + g(2) = f(2) + f(1) = 2. In general, f defines the Fibonaccisequence.

Although the equations defining f and g do not follow the formatof primitive recursion, they seem to be “safe”. The functions f andg are defined mutually: we can not compute one wihout the other.Nevertheless, cartesian products allow us to “package” f : N → N


and g : N → N into a single function 〈f,g〉 : N → N×N. This latterfunction can be defined by simple recursion. In particular, defineh : N→N×N by

h(0) = (0, 1)

h(ny) = (pr(h(n)) + pr ′(h(n)), pr(h(n)))

Then it is easily checked that f = pr ◦ h and g = pr ′ ◦ h are the functionswe sought. So there was no harm in writing their mutual definition aswe did. Indeed, a general scheme for mutually recursive definitionscould be worked out.

7.4 Why Induction Works

Returning to our original postulates of the natural numbers, let usconsider whymy = ny should implym = n. Recall that Thepredecessor function is defined by primitive recursion:

pred(0) = 0

pred(ky) = k

So ifmy = ny, thenm = pred(my) = pred(ny) = n. In otherwords, our ability to calculate predecessor by recursion forces thepostulate to be true.

Similarly, take a set with two distinct elements. For example,BOOL = {false, true}. The specific elements are not important, butthe fact that false 6= true is. Now define is-zero : N→ BOOL by

is-zero(0) = true

is-zero(ky) = false

Now suppose 0 = ny for some n. Then the recursive definition ofis-zero would mean that true = is-true(0) = is-true(ny) = false. That isnot possible. So 0 can not equal any ny.

The Axiom of Induction also follows from our ability to definefunctions recursively. We can reformulate the Axiom as saying that ifP is a subset of N satisfying

• 0 ∈ P and

• if k ∈ P then ky∈ P holds for all k ∈N,

then P = N. We can show that this is true using recursion. It is easierto reason a bit more generally. Suppose P is a set, and f : P → N is afunction so that 0 = f(p0) for some p0 ∈ P. Suppose, moreover, that


t : P → P is a function satisfying f(t(p)) = f(p)y for all p ∈ P. Then weclaim that for every n ∈ N, there is some p ∈ P so that f(p) = n. Bysimple recursion, there is a function g : N→ P defined by

g(0) = p0

g(ky) = t(g(k))

So f ◦ g is a function from N to N. It satisfies the following:

(f ◦ g)(0) = f(p0)= 0

(f ◦ g)(ky) = f(t(g(k))

= f(g(k))y

= (f ◦ g)(k)y

That is, f ◦ g satisfies the same recurrence is the identity on N. Since theprimitive recursion postulate requires that any recurrence determinesexactly one function, it must be the case that idN = f ◦ g. So for anyn ∈ N, f(g(n)) = n.

Now take f to be an inclusion map inclP : P →N where P is a subsetof N. The conditions on f translate to saying that 0 ∈ P and that P isclosed under taking successors. So for every n ∈ N, it must be thecase that n ∈ P. Hence the two are equal.

In summary, Postulate 7 says that (primitive) recursion is a permis-sible way to define functions involving natural numbers. This leadsto proofs that our original natural number postulates (from Chapter 1)are forced to be true. In effect, they are redundant now. This reinforcesthe idea that computation (in the form of recursion) is foundational formathematics.

8Ordering and Strong Induction

THE STANDARD ORDERING of natural numbersm ≤ n does notCHAPTER GOALS

In this chapter, we definem ≤ n

andm < n for natural numbers andderive some useful properties of theserelations. This leads to reformulations ofinduction that will be useful later.

need much explanation. But in fact, it arises in a precise way fromaddition. This fact permits us to reason about order of numbers usingthe arithmetic.

DEFINITION 7: Comparison of Natural Numbers

For natural numbersm and n, say thatm is less than or equal to n(writtenm ≤ n) if and only ifm+ d = n for some d : N. Also definem < n (m is strictly less than n) to mean thatm+ dy = n for somenatural number d.

For example, 4 ≤ 9 because 4+ 5 = 9. On the other hand 5 � 3

because there is no natural number d for which 5+ d = 3. Likewise,5 ≮ 5 because there is no d for which 5+ dy = 5.

From this definition, we can immediately infer some useful facts These three properties of ≤ are impor-tant enough to have names. We saythat ≤ is reflexive, transitive and anti-symmetric. We will study these and otherproperties of relations in Chapter ??.

about ≤. For example,m ≤ m is always true becausem+ 0 = m. Ifm ≤ n and n ≤ p thenm ≤ p. And ifm ≤ n and n ≤ m, thenm = n.I leave the proofs of these and a few other facts as exercises.

Another propery of ≤ that is a bit more subtle to prove is linearity.That is, for any two natural numbersm and n, eitherm ≤ n or n < m.


LEMMA 9: ≤ is linear

For allm ∈N and n ∈N, eitherm ≤ n or n < m.

PROOF :We proceed by induction onm.

Basis: Clearly, 0 ≤ n for every n because 0+n = n.

Inductive hypothesis: Suppose that for some k ∈ N, it is the case thatfor every n ∈N, either k ≤ n or n < k.

Inductive step: The goal is to show that for every n ∈ N, either ky ≤n or n < ky. Consider some n. By the inductive hypothesis, eitherk ≤ n or n < k. We look at the two cases separately.

If n < k, then n+dy = k for some natural number d. So n+dyy =

ky, and hence n < ky.

If k ≤ n, then k + d = n for some natural number d. Now d

must either be 0 or some successor. If d = 0, then k = n. Son+ 1 = ny = ky. Hence n < ky. If d is a successor, then d = ey

for some e. So ky + e = k+ ey = n. That is, ky ≤ n. All cases havebeen accounted for, so either ky ≤ n or n < ky.

�

Linearity justifies our intuition that natural numbers are ordered sothey can be arranged in a straight line with smaller numbers to the leftand larger ones to the right. In other words, this jibes with the pictureswe have been drawing all along.

Now that we know the natural numbers are linearly ordered,we can return to the question of cancellativity for multiplication —having punted on a proof of this earlier.

PROPOSITION 5: Multiplication by a non-zero is cancellative.

For all natural numbersm, n and p, ifm · py = n · py thenm = n.

PROOF : Supposem · py = n · py. Because of linear ordering of the The laws of arithmetic used herewere proved by induction earlier. Soimplicitly, this lemma still depends oninduction, even though we do not usethat pattern explicitly.

natural numbers, eitherm ≤ n or n < m. Without loss of generality,assumem ≤ n. That is,m+ d = n for some d. So

m · py + 0 = n · py = (m+ d) · py = m · py + d · py.

ORDERING AND STRONG INDUCTION 69

By cancellativity of addition, 0 = d · py. So by the Law of No ZeroDivisors, d = 0. �

The ordering on natural numbers permits us to think about a newform of induction, sometimes called strong induction even though it isequivalent to simple arithmetic induction.

Suppose P(k) is a statement about natural numbers. That is P(5)might be true or false, P(2918) must be true or false, and so on. LetP∗(k) be the statement: P(j) is true for all j < k. So P∗(5) is true just incase P(0), P(1), . . . , P(4) are all true.

An inductive proof that P∗(m) holds for allm looks like this:

STRONG INDUCTION , PRELIMINARY VERSION

Suppose P(m) is a property of natural numbers. Let P∗(m) be therelated property so that P∗(m) holds if and only if P(j) holds for eachj < m. Then the following is an outline of a proof by induction thatP∗(m) holds for allm.

Basis: There is no j < 0. So it is automatically true that P(j) holds forall j < 0. In other words, P∗(0) holds no matter what P is.

Inductive Hypothesis: Suppose P∗(k) holds for some natural number k.That is, P(j) holds for every j < k.

Inductive Step: The goal is to prove that P∗(ky) holds. But by theinductive hypothesis, P(j) holds for every j < k. So if P(k) alsoholds, then P∗(knxt) holds. So the inductive step for P∗(ky) iscomplete just by proving that P(k) holds.

We can simplify this into two parts (because the basis is trivial).

STRONG INDUCTION , USABLE VERSION

Strong inductive hypothesis: Suppose that for some k, it is the case thatP(j) holds for all j < k.

Strong inductive step: Prove that P(k) holds.

Conclude by strong induction that P(m) holds for allm.


EXAMPLE 5:

We claim that the Fibonacci numbers grow exponentially fast. Inparticular, φn ≤ fib(n+ 2), where φ = 1+

√5

2 . First, check for yourselfthat φ2 = φ+ 1.

The proof is by strong induction.

Strong Inductive Hypothesis Suppose that for some k, it is the case thatφj ≤ f(j+ 2) holds for all j < k.

Inductive Step For k < 2, check that φ0 = 1 ≤ fib(2) and φ1 = φ ≤fib(3). So the claim is true for k = 0 and k = 1. For values k ≥ 2, k asjyy for some j. So

φk = φ2 ·φj = (φ+ 1)φj = φjnxt +φj.

By the strong inductive hypothesis, φjy ≤ fib(ky) and φj ≤ fib(k).So φk ≤ fib(k+ 2).

Hence, φn ≤ fib(n+ 2) for all n.

Minimization

Suppose P(−) is a property of natural numbers. Let’s refer to any n forwhich P(n) holds as a satisfier of P. A least satisfier of P is any n so that(i) n is a satisfier of P and (ii) ifm is a satisfier of P, then n ≤ m.

If P has any satisfier at all, it must have a least satisfier. For supposeP has no least satisfier. Then consider the property ¬P defined by¬P(n) is true if and only if P(n) is false. Evidently, if ¬P(j) is true forevery j < k, then ¬P(k) must also hold. Otherwise, P(k) would hold,so kwould be a satisfier, and since by the strong inductive hypothesis,no natural number less that k is a satisfier, kwould have to be theleast satisfier. We supposed that no such thing exists. So by stronginduction, ¬P(k) holds for every k. In other words, P(k) is false for allk.

We can restate this simply.

ORDERING AND STRONG INDUCTION 71

THEOREM 1: Minimization Principle for Natural Numbers

Any property of natural numbers that is satisfied by at least onenatural number has a least satisfier.

PROOF : The proof is covered essentially in the foregoing paragraph. �

We will not illustrate this principle here, but you will encounter theidea in other courses. It is sometimes a convenient way to think aboutan inductive proof.

8.1 Laws of ordered arithmetic

The arithmetic laws for addition and multiplication, as stated inChapter ??, are all to do with equality. But now that we have theordering of natural numbers, we ought to spend at how ≤ interactswith arithmetic.

LEMMA 10: Addition is monotonic

For any natural numbers,m, n and p, ifm ≤ n, thenm+ p ≤ n+ p.

PROOF : This is an exercise. �

LEMMA 11: Addition is order-cancellative

For any natural numbersm, n and p, ifm+ p ≤ n+ p, thenm ≤ n.PROOF : This is an exercise. �


LEMMA 12: Multiplication is monotonic

For any natural numbersm, n and p, ifm ≤ n, thenm · p ≤ n · p.PROOF : This is an exercise. �

LEMMA 13: Multiplication by a positive natural number is order-can-cellative

For any natural numbersm, n and p, ifm · py ≤ n · py, thenm ≤ n.PROOF : This is an exercise. �

Exercises

71. Prove Lemma 10.

72. Prove Lemma 11.

73. Prove Lemma 12.

74. Prove Lemma 13.

9Minimum and Maximum

THE ORDERING OF NATURAL NUMBERS gives rise to two otherCHAPTER GOALS

In this chapter, we define min(m,n)and max(m,n) for natural numbers andextend the laws of arithmetic to includethem.

operations: min and max. The minimum of two natural numbersmand n is, of course, the smaller of the two. It makes sense to say thisbecause ≤ is linear. That is, eitherm ≤ n— in which casem is theminimum – or n < m – in which case n is the minimum. We writemin(m,n) for this. The maximum is written as max(m,n). Both ofthese can be specified by algorithms.

ALGORITHM 7: Mimimum

For natural numbersm and n, min(m,n) is calculated by

min(0,k) = 0

min(jy, 0) = 0

min(jy, ky) = min(j, k)y

ALGORITHM 8: Maximum

For natural numbersm and n, max(m,n) is calculated by

max(0,k) = k

max(jy, 0) = jy

max(jy,ky) = max(j,k)y

Exercises


75. Calculate explicitly min(3, 6).

76. Calculate explicitly max(4, 3).

The important property of min and max is summarized by thefollowing lemma.

LEMMA 14: Characterizing min and max

For natural numbersm, n and p,

p ≤ min(m,n) ⇐⇒ p ≤ m and p ≤ nmax(m,n) ≤ p ⇐⇒ m ≤ p and n ≤ p.

PROOF : We look only at min because the proof for max is essentiallyidentical.

By cases, eitherm ≤ n or n < m. In the first case, min(m,n) = m,so min(m,n) ≤ m by reflexivity, and min(m,n) ≤ n becausem ≤ n.In the latter case, min(m,n) = n by reflexivity. Again min(m,n) ≤ mbecause we have assumed n < m.

Suppose p ≤ m and p ≤ n. By linearity, eitherm ≤ n or n < m. Ineither case, p ≤ min(m,n). �

It is worth noting that min(m,n) is actually the unique naturalnumber satisfying the two conditions of the lemma. That is, suppose(i) q ≤ m q ≤ n and (ii) for all natural numbers p, if p ≤ m and p ≤ nthen p ≤ q. Then by (i), q ≤ min(m,n). By (ii), min(m,n) ≤ q. Soq = min(m,n). This tells us that min(m,n is, in a sense, optimal forsolving the two inequalities x ≤ m and x ≤ n simultaneously. Wesummarize the lemma by saying that min(m,n) is the greast lowerbound ofm and n. Likewise, max(m,n) is the least upper bound.

Some simple facts about min and max derive directly from thesecharacterizations, all of which can be proved using the above charac-terizations plus some facts about addition.

In particular, together with addition, min and max make the natu-ral numbers into something called a distributive lattice ordered monoid.Basically, this means that addition together with min and max coop-erate in specific ways that augment the basic laws of arithmetic. Themost useful laws having to do with min and max follow.

MINIMUM AND MAXIMUM 75

BASIC LAWS OF min AND max

For any natural numbers,m, n and p: Like the basic laws of arithmetic pre-sented in Chapter ??, there laws areorganized here to emphasize similaritiesbetween min and max. Pay attention tothat.

Characterization via ≤m ≤ min(n,p) if and only ifm ≤ n andm ≤ pmax(m,n) ≤ p if and only ifm ≤ p and n ≤ p

Associativity min(m, min(n,p))= min(min(m,n),p)max(m, max(n,p))= max(max(m,n),p)

Commutativity min(m,n)= min(n,m)

max(m,n)= max(n,m)

Idempotency min(m,m)= m

max(m,m)= m

Absorptivity m= min(m, max(n,m))

m= max(m, min(n,m))

Distributivity m+ min(n,p)= min(m+n,m+ p)

m+ max(n,p)= max(m+n,m+ p)

max(m, min(n,p))= min(max(m,n), max(m,p))min(m, max(n,p)= max(min(m,n), min(m,p))

m ·min(n,p)= min(m · n,m · p)m ·max(n,p)= max(m · n,m · p)

Modularity m+n= min(m,n) + max(m,n)

We will not prove most of these as they follow easily from arith-metic laws.

LEMMA 15:

m+ min(n,p) = min(m+n,m+ p) for all natural numbersm,n,p.PROOF : Because min(n,p) ≤ n and addition is monotonic, m + This is a good illustration of why anti-

symmetry of ≤ is useful. We wish toprovem+ min(n,p) = min(m+ n,m+

p). Instead, we provem+ min(n,p) ≤min(m + n,m + p), and separately,min(m+ n,m+ p) ≤ m+ min(n,p).Then we let antisymmetry take us therest of the way.

min(n,p) ≤ m + n. Likewisem + min(n,p) ≤ m + p. Som +

min(n,p) ≤ min(m+n,m+ p).So to complete the proof, we must show that min(m+n,m+ p) ≤

m+ min(n,p). Suppose k ≤ min(m+n,m+ p). Then k ≤ m+n andk ≤ m+ p. If k ≤ m then k ≤ m+ min(n,p) obviously. Otherwise,m < k by Linearity. So m+dy = k for some d. Hence m+dy ≤ m+n


andm+ dy ≤ m+ p. Since ≤ is order reflecting, dy ≤ min(n,p).Consequently, k = m+ dy ≤ m+ min(n,p). We have thus shown thatk ≤ min(m+ n,m+ p) implies k ≤ m+ min(n,p). In particular, thisapplies to min(m+n,m+ p). �

Exercises

77. Calculate the following values. Show work.

(a) min(5, min(4, 6))

(b) min(5, max(4, 6))

(c) min(340, max(234, 340))

(d) min(5, max(3, min(max(1, 2), 7)))

(e) min(5+ max(4+ min(3+ max(7, 8), 3+ min(7, 8)), 6), 7)

78. Prove that min distributes over max.

79. Prove that multiplication distributes over min.

10Divisibility: Ordering the Natural Numbers by Multipli-cation

D IS IBILITY of natural numbers is a relation defined by analogy withCHAPTER GOALS

Develop the analogue of ≤ defined bymultiplication instead of by addition.Illustrate the value of the analogy toprove useful properties of divisibility.Introduce general division for naturalnumbers.

the standard order. In the standard order,m ≤ n means thatm+ d = n

for some d. Because multiplication satisfies many of the same lawsas addition(it is commutative, associative, etc.), a similar definition ispossible in terms of multiplication.

DEFINITION 8: Divisibility

For natural numbersm and n, say thatm divides n if and only ifm · q = n for some natural number q. We writem | nwhenm dividesn, and writem - nwhenm does not divide n.

An alternative way to say thatm divides n is n is a multiple ofm.

The distinction betweenm ≤ n andm | n is precisely that theformer is defined by addition and the latter by multiplication. So wecan transfer many facts about ≤ to facts about | simply by noticingthat they depend on analogous laws of arithmetic.

For example, the relation ≤ is reflexive — meaning thatm ≤ m forallm— because 0 is the identity for addition. The relation | is reflexivesimilarly because 1 is the identity for multiplication. That ism | m

becausem · 1 = m. Likewise, ≤ is transitive because addition isassociative; | is transitive because multiplication is associative.

Divisibility is also anti-symmetric — meaning thatm | n and n | m

implies thatm = n. But a proof of this hints at an important differencebetween | and ≤.

Recall that we proved thatm ≤ n and n ≤ m impliesm = n using


cancellativity of addition. But multiplication is only cancellative fornon-zeros. That is,m · p = n · p impliesm = n only when p > 0. So weneed to treat 0 as a special case.

LEMMA 16: Divisibility is anti-symmetric

For any natural numbersm and n ifm | n and n | m, thenm = n.

PROOF : Supposem and n are natural numbers satisfyingm | n andn | m. Thus, there are natural numbers q and r so that

m · q = n

n · r = m

Supposem = 0. Then by the first equation n = 0 as well. Supposem = py for some p. Then

py · q · r = py

So cancelling py yields q · r = 1. But natural numbers also satisfy theLaw of Integrality. So q = 1. �

The divisibility relation begins to be more interesting when werealize that it is not linear. For example, 4 does not divide 13 and 13does not divide 4. Apparently, the structure of the natural numberswith respect to | is much more complicated than with respect to ≤.

Note that 1 | m is true for anym, simply because 1 ·m = m. Andm | 0 is true for anym becausem · 0 = 0. So 1 is “at the bottom” of thedivisibility relation and 0 is “at the top”. It may seem strange to claim that 0 | 0.

Some people find it so irritating thatthey simply rule 0 out of consideration,and declare that 0 | 0 is undefined. Tech-nically, this means they are committedto sayingm | n if and only if there isa unique q so thatm · q = n. Since0 · q = 0 for all q, there is no uniquevalue. This is fine, but I prefer to under-stand 0 | 0 just to mean 0 · q = 0 for someq.

Figure 10.1 shows a fragment of the natural numbers with respectto divisibility.

DIVISIBILITY: ORDERING THE NATURAL NUMBERS BY MULTIPLICATION 79

1

3

9

27

2

6

18

54

4

12

36

108

8

24

72

216

5

15

45

135

10

30

90

270

20

60

180

540

40

120

360

1080

25

75

225

675

50

150

450

1350

100

300

900

2700

200

600

1800

5400

0

...

...

...

...

...

...

...

Figure 10.1: Part of the divisibilityrelation

Exercises

For each of the following pairs (m,n) of natural numbers, determinewhether or notm | n.

80. (4, 202)

81. (7, 49)

82. (11, 1232)

83. (9, 19384394)

84. (n, 6n)

85. (26, 65)

Before closing this section, we note additional facts about howdivisibility interacts with arithmetic.


LEMMA 17: Technical facts about divisibility

1. m | n impliesm | n · p

2. Ifm | n, thenm | p if and only ifm | (n+ p)

3. If n < m andm | (pm+n), then n = 0.

4. m | n and n > 0 impliesm ≤ n.

PROOF : Whenm = 0, all of these are trivial, as 0 | n holds if and onlyif n = 0. So supposem is positive.

1. is due associativity.

2. is due to distributivity, and cancellativity of addition.

3. Suppose n < m andmd = mp+ n. If we can show that d = p,then it follows that n = 0. First, note thatmp ≤ md. And sincem ispositive, it can be cancelled. So p ≤ d.

Second,md = mp+ n < mp+m = m(py). Again, cancellingmyields d < py. So d ≤ p.

4. follows from (3).

�

Exercises

86. Fill in the details of the proofs of Lemma 17:1,2 and 4.

10.1 Quotients and Remainders

Without rational numbers, division still makes sense, provided weaccount for the fact that sometimes numbers don’t divide evenly.When you first learned about division, you learned about quotients


and remainders. For example, you may have calculated a long division,

925

13)12034

11700

334260

7465

9

r 9

showing that the quotient is 925with remainder 9. In any correctlong divisionm÷ n, the remainder (call it r) must be less then n. Inthe example, 9 < 13. The quotient (call it q) and r together satisfym = n · q+ r. In the example, 12034 = 13 · 925+ 9.

So the long division calculations you performed in elementaryschool were really an implementation of the following fact.

THEOREM 2: Natural number division

For any natural numberm and any positive natural number n,there is a unique pair of natural numbers q and r satisfying

1. m = n · q+ r and

2. r < n.

PROOF : First, we prove that if q and r satisfy the stated conditions,and so do q ′ and r ′, then q = q ′ and r = r ′. This will prove thatthere can be at most one pair of natural numbers satisfying the statedconditions.

Suppose n · q+ r = n · q ′ + r ′ and r < n and r ′ < n. If r < r, thenthere is some e so that r+ ey = r ′. Hence q · n = q ′ · n+ ey. But thenn | n · q ′ + ey. So nmust divide ey. But ey is strictly less that n, sothis is impossible. Thus it can not be the case that r < r ′. For the samereason, it can not be the case that r ′ < r. So r = r ′. By cancellativityof addition, q · n = q ′ · n. And since n is positive, cancellativity ofmultiplication yields q = q ′.

Now we prove that there actually is a pair of numbers q and rsatisfying the conditions. For this, we proceed by induction onm. Fixa positive natural number n.

• [Basis] The goal is to find q and r so that 0 = s · n+ r and r < n.Clearly, q = 0 and r = 0 do the job.


• [Inductive hypothesis] Assume that for some k : N, there arenatural numbers p and s satisfying k = p · n+ s and s < n.

• [Inductive Step] The goal is to find q and r so that ky = q · n+ r andr < n. By the inductive hypothesis, ky = p · n+ sy. There are twocases to consider: either sy < n, or sy = n— sy can not be strictlygreater than n because s < n. Suppose sy < n. Then let q = p andlet r = sy. Suppose sy = n. Then ky = p · n+ n = py · n+ 0. Solet q = py and r = 0.

So the claim is true for allm : N and all positive n : N. �

This theorem indicates that division of natural numbers works,as long as we account for both the quotient and the remainder. It isuseful to have notation for these.

DEFINITION 9:

For any natural numberm and positive natural number n, letm//ndenote the quotient andm mod n the remainder of dividingm by n.That is,m//n andm mod n are the unique integers so that

1. m = n · (m//n) + (m mod n) anditem 0 ≤ m mod n < n.

The notation // is borrowed from the programming langaugePython. It is intended to avoid confusion with real number divisionx/y. Python, Java, C and many other languages use a percent signfor remainder, but unfortunately, its precise meaning differs fromlangauge to language. The notation mod avoids this ambiguity.

Notice that // can be characterized in relation to ≤. First, evidentlyifm ≤ n and p > 0, thenm// p ≤ n// p. A proof of this is a simpleexercise. So division by a positive p is monotonic.


LEMMA 18: Division is adjoint to multiplication

For natural numbersm, n and p, with p positive, it is the case thatp ·m ≤ n if and only ifm ≤ n// p.

PROOF : Supposem ≤ n// p. Then p ·m ≤ p · (n// p) becausedivision of p is monotonic. Moreover, p · (n// p) + (n mod p) = n. Sop · p ≤ n. Conversely, suppose p ·m ≤ n. Then (p ·m)// p ≤ n// p.Butm = (p ·m)// p. �

The following shows that a dual version of division is also possible.That is, form : N and positive n : N, there is a natural number s sothat s ≤ p is and only ifm ≤ n · p.

COROLLARY 1: Natural number division, dual version

For any natural numberm and any positive natural number p,there is a unique pair of natural numbers s and t satisfying

1. m+ t = p · s and

2. t < p.

Moreover, for any natural number n, s ≤ n if and only ifm ≤ n · p.

PROOF : By division, there are natural numbers q and r so thatm =

n · q+ r and r < n. If r = 0, thenm+ r = n · q. Otherwise, r < n andthere is a “difference” t so that r+ t = n. Clearly, t < n because r > 0.Som+ t = n · q+ r+ t = n · q+n = n · (q+ 1).

Suppose s ≤ p. Thenm ≤ n · s ≤ n · p. Conversely, ifm ≤ n · p,then

n · s = m+ t < n · p+n = n · (p+ 1).

But multiplication by n is order reflecting, so s < p+ 1. That is, s ≤ p.�

Exercises

87. Calculate the following:

(a) 24// 7


(b) 10000000 mod 10000001

(c) 13 mod 8

(d) 8 mod 5

(e) 5 mod 3

(f) 3 mod 2

(g) 2 mod 1

88. Show that for anym : N, any positive n : N and any positivenatural number p, it is the case that

(p ·m)//(p · n) = m//n

and(p ·m) mod (p · n) = p · (m mod n).

11Greatest Common Divisors and Least Common Multi-ples

Recall that min(m,n) is characterized as the greatest lower bound of Write better intro and addgoalsWrite better intro and addgoalsm and nwith respect to the standard order ≤. Likewise, max(m,n) is

the least upper bound. For anym, n and p, they satisfy

p ≤ min(m,n) ⇐⇒ p ≤ m and p ≤ nmax(m,n) ≤ p ⇐⇒ m ≤ p and n ≤ p

This chapter concerns analogous operations with respect to divisi-bility. The idea is simple. If any two natural numbers have a greatestlower bound and a least upper bound with respect to ≤, then perhapsany they also have a greatest lower bound and a least upper boundwith repect to divisibility.

In fact, they do. The greatest lower bound with respect to divisibil-ity is called the greast common divisor and is denoted gcd(m,n). Theleast upper bound with respect to divisibility is called the least commonmultiple and is denoted by lcm(m,n). But it is not immediately obvi-ous that a greatest common divisor or least common multiple always As you know, greatest common divisors

and least common multiples play usefulroles in manipulating fractions. Forexample, to add 1

6 + 215 , we first find

that 30 is least common multiple of 6and 15. So 1

6 = 530 and 2

15 = 430 . Thus

16 + 2

15 = 930 . The result is not reduced

to simplest terms. Continuing, we notethat 3 is the greatest common divisor of9 and 30. So 9

30 = 3·33·10 = 3

10 .

exists.Our first task it to show that they do. That is, for every two natural

numbersm and n, we seek natural numbers gcd(m,n) and lcm(m,n)so that

p|gcd(m,n) ⇐⇒ p | m and p | n

lcm(m,n) | p ⇐⇒ m | p and n | p

A proof that every two natural numbers have a greatest commondivsior is quite old. Euclid included such a proof in his geometrytext. The proof given here is based on a proof due to the 18th centurymathematician Bezout. Like Bezout’s actual proof, this proof gives


us more information about the greatest common divisor. Bezout’soriginal theorem is proved in terms of integers, not natural numbers,using negative numbers and subtraction. We work this out in theterms of natural numbers only. The proof is longer and more compli-cated, essentially because it replaces subtraction with steps that usecancellation. The advantages are that the proof demonstrates how touse cancellativity in a proof, and that the result only involves naturalnumbers (integers are not needed).

THEOREM 3: Bezout’s Theorem

For any natural numbersm and n for whichm ≤ n, there arenatural numbers g, a and b so that

• g is the greatest common divisor ofm and n, and

• g+ a ·m = b · n

PROOF : The proof proceeds by strong induction onm. For the pur-pose of this proof, let us say that three natural numbers g, a and bconstitute a Bezout triple for (m,n) if

• g is the greatest common divisor ofm and n, and

• g+ a ·m = b · n

So the claim is that for any natural numbersm and n, ifm ≤ n, thenthere is a Bezout triple for (m,n).

Strong inductive hypothesis Suppose that for some k it is the case thatfor all j < k and all n ≥ j, there is a Bezout triple for (j,n).

Inductive step: The goal is to find a Bezout triple for (k,n) for any ngreater than or equal to k.

If k = 0, then clearly n divides k and divides n, so n is the greatestcommon divisor. And n+ 0 · k = 1 · n. Hence n, 0 and 1 form aBezout triple for (k,n).

Suppose 0 < k ≤ n, then n mod k < k. So by the strong inductivehypothesis, there is a Bezout triple for (n mod k) and k. That is,we can pick natural numbers g ′, a ′ and b ′ so that g ′ is the greatestcommon divisor of (n mod k) and k, and

g ′ + a ′ · (n mod k) = b ′ · k (11.1)

GREATEST COMMON DIVISORS AND LEAST COMMON MULTIPLES 87

Because g ′ is the greatest common divisor of n mod k and k, itthe case that g ′ | k(n// k). So g ′ | (k(n// k) + n mod k). Butn = k(n// k) + n mod k. Hence g ′ divides both k and n. Supposec | k and c | n. Then c | (k(n// k) +n mod k). So c | (n mod k). Thatis, g ′ is also the greatest common divisor of k and n.

It remains to find a and b satisfying

g ′ + a · k = b · n

so that g ′, a and b constitute the desired Bezout triple for (k,n).

From Equation 11.1, by adding equals to each side and using thefact that n = k · (n// k) +n mod k,

g ′ + a ′ · (n mod k) = b ′ · k (11.2)

g ′ + a ′ · (k · (n// k) +n mod k) = b ′ · k+ a ′ · k · (n// k) (11.3)

g ′ + a ′ · n =(b ′ + a ′ · (n// k)

)· k. (11.4)

So g ′, a ′ and b ′ + a ′ · (n// k) form a Bezout triple for (n,k). But weseek a Bezout triple for (k,n).

By Corollary 1, choose s and t so that

a ′ + t = k · s. (11.5)

Adding t · n to both sides of Equation 11.4, then using Equation 11.5,

g ′ + (a ′ + t) · n =(b ′ + a ′ · (n// k)

)· k+ t · n (11.6)

g ′ + (s · n) · k =(b ′ + a ′ · (n// k)

)· k+ t · n (11.7)

By linearity, either b ′ +a ′ · (n// k) ≤ n · s or n · s < b ′ +a ′ · (n// k).Consider two cases separately.

Supposeb ′ + a ′ · (n// k) + d = n · s (11.8)

for some d. Then we claim that g ′ + d · k = t · n. By adding d · kto both sides of Equation 11.7, using Equation 11.8, then cancellings · n · k,

g ′ + (s · n+ d) · k = (b ′ + a ′ · (n// k) + d) · k+ t · n (11.9)

g ′ + (s · n+ d) · k = (s · n) · k+ t · n (11.10)

g ′ + d · k = t · n (11.11)

So g ′, d and t form a Bezout triple for (k,n). This completes thecase when b ′ + a ′ · (n// k) ≤ n · s.


Supposen · s+ e = b ′ + a ′ · (n// k) (11.12)

for some positive e. Recall that 0 < k ≤ n. So n is positive. Againusing Corollary 1, pick natural numbers u and v so that

e+ v = n · u. (11.13)

In this case, we claim that g ′ + u · k = (t+ v · k) · n. SubstitutingEquation 11.12 into Equation 11.7, cancelling s · n · k, adding v · k toboth sides, and substituting Equation 11.13,

g ′ + (s · n) · k = (s · n+ e) · k+ t · n (11.14)

g ′ = e · k+ t · n (11.15)

g ′ + v · k = (e+ v) · k+ t · n (11.16)

g ′ + v · k = (u · k+ t) · n (11.17)

So g ′, v and u · k+ t constitute a Bezout triple of (k,n). This com-pletes the case when n · s < b ′ + a ′ · (n// k).

�

The theorem establishes that the greatest common divisor exists forany two natural numbersm ≤ n. But because gcd(m,n) = gcd(n,m),it actually shows that the order does not matter, except with respect to Actually, the last half of the proof

involved converting a Bezout triplefor (n,k) into a Bezout triple for (k,n).So in principle, we could adapt theproof the show directly that the orderof arguments does not matter in anyexcept whenm = 0.

determining Bezout triples.An algorithm for calculating Bezout triples, and therefore gcd(m,n),

can be extracted from the proof. The details of calculating Bezouttriples is complicated by the fact that a case distinction plays a role.We can extract a simpler algorithm for calculating gcd(m,n) withoutfinding actual Bezout triples. This is essentially the algorithm thatEuclid provided.

ALGORITHM 9: Euclid’s algorithm

For natural numbersm and n, gcd(m,n) can be computed via thefollowing:

gcd(0,n) = n

gcd(m,n) = gcd(n mod m,m) for anym > 0

The least common multiple ofm and n also exists and relates to


gcd(m,n) in the following way.

THEOREM 4: Least common multiples exist

Any two natural numbersm and n have a least common multiple.Moreover, if we write lcm(m,n) for the least common multiple, thenm · n = gcd(m,n) · lcm(m,n).

PROOF : If gcd(m,n) = 0, thenm = 0 and n = 0. So 0 is their onlycommon multiple, andm · n = gcd(m,n) · lcm(m,n).

Suppose that gcd(m,n) is positive and thatm ≤ n (or swapm andn if not). So gcd(m,n) · p = m and gcd(m,n) · q = n for some naturalnumbers p and q.

Let s = gcd(m,n) · p · q. Clearly, s is a multiple ofm and is amultiple of n, and gcd(m,n) · s = m · n. So s is a common multiple ofm and n. So to prove the result, we must show that s is the least. Thatis, if r is also a common multiple ofm and n, then r is a multiple of s.

Supposem · p ′ = r and n · q ′ = r. Then r = gcd(m,n) · p · p ′ =gcd(m,n) · q · q ′. So p · p ′ = q · q ′. Let t = p · p ′.

By Bezout’s Theorem, find natural numbers a and b so that

gcd(m,n) + a ·m = b · n.

Sot · gcd(m,n) + t · a ·m = t · b · n. (11.18)

But t · a ·m = s · a · q ′, and t · b ·n = s · b · p ′. So Equation 11.18 can bewritten as

r+ s · a · q ′ = s · b · p ′ (11.19)

Since r+ s ·a ·q ′ and s ·b ·p ′ are both multiples of s, r is also a multipleof s. �

We can take the result of this theorem as a definition.


DEFINITION 10: Least common multiple as an operation

For any natural numbersm and n, let lcm(m,n) denote the least The proof of Theorem 4 can be usedto extract an algorith for computinglcm(m,n). The simplest, but not mostefficient, method is to take lcm(m,n) =(m · n)// gcd(m,n).

common multiple ofm and n.

Theorem 4 actually tells us more than mere existenc of lcm(m,n).In fact, it shows that gcd(m,n) · lcm(m,n) = m · n. This is ex-actly analogous to one of the laws we established in Chapter ??:min(m,n) + max(m,n) = m + n. Though the proof of Theorem 4is much more complicated, the two results are clearly similar. Theyare in fact closely related. We will see how after investigating primenumbers in Chapter 12.

As one should expect now gcd and lcm are closely analogous tomin and max. Many of laws we enumerated for min and max carryover. Compare the following table of laws to the table on page 75


BASIC LAWS OF gcd AND lcm

For any natural numbersm, n and p:Characterization via |

m|gcd(n,p) if and only ifm | n andm | p

lcm(m,n) | p if and only ifm | p and n | p

Associativity gcd(m, gcd(n,p))= gcd(gcd(m,n),p)lcm(m, lcm(n,p))= lcm(lcm(m,n),p)

Commutativity gcd(m,n)= gcd(n,m)

lcm(m,n)= lcm(n,m)

Idempotency gcd(m,m)= m

lcm(m,m)= m

Absorptivity m= gcd(m, lcm(n,m))

m= lcm(m, gcd(n,m))

Distributivity m · gcd(n,p)= gcd(m · n,m · p)m · lcm(n,p)= lcm(m · n,m · p)

lcm(m, gcd(n,p))= gcd(lcm(m,n), lcm(m,p))gcd(m, lcm(n,p)= lcm(gcd(m,n), gcd(m,p))

Modularity m · n= gcd(m,n) · lcm(m,n)

11.1 Relative Primality

One characterization of prime numbers is that p is prime if and onlyif p is greater than 1 and gcd(p,n) = 1 for all n lying strictly between1 and p. For example, 5 is prime. It is easy to check that gcd(5, 2) =

gcd(5, 3) = gcd(5, 4) = 1. More generally, pairs of numbers forwhich gcd(m,n) = 1 play an important role in many parts of discretemathematics. Hence the following definition.


DEFINITION 11: Relatively Prime Numbers

Natural numbersm and n are said to be relatively prime if and onlyif gcd(m,n) = 1.

For now, the idea of relatively prime numbers is not much directuse. But notice that when you “reduce” a fraction, such a 24

15 , you findthe greatest common divisor gcd(24, 15) and factor it from the numera-tor and denominator. In this case, the result is 3·8

3·5 = 85 . The result has

a numerator and denominator that are relatively prime. In fact, thisis a useful definition of “reduced fraction”. I won’t pursue this ideahere because fractions are not our concern for now. Nevertheless, acouple of useful lemmas regarding relative primality are in order Youcan work out how that relate to fractions.

LEMMA 19: Factoring out a greatest common divisor results in rela-tively prime numbers.

Supposem and n are any two natural numbers. Choose p and qso thatm = gcd(m,n) · p and n = gcd(m,n) · q. Then p and q arerelatively prime.

PROOF : This is an exercise. �

LEMMA 20: Products of relative primes are relatively prime

If gcd(m,p) = 1 and gcd(n,p) = 1, then gcd(m · n,p) = 1.PROOF : In case p = 0, this is trivial since gcd(m, 0) = m andgcd(n, 0) = n. Otherwise, gcd(n,p) = gcd(p mod n,n) andgcd(m,p) = gcd(p mod m,m) Suppose gcd(m · n,p) > 1, gcd(m,p) =1 and gcd(n,p) = 1. So gcd(m · n,p) divides p. Evidently, it can notdividem or n. So there is a non-zero remainder in both cases. Thatis,m = q · gcd(m · n,p) + r and n = q ′ · gcd(m · n,p) + r ′ for someq, r,q ′, r ′ where 0 < r, r ′ < gcd(m · n,p). �


LEMMA 21: A cancellation law for divsibility

Ifm and n are relatively prime andm | n · p, thenm | p.

PROOF : Suppose gcd(m,n) = 1 andm divides n · p. Eitherm ≤n or n < m. Supposem ≤ n, Bezout’s Theorem yields naturalnumbers s and t so that 1 + s ·m = t · n. Notice that t can not be0. So p + p · s ·m = p · t · n. Sincem divides the right side of thisequation, it divide the left. Sincem also divides p · s ·m, it mustdivide p. Suppose n < m, then again Bezout’s Theorem yields naturalnumbers s and t so that 1+ s . . . n = t ·m. Again t 6= 0. Som | t · n · pand t · n · p = p+ s ·m · p. So againmmust also divide p.�

Exercises

89. For each of the following pairs of numbers, calculate their greatestcommon divisor and indicate which pairs are relatively prime.

90. Recall the Fibonacci numbers. Show that fib(n) and fib(n+ 1) arealways relativey prime.

91. Show that for any natural numbersm andm · n + 1 are alwaysrelatively prime.

12Prime Numbers

PRIME NUMBERS ARE BUILDING BLOCKS for the natural num-CHAPTER GOALS

Define rigorously the prime numbersand prove the Fundamental Theorem ofArithmetic (each positive natural num-bers has a unique prime factorization).

bers. As such they are fundamental to many branched of discretemathematics.

We all know what a prime number is: a number that is not 1 andhas no factors smaller than itself. One might ask why 1 is excluded.After all, 1 does not have any factors smaller than itself either. So atfirst blush, it seems like an arbitrary thing to exclude. In fact, histori-cally 1 has been regarded as a prime number. But that actually makesthings more difficult. So now, everyone agrees that 1 should not beconsidered to be prime. In the next paragraphs, we look at why thismakes sense.

Recall that the product of a list of natural numbers is definedinductively by

•∏

[ ] = 1

•∏n : L = n ·

∏L.

DEFINITION 12: Prime Numbers

A prime number is a positive natural number p so that for any listL of natural numbers, if p |

∏L, then p | n for some n ε L.

With this definition, 1 is not prime for exactly the same reason that6 is not. In both cases, all we need to show is the number divides aproduct of natural numbers, but does not divide any of the factorsin that product. For 1, just observe that 1 divides

∏[ ], but 1 does not

divide any item on the list [ ] ( because there are no items on the emptylist). Likewise, 6 divides

∏[2, 3] but 6 does not divide any item on the

PRIME NUMBERS 95

list [2, 3]. In contrast, 2 is prime because if 2 |∏Lwith L being a list of

natural numbers, at least one of the items on the list must be even.Notice that for both of the counter-examples, 6 and 1, we were able

to choose a product that does not just divide the number, but is equalto the number. This suggests that the definition of primality is relatedto a seemingly more general notion.

DEFINITION 13: Irreducible numbers

An irreducible number is a positive natural number p so that forany list L of natural numbers, if p =

∏L, then p ε L.

This definition is most likely what you were told is the definition ofprimality. That is, a prime number is usually defined to be any naturalnumber greater than 1 that can not be factored into properly smallenumbers. It is not hard to see that this is precisely what irreducibilitymeans.

The distinction between irreducibilty and primality is an importantone in other contexts. We will encounter some of those later whenwe discuss modular arithmetic. As it happens though, in the naturalnumbers primes and irreducibles are the same. So why bother to havetwo definitions now?

First, in situations very closely related to the natural numbers,primes and irreducibles really are not the same. So it is helpful to beaware of the distinction now. Second, the proof that the two conceptsare the same for natural numbers is instructive. To show that allprimes are irreducible is fairly easy. The proof that all irreducibles areprime is more complicated.

LEMMA 22: All primes are irreducible

Any prime number is also an irreducible number.

PROOF : Suppose p is prime. Consider a list L so that p =∏L. Then

p divides∏L. So there is some n ε L so that p divides n. But n also

divides∏L. Because divisibility is anti-symmetric, p = n. �

To prove the converse, we need a fact that (in a different form) wsknown to Euclid.


LEMMA 23: Euclid’s Lemma

If p is irreducible, then for anym eitherm and p are relatively prime,orm is a multiple of p.

PROOF : Suppose p is irreducible. In particular, it is positive, sogcd(p,m) 6= 0.

Suppose 1 < gcd(p,m) < p. Then for some q, gcd(p,m) · q = p.But then 1 < q < p as well. So p is not irreducible. This contradictsthe assumption, so gcd(p,m) must either equal 1 or be greater than orequal to p. But gcd(p,m) ≤ p always holds when p is positive.�

THEOREM 5: Irreducibles are prime

Every irreducible number is prime.PROOF : Suppose p is irreducible. By induction on lists, we show thatif p |

∏L holds for some list L, then p | n for some n on the list.

Basis The basis holds vacuously because∏

[ ] = 1 6= p.

Inductive Hypothesis Suppose K is a list of natural numbers so that ifp |∏K, then p | n for some n on the list K.

Inductive Step The goal is to show that for any natural numberm, ifp |∏

(m : K), then p | n for some n on the listm : K. That is, eitherp | m or p | n for some n on the list K.

By Euclid’s Lemma, eitherm is a multiple of p orm and p arerelatively prime. If it is the former, then p | m. If it is the latter, thenp |∏K by Lemma 21. So by the inductive hypothesis, p | n for

some n on the list K.

�

PRIME NUMBERS 97

LEMMA 24: All irreducibles are prime

Any irredicuble number is a prime number.

PROOF : Suppose r is an irreducible number. We prove by induction onlists that if r |

∏L, then r | n for some n on the list.

Basis If r is irreducible, it is greater than 1. So r can not divide∏

[ ].The basis is true vacuously.

Inductive Hypothesis Suppose that K is a list of positive natural num-bers so that if r |

∏K, then p | n for some n on the list K.

Inductive Step The goal is to show that for any positivem, if r divides∏(m : K), then r divides some n on the listm : K. That is, either

r | m or r | n for some n on the list K.

Suppose that r |∏

(m : K). Consider g = gcd(r,m). Because bothr andm are positive, g is also positive and r = g · q for some q.Because r is irreducible, either g ≥ r or q ≥ r. So one or the otheris equals r. If g = r, then r | m. Otherwise, r andm are relativelyprime. So by Lemma 21,m | m ·

∏K,m |

∏K. So the inductive

hypothesis ensures thatm | n for some n on the list K.

�

Consequently, we can use the terms “irreducible” and “prime”interchangibly for natural numbers (also for integers when they comeup). But bear in mind that the two concepts differ in other contexts.

Our next task is to prove two facts you are been told many times.First, the Fundamental Theorem of Arithmetic says that every positivenatural number can be factored in essentially one way into primenumbers. Second, we show that there are infinitely many primes.

We split the proof of Fundamental Theorem into two parts. First,we prove that every positive natural number factors into irreducibles.Then we show that any factorization into primes is unique except forthe order in which the primes are listed.


LEMMA 25:

For any positive natural numberm, there is a list P consisting only ofirreducible numbers so thatm =

∏P.

PROOF : Here we use strong induction.

• [Strong Inductive Hypothesis] Assume that for some positive k, itis the case that for every 0 < j < k, there is a list of primes P so thatj =∏P.

• [Strong Inductive Step] There are three cases to consider. Eitherk = 1, or k = i · j for some positive i and j strictly less than k, orneither of these holds.

Suppose k = 1. Then we let P = [ ]. This is a (trivial) list of irre-ducibles whose product is k.

Suppose k = i · j where i and j are both positive and strictly less thank. By the inductive hypothesis, there are lists of primes Q and R sothat i =

∏Q and j =

∏R. Hence k =

∏Q ·∏R =∏

(Q++ R).

Suppose k is neither equal to 1, nor equal to i · j for any two naturalnumbers strictly less than k. Then k itself is irreducible. The proofof this is by induction on lists.

Basis Because k 6=∏

[ ], the basis is vacously true. That is if k =∏[ ], then kwould appear on the list [ ].

Inductive hypothesis Suppose that for a list K, if k =∏K, then k

appears on the list.

Inductive step The goal is to show that for any n, if k =∏

(n : K),then k appears on the list n : K. That is, either k = n or k ε K.

By definition,∏

(n : K) = n ·∏K. But the case of k under

consideration is that k = i · j implies either i ≥ k or j ≥ k. Becausek is positive, either k = n or k =

∏K. In the first case, k ε (n : K).

In the second case, by the inductive hypothesis for K, k ε K, sok ε (n : K).

So k is irreducible. And since k =∏

[k], it is the product of a list ofirreducibles.

These are the only possible cases. So the strong inductive step isproved.

�

PRIME NUMBERS 99

The list of irreducibles constructed in Lemma 25 is often called aprime factorization ofm (because we know irreducible and primemean the same thing). Generally, a composite has more than onesuch factorization for trivial reasons. For example, the lists [2, 3]and [3, 2] both factor 6 into irreducibles. But the only way two suchfactorizations of the same number can differ is by the order in whichthe factors are listed. If we insist that our lists are sorted in increasingorder, this ambiguity is avoided.

Recall from Chapter ??.?? that because N is ordered by ≤, we cansay what it means to have a sorted list of natural numbers. Namely, alist L of natural numbers is sorted if Li ≤ jj whenever i ≤ j < len(P).

LEMMA 26:

Sorted lists of primes are determined by their productsFor any sorted lists of primes, P and Q, if

∏P =∏Q then P = Q. In other words, the factorization of a

positive natural number into primes isunique up to the order in which we listthe factors.

PROOF : The proof is by list induction on P.

Basis For any list Q of irreducibles, if∏

[ ] =∏Q, then Qmust be the

empty list as well. I leave the details of checking this to you.

Inductive Hypothesis Suppose that for some sorted list K consistingof primes it is the case that for every sorted list P of primes, if∏K =∏P, then K = P.

Inductive Step Consider some primem so thatm : K is sorted. Thegoal is to show that for any sorted list of primes Q, if

∏m : K =∏

Q, thenm : K = Q. This is by list induction on Q.

Basis As before,∏

[ ] = 1. So the basis holds vacuously.

Inductive Hypothesis Suppose that for some list sorted list J, it is thecase that if

∏m : K =

∏J, thenm : K = J.

Inductive Step Consider prime n so that n : J is sorted. Ifm = n,then

∏K =

∏J. So by the main inductive hypothesis, K = J. So

m : K = n : J.

To complete the inductive step we show thatm < n and n < mare both impossible. Ifm < n, thenm is relatively prime to n.Som |

∏J, but because n : J is a sorted list of primes,m is also

relatively prime to each item in J. This contradicts primality ofm.Likewise, n < m contradicts primality of n for the sae reason byswapping the roles ofm and n, and of K and J.

So we have completed the iductive proof that for any sorted list of


primes Q, if∏m : K =

∏Q, thenm : K = Q. This is the needed

inductive step.

�

The preceeding lemmas show that every positivem has a uniquesorted prime factorization, typically called the prime factorization.

THEOREM 6: Fundamental Theorem of Arithmetic

Every positive natural number has a unique sorted prime factoriza-tion.

PROOF : All that remains is to the remark that if P is a prime factoriza-tion ofm, then P can be sorted into increasing order. The result has thesame product P because of commutativity. �

For example, 24 is factored as [2, 2, 2, 3] and 800 is factored as[2, 2, 2, 2, 2, 5, 5]. We can get a more efficent representation by listingthe number of times each prime is repeated. So we can represent 800by [5, 0, 2], signifying that 800 = 253052. Notice that in this notation,we need the middle 0 as a “place holder” to indicate that our numberdoes not have any 3 factors. Also “trailing zeros” in this notation donot make a difference. [5, 0, 3, 0] also represents 800. The extra 0 at theend simply tells us that 800 is not divisible by the next prime (7). Wewill investigate this notation informally (without supplying oroofs)after establishing that we have a plentiful supply of primes. The ad-vantage of the notation is that it allows us to see immediately how gcdand lcm are not merely analogous to min and max, but are actuallyclosely related computationally.

THEOREM 7:

There are infinitely many primesFor every natural numberm, there is a prime p greater thanm.

PROOF : We prove this by showing that no finite list of prime exhaustsall of them.

Suppose L is a non-empty list consisting of primes. To show thatL is missing something, consider the numberm = 1 +

∏L. Since∏

L ≥ 1,m > 1. Som has a non-empty irreducible factorization, sayM. ClearlyM does not have any item in common with L (this couldbe proved explicitly by induction on L). So any item on the listM is

PRIME NUMBERS 101

missing from L SinceM is not empty, L can not be an exhaustive list ofall primes. �

DEFINITION 14:

We can enumerate the primes: 2, 3, 5, 7, . . . in increasing order. Forevery natural number k, let pk be the kth prime. That is, the numberspk satisfy

p0 = 2

pky) = the smallest prime q so that pk < q

Notice that this is well-defined because for anym there is a primegreater thanm. We would not know this if we did not know there areinfinitely many primes.

Exercises

92. What is p10?

93. What is the prime factorization of 1440?

Using pk, we can succinctly represent any positive natural num-ber by a list of exponents of primes. Namely, for a list R of naturalnumbers, define Rpe (for “prime exponent representation”) to be

Rpe :=∏

i<lenR

pRii .

For example, [1, 2, 0, 2]pr = 21325072 = 882.For a positive natural number n, let PE(n) denote the unique list so

that (i) PE(n) 6= = n and (ii) PE(n) does not contain any trailing zeroes.Define P+Q for two lists of natural numbers by adding the items of

the two lists itemwise. That is,

P+[] = P

[]+Q = Q

m : P+n : Q = (m+n) : (P+Q)

Then it is clear (we will not give a proof) that Ppe ·Qpe = (P+Q)pe.


Define a relation P � Q on lists of natural numbers by

• [] � Q always,

• m : P � [] never, and

• m : P � n : Q if and only ifm ≤ n and P � Q.

Then Ppe | Qpe if and only if P � Q. In other words, if we list theexponents of primes that constitute given numbersm and n, we cancompare them for divisibility simply by comparing the exponents by≤.

This leads to the relation between min and gcd. Namely, define anoperation on lists of natural numbers by taking minima itemwise.

min(P, [ ]) = P

min([ ],Q) = Q

min(m : P,n : Q) = min(m,n) : min(P,Q)

Then it is also easy to check that min(P,Q)pe = gcd(Ppe,Qpe) for anytwo lists of natural numbers.

A similar definition of max will lead to max(P,Q)pe = lcm(Ppe,Qpe)

for any two lists of natural numbers as well.Now the fact thatm · n = gcd(m,n) · lcm(m,n) becomes clearly a

corollary of the fact thatm+n = min(m,n) + max(m,n). Namely, forany two lists of natural numbers, P and Q,

P+Q = min(P,Q)+max(P,Q)

is easily proved by induction on lists. And the preceding paragraphsshow thatm · n = gcd(m,n) · lcm(m,n) follows from this.

In principle, gcd(m,n) could be calculated by first factoringm andn into primes, then using min on the resulting lists, then convertingthat back to a natural number. This method, however, is extremelyinefficient because finding the prime factorization of a large number isdifficult.

lecture notes on discrete mathematics · 5 ordering and strong induction 34 ... question of the...

Documents