course notes discrete mathematics 1 - university of limerickjkcray.maths.ul.ie/ms4111/slides.pdf ·...

Course Notes

for

MS4111/MS4132

Discrete Mathematics 1

J. Kinsella

February 5, 2002

0-0

MS4111/MS4132 Discrete Mathematics 1 0-1

Contents

1 Preamble 1

1.1 Availability of Notes . . . . . . . . . . . . . . . . . . . . . 1

1.2 Aims/Objectives . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Prime Text . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Syllabus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.5 Tutorials & Assessment . . . . . . . . . . . . . . . . . . . 5

1.6 Warning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.7 In Praise of Lectures . . . . . . . . . . . . . . . . . . . . . 7

2 Elementary Logic 8

2.1 Statement Calculus . . . . . . . . . . . . . . . . . . . . . . 9


2.1.1 Connectives . . . . . . . . . . . . . . . . . . . . . . 10

2.1.2 Rules of Statement Calculus . . . . . . . . . . . . . 23

2.1.3 Tautologies and Contradictions . . . . . . . . . . . 30

2.2 Predicate Calculus . . . . . . . . . . . . . . . . . . . . . . 32

2.2.1 Quantifiers . . . . . . . . . . . . . . . . . . . . . . 34

2.3 Revision Exercises . . . . . . . . . . . . . . . . . . . . . . 40

3 Proof Techniques 50

3.1 Direct Enumeration . . . . . . . . . . . . . . . . . . . . . 57

3.2 Direct Proof . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.3 Contrapositive Proof . . . . . . . . . . . . . . . . . . . . . 62

3.4 Proof by Contradiction . . . . . . . . . . . . . . . . . . . . 63

3.5 Proof by Induction . . . . . . . . . . . . . . . . . . . . . . 66


3.5.1 A Variation on the Principle of Induction . . . . . 78


4 Set Theory 92

4.1 Equality of Sets . . . . . . . . . . . . . . . . . . . . . . . . 94

4.2 Set Operators . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.3 Paradoxes in Naive Set Theory . . . . . . . . . . . . . . . 100

4.3.1 Russell’s Paradox . . . . . . . . . . . . . . . . . . . 100

4.4 Power Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 103


5 Functions 106

5.1 Domain, Co-domain & Range . . . . . . . . . . . . . . . . 111

5.2 Properties of Functions . . . . . . . . . . . . . . . . . . . 115


5.3 Operations on Functions . . . . . . . . . . . . . . . . . . . 121


6 Recurrence Relations 127

6.1 Sequences and Recurrence Relations . . . . . . . . . . . . 127

6.2 Derivation of a Recurrence Relation . . . . . . . . . . . . 132

6.3 Iteration Technique . . . . . . . . . . . . . . . . . . . . . . 137

6.4 Substitution Technique — Introduction . . . . . . . . . . 140

6.5 Substitution Technique — Details . . . . . . . . . . . . . . 144

6.5.1 Two Distinct Real Roots . . . . . . . . . . . . . . 145

6.5.2 Equal Real Roots . . . . . . . . . . . . . . . . . . . 145

6.5.3 Complex Roots . . . . . . . . . . . . . . . . . . . . 148

6.6 Substitution Technique — Inhomogeneous Case . . . . . . 152


6.6.1 Polynomial Inhomgeneous Term . . . . . . . . . . 157

6.6.2 Trigonometrical Inhomogeneous Term . . . . . . . 159


7 Application: Analysis of Algorithms 164

7.1 Selection Sort Algorithm . . . . . . . . . . . . . . . . . . . 170

7.2 Binary Search algorithm . . . . . . . . . . . . . . . . . . . 174

7.3 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . 183

7.4 Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . 186


A Supplementary Material 206

A.1 In Praise of Lectures . . . . . . . . . . . . . . . . . . . . . 206

A.2 Wiles’ Proof . . . . . . . . . . . . . . . . . . . . . . . . . . 228

MS4111/MS4132 Discrete Mathematics 1 1'

&

$

%

1 Preamble

1.1 Availability of Notes

The Notes for this course are freely available via the Web athttp://jkcray.maths.ul.ie/ms4111.html. You may also accessthe notes (and other related material) on the local area network(L.A.N.) via the network share: \\JKCRAY \PUBLIC in thefolder MS4111. You will need to use the Map Network Drive

command — available by right-clicking on the “My Computer” or“ITD Computer” icon on the top left of the display. Enter thenetwork share above, the system will allocate a drive letter. PressOK and a window will open showing the folders available. Selectthe folder MS4111 and double-click on the file Slides.


&

$

%

1.2 Aims/Objectives

• To introduce students to the language of Discrete Mathematics.

• To show its relevance, particularly in Computer Science.

• To reinforce development of problem-solving skills.


&

$

%

1.3 Prime Text

• Discrete Mathematics

– R. Johnsonbaugh

– Maxwell Macmillan

– Available in Bookshop


&

$

%

1.4 Syllabus

1. Elementary Logic

2. Proof Techniques

3. Set Theory

4. Functions

5. Recurrence Relations

6. Application: Analysis of Algorithms


&

$

%

1.5 Tutorials & Assessment

• Two lectures/week.

• One tutorial/week - to be assigned.

• Problems will be assigned each week on Thursday to beattempted for the following week’s tutorial.

• The module will be assessed via at least one “mid-term”assessment together with an end-of-semester assessment.


&

$

%

1.6 Warning

It should be pointed out that St. Augustine warned the faithful asfollows:

The Good Christian should beware of Mathematiciansand those who make empty prophecies. The dangeralready exists that the Mathematicians have made acovenant with the devil to darken the spirit and to confineman in the bonds of Hell.

Caveat emptor.


&

$

%

1.7 In Praise of Lectures

See Appendix A.1 for an interesting discussion on the pros & consof attending lectures & taking notes.


&

$

%

2 Elementary Logic

Definition 2.1 A statement in logic is an English (or anynatural language) sentence which has a definite truth value — i.e iseither true or false. The terms statement, proposition & assertionare all equivalent.

Example 2.1 Here are some English sentences. Some arestatements, some not.


&

$

%

Sentence Statement?

“Two is a number” Yes

“x is positive” No

“Mathematics is interesting” Yes

“God exists” Yes

“Please pay attention” No

“There is life on Mars” Yes

Figure 2.1: Example sentences


&

$

%

2.1 Statement Calculus

Statements are the fundamental building blocks of “The Calculusof Propositions” or “Statement Calculus”. We consider in thissubsection “calculations” using/concerning statements.

2.1.1 Connectives

We need to introduce “connectives” to combine simple statementsinto compound statements. Introduce a simple symbolic notationwhere capital letters (A,B,C,..) can stand for any statement. First,we define the concept of equality for statements:


&

$

%

Definition 2.2 (Equality) Two statements are equal if they havethe same truth value — irrespective of the truth values of theircomponent statements, if any.

We use the symbol ≡ instead of = to represent equality forstatements to remind ourselves that we are doing StatementCalculus, not Arithmetic.


&

$

%

AND Connective

Definition 2.3 The AND connective is defined by saying A ANDB is TRUE only when both A and B are TRUE: otherwise A ANDB is FALSE.

Notation: Use T to represent the truth value TRUE and use F torepresent the truth value FALSE. Use the symbol ∧ to representthe AND connective. We now introduce the device known as a“truth table” which allows unambiguous definitions in statementcalculus. Simply list all 2n possible combinations of truth values forthe n component statements.


&

$

%

A B A ∧B

T T T

T F F

F T F

F F F

Figure 2.2: Truth Table for AND

The statement A ∧B is our first example of a “compoundstatement”. This is called the conjunction of A and B.


&

$

%

Example 2.2 Life is real and life is earnest.

We “parse” the sentence into its component statements A “Life isreal” and B “Life is earnest” and write it as A ∧B. The compoundstatement A ∧B is true when both A & B are true and falseotherwise.


&

$

%

OR Connective

Definition 2.4 A OR B (A ∨B) is FALSE only when both A,Bare FALSE and is TRUE otherwise.

A B A ∨B

T T T

T F T

F T T

F F F

Figure 2.3: Truth Table for OR


&

$

%

We refer to “A OR B” as the disjunction of A & B. This is an“Inclusive” OR — i.e A ∨B is TRUE when either A or B or bothare TRUE. An “exclusive” OR — XOR can also be defined, as inthe following example.

Example 2.3 Either A it will rain this evening or B I willcut the grass. This does not correspond to A ∨B but to(A ∨B) ∧ (A ∧B)′.

Here the quote mark ’ is the symbol used for the NOT connective,which we now define. (NOT A is also often written A.)


&

$

%

NOT Connective

Definition 2.5 The statement “NOT A” is TRUE when A isFALSE and vice-versa.

A A′

T F

F T

Figure 2.4: Truth Table for NOT


&

$

%

IMPLICATION Connective In ordinary English usage, weunderstand the statement “A implies B” to be true if both A & Bare true and false if A is true and B is not. What if A is false? Thiscase is not usually of interest in ordinary conversation but for“implies” to be a connective, “A implies B” must have a definitetruth value for all values of A & B.

We say that “A implies B” is “vacuously true” if A ≡ F ;irrespective of the truth value of B. Consider the followingexample:

Example 2.4 “If every day is Christmas Day then I am SantaClaus” is TRUE!


&

$

%

We use the “⇒ ” symbol to stand for the implication connective.We can now define A⇒ B using a truth table.

A B A⇒ B

T T T

T F F

F T T

F F T

Figure 2.5: Truth Table for ⇒


&

$

%

We need to define the terms “necessary” and “sufficient” which areoften used in mathematical discussions.

Definition 2.6 Say that “A is sufficient for B” if A⇒ B.

Definition 2.7 Say that “ A is necessary for B” if B ⇒ A.


&

$

%

EQUIVALENCE Connective

Definition 2.8 We say that “A is equivalent to B” if “A isnecessary and sufficient for B”.

We use the symbol ⇔ to stand for the equivalence connective. Itshould be clear that Definition 2.8 corresponds to the followingtruth table:

A B A⇔ B

T T T

T F F

F T F

F F T

Figure 2.6: Truth Table for ⇔


&

$

%

Note that A⇔ B is T precisely when A and B have the same truthvalue. A final note on equivalence — ⇔ is a connective, ≡ is not.Put simply, connectives can be used to combine statements intomore complex statements; relational operators like ≡ allow usto compare statements.


&

$

%

2.1.2 Rules of Statement Calculus

We have defined a set of connectives (operators) analogously to thedefinitions of +,−,×,÷ in arithmetic. Corresponding to thefamiliar rules of Arithmetic, we can list the Rules of StatementCalculus.


&

$

%

First recall the Rules of Arithmetic: (a, b, c stand for any realnumber)

Commutative Law

a+ b = b+ a

a× b = b× a(2.1)

Associative Law

a+ (b+ c) = (a+ b) + c

a× (b× c) = (a× b)× c(2.2)

Distributive Law{a× (b+ c) = (a× b) + (a× c) (2.3)

Zero-One Law

a× 0 = 0

a× 1 = a

a+ 0 = a

(2.4)


&

$

%

(There are, of course, other rules in Arithmetic — in particularthose involving Division ÷ — but they are not of interest here.)


&

$

%

We now list the corresponding Laws for Statement Calculus —each must be checked by constructing a truth table. (In thefollowing T & F are used as abbreviations for TRUE & FALSErespectively.


&

$

%

Commutative Law

A ∨B ≡ B ∨AA ∧B ≡ B ∧A

(2.5)

Associative Law

A ∨ (B ∨ C) ≡ (A ∨B) ∨ CA ∧ (B ∧ C) ≡ (A ∧B) ∧ C

(2.6)

Distributive Law

A ∧ (B ∨ C) ≡ (A ∧B) ∨ (A ∧ C)

A ∨ (B ∧ C) ≡ (A ∨B) ∧ (A ∨ C)(2.7)


&

$

%

Identity Law

A ∧ T ≡ AA ∨ F ≡ A

(2.8)

Complement Law

A ∧A′ ≡ FA ∨A′ ≡ T(2.9)


&

$

%

Exercise 2.1 Check the five Laws using Truth Tables.

The importance of the Laws are that they allow us to “do algebra”with statements, i.e. to manipulate so-called “Boolean” or Logicalexpressions like a ∧ (b⇒ (c⇔ a)′) and simplify them wherepossible.


&

$

%

2.1.3 Tautologies and Contradictions

Definition 2.9 A tautology is a statement which has the truthvalue T irrespective of the truth values of its component statements— if any.

Example 2.5

T ≡ T (2.10)

A ∨A′ ≡ T (2.11)

(A⇒ B)⇔ (B′ ⇒ A′) ≡ T (2.12)

Definition 2.10 A contradiction is a statement which has truthvalue F irrespective of the truth values of its component statements— if any.


&

$

%

Example 2.6

F ≡ F (2.13)

A ∧A′ ≡ F (2.14)

(A⇒ B)⇔ (B′ ⇒ A′)′ ≡ F (2.15)

Exercise 2.2 Construct a truth table for the compound statement(A′ ∨B). Can we draw any conclusions?


&

$

%

2.2 Predicate Calculus

The following are examples of statements:

1. Everybody likes mathematics.

2. Just because you are paranoid doesn’t mean they are not outto get you.

3. All integers of the form 2× x, where x is an integer — are even.

4. All integers of form 2n − 1; where n is a positive integer — areprime.

They have in common the property that they refer to classes (orsets) of objects. This makes the techniques we have developed sofar (Statement Calculus) inadequate. (Consider example 2 in thelist above.)


&

$

%

The key idea is that of a predicate. A predicate is a logicalexpression which contains one or more “free variables”. Whenvalues are assigned to the free variable(s), we obtain a statement.For example, “If x (free variable) is an integer, then 2× x is aneven integer” is an example of a predicate. We can use a familiarnotation (function notation) to refer to predicates: for example letthe above predicate be represented as e(x); then e(3) stands for thestatement “If 3 is an integer, then 2× 3 is an even integer”.


&

$

%

2.2.1 Quantifiers

Definition 2.11 Given a predicate, p(x), the statement “For allx, p(x)” is true only if the predicate p(x) becomes a true statementfor any value x = x0 of the free variable x.

We use the notation ∀x, p(x) to represent “For all x, p(x)”. Thesymbol ∀ is referred to as the “universal quantifier”.

Example 2.7 ∀x,[(x ∈RRR)⇒ x2 ≥ 0

]is a TRUE statement.

An alternative and neater notation for the above example is:

Example 2.8 ∀x ∈RRR, x2 ≥ 0


&

$

%

In general, given a predicate p(x), we can restrict the range ofvariation of the free variable by either:

∀x,[(x ∈ S)⇒ p(x)

]or we can use a more economical notation:

Definition 2.12

∀x ∈ S, p(x) ≡ ∀x,[(x ∈ S)⇒ p(x)

]


&

$

%

We now define our second quantifier, ∃:

Definition 2.13 Given a predicate, p(x), the statement “There isan x, such that p(x)” is true only if the predicate p(x) becomes atrue statement for at least one value of the free variable x, say x0.

We use the notation ∃x|p(x) to represent “There is an x, such thatp(x)”. The symbol ∃ is referred to as the “existential quantifier”.

Again, we can restrict the range of possible values for the freevariable by either:

∃x|[(x ∈ S) ∧ p(x)

]or, again more economically, by:

Definition 2.14

∃x ∈ S|p(x) ≡ ∃x|[(x ∈ S) ∧ p(x)

]


&

$

%

Rules for negating quantifiers The following rules explainhow the negation of a statement formed with quantifiers can beobtained.

Theorem 2.1 (Negating Quantifiers)(∀x, p(x)

)′≡ ∃x|p′(x) (2.16)(

∃x|p(x))′

≡ ∀x, p′(x) (2.17)

Proof: (Eq. 2.16)

If(∀x, p(x)

)′≡ T , then

(∀x, p(x)

)≡ F and so it is not true that

for all x, p(x) ≡ T . Therefore there must be at least one choice (sayx0) such that p(x0) ≡ F or equivalently p′(x0) ≡ T . We concludethat ∃x|p′(x) is true, as required.

Exercise 2.3 Construct the corresponding argument for Eq. 2.17


&

$

%

Comment 2.1 Of course we have not yet said what we mean by atheorem or proof!

Example 2.9 Consider again the “Just because you are paranoiddoesn’t mean they are not out to get you” example above. Definethe two predicates “x is paranoid” = p(x) and “They are out to getx” = b(x). Now the English statement translates into(

∀x, [p(x)⇒ b′(x)])′

= ∃x|[p(x)⇒ b′(x)]′

≡ ∃x|[p′(x) ∨ b′(x)]′

≡ ∃x|[p(x) ∧ b(x)]

Translate this back into English!


&

$

%

Example 2.10 Start with the (?) true statement: “Everybody likesmaths”. Let m(x) be the predicate: “x likes mathematics”. Insymbolic form, we get: ∀x,m(x). To negate this, use theappropriate rule above, yielding: ∃x|m(x)′. In English; “There issomeone who dislikes maths”.

Question: How do we negate predicates of form: ∀x ∈ S, p(x) or∃x ∈ S|pXx). Just use the more explicit notation above(Definitions 2.12 and 2.14) to derive the following results:

Exercise 2.4(∀x ∈ S,m(x)

)′≡

(∃x ∈ S|m(x)′

)(∃x ∈ S|m(x)

)′≡

(∀x ∈ S,m(x)′

)


&

$

%

2.3 Revision Exercises

1. Find the antecedent and consequent in each of the followingstatements.

(a) Healthy plant growth follows from sufficient water.

(b) Increased availability of microcomputers is a necessarycondition for further technological progress.

(c) Errors will be introduced only if there is a modification ofthe program.

(d) Fuel savings imply good insulation or wearing overcoatsindoors.


&

$

%

2. Several forms of negation are given for each of the followingstatements. Which are correct?

(a) Some people like mathematics.i. Some people dislike mathematics.ii. Everybody dislikes mathematics.iii. Everybody likes mathematics.

(b) The answer is either 2 or 3.i. Neither 2 nor 3 is the answer.ii. The answer is not 2 or 3.iii. The answer is not 2 and it is not 3.

(c) All people are tall and thin.i. Someone is short and fat.ii. No one is tall and thin.iii. Someone is short or fat.


&

$

%

3. Let P be the statement “Eliminating unemployment is a keyelement of public policy”, let Q be the statement “There willbe an election soon” and let R be the statement “Theimportant thing is not to rock the boat”. Give simple Englishsentences for each of the following statements.

P ′ (P ∧Q)′

Q′ ∧R′ (P ∨Q)⇒ R

(P ⇒ (Q ∧R))′ Q′ ∨ (P ⇒ R)


&

$

%

4. Let P be the statement “Whales are aquatic mammals”. Let Qbe the statement “Fish is a healthy food”. Let R be thestatement “It is wrong to eat animals”. Represent each of thefollowing English sentences symbolically:

(a) If whales are aquatic animals, then it is wrong to eatanimals.

(b) Either fish is a healthy food or it is the case both thatwhales are not aquatic animals and that it is not wrong toeat animals.

(c) It is not the case that if whales are aquatic mammals and itis wrong to eat animals, then fish is not a healthy food.

5. Given that in Question 3, P is false, Q is true and R is true;determine the truth value of each of the compound statements(i) to (vi) given there using either truth tables or the calculusof propositions.


&

$

%

6. Using letters for the component statements, translate thefollowing compound statements into symbolic notation.

(a) If prices go up, then housing will be plentiful and expensive;but if housing is not expensive, then it will still be plentiful.

(b) Either going to bed or going swimming is a sufficientcondition for changing clothes; however, changing clothesdoes not mean going swimming.

(c) Either it will rain or it will snow, but not both.

(d) If Janet wins or if she loses, she will be tired.

(e) Just because you are paranoid doesn’t mean they aren’t outto get you.


&

$

%

7. Construct truth tables for the following statements, where A,Band C are statements. Note any tautologies or contradictions.

(A⇒ B)⇔ A′ ∨B. (A ∧B) ∨ C ⇒ A ∧ (B ∨ C).

A ∧ (A′ ∨B′)′ A ∧B ⇒ A′.

(A⇒ B)⇒((A ∨ C)⇒ (B ∨ C)

)(A⇒ B) ∨ (B ⇒ A).

8. Suppose that A, B and C are conditions that will be true orfalse when a certain computer program is executed. Supposefurther that you want the program to carry out a certain taskonly when A or B is true (but not both) and C is false. UsingA, B and C and the connectives AND, OR and NOT, write astatement which will be true only under these conditions (inthe language of your choice).

9. Prove that (B′ ∧ (A⇒ B)

)⇒ A′

is a tautology.


&

$

%

10. A certain island exists (but where?) whose inhabitants are alleither knights or knaves. Knights always tell the truth, knavesalways lie, (declare to be true what is false and vice versa). Anintrepid logician lands on the island and meets someinhabitants. The first (A) says “I am a knight”. The second(B) says “If I am a knave, then the Prime Minister (P) is aknight”. The third (C) says “I am a knight, the Emperor (E) isa knave, and if the Grand Vizier (V) is a knave, then I am aknave”. What conclusions can the logician draw as to theaffiliation (knight or knave) of A, B, C, E, P and V ?.

11. The logician falls asleep and dreams that a native of the islandspeaks to him and says “I am a knave”. Can this dream betrue?


&

$

%

12. Let S represent the assertion “S is a knight”. Let s stand forany assertion made by speaker S. Show that S ⇔ s is atautology. With this observation, solve the problems given inquestion 7 using symbolic notation. (Difficult!)

13. Transcribe each of the following English sentences into logicalnotation using quantifiers: (Use x to represent “cat” and p(x)to represent the predicate “x has whiskers”).

(a) All cats have whiskers.

(b) There exists a cat without whiskers.

(c) There does not exist a cat without whiskers.

(d) No cat has whiskers.


&

$

%

14. Transcribe each of the following English sentences into logicalnotation using quantifiers: (Use x to represent “person”, p(x)to represent the predicate “x is truthful”, q(x) for “x isforgetful” and r(x) for “x is successful”).

(a) If everyone is truthful, then everyone is successful.

(b) It is true of all people that if they are truthful, then theyare successful.

(c) It is not the case that there is a person who is both forgetfuland truthful.

(d) If there is a person who is both forgetful and truthful, thenthat person is successful.


&

$

%

15. Negate each of the following statements in symbolic notation.Simplify the resulting statements using the appropriate rulesfrom predicate calculus and explain which you are using.

∃x|∀y, f(x) > g(y) ∀y,∃x|x2 = y3

∀x∀y, [(y > 0)⇒ (xy > 0))

16. Let p(x) be the predicate “x ≥ 1” and let q(x) be the predicate“x ≤ 3”. Let r(x) be the predicate “x < 1”. Determine whichof the following statements are true and which false if theuniverse of discourse is the set of real numbers:

∀x,((p(x) ∨ q(x)

) (∀x, p(x)

)∨(∀x, q(x)

).(

∃x|p(x))∧(∃x|r(x)

)∃x|(p(x) ∧ r(x)

)17. Write the converse and the contrapositive of each statement in

Q.l.

18. “All statements contained in this Question are false”. Discuss.


&

$

%

3 Proof Techniques

We begin with a general discussion. Results in mathematics canalways be expressed in the form P ⇒ Q where P,Q are statements,or more generally in the form ∀x, y, · · · ; p(x, y, · · · )⇒ q(x, y, · · · ),where p, q are predicates and x, y, · · · are the free variables. Forexample; the uncontroversial result “the product of 2 even integersis an even integer” can be stated as ∀x, y; (x, y ∈ E)⇒ (x× y ∈ E).(This can of course be written in a more usual notation as∀x, y ∈ E; x× y ∈ E.) In the following discussions, we will use thesimple compound statement P ⇒ Q as our “prototype” theorembut in practice theorems are usually stated using predicates andquantifiers as they will generally make claims about classes or setsof objects; for example the natural numbers.


&

$

%

Definition 3.1 (Theorem) To assert that the statement P ⇒ Q

(or more generally ∀x, y, · · · ; p(x, y, · · · )⇒ q(x, y, · · · ) is a theoremis just to say that it is a tautology — i.e. that it has the truth valueT .

Definition 3.2 (Proof) To prove a theorem is to show that thetruth value is indeed T..

We know from the definition of the implication connective that ithas truth value T unless P ≡ T and Q ≡ F . So a proof consists inshowing that if P ≡ T then so is Q. If the theorem is stated usingpredicates and quantifiers: ∀x, y, · · · ; p(x, y, · · · )⇒ q(x, y, · · · )then we need to show that irrespective of the choice of valuesx0, y0, · · · for the free variables x, y, · · · ; that if the statementp(x0, y0, · · · ) is true, then so is the statement q(x0, y0, · · · ).

Definition 3.3 Terminology: We call P (or more generally p thehypothesis and Q (q) the conclusion.


&

$

%

A textbook proof usually is an example of “deductive reasoning” -i.e. a logically valid sequence of steps which establish the truth ofQ from the truth of P .

Definition 3.4 Terminology: A conjecture is an implicationP ⇒ Q which has not been verified/proved. Once proved, aconjecture is called a theorem.

Example 3.1 Consider the conjecture “all odd numbers areprime”. Check 1 is odd, 1 is prime. OK. 3 is odd, 3 is prime. OK.5 is odd, 5 is prime. OK. 7 is odd, 7 is prime. OK

So the conjecture is true......? and so is a theorem....?


&

$

%

The moral is that conjectures about sets of numbers must must bechecked for all possible elements of the set. If the set is infinite, thisis clearly not possible. What to do? The solution is to recognisethat the definition 2.11 of the universal quantifier requires that theimplication p(x)⇒ q(x) be true for any choice x0 of the freevariable x. So we simple take a “fixed but arbitrary choice” of x,say x0, and try to show that from p(x0) we can conclude q(x0).


&

$

%

Example 3.2 Consider the conjecture: “All numbers of the form2n − 1′ (where n is an integer) are prime. Express the conjecture asan implication: ∀n; n ∈ ZZZ⇒ 2n − 1 ∈ PPP.

For the conjecture to be a theorem, the implication must be truefor any choice n0 ∈ ZZZ. So we need to check that irrespective of thevalue of n0, the implication holds. Now, in fact, the conjecture isnot true, so any attempt to prove it is doomed. Suppose that wesuspect this to be the case — we need to find acounter-example, in the case of the present example a value forn0 for which 2n0 − 1 is not prime.

Exercise 3.1 Just try n0 = 1, 2, 3, · · · until the conjecture fails.

Similarly for Example 3.1, 9 is a counter-example.

If the set of possible values for the free variable(s) happens to befinite, then we can check case-by-case.

Example 3.3 “All odd integers from 1–7 inclusive are prime”.


&

$

%

It will often happen that a conjecture can be broken down into afinite number of special cases; each of which may have infinitelymany elements. The advantage of breaking down into special cases(e.g. n even or n odd) is that very often the implication can bechecked more easily for the special cases.

Example 3.4 (Fermat’s Last Theorem) The Theorem states:

∀n ≥ 3,[∃x, y, z ∈NNN s.t. xn + yn = zn

]′. In other words,

there is no non-trivial integer solution to the equation xn + yn = zn

for n > 2. (For n = 2 we have, for example, (32 + 42 = 52).) Butin fact, until recently, it was only a conjecture.


&

$

%

An interesting discussion on the history of Fermat’s Theorem andhow it was finally proved may be found in Appendix A.2.


&

$

%

3.1 Direct Enumeration

If there are only a finite number of possibilities (cases) — checkthem all. For example, consider the conjecture: “All odd integersfrom 1-7 inclusive are prime”. Just check the conjecture∀n ∈ {1, 3, 5, 7}, n ∈ PPP for each value of n. This approach is onlyuseful in trivial applications. Non-trivial conjectures are usuallystatements about infinite sets/classes.


&

$

%

3.2 Direct Proof

Given a conjecture : P ⇒ Q, we assume P true, then based on thisassumption and using any other “relevant” information available —build a chain of reasoning leading to the conclusion that Q is true.More generally; given an implication of the form ∀x, p(x)⇒ q(x);we need to show that for an arbitrary but fixed choice of x —x0 say — that given p(x0) true then based on this assumption andusing any other “relevant” information available we can build achain of reasoning leading to the conclusion that q(x0) is true. Thecrucial point is that x0 must be arbitrary, in other words the proofmust not depend on the particular choice of x0.


&

$

%

We begin with an example conjecture which is clearly true — sothat we can concentrate on the logic of the proof method.

Example 3.5 Conjecture that “If an integer is divisible by 6, thenit is divisible by 3”. Using predicate notation we obtain∀x ∈ ZZZ, (6|x)⇒ (3|x). Choose an arbitary x0 ∈ ZZZ. If the resultfollows for this x0, it must hold for all x. Now regard x0 as fixed.We can now construct a chain of reasoning starting with thehypothesis:

Proof:

6|x0 Hypothesis

x0 = 6× k, for some k ∈ ZZZ By definition of “divides”

x0 = (2× 3)× k Known fact about 6

x0 = 3× (2× k) Associativity, Commutativity, Closure for +,×3|x0 Conclusion 2


&

$

%

Example 3.6 Consider the conjecture:∀x, y ∈NNN, x < y ⇒ x2 < y2. Choose x0, y0 arbitary but fixed in NNN.Our chain of reasoning is now:

x0 < y0, x0 > 0, y0 > 0 Hypothesis

x0 × x0 < x0 × y0 Known property of inequalities

x20 < x0 × y0 Rewriting

x0 × y0 < y0 × y0 x0 < y0

x20 < y2

0 Inequality is transitive — Conclusion 2

Comment 3.1 We will often not trouble to explicitly label theparticular “arbitrary fixed” choice of x as (say) x0 but merely notethat x is now to be regarded as fixed. This allows proofs to readbetter.


&

$

%

Example 3.7 Consider the conjecture:∀x, y ∈RRR+, x2 < y2 ⇒ x < y. Choose x,y fixed and arbitary.

x2 < y2

x

y<

y

x

?...

It is possible to construct a direct proof for this result — try it!However, our next proof technique makes the proof trivial.


&

$

%

3.3 Contrapositive Proof

Recall that (A⇒ B) ≡ (B′ ⇒ A′) where (B′ ⇒ A′) is thecontrapositive of (A⇒ B). So, to prove P ⇒ Q just construct adirect proof for Q′ ⇒ P ′.

Example 3.8 Consider the previous example. Choose x, yarbitrary and fixed as usual. RTP (x2 < y2)⇒ (x < y). We simplytake the contrapositive: (x ≥ y)⇒ (x2 ≥ y2). Now construct adirect proof of the latter implication.


&

$

%

3.4 Proof by Contradiction

Given the conjecture (which is to be proved) P ⇒ Q, we note that(P ⇒ Q) ≡ (P ′ ∨Q). So a proof consists in showing that(P ′ ∨Q) ≡ T . This is equivalent to showing that (P ∧Q′) ≡ F .The technique is thus to assume P ≡ T and Q ≡ F and try todeduce a falsehood (e.g. 1 = 0).

Comment 3.2 This approach is particularly useful when neitherP nor Q on their own suffice to construct a direct proof.


&

$

%

Example 3.9 Conjecture that√

2 is irrational. In predicatenotation (rather artificially) we can restate this as∀x, (x =

√2)⇒ (x ∈QQQ)′. (We use the symbol QQQ to refer to the

rational numbers). A direct proof is certainly not obvious!

We begin our Proof by Contradiction by choosing an arbitrary fixedx such that x =

√2 and x is rational. Thus we conclude (as x is

fixed) that√

2 = pq for some integers p, q 6= 0. We can and do

assume that p, q have no common factors. (This is the key to theproof). So 2 = p2/q2 and thus p2 = 2q2. We conclude that p2 iseven.

Exercise 3.2 Show p2 even ⇒ p even.

As p is even, we can write ∃r ∈ ZZZs.t. p = 2r. Therefore 4r2 = 2q2

and q2 = 2r2 so q2 is even and (using the result of Exercise 3.2) sois q. But we assumed p and q had no common factors. So ourconclusion is a falsehood as required.


&

$

%

Comment 3.3 We use the symbol 2 at the end of a line to signifythe end of a proof. Some writers use the letters “q.e.d.” whichstand for “quod erat demonstrandum” — “what was to be proved”.


&

$

%

3.5 Proof by Induction

We can introduce the idea of induction with the aid of an analogy(i.e. a comparison with another situation which has similarfeatures). Consider a ladder being climbed by a person P andsuppose we know that

1. P can reach the first rung.

2. From any accessible rung, P can reach the next.

Our common sense tells us that we can conclude that P can reachany rung. The Basis for this conclusion is that the number of rungsis finite. We are not entitled to arrive at this conclusion if theladder is infinite.

Comment 3.4 This is unlikely to be a problem in everyday life.


&

$

%

Suppose that we are told some predicate p(n) becomes a truestatement for n = 1 (i.e. p(1) ≡ T ). Suppose that we are also toldthat if the predicate is true for any fixed integer i, thenp(i+ 1) ≡ T . It is reasonable to conclude that ∀n ∈NNN, p(n).However this conclusion is based on the same common-sensereasoning which led to the conclusion that the ladder can beclimbed. Just as in that case, the reasoning is only valid if thenumber of values for the free variable n is finite.


&

$

%

Comment 3.5 We can make the ladder analogy exact by lettingp(n) = “P can reach the nth rung” then assumption 1 translates top(1) ≡ T and assumption 2 translates to: ∀i ∈NNN, p(i)⇒ p(i+ 1)

To allow us to use this plausible but unsupported line of reasoningfor arbitrary predicates p(n) and for the full set NNN, we add an extra“Axiom” or property of the positive integers NNN called the“Principle of Induction”.


&

$

%

Definition 3.5 (Induction (Weak)) Given a predicate p(n)(where n is restricted to the positive integers NNN) then if

p(1) ≡ T (3.1)

and

∀i ∈NNN, p(i)⇒ p(i+ 1) ≡ T (3.2)

then

∀n ∈NNN, p(n) ≡ T (3.3)

Comment 3.6 We often shorten p(k) ≡ T to p(k).


&

$

%

Example 3.10 Conjecture: 1 + 2 + · · ·+ n = n(n+1)2 ; n ∈NNN.

Proof:

[Basis step] Check p(1) : 1 = 1(1+1)2 .

[Inductive step] Assume p(i) : 1 + 2 + · · ·+ i = i(i+1)2 . Use the

following strategy: try to transform/change LHSi into LHSi+1.Use the same process on RHSi. Then rearrange. Applying thestrategy: 1 + 2 + · · ·+ i+ (i+ 1) = i(i+1)

2 + (i+ 1). Nowrearrange the RHS. We find that:

1 + 2 + · · ·+ i+ (i+ 1) =(i+ 1)(i+ 2)

2as required.

Therefore by the Principle of Induction; ∀n ∈NNN, p(n); i.e.1 + 2 + · · ·+ n = n(n+1)

2 .


&

$

%

Comment 3.7 It is often possible to devise an alternative“constructive proof” where we derive the result from (for example)the left hand side. Where this can be done, the resulting proof isusually clearer than an inductive proof and should be used wherepossible.

Comment 3.8 We will usually use “sigma” notation for sums and“pi” notation for products in the following examples as defined inthe following definitions.

Definition 3.6n∑i=1

fi ≡ f1 + f2 + · · ·+ fn.

Definition 3.7n∏i=1

fi ≡ f1 × f2 × · · · × fn.


&

$

%

Example 3.11 Conjecture:∀n ∈NNN, 12 + 22 + 32 + · · ·+ n2 = n(n+1)(2n+1)

6 .

Proof:

[Basis Step] 12 = 1(1+1)(2×1+1)2 . The equation is satisfied so the

Basis Step is complete.


&

$

%

[Inductive Step] Assume 12 + 22 + · · ·+ i2 = i(i+1)(2i+1)6 . RTP

that 12 + 22 + · · ·+ i2 + (i+ 1)2 = (i+1)(i+2)(2i+3)6 . We now

change (to reduce the amount of writing) to sigma notation:the inductive assumption now reads:

i∑k=1

k2 =i(i+ 1)(2i+ 1)

6. (3.4)

Now RTP that∑i+1k=1 k

2 = (i+ 1)(i+ 2)(2i+ 3)/6. Addtermi+1 = (i+ 1)2 to both sides of Eq. 3.4 to obtain:

i+1∑k=1

k2 = (i+ 1)(i+ 2)(2i+ 3)/6 + (i+ 1)2. (3.5)

Simplifying the RHS of the latter equation yields the requiredresult. (Check.)


&

$

%

So by the Principle of Induction,∀n ∈NNN;

∑nk=1 k

2 = n(n+ 1)(2n+ 1)/6 as required.


&

$

%

Definition 3.8 (Factorial) Define “n factorial” byn! = n(n− 1).(n− 2).....3.2.1. So 3! = 6, 4! = 24 etc. We define0! = 1.


&

$

%

Example 3.12 Conjecture:∑ni=1 i(i!) = (n+ 1)!− 1.

Proof:

[B.S] . p(1) : 1.(1!) = 2!− 1.

[I.S.] Assume p(k) for some fixed but arbitrary k. So we assumethat:

k∑i=1

i(i!) = (k + 1)!− 1. (3.6)

Now RTP that∑k+1i=1 i(i!) = (k + 2)!− 1. Add

tk+1 = (k + 1)(k + 1)! to both sides of Eq. 3.6. It follows that∑k+1i=1 i(i!) = (k+ 1)!− 1 + (k+ 1)(k+ 1)! When we simplify the

RHS, it reduces to the RHS of p(k + 1) as required.

Exercise 3.3 Check.

By the Principle of Induction, we conclude that∑ni=1 i(i!) = (n+ 1)!− 1.


&

$

%

Example 3.13 Conjecture that: “Any 2 given positive integers areequal.” Define p(n): “If the maximum of any 2 given integers a, bis n, a = b”. Suppose that we are given that ∀n, p(n). Then, givenany 2 integers a, b, the larger of the two equals some integer n. Soa = b. It follows that our conjecture can be restated as ∀n, p(n).Proof:

[B.S.] p(1): “If the max of any 2 given positive integers a, b is 1,then a = b”. True.


&

$

%

[I.S. ] Assume p(k): “If the maximum of any 2 given positiveintegers a, b is k, then a = b”. RTP p(k + 1): “If the maximumof any 2 given positive integers a, b is k + 1, then a = b” . Sosuppose that the maximum of two given arbitrary positiveintegers a, b is k + 1. RTP that a = b. Suppose that b is thelarger (wlog). It follows that b = k + 1. Now considera− 1, b− 1. The larger of these two numbers (b− 1) equals k.So by the inductive hypothesis a− 1 = b− 1 and trivially a = b.

So by the Principle of Induction, all positive integers are equal.Something wrong here surely???

Exercise 3.4 Figure out what is wrong!


&

$

%

3.5.1 A Variation on the Principle of Induction

Recall the Principle of Induction:{p(1) ∧

[∀k ∈NNN, p(k)⇒ p(k + 1)

]}⇒ ∀n, p(n). This version is

usually called the Weak Principle of Induction. We can nowdefine the “ Strong” Principle of Induction:

Definition 3.9 (Strong Principle of Induction) Given anarbitrary predicate p, then{p(1)∧[∀k ∈NNN, (p(1) ∧ p(2) ∧ ..... ∧ p(k))⇒ p(k + 1)]

}⇒ ∀n, p(n).

(3.7)

Comment 3.9 We say that an axiom or theorem is “strong” if itallows results to be deduced from it that cannot readily be deducedfrom a “weaker” version.


&

$

%

Equivalence of Strong & Weak Induction We do not need toadd the “ Strong” Principle of Induction to our list of axioms forthe positive integers if we can show that it is equivalent to thesimpler “weak” version. Begin by defining the “Well OrderingPrinciple”. We are not primarily interested in the W.O.P. for itself,but rather as a means to the end our proving the equivalence ofStrong & Weak Induction.

Definition 3.10 (Well Ordering principle) If X is anon-empty set of positive integers; then X has a least element —i.e. there is an element in X, (call it x0) s.t. x0 is less than orequal to all other elements of X. In predicate notation,∀X ⊆NNN, X 6= ∅; ∃x0 ∈ Xs.t.∀x ∈ X,x0 ≤ x.

Comment 3.10 This result is obviously true for finite sets.

Exercise 3.5 Check.


&

$

%

Just as we were not able to prove that the original (Weak)Principle of Induction was valid, we have to either state the WellOrdering Principle (W.O.P.) as an axiom or prove that it isequivalent to an existing axiom. In fact we will show (using aso-called “Circular Chain of Deduction” that the Well OrderingPrinciple (W.O.P.) ≡ P.S.I. ≡ P.W.I.. We begin by proving threeso-called Lemmas (subsidiary theorems).


&

$

%

Lemma 3.1 The Weak Principle of Induction (W.P.I.) impliesthe Well Ordering Principle (W.O.P.) .

Proof:

RTP that “for any arbitrary fixed non-empty set X ⊆NNN, X has aleast element”. We use a trick where we define p(n): “If n ∈ X,then X contains a least element”. If ∀n, p(n), then our resultfollows. (Check.) We can assume the Weak Principle of Induction(W.P.I.) so the proof is a standard “proof by induction”.

[B.S.] p(1): True as X is a set of positive integers and contains 1.


&

$

%

[I.S.] Assume p(k): i.e. “If k ∈ X, then X contains a leastelement”. In predicate notation, we have for any set X ⊆ Nthat k ∈ X ⇒ ∃x0 ∈ Xs.t.∀x ∈ X; x0 ≤ x. RTP: p(k + 1). Sochoose a fixed but arbitrary set X ⊆ N and suppose thatk + 1 ∈ X. RTP that X contains a least element. Consider thenew set X ′ = X ∪ {k}. As X ′ contains k, X ′ has a leastelement; called x′0 (say). Is x′0 in X or not?

[Case 1] If x′0 ∈ X, then it is the least element of X (as it isthe least element of X ′ which contains X).

[Case 2] Suppose x′0 is not in X. Then we must have x′0 = k.Conclude (as k + 1 ∈ X by assumption) that k + 1 is theleast element of X.

So X contains a least element and therefore we have p(k + 1) .

Thus by the Weak Principle of Induction (W.P.I.) ,∀n ∈NNN; p(n).


&

$

%

Lemma 3.2 The Well Ordering Principle (W.O.P.) implies theStrong Principle of Induction (S.P.I.) .

Proof:

RTP that for any predicate p(n), (B.S.) ∧ (I.S.) ⇒ ∀n, p(n). (TheI.S. referred to here is the “strong” version; see Def. 3.9.)


&

$

%

So, let p(n) be an arbitrary fixed predicate. Assume the (B.S.) andthe (I.S.) hold. RTP that ∀n, p(n). The only “relevant information”at our disposal is that we can assume the Well Ordering Principle(W.O.P.) . We use a “Proof by Contradiction”, i.e. we assume

(B.S.) ∧ (I.S.) ∧(∀n ∈NNN; p(n)

)′—– in effect we assume that ∃

at least one n st p(n)′. Let X be the set of positive integers n forwhich p(n)′, i.e. X = {n ∈NNN|p(n)′}. We know that X is not theempty set (why?) — so we can apply the Well Ordering Principle(W.O.P.) . Therefore ∃x0 ∈ Xs.t.x ≥ x0,∀x ∈ X. In words, x0 isthe first integer for which p(n) is false. So for allx < x0, p(x) ≡ T . (Remember we assumed p(1) ≡ T — the basisstep.) So we have p(1) ≡ T, · · · , p(x0 − 1) ≡ T . But the Inductivestep (I.S.) implies that p(x0) ≡ T which contradicts our earlierconclusion that p(x0) ≡ F . Contradiction. Therefore∀n, p(n).


&

$

%

Lemma 3.3 The Strong Principle of Induction (S.P.I.) impliesthe Weak Principle of Induction (W.P.I.)

Proof:

The result is trivial, precisely because the Strong Principle ofInduction (S.P.I.) is “stronger” than the Weak Principle ofInduction (W.P.I.) . In other words, the Strong Principle ofInduction (S.P.I.) asserts that (B.S.) ∧ (I.S.)S ⇒ ∀n, p(n). Butthe Weak Principle of Induction (W.P.I.) asserts that (B.S.) ∧(I.S.)W ⇒ ∀n, p(n). The key observation is that if (I.S.)S ≡ T thencertainly (I.S.)W ≡ T .

Exercise 3.6 Why?


&

$

%

We are now in a position to state & prove our Theorem:

Theorem 3.4 The Strong Principle of Induction (S.P.I.) isequivalent to the Weak Principle of Induction (W.P.I.) .

Proof:

We have established that (Lemma 3.1) Strong Principle ofInduction (S.P.I.) ⇒ Well Ordering Principle (W.O.P.) , that(Lemma 3.2) Well Ordering Principle (W.O.P.) ⇒ StrongPrinciple of Induction (S.P.I.) and that (Lemma 3.3) StrongPrinciple of Induction (S.P.I.) ⇒ Weak Principle of Induction(W.P.I.) . It follows that the three properties of the positiveintegers are equivalent and that in particular Weak Principle ofInduction (W.P.I.) ≡ Strong Principle of Induction (S.P.I.).


&

$

%


1. Provide counter-examples to the following statements:

(a) Every geometric figure with four right angles is a square.

(b) If a real number is not positive, then it must be negative.

(c) All people with red hair have green eyes or are tall.

(d) All people with red hair have green eyes and are tall.

2. Give a direct proof that the product of odd integers is odd.

3. Prove by contradiction that the product of odd integers is odd.

4. Prove by contraposition that if a number x is positive then sois x+ 1.

5. Let x and y be positive numbers and prove that x < y if andonly if x2 < y2.


&

$

%

6. Prove that for any positive integer n,

2 + 6 + 10 + . . .+ (4n− 2) = 2n2

7. Prove that for any positive integer n, the number 22n − 1 isdivisible by 3.

8. Prove that 2n < n! for any positive integer n ≥ 4.

9. Prove each of the following identities by mathematicalinduction:

n∑i=1

i(2i) = 2 + (n− 1)2n+1

n∑i=1

2(3i−1) = 3n − 1

n∑i=1

(i)(i!) = (n+ 1)!− 1


&

$

%

10. Prove by mathematical induction ( De Moivre’s Theorem) that

∀n ∈NNN, (cos θ + i sin θ)n = cosnθ + i sinnθ.

11. Prove by mathematical induction that

(cos θ)(cos 2θ)(cos 4θ)(cos 8θ) . . . (cos 2n−1θ) =sin 2nθ2n sin θ

.

12. What is wrong with the following “proof” by mathematicalinduction? Show that ∀n ∈NNN, P (n), where

p(n) : n = n+ 1.

Assume p(k) for some fixed but arbitrary k. Then

k = k + 1.

Adding 1 to both sides yields k + 1 = k + 2 as required!


&

$

%

13. What is wrong with the following “proof” by mathematicalinduction? Show that all students in the University are takingthe same degree course. Let p(n) be the statement that anygroup of exactly n students in the University are taking thesame degree course. First check p(1); this is trivial as a singlestudent can only take a single course! Now assume p(k), i.e.given any group of exactly k students, all are taking the samecourse. To prove p(k + 1), we must consider an arbitrary groupof k + 1 students. Exclude one student (e.g. the one with thehighest Q.C.A.) from the group. By the inductive hypothesis,all the remainder must be studying for the same degree, degreecourse X (say). Now remove any other person from the originalgroup (e.g. the one with the lowest Q.C.A.). Again theremaining group must all be studying for the same degree and,as the two groups of k students have k− 1 students in common,all must be studying for degree course X. The result follows.


&

$

%

4 Set Theory

This Chapter is largely revision material. Set Theory ideas shouldalready be familiar — it is difficult to progress far in Mathematicswithout them & we have used set notation in Chapters 1 & 2.

Definition 4.1 (Set) A set is any “well-defined” collection ofobjects or elements

Definition 4.2 Write “x is an element of X” as x ∈ X. (We havebeen using this notation informally up to this point.)


&

$

%

We define a specific set either by listing its elements: eg.A = {1, 7, 3} or by defining it using a predicate eg.B = {x ∈ ZZZ|(x ≤ 5) ∧ (x > −3)} or in general C = {x|p(x)}. Therange of variation of the free variable x may be restricted (as wehave already seen); using an expression of the formD = {x ∈ Y |p(x)}; e.g. E = {x|(x ∈ ZZZ) ∧ (x ≤ 5) ∧ (x > −3)} isthe same set as B above.

Definition 4.3 (Cardinality) The cardinality of a finite set X isthe number of elements in X. We write the cardinality of X as |X|.

Example 4.1 |{1, 3, 7}| = 3.

Definition 4.4 (Null Set) The null set (written as { } or ∅) isthe set with no elements.

Example 4.2 The null set ∅ = {x|p(x) ∧ p′(x)} where p(x) is anypredicate.


&

$

%

4.1 Equality of Sets

Definition 4.5 (Set Equality) We say that two sets are equal ifthey have the same elements.

We can use the idea of a subset to test for equality.

Definition 4.6 (Subsets) We say a set X is a subset of a set Y(written X ⊂ Y ) if all elements of X are elements of Y .

Comment 4.1 Now we can observe that two sets X,Y are equal ifand only if X ⊂ Y and Y ⊂ X.


&

$

%

Testing for Equality This observation leads us to a practicaltechnique for testing for set equality when the sets are defined (asthey usually are) using predicates. The key step is how to proveX ⊂ Y . RTP ∀x(x ∈ X)⇒ (x ∈ Y ). We simply choose x anarbitrary fixed element of X. Now RTP x ∈ Y . More specifically;suppose we are given X = {x|p(x)} and Y = {x|q(x)}. To showX ⊂ Y using the above technique — choose x fixed, arbitrary in X.It follows that p(x) ≡ T . Now construct a proof that p(x)⇒ q(x).If we can do this we will have established that all elements of X areelements of Y .

Example 4.3 Given X = {x ∈ ZZZ|(4|x)} and Y = {x ∈ ZZZ|(2|x)}; toshow that X ⊂ Y, choose x arbitrary and fixed in X. Therefore 4|xand so x = 4× n for some integer n. We therefore havex = 2× (2n) and so 2|x as required. It follows that q(x) ≡ T and sox ∈ Y . But x was an arbitrary element of X; therefore all elementsof X are elements of Y . Therefore X ⊂ Y .


&

$

%

4.2 Set Operators

Definition 4.7 (Union) Define union of 2 sets X,Y as thesmallest set containing all the elements of both X and Y ; writtenX ∪ Y .

Definition 4.8 (Intersection) Define the intersection of 2 setsX,Y as the set containing all elements common to both; writtenX ∩ Y .

Comment 4.2 Suppose X = {x|p(x)}; Y = {x|q(x)}. Then

X ∪ Y = {x|p(x) ∨ q(x)} (4.1)

X ∩ Y = {x|p(x) ∧ q(x)}. (4.2)

Example 4.4 Consider the identity (equation)X ∪ (Y ∩Z) = (X ∪Y )∩ (X ∪Z). We can use our general techniquefor proving set equalities; show LHS ⊂ RHS & RHS ⊂ LHS.


&

$

%

[Case I: LHS ⊂ RHS] Let x be a fixed arbitrary element ofLHS. RTP x ∈ RHS. We have that x ∈ X ∪ (Y ∩Z). There aretwo sub-cases:

[x ∈ X] Therefore x is an element of both X ∪ Y and X ∪ Zand thus x ∈ (X ∪ Y ) ∩ (X ∪ Z) = RHS as required.Therefore LHS ⊂ RHS.

[x ∈ (Y ∩ Z)] Exercise: check this case.

[Case II: RHS ⊂ LHS] Let x be a fixed arbitrary element ofLHS. RTP x ∈ LHS. We have x ∈ X ∪ Y and x ∈ X ∪ Z.Again there are two sub-cases:

[x ∈ X] Trivially; we must have x ∈ X ∪ (Y ∩ Z) = RHS asrequired.

[x 6∈ X] In this case x ∈ Y and x ∈ Z so certainly x ∈ (Y ∩ Z)and therefore x ∈ X ∪ (Y ∩ Z) = RHS as required.


&

$

%

Definition 4.9 (Complement) If a set X is defined by apredicate p(x), i.e. X = {x|p(x)} then we can define thecomplement of X; Xc = {x|p(x)′}.

Proving Equality — Again A much neater way of proving setidentities is to note the equivalence between between set union andlogical OR (Eq. 4.1) and between set intersection and logical AND

(Eq. 4.2):Union (∪) OR (∨)

Intersection (∩) AND (∧)

It follows that the set

X ∩ (Y ∪ Z) is precisely the set

{a

∣∣∣∣p(a) ∧(q(a) ∨ r(a)

)}.


&

$

%

But it follows from the Distributive Law for Statement CalculusEq. 2.7 that this is equal to{

a

∣∣∣∣(p(a) ∧ q(a))∨(p(a) ∧ r(a)

)}which is just (

X ∩ Y)∪(X ∩ Z

)the required result.

Definition 4.10 (Universal Set) The universal set U is the setcontaining all elements of interest — or equivalently the set ofwhich all sets under consideration are subsets. For example if weare working with the positive integers NNN then U = NNN.

Comment 4.3 For all sets X, X ∪Xc = U and X ∩Xc = ∅


&

$

%

4.3 Paradoxes in Naive Set Theory

A paradox is simply a contradictory result (real or apparent) whicharises from a theory. Paradoxes often motivate new work which“tidies up” and clarifies a branch of mathematics. One famousparadox is due to Bertrand Russell and is known as Russell’sParadox.

4.3.1 Russell’s Paradox

Define a set A = “ The set of all sets which contain themselves aselements”. (Using symbolic notation: A = {X|X ∈ X}.) Nowconsider the set I = “The set of all infinite sets”. The set I iscertainly non-empty as (for example) RRR ∈ I; NNN ∈ I; ZZZ ∈ I.Moreover I is itself an infinite set as it has infinitely many subsets.


&

$

%

For example {0, 1, 2....} ; {1, 2....} ; {2....} ; · · · are all infinite setsand are therefore in I. There are infinitely many such subsets, soI ∈ I. Therefore A 6= ∅.

Now define B = “The set of all sets which do not containthemselves as elements”. It is easy to find examples of sets in B.

Exercise 4.1 Give some examples of sets in B.

So B 6= ∅. We now pose the question: “Is B ∈ B ? “ The answermust surely be either yes or no!

[Case 1: B ∈ B] Therefore B contains itself as an element. SoB 6∈ B. Therefore B does not contain itself as an element.Contradiction!

[Case 2: B 6∈ B] Therefore B has the property that it containsitself as an element. So B ∈ B. Again a contradiction!


&

$

%

So the definitions of the sets A,B are paradoxical. The resolutionof the paradox is a more careful definition of the term “set” whichexcludes phrases like “the set of all sets which · · · ”


&

$

%

4.4 Power Sets

We begin with a definition.

Definition 4.11 Given a non-empty set A, define the power set ofA, written P(A), to be the set of all subsets of A.

Example 4.5 Given A = {a, b, c}, thenP(A) =

{∅, {a}, {b}, {c}, {a, b}, {b, c}, {c, a}, {a, b, c}

}. Note that

|P(A)| = 23 = 8.

This suggests the following result, which is left as an exercise:

Exercise 4.2 Show that for any finite set A of cardinality n,|P(A)| = 2n


&

$

%


1. If X and Y are nonempty sets and X×Y = Y ×X, what canwe conclude about X and Y?

2. In the following problems, determine whether the statement istrue or false. If the statement is false, give a counterexample.Take A,B,C as arbitrary subsets of a given universal set U.

(a) A× (B ∪ C) = (A×B) ∪ (A× C)

(b) (A×B) ∩ C = A ∩ C ×B ∩ C(c) A× (B − C) = (A×B)− (A× C)

(d) A ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C)

(e) (A ∩B)C = AC ∪BC

(f) A− (B × C) = (A−B)× (A− C)

(g) A ∩ (B × C) = (A ∩B)× (A ∩ C)

(h) A× {} = {}


&

$

%

(i) A ∩ (B ∪ C) = (A ∩B) ∪ (A× C)

(j) (A ∪B)C = AC ∩BC


&

$

%

5 Functions

We saw in Ch. 3 that a set is an unordered collection e.g.{a, b} = {b, a}. We begin with a definition:

Definition 5.1 (Ordered Pair) An ordered pair (a, b) is a set{a, b} with the additional property that (a, b) = (c, d) iff a = c andb = d .

Definition 5.2 Given 2 non-empty sets X,Y , define the“Cartesian Product” of X,Y ; written X × Y to be “the set of allordered pairs, (x, y) s.t. x ∈ X, y ∈ Y = {(x, y)|x ∈ X, y ∈ Y }.

Example 5.1 Suppose that X = {a, b, c} and Y = {1, 2}. ThenX × Y = {(a, 1), (a, 2), (b, 1), (b, 2), (c, 1), (c, 2)}.

Exercise 5.1 Check that if X,Y are finite sets, then|X × Y | = |X|.|Y | .


&

$

%

Definition 5.3 We can consider subsets of X × Y . Any suchsubset is called a relation. Formally: a relation is a set of orderedpairs.

Example 5.2 Suppose that X = {Tom,Mary,Anne,Eoin} andY = {Computer Science, Mathematics, Ethnomusicology, French}.Use the Table to show which subjects each student studies.

Student Course

Tom Computer Science

Mary Mathematics

Anne French

Eoin Ethnomusicology

Anne Mathematics

Mary Computer Science


&

$

%

Now we can represent the relationship “studies” by the relation

R ={

(Tom, Computer Science), (Mary, Mathematics),

(Anne, French), (Eoin, Ethnomusicology),

(Anne, Mathematics), (Mary, Computer Science)}

In particular sets of ordered pairs (relations) where no twoordered pairs have the same first element are of interest.

Definition 5.4 (Function) If a relation R ⊂ X × Y has thisproperty then we say that R is a function from X to Y and write

R : X → Y.


&

$

%

We can rewrite this definition of a function in a formal notation:

Definition 5.5 A relation R ⊂ X × Y is a function R : X → Y if(x, y1) ∈ R and (x, y2) ∈ R⇒ y1 = y2

Example 5.3 Take X = Y = NNN. Consider the relationf = {(n, 2n+ 1)|n ∈ X}. Clearly f ⊂ X × Y as all elements of fare ordered pairs of positive integers. Is f a function? Let(n, y1), (n, y2) ∈ f . We need to show that y1 = y2 . We must havey1 = 2n+ 1; y2 = 2n+ 1 so of course y1 = y2 and so f is afunction.

Comment 5.1 (Notation ) If f : X → Y is a function, thenwrite

((x, y) ∈ f

)≡(y = f(x)

). The second form is of course the

one familiar from calculus courses. We will use both notationsinterchangeably.

Example 5.4 Consider the previous Example; we can writey = f(n) = 2n+ 1.


&

$

%

Example 5.5 Consider X = {1, 2, 3, 4, 5} , Y = {0, 1, 2},f = {(x, x mod 3)|x ∈ X} (a mod b is the remainder when a isdivided by b). We can list the elements of f :f = {(1, 1), (2, 2), (3, 0), (4, 1), (5, 2)}. We can conclude that f is afunction as (by inspection) no 2 distinct ordered pairs have samefirst element.

Exercise 5.2 Suppose f = {(1, 2), (2, 0), 1, 1), (3, 1)}. Is f afunction?


&

$

%

5.1 Domain, Co-domain & Range

Definition 5.6 (Domain — Informal ) If f ⊂ X × Y , (f : X → Y ) , say X is the domain of f .

Example 5.6 Suppose that X = {1, 2, 3, 4, 5} andf : X → Y : x→ x mod 2. We can writef = {(x, x mod 2)|x ∈ X} and we can write it more explicitly asf = {(1, 1), (2, 0), (3, 1), (4, 0), (5, 1)}. Clearly the domain of thefunction is just X.

Example 5.7 Given X = {1, 2, 3, 4, 5} , Y = {a, b, c} andf = {(1, a), (4, c)}, then the above definition implies that thedomain is X. In fact, a more sensible conclusion is that thedomain is {1, 4} as f ⊂ {1, 4} × {a, b, c}.


&

$

%

This example motivates the following improved definition:

Definition 5.7 (Domain — Formal) Given a functionf : X → Y , the domain of f is the smallest set containing all thefirst elements of the ordered pairs in f .

Example 5.8 In this example we consider three familiar functionsfrom our new perspective:

1. sin(x) Formally sin = {(x, sin(x))|x ∈RRR}. The domain of sinis clearly RRR.

2. ln(x) Formally log = {(x, log(x))|x ∈RRR+}. The domain oflog is RRR+.

3.√x Formally √ = {(x,

√x)|x ∈RRR+ ∪ {0}}. The domain of

√ is RRR+ ∪ {0}.

Comment 5.2 Using our two notations we can refer to ln byln : RRR+ →RRR or ln ⊂RRR+ ×RRR.


&

$

%

Definition 5.8 (Codomain) If f : X → Y (f ⊂ X × Y ), we saythat Y is the co-domain of f .

Example 5.9 Consider our three example functions again:

1. sin(x) The co-domain is RRR but could equally well be definedto be [−1, 1].

2. ln(x) The co-domain is RRR. In fact no smaller set would do.

3.√x The co-domain is RRR but could be RRR+ ∪ {0}. Note that

the usual problem of resolving the ambiguity in the sign of√x

arises here. Until we do,√x is not a function. We usually

take the positive root — as here.

Definition 5.9 Given a function f : X → Y (f ⊂ X × Y , defineR(f) (the range of f) to be the set {y ∈ Y |(x, y) ∈ f , for somex ∈ X} or equivalently {y ∈ Y/y = f(x), for some x ∈ X}.


&

$

%

Comment 5.3 The definition implies that R(f) is always a subsetof the co-domain of f .

Example 5.10 Suppose that X = {1, 2, 3, 4, 5} , Y = {a, b, c} andf = {(1, a), (4, c)}. The co-domain is just Y while the range is{a, c} which is a subset of Y .

Example 5.11 Consider again our three example functions(sin / ln /

√x):

1. sin(x) The range is [−1, 1]

2. ln(x) The range is RRR.

3.√x The range is RRR+ ∪ {0}.


&

$

%

5.2 Properties of Functions

Recall that R(f) ⊂ the co-domain of f . This motivates thefollowing definition:

Definition 5.10 Say that a function f : X → Y is “onto” or“surjective” if R(f) = the co-domain of f .

Example 5.12 Again we examine our three example functions(sin / ln /

√x).

1. sin(x) Not onto.

2. ln(x) Onto.

3.√x Not onto.

Comment 5.4 By restricting the co-domain Y to the range R(f)we can always make a function onto.


&

$

%

Comment 5.5 (Terminology) If f is onto, we say f is afunction from X onto Y otherwise we say f is a function from X

to Y .

Definition 5.11 (1–1) A function f : X → Y is 1–1 or“injective” if

∀x1, x2 ∈ X, (x1, y) ∈ f and (x2, y) ∈ f ⇒ x1 = x2

or equivalently

∀x1, x2 ∈ X, (y = f(x1)) and (y = f(x2))⇒ x1 = x2

Exercise 5.3 Consider our three example functions: (sin / ln /√x).

Are they 1–1?


&

$

%

Comment 5.6 (Alternative definition of 1–1) A more“user-friendly” definition of the 1–1 property is that a functionf : X → Y is 1–1 if no two ordered pairs have the same secondelement. Can you see that this is equivalent to Def. 5.11?

Example 5.13 Consider f : RRR→RRR : x→ x2. Is f 1–1? Assume∃x1, x2 ∈RRR s.t.f(x1) = f(x2). Therefore x2

1 = x22. Can we

conclude that x1 = x2? No. (Why?) Therefore f is not 1–1.

Definition 5.12 (Bijection) If a function f : X → Y is 1–1 andonto, we say that it is a bijection or that f is bijective.

Bijections have the following nice property (which you shouldcheck).

Comment 5.7 If f is a bijection f : X → Y then ∀x ∈ X,∃ aunique y ∈ Y such that (x, y) ∈ f and also that ∀y ∈ Y,∃ a uniquex ∈ X, such that (x, y) ∈ f .

Exercise 5.4 Prove that the above comment holds.


&

$

%

We can now define the inverse of a function, provided that it is abijection.

Definition 5.13 (Inverse) If f : X → Y is a bijection then define“f inverse” (written f−1 : Y → X) by f−1 = {(y, x)|(x, y) ∈ f}.

We need to establish that this definition gives us a function; in factwe can do better.

Theorem 5.1 The set of ordered pairs f−1 ⊂ Y ×X as definedabove is a function — and is a bijection.


&

$

%

Proof:

1. RTP f−1 is a function. Assume false so f−1 is not a function.(We seek to establish a contradiction.) Therefore at least twoordered pairs must have the same first element. So we have forsome x1 6= x2 ∈ X that (y, x1) ∈ f−1and(y, x2) ∈ f−1. But wecan rewrite these equations (using the definition of f−1 ) as(x1, y) ∈ fand(x2, y) ∈ f . But we assumed that f is 1–1, sox1 = x2. Contradiction. Therefore f−1 is a function.

2. RTP f−1 is 1–1. Assume not. Therefore ∃y1 6= y2 ∈ Y s.t.(y1, x) ∈ f−1and(y2, x) ∈ f−1 Again, we can translate this intoa statement about f ; namely (x, y1) ∈ fand(x, y2) ∈ f . But fis a function, so y1 = y2. Contradiction. Sof−1 is 1-1.


&

$

%

3. RTP f−1 is onto. Assume not. Therefore∃x0 ∈ Xst∀y ∈ Y, (y, x0) 6∈ f−1. This is equivalent to sayingthat ∃x0 ∈ Xst∀y ∈ Y, (x0, y) 6∈ f . But x0 is in the domain off . Contradiction, so f is onto.

We conclude that f−1 is a bijection.


&

$

%

5.3 Operations on Functions

Just as we can perform arithmetic on numbers using the familiararithmetic operators, we can define corresponding operations onfunctions. The definitions are simple:

Definition 5.14 (Function Sum) Given two functions f, gdefined on a set X, we define the sum of f, g (written f + g) byf + g = {(x, y1 + y2)|x ∈ X; (x, y1) ∈ f&(x, y2) ∈ g}. Using themore familiar notation: (f + g)(x) = f(x) + g(x).

Example 5.14 The function sin +3 cos is just the function whosevalue for any x ∈RRR is sin(x) + 3 cos(x).


&

$

%

Similarly, we can define the difference and product of twofunctions:

Definition 5.15 (Function Difference) Given two functionsf, g defined on a set X, we define the difference of f, g (writtenf − g) by f − g = {(x, y1 − y2)|x ∈ X; (x, y1) ∈ f&(x, y2) ∈ g}.Using the more familiar notation: (f − g)(x) = f(x)− g(x).

Example 5.15 The function sin− cos is just the function whosevalue for any x ∈RRR is sin(x)− cos(x).

Definition 5.16 (Function product) Given two functions f, gdefined on a set X, we define the product of f, g (written fg) byfg = {(x, y1y2)|x ∈ X; (x, y1) ∈ f&(x, y2) ∈ g}. Using the morefamiliar notation: (fg)(x) = f(x)g(x).

Example 5.16 The function sin cos is just the function whosevalue for any x ∈RRR is sin(x) cos(x).


&

$

%

Another useful operation which we can define on functions iscomposition.

Definition 5.17 (Function Composition) Given a functionf : X → Y and a function g : Y → Z, we can define a new functiong ◦ f : X → Z byg ◦ f = {(x, z)|(x ∈ X) ∧ (∃y ∈ Y s.t.(x, y) ∈ g & (y, z) ∈ g)}. Inthe more familiar notation: (g ◦ f)(x) = g(f(x)),∀x ∈ X.

Exercise 5.5 Prove that when g ◦ f is defined as above, then it isindeed a function.

Example 5.17 Let X = {1, 2, 3, 4}, Y = {a, b, c} and Z = {p, q, r}.Suppose f = {(1, a), (2, a), (3, b), (4, c)} andg = {(a, p), (b, q), (c, r)}. Then g ◦ f = {(1, p), (2, p), (3, q), (4, r)}.

Exercise 5.6 Which of the functions f , g, g ◦ f are 1–1, onto?


&

$

%


1. Determine whether each of the following sets of ordered pairs isa function from X = {1, 2, 3, 4} to Y = {a, b, c, d}. If it is afunction, find its domain and range and determine whether it isone-to-one (injective) or onto (surjective). If it is one-to-oneand onto, describe the inverse function as a set of ordered pairsand give the domain and range of the inverse function.

{(1, a), (2, a), (3, c), (4, b)} {(1, c), (2, a), (3, b), (4, c), (2, d)}{(1, c), (2, d), (2, a), (4, b)} {(1, d), (2, d), (4, a)}{(1, b), (2, b), (3, b), (4, b)}.

2. Give an example of a function that is one-to-one, but not onto.

3. Give an example of a function that is onto, but not one-to-one.

4. Give an example of a function that is neither onto norone-to-one.


&

$

%

5. Given g : X −→ Y and f : Y −→ Z, verify or find acounter-example for each of the following propositions:

(a) If f and g are one-to-one, then f ◦ g is one-to-one.

(b) If f and g are onto, then f ◦ g is onto.

(c) If f ◦ g is one-to-one, then g is one-to-one.

(d) If f ◦ g is onto, then f is onto.

(e) If f ◦ g is onto, then g is onto.

(f) If f ◦ g is one-to-one, then f is one-to-one.

(g) If f and g are one-to-one and onto, then f ◦ g is one-to-oneand onto.

6. Let f : X −→ Y . Prove that f is one-to-one if and only iff(A ∩B) = f(A) ∩ f(B),∀A,B ⊂ X.

7. Let f : X −→ Y. Show that f is one-to-one if and only ifwhenever g is a one-to-one function from any set A to X, f ◦ gis one-to-one.


&

$

%

8. Let f : X −→ Y. Show that f is onto Y if and only ifwhenever g is a function from Y onto any set Z, g ◦ f is onto Z.

9. If X and Y are sets, define X to be equivalent to Y if there isa one-to-one and onto function from X to Y. Show that forany set X, X is not equivalent to P(X) (the power set of X).

10. (Difficult!) Given that f : NNN −→NNN satisfies the inequality:∀n ∈NNN, f(n+ 1) > f(f(n)), show that f is given by∀n ∈NNN, f(n) = n.


&

$

%

6 Recurrence Relations

In this Chapter, we consider lists (sequences) of numbers a1, a2, · · ·and formulas like an = an−1 + an−2. We will learn in this Chapterhow to solve such equations and in Chapter 7 how they arise whensolving important problems in Computer Science.

6.1 Sequences and Recurrence Relations

Definition 6.1 A sequence is a function f : NNN→RRR : n→ an.

Comment 6.1 Alternatively a sequence may be thought of a anordered list of real numbers (of arbitrary length). We often refer tothe sequence a1, a2, · · · as “the sequence {an}” for brevity.


&

$

%

Recurrence relations are equations which express the “generalterm” an of a sequence a1, a2, · · · , in terms of one or more of itspredecessors. The recurrence equation can be used to calculatesuccessive terms in the sequence given (say) the values of a1 anda2. Often the recurrence relation is the result of the mathematicalanalysis of some problem (as we will see below and in the nextChapter). Simply calculating successive terms in the sequence canbe useful but often we seek to calculate a “closed form” or explicitformula for the general term an in terms of n. The most generalform of interest is an = f(an−1, an−2, · · · , a1). However equationsas complex as (for example)an = log(an−1 + an−2 sin2(an−1an−3)) + an−4 can rarely be solvedin closed form. The best we could do for this example would be tocalculate a5 given a1, a2, a3, a4; then calculate a6 in terms ofa2, a3, a4, a5; etc.


&

$

%

Exercise 6.1 Find a5, a6, a7 given a1 = 2, a2 = 3, a3 = 4, a4 = 5.Can you see any pattern to the numbers?

To have any hope of finding systematic solution techniques, we firstneed to restrict ourselves to simple special cases which are still ofinterest.

Definition 6.2 A recurrence relation is linear if the unknownsan, an−1 · · · appear only as linear combinations — ie as terms ofthe form αan−1 + βan−2 + γan−3 · · · . A recurrence relation whichis not linear is called non-linear.

Example 6.1 The recurrence relation an = 3a2n−1 + 4an−3 is

non-linear. The recurrence relation an = 3an−1 + 4an−3 + 7 islinear.


&

$

%

Definition 6.3 A recurrence relation is second-order if an canbe expressed in terms of its immediate presecessors an−1, an−2 only.In general the order of a recurrence relation is the number ofsteps back in the sequence you need to go in order to calculate an.

Example 6.2 The recurrence relation an = 4a3n−3 + 5an−7 is

seventh-order. The recurrence relation an = 2an−1 + 5a2n−2 is

second-order.

Definition 6.4 A recurrence relation has constant coefficientsif each term on the RHS only involves terms in the sequence andconstant numeric values.

Example 6.3 The recurrence relationan = 3a4

n−2/ ln(an−3) + 4an−6 has constant coefficients. Therecurrence relation an = 2nan−1 + 4nan−2 does not.


&

$

%

Definition 6.5 A recurrence relation is homogeneous if everyterm on the RHS involves one or more terms in the sequence.Otherwise the recurrence relation is said to be inhomogeneous.

Example 6.4 The recurrence relationan = a3

n−3 sin2(an−4) + 7a3n−2 is homogeneous. The recurrence

relation an = a3n−3 sin2(an−4) + 7a3

n−2 + 7n2 + 4 is inhomogeneous.

Example 6.5 The “Fibonacci sequence” 1, 1, 2, 3, 5, 8, · · · can begenerated by the recurrence relation an = an−1 + an−2; wherea1 = a2 = 1. This is an example of a linear, homogeneous,second-order recurrence relation with constantcoefficients.


&

$

%

6.2 Derivation of a Recurrence Relation

Consider the famous “Tower of Hanoi” problem. It may be posedas follows. Start with n heavy disks stacked on the left-handmostof three posts in decreasing order of size (smallest at the top). Thedisks are to be moved (one at a time) from Post 1 to Post 3; whileobeying the rule that a larger disk may not be placed on top of asmaller one.

{n Disks

Figure 6.1: Tower of Hanoi at Year Zero


&

$

%

Many years later the disks have been moved to post number three.

{n Disks

Figure 6.2: Tower of Hanoi at Year ?

It is not immediately obvious how long it takes to perform this taskfor n disks — in units of time where one unit is the time requiredto carry one disk from one Post to another. The solution to theproblem leads to the idea of an “algorithm”.

Definition 6.6 An algorithm is an unambiguous sequence ofoperations which solve a specified problem in a finite time.


&

$

%

We need to specify an algorithm to solve the Tower of Hanoiproblem. This is easily done for small values of n — it rapidlybecomes complicated for (say) n > 3. The insight which leads to asimple algorithm is the idea of recursion.

Definition 6.7 (Recursion) Informally, a recursive algorithmis one which refers to itself.

Consider the following recursive algorithm for the Tower of Hanoiproblem:

To solve the n-disk problem:

• A: Move the top n− 1 disks to the middle Post.

• B: Move the biggest (base) disk to its final destination (RHPost)

• C: Move the top n− 1 disks from middle Post to RH Post.


&

$

%

Comment 6.2 Note that the procedure or algorithm is not explicit— we always describe the solution procedure for a problem in termsof the procedure for a problem of size n− 1.

Exercise 6.2 Persuade yourself that the above algorithm works bytrying it for (say) n = 3, 4.

Let an be the time taken to solve the problem for n disks. Thereforea1 = 1 and a2 = 3. We can now derive a recurrence relation forthis algorithm. Clearly an = TA + TB + TC (where TA, TB , TC arethe times taken to accomplish steps A,B,C of the algorithmrespectively. We must therefore have: an = an−1 + 1 + an−1

(substituting for TA, TB , TC). Therefore the recurrence relation is:

an = 2an−1 + 1 (6.1)

which together with the starting value a1 = 1 concludes ouranalysis of the problem.


&

$

%

All that remains is to find a solution to the recurrence relation, i.e.to find an expression for an in terms of n (as a function of n).

We will consider two general solution techniques; the IterationTechnique & the Substitution Technique. The first has theadvantage of being applicable to any recurrence relation, althoughit may not always yield a closed-form solution. The second is onlyapplicable to linear, homogeneous, second-order recurrencerelations with constant coefficients; but will always yield a closedform solution. (It can be extended to inhomogeneous recurrencerelations — as we will see.)


&

$

%

6.3 Iteration Technique

The technique consists simply of repeatedly substituting foran−1, an−2, · · · in the RHS of the recurrence relation from theLHS. In other words, given a recurrence relation of the forman = f(an−1, an−2, · · · ); substitute for an−1 = f(an−2, an−3, · · · ),an−2 = f(an−3, an−4, · · · ), etc. We try to infer the general pattern(if any) by studying the formulas obtained for an in the first fewsubstitutions. (This is an example of inductive reasoning.) Afterwe guess a formula for an, we check it by substituting into theoriginal recurrence relation.


&

$

%

Example 6.6 Consider the Tower of Hanoi recurrence relation,Eq. 6.1. We will apply the above technique.

an = 2an−1 + 1 n ≥ 2

= 2(2an−2 + 1) + 1

= 22an−2 + 2 + 1

= 22(2an−3 + 1) + 2 + 1

= 23an−3 + 22 + 2 + 1

= 24an−4 + 23 + 22 + 2 + 1...

= 2n−1a1 + 1 + 2 + 22 + .....2n−2

Finally, simplifying we getan = 1 + 2 + 22 + ......2n−1 = 2n−1

2−1 = 2n − 1. Therefore an = 2n − 1


&

$

%

Exercise 6.3 Check that this satisfies the recurrence relation —ie that an = 2an−1 + 1.

More examples may be found in the Exercises at the end of theChapter.


&

$

%

6.4 Substitution Technique — Introduction

We begin with an example. The Fibonacci sequence is generated bythe recurrence relation an = an−1 + an−2; a1 = a2 = 1. We wantto find an expression for an in terms of n. Try a simple substitutionfor an which contains a parameter for which we can solve. In otherwords, we use a trial solution or “ansatz” which satisfies theequation for some choice of a free parameter.

In our work, the trial solution will usually take the form an = tn

where t is to be determined. So

an = an−1 + an−2 (6.2)

tn = tn−1 + tn−2 (6.3)

for all n ≥ 3.. Dividing across by tn−2, we find that t must satisfyt2 − t− 1 = 0. So t = 1±

√5

2 = t± (say). We therefore have twosolutions an = tn+ and tn−.


&

$

%

But we need the solution for an to give the right answer whenn = 1, 2, i.e. a1 = 1 and a2 = 1. To sort out the difficulty we needthe following theorem.

Theorem 6.1 If the sequences pn and qn both satisfy therecurrence relation an = αan−1 + βan−2 then for any choice ofA,B (constants) so does the new sequence Apn +Bqn.

Proof: We have (substituting pn, qn into the recurrence relationand cross-multiplying by A,B respectively):

Apn = αpn−1A+ βpn−2A (6.4)

Bqn = αqn−1B + βqn−2B (6.5)

and adding we obtain

Apn +Bqn = α(Apn−1 +Bqn−1) + β(Apn−2 +Bqn−2). (6.6)


&

$

%

We now introduce the idea of a general solution to a recurrencerelation. This will allow us to find a solution which has the rightvalues for a1, a2.

Definition 6.8 A general solution to a linear, homogeneous,second-order recurrence relation with constant coefficients is asequence with two free parameters (A,B say); any choice of whichyields a solution to the recurrence relation.

So from our 2 solutions tn+, tn− we can create a “general solution”

Atn+ +Btn−. We can choose the constants A,B so that the correctvalues for a1, a2 are obtained.

Definition 6.9 A solution which does not have free parameters iscalled a particular solution.


&

$

%

Back to the Fibonacci recurrence relation. We now find the correctchoice of A,B by requiring that

a1 ≡ 1 = At+ +Bt− (n = 1) (6.7)

a2 ≡ 1 = At2+ + +Bt2− (n = 2) (6.8)

Solving, we find A = 1√5

and B = −1√5. So the particular solution

which satisfies a1 = 1, a2 = 1 is an = 1√5

((1+√

5)n

2n − (1−√

5)n

2n

).

Note that for large n, an ≈ 1√5

(1+√

5)n

2n .


&

$

%

6.5 Substitution Technique — Details

Our general technique for solving a recurrence relation of the form

an = c1an−1 + c2an−2 (6.9)

is now clear. Begin with a trial solution an = tn, where t is to bedetermined. Substituting in the recurrence relation yields aquadratic in t:

t2 = c1t+ c2 (6.10)

When we solve the quadratic, as usual three cases arise:

1. Two real distinct roots.

2. Equal real roots.

3. Two distinct complex roots.


&

$

%

We consider each case separately.

6.5.1 Two Distinct Real Roots

This is the most straightforward case. The Fibonacci examplewhich we have already studied illustrates the procedure in this case.

6.5.2 Equal Real Roots

So the solution to the quadratic is just t = r (say). A generalquadratic takes the form: t2 + et+ f = 0. If there is a double rootat t = r then we can write the quadratic as (t− r)2 = 0. Expandingthis expression and equationg coefficients of t, 1, we find thate = −2r and f = r2. Now, given our recurrence relation Eq. 6.9;the quadratic being solved is Eq. 6.10.


&

$

%

Therefore c1 = 2r and c2 = −r2 and so

c2c1

= −12r. (6.11)

We need a second solution to the recurrence relation as we need 2unknowns in the general solution corresponding to the 2 startingvalues for a1, a2. We set out to show that a second trial solution isyielded by an = nrn.

Lemma 6.2 The sequence an = nrn satisfies the recurrencerelation Eq. 6.9.


&

$

%

Proof: Substituting an = nrn in the recurrence relation, wefind that r must satisfy:

nrn = c1(n− 1)rn−1 + c2(n− 2)rn−2 (6.12)

The term proportional to n and the remainder must separatelyvanish — as the equation must hold for all n — so we obtain thetwo equations: rn − c1rn−1 − c2rn−2 = 0 (holds as r is a root of thequadratic Eq. 6.10) and c1r

n−1 + 2c2rn−2 = 0 (as c1, c2 satisfyEq. 6.11).

So the general solution for the Double Real Root case is

an = αrn + βnrn. (6.13)


&

$

%

Example 6.7 Consider the recurrence relationan = 4an−1 − 4an−2; a1 = 1, a2 = 3. The substitution an = tn

reduces to the quadratic t2− 4t+ 4 = 0 whose only solution is r = 2.The general solution is an = α2n + βn2n. When we incorporate thestarting values for a1, a2, we get the pair of equations:

1 = 2α+ 2β (n = 1) (6.14)

3 = 4α+ 8β (n = 2) (6.15)

whose solution is α = β = 1/4. So the “particular solution” (thesolution which satisfies the given values for a1, a2) is142n + 1

4n2n = 2n−2(n+ 1).


&

$

%

6.5.3 Complex Roots

We have the general solution an = α(r1)n + β(r2)n where r1, r2 are2 roots of t2 − c1t− c2 = 0. In general, given t2 + et+ f = 0, the

solution is of course −e±√

(e2−4f)

2 . If (e2 − 4f) < 0 (the case wherethe roots are complex) then call

√e2 − 4f = iB where B is real

and positive and i =√−1. Call −e2 = A. Then the roots are

r1 = A+ iB and r2 = A− iB. Note that the two roots come in a“mirror image” or “conjugate” pair. We can write any complexnumber A+ iB in so-called “polar” form D(cos θ + i sin θ) whereD =

√A2 +B2 and tan θ = B/A.

Now De Moivre’s Theorem states that(cos θ + i sin θ)n = cosnθ + i sinnθ. So we can write the generalsolution to our recurrence relation as

αDn(cosnθ + i sinnθ) + βDn(cosnθ − i sinnθ) (6.16)


&

$

%

The problem with this form for the solution is that α, β are (ingeneral) complex numbers while we know that the solution to arecurrence relation where the coefficients are all real mustthemselves be real. (Why?) If we write α = α1 + iα2 andβ = β1 + iβ2 and substitute into Eq. 6.16 we find that the generalsolution is

Dn

(α1 cosnθ − α2 sinnθ + i(α1 sinnθ + α2 cosnθ)

+ β1 cosnθ − β2 sinnθ + i(β1 sinnθ + β2 cosnθ))


&

$

%

However using the fact that the solution is real, we can drop all theterms prefixed by i and write the general solution as:

Dn[γ cosnθ + δ sinnθ] (6.17)

where γ, δ are arbitrary constants and D, θ are computed from thequadratic as explained above.

Example 6.8 Consider the recurrence relationan = 2an−1 − 5an−2|a1 = 1; a2 = 3. The usual substututionan = tn yields the quadratic t2 − 2t+ 5 = 0 whose roots are 1± 2i.So (in terms of our previous analysis), A = 1 and B = 2. Firstcalculate D =

√A2 +B2 which gives D =

√5. Now we need to

calculate the angle θ fro the equation tan θ = B/A. We thereforehave tan θ = 2 and so θ = arctan 2. (There is no need tonumerically evaluate θ at this stage.)


&

$

%

Our general solution is therefore: 5n/2(α cosnθ + δ sinnθ

). We

have the “initial conditions” a1 = 1 and a2 = 3 so we have

1 =√

5(α cos θ + δ sin θ) (n = 1)

3 = 5(α cos 2θ + δ sin 2θ)

and, substituting for θ

1 = α+ 2β

3 = −3α+ 4β

Exercise 6.4 Solve for γ, δ and write the general term an.


&

$

%

6.6 Substitution Technique — Inhomogeneous

Case

We now know how to solve linear homogeneous order 2 recurrencerelations with constant coefficients; i.e. recurrence relations ofform an = c1an−1 + c2an−2 . We will now extend the technique toinhomogeneous recurrence relations of forman = c1an−1 + c2an−2 + f(n) where f is an arbitrary given functionof n.

Example 6.9 Consider the recurrence relationsan = 2an−1 − an−3 + n2 + 1 and an = 2an−1 + 4an−2 + sin(n).

The technique for homogeneous recurrence relations doesn’t work— as if we substitute an = tn in such a recurrence relation, we gettn = c1t

n−1 + c2tn−2 + f(n), which cannot be solved for t in general

without t depending on n. The extended technique can be statedas a theorem:


&

$

%

Theorem 6.3 Let an be the general solution of the recurrencerelation

an = c1an−1 + c2an−2 + f(n). (6.18)

Then, given a particular solution pn (see Def. 6.9), we can writean = pn + hn; where hn is a general solution of the homogeneousequation which can be constructed by deleting f(n) from eq. 6.18,namely an = c1an−1 + c2an−2.


&

$

%

Proof: Given

an = c1an−1 + c2an−2 + f(n) (6.19)

hn = c1an−1 + c2hn−2 (6.20)

and assuming we have a particular solution p

pn = c1pn−1 + c2pn−2 + f(n) (6.21)

then, subtracting Eq. 6.21 from Eq. 6.19 we have that

(an − pn) = c1(an−1 − pn−1) + c2(an−2 − pn−2). (6.22)

This equation has the same form as Eq. 6.20; so, as hn is thegeneral solution of the homogeneous equation, we must havean − pn = hn. Thus

an = hn + pn, (6.23)

as required.


&

$

%

Example 6.10 Consider the recurrence relationan = 5an−1 − 6an−2 + 9 with initial values a1 = 1 and a2 = −1.

• First we solve the homogeneous recurrence relationan = 5an−1 − 6an−2. The solution is hn = α2n + β3n.(Check.)

• Now we need a particular solution of the full equation. We trythe simplest possible choice: pn = K — a constant. Substitutingthe the recurrence relation yields K = 5K − 6K + 92K = 9, sowe have K = 4 1

2 and therefore pn = 4 12 .

• So the general solution of the inhomogeneous recurrencerelation is an = hn + pn = α2n + β3n + 4 1

2 .

Exercise 6.5 Find the particular solution to the above recurrencerelation which satisfies the given initial conditions.


&

$

%

We need a structured approach to the task of finding a particularsolution to a given inhomogeneous recurrence relation. We adoptas our guiding principle that we should use the simplest solutionpossible. Consider a succession of increasingly complex possibleforms for f(n).

6.6.1 Polynomial Inhomgeneous Term

Suppose that f(n) is a polynomial in n. In particular, suppose thatf(n) = An2 +Bn+ C — a quadratic in n. Then our recurrencerelation takes the form an = c1an−1 + c2an−2 +An2 +Bn+ C.For a trial solution pn to satisfy this equation, it must be at least aquadratic in n — i.e. at least pn = Kn2 + Ln+M . In general thetrial solution Pn must be a polynomial of degree at least equal tothat of f(n).


&

$

%

Example 6.11 Consider the examplean = 3an−1 + 7an−2 + 2n2 + 6n+ 1. Substitutingpn = An2 +Bn+ C for an in the recurrence relation gives us

An2 +Bn+ C = 3(A(n− 1)2 +B(n− 1) + C)

+ 7(A(n− 2)2 +B(n− 2) + C) + 2n2 + 6n+ 1. (6.24)

Now gather together on the LHS the n2, n and constant terms.They must each sum to 0. (Why?) We have

A− 3A− 7A− 2 = 0 (6.25)

B + 6A− 3B + 28A− 6 = 0 (6.26)

C − 3A+ 3B − 3C − 28A+ 14B − 7C − 1 = 1. (6.27)

Solving for A,B,C yields a particular solution.


&

$

%

Exercise 6.6 Use Maple to solve the above set of equations andsolve the above recurrence relation for the initial values a1 = 1,a2 = 3. Do you really need Maple?

The problem may be “degenerate” and this rule may not work, e.g.an = 4an−1 − 3an−2 + 15. Try pn = K. Then K must satisfyK = 4K − 3K + 15. So K = K + 15 and therefore 0 = 15. Theproblem is degenerate as hn = α3n + β(1)n = α3n + β. Try pn =Kn. (There is no need for a constant term as hn already includesone.) Then

Kn = 4K(n− 1)− 3K(n− 2) + 15

Exercise 6.7 Try solving for K.


&

$

%

6.6.2 Trigonometrical Inhomogeneous Term

Suppose that f(n) is a cos or sin. For example, consider therecurrence relation an = 2an−1 − 3an−2 + 2 cos(n)− 7 sin(n). Themost general needed pn takes the form A cosn+B sinn.

Exercise 6.8 Substitute the above trial solution in the givenrecurrence relation and solve for A,B.


&

$

%


1. Suppose that you have n dollars and that each day you buyeither orange juice ($1), milk ($2) or beer ($2). If cn is thenumber of ways of spending all the money, show that

cn = cn−1 + 2cn−2.

2. Let cn denote the number of regions into which the plane isdivided by n lines. Assume that each pair of lines meet in apoint, but that no three lines meet in a point. Derive arecurrence relation for the sequence c1, c2, . . .. Hint: What isthe effect of adding one extra line?

3. If S is a bit string, let C(S) be the maximum number ofconsecutive 0’s in S. Let sn be the number of n-bit strings Swith C(S) ≤ 2. Develop a recurrence relation for s1, s2, . . ..


&

$

%

4. Ackerman’s function A(m,n) is defined in Johnsonbaugh bythe recurrence relations

A(m, 0) = A(m− 1, 1), m = 1, 2, . . .

A(m,n) = A(m− 1, A(m,n− 1)), m = 1, 2, . . . n = 1, 2, . . .

and the initial conditions

A(0, n) = n+ 1, n = 0, 1, 2, . . .

(a) Use induction to show that

A(1, n) = n+ 2, n = 0, 1, . . .

(b) Use induction to show that

A(2, n) = 3 + 2n, n = 0, 1, . . .

(c) Guess a formula for A(3, n) and prove it using induction.


&

$

%

5. Solve the following recurrence relations with the given initialconditions:

an = −3an−1; a0 = 2,

an = 2nan−1; a0 = 1,

an = an−1 + n; a0 = 0,

2an = 7an−1 − 3an−2; a0 = a1 = 1,

2an = 7an−1 − 3an−2 + n+ 1; a0 = a1 = 1,

2an = 7an−1 − 3an−2 + cosn; a0 = a1 = 1,

2an = 7an−1 − 5an−2 + 1; a0 = a1 = 1,√an =

√an−1 + 2

√an−2; a0 = a1 = 1,

an =√an−1

a2n−2

; a0 = 1, a1 = 2,

an = −2nan−1 + 3n(n− 1)an−2; a0 = 1, a1 = 2.


&

$

%

7 Application: Analysis of Algorithms

We have already defined the term algorithm, the definition isrepeated here for convenience.

Definition 7.1 (Algorithm) An algorithm is an unambiguoussequence of steps which completes a task in a finite time.

In this Chapter we use recurrence relations to analyse theexecution time needed by algorithms. Given a particularalgorithm, we will find a recurrence relation (and initialconditions) that define a sequence a1, a2, . . . ,, where an is the timetaken by the algorithm to solve a problem of size n.


&

$

%

We usually distinguish between

• Best Case: the time taken under the most favourableassumptions possible.

• Worst Case: the time taken under the least favourableassumptions possible.

• Average Case: the time taken when we average over all possibleinput types.

In many situations, the average case is the one of interest, thoughit can be useful to know the best and worst case execution times asthey bracket the actual time for a particular problem. Often we aremore interested with the approximate rate of growth of theexecution time as n increases than with the exact formula.


&

$

%

Example 7.1 Suppose that the worst case time for an algorithm is

t(n) = 60n2 + 5n+ 1

for an input of size n. For large n, it is reasonable to say thatt(n) ≈ 60n2.

Note that if time in the latter example was measured in secondsthen

t(n) = n2 +5n60

+160

when time is measured in minutes. This change of units does notchange the rate of growth in t(n); it just changes t(n) by a constantfactor. For this reason, when we want to describe how theexecution time grows as n, we not only look for the dominant(most important) term (e.g. 60n2), but also ignore constantcoefficients (factors).


&

$

%

For the present example, we would say that t(n) grows like n2 as nincreases. We say that t(n) is of order at most n2 and write

t(n) = O(n2)

which is spoken as “t(n) is big oh of n2” or just “t(n) is of ordern2”. We now define these terms formally:

Definition 7.2 Let f and g be functions with domain{1, 2, 3, . . .

}. We write

f(n) = O(g(n))

and say f(n) is of order at most g(n) if there exists a positiveconstant C such that

|f(n)| ≤ C|g(n)|

for all n sufficiently large (greater than some minimum value N).


&

$

%

Example 7.2 Since

60n2 + 5n+ 1 ≤ 60n2 + 5n2 + n2

= 66n2 for n ≥ 1,

we may take C = 66 in Def. 7.2 to get

60n2 + 5n+ 1 = O(n2).

Example 7.3 The function 2n+ 3 log2 n < 2n+ 3n = 5n (aslog2 n < n for n ≥ 1). So

2n+ 3 log2 n = O(n).

Example 7.4 If k is a positive integer , then

1k + 2k + · · ·+ nk ≤ nk + nk + · · ·+ nk = n · nk = nk+1

for n ≥ 1; so1k + 2k + · · ·+ nk = O(nk+1).


&

$

%

Example 7.5 Since (by the Triangle Inequality)

|3n3 + 6n2 − 4n+ 2| ≤ 3n3 + 6n2 + 4n+ 2

≤ 3n3 + 6n3 + 4n3 + 2n3

= 15n3 for n ≥ 1,

it follows that3n3 + 6n2 − 4n+ 2 = O(n3).


&

$

%

7.1 Selection Sort Algorithm

Consider the following algorithm — Selection Sort. It sorts asequence s1, s2, . . . , sn into ascending order by first selecting thelargest item and placing it last and then recursively sorting theremaining elements.

[Input:] A sequence s1, s2, . . . , sn

[Output:] s1, s2, . . . , sn arranged into ascending order.


&

$

%

Example 7.6 Algorithm 7.1 (Selection Sort)

1 begin2 if n = 13 then STOP

4 fi5 MaxIndex := 16 for i := 2 to n do7 if s[i] > s[MaxIndex]8 then MaxIndex := i

9 fi10 od11 SWAP s[n], s[MaxIndex]12 Selection Sort (s[1], . . . , s[n-1])13 end


&

$

%

Now, to analyse the “time-complexity” of this algorithm. we countthe number of comparisons tn at line 7. (For this algorithm, thebest-case, average-case and worst-case times are all the same.) Weimmediately find the initial condition

t1 = 0.

To find a recurrence relation for the sequence t1, t2, . . . , wesimulate the execution of the algorithm for an arbitrary input ofsize n > 1. We count the number of comparisons at each line andthen sum these numbers to find the total number of comparisonstn. At line 7 there are n− 1 comparisons (as the for loop isexecuted n times) and at line 12 there are tn−1 comparisons as weare sorting a list of size n− 1.

So we have the required recurrence relation :

tn = tn−1 + n− 1.


&

$

%

We can solve this using the Iteration Method as follows:

tn = tn−1 + n− 1

= (tn−2 + n− 2) + n− 1

= (tn−3 + n− 3) + (n− 2) + n− 1

...

= t1 + 1 + 2 + · · ·+ (n− 2) + (n− 1)

= 0 + 1 + 2 + · · ·+ (n− 1)

=n(n− 1)

2= O(n2).


&

$

%

7.2 Binary Search algorithm

In this subsection, we consider a classic “recursive” algorithm andanalyse its complexity. First, a technical definition.

Definition 7.3 Define the “floor” of a real number x to be thelargest integer ≤ x. Write this as bx or bxc.

Example 7.7 The floor of 3.7 is 3, the floor of 2.0 is 2, the floorof -3.1 is −4.

Consider the following algorithm — Binary Search. It searchesan ordered sequence s1, s2, . . . , sn for a key and returns theindex (position) of the key if it is found and zero if it is not.

[Input:] A sequence si, s2, . . . , sj , i ≥ 1 sorted into ascendingorder, together with a value key which is to be searched for.

[Output:] An index k for which s[k] = key, or if key is not in thesequence, the output is the value 0.


&

$

%

Algorithm 7.2 (Binary Search)

1 begin

2 left := i

3 right := j

4 if left > right

5 then

6 k := 07 STOP (Failed)8 fi

9 k := b left+right2 c10 if (key = s[k])11 then STOP (Succeeded)12 fi

13 if (key < s[k])14 then j := k − 115 else i := k + 116 fi

17 Binary Search (s[i], . . . , s[j])(Recursive Call)18 end


&

$

%

Define tn for this problem to be the number of times of times thealgorithm is invoked (called) in the worst case for a list of n items.

Suppose that n = 1. Then the sequence consists of one element siand i = j. In the worst case, the item will not be found at line 10so the algorithm will be invoked a second time at line 17. However,now i > j and so the algorithm will terminate with failure atline 7. So the algorithm has been invoked twice and therefore:

t1 = 2.

Suppose now that n > 1. In this case i > j, so at line 4, thecondition left > right will be false. In the worst case, the item willnot be found at line 10 and so the algorithm will be invoked atline 17. The invocation at line 17 will require a total of tminvocations, where m is the size of the sequence that is input atline 17.


&

$

%

Since the sizes of the left and right sides of the original sequenceare bn−1

2 c and bn2 c respectively and as the worst case occurs whenthe sequence is larger; the total number of invocations at line 17will be tbn2 c. The original invocation together with the invocationat line 17 gives the total number of invocations. We therefore havethe recurrence relation :

tn = 1 + tbn2 c; t1 = 2. (7.1)

We now need to solve the recurrence relation 7.1. Rather than tryto solve for all n; consider the special case n = 2k, which allows therecurrence relation to be solved exactly. We obtain:

t2k = 1 + t2k−1 ; k ≥ 1. (7.2)

We can greatly simplify this with the substitution uk = t2k , whichyields:

uk = 1 + uk−1; k ≥ 0, u0 = 2. (7.3)


&

$

%

It is easy to see that the solution to this recurrence relation isuk = k + 2, k ≥ 0. Reverting to the original sequence {tn}, we havet2k = k + 2 and so, when n is a power of two, we have

tn = log2(n) + 2. (7.4)

We could conjecture that tn = log2(n) + 2 for all n; but this cannotbe correct as log2(n) is only an integer for n a power of 2. Thesimplest change we can make to the solution which might work isto try tn = blog2(n)c+ 2. We could simply substitute into therecurrence relation Eq. 7.1 but to do this is precisely to prove theformula by induction; and Strong Induction at that!

Exercise 7.1 Discuss.


&

$

%

Theorem 7.1 Given the recurrence relation Eq. 7.1, the solutionis tn = blog2(n)c+ 2, n ≥ 1.

Proof: Use the Strong Principle of Induction.

[Basis Step] Straightforward.

[Inductive Step] Assume that tk = blog2(k)c+ 2, k < n. Weneed to show that tn = blog2(n)c+ 2. (Note that we needStrong Induction, as we need to assume the formula holds inparticular for k = bn2 c, not just for k = n− 1.) We need toconsider the cases n even and n odd separately.


&

$

%

[n Even] We have tn = tbn2 c + 1. As n is even, we can replacebn2 c byn2 . So;

tn = tn2

+ 1.

Now use the inductive hypothesis:

tn2

= blog2(n

2)c+ 2

= blog2(n)− log2(2))c+ 2

= blog2(n)c − 1 + 2

= blog2(n)c+ 1

and so

tn = tn2

+ 1

= blog2(n)c+ 1 + 1

= blog2(n)c+ 2

as required.


&

$

%

[n Odd] As n is odd, we must replace bn2 c byn−12 . So;

tn = tn−12

+ 1.

Now use the inductive hypothesis:

tn−12

= blog2(n− 1

2)c+ 2

= blog2(n− 1)− log2(2))c+ 2

= blog2(n− 1)c − 1 + 2

= blog2(n− 1)c+ 1.

But, as n is odd, log2(n) cannot be a whole number and so

blog2(n− 1)c = blog2(n)c

The final steps are as for the Even Case.


&

$

%

We can therefore conclude that the execution time for a BinarySearch of an Ordered List is O(log2(n)). This is very fast comparedwith the times for searching an unordered list.


&

$

%

7.3 Insertion Sort

This algorithm sorts the sequence s1, . . . , sn into increasing orderby recursively sorting the first n− 1 elements and then inserting snin the correct position.

[Input:] A sequence s1, s2, . . . , sn.

[Output:] The sequence s1, s2, . . . , sn arranged into ascendingorder.


&

$

%

Algorithm 7.3 (Insertion Sort)

1 begin2 if n = 13 then4 STOP5 fi6 Insertion Sort (s[1], . . . , s[n-1])(Recursive Call)7 i := n− 18 temp := s[n]9 while (i ≥ 1) ∧ (s[i] > temp) do

10 s[i+ 1] := s[i]11 i := i− 112 od13 s[i+ 1] := temp

14 end


&

$

%

Let tn be the number of times the comparison (s[i] > temp) inline 9 is made in the worst case. We can assume that, if i < 1, thecomparison is not made.

Exercise 7.2 Check that the algorithm correctly sorts the list5, 2, 7, 3, 1, 4.

Exercise 7.3 Show that the worst case arises when the list to besorted in in descending order.

Clearly t1 = 0 and t2 = 1.

Exercise 7.4 Check that t3 = 3.

In general, for n > 1; the recursive call at line 6 yields tn−1

comparisons (as the input to this call is a list of n− 1 elements). Inthe worst case the while loop at line 9 is executed n− 1 times. Sothe total number of comparisons for a list of size n is tn−1 + n− 1.


&

$

%

We can write the recurrence relation :

tn = tn−1 + n− 1; t1 = 0.

Exercise 7.5 Show that the solution is

tn =n(n− 1)

2.

Clearly we can conclude that tn = O(n2). More importantly, wefind that Insertion Sort and Selection Sort have execution times ofthe same order. We end this Chapter by considering a morecomplex sorting algorithm with much faster performance.


&

$

%

7.4 Merge Sort

The Merge Sort algorithm takes a sequence si, . . . , sj and divides itinto two nearly equal sequences si, . . . , sm and sm+1, . . . , sj ; where

m =⌊i+ j

2

⌋. Each of these sequences in then recursively sorted

(by dividing in two & sorting, ...) after which they are combined toproduce a sorted version of the original sequence. The process ofcombining two sorted sequences into a new list (also in ascendingorder) is called merging.

We begin by describing the Merge algorithm .

[Input:] Two sorted sequences si, . . . , sm and sm+1, . . . , sj .

[Output:] The sequence ci, · · · , cj consisting of the two inputsequences combined (in ascending order).


&

$

%

Algorithm 7.4 (Merge)

1 begin

2 p := i (p is the index in the first input sequence)3 q := m + 1 (q is the index in the second input sequence)4 r := i (r is the index in the combined output sequence)5 while (p ≤ m) ∧ (q ≤ j) do

6 if (s[p] < s[q]) then

7 c[r] := s[p]8 p := p + 19 else

10 c[r] := s[q]11 q := q + 112 fi

13 r := r + 114 od

15 while (p ≤ m) do (Copy remainder of first sequence)16 c[r] := s[p]17 p := p + 118 r := r + 119 od

20 while (q ≤ j) do (Copy remainder of second sequence)21 c[r] := s[q]22 q := q + 123 r := r + 124 od

25 end


&

$

%

Exercise 7.6 Try stepping through the algorithm with the twosequences 1, 3, 4 and 2, 4, 5, 6 as input.

In the Merge algorithm, the comparison of elements in thesequences occurs at line 6. This loop will execute as long as (p ≤ m)and (q ≤ j). So, in the worst case, the algorithm requires j − icomparisons. (This worst case occurs when members of the firstand second sequences are alternately selected to enter the combinedsequence; e.g. s(1, . . . , 3) = (1, 3, 5) and s(4, . . . , 6) = (2, 4, 6).)

We now describe the main algorithm (which uses the Mergealgorithm as a tool):


&

$

%

[Input:] A sequence si, . . . , sj .

[Output:] The sequence si, · · · , sj in ascending order.

Algorithm 7.5 (Merge Sort)

1 begin2 if i = j then STOP3 fi

4 m :=⌊i+j2

⌋5 Merge Sort (s[i], . . . , s[m])(Recursive Call — 1st Part)6 Merge Sort (s[m+1], . . . , s[j])(Recursive Call— 2nd Part)7 Merge (s(i, . . . ,m) and s(m+ 1, . . . , j))(Merge Sorted Parts)8 s(i, . . . , j) := c(i, . . . , j)(Copy c onto s.)9 end


&

$

%

Finally, we need to determine the worst case execution time of theMerge Sort algorithm. Because of the complexity of theargument, we state it as a Theorem. We want to show that thealgorithm has a worst case execution time of O(n log2 n), i.e. thattn ≤ Cn log2 n for some positive constant n.

Theorem 7.2 The Merge Sort algorithm requires at most4n log2 n comparisons to sort n items in the worst case.

Proof: Let tn be the number of comparisons required byMerge Sort to sort n items in the worst case. Clearly t1 = 0. Ifn > 1, then at line 5 at most tbn+1

2 ccomparisons are needed.

Similarly at line 6 at most tbn2 c comparisons are needed.

Exercise 7.7 Check; consider the cases n even and n odd.


&

$

%

At line 7, at most n− 1 comparisons are needed. Therefore:

tn ≤ tbn+12 c

+ tbn2 c + n (7.5)

(We have replaced n− 1 by n — note the ≤. This simplifies thefollowing discussion. Purists may like to try Exercise 2 below.) Theprice we pay is the introduction of the inequality; which means wedo not have a recurrence relation. Now define the sequenceu1, u2, . . . by the recurrence relation

un = ubn+12 c

+ ubn2 c + n; u1 = 0. (7.6)

We use the standard trick of considering the special case n = 2k,which yields:

u2k = 2u2k−1 + 2k; u1 = 0.


&

$

%

or, defining vk = u2k ;

vk = k; v0 = 0.

We can solve this recurrence relation using the Iteration Method.

vk = 2vk−1 + 2k

= 2[2vk−2 + 2k−1

]+ 2k

= 22vk−2 + 2 · 2k

= 2[2vk−3 + 2k−2

]+ 2 · 2k

= 23vk−3 + 3 · 2k

...

= 2kv0 + k2k = k2k


&

$

%

So, when n is a power of 2,

un = n log2 n. (7.7)

We want to show that tn ≤ 4n log2 n, for n = 1, 2, 3, . . . .

Exercise 7.8 It is easy to check that the result holds for n = 1, 2.

We may assume that n > 2. We assume the results of Lemma 7.3and Lemma 7.4 below; namely that

un ≤ un+1; n = 1, 2, 3, . . . (7.8)

andtn ≤ un; n = 1, 2, 3, . . . (7.9)

Let n > 2 be arbitrary but fixed. First, choose k such that2k < n < 2k+1. Then, by Eq. 7.7 and Eq. 7.8, we have:

un ≤ u2k+1 = (k + 1)2k+1 ≤ (k + k)2k+1 = 4k2k ≤ 4n log2 n.


&

$

%

Finally, by Eq. 7.9, we have

tn ≤ 4n log2 n,

as required.

Finally, we tidy up the loose ends by proving the two Lemmasquoted in the above proof.


&

$

%

Lemma 7.3 Given the definitions used in Theorem 7.2, then

un ≤ un+1; n = 1, 2, 3, . . . , i.e. Eq. 7.8 above.

Proof: Use the Strong Principle of Induction.

[Basis Step] u1 = 0 < 2 = 2u1 + 2 = u2.

[Inductive Step] Assume that the inequality holds for k < n.Now,

un+1 = ubn+22 c

+ ubn+12 c

+ n+ 1

≥ ubn+12 c

+ ubn2 c + n = un,

as required.


&

$

%

Lemma 7.4 Given the definitions used in Theorem 7.2, then

tn ≤ un; n = 1, 2, 3, . . . , i.e. Eq. 7.9 above.

Proof: Again, use the Strong Principle of Induction.

[Basis Step] By definition of u1, we have u1 ≡ 1 ≤ t1 = 1.

[Inductive Step] We assume that the inequality holds for k < n.Now, by Eq. 7.5 we have

tn ≤ tbn+12 c

+ tbn2 c + n

≤ ubn+12 c

+ ubn2 c + n

= un

as required.


&

$

%

In conclusion, we have found that Merge Sort is O(n log2 n) inthe worst case and is therefore much more efficient than eitherInsertion Sort or Selection Sort. The analysis of the averagecase execution time is of great importance but is too involved forthe present course.


&

$

%


1. Consider the following algorithm: ListSearch

Find the largest and smallest elements in an arrays[1], . . . , s[n]. The largest element is returned in “large” and theleast element in “small”.


&

$

%

Algorithm 7.6 (ListSearch)

1 begin

2 if (n = 1)3 then large := s[1]4 small := s[1]5 fi

6 return

7 m := b(n+12 )c

8 call LISTSEARCH(s[1], .., s[m])9 large left := large

10 small left := small

11 call LISTSEARCH(s[m + 1], .., s[n])12 large right := large

13 small right := small

14 if (large left > large right)15 then large := large left

16 else large := large right

17 fi

18 if (small left > small right)19 then small := small right

20 else small := small left

21 fi

22 return

23 end

Let tn be the number of comparison steps required for an input of size n. Establish therecurrence relation

tn = tbn2 c+ tbn+1

2 c+ 2.

Solve the recurrence relation.


&

$

%

2. Show that when the recurrence relation

tn = tbn+12 c

+ tbn2 c + n− 1; t1 = 0

is solved for n = 2k, the result is t2k = k2k − 2k+1 + 1.

3. Use a plotting package (e.g. Maple) to plot the functions n2

and n log2 n. How big does n have to be for the two functionsto differ by a factor of 10?


&

$

%

4. Consider the following algorithm , Exponential–1.

[Input:] A real number a and a positive integer n.

[Output:] exp = an.

Algorithm 7.7 (Exponential–1)

1 begin2 if n = 1 then3 exp := a

4 STOP5 fi6 m := bn/2c7 exp1 := Exponential–1 (a,m)(Recursive Call — 18 exp2 := Exponential–1 (a, n−m)(Recursive Call — 2)9 exp := exp1 · exp2

10 end


&

$

%

(a) Let tn be the number of multiplications (line 9) requiredto compute an. Find a recurrence relation and initialconditions for this algorithm .

(b) Solve the recurrence relation found in the previousquestion — your answer should betn = n− 1, n = 1, 2, . . . . (First solve the recurrencerelation for n = 2k then use induction to show the formulaholds for n = 1, 2, 3, . . . .)

(c) What is the order of Exponential–1?


&

$

%

5. Consider the following algorithm , Exponential–2.

[Input:] A real number a and a positive integer n.

[Output:] exp = an.Algorithm 7.8 (Exponential–2)

1 begin

2 if n = 1 then

3 exp := a

4 STOP

5 fi

6 m := bn/2c7 exp := Exponential–2 (a,m)(Recursive Call)8 exp := exp · exp9 if n is odd then

10 exp := exp · a11 fi

12 end

Again, let tn be the number of multiplications (lines 8 and 10).


&

$

%

(a) Show that

tn =

t(n−1)/2 + 2 if n is odd

tn/2 + 1 if n is even

(b) Find t1, t2, t3, t4.

(c) Solve the above recurrence relation for the case where n isa power of 2. You should find tn = log2 n.

(d) Prove that tn ≤ 2 log2 n for n = 1, 2, 3, . . . .

(e) What is the order of Exponential–2?

6. Which of the two exponential algorithms would yourecommend and why?


&

$

%

A Supplementary Material

The following material is included in the hope that the reader willfind it interesting — perhaps even entertaining!

A.1 In Praise of Lectures

The following is quoted verbatim from a short lecture published onthe Internet by T. W. Korner of the Dept. of Pure Mathematicsand Mathematical Statistics, University of Cambridge. It may befound at:http://www.dpmms.cam.ac.uk/site2000/Staff/korner01.html

http://www.dpmms.cam.ac.uk/site2000/Staff/korner01.html


&

$

%

The Ibis was a sacred bird to the Egyptians and worshippersacquired merit by burying them with due ceremony. Unfortunatelythe number of worshippers greatly exceeded the number of birdsdying of natural causes so the temples bred Ibises in order thatthey might be killed and and then properly buried.

So far as many mathematics students are concerned universitymathematics lectures follow the same pattern. For these studentsattendance at lectures has a magical rather than a real significance.They attend lectures regularly (religiously, as one might say) takingcare to sit as far from the lecturer as possible (it is not good toattract the attention of little understood but powerful forces) andtake complete notes. Some lecturers give out the notes at suchspeed (often aided by the technological equivalent of the Tibetanprayer wheel — an overhead projector) that the congregation isfully occupied but most fail in this task.


&

$

%

The gaps left empty are filled by the more antisocial elements withsurreptitious (or not so surreptitious) conversationa, reading ofnewspapers and so on whilst the remainder doodle or daydream.The notes of the lecture are then kept untouched until the holidaysor, more usually, the week before the exams when they are carefullyhighlighted with day-glow yellow pens (a process known asrevision). When more than 50% of the notes have been highlighted,revision is said to be complete, the magical power of the notes isexhausted and they are carefully placed in a file never to beconsulted again.

aA lecture is a public performance like a concert or a theatrical event. Televi-

sion allows channel hopping and conversation. At public performances, private

conversation, however interesting to the participants, distracts the rest of the

audience from the matter in hand. It must be added that just as good eaters

make good cooks so good audiences make for good lectures. A lecturer will give

a better lecture to a quiet and attentive audience than to a noisy and inattentive

one.


&

$

%

(Sometimes the notes are ceremonially burnt at the end of thestudent’s university career thereby giving a vivid demonstration ofthe value placed on the academic side of fifteen years of education.)

Many students would say that there is an element of caricature inmy description. They would agree that the lectures they attend areincomprehensible and boring but claim that they have to come tofind out what is going to be examined. However, even if this wasthe case, they would still be behaving irrationally. The invention ofthe Xerox machine means that only one student need attend eachlecture the remainder being freed for organised games, social eventsand so ona.

aIn the past some universities made lectures compulsory. In Cambridge during

the early 19th century attendance at lectures was not compulsory but attendance

at Chapel was. ‘The choice’ thundered supporters of compulsory chapel ‘is be-

tween compulsory religion and no religion at all’. ‘The difference’ replied one

opponent ‘. . . is too subtle for my grasp’.


&

$

%

Nor would this student need to take very extensive notes sinceeverything done in the lecture is better done in the textbooks.

Even the least experienced observer can see that the averagelecturer makes lots of little mistakes. Usually these are just‘mis-speakings’ or misprints sometimes spotted by the lecturer,sometimes vocally corrected by a wide awake member of theaudience, sometimes silently corrected by the note taker but oftenpassing unnoticed into students notes to puzzle or confuse themlater. The experienced observer will note that, though the generaloutlines of proofs are reasonably well done, the fine detail is oftentackled inefficiently or vaguely with, for example, a four line proofwhere one line will do. A lecture takes place in real time, so tospeak, with 50 minutes of mathematics occupying 50 minutes ofexposition whereas a chapter of a book that takes ten minutes toread may have taken as many days to compose.


&

$

%

When the author of a book encounters a problem she can stop andthink about it; the lecturer must press on regardless. If thenotation becomes too complex or it becomes clear that somevariation in an early definition would be helpful the author can goback and change it; the lecturer is committed to her earlier choices.When her book is finished the lecturer can reread it and revise atleisure. She will get her friends to read the manuscript and they,viewing it with fresh eyes, will be able to suggest corrections andimprovements. Finally, if she is wise, she will offer a graduatestudent a suitable monetary reward for each error found.


&

$

%

Even with all these precautions, errors will still slip through, but itis almost certain that the book will provide a clearer, simpler andmore accurate exposition than any lecture notesa.

Students may feel under some obligation to go to lectures; theirteachers are under no such compulsion. Yet mathematicians go toseminars, colloquium talks, graduate courses all of which arelectures under another name. Why, if lectures have all thedisadvantages that I have shown, do they persist in going to them?The surprising answer is that many mathematicians find it easier tolearn from lectures than from books. In my opinion there areseveral interlinked reasons for this.

aAt one time it was the custom for beginning lecturers to spend their first

couple of years producing a perfect set of lecture notes, in effect a book. For the

rest of their professional lives their lectures consisted of reading these notes out

at dictation speed. Their exposition was then clear, simple and accurate but, in

view the invention of printing some centuries earlier, the same result could have

been obtained more efficiently.


&

$

%

(1) A lecture presents the mathematics as a growing thing and notas a timeless snapshot. We learn more by watching a house beingbuilt than by inspecting it afterwards.

(2) As I said above, the mathematics of lecture is composed in realtime. If the mathematics is hard the lecturer and, therefore, heraudience are compelled to go slowly but they can speed past theeasy parts. In a book the mathematics, whether hard or easy, slipsby at the the same steady pace.

(3) Some lecturers are too shy, some too panic stricken and a few(but very few) too vain or too lazy to respond to the mood of theaudience. Most lecturers can sense when an audience is puzzledand respond by giving a new explanation or illustration. When alecture is going well they can seize the moment to push theaudience just a little further than they could normally expect to go.A book can not respond to our moods.


&

$

%

(4) The author of a book can seldom resist the temptation to addjust one extra point. (Why should she, when purchasers andpublishers prefer to deal in ‘proper’ books rather than slimpamphlets?) The lecturer is forced by the lecture format toconcentrate on the essentials.

(5) In a book the author is on her best behaviour; remarks whichgo down well in lectures look flat on the printed page. A lecturercan say ‘This is boring but necessary’ or ‘It took me three days towork this out’ in a way an author cannot.

There is another advantage of lectures which is of particularimportance to beginners. There is a slogan ‘We learn mathematicsby doing mathematics’ which like many slogans conceals one truthbehind another. We do not learn to play the violin by playing theviolin or rock climbing by climbing rocks. We learn by watchingexperts doing these things and then imitating them.


&

$

%

Practice is an essential part of learning but unguided practice isgenerally useless and often worse than useless. People who teachthemselves to program acquire a mass of bad programming habitswhich (unless they wish to remain hackers all their lives) they thenhave to painfully unlearn. Mathematics textbooks show us howmathematicians write mathematics (admittedly an important skillto acquire) but lectures show us how mathematicians domathematics.


&

$

%

In his book Science Awakening Van Der Waerden makes thefollowing suggestive remarks about the decline of the ancient Greekmathematical tradition.

Reading a proof in Apollonius requires extended andconcentrated study. Instead of a concise algebraic formula,one finds a long sentence, in which each line segment isindicated by two letters which have to be located in thefigure. To understand the line of thought one is compelledto transcribe these sentences in modern concise formulas.The ancients did not have this tool; instead they had theoral tradition.

An oral tradition makes it possible to indicate the linesegments with the fingers; one can emphasise essentials andpoint out how the proof was found. All of this disappearsin the written formulation of the strictly classical style.The proofs are logically sound, but they are not suggestive.


&

$

%

One feels caught in a logical mousetrap, but one fails tosee the guiding line of thought.

As long as there was no interruption, as long as eachgeneration could hand over its method to the next,everything went well and the science flourished. But assoon as some external cause brought about an interruptionin the oral tradition, and only books remained it becameextremely difficult to assimilate the work of the greatprecursors and next to impossible to pass beyond it.


&

$

%

In my view students should treat lectures not as a note takingexercise but as a dialogue between themselves and the lecturer.They should try to follow the argument as it emerges and not justtake it down blindly. ‘But’ the reader will exclaim ‘this is animpossible and futile council of perfection’ and, after having thrownthese notes into the nearest available wastepaper basket, she maywell resolve her indignation into a series of questions.


&

$

%

What about note taking? If you look at experiencedmathematicians in a lecture you will see that their note taking is anautomatic process which leaves them free to concentrate on thelecture. Most mathematics lecturers follow two conventions whichmake automatic, or at least semi-automatic, note taking possible

(a) Everything that is written on the blackboard is to be copieddown and nothing that is spoken need be taken down.

(b) It is the responsibility of the lecturer to ensure that whatappears on the board forms a decent set of notes without furtherediting.

Semi-automatic note taking is a skill that has to be learnt, but itseems to be an easy one to acquire.

Would it better not to take notes? Some mathematiciansnever take notes but most find that note taking helps themconcentrate on the job in hand.


&

$

%

(When the audience at a seminar stop taking notes the experiencedseminar speaker knows that they have lost interest and are nowusing her as a gently babbling source of white noise whilst theythink their own thoughts.) Further even the largest blackboard willeventually be erased and notes allow you to glance back to earlierparts of the lecture.

What should you do if you get lost? The first and mostimportant thing is to remember that most mathematicians are lostmost of the time during lectures. (If you do not believe me, askaround.) Attending a mathematics lecture is like walking through athunderstorm at night. Most of the time you are lost, wet andmiserable but at rare intervals there is a flash of lightening and thewhole countryside is lit up. Once you realise that your plight isneither an infallible sign of your incurable stupidity nor a clearindication of the lecturer’s total incompetence but simply a normaloccurrence, it is clear how you should act.


&

$

%

You should continue taking notes watching all the time for a pointwhere the lecturer changes the subject (or finishes a proof orwhatever) and you can rejoin her exposition as an active partner.

It is obvious that if you study your lecture notes after the lecturewith the object of understanding the point where thelecturer has got to you will have a better chance ofunderstanding the next lecture. If you are one of the majority ofthe students who find this a counsel of perfection then you could atleast use the five minutes before the next lecture rereading the lastpart of your notes. (If you do not do even do this, at least askyourself why you do not do this.)

What should you do if you understand nothing at all ofwhat is going on? At an advanced level it is possible for an entirecourse of 24 lectures to be devoted to the proof of a single theorem.


&

$

%

If you get really lost in such a course (and probably by the endeverybody except, perhaps, the lecturer will be really lost) you staylost. However first and second year undergraduate lectures consistof a set of short topics chained together in some reasonable order.Even if you completely fail to understand one topic there is noreason why you should not understand the next (even if you do notunderstand the proof of Cauchy’s theorem you can still use it). Onthe other hand if incomprehensible topic succeeds incomprehensibletopic then taking notes in the hope that all will become clear whenyou revise is not an adequate response. You should swallow yourpride and consult your director of studies.

What about questions? There are three types of questions thatan audience can ask.

(a) Questions of Correction If you think the lecturer hasmissed out a minus sign or written α when she meant β then youshould always ask.


&

$

%

No lecturer likes to spend a blackboard of calculations sinkingfurther into the mire because her audience has failed to point outan error on line one. Sometimes very polite students wait untilafter a lecture to point out errors with the result that the lecturerknows that she has made an error but that she cannot correct it.So the rule is ask and ask at once.

(b) Questions of Incomprehension It takes considerablecourage to admit that you do not understand something in front ofother people. However if you do not understand something it islikely that many others in the audience will be in the same boatand you will have their silent thanks.


&

$

%

You will usually also have the audible and honest thanks of thelecturer since, as I have indicated above, most lecturers prefer tokeep in touch with the audiencea. (There is a small andunfortunate minority who would prefer to lecture to an emptyroom, but give your lecturer the benefit of the doubt and ask.)

(c) Questions of Extension If you are in the happy position ofunderstanding everything the lecturer says then you may wish herto go further into a topic. Your modest request to hear more aboutthe general case is unlikely to go down well with the rest of theaudience who are still struggling with the particular case.

aI have often thought that the technology of the TV game-show should be

adapted to the lecture theatre. Each seat would have a concealed button which

the auditors could press when they wanted the lecturer to slow down. The ‘votes’

could be added and the result shown on a dial visible only to the lecturer who

would then be in the position of a motorist trying to keep to the speed limit.


&

$

%

Such questions should be left until after the lecture when thelecturer will be happy to oblige (few mathematicians can resist aninvitation to talk more about their subject).


&

$

%

If you find yourself asking more than one question per lecture,examine your motives.

It is noticeable that at seminars it is often the most distinguishedmathematicians who ask the simplest (if they were not sodistinguished, one might say naive) questions. It is, I suppose,possible that they only began to ask such questions after theybecame distinguished, but I believe that a willingness to ask whenthey do not know is a characteristic of many great mindsa.

aThough there is no unique recipe for greatness. When the very great physicist

Bohr was visiting the great physicist Landau in Moscow he was invited to give

a talk to the graduate students with Landau translating. Bohr concluded his

talk with the assertion ‘I attribute my success to the fact that I have never been

afraid to let my students tell me what a fool I am’. The Russian translation

ended ‘I attribute my success to the fact that I have never been afraid to tell my

students what fools they are’.


&

$

%

Mathematical sayings tend to have multiple attributions (perhapsbecause mathematicians remember processes rather than isolatedfacts like names). The ancient Greeks attributed the followingsaying to Euclid among others. Ptolomey, King of Egypt, askedEuclid to teach him geometry. ‘O King’ replied Euclid ‘in Egyptthere are royal roads and roads for the common people, but thereare no royal roads in geometry.’ Mathematics is hard, there are noeasy ways to understanding but the lecture, properly used, is theeasiest way that I know.


&

$

%

A.2 Wiles’ Proof

The following is an excerpt from an extensive discussion of thehistory of the Theorem and attempts to prove it; it can be foundon http://www.vertigo.co.uk/fermat/book10.htm

Figure A.1: Pierre de Fermat

http://www.vertigo.co.uk/fermat/book10.htm


&

$

%

“I have discovered a truly marvellous proof, which thismargin is too narrow to contain . . . ” With these words theseventeenth-century French mathematician Pierre deFermat threw down the gauntlet to future generations.Fermat’s Last Theorem looked simple enough for a child tosolve, yet the finest mathematical minds would be baffledby the search for the proof.

Over three hundred and fifty years were to pass beforea mild-mannered Englishman finally cracked the mysteryin 1995. Fermat by then was far more than a theorem.Whole lives had been devoted to the quest for a solution.

There was the nineteenth century mathematicianSophie Germain, who had to take on the identity of a manto conduct her research in a field forbidden to females.


&

$

%

The dashing Evariste Galois scribbled down the resultsof his research deep into the night before sauntering out todie in a duel. The Japanese genius Yutaka Taniyama killedhimself in despair, while the German industrialist PaulWolfskehl claimed Fermat had saved him from suicide.


&

$

%

Figure A.2: Andrew Wiles

Andrew Wiles had dreamed of proving Fermat eversince he first read about the theorem as a boy of ten in hislocal library. Whilst the hopes of others had been dashed,his dream was destined to come true — but only afteryears of toil and frustration, of exhilarating breakthrough


&

$

%

and crashing disappointment. The true story of howmathematics’ most challenging problem was made to yieldup its secrets is a thrilling tale of endurance, ingenuity andinspiration.

To succeed Wiles required enormous determination toovercome the periods of self-doubt. He describes hisexperience of doing mathematics in terms of a journeythrough a dark unexplored mansion: “You enter the firstroom of the mansion and it’s completely dark. You stumblearound bumping into the furniture but gradually you learnwhere each piece of furniture is. Finally, after six months orso, you find the light switch, you turn it on, and suddenlyit’s all illuminated. You can see exactly where you were.Then you move into the next room and spend another sixmonths in the dark. So each of these breakthroughs, whilesometimes they’re momentary, sometimes over a period of


&

$

%

a day or two, they are the culmination of , and couldn’texist without, the many months of stumbling around in thedark that proceed them.”

In 1993, Wiles returned to Cambridge to announce hisproof. Headlines around the world declared that the LastTheorem had been solved, but at this point the proof hadnot been checked. It turned out that there was a flaw inWiles’ argument, and the proof crashed to the ground.

Wiles immediately locked himself away again, trying tofix the error. The mistake seemed to become harder andharder to overcome as the months passed. Eventually heinvited a former student, Richard Taylor, to work withhim, but still there seemed to be no progress.

On September 19th 1994, Wiles made the crucialbreakthrough: “I was sitting at my desk one Mondaymorning, when suddenly, totally unexpectedly, I had this


&

$

%

incredible revelation. It was so indescribably beautiful, itwas so simple and so elegant. I couldn’t understand howI’d missed it and I just stared at it in disbelief for twentyminutes. Then during the day I walked around thedepartment, and I’d keep coming back to my desk lookingto see if it was still there. It was still there. I couldn’tcontain myself, I was so excited. It was the most importantmoment of my working life. Nothing I ever do again willmean as much.”

course notes discrete mathematics 1 - university of limerickjkcray.maths.ul.ie/ms4111/slides.pdf ·...

Documents