mathematical logic i

Lecture notes for Mathematical Logic I

Phil 513 — Kevin C. Klement

Fall 2011

CONTENTS

Introduction 1A. The Topic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1B. Metalanguage and Object Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C. Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3D. Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1 Metatheory for Propositional Logic 8A. The Syntax of Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8B. The Semantics of Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9C. Reducing the Number of Connectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12D. Axiomatic Systems and Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . 16E. Axiomatic System L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17F. The Deduction Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19G. Soundness and Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22H. Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23I. Independence of the Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Metatheory for Predicate Logic 28A. The Syntax of Predicate Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28B. The Semantics of Predicate Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31C. Countermodels and Semantic Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35D. An Axiom System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39E. The Deduction Theorem in Predicate Logic . . . . . . . . . . . . . . . . . . . . . . . . . 40F. Doing without Existential Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . 41G. Metatheoretic Results for System PF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43H. Identity Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Peano Arithmetic and Recursive Functions 56A. The System S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56B. The Quasi-Fregean System F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59C. Numerals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63D. Ordering, Complete Induction and Divisibility . . . . . . . . . . . . . . . . . . . . . . . 64E. Expressibility and Representability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

i

F. Primitive Recursive and Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . 71G. Number Sequence Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77H. Representing Recursive Functions in System S . . . . . . . . . . . . . . . . . . . . . . . 80

4 Gödel’s Results and their Corollaries 85A. The System , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85B. System S as its Own Metalanguage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86C. Arithmetization of Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88D. Robinson Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94E. Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95F. ř-Consistency, True Theories and Completeness . . . . . . . . . . . . . . . . . . . . . . 96G. Gödel’s First Incompleteness Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 98H. Church’s Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101I. Löb’s Theorem / Gödel’s Second Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 102J. Recursive Undecidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104K. Church’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

ii

INTRODUCTION

A. The Topic

This is a course in logical metatheory, i.e., the studyof logical systems. It is probably very diUerent (andconsiderably harder) than any logic courses youmay have taken before. Those courses (such as ourPhil 110 and 310) may have involved learning log-ical systems and the symbols they employ: whatthey are, how they are used, and how they relateto English. You learned how to construct formaldeductions or proofs within certain logical systems.What you were proving was not anything about alogical system, it was instead either not about any-thing at all (because the problem never told youwhat the symbols meant in that context), or aboutsome made-up people or things. (E.g., perhaps youhad to prove something about some wacky folksnamed ‘Jay’ and ‘Kay’.)

If you have taken Intermediate Logic you havemastered the “boring” part of logic: classical propo-sitional logic and Vrst-order predicate logic. Youmay have been exposed to some relatively moreadvanced and diXcult topics: free logic, basic settheory, and if you’re lucky, modal logic. However,the more advanced logical systems become, themore controversial they get. For example, I think“free logic” is a philosophical disaster and should betaught only as something to avoid. Obviously, mycolleagues don’t always agree with me. Philoso-phers (and others) widely disagree about what theright form of modal logic is. So if you plan on con-tinuing your logical education, it’s probably abouthigh time you started thinking about what makes a

logical system a good one. Does it need to conformto natural language? Does it need to conform tothe metaphysical structure of the world? Does itneed to conform to the ordinary reasoning habitsof philosophers and mathematicians when they’renot self-consciously thinking about logic? Theseare diXcult questions.

Minimally, we can pretty much all agree on thefollowing. For a logical system to be a good one,it has to have the features it was designed to have.For example, the derivation rules for propositionallogic you learned in your Vrst logic course weredesigned so that any argument for which it is pos-sible to construct a derivation of the conclusionfrom the premises is a valid one according to truthtables. If a system of derivation were set up withthis aim but included, along with modus ponensand modus tollens, additionally the inference ruleof aXrming the consequent, i.e.,

From A ⇒ B and B infer A

clearly, the system would inadequate, becausethere would be invalid arguments for which onecould construct a derivation.

Logical metatheory is the branch of logic thatstudies logical systems themselves. In this course,rather than using a logical system to prove thingsabout Jay and Kay, we’ll be proving things aboutlogical systems. However, it’s best not to startwith the controversial ones. People disagree abouthow to do relevance logic, or deontic logic, or para-consistent logic, and even whether or not thesebranches of logic are worth doing at all. These are

1

not the ideal places to begin to learn how to provethings about logical systems; it’s best to start atthe beginning. We’ll be starting with propositionallogic. In our Vrst unit, we’ll be proving about ourlogical system for propositional logic that everydeduction possible within it is valid according totruth tables, and conversely, that every argumentvalid according to truth tables has a correspondingdeduction, or in other words that it is sound andcomplete. We’ll then move on to proving thingsabout Vrst-order predicate logic.

Lastly, we’ll move on to the logic of mathemat-ics, the basic reasoning patterns involved in mathe-matics and the basic principles of arithmetic. We’llshow how, as the logical system under study getsmore complex, so does the apparatus one needs inorder to prove things about it. We’ll also discoversome interesting results about logical systems ofa certain sort, speciVcally that they don’t alwaysquite live up to their original intent. For example,we will be studying the attempt made in the late1890s and early 1900s to fully capture all truths ofelementary number theory within a single deduc-tive system, and show that the attempt failed, andeven that what they had hoped is impossible! Thisis one of the results of Gödel’s incompleteness the-orems. But let’s start at the beginning: metatheoryfor simple propositional logic.

B. Metalanguage and ObjectLanguage

Modern logical systems tend to make use of theirown symbolic languages; hence one of the thingsthat get studied in logical metatheory are the lan-guages of logical systems.

Definition: The object language is the languagebeing studied, or the language under discussion.

Definition: The metalanguage is the languageused when studying the object language.

In this course, the object languages will be the sym-bolic languages of propositional and predicate logic.The metalanguage is English. To be more precise, itis a slightly more technical variant of English than

ordinary English. This is because in addition tothe symbols of our object language, we’ll be addingsome technical terms and even some symbols toordinary English to make our lives easier.

The use/mention distinction

English already has some handy devices that makeit a good metalanguage. SpeciVcally it has thingslike quotation marks that we can use for mention-ing an expression as opposed to using it. Kevin isnot a name, but “Kevin” is. Many words are verbs,but “verbs” itself is only one word and it is not averb. This sentence mentions the word “however”.This sentence, however, both uses and mentionsthe word “however”. You get the idea.

The logic of the metalanguage

We’ll be using the metalanguage to prove thingsabout the object language, and proving anythingrequires logical vocabulary. Luckily, English hashandy words like “all”, “or”, “and”, “not”, “if”, andit allows us to add new words if we want like “iU”for “if and only if”. Of course, our object languagesalso have logical vocabularies, and have signs like“⇒”, “¬”, “∨”, “∀”. But we’d better restrict thosesigns to the object language unless we want to getourselves confused.

But we do want our metalanguage to be veryclear and precise. For that reason, when we usethe word “or”, unless noted otherwise, we meanby this the inclusive meaning or “or”. Similarly,if we use the phrase “if . . . then . . . ” in this classwe always mean the material conditional unlessstated otherwise. (This makes our metalanguageslightly more precise than ordinary English.) Thesame sorts of logical inferences that apply in theobject language also apply in the metalanguage. So

If blah blah blah then yadda yadda.Blah blah blah.Therefore, yadda yadda.

. . . is a valid inference form. You have to use logicto study logic. There’s no getting away from it.However, I’m not going to bother stating all thelogical rules that are valid in the metalanguage,

2

since I’d need to do that in the metametalanguage,and that would just get me started on an inVniteregress. The rule of thumb is: if it’s OK in theobject language, it’s OK in the metalanguage too.

Metalinguistic variables

Ordinary English doesn’t really use variables, butthey make our lives a lot easier. Since the metalan-guage is usually used in this course to discuss theobject language, the variables we use most oftenin the metalanguage are variables that are used totalk about all or some expressions of the metalan-guage. Especially when we get to predicate logic,where the object language itself contains variables,again, we don’t want to get the variables of theobject language confused with those of the met-alanguage. Since predicate logic uses letters like‘x’ and ‘y’ as variables in the object language, it isimportant to be clear when a variable is part of thelanguage. This can be done by making the meta-language’s variable distinctive. For example, I usefancy script letters like ‘A ’ and ‘B’ in the meta-language to mean any object-language expression ofa certain speciVed type. For example, I might writesomething like:

If A is a sentence of predicate logic,then A contains no variables not boundby a quantiVer.

Notice that, in that statement, the variable ‘A ’ isused, not mentioned. The letter ‘A ’ is not itselfused in predicate logic, and contains no variablesbound or free. It’s something I use in the metalan-guage in place of mentioning a particular objectlanguage expression. So A might be “Fa” or itmight be “(∀x)(Fx⇒ Gx)”, etc.

A typical use of these is to represent any objectlanguage expression or set of expressions matchingcertain patterns. This happens for example in stat-ing the inference rules of the object language. Justlook at the lists of rules you used when learninglogic. Whichever book you used, modus ponensdidn’t look (or shouldn’ t have looked) like :

P ⇒ QP

Q

Instead, it looked something like:

A ⇒ BAB

Why? Well, if you used the object-language ver-sion, modus ponens would only apply when theantecedent is ‘P ’ and consequent is ‘Q’, and so thefollowing wouldn’t have counted as an instance ofthe rule:

(S ∨ T )⇒ RS ∨ TR

The only way to get the rule to cover an inVnitenumber of possible cases is to state it schematically,i.e., using variables of the metalanguage to describeany object language expressions of certain forms.Hence, variables in the metalanguage used in thisway are called schematic letters.

In your homework and exams, you may pre-fer to use Greek letters instead of script letters,which may be easier to draw in a more distinctiveway. You may do whatever you wish provided Ican tell the diUerence between object language andmetalanguage variables.

Schematic letters will be used every single dayin this class. Better make friends with them quick.

C. Set Theory

Generally, in order to do logical metatheory fora given logical system, the logical apparatus ofthe metalanguage has to be at least as complex,and usually more complex than that of the objectlanguage. So in order to do metatheory for propo-sitional and predicate logic, we’ll need somethingstronger, and in particular, we’ll need some settheory. Note that this course is not a course onset theory; we’re not going to be studying logicalsystems for set theory. Instead, we’re going to pre-suppose or use some set theoretical notation in ourmetalanguage, i.e., English. Therefore, you shouldthink of all the signs and variables in this sectionas an expansion of English. This semester at least,set theory will be something we usewhen we studypropositional and predicate logic; not somethingwe are studying.

3

This means that we can be relatively informalabout it. This is good because the exact rulesand principles of set theory are still controversial.There are diUerent systems, e.g., “ZF set theory”,“NBG set theory”, “the theory of types”, and so on.Luckily we don’t need to get into those details, be-cause all we’ll need for this course is the rudimentsthey all share.

Sets

Definition: A set is a collection of entities forwhich it is determined, for every entity of a giventype, that the entity either is or is not included in theset.

Definition: An urelement is a thing that is not aset.

Definition: An entity A is a member of set Γ iUit is included in that set.

We write this as: “A ∈ Γ”. We write “A /∈ Γ” tomean that A is not a member of Γ.

Sets are determined entirely by their members:for sets Γ and ∆, Γ = ∆ iU for all A, A ∈ Γ iUA ∈ ∆.

Definition: A singleton or unit set is a set con-taining exactly one member.

“{A}” means the set containing A alone. Gener-ally, “{A1, . . . , An}” means the set containing allof A1, . . . , An, but nothing else.

The members of sets are not ordered, so from{A,B} = {C,D} one cannot infer that A = C ,only that either A = C or A = D.

Definition: If ∆ and Γ are sets, ∆ is said to be asubset of Γ, written “∆ ⊆ Γ”, iU all members of∆ are members of Γ; and ∆ is said to be a propersubset of Γ, written “∆ ⊂ Γ”, iU all members of∆ are members of Γ, but not all members of Γ aremembers of ∆.

Definition: If ∆ and Γ are sets, the union of ∆and Γ, written “∆ ∪ Γ”, is the set that contains ev-erything that is a member of either ∆ or Γ.

Definition: The intersection of ∆ and Γ, writ-ten “∆ ∩ Γ”, is the set that contains everything thatis a member of both ∆ and Γ.

Definition: The relative complement of ∆ andΓ, written “∆−Γ”, is the set containing all membersof ∆ that are not members of Γ.

Definition: The empty set or null set, written“∅”, “Λ” or “ { }”, is the set with no members.

Definition: If Γ and ∆ are sets, then they are dis-joint iU they have no members in common, i.e., iUΓ ∩∆ = ∅.

Ordered n-tuples and relations

Definition: An ordered n-tuple, written“〈A1, . . . , An〉”, is something somewhat like a set,except that the elements are given a Vxed order, sothat 〈A1, . . . , An〉 = 〈B1, . . . , Bn〉 iU Ai = Bi forall i such that 1 ≤ i ≤ n.

An ordered 2-tuple, e.g., 〈A,B〉 is also called an or-dered pair. An entity is identiVed with its 1-tuple.

Definition: If Γ and ∆ are sets, then the Carte-sian product of Γ and ∆, written “Γ×∆”, is theset of all ordered pairs 〈A,B〉 such that A ∈ Γ andB ∈ ∆.

Generally, “Γn” is used to represent all ordered n-tupes consisting entirely of members of Γ. Noticethat Γ2 = Γ× Γ.

The following deVnition is philosophicallyproblematic, but a common way of speaking inmathematics.

Definition: An n-place relation (in extension)on set Γ is any subset of Γn.

A 2-place relation is also called a binary relation.Binary relations are taken to be of sets of orderedpairs. A 1-place relation is also called (the exten-sion of) a property.

Definition: If R is a binary relation, then the do-main of R is the set of all A for which there is an Bsuch that 〈A,B〉 ∈ R.

4

Definition: If R is a binary relation, the range ofR is the set of all B for which there is an A such that〈A,B〉 ∈ R.

Definition: The Veld of R is the union of the do-main and range of R.

Definition: If R is a binary relation, R is reWex-ive iU 〈A,A〉 ∈ R for all A in the Veld of R.

Definition: IfR is a binary relation,R is symmet-ric iU for all A and B in the Veld of R, 〈A,B〉 ∈ Ronly if 〈B,A〉 ∈ R.

Definition: If R is a binary relation, R is tran-sitive iU for all A, B and C in the Veld of R, if〈A,B〉 ∈ R and 〈B,C〉 ∈ R then 〈A,C〉 ∈ R.

Definition: A binary relation R is an equiva-lence relation iU R is symmetric, transitive andreWexive.

Definition: If R is an equivalence relation then,the R-equivalence class on A, written “ [A]R”, isthe set of all B such that 〈A,B〉 ∈ R.

Functions

Definition: A function (in extension) is a bi-nary relation which, for all A, B and C , if it includes〈A,B〉 then it does not also contain 〈A,C〉 unlessB = C .

So if F is a function and A is in its domain, thenthere is a unique B such that 〈A,B〉 ∈ F ; thisunique B is denoted by “F (A)”.

Definition: An n-place function is a functionwhose domain consists of n-tuples. For such afunction, we write “F (A1, . . . , An)” to abbreviate“F (〈A1, . . . , An〉)”.

Definition: An n-place operation on Γ is afunction whose domain is Γn and whose range isa subset of Γ.

Definition: If F is a function, then F is one-oneiU for all A and B in the domain of F , F (A) =F (B) only if A = B.

Cardinal numbers

Definition: If Γ and ∆ are sets, then they areequinumerous, written “Γ ∼= ∆”, iU there is a one-one function whose domain is Γ and whose range is∆.

Definition: Sets Γ and ∆ have the same cardi-nality or cardinal number if and only if they areequinumerous.

Definition: If Γ and ∆ are sets, then the cardinalnumber of Γ is said to be smaller than the cardinalnumber of ∆ iU there is a set Z such that Z ⊆ ∆and Γ ∼= Z but there is no set W such thatW ⊆ ΓandW ∼= ∆.

Definition: If Γ is a set, then A is denumerableiU Γ is equinumerous with the set of natural numbers{0, 1, 2, 3, 4, . . . , (and so on ad inf.)}.

Definition: Aleph null, also known as alephnaught, written “ℵ0”, is the cardinal number of anydenumerable set.

Definition: If Γ is a set, then Γ is Vnite iU eitherΓ = ∅ or there is some positive integer n such thatΓ is equinumerous with the set {1, . . . , n}.

Definition: A set is inVnite iU it is not Vnite.

Definition: A set is countable iU it is either Vniteor denumerable.

HomeworkAssuming that Γ, ∆ and Z are sets, R is a rela-tion, F is a function, and A and B are any entities,informally verify the following:(1) A ∈ {B} iU A = B(2) if Γ ⊆ ∆ and ∆ ⊆ Z then Γ ⊆ Z(3) if Γ ⊆ ∆ and ∆ ⊆ Γ then Γ = ∆(4) (Γ ∪∆) ∪ Z = Γ ∪ (∆ ∪ Z)(5) (Γ ∩∆) ∩ Z = Γ ∩ (∆ ∩ Z)(6) Γ ∩∅ = ∅ and Γ ∪∅ = Γ(7) Γ− Γ = ∅(8) (Γ ∩∆) ∪ (Γ−∆) = Γ(9) Γ1 = Γ

5

(10) If R is an equivalence relation, then ([A]R =[B]R iU 〈A,B〉 ∈ R) and (if [A]R 6= [B]R then[A]R and [B]R are disjoint).

(11) Addition can be thought of as a 2-place opera-tion on the set of natural numbers.

(12) Γ ∼= Γ(13) The set of even non-negative integers is denu-

merable.(14) The set of all integers, positive and negative, is

denumerable.

D. Mathematical Induction

We’ll also be expanding the logic of the metalan-guage by allowing ourselves the use of mathemati-cal induction, a powerful tool of mathematics.

Definition: The principle of mathematical in-duction states the following:If (φ is true of 0), then if (for all natural numbers n,if φ is true of n, then φ is true of n + 1), then φ istrue of all natural numbers.

To use the principle mathematical induction to ar-rive at the conclusion that something is true ofall natural numbers, one needs to prove the twoantecedents, i.e.:

Base step. φ is true of 0

Induction step. for all natural numbers n, if φ istrue of n, then φ is true of n+ 1

Typically, the induction step is proven by means ofa conditional proof in which it is assumed that φis true of n, and from this assumption it is shownthat φ must be true of n+ 1. In the context of thisconditional proof, the assumption that φ is true ofn is called the inductive hypothesis.

From the principle of mathematical induction,one can derive a related principle:

Definition: The principle of complete (orstrong) induction states that:If (for all natural numbers n, whenever φ is true ofall numbers less than n, φ is also true of n) then φ istrue of all natural numbers.

In this class, we rarely use these principles in themetalanguage. Instead, we use some corollariesthat come in handy in the study of logical systems.Mendelson does not give these principles specialnames, but I will.

Definition: The principle of wU inductionstates that:For a given logical language, if φ holds of the sim-plest well-formed formulas (wUs) of that language,and φ holds of any complex wU provided that φ holdsof those simpler wUs out of which it is constructed,then φ holds of all wUs.

This principle is often used in logical metatheory.It is a corollary of mathematical induction. Actu-ally, it is a version of it. Let φ′ be the property anumber has if and only if all wUs of the logicallanguage having that number of logical operatorshave φ. If φ is true of the simplest well-formed for-mulas, i.e., those that contain zero operators, then0 has φ′. Similarly, if φ holds of any wUs that areconstructed out of simpler wUs provided that thosesimpler wUs have φ, then whenever a given naturalnumber n has φ′ then n+ 1 also has φ′. Hence, bymathematical induction, all natural numbers haveφ′, i.e., no matter how many operators a wU con-tains, it has φ. In this way wU induction simplyreduces to mathematical induction.

Similarly, this principle is usually utilized byproving the antecedents, i.e.:

Base step. φ is true of the simplest well-formedformulas (wUs) of that language; and

Induction step. φ holds of any wUs that are con-structed out of simpler wUs provided thatthose simpler wUs have φ.

Again, the assumption made when establishing theinduction step that φ holds of the simpler wUs iscalled the inductive hypothesis.

We’ll also be using:

Definition: The principle of proof induction:In a logical system that contains derivations or proofs,if φ is true of a given step of the proof whenever φ istrue of all previous steps of the proof, then φ is trueof all steps of the proof.

6

The principle of proof induction is an obvious corol-lary of the principle of complete induction. Thesteps in a proof can be numbered; we’re just apply-ing complete induction to those numbers.

HomeworkAnswer any of these we don’t get to in class:(1) Let φ be the property a number x has just in

case the sum of all numbers leading up to andincluding x is x(x+1)

2 . Use the principle of math-ematical induction to show that φ is true of allnatural numbers.

(2) Let φ be the property a number x has just incase it is either 0 or 1 or it is evenly divisible bya prime number greater than 1. Use the prin-ciple of complete induction to show that φ istrue of all natural numbers.

(3) Let φ be the property a wU A of propositionallogic has if and only if has a even number ofparentheses. Use the principle of wU inductionto show that φ holds of all wUs of propositionallogic. (If needed, consult the next page for adeVnition of a wU in propositional logic.)

(4) Consider a logical system for propositionallogic that has only one inference rule: modusponens. Use the principle of proof induction toshow that every line of a proof in this systemis true if the premises are true.

7

UNIT 1

METATHEORY FOR PROPOSITIONAL LOGIC

A. The Syntax of PropositionalLogic

We Vnally turn to our discussion of the logicalmetatheory for propositional logic (also known assentential logic). In particular, we shall limit ourstudy to classical (bivalent) truth-functional propo-sitional logic. We Vrst sketch the make-up of theobject language under study. The syntax of a lan-guage, or the rules governing how its expressionscan and cannot be combined.

The basic building blocks are statement letters,connectives and parentheses.

Definition: A statement letter is any uppercaseletter of the Roman alphabet written with or withouta numerical subscript.

Examples: ‘A’, ‘B’, ‘P ’, ‘Q1’, ‘P13’, and ‘N123’ areall statement letters. The numerical subscripts areused in case we would ever need to deal with morethan 26 simple statements at once. Hence ‘P1’ and‘P2’ are counted as diUerent statement letters.

Definition: A propositional connective is anyof the signs ‘¬’, ‘⇒’, ‘⇔’, ‘∧’ and ‘∨’.

Definition: A well-formed formula* (abbrevi-ated wU) is deVned recursively as follows:

(i) any statement letter is a wU;(ii) if A is a wU then so is ¬A ;1

(iii) if A and B are wUs then so is (A ⇒ B);(iv) if A and B are wUs then so is (A ⇔ B);(v) if A and B are wUs, then so is (A ∨ B);(vi) if A and B are wUs, then so is (A ∧ B);(vii) nothing that cannot be constructed by repeated

applications of the above steps is a wU.

* The above deVnition is provisional; we shall lateramend it. This tell us everything we need to knowabout the “syntax” or “grammar” of propositionallogic.

You may be familiar with a slightly diUerentnotation. I am sticking with the book.

Mendelson’s sign AlternativesNegation ¬ ∼, −Conjunction ∧ &, •Disjunction ∨ +Material conditional ⇒ →, ⊃Material biconditional ⇔ ↔, ≡

Feel free to use whatever signs you prefer. I mightnot even notice.

1Here we are not really using the phrase “¬A ”, since this deVnition is in the metalanguage and ‘¬’ is not part ofEnglish. Nor, however, are we mentioning it, since ‘A ’ is not a part of the object language. Really we should be usingspecial “quasi-quotation” marks, also known as Quine corners, where p¬A q is the object language expression formed byconcatenating ‘¬’ to whatever expression A is. Although imprecise, I forgo Quine corners and rely just on context, to avoida morass of these marks, and to allow for another use of the same notation Mendelson uses in chap. 3.

8

Parentheses Conventions

The chart above also gives the ranking of the con-nectives used when omitting parentheses. Some-times when a wU gets really complicated, it’s easierto leave oU some the parentheses. Because thisleads to ambiguities, we need conventions regard-ing how to read them. When parentheses are omit-ted, and it is unclear which connective has greaterscope, the operator nearer the top on the list aboveshould be taken as having narrow scope, and theoperator nearer the bottom of the list should betaken as having wider scope. For example:

A⇒ B ∨ C

is an abbreviation of:

(A⇒ (B ∨ C))

whereas:A⇒ B ⇔ C


((A⇒ B)⇔ C)

When the operators are the same, the convention isassociation to the left, i.e., the leftmost occurrenceis taken to have narrowest scope. So

A⇒ B ⇒ C


((A⇒ B)⇒ C)

Obviously, for ‘∨’ and ‘∧’, this last convention isless important, since (A ∨ B) ∨ C is logicallyequivalent with A ∨ (B ∨ C ), and similarly,(A ∧ B) ∧ C is equivalent with A ∧ (B ∧ C ).

Sometimes parentheses cannot be left oU, with-out making the wU mean something else:

A⇒ (B ⇔ C)

cannot be written

A⇒ B ⇔ C.

B. The Semantics ofPropositional Logic

To give a semantic theory for a language is tospecify the rules governing the meanings of theexpressions of that language. In truth-functionalpropositional logic, however, nothing regarding themeaning of the statement letters over and abovetheir truth or falsity is relevant for determining thetruth conditions of the complex wUs in which theyappear. Moreover, the meanings of the connectivesare thought to be exhausted by the rules governinghow the truth-value of the wUs they are used toconstruct depends on the truth values of the state-ment letters out of which they are constructed.

In short, everything relevant to the logical se-mantics of a wU of propositional logic is given byits truth table. I assume you already know how toconstruct truth tables. E.g.:

(P ∨ Q) ∨ ¬ (Q ⇒ (P ⇔ Q))T T T T F T T T T TT T F T F F T T F FF T T T T T F F F TF F F F F F T F T F

Roughly, this shows us every thing we need toknow about the meaning of the wU “(P ∨ Q) ∨¬(Q⇒ (P ⇔ Q))”.

To get serious with our study, we need a num-ber of deVnitions.

Definition: A truth-value assignment is anyfunction whose domain is the statement letters ofpropositional logic, and whose range is a nonemptysubset of truth values {TRUTH, FALSITY} (T and Ffor short).

Informally, each row of a truth table represents adiUerent truth-value assignment. Each row rep-resents a diUerent possible assignment of truthvalues to the statement letters making up the wUor wUs in question.

In virtue of way it is constructed out of truth-functional connectives, every wU is determinedto be either true or false (and not both) for anygiven truth-value assignment to the statement let-ters making it up. The truth value of a statement

9

for a given truth-value assignment is representedin the Vnal column of a truth table, underneath itsmain connective.

Definition: A wU is a tautology iU it is true for ev-ery possible truth-value assignment to its statementletters.

The wU “(P ∨ Q) ∨ ¬(Q ⇒ (P ⇔ Q))” isnot a tautology, because it is false for the truth-value assignment that makes both ‘P’ and ‘Q’ false,despite that all the other truth-value assignmentsmake it true. However, the wU “(P ∨ Q) ∨ (Q⇒(P ⇔ Q))” is a tautology, because it is true forevery truth-value assignment, i.e., on every row ofa truth table.

Abbreviation: The notation:

� A

means that A is a tautology.

Definition: A wU is a self-contradiction iU it isfalse for every possible truth-value assignment.

Definition: A wU is contingent iU it is true forsome possible truth-value assignments and false forothers.

Definition: A wU A is said to logically imply awU B iU there is no possible truth-value assignmentto the statement letters making them up that makesA true and B false.


A � B

means that A logically implies B. Note that thissign is part of the metalanguage; it is an abbrevia-tion of the English words “. . . logically implies . . . ”.The sign ‘�’ is not used in the object language.So “A � (B � D)” and “A ⇒ (B � C )” arenonsense.

Definition: Two wUs A and B are logicallyequivalent if and only if every possible truth-valueassignment to the statement letters making them upgive them the same truth value.

Abbreviation: The notation

A �� B

is used to mean that A and B are logically equiv-alent.

Definition: If Γ is a set of wUs and A is a wU,then A is logical consequence of Γ if and only ifthere is no truth-value assignment to the statementletters making up the wUs in Γ and A that makesevery member of Γ true but makes A false.

To say that A is a logical consequence of Γ is thesame as saying that an argument with the membersof Γ as its premises and A is conclusion is valid bytruth tables.


Γ � A

is used to mean that A is a logical consequence ofΓ.

These four uses of the sign “�” are related in intu-itive ways. A tautology can in eUect be thought ofsomething that is true in virtue of logic alone, orthe conclusion of a logically valid argument that be-gins without any premises at all! I.e, “� A ” meansthe same as “∅ � A ”.

Definition: Two wUs A and B are said to be con-sistent or mutually satisVable if and only if thereis at least one truth-value assignment to the state-ment letters making them up that makes both Aand B true.

It is time to get our Vrst practice proving thingsin the metalanguage. Again, we’re going to use En-glish to prove something about the logical languageof propositional logic. We can be somewhat infor-mal about the logical structure of our proof, sincewe haven’t fully laid out a deductive system forthe metalanguage. But it’s usually best to numberthe steps of the proof just like an object languagededuction and be as clear as possible about howthe proof works.

Here’s what we’re going to prove:

10

Result: For any wUs A and B, A � B iU� (A ⇒ B). (Logical implication is equivalentwith tautologyhood of material implication.)

What we’re proving is a biconditional; in particular,we’re proving that one statement logically impliesanother iU the corresponding object language con-ditional statement is a tautology. We’ll prove thisbiconditional using the same strategy we’d use ifwe were going to prove a biconditional in someobject language deduction system. In particular,we prove the conditional going one way, and thenthe other. So the proof goes like this:

Proof:(1) Assume that A � B. We need to show that

(A ⇒ B) is a tautology.(2) Suppose for reductio ad absurdum (indirect

proof) that (A ⇒ B) is not a tautology. Thismeans that there is some truth-value assign-ment that does not make (A ⇒ B) true. Itmust make (A ⇒ B) false.

(3) According to the truth table rules for the sign‘⇒’, this means it must make A true and Bfalse.

(4) However, this contradicts the assumption thatA � B, since that rules out any truth-valueassignment making A true and B false.

(5) Our supposition at line (2) must be mistaken,and so � (A ⇒ B) after all.

Lines (1)–(5) represent a “conditional proof” in themetalanguage that if A � B then � (A ⇒ B).We need to go the other way as well.(6) Assume that � (A ⇒ B).(7) Assume for reductio ad absurdum that it is not

true that A � B. This means that there is atleast one truth-value assignment that makesA true but B false.

(8) Since there is at least one truth-value assign-ment that makes A true and B false, there isat least one truth-value assignment that makes(A ⇒ B) false, given the rules for construct-ing truth tables for the sign ‘⇒’.

(9) However, this contradicts our assumption atline (6). Hence, A � B after all.

Lines (6)–(9) represent a “conditional proof” thatif � (A ⇒ B) then A � B. Putting the twotogether we get that A � B iU � (A ⇒ B). Thisis what we were aiming to prove. e

(Here on out, I use “e” to demarcate the end of aproof in the metalanguage.)

Be careful about not mixing up the object lan-guage and the metalanguage. Assuming that “not� A ” is not the same as assuming that “� ¬A ”.After all, if A is contingent, neither it nor its nega-tion is a tautology. The sign “¬” should never beused for negation in the metalanguage, nor “⇔”used instead of “iU”, etc. If you wish, you can write,“2 A ” to mean “not-� A ”, but never “¬ � A ”.That’s not even meaningful!

Result: A �� B iU � (A ⇔ B).

Proof:Similar to previous example. e

Result: For any wUs A and B, if � A and� (A ⇒ B) then � B.

First, to be clear about what we’re doing, we’renot proving that modus ponens is a valid reasoningform. That would be to prove that

{(A ⇒ B),A } � B

The above is true, and easily proven, but it’s notwhat we’re after. Instead, we’re proving somethinga bit stronger, namely that modus ponens preservestautologyhood, i.e., that if both A and A ⇒ Bare tautologies, then B is a tautology as well.

Proof:What we’re proving is a conditional. We assumethe antecedent and attempt to prove the conse-quent.(1) Assume that � A and � (A ⇒ B).

11

(2) This that both A and (A ⇒ B) are tautolo-gies, i.e., that every possible truth-value assign-ment to the statement letters making them upmakes them true.

(3) Suppose for reductio ad absurdum that therewere some truth-value assignment (row of atruth table) making B false.

(4) Notice that because every truth-value assign-ment makes (A ⇒ B) true, if it makesB falseit must make A false as well.

(5) From lines (3) and (4) we get the result thatthere is a truth-value assignment making Afalse.

(6) However, it follows from line (2) that no truth-value assignment makes A false.

(7) Lines (5) and (6) are a contradiction, and so ourassumption at line (3) is false, and so � B.

(8) Therefore, by conditional proof, if � A and� (A ⇒ B) then � B. e

In your book, there are also proofs of the following:

Result: If � A , and B is the result of (uni-formly) replacing certain statement letters in Aby complex wUs, then � B.

Result: If A is a wU containing wU C in oneor more places, and B is just like A except con-taining wU D in those places where A containsC , then if C �� D then A �� B.

C. Reducing the Number ofConnectives

When working within a given language, usuallythe more complex it is, the easier it is to say whatyou want, because you have more vocabulary inwhich to say it. However, when you’re trying toprove something about the language, it’s usually

easier if it is simpler, because the more complexthe language is, the more there is to say about it.When doing logical metatheory, it’s usually to ouradvantage to whittle down our object language(and the logical calculi we develop in it) to as smallas possible. To that end, we ask, do we really needall Vve connectives (¬,⇒,⇔, ∧ and ∨)?

After all, our object language is not inadequatein any way by not including a sign for the exclu-sive sense of disjunction, since we can represent itusing other signs, e.g., as (A ∨ B) ∧ ¬(A ∧ B)or ¬(A ⇔ B), etc. And no, we don’t need all Vveof the ones we have. First we’ll show that we couldget by with just three, and later two, and Vnallyone.

Result (Adequate Connectives): Every possi-ble truth function can be represented by meansof the connectives ‘∧’, ‘∨’ and ‘¬’ alone.

Proof:We’ll prove this somewhat informally.(1) Assume that A is some wU built using any set

of truth-functional connectives, including, ifyou like, connectives other than our Vve. (Amight make use of some three or four-placetruth-functional connectives, or connectivessuch as the exclusive or, or any others youmight imagine for bivalent logic.)

(2) What we’re going to show is that there is a wUB formed only with the connectives ‘∧’, ‘∨’and ‘¬’ that is logically equivalent with A .

(3) In order for it to be logically equivalent to A ,the wU B that we construct must have thesame Vnal truth value for every possible truth-value assignment to the statement letters mak-ing up A , or in other words, it must have thesame Vnal column in a truth table.

(4) Let P1,P2, . . . ,Pn be the distinct statementletters making up A . For some possible truth-value assignments to these letters, A may betrue, and for others A may be false. Theonly hard case would be the one in whichA is contingent. Clearly tautologies and self-contradictions can be constructed with the

12

signs ‘∧’, ‘∨’ and ‘¬’, and all tautologies arelogically equivalent to one another, and all self-contradictions are equivalent to one another,in those cases, our job is easy. Let us supposeinstead that A is contingent.

(5) Let us construct a wU B in the following way.a) Consider in turn each possible truth-value

assignment to the letters P1,P2, . . . ,Pn.For each truth-value assignment, constructa conjunction made up of those letters thetruth-value assignment makes true, alongwith the negations of letters the truth-valueassignment makes false.

Example: Suppose the letters involved are‘A’, ‘B’ and ‘C’. This means that thereare eight possible truth-value assignments,corresponding to the eight rows of a truthtable. We construct an appropriate con-junction for each.

A B C ConjunctionT T T A ∧ B ∧ CT T F A ∧ B ∧ ¬CT F T A ∧ ¬B ∧ CT F F A ∧ ¬B ∧ ¬CF T T ¬A ∧ B ∧ CF T F ¬A ∧ B ∧ ¬CF F T ¬A ∧ ¬B ∧ CF F F ¬A ∧ ¬B ∧ ¬C

b) From the resulting conjunctions, form acomplex disjunction formed from thoseconjunctions formed in step a) for whichthe corresponding truth-value assignmentmakes A true.

Example: Suppose for the example abovethat the Vnal column of the truth table forA is as follows (just at random):

A B C AT T T TT T F TT F T FT F F FF T T TF T F FF F T FF F F F

This means that we form a disjunctionusing as the disjuncts those conjunctionsformed in step a) for those rows that makeA true. The others are left out. In thiscase:

(A ∧ B ∧ C) ∨ (A ∧ B ∧ ¬C) ∨(¬A ∧ B ∧ C)

The three conjunctions in the disjunctionconform to the three truth-value assign-ments that make A true.

(6) The wU B constructed in step (5) is logicallyequivalent to A . Consider that for those truth-value assignments making A true, one of theconjunctions making up the disjunction B istrue, and hence the whole disjunction is true aswell. For those truth-value assignments mak-ing A false, none of the conjunctions makingup B is true, because each conjunction willcontain at least one conjunct that is false onthat truth-value assignment.

Example: Let us construct a truth table for theformula we constructed during our last step:

(A∧B∧C)∨(A∧B∧¬C) ∨ (¬A∧B∧C)T TTT T T T TT F F T T F T F T F TT TT F F T T TTTT F T F T F T F FT F F F T F T F F F F T F F T F F F TT F F F F F T F F FT F F F T F F F FF F T F T F F F T F F T T T FTTT TF F T F F F F F T FT F F T FTT F FF F F F T F F F F F F T F T F F F F TF F F F F F F F F FT F F T F F F F F

By examining the Vnal column for this truthtable, we see that it has the same Vnal columnas that given for A .

(7) This establishes our result. The example wasarbitrary; the same process would work regard-less of the number of statement letters or Vnalcolumn for the statement involved. e

13

Reducing Further

The above result means that any set of connectivesin which we can always Vnd equivalent forms for(A ∨ B), (A ∧ B) and ¬A is an adequate set ofconnectives. This means we can reduce still further.We don’t need all three. We can get by with two inany of three ways.

Corollary: All truth-functions can be deVnedusing only ¬ and ∨.

Proof:The form ¬(¬A ∨ ¬B) is equivalent to (A ∧ B)and could be used instead of the latter in the proofabove. e

Corollary: All truth-functions can be deVnedusing only ¬ and ∧.

Proof:The form ¬(¬A ∧ ¬B) is equivalent with (A ∨B) and could be used instead of the latter in theproof above. e

Corollary: All truth-functions can be deVnedusing only ¬ and⇒.

Proof:Note that

(¬A ⇒ B) �� (A ∨ B) and¬(A ⇒ ¬B) �� (A ∧ B)

and so the former forms can be used in place of thelatter forms in the proof above. e

Reducing Still Further

Actually, if we started from a diUerent basis, wecould get by with just one connective. The mostcommon way to do this is with the SheUer stroke,written ‘|’. It has the following truth table:

A B (A | B)T T FT F TF T TF F T

“A | B” could be read “not both A and B”, andindeed is equivalent to ¬(A ∧ B). However, asour aim is to reduce all operators to ‘|’, it is best notto think of the meanings of ‘¬’ or ‘∧’ as playing arole.

Corollary: All truth-functions can be deVnedusing only the SheUer stroke.

Proof:Note that:

(A | A ) �� ¬A

((A | A ) | (B | B)) �� (A ∨ B)((A | B) | (A | B)) �� (A ∧ B)

and just for kicks, we can add:

(A | (B | B)) �� (A ⇒ B)(((A | A ) | (B | B)) | (A | B)) �� (A ⇔ B)

Hence, forms using the SheUer stroke can be sub-stituted in the proof above. e

Another way is with the SheUer/Peirce dagger,written ‘↓’ (“neither . . . nor . . . ”), which has thetruth table:

A B (A ↓ B)T T FT F FF T FF F T

14

Corollary: All truth-functions can be deVnedusing only the SheUer/Peirce dagger.

Proof:It suXces to note that:

(A ↓ A ) �� ¬A

((A ↓ A ) ↓ (B ↓ B)) �� (A ∧ B)((A ↓ B) ↓ (A ↓ B)) �� (A ∨ B) e

But that’s it. ‘|’ and ‘↓’ are the only binary connec-tives from which all truth functions can be derived.In fact, we can prove this.

Result: No binary operator besides ‘ |’ and ‘ ↓’ isby itself suXcient to capture all truth functions.

Proof:(1) Suppose there were some other binary connec-

tive # that was adequate by itself.(2) We know immediately that (A # B) must

be false when A and B are both true. If not,then it would impossible to form somethingequivalent to a contradiction, since the “toprow” of the truth table (the truth-value assign-ment making all statement letters true) wouldalways make a wU true.

(3) For similar reasons, (A #B) must be truewhen A and B are both false, or else it wouldbe impossible to form something equivalent toa tautology.

(4) Lines (2) and (3) give us this much of the tablefor #:

A B A # BT T FT F ?F T ?F F T

The question is how to Vll in the remaining ?’s.

(5) If we Vll both in with T’s get the SheUer stroke.If we Vll both in F’s, we get the SheUer/Peircedagger. That rules out two of the four remain-ing possibilities.

(6) If we Vll them in T and F respectively, the resultis equivalent with ¬B, and if we Vll them inwith F and T, the result is equivalent with ¬A .

(7) Negation is clearly insuXcient for deVning allother truth functions (by itself, it can deVneonly two truth functions). So the remaining op-tions are inadequate. There are no possibilitiesleft. Our # is impossible. e

There are, however, triadic connectives and 4+place connectives that work.

Austere Syntax

We noted earlier that having a reduced vocabularyin the object language makes proving things aboutit in the metalanguage easier, because there is lessto say. So we might decide to revise our deVnitionof a well formed formula, and make it just thissimple:

(i) Any statement letter is a wU;(ii) if A and B are wUs then so is (A | B);(iii) nothing that cannot be constructed by re-

peated applications of the above is a wU.However, there are trade oUs. The SheUer strokeis less psychologically natural, and the rules of in-ference governing the SheUer stroke are far lessintuitive than anything as simple as modus ponensand modus tollens.

In this course, we take an intermediate route,and take ‘⇒’ and ‘¬’ as our only primitive con-nectives. Therefore, we now oXcially revise ourdeVnition of a wU as follows:

Definition: A(n oXcial) well-formed formula(wU) is deVned recursively as follows:

(i) Any statement letter is a wU;(ii) if A is a wU then so is ¬A ;(iii) if A and B are wUs then so is (A ⇒ B);(iv) nothing that cannot be constructed by repeated

applications of the above is a wU.

We can continue to use the signs ‘⇔’, ‘∧’ and ‘∨’,but treat them as mere abbreviations. They are

15

deVnitional shorthands, just like the conventionswe adopted regarding parentheses:

Abbreviations:

(A ∨ B) abbreviates (¬A ⇒ B)(A ∧ B) abbreviates ¬(A ⇒ ¬B)

(A ⇔ B) abbreviates ¬((A ⇒ B)⇒ ¬(B ⇒ A ))

Whenever one of these signs appears, what is reallymeant is the wU obtained by replacing deVnienswith the deVniendum. o, e.g.,

(P ∧ Q) ∧ (R ∨ S)

is just a shorthand abbreviation for

¬(¬(P ⇒ ¬Q)⇒ ¬(¬R⇒ S))

Similarly, (P ∧ ¬P ) means ¬(P ⇒ ¬¬P ), and(P ∨ ¬P ) means (¬P ⇒ ¬P ).

D. Axiomatic Systems andNatural Deduction

Our next topic is proofs or deductions in the ob-ject language. You learned a deduction system forpropositional logic in your Vrst logic course. Mostlikely, it was what is called a natural deductionsystem, and contained 15 or more rules of infer-ence. There are many competing natural deduc-tion systems out there. The following are derivedloosely on the systems of Kalish and Montague,Gentzen and Fitch, respectively.

Examples:

(1) Hardegree’s System

Inference rules⇒O: From A ⇒ B and A infer B. From

A ⇒ B and ¬B infer ¬A .∨O: From A ∨ B and ¬A infer B. From A ∨ B

and ¬B infer A .∧O: From A ∧ B infer A . From A ∧ B infer B.⇔O: From A ⇔ B infer A ⇒ B. From A ⇔ B

infer B ⇒ A .

DN: From ¬¬A infer A . From A infer ¬¬A .∨I: From A infer A ∨ B. From A infer B ∨ A .∧I: From A and B infer A ∧ B.⇔I: From A ⇒ B and B ⇒ A infer A ⇔ B.6I: From A and¬A infer 6.6O: From 6 infer A .¬⇒O: From ¬(A ⇒ B) infer A ∧ ¬B.¬∨O: From ¬(A ∨ B) infer ¬A . From ¬(A ∨

B) infer ¬B.¬∧O: From ¬(A ∧ B) infer A ⇒ ¬B.¬⇔O: From ¬(A ⇔ B) infer ¬A ⇔ B.

Additional proof techniquesCD: Start a subderivation assuming A . If you de-rive B, you may end the subderivation and inferA ⇒ B.ID: Start a subderivation assuming A . If you de-rive 6, you may end the subderivation and infer¬A . OR Start a subderivation assuming ¬A . Ifyou derive 6, you may end the subderivation andinfer A .

Here there are 21 inference rules and 3 addi-tional proof techniques.

(2) Copi’s System

Inference rulesMP: From A ⇒ B and A infer B.MT: From A ⇒ B and ¬B infer ¬A .DS: From A ∨ B and ¬A infer B .HS: From A ⇒ B and B ⇒ C infer A ⇒ C .Simp: From A ∧ B infer A .Conj: From A and B infer A ∧ B.Add: From A infer A ∨ B.CD: From A ∨ B and (A ⇒ D) ∧ (B ⇒ C )

infer D ∨ CAbs: From A ⇒ B infer A ⇒ (A ∧ B)

Replacement rulesDN: Replace A with ¬¬A or vice versa.Com: Replace A ∨ B with B ∨ A or vice versa.

Replace A ∧ B with B ∧ A or vice versa.Assoc: Replace A ∨ (B ∨ C ) with (A ∨ B) ∨

C or vice versa. Replace A ∧ (B ∧ C ) with(A ∧ B) ∧ C or vice versa.

Dist: Replace A ∧ (B ∨ C ) with (A ∧ B) ∨(A ∧ C ) or vice versa. Replace A ∨ (B ∧C ) with (A ∨ B) ∧ (A ∨ C ) or vice versa.

16

Trans: Replace A ⇒ B with ¬B ⇒ ¬A or viceversa.

Impl: Replace A ⇒ B with ¬A ∨ B or viceversa.

Equiv: Replace A ⇔ B with (A ⇒ B) ∧ (B ⇒A ) or vice versa. Replace A ⇔ B with(A ∧ B) ∨ (¬A ∧ ¬B) or vice versa.

Exp: Replace (A ∧ B) ⇒ C with A ⇒ (B ⇒C ) or vice versa.

Taut: Replace A with A ∧ A or vice versa. Re-place A with A ∨ A or vice versa.

Additional proof techniquesCP: Start a subderivation assuming A . If you de-rive B, you may end the subderivation and inferA ⇒ B.IP: Start a subderivation assuming A . If you deriveB ∧ ¬B, you may end the subderivation and infer¬A .

Here we have 23 rules and two additional prooftechniques.

A natural deduction system is a systemdesigned to include as its inference rules thosesteps of reasoning that are most psychologicallysimple and easy. Usually, this means that some ofthe rules are redundant. Consider, e.g., modustollens (MT) in the Copi/Cohen system. It isredundant given the rules of transposition andmodus ponens. Instead of using MT, one couldalways use them instead.

1. P ⇒ Q2. ¬Q3. ¬Q⇒ ¬P 1 Trans4. ¬P 2, 3 MP

Natural deduction systems contrast with ax-iomatic systems. Axiomatic systems aim to beas minimal as possible. They employ as few basicprinciples and rules as possible. For them, stickingto what is psychologically most natural or conve-nient is not the prime goal.

Generally, when working within a deductionsystem, proofs are easier when the system is morecomplex, because you have more rules to workwith. However, when proving things about a de-duction system, it’s much easier when the systemis as simple and minimal as possible.

Therefore, in what follows we attempt to con-struct a relatively minimalistic deduction systemfor propositional logic; a system, moreover, thatwas custom made for our new revised deVnition ofa well-formed formula. In that system, oXcially, allwUs are built up only using the signs ‘⇒ ’ and ‘¬’.The other signs can be utilized as abbreviations orshorthand notations, but they are not parts of theoXcial symbolism.

E. Axiomatic System L

This system uses the restricted deVnition of a wU inwhich⇒ and ¬ are the only primitive connectives.

First we need some deVnitions:

Definition: An axiom of L is any wU of one of thefollowing three forms:

(A1) A ⇒ (B ⇒ A )(A2) (A ⇒ (B ⇒ C ))⇒

((A ⇒ B)⇒ (A ⇒ C ))(A3) (¬A ⇒ ¬B)⇒ ((¬A ⇒ B)⇒ A )

Note: strictly speaking, there are an inVnite num-ber of axioms, because every instance of theseforms is an axiom. Instances of (A1) include notonly “P ⇒ (Q ⇒ P )” but also complicatedinstances such as “(¬A ⇒ B) ⇒ (¬(¬D ⇒¬¬M)⇒ (¬A⇒ B))”.

Hence (A1) is not itself an axiom; it is an axiomschema. System L has an inVnity of axioms, butthree axiom schemata.

System L has only one inference rule, viz., modusponens: from A ⇒ B and A infer B.

Definition: A proof in L of a conclusion A froma set of premises Γ is a Vnite ordered sequence ofwUs B1,B2, . . . ,Bn, such that the last member ofthe sequence, Bn, is the conclusion A and for eachmember of the sequence Bi, where 1 ≤ i ≤ n, either(1) Bi is a member of the premise set Γ, or (2) Bi

is an axiom of L, or (3) Bi follows from previousmembers of the sequence by modus ponens.

17

To put this less formally, L is a deduction systemin which each step must be either a premise, anaxiom, or a modus ponens inference. There are noother rules.

All proofs are direct. There are no indirect orconditional proofs.

Contrast the simplicity of this system with thenatural deduction systems on the previous page.Yet, this system is no less powerful. Indeed, in aweek or two we will prove that it is complete, i.e.,that every thing that should be provable in it isprovable in it.

Abbreviation: We use the notation

Γ ` A (or Γ `L A )

to mean that there is at least one proof (in L) ofA from Γ, or that A is provable from the set ofpremises Γ. If Γ has one or just a few members, wewrite simply: B ` A or B,C ` A etc. (The sign‘`’ is called the turnstile.)

Definition: A theorem of L is any wU A suchthat ∅ ` A .

In other words, a theorem is a wU that can beproven without using any premises.

Abbreviation: We use the notation

` A

To mean that A is a theorem.

Here is a proof showing that “P ⇒ P ” is a theoremof L:

1. P ⇒ ((P ⇒ P )⇒ P ) instance of A12. P ⇒ (P ⇒ P ) instance of A13. (P ⇒ ((P ⇒ P )⇒ P ))⇒

((P ⇒ (P ⇒ P ))⇒ (P ⇒ P ))instance of A24. (P ⇒ (P ⇒ P ))⇒ (P ⇒ P ) 1, 3 MP5. P ⇒ P 2, 4 MP

Here we see with line 1 and line 2 that diUerentinstances of the same axiom schema are quite of-ten used within the same proof. Line 3 is a typicalinstance of (A2), making A and C into ‘P’ and Binto “(P ⇒ P )”.

In general proofs in an axiomatic system arelonger and less natural than in natural deduction.We make it up to ourselves by never proving thesame thing twice. Notice that the above proof suf-Vces for the particular theorem “P ⇒ P ”. How-ever, the exact same line of reasoning would workfor any statement of the form A ⇒ A . WhateverA is, there is a proof of the form:

1. A ⇒ ((A ⇒ A )⇒ A ) A12. A ⇒ (A ⇒ A ) A13. (A ⇒ ((A ⇒ A )⇒ A ))⇒

((A ⇒ (A ⇒ A ))⇒ (A ⇒ A )) A24. (A ⇒ (A ⇒ A ))⇒ (A ⇒ A ) 1, 3 MP5. A ⇒ A 2, 4 MP

Not only that, but we could introduce the appro-priate Vve steps into any derivation whenever wewanted something of the form A ⇒ A . Just likeevery instance of A ⇒ (B ⇒ A ) is an axiom,every instance of (A1) is a theorem. Hence we callit a theorem schema. Let us call this schema “SelfImplication” (Self-Imp).

Once we have given a proof for a theoremschema, from then on we treat its instances asthough they were axioms, and allow ourselves tomake use of it in any later proof just by citing theprevious proof. This is allowable since, if need be,we could always just repeat the steps of the origi-nal proof in the middle of the new proof. Here’s aproof showing that for any wU A , it holds that:

¬¬A ` A1. ¬¬A Premise2. ¬¬A ⇒ (¬A ⇒ ¬¬A ) A13. ¬A ⇒ ¬¬A 1, 2 MP4. (¬A ⇒ ¬¬A )⇒ ((¬A ⇒ ¬A )⇒ A ) A35. (¬A ⇒ ¬A )⇒ A 3, 4 MP6. ¬A ⇒ ¬A (Self-Imp)7. A 5, 6 MP

Strictly speaking, steps such as 6 are not allowed;however, we could remedy this by simply insert-ing the appropriate steps from our previous proofschema. This gets tedious. Our motto in this classis to never prove something again once you’ve al-ready proven it once.

Not only that, but the result that for any wUA , we have ¬¬A ` A is also the sort of thingthat might come in handy down the road. It is not

18

a theorem schema, since it involved a premise, anddoes not show anything to be a theorem. How-ever, what it does show is that whenever we havearrived at something of the form ¬¬A within aproof, we could do the same steps above to arrive atA . We’re allowed to skip these steps, and cite andabove result. In eUect, we’ve added a new inferencerule to our system. We haven’t really added to thesystem, since we could always Vll in the missingsteps.

Hence a result of this form is called a derivedrule. Let us call this derived rule double negation(DN) for obvious reasons. (Actually, it is only halfof double negation. We’d also need to show thatA ` ¬¬A , which is diUerent.)

Your book just gives theorem schemata andderived rules generic names like “Prop. 1.11a”. Per-sonally, I Vnd them easier to remember if I makeup my own descriptive names and abbreviationslike “(Self-Imp)” and “(DN)”. You can more or lessdo as you like. When I do my grading I won’t re-ally be looking at how you annotate your proofs.I’ll be looking more at the content of the proofsthemselves.

Another thing I Vnd helpful that the bookdoesn’t do is recognize that each step in a proofis itself a result, since we could have stopped theproof there. Hence I like to use the sign “`” beforeany step of a proof I arrive at without using anypremises, and similarly, for those steps that didrequire a premise, I like to make note of this bywriting the premise before the sign “`”. So for theVrst example, I prefer:

1. ` A ⇒ ((A ⇒ A )⇒ A ) A12. ` A ⇒ (A ⇒ A ) A13. ` (A ⇒ ((A ⇒ A )⇒ A ))⇒

((A ⇒ (A ⇒ A ))⇒ (A ⇒ A )) A24. ` (A ⇒ (A ⇒ A ))⇒ (A ⇒ A ) 1, 3 MP5. ` A ⇒ A 2, 4 MP

And for the second, I prefer to write:

1. ¬¬A ` ¬¬A Premise2. ` ¬¬A ⇒ (¬A ⇒ ¬¬A ) A13. ¬¬A ` ¬A ⇒ ¬¬A 1, 2 MP4. ` (¬A ⇒ ¬¬A )⇒ ((¬A ⇒ ¬A )⇒ A )

A35. ¬¬A ` (¬A ⇒ ¬A )⇒ A 3, 4 MP

6. ` ¬A ⇒ ¬A (Self-Imp)7. ¬¬A ` A 5, 6 MP

Written this way, every single line becomes a meta-theoretic result. Moreover, it shows which lines ina proof are justiVed by which premises, and whichlines were justiVed without using any premises.(When a premise is introduced, it is its own jus-tiVcation.) Here we see that in the Vrst proof ev-ery line was a theorem, but in the second proof,some lines were theorems, but others required theassumption at line 1. The disadvantage of this no-tation is that it is more to write, which gets tediousespecially when more than one premise is involved.You can do much the same thing by abbreviatingusing line numbers, e.g., by writing line 3 insteadas:3. [1] ` ¬A ⇒ ¬¬A 1, 2 MPwith the “[1]” representing line 1, and so on.

F. The Deduction Theorem

If you do your homework, you’ll be chugging awayat a number of interesting and worthwhile new the-orem schemata and derived rules. Today we showsomething more radical: we show that the natu-ral deduction method of conditional proof, whilenot strictly speaking allowed in System L, is unnec-essary in L, because there is a rote procedure fortransforming a would-be conditional proof into adirect proof. To be more precise, we’re going toprove the following meta-theoretic result:

Result (The Deduction Theorem):If Γ∪{C } ` A , then Γ ` C ⇒ A . Or, in otherwords, if we can construct a proof for a certainresult A using a set of premises Γ along withan additional premise or assumption, C , then itis always possible, using the original set alone, toconstruct a proof for the conditional statementC ⇒ A .

Proof:(1) Assume that Γ ∪ {C } ` A . This means that

there is a proof, i.e., an ordered sequence of

19

wUs B1,B2, . . . ,Bn that satisVes the deVni-tion of being a proof of A from Γ ∪ {C }.

(2) We’re going to use the technique of proof in-duction (see page 6) to show that for every stepin this proof, Bi, where 1 ≤ i ≤ n, it holdsthat Γ ` C ⇒ Bi.

(3) An argument by proof induction works by Vrstmaking an inductive hypothesis. Let Bi be anarbitrary step in the proof. We’re allowed toassume as an inductive hypothesis that for allearlier steps in the proof Bj such that j < i itholds that Γ ` C ⇒ Bj . We need to show thatthe same holds for Bi given this assumption.

(4) Because Bi is a step in a proof of A fromΓ ∪ {C }, Bi is any one of these three things:a) Bi is a premise, i.e., it is a member of

Γ ∪ {C }.b) Bi is an axiom of L.c) Bi followed from previous steps in the

proof by modus ponens.We will show for any of these cases, it holdsthat Γ ` C ⇒ Bi.Case a) : Bi is a premise. This means that ei-ther Bi is C or it is a member of Γ. If Bi

is C , then C ⇒ Bi is the same as C ⇒ C ,and hence an instance of (Self-Imp), which canbe introduced into any proof. In that case,Γ ` C ⇒ Bi. If Bi is a member of Γ thenclearly Γ ` Bi. We can introduce the axiomBi ⇒ (C ⇒ Bi) as an instance of (A1), andso by MP we can conclude Γ ` C ⇒ Bi.Case b) : Bi is an axiom. Hence we can intro-duce Bi into any proof at any time. By (A1),Bi ⇒ (C ⇒ Bi) is also an axiom. Henceby MP we get ` C ⇒ Bi, and a fortioriΓ ` C ⇒ Bi.Case c) : Bi followed from previous steps in theproof by modus ponens. This is the hard case.By the deVnition of modus ponens, there mustbe two previous members of the sequence, Bj

and Bk from which it followed, with Bj takingthe form Bk ⇒ Bi. By the inductive hypoth-esis, it holds that Γ ` C ⇒ Bj and Γ ` C ⇒Bk. Because Bj takes the form Bk ⇒ Bi, thismeans Γ ` C ⇒ (Bk ⇒ Bi). We can thenintroduce the axiom (C ⇒ (Bk ⇒ Bi)) ⇒((C ⇒ Bk) ⇒ (C ⇒ Bi)) as an instance

of (A2). By two applications of MP, we getΓ ` C ⇒ Bi.

(5) Hence, for every step Bi in the original proof,we can push the assumption C through tomake it an antecedent. This is true of thelast step in the proof, Bn, which must be A ,since A was the conclusion of the originalproof. Hence, Γ ` C ⇒ Bn means thatΓ ` C ⇒ A . e

The above proof of the deduction theorem isfairly hard to follow in the abstract, but the ideabehind it is actually very simple. What it means isthat for every proof making use of some numberof assumptions or premises, we can eliminate oneof the premises and make it an antecedent on eachline of the original proof. There is a rote proce-dure for transforming each line into a line with theeliminated assumption or premise as an antecedent.We follow this procedure for the example given onthe next page. The deduction theorem works al-most as a substitute for conditional proof; moreprecisely, however, it shows that conditional proofin the object language is not needed.

Applying the Deduction Theorem

Last class we covered this proof schema:

1. ¬¬A ` ¬¬A Premise2. ` ¬¬A ⇒ (¬A ⇒ ¬¬A ) A13. ¬¬A ` ¬A ⇒ ¬¬A 1, 2 MP4. ` (¬A ⇒ ¬¬A )⇒ ((¬A ⇒ ¬A )⇒ A )

A35. ¬¬A ` (¬A ⇒ ¬A )⇒ A 3, 4 MP6. ` ¬A ⇒ ¬A (Self-Imp)7. ¬¬A ` A 5, 6 MP

We used a premise to arrive at our conclusion; thededuction theorem tells us that there is a proof notmaking use of the premise, in which the premiseof the original argument becomes an antecedenton the result, i.e.:

` ¬¬A ⇒ A

The proof of the deduction theorem provides uswith a way of transforming the above proof schemainto one for the result that ` ¬¬A ⇒ A .

20

We take the steps of the original proof oneby one, and depending on what kind of case it is,we treat it appropriately. In transforming eachstep, the goal is to push the discharged premiseto the other side of the turnstyle, and arrive at“` ¬¬A ⇒ . . .”.

Line 1 is the discharged premise. It falls in “case a)”from the previous page. It becomes:

1. ` ¬¬A ⇒ ¬¬A (Self-Imp)

Line 2 appeals to an axiom. It falls in “case b)”. So,it becomes:

2. ` ¬¬A ⇒ (¬A ⇒ ¬¬A ) A13. ` (¬¬A ⇒ (¬A ⇒ ¬¬A ))⇒

(¬¬A ⇒ (¬¬A ⇒ (¬A ⇒ ¬¬A ))) A14. ` ¬¬A ⇒ (¬¬A ⇒ (¬A ⇒ ¬¬A ))2,3 MPLine 3 is gotten by MP. It falls in “case c)”:

5. ` (¬¬A ⇒ (¬¬A ⇒ (¬A ⇒ ¬¬A )))⇒((¬¬A ⇒ ¬¬A )⇒

(¬¬A ⇒ (¬A ⇒ ¬¬A ))) A26. ` (¬¬A ⇒ ¬¬A )⇒

(¬¬A ⇒ (¬A ⇒ ¬¬A )) 4, 5 MP7. ` ¬¬A ⇒ (¬A ⇒ ¬¬A ) 1, 6 MPLine 4 also appeals to an axiom. We treat it justlike we treated line 2:

8. ` (¬A ⇒ ¬¬A )⇒ ((¬A ⇒ ¬A )⇒ A )A3

9. ` ((¬A ⇒ ¬¬A )⇒ ((¬A ⇒ ¬A )⇒ A ))⇒ (¬¬A ⇒ ((¬A ⇒ ¬¬A )⇒

((¬A ⇒ ¬A )⇒ A ))) A110. ` ¬¬A ⇒ ((¬A ⇒ ¬¬A )⇒

((¬A ⇒ ¬A )⇒ A )) 8, 9 MPLine 5 is gotten at by MP, so “case c)” again:

11. ` (¬¬A ⇒ ((¬A ⇒ ¬¬A )⇒ ((¬A ⇒¬A )⇒ A )))⇒ ((¬¬A ⇒ (¬A ⇒¬¬A ))⇒ (¬¬A ⇒ ((¬A ⇒ ¬A )⇒ A )))

A212. ` (¬¬A ⇒ (¬A ⇒ ¬¬A ))⇒

(¬¬A ⇒ ((¬A ⇒ ¬A )⇒ A )) 10,11 MP13. ` ¬¬A ⇒ ((¬A ⇒ ¬A )⇒ A ) 7, 12 MP

Line 6 appeals to a theorem schema. Strictly speak-ing we should write out the intermediate steps, butto save time we can treat it like an axiom, and usethe method for case b):

14. ` ¬A ⇒ ¬A (Self-Imp)

15. ` (¬A ⇒ ¬A )⇒ (¬¬A ⇒ (¬A ⇒ ¬ A ))A1

16. ` ¬¬A ⇒ (¬A ⇒ ¬A ) 14, 15 MPLine 7 is another MP step:

17. ` (¬¬A ⇒ ((¬A ⇒ ¬A )⇒ A ))⇒((¬¬A ⇒ (¬A ⇒ ¬A ))⇒ (¬¬A ⇒ A ))

A218. ` (¬¬A ⇒ (¬A ⇒ ¬A ))⇒ (¬¬A ⇒ A )

13, 17 MP19. ` ¬¬A ⇒ A 16, 18 MPWe’ve transformed our original 7 step proof into a19 step proof for the result we were after. Noticethat in the new proof, every single step is a theo-rem; the hypothesis is removed entirely. The Vnalline shows that all wUs of the form ¬¬A ⇒ Aare theorems.

This procedure can be lengthy, but it sure iseUective! The proofs that result from the trans-formation procedure are not usually the most el-egant ones possible Notice, e.g., that lines 2 and7 are identical, so we could have skipped lines 3–7! However, we followed the recipe provided onthe previous page blindly, since we know that thatprocedure will work in every case.

Since we know we can always transform theone kind of proof into the other, from here on out(well, except in tonight’s homework), wheneveryou have a result of the form A ` B, just goahead and conclude ` A ⇒ B, annotating with“DT”. (In eUect, this allows you to do “conditionalproofs” in our System L.)

Important Derived Rules for System L

The following are either proven in your book, as-signed for homework, or not worth our time tobother proving now.

Remember that (A ∨ B) is deVned as (¬A ⇒B) and (A ∧ B) is deVned as ¬(A ⇒ ¬B), etc.

Derived rules: My name/abbreviation:A ⇒ B,B ⇒ C ` A ⇒ C Syllogism (Syll)A ⇒ (B ⇒ C ) ` B ⇒ (A ⇒ C )

Interchange (Int)A ⇒ B ` ¬B ⇒ ¬A Transposition (Trans)¬A ⇒ ¬B ` B ⇒ A Transposition (Trans)A ⇒ B,¬B ` ¬A Modus Tollens (MT)

21

¬¬A ` A Double Negation (DN)A ` ¬¬A Double Negation (DN)¬A ` A ⇒ B False Antecedent (FA)A ` B ⇒ A True Consequent (TC)A ,¬B ` ¬(A ⇒ B) True Ant/False C.(TAFC)¬(A ⇒ B) ` A True Antecedent (TA)¬(A ⇒ B) ` ¬B False Antecedent (FC)A ⇒ B,¬A ⇒ B ` B Inevitability (Inev)A ∨ A ` A Redundancy (Red)A ` A ∧ A Redundancy (Red)A ` A ∨ B Addition (Add)A ` B ∨ A Addition (Add)A ∨ B ` B ∨ A Commutativity (Com)A ∧ B ` B ∧ A Commutativity (Com)A ⇔ B ` B ⇔ A Commutativity (Com)(A ∧ B) ∧ C ` A ∧ (B ∧ C )

Associativity (Assoc)A ∧ (B ∧ C ) ` (A ∧ B) ∧ C ” (Assoc)(A ∨ B) ∨ C ` A ∨ (B ∨ C ) ” (Assoc)A ∨ (B ∨ C ) ` (A ∨ B) ∨ C ” (Assoc)A ∧ B ` A SimpliVcation (Simp)A ∧ B ` B SimpliVcation (Simp)A ,B ` A ∧ B Conjunction Intro (Conj)A ⇒ B,B ⇒ A ` A ⇔ B

Biconditional Intro (BI)A ,B ` A ⇔ B Biconditional Intro (BI)¬A ,¬B ` A ⇔ B Biconditional Intro (BI)A ⇔ B ` A ⇒ B Biconditional Elim (BE)A ⇔ B ` B ⇒ A Biconditional Elim (BE)A ⇔ B,A ` B Bic. Modus Ponens (BMP)A ⇔ B,B ` A Bic. Modus Ponens (BMP)A ⇔ B,¬A ` ¬B Bic. Modus Tollens (BMT)A ⇔ B,¬B ` ¬A Bic. Modus Tollens (BMT)

As you see, I prefer Copi’s abbreviations. You canuse whatever abbreviations you prefer, providedthat you don’t use a derived rule until you’ve givena proof schema for it! Once you do it, you canalways refer back to it.

G. Soundness and Consistency

Time to get to the really good stuU—the importantresults of this chapter.

Generally, we say that a logical system is soundif and only if everything provable in it ought to

be provable in it given the intended semantics forthe signs utilized in the language. Generally, wesay that a logical system is consistent if and onlyif there is no wU A such that both A and ¬A areprovable in the system.

We now show that L has these features.

Result (Soundness): System L is sound, i.e., forany wU A , if ` A then � A . In other words,every theorem of L is a tautology.

Proof:(1) Assume ` A . This means that there is a se-

quence of wUs B1,B2, . . . ,Bn constitutingproof of A in which every step is either anaxiom or derived from previous steps by MP.

(2) We shall show by proof induction that everystep of such a proof is a tautology. We assumeas inductive hypothesis that all the steps priorto a given step Bi are tautologies. We nowneed to show that Bi is a tautology.

(3) Bi is either an axiom or derived from previ-ous steps by MP. If it is an axiom, then it is atautology. (A simple truth table for the threeaxiom schemata shows that all instances aretautologies.) By an earlier result (see p. 3), any-thing derived from MP from tautologies is alsoa tautology. Hence, Bi is a tautology.

(4) By proof induction, all steps of the proof aretautologies, including the last step, which is A .Hence � A . e

Corollary (Consistency): System L is consis-tent, i.e., there is no wU A such that both ` Aand ` ¬A .

Proof:

Suppose for reductio that there is some A suchthat ` A and ` ¬A .

Since L is sound, � A and � ¬A .

22

By the deVnition of a tautology, every truth-valueassignment makes both A true and ¬A true.

However, no truth-value assignment can makeboth A and ¬A true, and so our assumption isimpossible. e

Here we see that consistency is a corollary ofsoundness. Here’s another.

Corollary: If {B1,B2, . . . ,Bn} ` A then{B1,B2, . . . ,Bn} � A .

Proof:The reason is that if

{B1,B2, . . . ,Bn} ` A

then by multiple applications of the deduction the-orem,

` (B1 ⇒ (B2 ⇒ . . . (Bn ⇒ A ))).

Then, by soundness, we can conclude:

� (B1 ⇒ (B2 ⇒ . . . (Bn ⇒ A )))

Then by simple reWections on the rules governingtruth tables, it is obvious that:

{B1,B2, . . . ,Bn} � A

In other words, only logically valid arguments haveproofs in L. e

H. Completeness

Our next task is to prove the converse of soundness,i.e., that if � A then ` A .

Unfortunately, the word “complete” is usedwith two diUerent meanings in mathematical logic.On one meaning (used by Emil Post), a system issaid to be complete if and only if for every wUA , either A or ¬A is a theorem of the system.System L is obviously not complete in this sense,since for a contingent statement, neither it nor its

negation is a tautology, and hence neither it norits negation is a theorem. The other sense of com-pleteness is the converse of soundness, i.e., thateverything that should be provable in the systemgiven the semantics of the signs it employs is infact provable. This notion of “completeness” wasVrst used by Kurt Gödel, and is sometimes called“semantic completeness”. System L is complete inthis sense. Before we prove this, we Vrst need toprove something else.

Composition Lemma

(Something that is proven only as a means towardsproving something else is called a lemma.)

Most likely, one of the Vrst things you learnedabout propositional logic is how to compute thetruth value of a given statement if you are giventhe truth values of all its statement-letters. This iswhat you do when you Vll in a row of a truth table.P Q R P ⇒ ¬ (Q ⇒ R)T T F T T T (T F F)

In system L, this corresponds to the result that,if, for every statement letter in A , you are giveneither it or its negation as a premise, you shouldbe able to derive either the truth or the falsity ofA . For the example just given, we should have:{P,Q,¬R} ` P ⇒ ¬(Q⇒ R). We do!

1. {P,Q,¬R} ` P Premise2. {P,Q,¬R} ` Q Premise3. {P,Q,¬R} ` ¬R Premise4. {P,Q,¬R} ` ¬(Q⇒ R) 2, 3 (TAFC)5. ` ¬(Q⇒ R)⇒ [P ⇒ ¬(Q⇒ R)] A16. {P,Q,¬R} ` P ⇒ ¬(Q⇒ R) 4, 5 MP

Let us prove this result in a general form.

Result (Composition Lemma): If A is a wUwhose statement letters are P1, . . . ,Pn, andthere is a truth-value assignment f such that setΓ contains Pi iU f assigns the truth value T toPi, and Γ contains ¬Pi iU f assigns the truthvalue F to Pi, then if f makes A true, thenΓ ` A , and if f makes A false, then Γ ` ¬A .

23

Proof:We show this by wU induction.

Base step: Let A be a statement letter. Then theonly statement letter making up A is A itself. Iff assigns T to A , then A ∈ Γ and hence Γ ` A .Similarly, if f assigns F to A , then ¬A ∈ Γ, andhence Γ ` ¬A .

Induction step: Because all complex wUs are builtup using the signs ¬ and⇒, we need to show twothings, (a) if A takes the form ¬B then the aboveholds for A assuming it holds of B, and (b) if Atakes the form B ⇒ C , then the above holds of Aassuming it holds of B and C .

First let’s show part (a).

Suppose A takes the form ¬B. If f makes A true,then it must make B false. By our assumption,Γ ` ¬B, which is the same as Γ ` A , which iswhat we want. If f makes A false, it must makeB true. By our assumption Γ ` B, and by (DN)Γ ` ¬¬B, which is the same as Γ ` ¬A .

Now let’s show part (b).

Suppose A takes the form B ⇒ C . If f makesA true, it must make either B false or C true.If f makes B false, then by our assumption Γ `¬B, and by the derived rule (FA), it follows thatΓ ` B ⇒ C , i.e., Γ ` A . If f makes C true, byour assumption Γ ` C and so by the derived rule(TC), we get Γ ` B ⇒ C , i.e., Γ ` A . On theother hand, if f makes A false, it must make Btrue and C false. By the assumption, Γ ` B andΓ ` ¬C . Then by the derived rule (TAFC), we getΓ ` ¬(B ⇒ C ), or in other words, Γ ` ¬A .

This completes the induction step, and hence theComposition Lemma follows by wU induction. e

We are now ready to tackle completeness.

Result (Completeness): System L is semanti-cally complete, i.e., for any wU A , if � A then` A .

Proof:(1) Assume that � A , and let the statement letters

making it up be P1, . . . ,Pn.

Example: For illustration purposes only, we’llassume A contains only three statement let-ters ‘P ’, ‘Q’ and ‘R’.

(2) As a tautology, every truth-value assignmentto those statement letters makes A true.

(3) By the Composition Lemma, it follows that forevery for set Γ that contains either Pi or ¬Pi

but not both for each i such that 1 ≤ i ≤ n, wehave Γ ` A .

Example: Consider the truth table for A ; it is atautology, true on every row. Each row gives usa diUerent result from the Composition Lemma,but always a diUerent way of proving A .

P Q R A Result of lemmaT T T T {P,Q,R} ` AT T F T {P,Q,¬R} ` AT F T T {P,¬Q,R} ` AT F F T {P,¬Q,¬R} ` AF T T T {¬P,Q,R} ` AF T F T {¬P,Q,¬R} ` AF F T T {¬P,¬Q,R} ` AF F F T {¬P,¬Q,¬R} ` A

(4) By the Deduction Theorem, we can concludethat if ∆ is a set containing either Pi or ¬Pi

for each i such that 1 ≤ i ≤ n − 1, we haveboth ∆ ` Pn ⇒ A and ∆ ` ¬Pn ⇒ A .By the derived rule (Inev), we can conclude∆ ` A .

Example: What we’re doing here taking thelast statement letter or negation in eachpremise set and removing it by the deduc-tion theorem, thereby making it an antecedent.However, since we have both the case with theaXrmative antecedent and the case with thenegative antecedent, they drop oU by (Inev).

{P,Q} ` R⇒ A}so {P,Q} ` A{P,Q} ` ¬R⇒ A

{P,¬Q} ` R⇒ A}so {P,¬Q} ` A{P,¬Q} ` ¬R⇒ A

{¬P,Q} ` R⇒ A}so {¬P,Q} ` A{¬P,Q} ` ¬R⇒ A

{¬P,¬Q} ` R⇒ A}so {¬P,¬Q} ` A{¬P,¬Q} ` ¬R⇒ A

24

(5) By continued application of the same processdescribed in step (4), we can successively elimi-nate the members of the premise sets, arrivingultimately at the results that `P1 ⇒ A and` ¬P1 ⇒ A . Again, by (Inev), it follows that` A .

Example: We just continue the same process:{P,Q} ` A so P ` Q⇒ A

}P ` A{P,¬Q} ` A so P ` ¬Q⇒ A

{¬P,Q} ` A so ¬P ` Q⇒ A}¬P ` A{¬P,¬Q} ` A so ¬P ` ¬Q⇒ A

Finally we get both ` P ⇒ A and ` ¬P ⇒A , and can conclude ` A by (Inev).

Actually, from the above proof, the proof of thecomposition lemma and the proof of the deductiontheorem, we could write an algorithm for teachinga computer how to construct a derivation for anygiven tautology of our language. (Most such proofs,however, would be several thousand steps long.)

Corollary: if B1, . . . ,Bn � A , thenB1, . . . ,Bn ` A . For every valid argument,there is a deduction for it in our very minimalSystem L.

Proof:See the proof of the converse of the above, givenas a corollary to Soundness on p. 23, and run it inthe other direction. e

Corollary: For any wU A , ` A iU � A . (Alland only tautologies are theorems of L.)

Proof: Combine Soundness and Completeness.

I. Independence of the Axioms

You might think that in establishing soundness andcompleteness, we have shown that System L is

exactly what it was intended to be: all and onlylogical truths are provable in it, and all and onlyvalid arguments have proofs in it. The only way inwhich it might be criticized if it is were redundant,i.e., if it contained more axioms than necessary.Our task today it to show that we needed all threeaxiom schemata. We’ll show that it is impossible toderive any one of the axiom schemata as a theoremschema using only the other two axiom schemataand MP.

This may seem like a diXcult task: how doesone prove that something is not provable fromsomething else? Consider: if we expanded L byadding axioms that are not tautologies, then obvi-ously, we could prove that the new axioms wereindependent because everything provable from theaxioms of L alone is a tautology. We can’t use thismethod to establish the independence of A1 fromA2 and A3, however, because all are tautologies.

Instead, we focus on a diUerent, made-up prop-erty a wU can have, called selectness.

Definition: A schmuth-value assignment isany function mapping statement letters of the lan-guage of propositional logic to the set {0, 1, 2}.

In eUect, such an assignment is something that as-signs either 0, 1 or 2 to each statement letter. Thisis rather like a truth-value assignment, which mapsstatement letters to T and F, except that here thereare more possibilities.

Indirectly a schmuth-value assignment deter-mines a schmuth-value for complex wUs accordingto the following charts:

A ¬A0 11 12 0

A B A ⇒ B0 0 00 1 20 2 21 0 21 1 21 2 02 0 02 1 02 2 0

These charts allow us to construct schmuth tables.Let us see what one looks like for an instance of(A1).

25

A ⇒ (B ⇒ A)0 0 0 0 00 2 1 2 00 0 2 0 01 0 0 2 11 0 1 2 11 2 2 0 12 0 0 2 22 0 1 0 22 0 2 0 2

The Vnal schmuth-value of this formula for eachschmuth-value assignment is given underneaththe main connective (the Vrst “⇒”). Here we seethat this wU is schmtingent, i.e., it has diUerentschmuth-values for diUerent schmuth-value assign-ments.

Contrast this with the possible instances of(A3):

(¬A ⇒¬B) ⇒ ((¬A ⇒B)⇒A )1 0 2 1 0 0 1 0 2 0 0 01 0 2 1 1 0 1 0 2 1 0 01 0 2 0 2 0 1 0 0 2 0 01 1 2 1 0 0 1 1 2 0 0 11 1 2 1 1 0 1 1 2 1 0 11 1 2 0 2 0 1 1 0 2 2 10 2 2 1 0 0 0 2 0 0 2 20 2 2 1 1 0 0 2 2 1 0 20 2 0 0 2 0 0 2 2 2 0 2

Instances of (A3) are schmtologies, i.e., have theschmuth-value 0 for any possible schmuth-valueassignment.

Definition: We say that a wU is select if and onlyif it is a schmtology, i.e., it has schmuth-value 0 forany possible schmuth-value assignment.

Similarly, all the instances of (A2) are select. I’llspare you the 27 row table. You’ll just have to takemy word for it.

Result: Modus ponens preserves selectness.

Proof:Suppose A is select, i.e., has schmuth value 0 forevery possible schmuth-value assignment. Simi-larly, suppose that A ⇒ B is select, i.e., has 0 forevery possible schmuth-value assignment. ThenB must be select as well. We can see this by theschmuth table rules for ‘⇒’. If B were not select,then it would have 1 or 2 as value for some assign-ment. If so, then A ⇒ B and A could not bothbe select, because A ⇒ B has value 2 when Ahas 0 and B gets 1 or 2 as value. e

Result: Axiom schema (A1) is independent of(A2) and (A3).

Proof:Suppose we had an axiom system in which ouronly axiom schemata were (A2) and (A3) and ouronly inference rule were modus ponens. If so, thenevery theorem of the system would be select, sincethe axioms are select and everything derived fromselect wUs by MP is also select. Because someinstances of (A1) are not select, this means someinstances of (A1) would not be theorems of this sys-tem. Hence, not all instances of (A1) are derivablefrom (A2), (A3) and MP alone. e

A similar procedure can show that (A2) is inde-pendent of (A1) and (A3). We again consider func-tions assigning one of {0, 1, 2} to each statementletter, but instead use the diUerent rules below forcomplex wUs:

A ¬A0 11 02 1

A B A ⇒ B0 0 00 1 20 2 11 0 01 1 21 2 02 0 02 1 02 2 0

26

We then deVne a notion of grotesqueness. A com-plex wU is grotesque if and only if it comes outwith value 0 using these revised rules for any pos-sible assignment of 0, 1 or 2 to all its statementletters.

It turns out that all instances of (A1) and (A3)are grotesque, but some instances of (A2) are not.Modus ponens preserves grotesqueness. So (A2) isindependent of (A1) and (A3).

For homework, you’ll be proving the indepen-dence of (A3) from (A1) and (A2). Relatively, that’sthe easiest, since it doesn’t require three values,and can be done with assignments into {0, 1}, pro-vided that one changes the rule governing how thevalue for ¬A is determined by the value of A .

These independence results establish that thereis no redundancy in our axiom schemata; wecouldn’t simply remove one of them and be leftwith a complete system.

In one sense, our system is “as minimal as pos-sible,” but in another sense it isn’t. We can’t simplyremove any of the ones we have, but we could startwith completely diUerent axiom schemata. Several“rival” axiomatizations are possible; you can Vnd alist of some of them in your book pp. 45–46. Ax-iomatizations have been found in which there isonly one axiom schema. Just like the decision touse both ‘⇒’ and ‘¬’ instead of ‘|’, however, thereare diminishing returns to minimalism. The proofsin such systems for even the most mundane re-sults often require an insane number of steps andinsanely complicated axioms.

However, in case you’re curious, the Vrst com-plete system for propositional logic using a singleaxiom schema was discovered by Jean Nicod in1917, and it uses the SheUer stroke instead of ‘⇒’and ‘¬’. An axiom is any instance of the singleschema:

(A | (B | C )) | ((D | (D | D)) |((E | B) | ((A | E ) | (A | E ))))

The only inference rule is: From A | (C | B) andA infer B.

Are you glad I didn’t make you use that sys-tem?

27

UNIT 2

METATHEORY FOR PREDICATE LOGIC

A. The Syntax of PredicateLogic

Onwards and upwards. Our Vrst task is to describeour new language.

Definition: An individual variable is one of thelowercase letters ‘ x’, ‘ y’, or ‘ z’, written with or with-out a numerical subscript:

Examples: ‘x’, ‘x1’, ‘x12’, ‘y’, ‘y2’, ‘z’, ‘z13’, etc.

I use the unitalicized letters ‘x’, ‘y’ and ‘z’ as ob-ject language variables, and italicized letters inthe same range—‘x’, ‘y’, ‘z’ —as metalinguisticschematic letters for any object-language variables.Thus, e.g.,

(∀x)(Fx⇒ Gx)Schematically represents all of “(∀x)(Fx ⇒ Gx)”and “(∀y)(Fy ⇒ Gy)” and “(∀x3)(Fx3 ⇒ Gx3)”,and so on. The diUerence is subtle, and usually notso important to keep straight. After all, object lan-guage variables tend to be interchangeable; thesedo not mean anything diUerent. This is why I’m us-ing notation that does not emphasize the diUerence.Still, we do need a technical means for diUerentiat-ing between the two when it is necessary.

OXcially, Mendelson only uses ‘xn’, and not‘yn’ or ‘zn’, although he doesn’t stick to this. Hisvariables are always italicized; the only diUer-ence beween object language and metalanguageis whether a particular numerical subscript occurs,

or only a variable one: the diUerence between ‘x2’and ‘xi’. It simpliVes some things when we get tothe semantics to use only one letter ‘x’, but I Vndstatements with multiple variables much easier toread with ‘x’, ‘y’ and ‘z’ instead of ‘x1’, ‘x2’ and‘x3’.

Definition: An individual constant is one of thelowercase letters from ‘ a’ to ‘ e’, written with or with-out a numerical subscript.

Examples: ‘a’, ‘a2’, ‘b’, ‘c124’, etc.

Again, I use them unitalicized for object-languageconstants, and italicized, when (very rarely) I needto make a schematic statement about any constant.

Again, Mendelson only uses ‘an’.

Definition: A predicate letter is one of the upper-case letters from ‘A’ to ‘T ’, written with a numericalsuperscript ≥ 1, and with or without a numericalsubscript.

Examples: ‘A1’, ‘R2’, ‘H4’, ‘F 12 ’, ‘G

34’, etc.

Even when italicized, take these to be object lan-guage constants; script letters such as P are usedin their place schematically if need be.

The superscript indicates how many terms thepredicate letter takes to form a statement. A predi-cate letter with a superscript ‘1’ is called amonadicpredicate letter. A predicate letter with a super-script ‘2’ is called a binary or dyadic predicateletter.

28

It is customary to leave these superscripts oUwhen it is obvious from context what they must be.E.g., “R2(a, b)” can be written simply “R(a, b)”.

OXcially Mendelson only uses ‘Amn ’.

Definition: A function letter is one of the lower-case letters from ‘f ’ to ‘ l’, written with a numericalsuperscript ≥ 1, and with or without a numericalsubscript.

Examples: ‘f 1’, ‘g2’, ‘h33’, etc.

The numerical superscript indicates how manyargument places the function letter has. A func-tion letter with a superscript ‘1’ is called amonadicfunction letter; a function letter with a superscript‘2’ is called a binary/dyadic function letter, etc.

Here too, it is customary to leave these super-scripts oU when it is obvious from context whatthey must be. E.g., “f 1(x)” can be written simply“f(x)”.

Definition: A term of the language is deVned re-cursively as follows:

(i) all individual variables are terms;(ii) all individual constants are terms;(iii) if F is a function letter with superscript n,

and t1, . . . , tn are terms, then F (t1, . . . , tn)is a term;

(iv) nothing that cannot be constructed by repeatedapplications of the above is a term.

Examples: ‘a’, ‘x’, “f(a)”, “g(x, f(y))”, etc.

As evinced above, I use italicized lowercase lettersfrom later on in (but not the end of) the alphabet,such as ‘t’, ‘r’, etc., schematically for any terms.

Definition: An atomic formula is any expressionof the form P(t1, . . . , tn) where P is a predicateletter with superscript n, and t1, . . . , tn are all terms.

Examples: “F 1(a)”, “F 1(f(x))”, “R34(a, b, c)”,

“H4(x, b, y, g(a, x))”, etc.

If you used Hardegree’s Intermediate textbook, youmay be used to using hard brackets ‘[’ and ‘]’ in-stead of soft brackets for atomic formulas. Mendel-son uses soft brackets for both, as does almost

everyone else. Here, I follow Mendelson, thoughI’ll put hard brackets to another use in a minute.

However, I adopt the convention that if theterms in an atomic formula contain no function let-ters, the parentheses and commas may be removed.

Examples: “Fx” is shorthand for “F 1(x)”, and“Rab” is shorthand for “R2(a, b)”.

Definition: A well-formed formula (wU) is re-cursively deVned as follows:

(i) any atomic formula is a wU;(ii) if A is a wU, then ¬A is a wU;(iii) if A and B are wUs, then (A ⇒ B) is a wU;(iv) If A is a wU and x is an individual variable,

then ((∀x) A ) is a wU;(v) nothing that cannot be constructed by repeated

applications of the above is a wU.

Mendelson puts parentheses around quantiVers.Other notations for “(∀x)” include “(x)”, “∀x”, and“∧x”. Again, you can use whatever notation you

want, and I might not even notice.We continue to use the same conventions as

last unit for dropping parentheses. In Mendelson’spractice, the quantiVer is taken to fall between ¬,∧, ∨ and⇒ ,⇔ in the ranking. In other words:

(∀x)Fx⇔ Ga

abbreviates:

(((∀x)Fx)⇔ Ga)

Whereas:(∀x)Fx ∨ Ga

abbreviates:

((∀x)(Fx ∨ Ga))

This is unusual on Mendelson’s part and I shallavoid making use of this convention in cases simi-lar to this last one.

The existential quantiVer is introduced by def-inition. The deVnitions for the other connectivesremain unchanged.

29

Abbreviations:

((∃x) A ) abbreviates ¬((∀x)¬A )(A ∧ B) abbreviates ¬(A ⇒ ¬B)(A ∨ B) abbreviates (¬A ⇒ B)

(A ⇔ B) abbreviates¬((A ⇒ B)⇒ ¬(B ⇒ A ))

Definition: A Vrst-order language is any logi-cal language that makes use of the above deVnitionof a wU, or modiVes it at most by restricting whichconstants, function letters and predicate letters areutilized (provided that it uses at least one predicateletter). E.g., a language that does not use functionletters still counts as a Vrst-order language.

Free and Bound Variables

Definition: When a quantiVer (∀x) occurs as partof a wU A , the scope of the quantiVer is deVned asthe smallest part ((∀x) B) of A such that ((∀x) B)is itself a wU.

Definition: If x is a variable that occurs within awU A , then an occurrence of x in A is said to bea bound occurrence iU it occurs in the scope of aquantiVer of the form (∀x) within A ; otherwise, theoccurrence of x is said to be a free occurrence.

Examples:1. All three occurrences of ‘x’ in

“(∀x)(Fx⇒ Fx)” are bound.2. The (solitary) occurrence of ‘x’ in

“Fx⇒ (∀y)Gy” is free.

Definition: A variable x that occurs within a wUA is said to a bound variable, or simply bound, iUthere is at least one bound occurrence of x in A .

Definition: A variable x that occurs within a wUA is said to be a free variable, or simply free, iUthere is at least one free occurrence of x in A .

Notice that ‘x’ is both bound and free in “Fx ⇒(∀x)Gx”, because some occurrences are bound andone is free.

Definition: A wU A is said to be closed iU Acontains no free variables; otherwise A is said to beopen.

Open formulas may be very unfamiliar to some ofyou. In Hardegree’s books, you never see some-thing like “Fx” by itself as a line of a proof: there,variables are always bound by quantiVers. Onlyconstants appear on their own.

To not be thrown by wUs including free vari-ables, try not to equate the notion of a true/falsesentence with the notation of a wU. In fact:

Definition: A sentence is a closed wU.

Normally, we’ll only call sentences “true” or “false”.For wUs containing free variables, we say they aresatisVed by some values of the variables, and notsatisVed by others.

In a derivative sense, however, we’ll say that anopen wU is “true” iU any values we choose for thefree variable(s) would satisfy them. However, theseare semantic issues and we’re still doing syntax.

Examples:1. If ‘R2’ means “. . . is taller than . . . ”, and ‘b’ is an

individual constant standing for Barack Obama,then the open wU, “R2(x, b)”, is satisVed by allvalues of the variable that are things taller thanObama, and in our derivative sense, we say thatthis wU is not true because it is not satisVed byevery value of the variable.

2. The open wU “R2(x, b)⇒ R2(x, b)”, (whateverour interpretation) is satisVed by all values ofthe variable, and, derivatively, is regarded astrue.

Why do we need both “(∀x)R2(x, b)” and“R2(x, b)”? This will become clearer when we getto the system of deduction. (Actually this is in parthistorical accident; axiom systems that do withoutfree variables have been devised, but they are morecomplicated.)

The diUerence is roughly the same as betweenany and all.

Definition: If A is a wU, t is a term and x is avariable, then t is free for x in A iU no free oc-currence of x in A lies within the scope of somequantiVer (∀y) where y is a variable occurring in t .

30

Basically, this means that if you substitute t forall the free occurrences of x within A , you won’tend up inadvertently “binding” any of the variablesthat happen to occur in t.

Examples:1. ‘a’ is free for ‘x’ in “(∀y)Gy⇒ Gx”.2. “f 2(x, z)” is free for ‘x’ in “(∀y)Gy⇒ Gx”.3. ‘z’ is not free for ‘x’ in “(∀y)Gy⇒ (∀z)Rxz”.4. “f 2(a, z)” is not free for ‘x’ in

“(∀y)Gy⇒ (∀z)Rxz”.5. “f 2(a, z)” is free for ‘x’ in

“(∀y)Gy⇒ (∀z) (∀x)Rxz”.6. All terms are free for ‘x’ in “(∀y)Gy”.7. All terms are free for ‘y’ in “(∀y)Gy”.

I write A [x] for an arbitrary wU that may or maynot contain x free. If the notation A [t] is used inthe same context, it means the result of substitutingthe term t for all free occurrences of x (assumingthere are any) in A [x].

Examples:1. If A [x] is “Fx”, then A [y] is “Fy”.2. If A [x] is “(∀y)R(y, x)” then A [f(b)] is

“(∀y)R(y, f(b))”.3. If A [z] is “F z⇒ Gz”, then A [d] is “Fd⇒ Gd”.4. If A [x] is “Fx ⇒ (∀x)Gx”, then A [d] is

“Fd⇒ (∀x)Gx”.5. If A [x] is “Fa” then A [y] is “Fa”.

Mendelson writes A (x) and A (t) instead of A [x]and A [t]. I think my notation makes it clearerthat these signs are parts of the metalanguage, andthat the parentheses that appear here are not theparentheses used in atomic formula or in functionterms.

Similarly, I write A [x, y] for an arbitrary wUthat may or may not contain x and y free, and inthe same context I use A [t, s] for the result of sub-stituting t for all free occurrences of x, and s forall free occurrences of y, in A [x, y].

Examples:1. If A [x, y] is “Rxy”, then A [a, b] is “Rab”.2. If A [x, y] is “(∀z)(Rzx ∧ Ryz)”, then A [a, b]

is “(∀z)(Rza ∧ Rbz)”.

B. The Semantics of PredicateLogic

Over the next couple weeks, we’ll introduce anaxiomatic system of deduction for predicate logic,and prove it complete, consistent and sound, justlike we did for propositional logic. In other words,we’ll prove that every theorem is logically true,and that every logically true wU is a theorem. But,what is a logically true wU in predicate logic?

In propositional logic, a logically true wU isjust a tautology, and we had a decision procedurefor determining which wUs are logically true andwhich are logically false, and which arguments arevalid and which are invalid, viz., truth tables.

However, you may never have learned any-thing similar for determining whether a given pred-icate logic statement is valid or invalid accordingto its semantics (i.e., its form and the meaning of itslogical constants). We don’t teach it here at UMassin Intro or Intermediate Logic. We must make upfor this glaring omission immediately!

The notion of a “truth-value assignment” frompropositional logic is replaced with the notion ofan interpretation or model.

Definition: An interpretation M consists of thefollowing four things:1. The speciVcation of some non-empty set D to

serve as the domain of quantiVcation for thelanguage.This set is the sum total of entities the quanti-Vers are interpreted to “range over”. The domainmight include numbers only, or people only, oranything else you might imagine. The domainof quantiVcation is sometimes also known asthe universe of discourse.

2. An assignment, for each individual constant inthe language, some Vxed member of D for whichit is taken to stand.For a given constant c, this member is denotedin the metalanguage by “(c)M”.

3. An assignment, for each predicate letter with su-perscript n in the language, some subset of Dn.That is, the interpretation assigns to each predi-cate letter a set of n-tuples from D.

31

For a given predicate letter Pn, this set is de-noted in the metalanguage by “(Pn)M”. Thisset can be thought of as the extension of thepredicate letter under the interpretation.

4. An assignment, for each function letter with su-perscript n in the language, some n-place opera-tion on D.In other words, each function letter is assigneda set of ordered pairs, the Vrst member of whichis itself some n-tuple of members of D, and thesecond member of which is some member ofD. This set of ordered pairs is a function, sofor each n-tuple in its domain, there is a uniqueelement of D in its range. So if D is the setof natural numbers, a two-place function let-ter F 2 might be assigned the addition opera-tion, i.e., the set of all ordered pairs of the form〈〈n,m〉, p〉 such that n+m = p. This operationcan be thought of as the mapping, or function-in-extension, represented by the function letterunder the interpretation.

In a sense, the four parts of a model Vx the mean-ings of the quantiVers, constants, predicate lettersand function letters, respectively. (Or at the veryleast, they Vx as much of their meanings as is rele-vant in an extensional logical system such as Vrst-order predicate logic.) This leaves only somethingto be said about variables.

Sequences

Each model is associated with a certain domain oruniverse of discourse. Variables are allowed to takediUerent values within that domain. A variableis given a value by what is called a (denumerable)sequence.

Definition: A denumerable sequence or vari-able assignment for domain D is a function whosedomain is the set of positive natural numbers, andwhose range is a subset of D.

What does this have to do with assigning values tothe variables?

We Vrst note that while there are an inVnitenumber of variables (since we can always use dif-ferent subscripts), we can arrange them in a Vxed

order and assign them a numbered position in thatordering. I will utilize the ordering:

x, y, z, x1, y1, z1, x2, y2, z2, . . .

For any given variable x, its position in this ordercan be determined with the formula:

p = 3n+ k

Where p is the number of the position, n is thenumber of the subscript on the variable (or 0 if ithas none), and k is either 1, 2 or 3 depending onwhether the letter used is ‘x’, ‘y’ or ‘z’.

(Because your book does not use ‘y’ or ‘z’ oX-cially, it can simply order the variables accordingto their subscripts.)

For each interpretation M, there will be some(usually inVnite) number of sequences of the ele-ments of its domain D. A denumerable sequencecan be thought of as an ordered list of elements ofthe domain D that has a beginning but has no end,in which the members of D are arranged in anyorder, with or without repetition or patterns.

So if D is the set containing the members ofthe Beatles, each of the columns below representsa diUerent denumerable sequence.

s1 s2 s3 s41 John John Ringo Ringo2 John Paul Paul Ringo3 John George John Ringo4 John Ringo John Paul5 John John Ringo Ringo6 John Paul George Ringo7 John George John Ringo8 John Ringo George Paul9 John John Ringo George...

......

......

There are inVnitely many more such sequences inaddition to those listed.

Think of each member of a sequence as thevalue of the variable that occupies the correspond-ing position. The variable ‘x’ is correlated with theVrst position in such sequences; ‘y’ is correlatedwith the second position and so on. So s1 makesJohn the value of every variable. s2 diUerent Beat-les the values of diUerent variables in a patternedway. And s3 does so in an unpatterned way.

32

Therefore, each sequence correlates every vari-able of the language with a member of the domainof that interpretation. For a given variable x andsequence s, in the metalanguage, we use “s(x)” todenote the member of D which s correlates withx. Hence, s3(‘y’) is Paul. Given the assignmentmade for the constants and function letters in M,derivatively, each sequence correlates every termof the language with a member of D. If c is anindividual constant, then let s(c) be (c)M. Then forfunction terms, let s(F (t1, . . . , tn)) be the entity εin D such that 〈〈s(t1), . . . , s(tn)〉, ε〉 ∈ (F )M.

A sequence acts just as an assignment of valuesto the variables. Within a given interpretation M,an open wU might be satisVed by some and notothers.

Satisfaction

Definition: The notion of satisfaction is deVnedrecursively. For a given interpretation M with domainD:

(i) If A is an atomic wU P(t1, . . . , tn), then se-quence s satisVes A iU the ordered n-tupleformed by those entities in the domain D thats correlates with t1, . . . , tn is in the extensionof P for interpretation M, or more precisely,〈s(t1), . . . , s(tn)〉 ∈ (P)M.

(ii) Sequence s satisVes a wU of the form ¬A iU sdoes not satisfy A .

(iii) Sequence s satisVes a wU of the form (A ⇒B) iU either s does not satisfy A or s doessatisfy B.

(iv) Sequence s satisVes a wU of the form (∀x) AiU every sequence s∗ that diUers from s at mostwith regard to what entity of D it correlateswith the variable x satisVes A .

Roughly speaking, each sequence assigns a mem-ber of D to each free variable, and from there, onecan determine whether or not the wU is satisVedby that variable assignment as one would expect.

The notion of satisfaction is important becauseit is used to deVne the notion of truth. (This is theheart of Tarski’s formal semantics.)

Truth and other Semantic Notions

The truth of a wU of predicate logic is relative toan interpretation, just like whether or not a wUof propositional logic is true is relative to a giventruth-value assignment.

Definition: A wU A is true for interpretationM iU every denumerable sequence one can form fromthe domain D of M satisVes A .

Abbreviation: Used in the metalanguage:

�M A

Means that A is true for M. (The subscript on � isnecessary here.)

Definition: A wU A is false for interpretationM iU no denumerable sequence one can form fromthe domain D of M satisVes A .

Notice that while closed wUs are always either trueor false (but not both) for an interpretation M; openwUs can be neither.

However, for all wUs A , A is true for M iU¬A is false, and A is false iU ¬A is true.

Here are some other important consequencesof these deVnitions (elaborated upon in your book,pp. 61–64):

• Modus ponens preserves truth in an interpre-tation, i.e., for all interpretations M, if �M Aand �M (A ⇒ B) then �M B.

• If A is an open wU, and B is obtained fromA by binding all free variables of A withinitial quantiVers ranging over the wholewU, then for any interpretation M, �M AiU �M B.

• Every instance of a truth-table tautology istrue for every interpretation.

• If A is a sentence, then for every interpreta-tion M, either �M A or �M ¬A .

• If t is free for x in A [x], then any wU ofthe form (∀x) A [x] ⇒ A [t] is true for allinterpretations.

• If A does not contain x free, then any wU ofthe form (∀x)(A ⇒ B)⇒ (A ⇒ (∀x) B)is true for all interpretations.

33

Definition: If M is an interpretation, and Γ is a setof wUs all of which are true for M, then M is called amodel for Γ.

Notice that every interpretation will be a model forsome sets of wUs.

So it is appropriate to equate the notion of amodel with the notion of an interpretation. Infact, the study of formal semantics for artiVciallanguages is sometimes called “model theory”. Itend to use the words “model” and “interpretation”interchangeably.

Definition: A wU A is said to be logically trueor logically valid iU A is true for every possibleinterpretation.


� A

(leaving oU any subscript) means that A is logi-cally valid. Because interpretations are analogousfor truth-value assignments in propositional logic,this deVnition is analogous to the deVnition of atautology given in our last unit; this is why thenotation “�” is appropriate.

What interpretations are possible? Do we knowhow many? (In a footnote, Mendelson, ratherdubiously, equates interpretations with “possibleworlds”. This is misleading in many ways, but itcan sometimes be helpful to think of it in this way.)

• It is impossible that both � A and � ¬A .

Definition: A wU A is said to be satisVable iUthere is at least one interpretation for which there isat least one sequence that satisVes A .

• Hence, � A iU ¬A is not satisVable, and� ¬A iU A is not satisVable.

Derivatively, a set of wUs Γ is said to be (mutually)satisVable iU there is at least one interpretation forwhich there is at least one sequence that satisVesevery member of Γ.

Definition: A wU A is said to be contradictoryiU it is not satisVable. Hence, A is contradictory iU� ¬A .

Definition: A wU A is said to logically imply awU B iU in every interpretation, every sequence thatsatisVes A also satisVes B.


A � B

means that wU A logically implies B.

Definition: A wU A is a logical consequence ofa set of wUs Γ iU in every interpretation, every se-quence that satisVes every member of Γ also satisVesA .

Abbreviation: Similarly

Γ � A

means that A is a logical consequence of Γ.

Definition: AwU A is said to be logically equiv-alent to a wU B iU in every interpretation M, Aand B are satisVed by the same sequences.

Abbreviation: The notation

A �� B

means that A and B are logically equivalent.

• It follows that A �� B iU A � B andB � A .

This rounds out our presentation of the importantsemantic concepts for predicate logic.

34

C. Countermodels andSemantic Trees

If you are like many students, in your introduc-tory logic courses, you were taught the truth-tablemethod for showing the validity or invalidity ofan argument in propositional logic, but were nevertaught an analogous method for showing the inva-lidity of an argument in predicate logic. To be sure,you were probably taught a deduction system forpredicate logic; but such deductions can only beused to show that an argument is valid, not that anargument is invalid.

The deVnitions above tell us more or less whatthe process should be like. Just like showinga propositional logic argument to be invalid in-volves Vnding a truth-value assignment makingthe premises true and the conclusion false, the ap-propriate method for predicate logic involves Vnd-ing an interpretation containing a sequence thatsatisVes all the premises but not the conclusion.

Consider the following argument:

FaGa⇒ FaGa

The conclusion of this argument is not a logicalconsequence of its premises. The reason is thatthere are sequences in some interpretations thatsatisfy the premises but not the conclusion. All wehave to do is describe one.

Consider the model B in which (1) the domainof quantiVcation is the set {Britney Spears}, (2) theassignment to all constants, including ‘a’, is Brit-ney Spears, (3) all predicate letters are assigned anempty extension except the predicate letter ‘F 1’,which is assigned the extension {Britney Spears},(4) all function letters are assigned operations map-ping ordered n-tuples of the members of the set{Britney Spears} onto Britney Spears.

This model has only one sequence, that whichassigns every variable to Britney Spears. Call thissequence s. Since s(‘a’) ∈ (‘F 1’)B, s satisVes “Fa”.Hence, s also satisVes “Ga ⇒ Fa”. However, sdoes not satisfy “Ga”, since s(‘a’) /∈ (‘G1’)B. (Re-call that (‘G1’)B is ∅.)

Because there is at least one sequence in at least

one interpretation that satisVes all the premises ofthe argument but does not satisfy the conclusion,the conclusion is not a logical consequence of thepremises.

The same reasoning shows that the wU:

Fa⇒ ((Ga⇒ Fa)⇒ Ga)

is not logically valid.

Definition: A model in which there is a sequencethat does not satisfy a given wU A (or set of wUs Γ)is called a countermodel to A (or to Γ.)

In propositional logic, there is an eUective proce-dure that always identiVes a “counter truth-valueassignment” if one exists, or shows that there arenone. (Truth tables.) When a wU is logically valid,is there always a method for proving that there areno countermodels? If a wU is not valid, is therealways an eUective method for Vnding its counter-models?

As it turns out, no, there isn’t. There is amethod that works a lot of the time, but it isn’t al-ways eUective. This procedure involves construct-ing what are called semantic trees.

Semantic trees work very similarly to abbre-viated truth tables, i.e., those you do by simplyassuming that the complex wU is F and attempt-ing either to Vnd a truth-value assignment in linewith this assumption, or to show that no truth-value assignment ever could be in line with thisassumption.

To test whether a certain wU is satisVable wewrite ‘T’ next to it. To test whether it is a logicaltruth we write ‘F’ next to it to determine whether itis possible for it not to be satisVed. To test whethera certain group of wUs could be satisVable whileanother is not, we write ‘T’ next to those whichare to be satisVed and ‘F’ next to those which arenot. (This could be useful in testing the validity ofan argument.) The book does not write ‘T’s and‘F’s, but just wUs and their negations, but I think itbetter to stress the semantic nature of this exercise.(This is not meant as a replacement for a system ofdeduction.)

We then apply the rules below to the state-ments, depending on their main connectives. These

35

break down how the satisfaction of a given wU de-pends on its parts, to see if the proposal is possible.When a certain possibility might be true in morethan one way, the tree branches to explore bothpossibilities.

Semantic Tree Rules

Here are the rules for the primitive connectives.

NegationsT (¬A )

...F A

F ¬A...

T A

ConditionalsT (A ⇒ B)

...

F A T B

F (A ⇒ B)...

T AF B

Universal QuantiVerT (∀x) A [x]

...T A [t1]

...T A [tn]

(for all closed terms ti occurring on this branch oftree)

F (∀x) A [x]...

F A [c](where c is some new constant unused in tree)

Afterwards, reapply rule for any “T (∀y) B” (or“F (∃y) B”) lines that were applied earlier on the

current branch.

Atomic FormulasT P(t1, . . . , tn)

Check to see whether F P(t1, . . . , tn) appearspreviously on branch. If so, close branch with 6.

If not, do nothing.

F P(t1, . . . , tn)Check to see whether T P(t1, . . . , tn) appearspreviously on branch. If so, close branch with 6.

If not, do nothing.

Strictly speaking, the above rules suXce, since wecould rewrite any wU containing other connectives,or the existential quantiVer, in unabbreviated form.However, it can be seen that the rules will be equiv-alent to the following additional tree rules. Youmay choose either to use or not use these.

Disjunctions

T (A ∨ B)...

T A T B

F (A ∨ B)...

F AF B

ConjunctionsT (A ∧ B)

...T AT B

F (A ∧ B)...

F A F B

36

BiconditionalsT (A ⇔ B)

...

T AT B

F AF B

F (A ⇔ B)...

T AF B

F AT B

Existential QuantiVersT (∃x) A [x]

...T A [c]

(where c is some new constant unused in tree)Afterwards, reapply rule for any “T (∀y) B” (or

“F (∃y) B”) lines on the current branch.

F (∃x) A [x]...

F A [t1]...

F A [tn](for all closed terms ti occurring on this branch of

tree)

When a tree branches, you’re considering diUerentways of making good on your original assumption.“The current branch” is considered everything youcan reach by tracing upwards but not downwardsfrom the current location.

Rules can be applied in any order, but generally,it’s more helpful to apply other rules before the “T(A ⇒ B)” and “T (∀x) A [x]” rules.

If you continue this procedure, you will achieveone of three results:(1) Every branch of the tree will close.

In this case, the initial assumption turned outto be impossible. Therefore, � B. The treeitself can be transformed into a proof in themetalanguage that B has no countermodels.

(2) You will have applied the rules to every wU insome branch without it closing.In this case, the branch remaining open can beused to construct a model and sequence for theoriginal hypothesis. (This will be a counter-model to B if you assumed B was unsatisVed,etc.) Choose a domain with as many entities asthere closed terms on the branch, and assigneach term ti to one of the entities of the domain,(ti)M, and, for each n-place predicate letter Pon the branch, include 〈(t1)M, . . . , (tn)M〉 in(P)M iU the assumption “T P(t1, . . . , tn)” oc-curs on that branch. The described model willhave a sequence that satisVes all and only thosewUs that have a T next to them in the branch.

(3) You will be stuck in an inVnite loop of steps, andnever Vnish the tree. In this case, there is likelya model that will have a sequence that satisVesall the initial assumptions, but it may be onewith an inVnitely large domain. With creativeinsight, you may be able to determine what thismodel will be like, but there is no algorithm fordoing this.

By changing the initial assumption, we can usetrees also to test whether or not a sentence is con-tradictory (by, e.g., assuming T B at the start), orwhether two sentences are equivalent (by deter-mining whether their biconditional can be unsatis-Vable), and so on.

Examples:

1. Let us Vrst use a tree to show that

(∀x)(Fx⇒ Gx), (∀x)(Gx⇒ Hx)� (∀x)(Fx⇒ Hx)

We do this by exploring the possibility of a se-quence satisfying the premises but not the con-clusion, and show that this is impossible.

37

T (∀x)(Fx⇒ Gx)T (∀x)(Gx⇒ Hx)F (∀x)(Fx⇒ Hx)F (Fa⇒ Ha)

T FaF Ha

T (Fa⇒ Ga)

F Fa6

T GaT (Ga⇒ Ha)

F Ga6

T Ha6

The above can easily be transformed into a met-alanguage proof. Such a proof would begin:“Suppose for reductio ad absurdum there is somesequence s in some model M such that s satis-Ves ‘(∀x)(Fx ⇒ Gx)’ and ‘(∀x)(Gx ⇒ Hx)’but not ‘(∀x)(Fx ⇒ Hx)’. By the deVnitionof satisfaction, it there must be some other se-quence s′, diUering from s by at most what itassigns to the variable ‘x’ that does not satisfy“(Fx⇒ Gx)”; let us call the entity s′ assigns to‘x’, α. [α plays the role of ‘a’ in the tree, thoughwe should not assume anything about the con-stant ‘a’.] Any sequence that assigns α to ‘x’will satisfy ‘Fx’ but not ‘Hx’ . . . ”, and so on,matching the lines of the tree. Branching willresult in a proof by cases in the metalanguage,where each case leads to a diUerent contradic-tion.

2. We now show that

2 (((∃x)Fx) ∧ ((∃y)Gy))⇒ (∃x)(Fx ∧ Gx)

We do this by constructing a countermodel viaa tree. We do this by assigning ‘F’ to the above.

F (((∃x)Fx) ∧ ((∃y)Gy))⇒ (∃x)(Fx ∧ Gx)T ((∃x)Fx) ∧ ((∃y)Gy)

F (∃x)(Fx ∧ Gx)T (∃x)FxT (∃y)Gy

T FaT Gb

F Fa ∧ Ga

F Fa6

F GaF Fb ∧ Gb

F Fb F Gb6

Although two branches closed, one remainsopen. We can use this to construct a coun-termodel, M. Let D = {α, β}, (‘a’)M = α,(‘ b’)M = β, (‘F 1’)M = {α}, (‘G1’)M = {β}. Inany such model, no sequence satisVes the above.Hence this wU is not logically valid.

3. For an example of an inVnite tree, consider oneattempting to show that

2 (∀x) (∃y)Rxy⇒ Fa

which looks like this:

F (∀x) (∃y)Rxy⇒ GaT (∀x) (∃y)Rxy

F GaT (∃y)RayT Rab

T (∃y)RbyT Rbc

T (∃y)RcyT Rcd

T (∃y)RdyT Rde

...

Clearly, there is no end to this tree, but it’s alsopretty clear that it does describe a model. LetD = the set of natural numbers, (‘R2’)M be theless than relation, (‘G1’)M be the property of be-ing odd, and the constants stand for the naturalnumbers in order, beginning with (‘a’)M = 0.

38

Mendelson goes so far as to give metatheoreticproofs that whenever a tree closes, the wU(s) inquestion is (are) unsatisVable (or logically true, ifF was assumed), and that the appropriate kind ofmodel exists if the tree doesn’t close, and so on.

D. An Axiom System

As our system of deduction for predicate logic, weintroduce the following. (Below, ‘A ’, ‘B’, and ‘C ’are used as schematic letters rep-resenting wUs, ‘x’as a schematic letter for individual variables, and‘t’ for terms. ‘Γ’ is used as a metalinguistic variableranging over sets of object-language wUs.)

The First-Order Predicate Calculus (Sys-tem PF)1

Definition: An axiom of PF or logical axiom isany wU of one of the following Vve forms:

(A1) A ⇒ (B ⇒ A )(A2) (A ⇒ (B ⇒ C ))⇒

((A ⇒ B)⇒ (A ⇒ C ))(A3) (¬A ⇒ ¬B)⇒ ((¬A ⇒ B)⇒ A )(A4) (∀x) A [x]⇒ A [t], for all instances such that

t is free for x in A [x](A5) (∀x)(A ⇒ B)⇒ (A ⇒ (∀x) B), for all

instances such that A contains no freeoccurrences of x.

Definition: The inference rules of PF are:

Modus ponens (MP): From (A ⇒ B) and A , in-fer B.

Generalization (Gen): From A infer (∀x) A .

Abbreviation:

Γ ` A and simply ` A

are deVned as you might expect. In this unit, unlessotherwise speciVed, ‘`’ means ‘`PF’.

Definition: A Vrst-order theory K is an ax-iomatic system in the language of predicate logicthat can be obtained from the above by adding zeroor more proper or non-logical axioms.

Proper axioms are added to represent the basicprinciples of a certain area of thought. E.g., wemight form a Vrst-order theory for the study of thesolar system by using the constants ‘a1’, . . . , ‘a9’as the nine planets (and Pluto), ‘b1’ for the sun,etc., using ‘O2’, for the orbiting relation, etc., andadding as axioms certain laws of physics stated inthe language of predicate logic, and so on.

Note that every theorem schema of L corre-sponds to a theorem schema of PF (or any otherVrst-order theory). Since L is complete, if A is atruth-table tautology, then `PF A . You may citethis in your proofs by writing “Taut” as justiVca-tion. Similarly, every derived rule of L correspondsto a derived rule of PF. You may make use of thisin your derivations by using the abbreviations onp. 21, or simply writing “SL” [System L] as justi-Vcation. Alternatively, you may utilize the nota-tion used within your favorite natural deductionsystem for propositional logic, or abbreviate thenames given with the derived rules listed in sec. 2.5of your textbook.

All Vrst-order theories, including the bare-bones PF, have the following additional derivedrules.

Result (UI, ∀O or rule A4):(∀x) A [x] ` A [t], where t is free for x in A [x].(Universal instantiation.)

Proof:Follows directly from (A4) by MP.

1PF stands for “Full Predicate calculus”, i.e., the calculus within a syntax including all possible constants, predicate lettersand function letters. The “Pure Predicate calculus”, PP, is the same, but excluding all constants or function letters from thesyntax. Mendelson gives these abbreviations in Chapter 3. There are predicate calculi that are neither pure nor full.

39

Result (EG, ∃I or E4): A [t, t] ` (∃x) A [t, x],where t is free for x in A [t, x]. (Existential gen-eralization; the repetition of t here indicates thatnot all the occurrences of t need to change.)

Proof:The following schema shows the object-languagesteps necessary.

1. A [t, t] ` A [t, t] (Premise)2. A [t, t] ` ¬¬A [t, t] 1 SL (DN)3. ` (∀x)¬A [t, x]⇒ ¬A [t, t] A44. A [t, t] ` ¬ (∀x)¬A [t, x] 2, 3 SL (MT)5. A [t, t] ` (∃x) A [t, x] 4 deVnition of ∃

e

Result (Sub or Repl): A [x] ` A [t], where t isfree for x in A [x]. (The rule of substitution orreplacement of free variables.)

Proof:Schematically:

1. A [x] ` A [x] (Premise)2. A [x] ` (∀x) A [x] 1 Gen3. A [x] ` A [t] 2 UI

e

E. The Deduction Theorem inPredicate Logic

The deduction theorem does not hold generallyin the Vrst-order predicate calculus PF, nor wouldwe want it to. After all, in the semantics of predi-cate logic, it is not the case that � Fx⇒ (∀x)Fx,and similarly in the system of deduction, while wehave Fx ` (∀x)Fx by Gen we should not have` Fx ⇒ (∀x)Fx. We therefore state and provethe deduction theorem in the following restrictedform:

Result (DT): If Γ ∪ {C } ` A and in the proofB1, . . . ,Bn of A from Γ ∪ {C }, no step is ob-tained by an application of Gen that both (i) isapplied to a previous step that depends upon hav-ing C in the premise set, and (ii) uses a variableoccurring free in C , then Γ ` C ⇒ A .

Proof:(1) Assume the complex antecedent of DT. We will

show, using proof induction, that for every stepBi in the proofB1, . . . ,Bn ofA from Γ∪{C },that it holds that Γ ` C ⇒ Bi. We are enti-tled to assume that we have already gottenΓ ` C ⇒ Bj for all steps Bj prior to Bi.

(2) Because Bi is a step in the proof of A fromΓ∪{C }, the cases we have to consider are that:(a) Bi is a member of Γ, (b) Bi is C , (c) Bi isan axiom, (d) Bi follows from previous stepsin the proof by MP, and (e) Bi follows from aprevious step by an application of Gen obeyingthe restriction mentioned above. We considereach case.

Case (a). Bi is a member of Γ. Hence Γ ` Bi,and by SL, Γ ` C ⇒ Bi.

Case (b). Bi is C . Then C ⇒ Bi is sim-ply C ⇒ C , a simple tautology, whenceΓ ` C ⇒ Bi.

Case (c). Bi is an axiom. Hence ` Bi and bySL, ` C ⇒ Bi. A fortiori, Γ ` C ⇒ Bi.

Case (d). Bi follows from previous membersof the series by MP. Therefore there are pre-vious members of the series Bj and Bk suchthat Bj takes the form Bk ⇒ Bi. By theinductive hypothesis, we already have bothΓ ` C ⇒ Bk and Γ ` C ⇒ (Bk ⇒ Bi). BySL, Γ ` C ⇒ Bi.

Case (e). Bi follows from a previous member ofthe series by an application of Gen obeying therestriction mentioned above. Therefore, thereis a previous step Bj such that Bi takes theform (∀x) Bj for some variable x. Because ofthe restriction, either obtaining Bj did not de-pend on having C in the premise set, or C does

40

not contain x free. In the Vrst subcase, Γ ` Bj

and hence by Gen, we have Γ ` (∀x) Bj , i.e.,Γ ` Bi. By SL, then, Γ ` C ⇒ Bi, as usual. Inthe second subcase, we Vrst note that we haveΓ ` C ⇒ Bj by the inductive hypothesis. ByGen, we obtain Γ ` (∀x)(C ⇒ Bj). BecauseC does not contain x free, as an instance of (A5)we have ` (∀x)(C ⇒ Bj)⇒ (C ⇒ (∀x) Bj).By MP, Γ ` C ⇒ (∀x) Bj , i.e., Γ ` C ⇒ Bi.e

Obviously, in the proof establishing that Fx `(∀x)Fx, Gen is applied to a step that both dependson Fx and makes use of a variable occurring freein Fx. (Note that invoking the “Sub” or “Repl”derived rule requires the same.) So we cannot con-clude ` Fx⇒ (∀x)Fx.

However, such is not the case with the proof:

1. (∀x)Fx ` (∀x)Fx Premise2. ` (∀x)Fx⇒ Fy (A4)3. (∀x)Fx ` Fy 1, 2 MP4. (∀x)Fx ` (∀y)Fy 3 Gen

This we transform as follows:

1. ` (∀x)Fx⇒ (∀x)Fx (Taut)2. ` (∀x)Fx⇒ Fy A43. ` (∀x)Fx⇒ ((∀x)Fx⇒ Fy) 2 SL4. ` (∀x)Fx⇒ Fy 1, 3 SL5. ` (∀y)((∀x)Fx⇒ Fy) 4 Gen6. ` (∀y)((∀x)Fx⇒ Fy)⇒

((∀x)Fx⇒ (∀y)Fy) A57. ` (∀x)Fx⇒ (∀y)Fy 5, 6 MP

Again, this is not the most eloquent proof. (Gettingline 4 by SL is silly, since it’s an axiom, and in-deed, the same one introduced at line 2.) However,that’s what case (d) called for and following therote procedure always works.

From here on out, you can use this to shortenyour proofs, but bear in mind the restrictions. Youneed to make sure that you don’t apply it whenyou’ve used Gen on a variable appearing free in anassumption!

F. Doing without ExistentialInstantiation

The natural deduction rule of “Existential Instan-tiation” or “Existential Elimination” (EI, ∃O) rec-ommends that from a given existentially quanti-Ved statement, one should infer the correspondingstatement with the quantiVer removed, and somenew or unused constant in place of the variable.(Mendelson calls this “Rule C” for choice.) How-ever, note that for most wUs A [x] it is not the casethat

(∃x) A [x] � A [c]for any constant c. Thus, e.g., we ought not have(∃x)Fx ` F (c) for any constant c, even an unusedone. Within an interpretation M, every constant cis assigned a Vxed entity of the domain, viz., (c)M.There simply is no inferring that (c)M is in theextension of the predicate letter ‘F ’, viz., (‘F ’)M,simply on the assumption that something is. Thisso-called “rule” of natural deduction is logicallyinvalid, and should be done away with. Luckily,we don’t need it. Bearing in mind Exercise 2.32dfrom your homework, we have:

(New-DR) If A does not contain x free, then(∀x)(B ⇒ A ) ` (∃x) B ⇒ A .

With this, we have the following “conversion” froma pseudo-proof that uses EI to a proof that doesn’t.

PSEUDO-PROOF:

(∃x)Fx `∗ (∃x)(Fx ∨ Gx)

1. (∃x)Fx ` (∃x)Fx Premise2. (∃x)Fx `∗ Fa 1 EI/∃O3. (∃x)Fx `∗ Fa ∨ Ga 2 SL4. (∃x)Fx `∗ (∃x)(Fx ∨ Gx) 4 EG

CONVERSION:

(∃x)Fx ` (∃x)(Fx ∨ Gx)

1. (∃x)Fx ` (∃x)Fx Premise2. Fx ` Fx Premise3. Fx ` Fx ∨ Gx 2 SL4. Fx ` (∃x)(Fx ∨ Gx) 3 EG5. ` Fx⇒ (∃x)(Fx ∨ Gx) 4 DT

41

6. ` (∀x)(Fx⇒ (∃x)(Fx ∨ Gx)) 5 Gen7. ` (∃x)Fx⇒ (∃x)(Fx ∨ Gx) 6 New-DR8. (∃x)Fx ` (∃x)(Fx ∨ Gx) 1, 7 MP

We will show that whenever a pseudo-proof ispossible with EI/∃O, a conversion to a real proofsimilar to the above is always possible. Stated verysimply: replace every step arrived at by EI with apremise similar to it, except containing a variablenot occurring free in any lines of the pseudo-proofinstead of the “new” constant. Continue the proofas normal, then push the new premise throughwith the deduction theorem. Generalize, and applyNew-DR. Then, along with the existential state-ment to which you applied EI, and MP, you get theresult. Let us state this result more formally.

Definition: Pseudo-derivability or `∗: Γ `∗ BiU there is an ordered series of wUsA1, . . . ,An whereAn is B, and for each step Ai where 1 ≤ i ≤ n, ei-ther:(a) Ai is an axiom;(b) Ai is a member of Γ;(c) there is some previous step in series, Aj , such

that Aj takes the form (∃x) C [x], and Ai takesthe form C [c], where c is a constant that doesnot occur in any previous step of the pseudo-proof, nor in B, nor in any premise in Γ (i.e.,Ai was derived by the pseudo-rule, EI);

(d) Ai follows from previous steps by MP;(e) Ai follows from a previous step by Gen, but not

using a variable that is free in some previousstep of the series C [c] arrived at by EI.

Result: If Γ `∗ B then Γ ` B.(The non-necessity of EI.)

Proof:

(1) Assume Γ `∗ B, and let A1, . . . ,An be thesteps of the pseudo-proof.

(2) Let (∃x1) C1[x1], . . . , (∃xm) Cm[xm] be themembers of the pseudo-proof to which EI isapplied (in order), and let C1[c1], . . . ,Cm[cm]be the results of these EI steps (in order).

(3) It is obvious that if we expand Γ byadding {C1[c1], . . . ,Cm[cm]}, we can proveB without EI, or, in other words, Γ ∪{C1[c1], . . . ,Cm[cm]} ` B, since we are sim-ply adding the results of our EI steps to ourpremise set.

(4) Because no application of Gen is made to a freevariable of Cm[cm] after it is introduced, by thededuction theorem we have:

Γ ∪ {C1[c1], . . . ,Cm−1[cm−1]} ` Cm[cm]⇒ B

(5) Pick some variable y that does not occur freeanywhere in the series A1, . . . ,An (prefer-ably xm). Replace cm with y everywhere inthe proof for Γ ∪ {C1[c1], . . . ,Cm−1[cm−1]} `Cm[cm] ⇒ B. The result will also be a proof.Notice that cm does not occur anywhere in theset Γ ∪ {C1[c1], . . . ,Cm−1[cm−1]}, since it wasnew when we introduced it. So there is noreason it must be used rather than the variable.

(6) Hence, Γ ∪ {C1[c1], . . . ,Cm−1[cm−1]} `Cm[y]⇒ B.

(7) By Gen, Γ ∪ {C1[c1], . . . ,Cm−1[cm−1]} `(∀y)(Cm[y]⇒ B).

(8) Because y does not occur free in the proof, andB is An, B does not contain y free. Hence,by (New-DR), Γ ∪ {C1[c1], . . . ,Cm−1[cm−1]} `(∃y) Cm[y]⇒ B.

(9) Because Cm[cm] was arrived at in theoriginal pseudo-derivation by EI onsome wU of the form (∃xm) Cm[xm],which either is (∃y) Cm[y], or can beused to get it, it must be that Γ ∪{C1[c1], . . . ,Cm−1[cm−1]} ` (∃y) Cm[y]. Thus,by MP, Γ ∪ {C1[c1], . . . ,Cm−1[cm−1]} ` B.

(10) By the same procedure described in steps (5)–(9), we can eliminate Cm−1[cm−1] from thepremise set, and so on, until we have elim-inated everything except the members of Γ.Hence, Γ ` B. e

This proof shows us that we don’t need Existen-tial Instantiation, very much like we don’t need

42

Conditional Proof. However, because we have thismetatheoretic result, we now know that wheneverwe are able to carry out a pseudo-proof making useof the rule, we could transform it into a proof thatdoes not make use of the rule. Hence, it is innocu-ous to pretend as if we do have such a rule. Fromhere on out we allow ourselves to make use of“Rule C” or “EI” in our proofs, even though strictlyspeaking there is no such rule. We must be careful,however, to obey the restrictions for what countsas a “pseudo-derivation”, as deVned in the previ-ous page. I like to mark the pseudo-steps with ‘`∗’rather than ‘`’; you can stop using the ‘*’ when thedummy constant no longer appears.

G. Metatheoretic Results forSystem PF

Result (Soundness): For all wUs A , if ` Athen � A .

Proof:Every instance of the axiom schemata is logicallyvalid. (This can be veriVed using semantic trees.)MP and Gen preserve logical validity. (In the caseof Gen, note that a wU is logically valid iU it issatisVed by all sequences in all models. If an openwU is satisVed by all sequences in a model, then thecorresponding wU with one of the variables boundwith an initial quantiVer will also be satisVed byall sequences in that model.) If ` A , then A isderivable from the axioms by some Vnite numberof steps of MP and Gen, each preserving validity,and hence, � A . e

Result (Consistency): There is no wU A suchthat both ` A and ` ¬A .

Proof:Suppose for reductio ad absurdum that there is awU A such that ` A and ` ¬A . By soundness,� A and � ¬A . In other words, every sequencein every model satisVes both A and ¬A . But asequence satisVes ¬A iU it does not satisfy A , soany arbitrary sequence will both satisfy and notsatisfy A , which is absurd. e

Ultimately, we also want to prove the completenessof PF. We’ll get there, but we Vrst need to prove anumber of lemmas.

Result (Denumerability of wUs): The set ofwUs of the language of predicate logic is denu-merable, i.e., we can generate a one–one corre-spondence between the set of natural numbersthe set of wUs.

Proof:All wUs are built up of the simple signs: ‘(’, ‘ , ’,‘)’, ‘⇒’, ‘¬’, ‘∀’, as well as the individual constants,variables, predicate letters and function letters.

A. We deVne a function g that associates each sim-ple sign with a diUerent natural number.(1) Let g(‘(’) = 3, g(‘)’) = 5, g(‘, ’) = 7, g(‘¬’) =

9, g(‘⇒’) = 11, and g(‘∀’) = 13.(2) If c is a constant, and n is the number of its

subscript (if c has no subscript, then n = 0),then depending on which letter of the alphabetis used, let k be either 1, 2, 3, 4 or 5 (1 for ‘a’, 2for ‘b’, etc.), and let g(c) = 7 + 8(5n+ k).

(3) If x is a variable, and n is the number of itssubscript (if x has no subscript, then n = 0),then depending on which letter of the alphabetis used, let k be either 1, 2, or 3 (1 for ‘x’, 2 for‘y’ and 3 for ‘z’), and let g(x) = 13 + 8(3n+ k).

(4) If F is a function letter, and n is the number ofits subscript (if F has no subscript, then n = 0)and m is the number of its superscript, thendepending of which letter of the alphabet isused (‘f ’ through ‘l’), let k be one of 1 through7, and let g(F ) = 1 + 8(2m37n+k).

43

(5) If P is a predicate letter, and n is the numberof its subscript (if P has no subscript, thenn = 0) andm is the number of its superscript,then depending of which letter of the alpha-bet is used (‘A’ through ‘T ’), let k be one of 1through 20, and let g(P) = 3 + 8(2m320n+k).

We can now deVne the value of g for formulas invirtue of its value for simple signs.(6) Let (p0, p1, p3, p4, . . . ) be the sequence of prime

integers in order starting with 2. (There is nogreatest prime.) Hence p0 = 2, p1 = 3, p3 = 5,and so on.

(7) Let µ0µ1µ2 · · ·µr be some string of signs fromthe syntax of predicate logic. It might be some-thing ill-formed like “→)x∀a12(”, or it mightbe a well-formed formula like “(∀x1)(F (x1)⇒F (x1))”. Here, µ0 is the Vrst sign in the string,µ1 is the second sign, and so on. For all suchstrings, let g(µ0µ1µ2 . . . µr) = p

g(µ0)0 · pg(µ1)

1 ·pg(µ2)2 · . . . · pg(µr)r .

(8) For a given expression A , the number g(A )is called the Gödel number of A . Notice thatbecause the Gödel numbers of the diUerentsimple signs are all diUerent, so are the Gödelnumbers of strings of signs, since for diUer-ent strings, these numbers will have diUerentprime factorizations.

(9) Let N − {0} be the set of natural numbersgreater than zero, C the subset of natural num-bers that are Gödel numbers of wUs, andW theset of all wUs. Consider now the function w(x)from N− {0} onto C, whose value for x as ar-gument is the xth smallest natural number thatis the Gödel number of a wU of predicate logic.Consider also the function s(x) from C ontoW, whose value for any Gödel number of a wUis that wU. Then the function s(w(n+ 1)), is a1–1 correspondence between the set of naturalnumbers and the set of wUs. e

Corollary: The set of closed wUs is also denu-merable.

Proof:As above, with the set of closed wUs (and theirGödel numbers) substituted forW (and C).

We’re inching closer to completeness. Before mov-ing on, I want to make note of some diUerences be-tween my proof of completeness and Mendelson’s.Mendelson prefers to speak of diUerent Vrst-ordertheories. Remember than a Vrst-order theory isan axiomatic system gotten by adding additionalaxioms to the axioms of PF. Really, talking aboutwhat theorems are provable in a given system K,where the additional axioms of K are the membersof a set Γ is equivalent to speaking about whatis provable in the barebones system PF beginningwith Γ as a set of premises, since clearly:

`K A iU Γ `PF A

It’s really a matter of taste whether we view theproof as being about diUerent theories or as be-ing about diUerent premise sets. I prefer to speakabout premise sets, since that we don’t have to dealwith any sense of “`” other than “`PF”. But thediUerences are trivial.

Before moving on, let us introduce some newmetalinguistic deVnitions.

Definition: A set of wUs Γ is said to be consis-tent iU there is no wU B such that both Γ ` B andΓ ` ¬B . (Otherwise, Γ is inconsistent.)

Definition: A set of wUs Γ is said to be maximaliU for every closed wU B, either B ∈ Γ or ¬B ∈ Γ.

Definition: A set of wUs Γ is said to be universaliU for every wU B[x] that contains at most x free,if it is the case for all closed terms t that B[t] ∈ Γ,then (∀x) B[x] ∈ Γ.

We now move on to our next important Lemma onthe way to completeness.

Result (LEL): If Γ is a consistent set of closedwUs, then there is a set of closed wUs ∆ such that:(a) Γ ⊆ ∆, (b) ∆ is consistent, (c) ∆ is maxi-mal, and (d) ∆ is universal. (The LindenbaumExtension Lemma.)

44

Proof:(1) Assume that Γ is a consistent set of closed wUs.

(2) For convenience, we assume that none of theconstants ‘e’, ‘e1’, ‘e2’, ‘e3’, . . . , etc., occur any-where in the wUs in Γ.2

(3) By the denumerability of the set of closed wUsof the language, we can arrange them in aninVnite sequence:

A1,A2,A3, . . . , etc.

Making use of this sequence, let us recursivelydeVne an inVnite sequence of sets of wUs:

Γ0,Γ1,Γ2, . . . , etc.

As follows:a) Let Γ0 = Γ.b) We deVne Γn+1 in terms of Γn in one of the

following three ways:(i) if Γn ∪ {An+1} is consistent, then let

Γn+1 = Γn ∪ {An+1};(ii) if Γn ∪ {An+1} is inconsistent

and An+1 does not take the form(∀x) B[x], then let Γn+1 = Γn ∪{¬An+1};

(iii) if Γn ∪ {An+1} is inconsistentand An+1 does take the form(∀x) B[x], then let Γn+1 = Γn ∪{¬An+1} ∪ {¬B[ex]}, where ‘ex’ isthe Vrst member of the sequence‘e’, ‘e1’, ‘e2’, ‘e3’, . . . , that does notoccur in Γn.

(4) Let ∆ be the union of all of the members of theΓ-sequence (i.e., Γ0 ∪ Γ1 ∪ Γ2 ∪ . . . etc.)

(5) Obviously, Γ ⊆ ∆. This establishes part (a) ofthe consequent of the Lemma.

(6) Every member of the Γ-sequence is consistent.We prove this by mathematical induction.Base step: Γ0 is Γ, and it is consistent ex hy-pothesi.Induction step: Suppose Γn is consistent. Itfollows that Γn+1 is consistent by a proof bycases:

Case (i): Γn+1 = Γn ∪ {An+1} and Γn ∪{An+1} is consistent, so Γn+1 isconsistent.

Case (ii): Γn+1 = Γn ∪ {¬An+1}, and Γn ∪{An+1} is inconsistent.• Hence there is some B suchthat both Γn ∪ {An+1} ` Band Γn ∪ {An+1} ` ¬B.

• By SL, Γn ∪ {An+1} ` B ∧¬B.

• Because, An+1 is closed, it fol-low by DT, that Γn ` An+1 ⇒(B ∧ ¬B).

• But ` ¬(B ∧ ¬B) by SL.• By MT, Γn ` ¬An+1.• Suppose for reductio that Γn+1is inconsistent.

• So there is some C such thatΓn ∪ {¬An+1} ` C and Γn ∪{¬An+1} ` ¬C .

• By reasoning parallel to theabove, by SL and the deduc-tion theorem, we also haveΓn ` An+1.

• So Γn is inconsistent.• This contradicts the inductivehypothesis. Hence Γn+1 isconsistent.

Case (iii): Γn+1 = Γn∪{¬An+1}∪{¬B[ex]},Γn ∪ {An+1} is inconsistent andAn+1 takes the form (∀x) B[x].• By the same reasoning asin the previous case, Γn `¬An+1.

• Suppose for reductio that Γn+1is inconsistent.

• So there is some C such thatΓn ∪ {¬An+1} ∪ {¬B[ex]} `C and Γn ∪ {¬An+1} ∪{¬B[ex]} ` ¬C .

• By SL, Γn ∪ {¬An+1} ∪{¬B[ex]} ` C ∧ ¬C .

• By DT, Γn ∪ {¬B[ex]} `¬An+1 ⇒ (C ∧ ¬C ).

2If this assumption is not warranted, we could use another denumerable sequence of constants, e.g., the ‘b’s or the ‘c’s, oreven add a new sequence of constants ‘o’, ‘o1’, ‘o2’, ‘o3’, . . . , to the language if need be.

45

• By MP, Γn ∪{¬B[ex]} ` C ∧¬C .

• Because An+1 is closed and ittakes the form (∀x) B[x], thewU B[ex] is also closed.

• By DT, Γn ` ¬B[ex]⇒ (C ∧¬C ).

• ` ¬(C ∧ ¬C ), and so by SL,Γn ` B[ex].

• ‘ex’ is not included in Γn.Hence, we can replace ‘ex’with the variable x through-out the proof for Γn ` B[ex]and the result will also be aproof. Hence Γn ` B[x].

• By Gen, Γn ` (∀x) B[x],which is the same as Γn `An+1.

• So Γn is inconsistent, whichcontradicts the inductive hy-pothesis.

• Hence Γn+1 is consistent.

(7) It follows from (6) that ∆ is consistent.a) Note that the Γ-sequence is constantly ex-

panding: For all j and k such that j < k,Γj ⊆ Γk. Crudely, ∆ can be thought of asthe upper limit of the expansion.

b) So every Vnite subset of ∆ is a subset ofsome Γi for some suitably large i.

c) However, every proof from ∆ has only a V-nite number of steps, and hence only makesuse of a Vnite subset of ∆.

d) If there were some B such that both ∆ `B and ∆ ` ¬B, for some suitably large i,it would have to be that both Γi ` B andΓi ` ¬B.

e) This is impossible because all the membersof the Γ-sequence are consistent by (6).

f) Hence, ∆ is consistent.g) This establishes part (b) of the consequent

of the Lemma.

(8) ∆ is obviously maximal as well.a) All closed wUs are members of the se-

quence A1,A2, . . . , etc.b) For each Ai, either it or its negation is a

member of Γi, and Γi ⊆ ∆.

c) So for all closed wUs A1,A2, . . . , etc., ei-ther it or its negation is included in ∆.

d) This establishes part (c) of the consequentof the Lemma.

(9) Finally, ∆ is also universal.a) We show this by reductio. Suppose oth-

erwise, i.e., suppose that there is a wUB[x] that contains at most x free, suchthat for all closed terms t, B[t] ∈ ∆, but(∀x) B[x] /∈ ∆.

b) (∀x) B[x] is closed, so because ∆ is maxi-mal, it must be that ¬ (∀x) B[x] ∈ ∆.

c) Because (∀x) B[x] is closed, it also fol-lows that (∀x) B[x] is a member of the A -sequence, i.e., (∀x) B[x] is An+1 for somenumber n.

d) Obviously, however, since (∀x) B[x] /∈ ∆,it follows that Γn+1 is not obtained fromΓn using case (i).

e) Nor was it obtained using case (ii), sinceAn+1 is of the form (∀x) B[x].

f) This leaves case (iii), so Γn+1 is Γn ∪{¬An+1} ∪ {¬B[ex]}.

g) Hence for some x, ¬B[ex] ∈ Γn+1 and so¬B[ex] ∈ ∆.

h) But by our assumption, it holds for allclosed terms t that B[t] ∈ ∆.

i) All constants, ‘ex’ included, are closedterms, so B[ex] ∈ ∆. j) Hence, both∆ ` ¬B[ex] and ∆ ` B[ex].

j) However, this is impossible, because wehave already shown ∆ to be consistent.

k) Our supposition has been shown to be im-possible, hence ∆ is universal.

l) This establishes part (d) of the consequentof the Lemma.

(10) By suitably deVning ∆, we have shown each ofparts (a)-(d) of the consequent of the Lemma onthe basis of the assumption of its antecedent.Hence, the Lemma is established. e

We’ve just shown that beginning with any con-sistent set of sentences, we can keep adding to itad inVnitum to get a maximally consistent set ofsentences of the language.

We pause again for a new deVnition:

46

Definition: A model or interpretation M is a de-numerable model iU its domain of quantiVcationD is denumerable (as deVned on p. 5).

Result (MCL): If ∆ is a consistent, maximal,and universal set of closed wUs, then there is atleast one denumerable model for ∆. (The Maxi-mal Consistency Lemma.)

Proof:(1) Assume that ∆ is a consistent, maximal, and

universal set of closed wUs. We can then de-scribe a denumerable model M for ∆ using thefollowing procedure.

(2) Essentially, we’ll let all the closed terms of thelanguage stand for themselves. (Another possi-ble way of constructing a model would be tolet each closed term stand for its Gödel num-ber. However, let us proceed using the formermethod.)

(3) Let the domain of quantiVcation D of M be theset of closed terms of the language of Vrst-orderpredicate logic. Note that there are denumer-ably many closed terms, so M is a denumerablemodel.

(4) For each constant c, let (c)M be c itself. So, forexample, (‘a’)M is ‘a’, (‘b12’)M is ‘b12’, etc.

(5) For each function letter F with superscript n,let (F )M be that n-place operation onD whichincludes all ordered pairs of the form

〈〈t1, . . . , tn〉,F (t1, . . . , tn)〉i.e., the operation that has the closed termF (t1, . . . , tn) as value for 〈t1, . . . , tn〉 as argu-ment.

Example: The operation (‘f 1’)M, which M as-signs to the monadic function letter ‘f 1’, willcontain such ordered pairs as 〈‘a’, “f 1(a)”〉,〈‘b12’, “f 1(b12)”〉, and 〈“f 1(a)”, “f 1(f 1(a))”〉,and so on.

(6) For each predicate letter P with superscriptn, let (P)M be that subset of Dn that in-cludes the n-tuple 〈t1, . . . , tn〉 iU the atomicwU P(t1, . . . , tn) is included in ∆.

Example: The extension of ‘F 1’ under M, viz.,(‘F 1’)M, will include the term ‘a’ just in case“F 1(a)” ∈ ∆, and will exclude ‘a’ just in case“¬F 1(a)” ∈ ∆, and so on.

(7) We must now prove that this interpretation Mis a model for ∆, i.e., that for all wUs A , ifA ∈ ∆, then �M A . We will actually provesomething stronger, i.e., that for all closed wUsA , A ∈ ∆ iU �M A . (∆ only contains closedwUs, so we need not worry about open wUs.)We prove this by wU induction.

Base step: A is a closed atomic formula.• Hence, A takes the form P(t1, . . . , tn)where P is a predicate letter with super-script n and t1, . . . , tn are closed terms.

• Because closed terms contain no variables,all sequences in M will associate each tiwith itself.

• So by the deVnition of satisfaction,all sequences in M will satisfy A iU〈t1, . . . , tn〉 ∈ (P)M.

• By the deVnition of truth in an interpreta-tion, �M A iU 〈t1, . . . , tn〉 ∈ (P)M.

• By our characterization of M under(6) above, 〈t1, . . . , tn〉 ∈ (P)M iUP(t1, . . . , tn) ∈ ∆.

• So P(t1, . . . , tn) ∈ ∆ iU �M A , i.e.,A ∈ ∆ iU �M A .

Induction step: Assume as inductive hypoth-esis that it holds for all closed wUs B withfewer connectives than A , that B ∈ ∆ iU�M B. We will then show that it holds for thecomplex closed wU A that A ∈ ∆ iU �M A .This proceeds by a proof by cases on the make-up of A .Case (a): A takes the form ¬B, where B is

also closed and has one fewer con-nective than A .• By the inductive hypothesis,

B ∈ ∆ iU �M B.• Because ∆ is consistent, if A ∈

∆, then B /∈ ∆.• Because ∆ is maximal, if B /∈

∆, then A ∈ ∆.• So B /∈ ∆ iU A ∈ ∆.

47

• Hence A ∈ ∆ iU not-�M B.• Since B is closed, �M ¬B iUnot-�M B.

• Hence, A ∈ ∆ iU �M ¬B, i.e.,A ∈ ∆ iU �M A .

Case (b): A takes the form B ⇒ C , whereB and C are closed wUs with fewerconnectives.

First we prove that if A ∈ ∆ then�M A .• Suppose A ∈ ∆.• Since ∆ is maximal and consis-tent, B ∈ ∆ or ¬B ∈ ∆, butnot both, and likewise with C .

• However, because B ⇒ C ∈ ∆and ∆ is consistent, it cannot bethat both B ∈ ∆ and ¬C ∈ ∆,so either ¬B ∈ ∆ or C ∈ ∆.

• By the inductive hypothesis,B ∈ ∆ iU �M B, and C ∈ ∆iU �M C .

• By the same reasoning given forthe previous case, ¬B ∈ ∆ iU�M ¬B.

• So either �M ¬B or �M C .• By the deVnition of satisfactionfor conditionals, it follows that�M B ⇒ C , i.e., �M A .

Now we prove that if �M A thenA ∈ ∆.

• Suppose �M A , i.e., �M B ⇒C .

• Because B and C are closed,by the deVnition of satisfactionfor conditionals, we have either�M ¬B or �M C .

• By the inductive hypothesis,B ∈ ∆ iU �M B and C ∈ ∆ iU�M C .

• Again, by the reasoning givenfor the previous case, ¬B ∈ ∆iU �M ¬B.

• So either ¬B ∈ ∆ or C ∈ ∆.• Because ∆ is maximal, either

B ⇒ C ∈ ∆ or ¬(B ⇒ C ) ∈∆.

• If ¬(B ⇒ C ) ∈ ∆, then ∆would be inconsistent, because` ¬(B ⇒ C )⇒ B and` ¬(B ⇒ C )⇒ ¬C .

• So B ⇒ C ∈ ∆, i.e., A ∈ ∆.

Putting these two results together,we get that A ∈ ∆ iU �M A .

Case (c): A takes the form (∀x) B[x], whereB[x] contains fewer connectives,and B[x] contains at most x free.

First we prove that if A ∈ ∆ then�M A .• Suppose A ∈ ∆, i.e.,

(∀x) B[x] ∈ ∆.• Because B[x] contains at mostx free, for all closed terms t ,B[t] is a closed wU.

• Because ∆ is maximal, for allclosed terms t, either B[t] ∈ ∆or ¬B[t] ∈ ∆.

• However, since ∆ is consistent,it must be that for all closedterms t , B[t] ∈ ∆.

• By the inductive hypothesis, forall closed terms t, �M B[t].

• Because the domain of quan-tiVcation for M is D, and Dconsists of the set of closedterms, and every closed termis interpreted as standing for it-self, a sequence of M will satisfyB[x] iU it satisVes B[t] for thatclosed term t that gets assignedto x in that sequence.

• Because all sequences of M sat-isfy B[t] for all closed terms t,all sequences of M will satisfyB[x], and hence all sequencesof M will satisfy (∀x) B[x].

• Hence, �M (∀x) B[x], i.e.,�M A .

We now prove that if �M A thenA ∈ ∆.• Suppose �M A , i.e., all se-

quences of M satisfy (∀x) B[x].• Hence, all sequences of M sat-

48

isfy B[x], regardless of whatentity in the domain gets as-signed to x.

• Because the domain of quan-tiVcation for M is D, and Dconsists of the set of closedterms, and every closed termis interpreted as standing for it-self, a sequence of M will satisfyB[x] iU it satisVes B[t] for thatclosed term t that gets assignedto x in that sequence.

• So, for all closed terms t,�M B[t].

• By the inductive hypothesis, itfollows that, for all closed termst, B[t] ∈ ∆.

• Because ∆ is universal, it fol-lows that (∀x) B[x] ∈ ∆, i.e.,A ∈ ∆.

Putting these together, we get thatA ∈ ∆ iU �M A .

(8) By induction, regardless of A ’s length, A ∈ ∆iU �M A . So M is a model for ∆. This estab-lishes the Lemma. e

If we follow Mendelson and think of a model asa sort of possible world, a maximally consistentset of sentences can be thought of as a maximallydescriptive yet consistent description of a possibleworld. This lemma says roughly that for everymaximally descriptive consistent description of apossible world, one exists for which that descrip-tion is true.

Result (The Modeling Lemma): A set of closedwUs Γ is consistent iU it has a denumerablemodel (i.e., there is at least one denumerablemodel for Γ).

Proof:This biconditional breaks down into:

(MLa) If a set of closed wUs Γ has a denumerablemodel, then Γ is consistent.

(MLb) If a set of closed wUs Γ is consistent, thenΓ has a denumerable model.

Instead of proving (MLa) directly, we shall provethe following stronger thesis:

(MLa)* If a set of closed wUs Γ has any model,then Γ is consistent.

Proof of (MLa)* and (MLa):(1) Assume the opposite for reductio ad absurdum.

I.e., assume that Γ is a set of closed wUs, andthere is at least one model M for Γ, but that Γis inconsistent.

(2) Hence, there is some A such that Γ ` A andΓ ` ¬A .

(3) This means that A and ¬A are each derivablefrom the members of Γ along with the axiomsof PF by zero or more applications of MP andGen.

(4) All the axioms of PF are logically valid, andhence true in M.

(5) Similarly, all the members of Γ are true in Mby hypothesis.

(6) However, both MP and Gen preserve truth inan interpretation, so it must be that both �M Aand �M ¬A .

(7) By the deVnition of truth in an interpretation,every sequence in M satisVes both A and ¬A .

(8) However, a sequence satisVes ¬A iU it doesnot satisfy A , so any arbitrary sequence of Mwill both satisfy and not satisfy A , which isabsurd.

(9) Hence (MLa)* must be true. Regardless of thesize of the domain, any set of closed wUs thatcan be modeled is consistent. This includesthose with denumerable models, so (MLa)* en-tails (MLa).

Proof of (MLb):(1) Assume that Γ is a consistent set of closed wUs.(2) By LEL, there is a set of closed wUs ∆ such

that: (a) Γ ⊆ ∆, (b) ∆ is consistent, (c) ∆ ismaximal, and (d) ∆ is universal.

(3) By MCL, there is an interpretation M that is adenumerable model for ∆.

(4) So for all closed wUs A , if A ∈ ∆, then �M A .(5) Because Γ ⊆ ∆, for all closed wUs A , if A ∈ Γ

then A ∈ ∆.

49

(6) So for all closed wUs A , if A ∈ Γ then �M A .(7) Therefore, M is also a denumerable model for

Γ. e

The following is not needed for completeness, butis an interesting and surprising result of (MLa)*and (MLb).

Corollary (The Skolem-Löwenheim Theorem):If a set Γ of closed wUs of Vrst-order predicatelogic has any sort of model, then it has adenumerable model.

Proof:By the stronger (MLa)*, if Γ has any sort of model,then it is consistent. By (MLb), if it is consistent, ithas a denumerable model.

Finally we turn to completeness:

Result (Completeness): For all wUsA , if � Athen ` A .

Proof:(1) Suppose � A , but suppose for reductio ad ab-

surdum that it is not the case that ` A .(2) Let B be the universal closure of A , i.e., if the

free variables of A are x1, . . . , xn, then B is(∀x1) . . . (∀xn) A .

(3) Universal closure preserves truth in an inter-pretation, so � B.

(4) B has no free variables left, so B is closed.(5) The singleton set containing ¬B alone, {¬B},

must be consistent. Here’s a proof of this byreductio:a) Suppose there were some C such that{¬B} ` C and {¬B} ` ¬C .

b) By SL, {¬B} ` C ∧ ¬C .c) Since B is closed, so is ¬B, and so by DT,

we have ` ¬B ⇒ (C ∧ ¬C ).d) But ` ¬(C ∧ ¬C ), so by SL, ` B.

e) But A is derivable from B by universalinstantiation, so it would follow that ` A ,which contradicts our earlier assumption.

f) Hence, {¬B} is consistent.(6) By the Modeling Lemma, {¬B} has a denu-

merable model. Hence there is an interpreta-tion M such that �M ¬B.

(7) But we also have � B, and hence �M B.(8) By the deVnition of truth in an interpretation,

every sequence in M satisVes both B and ¬B.(9) However, a sequence satisVes ¬B iU it does

not satisfy B, so any arbitrary sequence of Mwill both satisfy and not satisfy B, which isabsurd.

(10) We’ve shown our supposition to be impossible,thereby establishing completeness indirectly.e

Corollary: If Γ � A then Γ ` A .

Proof:Follows from minor modiVcations on the aboveproof. e

Unfortunately, this proof does not, as in the Propo-sitional Calculus (System L), provide a “recipe” forconstructing a proof of any given logical truth inPF. We have simply proven that any given logi-cal truth must be derivable, because if it were not,there would exist a countermodel to its logical va-lidity.

The completeness of the Vrst-order predicatecalculus was Vrst proven by Kurt Gödel in 1930,and so this is sometimes called “Gödel’s Complete-ness Theorem,” although his way of proving itwas actually very diUerent from ours. (It was Vrstproven our way by Leon Henkin in 1949.) However,Gödel is much more famous for his incompletenesstheorems than his completeness theorem.

H. Identity Logic

To add identity to Vrst-order predicate logic, wesimply pick a 2-place predicate letter—say ‘I2’—touse to stand for the identity relation, and make theappropriate additions to our logical system.

50

Syntax

OXcially, the syntax is unchanged. We already had‘I2’ as a predicate letter. We are simply Vxing itsintended meaning.

However, it is useful to introduce abbreviationssuch as the following.

Abbreviations:

t = u abbreviates I2(t, u)t 6= u abbreviates ¬I2(t, u)

(∃1x) A [x] abbreviates (∃x) A [x] ∧[(∀x) (∀y)(A [x] ∧ A [y]⇒ x = y)],

where y is the Vrst variable not occurring in A [x].(∃n+1x) A [x] abbreviates

(∃y)(A [y] ∧ (∃nx)(x 6= y ∧ A [x])),where y is the Vrst variable not occurring in A [x].

The above deVnition deVnes (∃2x) A [x] in terms of(∃1x) A [x] and (∃3x) A [x] in terms of (∃2x) A [x]and so on. (We could do even better by beginningwith (∃0x) A [x] for (∀x)¬A [x].)

Because the syntax is unchanged, the set of wUsand the set of closed wUs remain denumerable.

System of Deduction

Definition: The Vrst-order predicate calculuswith identity, or System PF = is the system ob-tained from PF by adding the following axiom andaxiom schema:(A6) (∀x) x = x(A7) x = y ⇒ (A [x, x]⇒ A [x, y]),

for all instances in which y is free for x inA [x, x], and A [x, y] is obtained from A [x, x]by replacing some, but not necessarily all, freeoccurrences of x with y.

Definition: A Vrst-order theory with identity[equality] is any Vrst-order theory that has all the-orems of PF = formulable in its syntax as theorems(i.e., it is a theory built on PF in which (A6) is eitheran axiom or theorem, and all instances of (A7) areeither axioms or theorems.) This includes PF= itself.

Some easy theorems and derived rules:

Result (Ref=): `PF= t = t, for any term t.(ReWexivity of identity.)

Proof:Direct from (A6) by universal instantiation.

Result (LL/Sub=):t = u,A [t, t] `PF= A [t, u], for all terms t andu that are free for all variables in A [x, y], andwhere A [t, u] arises from A [t, t] by replacingsome or all occurrences of t with u. (Leibniz’slaw)

Proof:Derived from (A7) by Gen on both x and y, thenuniversal instantiation to t and u, and MP × 2. Itmay be necessary to do some bound variable jug-gling, but this is no problem.

Result (Sym=): t = u `PF= u = t, for anyterms t and u. (Symmetry of identity.)

Proof:t = u, t = t `PF= u = t is an instance of LL, andwe have `PF= t = t by reWexivity.

Result (Trans=): t = u, u = v `PF= t = v forany terms t, u and v. (Transitivity of identity.)

Proof:u = v, t = u `PF= t = v is an instance of LL.

The deduction theorem and replacement for exis-tential instantiation are unchanged by the addition.

51

Semantics for Identity Logic

We intend ‘I2’ to stand for the identity relation. So:

Definition: An interpretation M is a normalmodel iU, for M, (‘I2’)M is the set of all and onlyordered pairs of the form 〈o, o〉 of objects o includedin the domain of quantiVcation D of M.

Definition: A wU A is identity-valid iU it is truefor all normal models. I abbreviate this as: �= A .

Note that all wUs that are logically valid simpliciter(� A ) are identity-valid (�= A ), but not vice-versa. Note that (A6) and all allowed instances of(A7) are identity-valid. (Proving this is homework.)

Some Important Metatheoretic Resultsfor PF = and other Theories with Iden-tity

Result (Soundness):For all wUs A , if `PF= A then �= A .

(Proof is left as part of an exam question.)

Result (Consistency): There is no wU A suchthat `PF= A and `PF= ¬A .

(Proof is left as part of an exam question.)

Result: Any Vrst-order theory K in which (A6)is an axiom or theorem, and all instances of (A7)in which A [x, x] is an atomic formula with noindividual constants are either axioms or theo-rems, is a Vrst-order theory with identity. (Thepossibility of reducing (A7).)

Proof:(1) Assume that K is a Vrst-order theory in which

(A6) is an axiom or theorem, and all instancesof (A7) in which A [x, x] is an atomic formulawith no individual constants are either axiomsor theorems. We shall prove that all instancesof (A7) can be derived regardless of the com-plexity of A [x, x], by wU induction.

(2) Base step: A [x, x] is atomic. By hypothesis,(A7) is a theorem of K for all cases in whichA [x, x] is an atomic formula with no individ-ual constants. All others can be obtained byGen and universal instantiation.

(3) Induction step: We assume that all instancesof (A7) hold for instances of A [x, x] that aresimpler than a given instance, and need toshow that for the given instance of A [x, x](A7) holds as well. This proceeds by a proof bycases of the possible make-up of the instanceof A [x, x] in question.Case (a): A [x, x] takes the form ¬B[x, x].

i) Let C [x] be B[z, x]. Clearly,C [x] is the same complexity asB[x, x].

ii) By the inductive hypothesis,we have this instance of (A7):`K x = y ⇒ (C [x]⇒ C [y]).

iii) By manipulating variableswith Gen and UI, we get:`K y = x⇒ (C [y]⇒ C [x]).

iv) Because we have atomicinstances, we have: `K x =y ⇒ (x = x⇒ y = x), and sowith (A6) and SL we get:`K x = y ⇒ y = x.

v) So by SL:`K x = y ⇒ (C [y]⇒ C [x]).

vi) That is: `K x = y ⇒(B[z, y]⇒ B[z, x]).

vii) By Gen on z and UI to x weget: `K x = y ⇒ (B[x, y]⇒B[x, x]).

viii) By SL: `K x = y ⇒(¬B[x, x]⇒ ¬B[x, y]), i.e.,`K x = y ⇒ (A [x, x]⇒A [x, y]).

Case (b): A [x, x] is of form (B[x, x] ⇒

52

C [x, x]).i) By the inductive hypothesis,

we have: `K x = y ⇒(C [x, x]⇒ C [x, y]).

ii) By the same proceduredescribed in the previous case:`K x = y ⇒ (B[x, y]⇒B[x, x]).

iii) By MP:x = y `K B[x, y]⇒ B[x, x].

iv) Similarly:x = y `K C [x, x]⇒ C [x, y].

v) Clearly, if we addB[x, x]⇒ C [x, x] as a furtherpremise, we could complete asyllogism, i.e.:x = y,B[x, x]⇒ C [x, x] `KB[x, y]⇒ C [x, y].

vi) By DT × 2, we have: `K x =y ⇒ ((B[x, x]⇒ C [x, x])⇒(B[x, y]⇒ C [x, y])), whichis: `K x = y ⇒ (A [x, x]⇒A [x, y]).

Case (c): A [x, x] takes the form(∀z) B[x, x, z].

i) By the inductive hypothesis:`K x = y ⇒ (B[x, x, z]⇒B[x, y, z]).

ii) By MP: x = y `KB[x, x, z]⇒ B[x, y, z].

iii) Hence, by UI and MP:x = y, (∀z) B[x, x, z] `KB[x, y, z].

iv) By Gen:x = y, (∀z) B[x, x, z] `K(∀z) B[x, y, z].

v) By DT × 2, we have: `K x =y ⇒ ((∀z) B[x, x, z]⇒(∀z) B[x, y, z]), which is:`K x = y ⇒ (A [x, x]⇒A [x, y]).

(4) Hence, regardless of the complexity of A [x, x],we have the appropriate instance of (A7).Therefore, all instances of (A7) are theorems ofK.

(5) K is a Vrst-order theory (one built by expandingPF by adding proper axioms). Hence K has all

instances of (A1)–(A5) as axioms. (A6) is eitheran axiom or a theorem of K, and all instancesof (A7) are theorems. Hence all axioms of PF=

are theorems of K. Moreover, K has all the in-ference rules of PF, and hence all the inferencerules of PF=.

(6) All theorems of PF= are derived from (A1)–(A7)by the inference rules. Therefore, all theoremsof PF= are theorems of K.

(7) Therefore, K is a Vrst-order theory with iden-tity. e

Result: If M is a model for the set of axioms ofPF =, then there is a normal model M* such thatfor all wUs A , �M A iU �M∗ A . (ContractingModels to Normal Models.)

Proof:(1) Assume that M is a model for the set of axioms

of PF=.(2) It does not follow from this that M is a nor-

mal model, i.e., it does not follow that (‘I2’)M

only consists of ordered pairs of the form 〈o, o〉of objects o included in the domain of quan-tiVcation D of M. However, we do know thefollowing things about (‘I2’)M:a) Because M makes (A6) true, (‘I2’)M must

be a reWexive relation in the set-theoreticsense.

b) Because M makes the instance of (A7),x = y ⇒ (x = x ⇒ y = x), true, and be-cause it is reWexive (so all sequences satisfyx = x), (‘I2’)M must also be a symmetricrelation in the set-theoretic sense.

c) Because M makes the instance of (A7),x = y ⇒ (x = z ⇒ y = z), true, andbecause it is symmetric, (‘I2’)M must alsobe a transitive relation in the set-theoreticsense.

d) So, (‘I2’)M must be an equivalence relation.e) Let us call this equivalence relation E. For

any object o in the domain D ofM , [o]E isthe E-equivalence class on o; i.e., the set ofp such that 〈o, p〉 ∈ E.

53

(3) We can then construct a normal model M* inthe following way.a) Let the domain of quantiVcation for M*,

viz., D∗, be the set of all E-equivalenceclasses formed from members of D. I.e.,if D is {o1, o2, o3, . . . } then let D∗ be{[o1]E, [o2]E, [o3]E, . . . }.

b) For all constants c, let (c)M∗ be [(c)M ]E .c) For all function letters F with super-

script n, let (F )M∗ be the n-place op-eration on D∗ that includes the orderedpair 〈〈[o1]E, . . . , [on]E〉, [oq]E〉 iU (F )M in-cludes 〈〈o1, . . . , on〉, oq〉.

d) For all predicate letters P with su-perscript n, let (P)M∗ be the subsetof D∗n that includes the ordered n-tuple 〈[o1]E, . . . , [on]E〉 iU (P)M includes〈o1, . . . , on〉.

(4) It follows that M* is normal. Because (‘I2’)M isE, (‘I2’)M

∗is the set of ordered pairs that con-

tains 〈[o]E, [p]E〉 iU 〈o, p〉 ∈ E. BecauseE is anequivalence relation, 〈o, p〉 ∈ E iU [o]E = [p]E .

(5) We now prove that �M A iU �M∗ A for allwUs A . Note that this will be the case whenA is satisVed by all sequences in one interpre-tation when it is satisVed by all sequences inthe other. Each sequence s of M of the form:o1, o2, o3, . . . corresponds to a sequence s′ ofM* of the form [o1]E, [o2]E, [o3]E, . . . . For eachsuch sequence pair, s and s′, it is apparent thatfor any term t, s′(t) is [s(t)]E . We now provethat for all wUs A , for all such sequence pairss and s′, sequence s (of M) will satisfy A iUthe corresponding sequence s′ (of M*) satisVesA , by wU induction.

Base step: A is atomic, i.e., it takesthe form P(t1, . . . , tn). Then s satisVesA iU 〈s(t1), . . . , s(tn)〉 ∈ (P)M, and s′

satisVes A iU 〈[s(t1)]E, . . . , [s(tn)]E〉 ∈(P)M∗ . By our description of M*above, 〈[s(t1)]E, . . . , [s(tn)]E〉 ∈ (P)M∗ iU〈s(t1), . . . , s(tn)〉 ∈ (P)M , so s satisVes AiU s′ satisVes A .

Induction step: Assume as inductive hypoth-esis that it holds for all wUs B simpler thanA that, for all such sequence pairs s and s′, s

satisVes B iU s′ satisVes B. We must showthat it holds for A as well. Proof by cases.Case (a): A takes the form ¬B. By the in-

ductive hypothesis, s satisVes B iUs′ satisVes B. By the deVnition ofsatisfaction, s satisVes A iU s doesnot satisfy B, and the same holdsfor s′. So, s does not satisfy A iUs′ does not satisfy A , and hence ssatisVes A iU s′ satisVes A .

Case (b): A takes the form B ⇒ C . By theinductive hypothesis, s satisVes BiU s′ satisVes B and s satisVes CiU s′ satisVes C . s satisVes A iU iteither does not satisfy B or it doessatisfy C , and similarly for s′, so ssatisVes A iU s′ satisVes A .

Case (c): A takes the form (∀x) B[x]. Now,s will satisfy A iU all sequences ςin M diUering from s at most withregard to what entity gets assignedto x satisfy B[x], and s′ will satisfyA iU all sequences ς ′ in M* diUer-ing from s′ at most with regard towhat entity gets assigned to x sat-isfy B[x]. Each such sequence ς inM corresponds to such a sequenceς ′ of M* and vice-versa. By the in-ductive hypothesis, it will hold thatς satisVes B[x] iU ς ′ satisVes B[x],so s satisVes A iU s′ satisVes A .

So regardless of the length of A , for such asequence pair, s and s′, s satisVes A iU s′ sat-isVes A . Such sequence pairs will exhaust thesequences of M and M*, so it follows that forall wUs A , �M A iU �M∗ A .

(6) Obviously, it follows from this that M* is alsoa model for the set of axioms of PF=. Thisestablishes the result. e

Result (Completeness): For all wUs A ,if �= A then `PF= A .

This proof is left as an exam question, but it re-

54

quires the above possibility of contracting modelsto normal models.

55

UNIT 3

PEANO ARITHMETIC AND RECURSIVE FUNCTIONS

A. The System S

Definition: Stated in English, the Peano postu-lates (also called the Peano axioms or the Peano-Dedekind Axioms) are the following Vve principles:(P1) Zero is a natural number.(P2) Every natural number has a successor which is

also a natural number.(P3) Zero is not the successor of any natural number.(P4) No two natural numbers have the same succes-

sor.(P5) If something is both true of zero, and true of the

successor of a number whenever it is true of thatnumber, then it is true of all natural numbers(i.e., the principle of mathematical induction).

Your book discusses the history of these principlesin more detail, but in the late 19th and early 20thcentury it was widely believed that all the truthsof number theory (pure arithmetic) could be de-rived as theorems from these principles. But solong as they are simply stated in English, and notintroduced within a precisely formulated logicalcalculus, this is a diXcult supposition to test.

If these are the only truths we take as axiomatic,to get truths regarding addition, multiplication, etc.,we’d also need certain principles of set theory, andthe proper axiomatization of set theory is still verycontroversial. However, without set theory, we canobtain more or less the same results by taking thenotions of addition and multiplication as primitivefunctions within a more or less standard Vrst-order

predicate theory, and adding a few additional ax-ioms.

Any system that has the same mathematicaltheorems is called a Peano arithmetic.

Mendelson’s System S: Syntax

The new system does not require the addition ofanything new to the syntax of standard Vrst-orderpredicate logic. In fact, we give the system S aless complicated syntax than PF by making thefollowing restrictions:A. There is only one predicate-letter:

I2

and again, instead of I2(t, u) we write (t = u).

B. There is only one constant:

a

but as an alternative, we use numeral ‘0’.

C. There are three function-letters:

f 1 f 21 f 2

2

but instead of writing f 1(t), we write t′, andinstead of writing f 2

1 (t, u), we write (t + u),andinstead of writing f 2

2 (t, u), we write (t · u).

56

D. There are still denumerably many variables, asbefore.

Hence, all atomic wUs are identity statements.Other wUs are built from atomic ones as before.

Mendelson’s System S: Semantics

The system S has a single intended interpretation,its so-called “standard model”. (Although it doeshave other models.)

Definition: The standard model for S can becharacterized as follows:1. The domain of quantiVcation is the set of natural

numbers {0, 1, 2, 3, . . .}.2. The interpretation of the constant ‘a’ is the num-

ber zero.3. The interpretation of the predicate-letter ‘I2’ is

the identity relation on the set of natural numbers.4. The interpretation of the function-letter ‘f 1’ is the

set of ordered pairs in which the second elementis the number one greater than the Vrst element,e.g., 〈0, 1〉, 〈1, 2〉 and 〈2, 3〉, etc.The interpretation of ‘f 2

1 ’ is the set of ordered pairsin which the Vrst element is itself an ordered pairof natural numbers, and the second element is thesum of those two numbers, e.g., 〈〈2, 3〉, 5〉, etc.The interpretation of ‘f 2

2 ’ is the set of ordered pairsin which the Vrst element is itself an ordered pairof natural numbers, and the second element is theproduct of those two numbers, e.g., 〈〈2, 3〉, 6〉, etc.

The axioms of system S (listed below) are truein the standard model. Because S has a model,by the Modeling Lemma, it is consistent. How-ever, because the proof of the Modeling Lemmarequires mathematical methods such as the princi-ple of mathematical induction in the metalanguage,and system S contains object-language translationsof these very principles, this ‘proof’ of consistencyappears somewhat circular. It is customary there-fore to state this result in a somewhat weaker way:e.g., assuming that ordinary mathematical reason-ing (as reWected in the metalanguage) is consistent,so is system S.

Axiomatization of S

The system is built upon the predicate calculus;bearing in mind the restrictions on the syntax men-tioned above, its axioms include instances of axiomschemata (A1) through (A5) of the predicate calcu-lus. Its only two primitive inference rules are Genand MP. (The deduction theorem, etc., holds in thesame form.) We add the following:

Definition: A Proper Axiom of S is any one of(S1)–(S8) listed below, or any instance of (S9).(S1) x = y⇒ (x = z⇒ y = z)(S2) x = y⇒ x′ = y′(S3) 0 6= x′(S4) x′ = y′ ⇒ x = y(S5) x + 0 = x(S6) x + y′ = (x + y)′(S7) x · 0 = 0(S8) x · y′ = (x · y) + x(S9) A [0]⇒ ((∀x)(A [x]⇒ A [x′])

⇒ (∀x) A [x])

Result: S is a Vrst-order theory with identity.

Although (A6) and (A7) of the predicate calcu-lus with identity are not taken as axioms, theyare derivable as theorems in this system from theabove. It will be recalled from our last unit thatwe proved that if (A6) is a theorem, and those in-stances of (A7) involving atomic formulas are the-orems, then other instances of (A7) follow. Someof the principles necessary for getting instancesof (A7) involving atomic wUs are proved below,some are proved in the book, and some are left ashomework.

Result: The theorems and derived rules govern-ing reWexivity, symmetry and transitivity of iden-

57

tity hold in S, i.e.:

(A6) `S (∀x) x = x(Ref=) `S t = t, for any term t.

(Sym=T) `S (∀x) (∀y)(x = y⇒ y = x).(Sym=) t = u `S u = t, for any terms t, u.

(Trans=T) `S (∀x) (∀y) (∀z)(x = y⇒(y = z⇒ x = z))

(Trans=) t = u, u = s `S t = s,

for all terms t, u, s.

Proof:Demonstration of (A6):

1. `S x + 0 = x (S5)2. `S x = y⇒ (x = z⇒ y = z) (S1)3. `S (∀x) (∀y) (∀z)(x = y⇒ (x = z⇒ y = z))

2 Gen×34. `S x + 0 = x⇒ (x + 0 = x⇒ x = x)3 UI×35. `S x = x 1, 4 MP×26. `S (∀x) x = x 5 Gen

(Ref=) follows directly from (A6) by UI.

For (Sym=T):

1. `S x = y⇒ (x = z⇒ y = z) (S1)2. `S (∀x) (∀y) (∀z)(x = y⇒ (x = z⇒ y = z))

1 Gen×33. `S x = y⇒ (x = x⇒ y = x) 2 UI×34. `S x = x⇒ (x = y⇒ y = x) 3 SL5. `S x = x Ref=6. `S x = y⇒ y = x 4, 5 MP7. `S (∀x) (∀y)(x = y⇒ y = x) 6 Gen×2

(Sym=) follows by UI×2 and MP.

(Trans=T):

1. `S (∀x) (∀y) (∀z)(x = y⇒ (x = z⇒ y = z))(S1) Gen×3

2. `S x1 = y1 ⇒ (x1 = z1 ⇒ y1 = z1) 1 UI×33. `S (∀x1) (∀y1) (∀z1)(x1 = y1 ⇒

(x1 = z1 ⇒ y1 = z1)) 2 Gen×3

4. `S y = x⇒ (y = z⇒ x = z) 3 UI×35. `S x = y⇒ y = x (Sym=T) UI×26. `S x = y⇒ (y = z⇒ x = z) 4, 5 SL7. `S (∀x) (∀y) (∀z)(x = y⇒ (y = z⇒ x = z))

6 Gen×3(Trans=) follows by UI×3, and MP×2. e

Result (MI): A [0], (∀x)(A [x] ⇒ A [x′]) `S(∀x) A [x], for any variable x and wUA [x]. (De-rived rule for mathematical induction.)

Proof:This follows from (S9) and MP×2. e

(From here on out I shall often ignore the fact that(S1)–(S9) are stated with particular variables, ratherthan schematically for all variables or all terms, andtreat, e.g., anything of the form x = x + 0 as if itcounted as (S5); obviously it takes only Gen and UIto move from ‘x’ to any other variable x.)

Result (Sub+): `S (∀x) (∀y) (∀z)(x = y ⇒x + z = y + z). (Substitution of identicals foraddition.)

Proof:1. x = y `S x = y (Premise)2. `S x = x+ 0 (S5)3. `S y = y + 0 (S5)4. x = y `S x = y + 0 1, 3 Trans=5. `S x+ 0 = x 2 Sym=6. x = y `S x+ 0 = y + 0 4, 5 Trans=7. `S x = y ⇒ x+ 0 = y + 0 6 DT8. x = y ⇒ x+ z = y + z `S x = y ⇒ x+ z =y + z (Premise)

9. x = y ⇒ x+z = y+z, x = y `S x+z = y+z1,8 MP

10. `S (∀x) (∀y)(x = y ⇒ x′ = y′) (S2) Gen×211. `S x+ z = y + z ⇒ (x+ z)′ = (y + z)′

10 UI×212. x = y ⇒ x+ z = y + z, x = y `S (x+ z)′ =

(y + z)′ 9, 11 MP13. `S x+ z′ = (x+ z)′ (S6)

58

14. `S y + z′ = (y + z)′ 13 Gen, UI15. x = y ⇒ x+z = y+z, x = y `S x+z′ = y+z′

12, 13, 14 Trans=, Sym=16. `S (x = y ⇒ x+ z = y + z)⇒

(x = y ⇒ x+ z′ = y + z′) 15 DT×217. `S (∀z)[(x = y ⇒ x+ z = y + z)⇒

(x = y ⇒ x+ z′ = y + z′)] 16 Gen18. `S (∀z)(x = y ⇒ x+ z = y + z) 7, 17 MI19. `S (∀x) (∀y) (∀z)(x = y ⇒ x+ z = y + z)

18 Gen×2e

The proofs of the following are in the book:

Result: Analogues of (S5), (S6) and (Sub+),Wipped.`S (∀x)x = 0 + x`S (∀x) (∀y)x′ + y = (x+ y)′`S (∀x) (∀y) (∀z)(x = y ⇒ z + x = z + y)

Result (Com+/Assoc+): Commutativity andassociativity of addition.`S (∀x) (∀y)x+ y = y + x`S (∀x) (∀y) (∀z)((x+ y) + z) = (x+ (y+ z))

Tonight’s homework includes proving analogousresults for multiplication. (I.e., you’ll prove sub-stitution within multiplication, (Sub·), Wipped ver-sions of (S7), (S8) and (Sub·), as well as (Com·).) Weget from these that S is a theory with identity, assubstitution is allowed in all contexts.

B. The Quasi-Fregean System F

Suppose you wanted to construct an axiomatic sys-tem for mathematics, but did not want to take ‘·’,‘+’, ‘′’, and ‘0’ as primitive, and instead wanted todeVne them. One initially attractive way would beto do this within an axiomatic set theory, in a waysuch as the following, which I’m calling systemF. This system is not Frege’s system, but a crudeoversimpliVcation thereof.

Syntax

1. We add to the syntax of predicate logic the fol-lowing subnective, which yields a term for anyvariable x and wU A [x].

{x|A [x]}

This is read, “the set of all x such that A [x]”.2. All occurrences of x in a term of the form{x|A [x]} are considered to be bound.

3. We also choose a two-place predicate letter E2

to use for the membership relation. An ex-pression of the form (t ∈ u) is shorthand forE2(t, u), and (t /∈ u) is shorthand for ¬E2(t, u).

Axiomatization

The system F contains analogues of axiomschemata (A1) through (A7) of the predicate cal-culus with identity (PF=), the inference rules MPand Gen, and the following two additional axiomschemata:

(A8) (∀x)(A [x]⇔ x ∈ {y|A [y]}), for all casesin which the variable y is free for x in A [x].

(A9) (∀x)(x ∈ {y|A [y]} ⇔ x ∈ {z|B[z]})⇒{y|A [y]} = {z|B[z]}, where {y|A [y]} and{z|B[z]} do not contain x free.

Some Intuitive DeVnitions / Abbrevia-tions

(In the following, x, y and z are the Vrst three vari-ables that do not occur in the terms t and u; notealso that some of these are abbreviations of terms,others are abbreviations of wUs.)

Set theoretic definitions(t ∩ u) for {x|x ∈ t ∧ x ∈ u}(t ∪ u) for {x|x ∈ t ∨ x ∈ u}t for {x|x /∈ t}(t ⊆ u) for (∀x)(x ∈ t⇒ x ∈ u)V for {x|x = x}∅ for {x|x 6= x}{t} for {x|x = t}{t, u} for {x|x = t ∨ x = u}

59

〈t, u〉 for {{t}, {t, u}}(t× u) for {x| (∃y) (∃z)x = 〈y, z〉 ∧ y ∈ t ∧ z ∈u)}Dom(t) for {x| (∃y)(〈x, y〉 ∈ t)}Rng(t) for {x| (∃y)(〈y, x〉 ∈ t)}Fld(t) for (Dom(t) ∪Rng(t))Inv(t) for {x| (∃y) (∃z)(x = 〈y, z〉 ∧ 〈z, y〉 ∈ t)}Fnct(t) for (∀x) (∀y) (∀z)(〈x, y〉 ∈ t ∧

〈x, z〉 ∈ t⇒ y = z)Biject(t) for (Fnct(t) ∧ Fnct(Inv(t)))

Mathematical definitions(t ∼= u) for (∃x)(Biject(x) ∧ Dom(x) = t ∧

Rng(x) = u)Card(t) for {x|x ∼= t}0 for Card(∅)t′ for {x| (∃y)(y ∈ x ∧ x ∩ {y} ∈ t)}1 for 0′2 for 1′3 for 2′ [. . . and so on for other numerals]N for {x| (∀y)(0 ∈ y ∧ (∀z)(z ∈ y ⇒ z′ ∈ y)⇒

x ∈ y)}Fin(t) for (∃x)(x ∈ N ∧ t ∈ x)Infin(t) for ¬Fin(t)Denum(t) for (t ∼= N)Ctbl(t) for (Fin(t) ∨ Denum(t))(t ≤ u) for (∃x) (∃y) (∃z)(x ∈ t ∧ y ∈ u ∧

z ⊆ y ∧ x ∼= z)(t < u) for ((t ≤ u) ∧ ¬(u ≤ t))§(t) for {x|x ∈ N ∧ x < t}(t+ u) for Card((§(t)× {0}) ∪ (§(u)× {1}))(t · u) for Card(§(t)× §(u))

Results

With these deVnitions in place, one can derivePeano’s postulates as theorems in the followingforms:(P1) `F 0 ∈ N(P2) `F x ∈ N⇒ x′ ∈ N(P3) `F x ∈ N⇒ 0 6= x′

(P4) `F x ∈ N ∧ y ∈ N ⇒ (x′ = y′ ⇒ x = y)(P5) `F A [0] ∧ (∀x)(A [x]⇒ A [x′])⇒

(∀x)(x ∈ N⇒ A [x])As well as analogues of Mendelson’s other axioms:(S5F) `F x ∈ N⇒ x+ 0 = x(S6F) `F x ∈ N ∧ y ∈ N⇒ (x+ y′) = (x+ y)′

(S7F) `F x · 0 = 0(S8F) `F x ∈ N ∧ y ∈ N ⇒ (x · y′) = ((x ·

y) + x)

Also, we have, e.g.:

`F ((∃1x) A [x])⇔ ({x|A [x]} ∈ 1)`F ((∃2x) A [x])⇔ ({x|A [x]} ∈ 2)`F ((∃3x) A [x])⇔ ({x|A [x]} ∈ 3)And so on.

Disaster

The system F, unfortunately, is inconsistent due toRussell’s paradox:

`F {x|x /∈ x} /∈ {x|x /∈ x} ⇔{x|x /∈ x} ∈ {x|x /∈ x}

Proof: Direct from (A8) and universal instantiation.Whence both `F {x|x /∈ x} ∈ {x|x /∈ x}, and`F {x|x /∈ x} /∈ {x|x /∈ x}.

Hence `F A for all wUs A , making the systementirely unsuitable for mathematics. In this systemwe have both `F 1 + 1 = 2 and `F 1 + 1 = 3!

Poor Frege.

HomeworkWithout using Russell’s paradox or other contra-diction, prove `F {x} = {y} ⇒ x = y.

Some History

In the late 19th century, Euclid’s axiomatization ofgeometry came under new scrutiny. Many mathe-maticians began to investigate the axiomatizationof arithmetic as well. In 1879 German mathemati-cian Richard Dedekind surmised that Vve princi-ples formed the basis of all pure arithmetic.

In the 1880s and 1890s, the adequacy of thoseVve principles (and others) was studied in depth,most importantly by a group of Italian mathemati-cians lead by Giuseppe Peano. In order to considerthem more systematically, Peano urged that theprinciples be written using a rigorously deVnedsymbolic notation for logic and set theory, whichhe was still developing at that time. Given Peano’s

60

role in systematizing and popularizing the aboveprinciples, they have since come to be called the“Peano axioms” or “Peano postulates”.

In the year 1900, Peano presented some of hisVndings at the International Congress of Philoso-phy in Paris. In the audience was a young Englishpolyglot whose main contribution to academia wasa fellowship thesis on the compatibility of non-Euclidian geometry with Hegelian idealism. Hewas so impressed with Peano’s work that overthe next few months he had not only masteredPeanist logic, but had suggested several improve-ments. This was a 28 year old Bertrand Russell.

Russell suggested that it wasn’t enough to statethe axioms of arithmetic in logical notation. Oneneeded also to be explicit about the rules and prin-ciples governing that logical notation, because onlythen could one really test what is and what isn’tprovable. The axioms of arithmetic needed to besupplemented with the axioms of logic. However,in attempting to axiomatize logic and set theory,bearing in mind Georg Cantor’s deVnition of cardi-nal numbers in terms of one-one correspondences,Russell became convinced that given suitable def-initions of the notions of ‘zero’, ‘successor’ and‘natural number’, the Peano (so-called) ‘axioms’could actually be derived as theorems from the ax-ioms of logic alone. Russell, having just taught acourse on Leibniz, saw this as vindication of Leib-niz’s theory that mathematical truths are simplymore complicated truths of logic: a theory nowknown as logicism.

Russell began work on writing what he imag-ined to be a two volume work called The Principlesof Mathematics. In volume one, he would explainthe reduction of mathematics to logic informally(in English), and in volume two, he would set out toderive all of pure mathematics within an axiomaticsystem whose only axioms were logical axioms.However, in mid-1901, after he had Vnished thebulk of vol. I, he discovered the paradox of sets thatnow bears his name. Realizing that this made themost natural axiomatization of set theory inconsis-tent, Russell started to look for a philosophicallyadequate solution that would nevertheless salvagemost of the work he (and others such as Dedekindand Peano) had done. Not Vnding an easy solution,

Russell decided to publish vol. I with only a prelim-inary discussion of the contradiction and possibleways of avoiding it, leaving a complete solutionof the inconsistency within the formal system forfurther development in vol. II.

While Vnishing vol. I in 1901–1902, Russell dida search of recent literature on the foundationsof mathematics, and in so doing rediscovered theworks of Gottlob Frege. Frege, working in almostcomplete isolation, had already in his 1884 Grund-lagen der Arithmetik (trans. Foundations of Arith-metic), given a list of basic principles of arithmeticvery similar to Dedekind’s, but also suggested, likeRussell, that given suitable deVnitions in terms ofnotions of pure logic, that these principles could bederived from logical principles alone. In fact, Fregehad already developed the core of an axiomatic sys-tem for logic in his 1879 work BegriUsschrift, andin his later 1893 magnum opus, Grundgesetze derArithmetik, vol. I. (trans. Basic Laws of Arithmetic),Frege expanded that system by adding axioms for“value-ranges” (in eUect, class theory), and had be-gun to derive the elementary truths of numbertheory. While Russell was delighted to Vnd suchcommon ground between his work and Frege’s,he also discovered that Frege’s system fell prey tohis paradox, and was therefore inconsistent. Hebroke the news gently to Frege in a letter. Here is atranslation of that letter, as well as Frege’s response(both originally written in German):

Dear Colleague: [16 June 1902]I have known your Grundgesetze

der Arithmetik for a year a half, butonly now have I been able to Vnd thetime for the thorough study I intendto devote to your writings. I Vnd my-self in full accord with you on all mainpoints, especially in your rejection ofany psychological element in logic andin the value you attach to a conceptualnotation for the foundations of mathe-matics and of formal logic, which, in-cidentally, can hardly be distinguished.On many questions of detail, I Vnd dis-cussions, distinctions and deVnitionsin your writings for which one looks

61

in vain in other logicians . . .I have encountered a diXculty only

on one point. You assert (p. 17) that afunction could also constitute the in-deVnite element. This is what I usedto believe, but this view now seems tome dubious because of the followingcontradiction: Let w be the predicateof being a predicate which cannot bepredicated of itself. Can w be predi-cated of itself? From either answer, thecontradictory follows. We must there-fore conclude that w is not a predicate.Likewise, there is no class (as a whole)of those classes which, as wholes, arenot members of themselves . . .

On the fundamental questionswhere symbols fail, the exact treat-ment of logic has remained very back-ward; I Vnd yours to be the best treat-ment I know in our time; and this iswhy I have allowed myself to expressmy deep respect for you. It is very re-grettable that you did not get around topublishing the second volume of yourGrundgesetze; but I hope that this willstill be done.

Yours sincerely,Bertrand Russell

Dear Colleague: [22 June 1902]Many thanks for your interesting

letter of 16 June. I am glad that youagree with me in many things and thatyou intend to discuss my work in de-tail . . .

Your discovery of the contradic-tion has surprised me beyond wordsand, I would almost say, left me thun-derstruck, because it has rocked theground on which I intended to buildarithmetic. It seems accordingly that. . .my Basic Law [axiom] V is false . . . Imust give some further thought to thematter. It is all the more serious as thecollapse of my Law V seems to under-mine not only the foundations of my

arithmetic but the only possible foun-dations for arithmetic as such . . . Yourdiscovery is at any rate a very remark-able one, and it may perhaps lead to agreat advance in logic, undesirable asit may seem at Vrst sight . . .

The second volume of myGrundgesetze is to appear shortly. Ishall have to give it an appendix whereI will do justice to your discovery. Ifonly I could Vnd the right way of look-ing at it!

Yours sincerely,Gottlob Frege

Unfortunately, the hastily prepared solution Fregeincluded in an Appendix to vol. II of Grundgesetzewas unsuccessful, and leads to a similar, but morecomplicated, contradiction. For Russell’s part, ittook him seven more years to Vnd a solution hewas happy with. By then, volume II of the Princi-ples had grown so big, and had deviated so far fromthe plan laid out in vol. I that Russell, along with hisnew collaborator, Alfred North Whitehead, decidedto rename it Principia Mathematica, which was it-self split into three volumes, published in 1910,1911 and 1913. (Principia dropped set theory assuch, and instead re-interpreted talk of ‘classes’ inmathematics using notions not involving sets, butinstead higher-order quantiVcation over ‘proposi-tional functions’ divided into ramiVed ‘types’.)

Meanwhile, other mathematicians had devel-oped consistent set theories whose axioms, how-ever, did not seem to have the character of self-evidence usually thought to be required of logicaltruths. Such mathematicians still thought much ofmathematics could be reduced to set theory, butdenied that set theory was a branch of logic. TheVrst system was developed by Ernst Zermelo in1908, added to, and made more rigorous by AdolfFrænkel in 1922. Their system is now called ZF orZermelo-Frænkel set theory. Another was sug-gested by John von Neumann in 1925, and ex-panded by Paul Bernays and Kurt Gödel in the1930s and is now called NBG set theory. Two moreset theories, NF and ML, were developed by W.V.Quine in 1937 and 1940. New versions continue

62

to be discovered, such as George Boolos’s “New V”which stays close to Frege’s original system withonly a slight modiVcation to Frege’s Basic Law V.However, let us return to Mendelson’s System Lfor the time being.

C. Numerals

The following were either proven in your home-work, or follow from those results.

(LL) t = u,A [t, t] `S A [t, u], andu = t,A [t, t] `S A [t, u], for all terms t

and u that are free for x in A [x, x].(Canc+T) `S (∀x) (∀y)(x+ z = y + z ⇒ x = y)(Canc+) t+ s = u+ s `S t = u

Definition: Numerals are the primary or canoni-cal terms used in a given language to stand for spe-ciVc natural numbers.

We have numerals in both the object language andthe metalanguage. In standard English (our meta-language) the numerals are the signs:

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . . , etc.

In the language of system S, the numerals consti-tute the following series of closed terms:

0, 0′, 0′′, 0′′′, 0′′′′, 0′′′′′, 0′′′′′′, 0′′′′′′′, . . . , etc.

Let us now introduce a metalanguage function, nthat yields, for a given number, the numeral of S forthat number. We deVne this function recursivelyin the metalanguage as follows:

Abbreviation: 0 is the constant ‘0’n+ 1 is n′ (This is the metalanguage ‘+’.)

So 2 is “0′′”, 5 is 0′′′′′, and 25 is 0′′′′′′′′′′′′′′′′′′′′′′′′′.

Result (+1): `S x+ 1 = x′

Proof:1. `S x+ 0′ = (x+ 0)′ (S6)2. `S x+ 0 = x (S5)3. `S x+ 0′ = x′ 1, 2 LL4. `S x+ 1 = x′ 3 def. 1

e

Result (·1): `S x · 1 = x

Proof:1. `S x · 0′ = (x · 0) + x (S8) Gen, UI2. `S x · 0 = 0 (S7)3. `S x · 0′ = 0 + x 1, 2 LL4. `S 0 + x = x (S5), (Com+)5. `S x · 0′ = x 3, 4 Trans=6. `S x · 1 = x 5 def. 1

e

Result (·2): `S x · 2 = x+ x.

Proof:1. `S x · 0′′ = (x · 0′) + x (S8) Gen, UI2. `S x · 0′ = x Above, def. 13. `S x · 0′′ = x+ x 1, 2 LL4. `S x · 2 = x+ x 3 def. 2

e

Result (0+): `S x+y = 0⇒ (x = 0 ∧ y = 0).

Proof:1. x+ 0 = 0 `S x+ 0 = 0 (Premise)2. `S 0 + 0 = 0 (S5) Gen, UI3. x+ 0 = 0 `S x+ 0 = 0 + 0 1, 2 LL4. x+ 0 = 0 `S x = 0 3 Canc+5. `S 0 = 0 Ref=6. x+ 0 = 0 `S x = 0 ∧ 0 = 0 3, 4 SL7. `S x+ 0 = 0⇒ (x = 0 ∧ 0 = 0) 6 DT

63

8. `S x+ y′ = (x+ y)′ (S6)9. `S 0 6= (x+ y)′ (S3) Gen, UI10. `S (x+ y)′ = 0⇒ 0 = (x+ y)′(Sym=T) UI×211. `S (x+ y)′ 6= 0 9, 10 MT12. `S x+ y′ 6= 0 8, 11 LL13. `S x+ y′ = 0⇒ (x = 0 ∧ y′ = 0) 12 SL14. `S [x+ y = 0⇒ (x = 0 ∧ y = 0)]⇒

[x+ y′ = 0⇒ (x = 0 ∧ y′ = 0)] 13 SL15. `S (∀y){[x+ y = 0⇒ (x = 0 ∧ y = 0)]⇒

[x+ y′ = 0⇒ (x = 0 ∧ y′ = 0)]} 14 Gen16. `S (∀y)[x+ y = 0⇒ (x = 0 ∧ y = 0)] 7,15

MI17. `S x+ y = 0⇒ (x = 0 ∧ y = 0) 16 UI

e

Proofs of the following are left as homework or aregiven in the book:

(0·) `S x 6= 0⇒ (x · y = 0⇒ y = 0)(1+) `S x+ y = 1⇒ [(x = 0 ∧ y = 1) ∨

(x = 1 ∧ y = 0)](1·) `S x · y = 1⇒ (x = 1 ∧ y = 1)

(∃Succ) `S x 6= 0⇒ (∃y)(x = y′)(Canc·) `S x 6= 0⇒ (y · x = z · x⇒ y = z)

(∃Succs) `S x 6= 0⇒ [x 6= 1⇒ (∃y)(x = y′′)]

We also get the following very important results,stated in the metalanguage.

Result (Num6=): For all natural numbers n andm, if n 6= m, then `S n 6= m

Proof:Assume that n 6= m. Hence what we need to proveis a statement of the form:

`S 0′′...(n times)...′ 6= 0′′...(m times)...′

where one side has more ′-signs than the other.Perform a reductio in the object language, taking asa premise the wU, 0′′...(n times)...′ = 0′′...(m times)...′. Bysuccessive applications of (S4) and MP, dependingon whether n < m orm < n you’ll get either that:

0′′...(n times)...′ = 0′′...(m times)...′ `S 0 = 0′′...(m−n times)...′

or that:

0′′...(n times)...′ = 0′′...(m times)...′ `S 0′′...(n−m times)...′ = 0

However, the negations of these follow from (S3),or (S3) and Sym=, and UI. So, by DT and MT,we get that `S 0′′...(n times)...′ 6= 0′′...(m times)...′, i.e.,`S n 6= m. e

Result (Num+): For all natural numbers n andm, `S n+m = n+m.

Proof:We use induction onm in the metalanguage. First,for the base, n + 0 is simply n, and we have`S n = n + 0 by (S5) and Sym=. For the induc-tion step, assume `S n+m = n + m. We need`S n+ (m+ 1) = n + (m+ 1), i.e., `S n+m′ =n+ (m)′. This follows by (S2), (S6) and Sym=. e

Result (Num·): For all natural numbers n andm, `S n · m = n · m.

You will prove the above as part of your homework.

D. Ordering, CompleteInduction and Divisibility

Abbreviations:

(t < u) for (∃x)(x 6= 0 ∧ t+ x = u), where xis the Vrst variable not in t or u

(t ≤ u) for (t < u) ∨ (t = u)(t > u) for (u < t)(t ≥ u) for (u ≤ t)(t ≮ u) for ¬(t < u)and we deVne (t � u), (t ≯ u) & (t � u) similarly.

64

Result (Irref<): `S x ≮ x

Proof:1. `S x+ 0 = x (S5)2. `S x+ y = x+ 0⇒ y = 0

(Canc+T) Gen, UI, Com+3. `S x+ y = x⇒ y = 0 1, 2 LL4. `S y 6= 0⇒ x+ y 6= x 3 SL5. `S ¬¬(y 6= 0⇒ x+ y 6= x) 4 DN6. `S ¬(y 6= 0 ∧ x+ y = x) 5 def. ∧7. `S (∀y)¬(y 6= 0 ∧ x+ y = x) 6 Gen8. `S ¬ (∃y)(y 6= 0 ∧ x+ y = x) 7 DN, def. ∃9. `S x ≮ x 8 defs. <,≮

e

Result (Trans<):`S x < y ⇒ (y < z ⇒ x < z)

Proof:1. x < y `S (∃z)(z 6= 0 ∧ x+ z = y) Pr, def. <2. y < z `S (∃x)(x 6= 0 ∧ y + x = z) Pr, def. <3. x < y `∗S b 6= 0 ∧ x+ b = y 1 “Rule C”4. y < z `∗S c 6= 0 ∧ y + c = z 2 “Rule C”5. x < y `∗S x+ b = y 3 SL6. y < z `∗S y + c = z 4 SL7. x < y, y < z `∗S (x+ b) + c = z 5, 6 LL8. x < y, y < z `∗S x+ (b+ c) = z 7 Assoc+9. `∗S b+ c = 0⇒ (b = 0 ∧ c = 0) (0+), Gen, UI10. x < y `∗S b+ c 6= 0 3, 9 SL11. x < y, y < z `∗S b+ c 6= 0 ∧ x+ (b+ c) = z

8, 10 SL12. x < y, y < z `S (∃y)(y 6= 0 ∧ x+ y = z)

11 EG13. x < y, y < z `S x < z 12 def. <14. `S x < y ⇒ (y < z ⇒ x < z) 13 DT×2

e

Others (many assigned as homework):

(Ref≤) `S x ≤ x

(Anti-Sym<) `S x < y ⇒ y ≮ x

(Trans ≤) `S x ≤ y ⇒ (y ≤ z ⇒ x ≤ z)(Trans≤<) `S x ≤ y ⇒ (y < z ⇒ x < z)

(Order) `S x = y ∨ x < y ∨ y < x

`S x < y ∨ x ≥ y

`S x ≤ y ∨ x > y

(≤≥to =) `S x ≤ y ∧ x ≥ y ⇒ x = y

(0≤) `S 0 ≤ x

(0<) `S 0 < x′

(≮0) `S x ≮ 0(≤0) `S x = 0⇔ x ≤ 0

(< Succ) `S x < x′

(< Succ≤) `S x < y ⇔ x′ ≤ y

`S x < y′ ⇔ x ≤ y

(+Pres≤) `S x ≤ x+ y

(+Pres<) `S y 6= 0⇒ x < x+ y

(·Pres≤) `S y 6= 0⇒ x ≤ x · y(·Pres<) `S x 6= 0⇒ (y > 1⇒ x < x · y)

(Canc+<) `S x < y ⇔ x+ z < y + z

(Canc+≤) `S x ≤ y ⇔ x+ z ≤ y + z

(Canc·<) `S z 6= 0⇒ (x < y ⇔ x · z < y · z)(Canc·≤) `S z 6= 0⇒ (x ≤ y ⇔ x · z ≤ y · z)

(OrdC) `S (∀x)[((∀y)(y < x⇒ A [y]) ∧(∀y)(y ≥ x⇒ B[y]))⇒

(∀y)(A [y] ∨ B[y])]`S (∀x)[((∀y)(y ≤ x⇒ A [y]) ∧

(∀y)(y > x⇒ B[y]))⇒(∀y)(A [y] ∨ B[y])]

Result: For any natural number n,`S (x = 0 ∨ . . . ∨ x = n)⇔ x ≤ n.

Proof:We use induction on n in the metalanguage.

65

(1) The base case is (≤0).(2) For the induction step, as inductive hypothesis,

assume:

`S (x = 0 ∨ . . . ∨ x = n)⇔ x ≤ n.

(3) We need:

`S (x = 0 ∨ . . . ∨ x = n ∨ x = n+ 1)⇔x ≤ n+ 1

By the deVnition of the overbar, this is:

`S (x = 0 ∨ . . . ∨ x = n ∨ x = n′)⇔ x ≤ n′

(4) The left-to-right conditional follows by a proofby cases: by (2), (Trans≤<) and (<Succ) forthe Vrst n cases, and, for the Vnal case, by theobvious tautology:

`S x = n′ ⇒ x ≤ n′

(5) For the right-to-left conditional, Vrst note bythe deVnition of ≤, we have:

x ≤ n′ `S x < n′ ∨ x = n′

(6) By (<Succ≤), we have:

`S x < n′ ⇒ x ≤ n

(7) By this and the inductive hypothesis, (2), then:

`S x < n′ ⇒ (x = 0 ∨ . . . ∨ x = n)

(8) By obvious propositional logic rules:

`S x < n′ ⇒ (x = 0 ∨ . . . ∨ x = n ∨ x = n′)

(9) The following is an obvious tautology:

`S x = n′ ⇒ (x = 0 ∨ . . . ∨ x = n ∨ x = n′)

(10) So by a proof by cases starting with (5), we get:

x ≤ n′ `S (x = 0 ∨ . . . ∨ x = n ∨ x = n′)

(11) Therefore, the right-to-left conditional followsby the deduction theorem. This establishes thebiconditional, and completes the induction. e

Corollary: For any natural number n and wUA [x], `S (A [0] ∧ . . . ∧ A [n])⇔

(∀x)(x ≤ n⇒ A [x]).

Corollary: For any natural number n,`S (x = 0 ∨ . . . ∨ x = n)⇔ x < n+ 1.

Corollary: For any natural number n and wUA [x], `S (A [0] ∧ . . . ∧ A [n])⇔

(∀x)(x < n+ 1⇒ A [x]).

New Forms of Induction

Result (CI):`S (∀x)((∀y)(y < x⇒ A [y])⇒ A [x])⇒

(∀x) A [x](The principle of strong or complete mathemati-cal induction.)

Proof:For the proof, like any conditional, we begin byassuming the antecedent, abbreviated as ($).1. ($) `S (∀x)((∀y)(y < x⇒ A [y])⇒ A [x])

(Pr)Rather than directly proceeding to derive(∀x) A [x], we instead attempt to show(∀x) (∀z)(z ≤ x⇒ A [z]) by normal (“weak”)induction on x.

2. z ≤ 0 `S z = 0 (Pr), (≤Zero) SL3. ($) `S (∀y)(y < 0⇒ A [y])⇒ A [0] 1 UI4. `S y ≮ 0 (≮Zero) Gen, UI5. `S y < 0⇒ A [y] 4 SL6. `S (∀y)(y < 0⇒ A [y]) 5 Gen7. ($) `S A [0] 3, 6 MP

66

8. ($), z ≤ 0 `S A [z] 2, 7 LL9. ($) `S z ≤ 0⇒ A [z] 8 DT10. ($) `S (∀z)(z ≤ 0⇒ A [z]) 9 Gen

This establishes the base step. Next:11. (∀z)(z ≤ x⇒ A [z]) `S z ≤ x⇒ A [z]

(Pr), UI12. `S z ≤ x′ ⇒ z < x′ ∨ z = x′ Taut, def. ≤13. `S z < x′ ⇔ z ≤ x (<Succ≤), Gen, UI14. (∀z)(z ≤ x⇒ A [z]) `S z < x′ ⇒ A [z]

11, 13 SL15. (∀z)(z ≤ x⇒ A [z]) `S y < x′ ⇒ A [y]

14 Gen, UI16. (∀z)(z ≤ x⇒ A [z]) `S (∀y)(y < x′ ⇒

A [y]) 15 Gen17. ($) `S (∀y)(y < x′ ⇒ A [y])⇒ A [x′] 1 UI18. ($), (∀z)(z ≤ x⇒ A [z]) `S A [x′] 16, 17 MP19. ($), (∀z)(z ≤ x⇒ A [z]) `S z = x′ ⇒ A [z]

18 PF=

20. ($), (∀z)(z ≤ x⇒ A [z]) `S z ≤ x′ ⇒ A [z]12, 14, 19 SL

21. ($), (∀z)(z ≤ x⇒ A [z]) `S (∀z)(z ≤ x′ ⇒A [z]) 20 Gen

22. ($) `S (∀z)(z ≤ x⇒ A [z])⇒ (∀z)(z ≤ x′

⇒ A [z]) 21 DT23. ($) `S (∀x)[(∀z)(z ≤ x⇒ A [z])⇒

(∀z)(z ≤ x′ ⇒ A [z])] 22 Gen

This establishes the induction step, whence:

24. ($) `S (∀x) (∀z)(z ≤ x⇒ A [z]) 10, 23 MI25. ($) `S x ≤ x⇒ A [x] 24 UI×226. `S x ≤ x (Ref≤)27. ($) `S A [x] 25, 26 MP28. ($) `S (∀x) A [x] 27 Gen29. `S (∀x)((∀y)(y < x⇒ A [y])⇒ A [x])⇒

(∀x) A [x] 28 DT

e

Corollary (LNP): `S (∃x) A [x]⇒(∃x)(A [x] ∧ (∀y)(y < x⇒ ¬A [y]))

(The Least Number Principle.)

Proof:This is roughly the transposition of (CI), with¬A [x] substituted for A [x]. See book for details.e

Corollary (MID): `S (∀x)(A [x]⇒(∃y)(y < x ∧ A [y]))⇒ (∀x)¬A [x]

(The Method of InVnite Descent.)

Proving this is homework, but it follows from LNP.

Divisibility

While the division function cannot be deVned forthe natural numbers alone, the relation of divisibil-ity can be so deVned.

Abbreviation: t|u for (∃x)(u = t · x), where xis the Vrst variable that does not occur in t and u.

This can be read as “u is evenly divisible by t”, “tevenly divides u”, or as “u is a multiple of t”.

Result (Ref|): `S x|x

Proof:1. `S x = x · 1 (·1), Sym=2. `S (∃y)(x = x · y) 1 EG3. `S x|x 2 def. |

e

Result (1|): `S 1|x

Proof:1. `S x = x · 1 (·1), Sym=2. `S x = 1 · x 1 (Com·), Trans=3. `S (∃y)(x = 1 · y) 2 EG4. `S 1|x 3 def. |

e

67

Result (|0): `S x|0

Proof:1. `S 0 = x · 0 (S7) Sym=2. `S (∃y)(0 = x · y) 1 EG3. `S x|0 2 def. |

e

Result (Trans|): `S x|y ∧ y|z ⇒ x|z

Proof:1. x|y ∧ y|z `S x|y (Premise) SL2. x|y ∧ y|z `S y|z (Premise) SL3. x|y ∧ y|z `S (∃z)(y = x · z) 1 def. |4. x|y ∧ y|z `S (∃x)(z = y · x) 2 def. |5. x|y ∧ y|z `∗S y = x · b 3 “Rule C”6. x|y ∧ y|z `∗S z = y · c 3 “Rule C”7. x|y ∧ y|z `∗S z = (x · b) · c 5, 6 LL8. x|y ∧ y|z `∗S z = x · (b · c) 7 Assoc·, Trans=9. x|y ∧ y|z `S (∃y)(z = x · y) 8 EG10. x|y ∧ y|z `S x|z 9 def. |11. `S x|y ∧ y|z ⇒ x|z 10 DT

e

Further results (either proven in the book, or as-signed as homework):`S y 6= 0 ∧ x|y ⇒ x ≤ y`S x|y ∧ y|x⇒ x = y`S x|y ⇒ x|(y · z)`S x|y ∧ x|z ⇒ x|(y + z)`S x|1⇒ x = 1`S x|y ∧ x|y′ ⇒ x = 1

Result (UQR): `S (∀x) (∀y)(y 6= 0⇒(∃1z) (∃1z1)(x = (y · z) + z1 ∧ z1 < y))

(Uniqueness of quotient and remainder.)

(The proof of this is somewhat complicated, but issketched in the book.)

At this point, we can do virtually all elementaryarithmetic for natural numbers in system S.

E. Expressibility andRepresentability

We’ve now seen that number-theoretic relationssuch as <, ≤, |, etc., can be deVned in S, eventhough they were not taken as primitive predicateletters. It is also easy to see that certain functionson the natural numbers, such as the squaring func-tion, n2, could be deVned in S. Our topic over thenext few days will involve general results aboutwhat sort of mathematical functions and relationscan be expressed or represented in system S (andsimilar systems), and what sort cannot be.

In the metatheory, functions and relations areconsidered set-theoretically. An n-place relation,for example, is considered to be a set of n-tuples.An n-place function is considered as a set of or-dered pairs, the Vrst elements of which are them-selves n-tuples. For most purposes, however,we can think of them more informally as argu-ment/value mappings.

Let N be the set of natural numbers{0, 1, 2, . . . }. We then deVne the following:

Definition: A number-theoretic relation is anysubset of Nn for some n (e.g., any set of n-tuples ofnatural numbers).

Examples: Being even, being odd, and being primeare one-place number-theoretic relations (proper-ties). Being greater than, being divisible by, etc., aretwo-place number theoretic relations. We are hereidentifying being even with the set of even num-bers, and being greater than with a set of orderedpairs of numbers

Definition: A number-theoretic function is afunction whose domain is Nn for some n, and whoserange is a subset of N.

68

Examples: Addition and multiplication are bothtwo-place number-theoretic functions. The func-tion that yields, for a given natural number n asargument, the nth prime, is a one-place number-theoretic function.

Within a given mathematical system such as S,some functions and relations may be deVnable andsome may not be deVnable. Let us make this moreprecise.

Below, we assume that K is an axiom systemwith numerals for natural numbers (e.g., System S).

Definition: A given n-place number-theoretic rela-tionR is said to be expressible in K iU there is a wUA [x1, . . . , xn] with x1, . . . , xn as its free variablessuch that, for any natural numbers k1, . . . , kn:

(i) If R holds for 〈k1, . . . , kn〉, then`K A [k1, . . . , kn];

(ii) If R does not hold for 〈k1, . . . , kn〉, then`K ¬A [k1, . . . , kn].

Definition: A given n-place number-theoreticfunction F is said to be representable in K iU thereis a wU A [x1, . . . , xn, y] with x1, . . . , xn and y asits free variables such that, for any natural numbersk1, . . . , kn andm:

(i) If the value of F for 〈k1, . . . , kn〉 as argumentism, then `K A [k1, . . . , kn,m];

(ii) `K (∃1y) A [k1, . . . , kn, y].

(Note that A [x1, . . . , xn, y] might be an identitystatement of the form y = F (x1, . . . , xn), whereF is a function letter, but it need not be; it couldinstead be any wU containing y and x1, . . . , xn freesatisfying the above conditions.)

Definition: A given n-place number-theoreticfunction F is said to be strongly representable inK iU there is a wU A [x1, . . . , xn, y] with x1, . . . , xnand y as its free variables such that, for any naturalnumbers k1, . . . , kn andm:

(i) If the value of F for 〈k1, . . . , kn〉 as argumentism, then `K A [k1, . . . , kn,m];

(ii) `K (∃1y) A [x1, . . . , xn, y].

There are only denumerably many wUs within ourlanguage, but there is a non-denumerably inVnite

number of n-place number-theoretic relations andfunctions, so not all of them can be represented ina theory such as S.

Examples:

1. The identity relation on the set of natural num-bers is expressible in S by the wU x1 = x2, since:(a) If k1 = k2, then k1 is the same as k2, so`S k1 = k2 is an instance of (Ref=).

(b) Result (Num6=), on p. 64, established thatfor any natural numbers k1 and k2, if k1 6=k2, then `S k1 6= k2.

2. The less than relation is expressible in S by thewU x1 < x2, i.e., (∃x)(x 6= 0 ∧ x1 + x = x2).

3. The “zero function”, whose value is 0 for anynatural number as argument, is strongly repre-sentable in S (or any other theory with identity)by the wU (x1 = x1 ∧ y = 0), since:(a) For any natural number k,`S (k = k ∧ 0 = 0)

(b) `S (∃1y)(x1 = x1 ∧ y = 0)4. The successor function is strongly representable

in S by y = x′1.5. The “projection functions” Un

i are functionswhich, for any n arguments, simply return theirith argument as value. E.g., U4

3 (5, 8, 2, 13) = 2,and U4

3 (7, 1, 0, 16) = 0. They are strongly repre-sentable in S (or any other theory with identity)by wUs of the form

(x1 = x1 ∧ . . . ∧ xn = xn ∧ y = xi)

since:(a) For any 〈k1, . . . , kn〉, the value of Un

i is ki,and `S (k1 = k1 ∧ . . . ∧ kn = kn ∧ ki =ki).

(b) `S (∃1y)(x1 = x1 ∧ . . . ∧ xn = xn ∧ y =xi).

Result: If K is a Vrst-order theory with identity,then the number-theoretic function F is repre-sentable in K iU it is strongly representable inK.

69

Sketch of proof:

Note that part (ii) of the deVnition of strong rep-resentability entails (by Gen and UI) part (ii) ofthe deVnition of representability, so the right-to-left conditional holds. For the left-to-right con-ditional, note that if F is represented in K byA [x1, . . . , xn, y]. Then, we can construct a wUB[x1, . . . , xn, y] with the following form:

((∃1y)(A [x1, . . . , xn, y]) ∧ A [x1, . . . , xn, y]) ∨(¬ (∃1y)(A [x1, . . . , xn, y]) ∧ y = 0)

Then F will be strongly represented by this com-plex wU, because, (i) whenm is the value of F for〈k1, . . . , kn〉 as argument, if the appropriate numer-als replace x1, . . . xn and y in the above, the Vrstdisjunct is derivable, and so the whole is, and (ii)`PF= (∃1y) B[x1, . . . , xn, y].

The details of the proof of this theorem of PF=

are sketched in the book, but, informally, either theVrst conjunct of the Vrst disjunct must hold, or theVrst conjunct of the second disjunct must hold (butnot both). For the former case, then there is exactlyone y such that A [x1, . . . , xn, y], and for the latter,there is, of course, always exactly one y such thaty = 0.

Result: Number-theoretic functions deVned bysubstitution of strongly representable functionswithin strongly representable functions are alsostrongly representable. More precisely, if F is ann-place function whose value for 〈k1, . . . , kn〉 isg(h1(k1, . . . , kn), . . . , hm(k1, . . . , kn)) where gand h1, . . . , hm are all strongly representable inK, then F is also strongly representable in K.

Sketch of proof:

Suppose that g is (strongly) represented in K bythe wU B[x1, . . . , xm, y] and h1 through hm are(strongly) represented by the wUs A1[x1, . . . xn, y]through Am[x1, . . . xn, y]. It is then possible to rep-

resent f with the wU:

(∃z1) . . . (∃zm)(A1[x1, . . . xn, z1] ∧ . . . ∧Am[x1, . . . xn, zm] ∧ B[z1, . . . , zm, y])

A proof that the above wU satisVes parts (i) and(ii) of the deVnition of strong representability forF is given in the book, but the result is somewhatintuitively obvious.

For example, if multiplication can be stronglyrepresented in K by A [x1, x2, y], and addition canbe represented in K by B[x1, x2, y] then the func-tion, whose value for two natural numbers n andm is the product of n and m added to itself (i.e.,nm+ nm), can be represented in K by the wU:

(∃z1) (∃z2)(A [x1, x2, z1] ∧ A [x1, x2, z2] ∧B[z1, z2, y])

Roughly, this says there is a z1 and z2 where bothz1 and z2 are the product of x1 and x2, and y is thesum of z1 and z2.

Characteristic Functions and Graphs

Definition: If R is an n-place number-theoreticrelation, its characteristic function, written CR,is the n-place number-theoretic function deVned asfollows:

CR(k1, . . . , kn) =

0 if R holds for 〈k1, . . . , kn〉,1 if not.

Examples:

(a) C<(3, 7) = 0 but C<(7, 3) = 1, etc.(b) C=(2, 2) = 0 but C=(2, 3) = 1, etc.(c) C|(3, 27) = 0 but C|(3, 26) = 1, etc.

Note that this is the reverse of many programminglanguages, etc., in which the “Boolean number” 1is used for truth, and 0 is used for falsity.

Result: For any theory K (e.g., system S) inwhich `K 0 6= 1, the relation R is expressiblein K iU its characteristic function CR is repre-sentable in K.

70

Proof:(1) To see the truth of the left-to-right conditional,

note that if R is expressed in K by the wUA [x1, . . . , xn], then CR can be represented bythe wU:

(A [x1, . . . , xn] ∧ y = 0) ∨(¬A [x1, . . . , xn] ∧ y = 1).

(2) For the right-to-left conditional, note thatif CR is represented in K by some wUA [x1, . . . , xn, y], then given that `K 0 6= 1, Rcan be expressed by the wU A [x1, . . . , xn, 0].

Example: C< can be represented in S by the wU

((∃x)(x 6= 0 ∧ x1 + x = x2) ∧ y = 0) ∨(¬ (∃x)(x 6= 0 ∧ x1 + x = x2) ∧ y = 1).

Definition: If F is an n-place number-theoreticfunction, then the graph of F, written GF , is the(n + 1)-place number-theoretic relation that holdsfor 〈k1, . . . , kn, kn+1〉 iU kn+1 is the value of F for〈k1, . . . , kn〉 as argument.

Example: The graph of the addition function is therelation that holds between three numbers just incase the sum of the Vrst two numbers is the last.

Result: For any theory K, the function F is rep-resentable in K iU the graph of F is expressiblein K.

(Proving this is homework.)We will be making a good deal of use of char-

acteristic functions in what follows, but relativelylittle use of graphs.

F. Primitive Recursive andRecursive Functions

We have been discussing number-theoretic func-tions and relations and what it is for them to be

expressible or representable within an axiomaticsystem. There are two very important categoriesof functions that any axiomatic theory for numbertheory should be able to represent, viz., primitiverecursive functions, and recursive functions.

These two categories of functions have broadimportance not only within logic, but also in math-ematics and computer science generally. We shalllater prove that the functions in these categoriesare representable within System S (i.e., Peano arith-metic). Our current task, however, is simply to geta better understanding of what it is for a functionto fall into one or both of these two groups. Forthe moment, therefore, we’re putting system S onthe shelf and will be discussing these functionsentirely in the metalanguage. Therefore, all themathematical notation that appears over the nextseveral pages is the notation of ordinary mathemat-ics, not the notation used within system S or otherformal theory.

Definition: The initial functions are the follow-ing functions:(1) The (one-place) zero function Z, the value of

which is 0 for any argument (i.e., for all x,Z(x) = 0).

(2) The (one-place) successor function N , the valueof which is always the number one greater thanits argument. (Note, we write this as N(x), notx′, to avoid confusing the metalanguage functionsign and its counterpart in the object languageof system S.)

(3) The (n-place) projection functions Uni , which, for

any n arguments, simply return their ith argu-ment as value. (There is a diUerent one for eachn and i.)

The following are not functions, but rules usedfor obtaining one function from others already de-Vned.

Definition: An n-place function f is said to be ob-tained by substitution from the m-place function gand the n-place functions h1, . . . , hm whenever thevalue of f can be determined as follows:

f(x1, . . . , xn) =g(h1(x1, . . . , xn), . . . , hm(x1, . . . , xn))

71

Definition: An (n+ 1)-place function f is said tobe obtained by recursion from the n-place functiong and the (n + 2)-place function h, iU both (i) thevalue of f can be determined as follows when 0 is itslast argument:

f(x1, . . . , xn, 0) = g(x1, . . . , xn),

and (ii) whenever its last argument is other than 0,its value can be determined from its value for theargument’s predecessor as follows:

f(x1, . . . , xn, y + 1) =h(x1, . . . , xn, y, f(x1, . . . , xn, y))

or, in the case of a one-place function, we say thatf can be obtained by recursion from the constantk (where k is a particular natural number), and the2-place function h whenever its values can be deter-mined as follows:

f(0) = k

f(y + 1) = h(y, f(y))

Definition: An n-place function f is said to be ob-tained by the choice of least rule from the (n+ 1)-place function g whenever the value of f can becharacterized as follows:

f(x1, . . . , xn) = the least natural number y

such that g(x1, . . . , xn, y) = 0.

(If there is not always such a y, f cannot be deVnedin this way.)

Note that if g is the characteristic function of somerelation R, then the value of f(x1, . . . , xn) will bethe least y such that R holds for 〈x1, . . . , xn, y〉.

Abbreviation: µyR(x1, . . . , xn, y) means “theleast y such that R holds of 〈x1, . . . , xn, y〉.”

µ operates as a subnective, much like the use of thesign ιfor descriptions. This “restricted µ-operator”is used in the metalanguage only. Your book alsocalls this the choice of least rule the “restricted µ-Operator rule” for reasons that should be apparent.

Examples: µy(y > 6) = 7 andµy(y is prime and even) = 2.

Definition: A number-theoretic function f is saidto be primitive recursive iU it can be obtained fromthe initial functions by some Vnite number of appli-cations of the rules of substitution and/or recursion.

Definition: A number-theoretic function f is saidto be recursive iU it can obtained from the initialfunctions by some Vnite number of applications ofsubstitution, recursion, and/or the choice of leastrules. (These are also called general recursive func-tions.)

Obviously, all primitive recursive functions are re-cursive (though we shall later prove that the con-verse does not hold).

Derivatively, a number-theoretic relation is saidto be primitive recursive iU its characteristic func-tion is primitive recursive.

A given subset of the natural numbers can bethought of as a one-place relation (property) onthe natural numbers. So a given set of naturalnumbers can also (derivatively) be called primitiverecursive (or recursive) iU all and only its membersshare some number-theoretic property the charac-teristic function of which is primitive recursive (orrecursive).

The following sorts of manipulations alwayspreserve (primitive) recursiveness:

Result: If the n-place function f is (primitive)recursive, then so is the (n + 1)-place functiong, whose value, g(x1, . . . , xn, xn+1), is alwayssimply f(x1, . . . , xn), so that the last argumentto g is always simply ignored.(Adding dummy variables.)

Proof:Function g can be deVned by substitution using fand the projection functions:

g(x1, . . . , xn, xn+1) = f(Un+11 (x1, . . . , xn+1),

. . . , Un+1n (x1, . . . , xn+1)) e

72

Result: If n-place function f is (primitive)recursive, then so is the n-place function gwhose value, g(. . . , xi, . . . , xj, . . . ), is alwaysf(. . . , xj, . . . , xi, . . . ).(Permuting variables.)

Proof:Again, using substitution and projection:

g(. . . , xi, . . . , xj, . . . ) = f(. . . , Unj (x1, . . . , xn),

. . . , Uni (x1, . . . , xn), . . . ) e

Result: If the (n + 1)-place function f is(primitive) recursive, then so is n-place func-tion g whose value, g(x1, . . . , xn) is alwaysf(x1, x1, . . . , xn).(Identifying variables.)

Proof:Our method is similar to the above:

g(x1, . . . , xn) = f(Un1 (x1, . . . , xn),

Un1 (x1, . . . , xn), . . . , Un

n (x1, . . . , xn)) e

The practical eUect of these three results, especiallywhen combined together, is that it strengthens thesubstitution rule so that not all the h’s need to ben-place functions, nor do they have to put the vari-ables in the same order as f , nor do they have tomake use of all the x’s, etc. Similar results followfor the g and h used in the recursion rule. (Weshall simply put this into practice from now on.)

Result: For any n, the n-place zero function Zn

is primitive recursive.

Proof:This follows by substitution, since:

Zn(x1, . . . , xn) = Z(Un1 (x1, . . . , xn)) e

Result: For any n and k, the n-place constantfunctions Cn

k , the value of which for any n-arguments, is always k regardless of what thearguments are, are primitive recursive.

Proof:This can be proven by induction on k. For k = 0,the n-place constant function is the same as then-place zero function. For the rest, the n-place con-stant function whose value is always k + 1 can bedeVned by substitution since:

Cnk+1(x1, . . . , xn) = N(Cn

k (x1, . . . , xn)) e

One of the consequences of this, is that by usingsuch functions in place of one of the h’s in thedeVnition of substitution, we can in eUect simplyplace a given natural number into the appropriateargument spot of g. Similarly, if we use such afunction in place of the g in recursion, we can sim-ply identify the value of f when its last argumentis 0 with a Vxed natural number even when n > 0.(We do this, e.g., in the deVnition of xy below.)

The class of recursive functions has beenproven equivalent to the class of Turing machine-computable functions, or roughly, those whosevalue a calculator or computer can in principledetermine using a mechanical procedure givenenough time. This may provide some intuitiveinsights as we continue our discussion of them.

Result: The functions below are primitive recur-sive.

(a) Addition: x+ y. DeVnable by recursion:x+ 0 = U1

1 (x) = xx+ (y + 1) = N(x+ y)

(b) Multiplication: x · y. Recursion again:x · 0 = Z(x) = 0x · (y + 1) = (x · y) + x

73

(c) x to the power of y: xy. Recursion:x0 = C1

1(x) = 1xy+1 = (xy) · x

(d) Predecessor: δ(x). Recursion:δ(0) = 0δ(y + 1) = U2

1 (y, δ(y)) = y(e) Subtract-as-much-as-you-can: x −. y. Recur-

sion:x−. 0 = xx−. (y + 1) = δ(x−. y)

(f) Absolute diUerence: |x− y| . Substitution:|x− y| = (x−. y) + (y −. x)

(g) Signum: sg(x). Substitution:sg(x) = x−. δ(x)(Yields 1 for everything except 0, for which ityields 0. This function and the next are veryhelpful in deVning characteristic functions.)

(h) Reverse signum: sg(x). Substitution:sg(x) = 1−. sg(x)(Yields 0 for everything except 0, for which ityields 1.)

(i) Factorial: x! Recursion:0! = 1(y + 1)! = y! · (y + 1)

(j) Minimum of 2 arguments: min(x, y). Substi-tution:min(x, y) = x−. (x−. y)

(k) For any n > 2, the minimum of n arguments,because each such function can be deVned bysubstitution using the previous one:min(x1, . . . , xn, xn+1) =

min(min(x1, . . . , xn), xn+1)(l) Maximum of 2 (or more) arguments:

max(x, y) = y + (x−. y)max(x1, . . . , xn, xn+1) =

max(max(x1, . . . , xn), xn+1)(m) Remainder upon division: rm(x, y). Recur-

sion:rm(x, 0) = 0rm(x, y + 1) =

N(rm(x, y)) · sg(|x−N(rm(x, y))|)(n) Quotient upon division of y by x: qt(x, y).

(Rounded down.) Recursion:qt(x, 0) = 0qt(x, y+1) = qt(x, y)+sg(|x−N(rm(x, y))|)

Remember that all the mathematical notation on

the list above is the notation of ordinary mathemat-ics. We have not shown how these functions couldbe represented in System S or any other axiomaticsystem built upon the predicate calculus (at leastnot yet anyway).

Bounded Sums and Products

The following notation:∑z<y

f(x1, . . . , xn, z)

stands for the (n+1)-place bounded sum functiong, whose value for 〈x1, . . . , xn, y〉 as argument isthe sum of all the values of f for 〈x1, . . . , xn, 0〉through 〈x1, . . . , xn, y − 1〉.

Result: If f is (primitive) recursive, then so isthe bounded sum g, as explained above.

Proof:The function g can be deVned by recursion as fol-lows:

g(x1, . . . , xn, 0) = 0g(x1, . . . , xn, y + 1) = g(x1, . . . , xn, y) +

f(x1, . . . , xn, y) e

Similar results follow for the form:∑z≤y

= f(x1, . . . , xn, z)

This is deVnable by substitution, since:∑z≤y

f(x1, . . . , xn, z) =∑

z<y+1f(x1, . . . , xn, z)

Similarly for doubly bounded sums:∑y<z<v

f(x1, . . . , xn, z) =∑

z<δ(v−. y)f(x1, . . . , xn, z + y + 1)

The following notation:∏z<y

f(x1, . . . , xn, z)

stands for the (n + 1)-place bounded product func-tion g, whose value for 〈x1, . . . , xn, y〉 as argu-ment is the product of all the values of f for〈x1, . . . , xn, 0〉 through 〈x1, . . . , xn, y − 1〉.

74

Result: If f is (primitive) recursive, so is thebonded product g, as characterized above.

Proof:Again g can be deVned by recursion, since:

g(x1, . . . , xn, 0) = 1g(x1, . . . , xn, y + 1) =

g(x1, . . . , xn, y) · f(x1, . . . , xn, y) e

Similar results follow for bounded products forall z ≤ y, as well as doubly bounded products(y < z < v).

Bounded sums can be used in clever ways to‘scan’ ranges of numbers and ‘count’ those num-bers with certain characteristics. For example, con-sider the tally function τ , whose value for x asargument is the number of factors of x less than orequal to x itself. This function can be deVned asfollows:

τ(x) =∑z≤x

sg(rm(z, x))

This function ‘scans’ the numbers up to and in-cluding x, and each time it encounters one theremainder of which is 0 when divided into x, thevalue of the reverse signum function is 1, and soone more is added to the bounded sum.

Relations and Recursion

Recall that a relation is said to be (primitive) recur-sive iU its characteristic function is a (primitive)recursive function.

Definition: The negation of number-theoreticrelation R, viz., “not-R”, is the relation that holdsof a given n-tuple of natural numbers 〈k1, . . . , kn〉iU R does not hold of 〈k1, . . . , kn〉.

Definition: The conjunction of number-theoretic relations R and S, written “R-and-S”,is the relation that holds of 〈k1, . . . , kn〉 iU R and Sboth hold of 〈k1, . . . , kn〉.

Definition: The disjunction of two relations,written “R-or-S”, is the relation that holds for〈k1, . . . , kn〉 iU it holds for either R or S.

(Similar terminology is used for other propositionalconnectives.)

If we think of relations set-theoretically, con-junctions are really intersections, and disjunctionsare really unions, and negations are really comple-ments, etc.

Mendelson sometimes uses notation such as“R ∨ S” and “¬R” for negations and disjunctions ofrelations. This notation can be misleading, becauseit is still part of the metalanguage, not the object-language. Therefore, I use the English words.

Result: If R and S are (primitive) recursive re-lations, then so are their negations, conjunctions,disjunctions, and so on.

Proof:By deVnition, if R and S are (primitive) recursive,their characteristic functions CR and CS are (prim-itive) recursive, in which case the characteristicfunctions of their negations and disjunctions canbe deVned as follows:

Cnot-R(x1, . . . , xn) = sg(CR(x1, . . . , xn))CR-or-S(x1, . . . , xn) =

CR(x1, . . . , xn) · CS(x1, . . . , xn)Other propositional operations on relations can bedeVned in terms of disjunction and negation. e

I use the notation:z

∃z<y

R(x1, . . . , xn, z)

in the metalanguage(!) to stand for the relationQ that holds for 〈x1, . . . , xn, y〉 iU the relation Rholds for at least one ordered (n+ 1)-tuple of theform 〈x1, . . . , xn, z〉 where z < y. (Mendelsonwrites instead (∃z)z<y R(x1, . . . , xn, z)—but I Vndthis too close to the notation used in the objectlanguage, and potentially confusing.)

75

Result: If R is a (primitive) recursive numbertheoretic relation, so is the existentially quanti-Ved relation Q as annotated above.

Proof:The characteristic function for Q can be deVnedin terms of the characteristic function for R bysubstitution as follows:

CQ(x1, . . . , xn, y) =∏z<y

CR(x1, . . . , xn, z)

Note that if R holds for at least one 〈x1, . . . , xn, z〉where z < y, then, for that 〈x1, . . . , xn, z〉, thevalue of the characteristic function will be 0, inwhich case, the value of the bounded product willalso be 0. If there is no such z, then the value ofthe characteristic function will always be 1, and sothe bounded product will also yield 1 as value. e

Similar results follow for bounded existential quan-tiVers using ≤, doubly bounded existential quanti-Vers, etc.

The notation:z

∀z<y

R(x1, . . . , xn, z)

is used in the metalanguage to stand for the rela-tionQ that holds for 〈x1, . . . , xn, y〉 just in case therelation R holds for all ordered (n + 1)-tuples ofthe form 〈x1, . . . , xn, z〉 where z < y.

Result: If number-theoretic relation R is (primi-tive) recursive, so then is the bounded universallyquantiVed relation Q, as annoted above.

Proof:

CQ(x1, . . . , xn, y) = sg∑z<y

CR(x1, . . . , xn, z)

Here the bounded sum ‘scans’ the values of z lessthan y and adds one whenever it Vnds one forwhich R does not hold for 〈x1, . . . , xn, z〉. There-fore, the value of the signum function is 0 iU this‘scan’ Vnds no such z.

(We could also have deVned this using the

bounded existential quantiVer, sincez

∀z<y

. . . is the

same as not-z

∃z<y

not- . . .). e

Similar results follow for bounded universalquantiVers using ≤, and doubly bounded ones, etc.

The notation:

µzz<y R(x1, . . . , xn, z)

is used to stand for the function g whose valuefor 〈x1, . . . , xn, y〉 as argument is the least num-ber z less than y for which the relation R holdsfor 〈x1, . . . , xn, z〉 if there is such a z, and whosevalue is y if there is no such z.

Result: If the relation R is (primitive) recursive,then so is the function g, deVned by the boundedµ-operator above.

Proof:Function g can be deVned using the characteristicfunction of R by substitution as follows:

g(x1, . . . , xn, y) =∑z<y

∏w≤x

CR(x1, . . . , xn, w)

(As z increases, the bounded product will keepadding 1 to the bounded sum so long as R doesnot hold for any 〈x1, . . . , xn, w〉 where w ≤ z. Assoon as a z is reached for which R does hold forsome 〈x1, . . . , xn, w〉 where w ≤ z, the boundedproduct will stop adding to the bounded sum, andso the result will identical to the least such z.) e

Note that because functions deVned using thebounded µ-operator do not make use of the un-bounded µ-operator, it is possible for such func-tions to be primitive recursive, not simply recur-sive, provided that CR is primitive recursive.

76

Similar results follow for bounded µ-operatorsusing ≤ instead of <, and doubly bounded µ-operators.

Result: The relations and functions listed beloware primitive recursive.

1. The identity relation, since:

C=(x, y) = sg(|x− y|)

2. The relation of being less than:

C<(x, y) = sg(y −. x)

3. The relation of being evenly divisible by:

C|(x, y) = sg(rm(x, y))

4. The property of being prime:

CPr(x) = C=(τ(x), 2)

(Recall that τ(x) is the number of even divisorsof x less than or equal to x. Remember thata number is prime iU it has exactly two suchdivisors, 1 and itself.)

5. The function px, whose value for x as argumentis the xth prime number:

p0 = 2py+1 = µzz≤(py)!+1 (py < z and Pr(z))

(The bound placed on z comes from Euclid’sproof that there is no greatest prime number.)

It is a well knownmathematical result that everypositive integer x has a unique prime factoriza-tion,

x = (p0)a0 · (p1)a1 · . . . · (pk)ak

where a0, . . . , ak are the series of exponents ofthe Vrst k primes, where pk is the largest primethat evenly divides x.

6. The function (x)y whose value is the exponenton the yth prime in the prime factorization of x,is primitive recursive:

(x)y = µzz<x ((py)z|x and not-((py)z+1|x))

This will return the least z such that (py)z goesevenly into x but (py)z+1 does not.

7. The notation `~(x), read “the length of x”, isused for the function, that has as value, the num-ber of prime factors of x (i.e., the number of y’ssuch that (x)y is not zero.) This function is alsoprimitive recursive. See the book for details.

Result: If the functions g1, . . . , gm are all (prim-itive) recursive, and if the relations R1, . . . , Rm

are all (primitive) recursive, then so is the func-tion f whose value can be informally character-ized as follows:

f(x1, . . . , xn) =

g1(x1, . . . , xn)if R1(x1, . . . , xn)...

gm(x1, . . . , xn)if Rm(x1, . . . , xn)

is also (primitive) recursive.

Proof:The above deVnition is equivalent to the following:

f(x1, ..., xn) =(g1(x1, . . . , xn) · sg(CR1(x1, . . . , xn))) + . . .

+ (gm(x1, . . . , xn) · sg(CRm(x1, . . . , xn))) e

G. Number Sequence Encoding

Although this is somewhat counterintuitive, thereare only denumerably many n-tuples of naturalnumbers for any positive integer n. The followingchart shows one way of enumerating all ordered

77

pairs of natural numbers:

〈0, 0〉0

〈1, 0〉2

〈2, 0〉5

〈3.0〉9

〈4, 0〉14

〈5, 0〉20 · · ·

〈0, 1〉1

〈1, 1〉4

〈2, 1〉8

〈3, 1〉13

〈4, 1〉19 · · ·

〈0, 2〉3

〈1, 2〉7

〈2, 2〉12

〈3, 2〉18 · · ·

〈0, 3〉6

〈1, 3〉11

〈2, 3〉17 · · ·

〈0, 4〉10

〈1, 4〉16 · · ·

〈0, 5〉15 · · ·...

We begin by enumerating all pairs whose elementsadd up to zero, then those whose elements add upto 1, then those that add up to 2, etc., in a system-atic way by moving upwards along the diagonals.If this is continued ad inVnitum, no ordered pairwill be left out, and all natural numbers will beused.

Moreover, the 2-place function whose value, fora given x and y, is the natural number correspond-ing to 〈x, y〉 in this enumeration can be deVnedthus:

ρ(x, y) = qt(2, x2 + y2 + 2xy + x+ y) + x

This function is primitive recursive. So are the in-verse functions ρ∗1(z) and ρ∗2(z), whose values fora given z are the Vrst and second elements respec-tively in the corresponding ordered pair.

Similar sorts of mappings can be devised forenumerating all 3-tuples, 4-tuples, etc. Mendelsonactually does these sorts of mappings in a slightlydiUerent way (less easy to put on a chart), and callshis function σ2 instead of ρ, and calls the inversefunctions σ2

1 and σ22 . He then proves that it’s pos-

sible to deVne a function σk similar to σ2 for anyk > 0, as well as corresponding inverse functions,and that all such functions can be shown to beprimitive recursive. We will not have much callfor this, as it is superseded by the uniform methodbelow.

Arbitrary Number Sequence Encoding

We can actually devise a single uniform method forencoding any Vnite sequence (of arbitrary length)of positive integers with a single natural number.

This has a variety of uses, and plays a crucialrole in Gödel numbering.

We do it as follows. Suppose the Vnite sequencewe want to encode is the following:

a0, a1, a2, . . . , ak

This sequence can be “encoded” using the numberobtained by raising the Vrst k + 1 prime numbers(starting with 2) to these numbers as powers inorder, and multiplying, so that the above becomes:

pa00 · pa1

1 · pa22 · . . . · p

akk

The result is a single positive integer: most likely,a very large one, but a single integer nonetheless.If we used the same method for ‘encoding’ anydiUerent Vnite sequence of positive integers, theresult would always be a diUerent integer, becauseit would have a diUerent prime factorization.

There are certain primitive recursive functionsthat are very helpful in working with and manipu-lating the numbers used for such encoding. Threeof them, px, (x)y and `~(x), have already been dis-cussed.

The function (x)y can be used to ‘retrieve’ agiven element of the sequence from the numberused to encode it. For example, if x encodes thesequence: a0, a1, a2, . . . , ak. Then we see that(x)i = ai for any 0 ≤ i ≤ k.

Suppose that x encodes the sequence:a0, a1, a2, . . . , ak. Suppose also that y encodesthe sequence:

b0, b1, b2, . . . , bj

Suppose that now we want to encode the sequencethat puts the b-sequence after the a-sequence:

a0, a1, a2, . . . , ak, b0, b1, b2, . . . , bj

Note that we cannot simply multiply x and y: thiswould lead to the number that encodes:

(a0 + b0), (a1 + b1), (a2 + b2), . . . , etc.

78

This is not what we want. Instead, we deVne thefollowing function of x and y:

x ∗ y = x ·∏

z<`~(y)(p`~(x)+z)(y)z

The function x ∗ y, as you can see, is primitiverecursive, and is called the juxtaposition function,because it is used in juxtaposing sequences of pos-itive integers. (They must be positive integers: ifeither sequence were to contain 0, the `~ functionwon’t return the correct sequence length.)

Do not be mislead by the fact that a numberof programming languages, and software, etc., usethe sign ‘∗’ for multiplication. That is not what thesign ‘∗’ used here means.

Besides its use in Gödel numbering, numbersequence encoding can be used in the recursivedeVnitions of certain functions that might not oth-erwise seem recursive.

Course-of-Values Recursion

Because a single number can be used to encode aVnite sequence of numbers, it is possible to deVnea function whose value for y as argument encodesthe sequence (“course”) of values of another func-tion for all arguments leading up to and includingy.

If f is a (n + 1)-place number-theoretic func-tion, then the notation ‘f#’ is used for the(n + 1)-place number-theoretic function whosevalue for 〈x1, . . . , xn, y〉 is the number that en-codes the series of values for f for all (n + 1)-tuples starting with 〈x1, . . . , xn, 0〉 and endingwith 〈x1, . . . , xn, y − 1〉.

Result: A function f is (primitive) recursive iUf# is (primitive) recursive.

Proof:We prove this in both directions.(a) Suppose that f has already been shown to be

(primitive) recursive. One can then obtain f#

by substitution as follows:

f#(x1, . . . xn, y) =∏z<y

(pz)f(x1,...xn,z)

(b) On the other hand, suppose that f# has al-ready been shown to be (primitive) recursive.One can then obtain f by substitution as fol-lows:

f(x1, . . . xn, y) = (f#(x1, . . . , xn, y + 1))ye

Sometimes it is easier to deVne f# recursivelythan it is to deVne f , especially, a function whosevalue for a given number depends not only uponits value for the previous number, but upon morethan one or even all of its prior values. Such func-tions are said to be obtained by course-of-valuesrecursion, rather than simple recursion.

Example: Consider fib(x) whose value for any xis the xth item in the Fibonacci sequence, whichadds the previous two members to get the next(staring with 1, 1):

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . . , and so on.

This function cannot simply be deVned by the re-cursion rule, because its value for y + 1 dependsnot only on its value for y but also on its value fory − 1.

However, fib # can be obtained by the simplerecursion rule as follows:

fib #(0) = 0fib #(y + 1) = (sg(C<(y, 2)) · (4y + 2)) +

(C<(y, 2) · fib #(y) · (py)(fib #(y))δ(y)+(fib #(y))δ(δ(y)))Since fib # is primitive recursive, and fib can beobtained from it by substitution:

fib(x) = (fib #(x+ 1))xThe function fib is also primitive recursive.

Similar results follow for those relations one wouldwant to deVne in a similar sort of way. Generally,we can say that if a function f is obtained from(primitive) recursive functions by course-of-valuesrecursion, then f is (primitive) recursive itself. (Afuller proof is given in the book.)

Course-of-values recursion is to simple recur-sion what strong induction is to weak induction.

79

Gödel’s β-Function

Consider the following primitive recursive func-tion, deVned as follows:

β(x1, x2, x3) = rm(1 + ((x3 + 1) · x2), x1)

Surprisingly, for any series of natural numberswith n+ 1 members

k0, k1, k2, . . . , kn

one can Vnd two Vxed natural numbers b and csuch that, for any i, such that i ≤ n, β(b, c, i) = ki.

To see this, Vrst, let c be

(max(n, k0, k1, k2, . . . , kn))!

Next, consider the following sequence:

u0, u1, u2, . . . , un

where each ui = 1 + ((i+ 1) · c), for all i ≤ n.No two members of the u-series have a factor

in common other than 1. (It is a matter of tediousarithmetic to show this.)

It follows from this and a known principle ofmodular arithmetic, the Chinese remainder theorem,that there is at least one number b such that theremainder upon division of b by ui is always kifor every i from 0 to n. (The proof of this is moretedious arithmetic.)

Therefore, because each ui is 1 + ((i+ 1) · c),it follows that rm(1 + ((i+ 1) · c), b) = ki, whichis to say that β(b, c, i) = ki.

Example: Suppose that our k-sequence is simply:

1, 2, 1

Then n = 2 and max(2, 1, 2, 1) is also 2, and V-nally, c = 2!, which is also 2.

Then, the u-series is:

3, 5, 7

(These numbers share no common factor.)It follows by the Chinese remainder theo-

rem that there at least one number b such thatrm(3, b) = 1, and rm(5, b) = 2, and rm(7, b) = 1.(In this case, b could be 22, or 127, etc.)

For this sequence, for all 0 ≤ i ≤ 2,

β(22, 2, i) = ki

The upshot of all this is that it provides yet an-other method of talking indirectly about sequencesof numbers. Each sequence corresponds to a band c, which, when combined together with theβ-function, give us a way of ‘retrieving’ elementsin the sequence. Claims made about sequencesof numbers can be transformed into claims madeabout the b and c that it would be appropriate touse as the Vrst two arguments to the β-functionfor that sequence.

This will also help us prove that all recursivefunctions are representable in S.

For further discussion of the Chinese Re-mainder Theorem, see the book, p. 184, 188,419, and: http://www.cut-the-knot.org/blue/chinese.shtml

H. Representing RecursiveFunctions in System S

Our next task is to show that every recursive func-tion is representable in System S.

Recall that recursive functions are those ob-tained from the initial functions (the zero function,the successor function and the projection func-tions) by some Vnite number of applications of thesubstitution, recursion and choice of least rules.

Therefore, in order to prove our result, we needonly show the following: (a) the initial functionsare representable in S, (b) the rule of substitutionpreserves representability in S, (c) the rule of re-cursion preserves representability in S, and (d) thechoice of least rule preserves representability in S.

On pp. 69–70, we showed how the initial func-tions could be (strongly) represented in S, and wealso discussed why it is that the substitution rulepreserves (strong) representability.

Therefore, what’s left is to show that the re-cursion and choice of least rules preserve repre-sentability. We begin with the easier of the two.

80

http://www.cut-the-knot.org/blue/chinese.shtml

http://www.cut-the-knot.org/blue/chinese.shtml

Result (Choice-of-least Lemma): The choiceof least rule preserves representability in S: Moreprecisely, if a given (n + 1)-place number-theoretic function g is representable in S, andf is an n-place number theoretic function whosevalue for a given 〈x1, . . . , xn〉 can be character-ized as follows:

f(x1, . . . , xn) = the least natural number y

such that g(x1, . . . , xn, y) = 0

then, f is also representable in S.

Proof:1. Suppose that g is represented in S by the wU

E [x1, . . . , xn, xn+1, y]. By deVnition, then:(1a) if the value of g for 〈k1, . . . , kn, kn+1〉 as

argument ism, then`S E [k1, . . . , kn, kn+1,m]; and

(1b) `S (∃1y) E [k1, . . . , kn, kn+1, y].2. We can represent f in S using the wU:

E [x1, . . . , xn, y, 0] ∧(∀z)(z < y ⇒ ¬E [x1, . . . , xn, z, 0])

3. We need to show that:(3a) If the value of f for 〈k1, . . . , kn〉 is j, then,

`S E [k1, . . . , kn, j, 0] ∧(∀z)(z < j ⇒ ¬E [k1, . . . , kn, z, 0])

(3b) `S (∃1y)(E [k1, . . . , kn, y, 0] ∧(∀z)(z < y ⇒ ¬E [k1, . . . , kn, z, 0]))

4. To show (3a), Vrst assume that the value of ffor 〈k1, . . . , kn〉 is j. Then j must be the least ysuch that g(k1, . . . , kn, y) = 0. Hence, by (1a):(4a) `S E [k1, . . . , kn, j, 0]

For any 0 ≤ i < j, g(k1, . . . , kn, i) issomething other than 0, so, by (1a) and(1b), we can get:

(4b) `S ¬E [k1, . . . , kn, 0, 0] ∧¬E [k1, . . . , kn, 1, 0] ∧ . . . ∧

¬E [k1, . . . , kn, j − 1, 0]By a corollary proven on p. 66, it followsfrom (4b) that:

(4c) `S (∀z)(z < j ⇒ ¬E [k1, . . . , kn, z, 0])Conjoining (4a) with (4c), we get (3a).

5. To get (3b), since we have (3a), we need onlyprove uniqueness. We Vrst make an assumption:(5a) E [k1, . . . , kn, y, 0] ∧

(∀z)(z < y ⇒ ¬E [k1, . . . , kn, z, 0])By the theorem (Order), we have that:

(5b) `S y = j ∨ y < j ∨ y > j.However, the second and third disjuncts of(5b) lead to contradictions with (5a), (4a)and (4c), which leaves only the Vrst, andso, by DT:

(5c) `S E [k1, . . . , kn, y, 0] ∧ (∀z)(z < y ⇒¬E [k1, . . . , kn, z, 0])⇒ y = j

By Gen, (3a), EG, and exercise 2.70e, weget (3b). e

Result (Recursion Lemma): The recursion rulepreserves representability in S: More precisely,if a given n-place number-theoretic function gis representable in S, and a given (n+ 2)-placenumber-theoretic function h is also representablein S, and f is an (n+ 1)-place number-theoreticfunction, for which it is true for all x1, . . . , xnand y that:

(i) f(x1, . . . , xn, 0) = g(x1, . . . , xn)(ii) f(x1, . . . , xn, y + 1) =

h(x1, . . . , xn, y, f(x1, . . . , xn, y))then, f is also representable in S.

Proof:(This proof is very complex, perhaps the most com-plex single proof of the semester. It is given indetail in the book. I’m not going to try to recreateall the details here, but will merely give a roughoutline.)1. Suppose that f is obtained from g and h recur-

sively as suggested above. Then, if m is thevalue of f for some 〈x1, . . . , xn, y〉, there mustbe some Vnite sequence,

v0, v1, . . . , vy

(the course-of-values of f for all arguments lead-ing up to and including y) where vy is m andalso: v0 = g(x1, . . . , xn), and for all 0 ≤ i < y,

81

vi+1 = h(x1, . . . , xn, i, vi)E.g., if f(x, y) is xy, the v-sequence would be:

1, x, x2, x3, . . . , xy

I.e., C11(x), C1

1(x) · x, (C11(x) · x) · x, etc.

2. However, talk about any Vnite sequence can beproxyed using Gödel’s β-function, i.e.:

β(x1, x2, x3) = rm(1 + ((x3 + 1) · x2), x1)

This can be strongly represented in S by the wU:

(∃z)(x1 = ((1 + ((x3 + 1) · x2)) · z) + y ∧y < 1 + ((x3 + 1) · x2))

Hereafter we’ll use “Bt[x1, x2, x3, y]” as short-hand for the above. The proof that this wUstrongly represents the β function comes easilyfrom (UQR) and the expressibility of the lessthan relation by “x1 < x2”.

3. We can then use Bt[x1, x2, x3, y] to constructstatements in S that make assertions about V-nite sequences, and in particular, those Vnitesequences that correspond to partial courses-of-values of recursive functions for all argumentsup to a given point. This is all we need to repre-sent such functions.

4. We supposed that g and h are representable inS. Suppose that the wU that represents g is:

A [x1, . . . , xn, y]

And suppose that the wU that represents h is:

E [x1, . . . , xn, xn+1, xn+2, y]

By deVnition, then, for any natural numbersk1, . . . , kn, kn+1, kn+2 andm:(4a) if the value of g for 〈k1, . . . , kn〉 ism, then

`S A [k1, . . . , kn,m];(4b) `S (∃1y) A [k1, . . . , kn, y];(4c) if the value of h for 〈k1, . . . , kn, kn+1, kn+2〉

ism, then `S E [k1, . . . , kn, kn+1, kn+2,m];(4d) `S (∃1y) E [k1, . . . , kn, kn+1, kn+2, y].

5. Given the above, it is possible to represent fwith the following wU:

(∃z1) (∃z2)(

(∃y2)(Bt[z1, z2, 0, y2] ∧

A [x1, . . . , xn, y2]) ∧ Bt[z1, z2, xn+1, y] ∧(∀z3)

(z3 < xn+1 ⇒

(∃y3) (∃y4)(Bt[z1, z2, z3, y3] ∧

Bt[z1, z2, z′3, y4] ∧ E [x1, . . . , xn, z3, y3, y4])

))Ugly! What on earth does this say?!

• What we want it to say is that y isthe value of the recursive function f for〈x1, . . . , xn, xn+1〉 as argument. Does it?

• Remember that the β function is used totalk indirectly about Vnite sequences. Be-cause each Vnite sequence corresponds toa Vxed b and c such that β(b, c, i) is alwaysthe ith member of the sequence, quantiV-cation over sequences can in eUect be doneby quantifying over two numbers. The ex-istential quantiVcation over z1 and z2 atthe start of this wU in eUect says “there isa Vnite sequence such that . . . ”.

• Given that Bt[. . .] represents the β func-tion, and A [. . .] represents g, the Vrst con-junct on the inside says that there is a y2at the start (0-spot) of the sequence, andit’s the value of g for 〈x1, . . . xn〉. This ba-sically says how the sequence of values ofthe recursive function begins.

• Next, it says that y is at the xn+1-spot ofthe sequence of values, which is to be ex-pected if y is the value of f when f ’s lastargument is xn+1.

• Lastly, given that E [. . .] represents h, itsays that for each previous spot in the se-quence (the z3-spot, where z3 < xn+1), themember of the sequence at the next spot(y4) is obtained from the member at thez3-spot (y3) in the appropriate way fromthe h function.

6. This will (hopefully) be much clearer with an ex-ample. With xx2

1 , the functions used in its recur-sive deVnition are the constant function whosevalue is always 1 (this plays the role of g) and

82

multiplication (this plays the role of h). Mak-ing some minor simpliVcations, these are repre-sented in S by the wUs y = 1 and y = x1 · x2.According to the above recipe, the function xx2

1is represented by the following wU:

(∃z1) (∃z2)(

(∃y2)(Bt[z1, z2, 0, y2] ∧ y2 = 1)

∧ Bt[z1, z2, x2, y] ∧ (∀z3)(z3 < x2 ⇒

(∃y3) (∃y4)(Bt[z1, z2, z3, y3] ∧ Bt[z1, z2, z′3, y4]

∧ y4 = y3 · x1)))

This says that there is a sequence of naturalnumbers with x2+1 members, the Vrst of whichis 1, the last of which is y, and each one relatesto the previous one by being its product whenmultiplied by x1. With some thought, it is clearthat this is the case if and only if y = xx2

1 .7. Of course, we still need to prove that the result-

ing wU satisVes the conditions for representingf , i.e., we need to show that:(7a) If the value of f for 〈k1, . . . , kn, kn+1〉 as

argument ism, then:`S (∃z1) (∃z2)

((∃y2)(Bt[z1, z2, 0, y2] ∧

A [k1, . . . , kn, y2]) ∧ Bt[z1, z2, kn+1,m] ∧(∀z3)(z3 < kn+1 ⇒(∃y3) (∃y4)(Bt[z1, z2, z3, y3] ∧Bt[z1, z2, z

′3, y4] ∧

E [k1, . . . , kn, z3, y3, y4]))), and

(7b) `S (∃1y) (∃z1) (∃z2)((∃y2)(Bt[z1, z2, 0, y2]

∧ A [k1, . . . , kn, y2]) ∧ Bt[z1, z2, kn+1, y] ∧(∀z3)(z3 < kn+1 ⇒(∃y3) (∃y4)(Bt[z1, z2, z3, y3] ∧Bt[z1, z2, z

′3, y4] ∧

E [k1, . . . , kn, z3, y3, y4])))

8. Sigh. We don’t have time. The result followsfrom the nature of the recursive deVnition of fin terms of g and h, as well as the representabil-ity of g, h and the β function by A [. . .], E [. . .],and Bt[. . .], respectively. The full proof is givenin the book.

9. Let us content ourselves with an example. 13 is1. Hence we should have:(9a) `S (∃z1) (∃z2)

((∃y2) Bt[z1, z2, 0, y2] ∧

y2 = 1) ∧ Bt[z1, z2, 3, 1] ∧

(∀z3)(z3 < 3⇒

(∃y3) (∃y4)(Bt[z1, z2, z3, y3] ∧Bt[z1, z2, z

′3, y4] ∧ y4 = y3 · 1)

))The sequence 1, 1, 1, 1 is ‘encoded’ using the βfunction with b = 1 and c = 6, and we have:(9b) `S Bt[1, 6, 0, 1](9c) `S Bt[1, 6, 1, 1](9d) `S Bt[1, 6, 2, 1](9e) `S Bt[1, 6, 3, 1](9f) `S 1 = 1(9g) `S 1 = 1 · 1The theorem (9a) follows from these theoremsalong with theorems proven on the orderinghandout, and existential generalization.

10. Similar results will follow for any other argu-ments to xx2

1 . Certain other results are neededfor (7b) but they can be proved in similar fash-ion. e

Result: Every recursive function is representablein System S.

Proof:The initial functions are all representable in SystemS, and whatever can be obtained from functionsrepresentable in S by the rules of substitution, re-cursion and choice of least is also representable inS. It follows by the deVnition of a recursive func-tion that all recursive functions are representablein S. e

Corollary: All primitive recursive number-theoretic functions are representable in S.

Proof:It follows from the deVnitions of primitive recur-sive and recursive functions that all primitive re-cursive functions are recursive. e

83

Corollary: All recursive number-theoretic rela-tions are expressible in System S, including allprimitive recursive ones.

Proof:By deVnition, their characteristic functions are re-cursive, and hence their characteristic functionsare representable in S. We have already establishedthat whenever a relation’s characteristic functionis representable in a given theory with identity, therelation is expressible in that theory. e

This gives us an intuitive sense of the strengthof system S; more or less, it has as theorems theappropriate arithmetical results regarding all recur-sive functions and relations, i.e., those that can inprinciple be calculated by a mechanical procedureby computer, calculator or similar device.

84

UNIT 4

GÖDEL’S RESULTS AND THEIR COROLLARIES

A. The System ,

We normally think of the wUs in a logical systemas having meaning, or at least as having a mean-ing given an interpretation, such as the standardinterpretation for System S. However, it is possibleto think of an axiomatic system as just a system ofrules for manipulating syntactic strings.

Consider the following simple system for ma-nipulating strings of symbols:

Syntax

The basic syntactic units are the signs ‘2’ and ‘#’.A formula is any string of one or more of

these two signs, such as: “#”, “2#”, “#2##2”or “2##2#2”.

A well-formed formula (wU) is any formulathat begins with “2”. So “2#2##2”, “2222”,“2##”, and “2#2” are all wUs. However,“##2#2” and “#22#2” are not wUs.

Semantics

The wUs of system , do not have any intendedinterpretation or meaning. (This is not to say thatthey cannot be interpreted as having a meaning,however.) The system is only intended to be agame of string manipulation for the very easilyamused.

“Axiomatization”

The system has one “axiom”: 2The system has one “inference rule”:add circle: if A is a wU, from A , infer A #.A theorem is any wU that can be ‘derived’ from

the axiom by some Vnite number of applications ofthe inference rule.

Hence, the following are theorems:2#2##2###2####

and so on . . .

Metatheory

Schmödel Numbering

Since the system has no intended meaning, notionssuch as completeness and soundness do not apply.

However, this does not mean that we cannotprove anything about it. We can prove, e.g., thatnot every wU is a theorem, etc.

Metalogical results for System , can bemade simpler by coordinating every wU with itsSchmödel number. Schmödel numbering is mucheasier than Gödel numbering. To get the Schmödelnumber of a string of signs for ,, simply replaceevery 2 with the digit ‘1’ and every # with thedigit ‘0’, and think of the result as a numeral writ-ten in binary notation. Let the Schmödel number

85

of the wU be the number that this binary numeralsigniVes.

Examples: Hence the Schmödel number of“2##2#” is 18 (10010 in binary), and theSchmödel number of “2#####” is 32 (100000 inbinary).

Result: The following results hold of ,.(a) No two wUs of , have the same Schmödel

number.(b) The number 0 is the only natural number

that is not the Schmödel number of a wU of,.

(c) The number 1 is the only number that is aSchmödel number of an axiom of ,.

(d) If n and m are Schmödel numbers of wUsof ,, then the wU corresponding to m fol-lows from the wU corresponding to n by addcircle iUm = 2n.

(e) A natural number n is the Schmödel num-ber of a theorem of , iU it is a power of2.

(The proofs of these results are fairly obvious.)We might even say this: although they had no

intended meaning, it is possible to think of the wUsof, as simply standing for numbers, and it is possi-ble to think of all the metatheoretic properties andrelations of wUs of , as being number-theoreticproperties/relations. The 1-place relation (prop-erty) of being a theorem of ,, corresponds fully tothe number-theoretic property of being a power of2.

System S as Metalanguage

Because all the number-theoretic properties andrelations one would need to do metatheory for ,are recursive, it turns out that System S could beused a metalanguage for System ,. For example,“x is a power of 2” can be expressed in S using thewU:

x = 1 ∨(2|x ∧ (∀y)(y > 2 ∧ ¬2|y ⇒ ¬y|x)

)

This says that x = 1, or 2 and nothing odd above 2divides evenly in to x. In eUect, we can prove that2##### is a theorem of , in S, since:

`S 32 = 1 ∨(2|32 ∧

(∀y)(y > 2 ∧ ¬2|y ⇒ ¬y|32))

Similarly, we can prove in S that 2####2 is nota theorem of ,, since:

`S ¬(

33 = 1 ∨ (2|33 ∧

(∀y)(y > 2 ∧ ¬2|y ⇒ ¬y|33)

))Because all recursive relations are expressible in S,in eUect, all metatheory for , could be done in Srather than English. The numerals of S in eUect actas its “names” for wUs of ,.

B. System S as its OwnMetalanguage

You’ve probably guessed what’s coming. System Scan partially act as its own metalanguage as well,because every wU of System S corresponds to aGödel number, and many (though not all) metathe-oretic properties and relations of wUs of S cor-respond to recursive number-theoretic propertiesand relations of their Gödel numbers.

Since all recursive number-theoretic relationsare expressible in S, System S can in eUect be usedto say, and even be used to provemany things aboutitself. The result is a strange collapse of the meta-language into the object language.

Although much more diXcult to characterizethan their counterparts for system ,, the followingnumber-theoretic relations are primitive recursive,and therefore can be captured in S:

• being the Gödel number of a wU of S;• being the Gödel number of an axiom of S;• being the Gödel number of a wU that followsby MP from wUs with Gödel numbers are nandm;

• being the Gödel number of a wU followingby Gen from a wU with Gödel number n;

86

• being a number that “encodes” a Vnite se-quence of Gödel numbers whose correspond-ing wUs, in order, constitute a proof of thewU with Gödel number n, etc.

• These properties and relations are entirelyarithmetical in nature, just like being a powerof 2 is entirely arithmetical in nature.

Gödelian Results

Gödel found a trick to make it possible, for anysystem that can do enough mathematics to expressrecursive properties and relations, to construct aclosed wU written entirely in the language of thatsystem that in eUect “says” that its own Gödel num-ber is not the Gödel number of a theorem of thatsystem. It then follows that if the system is consis-tent, it cannot be complete.1. Suppose that for System S the wU in question

is abbreviated as G . Note that since G is builtup entirely in the syntax of S, it is “really” amathematical statement, involving only 0, ′, +, ·,=, variables, and the logical constants.Note that:(1a) G is true in the standard interpretation for

S iU not-`S G .(1b) ¬G is true in the standard interpretation

iU `S G .2. Suppose for reductio that both:

(2a) System S is consistent, i.e., there is no wUA such that `S A and `S ¬A .

(2b) System S is complete, i.e., for all wUs A ,if A is true in the standard interpretation,then `S A .

3. It follows that G is not true in the standard in-terpretation. If it were true, by (2b) it would bea theorem, but by (1a) it would also not be atheorem, which is impossible.

4. Since G is closed, and it is not true in the stan-dard interpretation, ¬G is true in the standardinterpretation. It then follows by (1b) that `S G ,but it follows from (2b) that `S ¬G . Hence, S isinconsistent, which contradicts (2a).

5. Because S appears to be consistent, we mustconclude that it is incomplete.Note that this means that there are purely arith-

metical truths written in the syntax of System S

that cannot be derived within System S. These onlyinvolve only 0, ′, +, ·, =. So they are truths ofthe natural numbers, i.e., of number theory. Peanoarithmetic, therefore, fails as a complete axiomati-zation of number theory.

The defect is not localized to System S. Any ax-iomatic system for mathematics of which it is truethat: (i) all recursive relations are expressible in it,(ii) it has an arithmetizable syntax (its wUs can beGödel-numbered), (iii) the relation that holds be-tweenm and n just in casem “encodes” a sequenceof wUs of the system that constitutes a proof of thewU of which n is the Gödel number, is a recursiverelation, either is inconsistent, or fails to captureall truths of number theory, for similar reasons.

Adding more axioms and/or inference rules willnot help; this will simply change the recursiveproperties and relations involved, but there willstill exist unprovable truths.

In fact, Gödel himself Vrst proved his resultsnot for a Vrst-order system like S, but for higher or-der logics similar to Whitehead and Russell’s Prin-cipia Mathematica, in his classic 1931 paper, “Überformal unentscheidbare Sätze der Principia Math-ematica und verwandter Systeme.” (“On FormallyUndecidable Propositions of Principia Mathematicaand Related Systems.”)

Could we just give up one of (i) through (iii)?None are promising. Obviously, no axiomatizationfor mathematics that couldn’t express recursive re-lations could be adequate. It is not known how toconstruct a syntax that is not arithmetizable but isstill learnable, and similarly it is not known howto construct an axiom system that is learnable anduseable in practice but in which the relation men-tioned above would not be recursive. (These arewidely believed to be impossible.)

The upshot of this: It is impossible to captureall arithmetical truths within a learnable axiomaticsystem.

Over the next few weeks, we’ll be looking atthese and similar results, more precisely, and inmore detail. This handout is only meant as a roughsketch, and is somewhat crude and oversimpliVed.

The underlying idea of arithmetizing meta-logic has also made possible bringing to bear thefull array of mathematical knowledge to issues in

87

metalogic, which has led to many other interestingresults besides Gödel’s.

C. Arithmetization of Syntax

We start with the process of Gödel numbering;note that because of diUerences in the way I origi-nally laid out the syntax and the way Mendelsondid, there are some very subtle but unimportantdiUerences in our way of doing Gödel numberingbelow.

Gödel Numbers for Simple Signs

Every formula constructed in the syntax of (Vrst-order) predicate logic is built up from the followingsimple signs: ‘(’, ‘,’, ‘)’, ‘⇒’, ‘¬’, ‘∀’, as well as the in-dividual constants, variables, predicate letters andfunction letters.

The process of Gödel numbering begins bydeVning a function g that assigns to each simplesign a diUerent odd positive integer:1. Firstly, we let . . .

g(‘(’) = 3,g(‘)’) = 5,g(‘, ’) = 7,g(‘¬’) = 9,g(‘⇒’) = 11, andg(‘∀’) = 13.

2. If c is a constant, and n is the number of itssubscript (if c has no subscript, then n = 0),then depending on which letter of the alphabetis used, let k be either 1, 2, 3, 4 or 5 (1 for ‘a’, 2for ‘b’, etc.), and let g(c) = 7 + 8(5n+ k).

3. If x is a variable, and n is the number of itssubscript, then depending on which letter ofthe alphabet is used, let k be either 1, 2, or3 (1 for ‘x’, 2 for ‘y’ and 3 for ‘z’), and letg(x) = 13 + 8(3n+ k).

4. If F is a function letter, and n is the numberof its subscript, and m is the number of its su-perscript, then depending of which letter of thealphabet is used (‘f’ through ‘l’), let k be one of1 through 7, and let g(F ) = 1 + 8(2m3(7n+k)).

5. If P is a predicate letter, and n is the numberof its subscript and m is the number of its su-perscript, then depending of which letter of thealphabet is used (‘A’ through ‘T’), let k be one of1 through 20, and let g(P) = 3+8(2m3(20n+k)).

Gödel Numbers for Strings

Each string of simple symbols built from these sim-ple signs can then be correlated with a Vnite se-quence of the above numbers. This includes bothwell-formed and ill-formed formulas, and functionterms. Hence, the wU “I2(a, a)” (i.e., “a = a”) iscorrelated with:

629859, 3, 15, 7, 15, 5

1. We can then extend the notion of Gödel num-bering to cover strings by coordinating eachformula with the number that encodes the se-quence of numbers of its simple symbols in or-der, so:

g(“I2(a, a)”) = 262985933515771115135

2. The Gödel numbers of strings of symbols do notoverlap with Gödel numbers of simple symbols,since the latter are always odd, and the formerare always even (all have 2 in their prime factor-ization).

3. Note that we must distinguish between theGödel number of the simple symbol ‘(’ and theGödel number of the one-character-long string“(”. The former is 3; the latter is 23, i.e., 8.

Gödel Numbers for Sequences of Formu-las

1. Each Vnite sequence of formulas or other strings(e.g., a proof) can be correlated with a Vnite se-quence of Gödel numbers. E.g., if our sequenceof wUs is:

A0,A1, . . . ,Ak

This can be correlated with the sequence:

g(A0), g(A1), . . . , g(Ak)

We can then extend the notion of Gödel num-bering to cover such sequences by using the

88

numbers that “encode” the sequences of theirGödel numbers. Hence for the above, we have:

g(A0,A1, . . . ,Ak) = 2g(A0)3g(A1) . . . pg(Ak)k

2. Similarly, Gödel numbers of sequences of for-mulas also do not overlap with Gödel numbersof singular formulas. While both are alwayseven, in their prime factorizations, in the latter,2 is always raised to an odd power, and in theformer, 2 is always raised to an even power.

3. Also we must distinguish between theGödel number of a formula itself, and theGödel number of a one-membered wU-sequence. The Gödel number of “I2(a, a)” is262985933515771115135, but the Gödel number ofsequence consisting of this formula alone is 2raised to the power of 262985933515771115135.

Working Backwards

Not only do we know the algorithm for determin-ing the Gödel number of some expression of predi-cate logic, there is also a fairly simple algorithm forworking in the reverse direction: i.e., given a nat-ural number, determining what, if anything, thatnumber Gödelizes.

Odd numbers below 15 are obvious; all otherodd numbers represent either variables, constants,predicate-letters or function-letters depending onwhether the remainder is 1, 3, 5 or 7, respectively,when divided by 8.

For an even number, you must determine itsprime factorization and then work from there.

Examples:• 77 is odd. Its remainder when divided by 8is 5. Hence it is a variable. 77 = 13 + 64 =13 + (8 · 8) = 13 + (8 · ((3 · 2) + 2)). De-coding, we see that this is the number of thevariable ‘y2’.

• The prime factorization of 4,060,435,238,092,800,000,000,000,000,000,000,000 is 2513352375,which corresponds to the wU “A1(b)”.

• A logically unimportant point of trivia: allGödel numbers of wUs are evenly divisibleby 1000 or a higher power of 10. Do you seewhy?

Recursive Syntax Arithmetization

DiUerent Vrst-order languages use diUerent con-stants, function-letters and predicate-letters. E.g.,System S only has the constant ‘a’, the predicateletter ‘I2’ and function letters ‘f 1’, ‘f 2

1 ’, ‘f22 ’ (i.e.,

0, =, ′, + and ·). System PF, however, allows anyconstant, function-letter or predicate letter. Thepure predicate-calculus (PP) is just like PF exceptthat it has no constants or function-letters.

Definition: A theory K is said to have a (prim-itive) recursive vocabulary iU the followingnumber-theoretic properties are (primitive) recursive:(a) IC(x): x is the Gödel number of a constant used

(allowed) in K.(b) FL(x): x is the Gödel number of a function-

letter used (allowed) in K.(c) PL(x): x is the Gödel number of a predicate-

letter used (allowed) in K.

Systems S, PF and PP all have primitive recursivevocabularies. For S, e.g., IC(x) is the property xhas iU x = 15; for PF, IC(x) is the property x has

iUy

∃y<x

(x = 7 + 8y); for PP, IC(x) is the property xhas iU x 6= x (i.e., the empty set).

Indeed, it is diXcult to imagine a theory with-out a recursive vocabulary. For such a theory, therewould be no eUective method to determine whethera given symbol (e.g., ‘b312’) was allowable or not!

Result: The following property is primitive re-cursive:Vbl(x): x is the Gödel number of a variable.

Proof:All standard Vrst-order theories use all variables,

and so Vbl(x) iUy

∃y<x

(x = 21 + 8y), and the latter

is primitive recursive. e

89

Result: For any theory with a (primitive) recur-sive vocabulary, the number-theoretic properties,relations and functions listed below are (primi-tive) recursive.

(For some, I give the arithmetical formulas char-acterizing them in the metalanguage; for the rest,consult the book. For the most part, they can be re-cursively characterized fairly easily with the func-tions used to do encoding, ‘∗’ especially. A rarefew involve course-of-values recursion or the like.)

(a) EVbl(x): x is the Gödel number of a single-symbol string consisting of a variable alone,

i.e.:y

∃y<x

(Vbl(y) and x = 2y).(b) (property) EIC(x): x is the Gödel number of a

single-symbol string consisting of a constant

alone, i.e.:y

∃y<x

(IC(y) and x = 2y).(c) (property) EFL(x): x is the Gödel number of a

single-symbol string consisting of a function-

letter alone, i.e.:y

∃y<x

(FL(y) and x = 2y).(d) (property) EPL(x): x is the Gödel number of a

single-symbol string consisting of a predicate-

letter alone, i.e.:y

∃y<x

(PL(y) and x = 2y).(e) (function) ArgT(x): the superscript on the

function-letter with Gödel number x.(f) (function) ArgP(x): the superscript on the

predicate-letter with Gödel number x.(g) (property) Gd(x): x is the Gödel number of

any string of signs allowed in the theory.(h) (property) Trm(x): x is the Gödel number of a

term of the theory.(i) (property) Atfml(x): x is the Gödel number of

an atomic formula of the theory.(j) (property) Fml(x): x is the Gödel number of a

wU of the theory. Actually, Mendelson’s def-inition is wrong, even for his own notation,

but we can put it as:

Atfml(x) ory

∃y<x

(Fml(y) and x = 29 ∗ y) ory

∃y<x

z

∃z<x

((Fml(y) and Fml(z) and

x = 23 ∗ y ∗ 211 ∗ z ∗ 25) or

(Fml(y) and EVbl(z) and

x = 23 ∗ 23 ∗ 213 ∗ z ∗ 25 ∗ y ∗ 25))

I.e., x is either the Gödel number of an atomicformula or there is a lower number y that isthe Gödel number of a wU A and x is theGödel number of ¬A , or there are two lowernumbers y and z such that either y and z arethe Gödel numbers of A and B and x is theGödel number of (A ⇒ B) or y is the Gödelnumber of a wU A and z is the Gödel numberof a variable v and x is the Gödel number of((∀v) A ).(The error was pointed out to be my a studentin Korea who did not give his or her name.The correction is mine.)

(k) (relation) MP(x, y, z): z is the Gödel number ofa wU that follows by MP from the wUs whoseGödel numbers are x and y, i.e.: (Fml(x) andFml(y) and Fml(z)) and (either (y = 23 ∗ x ∗211 ∗ z ∗ 25) or (x = 23 ∗ y ∗ 211 ∗ z ∗ 25)).(I.e., x, y and z correspond respectively to wUsA , B and C , and A is (B ⇒ C ) or B is(A ⇒ C ).

(l) (relation) Gen(x, y): y is the Gödel number ofa wU that follows by Gen from the wU whoseGödel number is x.

(m) (function) Sub(x, y, z): the Gödel number ofwhat results from substituting the term withGödel number y for all free occurrences of thevariable with Gödel number z in the wU withGödel number x.

(n) (relation) Fr(x, y): x is the Gödel number of awU that contains free occurrences of the vari-able with Gödel number y.

(o) (relation) Ff(x, y, z): x is the Gödel number ofa term that is free for the variable with Gödelnumber y in the wU with Gödel number z.

90

(p) (property) AxA1(x): x is the Gödel numberof an instance of axiom schema (A1), i.e.:y

∃y<x

z

∃z<x

(Fml(y) and Fml(z) and x = 23 ∗ y ∗

211 ∗ 23 ∗ z ∗ 211 ∗ y ∗ 25 ∗ 25).(q) (properties) AxA2(x), AxA3(x), AxA4(x),

AxA5(x), etc., are characterized similarly.(r) (property) LAX(x): x is the Gödel number of

an instance of one of (A1)–(A5) (“a logical ax-iom”).

(s) (function) Neg(x): the Gödel number of thenegation of the wU with Gödel number x, i.e.:Neg(x) = 29 ∗ x.

(t) (function) Cond(x, y): the Gödel number ofthe conditional with the wUs with Gödel num-bers x and y as antecedent and consequent,respectively.

(u) (property) Sent(x): x is the Gödel number of aclosed wU (a sentence).

(v) (function) Clos(x): the Gödel number of theuniversal closure of wU with Gödel number x.

Theory-SpeciVc Functions and Relations

In any theory such as S that uses the speciVc con-stant ‘a’ as its numeral for 0, and constructs theremaining numerals using the function-letter ‘f 1’for ‘successor of’, the following are primitive re-cursive:(a) (function) Num(x): the Gödel number of the

numeral standing for the number x. This isdeVned by recursion as follows:

Num(0) = 215 (i.e., the Gödel number of “a”)

Num(y + 1) = 249 ∗ 23 ∗ Num(y) ∗ 25

(b) (property) Nu(x): x is the Gödel number of a

numeral, i.e.,y

∃y<x

(x = Num(y)).(c) the diagonalization function D(y): this is

an evil function that, if its argument is theGödel number of a wU A [x] containing thevariable ‘x’ free , returns as value the Gödelnumber of the wU obtained by substitutingthe numeral for the Gödel number of A [x]for all free occurrences of ‘x’ in A [x], i.e.:D(y) = Sub(y,Num(y), 21)

(21 is the Gödel number of the variable ‘x’.)

Definition: A theory K is said to have a (primi-tive) recursive axiom set iU, for that theory, thefollowing property is (primitive) recursive:

PrAx(x): x is the Gödel number of a proper (non-logical) axiom of the theory K.

Result: System S has a primitive recursive ax-iom set.

Proof:1. (property): AxS1(x): x is the Gödel number of

(S1), i.e.:x = 233629859537211171329175191123329629859

3133721417433747553115962985961367297177337

793833893.This the Gödel of the the wU:

(I2(x, y)⇒ (I2(x, z)⇒ I2(y, z)))2. (properties) AxS2(x) through AxS8(x) can be

characterized similarly.3. (property): AxS9(x): x is the Gödel number of

an instance of schema (S9), i.e.:

y

∃y<x

z

∃z<x

(EVbl(2z) and Fml(y) and

x = 23 ∗ Sub(y, 215, z) ∗ 211 ∗ 23 ∗ 23 ∗ 23 ∗

213 ∗ 2z ∗ 25 ∗ 23 ∗ y ∗ 211 ∗

Sub(y, 249 ∗ 23 ∗ 2z ∗ 25, z) ∗ 25 ∗ 25 ∗ 211 ∗

23 ∗ 23 ∗ 213 ∗ 2z ∗ 25 ∗ y ∗ 25 ∗ 25 ∗ 25)4. PrAx for System S is then just the disjunction

of AxS1 through AxS1; hence S has a primitiverecursive axiom set. e

Result: For any Vrst-order axiomatic systemwith both a (primitive) recursive vocabulary anda (primitive) recursive axiom set, the number-theoretic properties and relations listed below arealso (primitive) recursive.

91

(a) (property) Ax(x): x is the Gödel number of anaxiom (logical or proper) of the system.(This is simply the disjunction of LAX andPrAx.)

(b) (property) Prf(x): x is the Gödel number of aproof of the theory (i.e., it encodes a sequenceof wUs such that each member of the sequenceis either an axiom, or follows from previousmembers of the sequence by Gen or MP.)

• This property is deVned by course-of-values recursion for properties, i.e., one-place relations. To really prove that it isrecursive, we would Vrst have to showthat the course-of-values function, CPrf#,for the characteristic function of Prf, viz.,CPrf, is recursive, then obtain CPrf fromCPrf#.

• Intuitively, however, we know that it isa recursive relation if we know how todetermine whether or not it applies toa given number when we already knowhow to determine whether it or not ap-plies to any smaller number. Note that:

Prf(x) iU [eithery

∃y<x

(Ax(y) and x = 2y) ory

∃y<x

z

∃z<`~(y)

w

∃w<x

(Prf(y) and Gen((y)z, w) and

x = y ∗ 2w) ory

∃y<x

z

∃z<`~(y)

v

∃v<`~(y)

w

∃w<x

(Prf(y)

and MP((y)z, (y)v, w) and x = y ∗ 2w) ory

∃y<x

z

∃z<x

(Prf(y) and Ax(z) and x = y ∗ 2z)]

• Roughly, this says, x is the Gödel num-ber of a proof iU either (i) it encodes asequence consisting of an axiom by it-self, (ii) it is obtained from the Gödelnumber of a shorter proof by appendingthe Gödel number of a new Gen step tothe encoding, (iii) it is obtained from theGödel number of a shorter proof by ap-pending the Gödel number of a new MPstep to the encoding, or (iv) it is obtained

from the Gödel number of a shorter proofby appending the Gödel number of a newaxiom to the encoding.

(c) (relation) Pf(x, y): x is the Gödel number ofa proof in the theory of the wU with Gödelnumber y, which is characterized this way:

Prf(x) and y = (x)δ(`~(x))

In other words, x is a proof, and y is the ex-ponent in the greatest prime number in theprime factorization of x (i.e., y is the last num-ber encoded in x.)

(Strictly speaking, since PrAx is deVned diUerentlyin diUerent theories depending on what axiomsthey have, the above are also deVned diUerentlyfor diUerent theories.)

We can now prove all functions representablein S are recursive. We noted earlier that, roughlyspeaking, recursive functions represent those acomputer can in principle given enough time de-termine algorithmically. The following argumentsuggests roughly that if a function is representablein a system like S, then, here’s one way for a com-puter to compute its value, i.e., go through theGödel numbers of proofs in S. For each, check if itproves anything using the wU used to represent thefunction for the values in question. If it does, thenthe value of the function is whatever the proof in Sproves it to be. Keep looking until you Vnd such aproof. Since the function has a value, and there isa proof in S that the function has that value, you’lleventually Vnd it this way.

Result: For any theory K (such as S) with a sys-tem of numerals, a recursive vocabulary and ax-iom set, and for which the following principleholds:

(%) For any natural numbers r and s, if`K r = s then r = s.

For any n-place number-theoretic function f , iff is representable in K, then f is recursive.

92

Proof:(1) Assume K is such a theory and assume that f

is representable in K.(2) By the deVnition of representability, there is

some wU A [x1, . . . , xn, y] with x1, . . . , xn andy as its free variables such that, for all naturalnumbers k1, . . . , kn, andm:(2a) If the value of f for 〈k1, . . . , kn〉 as argu-

ment ism, then `K A [k1, . . . , kn,m];(2b) `K (∃1y) A [k1, . . . , kn, y].

(3) Let c be the Gödel number of the wUA [x1, . . . , xn, y] that represents f .

(4) Consider the (n + 2)-place number-theoreticrelation PA such that PA (z1, . . . , zn, u, v) iU vis the Gödel number of a proof in K of the wU:

A [z1, . . . , zn, u]

Or in other words, PA holds for z1, . . . , zn, uand v iU v is the Gödel number of an object-language proof essentially to the eUect that

f(z1, . . . , zn) = u.

(5) We can prove that PA is a recursive relation.(5a) Because K has a recursive vocabulary and

axiom set, the following functions andrelation, discussed above, are recursive:Pf(x, y), Sub(x, y, z), and Num(x).

(5b) Note that c is the Gödel num-ber of A [x1, . . . , xn, y], and 29 isthe Gödel number of ‘y’, and theGödel numbers of ‘x1’, . . . , ‘xn’ are45, 69, 93, . . . (increases by 24) . . . ,(21 + 24n).Then we can see that:Sub(c, Num(u), 29) is the Gödel numberof A [x1, . . . , xn, u], and soSub(Sub(c, Num(u), 29), Num(z1),45) is the Gödel number ofA [z1, x2, . . . , xn, u].Repeating this process, we can see that:Sub(. . . Sub(Sub(c, Num(u), 29), Num(z1),45) . . . , Num(zn), 21 + 24n) is the Gödelnumber of A [z1, . . . , zn, u].

(5c) So we can obtain PA by substitution:

PA (z1, . . . , zn, u, v) =Pf(v, Sub(. . . Sub(Sub(c,Num(u), 29),

Num(z1), 45) . . . ,Num(zn), 21+24n))

Since Pf, Sub and Num are recursive, sois PA .

(6) For any natural numbers k1, . . . , kn, r and j,if PA (k1, . . . , kn, r, j), then the value of f for〈k1, . . . , kn〉 as argument is r.(6a) Assume PA (k1, . . . , kn, r, j).(6b) By the deVnition of PA ,

`K A [k1, . . . , kn, r].(6c) f is a number-theoretic function;

so it must have some value s for〈k1, . . . , kn〉 as argument. By (2a),`K A [k1, . . . , kn, s].

(6d) By (2b), (6b) and (6c), `K r = s.(6e) By principle (%), it must be that r = s.

(7) For any 〈z1, . . . , zn〉, f will have somevalue, u, and by (2a) there will be a prooffor A [z1, . . . , zn, u] in K. Hence for any〈z1, . . . , zn〉 there will be a sequence:

u, v

Where v is the Gödel number of a proof ofA [z1, . . . , zn, u] in K, which is to say:

PA (z1, . . . , zn, u, v).

Let w be the number that encodes the abovesequence, i.e., 2u3v. Hence:

PA (z1, . . . , zn, (w)0, (w)1).

(8) Then f can be obtained from PA using the“choice of least” rule and the function (x)y . Con-sider:

f(z1, . . . , zn) =(µw(PA (z1, . . . , zn, (w)0, (w)1)))0

This function will return the u such that w isthe least number that encodes a sequence u, vsuch that:

PA (z1, . . . , zn, u, v)

93

By (6) above, this u will be the value of f for〈z1, . . . , zn〉. Because PA is a recursive rela-tion, and we obtained f using the “choice ofleast” rule from PA and the primitive recursivefunction (x)y, f is recursive. e

Corollary: In any theory meeting the conditionsabove, all expressible number-theoretic relationsare recursive.

Corollary: A number-theoretic function f isrepresentable in S if and only if it is recursive,and a number-theoretic relation R is expressiblein S if and only if it is recursive.

D. Robinson Arithmetic

After Gödel discovered his famous results forPeano arithmetic, the mathematician RaphaelRobinson decided to see how weak he could makean axiomatic system in which it would still be thecase that all (and only) recursive number-theoreticfunctions are representable. Here is the result,slightly modiVed by Mendelson. (Robinson Arith-metic is usually called system Q; with Mendelson’schange, we call it RR.)

The System RR

The syntax and intended semantics for RR are thesame as for system S. For its deductive theory, itconsists of the logical axioms (A1)–(A5), the infer-ence rules MP and Gen, and the following properaxioms.(RR1) x = x(RR2) x = y⇒ y = x(RR3) x = y⇒ (y = z⇒ x = z)(RR4) x = y⇒ x′ = y′(RR5) x = y⇒ (x + z = y + z ∧ z + x = z + y)

(RR6) x = y⇒ (x · z = y · z ∧ z · x = z · y)(RR7) x′ = y′ ⇒ x = y(RR8) 0 6= x′(RR9) x 6= 0⇒ (∃y)(x = y′)(RR10) x + 0 = x(RR11) x + y′ = (x + y)′(RR12) x · 0 = 0(RR13) x · y′ = (x · y) + x(RR14) (x = (x1 · x2) + y ∧ y < x1) ∧

(x = (x1 · x3) + z ∧ z < x1)⇒ y = zNotice that all of the above are particular axioms,not axiom schemata. RR has exactly 14 properaxioms, while, strictly speaking, S has inVnitelymany. (RR14) was added by Mendelson to makeit easier to prove that Gödel’s β-function is repre-sented by RR, but the really interesting thing aboutthis system is that its proper axioms are Vnite, notwhether they are 13 or 14 in number.

The primary diUerence between RR and S isthat RR does not contain something equivalent to(S9): Peano arithmetic’s principle of mathematicalinduction. However it does add axioms that areequivalent to many of the important theorems onewould use (S9) to get in System S. This includes(Ref/Trans/Sub=), (Sub+), (Sub·), etc., which areneeded to get Leibniz’s law, or (A7) of PF=.

Definition: First-order K is a subtheory of theoryK′ if and only if every theorem of K is a theorem K′;K is a proper subtheory of K′ if K is a subtheory ofK′, but K′ is not a subtheory of K.

Example: RR is a proper subtheory of S.

Just how weak is RR?

RR does very well when it comes to dealing withnumerals, and in general in proving things aboutclosed terms. For example:

For any natural numbers n andm,(a) `RR n+m = n+m;(b) `RR n · m = n · m;(c) if n 6= m, then `RR n 6= m;(d) if n < m, then `RR n < m;(e) `RR x ≤ 0;(f) `RR x ≤ n⇒ (x = 0 ∨ . . . ∨ x = n) etc.

94

Without an induction principle, however, there aremany similar results making use of variables andquantiVers of the system that one cannot prove inRR. For example, the following are not theorems ofsystem RR:

(∀x) (∀y)x+ y = y + x

(∀x) (∀y)x · y = y · x(∀x) (∀y) (∀z)(x+ y) + z = x+ (y + z)(∀x) (∀y) (∀z)x · (y + z) = (x · y) + (x · z)

However, the lack of such principles does not in-terfere with results about which number-theoreticfunctions are representable, and which number-theoretic relations are expressible. Recall that thedeVnitions of expressibility and representabilityprimarily have to do with getting the appropriatetheorems for the right numerals, not for gettinggeneral results stated with quantiVers. In fact . . .

Result: A number-theoretic function is repre-sentable in RR iU it is recursive, just as in systemS. Similarly, a number-theoretic relation is ex-pressible in RR iU it is recursive.

The proofs of these results for RR are almost ex-actly the same as the corresponding proofs for S.It would be matter of tedious backtracking to seethis. This is not surprising, since RR was customtailored to allow these results to go through. In theproof of these results for S, we rarely appealed totheorems that require (S9), and in those few occa-sions in which we did, we have been given a newaxiom of RR that works just as well.

Obviously, RR is incomplete, and too weak forwhat we wanted. However, it will turn out to beuseful later in the unit to have a weaker systemwith only a Vnite number of proper axioms, tomake certain other things easier to prove, espe-cially Church’s theorem.

E. Diagonalization

Preliminaries

Abbreviation: pA q is shorthand for the object-language numeral for the Gödel number of A .

Example: The Gödel number of “I2(a, a)” is262985933515771115135, so pI2(a, a)q is the numeral262985933515771115135, which is actually ‘0’ fol-lowed by 262985933515771115135 successor functionsigns (′).

Insofar as system S (or similar theory) can partlyact as its own metalanguage, such numeralsas pI2(a, a)q act as its “name” for its own wU“I2(a, a)”.

Result (The Fixed-Point Theorem): For anytheory K (such as S or RR) such that (i) Kis a theory with identity, (ii) K has a systemof numerals and a recursive vocabulary, (iii)all recursive number-theoretic functions arerepresentable in K, it holds that for any wU E [x]containing x as its only free variable, there is aclosed wU B such that:

`K B ⇔ E [pBq]

This theorem states that for any wU of the formE [x], there is closed wU B such that, within K, Bis equivalent to the claim that E [x] holds for B’sown Gödel number.

Example: Consider the wU of S, “x = 0”. By thetheorem, there is some wU C such that

`S C ⇔ pC q = 0.

This can be thought of this way: C says of it-self that its own Gödel number is 0. (In this case,`S ¬C .)

The proof of the theorem relies on the (evil) diago-nalization function D, introduced on p. 91. Recallthat when its argument is the Gödel number of a

95

wU of the form A [x], the value of D is the Gödelnumber of the formula obtained by substituting thenumeral for the Gödel number of A [x] for all freeoccurrences of ‘x’ in A [x].

Proof:(1) Assume that K is a theory meeting conditions

(i)–(iii) above, and then consider any wU E [x]having ‘x’ as its only free variable.

(2) Because K has a recursive vocabulary and asystem of numerals, the function D for K is arecursive number-theoretic function.

(3) Because all recursive functions are repre-sentable in K, and D is a recursive function,there is some wU D [x, y], such that, for all nat-ural numbers k andm:(3a) If D(k) = m, then `K D [k,m].(3b) `K (∃1y) D [k, y].

(4) Now consider the following wU:(4a) (∀y)(D [x, y]⇒ E [y])This wU more or less says that E holds of theGödel number obtained from x by the diago-nalization function.

(5) Let p be the Gödel number of the wU (4a).(5a) p is p(∀y)(D [x, y]⇒ E [y])q

Consider now the following closed wU,which hereafter we’ll call B:

(5b) B is (∀y)(D [p, y]⇒ E [y])This wU, B, says that E holds of theGödel number obtained from p from thediagonalization function. Let q be theGödel number of B. Hence:

(5c) q is pBqNotice that (4a) is itself of the form A [x].Hence, the value of the diagonalizationfunction for its Gödel number p, will bethe Gödel number of B, i.e., q:

(5d) D(p) = qNotice that because B says that E holdsof the Gödel number obtained from pfrom the diagonalization function, andq is the Gödel number of B, B in eUectssays that E holds of its own Gödel num-ber.

(6) By (5d) and (3a) and (3b), we can conclude:(6a) `K D [p, q](6b) `K (∃1y) D [p, y]

(7) We can now prove the biconditional:`K B ⇔ E [pBq]

1. B `K B (Premise)2. B `K (∀y)(D [p, y]⇒ E [y]) 1 (5b)3. B `K D [p, q]⇒ E [q] 2 UI4. B `K E [q] 3, (6a), MP5. B `K E [pBq] 4 (5c)6. `K B ⇒ E [pBq] 5 DT7. E [pBq] `K E [pBq] (Premise)8. D [p, y] `K D [p, y] (Premise)9. D [p, y] `K y = q 8, (6a), (6b) PF=10. D [p, y] `K y = pBq 9 (5c)11. E [pBq],D [p, y] `K E [y] 7, 10 LL12. E [pBq] `K D [p, y]⇒ E [y] 11 DT13. E [pBq] `K (∀y)(D [p, y]⇒ E [y]) 12 Gen14. E [pBq] `K B 13 (5b)15. `K E [pBq]⇒ B 14 DT16. `K B ⇔ E [pBq] 6, 15 SLThis establishes the theorem. e

This establishes the theorem. The Fixed-Point The-orem makes a certain kind of ‘self-reference’ possi-ble, which leads to all sorts of fun results.

F. ř-Consistency, TrueTheories and Completeness

Definition: A theory K with a system of numer-als is said to be ř-consistent iU for every wU A [y]containing y as its only free variable, if it is true forevery natural number n that

`K ¬A [n],

then it is not the case that `K (∃y) A [y].

Basically, a system is ř-consistent if whenever youcan prove that ¬A [y] holds for each particularnumbers, you cannot then also prove the quanti-Ved statement there is some number y such thatA [y].Definition: A theory K with the same syntax as Sis said to be a true arithmetical theory iU all itsproper axioms are true in the standard interpretation.

Remember that the standard interpretation is theinterpretation M such that (i) the domain of quan-tiVcation D of M is the set of natural numbers, (ii)

96

(‘0’)M is zero, (iii) (‘=’)M is the identity relation onthe set of natural numbers, and (iv) (‘+’)M is the ad-dition function, (‘·’)M is the multiplication function,and (‘′’)M is the successor function.

Result: For any theory K, if K is ř-consistent,then it is consistent.

Proof:Suppose K is ř-consistent. Then it cannot be in-consistent, because every wU is provable in aninconsistent system. Assume for reductio that K isinconsistent. For any wU A [y] containing y as itsonly free variable, it will be true for every naturalnumber n that `K ¬A [n], but it will also hold that`K (∃y) A [y]. So K is not ř-consistent. e

Result: All theorems of a true arithmetical the-ory are true in the standard interpretation.

Proof:By supposition, all the proper axioms of K are truein the standard interpretation, and so are the logi-cal axioms, and MP and Gen preserve truth in aninterpretation. e

Result: If a theory K is a true arithmetical the-ory, then it must be ř-consistent.

Proof:In such a theory, suppose that for every naturalnumber n, `K ¬A [n]. Then, for every n, ¬A [n]is true in the standard interpretation M. Hence,(∀y)¬A [y] must be true in the standard interpreta-tion, because for every natural number n, (n)M = n,and the natural numbers exhaust the domain ofquantiVcation of M. Hence (∃y) A [y] cannot be atheorem of K, because if it were, it would be true inthe standard interpretation, and it cannot be, sinceit is the negation of (∀y)¬A [y].

Corollary: Systems S and RR are ř-consistent.

Proof:Both are true arithmetical theories.

Completeness and Decidability

Recall that there are two widespread deVnitionsof the word “complete” in mathematical logic, asdiscussed on p. 23.

1. On one deVnition, (the deVnition I prefer), a sys-tem is said to be “complete” iU every wU that shouldbe a theorem in virtue of the intended semanticsfor the system is a theorem. (This deVnition wasVrst used by Gödel.)

Examples:

(a) System PF was designed to have, as theorems,all wUs that are logically valid. Hence to proveit complete, we needed to prove that if � Athen `PF A .

(b) System PF= was designed to have, as theo-rems, all wUs that are identity-valid. Hence toprove it complete, we needed to prove that if�= A then `PF= A .

(c) System S was designed to have, as theorems,all wUs that are true in the standard interpre-tation. Hence to be complete, it would haveto be the case that if A is true in the standardinterpretation, then `S A .

The Vrst deVnition makes completeness about therelationship between the semantics of the systemand its system of deduction.

2. On the other deVnition, a system K is said to becomplete (or, as I like to say, “maximal”) iU for ev-ery closed wU A , either `K A or `K ¬A . (ThisdeVnition was Vrst used by Polish American math-ematician Emil Post.)

97

This deVnition has nothing directly to do withsemantics, only with the system of deduction.

Systems PF and PF= are “complete” in Gödel’ssense, but not in Post’s. Indeed, it would be a badthing if PF were complete in Post’s sense, becausea wU should be a theorem of PF iU it is a logicaltruth, and so, for any contingent wU A , neither itnor its negation should be a theorem of PF.

However, given S’s limited syntax and singleintended interpretation, the two deVnitions of com-pleteness coincide. Why?

• Within a given interpretation, every closedwU is either true, or its negation is true. Be-cause S aims to capture everything that istrue in the standard interpretation, it couldbe complete in Gödel’s sense only if it is com-plete in Post’s sense, because to capture alltruths, for every closed wU A , it must cap-ture either A or ¬A , depending on whichis true in the standard interpretation.

• Unfortunately S is complete in neither sense,because Gödel showed that any theory simi-lar to S has “undecidable sentences”.

Definition: For a given closed wU A in a systemK, A is called an undecidable sentence iU neither`K A , nor `K ¬A .

Obviously, any system with undecidable sentencesis incomplete in Post’s sense.

Note that the word “undecidable” is also usedwith a diUerent meaning in mathematical logic, al-though applied to systems rather than individualsentences. We’ll actually discuss this meaning onp. 104. But Vrst, (drumroll please) . . .

G. Gödel’s First IncompletenessTheorem

Result: For any theory with identity K (e.g., Sor RR) such that (i) K has a recursive axiom setand vocabulary, (ii) every recursive function isrepresentable in K and every recursive relationis expressible in K, and (iii) K is ř-consistent,there is at least one undecidable sentence in K,G (called “the Gödel sentence for K”).(Gödel’s First Incompleteness Theorem).

Proof:1. Assume K is a theory with the characteristics

above.2. Because K is ř-consistent, it is consistent.3. Because K has a recursive axiom set, the number

theoretic relation Pf(x, y), that holds between xand y iU x is the Gödel number of a proof in Kof the wU with Gödel number y, is a recursiverelation.

4. Because every recursive relation is expressiblein K, Pf is expressible in K. Hence there is somewUPf [x1, x2] such that, for all natural numbersk1 and k2:(4a) If Pf holds for 〈k1, k2〉, then

`K Pf [k1, k2].(4b) If Pf does not hold for 〈k1, k2〉, then

`K ¬Pf [k1, k2].5. Consider the following wU:

(∀y)¬Pf [y, x]

Because every recursive function is repre-sentable in K, the Fixed-Point Theorem is ap-plicable to the above wU, and hence, there is aclosed wU , which we’ll call G , such that:(5a) `K G ⇔ (∀y)¬Pf [y, pG q]In eUect, G is equivalent to the assertion that nonatural number is the Gödel number of a proofof G in K, i.e., G asserts that it is not provable.

6. We must now prove that G is an undecidablesentence of K. Let q be the Gödel number of G .

7. We will Vrst show that it is not the case that`K G by reductio.(7a) Assume that `K G .(7b) Then there must be some proof of G in K.

This proof must have a Gödel number, r.

98

(7c) Hence Pf(r, q).(7d) By (4a) `K Pf [r, q].(7e) The Gödel number of G is q, so q is pG q.(7f) Hence, `K Pf [r, pG q].(7g) But, by (7a) and (5a), `K (∀y)¬Pf [y, pG q].(7h) Hence `K ¬Pf [r, pG q].(7i) By (7h) and (7f), K is inconsistent, contra-

dicting (2) above.8. Hence it is not the case that `K G . This means

that no natural number is the Gödel number of aproof of G in K. Hence, for all natural numbersn, the relation Pf does not hold for 〈n, q〉.

9. From (8) and (4b), we can conclude that:(9a) For all natural numbers n, `K ¬Pf [n, q].(9b) The Gödel number of G is q, so q is pG q.(9c) Hence, (9a) means that for all natural num-

bers n, `K ¬Pf [n, pG q].(9d) Because K is ř-consistent, from (9c) we

can infer that 0K (∃y) Pf [y, pG q].10. We now show that it is not the case that `K ¬G ,

again by reductio.(10a) Assume that `K ¬G .(10b) By (5a) and (10a), `K ¬ (∀y)¬Pf [y, pG q].(10c) This abbreviates to `K (∃y) Pf [y, pG q].(10d) But (10c) contradicts (9d).

11. Hence neither `K G nor `K ¬G . Since Gis closed, G is an undecidable sentence of K.QED. e

Corollary: All theories to which Gödel’s Vrsttheorem applies has are incomplete in Post’ssense.

Proof:All have at least one decidable sentence, and hencedo not fall under this deVnition of completeness.e

Corollary: Systems S and RR have undecidablesentences, and hence, are incomplete in Post’ssense.

Proof:They have the features necessary for the applica-bility of Gödel’s theorems.

Corollary: Any theory K to which the above the-orem applies, with the same syntax as S and RRand the same intended semantics as S and RR(including S and RR themselves) is also incom-plete in Gödel’s sense.

Proof:For every undecidable sentence, either it or itsnegation is true in the standard interpretation, andhence there are sentences that are true in the stan-dard interpretation, but are not theorems of K. e

In particular, the Gödel sentence G of K is true inthe standard interpretation but is not a theorem ofK. As we have just seen, for K, neither `K G nor`K ¬G . Since G is closed, either G or ¬G must betrue in the standard interpretation. However, sinceG asserts its own unprovability, and, in fact, G isnot provable in K, we can conclude that G is true.

Notice that the Gödel sentence G of some ap-plicable theory K is a wU written entirely the syn-tax of S. Interpreted with the standard interpreta-tion, it a sentence about natural numbers, built en-tirely out of the symbols ‘0’, ‘′’, ‘+’, ‘·’, ‘=’, boundvariables and logical signs. Moreover, it is true.Hence, it seems that not all truths of arithmeticcan be captured in any recursively axiomatizable,ř-consistent theory.

We can consistently add the Gödel sentence Gof some theory K to that theory as a new axiom,to obtain the theory KG . Since KG has a diUerentaxiom set from K, the number-theoretic propertyPrAx will be diUerent for KG from what it wasfor K, but it will still be recursive, and hence therelation Pf will also be diUerent but still be recur-sive. Hence, there will be a diUerent wU Pf ∗[x, y]that expresses the new Pf-relation, and a diUerentGödel sentence G ∗, diUerent from G , which is anundecidable sentence of KG . We can continue theadding all we like one by one; we’ll never achievecompleteness.

99

Gödel’s Vrst incompleteness theorem involvesř-consistency, not simple consistency. As J. B.Rosser showed Vve years later, a similar result canbe proved involving consistency proper.

Result: For any theory with identity K (e.g., Sor RR) such that (i) K has a recursive axiom setand vocabulary, (ii) every recursive function isrepresentable in K and every recursive relation isexpressible in K, (iii) for every natural number n,it holds that:($) `K x ≤ n⇒ x = 0 ∨ x = 1 ∨. . .∨ x = n(@) `K x ≤ n ∨ n ≤ xand (iv) K is consistent, there is at least one un-decidable sentence in K, R (called “the Rossersentence for K”).(The Gödel–Rosser Theorem)

Proof:1. Assume that K is such a theory. So all

recursive functions and relations are repre-sentable/expressible in K. Hence the relation Pfis expressible in K and the function Neg (whosevalue, for any Gödel number of a wU, is theGödel number of the negation of that wU) isrepresentable in K. Let the wU that expressesPf be Pf [x1, x2], and the let the wU that repre-sents Neg be Neg [x, y]. Hence, for all naturalnumbers k1 and k2:(1a) If Pf holds for 〈k1, k2〉, then `K

Pf [k1, k2].(1b) If Pf does not hold for 〈k1, k2〉, then

`K ¬Pf [k1, k2].(1c) If Neg(k1) = k2 then `K Neg [k1, k2].(1d) `K (∃1y) Neg [k1, y].

2. Consider the following open wU, hereafter ab-breviated as E [x]:

(∀z)(Pf [z, x]⇒ (∀y)(Neg [x, y]⇒(∃z1)(z1 ≤ z ∧Pf [z1, y])))

This wU says that, for all z, z is the Gödel num-ber of a proof in K of the wUwith Gödel numberx only if, for any number y that is the Gödel

number of the negation of the wU whose Gödelnumber is x, there is a number z1 smaller thanz which is the Gödel number of a proof in K ofthe wUwith Gödel number y. Notice that if E [x]holds for a given x, then either there is no Gödelnumber of a proof of the wU with Gödel numberx (and hence that wU is not a theorem of K), orthere is also a proof of the negation of the wUwith Gödel number x, and K is inconsistent.

3. The Fixed-Point Theorem applies to E [x]:(3a) `K R ⇔ E [pRq].

R in eUect asserts of its own Gödel numberthat either it is not the Gödel number of a theo-rem, or its negation is also a theorem (and K isinconsistent).

4. Let q be the Gödel number of R, and p be theGödel number of ¬R.

5. It then cannot be the case that `K R.(5a) Assume for reductio that `K R.(5b) Then there is some n such that Pf(n, q),

and by (1a) it follows that `K Pf [n, q].(5c) By (5a) and (3a), `K E [pRq], i.e., `K E [q].(5d) Expanding (5c), by UI and (5b), we get:

`K (∀y)(Neg [q, y]⇒(∃z1)(z1 ≤ n ∧Pf [z1, y])).

(5e) Note that Neg(q) = p and so by (1c),`K Neg [q, p].

(5f) So by (5d) and (5e), we get:`K (∃z1)(z1 ≤ n ∧Pf [z1, p])

(5g) By (5a) and K’s consistency, 0K ¬R.For all natural numbers s, Pf does not holdfor 〈s, p〉. A fortiori, for all natural num-bers s less than or equal to n, we have, by(1b), `K ¬Pf [s, p].

(5h) By PF= rules, from (5g) it follows that, forall natural numbers s less than or equal ton, we have `K x = s⇒ ¬Pf [x, p].

(5i) By ($), (5h) and a big proof by cases:`K x ≤ n⇒ ¬Pf [x, p].

(5j) By SL, Gen and variable juggling, (5i) be-comes: `K (∀z1)¬(z1 ≤ n ∧Pf [z1, p]).

(5k) But from (5j) and (5f), we get that K isinconsistent, which is impossible.

6. By a similar process of reasoning, we can showthat it is not the case that `K ¬R.(6a) Assume `K ¬R for reductio.(6b) Then there is some n such that Pf(n, p),

100

and by (1a) it follows that `K Pf [n, p].(6c) By (5b), and PF=, we have`K n ≤ x⇒ (∃z1)(z1 ≤ x ∧Pf [z1, p]).

(6d) By (5), for all natural numbers s, Pf doesnot hold for 〈s, q〉, and so by (1b),`K ¬Pf [s, q].

(6e) By a proof by cases similar to that in (5i),we get: `K x ≤ n⇒ ¬Pf [x, q].

(6f) By (@), (6c) and (6e), we can derivethat: `K ¬Pf [x, q] ∨ (∃z1)(z1 ≤ x ∧Pf [z1, p]).

(6g) By the same reasoning as (5e),`K Neg [q, p].

(6h) By (6g) and (1d), we get:`K (∀y)(Neg [q, y]⇒ y = p).

(6i) From (6f), SL, and variable juggling: `KPf [z, q]⇒ (∃z1)(z1 ≤ z ∧Pf [z1, p]).

(6j) Using (6h) and (6i) we get the followingproof:

1. Pf [z, q] `K Pf [z, q] (Premise)2. Pf [z, q] `K (∃z1)(z1 ≤ z ∧

Pf [z1, p]) 1, (6i), MP3. Neg [q, y] `K Neg [q, y] (Premise)4. Neg [q, y] `K y = p 3, (6h), UI, MP5. Pf [z, q],Neg [q, y] `K (∃z1)(z1 ≤

z ∧Pf [z1, y]) 2, 4 LL6. `K (∀z)(Pf [z, q]⇒

(∀y)(Neg [q, y]⇒ (∃z1)(z1 ≤ z ∧Pf [z1, y]))) 5,DT,Gen,DT,Gen

(6k) Note that the conclusion of (6j) is `K E [q].(6l) But q is the Gödel number of R, so`K E [pRq].

(6m) By (6l) and (3a), `K R.(6n) By (6m) and (6a), K is inconsistent, which

is impossible.7. By (5) and (6), neither `K ¬R nor `K R. So R

is an undecidable sentence of K. QED. e

The results of Gödel and Rosser we have just seencan more or less be summarized this way: no sys-tem for number theory with a recursive axiom setcan be complete.

Definition: A theory K is said to be recursivelyaxiomatizable iU there is a theory K* with exactlythe same theorems as K such that K* has a recursiveaxiom set.

Notice that a theory does not itself have to have arecursive axiom set to be recursively axiomatizable.

However, it is easy to prove that if a given the-ory is incomplete, then any theory with exactlythe same theorems will also be incomplete. Hence,no recursively axiomatizable system for numbertheory can be complete.

What about a system that is not recursivelyaxiomatizable? The results of Gödel and Rosserwould not apply to it, and so it might very well beable to capture all arithmetical truths. But whatwould such a system be like? If we accept Church’sthesis (see below), such a theory must be verystrange indeed.

H. Church’s Thesis

Definition: A number-theoretic function is saideUectively computable iU there exists a purely me-chanical procedure or algorithm—one that does notrequire original insight or ingenuity—whereby onecould determine the value of the function for anygiven argument or arguments.

Definition: A number-theoretic relation is said tobe eUectively decidable iU there exists a purely me-chanical procedure or algorithm whereby one coulddetermine whether or not it applies to any given num-ber or numbers.

Definition: Church’s thesis is the suppositionthat a number-theoretic function is eUectively com-putable iU it is recursive. (Or equivalently, that anumber-theoretic relation is eUectively decidable iUit is recursive.)

Church’s thesis has never been proven. The rea-son is that the notion of “purely mechanical pro-cedure, not requiring ingenuity” cannot be mademore precise without begging the question. (I.e.,if we simply deVne it in recursive mathematicalterms, Church’s thesis becomes uninteresting.)

However, more than a half-century of researchin computability and computer science have failedto produce a clear counterexample to Church’sThesis.

There is clearly a mechanical procedure forworking forwards and backwards between wUs

101

and their Gödel numbers. So if we accept Church’sthesis, a system that does not have a recursive ax-iom set would be one in which there is no eUectiveprocedure for determining whether or not a givenwU is an axiom or not.

While in the abstract, one can speak of “sys-tems” or “theories” that are not recursively axiom-atizable, it is impossible actually to describe onefully. In such a system, there would no eUectiveway to determine whether or not a given wU wasan axiom, and hence no eUective way to determinewhether or not a given alleged “proof” was allowedor not. It is diXcult to believe that such a systemwould not be fully learnable or usable in practice.

For example, in order to “cheat”, we could “cre-ate” a system in which every wU that is true inthe standard interpretation is an axiom. Obviously,such a theory would be complete. However, therewould be no way of determining whether a givenwU counts as an axiom or not. (E.g., is Goldbach’sconjecture true in the standard interpretation?)

The notion of a system in which there is noeUective procedure for determining whether or nota given wU is a theorem is somewhat less trou-bling. In fact, we shall later prove that S, RR andeven simple PF are like this. In these systems, ittakes ingenuity to determine whether a given wUis a theorem, because it takes ingenuity to Vnd theappropriate proof. However, there is at least aneUective procedure, once given an alleged proof,of determining whether or not it is an acceptableproof in that system. (In other words, while theproperty of being the Gödel number of a theorem isnot recursive for these systems, the relation Pf(x, y)is recursive.)

I. Löb’s Theorem / Gödel’sSecond Theorem

The Hilbert-Bernays Derivability Condi-tions

Gödel’s Vrst incompleteness theorem, applied to S,involves using the Fixed-Point Theorem to yield:

`S G ⇔ (∀y)¬Pf [y, pG q]

Recall that the wU Pf [x1, x2] expresses the rela-tion Pf that holds between x1 and x2 just in casex1 is the Gödel number of a proof of the wU withGödel number x2.

Abbreviation: We shall now introduce the fol-lowing new abbreviation:

Bew [x] is shorthand for (∃y) Pf [y, x]

While this wU does not express in S the prop-erty of being the Gödel number of a theorem of S,this is its meaning in the standard interpretation.(This abbreviation derived from the German word“beweisbar”, meaning provable.)

Definition: The Hilbert-Bernays derivabilityconditions are the following three results, for anywUs A and B:(HB1) If `S A , then `S Bew [pA q].(HB2) `S Bew [pA ⇒ Bq]⇒

(Bew [pA q]⇒ Bew [pBq])(HB3) `S Bew [pA q]⇒ Bew [pBew [pA q]q].

Similar results hold not only for S, but for any re-cursively axiomatizable extension of S.

For homework, you will prove (HB1). It followsfairly easily from the fact that Pf [x1, x2] expressesPf in S. (HB2) and (HB3) are more diXcult to prove,but follow in a similar way.

Curry’s Paradox (also known as Löb’sParadox)

Consider the following “proof” for the existence ofSanta Claus. Consider the sentence(C) “If this sentence is true, then Santa Clausexists.”I.e., let “C” be deVned as “C ⇒ E”, where “E”means Santa Claus exists. Then:

1. C `L C (Premise)2. C `L? C ⇒ E 1 def. C3. C `L? E 1, 2 MP4. `L? C ⇒ E 3 DT5. `L? C 4 def. C6. `L? E 4, 5 MP

102

Is it a theorem of propositional logic that SantaClaus exists? Well, no, because it is not legitimatein system L to deVne something in terms of itself.But in System S we do have the following odd re-sult:

Result (Löb’s Theorem): For any closed wU A ,if `S Bew [pA q]⇒ A , then `S A .

Proof:(1) Assume that `S Bew [pA q]⇒ A .(2) If the wU A is closed, then the wU Bew [x]⇒

A has exactly one free variable. Hence, by theFixed-Point theorem, there is some wU L suchthat:(2a) `S L ⇔ (Bew [pL q]⇒ A )Notice that L asserts of itself that if it is prov-able, then A is true. Assume for a conditionalproof that L is provable. Because of whatL says, it follows that if it is provable, thenA holds. We’ve assumed that it is provable.Hence, A holds. Discharging the assumption,if L is provable, then A holds. But this iswhat L says. Our conditional proof is a proofof L . Hence, L is provable, and so is A .

(3) Making this more formal:

1. `S L ⇒ (Bew [pL q]⇒ A ) (2a), SL2. `S Bew [pL ⇒ (Bew [pL q]⇒ A )q]

1, (HB1)3. `S Bew [pL q]⇒

Bew [pBew [pL q]⇒ A q] 2, (HB2), MP4. `S Bew [pBew [pL q]⇒ A q]⇒

(Bew [pBew [pL q]q]⇒ Bew [pA q]) (HB2)5. `S Bew [pL q]⇒ (Bew [pBew [pL q]q]⇒

Bew [pA q]) 3, 4 SL6. `S Bew [pL q]⇒ Bew [pBew [pL q]q]

(HB3)7. `S (Bew [pL q]⇒

(Bew [pBew [pL q]q]⇒ Bew [pA q]))⇒((Bew [pL q]⇒ Bew [pBew [pL q]q])⇒(Bew [pL q]⇒ Bew [pA q])) (A2)

8. `S Bew [pL q]⇒ Bew [pA q]5, 6, 7 MP×29. `S Bew [pA q]⇒ A Assumed at (1)10. `S Bew [pL q]⇒ A 8, 9 SL

11. `S L (2a), 10 SL12. `S Bew [pL q] 11, (HB1)13. `S A 10, 11 MP

(4) We have shown that `S A by assuming that`S Bew [pA q] ⇒ A . This establishes Löb’stheorem. e

Corollary: Consider theHenkin sentence, i.e.,the wU H , very much like Gödel’s G , exceptthat instead of asserting its own unprovability,H asserts its own provability:

`S H ⇔ Bew [pH q]

(The above is obtained from the Fixed-Point The-orem as you might expect.) It holds that `S H .

Proof:Immediate by the right-to-left half of the bicondi-tional, and Löb’s theorem. e

Since S is a true arithmetical theory, H is true inthe standard interpretation, despite the “intuition”that H could just as easily have been disprovable.

Löb’s theorem also leads to the result that theconsistency of S cannot be proven in S itself, eventhough there is a wU of S whose meaning in thestandard interpretation is that S is consistent.

The result that Peano Arithmetic, or any exten-sion thereof, cannot be used to prove its own con-sistency, was one of the original “incompletenessresults” Vrst proved by Gödel in 1931. Although,Gödel proved this result in a diUerent way, Löb’stheorem provides us with a fairly easy proof of thisresult.

Abbreviation: Let ConS be an abbreviation forthe following closed wU of system S:

(∀x) (∀y)¬(Neg [x, y] ∧ Bew [x] ∧ Bew [y])

Bearing in mind that Neg [x, y] represents the func-tion Neg(x), whose value for a given Gödel numberof a wU as argument, is the Gödel number of the

103

negation of that wU, the above wU in eUect saysthat it is not the case of any wU that both it andits negation are provable. Assuming that S is con-sistent, ConS, is true in the standard interpretation.However, it is not a theorem of S.

Result: If S is consistent, then 0S ConS.(Gödel’s Second Incompleteness Theorem)

Proof:1. Assume S is consistent, and assume for reductio

that `S ConS.2. Since `S 0 6= 1, by (HB1), we have`S Bew [p0 6= 1q].

3. By UI on ConS, we get`S ¬(Neg [p0 = 1q, p0 6= 1q] ∧

Bew [p0 = 1q] ∧ Bew [p0 6= 1q]).4. Because Neg [x, y] represents the Neg function,

we have in S that `S Neg [p0 = 1q, p0 6= 1q].5. By (2), (3), (4) and SL we get that`S ¬Bew [p0 = 1q].

6. By (A1), `S ¬Bew [p0 = 1q]⇒(0 6= 1⇒ ¬Bew [p0 = 1q]).

7. By (5) and (6), `S 0 6= 1⇒ ¬Bew [p0 = 1q].8. By transposition on (7), we get:`S Bew [p0 = 1q]⇒ 0 = 1.

9. By (8) and Löb’s theorem, it follows that`S 0 = 1!

10. Since `S 0 6= 1, this means that S is inconsistent,contrary to our hypothesis. The assumptionthat `S ConS must be mistaken. e

A similar result will hold for any extension of S,or generally, for any system with a recursive ax-iom set in which all recursive relations/functionsare expressible/representable, and for which theHilbert-Bernays derivability conditions hold.

We might put it this way: if a given axiomaticsystem for number-theory is “suXciently strong”,then if it is consistent, it cannot be used to proveits own consistency.

Precisely because S is (we hope!) consistent,ConS, is true in the standard interpretation. How-ever, it is not a theorem of S.

While ConS seems to make a metatheoretic as-sertion about the system S, taken with the stan-dard interpretation it is simply an assertion aboutnumbers and their arithmetical properties. It is yetanother example of a truth of arithmetic that Peanoarithmetic fails to capture. Hence, this too showsthat system S is incomplete.

As with Gödel’s Vrst incompleteness result,adding additional axioms, even ConS itself, willnot yield a complete system. Let us consider thesystem S* obtained from S by adding ConS as anaxiom. While it is easily shown that S* is consis-tent (at least if S is consistent), there will then bea diUerent wU ConS∗ that, for similar reasons, willnot be a theorem of S*.

This also shows that there are limitations tothe extent to which S (or any other consistent sys-tem) can properly be used for the metalanguage inwhich to conduct its own metatheory.

In fact, there are no closed wUs A for whichit is provable in S that A is not a theorem of S.(This can be seen by careful reWection on steps ofthe proof of Gödel’s second theorem.) While S canbe used to prove of itself that certain sentencesare theorems, it cannot be used to prove that anysentences aren’t theorems.

The last point is actually the same as the pointthat “Bew [x]” does not express in S the property ofbeing a theorem of S, which we’ll discuss furtherbelow.

J. Recursive Undecidability

Definition: An axiomatic system K is said to berecursively decidable iU the following number-theoretic property is recursive:

TK(x) : x is the Gödel number of a theorem of K.

(If a system is not recursively decidable, then it issaid to be recursively undecidable.)

The notion of a recursively decidable system shouldnot be confused with the notion of a decidable sen-tence. A system can be recursively decidable whilenevertheless having undecidable sentences.

Notice that a theory can be recursively axioma-tizable without being recursively decidable.

104

If we accept Church’s thesis, a recursively unde-cidable system is one in which there is no eUectiveor mechanical procedure for determining whetheror not any given wU is a theorem of the system.

If we extend Gödel numbering to include wUsof propositional logic (which is simple enough todo), we could show that System L (propositionallogic) is recursively decidable. A wU is a theoremof L iU it is a tautology. There is a mechanical pro-cedure (truth tables) to determine, for any givenwU, whether or not it is a tautology.

However, as we will prove shortly, no similarmechanical procedure exists for systems S, RR oreven PF. Semantic trees will not work in every case,and constructing derivations requires insight andingenuity; it is not a mechanical procedure.

Result: If K is a theory with identity such that (i)K has a recursive vocabulary and system of nu-merals, (ii) all recursive number-theoretic func-tions are representable in K and all recursivenumber-theoretic relations are expressible in K,and (iii) K is consistent, (e.g., S or RR), then K isrecursively undecidable.(The Recursive Undecidability Principle)

Proof:1. Assume that K is a theory with the character-

istics above, and assume for reductio that K isrecursively decidable.

2. Then, the number-theoretic property TK is re-cursive.

3. Because all recursive number-theoretic relationsare expressible in K, TK is expressible in K bysome wU T [x]. By the deVnition of expressibil-ity, for all natural numbers n:(3a) If TK holds for n, then `K T [n].(3b) If TK does not hold for n, then `K ¬T [n].

4. The above leads to a Gödel-like sentence, as-serting its own unprovability. However, whenconstructed usingT rather thanBew , with (3b),this will lead to an inconsistency.

5. By the Fixed-Point Theorem, there is someclosed wUW such that:

(5a) `K W ⇔ ¬T [pW q].6. Let q be the Gödel number of W . Hence q is

the same as pW q.7. Let us Vrst prove by reductio that 0K W .

(7a) Assume that `K W .(7b) Hence q is the Gödel number of a theorem

of K. In other words, TK holds for q.(7c) By (3a) and (7b),`K T [q], i.e., `K T [pW q].

(7d) By (5a) and (7c), `K ¬W .(7e) By (7d) and (7a), K is inconsistent, which

is impossible.8. We have just proven that 0K W . However, this

also leads to contradiction.(8a) W is not a theorem of K. Hence q is not

the Gödel number of a theorem of K. I.e.,TK does not hold for q.

(8b) By (3b), it follows that `K ¬T [q]. I.e.,`K ¬T [pW q].

(8c) By (5a) and (8b), we get `K W which con-tradicts (7).

9. Therefore, our assumption that TK is a recur-sive property must be mistaken. Hence, K isrecursively undecidable. This establishes theprinciple. e

Corollary: There is no wU of S that expressesthe property TS of being the Gödel number ofbeing a theorem of S.

Proof:By the above, TS is not recursive, and a numbertheoretic property is expressible in S if and only ifit is recursive. e

The wU “Bew [x]” means that x is the Gödel num-ber of a theorem of S, but it does not express thatproperty. A principle similar to (3b) does not holdfor Bew . Otherwise, since G is not a theorem of S,we would be able to prove that

`S ¬Bew [pG q]

and hence G itself, and S would be inconsistent.

105

Taken with Church’s Thesis, the UndecidabilityPrinciple means that there is no eUective proce-dure for determining whether or not a given wU isa theorem of S or RR. (Perhaps that will make youfeel better about those object-language proofs in Syou found diXcult: after all, if a computer can’t beprogrammed to Vnd a proof of any given theoremof S, why should you be expected to?)

The Recursive Undecidability Principle alsoleads to results such as Church’s Theorem andTarski’s theorem.

Tarski’s theorem has a proof-structure verysimilar to the above. (Indeed, the book presentsTarski’s theorem as a corollary of the RecursiveUndecidability Principle. However, I’ll give a sepa-rate, more intuitive proof closer to the proof Tarskihimself gave.)

Definition: A number-theoretic property P is saidto be arithmetical iU there is some wU A [x] with‘ x’ as its only free variable, containing no constantsother than ‘ 0’, predicate-letters other than ‘=’ orfunction signs other than ‘ ′’, ‘+’ and ‘ ·’ (i.e., inthe syntax of S/RR) such that, for all natural num-bers n, P holds of n iU A [n] is true in the standardinterpretation.

The System N

The proof of Tarski’s theorem makes reference tothe “cheater” system N , which has the same re-cursive syntax as S, but contains every wU thatis true in the standard interpretation as an axiom.Obviously, by the Gödel-Rosser theorem, N is notrecursively axiomatizable. Therefore, we cannotfully describe it nor list its axioms. However, wecan still consider it as an abstract possibility.

Result: N is a true arithmetical theory.

Proof:All its axioms are true in the standard interpreta-tion, and the inference rules preserve truth in aninterpretation, so all its theorems are also true. In

fact, the set of its theorems is identical with the setof its axioms. SpeciVcally:

(!) For any wU A , A is true in the standard inter-pretation iU `N A . e

Corollary: N is a theory with identity, since(A6) and all instances of (A7) are true in thestandard interpretation.

Result: Every number-theoretic function and re-lation that is representable/expressible in S isalso representable/expressible in N .

Proof:Because S is a true arithmetical theory, every theo-rem of S is an axiom of N .

Corollary: All recursive number-theoretic func-tions and relations are representable/expressiblein N .

Result (Tarski’s Theorem): The number-theo-retic property Tr, which holds of a given naturalnumber x iU x is the Gödel number of a wUthat is true in the standard interpretation, is notarithmetical.

Proof:1. Assume for reductio that Tr is arithmetical.

Then there is some wU Tr [x] such that for allnatural numbers n, Tr holds of n iU Tr [n] istrue in the standard interpretation.

2. Hence, for all natural numbers n, Tr holds of niU `N Tr [n].

106

3. Because N has the standard interpretation as amodel, by the Modeling Lemma, it is consistent.

4. N is complete in Post’s sense, because for ev-ery closed wU, either it or its negation is truein the standard interpretation. Hence, for everyclosed wU B, either `N B or `N ¬B.

5. Because all recursive functions are repre-sentable in N , the Fixed-Point Theorem applies.There is a closed wU A such that:(5a) `N A ⇔ ¬Tr [pA q]Notice that A asserts of itself that it is nottrue in the standard interpretation. Let q bethe Gödel number of A . So q is the same aspA q.

6. This leads to the liar paradox. Because A as-serts of itself that it is not true in the standardinterpretation, it is true iU it is not true.

7. By (4), it holds that either `N A or `N ¬A .Both are impossible. First, we will show thatassuming `N A leads to a contradiction.(7a) Assume that `N A .(7b) Hence, by (!), A is true in the standard

interpretation.(7c) By (7b), the Gödel number of A , q, is the

Gödel number of a wU that is true in thestandard interpretation. In other words,Tr holds of q.

(7d) By (2) and (7c), `N Tr [q], i.e.,`N Tr [pA q].

(7e) By (7d) and (5a), `N ¬A .(7f) By (7a) and (7e) we get that N is incon-

sistent, which is impossible.8. However, it is also impossible that `N ¬A .

(8a) Assume that `N ¬A .(8b) By (5a), `N Tr [pA q], i.e., `N Tr [q]..(8c) By (8b) and (2), Tr holds of q.(8d) But q is the Gödel number of A , and so

A is true in the standard interpretation.(8e) By (!) and (8d), `N A , and once again,

we get that N is inconsistent, which isimpossible.

9. Because N is consistent, our assumptionthat Tr is arithmetical must be mistaken. Inother words, there cannot be any such “truthpredicate” as Tr formulable in the syntax ofN /S/RR. e

This result can be paraphrased as the claim thatthe truth or falsity of a sentence of arithmetic isnot equivalent to an arithmetical property of itsGödel number. Because the arithmetical propertiesof the Gödel number of a wU encode the syntacticfeatures of a wU, this means whether a arithmeti-cal claim is true or false does not boil down to itssyntactic features.

Arguably, this deals a signiVcant blow to for-malism in the philosophy of mathematics: the the-ory that mathematics is the study of rules for ma-nipulating meaningless syntactic strings.

Because all recursive properties arearithmetical—you proved this as homework—acorollary of this result is that the property of beingan arithmetical truth is not recursive. AcceptingChurch’s Thesis, then, there is no eUective ormechanical procedure by which to determine thetruth or falsity of any arbitrary arithmetical claim.

Shucks. I guess we can’t program our comput-ers to determine the truth or falsity of Goldbach’sconjecture! They’ll just have to keep working at it.

K. Church’s Theorem

We begin by proving the following as a lemma.

Result (The Finite Extension Principle): Forany Vrst-order theories K and K* with the samesyntax, if K* is obtained from K by adding aVnite number of axioms to the axioms of K,then if K* is recursively undecidable, K is alsorecursively undecidable.

Proof:1. Assume that K* is obtained from K by adding the

particular wUs A1, . . . ,An as axioms. Assumethat K* is recursively undecidable, but assumefor reductio that K is recursively decidable.

2. Because K and K* have the same syntax, thewUs A1, . . . ,An may be used as hypotheses inK.

3. It is obvious that for every wU E ,{A1, . . . ,An} `K E iU `K∗ E .

107

4. Let B1, . . . ,Bn be the universal closures ofA1, . . . ,An. In any Vrst-order theory, a wUis interderivable with its closure. Hence,{A1, . . . ,An} `K E iU {B1, . . . ,Bn} `K E .

5. By the deduction theorem, and SL rules:{B1, . . . ,Bn} `K E iU `K (B1 ∧ . . . ∧Bn)⇒ E .

6. By (3), (4) and (5), we get that `K (B1 ∧ . . . ∧Bn)⇒ E iU `K∗ E .

7. Let c be the Gödel number of (B1 ∧ . . . ∧ Bn).By (6) it follows that, for any natural numbern, n is the Gödel number of a theorem of K* iU23 ∗ c ∗ 211 ∗ n ∗ 25 is the Gödel number of atheorem of K.

8. We’ve assumed that K is recursively decidable.Hence TK is a recursive property. However,given (7), the characteristic function of the prop-erty of being the Gödel number of a theorem ofK*, TK∗ , could easily be obtained by substitutionusing the characteristic function of TK.

9. Hence, K*, TK∗ is also a recursive property,which contradicts the assumption at (1) thatK* is recursively undecidable. Therefore, theassumption that K is recursively decidable mustbe mistaken. e

Result (Church’s Theorem): The Vrst-orderpredicate-calculus, system PF, is recursivelyundecidable.

Proof:1. Assume for reductio that PF is recursively de-

cidable, i.e., that the number-theoretic propertyTPF is recursive.

2. Consider now the system PS, the predicate cal-culus in the language of arithmetic. This isthe system that has the same syntax as S andRR but does not contain any proper axioms.Hence its only axioms are the instances of axiomschemata (A1)–(A5) that contain no constantsother than ‘0’ (‘a’), no predicate-letters otherthan ‘=’ (I2), and no function signs other than‘′’ (‘f 1’), ‘+’ (‘f 2

1 ’), and ‘·’ (‘f 22 ’).

3. System RR is a theory with identity with arecursive vocabulary and system of numerals.All recursive number-theoretic functions arerepresentable in RR and all recursive number-theoretic relations are expressible in RR. SinceRR is consistent, it follows by the RecursiveUndecidability Principle that RR is recursivelyundecidable.

4. RR has only a Vnite number of proper axioms,i.e., it is a Vnite extension of PS. By (3) and theFinite Extension Principle, it follows that PS isalso recursively undecidable.

5. Because PS has the same syntax as S, the numbertheoretic property FmlS that applies to a num-ber x iU x is the Gödel number of a wU of thesyntax of S is the same as the number-theoreticproperty FmlPS of being the Gödel number of awU of PS. Because FmlS is primitive recursive,so is FmlPS .

6. The system PS is just like system PF except thatits theorems are those theorems of PF that arewUs in the more limited syntax of PS.

7. It follows from (6) that the number-theoreticproperty of being a theorem of PS, namely TPS , isthe conjunction of the properties TPF and FmlPS .

8. However, the conjunction of two recursivenumber-theoretic properties is itself recursive.Hence, by (1) and (5), TPS is recursive, whichcontradicts (4).

9. Hence, our assumption that TPF is recursivemust be mistaken. This establishes the theo-rem. e

Because PF is both complete and sound, a wU A isa theorem of PF iU it is logically valid.

Therefore, if we accept Church’s Thesis,Church’s Theorem amounts to the claim that thereis no eUective or mechanical procedure for deter-mining whether or not a given wU of the languageof predicate logic is logically valid.

Doesn’t this seem like a good place to end thesemester? I thought so.

108

INDEX OF SYMBOLS AND DEFINITIONS

A , B, C , etc., 3∈, /∈, 4, 59{. . .}, 4⊆,⊂, 4∪, ∩,−, 4∅, 4〈. . .〉, 4×,Γn, 4[A]R, 5∼=, 5, 60ℵ0, 5¬, ∨, ∧,⇒,⇔, 8, 16�, ��, 10, 33–342, 11|, ↓, 14`, 18, 39∀, 29∃, 30A [x], 31(X)M, 31s(t), 336, 36`∗, 42g( ), 43, 88–89=, 6=, 51(∃nx), 51�=, 52′, +, ·, 0, 56{x|A [x]}, 59n, 63<, >, ≤, ≥, ≮, ≯, �, �, 64t|u, 67CR, 70GF , 71Uni , 69, 71

Z(x), 71µy(. . .), 72x

∃x<y

,x

∀x<y

, 75

µzz<y(. . .), 76∗, 79f#, 79β, Bt, 82pA q, 95Pf , 98Neg , 100Bew , 102

aleph null/naught, 5argument places, 29arithmetical property, 106arithmetization (of syntax), 87atomic formula, 29axiom, 17, 39axiom schema, 17axiomatic systems, 17

base step, 6beta function, 82binary, 4, 28, 29bound occurrence, 30bound variable, 30bounded µ-operator, 76bounded product, 74bounded sum, 74

cardinality/cardinal number, 5Cartesian product, 4characteristic function, 70choice of least rule, 72Church’s theorem, 107

109

Church’s thesis, 101closed formula, 30complete induction, 6completeness, 23, 97conjunction of relations, 75connective, 8consistent, 10, 44constant, 28constant function, 73contingent, 10contradictory, 34countable, 5countermodel, 35course-of-values recursion, 79Curry’s paradox, 102

dagger, 14decidable relation, 101decidable sentence, 98decidable theory, 104deduction theorem, 19, 40denumerable, 5denumerable model, 47denumerable sequence, 32derived rule, 19diagonalization, 95diagonalization function, 91disjoint, 4disjunction of relations, 75divisibility, 67domain (of a function/relation), 4domain of quantiVcation, 31dyadic, 28, 29

eUectively computable, 101eUectively decidable, 101empty set, 4encoding, 77equality, 50equinumerous, 5equivalence class, 5equivalence relation, 5expressible, 69

false, 33Fibonacci sequence, 79Veld, 5Vnite, 5

Vnite extension, 107Vnite extension principle, 107Vrst-order language, 30Vrst-order theory, 39Vrst-order theory with identity, 51formula, 8, 15, 29, 85free for, 30free occurrence, 30free variable, 30function, 5, 68function letter, 29function, characteristic, 70

general recursive, 72generalization, 39Gödel number, 44, 88Gödel sentence, 98Gödel’s β-function, 80Gödel’s Vrst incompleteness theorem, 98Gödel’s second incompleteness theorem, 104Gödel–Rosser theorem, 100graph, 71grotesque, 27

Henkin sentence, 103Hilbert-Bernays derivability conditions, 102

identity, 50identity-valid, 52inconsistent, 44independence, 25individual constant, 28individual variable, 28induction, 6, 58induction step, 6inductive hypothesis, 6inference rule, 7, 39inVnite, 5inVnite descent, 67initial function, 71interpretation, 31intersection, 4

juxtaposition, 79

least number principle, 67Leibniz’s law, 51lemma, 23

110

length, 77liar paradox, 107Lindenbaum extension lemma, 44Löb’s paradox, 102Löb’s theorem, 103logical axiom, 17, 39logical consequence, 10, 34logically equivalent, 10, 34logically imply, 10, 34logically true, 34logically valid, 34logicism, 61

mathematical induction, 6, 58maximal, 44member, 4metalanguage, 2metalinguistic variables, 3metatheory, 1method of inVnite descent, 67model, 31model for, 34modeling lemma, 49modus ponens, 7, 39monadic, 28, 29

n-place function, 5n-place operation, 5n-place relation, 4n-tuple, 4natural deduction, 16negation of a relation, 75Nicod system, 27normal model, 52null set, 4number-theoretic function, 68number-theoretic relation, 68numeral, 63

object language, 2ř-consistent, 96one-one function, 5open formula, 30operation, 5operator, 8order, 64ordered n-tuple, 4ordered pair, 4

overbar, 63

parentheses conventions, 9, 29Peano arithmetic, 56Peano axioms/postulates, 56Peirce dagger, 14predicate calculus, 39predicate calculus with identity, 51predicate letter, 28primitive recursive, 72projection function, 69, 71proof, 17proof induction, 6proper axiom (RR), 94proper axiom (S), 57proper subset, 4proper subtheory, 94property, 4propositional connective, 8pseudo-derivability, 42pure predicate calculus, 39

range, 5recursion, 72recursive axiom set, 91recursive function, 72recursive property/relation/set, 72recursive vocabulary, 89recursively axiomatizable, 101recursively decidable, 104reWexive, 5relation, 4, 68relative complement, 4representable, 69restricted µ-operator, 72Robinson arithmetic, 94Russell’s paradox, 60

satisfaction/satisfy, 33satisVable, 10, 34schematic letter, 3Schmödel number, 85schmtingent, 26schmuth tables, 25schmuth-value assignment, 25scope, 30select, 26self-contradiction, 10

111

semantic tree, 36sentence, 30sequence, 32set, 4SheUer stroke, 14SheUer/Peirce dagger, 14signum, 74singleton, 4Skolem-Löwenheim theorem, 50smaller, 5sound/soundness, 22standard model for S, 57statement letter, 8strong induction, 6strongly representable, 69subset, 4substitution, 71subtheory, 94successor, 57, 71symmetric, 5syntax, 8system ,, 85system F, 59system L, 17system N , 106system PF, 39system PF=, 51system PP, 39system PS, 108system RR, 94system S, 56

Tarski’s theorem, 106tautology, 10term, 29theorem, 18, 85theorem schema, 18transitive, 5true, 33true arithmetical theory, 96truth-value assignment, 9turnstile, 18

undecidable sentence, 98undecidable theory, 104union, 4unit set, 4

universal, 44universe of discourse, 31urelement, 4use and mention, 2

valid, 34variable, 28variable assignment, 32

well-formed formula (wU), 8, 15, 29, 85wU induction, 6

zero function, 69

112

mathematical logic i

Documents