predicate logic - university of manchester · 2018-09-11 · predicate logic mike prest department...

Predicate Logic

Mike PrestDepartment of Mathematics

University of ManchesterManchester M13 9PL

[email protected]

September 28, 2005

1

Contents

1 Introduction: The domain of logic 3

2 Propositional Logic 32.1 Propositional terms . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 A propositional calculus: Hilbert-style 93.1 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Introduction to predicate logic: languages and structures 184.1 The basic language . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2 Enriching the language . . . . . . . . . . . . . . . . . . . . . . . . 204.3 L-structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.4 Basic examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.5 Interpretation and formulas: the problem . . . . . . . . . . . . . 234.6 A solution: valuations . . . . . . . . . . . . . . . . . . . . . . . . 244.7 The satisfaction relation . . . . . . . . . . . . . . . . . . . . . . . 254.8 Formulas with parameters . . . . . . . . . . . . . . . . . . . . . . 28

5 Theories and models 295.1 Theories and elementary equivalence . . . . . . . . . . . . . . . . 295.2 Isomorphism of L-structures . . . . . . . . . . . . . . . . . . . . . 305.3 Substructures and elementary substructures . . . . . . . . . . . . 325.4 A little set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.5 Downwards Lowenheim-Skolem theorem . . . . . . . . . . . . . . 335.6 Upwards Lowenheim-Skolem theorem . . . . . . . . . . . . . . . . 34

6 Predicate Calculus: the Completeness Theorem 366.1 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366.2 Another proof system for propositional logic . . . . . . . . . . . . 386.3 A proof system for predicate logic . . . . . . . . . . . . . . . . . 396.4 Proofs involving extra constants . . . . . . . . . . . . . . . . . . 416.5 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426.6 The Completeness Theorem . . . . . . . . . . . . . . . . . . . . . 426.7 Herbrand structures . . . . . . . . . . . . . . . . . . . . . . . . . 44

Please let me know of any typos or errors that you come across.

2

1 Introduction: The domain of logic

By logic here we mean either propositional logic (the logic of combining state-ments) or first-order predicate logic (a logic which can be used for constructingstatements). Since propositional logic is a part of predicate logic we begin withthe former.

Propositional logic can be seen as expressing the basic “laws of thought”which are used not just in mathematics but also in everyday discourse. Predicatelogic, which can also be though of as “the logic of quantifiers”, is strong enoughto express essentially all formal mathematical argument.

Most of the examples that we will use are taken from mathematics but wedo use natural language examples to illustrate some of the basic ideas. Thenatural language examples will be rather “bare”, reflecting the fact that theseformal languages can capture only a small part of the meanings and nuances ofordinary language. There are logics which capture more (modality, uncertainty,etc.) of natural language but they have made almost no impact on mathematics(as opposed to philosophy and computer science), because predicate logic isalready enough for expressing the results of mathematical thinking.

By the way, one should be clear on the distinction between the formal ex-pression of mathematics (which is as precise and as formal as one wishes it tobe) and the process of mathematical thinking and informal communication ofmathematics (which uses mental imagery and all the usual devices of humancommunication).

2 Propositional Logic

2.1 Propositional terms

Propositional logic is the logic of combining already formed statements. Itbegins with careful and completely unambiguous descriptions of how to use the“propositional connectives” which are “and”, “or”, “not”, “implies”. But firstwe should be clear on what is meant by a “statement” (the words “assertion”and “proposition” will be used interchangably with “statement”).

The distinguishing feature of a statement is that it is either true or false.“The moon is made of cheese” is a (false) statement and “1 + 1 = 2” is a (true,essentially by definition) statement. Fortunately, in order to deal with the logicof statements, we do not need to know whether a given statement is true or false:it might not be immediately obvious whether “113456 × 65421 = 880459536“is true or false but certainly it is a statement. A more interesting example is“There are infinitely many prime pairs.” where by a prime pair we mean a pair,p, p+2, of numbers, two apart, where both are prime (for instance 3 and 5 forma prime pair, as do 17 and 19 but not 19 and 21). It is a remarkable fact that,to date, no-one has been able to decide whether this statement is true or false.

3

Yet it is surely (*) either false (after some large enough number there are nomore prime pairs) or true (given any prime pair there is always a larger primepair somewhere out there).

On the other hand, the following are not statements.“Is 7 a prime number?”“Add 1 and 1.“

The first is a question, the second a command.What about“x is a prime number.”:

is this a statement? The answer is, “It depends.”: if the context is such thatx already has been given a value then it will be a statement (since then eitherx is a prime number or is not) but otherwise, if no value (or other sufficientinformation) has been assigned to x then it is not a statement.

Here’s a silly example (where we can’t tell whether something is a statementor not). Set x = 7 if there are infinitely many prime pairs but leave the value ofx unassigned if there are not. Is “x is a prime number” a statement? Answer:(to date) we can’t tell!

But the example is silly and quite off the path of what we will be doing.

When we discuss mathematical properties of, for instance, numbers, we usevariables, x, y to stand for these numbers. This allows us to make generalassertions. So we can say “For all integers x, y we have x+ y = y+ x.” insteadof listing all the examples of this assertion: ..., “0 + 1 = 1 + 0”, “1 + 1 = 1 + 1”,..., “2 + 5 = 5 + 2”, ... (not that we could list all the assertions covered by thisgeneral assertion, since there are infinitely many of them). In the same way,although we will use particular statements as examples, most of the time weuse variables p, q, r to stand for statements in order that we may make generalassertions.

You might notice that in the paragraph above I assigned different uses tothe words “assertion” and “statement” (although earlier I said that I woulduse these interchangably). This is because I was making statements aboutstatements. That can be confusing, so I used “assertion” for the first (moregeneral, “meta”, “higher”) type of use and “statement” for the second type ofuse. In logic we make statements about statements (and even statements aboutstatements which are themselves statements about statements ...) and you justhave to get used to that (so keep your wits about you) but I have tried to makemy own use of the English language help rather than hinder clarity.

As indicated already, propositional logic is the logic of “and”, “or”, “not”,“implies” and “iff”. The words in quotes are propositional connectives: theyoperate on propositions (and propositional variables) to give new propositions.

Initially we define these connectives somewhat informally in order to empha-sise their intuitive meaning. Then we give their exact definition after we havebeen more precise about the context and have introduced the idea of (truth)valuation.

4

First, notation: we write ∧ for “and”; ∨ for “or”, ¬ for “not”,→ for “implies“and ↔ for “iff”. So if p is the proposition “the moon is made of cheese” andq is the proposition “mice like cheese” then p ∧ q, p ∨ q, ¬p, p → q, p ↔ qrespectively may be read as “the moon is made of cheese and mice like cheese”,“the moon is made of cheese or mice like cheese”, “the moon is not made ofcheese”, “if the moon is made of cheese then mice like cheese” and “the moonis made of cheese iff mice like cheese”.

A crucial observation is that the truth value (true or false) of a statementobtained by using these connectives only depends on the truth values of the“component propositions”. Check through the examples given to see if youagree (you might have some doubts about the last two examples: we will dis-cuss these). Also, for instance, you may not know whether or not the followingare true statements: “the third homology group of the torus is trivial”, “everymodule is stable” but you know that the combined statement “the third ho-mology group of the torus is trivial and every module is stable” is true exactlyif each of the separate statements is true. That is why it makes sense to ap-ply these propositional connectives to propositional variables as well as just topropositions.

So now the formal definition.

We start with a collection, p, q, r, p0, p1, et cetera of symbols which we callpropositional variables. Then we define, by induction, the propositionalterms by the following clauses:

(0) every propositional variable is a propositional term;(i) if s and t are propositional terms then so are: s ∧ t, s ∨ t, ¬s, s → t,

s↔ t;(ii) that’s it (more formally, there are no propositional terms other than

those which are so by virtue of the first two clauses).Remark: You can see some (all too typical) abuse of notation in clause (i),

where the symbols “s” and “t” are not being used as propositional terms but,rather, variables ranging over propositional terms. It is quite typical for textson logic to fail to point out such uses even though logic has a reputation forprecision! Maybe there’s a reason. (Perhaps it would make logic even moreconfusing?)

The above is an inductive definition, with (0) being the base case and (i)the inductions step(s) but it’s a more complicated inductive structure than thatwhich uses the natural numbers (as “indexing structure”). For there are manybase cases (any propositional variable), not just one (0 in ordinary induction)and there are (as given) five types of inductive step, not just one (“add 1” inordinary induction).

Example 2.1 Starting with propositional variables p, q, r, these are proposi-tional terms by clause (0) and then, by clause (i), so are p∧p, p∧q, ¬q, q → p forinstance. Then, by clause (i) again, (p∧p)∧p, (p∧q) → ¬r, (q → p) → (q → p)

5

are further propositional terms. Further applications of clause (i) allow us tobuild up more and more complicated propositional terms. So you can see thatthese little clauses have large consequences. The last clause simply says thatevery propositional term has to be built up in this way.

Notice how we have to use parentheses to write propositional terms. This isjust like the use in arithmetic and algebra: without parentheses the expression(−3 + 5) × 4 would read −3 + 5 × 4 and the latter is ambiguous. At least, itwould be, if we had not become used to the hierarchy of arithmetical symbols bywhich − binds more closely than × and ÷, and those bind more closely than +and −. Of course parentheses are still needed but such a hierarchy reduces thenumber needed and leads to easier readability. A similar hierarchy is used forpropositional terms, by which ¬ binds more closely than ∧ and ∨, which bindmore closely than → and ↔ (at least those are my conventions, unfortunatelythey are not universal). Therefore ¬p∧ q → r means ((¬p)∧ q) → r rather than¬(p ∧ q) → r or (¬p) ∧ (q → r) or ¬(p ∧ (q → r)).

You will recall that in order to prove results about things which are definedby induction (on the natural numbers) it is usually necessary to use proof byinduction. The same is true here: one deals with the base case (propositionalvariables) then the inductive steps. In this case there are five different types ofinductive step but we’ll see later than some of the propositional connectives canbe defined in terms of the others. For instance using ∧ and ¬ (or using → and¬) we can define all the others. Having made that observation, we then needonly prove the inductive steps for ∧ and ¬ (or for → and ¬).

Proofs of assertions about propositional terms (and, as we will see later, forthe more complicated predicate formulas) which follow their inductive construc-tion are often called “proofs by induction on complexity of terms”.

Now for the key idea of a (truth) valuation. Fix some set of propositionalvariables, and hence the corresponding set of propositional terms. A valua-tion is a function v from the set of propositional terms to the two-element set(really, the two-element boolean algebra, but you can ignore this parentheticalcomment) {T,F} (“True” and “False”, though some people prefer to use {0, 1})which satisfies the following conditions.For all propositional terms s, t we have

v(s ∧ t) = T iff v(s) = T and v(t) = T;v(s ∨ t) = T iff v(s) = T or v(t) = T;v(¬s) = T iff v(s) = F;v(s→ t) = T iff v(s) = F or if v(t) = T;v(s ↔ t) = T iff the values of v(s) and v(t) are the same, that is, if v(s) =

v(t).There’s quite a lot to say about this definition. We start with a formal but

important point. Namely, because all propositional terms are built up fromthe propositional variables using the propositional connectives, any valuation

6

is completely determined by its values on the propositional variables (this isthe formal statement of the point we made (the “crucial observation”) whendiscussing mice, cheese and homology groups).

For instance if v(p) = v(q) = T and v(r) = F then we have, since v is avaluation, v(p ∨ r) = T and hence v(¬(p ∨ r)) = F. Similarly, for any proposi-tional term, t, built from p, q and r, the value v(t) is determined by the aboveconditions. That does actually need proof. I don’t mean the (obvious, easilyproved by induction) point that this process works (in the sense that it givesa value): rather the more subtle point that there might be more than one wayof building up a propositional term and hence, just conceivably, there might betwo different construction routes - one which leads to the valuation T, and theother to F. This does not, in fact, happen: every propositional term (with allparentheses included) has a unique “construction tree”. We won’t give a proof:not that the proof is difficult (though it can be difficult to see how on earth tostart), rather we leave it as an exercise.

So every valuation is determined by its values on the propositional variables.Another point, which we won’t prove (by induction), is that there are no re-strictions on these values - we can assign values to the propositional variablesas we like and the extension to all propositional terms will be a well-definedvaluation. Therefore if we take terms in n propositional variables there will be2n valuations on these.

Truth tables are tables showing evaluation of valuations on propositionalterms. They can also be used to show the effect of the propositional connectiveson truth values. Note that “or” is used in the inclusive sense (“one or the otheror both”) rather than the exclusive (“one or the other but not both”).

p q p ∧ qT T TT F FF T FF F F

p q p ∨ qT T TT F TF T TF F F

p q p→ qT T TT F FF T TF F T

p ¬pT FF T

p q p↔ qT T TT F FF T FF F T

You might feel that the truth table for→ does not capture what you considerto be the meaning of “implies” but, if we are to regard it as a function on truthvalues (whatever the material connection or lack thereof between its “input”propositions) then the definition given is surely the right one. Or just regardp→ q as an abbreviation for ¬p ∨ q.

We will see some examples of truth tables in the lectures. They can beused to determine whether a propositional term t is a tautology, meaning thatv(t) = T for every valuation v, or a contradiction, meaning that v(t) = F for

7

every valuation v, or neither. Their use implicitly assumes the following fact:if t is a propositional term and if v and w are valuations which agree on allpropositional terms occurring in t then v(t) = w(t). This is pretty obvious andis easily proved by induction (on complexity of terms).

By the way, I didn’t define what I mean by a propositional term occurringin a propositional term but I think you know what I mean and I refer you tothe definition of “free variables” of a formula (near the beginning of Section 4)where you can see how something like this can be given a precise definition.

A number of my comments are skirting round the issue of just how preciseone should be in logic. By its nature, which is a consequence of its subject-matter, in logic one often has to be careful about things which normally onewould scarcely notice. But one should not confuse the subject-matter of logicwith our thinking about that subject-matter. So I take the view that thereis no more need to prove mathematically obvious things in logic than in anyother part of mathematics. (Of course, one should be prepared, if necessary, toprovide proofs of such “obvious” things!)

We say that two propositional terms, s and t, are logically equivalent,and write s ≡ t, if v(s) = v(t) for every valuation v. It is equivalent that s↔ tbe a tautology. Let’s prove that.

Suppose s ≡ t so, if v is any valuation, then v(s) = v(t) so, from the definitionof valuation, v(s ↔ t) = T. This is so for every valuation so, by definition oftautology, s ↔ t is a tautology. For the converse, suppose that s ↔ t is atautology and let v be any valuation. Then v(s ↔ t) = T and so (definition ofvaluation) v(s) = v(t). Then, by definition of equivalence, s and t are logicallyequivalent. We see that the proof was just an easy exercise from the definitions.

If S is a set of propositional terms and t is a propositional term then wewrite S |= t if for every valuation v with v(S) = T, by which we mean v(s) = Tfor every s ∈ S, we have v(t) = T: “whenever S is true so is t”. We will use thenotation v |= S as an alternative to v(S) = T (for reasons which will be obscureuntil we introduce the corresponding notion for predicate logic).

8

3 A propositional calculus: Hilbert-style

Given a propositional term, one may test whether or not it is a tautology by,for example, constructing its truth table. This is regarded as a “semantic” testbecause it is in terms of valuations. This test is recursive in the sense that wehave a procedure which, after a finite amount of time, is guaranteed to tell uswhether or not the term is a tautology.

More generally, suppose that S is a finite set of propositional terms and thatt is a propositional term. If we want to determine whether or not S |= t (thatis, whether every valuation which makes everything in S “true“ also makes t“true”) then we can do so. Namely, we take all the propositional variables whichoccur anywhere in any member of S or in t, draw up the truth table for eachterm in S and for t and then check that every valuation (=row of the truthtable) which makes every member of S true also makes t true.

In the case of predicate logic, however, it turns out that there is no corre-sponding algorithm for determining whether or not a proposition (“sentence”in that context) is a tautology or whether the truth of a finite set of proposi-tions implies the truth of another proposition. (In fact the set of tautologieswill be “recursively enumerable” but not recursive.) The best we can do is tofind a method of “generating” all tautologies, or of generating from a set, S,of “axioms” all consequences of those axioms. Such a generating method is a(propositional or predicate) calculus. In this section we will describe one suchcalculus. When we come to predicate logic we will describe another calculus(which will also include a different style of calculus for propositional logic).

Our (Hilbert-style) calculus will consist of certain axioms and one rule ofdeduction. There are infinitely many axioms, being all the propositional termsof one of the forms:

(i) s→ (t→ s)(ii) (r → (s→ t)) → ((r → s) → (r → t))(iii) ¬¬s→ s(iv) (¬s→ ¬t) → (t→ s)

where r, s and t may be any propositional terms.Thus, for instance, the following is an axiom: (p∧¬r) → ((s∨t) → (p∧¬r)).

We refer to (i)-(iv) as axiom schemas.

The single rule of deduction, modus ponens says that, from s and s → twe may deduce t.

Then we define the notion of entailment or logical implication, written`, within this calculus. Let S be a set (not necessarily finite) of propositionalterms and let s, t be propositional terms.

(i) If t is an axiom then S ` t (“logical axiom” LA)(ii) If s ∈ S then S ` s (“non-logical axiom” NLA)(iii) If S ` s and S ` s→ t then S ` t (“modus ponens” MP)

9

((iv) That’s it. ( like clause (ii) in the definition of propositional term))We read S ` t as “S entails t” or “S logically implies (within this particular

calculus) t”.

This definition is, like various definitions we have seen before, an inductiveone: it allows chains of entailments. Here is an example, of a deduction of p→ rfrom S = {p→ q, q → r}.

1. S ` q → r NLA2. S ` (q → r) → (p→ (q → r)) LA(i)3. S ` (p→ (q → r)) MP1,24. S ` ((p→ (q → r)) → ((p→ q) → (p→ r)) LA(ii)5. S ` ((p→ q) → (p→ r)) MP3,46 S ` p→ q NLA7. S ` p→ r MP5,6(The right-hand entries are there to help any reader follow/check the deduc-

tion.)

Note that if S ⊆ T and if S ` s then T ` s because any deduction (such asthat above) of s from T may be changed into a deduction of s from T simplyby replacing every occurrence of “S” by “T”.

Warning: it can be surprisingly difficult to find deductions, even of simplethings, in this calculus. The (“Gentzen-style/Natural Deduction”) calculus thatwe will use later allows, so I am told, deductions to be found more easily. Well,maybe with experience in using them, both become “natural” but for me theyare of theoretical interest only. (I find it of interest that there is a calculus forwhich one can prove a completeness theorem (3.11) but I find trying to finddeductions within a particular calculus deeply unfascinating (more honestly,extremely frustrating!). But those who are interested in reasoning per se andwho believe that it can be modelled by such calculi will have a different view.)

For any such deductive calculus there are two central issues: soundness andcompleteness. We say that a deductive calculus is sound if we cannot deducecontradictions by using it. Equivalently (check that you see why) if S ` t thenS |= t. And we say that a deductive calculus is complete if it is strong enoughto deduce all consequences, that is if S |= t implies S ` t.

So soundness is “If we can deduce t from S then, whenever S is true, t istrue.” and completeness is “If t is true whenever S is true then there will be adeduction of t from S.”.

In the remainder of this section we will give a proof of soundness (this is theeasier part) and completeness for the calculus above. Going through this youwill gain some acquaintance with ideas which will be used in the, considerablymore complicated, proof of (soundness and) completeness for the calculus whichwe will introduce for predicate logic.

10

3.1 Soundness

Suppose that S is a set of propositional terms and that t is a propositional term.We have to show that if S ` t is true then so is S |= t. So suppose that S ` tand let v be a valuation with v |= S (i.e. v(s) = T for every s ∈ S). We mustshow that v |= t (that is v(t) = T).

The idea of the proof is as follows. The fact that S ` t means that there isa deduction of t from S. Any such deduction is given by a sequence of (logicaland non-logical) axioms and applications of modus ponens. If we show that vassigns “T” to every axiom and that modus ponens preserves “T” then everyconsequence of a deduction will be “T”. More precisely, we argue as follows.

If r is logical axiom then (go back and check that all those axioms are actuallytautologies!) r is a tautology, so certainly v(r) = T. If r is a non-logical axiomthen r ∈ S so, by assumption on v, we have v(r) = T. Suppose now that wehave an application of MP in the deduction of t. That application has the form(perhaps with intervening lines and the first two lines occurring in the oppositeorder)

S ` rS ` r → r′

S ` r′for some propositional terms r, r′. We may assume inductively (inducting onthe length of the deduction) that v(r) = T and that v(r → r′) = T. Then,from the list of conditions for v to be a valuation, it follows that v(r′) = T, asrequired.

On the very last line of the deduction we haveS ` t

so our argument shows that v(t) = T, and we conclude that the calculus issound.

3.2 Completeness

Our first step is to prove the Deduction Theorem, which allows us to move termsin and out of the set of non-logical axioms. If we write something like “S ` t” ina mathematical assertion (as opposed to this being a line of a formal deduction)you should read this as saying “There is a deduction of t from S.”

dedth1Theorem 3.1 (Deduction Theorem) Let S be a set of propositional terms andlet s and t be propositional terms. Then S ` (s→ t) iff S ∪ {s} ` t.

Proof. Both directions of the proof are really instructions on how to transforma deduction of one into another.

From a deduction of S ` (s→ t) we may obtain a deduction of t from S∪{s}by first replacing each occurrence of S (to the right of “`”) by an occurrence of

11

S ∪ {s} (and noting that this is still a valid deduction), then adding two morelines at the end, namely

S ∪ {s} ` s (NLA)S ∪ {s} ` t (MP, line above and line before that).

Note that this does give a deduction of t from S ∪ {s}.For the converse, suppose that there is a deduction of t from S ∪ {s}. This

deduction is a sequence of linesS ∪ {s} ` ti for 1 = 1, . . . , n where tn = t.

We will replace each of these lines by some new lines.If ti is a logical axiom or member of S then we replace the i-th line byS ` ti LA or NLAS ` (ti → (s→ ti)) LA(i)S ` s→ ti MPIf ti is s then we replace the i-th line by lines constituting a deduction of

s→ s from S (the proof of 3.2 but with “S” to the left of each “`”).If the i-th line is obtained by an application of modus ponens then there are

line numbers j, k < i such that tk is tj → ti. In our transformed deduction therewill be corresponding (also earlier) lines reading

S ` s→ tj andS ` s→ (tj → ti)

so we replace the old i-th line by the linesS ` (s→ (tj → ti)) → ((s→ tj) → (s→ ti)) Ax(ii)S ` ((s→ tj) → (s→ ti)) MP(line above and one of the earlier ones)S ` s→ ti MP(line above and one of the earlier ones).What we end up with is a (valid - you should check that you see this)

deduction with last lineS ` s→ tn,

as required (recall that tn is t). (It’s worthwhile applying the process describedto an example just to clarify how this works.) 2

Next, some lemmas, the first of which was used in the proof above.lem11

Lemma 3.2 For every propositional term s there is a deduction (independentof s) with last line ` s → s and hence for every set S of propositional termsthere is a deduction with last line S ` s→ s.

Proof. Here’s the deduction.1. ` (s→ ((s→ s) → s) → ((s→ (s→ s)) → (s→ s)) Ax(ii)2. ` s→ ((s→ s) → s) Ax(i)3. ` (s→ (s→ s)) → (s→ s) MP(1,2)4. ` s→ (s→ s) Ax(i)5. ` s→ s MP(3,4)To obtain the second statement just put “S” to the left of each “`” and note

that the deduction is still valid. 2

12

We’ll abbreviate the statement of the following lemmas as in the statementof the Deduction Theorem. Throughout s and t are any propositional terms.

lem12Lemma 3.3 ` s→ (¬s→ t)

Proof. The first part of the proof is just to write down a deduction which takesus close to the end. Then there are two applications of the Deduction Theorem.We’ve actually incorporated those uses, labelled DT, into the deduction itself, asa derived rule of deduction. An alternative would be to stop the deductionat the line “7. {s,¬s} ` t MP(5,6)” and then say “Therefore {s,¬s} ` t. By theDeduction Theorem it follows that {s} ` ¬s → t and then, by the DeductionTheorem again, ` s→ (¬s→ t) follows.”

1. {s,¬s} ` ¬s→ (¬t→ ¬s) Ax(i)2. {s,¬s} ` ¬s NLA3. {s,¬s} ` ¬t→ ¬s MP(1,2)4. {s,¬s} ` (¬t→ ¬s) → (s→ t) Ax(iv)5. {s,¬s} ` s→ t MP(3,4)6. {s,¬s} ` s NLA7. {s,¬s} ` t MP(5,6)8. {s} ` ¬s→ t DT9. ` s→ (¬s→ t) DT 2

In the next proof we use more derived rules of deduction.lem13

Lemma 3.4 ` (s→ ¬s) → ¬s

Proof.1. {s→ ¬s} ` ¬¬s→ s Ax(iii)2. {s→ ¬s,¬¬s} ` s DT3. {s→ ¬s,¬¬s} ` s→ ¬s NLA4. {s→ ¬s,¬¬s} ` ¬s MP(2,3)5. {s→ ¬s,¬¬s} ` s→ (¬s→ ¬(s→ s)) Lemma 3.36. {s→ ¬s,¬¬s} ` ¬s→ ¬(s→ s) MP(4,5)7. {s→ ¬s,¬¬s} ` ¬(s→ s) MP(4,6)8. {s→ ¬s} ` ¬¬s→ ¬(s→ s) DT9. {s→ ¬s} ` (¬¬s→ ¬(s→ s)) → ((s→ s) → ¬s) Ax(iv)10. {s→ ¬s} ` (s→ s) → ¬s MP(8,9)11. {s→ ¬s} ` s→ s Lemma 3.212. {s→ ¬s} ` ¬s MP(10,11)13. ` (s→ ¬s) → ¬s DT 2

lem14Lemma 3.5 ` s→ ¬¬s

13

Proof.` ¬¬¬s→ ¬s Ax(iii)` (¬¬¬s→ ¬s) → (s→ ¬¬s) Av(iv)` s→ ¬¬s MP 2

lem15Lemma 3.6 ` ¬s→ (s→ t)

Proof. Exercise! 2

lem16Lemma 3.7 ` s→ (¬t→ ¬(s→ t))

Proof. Exercise! 2

Now, define a set S of (propositional) terms to be consistent if there is someterm t such that there is no deduction of t from S. Accordingly, say that a setS is inconsistent if for every term t one has S ` t. You might reasonably haveexpected the definition of S being consistent to be that no contradiction canbe deduced from S. But the definition just given is marginally more useful andis equivalent to the definition just suggested (this follows once we have proved3.11).

lem17Lemma 3.8 The set S of terms is inconsistent iff for some term s we haveS ` ¬(s→ s).

Proof. The direction “⇒“ is immediate from the definition.For the other direction, we suppose that there is some term s such that

S ` ¬(s → s). It must be shown that for every term t we have S ` t. Here isthe proof.

1. S ` s→ s Lemma 3.22. S ` (s→ s) → ¬¬(s→ s) Lemma 3.53. S ` ¬¬(s→ s) MP(1,2)4. S ` ¬¬(s→ s) → (¬t→ ¬¬(s→ s)) Ax(i)5. S ` ¬t→ ¬¬(s→ s) MP(3,4)6. S ` (¬t→ ¬¬(s→ s)) → (¬(s→ s) → t) Ax(iv)7. S ` ¬(s→ s) → t MP(5,6)8. S ` ¬(s→ s) by assumption9. S ` t MP(7,8) 2

lem18Lemma 3.9 Let S be a set of terms and let s be a term. Then S ∪ {s} isinconsistent iff S ` ¬s.

Proof. Suppose first that S ∪ {s} is inconsistent. Then, by definition, S ∪{s} ` ¬s. So, by the Deduction Theorem, we have S ` s → ¬s. Since also

14

` (s → ¬s) → ¬s (3.4) and hence S ` (s → ¬s) → ¬s, we can apply modusponens to obtain S ` ¬s.

For the converse, suppose that S ` ¬s and let t be any term. It must beshown that S ∪ {s} ` t. We have S ∪ {s} ` s and also, by 3.3, S ∪ {s} ` s →(¬s → t). So, by modus ponens, S ∪ {s} ` ¬s → t follows. Since S ` ¬s alsoS ∪ {s} ` ¬s so another application of modus ponens gives S ∪ {s} ` t. Thisshows that S ∪ {s} is inconsistent, as required. 2

The next lemma is an expression of the finite character of the notion ofdeduction.

lem19Lemma 3.10 Suppose that S is a set of terms and that s is a term such thatS ` s. Then there is a finite subset, S′, of S such that S′ ` s.

Proof. Any derivation (of s from S) has only a finite number of lines and henceuses only a finite number of non-logical axioms. Let S′ be the, finite, set of allthose actually used. Replace S by S′ throughout the deduction to obtain a validdeduction, showing that S′ ` s. 2

In the proof of the next theorem we make use of the observation that all thepropositional connectives may be defined using just ¬ and → and so, in orderto check that a function v from the set of propositional terms to {T,F} is avaluation, it is enough to check the defining clauses for ¬ and → only.

compl11Theorem 3.11 (Completeness Theorem for Propositional Logic, version 1) Sup-pose that S is a consistent set of propositional terms. Then there is a valuationv such that v |= S.

Proof. Let Γ = {T : T is a consistent set of terms and T ⊇ S} be the set ofall sets of terms which contain S and are still consistent. We begin by showing,using Zorn’s lemma (see 3.15 below, for this), that

Γ has a maximal element.

So let ∆ be a subset of Γ which is totally ordered by inclusion. Let T =⋃

∆be the union of all the sets in ∆. It has to be shown that T ∈ Γ and the onlypossibly non-obvious point is that T is consistent. If it were not then, choosingany term s, there would be a deduction T ` ¬(s→ s). By 3.10 there would bea finite subset T ′ of T with T ′ ` ¬(s→ s). Since ∆ is totally ordered and sinceT ′ is finite there would be some T0 ∈ ∆ such that T0 ⊇ T ′. But then we wouldhave T0 ` ¬(s→ s). By 3.8 it would follow that T0 is inconsistent, contradictingthe fact that T0 ∈ ∆ ⊆ Γ.

This shows that every totally ordered subset of Γ has an upper bound inΓ and so Zorn’s Lemma gives the existence of a maximal element, T say, of Γ.That is, T is a maximal consistent set of terms containing S. What we will do

15

is define the valuation v by v(r) = T if r ∈ T and v(r) = F if r /∈ T but variousthings have to be proved in order to show that this really does give a valuation.

First, we show that T is “deductively closed” in the sense that

(*1) if T ` r then r ∈ T.

Suppose, for a contradiction, that we had T ` r but r /∈ T. Then, bymaximality of T, the set T ∪ {r} would have to be inconsistent and hence, by3.9, T ` ¬r. By 3.3 T ` r → (¬r → t) for any term t, so two applications ofmodus ponens gives T ` t. Since t was arbitrary that shows inconsistency of T- contradiction. Therefore (*1) is proved.

Next we show that T is “complete” in the sense that

(*2) for every term t either t ∈ T or ¬t ∈ T.

For, suppose that t /∈ T. Then, by maximality of T, the set T ∪ {t} isinconsistent so, by 3.9, T ` ¬t. Therefore, by (*1), ¬t ∈ T .

Then we show that

(*3) s→ t ∈ T if ¬s ∈ T or t ∈ T.

For the direction “⇐” suppose first that ¬s ∈ T. Then, by 3.6 and (*1),s → t ∈ T. On the other hand if t ∈ T then s → t ∈ T by Axiom (i) and (*1).For the converse, “⇒”, if we have neither ¬s nor t in T then, by (*2) both s and¬t are in T. Then, by 3.7 and (*1), we have ¬(s→ t) ∈ T and so, by consistencyof T , s→ t /∈ T, as required.

Now define the (purported) valuation v by v(t) = T iff t ∈ T. Since S ⊆ Tcertainly v |= S so it remains to show that v really is a valuation. First, ifv(t) = T then t ∈ T so (consistency of T ) ¬t /∈ T so v(¬t) = F. Conversely, ifv(t) = F then t /∈ T so ((*2)) ¬t ∈ T so v(¬t) = T. That dealt with the ¬ clausein the definition of valuation. The → clause is direct from (*3) which, in termsof v, becomes v(s → t) = T iff v(¬s) = T or v(t) = T that is (by what we justshowed), iff v(s) = F or v(t) = T, as required. 2

compl12Theorem 3.12 (Completeness Theorem for Propositional Logic, version 2) LetS be a set of propositional terms and let t be a propositional term. Then S ` tiff S |= t.

Proof. The direction “⇒” is the Soundness Theorem. For the converse, supposethat S 0 t. Then, by Axiom (iii) and modus ponens, S 0 ¬¬t. It then followsfrom 3.9 that S ∪ {¬t} is consistent so, by the first version of the CompletenessTheorem, there is a valuation v such that v |= S and v |= ¬t so certainly wecannot have v |= t. Therefore S 2 t, as required. 2

16

compac11Theorem 3.13 (Compactness Theorem for Propositional Logic, version 1) LetS be a set of propositional terms. There is a valuation v such that v |= S iff forevery finite subset S′ of S there is a valuation v′ with v′ |= S′.

Proof. One direction is immediate: if v |= S then certainly v |= S′ for any(finite) subset S′ of S. For the converse suppose, for a contradiction, that thereis no v with v |= S. Then, by the Completeness Theorem (version 1), S isinconsistent. Choose any term s. Then, by definition of inconsistent, S ` ¬(s→s). So, by 3.10, there is a finite subset, S′, of S with S′ ` ¬(s → s). By 3.8,S′ is inconsistent. So by Soundness there is no valuation v′ with v′ |= S′, asrequired. 2

compac12Theorem 3.14 (Compactness Theorem for Propositional Logic, version 2) LetS be a set of propositional terms and let t be a propositional term. Then S |= tiff there is some finite subset S′ of S such that S′ |= t.

Proof. Exercise. 2

zornTheorem 3.15 (Zorn’s Lemma) Suppose that (P,≤) is a partially ordered setsuch that every chain has an upper bound, that is, if {ai}i∈I ⊆ P is totallyordered (for all i, j either ai ≤ aj or aj ≤ ai) then there is some a ∈ P witha ≥ ai for all i ∈ I. Then there is at least one maximal element in P (i.e. anelement with nothing in P strictly above it).

This is a consequence, in fact is equivalent to, the Axiom of Choice from settheory.

17

4 Introduction to predicate logic: languages andstructures

In this section we present the main definitions and ideas. When we come todiscuss the satisfaction relation (between structures and sentences) a difficultywill appear, one solution to which, in this section, we will outline but which willbe dealt with in detail, and by another route, later. In this way you will, I hope,have a clear idea of the issues before having to deal with the technicalities.

In the case of propositional logic there was essentially just one language: inthe case of predicate logic there are many, in the sense that when defining anysuch language one has to make a choice from certain possible ingredients. Thereis, however, a basic language which contains none of these extra ingredients andwe discuss that first. Actually even for the basic language there is a choice:whether or not to include a symbol for equality. The choice between inclusionor exclusion of equality rather depends on the types of application one has inmind. I find it natural to include equality so I will give the definitions for thatcase but I will add comments pointing out what differences there would be hadwe not included a symbol “=”.

4.1 The basic language

The basic (first-order, finitary, with equality) language L0 has the following:(i) all the propositional connectives ∧, ∨, ¬, →, ↔(ii) countably many variables x, y, u, v, v0, v1, ...(iii) the existential quantifier ∃(iv) the universal quantifier ∀(v) a symbol for equality =

Then we go on to define “terms” and “formulas”. Both of these, in differentways, generalise the notion of “propositional term” so remember that the word“term” in predicate logic has a different meaning from that in propositionallogic.

Formulas and free variables A term of L0 is nothing other than a variable(you’ll see what terms really are when we discuss languages with constant orfunction symbols). The free variable of such a term x (say) is just the variable,x, itself: fv(x) = {x}.

An atomic formula of L0 is an expression of the form s = t where s andt are terms. The set of free variables of the atomic formula s = t is given byfv(s = t) = fv(s) ∪ fv(t).

(Without a symbol for equality there are no atomic formulas for this languageand hence, see below, no formulas at all. That is, for languages without equalitythere is no “basic language”.)

18

The following clauses define what it means to be a formula of L0 (and,alongside, we define what are the free variables of any formula):

(0) every atomic formula is a formula;(i) if φ is a formula then so is ¬φ, fv(¬φ) = fv(φ);(ii) if φ and ψ are formulas then so are φ∧ψ, φ∨ψ, φ→ ψ and φ↔ ψ, and

fv(φ ∧ ψ) = fv(φ ∨ ψ) = fv(φ→ ψ) = fv(φ↔ ψ) = fv(φ) ∪ fv(ψ);(iii) if φ is a formula and x is any variable then ∃xφ and ∀xφ are formulas,

and fv(∃xφ) = fv(∀xφ) = fv(φ) \ {x}.((iv) plus the usual “that’s it” clause but from now on you should take these

as implicitly there)A sentence is a formula σ with no free variables (i.e. fv(σ) = ∅).Just as with propositional logic we do not need all the above, because we may

define some symbols in terms of the others. For instance, ∧ and ¬, alternatively→ and ¬, suffice for the propositional connectives. Also each of the quantifiersmay be defined in terms of the other using negation: ∀xφ is logically equivalentto ¬∃x¬φ (and ∃x is equivalent to ¬∀x¬) so we may (and in inductive proofswill) drop reference to ∀ in the last clause of the definition.

We also remark that we follow natural usage in writing, for instance, x 6= yrather than ¬(x = y).

If φ is a formula then it is so by virtue of the above definition, so it hasa “construction tree” and we refer to any formula occurring in this tree as asubformula of φ. We also use this term to refer to a corresponding substringof φ. Remember that any formula is literally a string of symbols (usually wemean in the abstract rather than a particular physical realisation) and so wecan also refer to an occurrence of a particular (abstract) symbol in a formula.

As well as defining the set of free variables of a formula we need to definethe notion of free occurrence of a variable. To do that, if x is a variable then:

(i) every occurrence of x in any atomic formula is free;(ii) the free occurrences of x in ¬φ are just the free occurrences of x in its

subformula φ;(iii) the free occurrences of x in φ∧ψ are just the free occurrences of x in φ

together with the free occurrences of x in ψ;(iv) there are no free occurrences of x in ∃xφ.In a formula of the form Qxφ we refer to φ as the scope of the quantifier

Q (∃ or ∀). Any occurrence of x in Qxφ which is a free occurrence of x in φ(the latter regarded as a subformula of Qxφ) is said to be bound by that initialoccurrence of the quantifier Qx. So a quantifier Qx binds the free occurrencesof x within its scope.

A comment on use of variables when you are constructing formulas. Notethat bound variables are “dummy variables”: the formula ∃xf(x) = y and∃zf(z) = y are, intuitively, equivalent. A formula with nested occurrences ofthe same variable being bound can be confusing to read: ∃x(∀x(f(x) = x) →

19

f(x) = x) could be written less confusingly as ∃x(∀y(f(y) = y) → f(x) = x). Ofcourse these are not the same formula but one can prove that they are logicallyequivalent and the second is preferable.

Another informal notation that we will sometimes use is to “collapse repeatedquantifiers”, for example to write ∀x, y(x = y → y = x) instead of ∀x∀y(x =y → y = x). Sometimes the abbreviations ∃!, ∃≤n, ∃=n are useful.

4.2 Enriching the language

The language L0 described above has little expressive power: using it, we cansay various things about the notion of equality but the following list just aboutexhausts the kinds of things that can be said.

∀x(x = x);∀x∀y(x = y → y = x);∀x∀y∀z(x = y ∧ y = z → x = z);∃x∃y∃z(x 6= y ∧ y 6= z ∧ x 6= z ∧ ∀w(w = x ∨ w = y ∨ w = z));∃x(x 6= x).

If, for instance, we wish to deal with groups in the context of a formallanguage, then we had better have some way of referring, in the language, tothe “multiplication” of the group. So we should enrich the language by addingto it a “binary operation symbol”. Or if we wish to deal with posets then weshould have a “binary relation symbol” with which to refer to the partial order.

Precisely what we should add to the language L0 depends on the type ofstructures whose properties we wish to capture within our formal language. Wetherefore suppose that we have, at our disposal, the following kinds of symbolswith which we may enrich the language:

n-ary function symbols such as f ( = f(x1, . . . , xn));(since an operation is simply a function regarded in a slightly different way,we don’t need to introduce operation symbols as well as function symbols, butwe do use “operation notation” where appropriate, writing, for instance, x+ yrather than +(x, y))

n-ary relation symbols such as R (= R(x1, . . . , xn))(1-ary relation symbols, such as P (= P (x)), are also termed (1- ary) predicatesymbols);

constant symbols such as c.In fact constant symbols can be regarded as 0-ary function symbols. Indeed,any n-ary function symbol can be replaced by an (n + 1)-ary relation symbol(together with an axiom saying that the relation is “functional”). So, strictlyspeaking, it is not necessary to introduce anything other than relation symbols.(However, as we will see, the notion of “substructure” is dependent on suchchoices of language.)

20

Formulas of an enriched language Suppose that L is the language L0 en-riched by as many function, relation and constant symbols as we require (thesignature of L is a term used when referring to these extra symbols). Exactlywhat is in L will depend on our purpose: in particular, L need not have functionand relation and constant symbols, although I will, for the sake of a uniformtreatment, write as if all kinds are represented. If S is the set of “extra” symbolswe have added then we will write L = L0 ∨ S. (It is notationally convenientto regard L as being, formally, the set of all formulas of L, so then, writing,for example, φ ∈ L makes literal sense. Thus the “∨” should be understood assome sort of “join”, not union of sets.)

The terms of L, and their free variables, are defined inductively by:(i) each variable x is a term, fv(x) = {x};(ii) each constant symbol c is a term, fv(c) = ∅;(iii) if f is an n-ary function symbol and if t1, . . . , tn are terms, then f(t1, . . . , tn)

is a term, fv(f(t1, . . . , tn)) = fv(t1) ∪ · · · ∪ fv(tn).

The atomic formulas of L (and their free variables) are defined as follows:(i) if s, t are terms then s = t is an atomic formula, fv(s = t) = fv(s)∪ fv(t);(ii) ifR is an n-ary relation symbol and if t1, . . . , tn are terms, thenR(t1, . . . , tn)

is an atomic formula, fv(R(t1, . . . , tn)) = fv(t1) ∪ · · · ∪ fv(tn).

The formulas of L (and their free variables) are defined as follows:(0) every atomic formula is a formula;(i) if φ is a formula then so is ¬φ, fv(¬φ) = fv(φ);(ii) if φ and ψ are formulas then so are φ∧ ψ, φ∨ ψ, φ→ ψ and φ↔ ψ and

fv(φ ∧ ψ) = fv(φ ∨ ψ) = fv(φ→ ψ) = fv(φ↔ ψ) = fv(φ) ∪ fv(ψ);(iii) if φ is a formula and x is any variable then ∃xφ and ∀xφ are formulas,

and fv(∃xφ) = fv(∀xφ) = fv(φ) \ {x}A sentence of L is a formula σ of L with no free variables (i.e. fv(σ) = ∅).Since formulas were constructed by induction we prove things about them

by induction (“on complexity”) and, just as in the case of propositional terms,the issue of unique readability raises its head. Such inductive proofs will bevalid only provided we know that there is basically just one way to constructany given formula (for two routes would give two paths through the inductionand hence, conceivably, different answers). Unique readability does hold forformulas, and also (we defined terms inductively so we follow the constructionto prove things about them) for terms as well! Both proofs are done by induction(on complexity) and are not difficult: the proof for terms is pretty obvious; thatfor formulas looks at the first symbol after the opening “(” and then, if necessary,looks for the point within a formula where the number of “(” minus the numberof “)” is equal to 1 (in order to make this work, at each inductive stage offormula construction one should put parentheses round the result).

21

4.3 L-structures

Suppose that L is a language of the sort discussed above.

Formulas and sentences do not take on meaning until they are interpretedin a particular structure. Roughly, having fixed a language, a structure forthat language provides: a set for the variables to range over (so, if M is theset then, “∀x” will mean “for all x in M”); an element of that set for eachconstant symbol to name (so each constant symbol c of the language will namea particular, fixed element of the structure); for each function symbol of thelanguage an actual function (of the correct arity) on that set; for each relationsymbol of the language an actual relation (of the correct arity) on that set.Here’s the precise definition.

An L-structure M (or structure for the language L) is a non-empty setM , called the domain or underlying set of M, we write M = |M|, togetherwith an interpretation in M of each of the function, relation and constant sym-bols of L. By an interpretation of one of these symbols we mean the following(and we also insist that the symbol “=” for equality be interpreted as actualequality between elements of M):

(i) if f is an n-ary function symbol, then the interpretation of f in M, whichis denoted fM, must be a function from Mn to M ;

(ii) if R is an n-ary relation symbol, then the interpretation of R in M,which is denoted RM, must be a subset of Mn (in particular, the interpretationof a 1-ary predicate symbol is a subset of M);

(iii) if c is a constant symbol, then the interpretation of c in M, which isdenoted cM, must be an element of M .

If no confusion should arise from doing so, the superscript “M” may bedropped (thus the same symbol “f” is used for the function symbol and forthe particular interpretation of this symbol in a given L-structure) but in manysituations, especially in proofs, we will retain the notational distinction.

4.4 Basic examples

The basic languageAn L0-structure is simply a set so L0-structures have limited value as illus-

trations of definitions and results.

In lectures we will give a variety of examples but those below all use thelanguage L which contains just one extra binary relation symbol R.

Directed graphs An L = L0 ∨ {R(−,−)}-structure M consists of a set Mtogether with an interpretation of the binary relation symbol R as a particularsubset, RM, of M ×M. That is, an L-structure consists of a set together witha specified binary relation on that set.

22

Given such a structure, its directed graph, or digraph for short, has for itsvertices the elements of M and has an arrow going from vertex a to vertex biff (a, b) ∈ RM. This gives an often useful graphical way of picturing or evendefining a relation RM (note that the digraph of a relation specifies the relationcompletely).

Certain types of binary relation are of particular importance in that theyoccur frequently in mathematics (and elsewhere).

Posets A partially ordered set (poset for short) consists of a set P and abinary relation on it, usually written ≤, which satisfies:

for all a ∈ P , a ≤ a (≤ is reflexive);for all a, b, c ∈ P , a ≤ b and b ≤ c implies a ≤ c (≤ is transitive);for all a, b ∈ P , if a ≤ b and b ≤ a then a = b (≤ is weakly antisymmetric).

The Hasse diagram of a poset is a diagrammatic means of representing a poset.It is obtained by connecting a point on the plane representing an element a ofthe poset to each of its immediate successors (if there are any) by a line whichgoes upwards from that point. We say that b is an immediate successor of aif a < b (i.e. a ≤ b and a 6= b) and if a ≤ c ≤ b implies a = c or c = b: we alsothen say that a is an immediate predecessor of b.

Equivalence relations An equivalence relation, ≡, on a set X is a binaryrelation which satisfies:

for all a ∈ X, a ≡ a (≡ is reflexive);for all a, b ∈ X, a ≡ b implies b ≡ a (≡ is symmetric);for all a, b, c ∈ X, a ≡ b and b ≡ c implies a ≡ c (≡ is transitive).

The (≡-)equivalence class of an element a ∈ X is denoted [a]≡, a/ ≡ orsimilar, and is {b ∈ X : b ≡ a}. The key point is that equivalence classes areequal or disjoint: if a, b ∈ X then either [a] = [b] or [a] ∩ [b] = ∅. Thus thedistinct ≡-equivalence classes partition X into disjoint subsets.

4.5 Interpretation and formulas: the problem

Suppose that σ is a sentence of the language L and that M is an L-structure.Then σ may be “interpreted” in M and, so interpreted, it will make a definiteassertion “about M” which will be either true or false.

For example if L0 is the basic language (with equality) and M is a set withtwo elements, regarded as an L-structure then the L0-sentence ∃x, y ∀z(z =x) ∨ (z = y) has an obvious “meaning in M” and hence a truth value (clearlytrue in this case).

We will give a precise definition of what we mean by “interpreting σ in M”but, in order to do so, we must consider the interpretation of general formulasin M. The reason for this is that we will define the notion of interpretation byinduction on the complexity of a formula and the inductive construction of asentence goes via formulas which may (and generally will) have free variables.

23

But then we have the following problem: a formula, when interpreted in astructure, need not have a truth value. For instance, the formula ∃yR(x, y) hasfree variable x and its interpretation in M asserts the existence of an elementa ∈M (say) such that x is R-related to a. This “assertion” does not have a truthvalue, since its truth or falsity depends on the value we assign to x: in general,for some values of x the “assertion” will be true and for some it will be false.What we are saying, therefore, is that the truth or falsity of ∃yR(x, y) cannotbe measured unless we substitute for the free variable x a particular elementof M. But the result of substituting elements of a structure into formulas ofa language is certainly not a formula of the language. The process certainlymakes intuitive sense: how are we to formalise it mathematically? In the nextsection we present one solution to this problem.

4.6 A solution: valuations

The aim here is to give meaning to the expression M |= φ where M is anL-structure and φ is a sentence of L. In order to reduce the number of caseswhich have to be treated in definitions and proofs we take ¬ and → to be thebasic propositional connectives and treat the other propositional connectives asbeing defined in terms of these. Also we regard ∀ as being a shorthand for ¬∃¬.

Suppose that L is a predicate language. A valuation, (M, v), for L consistsof an L-structure M together with a function v from the set of variables of Lto the underlying set, M, of M. So to give a valuation is to give a structurefor the language together with a value, in that structure, for each variable ofthe language. What we do is to define, by induction on complexity, the notionof a valuation satisfying a formula: given a formula φ and valuation (M, v) wedefine the notion written as M |=v φ (various notations are in use) and readas “the valuation (M, v) satisfies φ” or “φ is true in M under the valuationv” (etc.). We also tend to write xv rather than v(x) when this fits better withother pieces of notation.

Of course the basic idea is that under a particular valuation a value is as-signed to each variable and so even formulas with free variables are assigned adefinite truth value under any given valuation. We will introduce an alternative,more intuitive, notation in the section after the next.

Before we can give the inductive definition we have to show how a valuationon variables, that is, a function from the set of variables to a set M, extends toa function from the set of all terms of L to M. That also is by induction and issomething with which you are already familiar in specific cases.

For instance, consider an expression (x + 1) × y. If we assign values to thevariables x and y then, of course, we can extend this assignment to give a valueto (x+1)×y: first we note that the constant symbol 1 already has been assigneda value, 1M; also the symbol “+” has been mapped to an actual binary function,+M, on M and so x+ 1 can be mapped to xv + 1M; furthermore, × has been

24

mapped to an actual binary function, ×M, on M and so (x + 1) × y can bemapped to (xv + 1M)×M yv ∈M .

We use the same method in the general case. For convenience we will usethe same notation, v, for the map on variables and for the extension of this mapto arbitrary terms. Precisely, given a valuation (M, v) we extend v to terms asfollows:

(i) if c is a constant symbol of L set v(c) = cM;(ii) if R is an n-ary relation symbol of L and t1, . . . , tn are terms of L then

set v(R(t1, . . . , tn)) = RM(v(t1), . . . , v(tn)).That this does give a well-defined map follows from unique readability for

terms.

We need one more piece of notation: if (M, v) is a valuation, x is a variableand a ∈ M then we denote by vx

a the function from the set of formulas to Mdefined by vx

a(y) = v(y) if y 6= x and vxa(x) = a. So this is the function which is

just like v except that the value at x is (possibly) changed so as to send x to a.

4.7 The satisfaction relation

Now we can define the satisfaction relation M |=v φ, read as “(M, v) satisfiesφ” or “M satisfies φ under v”, between valuations and formulas.

(i) M |=v t1 = t2 iff v(t1) = v(t2)(ii) M |=v R(t1, . . . , tn) iff RM(v(t1), . . . , v(tn)) holds(iii) M |=v ¬φ iff M 2v φ(iv) M |=v φ→ ψ iff M 2v φ or M |=v ψ(v) M |=v ∃xφ iff for some a ∈M we have M |=vx

aφ

forallsatLemma 4.1 With notation as above, M |=v ∀xφ iff for every a ∈ M we haveM |=vx

aφ.

Proof. By definition M |=v ∀xφ, that is M |=v ¬∃x¬φ, iff M 2v ∃x¬φ, whichis the case iff for no a ∈M is it true that M |=vx

a¬φ which, in turn, is the case

iff for no a ∈M is it true that M 2vxaφ and that is the case iff for every a ∈M

we have M |=vxaφ. 2

There is a certain extravagance in assigning values to every variable whenwe are really only interested in the values assigned to the (finitely many) freevariables of a formula φ. It is intuitively clear that the values assigned to thevariables which do not appear anywhere in φ have no bearing on whether ornot M |=v φ. Only slightly less obvious is the fact that the values assignedto variables which appear in φ only bound by quantifiers have no effect on thetruth of M |=v φ. That these statements are indeed true is the content of theDependency Theorem.

25

depthTheorem 4.2 (Dependency Theorem) Suppose that M is an L-structure andthat φ is a formula of L. Let v and w be functions from the set of variables of Lto M such that v(x) = w(x) for every variable x which occurs free in φ. ThenM |=v φ iff M |=w φ.

Proof. The proof is, of course, by induction on complexity of φ. Also, in orderto deal with the “∃” case, we have to allow v and w to change so the inductionhypothesis is: “given M, L and φ as in the statement, for all functions v, w suchthat v(x) = w(x) for every variable which occurs free in φ we have M |=v φ iffM |=w φ” and this is what is proved by induction on φ.

First we need a similar statement about terms which is that, if t is any termand if v(x) = w(x) for every variable x which occurs in t, then tv = tw. Thisis proved by induction on complexity of t: if t is f(t1, . . . , tn) and, inductively,v(t1) = w(t1), . . . , v(tn) = w(tn) then v(f(t1, . . . , tn)) = fM(v(t1), . . . , v(tn))(by definition) = fM(w(t1), . . . , w(tn)) (by induction) = w(f(t1, . . . , tn)).

Next, if φ is atomic, either t1 = t2 or R(t1, . . . , tn), then, in the first case,we have M |=v t1 = t2 iff v(t1) = v(t2) (by definition) iff w(t1) = w(t2) (whatwas just proved about terms) iff M |=w t1 = t2 (definition again), and similarlyfor the other case.

If φ has the form ¬ψ then we have M |=v φ, that is, M |=v ¬ψ iff M 2v ψ(definition) iff M 2w ψ (inductive hypothesis) iff M |=w ¬ψ (by definition),that is iff M |= φ. (Actually we used, without comment, the obvious fact thatif v and w agree on the free variables of φ = ¬ψ then they agree on the freevariables of ψ.)

If φ has the form φ′ → ψ′ then we have M |=v φ iff M 2v φ′ or M |=v ψ

′

(by definition) iff M 2v φ′ or M |=w ψ′ (induction hypothesis) iff M |=w φ(definition) (here we used that the free variables of φ′ and ψ′ are among thoseof φ).

Finally, if φ has the form ∃xψ then we have M |=v φ iff there is a ∈ Mwith M |=vx

aψ (definition) iff there is a ∈M with M |=wx

aψ (we have fv(ψ) =

fv(φ) ∪ {x} and, by assumption, v and w agree on all free variables of φ so thesame is true of vx

a and vxa and these also agree on x (because we forced it) so

this follows by induction) iff M |=w ∃xψ, that is, iff M |=w φ, as required. 2

In the proof of the Completeness Theorem we will use a stronger form ofthis result which deals with the case where there are two languages (typicallyL′ will be an enrichment - will contain more symbols than - L). The proof is ofthe stronger form is, apart from obvious changes, just like that of 4.2.

depth2Proposition 4.3 (Dependency theorem, stronger form) Suppose that (M, v) isa valuation for the language L, that (N , w) is a valuation for the language L′and that φ is a formula of both L and L′. Suppose that the underlying sets,M and N, are identical. Suppose also that for every constant symbol c whichoccurs in φ (hence is a symbol of both languages) cM = cN (this makes sense

26

since M = N) and similarly for every function symbol and every relation symbolwhich occurs in φ. Suppose, furthermore, that v(x) = w(x) for every variablex which occurs free in φ. (In short, (M, v) and (N , w) agree on φ.) ThenM |=v φ iff N |=w φ.

As a corollary of the Dependency theorem we have the notion, of a structuresatisfying a sentence, that we were aiming for.

cordepCorollary 4.4 Let σ be a sentence of the language L and let M be an L-structure. Then either, for every valuation (M, v) we have M |=v σ, or forevery valuation (M, v) we have M 2v σ. In the former case we write M |= σand in the latter M 2 σ.

Proof. Since a sentence has no free variables it follows, by 4.2, that for anyvaluations (M, v) and (M, w) we have M |=v φ iff M |=w φ. 2

Notice that an immediate corollary is that M 2 σ iff M |= ¬σ.We say that a formula φ of the language L is satisfiable if there is some

valuation (M, v) of L with M |=v φ. We say that φ is logically valid ora tautology if for every valuation (M, v) we have M |=v φ. For example,∀x(x = x) is a tautology, as is x = x. A contradiction is a formula which isnot satisfiable (for example ∀x(x 6= x)). All the terminology and notation isextended to sets of formulas in the obvious way, for example, if T is a set offormulas then we write M |=v T iff M |=v φ for every formula φ in T .

If T is a set of formulas and φ is a formula we write T |= φ if, for everyvaluation (M, v), if M |=v T then M |=v φ, in which case we say that Tentails or implies φ.

implLemma 4.5 If T is a set of formulas and φ is a formula then T |= φ iff T∪{¬φ}is not satisfiable.

Proof. (⇐) Suppose that T ∪ {¬φ} is not satisfiable and let (M, v) be anyvaluation. If M |=v T then M |=v φ since, otherwise, we would have M |=v ¬φand hence (M, v) would satisfy T ∪ {¬φ}, contradiction.

(⇒) Suppose, for the converse, that T ∪ {¬φ} is satisfiable, say (M, v) issuch that M |=v T and M |=v ¬φ, that is M �v φ, so the condition for T |= φfails. 2

We use the notation T, φ as shorthand for T ∪ {φ}.satquan

Proposition 4.6 (a) If T is a set of formulas, φ is a formula and the variablex does not occur free in any formula of T then from T |= φ we may deduceT |= ∀xφ.

(b) Let T0 ⊆ T be sets of formulas, let φ and ψ be formulas and supposethat the variable x does not occur free in φ or in any formula of T0. Then fromT |= ∃xφ and T0, φ |= ψ we may deduce T |= ψ.

27

Proof. (a) Let (M, v) be any valuation such that M |=v T , hence M |=v φ.We must show that M |=v ∀xφ. We check the condition of 4.1. So let a ∈ Mand consider the valuation (M, vx

a): this valuation agrees with (M, v) on allvariables occurring free in φ (by assumption) so, by 4.2, M |=vx

aφ. So the

condition of 4.1 is satisfied and we have M |=v ∀xφ, as required.(b) Let (M, v) be such that M |=v T : we must show that M |=v ψ. From

our assumption that T |= ∃xφ we have that there is a ∈M with M |=vxaφ. Let

θ be any formula in T0. Since x is not free in θ we have that v and vxa agree

on the free variables of θ. So, since M |=v θ, the Dependency Theorem, 4.2gives M |=vx

aθ. Thus M |=vx

aT0. Therefore M |=vx

aT0, φ so, since T0, φ |= ψ,

M |=vxaψ. We also assumed that x does not occur free in ψ so, by another

application of the Dependency Theorem, we deduce M |=v ψ. 2

4.8 Formulas with parameters

Suppose that φ is an L-formula with one free variable x. It is very natural tothink of φ as a condition on elements of L-structures in the sense that if Mis an L-structure and a ∈ M then writing M |= φ(a) means that the elementa satisfies the condition φ(x) in M. There are a couple of ways of making asensible definition along these lines. One approach is to add one or more newconstant symbols to the language in order to “name” elements of M . The otheris to make use of the notion of valuation that we have already developed: wewill take that approach (whichever approach we use we do reach the same goalof a proper definition of M |= φ(a)).

So suppose that φ(x1, . . . , xn) is a formula of the language L and has freevariables among x1, . . . , xn (it is technically convenient not to insist that eachof x1, . . . , xn actually occurs (free) in φ). Let a1, . . . , an be elements of M (theyneed not be distinct). Let M be an L-structure. We write M |= φ(a1, . . . , an) iffor some (equivalently, by the Dependency Theorem, for all) valuations v withv(x1) = a1, . . . , v(xn) = an we have M |=v φ. If we want to make the notationshow the variables then we write φ(a1, . . . , an) is φ(x1/a1, . . . , xn/an). Infor-mally, one may use the notation φ(a1, . . . , an) on its own: this is not formallymeaningful (though it can be made so) but is intuitively appealing. Think ofφ(a1, . . . , an) as being a “formula with parameters” or a formula where vari-ables have been substituted by elements of a structure (for “x1/a1” read “forx1 substitute a1” or “x1 replaced by a1”).

Also, if we wish to replace only some of the free variables then we can usenotation such as φ(a1, x2, x3, a4) (x1 and x4 have been substituted for, butx2 and x3 have not). Any such expression we refer to as a formula withparameters; it is not a formula of L but it can be regarded as a formula of thelanguage L extended by adding constants which are then used to “name” theelements appearing. An familiar example is that of polynomials in possibly morethan one variable: in this context one may substitute some or all of the variables

28

by elements and one may regard the result as a polynomial with parameters.It is important to remember that it is only the free occurrences of variables

that are replaced. For instance if φ(x) is the formula x + 1 = 0 ∧ ∀x(x = x)then it is only the first occurrence of x that is free, so the formula φ(−1) is−1 + 1 = 0 ∧ ∀x(x = x).

Note that, as a consequence of the definitions, if a2, . . . , an ∈M, then M |=∃x2, . . . , xnφ(a2, . . . , an) iff there is some element a1 of M such that M |=φ(a1, a2, . . . , an). Also, since “∀ = ¬∃¬”, we have M |= ∀x1φ(x1, a2, . . . , an) iff,for every a1 ∈M we have M |= φ(a1, a2, . . . , an).

5 Theories and models

5.1 Theories and elementary equivalence

Suppose that L is some language of the sort that we have been considering. Atheory in L, or L-theory, is a set T of sentences of L. A model of the L-theoryT is an L-structureM such thatM satisfies every sentence in the set T : M |= σfor every σ ∈ T or just M |= T for short. I will make the requirement that, inorder to be allowed as a theory, a set T of sentences must have some model. Itis a consequence of the Completeness Theorem for predicate calculus (see 6.21)that this requirement is equivalent to there being no contradiction deduciblefrom T .

If M is an L-structure then the (complete) theory of M is the set of allsentences of L which are true inM: Th(M) = {σ ∈ L : σ is a sentence and M |=σ}. The term “complete” refers to the fact that any theory of this kind has theproperty that, if σ is any sentence of L, then either σ ∈ Th(M) or ¬σ ∈ Th(M).

A theory T is complete if for any sentence σ either T � σ or T � ¬σ.Two L-structures M and M′ are elementarily equivalent if they satisfy

exactly the same sentences of L : we then write M ≡ M′. We have M ≡ M′

iff Th(M) = Th(M′).If T is a set of sentences of L let Mod(T ) denote the collection of all models

of T .

We have the following fundamental result (which is an immediate conse-quence of the Completeness Theorem for predicate logic (6.21) or can be proveddirectly using an algebraic construction “ultraproducts”).

Theorem 5.1 (Compactness Theorem) Let T be any set of sentences of thelanguage L. Then T has a model iff every finite subset of T has a model.

Corollary 5.2 Let T be a theory such that Mod(T ) has models of arbitrarillylarge finite size. Then T has an infinite model.

Proof. Note that there is a sentence σ≥n of the basic language L0 which says“there are at least n elements”. By hypothesis every finite subset of T∞ =

29

T ∪ {σ≥n : n ∈ ω} has a model hence, by the Compactness Theorem, T∞ has amodel, as required. 2

5.2 Isomorphism of L-structures

Definition 5.3 If M and N are L-structures then a homomorphism fromM to N is a map α : M −→ N such that:

(i) for every constant symbol c of L, α(cM) = cN ;(ii) for every n-ary function symbol f of L and every n-tuple a1, . . . , an of

elements of M , α(fM(a1, . . . , an)) = fN (α(a1), . . . , α(an));(iii) for every n-ary relation symbol R of L and every n-tuple a1, . . . , an of

elements of M , if RM(a1, . . . , an) holds then RN (α(a1), . . . , α(an)) holds.Such a homomorphism is an isomorphism if it is a bijection and if(iii)′ for every n-ary relation symbol R of L and every n-tuple a1, . . . , an of

elements of M , RM(a1, . . . , an) holds iff RN (α(a1), . . . , α(an)) holds.

The notation α : M ' N means that α is an isomorphism from M toN . If there is an isomorphism from M to N then we say that M and N areisomorphic. The conclusion of the exercise below is that this is an equivalencerelation on L-structures.

The idea is that an isomorphism between structures is a bijection betweentheir underlying sets which preserves all the structure.

This general definition covers the cases such as groups, fields, boolean al-gebras, rings, Lie algebras that you may have seen before. In these and othercontexts where there are no relation symbols in the language an isomorphism isthe same as a bijective homomorphism but if there are relation symbols in thelanguage the notion of isomorphism is stronger than bijective homomorphism:for instance, one can easily construct bijective homomorphisms between posetswhich are not isomorphisms.

Exercise 5.4 (i) For every structure M the identity map idM : M −→M,given by idM(a) = a for every a ∈M, is an isomorphism from M to itself.(ii) If α : M −→ M′ and α′ : M′ −→ M′′ are isomorphisms then so is theircomposition α′α : M −→ M′′ (which is given by α′α(a) = α′(α(a)) for everya ∈M).(iii) If α : M −→ M′ is an isomorphism then the inverse α−1 : M′ −→ M,defined by α−1(b) = a iff α(a) = b (a ∈M, b ∈M ′), is an isomorphism.

Proposition 5.5 Suppose that α : N −→M is an isomorphism. Let φ(x) ∈ Land a in N . Then N |= φ(a) iff M |= φ(αa).

Proof. The proof is by induction on complexity of formulas but first one hasto show that for any term t(x1, . . . , xn) of L and any a1, . . . , an ∈ N we have

30

α(tN (a1, . . . , an)) = tM(α(a1), . . . , α(an)) (which, in turn, is shown by induc-tion on the construction of terms).

The details are left as an exercise. 2

Corollary 5.6 M' N implies M≡ N .

Proof. Apply 5.5 with φ an arbitrary sentence. 2

Lemma 5.7 If L has only finitely many extra symbols (i.e. L = L0 ∨ S whereS is a finite set) then, for each integer n ≥ 1, there are only finitely manyL-structures on any set of size n.

Proof. For each function, relation or constant symbol in S there are onlyfinitely many possibilities for its interpretation on the given set. 2

Corollary 5.8 If L has only finitely many extra symbols and if T is an L-theorywhich has no infinite models then T has, up to isomorphism, only finitely manymodels.

Proof. By 5.2 there is a finite bound on the size of models of T . By 5.7 thereare only finitely many, up to isomorphism, of each of the possible sizes (since,given n, all possibilities up to isomorphism can be seen on any chosen set of sizen). So in total we have only finitely many possibilities. 2

Exercise 5.9 Give an example to show that both the lemma and the corollaryare false if L has (a) infinitely many constant symbols or (b) infinitely manyrelation symbols.

Exercise 5.10 How many L = L0 ∨ {R}-structures are there on a set of sizen, where R is a binary relation symbol? How many up to isomorphism? (Thesecond question is to muse over: an answer is not expected!)

Lemma 5.11 If T is complete and has a finite model then T has just one modelup to isomorphism.

Proof. Say M |= T has exactly n elements. Then σ=n ∈ T and so everymodel of T has exactly n elements. We have to show that T = Th(M) specifieseverything about M. We do this by taking any N |= T (so card(N) = n)and then we show that one of the n! bijections between M and N must be anisomorphism - otherwise we produce a sentence true in one of M, N but falsein the other - contradicting completeness of T . 2

31

5.3 Substructures and elementary substructures

Substructures Let M be an L-structure, N ⊆ M . The L-structure inducedon N (if it exists) is given by:

(i) cN = cM (so we need cN ∈ N for all constant symbols c of L);(ii) fN (a) = fM(a) for a in N (so we need fN (N) ⊆ N for all function

symbols f in L);(iii) RN (a) iff RM(a) for a in N (no requirement on N).If this structure is well-defined we write N = M � N and say that N is the

substructure of M based on N . We write N ≤M.

Consider the structure (group) (Z,+, 0). It’s easy to check that the set, 2Z,of even integers is the basis for a substructure (the subgroup of even integers)of (Z,+, 0). But notice that if φ(x) is the formula ∃y(x = y + y) then we haveZ |= φ(2) but 2Z |= ¬φ(2). Thus, the properties of an element of 2Z may bedifferent depending on whether we regard it as an element of 2Z or of the largerset Z.

If we have a structure M and want to embed it as a subset of a largerstructure M′ we may well want the properties of elements of M to be thesame whether we think of them as elements of M or of M ′. For instance, ifwe extend the reals R to a structure with infinitesimals then we wouldn’t wantthe properties of the various real numbers to have changed in doing so. Theconclusion is that the property of being a substructure is not strong enough forsome purposes.

Definition 5.12 Suppose that N and M are L-structures with N a substruc-ture of M. We say that N is an elementary substructure of M (and thatM is an elementary extension of N ), writing N ≺M if:for every L-formula φ(x) and tuple a of elements of N we have N |= φ(a) iffM |= φ(a).

(Convention: we do not usually explicitly state the condition that tuplesshould match - so in the above, we assume that l(a) = l(x) where “l” denotesthe length of a tuple)

Exercise 5.13 If N ≤ M then N ≺ M iff for every formula φ(x) of L andevery (matching) tuple a from N if N |= φ(a) then M |= φ(a).

Lemma 5.14 (1) N ≺M implies N ≡M(2) N ≺M and M≺M′ implies N ≺M′.

Proof. Immediate from the definitions. 2

Notice, with reference to the example above, that the groups of integers andof even integers are isomorphic, hence elementarily equivalent, but the latter isnot an elementary substructure of the former.

32

5.4 A little set theory

Definition 5.15 Define a relation on sets by, given sets X, Y the relationcard(X) = card(Y ) (also written X ∼ Y or as | X |=| Y |) holds if thereis a bijection from X to Y , in which case we say that X and Y have thesame cardinality (or are equinumerous). It is easy to check that this is anequivalence relation (in fact this is a special case of a previous exercise since theabove relation is just L0-isomorphism).

Facts 5.16 N ∼ 2N, N ∼ Z, N ∼ Q - Any set equinumerous with N is said to becountably infinite. The term countable means finite or countably infinite.But R is not equinumerous with N - that is, R is uncountable (by Cantor’sdiagonal argument).

We can think of cardinal numbers as the ∼-equivalence classes and writecard(X) or | X | for the cardinality of X. Think of these as the naturalnumbers 0, 1, 2, ... supplemented by infinite numbers, the smallest of which is ℵ0

(“aleph-zero”), the cardinality of N (so any countably infinite set has cardinalityℵ0). The ordering on cardinal numbers is defined by: card(X) ≤ card(Y ) ifthere is an injection from X to Y . So e.g., 5 < ℵ0 < card(R). It is a theorem(“Cantor-Schroder-Bernstein) that card(X) = card(Y ) iff card(X) ≤ card(Y )and card(Y ) ≤ card(X). It is also a theorem (obtained by generalising Cantor’sdiagonal argument) that for any set X, card(X) < card(P(X)) where P(X) isthe power set of X (the set of all subsets of X).

The cardinal numbers are totally ordered (this needs the Axiom of Choice,as does most of mathematics which deals with abstract infinite sets): givencardinal numbers κ, µ, either κ ≤ µ or µ ≤ κ.

Also every cardinal number κ has a successor κ+ (so κ < κ+ and there’snothing in between). Set ℵ1 = ℵ+

0 , ℵ2 = ℵ+1 ,... (The question of whether

for infinite X we have card(P(X)) = card(X)+ - the Generalised ContinuumHypothesis - cannot be resolved on the basis of the usual axioms of set theory.)

Cardinal arithmetic Suppose card(X) = κ, card(Y ) = µ and supposeX ∩ Y = ∅. Define (one can check that the definitions are independent of thechoices of X,Y ): κ + µ = card(X ∪ Y ) κ × µ = card(X × Y ) κµ = card(XY )where XY denotes the set of all functions from Y to X.

If κ is infinite then: κ+ κ = κ, κ× κ = κ.If κ and µ are infinite then κ + µ = κ × µ = max{κ, µ}. In particular, one

may deduce that a countable union of countable sets is countable and that theset of all finite sequences from (or finite subsets of) a countable set is countable.

For any κ, 2κ > κ.

5.5 Downwards Lowenheim-Skolem theorem

First, a lemma.

33

Lemma 5.17 Suppose that N ≤M. Then N ≺M iff for all ψ(x1, . . . , xn, y) ∈L and a1, . . . , an ∈ N if M |= ∃yψ(a, y) then there is b ∈ N with M |= ψ(a, b).

Proof. (sketch) The direction “⇒” is direct from the definition of elementaryembedding. For the direction “⇐” we show, by induction on the complexityof φ, that for all formulas φ ∈ L and tuples a from N we have N |= φ(a) iffM |= φ(a). 2

Theorem 5.18 (Downwards Lowenheim-Skolem) Suppose that M is an infi-nite L-structure where L is a countable language and that A is a countable subsetof M . Then there is a countable elementary substructure N of M with A ⊆ N .

Proof. (outline) We have to produce a subset of M which contains A andwhich is an elementary substructure of M . Certainly any such subset mustbe a substructure so we have to include A, any interpretations of constantsymbols and then “close under the (L−)functions”. Doing that would give us asubstructure of M but not necessarily an elementary substructure of M. But5.17 says that a substructure is an elementary one if it contains “witnessesfor existential quantifiers”. So we list all formulas φ(x) with parameters fromA∪{constants} and then, for each φ with M |= ∃xφ(x) we choose some elementb ∈ M with M |= φ(b) (just choose one for each such φ). Add in all these“witnesses” b to get a new set A ∪ {constants} ∪ {witnesses}. Now we realisethat we have to repeat the process (because we have added new parameters). Infact we have to repeat the process ω-many times but no more (since a formulais a finite object). We end up with a subset of M which satsfies the criterionof 5.17 for being an elementary substructure of M. Cardinal arithmetic andthe countability assumptions we made ensure that this is countable (necessarilyinfinite since it is elementarily equivalent to M). 2

Corollary 5.19 Suppose that L is a countable language and that M is an in-finite L-structure. Then M has a countable elementary substructure.

Corollary 5.20 Suppose that L is a countable language and T is a consistentL-theory. Then T has a countable model.

5.6 Upwards Lowenheim-Skolem theorem

Theorem 5.21 (Upwards Lowenheim-Skolem) Suppose that M is an infiniteL-structure where L is a countable language and suppose that κ is a cardinalwith κ ≥ card(M). Then there is an elementary extension M′ of M withcard(M ′) = κ.

Proof. (outline) The idea is: since M is infinite it is consistent to say that thereis an element not equal to any member ofM , take an elementary extension which

34

contains such an element. Of course this “adds” perhaps only one extra element,but the procedure can be continued by transfinite induction up to κ.

That’s the basic idea but, when it comes to writing out the procedure care-fully, it turns out to be just as easy to add the κ new elements all at once. So,we first extend the language to include a name for every element of M : replaceL by LM . Then choose a set C of κ new constant symbols. Now just writedown the set T 6= of sentences in LM∪C which says “the elements of C are alldifferent from each other and from the elements of M”. Then we just take amodel of this theory. But we also want M to be an elementary substructure sowe actually need a model of T 6= ∪ Th(M,M).

Therefore, we check that T 6= ∪ Th(M,M) is consistent by showing that Mprovides a model of every finite subset of it. We conclude that T 6= ∪Th(M,M)has a model. But such a model contains (a copy of) M as an elementarysubstructure and contains at least κ many distinct elements. You might noticethat we haven’t used the assumption κ ≥ card(L). The point is that a model ofT 6= ∪Th(M,M) may be of cardinality larger than κ and so we may have to cutit down. For this we use the Downwards Lowenheim-Skolem Theorem (whichneeds the assumption on the size of the language compared with κ). 2

Corollary 5.22 Let T be a theory in the countable language L. If T has aninfinite model then, for every infinite cardinal κ, T has a model of cardinalityexactly κ.

Proof. Use the upwards, then the downwards, Lowenheim-Skolem theorems.2

Corollary 5.23 Suppose that L is a countable language and that T is an L-theory. Then either there is a finite bound on the cardinalities of models of Tor for every infinite cardinal κ there is a model of T of cardinality κ.

Exercise 5.24 Suppose that L is countable and that T is an incomplete L-theory. Show that for every infinite cardinal κ there are at least two non-isomorphic models of T of cardinality κ.

Exercise 5.25 Give an example of a complete theory T in a countable languagewhich has at least two non-isomorphic countably infinite models.

Exercise 5.26 Give an example of a theory T (in a countable language) whichhas an infinite model, which has a model with one element and which has nofinite models with more than one element.

35

6 Predicate Calculus: the Completeness Theo-rem

Throughout this chapter we will take ¬ and → to be the basic propositionalconnectives and will treat the other propositional connectives as being definedin terms of these. Also we will take ∀ to be a shorthand for ¬∃¬. This reducesthe number of cases which have to be considered in (definitions and) proofs byinduction on complexity of formulas. Also, as before, we will treat predicatelanguages with equality as the main case and just comment on the case wherethere is no symbol for equality.

6.1 Substitution

We begin with a careful treatment of substitution of terms for variables informulas.

If φ is a formula and t is a term then we write φ(x/t) for the formula whichresults when every free occurrence of x in φ is replaced by the term t.

For example if φ is P (x) ∧ ∃x(f(y, y) = x) (so here P is a unary rela-tion symbol and f is a binary function symbol) and t is f(x, y) then φ(x/t) isP (f(x, y)) ∧ ∃x(f(y, y) = x) and φ(y/t) is P (x) ∧ ∃x(f(f(x, y), f(x, y)) = x).

We say that a term t is free for the variable x in the formula φ if there isno variable y occurring in t such that some free occurrence of x in φ falls withinthe scope of a quantifier ∃y or ∀y. The point is that substitution should lead toa formula which is “the same up to change of variable or perhaps a special caseof that” but if, say, the variable y occurs in t but is not free for x in φ then theresult of substitution may lead to a very different formula.

For example, if φ is ∃y(x 6= y) (satisfied in any valuation on a structurewith at least two elements) and if t is simply y then φ(x/t), that is φ(x/y), is∃y(y 6= y) (which is never true). When we make substitutions we will alwayswant to have the condition that the term substituted is free for the substitutedvariable in the formula.

subthTheorem 6.1 (Substitution Theorem) Suppose that the term t is free for thevariable x in the formula φ. Let (M, v) be a valuation and set a = v(t). ThenM |=v φ(x/t) iff M |=vx

aφ (that is, in another notation, iff M |= φ(a)).

Proof. The proof is by induction on the complexity of φ and, as in the proofof 4.2, the induction has to be over all valuations. Just as we denote by φ(x/t)the formula which results when we replace all free occurrences of x in φ by theterm t, so we denote by s(x/t) the term which results when we substitute all(free) occurrences of x in the term s. We need the fact that if s is any term thenv(s(x/t)) = vx

a(s) (*). That is proved by induction on complexity of terms, asfollows.

36

If s is a constant symbol, c say then (*) is just v(c) = vxa(c) which is true

since replacing v by vxa has no effect on the interpretation (cM) of this constant

symbol.If s is a variable y different from x then (*) reads “v(y) = vx

a(y)” which istrue (by definition of vx

a).If s is the variable x then (*) reads “v(t) = vx

a(x)” which is, by the definitionof a and of vx

a , just “a = a”, which is correct.For the inductive step, if s is f(t1, . . . , tn) then (*) reads “v(f(t1(x/t), . . . , tn(x/t))) =

vxa(f(t1, . . . , tn)).” The LHS is, by definition (of the extension of a valuation to

terms), f(v(t1(x/t)), . . . , v(tn(x/t))) which, by induction, is f(vxa(t1), . . . , vx

a(tn))which, again by definition, is vx

a(f(t1, . . . , tn)), as required.Now for formulas. Note that (R(t1, . . . , tn))(x/t) is an identical formula to

R(t1(x/t), . . . , tn(x/t)) and (t1 = t2)(x/t) is an identical formula to t1(x/t) =t2(x/t).

Suppose that φ is atomic, say R(t1, . . . , tn). ThenM |=v (R(t1, . . . , tn))(x/t)iffM |=v R(t1(x/t), . . . , tn(x/t)) (by the note just above) iffRM((v(t1(x/t))), . . . , v((tn(x/t))))holds (by definition of |=) iff RM(vx

a(t1), . . . , vxa(tn)) holds (by (*) above) iff

M |=vxaR(t1, . . . , tn) (definition of |=). The proof for the case where φ is

t1 = t2 is similar.The cases for ¬ and ∧ are trivial as usual.Suppose, finally, that φ has the form ∃yψ. First suppose that y is a different

variable from x. Suppose that M |=v φ(x/t), that is M |=v ∃yψ(x/t). Thenthere is some b ∈M such thatM |=vy

bψ(x/t). By induction (over all valuations)

this gives M |=(vyb )x

aψ. Note that (vy

b )xa = (vx

a)yb because x and y are distinct

variables. So we have M |=(vxa)y

bψ, hence M |=vx

a∃yψ, that is M |=vx

aφ. The

argument just made reverses so we have equivalence in this case. There onlyremains the case where x and y are the same variable, so φ is ∃xψ. But then thefact that t is free for x in φ implies that x cannot occur free in φ. Hence φ(x/t)is just the same formula as φ and also the Dependency Theorem gives M |=v φiff M |=vx

aφ, so we have the statement of the theorem in this case also. 2

The next result expresses an intuitively reasonable property of the relation|=.

iosatProposition 6.2 If T is a set of formulas, φ is a formula, x is a variable andt is a term which is free for x in φ then:

(a) T |= ∀xφ implies T |= φ(x/t);(b) T |= φ(x/t) implies T |= ∃xφ.

Proof. (a) Let (M, v) be a valuation with M |=v T. Then M |=v ∀xφ, soM |=vx

aφ for every a ∈ M. In particular, if we let b = v(t) ∈ M then we have

M |=vxbφ so, by the Substitution Theorem, 6.1, M |=v φ(x/t).

(b) Let (M, v) be a valuation with M |=v T. Let a = v(t). By hypothe-sis M |=v φ(x/t) so, by the Substitution Theorem, M |=vx

aφ. Therefore, by

definition, M |=v ∃xφ. This shows that T |= ∃xφ. 2

37

6.2 Another proof system for propositional logic

First we will describe a system of generating propositional tersm which is dif-ferent from that in section 3. Then we go on to describe the extension of thissystem to predicate logic. The initial definitions are the same for both thepropositional and predicate case so we give these first.

A sequent is a line of the form T∣∣φ where T is a finite set of propositional

terms/a set of formulas and φ is a propositional term/a formula. We writeφ1, . . . , φn

∣∣φ instead of {φ1, . . . , φn}∣∣φ and we can write

∣∣φ if T is empty. Certainsequents are called theorems and they are defined inductively by the followingrules.

(Ax) Every sequent of the form T, φ∣∣φ is a theorem (these particular sequents

are axioms).(→I) If T, φ

∣∣ψ is a theorem then so is T∣∣φ→ ψ.

(→E) If T∣∣φ→ ψ and T

∣∣φ are theorems then so is T∣∣ψ.

(¬I) If T, φ∣∣ψ and T, φ

∣∣¬ψ are theorems then so is T∣∣¬φ.

(¬¬) If T∣∣¬¬φ is a theorem then so is T

∣∣φ.

The theorems/rules of deduction in this calculus are often written using aless linear notation which I will use in lectures (it takes a little longer than linearform to typset, hence my not using it in these notes).

As before one may introduce derived rules, for example, Proof by Contra-diction which says:

If T,¬φ∣∣ψ and T,¬φ

∣∣¬ψ are theorems then so is T∣∣φ.

This rule can be derived from those above as follows:If T,¬φ

∣∣ψ and T,¬φ∣∣¬ψ are theorems then so is T

∣∣¬¬φ (by (¬I)) and henceso is T

∣∣φ (by (¬¬)).

As in the earlier-described propositional calculus, a sequence of theorems iscalled a (valid) deduction. You should, of course, check that you agree thatthe above all are “valid rules of deduction”. Indeed, we have the following.

propnatLemma 6.3 If T

∣∣φ is a theorem then T |= φ.

Proof. It has to be shown that if v is a valuation on propositional terms suchthat v(T ) = T then v(φ) = T. But this is clear if the sequent is an axiom and,as you should check, whenever this is true for the assumption(s) of one of theabove clauses it is also true for the conclusion of the clause. So, by induction(on the definition of “theorem”) it is true for all theorems. 2

The next property is even more obvious.propmon

Lemma 6.4 If T∣∣φ is a theorem and T1 is a finite set of terms containing T

then also T1

∣∣φ is a theorem.

38

If T is a (possibly infinite) set of propositional terms and φ is any propo-sitional term then we will write T ` φ if there is a proof of φ from T in thiscalculus, more formally, if there is a finite subset T ′ of T such that T ′

∣∣φ is atheorem. You can read “T ` φ” as “there is a deduction of φ from T”. Ofcourse, we already have such a notation and terminology from before but in thispart you should forget the earlier deductive system. In any case, it will turn outthat the systems are equivalent because we will prove a completeness theorem(and soundness theorem) for this calculus as well. So in each deductive calculuswe will have T ` φ iff T |= φ and hence T ` φ in the one calculus iff T ` φ inthe other.

Note the following properties of the relation ` (all of which are rather im-mediate from the definitions).

(Ax) T, φ ` φ(→I) If T, φ ` ψ then T ` φ→ ψ(→E) If T ` φ→ ψ and T ` φ then T ` ψ(¬I) If T, φ ` ψ and T, φ ` ¬ψ then T ` ¬φ(¬¬) If T ` ¬¬φ then T ` φ(PbC) If T,¬φ ` ψ and T,¬φ ` ¬ψ then T ` ¬φ(Mon) If T ` φ and T1 ⊇ T then T1 ` φ(Fin) If T ` φ then there is a finite set T ′ ⊆ T with T ′ ` φ(Cut) If T, φ ` ψ and T ` φ then T ` ψ

We will prove soundness and completeness for this calculus as the initial partof the proof of soundness and completeness for the predicate calculus which wewill introduce.

In lectures we will describe a “natural deduction” system for this calculus,which can be used to find and present deductions in this system in a somewhatmore natural way than the linear deductions we saw with the earlier proposi-tional calculus. We do not give the details in these notes.

6.3 A proof system for predicate logic

The proof system for predicate logic is obtained by adding to the propositionallogic rules (but now applied to formulas of a language L) the following rules withside-conditions (note that we use ∀ as well as ∃: this is just for convenience).Throughout T denotes a set of formulas, φ and ψ denote formulas and t denotesa term.

(∀I) If T0

∣∣φ then T∣∣∀xφ whenever T0 ⊆ T provided x does not occur free in

any formula of T0

(∀E) If T∣∣∀xφ then T

∣∣φ(x/t) provided t is free for x in φ(∃I) If T

∣∣φ(x/t) then T∣∣∃xφ provided t is free for x in φ

(∃E) If T∣∣∃xφ and T0, φ

∣∣ψ then T∣∣ψ whenever T0 ⊆ T provided x does not

occur free in ψ or in any formula of T0

39

We also need a couple of rules for dealing with the equality symbol in L.(= I) T

∣∣t = t

(= E) If T∣∣t1 = t2 and T

∣∣φ(x/t1) then T∣∣φ(x/t2) provided t1 and t2 are free

for x in φ

As above these translate into properties of the relation `:(∀I) If T0 ` φ then T ` ∀xφ whenever T0 ⊆ T provided x does not occur free

in any formula of T0

(∀E) If T ` ∀xφ then T ` φ(x/t) provided t is free for x in φ(∃I) If T ` φ(x/t) then T ` ∃xφ provided t is free for x in φ(∃E) If T ` ∃xφ and T0, φ ` ψ then T ` ψ whenever T0 ⊆ T provided x does

not occur free in ψ or in any formula of T0

(= I) T ` t = t(= E) If T ` t1 = t2 and T ` φ(x/t1) then T ` φ(x/t2) provided t1 and t2

are free for x in φWe also have all the other properties listed after 6.4.

We could do with some more derived rules.morpred

Proposition 6.5 Suppose that the variable y is free for the variable x in φ butdoes not itself occur free in φ. Then:

(a) ∀y(φ(x/y)) ` ∀xφ;(b) ∃y(φ(x/y)) ` ∃xφ.

Proof. (a) We have ∀y(φ(x/y)) ` φ by (∀E) since, by the assumptions,(φ(x/y))(y/x) = φ and since the side-condition is satisfied. From that we obtain∀y(φ(x/y)) ` ∀xφ by (∀I).

(b) We have ∃y(φ(x/y)) ` ∃y(φ(x/y)) by (Ax). We also have φ(x/y) `φ(x/y) by (Ax) and, from this, we obtain φ(x/y) ` ∃xφ by (∃I). Now wecan apply the rule (∃E) with T = {∃y(φ(x/y))} and T0 = ∅ (again, notethat our assumptions do give that the side-conditions are satisfied) to obtain∃y(φ(x/y)) ` ∃xφ, as required. 2

morpred2Proposition 6.6 Suppose that the variable y is free for the variable x in φ butdoes not itself occur free in φ nor in any formula in the set T of formulas. IfT ` φ(x/y) then T ` ∀xφ.

Proof. From T ` φ(x/y) and (∀I) we have T ` ∀y(φ(x/y)). Then, by 6.5(a),and (Mon) (for the predicate case) we deduce T,∀y(φ(x/y)) ` ∀xφ. Then, by(Cut) (for the predicate case), we deduce T ` ∀xφ. 2

Soundness of this calculus is proved as for the calculus in section 3 but alsousing what we have proved above to deal with the cases involving quantifiers.

predsoTheorem 6.7 If T is any set of formulas of L and φ is any formula of L thenT ` φ implies T |= φ.

40

There are a number of variants of predicate calculus as well as a variety(e.g. Natural Deduction trees) of ways of setting out deductions in the variouscalculi. All these variants are essentially equivalent and I see, in the context ofthis course, little merit in spending much time on them. All that we need forthe proof of the completeness theorem is the notion of a deduction and basicfacts about this notion (which are independent of the particular details of thedeductive calculus used).

6.4 Proofs involving extra constantsxtrac

Proposition 6.8 Suppose that T ` φ is a sequent where T is a set of L0-formulas and φ is a formula of L0. Suppose that L1 = L0 ∨ {c1, . . . , cn} is alanguage which is obtained from L0 by adding new constants c1, . . . , cn. ThenT ` φ is also a sequent of L1. Suppose that T ` φ is actually a theorem of L1.Then T ` φ is already a theorem of L0.

Proof. Consider a proof of T ` φ in L1 - say an ND- tree for this. The pointis that we can replace each of the extra constants ci by a variable, yi say, whichoccurs nowhere else in the deduction (the deduction is a finite object so certainlythere will be variables which do not appear) and what results will again be avalid deduction (you should check this by considering each of the rules for thedeductive calculus) and now all the formulas appearing will be L0-formulas. 2

We need a version of this which says more of what we actually proved.For any formula φ of L1 denote by φ− the formula which results when eachoccurrence of each constant ci has been replaced by an occurrence of the variableyi which we take to be fixed, given any particular deduction. Also write T−1 forthe result of doing this to each formula in T1 where T1 is any set of formulas ofL1.

xtrac2Proposition 6.9 With notation as above, if T1 ` φ1, where T1 is a set of L1-formulas and φ1 is an L1-formula, then T−1 ` φ−1 .

As a corollary we have the following.outc

Proposition 6.10 If T ` φ(x/c) where c is a constant which does not occur inφ nor in any formula of T then T ` ∀xφ.

Proof. Consider a proof of the theorem T∣∣φ(x/c), rather, of T0

∣∣φ(x/c) whereT0 is some finite subset of T. Supposing that all these belong to the languageL, let L− be the language obtained by leaving out c (i.e. remove from L allformulas in which c appears). Then 6.9 applies (with L1 there being L hereand L0 there being L− here) to yield T− ` φ(x/c)−. Now, if y is the variablethat was used to replace the constant c (see the proof of 6.8) then φ(x/c)− isjust φ(x/y) (because, by assumption c occurs nowhere in φ(x/c) except where

41

it replaced x in φ). Also, since the constant c does not occur in any formula ofT, also T− = T. So we have T ` φ(x/y), as required. 2

6.5 Consistency

Say that a set T of formulas is inconsistent if there is a formula φ such thatT ` φ and T ` ¬φ and say that T is consistent otherwise. Here is a list ofproperties of this notion. In the lectures we will give proofs of at least some ofthem: the rest are exercises (all are easily proved).

reconsisProposition 6.11 Let T be a set of L-formulas and let φ be an L-formula.

(a) If T is consistent and if L′ is an enlargement of L by constants then Tis consistent, regarded as a set of L′-formulas.

(b) T ` φ iff T ∪ {¬φ} is inconsistent.(c) T is inconsistent if T ` φ for every formula φ.(d) If T is satisfiable then T is consistent.(e) If T is inconsistent and T ⊆ T1 then T1 is inconsistent.(f) If T is inconsistent then there is a finite subset of T which is inconsistent.(g) The union of an increasing sequence T0 ⊆ T1 ⊆ . . . Tn ⊆ . . . of consistent

sets of formulas is itself consistent.

6.6 The Completeness Theorem

We have to prove that our rules of deduction are strong enough to generate allconsequences of a set of axioms. That is, we have to prove that T |= φ impliesT ` φ. It will be enough to show

(Consis) every consistent set of formulas is satisfiable.For, if T |= φ then T ∪ {¬φ} is not satisfiable and then, by (Consis), this

can only be so if T ∪ {¬φ} is inconsistent and that, by 6.11(b), is equivalent toT ` φ, which is what we want.

Initially, it is far from obvious how to prove (Consis): all we have is a set offormulas that is consistent and, somehow, we have to produce an L- structure,indeed a valuation, which satisfies all these formulas. On reflection (and withthe benefit of hindsight) the conclusion seems inevitable: the L-structure mustsomehow be built from the language/set of formulas itself. That is what we willdo. First we will deal with the case where T is a consistent set of sentences.Then we will use the idea of extending a language by adding constants and 6.8in order to deal with the general case where T is a set of formulas. You will seethat some parts of the argument parallel parts of the argument that we gavebefore for completeness of the Henkin-style propositional calculus.

A set T of sentences of L is maximal consistent if T is consistent and if,whenever T ⊆ T ′ where T ′ is a consistent set of L-sentences, it must be thatT = T ′.

42

mxconexProposition 6.12 Every consistent set of sentences of L is contained in a max-imal consistent set of sentences of L.

Proof. For the general case (of possibly uncountable languages, i.e. languageswith uncountably many “extra” symbols) we need Zorn’s lemma and we arguejust as in the propositional case (see 3.11). For variety we give a more directargument which works in the countable case.

In the countable case the sentences can be enumerated as σ1, σ2, . . . , σn, . . . .We set T0 = T. Then we set T1 = T if T ∪ {σ1} is consistent and, if not,set T1 = T0. We continue in this way: having defined Tn which, inductively, isconsistent, we set Tn+1 = Tn∪{σn+1} if this set is consistent and set Tn+1 = Tn

if not. Thus we add sentences one by one into the set we are building up, butleave out those sentences which would immediately result in an inconsistentset. Then we set T ′ to be the union of all the Tn: a consistent set by 6.11(g).It is obvious from the construction (and trivial to prove) that T ′ is maximalconsistent. 2

Just as in the propositional case a maximal consistent set is “complete” and“deductively closed” (cf. proof of 3.11).

mxconTheorem 6.13 Let T be a maximal consistent set of sentences of a languageL and let σ, τ be sentences of L. Then the following hold.

(a) σ /∈ T implies T ` ¬σ(b) T ` σ implies σ ∈ T(c) ¬σ ∈ T iff σ /∈ T(d) (σ → τ) ∈ T iff σ /∈ T or τ ∈ T

Proof. (a) Supposing that σ /∈ T, set T ′ = T ∪{σ}: a set of sentences properlycontaining T so, by maximality of T, inconsistent. That is, there is a formula φsuch that T ′ ` φ and T ′ ` ¬φ. Then use (¬I) to obtain T ` ¬σ.

(b) If σ /∈ T then, by (a), T ` ¬σ so, since T is consistent, certainly T 0 σ.(c) If ¬σ ∈ T and we also had σ ∈ T then, from (Ax), we would have both

T ` ¬σ and T ` σ, contradicting consistency of T. For the converse, if σ /∈ Tthen, by (a), T ` ¬σ so, by (b) ¬σ ∈ T .

(d) Suppose that (σ → τ) ∈ T. By (Ax) T ` σ → τ so, if also σ ∈ T, henceby (Ax) T ` σ, we obtain T ` τ by (→E) and hence, by (b) τ ∈ T.

For the converse, if σ /∈ T then, by (a), T ` ¬σ so, by (Mon), T, σ ` ¬σ butalso T, σ ` σ by (Ax) so T, σ is inconsistent and hence (6.11(c)) T, σ ` τ henceT ` σ → τ by (→I). In the case that τ ∈ T, we have, by (Ax), T, σ ` τ andhence, by (→I), T ` σ → τ . 2

This is the point where, before, we were able to complete the proof ofthe completeness theorem for propositional logic by defining a valuation whichmakes exactly the terms in a maximal consistent set containing T take the value

43

“true”. But for predicate logic we need to do quite a lot more: we must buildan L-structure. Roughly, we will build it from constants that we will add to ourlanguage in order to witness existential quantifiers.

An existentially-prefixed sentence is a sentence of the form ∃xφ for somevariable x and some formula φ (with at most x as free variable). If θ is anexistentially-prefixed sentence, say θ is ∃xφ, then a Henkin sentence for θ isa sentence of the form θ → φ(x/c), that is ∃xφ→ φ(x/c), where c is a constantsymbol (not necessarily belonging to the initial language L).

A set T of sentences of a language L is a Henkin set for L if everyexistentially-prefixed sentence of L has a Henkin sentence in T (so here weare insisting that all the constants appearing in Henkin sentences do belong toL). A maximal consistent Henkin set, mcH set for short, of sentences ofL is a Henkin set for L which is a maximal consistent set of sentences of L (sonot all sentences in that set will be Henkin sentences!).

We will need the following lemma where, by a closed term, we mean a termwith no free variables. witnsLemma 6.14 Let T be a maximal consistent Henkin set for the language L andlet ∃xφ, respectively ∀xφ, be sentences of L. Then

∃xφ ∈ T iff for some closed term t of L we have φ(x/t) ∈ T ,∀xφ ∈ T iff for every closed term t of L we have φ(x/t) ∈ T .

Proof. For the first statement, suppose that ∃xφ ∈ T. There is a Henkinsentence, say ∃xφ → φ(x/c), for ∃xφ in T. By (Ax) and (→E) we deduce thatT ` φ(x/c) and hence, by 6.13(b), that φ(x/c) ∈ T. For the converse, supposethat φ(x/t) ∈ T for some closed term t. Then, by (Ax) and (∃I), T ` ∃xφ so,by 6.13(b), ∃xφ ∈ T .

For the second statement, suppose that ∀xφ ∈ T. Then, for any closed termt of L we have T ` φ(x/t) by (Ax) and (∀E) (the side-condition for the latter isautomatically satisfied since a closed term has no free variables). So, by 6.13(b),φ(x/t) ∈ T. Conversely, if φ(x/t) ∈ T for all closed terms t then, by 6.13(c),¬φ(x/t) /∈ T for all closed terms t. Since T is a Henkin set for L there is aconstant symbol c of L such that (∃x¬φ → ¬φ(x/c)) ∈ T and so, by 6.13(d),either ∃x¬φ /∈ T or ¬φ(x/c) ∈ T. But the latter contradicts our assumptionthat φ(x/c) ∈ T (and consistency of T ) so ∃x¬φ /∈ T and hence, by 6.13(c),¬∃x¬φ ∈ T. That is, ∀xφ ∈ T, as required. 2

6.7 Herbrand structures

Let T be a maximal consistent Henkin set for a language L. We will build theHerbrand structure for T as follows.

First, the underlying set M will be the set of closed terms of L factoredby a certain equivalence relation which is defined using T. Recall that a closed

44

term is one with no free variables (hence built from the constant symbols us-ing the function symbols). This set will be non-empty since, for example, theexistentially-prefixed sentence ∃x(x = x) has a Henkin sentence, which has theform ∃x(x = x) → c = c for some constant symbol c of L.

The equivalent relation is: two terms are equivalent if T says they are. Moreformally, define the equivalence relation ∼ on the set of closed terms of L by:s ∼ t iff (s = t) ∈ T .

termcongLemma 6.15 The relation ∼ is an equivalence relation on the set of closedterms of L which further satisfies: if s1 ∼ t1, . . . , sn ∼ tn then

(1) f(s1, . . . , sn) ∼ f(t1, . . . , tn) for each n-ary function symbol f of L,(2) R(s1, . . . , sn) ∈ T iff R(t1, . . . , tn) ∈ T for each n-ary relation symbol R

of L.

Proof. For the proof one first has to show that the following are all derivablein the calculus (the various s’s and t’s denote arbitrary terms, f denotes a n-aryfunction symbol and R denotes an n-ary relation symbol).

` t = tt1 = t2 ` t2 = t1t1 = t2, t2 = t3 ` t1 = t3s1 = t1, . . . , sn = tn ` f(s1, . . . , sn) = f(tn, . . . , tn)s1 = t1, . . . , sn = tn, R(s1, . . . , sn) ` R(tn, . . . , tn)Deriving these and using them to prove the lemma is left as an admittedly

rather uninspiring exercise. 2

So we set M to be the set of closed terms of L factored by the equivalencerelation ∼ . That is, M is the set of ∼-equivalence classes of closed terms. Wewrite t/ ∼ for the class of t. We then define an L-structure, M, on M in theobvious way:

if c is a constant symbol of L set cM = c/ ∼;if f is an n-ary function symbol and t1/ ∼, . . . , tn/ ∼ are elements of M

then set fM(t1/ ∼, . . . , tn/ ∼) = (f(t1, . . . , tn))/ ∼if R is an n-ary relation symbol and t1/ ∼, . . . , tn/ ∼ are elements of M then

set RM(t1/ ∼, . . . , tn/ ∼) to be true iff R(t1, . . . , tn) ∈ T .From the first two parts of the definition and induction on complexity of

terms one deduces that for every closed term t of L we have tM = t/ ∼.Of course we have to check that this structure is well-defined, since we have

defined this structure by reference to particular representatives of equivalenceclasses, but that’s what 6.15 is there for (again the details are left as an exercise).

What we do now is prove that this structure does satisfy all the sentencesin T . As always, we have to go through formulas to reach sentences so we makethe following definition. Let φ be a formula with free variables among x1, . . . , xn

(a list of distinct variables). A sentence instance of φ is any formula of the

45

form φ(x1/c1, . . . , xn/cn) where c1, . . . , cn are (not necessarily distinct) constantsymbols of L.

herstrTheorem 6.16 (Herbrand Structure Theorem) Let T be a maximal consistentHerbrand set for the language L and let M be the Herbrand Structure built asabove from L and T. Let φ be any formula of L and let φ′ be any sentenceinstance of φ. Then M |= φ′ iff φ′ ∈ T. In particular, for every sentence σ of Lwe have M |= σ iff σ ∈ L.

Proof. The proof is, you guessed it, by induction on complexity of the formulaφ.

First suppose that φ is atomic. Then φ′ has either the form t1 = t2 orR(t1, . . . , tn) where the ti are closed terms of L, say the former. Then M |=t1 = t2 iff tM1 = tM2 iff t1/ ∼= t2/ ∼ iff (t1 = t2) ∈ T and similarly for the othercase (so we just use the definition of the Herbrand structure and 6.15).

If φ is ¬ψ then any sentence instance, φ′, of φ clearly has the form ¬ψ′where ψ′ is a sentence instance of ψ. By induction we have M |= ψ′ iff ψ′ ∈ Tso M |= φ′, that is, M |= ¬ψ′ iff ψ′ /∈ T which is the case iff ¬ψ′ ∈ T (by6.13(c)), that is, iff φ′ ∈ T .

If φ is φ1 → φ2 then, clearly, any sentence instance, φ′, of φ has the formφ′1 → φ′2 where φ′i is a sentence instance of φi. Then M |= φ′ iff M 2 φ′1 orM |= φ′2 iff φ′1 /∈ T or φ′2 ∈ T (by induction) iff φ′ ∈ T by 6.13(d), as required.

If φ is ∃xψ then any sentence instance of φ′ has the form ∃xψ′ where ψ′ hashad all its free variables except x replaced by constant symbols. Then M |= φ′,that is M |= ∃xψ′ iff there is a ∈M with M |= ψ′(x/a) iff there is some closedterm t of L such that M |= ψ′(x/t) (using the fact that, by construction, everyelement of M has the form t/ ∼ for some closed term) iff ψ′(x/t) ∈ T (byinduction) iff φ′ ∈ T (by 6.14), as required. 2

Notice that we’re not done yet: the assumption of the theorem includes thatthe language L has a (maximal consistent) Herbrand set. The language westarted with might not have such a set (for example any language with onlyfinitely many constant symbols cannot have a Henkin set). So, starting with aconsistent set T of sentences of a language L we have to enlarge the languageby adding new constant symbols. We need a couple of lemmas for the mcHTheorem, which asserts that every consistent set of sentences can be extendedto an mcH set in a (in general) larger language.

mchlem1Lemma 6.17 If T is a consistent set of formulas of a language L and c is aconstant symbol of L which does not occur in T or in the formula φ of L thenT ∪ {∃xφ→ φ(x/c)} also is consistent

Proof. If T ∪ {∃xφ → φ(x/c)} were inconsistent then, using (¬I) (or 6.11(b)and (¬¬)), we obtain T ` ¬(∃xφ → φ(x/c) and so, since T ` ψ → ψ′ implies

46

T ` ψ and T ` ¬ψ′, we have T ` ∃xφ and T ` ¬φ(x/c). Therefore T ` ¬∀x¬φand, by 6.10, T ` ∀x¬φ, contradicting that T is consistent. 2

mchlem2Lemma 6.18 Every consistent set T of sentences of a language L can be ex-tended to a consistent set, T ′, of sentences of some enlargement, L′, of L ob-tained by adding constants, such that every existentially-prefixed sentence of Thas a Henkin sentence in T ′.

Proof. For each existentially-prefixed sentence θ, being ∃xφ, we add a newconstant cθ to L and set θ′ to be the sentence ∃xφ → φ(x/cθ). Let T ′ be theunion of T and all these sentences θ′. We have to show that this set is consistent.

If it were not then there would be a finite inconsistent subset, which would bea finite subset of T together with, say θ′1, . . . , θ

′n. But then finitely many appli-

cations of 6.17 would give that a finite subset of T, hence T itself, is inconsistent- the required contradiction. 2

mchTheorem 6.19 (mcH Theorem) If T is a consistent set of sentences of a lan-guage L then there is language, L′, containing L, obtained from L by addingnew constant symbols, and a maximal consistent Henkin set T ∗ of L′ such thatT ⊆ T ∗.

Proof. First we apply 6.18 to obtain an extension, L1 say, of L by constantsand a set, T1 say, of L1- sentences, containing T such that T1 contains a Henkinsentence for every existentially-prefixed sentence of L. The problem is that wewould like this last property but for existentially-prefixed sentences of L1. So werepeat, with L1 and T1 in place of L and T, to obtain an extension, L2, of L1 byconstants and a set, T2, of L2-sentences, containing T1 such that T2 contains aHenkin sentence for every existentially-prefixed sentence of L1. And so on. Welet L′ be the union of the languages Ln and set T ′ to be the union of the setsTn. By 6.11(g) T ′ is consistent and, if θ is an existentially-prefixed sentence ofL′, say θ ∈ Ln, then, by construction, there is a Henkin sentence for θ in Tn+1,so this Henkin sentence is in T ′. Therefore T ′ is a Henkin set for L′. By 6.12there is a maximal consistent Henkin set, T ∗ say, for L′ containing T ′. 2

We’re almost there: we can construct the Herbrand structure, M∗, for T ∗

and, by 6.16, M∗ satisfies all the sentences in T ∗, in particular, satisfies all thesentences of T. The only obstruction is that M∗ is an L′-structure rather thanan L-structure. But we consider the restriction of M∗ to L, by which we meanthe L-structure which has the same underlying set, M, as M∗ and in which allthe extra symbols of L are interpreted in the way they are interpreted in M∗

(that is, we just forget any structure which is for L∗ but not for L).reslem

Lemma 6.20 Suppose that L′ is an extension of L and that M′ is an L′-structure. Denote by M the restriction of M′ to L. Let φ be any formula of L.Then M′ |= φ iff M |= φ.

47

That is now enough: the restriction of M∗ to L will be a model of T sowe have shown the sentence version of (Consis): that every consistent set ofsentences has a model.

To show (Consis) for formulas is now easy: suppose that x0, x1, . . . are all thefree variables of L. Choose a set c0, c1, . . . of new and distinct constant symbolsand let Lc denote the extension of L obtained by adding all these constantsymbols. Then, given a consistent set, T, of L-formulas, replace each formulaφ ∈ T by the Lc-sentence, θc, which is obtained by replacing each of its freevariables xi by the corresponding constant symbol ci. By 6.10 the resulting set,Tc, of sentences is consistent. So by the above Tc has a model Mc. For eachθ ∈ T , M |= θc so, by 6.10 and 6.2, M |= θ. Then, by 6.20, the restriction, M,of Mc to L is a model of each θ ∈ T. Therefore M |= T, as required.

We have proved the Completeness Theorem.predcpl

Theorem 6.21 Let T be any set of formulas and let φ be any formula. If T � φthen T ` φ.

48

predicate logic - university of manchester · 2018-09-11 · predicate logic mike prest department...

Documents