a course in symbolic logic

A COURSE IN SYMBOLIC LOGIC

Haim Gaifman

Philosophy Department Columbia University

Copyright c°1992 by Haim GaifmanRevised: February 1999. Further corrections: February 2002.

Contents Introduction i - x

1 Declarative Sentences 1

1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Truth-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Context Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.2 Types and Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1.3 Vagueness and Open Texture . . . . . . . . . . . . . . . . . . . . . . . 9

1.1.4 Other Causes of Truth-Value Gaps . . . . . . . . . . . . . . . . . . . . 13

1.2 Some Other Uses of Declarative Sentences . . . . . . . . . . . . . . . . . . . . 14

2 Sentential Logic 17

2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1 Sentences, Connectives, Truth-Tables . . . . . . . . . . . . . . . . . . . . . . . 19

2.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.1.1 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.2 Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.1.3 Truth-Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.1.4 Atomic Sentences in Sentential Logic . . . . . . . . . . . . . . . . . . . 27

2.2 Logical Equivalence, Tautologies, Contradictions . . . . . . . . . . . . . . . . . 29

2.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.2.1 Some Basic Laws Concerning Equivalence . . . . . . . . . . . . . . . . 32

2.2.2 Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.2.3 Logical Truth and Falsity, Tautologies and Contradictions . . . . . . . 40

2.3 Syntactic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.3.1 Sentences as Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.3.2 Polish Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.4 Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.5 Sentential Logic as an Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.5.1 Using the Equivalence Laws . . . . . . . . . . . . . . . . . . . . . . . . 58

2.5.2 Additional Equivalence Laws . . . . . . . . . . . . . . . . . . . . . . . . 64

2.5.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

2.6 Conditional and Biconditional . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3 Sentential Logic in Natural Language 80

3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

3.1 Classical Sentential Connectives in English . . . . . . . . . . . . . . . . . . . . 85

3.1.1 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.1.2 Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

3.1.3 Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

3.1.4 Conditional and Biconditional . . . . . . . . . . . . . . . . . . . . . . . 97

4 Logical Implications and Proofs 105

4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

4.1 Logical Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

4.2 Implications with Many Premises . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.2.1 Some Basic Implication Laws and Top-Down Derivations . . . . . . . . 112

4.2.2 Additional Implication Laws and Derivations as Trees . . . . . . . . . . 118

4.2.3 Logically Inconsistent Premises . . . . . . . . . . . . . . . . . . . . . . 125

4.3 Fool-Proof Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

4.3.1 Validity and Counterexamples . . . . . . . . . . . . . . . . . . . . . . . 127

4.3.2 The Basic Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

4.3.3 The Fool-Proof Method . . . . . . . . . . . . . . . . . . . . . . . . . . 134

4.4 Proofs by Contradiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

4.4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

4.4.1 The Fool-Proof Method for Proofs by Contradiction . . . . . . . . . . . 139

4.5 Implications of Sentential Logic in Natural Language . . . . . . . . . . . . . . 143

4.5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

4.5.1 Meaning Postulates and Background Assumptions . . . . . . . . . . . . 144

4.5.2 Implicature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

5 Mathematical Interlude 153

5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

5.1 Basic Concepts of Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

5.1.1 Sets, Membership and Extensionality . . . . . . . . . . . . . . . . . . . 154

5.1.2 Subsets, Intersections, and Unions . . . . . . . . . . . . . . . . . . . . . 159

5.1.3 Sequences and Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . 165

5.1.4 Relations and Cartesian Products . . . . . . . . . . . . . . . . . . . . . 166

5.2 Inductive Definitions and Proofs, Formal Languages . . . . . . . . . . . . . . . 173

5.2.1 Inductive definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

5.2.2 Proofs by Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

5.2.3 Formal Languages as Sets of Strings . . . . . . . . . . . . . . . . . . . . 188

5.2.4 Simultaneous Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

6 The Sentential Calculus 197

6.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

6.1 The Language and Its Semantics . . . . . . . . . . . . . . . . . . . . . . . . . 197

6.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

6.1.1 Sentences as Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

6.1.2 Semantics of the Sentential Calculus . . . . . . . . . . . . . . . . . . . 202

6.1.3 Normal Forms, Truth-Functions and Complete Sets of Connectives . . . 206

6.2 Deductive Systems of Sentential Calculi . . . . . . . . . . . . . . . . . . . . . . 217

6.2.1 On Formal Deductive Systems . . . . . . . . . . . . . . . . . . . . . . . 217

6.2.2 Hilbert-Type Deductive Systems . . . . . . . . . . . . . . . . . . . . . . 219

6.2.3 A Hilbert-Type Deductive System for Sentential Logic . . . . . . . . . 221

6.2.4 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . 229

6.2.5 Gentzen-Type Deductive Systems . . . . . . . . . . . . . . . . . . . . . 235

7 Predicate Logic Without Quantifiers 241

7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

7.1 PC0, The Formal Language and Its Semantics . . . . . . . . . . . . . . . . . . 244

7.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

7.1.1 The Semantics of PC0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

7.2 PC0 with Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

7.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

7.2.1 Top-Down Fool-Proof Methods For PC0 with Equality . . . . . . . . . 253

7.3 Structures of Predicate Logic in Natural Language . . . . . . . . . . . . . . . . 261

7.3.1 Variables and Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . 261

7.3.2 Predicates and Grammatical Categories of Natural Language . . . . . . 264

7.3.3 Meaning Postulates and Logical Truth Revisited . . . . . . . . . . . . . 267

7.4 PC∗0 , Predicate Logic with Individual Variables . . . . . . . . . . . . . . . . . 269

7.4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

7.4.1 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

7.4.2 Variables and Structural Representation . . . . . . . . . . . . . . . . . 273

8 First-Order Logic 277

8.1 First View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

8.2 Wffs and Sentences of FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

8.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

8.2.1 Bound and Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . 283

8.2.2 More on the Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

103

8.2.3 Substitutions of Free and Bound Variables . . . . . . . . . . . . . . . . 288

8.2.4 First-Order Languages with Function Symbols . . . . . . . . . . . . . . 291

8.3 First-Order Quantification in Natural Language . . . . . . . . . . . . . . . . . 295

8.3.1 Natural Language and the Use of Variables . . . . . . . . . . . . . . . . 295

8.3.2 Some Basic Forms of Quantification . . . . . . . . . . . . . . . . . . . . 297

8.3.3 Universal Quantification . . . . . . . . . . . . . . . . . . . . . . . . . . 302

8.3.4 Existential Quantification . . . . . . . . . . . . . . . . . . . . . . . . . 307

8.3.5 More on First Order Quantification in English . . . . . . . . . . . . . . 309

8.3.6 Formalization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 314

9 FOL: Models, Truth and Logical Implication 323

9.1 Models, Satisfaction and Truth . . . . . . . . . . . . . . . . . . . . . . . . . . 323

9.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

9.1.1 The Truth Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

9.1.2 Defining Sets and Relations by Wffs . . . . . . . . . . . . . . . . . . . . 331

9.2 Logical Implications in FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

9.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

9.2.1 Proving Non-Implications by Counterexamples . . . . . . . . . . . . . . 336

9.2.2 Proving Implications by Direct Semantic Arguments . . . . . . . . . . . 338

9.2.3 Equivalence Laws and Simplifications in FOL . . . . . . . . . . . . . . 341

9.3 The Top-Down Derivation Method for FOL Implications . . . . . . . . . . . . 345

9.3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

9.3.1 The Implication Laws for FOL . . . . . . . . . . . . . . . . . . . . . . . 345

9.3.2 Examples of Top-Down Derivations . . . . . . . . . . . . . . . . . . . . 350

9.3.3 The Adequacy of the Method: Completeness . . . . . . . . . . . . . . . 353

Introduction

Logic is concerned with the fundamental patterns of conceptual thought. It uncovers struc-tures that underlie our thinking in everyday life and in domains that have very little incommon, as diverse, for example, as mathematics, history, or jurisprudence. A rather roughidea of the scope of logic can be obtained by noting certain keywords: object, property, con-cept, relation, true, false, negation, names, common names, deduction, implication, necessity,possibility, and others.

Symbolic logic aims at setting up formal systems that bring to the fore basic aspects ofreasoning. These systems can be regarded as artificial languages into which we try to translatestatements of natural language (e.g., English). While many aspects of the original statementare lost in such a translation, others are made explicit. It is these latter aspects that are thefocus of the logical investigation.

Historically, logic was conceived as the science of valid reasoning, one that derives solelyfrom the meaning of words such as ‘not’, ‘and’, ‘or’, ‘all’, ‘every’, ‘there is’, and others, orsyntactical constructs like ‘if ... then ’. These words and constructs are sometimes calledlogical particles. A logical particle plays the same role in domains that have nothing else incommon.

Here is a traditional very simple example. From the two premises:

All animals are mortal.

All humans are animals.

we infer, by logic alone:

All humans are mortal.

The inference is purely logical; it does not depend on the meanings of ‘animal’, and ‘human’,but solely on the meaning of the construct

all .... are

i

ii

The same pattern underlies the following inference, in which all non-logical concepts aredifferent. From:

All uncharged elementary particles are unaffected by electromagnetic fields.

All photons are uncharged elementary particles.

we can infer:

All photons are unaffected by electromagnetic fields.

The two cases are regarded in Aristotelian logic as instances of the same syllogism–a certainelementary type of inference. The particular syllogism under which the two examples fall isthe following scheme, where the premises are written above the line and the conclusion underit:

(1)All Ps are QsAll Rs are PsAll Rs are Qs

Our first example is obtained if we substitute:

‘animal’ for ‘P’, ‘mortal being’ for ‘Q’, ‘human’ for ‘R’.

The second is obtained if we substitute ‘uncharged elementary particle’ for ‘P’, ‘thing unaf-fected by electromagnetic fields’ for ‘Q’, ‘photon’ for ‘R’.

Had we substituted in the first case ‘immortal being’ for ‘Q’, (instead of ‘mortal being’) wewould have gotten the inference:

All animals are immortal.

All humans are animals.

--------------------------

All humans are immortal.

Here the first premise is false, and so is the conclusion. But the inference is correct. Itscorrectness does not require that the premises be true, but that they stand in a certain logicalrelation to the conclusion: they should logically imply it.

The use of schematic symbols is a first step in setting up a formalism. Yet, there is a longway from here to a fully fledged formal language. As will become clear during the course,there is much more to a formalism than the employment of formal symbols.

Here is a different type of a logical inference. From the three premises:

iii

Either Jill went to the movie, or she went to the party.

If Jill went to the party, she met there Jack.

Jill did not go to the movie.

we can infer:

Jill met Jack at the party.

The logical particles on which this last inference is based are:

either...or , if... then , not

The same scheme is exemplified by the following inference. (Again, its validity does not meanthat the premises are true, but only that they imply the conclusion.) From:

Either Ms. Hill invented her story, or Mr. Thomas harassed her.

If Mr. Thomas harassed Ms. Hill, then he lied to the Senate.

Ms. Hill did not invent her story.

we can infer:

Mr. Thomas lied to the Senate.

The scheme that covers both of these inferences can be written in the following self-explanatorynotation, where the premises (here there are three) are above the line and the conclusion isbelow it:

(2)

A or BIf B then C

not AC

Note that the schematic symbols of (2) are of a different kind than those of (1). Whereas in(1) they stand for general names: ‘human’, ‘mortal being’, ‘photon’, etc., they stand in (2)for complete sentences, such as ‘Jill went to the party’ or ‘Mr. Thomas harassed Ms. Hill’.The part of logic that takes complete sentences as basic units, and investigates the combiningof sentences into other sentences, by means of expressions such as ‘not’, ‘and’, ‘or’, ‘if ...then ’, is called sentential logic. The sentence-combining operations are called sententialoperations, and the resulting sentences (e.g., those in (2)) are sentential combinations, or

iv

sentential compounds. Sentential logic is a basic part that is usually included in other, richersystems of logic.

The logic that treats, in addition to sentential operations, the attribution of properties toobjects (‘Jill is not tall, but pretty’), or relations between objects (‘Jack likes Jill’) is predicatelogic. If we add to predicate logic means for expressing general statements, like those formedin English by using ‘all’, ‘every’, ‘some’ (‘All humans are mortal’, ‘Jill likes some tallman’, ‘Everyone likes somebody’), then we get first-order quantification logic, known also asfirst-order predicate logic, or, for short, first-order logic.

The examples schematized by (1) and by (2), are rather simple. In general, the recasting ofa sentence as an instance of some scheme is far from obvious. It amounts to an analysis: away of getting the sentence from components, which displays a certain logical structure. Thechoice of components and how, in general, we combine them is crucial for the development oflogic, just as the choice of primitive concepts and basic assumptions is crucial for any science.Logic was bogged down for more than twenty centuries, because it was tied to a particularway of analyzing sentences, one that is based on syllogisms. It could not accommodate thewealth of logical reasoning that can be expressed in natural language and is apparent in thedeductive sciences. To be sure, valuable insights and sophisticated analyses were achievedduring that period. But only at the turn of the century, when the basic Aristotelian setuphas been abandoned in favour of an essentially different approach, did symbolic logic comeinto its own.

Examples such as (1) and (2) can serve as entry points, but they do not show what symboliclogic is about. We are not concerned merely with schematizing this or that way of reasoning,but with the construction and the study of formal languages. The study is a tool in theinvestigation of conceptual thought. It aims, and has rich implications, beyond the linguisticdimension.

Logic is also not restricted to the first-order case. Other logical formalisms have been designedin order to handle a variety of notions, including those expressed by the terms ‘possible’,‘necessary’, ‘can’, ‘must’, and many others. There are also numerous systems that treat agreat variety of reasoning methods, which are quite different from the type exemplified infirst-order logic.

A Bit of History

The beginning of logic goes back to Aristotle (384 – 322 B.C.). Aristotle’s works contain,besides non-formal or semi-formal investigations of logical topics, a schematization and asystematic theory of syllogisms. This logic was developed and enlarged in the middle agesbut remained very limited, owing to its underlying approach, which bases the logic on thegrammatical subject-predicate relation. Other parts of logic, namely fragments of sententiallogic, have been researched by the Stoic philosophers (330 – 100 B.C.). They made use of

v

schematic symbols, which did not however amount to a formal system. That stage had tocome much later.

The decisive steps in logic have taken place in the second half of the nineteen and the beginningof the twentieth century, in the works of George Boole (1815 – 1864), Charles Peirce (1839– 1914), Giuseppe Peano (1858 – 1932), and–fore and foremost–Gottlob Frege (1848– 1925), whose work Begriffschrift (1879) presents the first logical system rich enough forexpressing mathematical reasoning. The other most significant event was the publication, inthree volumes, of the Principia Mathematicae by Russell and Whitehead (1910 — 1913), anextremely ambitious work from which the current formalisms derive directly. (Frege’s workwas little noticed at the time, though Russell and other logicians knew it.) The big bangin logic was directly related to new developments in mathematics, mostly in the works ofDedekind and Cantor. It owed much to the creation–by the latter– of set theory (1874).After the work of Russell and Whitehead, it has been taken much further by mathematiciansand philosophers, among whom we find Hilbert, Ackerman, Ramsey, Lowenheim, Skolem,and–somewhat later–Gentzen, Herbrand, Church, Tarski and many others. It owes itsmost important results to Kurt Godel–landmark results of deep philosophical significance.

“Critical Thinking”, what Symbolic Logic Is Not

In the middle ages logic was described as the “the art of reasoning”. It has been, and oftenstill is, viewed as the discipline concerned with correct ways of deriving conclusions from givenpremises. Much of medieval logic was concerned with the analysis and the classification ofarguments to be found in various kinds of discourse, from everyday life to politics, philos-ophy, and theology. In this way logic was related to rhetoric: the art of convincing peoplethrough talk. The medieval logicians devoted considerable effort to the uncovering of invalidarguments, known as fallacies, of which they classified a few dozens.

Perhaps a vestige of this tradition, or a renewal of it, is a logic course given in some curriculaunder “Critical Thinking”. In certain cases, I am told, the name is a euphemism for teachingtext comprehension, filling thereby a high school lacuna. But, to judge by the textbooks, thecourse commonly comprises an assortment of topics that have to do with inference-drawing:deductive, inductive, statistics and probability, some elements of symbolic logic, and a discus-sion of various fallacies. There is no doubt that there is value to such an overview, or to theanalysis of common fallacies. pretense of the title should be discounted. Critical thinking,without the scare quotes, is not something that can be taught directly as an academic subject.(Imagine a course called “Thinking”.)

Thinking, like walking, is learned by practice. And good thinking, clear, rigorous, critical,is what one acquires in life, or through work in particular disciplines. Observations andcorrective tips are useful, but they will not get you far, unless incorporated in long continuousexperience. Clear thinking is a lifelong project.

vi

A course in symbolic logic is not a course in “critical thinking”. What you will learn may,hopefully will, affect your thinking. You will study a certain formalism, using which you willlearn, among the rest, to analyze certain types of English sentences, to reconstruct them andto trace their logical relationships. These should make you more sensitive to some aspects ofmeaning and of reasoning. But any improvement of your thinking ability will be a consequenceof the mental effort and the practice that the course requires. Thinking does not arrive bylearning a set of rules, or by following this or that prescription.

The Research Program of Symbolic Logic

Symbolic logic is an enterprise that takes its “raw material” from the existing activity ofreasoning, displayed by human beings. It focuses on certain fundamental patterns and triesto represent them in an artificial language (also called calculus). The formal language issupposed to capture basic aspects of conceptual thought.

The enterprise comprises discovery as well as construction. It would not be accurate to saythat the investigator merely uncovers existing patterns. The structures are revealed andconstructed at the same time. Having constructed a formal system, we can go back andsee how much of our reasoning is actually captured by it. The formalism is philosophicallyinteresting, or fruitful, in as much as it gives us a handle on essential aspects of thought. It canbe also of technical interest. For it can provide tools for language processing and computersimulation of reasoning processes. In either respect, there is no a priori guarantee of success.

We should keep in mind that, even when the formal system represents something basic orimportant, its significance may be tied to some particular segment of our cognitive activity.There should be no pretense of “reducing” the enormous wealth of our thinking to an artificialsymbolic system. At least there should be no a priori conviction that it can be done. To whatextent can human reasoning be captured by a formal system is an intriguing and difficultquestion. It has been much discussed in the context of artificial intelligence (not always withthe best results).

Let us compare investigations in logic to investigations of human body-postures. A medicalresearcher can use x rays and other scans, or slow motion pictures, in order to find out howhuman bodies function in a range of activities. He will accumulate a vast amount of data. Inorder to organize his data into a meaningful account, he may classify it according to “humantypes”, establish certain regularities and formulate some general rules. Here he is alreadydeviating from the “raw material”; he is introducing an abstract system in as much as histypes are idealized constructs, which actual humans only approximate. (Not to mention thefact that in the very acquiring of data he is already making use of some theoretical system.)

Our investigator may also arrive at conclusions concerning the correct ways in which humansshould walk, or sit in order to preserve healthy tissue, minimize unnecessary tension, etc. Hisresearch will establish certain norms; not only will it reveal how humans use their bodies,

vii

but also how they ought to. He may even conclude that most people do not maintain theirbodies as they should. In this way the descriptive merges into the normative. And thereis a feedback, for the normative may provide further concepts and further guidelines for thedescriptive. Our investigator’s conclusions may be subject to debates, objections, or revisions.Here, as well, the descriptive and the normative are interlaced. Finally, it is possible thatcertain recommendations become highly influential, to the extent of being incorporated in theeducational system. They would thus become part of the culture, determining, to an extent,the actual behaviour of humans, say, the way they hammer, or the kind of chairs they prefer.

All of these aspects exist when we are concerned with human thinking. Here, as well, thedescriptive merges into the normative. Having started by investigating actual thinking, wemight end by concluding how thinking ought to be done. Furthermore, the enterprise mightinfluence actual thinking habits, projecting back on the very culture within which it wascarried out.

First-Order Logic

The development of symbolic logic has had by now far reaching consequences, which haveaffected deeply our philosophical outlook. Coupled with certain technological developments,it has also affected our culture in general. The basic system that the project has yielded isfirst-order logic. The name refers to a type of language, characterized–as stated in the firstsection–by a certain logical vocabulary. First-order logic serves also as the core of manymodifications, restrictions and–most important–enlargements.

Although first-order logic is rather simple, all mathematical reasoning (derivations used inproving mathematical results) can be reproduced within it. Since it is completely defined byprecise formal rules, first-order logic can itself be treated and investigated as a mathematicalsystem . Mathematical logicians have done this, and they have been able to prove highlyinteresting theorems about it; for example, theorems that assert that certain statements areunprovable from such and such axioms. These and other theorems about the system areknown as metatheorems; for they are not about numbers, equations, geometrical spaces, oralgebras, but about the language in which theorems about numbers, equations, geometricalspaces, or algebras are proven. They enlighten us about its possibilities and limitations.

In philosophy, the development of symbolic logic has had far reaching effects. The role oflogic within the general philosophical inquiry has been a subject of debate. There is a widespectrum of opinions, from those who accord it a central place, to those who restrict it to aspecialized area. The subject’s importance varies with one’s interests and the flavour of one’sphilosophy. In any case, logic is considered a basic subject, knowledge of which is required inmost graduate and undergraduate philosophy programs.

viii

The Wider Scope of Artificial Languages

Historically, the idea of a comprehensive formal language, defined by precise rules of math-ematical nature, goes back to Leibniz (1646 — 1716). Leibniz thought of an arithmeticalrepresentation of concepts, and dreamed of a universal formal language within which all truthcould be expressed. A similar vision was also before the eyes of Frege and Russell. Theactual languages produced by logicians fall short of any kind of Leibnizian dream. This is noaccident, for by now we have highly convincing reasons for discounting the possibility of sucha universal language. The reasons have been provided by logic itself, in the form of certainmetatheorems (Godel’s incompleteness results).

As noted, there is by now a wide variety of logical systems, which express many aspects ofreasoning and of conceptual organization. In the last forty years the enterprise of artificial lan-guages has undergone a radical change due to the emergence of computer science. Computerscientists have developed scores of languages of types different from the types constructed bylogicians. Their goal has not been the investigation of thought, but the description and themanipulation of computational activity. Computer languages serve to define the functioningof computers and to “communicate” with them, to “tell” a computer what to do. A majorconsideration that enters into the setting up of programming languages is that of efficiency:programs should be implementable in reasonable run time, on practical hardware. Usually,there is a trade-off between a program’s simplicity and conceptual clarity, on one hand, andits efficiency on the other.

At the same time we have witnessed a marked convergence of some of the projects of pro-gramming languages and those of logic. For example, the programming language LISP (andits many variants) is closely connected with the logical system known as the λ calculus, devel-oped in the thirties by Church. The calculus and its variants have been the focus of a greatamount of research by logicians and computer scientists. There has been also an importantdirect effect of symbolic logic on computer science. The clarity and simplicity of first-orderlogic suggested its use as a basis for a programming language. Ways were found to imple-ment portions of first-order logic in an efficient way, which led to the development of what isknown as logic programming. This, by now, is a vast area with hundreds, if not thousands,of researchers. Logic enters also, in an essential way, into other areas of computer science, inparticular, artificial intelligence and automated theorem proving.

The Goals and the Structure of the Course

The main purpose of the course is to teach FOL (first-order logic), to relate it to naturallanguage (English) and to point out various philosophical problems that arise thereby.

The level is elementary, in as much as the course does not include proofs of the deeper results,such as Godel’s completeness and incompleteness theorems. Nonetheless, the course aims at

ix

providing a good grasp of FOL. This includes an understanding of formal syntactic structuresan understanding of the semantics (that is, of the notion of an interpretation of the languageand how it determines truth and falsity), the mastering of certain deductive and relatedtechniques, and an ability to use the formal system in the analysis of English sentences andinformal reasoning.

The first chapter, which is more of a general nature, is intended to clarify the conceptspresuppositions that underlie the project of classical logic: the category of declarative sentenceand the classification of all declarative sentences into true and false. Various problems thatarise when trying to apply this framework to natural language are discussed, among whichare indexicality, ambiguity, vagueness and open texture. This introduction is also intendedto put symbolic logic into a wider and more concrete perspective, removing from it any falseaura of a given truth.

We get down to the actual course material in chapter 2, which provides a first view of sententiallogic, based on a semantic-oriented approach. Here are defined the connectives, a varietyof syntactic concepts (components, main connective, unique readability and others), truth-tables and the concept of logical equivalence. The chapter contains also various simplificationtechniques and an algebraic perspective on the system.

Having gotten a sufficient grasp of the formalism, we proceed in chapter 3 to match the formalsetup with English. The chapter discusses, with the aid of many examples, ways of expressingin English the classical connectives, the extent to which English sentences can be rendered insentential logic and what such an analysis reveals.

Chapter 4 treats logical implications and proofs. After defining (semantically) the concept ofa logical implication, the chapter presents a very convenient method of deciding whether apurported implication, from a set of premises to a conclusion, is valid. The method combinesthe ideas of Gentzen’s calculus with a top-down derivation technique. If the implicationis valid it yields a formal proof (which can be rewritten bottom-up), if not–it producesa counterexample, thereby establishing the non-validity. In the last section we return tonatural language and consider possible recastings of various inferences carried in English intoa formal mode. Here we discuss also some concepts from the philosophy of language, such asimplicature.

Chapter 5 provides some basic mathematical tools that are needed for the rigorous treatmentof logical calculi, in particular, for defining interpretations (models), giving a truth-definition,and for setting up deductive systems. These tools consist of elementary notions of set theory,and the basic techniques of inductive definitions and proofs.

In chapter 6 the formal language of the sentential calculus is defined with full mathematicalrigor, together with the concept of a deductive system. Here the crucial distinction betweensyntax and semantics is clarified and the relation between the two is established in terms ofsoundness and completeness.

x

Chapter 7 presents predicate logic (without quantifiers), based on a vocabulary of predicatesand individual constants. The equality predicate is introduced and the top-down method fordeciding logical implications is extended so as to include atomic sentences that are equalities.This chapter treats also predication in English. In the second half of that chapter the systemis extended by the introduction of variables and steps are taken towards the introduction ofquantifiers.

In chapter 8 the fully fledged language of first-order logic is defined, as well as the basicsyntactic concepts that go with it: quantifier scope, free and bound variables, and legitimatesubstitutions of terms. Emphasis is placed on an intuitive understanding of what first-orderformulas express and on translations from and into English.

The general concept of a model for a first-order language is presented in chapter 9, as wellas the definitions of satisfaction and truth. Based on these we get the concepts of logicalequivalence and logical implication. The top-down derivation technique of sentential logic isextended to first-order logic. As before, the method is guaranteed to yield a proof of any validimplication. A proof of this claim–which is not included in this chapter–yields immediatelythe completeness theorem.

Chapter 1

Declarative Sentences

1.0

Symbolic logic is concerned first and foremost with declarative sentences. These are sentencesthat purport to make factual statements. They are true if what they state is the case, andthey are false–if it is not.

‘Grass is green,’ ‘Every prime number is odd,’ ‘Not every prime numberis odd,’ ‘The moon is larger than the earth,’ ‘John Kennedy was notthe first president of the USA to be assassinated,’ ‘Jack loves Jill, butwouldn’t admit it’,

declarative sentences. The first the third and the fifth are true. The second and the fourthare false. The last is true just in this case: (i) Jack loves Jill and (ii) Jack does not admitthat he loves Jill.

You can see what distinguishes declarative sentences by comparing them with other types.Interrogative sentences, for example, are used to express questions:

‘Who deduced existence from thinking?’ ‘Did Homer write the Odyssey?’

Such sentences call for answers, which–depending on the kind of question–come in severalforms; e.g., the first of the above questions calls for a name of a person, the second–for a‘yes’ or a ‘no’.

Commands are expressed by means of imperative sentences, such as:

‘Love thy neighbour as thou lovest thyself,’ ‘Do not walk on the grass’.

1

2 CHAPTER 1. DECLARATIVE SENTENCES

Given in the appropriate circumstance, by someone with authority, they call for compliance.

None of these, or of the other kinds of sentence, is true or false in the same sense that adeclarative sentence is. We can say of a question that it is to the point, important, interesting,and so on, or that it is irrelevant, misleading or ill-posed. A command can be justified,appropriate, or illegitimate or out of place. But truth and falsity–in the basic, elementarysense of these terms–pertain to declarative sentences only. Sentences are used in many waysto achieve diverse purposes in human interaction. To question and to command are only twoof a great variety of linguistic acts. We have requests, greetings, condolences, promises, oaths,and many others. What is then, within this picture of human interaction, the principal roleof declarative sentences? It is–first and foremost–to convey information, to tell someonethat such and such is the case, that a certain state of affairs obtains.

But over and above their use in human communication, declarative sentences constitute de-scriptions (or purported descriptions) of some reality: a reality perceived by humans, butperceived as existing in itself, independently of its being described. A logical investigation ofdeclarative sentences can serve as tool that clarifies the nature of that reality. By uncoveringcertain basic features of our thinking it may also uncover basic features of the world that thethinking organizes. One can appreciate already, at this stage, the potential that the logic ofdeclarative sentences has for epistemology–the inquiry into the nature of knowledge, and forontology–the inquiry into the nature of reality.

For this reason, when sentences are the target of a philosophical inquiry, the declarative onesplay the most important role. Formal methods are not restricted to declarative sentences;formal systems have been designed for handling other types, such as questions and commands.But symbolic logic is mostly about declarative sentences, and it is with these that we shallbe concerned here.

Henceforth, I shall use ‘sentence’ to refer to declarative sentences, unless indicated otherwise.

1.1 Truth-Values

1.1.0

A declarative sentence is true or false, according as to whether what it states is, or is not thecase. It is very convenient to introduce two abstract objects, TRUE and FALSE, and to markthe sentence’s being true by assigning to it the value TRUE, and its being false–by assigningto it the value FALSE. We refer to these objects as truth-values.

Truth-values are merely a technical device. They make it possible to use concise and clearformulations. One should not be mystified by these objects and one should not look for hiddenmeanings. To say that a sentence has the value TRUE is just another way of saying that it is

1.1. TRUTH-VALUES 3

true, and to say that it has the value FALSE is no more than saying that it is false. Any twoobjects can be chosen as TRUE and FALSE. For the only thing that matters about truth-valuesis their use as markers of truth and falsity.

Notation: We use ‘T’ and ‘F’ as abbreviations for ‘TRUE’ and ‘FALSE’.

While the introduction of truth-values is a technical convenience, the very possibility of classi-fying sentences into true and false is a substantial philosophical issue. Does every sentence fallunder one of these categories? Little reflection will show that in our everyday discourse sucha classification is, to a large extent, problematic. The problem is not of knowing a sentence’struth-value; we may not know whether Oswald was Kennedy’s only assassin, or whether 232+1is a prime number, but we find no difficulty in appreciating the fact that, independently ofour knowledge, ‘Oswald was Kennedy’s only assassin’ is either true or false, and so is ‘232+1is prime’. The problem is that in many cases it is not clear what the conditions for truth andfalsity are and whether the classification applies at all. Perhaps certain sentences should onvarious occasions be considered as neither true nor false; which means, in our terminology,that neither T nor F is their value.

The logic we are going to study, which is classical two-valued logic, assumes bivalence: theprinciple that every sentence has one of the two values T or F. This principle makes forsystems that are relatively simple and highly fruitful at the same time. Logicians have, ofcourse, been aware of the problems surrounding the assignment of truth-values. But in orderto get off ground, an inquiry must start by focusing on some aspects, while others are ignored.Later it may broadened so as to handle additional features and other situations. The art isto know what to focus on and what, initially, to ignore. Classical two-valued logic has beenextremely successful in contexts where bivalence prevails. And it serves also as a point ofreference for further investigations, where problems of missing truth-values can be addressed.In short, we are doing what every scientist does, when he starts with a deliberately idealizedpicture.

In the coming sections of this chapter I shall highlight the main situations where the as-signment of definite truth-values is called into question. This will also be an occasion fordiscussing briefly some major topics regarding language: context-dependency, tokens andtypes, indexicals, ambiguity and vagueness.

1.1.1 Context Dependency

The same sentence may have different truth-values on different occasions of its use. Consider,for example:

Jack: I am tall,

Jill: I am tall.


If Jack is not tall, but Jill is, then–in Jack’s mouth–the sentence is false, but in Jill’s mouthit is true. This shows that we are dealing here with two kinds of things: the entity referred toas sentence, which is the same in the mouth of Jack and the mouth of Jill, and its differentutterances. The distinction is fundamental; it, and some of its hinging phenomena, will benow discussed.

1.1.2 Types and Tokens

Linguistic intercourse is based on the production of certain physical items: stretches of sounds,marks on paper, and their like, which are interpreted as words and sentences. Such items arecalled tokens. When you started to read this section you encountered a token of ‘linguistic’,which was part of a token of the opening sentence. And what you have just encountered isanother token of ‘linguistic’, this time enclosed in inverted commas.

Of course, “token” is meaningful only in as much as it is a token of something, a word, a letter,a sentence, or–in general–some other, more abstract entity. This other entity is called type.By a sentence-token we mean a token of a sentence, that is, a token of a sentence-type.

Note that our terms ‘letter’, ‘word’, or ‘sentence’, are ambiguous. Sometimes they refer totypes and sometimes to tokens. This is shown clearly in situations that involve counting. Howmany words are there on this page? The answer depends on whether you count repetitionsof the same word. If you do, then you interpret “word” as word-token, if you don’t–youinterpret it as word-type. Usually the number of word-tokens exceeds the number of word-types; for we do, as a rule, repeat.

Our ability to use language is preconditioned by our ability to recognize different tokens asbeing tokens of the same type. This “sameness” relation is often indicated by the physicalsimilarity of tokens. Thus, the two tokens of ‘ability’ in the first sentence of this paragraphhave exactly the same shape. But on the whole, what counts as being tokens of the sametype is a matter of convention; similarity is not necessary. Think of the different fonts one canuse for the same letters, and of the enormous variety of handwritings. (Reading someone’swritten words is often impossible without knowing the language, even when the alphabet isknown.) And to clinch the point, note that the same words are represented by tokens indifferent physical media: the acoustic and the visual.

Things would have been considerably simpler if we could disregard the difference betweentokens of the same type. But this is not so; for, as the last example shows, different tokensof the same type may have different truth-values.

1.1. TRUTH-VALUES 5

Indexicals and Demostratives

An indexical is a word whose reference depends–in a systematic way–on certain surroundingsof its token, e.g., the token’s origin, its time, or its place. Such is the pronoun ‘I’, which refersto its utterer, and such are the words ‘now’ and ‘here’, which refer to the utterance’s timeand place. The shift of reference may results in a truth-value change. Indexicals are, indeed,the most common cause for assigning different truth-values to different tokens of the samesentence. In the last example the difference in truth-value is caused by the indexical ‘I’, whichdenotes Jack, in the mouth of Jack, Jill–in the mouth of Jill. Quite often the indexicals areimplicit. In

(1) It is raining,

the present tense indicates that the time is the time of the utterance. And, in the absence ofan explicit place indication, the place is the place of the utterance. When (1) is uttered inNew York, on May 17 1992 at 9:00 AM, it is equivalent to:

(10) It is raining in New York, on May 17 1992 at 9:00 AM.

It is not difficult to spot indexicals, once you are aware of their possible existence. Besides‘now’ and ‘here’, we have also the indexicals ‘yesterday’, ‘tomorrow’, ‘last week’, ‘nextroom’ and many others.

Demonstratives, like indexicals, have systematically determined token-dependent references.They usually require an accompanying demonstration–some non-linguistic act of pointing.Such are the words ‘that’ and ‘this’. The use of ‘you’ involves a demonstrative element (theact of addressing somebody), as do sometimes ‘he’ and ‘she’. (It is not always easy to describewhat exactly the demonstration is, but this is another matter.) Sometimes a distinction ismade between pure indexicals–which, like ‘I’, require no demonstration–and non-pure ones.And sometimes ‘indexical’ is used for both indexicals and demonstratives.

Some Kinds of Ambiguity

Many, perhaps most, proper names denote different objects on different occasions. ‘Roosevelt’can mean either the first or the second USA president of this name, ‘Dewey’ can refer either tothe philosopher or to the Republican politician, ‘Tolstoy’ can refer to any of several Russianwriters. First and second names, or initials, can help in avoiding confusion (thus, we distin-guish between Teddy Roosevelt–the man who was fond of speaking softly while carrying a bigstick, and Franklin Roosevelt–the second world war leader in the wheel chair). Additionalnames reduce the ambiguity, but need not eliminate it. A glance in the telephone directoryunder ‘Smith’, or–in New York–under ‘Cohen’, will show this. Other distinguishing marks


can be used: ‘Dewey the philosopher’ versus ‘Dewey the politician’, ‘Johan Strauss the father’versus ‘Johan Strauss the son’.

Above all, a name’s denotation is determined by the context in which the name is used.(If I ask my daughter: has Bill telephoned? it is unlikely that she will take me to havereferred to Bill Clinton.) But there are no clear-cut linguistic rules that regulate this. Variousfactors enter: what has been stated before, the topic of the discussion, and what is knownof the interlocutor’s knowledge and intentions. Proper names behave quite differently fromindexicals; the latter are subject to systematic rules (‘you’ refers to the person addressed,‘now’ refers to the time of the utterance, etc.), the former are not.

Besides indexicals and proper names, linguistic expressions in general may have differentdenotations, or meanings, on different occasions. The “same word” might mean differentthings, e.g., tank–a large container for storage, and tank–an armored vehicle on caterpillartreads. But here we should be careful, for the very difference of meaning is often taken toconstitute a difference of words (i.e., of types). Homonyms are different words written andpronounced in the same way; their difference rests solely on difference in meaning. When‘tank’ is split into homonyms, it is no longer a single ambiguous word. Accordingly,

(2) John jumped into the tank,

is, strictly speaking, not an ambiguous sentence (which has different truth-values on differentoccasions) but an ambiguous expression that can be read as more than one sentence: asentence containing the ‘tank-as-container’-homonym, and a sentence containing the ‘tank-as-armored-vehicle’-homonym. The context in which (2) occurs (e.g., sentences that comebefore and after it) may help us to decide the intended reading.

By contrast, different tokens of ‘It is now raining here’ are tokens of the same sentence.For ‘now’ and ‘here’ do not constitute different words when used at different times, or atdifferent places. A child learning to speak does not coin a new English word, when he uses‘I’ for the first time. We can however say that the English language gained a new word when‘tank’ (already in use as a name of certain containers) was introduced as a name of certainarmored cars1. Many cases of ambiguity–where the meanings are linked–do not deserveto be treated as homonyms. ‘Word’ can mean word-type or word-token, but this does notconstitute sufficient ground for distinguishing two homonyms. We would do better, one feels,to regard it as a single ambiguous word.

Ambiguous terms are not the only source of sentential ambiguity; often the sentential structureitself can be construed in more than one way.

(3) Taking the money out of his wallet, he put it on the table.

1By the same reasoning, no new word is coined when a new baby is given a current name, like ‘Henry’.But we did get a new homonym when the ninth planet was named ‘Pluto’.

1.1. TRUTH-VALUES 7

Was it the money or the wallet he put on the table? That depends on the syntactic structureof (3); it is the first, if ‘it’ goes proxy for ‘the money’, the second–if it goes proxy for ‘hiswallet’. Syntactic ambiguity takes place when the same sequence of words lends itself todifferent structural interpretations. The truth-value can depend on the way we structure thesentence, or–in more technical terminology–on the way we parse it. Here, again, the contextcan decide the intended parsing.

We can have a concept of “sentence” according to which different parsings determine differentsentences; if so, (3) is to be regarded in the same light as (2): an expression representing morethan one sentence. But on the usual, everyday concept of sentence, (3) is a single syntacticallyambiguous sentence.

In symbolic logic the artificial language is set up in a way that bars any ambiguity. Ev-ery sentence has a unique syntactic structure and all referring terms have unique, context-independent references. Therefore a translation from natural language into symbolic logicinvolves an interpretation whereby, in cases of ambiguity, a particular reading is chosen. Asa preparatory step, we can try to paraphrase the sentences of natural language, so as toeliminate various context dependencies. This is the subject of the next subsection.

Eliminating Simple Context Dependencies

Dependencies on context, which are caused by indexicals or by ambiguity, can be eliminatedby replacing indexicals and ambiguous terms by terms that have unique and fixed denotationsthroughout the discussion.

For example, each occurrence of ‘word’ can be replaced by ‘word-type’ or by ‘word-token’,depending on whether the first or the second is meant; and when either will do, we can makethis explicit by writing ‘word-type or word-token’. Sometimes we resort to new nicknames‘The first John’ for our old school mate, ‘The second John’ for the new department chief.And, to be clear and succinct, we can introduce ‘John1’ and ‘John2’. The same policy canbe used to eliminate homonyms. To be sure, ‘John1’ is not an English name, but a newlycoined word. Our aim, however, is not to preserve the original phrasings, but to recast theminto forms more suitable for logical analysis.

Indexicals can be eliminated by using names or descriptive phrases with fixed denotations.Thus (1)–when uttered in New York at 9:00 AM, May 17 1992–is rephrased as (10). And ‘Iam tall’–when uttered by Jill on May 10 1992–is recast as:

(4) Jill is tall on May 10 1992.

Here ‘is’ is to be interpreted in a timeless mode, something like is/was/will-be. Note thedifferent degrees of precision in the specifications of time. The weather may change from hourto hour (hence we have ‘9:00 AM’ in (10)), but presumably Jill’s being tall is not subject to


hourly changes.

In this way sentence-tokens that involve context-dependency are translated into what Quinenamed eternal sentences, that is: sentences whose truth-values do not depend on time, loca-tion, or other contextual elements.

Note: We are not concerned here with a conceptual elimination of indexicals. The timescale used in (10) and (4) is defined by referring to the planet earth, and ‘earth’ is defined bya demonstrative: this planet, or the planet we are now on. We aim only to eliminate contextdependency that can cause trouble in logical analysis. And this is achieved by paraphrases ofthe kind just given.

Note also that, for local purposes: if we are concerned only with a particular discourse, wehave only to replace the terms whose denotations vary within that discourse. If ‘today’ refersto the same day in all sentence-tokens that are relevant to our purpose, we need not replaceit.

The situation is altogether different when it comes to ambiguities in general. If my daughtertells me ‘Bill telephoned an hour ago’ , I shall probably guess correctly who of the various Bill’sit was. But all I can appeal to is an assortment of considerations: the Bill I was expecting acall from, the Bill likely to call at that time, the Bill that has recently figured in our social life,etc. Considerations of this kind are classified in the philosophy of language under pragmatics.

The resort to pragmatics, rather than to clear-cut rules, is of great interest for linguistictheory and the philosophy of language; but is of no concern for logic, at least not the logicthat is our present subject. For our purposes, it is enough that there is a paraphrase thateliminates context-dependency. Logic takes it up from there. How we get there is anotherconcern.

The cases considered thus far are the tip of the iceberg. The real game of ambiguity andcontext-dependency starts when adjectives, descriptive phrases, and verbs are brought intothe picture. This subject–a wide area of linguistic and philosophical investigations–is notpart of this course. A few observations may however give us some idea of the extent of theproblems. Consider attributes such as

small, big, heavy, dark, high, fast, slow, rich,

and their like. You don’t need much reflection to realize that they are relative and highlycontext-dependent.

(5) Lionel is big.

(6) Kitty is small.

1.1. TRUTH-VALUES 9

You may deduce from (5) and (6) that Lionel is bigger than Kitty. Not so if it is knownthat Lionel is a cat and Kitty is a lioness. In that case the ‘big’ in (5) should read: ‘a bigcat’, or ‘big as cats go’, and the ‘small’ in (6)– as ‘a small lioness’. If we apply the strategysuggested above for ambiguous names, we shall split ‘big’ and ‘small’ into many adjectives,say ‘bigx’ and ‘smallx’ where ‘x’ indicates some kind of objects; the ‘big’ in (5) is thus readas ‘bigc’: big on the scale of cats, and the ‘small’ in (6)–as ‘smalll’: small on the scale oflions. Another, better strategy is to provide for a systematic treatment of compounds suchas ‘big as a ...’, ‘rich as a ...’, where ‘...’ describes some (natural) class.

Systematic treatments do not apply however when the adjective must be interpreted by refer-ring to a particular occasion. ‘The trunk is heavy’ can mean that the trunk is heavy, when Ido the lifting, or when you do the lifting, or when both of us do the lifting. And occasionallythere is nothing precise or explicit that we can fall back on.

(7) Jack Havenhearst lives in a high building on the outskirts of Toronto.

How high is “high”? A high building in Jerusalem is not so high in Manhattan. The contextmay decide it, or it may not. Perhaps the speaker has derived his statement from some vaguerecollection. In cases like this, when ambiguity is tied up with vagueness, the very possessionof a definite truth-value is put into question.

Before proceeding, note that the problems just mentioned concern attributes of the “neutral”kind. We have not touched on evaluative terms such as

‘beautiful’, ‘ugly’, ‘tasteful’, ‘repulsive’, ‘nice’, ‘sexy’, ‘attractive’,

and their like, which involve additional subjective dimensions, nor on:

‘important’, ‘significant’, ‘marginal’, ‘central’,

not to mention the ubiquitous ‘good’ and ‘bad’.

1.1.3 Vagueness and Open Texture

Some people are definitely bald, some are definitely not. But some are borderline cases,for whom the question: Is he bald? does not seem to have a yes-or-no answer. The sameapplies to every type of statement you might think of in the context of everyday discourse.For example, is it raining now? Sometimes the answer is yes, sometimes no, and sometimesneither appears satisfactory (does the present very light drizzle qualify as “rain”?)

Are we now in the USA? That type of question has almost always a well-defined answer, evenwhen we don’t know it; for international borders–things of extreme significance–are very


carefully drawn. But what if somebody happens to straddle the border-line? There is firstthe problem of pinpointing one’s location, and second the problem of pinpointing the border;and in both the “pinpointing” has limited precision. Even the question: Is the date now May17 1992? may, on some occasion, lack a yes-or-no answer; for the time-point defined by theutterance of ‘now’ is determined with no more than a certain precision, surely not up to amillisecond, say.

In everyday discourse we often handle borderline cases by employing a more refined classifi-cation. For example, we can use ‘quite bald’ and ‘hairy’ for the clear cases, and ‘baldish’ forthose in between. This provides for more accurate descriptions. But it leaves us in the samesituation when it comes to drawing the line between bald (in the old sense) and non-bald.And if we were to ban our original adjective, allowing only the refined ones, there would bestill borderline cases for each of the new attributes.

Cases in which neither T nor F is to be assigned are characterized as truth-value gaps, or forshort, as gaps. The cases considered before–those of indexicals and ambiguous terms–arenot genuine gaps, in as much as they can be resolved by removing the ambiguity. In cases ofvagueness the gaps are for real (or so many philosophers think).

Vagueness inheres in our very conceptual fabric. It does not arise because we are missing somefacts. Knowing all there is to know about the hairs on Mr. Hairfew’s head: their number,distribution and length, may not determine whether he is bald or not. There is no point toinsisting on a yes-or-no answer. The concept is simply not intended for cases like his. If youthink of it you will see that the phenomenon is all around. Only mathematics is exempt, andsome theoretical parts of exact science. It appears whenever an empirical element is present.2

Vagueness has been often regarded as a flaw, something to get rid of–if possible. But it hasa vital role in reducing the amount of processed information. In principle, we could–insteadof using ‘young’, ‘rich’, ‘bald’, and their like–use descriptions that tell us one’s exact age,one’s financial assets to the last penny, or one’s precise amount of cranial hair. All of whichwould involve colossal waste of valuable resources. For in most cases a two-fold classificationinto young and not-young, rich and not-rich, bald and not-bald, will do. And additionalinformation can be obtained if and when needed. The efficiency thereby achieved is worththe price of borderline cases with truth-value gaps.

A deeper reason for vagueness is that every conceptual framework gives us only a limitedpurchase on “reality” or “the facts”. There is always a place for surprise, for somethingturning up that resists classification, something that defies our neatly arranged scheme.

The examples considered so far are relatively simple borderline cases. In these situations acertain classification does not apply, yet an alternative exhaustive description is available.

2It is not a priori impossible that some experiment will turn up a particle for which the question: is it anelectron? has no clear-cut answer. The theory rules this out; but the theory may change, for its authority isestablished by empirical criteria. Here however we are confronted with open texture rather than with simplevagueness.

1.1. TRUTH-VALUES 11

There is no mystery about the financial status of Ms. Richfield. In principle a full list of herassets–calculated to the last cent–can be drawn. The difficulty in deciding whether she isrich stems solely from the vagueness of ‘rich’. But there are situations where no alternativedescription is available, situations that involve more than occasional borderline cases.

(8) Jeremy, the chimpanzee, knows that Jill will feed him soon.

Can we say that a monkey “knows” that something is going to happen in the near future?Granting the way we apply ‘know’ to people (itself a knotty issue and a subject of a vastphilosophical literature) can we apply it, in some instances, to animals? All we can do isspeculate on the monkey’s mode of consciousness, dispositions, state of mind or state ofbrain. And it is not even clear what are the factors that are relevant for deciding the statusof (8). Surely there will be conflicting opinions. Cases of this kind display the undecidednessof our conceptual apparatus, the fact that it is open-ended and may evolve in more than oneway. They are known as open texture. As the example above shows, open texture involvesquite common concepts. Think of generosity, freedom, or sanity.

Vagueness of Generality

General statements convey quantitative information regarding some class (or multitude) ofobjects. They are usually expressed by words such as ‘all’, ‘every’, ‘some’, ‘most’, and theirkin. For example:

All human beings have kidneys and lungs.

In classical logic generality is expressed by quantifiers, which have precise unambiguous in-terpretations. But in natural language the intended extent of generality is often ambiguous,as well as vague. ‘Everyone’ can cover many ranges, from one’s set of acquaintances to everyhuman on earth. Consider, for example:

(9) Everyone knows that Reagan used to consult astrologers.

(10) Everyone wants to be rich and famous.

(11) Everyone will sometime die.

Only in (11) can we interpret ‘everyone’ as meaning every human being– the way it is con-strued in symbolic logic. In (9) and in (10) the intended interpretation is obviously different.In (9) ‘everyone’ refers to a very small minority: people who are knowledgeable about Reagan.(9) is just another way of saying that the item in question had some publicity. (10) covers awider range than (9), but falls short of the generality of (11). Even when the range covered


by ‘everyone’ or ‘everything’ is explicit, the strength of the assertion can vary. For example,when the teacher asserts

(12) Everyone in class passed the test,

she will be taken literally; her assertion would be misleading even if a single student hadfailed. But a casual remark:

(13) Everyone in college is looking forward to the holidays season,

means only that a large majority does; it would not be considered false on the ground of afew exceptions. How large is the required majority? This is vague.

Such phenomena are even more pronounced when the general statement–usually expressedby means of the indefinite plural–is intended to express a law, or a rule. For rules may haveexceptions. (And the exceptions to this last rule are in mathematics, or some of the exactsciences, or in statements like (11).) The amount of tolerated exceptions is vague. Consider,for example:

(14) Women live longer than men,

(15) When squirrels grow heavy furs in the autumn, the winters are colder,

(16) Birds fly.

Statistical data (e.g., average life span) can be cited in support of (14); but under whatconditions the is sentence true? This is vague. (15) sums up a general impression of pastexperience; presumably, statistics can be invoked here as well. (16), on the other hand, isbetter viewed as a rule that determines the “normal” case: If something is known to be abird, then–in the absence of other relevant information–presume that it flies.

The general principles concerning ambiguity and vagueness apply also here. We may have togive up precise systematic prescriptions and settle for pragmatic guidelines. And we shouldaccept the possibility of borderline cases, where the assignment of any truth-value is ratherarbitrary.

Cases of the types just given can, of course, be handled by using mathematical-like systems.(14) and (15) call for statistical analysis, with all the criteria that go with it. (16), on theother hand, indicates reasoning based on normalcy assumptions, where one’s conclusions areretracted, if additional information shows the case to be atypical (in the relevant way). Littlereflection is needed to see that almost all our decision making involves reasoning of thatkind. With no information to the contrary, the usual order of things is presupposed. Todo otherwise would freeze all deliberate action. In recent years a great deal of research, bycomputer scientists, logicians and philosophers, has been devoted to systems within which

1.1. TRUTH-VALUES 13

reasoning that involves retractions can be expressed. They come under the general term ofnon-monotone logic.

1.1.4 Other Causes of Truth-Value Gaps

Non-Denoting Terms

Declarative sentences may contain descriptive expressions that function as names but lackdenotations. The standard, by now worn-out example is from Russell:

(17) The present king of France is bald.

(It is assumed that (17) is uttered at a time when there is no king of France. If neededthe time-indexical can be eliminated by introducing a suitable date.) Proper names, as well,may lack denotation: ‘Pegasus’, or ‘Vulcan’–either the name of the Roman god, or of thenon-existent planet.3

Frege held that declarative sentences containing non-denoting terms have no truth-value.This view was later adopted, for different reasons, by Strawson. Russell, on the other hand,proposed a rephrasing by which these sentences get truth-values; (17), for example, is recon-structed as:

(170) There is a unique person who is a King of France, and whoever is a King ofFrance is bald.

Therefore the sentence is false. Also false, by Russell’s reconstruction, is:

(18) The present king of France is not bald.

But

(19) It is not the case that the king of France is bald.

is true. (The difference between (18) and (19) is accounted for by giving the negation differentscopes with respect to the descriptive term ‘the king of France’–a point that we shall notdiscuss here.)

3‘Neptune’, ‘Pluto’, and ‘Vulcan’ were introduced as names of planets whose existence was deduced ontheoretical grounds from the observed movements of other planets. Neptune and Pluto were later observeddirectly. ‘Vulcan’ did not make it. The effects attributed to Vulcan were later explained by relativity theory.


As far as logic is concerned the question is more or less settled–not by a verdict in favour ofone of the views, but by having the issue sufficiently clarified, so as to reduce it to a choicebetween well understood alternatives. It boils down to what one considers as fitting betterour linguistic usage. Intuitions may vary. Nonetheless, the different resulting systems arevariants within the general framework of classical logic.

Category Mistakes

In the usual order of things, almost every attribute and every relation is associated with acertain type of objects. When the objects do not fit the attribute, we get strange, thoughgrammatically correct, sentences; for example,

(20) The number 3 is thirsty.

This is a category mistake; numbers are not the kind of things that are thirsty, or non-thirsty. Some may want to treat (20) as neither true nor false. Alternatively, (20) and its kincan be regarded as false. This policy can be extended so as to handle negations and othercompounds. As in the case of non-denoting terms, the ways of dealing with such examplesare well-understood and can be accommodated, as variants, within the general framework ofclassical logic. Non-denoting terms and category mistakes, interesting as they are when itcomes to working out the details, do not pose a foundational challenge to the framework ofclassical logic. But vagueness and open texture do.

1.2 Some Other Uses of Declarative Sentences

Declarative sentences have other uses, besides that of conveying information, or describingthe world. I do not mean their misuse, through lying, or by misleading. Such misuses aredirect derivatives of their ordinary use. I mean uses that are altogether different. They havebeen extensively studied by philosophers and linguists, and are worth noting for the sake ofcompleteness and in order to give us a wider perspective.

Fictional Contexts

In a play, or in a movie, the players utter declarative sentences, much as people in “real life”do; but what goes on is obviously different. Compare, for example, an exclamation of ‘Fire!’that is part of the play, with a similar exclamation, by the same actor in the same episode,when he observes a real fire breaking out in the hall. We say that the utterances in theplay are not true assertions, or are not performed in an assertive mode. They are pretendedassertions within a make-belief game.

1.2. SOME OTHER USES OF DECLARATIVE SENTENCES 15

Yet, within the game, they are subject to the same logic that applies to ordinary statements.Furthermore, truth-values can be meaningfully assigned to certain statements about fictionalcharacters. ‘Hamlet killed Polonius, and was not sorry about it’ will be regarded as true,while ‘Hamlet intended to kill Polonius’ will be regarded as false. This merely reflects whatis found in the play. The pretense reaches its limits easily: ‘Hamlet had blood type A’ isneither true nor false, or–if we adopt Russell’s method–false by some legislation in logic.Consider, for contrast, ‘Shakespeare had blood type A’, which has a determinate truth-value;even though we do not, and probably never will, know what this value is. No logical legislationcan settle this.

The declarative sentences that appear in novels, poetry, or jokes, achieve a variety of effects:they can amuse, entertain, evoke an aesthetic experience, a feel or a vision. Some can enlightenus, but not in the way that ‘The earth turns around the sun’ does.

Metaphors, Similes, and Aphorisms

Consider the following.

Skepticism is the chastity of the intellect. Santayana

To deny to believe and to doubt well, are to a manas a race is to a horse. Pascal

Those who can–do, those who cannot–teach. Shaw

Taken literally, the first is trivially false, or a category mistake (skepticism is not the chastityof something and ‘chastity’ does not apply to intellects). The second is trivially true ortrivially false–depending on whether the claimed likeness is indefinitely wide (any two thingsare alike in some respect) or narrow and precise. The third–as a plain general statement–isfalse on any of the usual criteria. Evidently, the points of the sayings have little to do withtheir literally determined truth-values.

Many have found in metaphors (of which the first is an example) hidden meanings, whichcan be approximated–though not captured–by non-metaphorical rephrasings. Others haveargued that the value of metaphor–what is transmitted, or evoked–is outside the scope oflinguistic meaning. And yet a metaphor can be misleading in a way that a joke, or a poemcannot. The same can be said of similes, which achieve their effect through a somewhatdifferent mechanism. Finally there are sayings like the third, which are not to be evaluatedliterally, but are neither metaphors nor similes. Their point is to underline some noteworthyfeature, to focus our attention on a certain pattern.

––––––


To sum up: in this chapter a brief overview was given of declarative sentences, their basicrole in conveying factual information, the ways they function in natural language and theproblems of assigning them truth-values. We have also noted some other uses of declarativesentences. To most of this we shall not return. But you should be aware of the wider pictureand of the perspective within which logic has been, and still is being developed.

We shall often emphasize the relations between symbolic logic and natural language. Oneshould remember, however, that symbolic logic is not concerned with language per se. Its aimis not the discovery of linguistic structures, or laws; that is the job of the linguist. Logic andlanguage are closely related because in symbolic logic we try, following linguistic guidelines,to express in a precise, structured way some of the things expressed in natural language.Many aspects of linguistic usage are not representable in a system of logic. Even with respectto conveying factual information, a statement will often resist fruitful formalization, eitherbecause it is too vague or confused, or because it is too complex or subtle.

Chapter 2

Sentential Logic: Some BasicConcepts and Techniques

2.0

English sentences are usually made from nouns, verbs, adjectives, etc. But sometimes thesmaller components are themselves sentences. For example:

(1) Jack went to the movie and Jill went home

is a compound made of two sentences: ‘Jack went to the movies’, and ‘Jill went home’, joinedby the word ‘and’. We can also combine sentences into bigger sentences by ‘or’:

(2) Jack will get a job or Jill will get a job.

In principle, any two sentences can be combined, using ‘and’ or ‘or’, (Recall that, from nowon, ‘sentence’ means a declarative sentence.)

We can also make a sentence from a single sentence by negating it:

(3) Jack did not graduate last year

can be seen as the negation of:

(4) Jack graduated last year.

17

18 CHAPTER 2. SENTENTIAL LOGIC

Strictly speaking, (4) is not a part of (3); not in the same way that ‘Jack went to the movie’is a part of (1). The forming of negation involves insertions and possibly additional changes,and it varies from language to language. In English and French we usually use an auxiliaryverb (‘do’ or ‘is’–in English, ‘avoir’ or ‘etre’–in French), in Hebrew we do not. In Englishthe auxiliary is succeeded by ‘not’, in French it is placed between ‘ne’ and ‘pas’. Grammaticaldetails like these are abstracted away when we set up the system of symbolic logic. What isessential is that every sentence can be negated.

Natural languages provide more than one way of forming negations. Instead of (3), we canuse the following as a negation of (4):

(30) It is not the case that Jack graduated last year.

(Here, indeed, (4) appear as a part of the negated sentence.)

Sentential logic is concerned exclusively with the making of sentences from sentences–in waysanalogous to the ones just illustrated. Predicate logic, to which we shall come later, provides,in addition to the apparatus of sentential logic, a finer analysis–whereby sentences are madefrom parts analogous to nouns and verbs.

The sentences of the system we are going to study do not belong to any natural language.But the system is meant to bring forth patterns that underlay language in general, in as muchas language expresses logical thinking. For this purpose we posit abstract entities, in the roleof sentences, and we postulate certain properties; just as in geometry we presuppose that thepoints, lines and planes satisfy certain axioms.

Sentential logic includes certain operations, by which sentences can be combined into sen-tences. We shall refer to these operations as sentential connectives, or for short connectives.One connective, called conjunction, corresponds to the operation effected in English by using‘and’ (as in (1)). Another connective corresponds to the operation of combining two sen-tences by using ‘or’ (as in (2)). For the moment we leave unspecified the exact nature of thesentences. We only assume that they are given together with the connective operations, andthat they satisfy certain properties, which we shall state as we go along.

Note: ‘Connective’ suggests the joining of more than one sentence, but it covers alsooperations on single sentences, such as negation. A connective is called binary if it operateson pairs of sentences. It is called monadic if it operates on single sentences.

The system to be studied here is based on one monadic connective: negation, and on severalbinary ones. In principle one can consider connectives that operate on more than two sen-tences. But we shall not require them, because we shall be able to express whatever is neededby repeated applications of the connectives we have.1

1For each connective, the number of sentences it combines is fixed. One can, however, generalize the notionof a connective, by allowing connectives that can combine–at one go–a variable number of sentences.

2.1. SENTENCES, CONNECTIVES, TRUTH-TABLES 19

A sentence obtained by applying connectives to sentences (possibly, more than once) is calleda sentential compound. The sentences to which the connective are applied are known asits sentential components, or components for short. Connectives can be applied repeatedly,yielding larger and larger compounds.

The semantic aspect, i.e., the aspect of truth and falsity, is represented in sentential logic byassignments of truth-values. The truth-value of a compound is determined by the truth-valuesof the components and by the connective that has been applied.

2.1 Sentential Variables, Connectives and Truth-Tables,

Atomic Sentences

2.1.0

We shall use

‘A’, ‘B’, ‘C’, ‘D’, , ‘A0’, ‘B0’, ‘D0’, ... etc.

as sentential schematic letters, i.e., as signs that stand for any sentences. A general claim is aclaim that holds, for all sentences that the schematic letters can stand for. We also call them,conveniently, sentential variables; that is, variables ranging over arbitrary sentences.2

Learning sentential logic is analogous to learning geometry, arithmetic, or algebra. Initially,we do not define what points, lines and planes are, or what numbers are; we rely on someintuitive understanding, and we lay down certain laws. Just as in algebra we have numericaloperations: addition and multiplication denoted by ‘+’ and by ‘·’, we have in sententiallogic the sentential connectives; for example, conjunction, which we shall denote by ‘∧’. Andjust as in algebra we speak of the numbers x, y, z, where ‘x’, ‘y’, ‘z’, are variables rangingover numbers, so in logic we speak of the sentences A, B, C. In algebra we say:

For every two numbers, x and y, there exists a number x · y, which is theirproduct,

And in sentential logic we say:

For every two sentences, A and B, there exists a sentence, A ∧ B, which istheir conjunction.

2 It is customary to regard variables as taking values in some domain. We can thus regard sententialvariables as having sentences as their possible values. But, as we shall see, the sentences themselves havetruth-values, T and F. Accordingly we shall speak of the truth-value of A, the truth-value of B, etc.


Sentences are, however, more sensitive than numbers to the ways in which the operations areapplied. In algebra we have the general equality x·y = y·x. But A∧B is, in general, differentfrom B ∧A. Indeed, ‘Jack went to the movie and Jill went home’ is not the same sentenceas ‘Jill went home and Jack went to the movie’. We shall later see that A∧B and B∧A arelogically equivalent; yet they are different sentences, unless A and B are the same sentence.

Sentential expressions are either sentential variables, or expressions obtained by combiningsentential variables (in the appropriate way) with connective signs. For example:

‘A ∧B’

is a sentential expression denoting the conjunction of A and B.

2.1.1 Negation

Negation is an operation on sentences. It is monadic, i.e., it applies to single sentences. Forevery sentence, A, there is a sentence, called the negation of A. We shall use ‘¬’ as the namefor negation and we shall write the negation of a sentence A as:

¬A

A negation of a sentence is referred to, for short, as ‘negation’. There is therefore an ambiguityin the use of ‘negation’: it can refer to the operation itself, or to a sentence that results fromthis operation. The intended meaning will be clear from the context.

Note that in stating the rule for negation we have used ‘A’ to stand for an arbitrary sentence.We could have used, of course, any other sentential variable, e.g., ‘B’.

Since ¬A is a sentence, there exists also a sentence which is the negation of ¬A, namely:¬¬A

We can continue in the same way and get, for any sentence A, the sentences:

¬A, ¬¬A, ¬¬¬A, ¬¬¬¬A, . . . , ad infinitum.

You can compare negation with the algebraic operation of forming the negative: for everynumber x we have the negative of x, denoted as ‘−x’. The double negative is equal to theoriginal number, −(−x) = x, but when it comes to sentences the situation is different: ¬¬Ais (we shall see later) logically equivalent to A, but they are different sentences. In fact, allthe sentences in the above-written list are different.

The truth-value of the negation of a sentence is determined by the following simple semanticlaw:


If the value of A is T, then the value of ¬A is F; if the value of A is F, thenthe value of ¬A is T.

Regarding T and F as opposite values we can say that the value of ¬A is the opposite valueof A: the effect of negation is to toggle (i.e., reverse) the truth-value.

2.1.2 Conjunction

For every two sentences A and B there is a sentence called the conjunction of A and B, whichis written as:

A ∧BWe say that A and B are the conjuncts of A ∧B.Again, there is an ambiguity: ‘conjunction’ denotes the operation, and is also used to referto the resulting sentence. (Note that in algebra we have two names: the result of applyingaddition to x and y is the sum of x and y; and the result of applying multiplication is theproduct.)

The truth-value of the conjunction of two sentences is determined by the following rule:

If both A and B have the value T, then A ∧ B has the value T. In everyother case, A∧B has the value F. (Note that “every other case” covers herethree cases: A gets T and B gets F, A gets F and B gets T, A gets F andB gets F.)

Repeated Applications and Grouping

By applying connectives to sentences we get sentences, to which we can apply again connectives–getting further sentences, and so on. We can form, for example, the negation of B and thenwe can form the conjunction of A with it:

A ∧ ¬B .

Note that the similar expression:

¬A ∧Bcan be interpreted in two ways:

(i) The conjunction of ¬A and B: (¬A) ∧B(ii) The negation of A ∧B: ¬(A ∧B)


(i) and (ii) are different sentences, which can, moreover, differ in truth-value: if B gets F, thenyou can easily verify that, independently of the value of A, (¬A) ∧B gets F; but ¬(A ∧ B)gets T.

In (¬A) ∧ B the scope of the negation is A; this is the sentence on which negation operateswithin the compound. In ¬(A ∧B) the scope of the negation is A ∧B.The notion of scope applies also to conjunction, as well as to the other connectives we shalllater introduce. An occurrence of a conjunction has a left scope and a right scope. In (¬A)∧Bthe left scope of the conjunction is ¬A; in ¬(A ∧B) it is A. In both, the right scope is B.Again, an analogy with algebra can clarify things:

(−3) + 4 6= −(3 + 4) and (3 · 4) + 5 6= 3 · (4 + 5) .

Parentheses are to be used, whenever needed to determine the way of reading the expres-sion. They figure among the symbols from which sentential expressions are constructed. Forconvenience of reading, we shall use square and curly brackets:

¬[(A ∧B) ∧ ¬(C ∧D)]

Regard them as parentheses written in a different way.

Grouping Conventions

In algebra there are standard notational conventions that allow us to suppress parentheses: (i)‘−3+ 4’ is read as ‘(−3)+ 4’, not as ‘−(3+ 4). (ii) ‘3 · 4+ 5’ is read as ‘(3 · 4)+ 5’, notas ‘3 · (4 + 5)’.These two conventions can be expressed by saying that the negative sign, ‘−’, and the mul-tiplication sign, ‘·’, bind stronger than the addition sign, ‘+’.Conventions of the same nature are adopted in logic. The convention is that ‘¬’ binds strongerthan any of the other connective names. This means the following:

When parentheses are missing, fix the scopes of the negation symbols to bethe smallest scopes that are consistent with the given expression.

Here is how it works:

‘¬A ∧B’ is read as: ‘(¬A) ∧B’ .

‘¬¬A ∧ ¬(B ∧ C)’ is read as: ‘(¬¬A) ∧ ¬(B ∧ C)’ .

‘¬(¬A ∧ ¬B) ∧ C’ is read as: [¬((¬A) ∧ ¬B)] ∧ C .


In the first example, the scope of the negation is A. In the second example, the scope of thefirst (leftmost) negation is ¬A, the scope of the second is A and the scope of the third isB ∧ C. In the third example, the scope of the first negation is (¬A) ∧ ¬B, the scope of thesecond is A and the scope of the third is B.

Homework 2.1

Insert parentheses in the following expressions according to the grouping convention, so as toensure unique readability. (Do not add parentheses if there is no danger of ambiguity.)

Having done this, write down the scopes of all occurrences of negations in 4, and all the leftand right scopes of the occurrences of conjunctions in 2. (In each case start from the leftmostoccurrence.)

1. ¬(¬A ∧ ¬(B ∧ A))2. ¬(¬A ∧ ¬(B ∧ C)) ∧ (A ∧B)3. ¬(A ∧ (¬A ∧ ¬B))4. ¬(A ∧ (¬A ∧ ¬C)) ∧ ¬¬B5. C ∧ ¬(C ∧ ¬(A ∧ C))6. A ∧ ¬(C ∧ (¬C ∧B))

2.1.3 Truth-Tables

Truth-tables are a standard, commonly used device for showing how the truth-values of sen-tences are determined by the values of their sentential components. The truth-values of eachsentence are written, in the column headed by it, in the same row containing the truth-valuesof its components. Here is the truth-table for negation.

A ¬AT FF T

And the truth-table for conjunction is:

A B A ∧BT T TT F FF T FF F F


Note that the use of ‘A’ and ‘B’ is of no particular significance. We could have used any othersentential variables.

A truth-table shows how to correlate, with every possible assignment of truth-values to thecomponents, a value for the whole sentence. The order of rows is not essential; we canrearrange them arbitrarily. We can also rearrange the columns, provided that they keep thesame headings. We can, for example, rewrite the truth-tables for negation and conjunctionthus:

A ¬AF TT F

A B A ∧BT F FF F FF T FT T T

B A ∧B AT T TF F FF F TT F F

It is desirable however to adopt some uniform fixed arrangement. And this is what we shalldo.

We can use a single truth-table to show the values of several sentences. In particular, whena sentence is built by iterating the connectives, it is convenient to have columns for the“intermediate” sentences:

A B ¬B A ∧ ¬BT T F FT F T TF T F FF F T F

Here, we have a column for ¬B which, together with the column for A, can be used todetermine the truth-values for A ∧ ¬B. You can include, or skip, such intermediate columnsaccording to your convenience. The case of iterated negation can be described as follows:

A ¬A ¬¬A ¬¬¬A . . .T F T F . . .F T F T . . .

Several sentences, not necessarily components of each other, can be included in a single truth-table. The table should include a column for every sentential variable that occurs in any ofthe sentential expressions. (Of course, the value of a sentential variable has no effect on thevalues of expressions not containing it.) The following is such an example.


A B C C ∧ ¬B ¬(C ∧ ¬B) ¬A ∧ ¬(C ∧ ¬B) A ∧ (C ∧ ¬B) ¬(A ∧ (C ∧ ¬B))T T T F T F F TT T F F T F F TT F T T F F T FT F F F T F F TF T T F T T F TF T F F T T F TF F T T F F F TF F F F T T F T

Note that we did not include a column for ¬B. The truth-value of C∧¬B is obtained directlyfrom those of C and B; the toggling of B’s value is “done in the head”. The truth-value of¬(C ∧ ¬B) is then obtained from that of C ∧ ¬B. A column for ¬A is not included; thetruth-value of ¬A ∧ ¬(C ∧ ¬B) is obtained directly from those of A and ¬(C ∧ ¬B).The Number of Rows The number of rows in a truth-table is determined by the numberof sentential variables figuring in it. With one sentential variable we have two rows, one foreach of its possible two values. Every additional sentential variable multiplies the number ofrows by two (each row gives rise to two: one where the additional variable gets T, another–where it gets F). Therefore, for two sentential variables the number of rows is 4, for three thenumber is 8, and for 4 it is 16. For n sentential variables, the number of rows is 2n.

Homework 2.2

Write down the truth-tables for the sentences of Homework 2.1, after inserting the requiredparentheses.

Sentences and Sentential Expressions

Sequences of symbols such as

‘A’, ‘B’ , ‘A ∧B’, ‘¬C’, ‘(¬A) ∧ (B ∧ C)’

are sentential expressions. They refer to sentences, whose final identity depends on the sen-tences referred to by the sentential variables. We therefore speak of the sentence A∧B. Butsometimes these sequences are used to refer to the expressions themselves. In this case wespeak of the sentential expressions A, A ∧B, etc., without using quotes. This double usage,which is sometime discouraged in logic, is convenient and we shall occasionally resort to it.It is quite common in algebra: one speaks of the number x+ y ·z, and also of the expressionx + y ·z. It is also common in English: we speak of the man Jack, but we speak also of thename Jack.


Sentences as Instances: A sentential expression can be viewed as a scheme. A sentencefalls under the scheme if it can be obtained by interpreting the sentential variables in thescheme as standing for certain sentences. We say in this case that the sentence is an instanceof, or that it can be written as, the sentential expression. E.g., any sentence of the formA∧¬B is an instance of C ∧D; it is obtained by letting ‘C’ stand for A and ‘D’ for ¬B. Theformal notion of substitution allows us also to substitute A ∧ ¬B for A. But of this later.

Truth-Values of Sentential Expressions: Truth-tables are determined by sententialexpressions. They show how the truth-values of the sentence represented by the expressiondepend on the values of the components represented by the variables.

We can consider assignment of truth-values directly to the sentential expressions. Hence wemay speak of the values assigned, in a given row, to the sentential variables, and of thecorresponding value of the expression; i.e. the value that appears in the expression’s column.

Truth Functionality

Every connective of (classical) sentential logic is truth functional. This means that the truth-value of a compound built by applying the connective is completely determined by the valuesof the components. This become clear if we consider two English connective, one truth-functional, the other not:

(1) Jack will go to see the play, and Jill says that the play is good.

(2) Jack will go to see the play, because Jill says that the play is good.

Each of (1) and (2) is obtained by combining the sentences:

(a) Jack will go to see the play,

(b) Jill says that the play is good.

In (1) the combining is by means of ‘and’, in (2)–by means of ‘because’. If either (a) or (b)(or both) is false, then (1) and (2) are false. If both (a) and (b) are true, then (1) is true;but the value of (2) is still undetermined. (2) is true only if there is a causal relation betweenJill’s saying and Jack’s going. If Jack goes to see the play, not because Jill has praised it,then (2) is false. This shows that ‘because’ is not truth-functional: the truth-values of a‘because’-compound cannot be found just by knowing the values of the components.

Here is an example of a non-truth-functional monadic operation. The operation is effectedby attaching the expression ‘it is necessary that’. From the sentence ‘...’ we get the sentence:‘it is necessary that ...’. Now both of the following are true:


Thirteen is a prime number,

John Kennedy was assassinated,

But only the first is a necessary truth (the attempt on Kennedy’s life could have failed).Hence, the first of the following is true, the second is false.

It is necessary that thirteen is a prime number,

It is necessary that John Kennedy was assassinated.

In natural language we have some connectives that are truth-functional, some that are clearlynot, and some that are borderline cases. We shall return to the subject in chapter 3. Thereare systems of logic that incorporate connectives that are not truth functional (for example,a logic containing a connective 2 for expressing necessity: 2A is true if and only if A isnecessarily true). But they shall not concern us here.

2.1.4 Atomic Sentences in Sentential Logic

So far, we have stipulated certain basic features of the system: the connective operationsand their truth-table interpretation. Other features follow in the sequel. We can proceed inthis manner without committing ourselves to particular sentences. Later, when we set uplanguages based on predicates and individual names, we will have more specific entities. It ishowever customary and convenient to be more specific even at the sentential level.

For this purpose we view the sentences as built from basic constituents by repeated appli-cations of sentential connectives. Let us assume an infinite sequence of so called atomicsentences:

A1, A2, . . . , An, . . .

All other sentences of the formal language are built from them, bottom-up, by repeatedapplications of the connectives (the ones we have so far, ¬ and ∧, and others to be introducedlater). The atomic sentences, or atoms for short, are not sentential compounds. We assume aninfinite sequence, for the sake of generality: in order not to be bound by arbitrary restrictions.

Every sentence is built from atomic sentences in a finite number of steps, where each stepconsists in applying a connective to sentences already constructed. Hence every sentenceinvolves a finite number of atoms. Our particular sentences are therefore entities of the kind:

A12, A1 ∧ A6, ¬(A6 ∧ (¬A2 ∧ A3)), . . .

Later postulates will imply that all these are different sentences; e.g., A3∧A4 6= A1∧A2. Otherconnectives, to be added later, will be used to generate sentences as well.


Note the essential difference between atomic sentences and the unspecified sentence referred toby sentential variables. The difference between ‘A’, ‘B’, ‘C’,..., on one hand, and ‘A1’, ‘A2’, ‘A3’,...,on the other, is like the difference between ‘x’, ‘y’, ‘z’ and ‘1’, ‘13’, ‘9’. The former arenumerical variables, the later–names of particular numbers. A1, A2, A3, etc., are distinctsentences; A2 6= A3, by definition. A, B, C, A1, A0, etc., are sentences left specified; A = Bmay, or may not, hold. Similarly, A1 6= A2 ∧ A3, but A may, or may not be equal to B ∧ C.(We can, of course, have A = A3, or A = A3 ∧ A4. But we cannot have A ∧ B = A3, becauseA3 is not a compound.)

Any particular interpretation of the language assigns truth-values to the atomic sentences,and this determines, via the truth-tables, the truth-value assigned to all other sentences. Butsentential logic is not about particular assignments to atomic sentences, but about propertiesand relations that hold for assignments in general.

By considering all possible assignments to atomic sentences, we treat them as being indepen-dent of each other. The truth-value assigned to Ai is not constrained by the values assigned toall other atoms. If we try to cast some sentences of natural language in the role of “atoms”,we see that, as a rule, they are not independent. For example, ‘a is red’ and ‘a is blue’ (where‘a’ denotes some object) cannot be both true; the truth of one implies the falsity of the other.This constraint, however, is not a matter of pure logic. It derives from the meaning of ‘red’and ‘blue’. By making the atoms independent we filter out everything that is not implied bythe meaning of the sentential connectives. If needed, we can add non-logical connections. Forexample, if A1 and A2 are, respectively, ‘a is red’ and ‘a is blue’, then the sentence

¬(A1 ∧ A2)

states that a is not both red and blue. If we restrict the assignments to those that make thissentence true, we impose the required constraint. We can decide to adopt the sentence as anaxiom, but it will not be an axioms of sentential logic.

Having introduced the atomic sentences, we can, by far and large, ignore them. We make useof the notion of atomic sentences in defining logical equivalence and other basic semantic con-cepts. We could have defined such concepts, rigorously, without assuming atomic sentences.But some basic properties of the semantic concepts would then require more intricate proofs.3

Once these properties are established, we do not need atomic sentences. The system is of ageneral schematic nature. General claims and techniques are best represented by using sen-tential variables, which is all that we need. Note that in any particular context the sentencesdenoted by the sentential variables play the roles of “atoms”, as long as we do not specifyanything more about their structure.

3 The claim is that sentential logic can be done rigorously without assuming atomic sentences. In a previousversion of the book, we followed this line, introducing atomic sentences only at a later stage. We used intuitivearguments instead of proofs, whose rigorous form would have been too abstract for the book. (cf. footnote 4,page 32). We continue to use intuitive arguments, but the rigorous proof is now around the corner.

2.2. LOGICAL EQUIVALENCE, TAUTOLOGIES, CONTRADICTIONS 29

2.2 Logical Equivalence, Tautologies and Contradictions

2.2.0

Obviously, the truth-values of A and ¬¬A are the same, no matter what A’s truth-value is.This is a simple example of logically equivalent sentences. The general idea is that sentencesare logically equivalent if they must have the same truth-value by reasons of pure logic.We have not determined yet what comes under “reasons of pure logic”. But in the case ofsentential logic, only the sentential connectives are considered as logical elements. This meansthat the sentences should have the same truth-value, solely because of the way in which theyare obtained by applying the connectives. Equivalence that derives only from the sententialconnectives is known as tautological. The detailed definition is as follows:

The sentences A and A0 are tautologically equivalent if under any assignmentof truth-values to the atomic sentences, A and A0 have the same truth-value.

In order to establish tautological equivalence we do not have, in general, to go to the level ofatomic sentences. A∧B and B∧¬¬A must have the same truth-value, no matter how A andB are constructed from smaller units. The same holds for other tautological equivalences thatwe establish here. All the relevant structure can be displayed by the sentential expressions.An equivalence is proven, once it is observed that the truth-table assigns, in every row, thesame value to the two sentences. Of course, the sentential variables can stand also for atomicsentences. Therefore the definition above implies the following.

Two sentences are tautologically equivalent if and only if they can be writtenas sentential expressions, such that, in a truth-table that has column forboth, their respective columns are the same.

Tautological equivalence is a special kind of logical equivalence. In the sentential calculus thetwo are the same. But in richer systems, such as first-order logic, there are sentences that arelogically, but not tautologically equivalent. Because in richer systems there are other logicalelements that can enter into the sentence. Hence:

Every two sentences that are tautologically equivalent are also logically equivalent; the converseholds when we limit ourselves to sentential logic, but not in general.

Notation and Terminology: We shall use ‘≡’ as the symbol for logical equivalence:

A ≡ B

means that A is logically equivalent to B. Hence, for every sentence A, we have:

A ≡ ¬¬A


We refer to statements that assert the equivalence of two sentences (e.g., the statement above)as equivalence statements, or simply, equivalences.

Often, when we are dealing with sentential logic, we shall use ‘equivalent’ as a shorthand for‘tautologically equivalent’. The context will indicate the intended meaning of the term.

Note: The symbol ‘≡’ is a shorthand for ‘is logically equivalent to’. It is a technicalterm, which is part of our English discourse. ‘A ≡ B’ reads as an English sentence: ‘Ais logically equivalent to B’. On the other hand ‘A ∧ B’ does not stand for any Englishsentence. It denotes the conjunction of A and B, which is a sentence in our formal system,but not in English.

Another, easily verifiable equivalence is:

A ∧B ≡ B ∧AThe sentences on the two sides are not, in general, the same: A ∧B 6= B ∧A, unless A = B.

Equivalence and Sentential Expressions

Equivalences of the kind just illustrated are general claims: for all A, A ≡ ¬¬A, and for all Aand B, A∧B ≡ B∧A. Therefore we can substitute for the sentential variables any sententialexpressions, e.g.,

¬(C ∧D) ≡ ¬¬¬(C ∧D), (¬C) ∧ ¬(A ∧B) ≡ ¬(A ∧B) ∧ (¬C)(Can you see the substitutions by which these are obtained from the previous equivalences?)

General equivalences of this form are schematic, they derive from the sentential expressions.We can define tautological equivalence directly for sentential expressions. The definition is:

Two sentential expressions are equivalent if, in a truth-table that has columnsfor both, their respective columns have the same truth-value in every row.

The equivalence of sentential expressions implies, of course, the equivalence of the denotedsentences. On the other hand, two sentential expressions such as

‘A ∧B’ and ‘A ∧ ¬C’

are not equivalent, but the denoted sentences may still be equivalent; for example, in thespecial case where B = ¬C, or in the special case where C = ¬B.It is not difficult to see that the following holds:


Two sentential expressions are equivalent, if and only if the sentences ob-tained by letting the sentential variables stand for distinct atomic sentencesare equivalent.

Two sentential expressions can be logically equivalent, even when they involve different sen-tential variables, for example:

A ∧ ¬A ≡ B ∧ ¬B ,

because the two always get the same value, namely F. This may diverge from our intuitivenotion of “equivalence”. Should the following be classified as equivalent?

(1) Jack is at home and Jack is not at home,

(2) The earth is larger than the moon and the earth is not larger than the moon.

In ordinary usage,“equivalence” often implies a common subject, or some sort of connectionthat is lacking in the case of (1) and (2). Tautological equivalence is not meant to capture suchaspects. We are interested only in equivalence that reduces to having the same truth-valuesunder all possible assignments of truth-values to the sentential variables.

Truth-Table Checking

One can show that two sentential compounds are tautologically equivalent simply by writinga truth-table for both, where all sentential variables (involved in either sentence) occur. If thecolumns headed by the two sentences are the same, then they are equivalent. The equivalenceof A ∧ ¬(A ∧ ¬B) and B ∧ ¬((¬A) ∧B) is shown in this way:

A B A ∧ ¬B ¬(A ∧ ¬B) A ∧ ¬(A ∧ ¬B)) (¬A) ∧B ¬((¬A) ∧B) B ∧ ¬((¬A) ∧B)T T F T T F T TT F T F F F T FF T F T F T F FF F F T F F T F

This “brute force” checking is often quite cumbersome. There are, we shall see, methods thatyield in many cases shorter, more elegant proofs. These methods yield also insights that arenot obtained via truth-tables. Often they enable us to simplify a sentence, that is: to finda simpler sentence equivalent to it. The last two sentences, for example, are equivalent to asentence that is much simpler than both:

A ∧BYou can verify it by noting that the column of each is identical to the column of A ∧B.


The equivalence of all the sentences in a group of more than two can be expressed by “chain-ing”, e.g.,

A ∧ ¬(A ∧ ¬B) ≡ B ∧ (¬((¬A) ∧B) ≡ A ∧BThis mode of writing relies on the property that sentences that are equivalent to the samesentence are equivalent to each other. The chain therefore implies that all the displayedsentences are logically equivalent.

2.2.1 Some Basic Laws Concerning Equivalence

For all sentences A, B, C, the following holds:

Reflexivity: A ≡ A.

Symmetry: If A ≡ B, then B ≡ A.

Transitivity: If A ≡ B and B ≡ C, then A ≡ C.

‘Reflexivity’ indicates that the relation “reflects back”: every sentence is logically equivalentto itself. ‘Symmetry’ indicates that the two sides can be switched. ‘Transitivity’ points tothe “passing on” of the relation, via the “mediator” B: from the pair A and B and the pairB and C to the pair A and C.

Each of these properties is obvious. The argument for transitivity is, for example, this: Iffor every assignment of truth-values to the atomic sentences, A and B have the same truth-values and B and C have the same truth-values, then also for every assignment to the atomicsentences A and C have the same truth-values.4 If A1 ≡ A2, A2 ≡ A3, . . . , An−1 ≡An, then A1 ≡ A3, hence also A1 ≡ A4, etc., A1 ≡ An. Thus, every two sentences amongA1, A2, . . . , An are equivalent.

When you come to think of it you will see that reflexivity, symmetry and transitivity are trueof equivalence in general, however defined and whatever the objects. For example, equalityof shape (between geometrical figures), parallelism (between lines), having the same pairof parents (between people), in fact–all relations that we characterize as equivalences. Inmathematics an equivalence relation is by definition any relation that satisfies reflexivity,symmetry and transitivity.

4 It is here that the atomic sentences are needed. They are the smallest building blocks of all the sentences.Without them, the equivalence of A and B would rest on a representation of A and B as sentential compoundsof smaller sentences, and the equivalence of B and C–on another representation of B and C. We would thenneed a refinement of these representations, so as to have the same smallest units as building blocks of all threesentences. Using unique readability (cf. 2.3.0, page 43), this can be done. But it would carry us too far awayfrom the course material.


Congruence Laws and Substitution of Equivalent Components

Besides the three basic properties that are common to all equivalence relations, there are, foreach equivalence relation, contexts in which we can substitute an object by an equivalent one.Laws of this nature are sometimes known as congruence laws. Logical equivalence behavesin this way when it comes to applying connectives. If we replace in a sentential compound acomponent by an equivalent one, we get an equivalent compound:

If A ≡ A0 then:

¬A ≡ ¬A0, A B ≡ A0 B, B A ≡ B A0

for every binary connective .

The arguments that prove these claims are easy: If, by virtue of logic, A and A0 have thesame truth-value, then also their negations have the same truth-value (namely the oppositeone); and this follows by virtue of logic, because negation is one of the logical elements of thesentences. The same reasoning applies to connectives in general. All we need is that betruth-functional (the value of C D should depend only on the values of C and D) and thatit be classified as a logical element.

The laws can be applied repeatedly, for example:

C ≡ ¬¬C, hence A ∧ C ≡ A ∧ ¬¬C, hence ¬(A ∧ C) ≡ ¬(A ∧ ¬¬C) .

The notions of components and substitutions will be elaborated in section 2.3 of this chapter.But we should have by now a sufficient intuitive understanding, relying on which we can makefree use of the substitution law: Given any sentence, the substitution of a component by alogically equivalent one results in a logically equivalent sentence.

We can establish equivalences by using substitutions, in combination with other properties oflogical equivalence. Here is an example:

From A ∧ C ≡ C ∧A, we get, applying negation:¬(A ∧ C) ≡ ¬(C ∧A)

By symmetry,¬(C ∧A) ≡ ¬(A ∧ C)

Now we can substitute in the right-hand side C by the equivalent ¬¬C and get, via transitivity:¬(C ∧A) ≡ ¬(A ∧ ¬¬C)

Operating with logical equivalences is analogous to operating with algebraic equalities. Oneuses reflexivity, symmetry and transitivity and substitutions of equivalents. But you have


to remember that logical equivalence is not equality. Sentences are syntactic creatures, andthey can differ as syntactic creatures even when logic dictates that they should have the sametruth-value.

Some Terminology and Notation

‘Iff ’: As is customary in logic and mathematics, we use ‘iff’ as shorthand for ‘if and onlyif’ (e.g., a product of two numbers is zero iff one of them is).

We use ‘⇒’ (or its longer version ‘=⇒’) to stand for the English ‘implies’, or ‘entails’. Thus,‘. . .⇒ ’ is to be read as: ‘If... then ’

Note that, like ‘≡’, ‘⇒’ is not a part of the formal language, but a convenient shorthandwithin our English discourse.

In a similar way we use ‘⇔’ (and ‘⇐⇒’ ) to stand for ‘iff’.The following table sums up the basic properties of logical equivalence discussed above.

A ≡ A

A ≡ B =⇒ B ≡ A

A ≡ B, B ≡ C =⇒ A ≡ C

A ≡ B =⇒ ¬A ≡ ¬B

For every binary connective, :

A ≡ B =⇒ A C ≡ B C

A ≡ B =⇒ C A ≡ C B

Non-Equivalent Sentences

The equivalences we establish are between expressions built from sentential variables. Hencethey hold in general, no matter what the sentential variables stand for. On the other hand,sentential expressions may be non-equivalent as expressions, while some of their instances areequivalent sentences. As expressions ‘A ∧ B’ and ‘A’ are not equivalent. But if A = B, or


if A = (¬¬B), or if A is any other of an infinite number of sentences, then A ≡ B. Thenon-equivalence of the expressions means that the equivalence between sentences does nothold in general, not that it always does not hold.

Two sentential expressions are not equivalent, if there is an assignment of truth-values to thesentential variables, under which the expressions get different values. (In the example above,assign T–to A, F–to B.) If, in this case, we let the sentential variables stand for distinctatomic sentences we get two particular non-equivalent sentences. E.g., the non-equivalentsentences A1 and A1 ∧ A2.We can get also non-equivalent instances without using atoms. In the example, since A shouldget T, substitute it by any sentence of the form ¬(C∧¬C), such a sentence–it is not difficultto see–always gets T. And substitute B by any sentence of the form C ∧ ¬C, which alwaysgets F. Then the resulting sentences are never equivalent, no matter what the sententialvariables stand for.

Homework

2.3 Simplify, if possible, each of the sentences in Homework 2.1; i.e., try to find an equivalentsentence that is simpler, the simpler–the better. Do not use other connectives (introducedlater) besides ¬ and ∧. (With the simplification methods of the sequel this will be very easy.Right now you can look at the truth-tables and try by guessing.)

2.4 Find all the pairs of sentences in Homework 2.1 that are equivalent. Fill the followingtable, by writing ‘+’ in every square for which the row sentence is equivalent to the columnsentence.

1 2 3 4 5 6123456

For each pair without a ‘+’, show that there is a truth-value assignment to the sententialvariables, under which the two sentences get different values.

Note: You can put ‘+’ in the diagonal and you can also assume that the filled table issymmetric around the diagonal. (Can you see why?) This leaves fifteen pairs of sententialexpressions for checking. Since equivalent sentences have always the same truth-values, theybehave in the same way with respect to other sentences. Hence, the more equivalent pairsyou discover at an early stage, the more you will economize in checking.


2.2.2 Disjunction

Disjunction is another binary connective, denoted by ‘∨’. For every two sentences, A and B,there is a sentence

A ∨B,called the disjunction of A and B. As in the cases of negation and conjunction, ‘disjunction’is used ambiguously: for the operation and for the resulting sentence.

The disjuncts of A ∨ B are A and B, the first is the left disjunct, the second–the rightdisjunct.

The truth-table for disjunction is:

A B A ∨BT T TT F TF T TF F F

In words: the truth-value of A∨B is F if the truth-values of both A and B are F. It is T inevery other case.

In English the operation corresponding to disjunction is often effected by using ‘or’. Forexample, under its usual reading, the sentence

(3) Jack is at home, or Jill is at home

is true when either Jack or Jill, or both, are at home, and is false when neither of them is.Read in this way, (3) can be construed as a disjunction of ‘Jack is at home’, and ‘Jill is athome’.

This type of ‘or’ is said to be inclusive. There is another type, described as exclusive or,which is taken to imply that one, but not both, of the alternatives is true. In the followingexample, the ‘or’ is presumably exclusive:

(4) Either you will pay the fine, or you will go to prison.

A further discussion of inclusive versus exclusive ‘or’ is in chapter 3.

Evidently, disjunction corresponds to inclusive ‘or’. But we can express exclusive ‘or’ by usingthe connectives introduced so far:

(A ∨B) ∧ ¬(A ∧B)


Intuitively, this sentence says: “A or B, and not both A and B”. You can confirm formallythat it has the desired property, by checking its truth-table:

A B A ∨B A ∧B ¬(A ∧B) (A ∨B) ∧ ¬(A ∧B)T T T T F FT F T F T TF T T F T TF F F F T F

Homework 2.5 Suppose that ∨x is a connective that corresponds to exclusive ‘or’ (i.e.,A ∨x B is true just when one of A and B is true, but not both.) Show that disjunction canbe expressed using ∧ and ∨x (without using negation); in other words, using only ∧ and ∨x,construct a sentence whose truth-table column is exactly that of A ∨B. (This is easier thanit look.)

Using ∧ and ¬, we can construct a sentence equivalent to A ∨B:

(5) A ∨B ≡ ¬(¬A ∧ ¬B)

This is described by saying that disjunction is expressible in terms of conjunction and nega-tion.

Heuristically, you can see why (5) holds by observing: “To say that A or B is the same as tosay that it is not the case that both not-A and not-B.” But you can verify it, formally, bytruth-tables; or, with some practice, by carrying out the checking in the head.

(5) shows that, having negation and conjunction, we can do without disjunction without losingexpressive power. Whenever we need A ∨ B, we can use the equivalent ¬(¬A ∧ ¬B). Buteliminating disjunction in this way can yield non-transparent expressions. It is very convenientto have disjunction as a primitive connective, because it corresponds to the familiar ‘or’-operation of natural language, and because it makes for short clear expressions. And, mostimportant, basic structural properties of the formalism are best displayed if both conjunctionand disjunction are available.

Conjunction can be expressed in terms of negation and disjunction:

(6) A ∧B ≡ ¬(¬A ∨ ¬B)

This, like (5) can be easily verified by direct checking of truth-values. We can thereforedispense with conjunction, if we have negation and disjunction. But again, the formalismis much easier to operate and its structure much more transparent, if both conjunction anddisjunction are available.

(5) and (6) are examples of logical equivalences that can be established by simple consid-


erations of truth-values, without writing the whole truth-table. In general, we can use anyof the following methods for establishing logical equivalence. Each of the conditions is bothnecessary and sufficient.

(I) Show, for one of the sentences, that if it has the value T, the other has thevalue T, and if it has the value F, the other has the value F.

(II) Show that one sentence has the value T iff the other has the value T.

(III) Show that one sentence has the value F iff the other has the value F.

Obviously, (I) suffices for proving logical equivalence. The same holds also for each of (II)and (III). Consider (III) for example. It implies that it is impossible that one sentence has Tand the other F; for this would contradict the “iff”.

In the case of (5), (III) provides the shortest argument. Since A ∨ B gets F, iff both A andB get F, it suffices to show that also ¬(¬A ∧ ¬B) gets F iff both A and B get F. And thisis argued as follows:

¬(¬A ∧ ¬B) gets F iff ¬A ∧ ¬B gets T. And this last conjunction gets Tiff both conjuncts: ¬A and ¬B get T; i.e., iff both A and B get F.

In a similar way, (II) can be used to prove (6). One can also derive each of (5) and (6) fromthe other, using suitable substitutions and the general equivalence laws. Here, for example,is a derivation of (6) from (5):

Applying negation to both sides of (5) we get:

¬(A ∨B) ≡ ¬¬(¬A ∧ ¬B)Since the double negation of a sentence is equivalent to a sentence, we can drop ‘¬¬’ on theright and get:

¬(A ∨B) ≡ ¬A ∧ ¬BSince this is true for any sentences A and B, it remains true if we substitute throughout, Aand B by their negations:

¬(¬A ∨ ¬B) ≡ ¬¬A ∧ ¬¬BAgain, we can drop double negations (replacing components by their equivalents), whichyields:

¬(¬A ∨ ¬B) ≡ A ∧BAnd this, via symmetry, yields (6). (If substituting A and B by their negations confusesyou, use different sentential variables and let A = ¬C, B = ¬D. You will get the desiredequivalence, formulated in terms of C and D.)

The examples just given illustrate some techniques of equivalence proving, which will beelaborated and extended in chapter 4.


Grouping with Disjunctions

With disjunction we have additional cases that require parentheses. For example,

¬A ∨B and A ∨B ∧ Care ambiguous expressions.

The first can be interpreted either as the sentence (¬A) ∨B, or as ¬(A ∨B). You can easilysee that the two are not, in general, logically equivalent. The second can be interpreted eitheras A∨ (B ∧C), or as (A∨B)∧C. Again, these are not, in general, logically equivalent: if Agets T and C gets F, then A ∨ (B ∧ C) gets T, but (A ∨B) ∧ C gets F.

Parentheses are therefore employed, in order to force unique readings. Our previous conven-tions for omitting parentheses are now extended by the following rule:

Disjunction symbol binds more weakly than either the symbols for negation or for conjunction.

This means that ‘¬’ binds the strongest, then ‘∧’, then ‘∨’. In treating expressions that arenot fully parenthesized, we first determine the scopes of negations to be the smallest that areconsistent with the given grouping; next we determine the left and right scopes of conjunctionsto be the smallest consistent with the grouping at that stage. For example,

¬A ∨ ¬B ∧ C is read as: (¬A) ∨ [(¬B) ∧ C] ,(¬A ∨ ¬B) ∧B ∨D is read as: {[(¬A) ∨ (¬B)] ∧B} ∨D .

When parentheses are suppressed, it is often desirable to indicate grouping by appropriatespacing, e.g.,

¬A ∨ ¬B∧C, [¬A∨¬B]∧B ∨D .

It is preferable to retain parentheses, even when redundant, if this makes for easier reading.

Homework 2.6

Find all the pairs of logically equivalent sentences, from the list given below, and write youranswer by filling a table in the manner described in Homework 2.4. For each pair that is notlisted as equivalent, give an assignment of truth-values to the sentential variables under whichthe sentential expressions get different values. (Note that the remarks of Homework 2.4 applyalso here.)

1. ¬A ∧B2. ¬(A ∨ ¬B)3. (¬A ∧B) ∨ ¬(C ∨ ¬C)


4. (A ∨B) ∧ (¬A ∨B)5. (B ∧ C) ∨ (B ∧ ¬C)6. ¬(A ∨B) ∧ ¬C7. ¬(A ∧B) ∨ ¬C

2.2.3 Logical Truth and Falsity, Tautologies and Contradictions

A sentence is a logical truth, or logically true, if it is true by reasons of pure logic. Again, theproblem of specifying the scope “pure logic” arises, and again, the idea is to classify certainelements of the sentence as logical and to require that the truth of the sentence derive solelyfrom these. In the case of sentential logic the only logical particles are the connectives, hencea sentence is logically true just when its truth derives from the way it is built by applyingsentential connectives. Such sentences are known as tautologies. The full definition is:

A sentence is a tautology if it gets T under every assignment of truth-valuesto the sentential atoms.

As in the case of logical equivalence (cf. 2.2.0), the definition can be stated without going tothe level of atoms:

A sentence is a tautology iff it can be written as a sentential expression, suchthat in its truth-table, its column has T in every row.

Note: A tautology is a special case of a logical truth. In sentential logic tautologies andlogical truths coincide. In general, every tautology is a logical truth, but not vice versa; Whenwe come to first-order logic, we shall encounter many logical truths that are not tautologies.

The simplest tautology, constructible using the connectives introduced so far, is:

A ∨ ¬A

Logical falsity is defined in a completely analogous way: A sentence is logically false, justwhen it gets F solely by virtue of its logical elements. In the case of sentential logic thismeans that it gets F, by virtue of the connectives. That is, it gets F under any assignmentto the atomic sentences. Or, equivalently, it can be written as a sentential expression suchthat, in the truth-table, its column contains only F’s. We shall call such sentences sententialcontradictions, or for short, contradictions. The simplest contradiction is:

A ∧ ¬A


Again, when we come to first-order logic we shall encounter logical falsities that are notsentential contradictions. Obviously:

A is a logical falsity iff ¬A is a logical truth.A is a logical truth iff ¬A is a logical falsity.If A is logically true, then the logical truths are exactly the sentences thatare logically equivalent to A.

If A is logically false, then the logical falsities are exactly the sentences thatare logically equivalent to A.

All logical truths are therefore logically equivalent, and so are all logical falsities. The equiv-alence defined here is a technical concept; it does not, and is not intended to, capture variousaspects of the intuitive notion of “equivalence”.

Note: While logical truths and falsities are highly significant, they are the exceptions ratherthan the rule. The sentences one usually encounters are neither logical truths nor logicalfalsities. ‘The sun has nine planets’ is true, ‘Nixon won the 1960 presidential election’ isfalse, but their truth and falsity does not derive from pure logic. They are neither logicallytrue, nor logically false. The same obtains in the case of formal languages; “most” of thesentences of the sentential calculus are neither tautologies nor contradictions.

Note: We use ‘tautology’ and ‘contradiction’ in a technical sense, which should not beconfused with a different, informal sense in which the terms are sometimes used. Occasionally,‘tautology’ means a trivial logical truth, and often ‘contradiction’ means a self-evident logicalfalsity.

Tautological and Contradictory Sentential Expressions

Just as we did in the case of logical equivalence, we can define the notions of tautology andcontradiction so as to apply to sentential expressions:

A sentential expression is tautological, if in a truth-table, that has a columnfor it, its column contains only T’s. It is contradictory, if its column containsonly F’s.

It now follows easily that a sentence is tautological (or contradictory) iff it can be written inthe form of a tautological (or contradictory) sentential expression. We also have:

A sentential expression is a tautology iff the sentence obtained from it byinterpreting the sentential variables as distinct sentential atoms is. Similarly


for contradictions.

The sentence A may or may not be a tautology, may or not be a contradiction (e.g., ifA = B ∨ ¬B, it is a tautology, and if A = B ∧ ¬B it is a contradiction); and it may beneither. But the sentential expression A (or, to use quotes, ‘A’) is neither a tautology nora contradiction; for its column contains both T and F. And this is of course true of eachsentential atom.

The tautologies and contradictions that we establish are of a general schematic nature, andthey remain so upon any substitutions for the sentential variables. Thus, if we substitute inA ∨ ¬A any sentential expression for ‘A’ we get a tautology:

(A ∨B) ∨ ¬(A ∨B), (¬A ∧B) ∨ ¬(¬A ∧B), etc.

On the other hand, the claim that A∨B, is not a tautology cannot be made without knowingwhat the sentential variables denote (if B = ¬A, this sentence is a tautology). We can onlysay that, in general, the sentence A ∨ B in non-tautological. The only exception to the lastremark are sentences that are established as contradictions. Whatever A is, A ∧ ¬A is not atautology–because it is a contradiction; similarly, A∨¬A is never a contradiction–becauseit is a tautology.

Homework 2.7 Find all the tautologies and all the contradictions among the followingsentences. For sentences not listed as a tautologies (as contradictions) give a truth-valueassignment to the sentential variables under which the sentence gets F (gets T).

1. ¬(A ∨B) ∨ (A ∨B)2. A ∧ (¬(A ∨B) ∨ (C ∧ ¬A))3. (A ∧B) ∨ (¬A ∧ ¬B)4. (A ∨B) ∧ (¬A ∨ ¬B)5. (A ∧ ¬B) ∨ (B ∧ ¬A)6. (A ∧B) ∧ ¬(A ∧ C)

2.3 Syntactic Structure

2.3.0

The sentences of our formal system are, like those of natural language, structured entities. Butunlike the sentences of natural language, which may involve syntactic ambiguity (cf. chapter

2.3. SYNTACTIC STRUCTURE 43

1, the section on ambiguity), every sentence of the formal system has a uniquely determinedsyntactic structure.

This principle is known as unique readability. It amounts in the case of sentential logic to thefollowing:

If a sentence is obtained by applying a connective to other sentences, then the sentence de-termines uniquely the connective and the sentences to which the connective has been applied.Hence there is a unique reading of such a sentence as a compound of other sentences.

It means, among other things, that a sentence cannot be both a negation of some sentence anda conjunction of two sentences, or both a negation and a disjunction, or both a conjunctionand a disjunction, etc. Moreover, if we apply negation to different sentences the resultingsentences must be different. And if we apply conjunction to a pair of sentences, then applyingit to another pair that differs either in the first or in the second sentence (or both) gives adifferent result; similarly for any other binary connective. Since we assumed that our sentencesare constructed from atoms that are not compounds, we can add here also the requirementthat a negation, or compound formed by a binary connective is not an atom. The followingis the explicit statement of unique readability.

For all sentences, A, B, C, A0, B0, we have:

• If is a binary connective, then:

¬A 6= B C.

• ¬A = ¬A0 only if A = A0.

• If and0are binary connectives, then:

A B = A00B0 only if =

0, A = A0, and B = B0.

• If A is an atomic sentence, then A 6= ¬B and A 6= B C, for every connective .

Main Connective and Component Structure

The main connective of a sentence is ¬ if the sentence has the form ¬A; it is , if thesentence has the form A B. The sentences to which the main connective is applied are thesentence’s immediate components. ¬A has one immediate component: A; A B has two: Aand B.

Unique readability guarantees that, for any sentential compound, the main connective andthe immediate components are well defined.

We can view a sentential compound as decomposable into its immediate components; any ofthese, which is a sentential compound, is again decomposable into its immediate components.


And so on. All the sentences that are obtained in this process of repeated decomposition areknown as the components of the original sentence.

For various purposes, it is convenient to regard also each sentence as a trivial component ofitself. This is merely a terminological technicality concerning the use of ‘component’. Thenontrivial components–-those that are obtained in the process of repeated decomposition ofa sentence–-are referred to as the sentences proper components.

Two sentences A and B are components of each other, only in the trivial case in which A = B.In other words, if A is a proper component of B, then B cannot be a component of A. This isintuitively obvious; for in the decomposition we always get smaller sentences. It can be provedin a rigorous way (we shall not do it at present), using the assumption that each sentenceis generated from atomic sentences. As remarked (cf. footnote 3 page 28), sentential logiccan be developed without assuming atomic sentences. In that case the requirement that nosentence is a proper component of itself is included among the syntactic postulates.

The concept of component can be characterized by a set of rules. A sentence is a componentof another, iff this can be established, in a finite number of steps, by applying the followingrules. For all sentences A, B, C:

1. A is a component of A.

2. A is a component of ¬A.

3. If is a binary connective, then A and B are components of A B.

4. If A is a component of B and B is a component of C, then A is a component of C.

For example, the components of

(7) (A ∨ ¬B) ∧ ¬A

are, beside the sentence itself: (i) A ∨ ¬B and ¬A (by 3), (ii) A and ¬B (by 3 and4), (iii) B (by 2 and 4), (iv) any component of A and any component of B (by 4).

Note that A is obtained here twice, first from A∨¬B (via 3), and second from ¬A (via 2).In the list of components it suffices to list it once (as we have just done). But it occurs morethan once as a component, and any specification of the sentence should make this clear.

The uniqueness of the decomposition (which is what unique readability amounts to) meansthat each composition step is uniquely determined. This implies that:

• A is a proper component of ¬B iff it is a component of B.


• A is a proper component of B C (where is a binary connective) iff it is a componentof B or of C (or of both).

We can use these laws to show that certain sentences are not components of others. Forexample, ¬B ∧ ¬A is not a component of (A ∨ ¬B) ∧ ¬A. Here is the proof.

First, ¬B ∧ ¬A 6= (A ∨ ¬B) ∧ ¬A.Otherwise, we would have by unique readability ¬B = A∨¬B, which isimpossible, since the left-hand side is a negation and the right-hand side–adisjunction.

Hence, if ¬A∧¬B is a component of (A ∨ ¬B) ∧ ¬A, it is a componenteither of A∨¬B or of ¬A. By unique readability, it is different from eitherof these. The only way it can be a component of A∨¬B is by being acomponent either of A or of ¬B. But both are impossible, since A and ¬Bare proper components of ¬A ∧ ¬B.By the same reasoning, it is not a component of ¬A.

On the other hand, A∧¬A, may, or may not, be a component of (A∨¬B)∧¬A, for it canbe a component of B.

Homework 2.8 (i) List the proper components of each of the first three sentences inHomework 2.7. (ii) Find which of the six sentences in Homework 2.7 have A∨B as a component,which cannot have it, and which may or may not have it. Prove, in the manner given above,one of your negative claims (i.e., that it cannot be a component). If it may or may not be acomponent, indicate when it is and when it is not.

Displayed Components: When a sentence is written as a sentential expression, someof its components are displayed in the expression, e.g., A and ¬A are displayed as componentsof B ∧ ¬A, and A∧B is displayed as a component of

(A ∨ ¬B) ∧ (A ∧B) .

We refer to such components as displayed components. (Of course, this is meaningful onlywith respect to a sentential expression.)

A given sentence can have components not displayed in the expression. They can be com-ponents of the sentence by being proper components of the sentences represented by thesentential variables. ¬A is not a displayed component of A ∧ B; but it can be a componentof that sentence, by being a component of B. If the sentential variables represent atomicsentences, then of course all the components are displayed.


Occurrences

The same sentence can turn up, as a component of another sentence, more than once. Forexample, in (A ∨ ¬B) ∧ ¬A, A turns up as a component of the first conjunct: A∨¬B, andalso of the second conjunct ¬A. For all we know, it may also be a component B.To distinguish between the different appearances of the same sentence as a component withinanother sentence we speak of occurrences. We say that there are at least two occurrencesof A, as a component of (A ∨ ¬B) ∧ ¬A. And there are two occurrences of A ∨ B, as acomponent of

(A ∨B) ∧ (A ∨B) .Here the number of occurrences is exactly two, because A∨B cannot be a component eitherof A or of B.

The concept occurrence is very general. It applies whenever abstract structures can haverepeating parts. For example, there are two occurrences of the word ‘Jack’ in the sentence

(8) Jill kissed Jack and Jack laughed.

And there are at least two occurrences of negation in (A ∨ ¬B) ∧ ¬A. There may be more,in as much as negation can occur also in A or in B; but in the sentential expression there areexactly two occurrences of the negation name ‘¬’.Occurrences should be distinguished from tokens. The latter are physical entities associ-ated with particular spatio-temporal regions. But occurrences are abstract parts of abstractstructures. The two occurrences of ‘Jack’ in (8) are parts of the sentence-type, not of thesentence-token.

(When sentences are realized as tokens, their parts are usually represented as token-parts,which are tokens themselves. A token of (8) therefore contains two tokens of ‘Jack’. But thesehave to be distinguished from the two occurrences of ‘Jack’; the latter are parts of the type,not of the token.)

Displayed Occurrences: When sentences are presented through sentential expressions,certain occurrences of components, or of connectives, are displayed. It is possible that there areother, undisplayed occurrences, which occur within the sentences represented by the sententialvariables. The situation here is the same as in the case of displayed components.

Terminology: To avoid long phrases we often omit the word ‘occurrence’. We may use ‘thefirst negation’ or ‘the leftmost conjunction’ when we mean the first occurrence of a negation, orthe leftmost occurrence of a conjunction. Connective names are therefore used ambiguously, todenote the connective as well as to denote occurrences of the connective. Similarly ‘the mainconnective’ can refer to the connective (a sentential operation), or to a particular occurrenceof it. We can also speak of the first A (meaning the first occurrence of A), the first A ∨ B,


etc. The context should make the intended meaning clear.

Main Connectives in Sentential Expressions: If a sentential expression is morethan a single variable (i.e., if it contains connective names), then it has a unique occurrenceof a connective name that marks the main connective. It also determines the immediatecomponents. We can say, for example, that in the sentential expression

(9) ¬(¬A ∧ ¬B)

the main connective name is ‘¬’; or, more precisely, that it is the leftmost occurrence of ‘¬’,or for short the leftmost ‘¬’. The main component of the sentence is ¬A ∧ ¬B. In

(10) (A ∧B) ∧ (A ∨B)

the main connective name is the second ‘∧’. When we speak of the main connectives sententialexpressions, we should be understood as referring to connective names not to the connectivesthemselves.

Homework 2.9 (i) Encircle the main connective in each of the expressions in Homework2.7. (ii) List, for each of these sentences, the components that have more than one displayedoccurrence, and the number of displayed occurrences of each.

Substitutions of Sentential Components

From given sentences we can get any sentential compound in a finite sequence of steps, whereeach step consists in applying a sentential connective to previous sentences. The sentences towhich connectives are applied during this process are the sentences used in the construction.Any sentence that is used in some step appears as a component of the end result; each separateuse, say of B, introduces a separate occurrence of B as a component.

If, instead of using in a certain step B, we use a different sentence B0, we get a differentoutcome: the sentence obtained by substituting an occurrence of B0 for the occurrence ofB. We say in this case that B0 has been substituted for that occurrence of B, or that theoccurrence of B has been substituted by B0.

We can substitute at one go several occurrences of a component, by the same sentence, orby different ones. One often encounters substitutions in which all occurrences of a sentence(say B) are substituted by another sentence (say B0). We say in this case that B0 has beensubstituted for B. The substitution of B0 for B leaves the sentence unchanged if B is not acomponent, or if B0 = B.

Here are a few examples. From the sentence


(11) (A ∨ ¬A) ∧A

we get the following sentences, by substitutions.

(11.1) A∧B for the second occurrence of A:

(A ∨ ¬(A ∧B)) ∧ A

(11.2) A∧B for the first occurrence of A, and ¬B for the second:

((A ∧B) ∨ ¬¬B) ∧A

(11.3) A∧B for A:

((A ∧B) ∨ ¬(A ∧B)) ∧ (A ∧B)

(11.4) B for A∨¬A:B ∧A

In (11), all occurrences of A are displayed, and so are all occurrences of A ∨ ¬A. (Can yousee why?) But in other cases, possible occurrences of components can be undisplayed in thesentential expression. Quite often, in describing a substitution, one restricts it to displayed oc-currences. ‘The second occurrence’ will thus mean the second displayed occurrence, and ‘thesubstitution of B0 for B ’ will mean the substitution of B0 for all displayed occurrences of B.For example, from

(12) ¬(A ∧B) ∧ (C ∨B)

we obtain sentences as follows:

(12.1) ¬A for the first occurrence of B, and A for the second:

¬(A ∧ ¬A) ∧ (C ∨A)

(12.2) A∨B for B:

¬(A ∧ (A ∨B)) ∧ (C ∨ (A ∨B))

(12.3) C∧B for C∨B:¬(A ∧B) ∧ (C ∧B)

If we want to substitute, in (12), A∨B for all occurrences of B, we have to describe it thus:

(12.20) ¬(A0 ∧ (A ∨ B)) ∧ (C 0 ∨ (A ∨ B)), where A0 and C 0 are obtained fromA and C, respectively, by substituting (throughout) A∨B for B.


Note: Usually, when we substitute, we want each sentential variable to be replaced by thesame sentential expression on all its occurrences. Because the variable stands for the samesentence throughout the expression. But we can, nonetheless, consider substitutions, of thekind just given, as syntactic manipulations that convert sentences to sentences. The followinghomework is such an exercise in pure syntax.

Homework 2.10 In each of the following triples the third sentence is obtained from thefirst through substituting certain displayed occurrences of the second by other sentences. Findthe occurrences that have been substituted and the sentence substituting each occurrence.

1. [(A ∧B) ∨ C] ∨ [¬(A ∧B) ∨B]A ∧B(¬B ∨ C) ∨ (¬¬B ∨B)

2. [(A ∨B) ∧ (¬A ∨ C)] ∨ ¬(A ∧ C)A

[((A ∨B) ∨B) ∧ (¬A ∨ C)] ∨ ¬[¬(A ∨B) ∧ C]

3. (¬A ∨ (¬B ∧ C)) ∧ (¬B ∨ ¬(B ∧A))¬B(¬A ∨ (¬(B ∧ A) ∧ C) ∧ (¬A ∨ ¬(B ∧A))

Substitution is a very general notion. It applies to all structures in which occurrences ofsome parts are replaceable by other parts. For example, each occurrence of ‘Jack’ in (8)can be substituted by any proper name. And any occurrence of a binary connective can besubstituted by another binary connective. If in

(A ∨ ¬B) ∧ (C ∨D)

we substitute all displayed occurrences of ∨ by ∧ we get:

(A ∧ ¬B) ∧ (C ∧D)

And if we toggle in that sentence (the displayed) ∨ and ∧, we get:

(A ∧ ¬B) ∨ (C ∧D)


Repeated Conjunctions and Disjunctions

The expression

‘ A ∧B ∧ C ’

is ambiguous, for it can be interpreted as either of the two sentences:

(A ∧B) ∧ C A ∧ (B ∧ C) .

The main connective in the first is the second (displayed) occurrence of ∧, in the second itis the first occurrence of ∧. The two resulting sentences are different; they are, however,logically equivalent. Each is true, just when all of A, B, C are true, and is false otherwise.

For many purposes the distinction between (A ∧ B) ∧ C and A ∧ (B ∧ C) does not matter.It is often convenient to ignore it and to use ‘A∧B ∧C’ as if it were a sentential expression.Actually, it is an ambiguous expression, which can denote either of the two sentences above.We can use it as long as the truth of what we say does not depend on which of the twosentences we choose.

This generalizes to more than three conjuncts. We use

‘ A1 ∧A2 ∧ . . . ∧An ’

as if it were a sentential expression, where in fact it is an ambiguous expression that candenote any of the sentences obtained by grouping via parentheses. (The number of differentgroupings grows rapidly as n becomes larger.) All these sentences are logically equivalent.Each is true when all the Ai’s are true, and is false otherwise. We can ignore the distinction,as long as the truth of our claims does not depend on the particular grouping.

The case of repeated disjunctions is completely analogous. We use

‘ A1 ∨A2 ∨ . . . ∨An ’

as if it were a sentential expression. Actually, it is ambiguous and can denote any of thesentences obtained by parenthesizing. All of them are logically equivalent. Each is false, ifall the Ai’s are false, and is true otherwise. Again, our usage is harmless, as long as theparticular groupings do not affect the truth of our claims.

2.3.1 Sentences as Trees

Trees are very useful structures, often used for representation and analysis. They can bedefined as mathematical entities, but are easily grasped without a formal definition. The

2.3. SYNTACTIC STRUCTURE 51 following is a tree, drawn according to a bottom-to-top convention. The little circles are called nodes, the line segments joining them are called edges. The bottom node is the root. The nodes above a given node, which are joined to it by edges, are its children; the node is their parent. The extreme nodes, those without children, are the leaves. In the present example the root has three children, the leftmost child has two children, and the rightmost one is a leaf. If the root of the tree is a leaf, the tree consists of a single node. Note that every node can serve as a root of a tree, which is a part of the whole tree. It consists of the node and all its descendants: its children, the children’s children and so on. We call this the subtree determined by the node. Since we read from top to bottom, trees are often drawn downward, with the root at the top. The same tree, drawn top-to-bottom, appears as: And sometimes trees are drawn from left to right. Trees are often labeled; that is: every node has an associated label, which is usually some symbol, but which can be any object. Different nodes can have the same label. In the trees that represent sentences, every leaf is labeled by a sentential variable, and every other node by a connective. The following four sentences are represented by the trees written below them. A ¬A A∧B A∨ B

A A A B A B


The general principle is very simple:

If the expression is a sentential variable, the tree ha*s one node labeled by

this variable. EIse, the root is labeled by the main connective' If the main

connective is negation, the root has one child; if it is a binary connective, the

root has two children. The subtrees determined by the children represent

the immediate components of the sentence'

Written as a tree,

(Av C) A -[A V (-B v -c)]

IS:

/oA

\oc

o

Io

o

v

)toollh

A

While taking more space than sequences, trees provide a very clear picture of the structure'

The main connective labels the root. The displayed components are exactly the subtrees

determined by the nodes.

flomework 2. 1 t (i) Write down the tree representations of the sentences of Homewo rk 2 '7 '

(ii) write down the sentential expressions that correspond to the foilowing trees.

,rD

VI,/\IaA

,r). Al"l''

,/inBh

AbA

2.4. SYNTAX AND SEMANTICS 53

2.3.2 Polish Notation

Polish notation is a way of writing sentences in sequential form, which ensures unique read-ability without using parentheses. The idea is very simple.

Write the connective name to the left of the sentences to which it applies.That is, if ‘ ’ is a binary connective, write ‘ AB’ instead of ‘A B’ .

Here are some examples that show how the notation works.

(A ∧B) ∧ C becomes ∧ ∧ABC .

A ∧ (B ∧ C) becomes ∧A ∧BC .

¬(A ∨ (B ∧ ¬C)) becomes ¬ ∨A ∧B¬C .

It can be proven that Polish notation ensures unique readability, but the proof is not trivial.

The following prescription converts our sentential expressions into Polish notation:

Determine the main connective and the immediate components. Write themain connective leftmost and follow it by the main components, in theirgiven order, after having converted each of them to Polish notation.

Since immediate components involve shorter expressions than the whole sentence, the pre-scription reduces the task to simpler tasks. Repeating it, one will eventually get the desiredPolish-notation form.

Converting from Polish notation to ours requires a more difficult method that shall not begiven at present. But you can acquire the skill with some practice.

Homework 2.12 Convert the expressions of Homework 2.7 to Polish notation.

2.4 Syntax and Semantics

A language can be studied purely from the syntactic perspective. Viewed in this way, thelanguage is a system consisting of expressions built of symbols, according to rules. The rulesclassify the expressions and determine how they can be combined and recombined into largerunits. This view ignores the interpretation of the language, i.e., its link with some non-linguistic domain: what its terms denote, what its sentences say, how their truth and falsityare determined. All of these come under the heading of semantics. The English sentence


(1) John likes Mary’s brother.

is syntactically analysed as a compound of the proper noun ‘John’, the third-person presenttense of the transitive verb ‘like’, and the noun phrase ‘Mary’s brother’–arranged in thatorder. The syntax can tell us that any compound of that form is a sentence. And it can ruleout as non grammatical the combination

(2) Likes John Mary’s brother.

But it does not tell us what the words refer to, what the sentence says, and it does notmention truth and falsity. All of these belong to the semantics.

Now the semantics must involve the syntax in an essential way. Because syntactic classificationand syntactic structure are among the factors that determine how expressions are interpreted.But the syntax can stand by itself. A computer can handle the syntax as a system of symbolswithout semantics.

The same distinction between syntax and semantics obtains in formal linguistic systems.In our case, the rules that govern sentential structure belong to the syntax. Under thisheading come: the unique readability property, the main connective, components, occurrences,substitutions, the tree structure, and their like. But truth-values and the interpretation of theconnectives, which is given by their truth tables, belong to the semantics. So do all conceptswhose definitions involve truth and falsity: logical truth, logical falsity and logical equivalence.

You should be aware of this fundamental distinction and know how to apply it to richersystems. In each case we shall have a syntax and a semantics. We shall later see that thereare theorems that connect syntactic notions with semantic ones. For example, logical truthscan be given a purely syntactic characterization. But do not confuse the two and do notbring semantic notions into the syntax. Logically equivalent sentences, for example, can havecompletely different syntactic structures.

2.5 Sentential Logic as an Algebra

2.5.0

We noted already that logical equivalence does not amount to equality. But logical equivalenceshares with equality certain features, which make it possible to adopt an algebraic approachin which the equivalence symbol ‘≡’ plays a role analogous to that of ‘=’ . Recall that, likeany equivalence relation, logical equivalence is reflexive, symmetric and transitive. Moreover,as stated earlier (cf. 2.2.1), it satisfies the substitution of equivalents principle:

2.5. SENTENTIAL LOGIC AS AN ALGEBRA 55

If A ≡ A0, and we change B to B0 by substituting in it one or more occurrences of A by A0,then B ≡ B0.

The principle can be proved formally, but it is quite evident on intuitive grounds: the only wayin which an occurrence of the component A can affect B’s truth-value is through the truth-value of A; if A and A0 have always the same value, the substitution makes no difference forthe value of B. Hence B ≡ B0.

The algebraic method for establishing logical equivalence is the following. First we fix certainequivalences as our starting point (if you wish, our axioms), then we derive from them otherequivalences by using repeatedly the substitution of equivalents principle. The equivalencesenclosed in the following box (along with another group to be given in 2.5.2) can play therole of the starting point.

Double Negation Law: ¬¬A ≡ A

Associativity:

(A ∧B) ∧ C ≡ A ∧ (B ∧ C) (A ∨B) ∨ C ≡ A ∨ (B ∨ C)

Commutativity:

A ∧B ≡ B ∧ A A ∨B ≡ B ∨A

Idempotence:

A ∧A ≡ A A ∨A ≡ A

Distributive Laws:

A ∧ (B ∨ C) ≡ (A ∧B) ∨ (A ∧ C) A ∨ (B ∧ C) ≡ (A ∨B) ∧ (A ∨ C)

De Morgan’s Laws:

¬(A ∧B) ≡ (¬A) ∨ (¬B) ¬(A ∨B) ≡ (¬A) ∧ (¬B)


The equivalences are meant as general laws; they hold for all A, B, C ; this should be clearfrom the context, even though the words ‘for all’ do not appear.

Except for Double Negation, the laws are listed in pairs; each consists of two dual laws: theleftmost is the law for conjunction, the rightmost–the corresponding law for disjunction.E.g., De Morgan’s first law is for conjunction, the second–for disjunction. Dual laws areobtained from each other by toggling, throughout, ‘∧’ and ‘∨’.Double Negation, Associativity and Commutativity, are obvious and were discussed already.Also obvious is Idempotence (the name means the same power: A ∧A and A ∨A, have “thesame power” as A). The last two pairs are less obvious, but are easily verified via truth-tables.

Consider the distributive law for conjunction:

A ∧ (B ∨ C) ≡ (A ∧B) ∨ (A ∧ C)When we pass from the left-hand side to the right-hand side, we distribute ∧ over ∨. This isanalogous to the arithmetical law by which multiplication can be distributed over addition:5

x · (y + z) = (x · y) + (x · z)We have also the dual law for distributing disjunction over conjunction. But here there isno arithmetical analogue. Arithmetical addition does not distribute over multiplication: ingeneral x+ (y · z) 6= (x+ y) · (x+ z).

The distributing of ∧ over ∨ pushes-in the conjunction: ‘∧’ enters into the parentheses of‘(A ∨ B)’. Similarly, the distributing of ∨ over ∧ pushes-in the disjunction. The oppositemove, from the right-hand side to the left-hand side, involves a pulling-out: ∧ is pulled out inthe law for conjunction, ∨ is pulled out in the law for disjunction. Also, in the first case thereis a pulling out of the common conjunct A, in the second case–of the common disjunct A.

In a similar way, De Morgan’s laws–in the left-to-right direction–involve the pushing-inof negation; in the opposite direction it involves a pulling out. This is accompanied byconjunction/disjunction toggling. Therefore these are not distributive laws. You cannotdistribute negation, because in general ¬(A ∧ B) is not logically equivalent to (¬A) ∧ (¬B),and similarly for disjunctions.

Instances: Since the laws hold for all A, B, C, we can substitute any sentential expressionsfor the sentential variables and get an equivalence that holds. (Of course, occurrences of thesame variable should be substituted by the same sentential expression.)

An equivalence obtained in this way is an instance of the law. For example,

(B ∨ C) ∧ ((A ∨B) ∨ C) ≡ ((B ∨ C) ∧ (A ∨B)) ∨ ((B ∨ C) ∧ C)5 Sometimes conjunction and disjunction are compared to multiplication and addition. If we put:

T = 1, F = 0, then the truth-value of a conjunction is the product of the conjuncts’ values. But thendisjunction does not correspond to addition; the value of a disjunction is not the sum but the maximum ofthe disjuncts’ values.


is an instance of the distributive law for conjunction. It is obtained by substituting

‘B ∨ C’ for ‘A’, and ‘A ∨B’ for ‘B’.

To identify a given equivalence as an instance of some law is not always easy. In the followinghomework you are required to do such identifications. Note that the law is grounded in thesemantics, but being an instance is a purely syntactic notion. The following is an exercise insyntax.

Homework 2.13 Each of the following is an instance of one of the listed equivalence laws.Find the law and the substitution that has been used to get the instance.

Note: Sometimes the two sides of the equivalence have been switched around.

1. ¬(¬A∧B ∨ ¬A) ≡ ¬(¬A ∧B) ∧ ¬¬A2. ¬(A ∨ ¬B) ∧ ¬(B ∧ C) ≡ ¬((A ∨ ¬B) ∨ (B ∧ C))3. ¬(¬(A ∨B) ∧B) ≡ ¬¬(A ∨B) ∨ ¬B4. ¬B ∨ (C ∧ ¬A) ≡ (¬B ∨ C) ∧ (¬B ∨ ¬A)5. (¬(A ∨B)∧A) ∨ (¬(A ∨B) ∧ A) ≡ ¬(A ∨B) ∧ (A ∨A)6. A∧(B∨C) ∨A∧C ≡ A∧C ∨A∧(B∨C)7. (B∨A) ∧ ((A∨B)∧(B∨A)) ≡ ((B∨A) ∧ (A∨B)) ∧ (B∨A)8. ¬(C ∧ (A ∨ C)) ≡ ¬C ∨ ¬(A ∨ C)9. (A ∨B) ∧ (A ∨ C) ≡ A ∨B∧C10. ¬B ∧ (B ∨ ¬B) ≡ (¬B ∧B) ∨ (¬B ∧ ¬B)

Left and Right Distributive Laws: In the distributive laws listed in the box, thepushed-in sentence appears on the left. Hence we call them the left distributive laws. Theright distributive laws are:

(B ∨ C) ∧A ≡ (B ∧A) ∨ (C ∧A) (B ∧ C) ∨A ≡ (B ∨A) ∧ (C ∨A)

They can be established directly via truth-tables, or derived from the left distributive lawsby using commutativity. For example:

(B ∨ C) ∧A ≡ A ∧ (B ∨ C) ≡ (A ∧B) ∨ (A ∧ C) ≡ (B ∧A) ∨ (C ∧ A)


The first equivalence is an instance of commutativity, the second–the distributive law forconjunction, the last–a simultaneous substitution of A∧B and A∧C by the equivalent (viacommutativity) B ∧A and C ∧ A. This last step can be split into two:

(A ∧B) ∨ (A ∧ C) ≡ (B ∧ A) ∨ (A ∧ C) ≡ (B ∧ A) ∨ (C ∧A)

Altogether there are three uses of commutativity. From now on we shall refer by ‘distributivelaw’ to both left and right distributive laws and we shall not bother to indicate every separateuse of commutativity.

2.5.1 Using the Equivalence Laws

We can be establish an equivalence in a sequence of steps that create a chain:

A1 ≡ A2 ≡ . . . ≡ An

This, via transitivity, proves: A1 ≡ An. The idea is that each of the equivalences Ai ≡ Ai+1

should be easily deducible, or previously established. Here is an example:

¬[(A∧B)∨¬C] ≡ ¬(A∧B)∧¬¬C ≡ ¬(A∧ B)∧C ≡ (¬A∨¬B)∧C ≡ (¬A∧C)∨ (¬B∧C)

Find for yourself what law (or laws) is used in each step. The chain proves:

¬[(A ∧B) ∨ ¬C] ≡ (¬A ∧ C) ∨ (¬B ∧ C) .

In this way we have simplified

¬[(A ∧B) ∨ ¬C] to (¬A ∧ C) ∨ (¬B ∧ C) .

Another, longer but clearer, style of presenting equivalence proofs consists in writing thesentences on separate lines, indicating in the margin the grounds for the step. Our last chainbecomes:

1. ¬[(A ∧B) ∨ ¬C]

2. ¬(A ∧B) ∧ ¬¬C De Morgan’s law for disjunction,

3. ¬(A ∧B) ∧ C double negation law,

4. (¬A ∨ ¬B) ∧ C De Morgan’s law for conjunction,

5. (¬A ∧ C) ∨ (¬B ∧ C) right distributive law for conjunction.


Often, several simple steps are combined into one. We can, for example, drop double negationsimmediately without special indication; we can accordingly pass directly from 1. to 3. inthe proof above. We also need not use separate steps for changing (via associativity) thegrouping in repeated conjunctions and disjunctions; in fact, these groupings can be ignored(cf. 2.3.0 “Repeated Conjunctions and Disjunction”). Similarly, we need not bother withexplicit changes of order or (via commutativity) in conjunctions and disjunctions, or withexplicit deletions of repeated conjuncts, or disjuncts (via idempotence). Once you get thehang of it you can assimilate these steps into others.

Pushing-In Negations

We mentioned already the pushing-in of negations, via De Morgan’s laws:

From ¬(A ∧B) to ¬A ∨ ¬B . From ¬(A ∨B) to ¬A ∧ ¬B .

This replaces one occurrence of ‘¬’, whose scope is A ∧B or A ∨B, by two occurrence withsmaller scopes: A and B. In the last example. negation is pushed inside in the passage from1 to 2; then it is pushed again in the passage from 3 to 4.

As long as there are components of the form ¬(A ∧ B), or ¬(A ∨ B), we can push negationin. By repeating this process, we will eventually get a sentence that has no components ofthese forms. (The mathematical proof of this claim will not be given here. But intuitively itis very clear, especially after having worked out a few examples.) In addition, we can alwaysdrop double negations. At the end the process there will be no component with a stringof more than one negation. Consequently, any sentential expression whose connectives areamong ‘¬’, ‘∧’, and ‘∨’, can be transformed into an equivalent one, in which negationapplies only to the sentential variables.

If several negation-occurrences can be pushed in, we can choose any of them. The finaloutcome does not depend on the choices of the pushed-in negations, provided that negation ispushed all the way in, all double negations are dropped and no other laws are applied. But thenumber of steps can vary. It is advisable to start with the outermost negations; because as thenegation moves in it will form with the inner ones double negations, which can be immediatelydropped. Here is an example to illustrate all these points. Given ¬[A ∧ ¬(B ∧ C)], we canstart by pushing-in the inner (i.e., second) negation:

¬[A ∧ ¬(B ∧ C)], ¬[A ∧ (¬B ∨ ¬C)], ¬A ∨ ¬(¬B ∨ ¬C), ¬A ∨ (B ∧ C)

If we start by pushing-in the outermost negation, we get:

¬[A ∧ ¬(B ∧ C)], ¬A ∨ ¬¬(B ∧ C), ¬A ∨ (B ∧ C)


De Morgan’s Laws for Repeated Conjunctions and Disjunctions

The law for conjunctions with more than two conjuncts is:

¬(A1 ∧A2 ∧ . . . ∧An) ≡ ¬A1 ∨ ¬A2 ∨ . . . ∨ ¬An

Here we are ignoring the grouping, since we are interested in logical equivalence only. Theequivalence can be deduced by repeated applications of the two-conjunct version of De Mor-gan. For example, if n = 3, we have:

¬[(A1 ∧A2) ∧A3] ≡ ¬(A1 ∧A2) ∨ ¬A3 ≡ (¬A1 ∨ ¬A2) ∨ ¬A3It is also easily derivable by considering truth-values:

The left-hand side is true iff A1∧A2∧ . . .∧An is false, i.e., iff at least one ofthe Ai’s is false. The right-hand side is true iff at least one of the sentences¬Ai is true, i.e., iff at least one of the Ai’s is false. Hence, the left-hand sideis true iff the right-hand side is.

The case of disjunctions is the exact dual:

¬(A1 ∨A2 ∨ . . . ∨An) ≡ ¬A1 ∧ ¬A2 ∧ . . . ∧ ¬An

Duality applies throughout; each of the above observations or claims has a dual.

Pushing-In Conjunctions

Conjunctions are pushed-in by distributing repeatedly conjunction over disjunction. As ex-plained already, we do it by going left-to-right in the distributive laws for conjunction:

A ∧ (B ∨ C) ≡ (A ∧B) ∨ (A ∧ C) (B ∨ C) ∧A ≡ (B ∧A) ∨ (C ∧A)It increases the size (both ‘A’ and ‘∧’ occur once on the left, but twice on the right), butdecreases the scope of the conjunction (instead of B ∨ C we have the scopes B and C). Italso increases the scope of the disjunction. Here is an example of pushing-in conjunctions:

A ∧ [(¬B ∨ C) ∧ ¬D] ≡ A ∧ [(¬B ∧ ¬D) ∨ (C ∧ ¬D)] ≡ (A ∧ ¬B ∧ ¬D) ∨ (A ∧ C ∧ ¬D)If we keep pushing-in conjunctions, we must eventually arrive at a point where no furtherpushing-in is possible; at that stage, there are no components either of the form A∧ (B ∨C)or of the form (B ∨ C) ∧ A. You may convince yourself that the process must terminate byworking out a few examples; the formal proof, which will not be given here, is far from trivial.

WATCH IT: Distributing the conjunction in ¬[(B ∨ C) ∧A] yields ¬[B∧A ∨ C∧A]. Butyou cannot distribute the conjunction in ¬(B ∨ C) ∧ A , because the first conjunct is not a


disjunction, but a negation. You can however simplify by pushing-in the negation:¬B ∧ ¬C ∧A .

Simultaneous Distributing of Conjunctions: Repeated applications of the distributivelaws to

(A1 ∨A2) ∧ (B1 ∨B2)yields the chain:

(A1∨A2)∧(B1∨B2) ≡ (A1∧(B1∨B2))∨(A2∧(B1∨B2)) ≡ (A1∧B1)∨(A1∧B2)∨(A2∧B1)∨(A2∧B2) .

The final sentence is a disjunction of all the conjunctions Ai∧Bj. This generalizes to arbitrarydisjunctions:

(A1 ∨ A2 ∨ . . . ∨ Am) ∧ (B1 ∨B2 ∨ . . . ∨Bn)

is logically equivalent to the disjunction of all the conjunctions in which one of the Ais is“conjuncted” with one of the Bjs:

(A1∧B1) ∨ . . . ∨ (A1∧Bn) ∨ (A2∧B1) ∨ . . . ∨ (A2∧Bn) ∨ . . . . . . ∨ (Am∧B1) ∨ . . . ∨ (Am∧Bn)

Altogether there are m·n disjuncts. It generalized further to more than two conjuncts:

(A1 ∨ . . . ∨Am) ∧ (B1 ∨ . . . ∨Bn) ∧ (C1 ∨ . . . ∨ Cp)

is logically equivalent to the disjunctions of all conjunctions of the form

Ai ∧Bj ∧ Ck ,

where 1 ≤ i ≤ m, 1 ≤ j ≤ n, 1 ≤ k ≤ p ; here we get m·n·p disjuncts (there are m choicesfor i, n choices for j, and p choices for k). In particular,

(A1∨A2) ∧ (B1∨B2) ∧ (C1∨C2)

is logically equivalent to a disjunctions of 8 conjunctions:

A1∧B1∧C1 ∨ A1∧B1∧C2 ∨ A1∧B2∧C1 ∨ A1∧B2∧C2∨ A2∧B1∧C1 ∨ A2∧B1∧C2 ∨ A2∧B2∧C1 ∨ A2∧B2∧C2

Pushing-In Disjunctions

Pushing-in disjunctions, which is carried out via the distributive laws for disjunction, is theexact dual. It reduces the scopes of ∨, but enlarges those of ∧. For example, two steps ofpushing-in disjunction yield:

A ∨ [(¬B ∧ C) ∨ ¬D] ≡ A ∨ [(¬B ∨ ¬D) ∧ (C ∨ ¬D)] ≡ (A ∨ ¬B ∨ ¬D) ∧ (A ∨ C ∨ ¬D)


Repeated pushing-in of disjunction terminates at a stage where there are no componentseither of the form A ∨ (B ∧ C) or of the form (B ∧ C) ∨A.As in the case of conjunction, we can distribute, at one go, disjunction over several conjunc-tions:

(A1 ∧A2 ∧ . . . ∧Am) ∨ (B1 ∧B2 ∧ . . . ∧Bn)

is logically equivalent to:

(A1∨B1) ∧ . . . ∧ (A1∨Bn) ∧ (A2∨B1) ∧ . . . ∧ (A2∨Bn) ∧ . . . . . . ∧ (Am∨B1) ∧ . . . ∧ (Am∨Bn)

And this generalizes further to more than two disjuncts.

Homework

2.14 Let G be the sentence:

¬[(A ∨ ¬B) ∧ ¬C] ∧ ¬[¬C ∨ (¬D ∧E)]

Construct the following sentences:

G1: Obtained from G by pushing-in negations all the way.

G2: Obtained from G1 by pushing-in conjunctions all the way.

G3: Obtained from G1 by pushing-in disjunctions all the way.

If possible, simplify G2 and G3 by using the idempotence laws.

2.15 Let H = ¬G, where G is as in 2.15 above. Construct H1 –obtained from H bypushing-in negations all the way, H2 –obtained from H1 by pushing-in conjunctions all theway, and H3 obtained from H1 by pushing-in disjunctions all the way.

WATCH IT: It is inadvisable to mix pushing-in conjunctions and pushing-in disjunctions.The first reduces the scopes of ∧ and enlarges those of ∨. The second has the oppositeeffect. And both increase sentence size. If you interlace them you will get longer and longersentences. For example:

A ∧ (B ∨ C) ≡ (A ∧B) ∨ (A ∧ C) ≡ [(A ∧B) ∨A] ∧ [(A ∧B) ∨ C] ≡

{[(A ∧B) ∨A] ∧ (A ∧B)} ∨ {[(A ∧B) ∨A] ∧ C} ≡ . . .

Occasionally you may want to apply the two kinds of pushing-in to separate components, orto combine them with other operations in between.


Pulling Out Negations and Common Factors

By applying De Morgan’s laws from right to left, we can pull out negations: ¬A ∨ ¬B isconverted to ¬(A ∧ B), and ¬A ∧ ¬B is converted to ¬(A ∨ B). This generalizes todisjunctions of more than two disjuncts; similarly for conjunctions; i.e., we can replace

¬A1 ∨ ¬A2 ∨ . . . ∨ ¬An by ¬(A1 ∧ A2 ∧ . . . ∧ An) ,

¬A1 ∧ ¬A2 ∧ . . . ∧ ¬An by ¬(A1 ∨ A2 ∨ . . . ∨ An) .

Similarly, by applying the distributive laws in the right-to-left direction we pull out a commonfactor; this is a common conjunct–if the law is for conjunction, a common disjunct–if thelaw is for disjunction. Thus we can replace:

(A∧B) ∨ (A∧C) by A ∧ (B ∨ C) ,

(A∨B) ∧ (A∨C) by A ∨ (B ∧ C) .It generalizes to more than two disjuncts that share a common conjunct, and to more thantwo conjuncts that share a common disjunct:

(A∧B) ∨ (A∧C) ∨ (A∧D) is replaceable by A ∧ (B ∨ C ∨D) .

(A∨B) ∧ (A∨C) ∧ (A∨D) is replaceable by A ∨ (B ∧ C ∧D) .

Since each pull-out step reduces the size of the sentence, repeated pull-outs–whether ofcommon conjuncts or of common disjuncts, or of both–must terminate at a stage where nofurther pulling out is possible. In the case of more than two disjuncts (or more than twoconjuncts), the final outcome of repeated pull-out can be highly sensitive to the order inwhich it is done. Consider for example:

(A∧B) ∨ (A∧C) ∨ (D∧C) .

If we group together the first two disjuncts and pull out A, we get:

[A∧(B∨C)] ∨ (D∧C)

and no further pulling out is possible. But if we group together the second and the thirddisjunct and pull out C, we get:

(A∧B) ∨ [(A∨D)∧C]

and, again, no further pulling out is possible. The two final outcomes are quite different; theyare of course logically equivalent, but the equivalence is not obvious at first glance.


2.5.2 Additional Equivalence Laws

If we add to the laws of our previous box the following ones, we get a set of laws from whichevery tautological equivalence is derivable by (repeated) substitutions of equivalents. We shallnot prove this mathematical theorem. But you can see how the additional laws enable us todrop certain conjuncts or certain disjuncts from an expression.

Tautological Conjunct: Contradictory Disjunct:

A ∧ (B ∨ ¬B) ≡ A A ∨ (B ∧ ¬B) ≡ A

Contradictory Conjunct: Tautological Disjunct:

A ∧ (B ∧ ¬B) ≡ B ∧ ¬B A ∨ (B ∨ ¬B) ≡ B ∨ ¬B

You can easily verify these laws by truth-value considerations. E.g., the truth-value ofA ∧ (B ∨ ¬B) is the same as the value of A, because the value of B ∨ ¬B is always T.

Note: B∨¬B can be replaced, in the above laws, by any logically equivalent sentence, i.e.,by any tautology. Similarly B ∧ ¬B can be replaced by any contradiction.

The following pair of laws is derivable from the previous ones. They are important, be-cause together with the other laws that do not involve negation (i.e., all other laws exceptDe Morgan’s) they characterize all the algebraic properties of sentential logic that involveconjunction and disjunction only. They are also very useful in simplifications, by enablingimmediate deletions of certain conjuncts, or certain disjunct.

A ∧ (A ∨B) ≡ A A ∨ (A ∧B) ≡ A

Again, simple truth-value considerations show that these equivalences hold. Take for examplethe first: if A gets T, so does A∨B and so does A∧ (A∨B); if A gets F, so does A∧ (A∨B).The derivation of the laws from the previous ones is much less obvious then the truth-valuearguments. But it is of mathematical interest that certain are formally derivable from others.


Here is a derivation of the first. Note that the first step consists in applying the contradictorydisjunct law from right to left, i.e., in adding a contradictory disjunct.

1. A ∧ (A ∨B)2. [A∧(A ∨B)] ∨ (A ∧ ¬A) adding a contradictory disjunct,

3. A ∧ [A∨B∨¬A] pulling out the common conjunct A,

4. A ∧ [B ∨ (A ∨ ¬A)] commutativity and associativity of disjunction,

5. A ∧ (A ∨ ¬A) tautological disjunct,

6. A tautological conjunct.

Redundant Conjuncts and Disjuncts: Given a conjunction

A1 ∧ A2 ∧ . . . ∧An

or a disjunction

A1 ∨ A2 ∨ . . . ∨An

the conjunct (or a disjunct) Ai is redundant if its deletion from the conjunction (or from thedisjunction) yields a logically equivalent sentence. For example, if the same conjunct, or thesame disjunct, occurs more than once, we can delete repeated occurrences, via idempotence.The two last groups of equivalence laws imply the following cases of redundancy.

Redundant Conjuncts:

(C1) Every tautological conjunct is redundant.

(C2) If one of the conjuncts is contradictory then every other conjunct is redun-dant.

(C3) If A is a conjunct, then every other conjunct of the form A ∨ B (or onelogically equivalent to A ∨B) is redundant.

Redundant Disjuncts:

(D1) Every contradictory disjunct is redundant.

(D2) If one of the disjuncts is tautological, then every other disjunct is redundant.

(D3) If A is a disjunct, then every other disjunct of the form A∧B (or one logicallyequivalent to A ∧B) is redundant.


Homework 2.16 Apply the simplification techniques in order to simplify the sentences ofHomework 2.1. Write the simplified sentences (i) using only ¬ and ∧, (ii) using only ¬ and∨.Note: You can now see the advantage of having both ∧ and ∨ in the system. Every sentencecan be rewritten in logically equivalent form, using only ¬ and ∧ (or only ¬ and ∨). Butthe rewriting results in hard to grasp constructs and renders the algebraic technique highlyobscure (at least for humans).

The final simplified form of a sentence should not contain redundant conjuncts, or redundantdisjuncts, for one can simplify further by deletion. But redundant components can be usefulin the middle of a derivation; a contradictory disjunct was added in the first step of the lastderivation. Here is another example. We show that A∧¬A ≡ B ∧¬B, using only the law forcontradictory disjunct (and the distributive and commutative laws).

1. A ∧ ¬A2. A∧¬A ∨B∧¬B adding the contradictory B∧¬B,3. B ∧ ¬B dropping the contradictory A ∧ ¬A.

The last step involves switching the order of disjuncts, via commutativity, and applying thecontradictory disjunct law in the form stated above. Having shown this, we now derive thecontradictory conjunct law from the contradictory disjunct law:

1. A ∧ (B ∧ ¬B)2. A ∧ (B ∧ ¬B) ∨A ∧ ¬A adding a contradictory disjunct,

3. A ∧ (B∧¬B ∨ ¬A) pulling out A,

4. A ∧ ¬A dropping the contradictory B ∧ ¬B,5. B ∧ ¬B by the previously established equivalence.

Such derivations do not follow the standard patterns of pushing-in, or pulling out. Someinventiveness is required.

It can be shown that each of the four laws for contradictory conjuncts and disjuncts implies,given the preceding laws, the other three. (Can you see how, using De Morgan’s laws, youcan get the contradictory disjunct law from the tautological conjunct law, and vice versa?)

Note: We can remove the tautological conjuncts from any given conjunction. What if allthe conjuncts are tautological? The conjunction is then a tautology and can be simplified toA∨¬A; but we cannot remove all conjuncts, for this will leave us with no sentence. Sometimes


a convention is made by which one posits “the empty conjunction” as a tautology; one addsa special sentence that, by definition, gets only the value T. One can also make sense of“the empty conjunction” by regarding the truth of each conjunct as a truth-constraint; theconjunction is true just when it meets all its truth-constraints. If there are no conjuncts,there are no truth-constraints and the conjunction is always true, i.e., tautological.

The dual case is a disjunction in which all disjuncts are contradictory. The disjunction is thencontradictory and can be simplified to A∧¬A. One can posit “the empty disjunction” as thecontradictory sentence: a special sentence that, by definition, gets only F. One can make senseof “the empty disjunction” by regarding the falsity of each disjunct as a falsity-constraint;the disjunction is false just when it meets all its falsity-constraints. If there are no disjuncts,there are no falsity-constraints and the disjunction is always false, i.e., contradictory.

Homework 2.17 Simplify the following sentences. Try to get simplification that are assimple, or as short as possible. Indicate briefly how the simplification is achieved.

1. A∧B ∨ ¬A∧B2. A∧B ∨ ¬B∧A3. (A ∨B) ∧ (A ∨ ¬B)4. (A ∨B) ∧ (B ∨ C)5. (A ∨B∧C) ∧ ¬(B ∧ C)6. ¬B ∧ (B ∨A)7. ¬B ∨B∧A8. A ∧ (B ∨ C) ∨ ¬B9. (A ∨ ¬B) ∧ ¬(¬B ∨A)10. ¬(A ∧B) ∧ ¬(A ∨B)11. (¬B ∨A∧B) ∧ (A ∨ ¬A∧B)12. ¬(¬A ∧ ¬B) ∧ (¬A ∨ ¬B)13. (A ∨B) ∧ (A ∨ ¬B) ∧ (¬A ∨B) ∧ (¬A ∨ ¬B)14. A∧B ∨A∧¬B ∨ ¬A∧B ∨ ¬A∧¬B15. ¬(A∧B ∨ A∧C ∨B∧C)16. (¬A ∨B) ∧ (¬B ∨ C) ∧ (¬C ∨A)17. ¬[(A ∨B ∨ C) ∧ (¬A ∨ ¬B ∨ ¬C)]


18. ¬[A∧¬B ∨ ¬A∧B] ∧ ¬[B∧¬C ∨ ¬B∧C]

19. A∧¬B ∨ ¬A∧B ∨A∧¬C ∨ ¬A∧C ∨B∧¬C ∨ ¬B∧C

20. ¬[¬A ∨ (¬B ∨ C)] ∨ (¬(¬A ∨B) ∨ ¬A ∨ C)

Small Sets of Laws

As noted above, the laws in the first two boxes (in 2.5 and 2.5.2) are sufficient for deriving–via the equivalence properties and substitution of equivalents–all the logical equivalences.Actually, we can reduce the number of required laws, because some of our listed laws arederivable from others. The right-hand side laws are derivable from the double negation law andthe left-hand side laws; and vice versa. Here, for example, is a derivation of commutativity fordisjunction from the double negation law, De Morgan’s law for conjunction and commutativityfor conjunction.

1. A ∨B

2. ¬¬A ∨ ¬¬B adding double negation,

3. ¬(¬A ∧ ¬B) pulling-out negation, via De Morgan for conjunction,

4. ¬(¬B ∧ ¬A) Commutative law for conjunction,

5. ¬¬B ∨ ¬¬A pushing-in negation, via De Morgan for conjunction.

6. B ∨A deleting double negations.

And here is a derivation of the law for a tautological conjunct, from the double negation law,De Morgan’s law for disjunction and the law for contradictory disjunct.

1. A ∧ (B ∨ ¬B)

2. ¬¬A ∧ ¬¬(B ∨ ¬B) adding double negation,

3. ¬[¬A ∨ ¬(B ∨ ¬B)] pulling negation out, via De Morgan for disjunction,

4. ¬[¬A ∨ (¬B ∧ ¬¬B)] pushing the third negation in, via De Morgan for disjunction,

5. ¬¬A deleting a contradictory disjunct,

6. A deleting double negation.


The pattern should be now clear: The law is derived from the corresponding law in theother column, using the double negation law and De Morgan–first from right to left (doublenegations are added and negation is pulled out), then from left to right (negation is pushedin and double negations are dropped).

Homework 2.18 (i) Derive the associative law for conjunction, from the associative lawfor disjunction, the double negation law and De Morgan’s law for disjunction. (ii) Derive thelaw for tautological disjunct from the law for contradictory conjunct, the double negation lawand De Morgan’s law for conjunction.

While the derivability of some laws from others is of mathematical interest, you need notrestrict yourself to a small basis of laws. When you simplify, or when you prove equivalences,you can use freely all the laws, as well as any equivalence that is obvious, or has been previouslyestablished.

2.5.3 Duality

We have noted already the duality phenomenon. If we toggle ∧ and ∨ in any of our laws weget the dual law. It is not difficult to deduce from this that if an equivalence, stated in termsof ¬, ∧ and ∨, is derivable from these laws, so is the dual equivalence, obtained by togglingthroughout ∨ and ∧. In general, duality can be defined by considering a certain operation ontruth-tables:

Say that a connective is the dual of another, if its truth table is obtainedfrom that of the other by toggling everywhere T and F.

It now follows that ∨ is the dual of ∧ :

A B A ∧BT T TT F FF T FF F F

A B A ∨BF F FF T TT F TT T T

The order of rows in the second truth-table is different from the customary one; but this, asremarked in 2.1.3, makes no difference.

If one connective is the dual of another, then the second is the dual of the first; because bytoggling T and F in the second truth-table, we get back our first table. Hence duality issymmetric. We therefore speak of dual connectives, or of a dual pair. The dual of negationis, again, negation:


A ¬AT FF T

A ¬AF TT F

Duality of Sentential Expressions: The dual of a given sentential expression is theexpression obtained from it by replacing every connective name by the name of its dual.

Again, if, given the dual expression, we replace every connective name with the name of itsdual, we get back our original expression. Hence duality–as a relation between expressions–is symmetric.

Note: Before an expression is transformed to its dual, it should be fully parenthesized.Conventions for omitting parentheses cannot be relied upon, because they discriminate be-tween ∧ and ∨. E.g., to form the dual of A ∧ B ∨ C, we insert parentheses: (A ∧ B) ∨ Cand then toggle the connectives: (A ∨ B) ∧ C. Without parentheses we would have gottenA ∨B ∧C; under the grouping conventions this would become A ∨ (B ∧C), which is wrong.Here are examples of dual expressions:

A ∧ (B ∨ C) A ∨ (B ∧ C)

¬A ∨ [¬(A ∧B)] ¬A ∧ [¬(A ∨B)](A ∧ ¬B) ∨ (¬A ∧B) (A ∨ ¬B) ∧ (¬A ∨B)

Now the following holds:

The truth-tables of dual expressions are obtained from each other by togglingeverywhere T and F.

It has a formal proof, which we shall not bring here. But you can convince yourself of its truthby the following observation: We get the truth-table of an expression by working through itscomponents, using at each stage the truth-table of the main connective. The toggling oftruth-values transforms the truth-table of any connective to the table of its dual. Hence,toggling throughout the truth-values yields the truth-table of the expression in which everyconnective is replaced by its dual. The following pair of truth-tables illustrates what happens:


A B ¬A ¬A ∨B ¬(¬A ∨B) A ∧ ¬(¬A ∨B)T T F T F FT F F F T TF T T T F FF F T T F F

A B ¬A ¬A ∧B ¬(¬A ∧B) A ∨ ¬(¬A ∧B)F F T F T TF T T T F FT F F F T TT T F F T T

If two sentential expressions are equivalent, then their columns, in a truth-table with entriesfor both, are the same. If we toggle throughout T and F we get a truth-table for the dualexpressions. Obviously, equal columns are transformed into equal columns. Therefore, we getthe duality principle:

If two sentential expressions are equivalent, so are their duals.

The basic laws listed in the boxes have been arranged in such a way that the laws in eachpair are duals: they are obtained from each other by replacing each expression by its dual.

From any proven equivalence we can get, by the duality principle, a dual equivalence. Forexample, having established that

¬B ∧ (B ∨A) ≡ A ∧ ¬B ,

we can deduce by duality that

¬B ∨ (B ∧A) ≡ A ∨ ¬B .

Duality can be extended to the case of laws that are stated in terms of truth-values, or whichemploy ‘tautology’ and ‘contradiction’. The dual of T is F, the dual of F is T. The dualof ‘tautology’ is ‘contradiction’, the dual of ‘contradiction’ is ‘tautology’. The following, forexample, are duals:

If A is a tautology, then A ∧B ≡ B .

If A is a contradiction, then A ∨B ≡ B .

Homework 2.19 Write down the duals of 1., 5., 16., and 20. of Homework 2.17 andsimplify them.


Note: As an algebraic structure, classical logic displays a striking symmetry in which T andF play symmetrical roles. In our use of language, however, truth and falsity cannot, in anysense, be considered symmetric. The essential difference between truth and falsity is broughtabout by the way language is used in our interactions with the extra-linguistic world and witheach other. This subject, which has fundamental importance in the philosophy of languageand the philosophy of logic, is beyond the scope of this book. Note only that the conceptof logical implication (to be defined in chapter 4) reflects the asymmetry between truth andfalsity. So does the fact that we find it advisable to have → (to be presently defined) as aconnective, but we do not include its dual.

2.6 Conditional and Biconditional

Conditional is a binary connective, whose name is ‘→’. If we include it then, for every twosentences A and B, there is a sentence called the conditional of A and B, which we write as:

A→ B

A and B are called, respectively, the antecedent and the consequent of A→ B. ‘Conditional’,like the names of the other connectives, is used ambiguously: for the connective (i.e., theoperation) as well as the resulting sentence.

Note: Do not confuse the logical antecedent–the one just defined–with the grammaticalantecedent of a pronoun (the word or phrase a pronoun refers to).

The truth-table of A→ B is:

A B A→ BT T TT F FF T TF F T

Conditional corresponds to the ‘If ... then ’ formation of English. The sentence

(1) If Jack is at home then Jill is at home

can be recast in sentential logic as the conditional A → B, where A represents ‘Jack isat home’, and B represents ‘Jill is at home’. Recasting (1) in this way means that weconsider it false if Jack is at home and Jill is not at home, and we consider it true in all othercases–in particular, if Jack is not at home. The problems arising from this interpretationof ‘If ... then ’ are discussed in chapter 3, where the relations between natural languageand sentential logic are looked into. In sentential logic → is just another connective havingthe above truth-table.

2.6. CONDITIONAL AND BICONDITIONAL 73

Obviously, the column of A→ B is the same as the column for ¬A ∨B. Hence we have:

A→ B ≡ ¬A ∨B

Consequently, → is expressible in terms of ¬ and ∨ (and therefore also in terms of ¬ and ∧).We could do without → in the sentential calculus; but there are good reasons for includingit, which have to do with the role of → in the context of logical implications and formaldeductions.

On the other hand, ∨ (hence also ∧) is expressible in terms of ¬ and →:

A ∨B ≡ ¬A→ B

Thus, ¬ and each of the binary connectives ∧, ∨, → are sufficient to express the other twobinary connectives.

Note: Sometimes conditional goes under the name ‘implication’, or ‘material implication’.The term ‘implication’ will be used for a different purpose. You should take care not toconfuse the two.

Grouping Convention for →: The convention is that ‘→’ binds more weakly than any ofthe previous connective names. This means that in restoring parentheses we first determinethe scopes of negations to be the smallest scopes consistent with the existent grouping, thenthe scopes of conjunctions, then those of disjunctions, and then the scopes of the conditionals.For example, the grouping in:

¬A ∧B ∨B → ¬C ∨Dis:

[((¬A) ∧B) ∨B]→ [(¬C) ∨D]

Conditional does not have the associativity property, enjoyed by conjunction and disjunction:

(A→ B)→ C and A→ (B → C)

are, in general, not equivalent. The two have different values when A, B, C get all F. Italso lacks commutativity:

A→ B and B → A

have different values, when A and B get different values.

Consequently, when conditionals are repeated, grouping and order are extremely important.We cannot omit parentheses as we have done in the case of long conjunctions and disjunctions.


Note: If we toggle T and F throughout the truth-table of conditional, we get the truth-tableof

¬A ∧BHence, if we were to introduce a connective dual to→, then the “dual-conditional” of A andB would be logically equivalent to ¬A∧B. This can be also seen by rewriting the conditionalin the logically equivalent form ¬A ∨B and forming the dual of that.

None of the customary systems of logic has a “dual-conditional” as a primitive connective.To form the dual of an expression involving conditionals, we should therefore replace everycomponent of the form A→ B by ¬A ∧B.Homework

2.20 Consider the expressionA→ B → C → D

How many possible different sentences can we obtain from it by inserting parentheses?

Find whether any two of sentential expressions obtained in this way are logically equivalent.

2.21 In certain cases conditional can be distributed over conjunctions and disjunctions. Forexample,

A→ (B ∧ C) ≡ (A→ B) ∧ (A→ C)

But sometimes the “distributing” involves a change in the other connective (the conjunctionor the disjunction over which conditional is distributed). Find the “distributive laws” for thefollowing cases.

• A→ (B ∨ C)• (A ∨B)→ C

• (A ∧B)→ C

2.22 Using the “distributive laws” of Homework 2.21, push→ all the way in, in the followingsentences:

1. (A ∨B)→ (C ∨D)2. (A ∨B)→ (C ∧D)3. (A ∧B)→ (C ∨D)4. (A ∧B)→ (C ∧D)

Note: Using conditional , we can form the tautology A → A , which is perhaps simplerthan our previous standard tautology A ∨ ¬A.


Biconditional

We conclude our list of connectives, with biconditional. It is a binary connective whose nameis ‘↔’. For every two sentences A and B, there exists a sentence that is their biconditional:

A↔ B

The interpretation of the biconditional is best expressed by expressing it as a conjunction oftwo conditionals:

A↔ B ≡ (A→ B) ∧ (B → A) .

In terms of truth-values this means that the value of A↔ B is T if A and B have the sametruth-value; it is F if they have different values.

Since conditional represents, in a way, ‘If ... then ’ , biconditional corresponds to:

If ... then , and if then ... , that is: ... iff .

For example,

(2) Jill is at home if and only if Jack is at home

can be formalized as a biconditional.

More than the other connectives ↔ is dispensable as a primitive. Replacing everywhere‘A↔ B’ by ‘(A→ B) ∧ (B → A)’, may cause some, but usually minor, inconveniences.

From the truth-table of biconditional we can immediately see that it has the commutativityproperty:

A↔ B ≡ B ↔ A

Also the following useful equivalences are easily verified, either by truth-table, or by algebraicmanipulations (after converting the biconditional into a conjunction of two conditionals andexpressing these in terms of the previous connectives).

A↔ B ≡ (A∧B) ∨ (¬A∧¬B)

¬(A↔ B) ≡ A↔ ¬B ≡ ¬A↔ B

¬(A↔ B) ≡ (A∧¬B) ∨ (¬A∧B) ≡ (A ∨B) ∧ (¬A ∨ ¬B)Note that the right-hand side of the first equivalence can be understood as saying: “Eitherboth A and B are true, or both are false”. Similarly, in the third row, the second sentencecan be understood as saying that A and B have different truth-values; and the third sentenceas saying that one of A and B is true, and one is false.


Note that ¬(A↔ B) expresses exclusive ‘or’.

Note: If we toggleT and F in the truth-table of A↔ B we get the truth-table of ¬(A↔ B),hence ¬(A ↔ B) can serve as the dual of A ↔ B. This can be also inferred from the factthat the rightmost sentences in the first and third rows in the above-given equivalences areduals of each other.

In addition to commutativity, biconditional has the associativity property:

(A↔ B)↔ C ≡ A↔ (B ↔ C)

This can be established either by truth-tables, or by algebraic manipulation using the equiv-alences given above. A short truth-value argument, which gives also some insight into thenature of repeated biconditionals, proceeds by proving first the following claim.

(A ↔ B) ↔ C gets T, under a given assignment of truth-values to thesentential variables A,B,C, iff F is assigned an even number of times (i.e.,to 2 of the variables, or to none of them).

Here is the proof.

Case (i): C gets T. Then (A ↔ B) ↔ C gets T iff A and B get the sametruth-value. If both get T, the number of assigned F’s is 0; and if both getF, this number is 2.

Case (ii): C gets F. Then (A ↔ B) ↔ C gets T iff A and B get differentvalues; in this case the number of assigned F’s is 2 (one to C, one to A orto B).

It is easily seen that there are exactly four possibilities of assigning an evennumber of F’s. These are exactly the possibilities covered in (i) and (ii).Hence, (A↔ B)↔ C gets T just when the number of assigned F’s is even.

This shows also that (B ↔ C) ↔ A gets T iff F is assigned an even number of times. Itfollows that:

(A↔ B)↔ C ≡ (B ↔ C)↔ A

Since (B ↔ C)↔ A ≡ A↔ (B ↔ C), we get the desired result.


Consequently, in expressions in which ‘↔’ is the only connective name, changes in the orderand grouping yield logically equivalent expressions (as is the case with ‘∧’, or with ‘∨’). Wecan, therefore, ignore parentheses, e.g.,

A↔ B ↔ C ↔ D↔ E

If in such a repeated biconditional a sentential variable, ‘A’, occurs more than once, we can,by changes of order and grouping, rewrite the biconditional as:

(A↔ A)↔ D

where D is the rest of the repeated biconditional. Since A↔ A is a tautology, it easily followsthat

(A↔ A)↔ D ≡ D

We can therefore drop any pair of two occurrences of the same sentential variable. If everysentential variable occurs an even number of times, then the repeated biconditional is a tau-tology. If not, then by dropping all repeating pairs, we are left with a repeated biconditionalin which every sentential variable occurs only once. Such a sentence is non-tautological (cf.Homework 2.23).

Homework 2.23 Show that a sentential expression constructed using only ‘↔’ is tautologicaliff each sentential variable occurs in it an even number of times. (Most of the proof has alreadybeen done, you have to state it in full, supplying the last missing step).

Using the type of argument just given, one can prove the following generalization of the claimused in the proof of associativity:

If a sentential expression is built only from ‘↔’ and sentential variables,and each variable occurs only once, then the expression gets T iff the numberof variables that get F is even.

Unlike our previous binary connectives, ↔ with negation is not sufficient for expressing theother connectives. It can be proved that, for any sentential expression built from two sententialvariables using only ‘¬’ and ‘↔’, the number of T’s in its truth-table column is even (either0, or 2, or 4). Therefore it cannot be equivalent say to ‘A ∧ B’, which has an odd numberof T’s in its column (namely, 1).


Biconditionals are handy for characterizing logical equivalence in terms of logical truth:

A ≡ B iff A↔ B is a logical truth.

Grouping: The convention is that ‘↔’ has weaker binding power than the other connectivenames. Hence

¬A ∨B∧C → B ↔ A→ B ∨ Cshould be read as:

[(((¬A) ∨ (B∧C))→ B)]↔ [A→ (B ∨ C)]

Simplifying Expressions Containing Conditionals and Biconditionals

Expressions containing conditionals and biconditionals can be transformed into equivalentones involving only ¬, ∧, and ∨, to which, in turn, we can apply our previous simplificationtechniques. Often, however, we can get simpler forms by retaining some conditionals, orbiconditionals, in the final outcome. For example,

A→ (B → C)

is logically equivalent to:¬A ∨ ¬B ∨ C

but also toA∧B → C

which has a clearer intuitive meaning. This generalizes to an arbitrary number of repeatedconditionals grouped to the right:

A1 → (A2 → (. . .→ (An−1 → An) . . .) ≡ (A1∧A2∧. . .∧An−1)→ An

Noteworthy equivalences concerning → are:

¬A→ ¬B ≡ B → A

¬(A→ B) ≡ A ∧ ¬B

The above-mentioned properties concerning ↔, as well as the equivalence:¬(A↔ B) ≡ A↔ ¬B

can be used in simplifications of expressions containing ‘↔’ and ‘¬’ only. When other con-nectives are present, there are no straightforward simplification techniques (short of rewritingeverything in terms of ¬,∧,∨ and applying the previous methods). But special cases maylend themselves to special treatments.

Homework 2.24

Simplify the following sentences. You may employ in the final version any of the connectivesintroduced here. Try to get sentences that are short or easy to grasp. You can use truth-valueconsiderations, the algebraic methods of the previous section, or a mixture of both.

1. (A→ B)→ C

2. (A→ B)→ A

3. (A→ B)→ B

4. (A→ B) ∨ (B → A)

5. ¬(A→ B) ∨ ¬(B → A)

6. (A↔ B)↔ A

7. A∧B ∨ ¬A∧¬B8. ¬A∧B ∨A∧¬B9. (A→ B)↔ (B → A)

10. (A→ B)→ (B → A)

11. (A ∨B)↔ (A ∧B)12. [¬(A↔ B)]↔ ¬[(B ↔ C)↔ C]

13. A∧B ↔ A∧C14. ¬(A→ B)↔ ¬A∧¬C15. (A→ B) ∧ (B → C) ∧ (C → A)

16. (B → A) ∧ (C → A) ∧ (A→ (B ∨ C))17. (A→ B) ∧ (A→ C) ∧ ((B ∧ C)→ A)

79

Chapter 3

Sentential Logic in Natural Language

3.0

In this chapter we try to uncover structures of sentential logic in a natural language–namely,English. We do so by recasting various English sentences in the form of sentential compounds,constructed by means of the connectives of classical two-valued logic. Certain sentences playin this representation the role of basic unanalysed units, they are represented by sententialvariables; others are built from them by using connectives. For example,

(1) Jack went to the theater, but Jill did not

can be recast as:

(10) A ∧ ¬B , where A and B are, respectively, the counterparts of:

(1.i) Jack went to the theater,

(1.ii) Jill went to the theater.

This implies the following: For every assignment of truth-values to A and B, if (1.i) and(1.ii) are given, respectively, the same values as A and B, then (1) and (10) get the sametruth-value.

We can think of A and B as translations in a formal language of (1.i) and (1.ii). At this pointwe don’t have to specify A and B any further, because the analysis stops there. Note that thesentential variables should represent self-contained declarative sentences, whose meaning–inthe given context–is completely determined. In our example, this calls for spelling out ‘Jilldid not’ as ‘Jill did not go to the theater’. B corresponds to ‘Jill went to the theater’ not to‘Jill did’. By the same token, ‘Jack likes Jill and Jane likes her too’ should be formalized asa conjunction A ∧B, where A and B correspond, respectively , to:

81

82 CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE

‘Jack likes Jill’, ‘Jane likes Jill’.

Again, the correlate of B is not ‘Jane likes her too’.

Translating (10) back literally, we get the repetitive sentence ‘Jack went to the theater andJill did not go to the theater’. The formalization is not intended as a guide to good style.Neither do we aim at linguistic analysis. While the latter is relevant to what we do, our jobis not that of the linguist. The formalization is intended to reveal a certain logical structure:it shows how the truth-value of the sentence depends on the truth-values of its components.

Note on Terminology: Throughout this chapter we adapt the formal terminology ofsentential logic to the context of natural language. Thus, ‘logical negation’ (or, for short,‘negation’) refers to an operation by which an English sentence is transformed to anothersentence, which always has the opposite truth-value. Similarly ‘logical conjunction’ (or, forshort, ‘conjunction’) refers to an operation by which two sentences are combined into a sen-tence, whose truth-value is determined according to the rules governing ∧. Likewise for theother connectives. We shall also use these terms to refer to the sentences themselves. Wetherefore say that (1) is read as the conjunction of (1.i) with the negation of (1.ii).

Note that in the above formalization of (1) ‘but’ is interpreted as ‘and’, i.e., as a word thatforms a logical conjunction. Actually, ‘but’, unlike the neutral ‘and’, indicates some contrastbetween Jack’s and Jill’s actions. But this is not sufficient for making a difference as far astruth-values are concerned: (1) is true just when both ‘Jack went to the theater’ and ‘Jilldid not go to the theater’ are true, and is false in all other cases. This is enough for reading(1) as a logical conjunction.

Stylistic and esthetic elements, indications, suggestions, indirect approval or condemnation,and similar aspects that go beyond the mere statement of facts, are, as a rule, obliterated inthe formalization.

Sentential logic is a very elementary part of logic, which cannot provide for an in-depthanalysis. At the level of sentential logic, highly intricate sentences may appear as basicunanalysed units, because further analysis requires a richer logical apparatus. But this is thefirst necessary step to a deeper analysis.

Besides, sentential analysis is often instructive in itself. Consider, for example:

Harvey weighed it. A mediocre two and two-thirds pounds. One morenegative datum to sabotage the notion that the brain’s size might accountfor the difference between ordinary and extraordinary ability–a notion thatvarious 19-century researchers have labored futilely to establish (claimingalong the way to have demonstrated the superiority of men over women,white men over black men, Germans over Frenchmen).

3.0. 83

It comes out as a long conjunction. But what are the conjuncts? In other words: to the truthof what is the author committed? The question forces one to focus on what is actually beingasserted here. You will find that the list runs somewhat like this:

(i) Harvey weighed it [the ‘it’ referring to something mentioned earlier],

(ii) The weighing showed a reading of two and two thirds pounds,

(iii) Two and two thirds pounds is a mediocre weight for an object of the kindin question,

(iv) The fact established by the weighing is evidence against the notion thatbrain size might account for differences between ordinary and extraordinarymental ability,

(v) There have been previous pieces of evidence disconfirming that notion aboutthe effect of brain size,

Etc.

Truth-Functional Compounds and The Truth-Value Test

A truth-functional compound of given English sentences is a sentence formed from themby applying connectives (English equivalents of the connective operations), such that thecompound’s truth-value is determined completely by the values of the components.

For example, (1) is a truth-functional compound of ‘Jack went to the theater’ and ‘Jill wentto the theater’; ‘Jill did not go to theater’ is a truth-functional compound of ‘Jill went tothe theater’.

On the other hand ‘Jack went to the theater because Jill told him to’ cannot be analysedas a truth-functional compound of ‘Jack went to the theater’ and ‘Jill told Jack to go tothe theater’. (See 2.1.3, page 26, for a discussion of the case.)

Whether an English sentence can be analysed as truth-functional compound of other sentencesis not always clear. Grammatical form and the presence of words such as ‘not’, ‘and’, andothers, may guide us; but in many cases this is not sufficient. It would do well to keep inmind the following test.

Truth-Value Test: A sentence can be construed as a truth-functional com-pound, formed by applying a sentential connective, only if its truth-value isalways determined according to the truth-table of that connective.

For example,

(2) John was unharmed by the accident


can be viewed as the negation of

(3) John was harmed by the accident.

If (3) gets T, (2) gets F; and if (3) gets F, (2) gets T. On the other hand, the construing of

(4) John is unhappy today

as the negation of:

(5) John is happy today

does not do so well on the truth-value test. If (5) gets T, then (4) gets F; but it is conceivablethat both (4) and (5) get F. John’s state might be neither that of happiness nor thatof unhappiness (say John is indifferent, or under sedation). More than mere absence ofhappiness, ‘unhappy’ implies some positive misery.

Note that the truth-value test constitutes a necessary but not a sufficient condition for in-terpreting a given sentence as a certain compound. Other considerations may enter. Forexample, of the two sentences:

‘Jill went to the theater’, ‘Jill did not go to the theater’,

the second is naturally construed as the negation of the first, not the first as the negationof the second; though both construals do equally well on the truth-value test. All we cansay is that the first sentence is logically equivalent to the negation of the second. Here thegrammatical form of negation decides the issue.

Like many undertakings pertaining to natural language, the success of recasting Englishsentences in logical form is a matter of degree. The question, “What is the logical form of agiven sentence?” need not always have a clear-cut answer. An approximation that ignorescertain aspects might do for certain purposes. Others may require a different, finer-grainedanalysis.

The Use of Hybrid Expressions

A convenient, common way of showing the recastings of sentences in sentential logic involvesthe application of formal notation to English. For example, (1) is to be analysed as:

(1∗) (Jack went to the theater) ∧¬(Jill went to the theater) .

3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH 85

We can read (1∗) as standing for: ‘The conjunction of ‘Jack went to the theater’ and thenegation of ‘Jill went to the theater’ ’. (Just so ‘A ∧ ¬B’ can be read as ‘the conjunctionof A and the negation of B’.) Strictly speaking, connective operations are defined for theformal setup. In English there is more than one construction that can represent a connective,e.g., there are several ways of expressing logical conjunction. But (1∗) is not meant as someunique English sentence. It is a convenient way of indicating a logical analysis of the sentencein question. It can refer to any of the sentences obtained by expressing the connectives inEnglish.

3.1 Classical Sentential Connectives in English

3.1.1 Negation

An English construction that represents logical negation, which is also analogous to the no-tation ‘¬A’, is based on appending ‘it is not the case that’ as a prefix:

It is not the case that Jack likes Jill

is the negation of ‘Jack likes Jill’. Much more common is the attaching of ‘not’ to a verbinside the sentence:

Jack does not like Jill.

Sometimes negation is expressed by using ‘un’; thus, (2) above can be construed as thenegation of (3). But sometimes ‘un’ does not express logical negation but something stronger,as is evidenced by (4). There are no formal rules for determining whether a certain use of‘un’ yields logical negation. You shall have to rely on your understanding of English.

Another prefix that implies negation is ‘dis’. But usually this is stronger than mere logicalnegation, stronger also than the corresponding ‘un’-construct. Compare for example:

(6) It is not the case that Jack is respectful to his boss.

(6‘) Jack is unrespectful to his boss.

(6“) Jack is disrespectful to his boss.

The first is the logical negation of ‘Jack is respectful to his boss’, but the last means that heis positively impertinent. The second seems to lie in between.


Intuitions vary. Some may construe “unrespectful” as sufficiently stronger than mere negation,so as to warrant an assignment of a different truth-value. Others might consider (60) as amore emphatic version of (6), but with the same truth-value.

Sometimes, even a mere ‘not’ can indicate something stronger than logical negation. Considerfor example:

(7) Jill did not like the play,

which seems to imply actual dislike, rather than mere absence of liking. Whether (7) actuallysays this, or only suggests it, is one of those questions to which a clear-cut answer is notforthcoming. We shall return to the problem in the next section.

It is not among the goals of this course to settle questions of English usage, especially whenthese involve fine distinctions and when there are no unanimous answers. But you should beaware of the existence of the problem and of its bearing on the logical analysis.

A noteworthy aspect of colloquial usage is the use of repeated negation for emphasis:

I haven’t told no lie.

The speaker alleges that he has not told any lie. The formal reading whereby the two negationscancel each other, is misleading; it would make the speaker assert that he has told a lie. Amore extreme example was proposed by Russell. A charwoman who is unjustly accused ofstealing replies indignantly: ‘I ain’t never done no harm to no one!’. Were we to followblindly the formal rules, we would interpret her as claiming to have, at some time, done harmto every human being. (Can you see how this would follow? It involves also rules concerningquantifiers, expressed here by ‘never’ and ‘no one’, but it should not be very difficult to guess.)

3.1.2 Conjunction

By the truth-value test, a sentence cannot count as a conjunction of two (or more) sentences,unless it is true when both (or all) of them are true and false in every other case. In assertinga conjunction the speaker is committed to the truth of all the conjuncts, and to nothing more.

(‘Conjunction’ as used here applies to sentences and should be clearly distinguished from theterm of grammar, which refers to a combining words.)

The standard word that marks conjunction is ‘and’. But conjunction is also obtained bymere juxtaposition, with the appropriate punctuation. Each of the following


Oswald shot Kennedy and Ruby shot Oswald,

Oswald shot Kennedy; Ruby shot Oswald,

Oswald shot Kennedy, Ruby shot Oswald1

can be read as:

(Oswald shot Kennedy) ∧ (Ruby shot Oswald)

Both ‘and’ and juxtaposition can serve in repeated conjunction:

Some are born great, some achieve greatness, and some have greatness thrustupon them.

A sequence of sentences, each ending in full stop, has the effect of a conjunction–in as muchas the writer is committed to the truth of all the sentences in the sequence;2 e.g.,

Oswald shot Kennedy. Ruby shot Oswald.

As remarked already, each conjunct should be an independent sentence. Pronouns that derivetheir meaning from other conjuncts should be replaced by independent particles. Thus,

Jack used to smoke a lot and so did his wife

becomes

(Jack used to smoke a lot) ∧ (Jack’s wife used to smoke a lot) .

Besides ‘and’ the following words are used to form compounds of two sentences, whose truthrequires that both components be true:

‘but’ , ‘yet’ , ‘moreover’ , ‘however’ , ‘although’ , ‘nonetheless’ , ‘since’ , ‘therefore’ , ‘be-cause’, and others.

1Taking here stylistic license, as it is done many times, to separate the sentences by a comma.2We may not want to declare a whole text false on the force of a single false sentence. This, however,

does not mean that the usual truth-table does not apply. It only shows that the truth-value of the resultingconjunction is not adequate for judging texts containing many sentences.


But not in all cases the truth of both components is sufficient for the truth of the compound.When it is not–we do not get a logical conjunction, but a compound that is not truth-functional. All these words have side effects that the neutral ‘and’ does not. The side effectsof ‘but’, ‘yet’ , ‘moreover’ , ‘however’ , ‘although’ , and ‘nonetheless’ do not make fora truth-value difference. Such compounds can be construed as logical conjunctions, thoughsome aspects of meaning are thereby lost.

On the other hand, we have noted that the effect of ‘because’ cannot be ignored in theformalization and that ‘because’-compounds are not, as a rule, truth-functional. ‘Since’,‘therefore’ and their like (e.g., ‘consequently’) appear to be intermediate cases. Consider forexample,

(8) Jill said that the play was good, therefore Jack went to see it.

If (8) is to construed as:

(Jill said that the play was good) ∧ (Jack went to see the play)

then (8) comes out true if Jill said that the play was good and Jack went to see it, even ifJack’s going had nothing to do with Jill’s saying. Under such circumstances (8) is undoubtedlymisleading. But a statement can be misleading and yet formally true; something false canbe suggested, without being explicitly stated. To what extent does (8) explicitly say thatJack’s going was caused by Jill’s saying? On this intuitions may differ. We shall reconsiderthe question in 4.5.2.

‘And’

When used to combine sentences, ‘and’ does not have the side effects that other combiningterms (‘but’, ‘yet’ etc.) have. But it can indicate temporal order or causal relation that gobeyond logical conjunction:

(9) Jack’s wife told him to stop smoking, and so he did.

Evidently, (9) indicates that Jack stopped smoking after his wife told him to. It suggestsmoreover a causal relation between the two facts. Whether such indications or suggestionsshould be taken as part of what is actually said is, again, not clear-cut.

Homework 3.1 Give and discuss at least three examples, besides the above-given (9), oftemporal order that is indicated by ‘and’, where the indications are of different strengths,from mere suggestion to almost explicit.

‘And’ is used, more commonly, not to join sentences, but to join nouns, or verbs, or adjectives,or adverbs, or phrases of these types. Often, the resulting constructs amount to logicalconjunctions:


(10) Jack and Jill took driving lessons

can be analysed as

(100) (Jack took driving lessons) ∧ (Jill took driving lessons)

And in a similar vein,

(11) Jack is short and blue-eyed

can be recast as:

(110) (Jack is short) ∧ (Jack is blue-eyed)

Yet this is not always the case. Consider:

(12) Jack and Jill went to the party.

Does this amount to: ‘Jack went to the party and Jill went to the party’? Something more isimplied, namely that they went together. Is the togetherness part of what is explicitly stated?In other cases this is certainly so:

(13) Jack and Jill painted this picture

implies cooperation between Jack and Jill. That each painted this picture separately doesnot make sense (unless “painting this picture” is understood in some very unusual way). Orconsider:

(14) Tom, Bill and Harry elected Helen as the group’s representative,

which, obviously, does not reduce to a conjunction of ‘Tom elected Helen ...’, ‘Bill electedHelen ...’, and ‘Harry elected Helen ...’. And, as a final illustration:

(15) l and l0 are parallel lines,

which certainly does not reduce to ‘l is a parallel line and l0 is a parallel line.’

The picture is now clear. As a combiner of names, ‘and’ (and juxtaposition) can function intwo ways. When it functions distributively, the result amounts to a sentential conjunction,which attributes some action, or property, to each of the named objects. The action, orproperty, distributes over ‘and’:


X and Y did Z is equivalent to X did Z and Y did Z.

When it functions collectively, ‘and’ combines the given names into a name of a group (con-sisting of the named objects). The action, or property, is attributed to the group as a whole;it cannot be reduced to the separate actions or the separate properties of group members.This is the non-distributive, or collective ‘and’:

{X and Y} did Z .

Sometimes ‘and’ is clearly distributive–as in the cases of (10) and (11); sometimes it is clearlynot–as in the cases of (13), (14), and (15); and sometimes it is ambiguous. In (12) ‘and’ canbe understood either distributively or collectively. The collective reading of (12) appears topredominate. But the other reading cannot be ignored; which is shown by the fact that

(16) Jack and Jill went to the party, but they did not go there together

is neither inconsistent nor in any way strange. Given (12) by itself, we interpret ‘and’ col-lectively. But with the additional clause in (16), we switch to the distributive reading. Theswitch is done in order to avoid an interpretation under which the sentence is obviously absurd.It is an instance of what some philosophers have called a principle of charity:

In cases of ambiguity, other things being equal, interpret the speaker in away that gives him the best benefit.

The distinction between distributive and collective ‘and’ arises also when it combines verbphrases. Thus,

Jack studied mathematics and played baseball

can be recast as a conjunction of ‘Jack studied mathematics’ and ‘Jack played baseball’; the‘and’ is distributive. But often there is an implied temporal proximity, temporal order, orcausal link. And if this is to be part of what the sentence says, then we cannot recast it as asimple conjunction. Compare:

(17) Jack cried and laughed versus (Jack cried) ∧ (Jack laughed) .

(18) Jill hit the ball and sent it spinning versus (Jill hit the ball) ∧(Jill sent the ball spinning) .

In (17) the conjunction misses the implication that the crying and laughing were almostconcurrent. In (19) the conjunction does not say that Jill’s hitting the ball was the cause ofher sending it spinning.

When ‘and’ or juxtaposition combine adverbs, the combination is usually collective. Compare,for example,


(19) Jill ran fast and silently versus (Jill ran fast) ∧ (Jill ran silently) .

The conjunction says that at some time (in the past) Jill ran fast and at some time she ransilently. It does not say that ‘fast’ and ‘silently’ apply to the same run–which is what theleft-hand side sentence says.

When adjectives are combined, the combination is distributive when the statement is in thepresent tense–as illustrated by (11) above (‘Jack is short and blue-eyed’). In the past tensethere is often, as in the case of verbs, an implication of temporal proximity. Being informedthat John was skinny and rich, one would understand that he was skinny and rich about thesame time; that he was skinny and poor in his youth, and twenty years later–fat and richdoes not accord well with what we are informed.

(The problem does not arise in the present tense, because the temporal proximity is guaranteedby the tense: both conjuncts are in the present, hence both refer to the same time.)

Note that, in the present tense, when adjectives are combined, the collective and distributivereadings come to the same; because the adjectives are applied to the same name (e.g., ‘Jack’ in(11)), one that denotes in both conjuncts the same object. This fails with respect to adverbsbecause we have not a common peg to hang our adverbs on.

Sometimes ‘and’ is used distributively but the distribution calls for a certain adjustment. Wecannot distribute the ‘and’ in

Jim and John went with their families to the zoo

so as to produce: ‘Jim went with their families to the zoo and John went with their familiesto the zoo’. Nonetheless the ‘and’ here is distributive and its correct distribution yields:

Jim went with his family to the zoo and John went with his family to thezoo.

Additional factors, involving the use of plural pronouns, are here at play. Be aware of suchpossibilities and do not apply the tests blindly. There are various other complications intowhich we shall not enter. A more comprehensive analysis, which is the work of a linguist, isbeyond the scope of this book.

Homework 3.2 Express the following texts as sentential combinations of basic components.Get your basic components as small as possible. Remember that the basic components shouldbe written as self-standing sentences. Indicate relevant ambiguities, as you find them, andformalize each of the readings.

1. Democracy is a form of government which may be rationally defended, not as beinggood, but as being less bad than any other.

2. The ear tends to be lazy, craves the familiar, and is shocked by the unexpected: the eye,on the other hand, tends to be impatient, craves the novel and is bored by repetition.


3. A sentimentalist is a man who sees an absurd value in everything and doesn’t know themarket price of anything.

4. To know a little of anything gives neither satisfaction nor credit, but often brings disgraceor ridicule.

5. Knowledge is two-fold and consists not only in an affirmation of what is true, but in thenegation of what is false.

6. It is the just doom of laziness and gluttony to be inactive without ease and drowsywithout tranquillity.

7. To learn is a natural pleasure, not confined to philosophers, but common to all men.

8. It ain’t no sin if you crack a few laws now and then, just so long as you don’t break any.

9. Great eaters and great sleepers are incapable of anything else that is great.

10. The virtue of the camera is not the power it has to transform the photographer into anartist, but the impulse it gives him to keep on going.

11. The fact that an opinion has been widely held is no evidence whatever that it is notutterly absurd; indeed in view of the silliness of the majority of mankind, a widespreadbelief is more likely to be foolish than sensible.

12. In a just society men and women should have equal opportunity and be free to choosetheir vocations.

Here, as an example, is a solution for 1.

Let A and B represent sentences as follows.

A: Democracy is a form of government which may be rationally defended as being good.

B: Democracy is a form of government which may be rationally defended as being less badthan any other.

Then 1. is translated as: ¬A ∧B

3.1.3 Disjunction

Usually a disjunction is formed in English by using ‘or’ or ‘either... or ’:

(20) Jack will be home this evening or his wife will,

as well as,


(200) Either Jack will be home this evening or his wife will,

can be recast as:

(20∗) (Jack will be home this evening) ∨ (Jack’s wife will be home this evening) .

Note that ‘either’, which marks the beginning of the left disjunct, serves (together with thecomma) to show the intended grouping. Compare for example:

Jack will go to the theater, and either Jill will go to the movie or Jill willspend the evening with Jack,

Either Jack will go to the theater and Jill will go to the movie, or Jill willspend the evening with Jack.

They come out, respectively, as:

Jack will go to the theater ∧ (Jill will go to the movie ∨ Jill will spend theevening with Jack),

(Jack will go to the theater ∧ Jill will go to the movie) ∨ Jill will spend theevening with Jack.

‘Or’ (and ‘either...or ’) is often used to combine noun phrases, verb phrases, adjectivals, oradverbials. In this it resembles ‘and’. But the distributivity problem does not arise for ‘or’as it arises for ‘and’; usually, we can distribute:

Jack or his wife will go to the party,

Jill will either clean her room or practice the violin,

are, respectively, equivalent to

Jack will go to the party or Jack’s wife will go to the party,

Jill will clean her room or Jill will practice the violin.

Note that the following are equivalent as well:

Jill ran fast or silently,

Jill ran fast or Jill ran silently.


The problem of the adverbs applying to the same run does not arise here–as it arises in(19) for ‘and’–because of disjunction’s truth-table: That sometime in the past Jill ran fastor silently is true just if sometime in the past Jill ran fast, or sometime in the past she ransilently. (The problem arises if ‘or’ is interpreted not as ∨, but exclusively; for, then, in thefirst sentence the exclusion affects one particular run, but in the second it affects all of Jill’spast running.)

Other failures of distributivity involve combinations of ‘or’ with ‘can’, or with verbs indicatingpossible choices. They will be discussed later (3.1.3, page 95).

Inclusive and Exclusive ‘Or’

We discussed already the inclusive and exclusive readings of ‘or’ (cf. 2.2.2, pages 34, 35). Re-call that under the inclusive reading an ‘or’-sentence is analysed as a disjunction of sententiallogic: it is true, if one of the disjuncts is true, or if both are. Under the exclusive reading,one, and only one, should be true.

Exclusive disjunctions are truth-functional as well, but they should be recast differently,namely in the form

(A ∨B) ∧ ¬(A∧B), or in the equivalent form (A∧¬B) ∨ (¬A∧B) .Cf. 2.2.2 .

In many cases ‘or’ seems exclusive where an inclusive interpretation will do as well; thequestion of the right reading may not have a clear-cut answer.

(21) Jack is now either in New York or in Toronto.

Since Jack cannot be at the same time both in New York and in Toronto, many jump tothe conclusion that the ‘or’ is exclusive. But this does not follow. From (21) one will inferthat (i) Jack is in one of the two cities and (ii) he is not in both. But (ii) follows from thegeneral impossibility of being in different places at the same time. It need not be part ofwhat (21) explicitly states. The ‘or’ can be inclusive. Just so, if somebody says that the sunis now rising, we will naturally infer that it is rising in the east, though it is not part of thestatement.

Whether the speaker who asserts (21) intends the exclusion to be part of the meaning of ‘or’,or only an obvious non-logical corollary, can be a question that none may settle, includingthe speaker.

Examples of the last kind, where ‘or’ can be read inclusively but where exclusion is nonethelessimplied, come easily to mind. The strength of the implied exclusion varies, however. In

Either Edith or Eunice will marry John


the exclusion of their not marrying him both is implied by the prohibition on bigamy, whichis considerably weaker than the impossibility of being in two places at the same time.

And in

(22) This evening I shall either do my homework or go to the movies,

the exclusion is only suggested (by the inconvenience of doing both), but is by no meansnecessary. The speaker can in fact add ‘or do both’ :

(23) This evening I shall either do my homework, or go the movies, or do both.

By choosing to add ‘or do both’–or words to this effect–the speaker allows for the possibilityof an exclusive interpretation. (22) and (23) are equivalent under the inclusive interpretation;under that interpretation, the additional ‘or do both’ is redundant. But it is not redundant if‘or’ is interpreted exclusively; its addition neutralizes the effect of the exclusive interpretation.The possibility of an exclusive interpretation of ‘or’ is not evidenced by (21), but by cases,like (23), where one finds it appropriate to add ‘or both’.

There are examples where the exclusive interpretation of ‘or’ is called for, e.g.,

Either you pay the fine or you go to prison.

And there are others where ‘or’ is obviously inclusive:

If you are either a good athlete or a first-rate student, you will be admitted,

which should be recast as: (A ∨B)→ C.

To sum up, while there is an exclusive sense of ‘or’, the inclusive sense–formalized by dis-junction of sentential logic–is appropriate in more cases than appears at first sight. Recall,however, that the main reason for having ∨ rather than exclusive disjunction (our ∨x) as aconnective, are its algebraic and formal properties. It is also directly related (as we shall seein chapter 5) to the set-theoretic operation of union.

‘Or’ with ‘Can’

(24) You can have either coffee or tea.

If we distribute the ‘or’ we get:

(25) You can have coffee ∨ you can have tea .


Now (25) is not the equivalent of (24). Suppose you can have coffee, but you cannot have tea.Then (25) is true, because the first disjunct is, but (24) is false.

The problem with (24) is that it involves a hidden conditional. Spelled out in full, it comesout as something like:

(240) If you ask for coffee or if you ask for tea, you’ll get what you ask for.

And the best way of expressing this is as a conjunction:

(2400) If you ask for coffee (and no tea) you’ll get coffee,

and

if you ask for tea (and no coffee) you’ll get tea.

The parentheses make explicit the assumption that one does not get both coffee and tea byasking for both. This corresponds to an exclusive reading of ‘or’ in (240). If this assumption isnot correct the parentheses should be omitted and the ‘or’ in (240) should be read inclusively.

The equivalence of (240) and (2400) is an instance of one of the following two equivalences. It

is an instance of the first, or of the second, depending on whether the ‘or’ in (240) is readexclusively or inclusively.

(A ∨x B)→ C ≡ (A∧¬B → C) ∧ (¬A∧B → C) (A ∨B)→ C ≡ (A→ C) ∧ (B → C)

Sometimes the following sentence is used as an equivalent of (24):

(24.1) You can have coffee or you can have tea.

In this case the ‘or’ in (24.1) does not stand for disjunction, either exclusive or inclusive.

The phenomenon just exemplified is general, it takes place when ‘can’ is used to express choiceor possibility:

‘You can choose either to marry her or not to see her again’,

‘ ‘Bank’ can mean either a river-bank or a financial institution’,

and many similar cases.

Homework

In the following homework A, B, C and D represent, respectively, the sentences:

‘Ann is married’, ‘Barbara is married’, ‘Claire is married’, ‘Dorothy is married’.

Sentences, whose truth-values depend only on which of the women are married and which arenot, can be naturally formalized, using ¬,∧ and ∨. For example,


‘Exactly one of Barbara and Claire is married’

comes out as:B∧¬C ∨ ¬B∧C

3.3 Formalize the following as sentential compounds of A,B,C,D, using ¬, ∧ and ∨. Tryto get short sentences. (‘The four women’ refers to Ann, Barbara, Claire and Dorothy.)

1. At least one of the four women is married.

2. Exactly one of the four women is married.

3. At least two of Ann, Barbara and Claire are unmarried.

4. If one of the four women is married, all of them are.

5. If one of the four women is unmarried, all of them are.

6. At least one of Ann and Dorothy is married and at least one of Barbara and Claireis unmarried.

7. Ann and Claire have the same marital status.

8. Either one or three among Barbara, Claire and Dorothy are married.

3.4 Let A, B, C be as above. Using the simplified form of the sentences of Homework2.17 (in 2.5.2), translate them into English. Use good stylistic phrasings. You do not haveto reflect logical form, but the English sentences should always have the same truth-value astheir formal counterparts. ‘Or’ is to be understood inclusively. You can use ‘the three women’to refer to Anne, Barbara and Claire.

3.1.4 Conditional and Biconditional

The main English constructions that correspond to the conditional of sentential logic are:

‘If ... , then ’, ‘If... , ’, and ‘ , if ...’,

where ‘...’ corresponds to the antecedent and ‘ ’ –to the consequent. Consider for example:

(26) If John works hard he will pass the logic examination,

or the equivalent

(260) John will pass the logic examination if he works hard.


If John works hard but does not pass the logic examination, (26) is false; and if he works hardand passes the examination, (26) is true. So far the intuition is clear. Somewhat less clear isthe case of a false antecedent, i.e., if John doesn’t work hard. Note that if we are to assign toa truth-value, it must be T(as in the truth-table of→). The value cannot be F, for the pointof making an ‘if...then ’ statement is to avoid commitment to the truth of the antecedent.(26) cannot be understood to imply that John will work hard.

Some may want to say that when the antecedent is false no truth-value should be assigned. Itis as if no statement is made. On this view, the very status of having a truth-value depends onthe antecedent’s truth. This interpretation complicates matters considerably and puts (26)beyond the scope of classical logic. Whatever its merits, it is by no means compelling. Quitereasonably we can regard (26), when the antecedent is false, as true. Just as we can judge afather, who had told his daughter:

(27) If it doesn’t rain tomorrow, we shall go to the zoo,

to have fulfilled his obligation if it rained and they didn’t go to the zoo. He has fulfilled itin an easy, disappointing way, but fulfilled it he has. By the same token, if the antecedent of(26) is false, the sentence it true; true by default, but true nonetheless.

‘Material conditional’ (or, in older terminologies, ‘material implication’) is sometimes used todenote our present conditional; ‘material’ indicates that the truth-value depends only on thetruth-values of the antecedent and the consequent, not on any internal or causal connection.

The construal of ‘if’-statements as material conditionals can lead to oddities that conflictwith ordinary intuition. The following statements turn out to be true, the first–because theantecedent is false, the second–because the consequent is true.

(28.i) If pigs have wings, the moon is a chunk of cheese,

(28.ii) If two and two make four, then pigs don’t have wings.

There is more than one reason for the oddity of statements like (28.i) and (28.ii). First, weexpect the speaker to be as informative as he or she can. There is no point of asserting aconditional if either the antecedent is known to be false, or the consequent is known to betrue. In the first case one is expected to assert the negation of the antecedent; in the secondcase one is expected to assert the consequent.

Furthermore, we expect some connection between what the antecedent and consequent say,and this is totally missing in (28.i) and (28.ii). The connection need not be causal, e.g.,

(29) If five times four is twenty, then five times eight is forty.

(29) makes sense in as much as the consequent can be “naturally deduced” from the an-tecedent.


Often ‘if ... , then ’ is employed in a sense that indicates a causal relation. For example,(27) says that if they will not be hindered by rain, the persons in question (referred to by‘we’) will go to the zoo. But if we construe it as a material conditional and apply the familiarequivalence

¬A→ B ≡ ¬B → A ,

we can convert it into the logically equivalent:

(270) If we don’t go to the zoo tomorrow, it will rain.

Without additional explanation (270) looks bizarre; for it suggests that not going to the zoohas an effect on the weather.

Cases like those discussed above have been sometimes called “paradoxes of material implica-tion [or material conditional]”. Actually, there is nothing paradoxical here. Material condi-tional is not intended to reflect aspects of ‘if’ that pertain to causal links, temporal order, orany connections of meanings, over and above the truth-value dependence.

Non-material conditional can be expressed in richer systems of logic, which are designed tohandle phenomena that are not truth-functional.

‘If’ and ‘Only If’, Sufficient versus Necessary Conditions

In ‘if’-statements that express conditionals the antecedent is marked by ‘if’. Recast as condi-tionals,

‘If ... , then ’, ‘If ... , ’, and ‘ , if ...’

come out as:

(30) (...) → ( ) .

Note that in the third expression above, the antecedent comes after the consequent. Theantecedent is not marked by its place but by the prefix ‘if ’.

Just as ‘if’ marks the antecedent, ‘only if’ marks the consequent. As far as truth-values areconcerned, to say

‘... , only if ’

is to say that ‘...’ is not true without ‘ ’ being true; which simply means that if ‘...’ is true,so is ‘ ’. Thus, it comes out as (30).


A condition X (fact, state of affairs, what a sentence states–we won’t belabor this here) issaid to be sufficient for Y , just when the obtaining of X guarantees the obtaining of Y . ThatX is a sufficient condition for Y is often described by asserting:

‘If X then Y ’, or ‘Y , if X’,

or–to put it more accurately:

‘If ... , then ’ or ‘ , if ...’,

where ‘...’ describes X and ‘ ’ describes Y .

On the other hand, X is a necessary condition for Y if Y cannot take place without X. Andthis is often expressed by

Y , only if X,

A sufficient condition need not be necessary; e.g., dropping the glass is sufficient for breakingit (expressed by: ‘If the glass is dropped it will break’), but it is not necessary–the glasscan be broken in other ways. Vice versa, a necessary condition need not be sufficient; e.g.,good health is necessary for Jill’s happiness (expressed by: ‘Jill is happy only if she is in goodhealth’), but it is not sufficient–other things are needed as well.

The confusing of necessary and sufficient conditions is quite common and results in fallaciousthinking; for example, the affirmation-of-consequent fallacy whereby, assuming the truth of‘If ... , then ’, one fallaciously infers the truth of ‘...’ from that of ‘ ’.

Although ‘if ... , ’ and ‘... , only if ’ come to the same when construed as conditionals,the move from the first to the second can result in statements that are rather odd. If we rewrite(27) (‘If it doesn’t rain tomorrow, we shall go to the zoo’) in the ‘only-if’ form we get:

It won’t rain tomorrow only if we go to the zoo,

which makes the going to the zoo a necessary condition for not raining, suggesting, even morethan (270), some mysterious influence on the weather. Underlying this are, again, the causalimplications that an ‘only-if’ statement can have, which disappear in the formalization. Bynow the matter has been sufficiently clarified.

‘If’-Statements that Express Generality

When an ‘if’-phrase is constructed with an indefinite article and a common noun, the resultis not a conditional but a generalization of one:

(31) If a woman wants to have an abortion, the state cannot prevent her.


To assert (31) is to assert something about all women. It cannot be formalized in sententiallogic. We shall see that in first-order logic it is expressed by using variables and prefixing auniversal quantifier before the conditional. It comes out like:

(310) For all x, if x is a woman who wants to have an abortion, then the statecannot prevent x from having it.

By contrast, the following is expressible as a simple conditional, though it has the samegrammatical form.

(32) If Jill wants to have an abortion, the state cannot prevent her.

The expression of generality through conditionals is very common in technical contexts, whengeneral rules are stated using variables, or schematic symbols. For example, the transitivityof > is stated in the form:

If x > y and y > z, then x > z,

meaning: for all numbers x, y, z, if x > y etc. The quantifying expression ‘For all numbersx, y, z’ has been omitted, but the reader has no difficulty in supplying it.

Other Ways of Expressing Conditionals

Besides ‘if’, there are other English expressions that mark the antecedent of a conditional.Here are some:

‘provided that’, ‘assuming that’, ‘in case that’, and sometimes ‘when’.

For example, (27) can be rephrased as

... it does not rain tomorrow, we shall go to the zoo,

where ‘...’ can stand for each of: ‘Provided that’, ‘Assuming that’, ‘In case that’. Wecan also change the places of antecedent and consequent: ‘We shall go to the zoo tomorrow,provided that it does not rain’. As in the case of ‘if’, we can get an expression marking theconsequent by adding ‘only’: ‘... , only in case that ’.

As a rule, ‘when’ can be used to form conditionals involving generality. For example, you canreplace in (31) ‘If’ by ‘When’; or consider:

(33) When there is a will there is a way.(I.e., in every case: if there is will there is a way.)

(34) A conjunction is true when both conjuncts are.(I.e., for every conjunction, if both conjuncts are true so is the conjunction.)


This use of ‘when’ is possible when the temporal aspect is non-existent as in (34), or when itis not explicit, as in (33). (In the latter, the temporal aspect was neutralized by using ‘case’in the paraphrase.) But when time is explicit, we cannot recast the sentence as a generalizedconditional, at least not in a straightforward way (via common names). For example:

(35) Jack and Jill will marry when one of them gets a secure job.

The temporal proximity of the two events (getting a secure job and the marriage), which isexplicitly stated in (35), disappears if (35) is formalized as a conditional.

(The formalization of (35)in first-order logic requires the introduction of time points. Alterna-tively, it can be carried out in temporal logic, designed expressly for handling such statements.In this logic truth-values are time-dependent and there are connectives for expressing con-structions based on ‘when’, ‘after’, ‘until’, etc.)

Unless: The harmless ‘unless’, which causes no problem in actual usage, can cause confusionwhen it comes to formalizing. ‘ , unless ...’ means that if the condition expressed by ‘...’ isnot satisfied, then ‘ ’ is (or will be) true. In formal recasting it is:

¬(...) → ( ) .

It is a conditional whose antecedent is the negation of the clause that follows ‘unless’. Thatclause can also come first: ‘unless ... , ’ .

Since¬A→ B ≡ A ∨B ,

an ‘unless’-statement can be formalized as a simple disjunction:

(...) ∨ ( ) .

Reading ‘unless’ as ‘or’ is somewhat surprising. Analyzing the situation, one can see that‘unless’ connotes (perhaps even more than ‘if’) some sort of causal connection. When we readit as ‘or’, this side of it disappears; hence, the surprise. We are also not used to regard ‘unless’as a disjunction. Whatever the reason, this is how ‘unless’ is interpreted as a truth-functionalconnective. A few examples of will show that it is indeed the right way:

We shall go to the zoo tomorrow, unless it rains.

We shall go to the zoo tomorrow if it does not rain.

Either it will rain tomorrow, or we shall go to the zoo.

Unless you pass the exam, you will not qualify.

If you don’t pass the exam, you will not qualify.

You’ll pass the exam, or you will not qualify.


Biconditional

In natural language biconditionals are expressed by using ‘if and only if’:

Tomorrow will be the longest day of the year, if, and only if, today is June20.

In this form, biconditionals can serve to assert that a certain condition is both necessary andsufficient for some other condition. Note that the “necessary”-part is expressed by ‘only if’,and it corresponds to the left-to-right direction of ‘↔’ ; the “sufficient”-part is expressed by‘if’, and it corresponds to the right-to-left direction of ‘↔’.Since a biconditional amounts to a conjunction of two conditionals, the construal of various‘if-and-only-if’-statements as biconditionals inherits some of the problems of the conditional(material implication).

Other expressions that can be used to form biconditionals are:

just if, just in case, just when.

But biconditionals that are stated with ‘just when’ are usually general claims whose formal-izations require quantifiers.

Homework 3.5 Express the following excerpts as sentential compounds. Get your basiccomponents as small as possible. Note basic components that are equivalent to generalizedconditionals and rewrite them so as to display the conditional part. For example:

The wise in heart will receive commandments, but a prating fool shall fall.

Answer: A ∧B, whereA : ‘The wise at heart will receive commandments’,

Generalized conditional: ‘If x is wise at heart, then x will receive commandment.’

B : ‘A prating fool shall fall’,

Generalized conditional: ‘If x is a prating fool, then x will fall.’

For the sake of the exercise, you can treat an address to the reader as an address to a particularperson, using ‘you’ as a proper name.

1. A leader is a dealer in hope.

2. Ignore what a man desires and you ignore the very source of his power.

3. Laws are like spider’s webs which, if anything small falls into them they ensnare it, butlarge things break through and escape.

4. If you command wisely, you’ll be obeyed cheerfully.

5. You’ll get well in the world if you are neither more nor less wise, neither better norworse than your neighbors.


6. God created man and, finding him not sufficiently alone, gave him a companion to makehim feel his solitude more keenly.

7. Women are as old as they feel–and men are old when they loose their feelings.

8. If there are obstacles, the shortest line between two points may be the crooked line.

9. There is time enough for everything in the course of the day if you do but one thing atonce; but there is not time enough in the course of the day if you will do two things ata time.

10. No one can be good for long if goodness is not in demand.

11. Literary works cannot be taken over like factories, or literary forms of expression likeindustrial methods.

12. If the mind, which rules the body, ever forgets itself so far as to trample upon its slave,the slave is never generous enough to forgive the injury; but will rise and smite itsoppressor.

Chapter 4

Logical Implications and Proofs

4.0

In this chapter we introduce implication (from many premises) and we present a method ofproving valid implications of sentential logic (i.e., tautological implications). It is an adap-tation of Gentzen’s calculus, which is easy to work with and which is guaranteed to produceeither a proof, or a counterexample that shows that the given implication does not obtain ingeneral–hence is not provable. As we shall see in chapter 9, the system extends naturally tofirst-order logic, where it is guaranteed to produce a proof of any given logical implication.

Returning, in the last section, to natural language, we try to represent implications that arisein English discourse, as formal implications of our system. In this connection we discusssome well-known concepts in the philosophy of language, such as meaning postulates andimplicature.

4.1 Logical Implication

As noted in the introduction, logic was considered historically the science of correct reasoning,which uncovers and systematizes valid forms of inference. Generally, an inference starts withcertain assumptions called premises and ends with a conclusion. It is not that required thatthe premises be true, but that they imply the conclusion; i.e., it should be impossible thatthe premises be true and the conclusion–false.

In general, implications are not grounded in pure logic. That Jack was at a certain hour in NewYork implies that he was not, shortly afterwards, in Toronto. This is not a logical implication.It rests on the practical impossibility of covering the New York - Toronto distance in tooshort a time. If the time is sufficiently short, the impossibility may be traced to a physical

105

106 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS

law. And in the extreme case, it becomes the impossibility of being at the same time in twodifferent places. But even this is not something that rests on pure logic.

We shall not address at this point what comes under “pure logic”. As in the cases of logicalequivalence and logical truth (cf. chapter 2), we can say that a sentence logically impliesanother if it is impossible that the first be true and the second false, by virtue of the logicalelements of the two sentences. In sentential logic the only logical elements are the connectives.A logical implication that rests only on the meaning of the connectives is tautological. Hereis the definition.

A tautologically implies B if there is no assignment of truth-values to theatomic sentences under which A gets T and B gets F.

As in the case of tautological equivalence, (cf. 2.2.0) there is no need to go to the level of theatomic sentences. That a sentence tautologically implies another can be seen by displayingtheir relevant sentential structure. The definition entails the following.

A tautologically implies B if and only if they can be written as sententialexpressions (with each sentential variable standing for the same sentencein both) such that there is no assignment to the sentential variables underwhich the expression for A gets T, and the expression for B gets F.

This means that, in a truth-table containing columns for both, there is no row in which A’svalue is T and B’s value is F.

If A is a contradiction, then there is no truth-value assignment (to the atomic sentences)under which it gets T. Hence, for all B, there is no assignment in which A gets T and Bgets F. Consequently a contradiction implies tautologically all other sentences. By a similarargument, every sentence implies tautologically a tautology. We shall return to this later.

Note that tautological implication is a certain type of logical implication. In sentential logicthe two are the same. But in more expressive systems, in particular in first-order logic, thereare logical implications that are not tautological.

Terminology and Notation:

• ‘Logical’ in the context of sentential logic means tautological. In the present chapter,the terms are interchangeable. For the sake of brevity we often use ‘implication’ forlogical implication.

• ‘|=’ denotes logical implication, that is:

A |= B

means that A logically implies B. If A |= B, we say that B is a logical consequence ofA.

4.1. LOGICAL IMPLICATION 107

• ‘|=’ is a shorthand for ‘logically implies’. Like ‘≡’ it belongs to our English discourse,not to the formal system. To say that A |= B is to claim that A logically implies B.

Note: Terms such as ‘implication’ and ‘equivalence’ are used mostly with respect to sen-tences, or sentential expressions, of our formal system. But we use them also with respectto our own statements. E.g., we can say that A ≡ A0 implies that ¬A ≡ ¬A0, and we speakabout the implication

A |= A0 =⇒ ¬A0 |= ¬A .

And here, ‘implication’, which is denoted by ‘=⇒’, refers to our own statements. Similarly,we may speak of the equivalence

A ≡ B ⇐⇒ B ≡ A .

A similar ambiguity surrounds ‘consequence’. The intended meaning of these, and othertwo-level terms, should be clear from the context.

If two sentences are equivalent, then under all truth-value assignments (to the atomic com-ponents) they get the same truth-value. Hence they imply each other. Vice versa, if theyimply each other, than there is no assignment under which one gets T and the other getsF; therefore they are equivalent. Hence, sentences are equivalent just when they imply eachother:

(1) A ≡ B ⇐⇒ A |= B and B |= A .

(1) shows how logical equivalence can be defined in terms of logical implication. On the otherhand, using conjunction, we can express implication in terms of equivalence:

(2) A |= B ⇐⇒ A ≡ A∧B .

The argument for (2) is easy:

Assume that A |= B, then (i) if A gets T, so does B and so does A∧B; and(ii) if A gets F, then A ∧B gets F. Therefore A and A ∧B always get thesame value.

Vice versa, if A ≡ A ∧B, it is impossible that A gets T and B gets F, forthen A and A ∧B get different values.

One can, nonetheless, argue that implication is the more basic notion. It corresponds directlyto inferring. Moreover, logical equivalence is reducible to it without employing connectives,but not vice versa.1 As we shall presently see, the most basic notion is that of implicationwith one or more premises.

1The reason for treating, in this book, equivalence before implication is didactic. Its analogy with equalitymakes equivalence more accessible and enables one to use algebraic techniques.


Using conditional, we can express logical implication in terms of logical truth.

(3) A |= B ⇐⇒ A→ B is logically true.

Homework 4.1 Prove (3), via the same type of argument used in proving (2).

The following properties of implication are easily established.

Reflexivity: A |= A

Transitivity: If A |= B and B |= C, then A |= C.

Transitivity is of course intuitively implied by the very notion of implication. The detailedargument is trivial:2

Assuming that A |= B and B |= C, one has to show that there is noassignments under which A gets T and C gets F. So assume that A getsT. Then B must get T, because A |= B. But then, C must get T, becauseB |= C.

Logical implications are preserved when we substitute the sentences by logically equivalentones:

(4) If A ≡ A0 and B ≡ B0, then A |= B iff A0 |= B0.

One can derive (4) immediately from the definitions, by noting that logical implication isdefined in terms of possible truth-values and that logically equivalent sentences always havethe same value.

((4) is also derivable from (1) via the transitivity of implication: If, A ≡ A0 then, by (1),A0 |= A. Similarly, if B ≡ B0, then B |= B0. If also A |= B, we get:

A0 |= A, A |= B, B |= B0

Applying transitivity twice we get A0 |= B0. In the same way we derive A |= B from A0 |= B0.)

(4) implies that, in checking for logical implications, we are completely free to substitutesentences by logically equivalent ones. We can therefore use all the previous simplificationtechniques in order to reduce the problem to one that involves simpler sentences.

2It is trivial if we define implication by appealing to assignment to atomic sentences. If we want to bypassatomic sentences we have to show that the two implications A |= B and B |= C, can be founded on sententialexpressions in which the three sentences are generated from the same stock of basic components (see alsofootnote 4 in chapter 2, page 32). This can be done by using unique readability.

4.1. LOGICAL IMPLICATION 109

Every case of logical equivalence is, by (1), a case of two logical implications, from left to rightand from right to left. But generally implications are one-way. Here are a few easy examplesin which the reverse implication does not hold in general.

(i) A ∧B |= A

(ii) A |= A ∨B

(iii) B |= A→ B

(iv) ¬A |= A→ B

(v) A ∧B |= A↔ B

(vi) ¬A ∧ ¬B |= A↔ B

That the reverse implications do not hold in general can be seen by assigning the sententialvariables truth-values, under which the left-hand side gets T and the right-hand side getsF. We can interpret them as standing for atomic sentences that can have these values. Forexample, in the case of (i), let A get T and let B get F.

Note that we can also force this assignment by interpreting the variables as standing fortautologies or contradictions. For example let A = C → C, B = C ∧ ¬C.In particular cases, the right-to-left implication holds as well. For example:

If B is logically true, e.g. (B = C → C) then A |= A ∧B.

If B is a logically false, then A→ B |= ¬A .

Here, as an illustration, is the argument for the second statement.

Assume that B is logically false. If A → B gets T, then, since B gets F(being logically false), A must get F. Hence, ¬A gets T. There is, therefore,no assignment (to the atomic sentences) in which A → B gets T and ¬Adoesn’t.

Homework 4.2 Find, for each of (i) - (vi) above, whether the reverse implication holds forall B, in each of the following cases:

(1) A is logically true. (2) A is logically false.

Altogether you have to check 12 cases. Prove every positive answer by an argument of thetype given above. Prove every negative answer by a suitable counterexample.


Note: We can define a notion of logical implication that applies to sentential expressions.This is completely analogous to the case of logical equivalence and logical truth (cf. 2.2). Weshall discuss it in 4.3.1, in the more general context of implication from many premises.

4.2 Implications with Many Premises

4.2.0

Implications with several premises are a natural generalization of the one-premise case. Thesentences A1, A2, . . . , An logically imply the sentence B, and B is a logical consequence ofA1, . . . , An, if it is impossible, by virtue of the logical elements of the sentences, that all thesentences Ai are true and B is false. The notation is generalized accordingly:

A1, . . . , An |= B

The sentences A1, A2, . . . , An are referred to as premises and B–as the conclusion.

The precise definition, in the case of sentential logic, is a straightforward generalization of theone-premise case. Here it is:

A1, . . . , An |= B, if there is no truth-value assignment to the atomicsentences under which all the Ais get T and B gets F.

Again, this entails a characterization in terms of sentential expressions, without appealing toatomic sentences:

A1, . . . , An |= B, iff all the premises and the conclusion can be writtenas sentential expressions, such that in a truth-table containing columns forall, there is no row in which all the Ai’s get T and B gets F.

Notation: We refer to sequences such as A1, . . . , An as lists of sentences, and we use

‘ Γ ’, ‘ ∆ ’, ‘ Γ0 ’, ‘ ∆0 ’,. . . etc.,

for denoting such lists. Thus, if Γ = A1, A2, . . . , An then

Γ |= B means that A1, . . . , An |= B .

Furthermore, we use notations such as ‘Γ, A’ and ‘Γ,∆’ for lists obtained by adding sentenceand by combining two lists:

If Γ = A1, . . . , An and ∆ = B1, . . . , Bk ,

4.2. IMPLICATIONS WITH MANY PREMISES 111

thenΓ, A = A1, . . . , An, A and Γ,∆ = A1, . . . An, B1, . . . , Bk .

It is obvious that, as far as logical consequences are concerned, the ordering of the premisesmakes no difference. Also repeating a premise, or deleting a repeated occurrence, make nodifference. For example,

A,B,A,C,D,D , B,A,C,D , and A,B,C,D

have the same logical consequences. Such rewriting of lists will, henceforth, be allowed, as amatter of course.

It should be evident by now that, in dealing with logical implications, we can apply thesubstitution-of-equivalents principle: any sentences among the premises and the conclusioncan be substituted by logically equivalent ones. Spelled out in detail, it means this:

If A1 ≡ A01, and . . . and An ≡ A0n, and B ≡ B0 ,

thenA1, . . . , An |= B ⇐⇒ A01, . . . , A

0n |= B0 .

The Empty Premise List

We include among the lists the empty list, one that has no members. To be a logical conse-quence of the empty list means simply to be a logical truth. (If Γ is empty, then, vacuously,all its members are true. Hence, to say that it is impossible that all members of Γ are trueand B is false is simply to say that it is impossible that B is false.)

Logical implication by the empty list is expressed by writing nothing to the left of ‘|=’ .Therefore

|= B

means that B is a logical truth. In the case of sentential logic, it means that B is a tautology.

By using conjunction, we can reduce an implication from A1, . . . , An to the single-premisecase:

(5) A1, . . . , An |= B ⇐⇒ A1 ∧ . . . ∧An |= B .

(5) is proved by observing that all Ai’s get T just when their conjunction, A1 ∧ . . .∧An, getsT.

Implications from premise lists constitute, nonetheless, an important advance on single-premise implications. First, they do not necessitate the use of conjunctions. Second, it is


easier to grasp an implication stated in terms of several premises, instead of using a singlelong conjunction. Third, the premise list can be infinite. The definition of implication worksequally well for that case, but the reduction to single premises, via (5), breaks down (unlesswe admit infinite conjunctions, which is a radical extension of the system). In this book weshall restrict ourselves to finite premise lists. Yet the infinite case has its uses. Fourth, byincluding the possibility of an empty list, we incorporate logical truth within the frameworkof logical implication. As we shall see, the rules for implications lead to useful methods forestablishing logical truth.

Finally and most important, there are nice sets of rules for establishing implications, whichdepend essentially on the possibility of having many premises.

4.2.1 Some Basic Implication Laws and Top-Down Derivations

Our previous (3) (which characterizes implication from a single premise in terms of logicaltruth) can be now stated as:

(6) A |= B ⇐⇒ |= A→ B

And this can be generalized to the following important law:

(7) Γ, A |= B ⇐⇒ Γ |= A→ B

(6) is a particular case of (7), obtained when Γ is empty. Here is the proof of (7):

To prove the left-to-right direction, assume that Γ, A |= B and show: Γ |=A→ B, i.e., that it is impossible (by virtue of the logical elements) that allmembers of Γ are true and A→ B is not. Assume a case where all membersof Γ get T. If A gets F, then A → B gets T (by the truth-table of → ).And if A gets T, then all members of Γ, A get T; since we assumed thatΓ, A |= B, B gets T. Again, by the truth-table of →, A→ B gets T.

To prove the right-to-left direction, assume that Γ |= A→ B, and show: Γ, A |=B, i.e., that it is impossible (by virtue of the logical elements) that all mem-bers of Γ, A get T and B get F. So assume that all members of Γ, A get T.Then (i) all members of Γ get T and (ii) A gets T. Having assumed thatΓ |= A→ B, it follows that A→ B gets T. Since also A gets T, it follows,by the truth-table of →, that B gets T.


Note that the argument relies on the logical elements of the sentences inΓ, A,B and on the truth-table of →, which is itself a logical element.

(7) provides a very useful way of establishing implications in which the conclusion is a con-ditional. We can refer to it by the (rather unwieldy) “conclusion-conditional law”. We alsomark it by the following self-explanatory notation

(|=,→) .

Here is an illustration how (|=,→) can work. Suppose that we want to show that:|= (A→ (B → A))

Using (|=,→) (where Γ is empty and B is substituted by B → A) this reduces to showingthat:

A |= B → A

Again, using (|=,→) (where Γ consists of A, A is substituted by B, and B by A), thisreduces to:

A,B |= A

But this last implication is obvious, because the conclusion occurs as one of the premises.Thus, we have established the logical truth of our original sentence.

The argument just given is an example of a top-down proof, or top-down derivation: We startwith the claim to be proved and, working our way “backward”, we keep reducing it to othersufficient claims (i.e., claims that imply it), until we reduce it to obviously true claims. We canthen turn the argument into a bottom-up proof: the familiar kind that starts with obviouslytrue claims and moves forward, in a sequence of steps, until the desired claim is established.The bottom-up proof is obtained by inverting the top-down derivation. In our last examplethe resulting bottom-up proof is:

A,B |= A obvious,

A |= B → A by (|=,→),|= A→ (B → A) by (|=,→).

The implications occurring in top-down derivations are referred to as goals. The derivationstarts with the initial goal (the implication we want to prove) and proceeds stepwise byreducing goals to other goals, until all the required goals are self-evident. The top-downmethod figures prominently in the sequel. Besides (|=,→), we shall avail ourselves of otherlaws. It goes without saying that all the laws are general schemes holding for all possiblevalues of the sentential variables. Here are some.

(8) If Γ |= A and every sentence that occurs in Γ occurs in Γ0, then Γ0 |= A.


(8) says that the addition of premises can only increase the class of implied consequences. Thisproperty is the monotonicity of logical implication, or of the logical consequence relation.3

Given our definition of implication, (8) is trivial (if it is impossible that all the sentences inΓ get T and A gets F, then, a fortiori, it is impossible that all sentences in Γ0 get T and Agets F).

Note: Monotonicity obtains for many types of implication, not necessarily logical, andit seems quite obvious: By adding more premises one can only get more consequences, notless. Yet we often employ reasonings that are not monotone. Conclusions established on thebasis of some information may be withdrawn when some additional information is obtained.The well-known example is: Being told that Twitty is a bird, one will conclude that Twittycan fly; but one will withdraw this conclusion if told, in addition, that Twitty is a penguin.Inferences of that nature have been, in the last twenty years, the subject of considerableresearch by logicians, computer scientists and philosophers working in the area of belief changeand artificial intelligence. Various formal systems have been proposed. They come under thetitle of non-monotonic logic.

Not as obvious as (8), but still quite easy, is the following law of consequence addition:

(9) If Γ |= A, then, for every sentence B:

Γ |= B ⇐⇒ Γ, A |= B

(9) means that any consequence of the given premises can be added as an additional premise,without changing class of consequences. Here is the proof.

The left-to-right direction of the ‘⇐⇒’ follows from monotonicity. For theright-to-left direction, assume that Γ, A |= B. If all sentences in Γ get T,then, by the initial assumption (that Γ |= A), A must get T; hence, allsentences in Γ, A get T; therefore B gets T. Thus, it is impossible that allsentences in Γ get T and B gets F.

The following generalization of (9) allows us to add as premises many consequences of theoriginal list.

(9∗) Assume that every sentence in ∆ is a consequence of Γ, then Γ and Γ,∆have the same consequences; that is, for every sentence C:

Γ |= C ⇐⇒ Γ,∆ |= C

3In mathematics, ‘monotone’ (or ‘positively monotone’) is used to describe relations in which an increasein one quantity does not cause a decrease in another related one. For example, 2·x is a monotone function ofx, for it doesn’t become smaller as x becomes larger. But x2 − 2x is not monotone, for, as x becomes largerit sometimes increases and sometimes decreases (e.g., it increases if x is increased from 1 to 2, but decreasesif x is increased from 0 to 1).


(9∗), can be proved by the same reasoning that proves (9). It can be also deduced by re-peated applications of (9). (Assume that ∆ = B1, . . . , Bm; then every Bi is a consequenceof Γ. By (9), Γ and Γ, B1 have the same consequences. Since B2 is a consequence Γ, it is,by monotonicity, a consequence of Γ, B1; again, by (9), Γ, B1 and Γ, B1, B2 have the sameconsequences. Therefore Γ and Γ, B1, B2 have the same consequences, etc.)

(9∗) implies the following generalization of the transitivity law of one-premise implications:

If (i) for every B in ∆, Γ |= B, and (ii) ∆ |= C, then Γ |= C .

The argument is easy: By monotonicity, every consequence of ∆ is a consequence of Γ,∆;and, by (9∗), Γ,∆ has the same consequences as Γ.

(10) If every sentence of ∆ is a consequence of Γ and every sentence of Γ is aconsequence of ∆, then Γ and ∆ have the same consequences.

(10) follows trivially from generalized transitivity: every consequence of ∆ is a consequenceof Γ and, vice versa, every consequence of Γ is a consequence of ∆.

Equivalent Premise Lists: Call two premise lists, Γ and ∆, logically equivalent, orequivalent for short, if they have the same logical consequences.

If two premise lists are equivalent then every sentence of one list is a consequence of the other(because it is a consequence of the list in which it occurs). (10) says that the reverse directionholds as well.

By the truth-table of →, we get immediately:

(11) A,A→B |= B

We have also: B |= A→B. These two imply the following very useful law:

(12) Γ, A, A→B |= C ⇐⇒ Γ, A,B |= C

(12) is obtained, via (10), by observing that every sentence in one of the two premise lists is aconsequence of the other. (Every sentence in Γ, A,B is a consequence of Γ, A,A→B, becauseA,A→B |= B. And every sentence in Γ, A,A→B is a consequence of Γ, A,B, becauseB |= A→B.)

We shall call (12) disjoining. It allows us to disjoin a premise that is a conditional into itsparts, provided that the antecedent is among the premises.


Note: Disjoining is related to what is known as modus ponens, or the rule of detachment, bywhich one can formally infer B from A and A→ B. Disjoining is different. It is the semanticlaw which justifies the use of modus ponens. The name ‘disjoining’ is not a current term.

(12) can be generalized to:

(12∗) If A0 |= A, then

Γ, A0, A→ B |= C ⇐⇒ Γ, A0, B |= C

To show (12∗), assume that A0 |= A. Then the addition of A to any list containing A0, yieldsan equivalent list. Hence, Γ, A0, A → B is equivalent to Γ, A0, A,A → B, which, by (12), isequivalent to Γ, A0, A,B. And this last list is equivalent to Γ, A0, B, since it is obtained fromit by adding A.

Here is an example of a top-down derivation that uses some of the listed laws. We want toshow that

|= [A→ (B → C)]→ [B → (A→ C)]

Starting with this as our initial goal, we keep reducing each goal to another sufficient goaland we write the goals on separate, numbered lines. Indicated in the margin is the law (orlaws) by which the preceding implication is reduced to the current one. The sign ‘

√’ marks

obvious implications that need no further reductions.

1. |= [A→ (B → C)]→ [B → (A→ C)] initial goal,

2. A→ (B → C) |= B → (A→ C)] by (|=,→),

3. A→ (B → C), B |= A→ C by (|=,→),

4. A→ (B → C), B,A |= C by (|=,→),

5. B → C,B,A |= C by disjoining,

6. C,B,A |= C by disjoining.√

Note that the reduction from 4. to 5. uses an instance of disjoining, whereby A→ (B → C), Ais replaced by B → C,A.

For the sake of brevity, we can write the three steps from 1. to 2., from 2. to 3., and from 3.to 4. as a single step:

|= [A→ (B → C)]→ [B → (A→ C)] initial goal,

A→ (B → C), B,A |= C by three applications of (|=,→).


In a similar way, we can write the steps from 4. to 5. and from 5. to 6. as a single step inwhich disjoining is applied twice.

The bottom-up proof of the initial goal is obtained by reversing the list: Start from 6. andend at 1., justifying each step by the indicated rule; 5. is obtained from 6. by disjoining, 4.from 5.–by disjoining, 3. from 4. by (|=,→), etc.From now on we will omit ‘initial goal’ in the margin of the first line.

Here is another example, where substitution-of-equivalents is used as well. The equivalenceswe use are:

B ∨ C ≡ ¬B → C and A→ B ≡ ¬B → ¬AThe first equivalence is used in getting 2. from 1., the second–in getting 4. from 3.; in thefirst case substitution is applied to the conclusion, in the second case–to one of the premises.

1. A→ B,¬A→ C |= B ∨ C

2. A→ B,¬A→ C |= ¬B → C substitution of equivalents,

3. ¬B,A→ B,¬A→ C |= C by (|=,→),

4. ¬B,¬B → ¬A,¬A→ C |= C substitution of equivalents,

5. ¬B,¬A,¬A→ C |= C by disjoining,

6. ¬B,¬A,C |= C by disjoining.√

Homework 4.3 Using the laws introduced so far, prove, via top-down derivations, thefollowing five implications. The goal should be reduced in the end to an obvious implicationin which the conclusion is one of the premises. You can use substitution-of-equivalents basedon simple equivalences of the kind given in the last example.

In the derivations of 4. and 5. you can use laws (12∗) and (10), as well as the implicationsB |= A ∨B and A,B |= A ∧B.

1. |= [A→ (B → C)]→ [(A→ B)→ (A→ C)]

2. ¬A→ B,B → C |= ¬C → A

3. A→ (B ∨ C),¬B |= A→ C

4. (A ∨B)→ (B → C) |= B → C

5. A∧B → C,B |= A→ C


4.2.2 Additional Implication Laws and Derivations as Trees

The following two laws handle conjunctions that occur either as premises or as conclusions.The first, the conjunction-premise law, handles the case where the conjunction is one of thepremises; the second, conjunction-conclusion law, handles the case where the conjunction isthe conclusion. We adopt the following notation:

(∧, |=): for the conjunction-premise law, (|=,∧): for the conjunction-conclusion law.

Here are the two laws:

(∧, |=) Γ, A ∧B |= C ⇐⇒ Γ, A,B |= C

(|=,∧) Γ |= A ∧B ⇐⇒ Γ |= A and Γ |= B

(∧, |=) follows, via (10) (and monotonicity), from the obvious facts that (i) A ∧ B logicallyimplies each of A and B, and (ii) A,B |= A ∧B.In the second law, (|=,∧), the⇒-direction follows, via transitivity, from the fact that A∧Bimplies each of A and B. The ⇐-direction (which is the direction we will be using in top-down derivations) is established by noting that, if Γ |= A and Γ |= B, then Γ implies everypremise in the list A,B. Since this list implies A∧B, Γ implies it as well (via transitivity).

The conjunction-conclusion law is distinguished from the other laws used so far in that itreduces a goal (Γ |= A∧B) not to one but to two goals. Both must be achieved. The numberof goals increases thereby. It may increase further, since each of the two new goals may giverise, directly or indirectly, to more than one goal. But the new goals are simpler: instead ofthe conclusion A∧B we have only A or B. This is also true of the other laws that we shall use.It constitutes the main feature of the method: Although the number of goals may increase,the goals themselves become simpler. In the end, the initial goal is reduced to a collection ofso called elementary goals; these are goals whose validity can be immediately checked.

Here is a simple top-down derivation that ends with two final, self-evident implications:

1. C → A |= (C → B)→ (C → A∧B)

2. C → A,C → B,C |= A ∧B two applications of (|=,→),

3. C,A,B |= A ∧B two applications of disjoining,

4.1 C,A,B |= A by (|=,∧), √

4.2 C,A,B |= B by (|=,∧). √


Terminology: In a given derivation, the goals to which a goal has been reduced in a single step are referred to as the goal's children. The goal is referred to as the parent.

In the last derivation, 1. has a single child , which is 2. The only child of 2. is 3. But 3. has two children, numbered 4.1 and 4.2. They are numbered thus, in order to mark them as the two children of 3. The derivation is not complete, unless both 4.1 and 4.2 have been achieved; hence we need two ,j's to indicate success.

Top-Down Derivations Written as 'frees

A top-down derivation can be written in the form of a tree,' whose nodes a,re labeled by the implications that appear as goals in the derivation. (Concerning trees, see 2.4.1; recall that' 'children' is used also in the tree-terminology foi: the nodes that issue from some node.) Here is the tree-form of the last derivation: ,

C 7 A 1= (C 7 B) 7 (C 7 Af\B) .

I C 7 A, C 7 B, C /= A"B

I C,A,B /= A B

~ --------C, A, B /= A C, A. B 1= B

The general rule is very simple: .

0) The root is labeled by the initial goal.

(il) A node has as many chilclrenas the children of the goal that labels it. Each . child-goal labels exa.etlyone child· node.

In a complete derivation the.1eaves of the tree are labeled· by impliGations considered to be obviously true. So far, we have restricted this Gategory to implications in which the conclusion is one of the premises. Later, we shall add to it another type of implication.

Usually, an implication can be reduced to a simpler one in more than one way. The choices of the sentence, to which we apply laws, determine the· resulting derivation. The implication proved in the last example is also provable as follows:

120 CHAPTER 4. LOG!CAL IMPLICATIONS AND PROOFS

l. C-+A /= (C ..... B) -+ (C -+ AAB)

2. C-+A,C-B,C /= AAB two applications of (I=. ..... ),

3.1 C -+ A,C -B,C F A by (/=,A),

3.2 C-+A,C-B,C /= B - by (I=.A),

4.1 C,A,C -; B /= A by disjoining, ,;

4.2 C -+ A,C,B 1= B by disjoining. ,;

Here, instead of applying disjoining to the second goal (as we did before) we apply to it (F,A). As Ii. result the branching. occurs earlier. We then apply disjoining to ea.ch of the children, 3.1 and3:2,gettlng 4.1 as thes1ngle child of 3.1, and 4.2 as a sIngle child of 3.2. The tree form of this derivation is:

, C -') A 1= (C -') B) -? (C -,)MB)

C -? A, C ~B, C 1= AAB

- /---------c.~ A, C -? B, C 1= A C -? A, C -') B, C 1= B

I I c, A, c-') BI= A C -? A. C, B 1= B

Note that 4.1 is not the child of 3.2 (which is the line lnnnediately preceding it) Neither is 4.2 tlie child of its lnnnediatepredecessor, 4.1. The child-parent 'relation is determiIi.ed by the numbering, not merely by the order of Jines, This is 1lllIf:voidabie when we we a sequential form to represent a tree. The general rule for numbering' the goals will be given later.

Laws for Other Connectives and More on Top-Down Derivations

In analogy with the two laws for conjunction, we have a disjunction-premise law, denoted (V,1=), which handles a disjunction -that occurs as premise; and a disjunction-conclusion iaw, denoted (F, V), which handles a disjunction that is a Conclusion. Here they are.

(V,!=) r,AV,BFC <=>

(/=, V) r /= A V B ~

r,AFC and r,B/=C

r,-.A /= B


Note: In (∨, |=), like in (|=,∧), we get (via the ⇐ direction) a reduction of a goal to twogoals, both of which must be proved. In (|=,∧) the English ‘and’ on the right-hand sidecorresponds to ∧ in the conclusion. But in (∨, |=) it corresponds to ∨ in the premise. Somestudents find this confusing. The reason for converting ∨ to ‘and’ is that ∨ occurs in thepremise. In order to show that ‘... or ’ implies something, we have to show that ‘...’implies it and ‘ ’ implies it.

(|=,∨) can be proved by replacing A ∨ B by the equivalent ¬A → B and, via (|=,→),transferring ¬A to the premises.Homework 4.4 Give an argument that proves the disjunction-premise law.

Note: To show that the premises imply A∨B, it is sufficient to show that they either implyA or imply B (because each of A, B, implies A ∨ B). Hence, we can have a disjunction-conclusion law with ‘or’ in the right-hand side. But such a law holds only in the⇐-direction.The ⇒-direction does not hold in general. Γ may imply A ∨ B without implying any of thedisjuncts A,B. For example,

|= A ∨ ¬A(here Γ is empty), but from this it does not follow that either |= A, or |= ¬A. For if A isneither logically true nor logically false, then 6|= A and 6|= ¬A.Therefore you run some risk if, in order to show that Γ |= A∨B, you try to show that eitherΓ |= A or Γ |= B. For if Γ implies neither disjunct, you are sure to fail, even though Γmay imply the disjunction. For example, you will fail if you try to prove in this way that|= A∨¬A. On the other hand, trying to show that Γ |= A∨B, by showing that Γ,¬A |= B,is safe; for the second task is equivalent to the first.

Here is an example of a derivation that involves more than one branching.

1. A ∨B |= (A→ B)→ [(B → A)→ A∧B],2. A ∨B,A→ B,B → A |= A∧B two applications of (|=,→),3.1 A,A→ B,B → A |= A∧B by (∨, |=),3.2 B,A→ B,B → A |= A∧B by (∨, |=),4.1 A,B,B → A |= A∧B by disjoining,

5.11 A,B,B → A |= A by (|=,∧), √

5.12 A,B,B → A |= B by (|=,∧), √

4.2 A→ B,B,A |= A∧B by disjoining,

5.21 A→ B,B,A |= A (|=,∧), √


5.22 A -+ B,B,A F B (F,A). VI

The following is the same derivation written as a labeled tree. For convenience, the line munbers-rather than the implications-have been used as labels.

\

1. /

2.

/~ 3.1 3.2 I

/ I 4.1 4.2

1-\ /'\ :5.11 5.12 5.21 5.22

The Rule for Numbering Nodes in a Tree: The following is a convenient numbering of sequentially arranged items that are labels of a tree, from which you can construct the tree.

• Numbers are of the form n. _ , where n is a positive integer and _ is a (possiblyampty) sequence of positive integers.

• The root of the tree is numbered 1.

., If a node numbered n. _ has a·single child; the child is numbered n+1. _ (Le., the head is increased by 1 and the tail is Jeftunchanged).

• If a node numbered n. _ has k children, where k > 1, they are numbered, according to their left-~o-right order:

n+l. _1, n+l. _' 2, ... n+l. _ k

Note that in n._ the number n is 'the node's level in the tree, i.e., the number of nodes on the branch leadingtd it from the root (including both ends). The number of digits after the point shows how many bninchings, up to that node, have occurred on that branch.

In our case, a goal is substit1.!ted by one or two goals. Hence nodes do not have more than two children and the sequence after n. consists of Is and 28. The,numbering rule applies equally \;0' derivations in which goals are substituted, in one step, by more than two goals.


Note: If the conclusion is (A∧B)∧C, then two applications of (|=,∧) will reduce our initialgoal to three, each with one of A,B,C as a conclusion. We may consider a law that achievesit in one step. Here it is convenient to disregard the grouping in the repeated conjunction:

Γ |= A ∧B ∧ C ⇐⇒ Γ |= A and Γ |= B and Γ |= C

The same applies to longer conjunctions. Repeated disjunctions in the premises can be treatedsimilarly. Such laws introduce branching into more than two branches. They are not includedamong our basic laws.

The conditional-premise and conditional-conclusion laws, denoted respectively as (→, |=) and(|=,→), are as follows:

(→, |=) Γ, A→ B |= C ⇐⇒ Γ,¬A |= C and Γ, B |= C

(|=,→) Γ |= A→ B ⇐⇒ Γ, A |= B

The first is obtained by replacing the premise A→ B by the equivalent ¬A∨B and applyingthe disjunction-premise law. The second is our old friend (7) (with the two sides of ⇔reversed).

Note: If, in addition to A→ B, the premise-list contains A, we can apply disjoining:

Γ, A,A→ B |= C ⇐⇒ Γ, A,B |= C

This is better than applying (→, |=). But whereas (→, |=) is always applicable to a premisethat is a conditional, disjoining requires that the antecedent, A, be among the premises.

It is good practice to apply disjoining as long as the premises contain a conditional and itsantecedent; e.g., a list of premises of the form

Γ, A→ B,B → C,A

can be reduced, by two applications of disjoining, to the equivalent and much simpler list

Γ, A,B,C

Sometimes (12∗) can be applied as well; e.g., relying on the obvious implication A |= A ∨B,we can replace Γ, A, (A ∨B)→ C by the equivalent premise-list: Γ, A, C.

The remaining binary connective, ↔, can be dealt with through replacing A ↔ B by theconjunction of two conditionals: (A→ B) ∧ (B → A). Alternatively, we can employ directlythe following laws:

(↔, |=) Γ, A↔ B |= C ⇐⇒ Γ, A,B |= C and Γ,¬A,¬B, |= C


(|=,↔) Γ |= A↔ B ⇐⇒ Γ, A |= B and Γ, B |= A

(↔, |=) is obtained by replacing A ↔ B by the equivalent (A∧B) ∨ (¬A∧¬B) and thenapplying the premise laws for disjunction and conjunction. (|=,↔) is obtained by replacingthe biconditional by a conjunction of two conditionals and applying the conclusion laws forconjunction and conditional.

Treatment of Negated Compounds Negated sentential compounds are sentences eitherof the form ¬(¬A), or of the form ¬(A B), where is a binary connective. If a negatedcompound occurs either in the premises or as the conclusion, then, if it is ¬(¬A) the goalcan be simplified by dropping the double negation. In the other case, we can push negationinside, using our old equivalence laws. This yields a sentence to which we can apply one ofthe previous implication laws.

If is ∧ or ∨, the pushing inside is done via De Morgan’s laws. If it is →, the pushinginside is achieved via the equivalence:

(13) ¬(A→ B) ≡ A ∧ ¬B

In the case of biconditional we can use each of the equivalences:

(14.i) ¬(A↔ B) ≡ (A∧¬B) ∨ (¬A∧B)(14.ii) ¬(A↔ B) ≡ A↔ ¬B

(If the negated biconditional is a premise, then, usually, (14.i) is more convenient; we thenapply (∨, |=) and, to each of the resulting implications, we apply (∧, |=). If the negated bicon-ditional is the conclusion, then (14.ii) is more convenient; we then apply (|=,↔). Sometimes(14.ii) is more convenient when ¬(A↔ B) is a premise: if the premise-list contains A, we canreplace A↔ ¬B, A by the equivalent list A,¬B.)Homework

4.5 Derive (14.ii) from (14.i) by pushing-in disjunction, deleting redundant conjuncts andreplacing the remaining disjunctions by equivalent conditionals.

4.6 Establish the following implications via top-down derivations. You can use all the lawsintroduced so far, as well as substitution of equivalents. At the end you should have reducedthe initial goal to a bunch of self-evident goals in which the conclusion is among the premises.(In some cases you might have to use (12∗)).

1. |= ¬(A→ B)→ A

2. |= A∧B → (A↔ B)


3. A ∨B,A→ C |= ¬B → C

4. (A→ B)→ A |= A

5. |= [C → (¬A→ B)]→ [(C → A) ∨ (C → B)]

6. (A ∨B)→ A∧B |= A↔ B

7. [A→ (A↔ B)] ∧ [B → (A↔ B)] |= A↔ B

8. |= A→ (B → C)↔ (A∧B → C)

4.2.3 Logically Inconsistent Premises

A premise-list is said to be logically inconsistent, or inconsistent for short, if it is impossible forall of them to be true, by virtue of pure logic. This is equivalent to saying that the conjunctionof all premises is logically false. In the case of sentential logic, where the connectives are theonly logical elements, we say that the premise-list is contradictory.

Given a logically inconsistent premise-list, and any sentence B, it is impossible that allpremises are true and B is false; because it is impossible that all premises are true. Hence,by the definition, an inconsistent premise-list implies any sentence:

If Γ is inconsistent, Γ |= B, for all B.

This is sometimes expressed by saying that a “contradiction implies everything”, and is often asource of misunderstandings. Some accept it as a strange, or “deep” truth of logic. And somemay find it a defect of the system. Actually, there is no mystery and no ground for objection.Logical implication is a technical concept defined for particular purposes. It captures someof our intuitions concerning “implication”; but it does not, and is not intended to, captureall. Using “imply” in the somewhat vague, everyday sense, we will never say that the twocontradictory premises,

(i) John Kennedy was assassinated by Lee Oswald,

(ii) John Kennedy was not assassinated by Lee Oswald,

imply that once there was life on Mars. For we require some internal link between the premisesand the conclusion. But there is nothing wrong in introducing a more technical variant ofimplication, well-defined in terms of possible truth-values, by which a contradiction does implyevery sentence. (Similar points relating to logical equivalence have been discussed already.)

The law that every sentence is implied by contradictory premises is very useful when it comesto deriving and checking logical implications. We shall regard the trivial instances of this law,


where the premise-list contains a sentence and its negation, as self-evident implications thatneed no further proof. This means that every implication of the form

Γ, A,¬A |= B

is a possible termination point in a top-down derivation. We therefore allow two kinds ofsuccessful leaves (that can be marked by

√): those in which the conclusion appears among

the premises, and those in which the premises contain a sentence and its negation. Here is anexample showing the use of the second kind.

1. (C → A) ∨ (C → B) |= C → A∨B2. (C → A) ∨ (C → B), C |= A ∨B by (|=,→),3. (C → A) ∨ (C → B), C,¬A |= B by (|=,∨),4.1 C → A,C,¬A |= B by (|=,∨),4.2 C → B,C,¬A |= B by (|=,∨),5.1 C,A,¬A |= B by disjoining,

√

5.2 C,B,¬A |= B by disjoining,√

5.1 is marked as successful, because the premises contain a sentence and its negation; 5.2 ismarked because the conclusion is among the premises.

Note: We can now derive disjoining from the other laws as follows. By (→, |=): Γ, A→B,A |= C iff Γ,¬A,A |= C and Γ, B,A |= C. But the first implication is obvious. Hence,Γ, A→ B,A |= C iff Γ, B,A |= C.

Homework 4.7 Show, via top-down derivations, that the following implications obtain.The final goals should be of the two allowed kinds of self-evident implications.

1. |= (A→ ¬A)→ ¬A2. |= (A→ B)→ (¬B → ¬A)3. A∨B → C |= (A→ C) ∧ (B → C)

4. |= (A→ B) ∧ (¬A→ B)→ B

5. A→ B∧C, (B → D) ∨ (C → D) |= A→ C

6. A→ B,B → ¬B |= ¬A7. A→ B∧C, B → ¬C |= ¬A8. A∧B → C, A∧¬B → C |= A→ C

4.3. FOOL-PROOF METHOD 127

4.3 A Fool-Proof Method for Finding Proofs and Coun-

terexamples

4.3.1 Validity and Counterexamples

All the preceding implication laws, and the equivalence laws of chapter 3, can be regarded asgeneral schemes. They hold no matter what sentences we substitute for the sentential vari-ables. Therefore the derivable implications are schemes as well; having proved an implicationwe have also proved all its instantiations.

To be precise, we should consider here (as we have done before) the sentential expressions. Wecan consider lists of sentential expressions. Call them premise expressions when they occuron the left-hand side of an implication. A list of premise expressions tautologically implies asentential expression if:

There is no truth-value assignment to the sentential variables under whichall the premise expressions get T and the conclusion expression gets F.

An implication (one that consists of expressions) is tautologically valid, or valid for short, ifthe list of premise expressions tautologically implies the conclusion expression.

An implication is therefore non-valid when there is a truth-value assignment to the sententialvariables under which all the premise expressions get T and the conclusion expression getsF. Such an assignment is called a counterexample. Hence an implication is valid just when ithas no counterexamples.

Obviously, a non-valid implication between sentential expressions fails as an implication be-tween sentences, if we substitute the sentential variables by distinct atomic sentences. Alter-natively, we can get a failed implication between sentences as follows:

Substitute every sentential variable that gets T in the counterexample by a tautology, andevery sentential variable that gets F– by a contradiction.

Example:A→ B, A∨C |= B

is not valid, because it has a counterexample:A B CF F T

If we assume that A = B =

D ∧ ¬D and C = D→ D, the implication fails as an implication between sentences:

(D∧¬D)→ (D∧¬D), (D∧¬D) ∨ (D→D) 6|= D∧¬DButA,B and C can be other sentences for which the implication holds, e.g., if C = A ∨B (checkit for yourself).


‘Implication’ has therefore double usage: when the premises and the conclusion are unspecifiedsentences, and when they are sentential expressions. There is no danger of confusion, becausethe context will make the reading clear. If we say that an implication (e.g., the one in thelast example) may or may not hold, then we are obviously referring to unspecified sentences.But if we say that it is not valid we are saying that the scheme does not hold in general, thatis,, as an implication between sentential expressions, it has a counterexample. By the sametoken we say that x · y > x+ y may or may not hold, depending on the numbers x and y, butthat x2 + 1 > x is a valid numerical inequality.

Equivalence for Counterexamples

We have seen that in each of our laws the left-hand side implication holds iff all the right-handside implications hold. Each of our laws satisfies also the following.

Counterexample Equivalence: A truth-value assignment to the sentential variablesis a counterexample to the left-hand side iff it is a counterexample to at leastone of the implications on right-hand side.

Counterexample equivalence can be inferred from the following two facts: (I) In each law,the equivalence of the two sides is preserved under all substitutions. (II) An implication isnon-valid iff it has an instantiation that fails as an implication between sentences.

Alternatively, the arguments that prove the equivalence of the two sides can be used to showtheir counterexample equivalence. As an illustration we show this for (|=,→) and for (∨, |=).

(|=,→) Γ |= A→ B ⇐⇒ Γ, A |= B

A truth-value assignment (to the sentential variables) is a counterexampleto the left-hand side, iff all members of Γ get T and A→ B gets F. But thisis equivalent to saying that all members of Γ get T, A gets T, and B gets F;which is exactly the condition for a counterexample to the right-hand side.

(∨, |=) Γ, A ∨B |= C ⇐⇒ Γ, A |= C and Γ, B |= C

A truth-value assignment is a counterexample to the left-hand side, iff allsentences in Γ get T, A∨B gets T, and C gets F. But A∨B gets T, iff eitherA gets T, or B gets T (or both). If A gets T, then this is a counterexampleto Γ, A |= C, and if B gets T, it is a counterexample to Γ, B |= C. Viceversa, a counterexample to one of the right-hand side implications assignsT to all members of Γ and to A ∨B, and assigns F to C.

Using top-down derivations, we will define a method that decides for any implication whetheror not it is valid. Given an implication, the method is guaranteed to produce either a proof


of it or a counterexample.

4.3.2 The Basic Laws

The method uses a finite number of basic laws. Some were mentioned before and some areeasily obtained from previously mentioned laws. Not all laws considered above are taken asbasic. The basic laws are naturally classified as follows.

First, there is a law that enables trivial rearrangements of premise lists:

If every sentence occurring in Γ occurs in Γ0 and every sentence occurring inΓ0 occurs in Γ, then for every A,

Γ |= A ⇐⇒ Γ0 |= A

Using this law we can reorder the premises, delete repetitions, or list any premise more thanonce. Henceforth, such reorganizing will be carried out without explicit mention.

Next, we designate two types of implications as self-evident:

Self-Evident Implications

Γ, A |= A Γ, A, ¬A |= B

Implications belonging to these two types play a role similar to that of axioms. In bottom-upproofs they serve as starting points. In top-down ones they are the final successful goals.

The rest, referred to as reduction laws, are the basis of the method. They enable us to replacea goal by simpler goals. The first group consists of laws that handle sentential compoundsA B. For each binary connective, , we have a premise law ( , |=) and a conclusion law (|= ),which handle, respectively, -compounds that appear as a premise, or as the conclusion. Thesecond group, which deals with negated compounds, is presented later.


Laws for Conjunction

(∧, |=) Γ, A ∧B |= C ⇐⇒ Γ, A,B |= C

(|=,∧) Γ |= A ∧B ⇐⇒ Γ |= A and Γ |= B

Laws for Disjunction

(∨, |=) Γ, A ∨B |= C ⇐⇒ Γ, A |= C and Γ, B |= C

(|=,∨) Γ |= A ∨B ⇐⇒ Γ,¬A |= B

Laws for Conditional

(→, |=) Γ, A→ B |= C ⇐⇒ Γ,¬A |= C and Γ, B |= C

(|=,→) Γ |= A→ B ⇐⇒ Γ, A |= B

Laws for Biconditional

(↔, |=) Γ, A↔ B |= C ⇐⇒ Γ, A,B |= C and Γ,¬A,¬B |= C

(|=,↔) Γ |= A↔ B ⇐⇒ Γ, A |= B and Γ, B |= A

The goal-reduction process is as described in 4.3. To recap: a step consists of replacing theleft-hand side of a reduction law (the goal that is being reduced) by the right-hand side.The ⇐-direction guarantees that proving the new goals (or goal) is sufficient for proving theold goal; the ⇒-direction means that they are also implied by it. All counterexamples to anew goal are also counterexamples to the old one, and any counterexample to the old one isobtained in this way.

A goal’s children are the goals that replace it. (If there is one goal on the right, there isonly one child). It follows from the above that if the children are valid so is the parent.Consequently, if all the leaf goals are valid, so are their parents, and the parents of theirparents, and so on, up to the initial goal. On the other hand, any counterexample to one ofthe leaf goals is also a counterexample to one (or more) of their parents, hence also to theparent’s parent, and so on, up to the initial goal. And any counterexample to the originalgoal is a counterexample to a goal in some leaf.


We still need laws for reducing negated compounds, sentences of the form ¬¬A or ¬(A B).The laws for double negation allow us to drop it, either in a premises or in the conclusion.Compounds of the form ¬(A B) can be treated in the way described in 4.2.2 (page 124),i.e., by pushing the negation inside. This means that we use laws such as:

Γ, ¬(A∧B) |= C ⇐⇒ Γ, ¬A∨¬B |= C

And similar laws for pushing negation inside in ¬(A∨B), in ¬(A→ B), and in ¬(A↔ B).

There is a more elegant way: Combine in a single law the pushing of negation and the lawthat applies to the resulting compound. For ¬(A ∧B) this yields:

Γ, ¬(A∧B) |= C ⇐⇒ Γ, ¬A |= C and Γ, ¬B |= C .

Doing so for all connectives, we get reduction laws for negated compounds, of the sametype as our previous ones. It is easy to see that counterexample equivalence is also truefor the second group. Because these laws are obtained from the first group by substitutingsentential expressions by logically equivalent ones; such substitutions do not change the setsof counterexamples.


Laws for Negated Negations

(¬¬, |=) Γ, ¬¬A |= B ⇐⇒ Γ, A |= B

(|=,¬¬) Γ |= ¬¬B ⇐⇒ Γ |= B

Laws for Negated Conjunctions

(¬∧, |=) Γ, ¬(A ∧B) |= C ⇐⇒ Γ, ¬A |= C and Γ, ¬B |= C

(|=,¬∧) Γ |= ¬(A ∧B) ⇐⇒ Γ, A |= ¬B

Laws for Negated Disjunctions

(¬∨, |=) Γ, ¬(A ∨B) |= C ⇐⇒ Γ, ¬A, ¬B |= C

(|=,¬∨) Γ |= ¬(A ∨B) ⇐⇒ Γ, |= ¬A and Γ |= ¬B

Laws for Negated Conditionals

(¬→, |=) Γ, ¬(A→ B) |= C ⇐⇒ Γ, A, ¬B |= C

(|=,¬→) Γ |= ¬(A→ B) ⇐⇒ Γ |= A and Γ |= ¬B

Laws for Negated Biconditionals

(¬↔, |=) Γ, ¬(A↔ B) |= C ⇐⇒ Γ, A, ¬B |= C and Γ, ¬A, B |= C

(|=,¬↔) Γ |= ¬(A↔ B) ⇐⇒ Γ, A |= ¬B and Γ, ¬B |= A

This completes the set of reduction laws.

Branching Laws: A law whose right-hand side has more than one implication is calleda branching law. Each application of a branching law causes branching in the tree. Thebranching laws are, for non-negated compounds: (|=,∧), (∨, |=), (→, |=), (↔, |=) and(|=,↔) . For negated compounds they are: (¬∧, |=), (|=,¬∨), (|=,¬→), (¬↔, |=) and(|=,¬↔). The other laws are referred to as non-branching.


Memorization: You do not have to memorize all the laws. A useful strategy is tomemorize only four: the two laws for conjunction, the premise-disjunction law, (∨, |=), andthe conclusion-conditional law, (|=,→). The rest you can get by obvious substitutions ofequivalents: The conclusion-disjunction law–by rewriting A ∨ B as ¬A → B; the premise-conditional law–by rewriting A→ B as ¬A∨B; the premise-biconditional law–by rewritingthe biconditional as (A∧B)∨ (¬A∧¬B), and the conclusion-biconditional law–by rewritingit as (A → B) ∧ (B → A). Beside the double negation laws, the other laws for negatedcompounds are obtained by pushing negation in, as indicated earlier.

Elementary Implications

Elementary implications are those that cannot be simplified through reduction laws. Animplication is elementary if every sentential expressions figuring in it is either a sententialvariable or a negation of one. This is equivalent to saying that it contains neither binary nornegated compounds. Here are some examples.

A, ¬B, C, ¬D |= ¬B ¬A, C, B, ¬C |= D ¬A, B, ¬C, D |= ¬EClaim: (I) If an elementary implication is valid, then it is self-evident, i.e., either theconclusion occurs as a premise or the premises contain a sentential expression and its negation.

(II) If an elementary implication is not self-evident then there is a unique assignment to itssentential variables that constitutes a counterexample. This assignment is determined by thefollowing conditions:

(i) Every sentential variable that occurs unnegated in the premises gets T, andevery sentential variable that occurs negated in the premises gets F.

(ii) The sentential variable of the conclusion gets F, if it occurs unnegated, T–ifit occurs negated.

Proof: Assume that an elementary implication is not self-evident and show that the condi-tions in (II) determine an assignment, and that the assignment is the unique counterexample.

For any given assignment the following is obvious: All the premises get T iff the assignmentsatisfies (i). The conclusion gets F iff the assignment satisfies (ii). There is at most oneassignment, to the sentential variables occurring in the implication, that satisfies (i) and (ii);because (i) and (ii) prescribe truth-values to all these sentential variables. Hence there is acounterexample iff there is an assignment satisfying (i) and (ii); the counterexample is thenunique.

The only way in which (i) and (ii) can fail to determine an assignment is by prescribingmore than one truth-value for the same sentential variable. This does not happen unless the


implication is self-evident. For if it is not, no sentential variable occurs in the premises bothnegated and unnegated; hence (i) assigns to each sentential variable occurring in the premisesexactly one value. Next, if the variable of the conclusion does not occur in the premises, thenonly (ii) gives it a value. If it occurs in the premises, it must be either negated in the premisesand unnegated in the conclusion, or unnegated in the premises and negated in the conclusion.Otherwise the conclusion is among the premises and the implication is self-evident. Hence(ii) and (i) assign to it the same value.

QED

In order to check the validity of an elementary implication, we therefore check if it is self-evident. If it is not, then (i) and (ii) in (II) tell us what the counterexample is.

Of the three elementary implications given above, the first two are self-evident. The coun-terexample to the third is:

A B C D EF T F T T

We can now assemble all the pieces and sum up the method.

4.3.3 The Fool-Proof Method

To check any given implicationΓ |= A ,

we take it as the initial goal and proceed top-down, by applying the premise and conclusionlaws for binary connectives and for negated compounds. As long as there is a goal containinga binary compound, or a negation of one, or a double negation, we can continue.

Such a process cannot go on indefinitely, because the goals become smaller. Intuitively this isclear. The mathematical proof of this will not be given here. (The proof is not trivial becauseas the goals become smaller, their number can increase; for precise inductive arguments see6.2.4 page 232, and 6.2.5 page 239.)

When the process terminates, we get a top-down derivation tree in which all the goals inthe leaves are elementary. If all are self-evident we get a top-down derivation of the initialimplication, which can be turned upside down into a bottom-up proof. Otherwise, there areterminal goals that are not self-evident. Each of these yields a unique counterexample to theinitial goal. All the counterexamples to the initial goal are obtained in this way.

Note: Our listed laws are sufficient for deriving all valid implications. For if an implicationis not derivable. the resulting tree gives us a counterexample. This shows that non-derivableimplications are not valid.


Consequently, there is no need to use any other laws or to rely on substitution of equivalentcomponents. In practice, however, you can legitimately apply other established laws, such asdisjoining ((12) of 4.2.1), and you may use substitutions of equivalents (where the equivalenceshave been proven already), in order to shorten the proof.

Note: In actual applications, you need not go all the way. The process can stop at thestage where all the goals are self-evident, even if they are not elementary. In the first exampleof 4.2.2 the final goals are elementary; in the second and the third they are not. Also, onceyou get an elementary implication that is not self-evident, you have your counterexample andyou can stop. But if you want to get all counterexamples to the initial goal, you should getall non-valid elementary implications of the tree.

Here are two examples. In the first we get a proof, in the second–a counterexample. The lawthat applies at each step is not indicated, but you can figure it out. The sentence to which alaw is applied is underlined.

1. A→ B, C → (A ∨B) |= C → B

2. A→ B, C → (A ∨B), C |= B

3.1 ¬A, C → (A ∨B), C |= B

3.2 B, C → (A ∨B), C |= B√

4.11 ¬A, ¬C, C |= B√

4.12 ¬A, A ∨B, C |= B

5.121 ¬A, A, C |= B√

5.122 ¬A, B, C |= B√

Note that had we employed law (12) of 4.2.1, we could have replaced

C → (A ∨B), C by C, A ∨B ,

which would have eliminated 4.11, making 4.12 the sole child of 3.1.

1. A→ (B ∨ C), B → (A ∧ C) |= ¬C → A

2. A→ (B ∨ C), B → (A ∧ C), ¬C |= A

3.1 ¬A, B → (A ∧ C), ¬C |= A

3.2 B ∨ C, B → (A ∧ C), ¬C |= A

4.11 ¬A, ¬B, ¬C |= A ×


4.12 ¬A, A ∧ C, ¬C |= A

Here the derivation stopped after yielding the non-valid elementary 4.11. The correspondingcounterexample is:

A B CF F F

You can easily see that this is also a counterexample to 1. If you continue to reduce theremaining goals, 3.2 and 4.12, you will see that both of them are valid, hence this is the onlycounterexample to our initial goal.

Homework 4.8 Write the last two top-down derivations in tree form, using line numbersto label the nodes. Write to the side of each node the law by which it has been derived fromits parent.

Some Noteworthy Properties of the Method

• Given any occurrence of a sentential expression in a goal, there is at most one law thatcan be applied to it and the result of the application is unique.

• One cannot go wrong by applying the reduction laws in any order, until all goals areelementary.

(But the choice, at each stage, of where to apply a reduction law can have considerableeffect on the derivation’s length. If, in the first of the last two examples, we had startedby treating the leftmost premise, A→ B, and had followed this by treating, on each ofthe two branches, C → (A∨B), we would have had four branches right at the beginningand the number of lines would have been 13, instead of 8.)

• The laws that deal with a binary connective do not introduce any other connectiveexcept, possibly, negation. Therefore, no other connectives, besides negation and thoseappearing in the initial goal, appear in the derivation.

Consequently, given any set of connectives that contains negation, the laws for theconnectives of the set are sufficient for deriving all tautological implications of the sub-system that is based on these connectives. For example, if we restrict our system tosentences whose connectives are ¬ and ↔ only, the laws double negation laws, and thepremise and conclusion laws for biconditionals and negated biconditionals are sufficient.

Note: The validity of a given implication can be settled also through truth-tables. We makea table for all the occurring sentential expressions; then we check, row after row, whether allthe premises get T and the conclusion gets F. If we find such a row–we get a counterexample.If not–the implication is valid. But the execution of this “brute force” checking is tedious,prone to mistakes and, often, more time consuming.

4.4. PROOFS BY CONTRADICTION 137

A most important feature of the method is that it generalizes to richer systems where truth-tables are not available. As we shall mention later (in 9.3.3) there is no method for first-order logic that is guaranteed to produce, in a finite number of steps, either a proof or acounterexample. But there is one that is guaranteed to produce a proof–if the implication isa logical implication. One of the proofs of this result is obtained by extending the presentmethod and by using the same type of arguments that show its adequacy for the sententialcase.

Homework 4.9 Give, for each of the following implication claims, a top-down derivation ora counterexample. To cut short the construction, you can use at your convenience additionallaws, besides the basic ones, as well as simple substitutions of equivalents.

1. |= [A→ (B → C)]→ [(A→ B)→ (A→ C)]

2. |= A ∨ (¬A∧C)→ (¬A→ C)

3. A→ B, B → C, C → A ∨B |= C ↔ (A ∨B)

4. A ∧B, B → (C ∨ ¬D), D→ ¬C |= C

5. A ∨ ¬B, (B → C)→ D |= A ∨D

6. A↔ B, B ∨ C, A→ ¬C |= A ∨ (C ∧ ¬B)

7. |= ((A→ B)→ C)→ (B → C)

8. A→ B∧C, (B ∨ C)→ D |= (D→ A)→ (B ↔ C)

9. A→ (B ∨ C), ¬B ∨ ¬C |= ¬A

10. B∧C → A, (A ∨B)→ C |= B → A

11. A ∨ (B∧C), B ∨ (A∧C) |= (A ∨B) ∧ C

4.4 Proofs by Contradiction

4.4.0

The following is easily established:

(15) Γ |= A iff Γ, ¬A is logically inconsistent.


The left-hand side holds just when it is impossible that all the sentences in Γ be true and Abe false; this is the same as saying that it is impossible that all sentences in Γ, ¬A be true.Furthermore, we have:

(16) If C is any contradiction, then

Γ is logically inconsistent iff Γ |= C .

Again, this is obvious: A logically inconsistent premise-list implies all sentences, in particular–all contradictions. Vice versa, if Γ implies a contradiction, then, it is impossible that allpremises in Γ be true, for then the contradiction will have to be true as well.

From (15) and (16) we get:

(17) If C is any contradiction, then:

Γ |= A ⇐⇒ Γ, ¬A |= C .

(17) gives us a way for proving that Γ |= A : Add ¬A to Γ and show that the resulting listimplies a contradiction. Such a proof is called proof by contradiction.

We can choose any contradiction as C. The most common one is a sentence of the form B ∧¬B. But instead of using particular contradictions, it is convenient to introduce a specialcontradiction symbol that denotes a sentence which, by definition, gets only the value F. Thesymbol to be used is:

⊥You can think of ‘⊥’ as denoting, ambiguously, any contradiction. But we shall employ it inrestricted way: it cannot occur among the premises but only as the right-hand side of ‘|=’:

Γ |= ⊥ .

This is simply a way of saying that Γ is logically (or, in the special case of sentential logic,tautologically) inconsistent. ⊥ can be replaced, if one wishes, by any particular contradiction.With this notation (17) becomes:

(18) Γ |= A ⇐⇒ Γ, ¬A |= ⊥

It is easily seen that the two sides of (18) are counterexample equivalent (i.e., have the samecounterexamples).

All our previous premise laws apply to implications of the form ‘Γ |=⊥’ (because ⊥ can bereplaced by any contradictory sentence). E.g.,

Γ, A ∨B |= ⊥ ⇐⇒ Γ, A |= ⊥ and Γ, B |= ⊥


Our previous notions of a self-evident implication and of an elementary implication carry over,in an obvious way, to implications of the form

Γ |= ⊥ .

The implication is self-evident just when Γ contains a sentence and its negation. It is elemen-tary if all the premise expressions are unnegated or negated sentential variables. There is nowonly one kind of self-evident implications, because cases in which the conclusion is among thepremises are excluded by the restriction on ‘⊥’. The claim that an elementary implication isvalid iff it is self-evident has now a simpler proof:

Assume that the elementary implication is not self-evident. Assign T toevery sentential variable appearing unnegated in the premise-list, assign Fto those that appear negated. Since no variable appears both unnegated andnegated, each is assigned a single value. This assignment makes all premisestrue, thereby constituting a counterexample.

4.4.1 The Fool-Proof Method for Proofs by Contradiction

Our previous top-down method can be adapted to proofs by contradiction. Given an initialgoal:

Γ |= A

we start with replacing it by the equivalent goal:

Γ, ¬A |= ⊥Then we proceed to reduce this goal to simpler goals by applying our premise laws to binarycompounds, to their negations, and to double negations. (If the initial goal is Γ |=⊥ westart the reductions right away.) All the resulting goals have ‘⊥’ on the right-hand side.We can continue until all the goals are elementary. If all are self-evident we get a proofof the initial goal. Otherwise, every elementary implication that is not self-evident yields acounterexample. Here are two illustrations. In the first the method yields a proof:

1. ¬(¬A ∨B), C → B |= ¬(A→ C)

2. ¬(¬A ∨B), C → B, ¬¬(A→ C) |= ⊥3. ¬(¬A ∨B) , C → B, A→ C |= ⊥4. ¬¬A , ¬B, C → B, A→ C |= ⊥5. A, ¬B, C → B , A→ C |= ⊥6.1 A, ¬B, ¬C, A→ C |= ⊥


6.2 A, ¬B, B, A→ C |= ⊥ √

6.11 A, ¬B, ¬C, ¬A |= ⊥ √

6.12 A, ¬B, ¬C, C |= ⊥ √

Note that we could have shortened the derivation had we used (12) (which is not among ourbasic rules). An application of (12) to 5. yields the equivalent goal:

A, ¬B, C → B, C |= ⊥

which by another application of (12) becomes self-evident:

A, ¬B, B, C |= ⊥

In the second example, the method yields a counterexample:

1. A→ B, ¬(B → C) |= A

2. A→ B, ¬(B → C) ,¬A |= ⊥

3. A→ B , B, ¬C, ¬A |= ⊥

4.1 ¬A, B, ¬C, ¬A |= ⊥ ×

4.2 B, B, ¬C, ¬A |= ⊥ ×

(The repeated occurrences of premises, in the last two goals, could have been deleted.) Notethat 4.1. and 4.2 are conjointly equivalent to the initial goal. Each has a counterexample.But their counterexamples are the same, namely:

A B CF T F

Therefore, this is the only counterexample to the original implication. You can check thatit is indeed a counterexample to the initial goal, by constructing a truth-table (for the threesentential expressions) noting that the row corresponding to that assignment is one in whichall the premises get T and the conclusion gets F. You can moreover check that this is theonly row with that property.

The proof-by-contradiction variant uses fewer basic laws than our previous method. All thereduction laws are premise laws. On the other hand, it may occasionally require more steps.The basic laws for top-down proofs by contradictions are given on pages 141, 142. Except forthe law for trivial rearrangements of the premises, no other laws are needed.


Homework 4.10 Using the proof-by-contradiction method, check which of the followingimplications is valid. Give in each case either a top-down derivation or a counterexample. Youcan use at your convenience additional premise laws, such as (12), or simple substitutions-by-equivalents.

1. A→ B, B → C |= ¬C → ¬A

2. A→ A∧B, B → C |= A→ C

3. (A ∨B)→ C, C |= A ∨B

4. A ∧ (B → C), B |= A ∧ C

5. |= (A→ B) ∧ (A→ C)→ (A ∨B → C)

6. A ∨B, B → C, C → ¬A |= ⊥

7. A↔ B |= (A ∨B)↔ A

8. A→ (B ∨ C), ¬(B → A), ¬(C → A) |= ⊥

9. (A∧B) ∨ C |= (A∧B)∧¬C ∨ C

The Laws for Proofs by Contradiction

First we have the law that fixes the self-evident implications.

Self-Evident Implication

Γ, A, ¬A |= ⊥

Then, we have the reduction laws. The first in the law that introduces ⊥ as the conclusion.The other are premise laws for binary connectives and negated compounds. In the followinglist the laws for A B and for ¬A B are grouped together.


Contradictory-Conclusion Law

Γ |= A ⇐⇒ Γ, ¬A |= ⊥

Law for Negated Negations

(¬¬, |=) Γ, ¬¬A |= ⊥ ⇐⇒ Γ, A |= ⊥

Laws for Conjunctions and Negated Conjunctions

(∧, |=) Γ, A ∧B |= ⊥ ⇐⇒ Γ, A,B |= ⊥

(¬∧, |=) Γ, ¬(A ∧B) |= ⊥ ⇐⇒ Γ, ¬A |= ⊥ and Γ, ¬B |= ⊥

Laws for Disjunctions and Negated Disjunctions

(∨, |=) Γ, A ∨B |= ⊥ ⇐⇒ Γ, A |= ⊥ and Γ, B |= ⊥

(¬∨, |=) Γ, ¬(A ∨B) |= ⊥ ⇐⇒ Γ, ¬A, ¬B |= ⊥

Laws for Conditionals and Negated Conditionals

(→, |=) Γ, A→ B |= ⊥ ⇐⇒ Γ,¬A |= ⊥ and Γ, B |= ⊥

(¬→, |=) Γ, ¬(A→ B) |= ⊥ ⇐⇒ Γ, A, ¬B |= ⊥

Laws for Biconditionals and Negated Biconditionals:

(↔, |=) Γ, A↔ B |= ⊥ ⇐⇒ Γ, A,B |= ⊥ and Γ,¬A,¬B |= ⊥

(¬↔, |=) Γ, ¬(A↔ B) |= ⊥ ⇐⇒ Γ, A, ¬B |= ⊥ and Γ, ¬A, B |= ⊥

4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE 143

4.5 Implications of Sentential Logic in Natural Lan-

guage

4.5.0

In order to establish logical implications between English sentences, we recast them as sen-tences of symbolic logic. We can then check whether logical implication holds for the recastsentences. Consider the premises:

(1) Jill will not marry Jack, unless he leaves New York,

(2) If Jack leaves New York, he must give up his current job,

and the inferred conclusion:

(3) Either Jack will give up his current job, or he won’t marry Jill.

Let A, B and C be, respectively, the formal counterparts of:

‘Jill will marry Jack’, ‘Jack will leave New York’, ‘Jack will give up his current job’.

Then the sentences are translated as:

(1∗) ¬B → ¬A(2∗) B → C

(3∗) C ∨ ¬A

And indeed:¬B → ¬A, B → C |= C ∨ ¬A

Had the implication not been valid, we would have had a counterexample, using which wecould have pointed out a possible scenario in which (1) and (2) are true and (3) is false. Forexample, had we replaced (1) by:

(10) Jack will not marry Jill if he leaves New York,

the formal implication would have been:

B → ¬A, B → C |= C ∨ ¬AAnd here we get a counterexample:


A B CT F F

That is, Jill marries Jack; he does not leave New York, and he does not give up his job.

Noteworthy Points: (I) We have construed (2) as a conditional and we have read ‘hemust give up his job’ as ‘he will give up his job’. Whatever the connotations of ‘must’ in thiscontext, they are ignored as irrelevant for the formalization.

(II) We used the same A to represent, in (1∗) and in (3∗, both the sentence

(4.i) ‘Jill will marry Jack’,

as well as

(4.ii) ‘Jack will marry Jill’.

Here we relied on the special meaning of ‘marry’; with most other verbs, e.g., ‘understand’,‘like’, ‘amuse’, etc. the move would have been illegitimate (‘Jack likes Jill’ is not equivalentin meaning to ‘Jill likes Jack’). In a more scrupulous formalization we would have represented(4.ii) by a different sentence, say A0. But then we should have included

A↔ A0

among the formalized premises. This additional premise reflects the equivalence of (4.i) and(4.ii), which is implicit in the argument. It has a different status than the other premises; forit is not explicitly stated, but is something that derives solely from the meaning of ‘marry’.Which brings us to our next subject.

4.5.1 Meaning Postulates and Background Assumptions

There are numerous connections between English verbs, adjectives, common names, and ad-verbs, which are based on their meaning and which English speakers are expected to know.They are taken for granted whenever we speak, argue, or draw conclusions. drawing conclu-sions. Our last example (Jack marries Jill if and only if Jill marries Jack) is a case amongmany. In that case we could avoid additional formal premises by using a coarse-grained for-malization, in which the same A represents (4.i) and (4.ii). But this is not always desirable,and in most cases it is not possible. Consider,

(5) Carol can be on the task force, if, and only if, Carol is unmarried,

from which we want to conclude

(6) If Carol is a bachelor, he can be on the task force.


Let, A, B, C, represent, respectively,

‘Carol is unmarried’, ‘Carol is a bachelor’, ‘Carol can be on the task force’ .

Then (5) and (6) become respectively:

C ↔ A, and B → C .

To infer the second from the first, we must add the premise: B → A, representing:

(7) If Carol is a bachelor, then Carol is unmarried.

(We cannot let the same formal sentence represent both ‘Carol is unmarried’ and ‘Carol is abachelor’; the two are not equivalent, since Carol can be an unmarried woman.)

The term meaning postulate has been introduced by Carnap to describe formalized sentencesthat are not logical truths, but are true by virtue of the meaning of their terms. They aresupposed to determine, axiomatically, the meaning of the undefined symbols.

Usually it takes first-order logic to express meaning postulates. E.g., the formal counterpartof (7) is a logical consequence of the meaning postulate:

(8) All bachelors are unmarried .

As we shall see, it can be written as:

(8∗) ∀x[Bachelor(x)→ Unmarried(x)]

Carnap held that meaning postulates are unrevisable laws of language, without empiricalcontent, a view that has by now been abandoned by most philosophers. Nonetheless, evenif the distinction is not–as Carnap held–absolute, it is a good methodological policy todistinguish sentences like (8) from sentences like ‘Carol is unmarried’, which convey non-linguistic factual information.

We shall henceforth use meaning postulates, without however committing ourselves to theoriginal significance associated with the term. Thus, in formalizing the inference from (5) to(6), we add B → C as a premise representing a meaning postulate (or a consequence of one).

Background Assumptions

Almost every reasoning involves background assumptions that are not spelled out explicitly.Given the premises:


(9) Arthur’s mother won’t be content, unless he lives in Boston,

(10) Arthur’s wife will be content only if he lives in New York,

we would naturally conclude:

(11) Either Arthur’s mother or his wife won’t be content.

Let us formalize:

A1: Arthur will live in Boston,

A2: Arthur will live in New York,

B1: Arthur’s mother will be content.

B2: Arthur’s wife will be content.

The required implication is:

(12) ¬A1 → ¬B1, B2 → A2 |= ¬B1 ∨ ¬B2 .

But it is easy to see that (12) is not a valid implication: if both A1 and A2 get T, the premisesare true and the conclusion is false. It turns out that in deriving (11) we have been assumingthat Arthur will not live in New York and in Boston at the same time. (“At the same time”–because in (9), (10) and (11), the future tense is, obviously, intended to indicate the sametime.)

This background assumption becomes, upon formalization:

¬(A1 ∧A2)

Having added it, we get the desired logical implication:

(12∗) ¬A1 → ¬B1, B2 → A2, ¬(A1 ∧A2) |= ¬B1 ∨ ¬B2

A background assumption is by no means a necessary truth. It is, for example, conceivablethat Arthur will live in Boston and in New York “at the same time”: say, he maintainstwo households and commutes daily. But given the inconvenience and the expense, such anarrangement is very unlikely; implicitly, we have ruled it out.


Or consider the inference from:

If Jack leaves New York, he will have to resign his current position,

AND

Jack decided to leave New York,

TO:

Jack will resign his current position.

Here there is an implicit assumption that Jack will carry out his decision. The assumptionmay be objectionable in contexts in which decisions are not always implemented.

Implicit background assumptions are thus statements of fact that are assumed to be known,or which can reasonably be taken for granted. Their certainty can vary considerably, fromthat of the well-established law, to that of a mere plausibility. Even an obvious commonplace,e.g., that one cannot be in different places exactly at the same time, can be classified as abackground assumption.

There is a philosophical tradition, initiated by Kant, according to which certain truths, suchas the impossibility of being in two places at the same time, derive from basic (non-linguistic)epistemic principles and are immune to revision. The truth just mentioned derives, presum-ably, from the very meaning of physical body and space. But today the force of that traditionhas been considerably weakened. Many cast doubt on the unrevisability of such a priori con-ceptual truths. Let us therefore classify under “meaning postulates” cases that are more of alexicographic nature, such as (8), rather than those that follow from foundational epistemicconsiderations. The latter will be classified as background assumptions, albeit ones we canhardly conceive of giving up.

‘Background assumptions’ thus covers an extremely wide spectrum, from the most entrenchedgeneral laws, to probable suppositions, to particular facts implied by context. In a fineranalysis we should distinguish between them. For the sake of simplicity we ignore thesedistinctions.

‘Meaning postulates’ is reserved for cases like (8), which derive from the conventions of lan-guage. The boundary separating meaning postulates from background assumptions is, to besure, blurred. This is true of many useful distinctions. The difference between (8) and somefactual assumption (e.g., that Jack will carry out his decision) is sufficiently clear to warranttheir classification under different headings.

4.5.2 Implicature

In linguistic exchange we often infer more than what is explicitly stated. On being told that


(13) Jack and Jill met and, so far, they have not quarreled,

one will naturally infer that a quarrel between Jack and Jill was likely. The sentence howeverdoes not say it. (13) is true, just in case that (i) Jack and Jill met and (ii) Jack and Jill havenot quarreled, then and later.

Grice, who pointed out and investigated these phenomena, proposed the term implicature,based on the verb implicate, for inferences of this kind. Thus, we can say that (13) implicatesthat there was some reason for expecting a quarrel between Jack and Jill. The sentence doesnot, strictly speaking, imply it. If we add to (13) the negation of the implicated sentence, wemight get something odd, but not a contradiction:

(130) Jack and Jill met and so far they have not quarreled; there was no reasonto expect a quarrel.

On Grice’s analysis, the implicature from (13) derives from certain pragmatic rules that governconversations. The rules that bear on (13), and on other cases that we shall consider, haveto do with the relevance, the informativeness and the economy of the speaker’s utterances.

The relevance requirement is that the statements made by the speaker be relevant to thetopic under discussion. The informativeness requirement is that the speaker supply the rightamount of information (known to her), which is required in that exchange. And economymeans that she is required to avoid unnecessary length. In our example the rules produce theimplicature in the following way.

If there was no reason why Jack and Jill should quarrel, then to say that they have notquarreled is to supply a piece of useless information. Since we expect the speaker to go by therules and to supply information that has some significance, we infer from (13) (assuming thespeaker to be knowledgeable and sincere) that there is some reason for expecting a quarrel.

The rules of conversation require also that a speaker should not assert the conditional

If ... , then ,

if he knows that ‘...’ is false, or if he knows that ‘ ’ is true. Because one can be moreinformative and more brief by asserting, in the first case–the negation of the antecedent, inthe second case–the consequent. This point was already discussed in 3.1.4; the oddity of(28.i) and (28.ii) of that section is partly explained by noting this implicature.

Also the inferring of a causal connection, which sometimes goes with the use of ‘and’, canperhaps be traced to conversational implicature. Being told:

(14) Jill recommended the play, and Jack went to see it,


we infer that Jill’s recommendation was the cause of Jack’s going. Else there would be no pointin mentioning the two together. Actually, this is not so much the requirement of relevance,as the requirement that there be a sufficiently focused topic of the discussion. The samerequirement of sufficient focus can be seen to underlie the assumption of temporal proximity:Unless stated otherwise, we interpret conjunctively combined clauses, in past or future tense,as referring roughly to the same time.

As you can see, implicatures make it possible to mislead without making assertions that areformally false. Many resort to this device. Politicians, advertisers and lawyers excel in it.

Implicature versus Ambiguity

In (13) the addition of the negated implicature does not yield a contradiction. We may takethis as corroborative evidence for its being an implicature, not an implication. This kind oftest is however not conclusive; on many occasions it misleads. Consider,

(15) Jack and Jill were married last week.

Usually (15) is taken to imply that Jack and Jill married each other. If, however, we add thenegation of that conclusion, we get:

(150) Jack and Jill were married last week, but they did not marry each other,

which is not at all contradictory. Shall we then say that our first inference from (15) is byimplicature only? No. Actually (15) is ambiguous. We have seen (cf. 3.1.2) that the use of‘and’ to combine names can result in two possible interpretations: the distributive, in whichthe sentence can be expressed as a conjunction, and the collective, where the combination ofnames functions as a name of a single item. The dominant reading of (15) is the collective,implying that Jack and Jill married each other. The addition of ‘they did not marry eachother’ makes this interpretation untenable (for it leads to a trivial contradiction). Hence weswitch to the other reading of ‘Jack and Jill were married’. In the same vein, we may interpret

John jumped into the tank

as stating that John jumped into some armored vehicle. But with a suitable addition, e.g.,

John jumped into the tank and dived to the bottom,

we read it as stating that John jumped into a large container.

A nice illustration of conversational implicature is provided by comparing (15) and


(1500) Jack and Jill were married last week, on the same day.

(15), under its first reading (i.e., that they married each other), implies (1500). But then, the

addition of ‘on the same day’ would be altogether redundant. Assuming the speaker to go bythe rule of economy, we reinterpret (15) and infer that they were not married to each other;else there would be no point to the additional information.

The Principle of Adjusting: We mentioned already the so-called charity principle (3.1.2after example (16)). According to it, we interpret our interlocutor, in cases of ambiguity, so asto make him sound sensible. Our last examples illustrate this point. But the name ‘charity’can mislead. For the principle derives from a wider principle, according to which we interpretour experience so as to make it cohere with the general scheme of expected regularities. Thisapplies to linguistic as well as to non-linguistic phenomena, to interaction with people as wellas to interaction with nature.

The subject is too broad to go into here. Suffice it only to observe that in the case of languagewe expect utterances, linguistic texts and linguistic interaction to accord with certain rules,syntactic, semantic and pragmatic. We expect utterances to make a certain sense. And whenthe danger of nonsense looms, we use the available possibilities to adjust our reading so as toavoid it.

Homework 4.11 Use sentential formalization, in order to analyze the logic of the followingexchanges and to answer the questions. If there is no stated question, find whether theconclusion follows from the premises.

Formalize only in as much as this is necessary for the purpose of your analysis.

Discuss briefly any points relating to meaning postulates, background assumptions, ambiguityand implicature, which you find relevant. Assume that the speakers are reasonable.

(1) Jill: Jack’s mother won’t be content, unless he lives in Boston.

Jack: But his wife will be content only if they live in New York.

Jill: So either his wife or his mother won’t be content.

Take ‘they’ in Jack’s statement to refer to Jack and his wife.

(2) Arthur, David and Mary share an apartment.

Jack: Arthur and David are crazy about Mary, so if she is at home both of them are.

Jill: In any case, if one of the boys is at home, the other is too, for none trusts himselfalone with the neighbor’s dog.

Jack: But they said that one of them will go over to Joe’s place to help him with hisstudies.


Jill: Which goes to show that Mary is not at home.

Does it? Does it make a difference for the implication if ‘they’ and ‘them’, in Jack’s laststatement, refer to the two boys or to the two boys and to Mary as well?

(3) Jack: Do you think that Mary is still unmarried?

Jill: I don’t know, but if Mary is not unmarried, neither is Myra.

Jack: And if Myra is not married neither is Mary.

Jill: All this is rather confusing. Doesn’t it imply that Myra is married only if Maryis?

Does it?

(4) Jill: Both Arthur and Jeremiah said that they won’t be happy, unless they marry Frieda.

Jack: By now she should have married one of them.

Jill: But she wasn’t going to marry anyone without a secure job.

Jack: So, by now, one of them has a secure job and one of them is not happy.

(5) Jack: If one of Arthur and Jeremiah goes to the movie either Olga or Amelia will gowith him.

Jill: And the two girls won’t go there together unless accompanied by a boy.

Jack: Which goes to show that the two boys will go to the movie only if the two girlsgo there too.

(6) Jack consults a fortuneteller whether he should become a musician or study for the law.

Jack: I won’t be happy unless I practice music.

Fortuneteller: But only by becoming a lawyer can you be rich enough to buy the thingsyou like.

Jack: It seems that my happiness depends on my giving up things I like.

Does it?

(7) Jill: If you go to the movie so will I.

Jack: If what you have said is true, then I will go to the movie.

Jill: Why this roundabout way of putting things? You could have simply said thatyou will go to the movie.

Jack: Not at all. I only said that I will go to the movie if what you had said is true.

Who is right and why?

(8) Jack: If I enroll in the logic course I shall work very hard.


Jill: I don’t know that I believe you... Well, at least I believe that you won’t enroll inthe logic course unless what you say is true.

Jack: But doesn’t this show that you don’t know your true beliefs?

Is Jack right?

(9) Jack: Arthur won’t move to a new apartment unless he accepts the new offer.

But this won’t be true if he marries Olga.

Jill: But if he marries Olga and moves to a new apartment, he will accept the offer.He won’t be able to do both on the salary he is getting now.

Jack: So, unless one of us is wrong, he won’t marry Olga.

(10) Jill: Unless you take a plane you won’t meet your father.

Jack: Taking a plane is rather costly.

Jill: But your father told me that if you meet him he’ll cover the expenses required foryour trip.

Jack: So if I take a plane, in the end it won’t cost me.

(11) Jack, Jill and Arthur who took a midterm test discuss the possible outcomes.

Jack: Someone who had a look at the list told me that two students got an A.

Jill: I am expecting an A, I’ll be in a bad mood if I didn’t get it.

Jack: So will I.

Arthur: I don’t care. This test doesn’t matter so much.

Jill: So either Arthur didn’t get an A, or one of us will be in a bad mood.

Chapter 5

Mathematical Interlude

5.0

Every description of a language (or setup, or system) must be phrased in some language. Thelanguage we use in talking about the language we discuss is referred to as the metalanguage,and the language we discuss–as the object language.

When we describe French in English, the metalanguage is English and the object language isFrench. The metalanguage can be the same as the object language: we can describe Englishin English. The language used in this course for discussing formal systems is English, orrather, English supplemented with some technical vocabulary.

We have been relying in our descriptions, arguments and proofs on certain basic, intuitivelygrasped concepts; for example, the concept of a finite number, and that of a finite sequence.We may say that a sentential expression is a finite sequence of symbols, and that a sententialcompound is a sentence obtained in a finite number of steps by applying connectives, and soforth.

Initially, the use of such notions poses no problems. But as the arguments and the construc-tions become more involved, there is an increasing need of a precise framework within whichwe can define certain abstract notions and carry out proofs. The framework can help us guardagainst error.1 At the same time it should have resources for carrying out constructions andproofs, beyond the immediate grasp of our intuitions.

The need for a rigorous conceptual foundation was addressed by mathematicians and philoso-phers in the second half of the nineteenth century. The desired foundations were laid in theworks of Dedekind, Frege and, in particular, Cantor, who created between the years 1874

1There are well-known examples, in the history of thought, of arguments considered clear and self-evident,which have later turned out to be confused or fallacious.

153

154 CHAPTER 5. MATHEMATICAL INTERLUDE

and 1884 what is known as set theory. This theory, developed later by other mathematicians(Zermelo, von Neuman, Hausdorff, Fraenkel–to name a few), provides a rigorous apparatus,sufficiently powerful for carrying out the constructions and proofs in all formal reasoning.

All known formal systems can, in principle, be described within set theory; and all known validreasoning about them can be derived in it. Nowadays, this theory provides the most basic kitof tools for any reasoning of a mathematical, or formal nature. In its more advanced versions,set theory–itself a sophisticated branch of mathematics–is of interest to the specialists only.But its elementary core is employed whenever formal precision is required.

In the first section of this chapter we shall introduce some very elementary set-theoreticalnotions. Our aim is not to study set theory per se, but to provide a more rigorous treatmentof formal languages and their semantics. We shall take for granted a variety of mathematicalconcepts, such as natural number, finite set, and finite sequence. In set theory these andall other mathematical concepts are defined in terms of a single primitive: the membershiprelation. But such reductions do not concern us here.

The second section is devoted to a certain technique that is widely employed in defining formallanguages and in establishing their properties. This is the technique of inductive definitionsand inductive proofs.

5.1 Basic Concepts of Set Theory

5.1.1 Sets, Membership and Extensionality

A set is a collection of any objects, considered as a single abstract object.

There is a set consisting of the earth, the sun, and the moon; another, consisting of the earth,the sun, the moon, and the planet Jupiter; and still another, consisting of the earth, the moon,number 8, and Bill Clinton. Thus, we can put into the same set objects whatsoever. Usually,we consider sets whose members are of the same kind: sets of people, sets of numbers, sets ofsentences, etc. But this is not a restriction imposed by the concept of set; the theory allowsus to form sets arbitrarily.

The objects that go into a set are said to be its members. And the basic relation on whichset theory is founded is the membership relation; it holds between two objects just when thesecond object is a set and the first is a member of it.

The symbol for membership is ‘∈’ . It is employed as follows:

x ∈ X

5.1. BASIC CONCEPTS OF SET THEORY 155

means that x is a member of the set X, and

x 6∈ X

means that x is not a member of X.

Hence, if X is the set whose members are the earth, the moon, the number 8, and Bill Clinton,then:

Earth ∈ X, 8 ∈ X, Moon ∈ X, Clinton ∈ X, Nixon 6∈ X, 6 6∈ X, Jupiter 6∈ X, etc.

We also have: Clinton 6∈ Nixon, because Nixon is not a set.Terminology: The membership symbol is occasionally used inside the English, e.g., ‘thereis x ∈ X’ is read as: ‘there is a member, x, of X’. Similar self-explanatory phrases will beused throughout.

Sometimes (but not always!) ‘contains’, or ‘is contained’, means contains as a member, or iscontained as a member. In these cases ‘X contains x’ means that x ∈ X. We also say that xbelongs to X, or that x is an element of X.

We use ‘x, y ∈ X’ as a shorthand for: ‘x ∈ X andy ∈ X’, and similarly for more than twomembers: ‘x, y, z ∈ X’.

Extensionality

A set is completely determined by its members. This means that sets that have the samemembers are the same set. Stated in full detail, the Extensionality Axiom says:

If X and Y are sets then X = Y iff every member of X is a member of Y and every memberof Y is a member of X.

Note that the “only if” direction is trivial: X = Y means that X and Y are identical, hencethey must have the same members. (This is actually a truth of first-order logic.) The realcontent of the axiom consists in the “if” direction: having the same members is sufficient forbeing the same set.

To see the implications of extensionality, consider the following two concepts: that of a humanbeing and that of a featherless two-footed animal. In an obvious sense, the concepts differ.But humans are featherless two-footed animals, and it so happens that there are no othersuch creatures besides humans. Hence, the set of humans is identical to the set of featherlesstwo-footed animal. When forming sets, differences between concepts that cannot be cashedin terms of members are ignored.


The extensionality axiom provides the standard way of proving that sets are equal. If X andY are sets, then to prove:

X = Y

it suffices to show that, for every x,

x ∈ X iff x ∈ Y .

Ways of Denoting Sets

The simplest way of representing sets is by listing their members. The set is denoted byputting curly brackets, { }, around the list. The three examples given at the beginning ofthe section are denoted as:

{Earth, Sun, Moon}, {Earth, Sun, Moon, Jupiter}, {Earth, Moon, 8, Clinton}

The ordering of the list and repetitions in it do not matter:

{Earth, Moon, 8, Clinton} = {Clinton, Clinton, 8, Earth, Clinton, Earth, 8, Moon}

because every member of the left-hand side is a member of the right-hand side set, and everymember of the right-hand side is a member of the left-hand side.

The method of listing the members is not practical when the list is too long, and not feasibleif the set is infinite. Sometimes suggestive notations can be used for infinite sets, for example:

{0, 1, 2, . . .} or {0, 2, 4, . . .}The first is set of all natural numbers (i.e., non-negative integers), the second–of all evennatural numbers. But this method, which is based on guessing the intended rule, is verylimited.

The most natural–and, in principle, perhaps the only–way of representing a set is by meansof a defining condition: one that determines what objects belong to it. In English, thedefinition has the form:

the set of all ...

where ‘...’ expresses the condition in question. Thus, we have:

The set of all positive integers divisible by 7 or 9, the set of all planets, the set of allstars, the set of all atoms, the set of all USA citizens born in August 1991, the setof all British kings who died before 1940, and so on.

Note that finite listing can be seen as a special case of this kind of definition:


{earth, moon, 8, Clinton} = the set of all objects that are either the earth,or the moon, or number 8, or Clinton.

In mathematics the following is used:

{x : . . . x . . .}It reads: The set of all x such that ...x... . Here, ‘...x...’ states the condition about x. Insteadof ‘x’ any other letter can be used. We shall refer to it as the standard curly bracket notation.

The examples given above can be therefore written as follows:

{x : x is a positive integer divisible by 7 or 9}, {x : x is a planet },{v : v is a star}, {y : y is an atom}, {z : z is a USA citizen born in August 1991},{x : x is a British king who died before 1940}, and so on.

This is not to say that every set can be denoted by an expression of the last given form,or–for that matter–by some other expression. In mathematics we allow for the possibilityof sets not denoted by any expression in our language; just as there may be atoms that nodescription can pick.

Variants of the Notation: Usually, set members are chosen from some fixed given domain(itself a set). If U is the domain in question, then the set of all members, x, of U that satisfy...x... is, of course:

{x : x ∈ U and . . . x . . .}An alternative notation is:

{x ∈ U : . . . x . . .}which reads: ‘the set of all x in U such that ...x...’ . Thus, if N is the set of all naturalnumbers, then:

{x ∈ N : x+ 1 is divisible by 3} = {x : x ∈ N and x+ 1 is divisible by 3}

Occasionally, the domain in question is to be understood from the context. It is also customaryto employ variables that range over fixed domains. If in the last example it is understoodthat ‘x’ ranges over the natural numbers, then we can omit the reference to N and writesimply

{x : x+ 1 is divisible by 3}

Other variants of the notation involve the use of functions. For example,

{2x : x ∈ N} and {x2 : x ∈ N}are, respectively, the set of all numbers of the form 2x and the set of all numbers of the formx2, where x ranges over N (i.e., the set of all even natural numbers and the set of all squares).


We can use for these sets the standard notation; but this would result in longer expressions.For example:

{x2 : x ∈ N} = {z : there is x ∈ N, such that z = x2}

Once you get used to them you will find these and other notations self-explanatory. Thefollowing exercises will help you to get accustomed to set-theoretic notations and phrasings.

Homework 5.1 Translate the following into the standard curly-bracket notation.

(1) The set of all people who like themselves.

(2) The set of all integers that are smaller than their squares.

(Recall, the square of x is x2.)

(3) The set of all people married to 1992 Columbia students.

Rewrite the following in the curly-bracket functional notation. You can use ‘N ’ and ‘Z’ todenote, respectively, the set of natural numbers and the set of integers. For (6) use ‘father(x)’to denote the father of x.

(4) The set of all positive multiples of 4.

(5) The set of all successors of integers divisible by 5.

(6) The set of all fathers of identical twins.

Describe in English the following sets, use short, neat descriptions. (‘ Livings ’, ‘ Humans ’,and ‘ Planets ’ have the obvious meanings.)

(7) {x ∈ Livings : x has two legs}(8) {x ∈ Humans : x has more than one child}(9) {x ∈ Planets : x is larger than the earth}

Rewrite the following in the standard curly-bracket notation.

(10) {3x : x ∈ Primes}(11) {x− y : x ∈ Primes, y ∈ Primes}(12) {2x+ y2 : x ∈ N, y ∈ Primes}

Note: The concept of a set is primitive. It cannot be defined by reduction to more basicconcepts. Explanations and examples (like the ones just given) may serve to get the concept


across, but they do not amount to definitions. In an indirect way, the concept is determinedby what we take to be the basic properties of sets. The same takes place in Euclideangeometry, where the undefined concepts of point, line and plane are indirectly determinedby the geometrical axioms. Like geometry, set theory is a system based on axioms. Someare “obvious”. Others, belonging to more sophisticated parts of the theory, require deepunderstanding. Except for extensionality, the axioms are not discussed here.

Singletons The set {x} has a single member, namely, x. Such a set is called a singleton,or a unit set; {x} is the singleton of x, or the unit set of x.One may be tempted to identify the singleton of x with x itself. The temptation should beresisted. The singleton {Clinton} is a set containing Clinton as its sole member. Clintonhimself is a man, not a set. Just so, one distinguishes between John the man and the one-member committee having John as its only member. If all the committee members exceptJohn perish in a crash, the committee becomes a one-member committee; but you do not wantto say that it becomes a man. The standard version of set theory has an axiom, called theregularity axiom, which implies that nothing can be a member of itself. It therefore impliesthat, for all x, x 6= {x} (because x ∈ {x}, but x 6∈ x).

The singleton of x is {x}, the singleton of {x} is {{x}}, the singleton of {{x}} is {{{x}}},and so on: {. . . {{x}} . . .}. It can be shown (assuming the regularity axiom) that all of theseare different from each other.

The Empty Set Among sets we include the so-called empty set: one that has no members.At first glance one may find this strange, as one might find strange, at first, the idea of thenumber zero. In fact, the concept is simple, highly useful and easily handled.

We speak about the empty set, because there is only one. This follows from extensionality: IfX and X 0 are sets that have no members, then X = X 0, because they have the same members(for every x: x ∈ X iff x ∈ X 0). The empty set is denoted as:

∅.

Note that every object that is not a set (e.g., every physical object) has no members. Ex-tensionality does not make these objects equal to ∅, because extensionality applies only tosets.

5.1.2 Subsets, Intersections, and Unions

If X and Y are sets than we say that X is a subset of Y if every member of X is a memberof Y . We also say in that case that Y is a superset of X. The notation is:

X ⊆ Y, or, equivalently, Y ⊇ X .


Occasionally, we use the term inclusion: we say that X is included in Y , meaning that X isa subset of Y .

As is usual in mathematics, crossing out indicates negation:

X 6⊆ Y

means that X is not a subset of Y .

Obviously, X ⊆ X, for every set X.

Proper Subsets: If X ⊆ Y and X 6= Y , then X is said to be a proper subset of Y , orproperly included in Y , and Y is said to be a proper superset of X.

If X ⊆ Y and Y ⊆ X, then X and Y have the same members and, by extensionality, are thesame. Therefore

X = Y iff X ⊆ Y and Y ⊆ X.

It is convenient to “chain” inclusions thus: X ⊆ Y ⊆ Z; it means: X ⊆ Y and Y ⊆ Z. Setinclusion is transitive:

If X ⊆ Y ⊆ Z then X ⊆ Z.

(The proof is trivial: Assume the left hand side. If x ∈ X then x ∈ Y , because X ⊆ Y ; hencealso x ∈ Z, because Y ⊆ Z; therefore every member of X is a member of Z.)

Every set, X, contains as members all members of the empty set (because the empty set hasno members). Hence,

∅ ⊆ X, for every set X

Note: The subset relation, ⊆ , should be sharply distinguished from the membershiprelation, ∈ . Every set is a subset of itself, but not a member of itself. On the other hand, amember of a set need not be a subset of it; the earth is a member of {Earth, Moon}, but itis not a subset of it, because the earth is not a set. Or consider the following:

∅ ⊆ {{∅}} (and the inclusion is proper), but ∅ 6∈ {{∅}}; because the onlymember of {{∅}} is {∅}, and ∅ 6= {∅}.

{∅} ∈ {{∅}} but {∅} 6⊆ {{∅}}; because {∅} contains ∅ as a member, whereas{{∅}} does not contain ∅ as a member.


Intersections

The intersection of two sets X and Y , denoted X ∩ Y , is the set whose members are all theobjects that are members both of X and of Y :

For every x, x ∈ X ∩ Y iff x ∈ X and x ∈ Y .

or, equivalently:

X ∩ Y = {x : x ∈ X and x ∈ Y }

Examples: The intersection of the set of all natural numbers divisible by 2 and the set ofall natural numbers divisible by 3 is the set of all natural numbers divisible both by 2 and by3. (This is the set of natural numbers divisible by 6.)

The intersection of the set of all even natural numbers and the set of all prime numbers isthe set of all numbers that are both even and prime; since the only number that is both evenand prime is 2, this is the singleton {2}.The intersection of the set of all USA citizens and the set of all redheaded people is the setof all redheaded USA citizens.

The intersection of the set of all women and the set of all pre-1992 USA presidents is the setof all women that have been, at some time before 1992, USA presidents. This happens to bethe empty set.

Disjoint Sets: Two sets, X,Y , are said to be disjoint if they have no common members;i.e., if X ∩ Y = ∅.

Unions

The union of the sets X and Y , denoted X ∪Y , is the set whose members are all objects thatare either members of X or members of Y (or members of both). That is:

For every x, x ∈ X ∪ Y iff x ∈ X or x ∈ Y .

or, equivalently:

X ∪ Y = {x : x ∈ X or x ∈ Y }


Examples: The union of the set of all natural numbers that are divisible by 6 and the setof all natural numbers that are divisible by 4 is the set of all numbers divisible either by 6 orby 4 (or by both, e.g., 12).

The union of the set of all mammals and the set of all humans is the set of all creatures thatare either mammals or humans; since every human is a mammal, this union is the set of allmammals.

The union of the set of all people that were, at some time up to t, senators, and the set ofall people who were, at some time up to t, congressmen, is the set of people who were at onetime or another, up to time t, members of at least one of the legislative houses.

The basic properties of intersections and unions are the following:

(X ∩ Y ) ∩ Z = X ∩ (Y ∩ Z) (X ∪ Y ) ∪ Z = X ∪ (Y ∪ Z)X ∩ Y = Y ∩X X ∪ Y = Y ∪XX ∩X = X X ∪X = XX ∩ ∅ = ∅ X ∪ ∅ = X

The equalities of the first row mean that the operations of intersection and union are asso-ciative, those of the second row mean that they are commutative, and those of the third row– that they are idempotent. These properties follow directly from the meanings of ‘and’ and‘or’. They are so obvious that one would hardly consider proving them formally. Formal, buttedious, proofs can be given. When this is done, one sees that the associativity of intersectionreflects the associativity of ‘and’ (i.e., the fact that (A∧B)∧C and A∧ (B ∧C) are logicallyequivalent) and the associativity of union reflects that of ‘or’.

Repeated Intersections and Repeated Unions: Intersections can be applied repeatedlyto more than two sets, and the same holds for unions. Since these operations are associative,we can ignore grouping and use expressions such as:

X1 ∩X2 ∩ . . . ∩Xn X1 ∪X2 ∪ . . . ∪Xn

And since the operations are commutative, the order of the sets can be changed withoutaffecting the result.

It is easily seen that X1 ∩X2 ∩ . . . ∩Xn is the set of all objects that are members of all thesets X1, . . . , Xn. Similarly, X1 ∪X2 ∪ . . . ∪Xn is the set of all objects that are members of atleast one of X1, . . . , Xn.

Distributive Laws: These two equalities hold in general:

X ∩ (Y ∪ Z) = (X ∩ Y ) ∪ (X ∩ Z) X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∪ Z)The first is the distributive law of intersection over union, the second – of union over inter-section. These laws are direct outcomes of the following two tautologies:


x ∈ X and (x ∈ Y or x ∈ Z) iff (x ∈ X and x ∈ Y ) or (x ∈ X and x ∈ Z).

x ∈ X or (x ∈ Y and x ∈ Z) iff (x ∈ X or x ∈ Y ) and (x ∈ X or x ∈ Z).

Obviously, each of X and Y includes (as a subset) their intersection X ∩ Y , and is includedin their union X ∪ Y . Which can be stated thus:

X ∩ Y ⊆ X,Y ⊆ X ∪ Y

As is easily seen, the subset relation can be characterized in terms either of unions, or ofintersections:

X ⊆ Y iff X ∩ Y = X X ⊆ Y iff X ∪ Y = Y

We also have:

If X ⊆ X 0 and Y ⊆ Y 0 then X ∩ Y ⊆ X 0 ∩ Y 0 and X ∪ Y ⊆ X 0 ∪ Y 0.

Every set which is included both in X and in Y is included in their intersection. This followseasily from the definitions. (It is also derivable from the above-given properties: If Z ⊆ Xand Z ⊆ Y , then Z = Z ∩ Z ⊆ X ∩ Y .)Therefore, the intersection of two sets X and Y is

(i) included both in X and in Y , and

(ii) includes every set that is included in X and in Y .

We can express this by saying that X ∩ Y is the largest set that is included both in X and inY .

Similarly, the union of X and Y can be characterized as the smallest set that includes both Xand Y .

Homework

5.2 Let N = {0, 1, 2, . . . , n, . . .} and let x, y, z, range over N . Let

X1 = {0, 1, 5, 7, 10, 13, 18, 19, 20}X2 = {3, 4, 5, 17, 21, 8, 9, 6, 1}X3 = {21, 31, 20, 40, 1, 0, 3, 20}


X4 = {2x : x > 3}X5 = {x : x is divisible by 2 or by 3}X6 = {x : x is prime}

Write down in the curly-bracket notation (using ‘∅’ for the empty set) the following sets:

1. X1 ∪X2

2. X1 ∩X3

3. X3 ∩X4

4. X3 ∪X4

5. X1 ∩X2 ∩X3

6. (X1 ∩X6) ∪ (X5 ∩X2)

7. (X5 ∩X6) ∪X1

8. (X6 ∪X5) ∩ (X1 ∪X3)

9. (X4 ∩X6) ∪X5

10. X4 ∩ (X6 ∪X5)

5.3 For any two sets, X, Y , define X − Y by:

X − Y = {x ∈ X : x 6∈ Y }

With the Xi’s as in 5.2, write down in the curly-bracket notation (using ‘∅’ for the emptyset) the following sets:

1. X1 −X2

2. X2 −X1

3. X6 −X5

4. X4 −X5

5. (X3 −X1) ∩ (X2 −X4)

6. (X1 −X3)−X2

7. X1 − (X3 −X2)


8. N −X4

9. X4 −N

10. X5 − (X6 ∪X4)

5.1.3 Sequences and Ordered Pairs

Sequential orderings underlie almost everything. Impressions, actions, events, come arrangedin time. Quite early in our life we become acquainted with finite sequences. We learn,moreover, that different sequences can be made by arranging the same objects in differentways. We also learn that elements can be repeated; the same color, shape, or whatnot, canoccur in different places. We learn to identify certain sequences of letters as words, and certainsequences of words – as sentences. Particular sequences of tones and rests make up tunes,and some sequences of moves constitute games. Sequences are all around.

We shall not define here the notion of a sequence in set-theoretical terms. Relying on ourintuitive understanding we shall take it for granted. Sequences can be formed from any givenobjects. And the sequences are objects themselves.

We may use ‘a1, a2, . . . , an’ to denote the sequence of length n in which a1 occurs in the firstplace, a2–in the second, ..., and an–in the n

th. But this notation is often inconvenient; forwe also use ‘a1, a2, . . . , an’ to refer to a plurality (we say: ‘the numbers 3, 7, 11, 19 are prime’),whereas a sequence is a single object. Therefore we have notations that display more clearlythe sequence as an object. The most common are:

(a1, a2, . . . , an) and ha1, a2, . . . , ani

Finite sequences are called tuples, sequences of length n–n-tuples. The expression ‘ith co-ordinate’ is used, ambiguously, for the ith place, as well as for the object occurring in thatplace.

The sequences we encounter are finite. But the notion can be extended to infinite cases. Wecan speak of the infinite sequence of natural numbers:

(0, 1, 2, . . . , n, . . .)

or of the sequence of even natural numbers:

(0, 2, 4, . . . , 2n, . . .)

In this course we shall be concerned only with finite sequences; though we may mentioninfinite sequences of numbers or of symbols.


It is convenient to refer to the objects occurring in the sequence as its members. The objectoccurring in the ith place is the ith member of the sequence. Do not confuse this with themembership relation of set theory! As a rule, the context indicates the intended meaning of‘member’.

Equality of Sequences: A sequence is determined by its length (the number of places, orof occurrences) and by the order in which objects occur: its first member, its second members,etc. Sequences are equal when they are “exactly the same”: they have the same length and,in each place, the same object occurs. Formally:

(a1, . . . , am) = (b1, . . . , bn) iff m = n and ai = bi, for all i = 1, . . . ,m .

They are thus quite different from sets. A set is completely determined by its members.Set-theoretic notations may list the members in some sequential order, but neither the ordernor repeated listings make a difference.

{0, 1, 1, 1} = {1, 0, 1, 1} = {0, 1} = {1, 0}But the sequences

(0, 1, 1, 1), (1, 0, 1, 1), (0, 1), (1, 0)

are all different.

Ordered Pairs, Triples, Quadruples, etc. Ordered pairs, or pairs for short, are 2-tuples;triples are 3-tuples; quadruples are 4-tuples, and so on.

Ordered pairs are of particular importance. The identity condition for sequences becomes, inthe case of ordered pairs, the well-known condition:

(a, b) = (a0, b0) iff a = a0 and b = b0 .

5.1.4 Relations and Cartesian Products

We have seen that any property of objects (belonging to some given domain) determines aset: the set of all objects (in the given domain) that have the property. We can therefore usesets as substitutes for properties. (By doing so we disregard the difference between any twoproperties that determine the same set.)

There are creatures that, like properties, are true of objects, but which involve more than oneobject: they relate objects to each other. For example, the parent-child relation holds for anypair of objects, x and y, such that x is a parent of y. Set theory provides a very simple andelegant way of representing these creatures:


Regard the relation as a property of ordered pairs and represent it, accordingly, as a set ofordered pairs.

Thus, the parent-child relation is the set of all ordered pairs (x, y), such that x is a parent ofy. If, for the sake of illustration, we restrict our universe to a domain consisting of:

Olga, Mary, Ruth, Jack, John, Abe, Bert, Nancy, Frieda,

and if the parent-child relation among these people is given by:

Abe is the father of Ruth and Jack,

Olga is the mother of Mary, Abe and Nancy,

Jack is the father of Bert,

John is the father of Nancy,

and there are no other parent-child relationships, then–over this domain–the parent-childrelation is simply the set:

{(Abe, Ruth), (Abe, Jack) , (Olga, Mary), (Olga,Abe), (Olga, Nancy), (Jack, Bert), (John, Nancy)}

Note that the child-parent relation is obtained by switching the two coordinates. It containsas members: (Ruth, Abe), (Jack, Abe), (Mary, Olga), etc.

Relations that involve three members are construed, accordingly, as sets of 3-tuples. Forexample, the betweenness-relation–which holds between any three points x, y, z on a linesuch that y is between x and z–is the set of all triples (x, y, z) such that y is between x andz. Here, to sum up, are some basic notions and terms:

A binary relation is a set of ordered pairs.

An n-ary relation (also called an n-place relation) is a set of n-tuples.

Unqualified ‘relation’ means often a binary relation.

If R is an n-ary relation, then n is referred to as the arity of R, or the numberof places of R.

{(x1, x2, . . . , xn) : . . . x1 . . . x2 . . . . . . xn . . .} is the set of all tuples (x1, x2, . . . , xn)satisfying the condition stated by ‘. . . x1, . . . x2, . . . . . . xn . . ..


The betweenness relation above can be written as:

{(x, y, z) : y is between x and z}where ‘x’ ‘y’ and ‘z’ range over geometrical points. Here are some other examples:{(x, y) : x is a parent of y}, {(x, y) : y is a parent of x}, {(x, y, z) : x introduced y to z}{(x, y) : x and y are real numbers and y = 2x+ 1}{(x, y) : x and y are natural numbers and x ≥ y}Note: The variables in relational notation are used as place holders, that is, to correlatecoordinates with places in the defining expression. Different variables, or the same variablesin different roles, can achieve the same effect:

{(x, y) : x is a parent of y} = {(y, x) : y is a parent of x} = {(u, x) : u is a parent of x}But

{(x, y) : x is a parent of y} 6= {(x, y) : y is a parent of x}The first relation consists of pairs in which the parent occupies the first coordinate, the child–the second; in the other relation the child is in the first place, the parent–in the second.

Self-explanatory variants of our notation involve repetitions of variables. e.g.,

{(x, x) : x ∈ D}is the set of all pairs (x, x), where x ranges overD. It is equal to {(x, y) : x, y ∈ D and x = y}.Note: The arity of the relation is the length of the tuple; it may be greater than the numberof different variables that appear in the definition, because, as we have just seen, the samevariable can occupy different places in the tuple.

Relations Over a Given Domain: Often, we consider relations that relate objects ofparticular kinds: numbers, people, animals, words, etc. We say that a relation is over D if itconsists of tuples whose members belong to D.

Usually, the variables range over well defined domains. In ‘x is an uncle of y’, ‘x’ and ‘y’range, obviously, over people. Relations can, however, relate objects of different kinds; e.g.,the ownership relation that holds between x and y, just when x is a person, and y is an objectowned by x.

Homework

5.4 Consider binary relations consisting of the pairs (x, y), determined respectively by fol-lowing conditions. (When the signs ≥, <,=, 6= are used, the variables range over the naturalnumbers.)

(1) x is a brother of y. (2) y is a sibling of x. (3) x ≥ y, (4) y < x, (5) x ⊆ y,where x and y are sets of natural numbers. (6) x 6= y (7) y = x (8) x is an


ancestor of y. (9) y is a child of x. (10) x = y = 3 (11) x and y are naturalnumbers.

Find out which of the following inclusions is true (the numbers refer to the correspondingrelations). Justify your answers.

(1) ⊆ (2) (2) ⊆ (1) (3) ⊆ (4) (4) ⊆ (3) (7) ⊆ (3) (7) ⊆ (4) (7) ⊆ (10) (10) ⊆(7) (8) ⊆ (9) (9) ⊆ (8)5.5 A (binary) relation, R, is:

symmetric, if whenever (x, y) ∈ R, also (y, x) ∈ R,

transitive, if whenever (x, y), (y, z) ∈ R, also (x, z) ∈ R,

reflexive over the domain D, if whenever x ∈ D, (x, x) ∈ R.

Note that reflexivity depends also on the domain. A relation which is reflexive over a domaincan cease to be so with respect to a larger domain. Usually, when we speak of a reflexiverelation, the domain is presupposed.

Find out which of the relations of 5.4 is symmetric, which is transitive and which is reflexive(over the naturally associated domain). Justify your conclusions. If the property in questiondoes not hold show this by a counterexample.

Cartesian Products

Let X1,X2, . . . , Xn be sets. The Cartesian product of X1,X2, . . . ,Xn, denoted as:

X1 ×X2 × . . .×Xn,

is the set of all n-tuples in which the first coordinate is a member of X1, the second coordinateis a member of X2, and so on ..., the n

th coordinate is a member of Xn. Formally, for all x:

x ∈ X1× . . .×Xn iff there are x1, x2, . . . , xn such that: x = (x1, x2, . . . , xn) and xi ∈ Xi,for i = 1, 2, . . . , n.

We can also express this using the notation for sets of tuples:

X1 ×X2 × . . .×Xn = {(x1, x2, . . . , xn) : xi ∈ Xi, for i = 1, 2, . . . , n}

Note: In ‘(x1, x2, . . . , xn)’, the index is the place-number in the sequence. But there is nogeneral rule that ties indices to place-numbers. The first member of (x3, x1, x1) is x3, thesecond is x1 and the third is x1.


Examples:{1, 2} × {0, 1, 2} = {(1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)}

{1, 2}×{1, 2}×{2, 3} = {(1, 1, 2), (1, 1, 3), (1, 2, 2), (1, 2, 3), (2, 1, 2), (2, 1, 3), (2, 2, 2), (2, 2, 3)}

Cartesian Powers: If Xi = X, for i = 1, . . . n, then X1 × . . . ×Xn is said to be the nth

Cartesian power of X and is denoted as:

Xn

Obviously, Xn is the set consisting of all n-tuples of members of X. If n = 2, it is the set ofall ordered pairs of members of X.

Historically, the concept of Cartesian product was derived from geometry. A coordinatesystem for the plane consists of two perpendicular directed lines, which are referred to asaxes. Each point in the plane can be projected on the two axes, determining thereby anordered pair of real numbers (x, y), where x represents the projection on the first axis, y–theprojection on the second. Vice versa, every pair of numbers determines a unique correspondingpoint. In this way it is possible to identify the plane with the Cartesian productR×R, or R2,where R is the set of all real numbers. Similarly, the three dimensional space can be identifiedwith R3. This representation, by now a commonplace, amounts to a major breakthrough inthe history of science. It was discovered around 1637 by Descartes. (‘Cartesian’ derives fromhis Latin name ‘Cartesius’.)

All of geometry is reducible, in principle, to a system that deals with pairs, or triples ofnumbers. Moreover, the concept of a Cartesian product makes possible the definition and thestudy of higher dimensional geometrical spaces, structures that resists visualization. A geo-metrical space of 4 dimensions is simply, R4, that of 5-dimensions is R5, and an n-dimensionalspace is Rn.

Homework

5.6 Let X1 = {0, 2, 4}, X2 = {0, 5}. Write down the following sets in the curly bracketnotation.

X1×X2, X2×X1, X21 , X3

2 X1× {2}×X1 X1×{∅}×X1, X1× ∅×X1 .

5.7 Prove the following:

(i) X × (Y ∪ Z) = (X × Y ) ∪ (X × Z)

(ii) (X ∪ Y )× Z = (X × Z) ∪ (Y × Z)

(iii) X1 × . . .×Xn = ∅ iff one of X1,X2, . . . , Xn is empty.

5.8 Prove that, if noXi is empty, then X1 ×X2 × . . .×Xn = Y1 × Y2 × . . .× Yn iff Xi = Yi,for i = 1, 2, . . . , n.


What can you deduce from this, concerning the equality X × Y = Y ×X ?

Functions

Historically, functions have been conceived as laws by which a magnitude is determined byanother; for example, the distance traveled by a falling body is said to be a function of thetime of fall. Functions have been also considered as rules that correlate objects with objects.Thus, there is a function that correlates with every number, x, the number x2; and one thatcorrelates with x the number 2x−1. Function’s can be defined for any kind of objects; e.g.,there is a function that correlates the person’s mother with each person, and there is one thatcorrelates, with each star, the galaxy it belongs to.

Commonly ‘f(x)’ denotes the object that the function f correlates with x (assuming, ofcourse, that f is defined for x). We say that f(x) is the value of f for the argument x, andalso that it is the value of x under f .

The intuitive concept of “rule”, or “law”, by which the correlation is determined, is toovague for mathematical purposes. Historically, the list of entities admitted as functions keptgrowing, until mathematicians came to realize that they need an abstract concept of function,which does not rely on the notion of a defining rule. Set theory provides a perfect definitionof such a concept. Consider the set of all pairs (x, y) such that y is the value correlated withx. One can define the function as being simply this set. Given such a set, the function assignsa value to each x for which there is a y such that (x, y) is in the set; that y is the value of xunder the function.

Not every set of ordered pairs will do as a function. It should satisfy the condition that, forevery x, there is at most one y such that (x, y) is in the set (the function should assign toany object no more than one value). Any set satisfying this condition is a function. Statedin full, the definition is this:

A function, f , is a relation (set of ordered pairs) such that: for all x, y, y0, if(x, y) ∈ f and (x, y0) ∈ f , then y = y0.

If there is a y such that (x, y) ∈ f , then we say that f is defined for x. The domain of thefunction, which we denote as dom(f), is the set of all objects for which the function is defined.If x ∈ dom(f), then the value of f for x is the unique y such that (x, y) ∈ f . The value isdenoted by ‘f(x)’.

Example If f = {(x, 3x2 + 1) : x ∈ N}, where N is the set of natural numbers, then f isa function, dom(f) = N , and f(x) = 3x2 + 1 for all x ∈ N

Functions differ just when they differ as sets of ordered pairs. It is easy to see that twofunctions, f and g, are equal just when they are defined for the same objects and assign to


every object the same value. Formally:

f = g iff dom(f) = dom(g) and f(x) = g(x), for all x ∈ dom(f) .

Functions come under many names, which suggest diverse aspects and uses of the concept.We have correlation, assignment, and correspondence, which suggest a pairing of objects withobjects. And we have also operator and operation, which suggest the transforming objects intoobjects. We might say that squaring is an operation by which any number x is transformedinto x2. Transformation is itself a term adopted in mathematics for functions of a certaintype. We have also the term mapping, which suggests both a matching of items and a copyingof one thing to another.

One-to-One Functions and Equinumerous Sets: A function f is said to be one-to-one ifit correlates with different objects in its domain different values; that is, for all x, y ∈ dom(f):

x 6= y =⇒ f(x) 6= f(y)

Or, equivalently expressed:

f(x) = f(y) =⇒ x = y

The concept of a one-to-one function is extremely important in mathematics. It serves, amongother things, to define the concept of equinumerous sets; these are sets that have the samenumber of members:

The set X is equinumerous to the set Y if there exists a one-to-one functionf , such that X = dom(f) and Y = {f(x) : x ∈ X}. (In words: Y is the setof all objects correlated via f with members of X.)

Little reflection will show that this definition captures all there is to “having the same num-ber”. There are exactly as many forks in the drawer as there are spoons, just when one canpair with every fork a spoon, so that different forks are paired with different spoons and everyspoon is paired with some fork. The function that correlates with each fork its paired spoon isthe function that satisfies the conditions of the last definition. Vice versa, any such functiondetermines a pairing of spoons with forks.

A crucial feature of the definition is that it applies to all sets, finite as well as infinite. Wecan therefore define when two infinite sets have the same number of elements. The definitionwas introduced by Cantor who derived from it the general concept of a cardinal number; thatis, a (possibly infinite) number, which can serve as an answer to the question: How manyelements are there in a set?

Functions of Several Arguments: So far we have discussed functions of one argument,i.e., functions that correlate objects with objects Often, however, we correlate objects withmore than one object. We may correlates with every two numbers, x and y, their sum: x+ y;

5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 173

and we may correlates with x, y and z, the number xy + z. Such cases are construed asfunctions of many arguments.

It is possible to subsume functions of n arguments under functions of one argument, byregarding them as one-argument functions defined for n-tuples. Under this construal, thefunction that assigns to x and y the difference x − y is a one-argument function, whosedomain is R × R; it assigns to any ordered pair the number obtained by subtracting thesecond coordinate from the first.

Alternatively–and this is sometimes more convenient–we can define a function of n argu-ments as an (n+1)− ary relation, say f , which satisfies the condition:

If (x1, . . . , xn, y) ∈ f and (x1, . . . , xn, y0) ∈ f, then y = y0

An n-place function, f , is defined for x1, . . . , xn, if there exists y, such that (x1, . . . , xn, y) ∈ f .Such a y is, of course, unique and we denote it by:

f(x1, . . . , xn)

5.2 Inductive Definitions and Proofs, Formal Languages

5.2.1 Inductive definitions

John McGregor (a Scottish squire from the 17th century) had four children:Mary, James, Robert and Lucy.

James died childless. Mary had two children. Robert and Lucy had threechildren each.

William, Mary’s first child, had one child.

...

and so the story goes on.

We may not know the descendants of John McGregor or their number, but we have no troublein understanding what a descendant of John McGregor is. It is either a child of McGregor,or a child of a child, or a child of a child of a child,... and so on.

Using the concept of a finite sequence it is not difficult to give an explicit definition of the setof a’s descendants, where a is a person.

x is a descendant of a iff there is a finite sequence (a1, . . . , an), such that a1is a child of a, ai+1 is a child of ai, for all i = 1, . . . , n− 1, and an = x.


The sequence just described shows the chain connecting x to a. The condition concerning thesequence can be relaxed:

A person x is a descendant of a iff there is a finite sequence in which x isthe last member, and every member of it is either a child of a or a child ofsome previous member.

There is another way of defining the set of descendants, which does not employ finite se-quences. Consider the following two properties of a set X:

(I) If x is a child of a, then x ∈ X (i.e., every child of a is a member of X).

(II) If x ∈ X and y is a child of X, then y ∈ X (i.e., every child of a member ofX is a member of X)

It is obvious that the set of all descendants of a satisfies (I) and (II), i.e., if: X = set of alldescendants of a, then (I) and (II) are true.

There are other sets that satisfy (I) and (II); for example, the set of all persons (becauseevery child of a is a person and every child of a person is a person); or the set of all peoplethat are descendants either of a or of b. But every set that satisfies (I) and (II) includes as asubset the set of descendants of a: First, by (I), it contains as members all the children of a;second, by (II), it contain also all the children’s children; hence, by (II) again, it contains allchildren’s children’s children, and so on. Therefore we have:

The set of all descendants of a is the smallest set that satisfies (I) and (II)

Here by “smallest” we mean that it is included as a subset in every set that satisfies (I) and(II). Note that if there is a smallest set it must be unique: if Y1 and Y2 are both smallest sets,then Y1 ⊆ Y2 and Y2 ⊆ Y1.

Therefore we can also say:

x is a descendant of a iff it belongs to every set satisfying (I) and (II).

Frege was the first to give definitions of this type.

Note: Instead of (I) and (II) we can use a single condition: their conjunction. This conditioncan be stated as follows:

(III) If x is either a child of a or a child of a member of X, then x ∈ X.

The existence of a smallest set that satisfies a given condition is a property of the condition.


Not every condition has this property. Consider, for example, the condition of being non-empty. There is no smallest non-empty set. Because if b and c are any two different objects,both {b} and {c} are non-empty; but there is no non-empty set that is a subset of both({b} ∩ {c} = ∅). Each of {b} and {c} is a minimal non-empty set: it has no proper subsetwhich is not empty; but it is not the smallest non-empty set. Or consider the followingcondition on X:

(IV) At least three children of McGregor are members of X.

Given that Mary, James, Robert and Lucy are McGregor’s children, each of the following setssatisfies (IV):

{Mary, James, Robert} {James, Robert, Lucy}

But no subset of the two satisfies (IV), because their intersection is {James, Robert}.If Y is the smallest set satisfying the condition P, then it is (i) a member of the family of allsets satisfying P, and (ii) a subset of every set in this family. [By a ‘family of sets’ we meana set whose members are sets.] Hence, Y is the intersection of all sets satisfying P.Note: In 5.1.2 we defined intersections of a finite number of sets. The definition generalizeseasily to any non-empty family, F , of sets: The intersection of the members of F is the setconsisting of those objects that are members of every set in F . An analogous generalizationapplies to unions: The union of all the sets in the family F is the set whose members are allobjects that belong to some member of F .

Operations on Sets, Monotonicity and Fixed Points

Our first definition of descendants tells us how to get each descendant by some finite, bottom-up construction of a sequence. The second definition represents a top-down approach, inwhich we form the intersection of all the sets that satisfy certain conditions. There is aconnection between the two definitions. It is brought out by regarding (I) and (II) not onlyas conditions, but as rules that determine operations on sets. If X is the set that is operatedon, then the rules are as follows.

(I∗) If x is a child of a, add x to X.

(II∗) If x ∈ X and y is a child of x, add y to X.

To apply (I∗) to X means to add to X all the children of a. (If there are no children of a,or if all of them are already in X, no new members are added.) To apply (II∗) to X means


to add to it all the children of its members. (If no member of X has children, or if all thechildren of the members of X are already in X, no new members are added.)

Henceforth we use ‘(I∗)’ and ‘(Ii∗)’, ambiguously, to refer to the rule, as well as to the operationdetermined by it.

Obviously, (I∗) and (II∗) can either augment X or leave it unchanged. They cannot decreaseit. Operations having this property are called non-decreasing. Also, the outcome of applyingeither (I∗) or (II∗) to X does not decrease if X is augmented. Operations that have thisproperty are called monotone.

A set that is left unchanged by applying to it an operation is said to be a fixed point of theoperation. If the operation is non-decreasing we also say that the set is closed under theoperation.

If ‘F (X)’ denotes the outcome of applying the operation F to the set X, then the propertiesabove can be summarized as follows:

Non-Decreasing: X ⊆ F (X)

Monotone: X ⊆ X 0 ⇒ F (X) ⊆ F (X 0)

Fixed Point: F (X) = X

For a non-decreasing F , a fixed point of F is said to be closed under F

Obviously, a set is a fixed point of (I∗) and (II∗) iff all children of a are already in it, and,for each of its members, it contains also all the member’s children. But this simply meansthat the set satisfies (I) and (II). Hence, the sets that satisfy (I) and (II) are exactly the fixedpoints of (I∗) and (II∗). Our aim is therefore to construct the smallest fixed point of (I∗) and(II∗). This is achieved as follows.

We start with an initial set, X0, such that X0 = ∅. Applying (I∗) to it, we get a set, X1,consisting of all the children of a. By applying (II∗) to X1 we get a set, X2 consisting of all thechildren of a and all their children. Again, by applying (II∗) to X2 we get X3, which consistsof the children of a, the children’s children, and the children’s children’s children. And so on.We get in this way a sequence

X0, X1, . . . , Xn, . . .

which is non-decreasing: X0 ⊆ X1 ⊆ . . . ⊆ Xn ⊆ . . ..

All sets in this sequence contain only descendants of a. It is also easily seen that everydescendant of a is a member of some set in the sequence. Hence the union of all the sets inthe sequence is exactly the set of all descendants of a. It is the smallest fixed point of (I∗)and (II∗).


Note: The sets of such a sequence can either go increasing all the way, or reach a “plateau”,remaining the same from some point on. In our example, the first is the case if time goes onindefinitely and there are always new descendants of a; the second is the case if, from sometime on, no new descendants are added.

We can combine (I∗) and (II∗) into a single operation, which adds to X the children of a, aswell as the children of the members of X. This corresponds to the conjunction, (III), of (I)and (II):

(III∗) If x is either a child of a or a child of a member of X, add x to X.

Applied to the empty set, (III∗) adds to it the children of a (as does (I∗)). Afterwards, sinceour set contains already the children of a, applications of (III∗) are the same as applicationsof (II∗).

Our case exemplifies the general features of inductive definitions:

(a) We are given certain conditions [in our example: (I) and (II)]. There is asmallest set that satisfies them and this is the set we define.

(b) We recast the conditions as rules that determine non-decreasing monotoneoperations on sets [in our example: (I∗) and (II∗)]. A set satisfies the definingconditions iff it is a fixed point of these operations.

(c) Starting with the empty set, and iterating the operations we get a non-decreasing sequence of sets whose union is the smallest fixed point.

It is always possible to replace the set of conditions by their conjunction [in our example:(III)], and to use the operation that corresponds to it [in our example: (III∗)], i.e., to iteratethis single operation. This can be preferred for the purpose of a general treatment. Butusually it is easier to grasp the construction if we separate the conditions. In particular, it isconvenient to distinguish two kinds of rules:

• Base Rules: These are the rules for starting the process. They enable us, uncondition-ally, to put in our set certain objects.

• Recursive Rules: These are the rules that we keep iterating. They enable us to add,as new members, objects that are related (in the relevant way) to members of the set .

In our example (I∗) is a base rule and (II∗) is a recursive rule. If we combine the rules intoone, then, when this is applied to ∅, it acts as the conjunction of all the base rules; afterwardsit acts as the conjunction of all the recursive rules.

The term ‘inductive rule’ is sometime used as a synonym of ‘recursive rule’. But it is alsoused more broadly to refer to all the rules of the inductive definition.


Note: The conditions of an inductive definition must be such that there is a smallest setsatisfying them. They should moreover determine, in the way just illustrated, non-decreasingmonotone operations. There are logical characterizations of conditions that have these prop-erties. We shall not go into them here. The general theory of inductive definitions is a subjectby itself.

Note: ‘Induction’ has several meanings. You probably know the term as it is used tocharacterize empirical generalizations; e.g., from the observed cases of human mortality weinfer by inductive generalization that all humans are mortal. Do not confuse the two uses of‘induction’. [The common root of the two refers to the “inducing” of new facts by old ones. Inempirical induction, we infer new unobserved cases from observed ones. This type of inferencedoes not have logical or mathematical certainty. In inductive definitions the “inducing” ispart of the definition: the fact that c is a descendant of a is “induced” by the facts that c isa child of b and b is a descendant of a.]

The term ‘recursive definition’ is sometimes used as a synonym for ‘inductive definition’. Onealso speaks of definition by recursion. Unfortunately, ‘recursion’ is used also to denote anycomputational process based on some algorithm. Thus, both ‘induction’ and ‘recursion’ havemore than one meaning.

Terminology and Notation: The conditions that figure in an inductive definition, areknown also as the clauses, or the inductive clauses, of the definition.

To cut the terminology short, it is customary to regard these clauses also as rules for addingmembers. This means that we can speak of (I) and (II) as if they were, respectively, (I∗) and(II∗); we may thus say that a set is closed under (II), or that it is a fixed point of (I) and (II).

It is customary to use the same symbol in the role of the set-variable that is used in statingthe conditions (in our example, ‘X’), as well as a name for the inductively defined set. If ‘Da’is to denote the set of all descendants of a, then its inductive definition will have the form:

(1) If x is a child of a, then x ∈ Da.

(2) If x ∈ Da and y is a child of x, then y ∈ Da.

We then say that Da is defined inductively by (1) and (2), meaning that it is the smallest setsatisfying these conditions. And we also say that Da is the smallest fixed point of (1) and(2). Here are some other examples of inductively defined sets. We denote them as ‘S1’, ‘S2’,etc.

The set S1:

(1) 2 ∈ S1.

(2) 3 ∈ S1.

(3) If x ∈ S1, then 2x ∈ S1.

(4) If x ∈ S1, then 3x ∈ S1.


Here (1) and (2) are the base rules and (3) and (4) are the recursive rules. Obviously, (1) and(2) can be replaced by the single base rule:

(10) 2, 3 ∈ S1.

And the other two rules can be combined into a single recursive rule:

(20) If x ∈ S1, then 2x ∈ S1 and 3x ∈ S1.

After the first step we get the set {2, 3} and then, with each iteration of the recursive rules,we add to our set all the products of set members with 2 and with 3. The first four sets inthe sequence are:

∅, {2, 3}, {2, 3, 4, 6, 9}, {2, 3, 4, 6, 9, 8, 12, 18, 27}It is not difficult to see that S1 consists of all natural numbers that can be expressed asproducts > 1, of 2’s and 3’s, that is: all numbers of the form 2m3n, where m,n ≥ 0 and atleast one of m,n is non-zero. (Recall that x0 = 1 and x1 = x.)

The set S2:

(1) 2, 3 ∈ S2

(2) If x, y ∈ S2, then x·y ∈ S2

Clause (2) means that, S2 is closed under products; i.e., it contains, with every two members,also their product.

It is not difficult to see that S1 = S2. The argument, which is easy, shows how the propertyof being the smallest set satisfying the condition is used:

S2 contains 2 and 3 and is closed under products. Hence it contains allproducts of 2’s and 3’s. Therefore S2 satisfies the conditions that define S1.Since S1 is the smallest set satisfying these conditions, we have: S1 ⊆ S2.

Vice versa, the set of all products > 1 of 2’s and 3’s contains 2 and 3 andis closed under products. Hence it satisfies the conditions that define S2.Since S2 is the smallest set satisfying these conditions, we have: S2 ⊆ S1.

Putting the two together we get: S1 = S2.

This case is easy. But, in general, the question whether two given inductive definitions definethe same set can be very difficult.

The set S3:

(1) 1 ∈ S3. (2) If x ∈ S3, then 2x ∈ S3.


It is not difficult to see that S3 is just the set consisting of all powers of 2:

{20, 21, 22, 23 , . . . , 2n, . . .}

The set S4:

(1) 3, 5 ∈ S4.

(2) If x ∈ S4, then x+3 ∈ S4.

(3) If x ∈ S4, then x+5 ∈ S4.

S4 is the analogue of S1 (with 2 and 3 replaced by 3 and 5) in which products have beenreplaced by sums. It is not difficult to see that S4 consists of all numbers > 0 that can bewritten as 3m+5n, where m,n are natural numbers. Just as S4 is the analogue of S1, so thefollowing set is the analogue of S2.

The set S5:

(1) 3, 5 ∈ S5.

(2) If x, y ∈ S5, then x+y ∈ S5.

As in the case for products, one can show that S4 = S5. It can be also shown that this is thesame as the following S6.

The set S6:

(1) 3, 5, 6, 8 ∈ S6.

(2) If x ∈ S6 and x ≥ 8 then x+1 ∈ S6.

S6 is simply the set consisting of 3, 5, 6, 8, and all numbers greater than 8.

[To see that S4 ⊆ S6 note that 3, 5 ∈ S6, that of all numbers ≤ 8 only 3, 5, 6, 8 are sumsof 3’s and 5’s; consequently, S6 is closed under (2) and (3) in the definition of S4. To see thatS6 ⊆ S4, note that every number among 3, 5, 6, 8 is a sum of 3’s and 5’s and each numberfrom 9 on is obtainable by adding to some number from 3, 5, 6, 8 a sum of 3’s and 5’s.]

In the preceding examples, the recursive rules add to the set numbers of growing size. Conse-quently, the set keeps growing and the fixed point is infinite. As the following example shows,this need not hold in general.

The set S7:


(1) 7 ∈ S7.

(2) If n ∈ S7 and n is odd, then 2n ∈ S7.

(3) If n ∈ S7 and n > 4, then n− 2 ∈ S7.

By iterating these rules we put into our set the following numbers: 7, 14, 5, 3, 12, 10, 8, 6, 4. Ad-ditional applications of the rules do not yield new numbers. Hence,

S7 = {7, 14, 5, 3, 12, 10, 8, 6, 4}

Homework 5.9 Let k be a fixed natural number. Let Xk be the set defined, inductively,by the following clauses:

(1) k ∈ Xk.

(2) If x ∈ Xk and x is even, then x/2 ∈ Xk.

(3) If x ∈ Xk and x is odd, then (3x+1)/2 ∈ Xk.

Write down (in the curly brackets notation) the sets Xk for the cases:

k = 0, 1, 2, 3, 5, 6, 15, 17.

Does there exist a number k for which Xk is infinite? This is an open and apparently a verydifficult problem in number theory.

Many examples of inductive definitions that apply to objects that are not numbers are givenin 5.2.3. We had already one example: the set of descendants. Here is one of the same kind.

The Set of Maternal Descendants: Let ‘maternal descendent’ means a descendant viathe mother-child relation. Note that the connecting chain must consist of females, except,possibly, the last descendant. Using ‘MDa’ for the set of maternal descendants of a, theclauses of the definition are:

(MD1) If a is female and x is a child of a, then x ∈MDa.

(MD2) If x ∈MDa and x is female and y is a child of x, then y ∈MDa

Note: If a is not female, MDa is empty. Formally, one shows that ∅ satisfies the twoconditions for MDa: Since a is not female, the antecedent of the first condition is false andthe condition holds vacuously. ∅ satisfies also the second condition, since no x is in ∅.


Inductive Definitions of Relations

The machinery of inductive definitions can be applied to define relations, where these, recall,are sets of pairs, or of n-tuples. The conditions determine rules for adding certain pairs, orn-tuples, to the set that is being constructed.

Here, for example, is the definition of the descendant relation, Des, which is the set of all pairs(x, z) in which x is a descendant of z. This definition is obtained from that of a’s descendantsby replacing the fixed parameter ‘a’ by a variable, say ‘(z)’, and by suitable replacements of‘x’ by ‘(x, z)’.

(1) If x is a child of z, then (x, z) ∈ Des.

(2) If (x, z) ∈ Des and y is a child of x then (y, z) ∈ Des.

Notation: Let s be the successor function, defined for natural numbers: s(x) = x+ 1.

Many relations over natural numbers can be defined inductively, in terms of the successorfunction. Here is one.

(1) (x, s(x)) ∈ R (i.e., this holds for all natural numbers x).

(2) If (x, y) ∈ R, then (x, s(y)) ∈ R.

(1) puts in R all pairs of the form (x, s(x)). Then, an application of (2) adds all the pairs(x, s(s(x))), another application adds the pairs (x, s(s(s(x)))), and so on. It is not difficultto see that R consists exactly of all pairs (x, y) in which x < y. Hence, (1) and (2) defineinductively the smaller-than relation, <, solely in terms of the successor function. If, insteadof ‘(x, y) ∈ R’ we write ‘x < y’, we get the usual form of this definition.

(1) x < s(x)

(2) If x < y, then x < s(y).

Inductive techniques can be used to define various functions. (Recall that functions areconstrued in set theory as relations of a particular kind.) Take, for example, the additionfunction and let it be the relation Sum. Since addition is a binary function, Sum is a ternaryrelation:

Sum = {(x, y, z) : z = x+ y}It is not difficult to see that the following inductive definition defines it in terms of thesuccessor function.

(1) (x, 0, x) ∈ Sum

(2) If (x, y, z) ∈ Sum, then (x, s(y), s(z)) ∈ Sum


Rewriting statements of the form ‘(x, y, z) ∈ Sum’ in the form ‘x+ y = z’, we get:

(10) x+ 0 = x

(20) If x+ y = z, then x+ s(y) = s(z)

Rewriting (20) in the equivalent form: x + s(y) = s(x + y), yields the following customaryform of the definition:

(10) x+ 0 = x

(200) x+ s(y) = s(x+ y)

This definition shows directly the iterated process. Given that s(0) = 1, s(1) = 2, s(2) = 3, ...etc., we can get the value of m+ n for every particular m and n. For example:

5 + 0 = 5

5 + 1 = 5 + s(0) = s(5 + 0) = s(5) = 6

5 + 2 = 5 + s(1) = s(5 + 1) = s(6) = 7

5 + 3 = 5 + s(2) = s(5 + 2) = s(7) = 8

etc.

Multiplication is definable inductively in terms of the successor function and addition:

(1) x · 0 = 0

(2) x · s(y) = (x · y) + x

Homework 5.10 Give inductive definitions of the following relations over the naturalnumbers, solely in terms of the successor function.

1. The less-than-or-equal relation, R≤, consisting of all pairs (x, y) in which x ≤ y.

2. The ternary relation S consisting of all triples of the form

(x, x+ n, x+ 2n)

where x and n are natural numbers.


5.2.2 Proofs by Induction

Suppose we want to prove that every member of some set, say S, has the property P. If Sis definable as the smallest set that satisfies certain conditions, we have the following way:Show that the set of all objects that have property P satisfies the conditions that define S.This would imply that S is a subset of {x : x has the property P}, i.e., that every memberof S has property P. If S is given through an inductive definition, then such a proof is knownas proof by induction. (Again, this is to be distinguished from empirical induction, cf. page178.)

Example: Recall the inductive definition of ‘maternal descendant of a’ (in 5.2.1). Supposethat a certain gene, g, passes always from mother to child. One is easily convinced that ifa carries the gene, every maternal descendant of a will. The underlying reasoning is a proofby induction, where the property, P, is that of carrying the gene g. Here is the detailedargument.

Let G be the set of all gene carriers. We have to show that, under the given assumptions,if a ∈ G, then MDa ⊆ G. It suffices to show that G satisfies the clauses in the inductivedefinition of MDa:

(i) If a is female and x is a child of a, then x ∈ G.

(ii) If x ∈ G and x is a female and y is a child of x, then y ∈ G

The holding of (i) and (ii) is trivial: (ii) is a reformulation of the assumption about passingon the gene and (i) follows from this and the assumption that a is female.

When the inductive rules are classified into base rules and recursive ones, a proof by inductionproceeds by showing that:

IN1 All members obtained via the base rules have the property P.IN2 The set of objects that have P is closed under the recursive rules.

IN1 is sometimes called the base of the induction, and IN2–the inductive step.

Proofs By Induction on Natural Numbers

The set of natural numbers {0, 1, . . . , n, . . .} is the smallest set satisfying the conditions:

(N1) 0 is a natural number.

(N2) If n is a natural number than the successor of n is a natural number.


This means that:

For any set X, the following is true: If (i) 0 ∈ X and (ii) whenever x ∈ X,also x+ 1 ∈ X, then all natural numbers are in X.

This principle is known as the induction axiom (for natural numbers). It is the basis forinductive proofs, which are quite common in number theory. A standard inductive proofshows that all natural numbers have some property P by showing the following:

(I) 0 has the property.

(II) For any number n, if n has the property, then n+ 1 has it.

(I) is the base of the induction and (II) is the inductive step. When proving (II), we assumethat n has the property and show that n+1 has it. The assumption that n has the propertyis known in this context as the inductive hypothesis.

Note: Sometimes ‘mathematical induction’ simply means induction over natural numbers.Actually it applies in any domain where sets can be characterized inductively, as being thesmallest sets that satisfy such and such inductive clauses. In 5.2.3 we shall see how the methodworks in a non-numeric domain.

Here is a standard example of an inductive proof. Let Sn = 0 + 1 + . . .+ n. It is easily seenthat Sn is defined inductively by:

S0 = 0.

Sn+1 = Sn + (n+1)

(‘Sn’ is the functional notation with the argument written as a subscript.)

The well-known formula for Sn is:

(SUM) Sn =n(n+1)2

Consider the truth of (SUM) for the number n, as a property of n. To show that (SUM) holdsalways, we proceed by induction and show: (i) (SUM) holds for 0, and (ii) if (SUM) holds forn it holds for n+ 1.

The first claim is the base of the induction. It amounts to the trivial equality 0 = 0(0+1)2. The

second is the inductive step. We assume that Sn =n(n+1)2

(the inductive hypothesis) and show

that Sn+1 =(n+1)((n+1)+1)

2. Now, by the inductive clause for Sn+1 we have: Sn+1 = Sn+(n+1).

Hence we have to show that n(n+1)2

+ (n + 1) = (n+1)((n+1)+1)2

. And this follows by highschool algebra.


Note: The above given (N1) and (N2) amount to an inductive characterization of thenatural numbers. But they cannot be considered as an inductive definition, as long as theyuse the concepts of the number zero and the successor function, which, usually, rely on theconcept of natural number.

In set theory, we can specify the number zero, without appealing to natural numbers. (We canchoose it as the empty set.) We can also define a suitable notion of successor that applies tosets in general. We can then use (N1) and (N2) in order to define natural numbers inductively.

Strong Induction

Several variants of induction over natural numbers are obvious and need no further comment.For example:

In order to show that all numbers that are ≥ k have a certain property (where k is some fixednumber) it suffices to show that (i) k has the property and (ii) whenever n has the property,so does n+ 1.

A very important variant merits special attention: the so-called strong induction. In thisvariant one proves that all numbers have the property P by showing that:

(I) 0 has the property P.(II∗) For any n, if all numbers ≤ n have the property P, then n+1 has it.

The task of proving (II∗) is easier then the task of proving (II); because, in order to show thatn+1 has the property, we can assume not only that n has it, but that all numbers ≤ n haveit.

Strong induction is implied by the fact that the set of natural numbers is the smallest set X,such that:

(1) 0 ∈ X.

(2) If all natural numbers ≤ n are in X, so is n+1.

Alternatively, we can derive strong induction from ordinary induction by the following trick.Given the property P, define another property P∗ by:

n has the property P∗ just when every m ≤ n has the property P.

It is easily seen that


(i) every natural number has the property PIFF

(ii) every natural number has the property P∗.

Therefore, in order to prove (i) it suffices to prove (ii). But it is not difficult to see thatproving (ii) by ordinary induction is the same as proving (i) by strong induction.

Note that (II∗) can be rephrased as follows:

(II∗∗) For any n > 0, if all numbers smaller than n have the property P, then nhas it.

We simply let ‘n’ play the role of our previous ‘n+1’. (I) and (II∗∗) can be combined into asingle condition:

(III) For any n, if all numbers smaller than n have the property P, then n has it.

For n = 0, (III) is equivalent to (I): since there are no numbers < 0, the antecedent is satisfiedvacuously, hence the claim means that 0 has the property. For n > 0, (III) is the same as(II∗∗). The proof of (III) may, of course, proceed by cases, with n = 0 treated as a separatecase.

Various variants of strong induction are obvious. For example, in order to show that allnatural numbers, belonging to some given set, X, have a property P, it suffices to prove therelativized version of (III):

(IIIX) For any n in X, if all numbers in X that are smaller than n have theproperty P, then n has it.

Here is an example of strong induction in use. A natural number is called prime if it is greaterthan 0 and is not a product of two smaller numbers. We shall now show, by strong induction,that every number > 1 is either a prime or a product of a finite number of primes; i.e., ofthe form p1 · p2 · . . . · pk, where the pi’s are prime (we presuppose here the concept of finitesequences and some elementary properties of products of these kind).

Assume that n > 1 and that the claim holds for all numbers > 1 that are smaller than n. Ifn is not a product of two smaller numbers, then it is a prime and the claim holds. Otherwise,n is a product of two smaller numbers, say n = k · m, where k,m < n. Both k and mmust be > 1 (if one of them is 1 the other cannot be smaller than n). Hence each is eithera prime or a product of primes: k = p1 · · · pi, m = p01 · · · p0j . Combining the two weget: n = p1 · · · pi · · · p01 · · · p0j .


Since strong induction is a more convenient tool it is employed whenever an inductive ar-gument relating to natural numbers is needed. The term ‘induction’ often means stronginduction.

5.2.3 Formal Languages as Sets of Strings

Written linguistic constructs are usually finite sequences of signs. There is a theory that treatslanguages simply as sets of finite sequences. The elements of the sequences are taken fromsome fixed domain, which is called, in this context, the alphabet. (If we were to representEnglish in this way, then the “alphabet” would consist of all English words and punctuationmarks and the set of sequences will consist of all grammatical English sentences.)

Let Σ be some fixed non-empty set of objects. We refer to Σ as the alphabet and we assumethat no member of Σ is a sequence of members of Σ.

Strings Over Σ: By a (non-empty) string over Σ we mean either a member of Σ, or asequence of length > 1 of members of Σ. The length of the string is 1, if it is a member of Σ;otherwise, it is the length of the sequence.

Strings over Σ are like finite sequences of members of Σ, with the sole difference that thestrings of length 1 are the members themselves. This is done for the sake of convenience; ifa ∈ Σ the distinction between a and the sequence hai plays no role in the theory.The assumption that no member of Σ is itself a sequence of members of Σ is necessary inorder to avoid ambiguity in determining the length of a string and the string’s members. Astring of length 1 has one member: the string itself. (Note that ‘member’ does not denotehere set-theoretic membership!)

Each string over Σ has a unique length, a uniquely determined first member, say a1, a uniquelydetermined second member, say a2, and so on. If the length of the string is n, and ai is itsith member, i = 1, . . . , n, then the string is written as:

a1a2 . . . an

If the string is of length 1 we simply have a1 (an element of Σ). We say that a occurs in thestring if, for some i, a = ai.

It is very useful to include among our strings a so-called empty string or null string, whoselength is 0, which has no members. It plays a role somewhat analogous to that of the emptyset. The null string is denoted as:

Λ

The criterion of identity for strings is obvious: If ai, bj ∈ Σ, for all i = 1, . . . ,m j = 1, . . . , n,


then

a1 . . . am = b1 . . . bn iff m = n and ai = bi for all i = 1, . . . ,m

(If bothm and n are > 1, this follows from the corresponding property of sequences (cf. 5.1.3).If one of them is 1, this follows from our assumption that no member of Σ is a sequence ofmembers of Σ.)

The set of all strings over Σ is denoted as:

Σ∗

Concatenation of Strings: For x and y in Σ∗, such that

x = a1, . . . , am and y = b1, . . . , bn ,

the concatenation of x and y is the string

a1 . . . amb1 . . . bn

It is denoted as:xy

Concatenation is also defined if one of the strings, or both, is Λ:

xΛ = Λx = x

Obviously, concatenation is associative: (xy)z = x(yz). Hence, in repeated concatenationwe can omit parentheses. If x1, x2, . . . , xn are strings then

x1x2 . . . xn

is the string obtained by concatenating them in the given order. Note that the string a1 . . . an,where the ai’s are members of Σ, is the concatenation of the ai’s, where these are consideredas strings.

By a language over Σ, we mean any subset of Σ∗.

Inductive Definitions of Sets of Strings

Strings form a domain where inductive definitions are particularly useful. First, note that,if Σ1 ⊆ Σ, then Σ∗1 (the set of all strings over Σ1) can be characterized inductively as thesmallest set satisfying:

(1) Λ ∈ Σ∗1.


(2) If x ∈ Σ∗1 and p ∈ Σ1, then xp ∈ Σ∗1.

In other words: Σ∗1 is the smallest set containing Λ and closed under concatenation (to theright) with members of Σ1. If, for example, a1, . . . , an are any members of Σ

∗. Then, by (1),

Λ ∈ Σ∗1

Hence, by (2), Λa1 ∈ Σ∗1, but this is exactly a1. Consequently:

a1 ∈ Σ∗1

An additional application of (2) yields:

a1a2 ∈ Σ∗1

and so on; n applications of (2) give us:

a1a2 . . . an ∈ Σ∗1

Σ∗1 is also the smallest set containing Λ and satisfying the following two conditions:

(20) If x ∈ Σ1, then x ∈ Σ∗1 .

(200) If x, y ∈ Σ∗1, then xy ∈ Σ∗1.

Prefixes of Strings: A prefix of a string x (called also an initial segment of x) is any stringy, such that, for some string z:

yz = x

It is easily seen that a prefix of a1 . . . an is any string of the form: a1 . . . am, where m ≤ n.The case m = 0 is taken to yield the empty string. Every string x is a prefix of itself, sincex = xΛ.

A proper prefix of x is one which is different from x.

The prefix relation can be defined inductively, using only concatenation (to the right) withmembers of Σ:

(1) For every string x, x is a prefix of x.

(2) If x is a prefix of y, and p ∈ Σ, then x is a prefix of yp .

Homework 5.11 A suffix of x is any string y such that, for some string z:

zy = x


A segment of x is any string y, such that, for some strings u and v:

uyv = x

Give inductive definition of these concepts, based on concatenation (to the left, or to theright) with members of Σ.

Powers of strings are defined as iterated concatenations: If n is a natural number, then:

xn = xx . . . x

where the number of x’s on the right-hand side is n. If n = 0, this is defined to be Λ. Thefollowing is an inductive definition of that function. Note that the induction is on naturalnumbers, but the values of the function are strings.

(1) x0 = Λ.

(2) xn+1 = xnx.

Obviously, x1 = x.

Note: If a ∈ Σ, then an is simply the string of length n consisting of n a’s.

Examples: The following are examples of languages, i.e., sets of strings, defined by induc-tion. We assume that a, b, and c are some fixed members of Σ, and x, y, z, are variablesranging over Σ∗.

The set L1:

(1) b ∈ L1.

(2) If x ∈ L1, then axc ∈ L1.

Starting with (1) (the base rule), we put b in L1. Then, rule (2) enables us, given any memberof L1, to get from it a new member in L1 by concatenating it with a on the left and with con the right. Hence, applying (1) and following it by repeated applications of (2) we get

b, abc, aabcc, aaabcccc, . . . , anbcn, . . .

It is not difficult to see that L1 consists of all strings of the form: anbcn, where n = 0, 1, . . ..

The set L2:

(1) abcc ∈ L2

(2) If x ∈ L2, then axcc ∈ L2.

By applying (1), put abcc into L2. Then, each application of (2) adds one a at the beginningand two c’s at the end. Hence, we end by getting strings of the form:

anbc2n, n = 1, 2, . . .


L2 is the language consisting exactly of all these strings.

In these two examples the inductively defined sets of strings have also explicit definitions,which enumerate according to an obvious rule the members of the set. But, in general, analternative description is not easily found. Sometimes the inductive definition is all that wehave. Consider, for instance, the following very simple set of rules:

(1) ab ∈ L3.

(2) If x ∈ L3, then axb ∈ L3.

(3) If x, y ∈ L3, then xy ∈ L3.

Let a and b be, respectively the left and right parentheses:

a = ( b = )

Then L3 consists of all parentheses-strings in which all parentheses are matched. E.g.,

(()) ()()() (()())() ()()()(())(()())

are in L3, while())( (()((()) ()())(()

are excluded from it. But the concept of matching parentheses is itself in need of clarification.A very good way of doing this is by the inductive definition just given.

If x is a string than x−1 is defined to be the string obtained by reversing x, i.e., by readingit right-to-left (and writing the result left-to-right). Here is an inductive definition of thisfunction:

(1) Λ−1 = Λ.

(2) If p ∈ Σ, then (xp)−1 = p(x)−1 (i.e., the reversal of xp is p followedby the reversal of x).

(The parentheses in (2) are used to indicate the string to which the function is applied, theyare not string members.)

For example, to find (abb)−1, we note that abb = Λabb, hence:

(abb)−1 = (Λabb)−1 = b(Λab)−1 = bb(Λa)−1 = bba(Λ)−1 = bbaΛ = bba

A similar and shorter argument shows that, if p ∈ Σ, then p−1 = p.

Homework

5.12 Let L4 be the set of strings defined inductively by:


(1) Λ ∈ L4.

(2) If x ∈ L4, then axb ∈ L4.

(3) If x ∈ L4, then bxa ∈ L4.

(4) If x, y ∈ L4, then xy ∈ L4.

Find all the strings in L4 whose length is 4. Show how to reach each of them by a sequenceof rule-applications.

There is a very simple description of L4, can you guess it?

5.13 Define inductively each of the following languages. Use the indicated names in thedefinition.

L5: The set of all anbn, where n = 0, 1, . . .

L6: The set of all amban, where 0 ≤ n < m.

L7: The set of all anbncn, where n = 1, 2, . . ..

PAL: The set of all palindromes over the alphabet {a, b}. A palindrome is a stringx such that x−1 = x.

Proofs by Induction on Strings

Since sets of strings are often characterized inductively, the technique of inductive proofscomes handy. The proofs fall under the general scheme of IN1 and IN2, given in 5.2.2. Hereis in example that underlies string processing techniques. L3 is the matching-parentheses setdefined above (with a and b in the role of left and right parentheses).

Claim: For every x in L3 the following is true: In every prefix of x the number of a’s isgreater or equal to the number of b’s.

Proof: First consider the strings that are put in L3 by the base rule (1) of the definition.The rule puts in L3 the single string ab. The prefixes of this string are:

Λ, a, ab

and the claim in this case is obviously true.

Next we have two inductive rules (2) and (3). Accordingly, we have to show:

(C2) If, in every prefix of x, the number of a’s is not smaller than the number ofb’s, then this is also true for every prefix of axb.


(C3) If, in every prefix of x and in every prefix of y, the number of a’s is notsmaller than the number of b’s, then this is also true for every prefix of xy.

In showing this we shall use certain obvious properties of prefixes (themselves provable fromthe definition of prefix):

(a) Every prefix of axb is either Λ, or au, where u is some prefix of x, oraxb.

(b) Every prefix of xy is either a prefix of x, or xv, where v is a prefix of y.

(C2) now follows easily from (a), and (C3) from (b). Consider, for example, (C2). If u is aprefix of x, and if m and n are, respectively, the numbers of a’s and b’s in it, then:

(i) In au, the numbers of a’s and b’s are m+1 and n.

(ii) In axb, the numbers of a’s and b’s are m+1 and n+1.

By our assumption (the inductive hypothesis) m ≥ n. Therefore m+1 > n, and m+1 ≥n+1. The proof of (C3) is similar.

QED

Homework 5.14 Prove the following by induction:

1. Every string, in the set L1 (defined above) is of the form anbcn, where n = 0, 1, . . ..

2. In every string of L2, the number of c’s is twice the number of a’s.

3. In every string of L3 the number of a’s is the same as the number of b’s.

5.2.4 Simultaneous Induction

The technique of inductive definitions can be applied to define several sets and/or relations atone go, i.e., by one set of conditions. Consider for example the following rules, which define,simultaneously, two sets of natural numbers E and O.

(1) 0 ∈ E.

(2) If x ∈ E, then x+1 ∈ O.

(3) If x ∈ O, then x+1 ∈ E.

It is not difficult to see that E and O are the sets of even and odd numbers.


There are other pairs of sets that satisfy the clauses just given, for example the pair in whichboth sets consist of all the natural numbers. But the pair (E,O) is smallest, in the followingsense: If (E0, O0) is any pair of sets satisfying the three clauses (with ‘E’ and ‘O’ replaced by‘E0’ and ‘O0’), then E ⊆ E0 and O ⊆ O0.

If several sets (and relations) are defined inductively by a single set of rules, we say that theyare defined simultaneously. These definitions are an extremely powerful tool.

In the following example simultaneous induction is used to define a toy language, which is afragment of English. We let Σ be the set consisting of eight English words:

Jack, Jill, the, person, who, liked, saved, hated

Since each of these is itself a string of more basic elements (the letters), we leave spacesbetween the concatenated English words in order to ensure a unique and easy reading. Wenow define by simultaneous induction three subsets of Σ∗, denoted as:

NP, VP, S

As it will turn out, NP is a set of noun-phrases, VP is a set of verb-phrases, and S is a set ofsentences.

(1) Jack, Jill ∈ NP.

(2) If x ∈ NP, then each one of the following strings is in VP:

liked x, saved x, hated x

(3) If x ∈ VP, then the following string is in NP

the person who x

(4) If x ∈ NP and y ∈ VP, then xy ∈ S.

Applying these rules, we find, for example, that

the person who liked Jill saved the person who hated Jack

is in S. Here is the proof:

(i) Jack and Jill are in NP, by (1).

(ii) liked Jill and hated Jack are in VP, by (2).


(iii) the person who liked Jill and the person who hated Jack areboth in NP, by (3).

(iv) saved the person who hated Jack is in VP, by (2).

(v) the person who liked Jill saved the person who hated Jack is inS, by (4).

Chapter 6

The Sentential Calculus

6.0

Returning in this chapter to sentential logic, we set up a rigorously defined formal language,based on an infinite sequence of atomic sentences and on the sentential connectives. We shalldefine the concept of an interpretation of this language, based on which we shall define, forthat language, logical truth, logical falsity, and logical implication. We shall also investi-gate additional topics: disjunctive and conjunctive normal forms, truth-functions, and theexpressiveness of sets of connectives.

In the second part of this chapter the fundamental concept of a formal deductive system isdefined, Hilbert-type and Gentzen-type systems for sentential logic are given and their basicfeatures are established. The two fundamental criteria that relate the syntax to the semantics,soundness and completeness, are defined and the systems presented are shown to satisfythem.

6.1 The Language and Its Semantics

6.1.0

So far, we have assumed that our language is provided with certain sentential operations:negation, conjunction, and other connectives; that its sentences are generated from certainatomic sentences, and that certain general conditions hold. We shall now show how to define,with full formal rigour, a language that satisfies these assumptions.

The language is built bottom-up, from a given set of atomic sentences; that is, all other sen-

197

198 CHAPTER 6. THE SENTENTIAL CALCULUS

tences are generated from them by repeated applications of the sentential connectives. Thelatter are defined in a way that ensures unique readability; i.e., there is exactly one decom-position of a non-atomic sentence into components and no atomic sentence is decomposable(the detailed requirements were listed in 2.3).

6.1.1 Sentences as Strings

There are many ways of setting up the language so as to satisfy the required properties. Thechoice of a particular definition is a question of convenience. For example, one can definesentences to be certain labeled trees (cf. 2.4). The most common way, however, is to definethem, and linguistic constructs in general, as strings over some given set of symbols; that is,they are either members or finite sequences of members of that set (cf. 5.2.3). This still leavesus with a large degree of freedom. Here is one way of defining the language.

The set of symbols, referred to as the alphabet, consists of the following distinct members:

A1, A2, . . . , An, . . . , , ∧∧ , ∨∨ , Â, ≺ Â

The Ai’s are the atomic sentences. There is an infinite number of them. The other five symbolsare the connective letters: the negation letter , the conjunction letter ∧∧ , and so on. Asusual, we assume that none of the alphabet members is a finite sequence of other members(cf. 5.2.3).

The difference between the connective letters , ∧∧ , ∨∨ , Â, ≺ Â, and the metalinguisticsymbols we have been using: ‘¬’, ‘∧’, ‘∨’, ‘→’, ‘↔’, is that the former are simply symbolsoccurring in strings that constitute the sentences. But the latter are names of certain syntacticoperations. The different font is used to make this clear. The two are of course related: theoperations are defined by concatenation of the strings with the corresponding connectiveletters.

The set of all sentences is the smallest set of strings, S, satisfying:

(i) Ai ∈ S, for all i = 1, 2, . . .

(ii) If x, y ∈ S, then:

(ii.1) x ∈ S

(ii.2) ∧∧xy ∈ S

(ii.3) ∨∨xy ∈ S

(ii.4) Âxy ∈ S

(ii.5) ≺ Âxy ∈ S

6.1. THE LANGUAGE AND ITS SEMANTICS 199

Here, and in this section only, ‘x’, ‘y’, ‘z’, ‘x0’, etc., are variables ranging over strings.

Note: The sentences of the language are constructed along the lines of the Polish notation.Had we used an infix convention, we should have included two additional symbols of left andright parenthesis. The choice of Polish notation helps to distinguish the given formal languagefrom our metalanguage, where the infix notation is used.

The sentential operations are now defined as follows. (The sign ‘=df ’ should read: ‘equalby definition’.)

¬x (the negation of x) =df x

x ∧ y (the conjunction of x and y) =df ∧∧xy

x ∨ y (the disjunction of x and y) =df ∨∨xy

x→ y (the conditional of x and y) =df Âxy

x↔ y (the biconditional of x and y) =df ≺ Âxy

These definitions imply, for example, the equalities:

A6 ∨ ¬A3 = ∨∨ A6 A3

(A1 → A4) ∧ A2 = ∧∧ ÂA1A4A2A1 → A4∧A2 = ÂA1∧∧ A4A2

In the last equality, the grouping of the left-hand side is obtained via our grouping conventions.

Given the previous definition of sentences, it is obvious that the set of sentences is closed underapplications of the connectives: the negation of a sentence is a sentence, the conjunction oftwo sentences is a sentence, etc. It remains to show that unique readability obtains. Thiscomes to the following claims.

(I) No atomic sentence, Ai, is of the form x, or cxy, where x and y are sentencesand c is a binary connective letter.

(II) No sentence x, is of the form cyz, where c is a binary connective letter andy and z are sentences.

(III) If x and x0 are sentences and x = x0, then x = x0.

(IV) For all sentences x, y, x0, y0, if c and c0 are binary connective letters then:

cxy = c0x0y0, only if c = c0, x = x0, and y = y0


(I) follows trivially from the assumption that no member of the alphabet is equal to a sequenceof other members. (II) and (III) are trivial as well: Since x and cyz are strings that startwith different symbols, namely, and c, they cannot be equal. And if x = x0, then thestrings obtained from them by deleting the first member are equal. Also trivial is the firstpart of (III): if cxy = c0x0y0, then their first members, c and c0 must be equal. So far, we haveused only general properties of strings.

The claim that is far from obvious is that if cxy = cx0y0, where x, x0, y, y0 are sentences, thenx = x0 and y = y0. Now, if cxy = cx0y0, then xy = x0y0 (because xy and x0y0 are obtainedfrom cxy and cx0y0 by deletion of the first member). Hence, we have to prove:

For all sentences x, x,0 y, y0,

xy = x0y0 =⇒ x = x0 and y = y0

The proof is based on a method that enables one to determine, by easy counting, the scopesof the connectives.

The Symbol-Counting Method Let us associate, with every symbol, a, of our alphabetan integer, ν(a), as follows:

For all Ai, ν(Ai) = 1, ν( ) = 0, and ν(c) = −1, for every binary connective letter c.Now let x be any non-empty string of length n, say x = a1a2 . . . an. With every occurrence ofa symbol in x, let its count number (relative to x) be the sum of all integers associated withit and with the preceding occurrences in x. For the ith occurrence the count number is:

ν(a1) + ν(a2) + . . .+ ν(ai)

Here is an illustration, where the string is the sentence

(A2 ∨ ¬A1) ∧ (A4 → (¬(A7 → A1))), that is: ∧∧ ∨∨ A2 A1 ÂA4 ÂA7A1This sentence is written below, in spaced form, on the first line, the numbers associated withthe symbols are written below it, and below them–the count numbers.

∧∧ ∨∨ A2 A1 Â A4 Â A7 A1−1 −1 1 0 1 −1 1 0 −1 1 1−1 −2 −1 −1 0 −1 0 0 −1 0 1

Main Claim: For every sentence a1 . . . an, the count number of the last occurrence is 1 andall other count numbers are < 1.

Proof: By induction on the strings. We show that (i) the claim holds for atomic sentences,(ii) if the claim holds for a sentence x, it holds also for x and (iii) if it holds for the sentencesx and y, then it holds also for cxy, where c is any binary connective letter.


(i) is obvious. (ii) is easy: since ν( ) = 0, the count number of an occurrence in x, relativeto x, is the same as the count number of that occurrence relative to x. To show (iii), notethat, since ν(c) = −1, the count numbers of occurrences in x, relative to cxy, are smallerby 1 than their numbers relative to x. Assuming that the claim holds for x, it follows that,relative to cxy, the last occurrence in x has count number 0 and all preceding occurrenceshave count numbers < 0. Hence, adding cx as a prefix does not affect the count numbers ofoccurrences in y: the number for each occurrence in y, relative to cxy, is same as its numberrelative to y. Assuming the claim for y, it follows that, relative to cxy, the count number ofthe last occurrence is 1 and all other occurrences in y have numbers < 1. qed

This claim implies that no sentence is a proper prefix of another sentence:

If y = xz and x and y are sentences, then y = x (i.e., z is the empty string).

Proof: The last occurrence in x has count number 0 relative to x. Since y = xz, it hasthe same count number relative to y; since the last occurrence in y is the only one that has(relative to y) count number 0, the last occurrence in x is also the last occurrence in y. Whichimplies x = y.

To complete the proof of unique readability, assume that xy = x0y0, and let m and m0 be,respectively, the lengths of x and x0. If m < m0, then x is a proper segment of x0, which isimpossible, since both are sentences. For the same reason we cannot have m0 < m. Hencem = m0, implying x = x0. Since y and y0 are then obtained by deleting the first m membersfrom the same string, we have y = y0. This concludes the proof.

Note: Count numbers, as we have defined them, give us a way of finding, given any sentence,its immediate components. Suppose that x is a sentence whose leftmost symbol, c, is a binaryconnective letter. Then x = cyz, where y and z are uniquely determined sentences. To findthem, find–by summing from left to right–all the count numbers in x. The first occurrencethat has count number 0 (or, equivalently, count number 1 relative to the string obtained bydeleting the leftmost c) is the last occurrence in y. The remainder of the string (which comesafter cy) is z. This method leads also to a procedure for determining whether a given stringis a sentence and provides a parsing, if it is.

Homework 6.1 Consider another way of constructing the language. Here (( and )) areadditional alphabet members, functioning as left and right parentheses. The clauses for non-atomic sentences are:

If x and y are sentences, then

((x)) is a sentence.

((x))c((y)) is a sentence, for each binary connective letter c.

Unique readability is proved via the following counting method. Associate with the left


parenthesis the number −1, with the right parenthesis–the number 1, and with all the otheralphabet symbols–the number 0. Count numbers are defined as above, by summing fromleft to right the associated numbers.

Prove, by induction, that in every sentence the last occurrence has count number 0 and allthe others count numbers are ≤ 0. Deduce from this that (i) if x is a sentence, then in ((x))all occurrences except the last have count numbers < 0, and (ii) if x and y are sentences,then, in ((x))c((y)), there are exactly two occurrences of parentheses with count number 0: thelast and the one after ((x. Deduce from this the unique readability property.

Note: We have presupposed an alphabet with an infinite number of symbols, which functionas basic units. These can be constructed from a finite number of other units. For example,they can be strings of 0’s and 1’s:

= 10, ∧∧ = 110, ∨∨ = 1110, Â = 11110, ≺ Â = 111110

A1 = 1111110, A2 = 11111110, A3 = 111111110, . . . , An = 15+n0, . . .

In each of these strings 0 serves to mark the end. If x1 . . . xm = y1 . . . yn, where all the xi’s andyj’s are strings of the form 1k0, k > 0, then m = n and xi = yi, for all i = 1, . . . , n. Hence,concatenations of such strings of 0’s and 1’s are uniquely decomposable as concatenations ofour alphabet symbols. For example, the string

111011111110101101111111101111110

is the sentence:∨∨ A2 ∧∧ A3A1

From now on we shall ignore the specific nature of the sentences. We require only thatsentences be generated from the atoms by applying connectives and that unique readabilityhold. We do not have any further use for the connective letters of the language. We employ,as before the symbols ‘¬’, ‘∧’, ‘∨’, ‘→’, ‘↔’ for the sentential operations. We also employ,as we did before,

‘A’, ‘B’, ‘C’, ‘A0’, ‘B0’, ‘C 0’, ‘A1’, ‘B1’, ‘C1’,...etc.

as variables ranging over sentences. The only new pieces in our stock are the atomic sen-tences A1, A2, A3, etc. Do not confuse them with sentential variables!

6.1.2 Semantics of the Sentential Calculus

Let SC be the language of the sentential calculus, as just defined.

So far, the definitions have been purely syntactic. The language is defined as a systemof uninterpreted constructs. An interpretation, which reads the language as being about


something else, is–as explained in 2.5–the concern of the semantics. Since our treatment offormal languages is general, we shall not be concerned with one particular interpretation, butwith the class of possible interpretations.

Usually, an interpretation determines how extralinguistic entities (objects, relations or prop-erties) are correlated with linguistic items When it comes to SC, the only linguistic itemsare sentences. The truth-values of the atomic sentences determine the values of all the othersentences. We therefore take assignments of truth-values to atomic sentences as our possibleinterpretations:

An interpretation of SC is a function, σ, defined for all atoms, such that, foreach Ai, σ(Ai) is a truth-value.

When we come to first-order languages, we shall encounter richer and more familiar types ofinterpretations.

We shall refer to an interpretation of SC as a truth-value assignment, or assignment for short,and we shall use

‘σ’, ‘τ ’, ‘σ0’, ‘τ 0’, etc.

as variables ranging over assignments.

An interpretation, σ, determines a unique assignment of truth-values to all the sentences ofSC: The value of each atom is the value assigned to it by σ, and the values of the othersentences are determined by the usual truth-table rules. Spelled out in detail, this amountsto an inductive definition:

For Atoms: If A is an atom, A gets σ(A).

For Negations:

If A gets T, ¬A gets F.If A gets F, ¬A gets T.

For Conjunctions:

If A gets T and B gets T, A ∧B gets T.

If A gets F, A ∧B gets F.

If B gets F, A ∧B gets F.

...

And so on for each of the connectives.


Obviously, the set of all sentences that get a truth-value by virtue of these rules contains allthe atoms and is closed under connective-applications. Hence every sentence gets a truth-value. Moreover, every sentence gets no more than one truth-value. This follows, again byinduction, by showing that the set of all sentences that get unique values contains all atomsand is closed under connective-applications. Here we have to use the unique readability:

An atom cannot get a value by virtue of any other rule except (i), becausean atom is not a sentential compound.

Next, assume that A gets a unique value. Then, since ¬A is not of the formB ∗ C, where ∗ is a binary connective, and since ¬A = ¬A0 only if A = A0,it follows that ¬A can get a value only through (ii) and that this value isuniquely determined by the value of A.

The same kind of argument applies to every other connective.

We can therefore speak of the value of a sentence A under the assignment σ. Let us denotethis as:

valσ(A)

Note: The atoms are treated as being completely independent: the truth-value of one isnot constrained by the values of the others. Dependencies between atoms can be introducedby restricting the class of possible interpretations. The restrictions can be expressed bystipulating that certain sentences must get the value T. You can think of them as of extralogical axioms. For example, the restriction that it is impossible for both A1 and A2 to betrue, amounts to stipulating that ¬(A1 ∧ A2) gets T.(Some restrictions cannot be expressed in this form; for example the restriction that only afinite number of atomic sentences get T. But any restriction that involves a finite number ofatoms can be thus expressed.)

Our previous semantic notions can be now characterized in these terms:

• A ≡ B, just when, for all σ, valσ(A) = valσ(B).

• A is a tautology, just when, for all σ, valσ(A) = T.

• A is a contradiction, just when, for all σ, valσ(A) = F.

• Γ |= A, just when there is no σ such that for all B in Γ valσ(B) = T and valσ(A) = F.

Our previous methods for establishing logical equivalence and logical implications relied onlyon the general features of the language and the connectives. Therefore they apply as before:


All the general equivalences, simplification methods, and proof techniques ofthe previous chapters apply, without change, when the sentential variablesrange over the sentences of SC.

On the other hand, with the sentences completely specified, we can now prove that particularsentences are not tautologies, or not contradictions, or are not logically implied by othersentences. For example, if A, B, and C are different atoms, then

A ∨B, A→ C 6|= B → C

For let σ be the assignment such that σ(A) = F, σ(B) = T, σ(C) = F, then

valσ(A ∨B) = valσ(A→ B) = T, but valσ(B → C) = F.

With A, B, and C unspecified we can only claim that the implication claim need not hold ingeneral. The counterexamples constructed in chapter 4 can be turned into counterexamplesconcerning specific sentences, by assuming that each sentential variable has a distinct atomicsentence as value.

Note: The value of A under the interpretation σ depends only on the values of the atomsthat are components of A: If σ and τ assign the same values to all atomic components of A,then

valσ(A) = valτ(A) .

Hence, as far as a particular sentence is concerned, we have to consider assignments definedonly for its atomic components. And if we are concerned with are finite number of sentences,we have to consider only a finite number of atoms.

Truth tables can serve to show how a sentence fares under different assignments. A truthtable for a given sentence should have a column for each of its atoms. The rows represent thedifferent assignments; the value of the sentence is given in its column. When several sentencesare compared by means of truth tables, their tables should be incorporated into a single tablethat has a column for each atom occurring in any of the sentences.

Logical equivalence may hold between sentences with different atoms. For example:

A3 → [A1 ∧ (A5 ∨ ¬A5)] ≡ [A3 ∨ (A4 ∧ ¬A4)]→ A1)

Note: The notion of duality (cf. 2.5.3) can be now defined for specific sentences. Con-sider sentences built from atoms using only negation, conjunction and disjunction. Applythe definition given in 2.5.3, assuming that the sentential variables denote distinct atoms.Alternatively, it can be defined inductively as follows, where ‘Ad’ denotes the dual of A.

(i) If A is an atom, then Ad = A

(ii) (¬A)d = ¬(Ad)


(iii) (A ∧B)d = Ad ∨Bd

(iv) (A ∨B)d = Ad ∧Bd

6.1.3 Normal Forms, Truth-Functions and Complete Sets of Con-nectives

A literal is a sentence which is either an atom or a negation of an atom.

Definition: A sentence is in disjunctive normal form, abbreviated DNF, if it is a disjunctionof conjunctions of literals. For example, the following is in DNF:

(¬A3∧¬A4∧A5) ∨ A2 ∨ (A3∧A6)

A sentence is in conjunctive normal form, abbreviated CNF, if it is a conjunction of disjunc-tions of literals. For example:

(A5∨¬A1) ∧ (A5∨A6∨A7) ∧ (A2∨A3∨A4) ∧ ¬A3

Note: Every literal is both a conjunction of literals (namely, a conjunction with oneconjunct) and a disjunction of literals (namely, a disjunction with one disjunct). A disjunctionof literals, say

A1 ∨ A2 ∨ A3 ∨ ¬A4is in DNF, because it is a disjunction of conjunctions of literals (where every conjunctionconsists of one literal). It is also in CNF, because it is a conjunction of disjunctions of literals(namely, a conjunction with one conjunct). In a similar way,

A1 ∧ A2 ∧ A3 ∧ ¬A4is both in CNF and in DNF.

An equivalent characterization of DNF and CNF is:

A sentence A is in DNF iff:

(i) A is constructed from atoms using no connective other than ¬,∧,∨.(ii) The scope of every negation is an atom.

(iii) The scope of every conjunction does not contain any disjunction.

A sentence is in CNF iff it satisfies (i) and (ii), and


(iii0) The scope of every disjunction does not contain any conjunction.

Theorem: For every sentence, A, there is a logically equivalent sentence in DNF, and thereis a logically equivalent sentence in CNF.

Here is a way to convert any given sentence to an equivalent sentence in DNF.

(I) Eliminate → and ↔, by expressing them in terms of ¬,∧,∨. Get in thisway a logically equivalent sentence that involves only ¬, ∧ and ∨.

(II) Push negation all the way in, cancelling double negations, until negationapplies only to atomic sentences.

(III) Push conjunction all the way in, by distributing conjunction over disjunction,until no disjunction is within the scope of a conjunction.

To get an equivalent CNF, apply steps (I) and (II) but instead of (III) use:

(III0) Push disjunction all the way in, by distributing disjunction over conjunction,until no conjunction is within the scope of a disjunction.

As you carry out these steps you can, of course, simplify according to the occasion, droppingredundant conjuncts or disjuncts, or using established equivalences (e.g., replacing A∨¬A∧Bby the equivalent A ∨B).Example: Assuming A, B, C to be atoms, the following are the stages of a possibleconverting of

[(A∧C)→ (B ↔ C)] ∧ ¬[(A∧C) ∨ ¬B]into an equivalent DNF:

1. {¬(A∧C) ∨ [(B∧C) ∨ (¬B∧¬C)]} ∧ ¬[(A∧C) ∨ ¬B]

2. [¬A ∨ ¬C ∨ (B∧C) ∨ (¬B∧¬C)] ∧ [¬(A∧C) ∧ ¬¬B]

3. [¬A∨¬C ∨ (B∧C)]∧ [(¬A∨¬C)∧B] (¬B∧¬C is a redundant disjunctbecause we have the disjunct ¬C)

4. [¬A ∨ ¬C ∨ (B∧C)] ∧ [(¬A∧B) ∨ (¬C∧B)]

5. {[¬A ∨ ¬C ∨ (B∧C)]∧¬A∧B} ∨ {[¬A ∨ ¬C ∨ (B∧C)]∧¬C∧B}

6. (¬A∧B)∨ (¬C∧¬A∧B)∨ (B∧C∧¬A)∨ (¬A∧¬C∧B)∨ (¬C∧B)∨ (B∧C∧¬C)

7. (¬A∧B) ∨ (¬C∧B)


In getting the CNF, steps 1-3 are the same; from 3. on we can proceed:

3. [¬A ∨ ¬C ∨ (B∧C)] ∧ [(¬A∨¬C) ∧B]40. [¬A ∨ [(¬C∨B) ∧ (¬C∨C)] ] ∧ (¬A∨¬C) ∧B50. (¬A∨¬C∨B) ∧ (¬A∨¬C) ∧B60. (¬A∨¬C) ∧B (both ¬A∨B and ¬A∨¬C∨B are redundant in the

presence of B).

Note that in this particular case we could have gotten the CNF from the DNF by pulling outB, or the DNF from the CNF–by simple distribution of conjunction . But in general thetwo forms are not as simply related.

A sentence in DNF is true just when some conjunction in it is true. Hence, this form showsclearly the interpretations under which the sentence is true. Consider, for example,

(A1∧A2) ∨ (¬A1∧A3) ∨ (A2∧A3) ∨ (¬A1∧¬A2)

This sentence is true just when:

(A1 and A2 are true) or (A1 is false and A3 is true) or (A2 and A3 are true) or(A1 and A2 are false).

Note that not all possibilities here are exclusive; if A1 and A2 and A3 are true, both the firstand third alternatives hold.

A sentence in DNF is false just when all its disjuncts are false. For example, our last sentenceis false just when:

(A1 is false or A2 is false) and (A1 is true or A3 is false) and (A2 is false or A3is false) and (A1 is true or A2 is true).

A CNF indicates the cases of truth and falsity in a dual way. Thus,

(A1∨A2) ∧ (¬A1∨A3) ∧ (A2∨A3) ∧ (¬A1∨¬A2)

is true, just when:

(A1 is true or A2 is true) and (A1 is false or A3 is true) and (A2 is true or A3is true) and (A1 is false or A2 is false).

And the sentence is false just when:


(A1 and A2 are false) or (A1 is true and A3 is false) or (A2 and A3 are false) or(A1 and A2 are true).

Note: A sentence can have many equivalent DNF’s (or CNF’s). For example, A1∨(¬A1∧A2),and A2 ∨ (A1 ∧ ¬A2) are equivalent sentences in DNF. They are equivalent to A1 ∨ A2. If youreplace them by their duals, you will get an analogous situation for CNF’s.

Homework 6.2 Find, for each of the following sentences, equivalent sentences in DNF andCNF, as short as you can. Assume that A,B,C,D are atoms.

1. (A→ B) ∧ (B → A)

2. (A∨B ↔ C∨D) ∧ (C ↔ D)

3. ((A→ B)→ C)→ C)

4. (A ∨B) ∧ (¬C ∨ ¬D)5. (A ∧B) ∨ (¬C ∧D)6. ¬[A ∨ (B ∧ C) ∨ (C ∧D)]7. (A∧B → C∧D) ∧ (C ∨ ¬D)8. ((A→ C) ∨ (C → D))→ (B → A)

9. (A∧B → B∧A) ∨ C10. (A ∨B) ∧ (¬A ∨ ¬B) ∧ (¬A ∨B) ∧ (A ∨ ¬B)

Expressing Truth-Functions by Sentences

Definition: An n-ary truth-function is a function defined for all n-tuples of T’s and F’s,which assigns to every n-tuple a truth-value (either T, or F).

Here, for example, is a ternary truth-function f :

f(T,T,T) = Ff(T,T,F) = Tf(T,F,T) = Ff(T,F,F) = Ff(F,T,T) = Tf(F,T,F) = Ff(F,F,T) = Tf(F,F,F) = T


The n-tuples of truth-values correspond exactly to the rows in a truth-table based on n atomicsentences, provided that we choose a matching of the atoms with the coordinates. The tuple(x1, x2, . . . , xn) corresponds to the row in which x1 is assigned to the first atom (the atommatched with the first coordinate), x2 is assigned to the second atom, and so on.

Now assume that A1, A2, . . . , An are n distinct atoms and that we agree that Ai, for i =1, . . . , n, is matched with the ith coordinate. For each n-tuple of truth-values (x1, x2, . . . , xn),let the assignment represented by (x1, x2, . . . , xn), be the assignment that assigns x1 to A1, x2to A2,..., xn to An. Then each sentence, A, whose atomic components are among A1, . . . , An,defines an n-ary truth-function, fA:

fA(x1, . . . , xn) = the value of A, under the assignment represented by (x1, . . . , xn).

The values of the function fA are given in A’s column, in the truth-table based on the atomsA1, . . . , An.

Example: It is not difficult to see that the ternary truth-function given above is the functiondefined by the sentence

¬A1 ↔ (A2 → A3)

Note: If Ai does not occur in A, then the ith argument has no effect on the value of fA.For example, if n = 2 and A = ¬A2, then, under our definition, fA is a two-place functionwhose value for (x1, x2) is obtained by toggling x2. If A contains k atoms, then for everyn ≥ k and every matching of the k atoms with coordinates from 1, 2, . . . , n, there is an n-arytruth-function defined by A.

Theorem: Every truth-function is defined by some sentence.

This is sometimes expressed by saying that every truth-table is a truth-table of some sentence.The proof will show how to construct the sentence, given the truth-table.

Proof: Let f be an n-ary function. Fix n distinct atomic sentences A1, . . . , An, with Ai

corresponding to the ith coordinate, i = 1, . . . , n.

If there is no n-tuple for which the value of f is T, then obviously, A1 ∧¬A1 defines f . Else,for each i = 1, . . . , n define:

ATi =df Ai AFi =df ¬Ai

For every n-tuple of truth-values (x1, . . . , xn), let

C(x1,...,xn) = Ax11 ∧Ax2

2 . . . ∧Axnn

Consider all the tuples (x1, . . . , xn) for which f(x1, . . . , xn) = T (i.e., the rows in the truth-table for which the required sentence should have T). Let A be the disjunction of all theC(x1,...,xn)’s, where (x1, . . . , xn) ranges over these tuples.


A gets T iff one of these disjuncts gets T. But C(x1,...,xn) gets T iff all the conjunctsAx11 , Ax2

2 , . . . , Axnn get T, that is, iff A1 gets x1, A2 gets x2, ..., An gets xn. Therefore A

gets T iff the assignment is given by one of the tuples for which the value of f is T. Thetruth-function defined by A coincides with f .

QED

Example: Consider the following truth-table:

A1 A2 A3 AT T T FT T F TT F T FT F F FF T T TF T F FF F T TF F F T

There are four rows for which the required sentence, A, should be T. Accordingly A can betaken as the disjunction of four conjunctions:

(A1∧A2∧¬A3) ∨ (¬A1∧A2∧A3) ∨ (¬A1∧¬A2∧A3) ∨ (¬A1∧¬A2∧¬A3)Note that the proof of the theorem yields the required sentence in DNF. It is also a new proofthat every sentence is equivalent to a DNF sentence.

A dual construction yields the required sentence in CNF. It is obtained by toggling everywhereT and F, and ∧ and ∨:Consider all tuples (x1, . . . , xn) for which the value of the function is F. If there are none,then the function is defined by A1 ∨ ¬A1. Else, define:

BTi =df ¬Ai BFi =df Ai

LetD(x1,...,xn) = Bx1

1 ∨Bx22 ∨ . . . ∨Bxn

n

Then the required CNF is the conjunction of all the D(x1,...,xn)’s such that f assigns to(x1, . . . , xn) the value F.

Example: The CNF obtained for the above-given truth-table is:

(¬A1∨¬A2∨¬A3) ∧ (¬A1∨A2∨¬A3) ∧ (¬A1∨A2∨A3) ∧ (A1∨¬A2∨A3)Each of the disjunctions corresponds to a row in which the sentence gets F.


Full DNFs and CNFs

Terminology: Given a conjunction, C, of literals, say that an atom occurs positively inC, if it is one of the conjuncts, and that it occurs negatively in C if its negation is one of theconjuncts. We speak, accordingly, of positive and negative occurrences of atoms.

Similarly an atom occurs positively in a disjunction of literals if it is one of the disjuncts, andit occurs negatively if its negation is one of the disjuncts.

Henceforth, we assume that when a sentence is written in DNF no atom occurs both positivelyand negatively in the same conjunction. For such conjunctions are contradictory and can bedropped. The only exception is when the sentence is contradictory, in which case it reducesto A1 ∧ ¬A1.Similarly, we assume that in a CNF no atom occurs both positively and negatively in thesame disjunction, unless the CNF is a tautology–in which case it reduces to A1 ∨ ¬A1.We assume, moreover, that there are no repetitions of the same literal in any conjunction (ofthe DNF), or in any disjunction (of the CNF), and no repeated disjuncts (in the DNF), orrepeated conjuncts (in the CNF).

When comparing DNFs (or CNFs) we disregard differences in the order of literals of a con-junction (of a disjunction), and differences in the order of the disjuncts (of the conjuncts).

Definition: A full DNF is one in which every occurring atom occurs in every conjunction.A full CNF is one in which every atom that occurs in it occurs in every disjunction.

Examples: Assuming that A2, A3 and A4 are distinct atoms, the following is a sentence infull DNF:

(A2∧¬A3∧A4) ∨ (¬A2∧¬A3∧A4) ∨ (¬A2∧A3∧A4)By pulling ¬A3 ∧ A4 out of the first two conjunctions and dropping the resulting redundantconjunct A2 ∨ ¬A2, we see that this sentence is logically equivalent to:

(¬A3∧A4) ∨ (¬A2∧A3∧A4)

which is in DNF but not in full DNF, because A2 occurs in the second conjunction, but notin the first. The sentence is also equivalent to:

(A2∧¬A3∧A4) ∨ (¬A2∧A4)

(can you see how to get it?), which is again in DNF, but not in full DNF.

An example of a full CNF (where the Ais are assumed to be atoms) is:

(A1∨¬A2∨A5) ∧ (¬A1∨¬A2∨A5) ∧ (¬A1∨A2∨A5) ∧ (A1∨¬A2∨¬A5)


which is equivalent to:

(¬A2∨A5) ∧ (¬A1∨A2∨A5) ∧ (A1∨¬A2∨¬A5)

(can you see how?), as well as to:

(A1∨¬A2∨A5) ∧ (¬A1∨A5) ∧ (A1∨¬A2∨¬A5)

And this last can be further compressed into:

(A1∨¬A2) ∧ (¬A1∨A5)

All of these are in CNF but not in full CNF.

A sentence in DNF can be expanded into full DNF by supplying the missing atoms. Say thatAi is an atom occurring in some conjunction, but not in the conjunction C. We can replaceC by the equivalent:

C ∧ (Ai ∨ ¬Ai)

which, via distributivity, becomes:

C∧Ai ∨ C∧¬Ai

Thus, any disjunct not containing Ai is replaceable by two: one with an additional Ai and onewith an additional ¬Ai. Proceeding in this way, we get eventually the full DNF. Obviously,this involves a blowing up of the sentence.

A similar process works for the full CNF: We replace every disjunction D in which Ai doesnot occur by:

(D ∨Ai) ∧ (D ∨ ¬Ai)

A full DNF shows us explicitly all the truth-table rows in which the sentence gets T. Eachconjunction contributes the row in which every atom occurring positively is assigned T, andevery atom occurring negatively is assigned F.

A full CNF shows us, in a dual way, all the rows in which it gets F. Each disjunctioncontributes the row in which every atom occurring positively gets F, and every atom occurringnegatively gets T.

Note: The DNF and CNF constructed in the proof of the last theorem are full. They areobtained by following the prescription just given for correlating conjunctions (in the DNF),or disjunctions (in the CNF) with truth-table rows.

Homework

6.3 Write down sentences B1, B2, B3 and B4 that have the following truth-tables. Writeeach of B1 and B2 in DNF and in CNF.


A1 A2 A3 B1 B2 B3 B4T T T T T F TT T F F T T FT F T F F T FT F F T F T FF T T F T F FF T F T T T TF F T T T T FF F F F F T F

Having written the sentences, see if you can simplify them by pulling out common conjuncts,or common disjuncts, as shown above.

6.4 Write B1 of 6.3 using only ¬ and ∧, B2–using only ¬ and ∨, and each of B2 and B3using only ¬ and →.Dummy Atoms: A DNF (or CNF) can contain dummy atoms, i.e., atoms that have noeffect on the truth-value. For example, assuming that the Ais are atoms, A1 is dummy in:

(A1 ∧ ¬A2 ∧A3) ∨ (¬A1 ∧ ¬A2 ∧A3)That sentence is in fact equivalent to

¬A2 ∧A3Note that both sentences are in full DNF.

It can be shown that, in a full non-contradictory DNF, an atom Ai is dummy iff the followingholds:

For every conjunction in the DNF there is another conjunction in it thatdiffers from the first only in that Ai occurs positively in one, negatively–inthe other.

The condition for full non-tautological CNFs is the exact dual of that.

Dummy atoms can be eliminated from a full DNF by pulling out, i.e., by replacing each

(A01∧. . . A0i−1∧Ai∧A0i+1∧. . .∧A0n) ∨ (A01∧. . . A0i−1∧¬Ai∧A0i+1∧. . .∧A0n)by the single conjunction

A01∧. . . A0i−1∧A0i+1∧. . .∧A0n

Applying this process we eventually get an equivalent full DNF without dummy atoms. Con-cerning such DNFs the following is provable:


Two full non-contradictory DNFs without dummy atoms are logically equivalent iff they are thesame (except for rearrangements of literals and disjuncts, and dropping repeated occurrences).

The case of full non-tautological CNFs is the exact dual and we shall not repeat it.

Note: The claims just made are true only under the assumption that the DNFs (or theCNFs) are full. The situation for non-full DNFs (or CNFs) is much more complex and willnot be discussed here.

General Connectives and Complete Connective Sets

The essential feature of a connective, which determines all its semantic properties, is its truthtable. Hence, a binary connective is characterized by the two-argument truth-functiondefined by

A1 A2

where A1 and A2 represent, respectively, the first and second coordinates. And a unaryconnective is characterized by the truth-function defined by A1.

For any given n-ary truth-function, we can introduce a corresponding n-place connective, onethat determines the given function.

There are 16 possible binary truth-functions. This can be seen by noting that the domain ofa binary truth-function consists of 4 pairs:

(T,T), (T,F) (F,T), (F,F)

For each pair there are two possible values: T and F. Hence, there are altogether 2 · 2 · 2 · 2possibilities. (In other words, there are 16 possible columns in a truth-table with two atoms.)

Accordingly, when considering binary connectives we have to consider 16 possibilities. Fourof these are used, as primitives, in SC. But it is possible to set up languages based on any setof connectives. We can also consider connectives of higher arity, that is, which combine–atone go–more than two sentences.

A set of connectives is called complete if all truth-functions are definable by sentences builtby using only connectives from the set.

The theorem proved in the previous subsection shows that ¬, ∧, and ∨ constitute acomplete connective set. Since ∨ is expressible in terms of ¬ and ∧ (cf. (5) in 2.2.2), ¬ and∧ constitute by themselves a complete set. For similar reasons, ¬, with each of ∨ and →constitute complete connective sets. Hence, the following sets of connectives are complete.

{¬, ∧} {¬, ∨} {¬, →}


Obviously, every set that includes one of them as a subset is complete as well. It can beshown that none of the following is complete.

{∧, ∨, →, ↔}, {¬,↔}

Everything concerning the expressive power of sets of unary and binary connectives is well-known. There are exactly two binary connectives that form, each by itself, a complete set.One, known as Sheffer’s stroke and usually denoted as: ‘ | ’, is given by the equivalence:

A|B ≡ ¬(A ∧B)Sheffer’s stroke is sometimes called nand (not—(... and )). It is also called alternative denial,because

A|B ≡ ¬A ∨ ¬B.To show that Sheffer’s stroke is complete, it suffices to show that negation and conjunctionare expressible by it. Negation is expressible, since

¬A ≡ A|AWe have also:

¬(A|B) ≡ A ∧BExpressing left-hand side negation in terms of Sheffer’s stroke, we see that conjunction isexpressible as well:

(A|B)|(A|B) ≡ A ∧B

The second complete binary connective, whose sign is often ‘↓’, is sometimes called nor, orjoint denial. Its truth-table is given by:

A ↓ B ≡ ¬(A ∨B)which is equivalent to:

A ↓ B ≡ ¬A ∧ ¬B

Note: Among the sixteen possible binary connectives, six are “degenerate”, i.e., have oneor more dummy arguments. These are:

The “tautology connective”, whose truth-value function assigns to all pairsthe value T. And the “contradiction connective” whose truth-value functionassigns to all pairs the value F.

The “first-coordinate connective”, whose truth-value function assigns to ev-ery pair (x1, x2) the first coordinate x1. And the negated first-coordinateconnective, whose truth-value function assigns to every pair the toggled firstcoordinate.

6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 217

The “second-coordinate connective”, and the negated second-coordinate con-nective.

Homework

6.5 Prove that ↓ is, by itself, complete.6.6 Recall that the dual of a connective is the connective whose truth-table is obtained by

toggling everywhere T and F(cf. 2.5.3). LetD→ and

D↔ be the duals of→ and↔. Show that:

(1) AD→ B is expressible in terms of ¬ and →.

(2) AD↔ B is expressible in terms of ¬ and ↔.

6.7 Show that, by usingD→ as a single connective, a contradictory sentence can be con-

structed, and that the same is true forD↔. Use this to show that negation is expressible in

terms of→ andD→, as well as in terms of↔ and

D↔. What can you infer from this concerningthe completeness of {→,

D→} ?6.8 Show that disjunction is expressible in terms of conditional (i.e., that a sentence logicallyequivalent to A ∨B can be constructed from A and B, using → only).

6.9 Let A1 and A2 be atoms. Show that if each of A and B is equivalent to one of:

A1 → A1 (a tautology), A1, A2, A1 → A2, A2 → A1, A1 ∨A2.then also A → B is equivalent to one of these sentence. Use this in an inductive argumentto prove a certain restriction on the expressive power of the language that has→ as the onlyconnective.

6.2 Deductive Systems of Sentential Calculi

6.2.1 On Formal Deductive Systems

Proving claims, reasoning and drawing conclusions, are fundamental in all cognitive domains.Logic, as was pointed out, is concerned with certain basic aspects of these activities. Theoutcome of the proving activity is a proof: a sequence of propositions that are supposedto establish the desired conclusion. As a rule, proofs presuppose an understanding of theconcepts under consideration.

Proofs come in many grades of precision and rigour. Proofs in mathematics, for example, area far cry from “proofs” in philosophy, which rest on partially understood concepts and on


unspecified assumptions, and which are often controversial and subject to unending debates.But mathematics, as well, presupposes a great deal of intuitive understanding. The historyof the subject shows that even mathematical proofs have not been immune to error andconfusion.

The drive for clarity and rigour has resulted in setups in which proofs are subject to strictrequirements. In classical form, a proof should start from certain propositions, chosen froma set fixed in advance, and every step should conform to certain rules. The paradigm ofsuch a system has been Euclidean geometry (dating back to the fourth century B.C.), whoseimportance in the history of science and ideas can be hardly exaggerated.

The propositions that serve as the starting points of proofs are known as axioms. The rulesthat determine which steps are allowed are known as inference rules. When a given domainis organized as an axiomatic system, we can think of the axioms as self-evident truths. Andwe can view the inference rules as obvious truth-preserving rules, i.e., they never lead fromtrue premises to a false conclusion. Proofs constructed in this way are therefore guaranteedto produce true propositions.

The usefulness of proofs lies in the fact that, though each axiom is obvious and each step issimple, the conclusion can be a highly informative, far from obvious statement.

The axiomatic method effected a grand systematization of geometry and gave it a particularshape. It served not only as a fool-proof guard against error, but as a guide for discoveringnew geometrical truths. At the same time it provided a framework for communicating prob-lems and results. It became a basic paradigm, an example to be followed by scientists andphilosophers through centuries.

Euclidean geometry relied, nonetheless, on many intuitions, quite a few of which were leftimplicit. Later geometricians, who noted these lacunae, made various assumptions explicitin the form of additional axioms. The more precise the system became the less it relied onunanalysed geometrical intuitions. The big breakthrough came at the turn of the century inthe works of Hilbert. He showed that geometry can be completely reduced to a formal system,characterized by a certain set of axioms and a certain way of constructing proofs, which donot require any geometrical intuition. He has indicated thereby the possibility of setting up apurely formal deductive system, one that is based on an uninterpreted language, which doesnot presuppose an understanding of the symbols’ meaning. In such a system, the constructionof proofs amounts to symbol manipulation and belongs to the level of pure syntax.

These developments formed part of the general evolution of modern logic. At about the sametime Peano, drawing on Dedekind’s work, proposed a formal deductive system for the theoryof natural numbers. Frege’s systems are essetially fully fledged deductive systems. To a lesserdegree this is also true of Russell’s and Whitehead’s Principia Mathematica.

A (formal) deductive system consists of:


(I) A formal language

(II) Rules that define the system’s proofs.

Most often (II) is given by:

(II.1) A set of axioms

(II.2) A set of inference rules.

Roughly speaking, a proof is a construct formed by repeated application of inference rules;the axioms serve as starting points.

Note: “Proof” is used here in more than one sense. The proofs that belong to deductivesystems are formal structures, represented by arrangements of symbols in sequences, or intrees. But we speak also of our own arguments and reasonings as proofs; for example, the(forthcoming) proofs that every sentences in SC is equivalent to a sentence in DNF, and thata given set of connectives is complete. Do not confuse these two notions of proof! The contextalways indicates which notion is meant.

A similar ambiguity surrounds the term “theorem”. We use it to refer to what is proved ina deductive system, as well as to claims we ourselves make. Again, the context indicates theintended meaning.

The importance of deductive systems does not derive from the practicality of their proofs(though some have found computerized applications), but from the light they throw on ourreasoning activity. The very possibility of capturing a sizable chunk of our reasoning bymeans of a completely formal system, one which is itself amenable to mathematical analysis,is extremely significant. We can thus reason about reasoning, and we can prove that somethings are provable and some are not.

When restricted to classical sentential logic, deductive systems do not play a crucial role;because truth-table checking can decide whether given premises tautologically imply a givenconclusion. Yet, they are extremely important. First, because sentential deductive systemsconstitute the core of richer systems in richer languages–such as first-order logic–wherenothing like truth-table checking is available. Second, they serve as a basis and as a point ofcomparison for various enriched sentential logics, which are beyond the scope of truth tables.Finally, they are the simplest example that beginners can study.

6.2.2 Hilbert-Type Deductive Systems

The simplest type of deductive system is often referred to as the Hilbert-type. In this typethe axioms are certain sentences and each inference rule consists of: (i) a list of sentence-


schemes referred to as the premises, (ii) a sentence-scheme referred to as the conclusion. It iscustomary to write an inference rule in the form:

A1, A2, . . . , Am

B

where the Ai’s are the premises and B is the conclusion. The rule allows us to infer B fromA1, . . . , Am.

The most common rule is modus ponens:

A, A→ BB

which allows us to infer B from the two sentences A→ B and A. Here A and B can be anysentences. The rule is a scheme that covers an infinite number of cases. We shall return to itin the next subsection.

In principle, the number of premises (which can vary according to the rule) can be any finitenumber; but is usually one or two.

Proofs and Theorems: A proof in a Hilbert-type system is a finite sequence of sentences

B1, B2, . . . , Bn

in which every sentence is either an axiom or is inferred from previous sentences by an inferencerule. Stated formally: for every k = 1, . . . , n either (i) Bk is an axiom, or (ii) there arej1, . . . , jm < k, such that Bk is inferred from Bj1 , Bj2 , . . . , Bjm by an inference rule.

Terminology: A proof, B1, B2, . . . , Bn, is said to be a proof of Bn, we also say that Bn isthe sentence proved by this proof. A sentence is said to be provable if there is a proof of it. Aprovable sentence is also called a theorem (of the given system).

Note: If B1, . . . , Bn is a proof, then, trivially, every initial segment of it: B1, . . . Bj, wherej ≤ n, is a proof. Hence, all sentences occurring in a proof are provable.

Note: We can subsume the concept of axiom under the concept of inference rule, by allowingrules with an empty list of premises. A proof can then be described as a sequence of sentencesin which every sentence is inferred from previous ones by some inference rule; axioms areincluded because they are inferred from the empty set.

It is not difficult to see that the set of theorems of a deductive system can be defined induc-tively as the smallest set satisfying:

(I) Every axiom is a theorem.


(II) If B1, ..., Bm are theorems and A is inferred from B1, . . . , Bm by an inferencerule, then A is a theorem.

From this viewpoint, proofs are constructs that show explicitly that a given sentence is ob-tainable by applying (I) and (II).

Proofs as Trees:

The identification of proofs with sequences is the simplest, but by no means the only possibleway of defining this concept. Proofs can be also defined as trees. The leaves of the proof-treeare labeled by axioms and every non-leaf is inferred from its children by an inference rule. Thesentence that labels the root is the one proved by the tree. Proof-trees take more space, butgive a fuller picture that shows explicitly the premises from which each sentence is inferred.

Notation: If D is a deductive system, then`D A

means that A is provable in D. The subscript ‘D’ is omitted if the intended system isobvious.

6.2.3 A Hilbert-Type Deductive System for Sentential Logic

The following system is one of the simplest deductive systems that are adequate for thepurposes of sentential logic. We shall denote it by ‘HS1’. (‘HS’ for ‘Hilbert-type Sententiallogic’).

The language of HS1 is based on our infinite list of atomic sentences and on two connectives

¬ and →This means that the sentences of HS1 are built from atoms using ¬ and → only. Otherconnectives are to be expressed, if needed, in terms of ¬ and → (cf. page 215). The axiomsof HS1 are all the sentences which fall under one of the following three schemes:

(A1) A→ (B → A)

(A2) (A→ (B → C))→ ((A→ B)→ (A→ C))

(A3) (¬A→ ¬B)→ (B → A)

It has a single inference rule, modus ponens:

A→ B, AB


Each of (A1), (A2), (A3) covers an infinite number of sentences. For example, the followingare axioms, since they fall under (A1):

¬(A2→A4)→ (¬A5→¬(A2→A4))

(A2→(A3→¬A2))→ (A1 → (A2→(A3→¬A2)))

A1 → ((A1→A1)→ A1)

Modus ponens is schematic as well. We can, for example, infer:

from A6 → A2 and A6:

the sentence A2,

from ¬(A1 ∧ A2)→ (A1 ∨ A2) and ¬(A1 ∧ A2) :the sentence (A1 ∨ A2),

from (A1 ∨ A3)→ [A3 → ¬A5] and A1 ∨ A3 :the sentence A3 → ¬A5,

and so on.

Since we employ sentential variables throughout, our claims are of schematic nature. Whenwe say that

`HS1 A→ A

we mean that every sentence of the form A → A is a theorem of HS1. The HS1-proofs weconstruct are, in fact, proof-schemes. Here, for example, is a proof (scheme) of A → A.For the sake of clarity the sentences are written on separate numbered lines, with marginalindications of the axiom scheme under which each sentence falls or the previous sentencesfrom which it is inferred. (The line-numbers and the marginal indications are not part of theformal proof.)

1. A→ ((A→A)→ A) ((A1))

2. (A→ ((A→A)→ A))→ ((A→ (A→A))→ (A→A)) ((A2))

3. (A→(A→A))→ (A→A) (from 1. and 2.)

4. A→(A→A) ((A1))

5. A→A (from 3. and 4.)


1. is an instance of (A1) where A has been substituted by A → A; the same substitutionyields 2. as an instance of (A2); 4. is another instance of (A1) obtained by substituting Bby A.

When proofs are defined as trees, the proof just given becomes:

Note: When constructing a proof in sequential form, a rule of inference can be appliedto any previously constructed sentences. If, in the above-given sequence, we move 4. to thebeginning (and leave the rest of the order unchanged) we get a proof of the same sentence,in which the fifth sentence is obtained by applying modus ponens to the first and the fourth.The corresponding proof-tree is unaffected by that modification.

Here is another example: a proof of ¬B → (B → A). The indications in the margin are leftas an exercise.

1. (¬A→ ¬B)→ (B → A)

2. [(¬A→ ¬B)→ (B → A)]→ [¬B → ((¬A→ ¬B)→ (B → A))]

3. ¬B → (¬A→ ¬B)4. ¬B → ((¬A→ ¬B)→ (B → A))

5. [¬B → ((¬A→ ¬B)→ (B → A))]→ [(¬B → (¬A→ ¬B))→ (¬B → (B → A))]

6. (¬B → (¬A→ ¬B))→ (¬B → (B → A))

7. ¬B → (B → A)

Homework 6.10 Find for each sentence in the last proof the axiom under which it falls orthe two previous sentences from which it is inferred (by modus ponens). In the case of theaxiom, write the substitution that yields the desired instance.

Proving that Something Is Provable

Finding proofs in HS1 by mere trying is far from easy. But there are techniques for showingthat proofs exist, and for producing them if necessary, without having to construct them


explicitly. For one thing, we can use sentences that have been proved already, or shown tohave proofs, as axioms. Having shown that `HS1 A → A, we can use it in the followingsequence in order to show that B → (A→ A) is provable as well.

(A→A)→ (B → (A→A)), A→A, B → (A→A)

The first sentence is an instance of (A1) and the third is derived from the previous two bymodus ponens. The sequence is not a proof, because A→ A is neither an axiom nor derivableby modus ponens from previous sentences. But we can replace A→ A by the sequence thatconstitutes its proof and the enlarged sequence is–it is easy to see–a proof of B → (A→ A).We have thus shown that the sentence is provable, without constructing a proof. If calledupon we can provide one. Applying again the same principle, we can use, from now on,B → (A→ A).

Derived Inference Rules: Just as we can use theorems, we can use additional inferencerules, provided that we show that everything provable with the help of these rules is alsoprovable in the original system. Such rules are known as derived inference rules.

Usually, the proof that a certain rule is derived will also show how every application of it canbe reduced to applications of the original axioms and rules. For example, the following is aderived inference rule of HS1

A¬A→ B

To prove this we have to show that from A we can get, by applying the axioms and rules ofHS1, ¬A → B. Here is how to do it. We shall use ¬A → (A → B), whose provability hasbeen established (cf. Homework 6.10, with A and B switched).

1. A

2. ¬A→ (A→ B)

3. [¬A→(A→B)]→ [(¬A→A)→(¬A→B)]

4. (¬A→A)→(¬A→B)

5. A→ (¬A→ A)

6. ¬A→ A

7. ¬A→ B

Here 3. is an instance of (A2) and 5. is an instance of (A1); 4. is inferred from 2. and3., 6.–from 1. and 5., 7.–from 4. and 6.


Proofs From Premises

In the rest of this subsection, ‘`’ stands for ‘`HS1’.The concept of provability is naturally extendible to provability from premises. As in 4.2, weuse ‘Γ’, ‘∆’, ‘Γ0’, ‘∆0’, etc., for premise lists. The other notations introduced there will servehere as well.

Definition: A proof (in HS1) from a premise list Γ, is a sequence of sentences B1, . . . , Bn

such that every Bi is either (i) an axiom, or (ii) a member of Γ, or (iii) inferred from twoprevious members by modus ponens. We say that B1, . . . , Bn proves Bn from Γ. A sentenceis provable from Γ if it has a proof from Γ; we denote this by:

Γ ` B

Our previous concept of proof is the particular case in which Γ is empty. The notation inthat case conforms to our previous usage:

` B

Note: A premise list Γ is not a list of additional axiom schemes. Sentential variables thatare used in writing premise-lists are meant to denote some particular unspecified sentences.But any general proof that Γ ` B, remains valid upon substitution of the sentential variablesby any sentences.

Example: The following shows that A→ (A→ B) ` A→ B:

(A→(A→B))→ ((A→A)→ (A→B)), A→ (A→B), (A→A)→ (A→B), A→A, A→B

The first sentence is an axiom (an instance of (A2)), the second is the premise , the thirdfollows by modus ponens. The fourth is a previously established theorem and the last isobtained by modus ponens. The full formal proof from the premise A→(A→B) is obtainedif we replace A→ A by its proof.

The set of sentences provable from Γ can be defined inductively as the smallest set containingall axioms, all members of Γ, and closed under modus ponens.

Note: The concepts expressed by ‘|=’ and ‘`’ (both of which are symbols of our metalan-guage) are altogether different. The first is semantic, defined in terms of interpretations andtruth-values; the second is purely syntactic, defined in terms of formal inference rules. Weshall see, however, that there are very close ties between the two. The establishing of theseties is one of the highlights of modern logic.

The following is obvious.


(1) If all the sentences in Γ occur in Γ0, then every sentence provable from Γ isprovable from Γ0.

We also have:

(2) If Γ ` A and Γ, A ` B, then Γ ` B

Intuitively, we may use A in proving B from Γ, because we can prove A itself from Γ. In amore formal manner: let A1, . . . , Ak−1, A be a proof of A from Γ and let B1, . . . , Bm−1, B bea proof of B from Γ, A then

A1, . . . , Ak−1, B1, . . . , Bm−1, B

is a proof of B from Γ. (The occurrences of A in B1, . . . , Bm−1, B can be inferred from previoussentences in A1, . . . , Ak−1.) When proofs are trees the proof of B from Γ is obtained bytaking a proof of B from Γ, A and expanding every leaf labeled by A into a proof of A fromΓ.

All the claims that we establish here for ` hold if we replace ‘`’ by ‘|=’ (which, we shall see,is not accidental). For example, (2) is the exact analogue of (9) of 4.2. But the argumentsthat establish the properties of ` are very different from those that establish their analoguesfor |=.The Deduction Theorem The following is the syntactic analogue of (|=,→) (cf. 4.2.1(7)).

(3) Γ, A ` B iff Γ ` A→ B .

The easy direction is from right to left: Consider a proof of A → B from Γ. If we add A tothe premise-list, we can get, via modus ponens, B.

The difficult direction is known as the Deduction Theorem:

If Γ, A ` B then Γ ` A→ B.

Here is its proof. Consider a proof of B from Γ, A:

B1, B2, . . . , Bn

We show how to change it into a proof of A → B from Γ. First, construct the sequence ofthe corresponding conditionals:

A→ B1, A→ B2, , . . . , A→ Bi, . . . , A→ Bn


This, as a rule, is not a proof from Γ. But we can insert before each A → Bi sentences sothat the resulting sequence is a proof of A→ Bn from Γ. Each Bi in the proof from Γ, A iseither (i) an axiom or a member of Γ, or (ii) the sentence A, or (iii) inferred from two previousmembers by modus ponens.

If (i) is the case, insert before A→ Bi, the sentences

Bi → (A→ Bi), Bi

The first is an axiom, the second is (by our assumption) either an axiom or a member of Γ.From these two A→ Bi is now inferred by modus ponens.

If (ii) is the case, then A → Bi = A → A, which, as we have seen, is provable in HS1;we can therefore insert before A→ A sentences that, together with it, constitute its proof inHS1.

The remaining case is (iii). Say the two previous Bj’s which yield Bi via modus ponensare Bk → Bi and Bk. The original proof of B is something of the form:

. . . Bk → Bi . . . Bk . . . Bi, . . .

(The relative order of Bk and Bk → Bi doesn’t matter.) This is converted into:

. . . A→ (Bk→Bi), . . . A→ Bk, . . . , A→ Bi, . . .

Now insert before A→ Bi sequence the sequence:

[A→ (Bk → Bi)]→ [(A→ Bk)→ (A→ Bi)], (A→ Bk)→ (A→ Bi)

The first is an axiom (an instance of (A2), the second is inferred from A→ (Bk → Bi)) andthe first by modus ponens. And now A→ Bi is inferred by modus ponens from(A→Bk)→ (A→ Bi) and A→ Bk.

After carrying out all the insertions, every sentence in the resulting sequence is either anaxiom or a member of Γ, or inferred from two previous members. Hence we get a proof ofA→ B from Γ.

QED

The deduction theorem is a powerful tool for showing the provability of various sentences. Wecan now employ the technique of transferring the antecedent to the left-hand side, which wasused in the context of logical implication. Here, for example, is an argument showing that

` (A→B)→ [(B→C)→(A→C)]

Using the deduction theorem, it suffices to show that:

(a) A→ B ` (B→C)→ (A→C)


Again, by the deduction theorem, the following is sufficient for establishing (a):

(b) A→B, B→C ` A→C

Using, for the third time the deduction theorem, (b) reduces to:

(c) A→B, B→C A, ` C

But (c) is obvious: From A→ B and A we can infer B, and from B and B → C we can inferC.

Note: The concept of proof from premises is definable for general deductive systems. Somesystems have inference rules whose applicability to arbitrary sentences is subject to certainrestrictions. In such systems the definition of proofs from premises is modified accordingly.However, (2), (3), and all other properties that are analogues of the implication laws, holdthroughout.

Homework 6.11 Prove the following:

1. A,¬A ` B

Hint: use ¬A→ (¬B → ¬A), axiom (A3) and twice modus ponens.

2. ¬¬A ` A

Hint: use 1. with A and B replaced by their negations, transfer ¬A via the deductiontheorem, choose for B any axiom (or theorem) of HS1.

3. A ` ¬¬AHint: get from 2. ` ¬¬A → A, replace A by ¬A, then use an instance of (A3) andmodus ponens.

4. If Γ,¬A ` ¬B then Γ, B ` A.

Hint: show that the assumption implies that Γ ` ¬A → ¬B, then use (A3) andmodus ponens.

5. If Γ, A ` B then Γ,¬B ` ¬A.Hint: use 2. and 3. to show that the assumption implies that Γ,¬¬A ` ¬¬B; thenuse 4.

6. A,¬B ` ¬(A→ B)

Hint: apply 5. to: A,A→ B ` B.

7. If Γ,¬A ` C and Γ, B ` C, then Γ, A→B ` C.


Hint: get from the first assumption, via 5. and 2., Γ,¬C ` A; get from the secondΓ,¬C ` ¬B; applying 6., get Γ,¬C ` ¬(A→ B), then apply 4.

8. ¬(A→ B) ` A

Hint: get from 1. ¬A ` A→ B; then apply 5. (with an empty Γ) and 2.

9. ¬(A→ B) ` ¬BHint: get from (A1) B ` A→B; then apply 5.

6.2.4 Soundness and Completeness

A formal deductive system is defined without recourse to semantic notions. Its significance,however, derives from its relation to some semantics. At least, this is the case in systems thatare based on classical logic or some variant of it. The semantics is given by a class of possibleinterpretations, in each of which every sentence gets a truth-value.

Soundness

A deductive system, D, is said to be sound for a given semantics, if everything provable in D istrue in all interpretations. This has a generalized form that applies to proofs from premises:

For all Γ and all A, if Γ `D A, then there is no interpretation in which allmembers of Γ are true and A is false.

Roughly speaking, it means that the proofs of the system can never lead us from true premisesto false conclusions. 1.

The soundness of D is proved by establishing the following two claims:

(S1) Every axiom of D is true in all interpretations.

(S2) The inference rules preserve truth-in-all-interpretations, that is: if the premisesof an inference rule are true in all interpretations, so is the conclusion.

For the generalized form, (S2) is replaced by:

(S2∗) For every interpretation, if all premises of an inference rule are true in theinterpretation, its conclusion is true in it as well.

1The generalized form can be deduced directly from the first, provided that the underlying language has(or can express)→ and the deduction theorem holds. In other cases the term “strong soundness” is sometimesused for the generalized form.


(S1) and (S2) imply that all sentences constructed in the course of a proof of D are truein all interpretations. We never get outside the set of all true-in-all interpretation sentences;because the axioms are in that set and all applications of inference rules leave us in it.Similarly, (S1) and (S2∗) imply that, for any given interpretation, if the premises are true inthe interpretation, then every proof from these premises leaves us within the set of sentencestrue in that interpretation. It is an inductive argument: the set of provable (or provable fromΓ) sentences is the smallest set containing the axioms (and the members of Γ) and closedunder the inference rules. To show that all the sentences in this set have some property, weshow that all axioms (and all members of Γ) have the property, and that the set of sentenceshaving this property is closed under the inference rules.

In the case of the sentential calculus, the interpretations consist of all truth-value assignmentsto the atoms. Truth under all interpretations means tautological truth. Presupposing thissemantics, the soundness of HS1 is the requirement that every provable sentence is a tautology;or, in symbols, that for every A:

` A =⇒ |= A,

Similarly, generalized soundness means that for every Γ and A:

Γ ` A =⇒ Γ |= A

To prove that HS1 is sound we show (S1) and (S2), (for the generalized form– (S2∗)).

That proof is easy: (S1) means that all axioms are tautologies, which can be verified directlyby considering truth-tables. Since modus ponens is our only inference rule, (S2∗) amountsto the claim that whenever A→ B and A are true in some interpretation, so is B. This, bythe truth-table of →, is trivial.Completeness

A deductive system D is complete with respect to the given semantics, if every sentence thatis true under all interpretations is provable in D.If, again, we consider also proofs from premises, we get the generalized form of completeness:

If, in every interpretation in which all members of Γ are true, A is true, thenA is provable from Γ.

The non-generalized form is the particular case where Γ is empty2.

Completeness means that the deductive system is powerful enough for proving all sentencesthat are always true (or–in the generalized form–for establishing all logical implications).

2The generalized form can be deduced directly from the first provided that: (i) the underlying language has(or can express)→ , (ii) modus ponens is a primitive or derived inference rule, and (iii) only finite premise-listsare considered. In other cases the term “strong completeness” is sometimes used for the generalized notion.


Completeness is of course a highly desirable property. Deductive systems that are not com-plete fail to express the full content of the semantics. But completeness is not as essentialas soundness. It is also much more difficult to prove. Unlike soundness, there is no generalinductive argument for establishing it.

Once the language and its semantics (set of possible interpretations) are chosen, the existenceof a formal deductive system that is both sound and complete is extremely significant. Itmeans that by using a purely syntactic system we can characterize basis semantic notions.For many interpreted languages completeness is unachievable3. The most notable case inwhich we have a deductive system that is both sound and complete is first-order logic, thesubject of this book.

In the case of HS1 completeness means that, for every sentence A:

|= A =⇒ ` A

Or, in the generalized form, for every Γ and A:

Γ |= A =⇒ Γ ` A

If both soundness and completeness hold, then we have:

Γ |= A ⇐⇒ Γ ` A

Soundness is the ⇐-direction, completeness–the ⇒-direction.The Completeness of HS1: The fool-proof top-down method of chapter 4 (cf. 4.3) canbe used in order to show that HS1 is complete. Among other things, we have noted in 4.3 thatthe method applies to any sublanguage of SC that has negation among its connectives. Anytrue implication claim can be therefore derived by applying (repeatedly) the laws correlatedwith these connectives to self-evident implications or the types:

(I.1) Γ, A |= A (I.2) Γ, A,¬A |= B

In the case of HS1, the only connectives are ¬ and →. Consequently, there are six lawsaltogether. Three cover the cases of double negations, conditional and negated conditional–in the premises, and three cover the same cases in the conclusion. The first group is:

(Pr1) Γ, A |= C ⇐⇒ Γ,¬¬A |= C

(Pr2) Γ,¬A |= C and Γ, B |= C ⇐⇒ Γ, A→ B |= C

(Pr3) Γ, A,¬B |= C ⇐⇒ Γ,¬(A→ B) |= C

3The most important is the language of arithmetic, which describes the natural-number system withaddition and multiplication. Godel’s incompleteness theorem shows that completeness is out of question, forthis and any richer language.


The second group is:

(Cn1) Γ |= A ⇐⇒ Γ |= ¬¬A(Cn2) Γ, A |= B ⇐⇒ Γ |= A→ B

(Cn3) Γ |= A and Γ |= ¬B ⇐⇒ Γ |= ¬(A→ B)

In a bottom up proof we start with implications of the types (I.1) and (I.2) and apply the lawsrepeatedly in the ⇒ direction. In order to establish completeness we shall prove analogousclaims for the ⇒ directions, where ‘|=’ is replaced by ‘`’. That is, we show the following:

(I.1∗) Γ, A ` A (I.2∗) Γ, A,¬A ` B

as well as:

(Pr1∗) Γ, A ` C ⇒ Γ,¬¬A ` C

(Pr2∗) Γ,¬A ` C and Γ, B ` C ⇒ Γ, A→ B ` C

(Pr3∗) Γ, A,¬B ` C ⇒ Γ,¬(A→ B) ` C

(Cn1∗) Γ ` A ⇒ Γ ` ¬¬A(Cn2∗) Γ, A ` B ⇒ Γ ` A→ B

(Cn3∗) Γ ` A and Γ ` ¬B ⇒ Γ ` ¬(A→ B)

Assume for the moment that we have shown this.

In 4.3 we claimed that, starting with an initial goal, the reduction is bound to terminate ina tree in which all end goals (in the leaves) are elementary. We appealed to the fact that thereductions always reduced the goal’s complexity. Here we shall make this reasoning preciseby turning it into an inductive argument.

Let the weight of a sequence of sentences, ∆, be the sum of all numbers contributed byconnective occurrences in ∆, where each occurrence of ¬ contributes 1 and each occurrence of→ contributes 2. It is easily seen that, in each of the six claims given above, the sequence ofsentences involved in the conclusion on the right-hand side (of ‘⇒’) has greater weight thanthe sequence of sentences involved in each of the left-hand side premises.

We now show by induction on the weight of Γ, A, that if

Γ |= A

thenΓ ` A


If the weight is 0, all the sentences in Γ, A are atoms. In this, and in the more general casewhere all the sentences are literals, we have: If Γ |= A, then either A occurs in Γ orsome atom and its negation occur in Γ. Otherwise, we can define (as in 4.3.3) a truth-valueassignment under which all the premises come out true and A comes out false. Hence, by(I.1∗) and (I.2∗), Γ ` A. If not all sentences are literals, then either some premise or theconclusion is a conditional, or a negated conditional, or a doubly negated sentence. Each ofthese possibilities is taken care of by corresponding claim from (Pri∗), (Cni∗), i = 1, 2, 3. Forexample, consider the case where Γ = Γ0, B → C, i.e., where we have:

Γ0 B → C |= A

By the ⇐ direction of (Pr2) we get:

Γ0, ¬B |= A and Γ0, C |= A

Each of Γ0,¬B,A and Γ0, C,A has smaller weight then the weight of Γ0, B → C,A. Hence,by the induction hypothesis:

Γ0, ¬B ` A and Γ0, C ` A

Applying (Pr2∗) we get:Γ0 B → C ` A

It remains to prove (I.1∗), (I.2∗) and the six claims (Pri∗), (Cni∗), i = 1, 2, 3. Of these, (I.1∗)is trivial. (Cn2∗) is the deduction theorem. The rest follow from the claims of Homework6.11:

(I.2∗) follows from 1.

(Pr1∗) follows from 2. (since ¬¬A ` A, everything provable from Γ, A isprovable from Γ,¬¬A).(Cn1∗) follows from 3. (if A is provable from Γ, so is ¬¬A, since A ` ¬¬A).(Pr2∗) is 7.

(Pr3∗) follows from 8. and 9. (since both A and ¬B are provable from¬(A→ B), everything provable from Γ, A,¬B is provable from Γ,¬(A→ B)

(Cn3∗) follows from 6. (if both A and ¬B are provable from Γ, so is ¬(A→B), since A,¬B ` ¬(A→ B)).

If we extend the language of HS1 by adding connectives, we can get a sound and completesystem provided that we add suitable axioms. Here again the fool-proof method can guideus. For example, if we add conjunction, we should add sound axioms, such that the followingfour claims are satisfied. They correspond to the conjunction and negated-conjunction lawsof 4.3, for the premises and the conclusion.


Γ, A,B ` C ⇒ Γ, A∧B ` C

Γ ` A and Γ ` B ⇒ Γ ` A∧BΓ,¬A ` C and Γ,¬B ` C ⇒ Γ,¬(A∧B) ` C

Γ, A ` ¬B ⇒ Γ ` ¬(A∧B)

Homework 6.12 Write down, for each of the connectives ∨ and↔, the additional propertiesthat ` should have in order to imply completeness for the system obtained by adding theconnective.

For each of ∧, ∨ and ↔ we can state axioms (rather, axiom schemes) that guarantee com-pleteness if the connective(s) is added. In what follows the associated axioms are chosen soas to involve only the connective in question and → (i.e., no ¬). There are three axioms foreach connective. If more than one is added, we simply include the axioms for each. In allcases modus ponens remains the only inference rule. Altogether we get, besides HS1, sevencomplete and sound deductive systems, which correspond to the seven non-empty subsets of{∧, ∨, ↔}.Axioms for ∧:

A∧B → A A∧B → B

A→ (B → (A∧B))Axioms for ∨:

(A→ C)→ [(B → C)→ ((A ∨B)→ C)]

A→ A ∨B B → A ∨BAxioms for ↔:

(A↔ B)→ (A→ B) (A↔ B)→ (B → A)

(A→ B)→ [(B → A)→ (A↔ B)]

)

Homework

6.13 Prove the above claim for ∧, that is: if we add ∧ to the language and the correspondingaxiom schemes to the deductive system, we get a complete system.

(Hint: Among other things, show, using the third axiom for conjunction, that

¬C→ A, ¬C→B ` ¬C → (A ∧B)Using ¬D→ E ` ¬E → D, deduce from this:

(¬A→C), (¬B→C) ` ¬(A∧B)→ C

Show also: A→ ¬B ` ¬(A ∧B), by showing: A ∧B ` ¬(A→¬B).)


6.14 Prove the above claim for ∨.(Hint: Among other things, show, using the first axiom for disjunction, that

¬C → ¬A, ¬C → ¬B ` ¬C → ¬(A ∨B)Taking C to be a negation of an axiom, deduce from this:

¬A, ¬B ` ¬(A ∨B)Using the other two axioms show that both ¬A and ¬B are provable from ¬(A ∨B). Inferfrom this that ¬(A ∨B) ` ¬(¬A→ B); from which you can infer that

¬A→ B ` A ∨B )

6.15 Prove the above claim for ↔.Note: When we add new connectives, the old axiom schemes cover additional sentences.E.g., (A1) covers originally all sentences A → (B → A) in which A and B are in the HS1language. In the enriched the language, (A1) covers all the cases where A and B are in thatlanguage.

Note: We can have fewer axioms for each additional connective, if we use ¬. We simplyadd axioms that allow us to express the new connectives in terms of ¬ and →. E.g., forconjunction we add:

(A ∧B)→ ¬(A→ ¬B) and ¬(A→ ¬B)→ (A ∧B)The significance of the previous axioms, which bypass negation, lies in expressing certainproperties of the connective in terms of conditional. These are both algebraic and proof-theoretic properties. These topics are beyond the scope of this book.

6.2.5 Gentzen-Type Deductive Systems

In chapter 4 (4.3 and 4.4) we have studied methods for establishing claims of the form:

B1, B2, . . . , Bn |= B

These methods can be represented as purely formal proof procedures, based on a vocabularyof uninterpreted symbols. This is done by using a new type of syntactic constructs, entitiesof the form:

B1, B2, . . . , Bn B

where B1, . . . , Bn and B are sentences and is a new symbol. (This includes the possibilityn = 0, where the construct is: B.)


Constructs of this form, called sequents, were introduced by Gentzen around 1934. Thesequent symbol, , differs from ‘|=’ in that it belongs to the syntax of the formalism anddoes not stand for any English expression. Sequents are on the same level as uninterpretedsentences. But they are not sentences (one cannot, for example, apply to them sententialconnectives). They form a new syntactic type.

Gentzen considered deductive systems in which certain sequents are designated as axioms andinference rules enable us to deduce sequents from other sequents. The theorems are thereforeare not sentences but sequents. If D is such a system, then

`D Γ A

means that Γ A is a theorem of D. All this is not done, of course, as a mere game. Weintend to interpret the sequents as implication claims. Let us say that the sequent

Γ A

is valid ifΓ |= A

i.e., there is no interpretation of the language in which all sentences of Γ are true and A isnot

Note: Gentzen’s calculus differs from the one we are discussing in an important aspect. Thesequents in it are of the form Γ ∆, where both Γ and ∆ are finite sequences of sentences,one of which but not both can be empty. The sequent is valid if, for every interpretation, if allmembers of Γ are true then at least one member of ∆ is true. The inference rules for such asystem are simpler and more elegant than the one’s used in this book. Each connective can behandled separately without bringing in negation. For beginners the intended interpretationis more natural and easier if only one sentence is on the right-hand side. The present systemis a variant constructed for the purpose of this book.

Accordingly, the properties of soundness and completeness for the system are the following.

Soundness: A Gentzen-type system, D, is sound, if for all Γ and A:

`D Γ A =⇒ Γ |= A

Completeness: A Gentzen-type system, D, is complete, if for all Γ and A:

Γ |= A =⇒ `D Γ A

Terminology: The antecedent of the sequent, Γ A, or its left-hand side, is Γ. Its succe-dent, or right-hand side is A. We shall use these terms, because ‘premises’ and ‘conclusion’are now needed to refer to sequents that are themselves premises and conclusions in proofs ofthe sequent calculus.


We shall consider two Gentzen-type deductive systems for sentential logic that are sound andcomplete. They are obtained by straightforward formalization of the methods of 4.3 and 4.4.The first, which we denote as GS1, is based on ordinary sequents. The second, which wedenote as GS2, corresponds to the proof-by-contradiction method and involves sequents ofthe form Γ ⊥.

The Deductive System GS1

GS1 has the following two axiom schemes:

(GA1) Γ, A A

(GA2) Γ, A,¬A C

Obviously, the axioms are valid. If we replace ‘ ’ by ‘|=’ we get what in chapter 4 (cf.4.3) we called self-evident implications.

GS1 has a rule that allows us to reorder the left-hand side of a sequent.

Γ A ReorderingΓ0 A

where Γ0 is obtained by reordering Γ.

The other inference rules, which constitute the heart of the system, correspond to the lawsof 4.3.2. We have antecedent rules and succedent rules for double negation, for each binaryconnective and for each negated connective.

In the following list the rules are arranged in two groups. The first consists of all antecedentrules, the second of all succedent rules. Each row contains the rules for some connective andits negation. (This is different from the arrangement of the laws in 4.3.2. But you can easilysee the correspondence, which is also indicated by the rules’ names. )

Note that the laws in 4.3.2 are ‘iff’ statements. The ⇐-direction of the law is the premises-to-conclusion direction of corresponding rule. Thus, the sequents of the premises are alwayssimpler than the conclusion sequent.


ANTECEDENT RULES

Γ, A C (¬¬ )Γ,¬¬A C

Γ, A,B C (∧ )Γ, A ∧B C

Γ,¬A C Γ,¬B C (¬∧ )Γ,¬(A ∧B) C

Γ, A C Γ, B C (∨ )Γ, A ∨B C

Γ,¬A,¬B C (¬∨ )Γ,¬(A ∨B) C

Γ,¬A C Γ, B C (→ )Γ, A→ B C

Γ, A,¬B C (¬→ )Γ,¬(A→ B) C

Γ, A,B C Γ,¬A,¬B C (↔ )Γ, A↔ B C

Γ, A,¬B C Γ,¬A,B C (¬↔ )Γ,¬(A↔ B) C

SUCCEDENT RULES

Γ B ( ¬¬)Γ ¬¬B

Γ A Γ B ( ∧)Γ A ∧B

Γ, A ¬B ( ¬∧)Γ ¬(A ∧B)

Γ,¬A B ( ∨)Γ A ∨B

Γ ¬A Γ ¬B ( ¬∨)Γ ¬(A ∨B)

Γ, A B ( →)Γ A→ B

Γ A Γ ¬B ( ¬→)Γ ¬(A→ B)

Γ, A B Γ, B A ( ↔)Γ A↔ B

Γ, A ¬B Γ,¬A B ( ¬↔)Γ ¬(A↔ B)

Soundness and Completeness of GS1: The soundness of GS1 follows easily by observingthat every axiom is valid and, for every inference rule, if all its premises are logically valid,so is the conclusion. This is exactly the ⇐-direction of the corresponding law in 4.3.2.


The completeness of GS1 follows from the fact that every true implication can be establishedusing the method of 4.3.3: Start with self-evident implications and apply the⇐ directions ofthe laws. These applications correspond exactly to the steps of a proof in GS1.

The rigorous inductive argument follows the same lines as the argument used to prove thecompleteness of HS1. With each sequent we associate its weight, defined as follows: Sum allthe numbers contributed by connective-occurrences, where every occurrence of ¬ contributes1, every occurrence of ∧, ∨ and→ contributes 2, and every occurrence of↔ contributes 3. Itis easily seen that, in all the inference rules, each of the premises has smaller weight than theconclusion. (The weight is some rough measure that reflects the fact that, in every inferencerule, each of the premises is “simpler” than the conclusion. Any “simplicity measure” thathas this property would do for our purposes.)

We now prove, by induction on the weight of Γ A, that if Γ A is valid then it isprovable in GS1.

If all the sentences in the sequent are literals, then the sequent is valid iff either (i) thesuccedent is one of the antecedent sentences or (ii) the antecedent contains a sentence andits negation. (Otherwise we can define, as in 4.3.3, a truth-value assignment that makes allantecedent literals true and the succedent false.) In either case the sequent is an axiom.

If not all sentences in Γ A are literals, then one of them is either of the form ¬¬C, or ofthe form C D, or of the form ¬(C D) (where is a binary connective). In each case oursequent can be inferred, from one or two premises, by applying the corresponding rule (anantecedent rule–if the sentence is in the antecedent, a succedent rule–if it is the succedent).We now invoke an important feature of our rules:

Reversibility: In each rule, if the conclusion is valid, so are all the premises.

This is the⇒-direction of the laws of 4.3.2. Since Γ A is assumed to be valid, the premisesof the rule that yields it are valid as well. Since each premise has smaller weight, the inductionhypothesis implies that it is provable in GS1. Therefore Γ A is provable.

QED

The Gentzen-Type Deductive System GS2

GS2 formalizes the proof-by-contradiction method of chapter 4 (cf. 4.4) exactly in the waythat GS1 formalizes the method of 4.3.3. In addition to the usual sequents, GS2 has sequentsof the form:

A1, . . . , An ⊥where ⊥ is a special symbol (signifying contradiction). ⊥ can appear only as the right-handside of a sequent.


By stipulation, the truth-value of ⊥ is F, under any assignment of truth-values to the atomicsentences. The valid sequents are defined as before. This means that Γ ⊥ is valid if thereis no truth-value assignment that makes all the sentences in Γ true.

Instead of the two axiom schemes (GA1) and (GA2), GS2 has one axiom scheme:

(GA3) Γ, A,¬A ` ⊥

The reordering rule is now supposed to cover also sequents of the new form.

GS2 has also the following inference rule:

Γ,¬A ⊥ ContradictionΓ A

The other rules of GS2 are the antecedent rules of GS1, with the difference that the succedentis always ⊥. (That is, replace list above, ‘C’ by ‘⊥’).After applying the Contradiction Rule no other rule can be applied. Hence, the only way toprove Γ A is to prove Γ,¬A ⊥ and then apply the Contradiction Rule.The soundness and completeness of GS2 are proved in the same way as they are proved forGS1.

Chapter 7

Predicate Logic Without Quantifiers

7.0

Taking a further step in the analysis of sentences, we set up a language in which the atomicsentences are made of smaller units: individual constants and predicates (or relation symbols).

Individual Constants

Individual constants are basic units that function like singular names of natural language,that is, names that denote particular objects. For example:

‘The Moon’, ‘Ann’, ‘Everest’, ‘Chicago’, ‘Bill’, ‘The USA’ .

An interpretation of the formal language associates with every individual constant a denotedobject, referred to as the constant’s denotation. The object can be arbitrary: a person, amaterial body, a spatio-temporal region, an organization, the number 1, whatever.

In natural language a name can be ambiguous, e.g., Everest the mountain and Everest theofficer. Usually, the intended denotation can be determined from context. A name may alsolack denotation, e.g., ‘Pegasus’. (We have discussed this at some length in chapter 1.) Butin predicate logic each individual constant has, in any given interpretation of the language,exactly one denotation. Different individual constants may have the same denotation, just asin natural language an object can have several names.

The denotations of the individual constants depend on the interpretation. The syntax leavesthem undetermined. On the purely syntactic level the individual constants are mere symbols,which function as building blocks of sentences.

241

242 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS

We shall assume that

a, b, c, d, etc.

are individual constants. It is also convenient to use ‘a’, ‘b’, ‘c’, etc. as variablesranging over individual constants. Thus, we may say: ‘For every individual constant c ’, or‘There is an individual constant b’. Occasionally we shall help ourselves to names borrowedfrom English:

Jill, Jack, Bill, 5, etc.

Predicates

Predicates, known also as relation symbols, combine with individual constants to make sen-tences. Grammatically, they fulfill a role analogous to English verb-phrases. But they expressproperties or relations. For example, in order to translate

Ann is happy

into predicate logic, we need an individual constant, say a, denoting Ann, and a predicate,say H, that does the job of ‘... is happy’. The translated sentence is:

H(a)

H expresses, under the intended interpretation, the property of being happy. As we shall seelater, it is interpreted simply as the set of all happy people.

Similarly, the translation of

Ann likes Bill

is obtained by using individual constants, say a and b, for denoting Ann and Bill and apredicate, say L, to play the role of ‘... likes ’. The translation is then:

L(a,b)

7.0. 243

Here L is supposed to express the relation of liking. Actually it is interpreted as a set-theoretical relation (cf. 5.1.4 page 167): the set of all pairs (x, y) such that x likes y.

In the first case H is a one-place predicate. It comes with one empty place. When we fill theempty place with an individual constant we get a sentence. In the second case L is a two-placepredicate. It comes with two empty places. We get a sentence when both places are filledwith constants. The same individual constant can be used twice, e.g.,

L(a,a)

which, under the interpretation just given, reads as:

Ann likes Ann, or Ann likes herself.

The number of places that a predicate has is known as its arity. In our example H is unary(or monadic) and L is binary. The arity can be shown by indicating the empty places:

H( ) L( , ).

A predicate can have any finite arity. For example, if a, b, and c are points on a line, wecan translate:

a is between b and c

into:

Bet(a,b,c)

Here Bet( , , ) is a ternary predicate (interpreted as the three-place betweeness relation)and a, b, and c are interpreted as denoting a, b, and c.

In general, an n-ary predicate is interpreted as some n-ary relation, where this is (as in settheory) a set of n-tuples. For the moment it suffices to note that predicates are analogous toconstructs such as ‘... is happy’, ‘... is red’, ‘... likes ’, ‘... is greater than ’, ‘... isbetween and ∗∗∗’, etc.

We assume that

P, R, S, P0, R0, etc.


are predicates of the formal language. It is also convenient to use ‘P’ ‘R’, etc. as variablesranging over the predicates. Thus, we might say: ‘For some monadic predicate P’, or ‘When-ever P is a ternary predicate’, etc. We may also help ourselves to suggestive expressions suchas:

Happy( ), FatherOf( , ), Larger( , ), etc.

As far as the formal uninterpreted language is concerned, predicates are mere symbols, eachwith an associated positive integer (called its arity), which can be combined in a well-definedway with individual constants to form sentences. The interpretation of these symbols varieswith the interpretation of the language. H( ) can be interpreted as the set of happy people–inone interpretation, as the set of humans–in another, as the set of all animals–in a third,and as the empty set–in a fourth. The same goes for individual constants.

Note: ‘Predicate’ has several meanings. It can refer to properties that can be affirmed ofobjects (‘rationality is a predicate of man’). It is also a verb denoting the attributing of aproperty to an individual (‘wisdom is predicated of Socrates’). Don’t confuse these meaningswith the present technical sense!

7.1 PC0, The Formal Language and Its Semantics

7.1.0

By ‘PC0’ we shall refer to the portion of first-order logic that involves, beside the connectives,only individual constants and predicates. The language of PC0 is therefore based on:

(i) A set of individual constants.

(ii) A set of predicates, each having an associated positive integer called itsnumber of places, or arity.

(iii) Sentential connectives, which (in our case) are: ¬, ∧, ∨, → and ↔.

Actually, we are describing here not a single language, but a family of languages. For thelanguages may differ in their sets of individual constants and predicates.

Given the individual constants and the of predicates, the atomic sentences are the constructsobtained by applying the rule:

If P is an n-place predicate and c1, . . . , cn are individual constants, thenP(c1, . . . , cn) is an atomic sentence.

7.1. PC0, THE FORMAL LANGUAGE AND ITS SEMANTICS 245

You can think of P(c1, . . . , cn) as the result of applying a certain operation to the predicateand the individual constants. (Just as A ∧B is the result of applying a particular operationto A and B.)

We want the atomic sentences to satisfy unique readability: the predicate and the sequence ofindividual constants should be readable from the sentence. This amounts to the requirement:

If P(c1, . . . , cn) = P0(c01, . . . , c

0m) then

P = P0, n = m, and ci = c0i, for i = 1, . . . , n

The exact nature of the atomic sentences does not matter, as long as unique readability foratomic sentences and for those constructed from them by applying connectives is satisfied.

Except for the difference in the atomic sentences, the set of all sentences is defined exactly asin the sentential calculus. It consists of all constructs that can be obtained from the followingrules:

Every atomic sentence is a sentence.

If A and B are sentences, then:

¬A, A ∧B, A ∨B, A→ B, A↔ B are sentences.

Note: Sometimes an infix notation is adopted for certain binary predicates: Insteadof R(a,b) we write a R b. This is purely a notational convention.

Example: If H( ) is a unary predicate playing the role of ‘... is happy’, and a, b and c areindividual constants playing the roles of ‘Ann’, ‘Bert’ and ‘Claire’, then the sentences,

Ann, Bert and Claire are happy Ann or Bert or Claire is happy

are rendered in PC0, respectively, by:

H(a) ∧ H(b) ∧ H(c) H(a) ∨ H(b) ∨ H(c)(Assuming that the English ‘or’ is meant inclusively.) The second sentence is also a translationof ‘One of Ann, Claire and Bert is happy’, when ‘one of’ means at least one of. But if wewant to formalize

Exactly one of Ann, Bert and Claire is happy,

we have to use a sentence that says that at least one of these people is happy, but no two ofthem are:

[H(a) ∨ H(b) ∨ H(c)] ∧ ¬[H(a)∧H(b) ∨ H(a)∧H(c) ∨ H(b)∧H(c)]



[H(a)∧¬H(b)∧¬H(c)] ∨ [H(b)∧¬H(a)∧¬H(c)] ∨ [H(c)∧¬H(a)∧¬H(b)]

WATCH OUT: You cannot use constructions that parallel English groupings of words,such as:

H(a,b,c), or H(a ∧ b ∧ c), or H(a ∨ b ∨ c) .Such expressions do not represent anything in the formal language. They are, as far as PC0is concerned, gibberish. H is a monadic predicate; it cannot be combined with three individualconstants. Conjunction, or conjunction combine only sentences, not individual constants.Similarly, if R( ) is a predicate that formalizes ‘... is relaxed’, then you cannot render ‘Ann ishappy and relaxed’ by writing:

(H ∧ R)(a)This, again, is gibberish. We do not have in PC0 a conjunction that combines predicates.Conjunction is by definition an operation on sentences. Therefore ‘Ann is happy and relaxed’is rendered as:

H(a) ∧ R(a)

7.1.1 The Semantics of PC0

An interpretation of a PC0-type language consists of:

(I) An assignment that assigns to each individual constant an object.

(II) An assignment that assigns to each n-ary predicate an n-ary relation, i.e.,a set of n-tuples. (If n = 1 the interpretation assigns to the predicate someset.)

If c is assigned to c and P is assigned to P, then c and P are described as the interpretationsof c and P. We also say that c denotes, or refers to, c, and that c is the denotation, orthe reference of c. This terminology is also used, though less commonly, with respect topredicates.

An interpretation determines the truth-value of each atomic sentence as follows. If an n-arypredicate P is interpreted as P and each individual constant ci is interpreted as ci, for i =1, . . . , n, then:

P(c1, . . . , cn) has the value T if (c1, . . . , cn) ∈ P

P(c1, . . . , cn) has the value F if (c1, . . . , cn) 6∈ P

(For n = 1, we simply take the member itself: P(c1) gets T if c1 ∈ P , and it gets F if c1 6∈ P .)


The assignment of truth-values to the atomic sentences determines the truth-values of allother sentences exactly in the same way as in the sentential calculus.

By defining the possible interpretations of the language, we have also determined the conceptsof logical truth and falsity: A sentence of PC0 is logically true if it gets the value T in allinterpretations. It is logically false, if it gets the value F under all interpretations.

We have also determined the relation of logical implication between a premise-list and asentence. Γ logically implies A, just when there is no interpretation under which allmembers of Γ are true and A is false.

Note: The implications we are considering now are no longer schemes of the kind wehandled in 4.3. Our sentences have been specified to be particular entities, e.g., P(a,b) orP(a, b)∨ R(b, b). The implication is logical if, for every interpretation, if all premises are trueso is the conclusion. The interpretations vary, but the sentences remain the same. This istrue of all systems we shall consider henceforth.

So far, no restriction has been placed on the interpretation of the predicates and the individualconstants. Consequently, any assignment of truth-values to the atomic sentences can berealized by interpreting the predicates and the individual constants in a suitable way. Givena truth-value assignment, σ, we can define an interpretation as follows:

Interpret different individual constants as denoting different objects.

Interpret any n-ary predicate P as the n-ary relation P such that, for alln-tuples:

(c1, . . . , cn) ∈ P iff, for some individual constants c1, . . . , cn, ci denotes ci,i = 1, . . . , n, and σ assigns to P(ci, . . . , cn) the value T.

Under this interpretation, the atoms get the truth-values that are assigned to them by σ.

A tautology of PC0 is defined as a sentence whose truth-value, obtained via truth-tables, isT for all assignments of truth-values to its atomic components. Since every assignment oftruth-values to the atomic sentences is obtainable in some interpretation, the logical truths ofPC0 coincide with its tautologies. Similarly, logical implication and tautological implicationare in the case of PC0 the same.

The story will change when we add the equality-predicate, ≈, to the language, because theinterpretation of this predicate is severely restricted. For example, the atomic sentence a ≈ agets always the value T. It is a logical truth but not a tautology.

The sentential apparatus that has been developed for sentential logic applies exactly in the sameway to PC0. We can therefore use freely the techniques of distributing, pushing negations, DeMorgan’s laws, substitutions of equivalents, the top-down proof methods of 4.3.3 and 4.4.1,DNF, CNF, and all the rest.


We can also set up a deductive system, like one of those described in 6.2–except that theatoms are not the Ai’s of SC, but the atomic sentence of our PC0 language.

Homework

7.1 Translate the following sentences, using predicates, individual constants and connectives.

The translation is to be based on the assumption that the sentences are interpreted in auniverse consisting of three men and three women:

Jack, David, Harry, Claire, Ann, Edith

where gender is according to name, and different names denote different persons.

The language has a name for each person. You can use the English names, or their (lowercase) first letters: j, d, h, c, a, e. For ‘... is happy’, you can use H(...).

When you find a sentence ambiguous, give its possible translations and prove their non-equivalence by showing the existence of interpretations under which they get different truth-values.

1. Everyone is happy.

2. Every man is happy and every woman is happy.

3. Every man is happy or every woman is happy.

4. Someone is happy.

5. Some man is happy and some woman is happy.

6. Some man is happy or some woman is happy.

7. Some woman is happy and some is not.

8. Some women are happy, while some men are not.

9. If Jack is not happy, none of the women is.

10. If Harry is happy, then one of the women is and one is not.

11. No man is happy, unless another person is.

12. All women are not happy, but all men are.

13. Women are happy, men are not.

14. Not everyone who is happy is a woman.


15. If men are happy so are women.

7.2 Find all the logical implications that hold between the translations of the first sixsentences in 7.1. (Note that there are only four sentences to consider, because of two obviousequivalences). Prove every equivalence (by any of the methods of chapter 4, or by truth-valueconsiderations). Prove every non-equivalence, by showing the existence of an interpretationthat makes one sentence true and the other false.

7.3 Translate the following sentences, under the same assumptions and using the samenotations as in 7.1. Use L(..., ) for ‘... likes ’. Follow the same instructions in cases thatyou find ambiguous.

1. Every woman is liked by some man.

2. Some man likes every woman.

3. Some woman likes herself, and some man does not.

4. Nobody is happy who does not like himself.

5. Some women, who do not like themselves, like David.

6. Ann does not like a man, unless he likes her.

7. Claire likes a man who likes Edith.

8. Claire and Edith like some man.

9. Unless liked by a woman, no man is happy.

10. Most men like Edith.

7.4 Which, if any, of the translated first two sentences of 7.3 implies logically the other?Prove your implication, as well as non-implication, claims.

Substitutions of Individual Constants

We can substitute in any atomic sentence an individual constant by another constant; thisgives another sentence. When the individual constant has more than one occurrence we cansubstitute any particular occurrence. We can also carry out simultaneous substitutions, i.e.,several substitutions at one go. A few examples suffice to make this clear. Let a b c bedifferent individual constants, and let

P(a, b, a)

be an atomic sentence.


Substituting the first occurrence of a by b we get: P(b, b, a).

Substituting (all occurrences of) a by c we get: P(c, b, c).

Substituting the first occurrence of a by c, its second occurrence by b, andb by a, we get: P(c, a, b)

Substitution in non-atomic sentences is effected by substitution in the atomic components.For example, if the given sentence is

P(a, c, b)→ ¬R(b, c)

then:

Substituting all occurrences of a by c and all occurrences of c by a we get:

P(c, a, b)→ ¬R(b, a)Substituting the first occurrence of b by c and the second by a we get:

P(a, c, c)→ ¬R(a, c)

7.2 PC0 with Equality

7.2.0

The equality predicate, or equality for short, is a binary predicate used to express statementof equality. In English such statements are expressed by:

‘... is equal to ’ ‘... is identical to ’, ‘... is the same as ’

Since we use the symbol ‘=’ in our own discourse, we shall adopt a different symbol as theequality predicate of PC0: ≈ . Thus,

c ≈ c0

is an atomic sentence of PC0, which says that the denotations of c and c’ are identical. Onthe other hand ‘ c = c0 ’ is a sentence in our discourse which says that c and c0 are thesame individual constant.

We refer to atomic sentences of the form a ≈ b as equalities. (The infix notation is a mereconvention; we could have used ‘≈ (a, b)’.)

7.2. PC0 WITH EQUALITY 251

We use ‘a 6≈ b’ as a shorthand for ‘¬(a ≈ b)’. Sentences of this form are referred to asinequalities.

We stipulate as part of the semantics of PC0 that if the language contains equality, then:

a ≈ b gets T iff denotation of a = denotation of b .

As we shall see, an interpretation of a first-order language involves the fixing of a certain set ofobjects as the universe of discourse. All individual constants denote objects in that universe,and all predicates are interpreted as subsets of the universe, or as relations over it. Once theuniverse is chosen, the interpretation of ≈ is, by definition, the identity relation over it: theset of all pairs (a, a), where a belongs to the universe.

For this reason ≈ is considered a logical predicate. All other predicates are non-logical.The following are sentences logical truths

(1) |= a ≈ a(2) |= a ≈ b→ b ≈ a(3) |= (a ≈ b ∧ b ≈ c)→ a ≈ c

But they are not tautologies. Their truth is not established on the basis of their sententialstructure, but because ≈ is interpreted as the identity relation. The following principleobviously holds.

(EP) For all individual constants c, c0: IfA0 is obtained fromA by simultaneouslysubstituting in some places c0 for c and/or c for c0, then:

c ≈ c0 → (A↔ A0) is a logical truth.

Here A can be any sentence; it can be itself an equality. Let

A = a ≈ bLet A0 be obtained by substituting simultaneously a for b and b for a. Then

A0 = b ≈ aBy (EP), the following is a logical truth:

a ≈ b→ [a ≈ b↔ b ≈ a]This is easily seen to imply (2). In a similar way, we can derive (3) from (EP). (Can yousee how?)


(EP) is implied by the stipulation that ≈ is interpreted as the identity relation and by thefollowing general principle that underlies the semantics of PC0:

Extensionality Principle:The truth-value of a sentence, in a given interpretation, does not change, ifwe substitute a term by another term that has the same denotation.

By ‘term’ we mean here an individual constant; but the principle applies if ‘term’ coverspredicates, whose denotations are, as we said, sets or relations (sets of tuples).

In many contexts of natural language the extensionality principle does not hold. The followingstandard example is due to Frege. ‘Hesperus’ and ‘Phosphorus’ are two names that stand,respectively, for ‘the evening star’ and ‘the morning star’. Both, it turns out, denote theplanet Venus. The sentences

(4) Jill believes that Hesperus is identical to Phosphorus.

(5) Jill believes that Phosphorus is identical to Phosphorus.

need not have the same truth-value, although (5) results from (4) by substituting a name bya co-denoting name (i.e., with the same denotation). Jill does not doubt that (5) is true, butshe may be unaware that Hesperus and Phosphorus are the same planet.

(4) and (5) are about Jill’s beliefs. We can formalize statements of this type if we introducesomething like a monadic connective, say Bel, which operates on sentences. For every sentenceA, there is a sentence Bel(A), which says that Jill (or some anonymous agent) believes that A.Syntactically Bel acts like negation. But it is not truth-functional, hence it is not a sententialconnective of classical logic. Individual constants that occur in the scope of Bel cannot, ingeneral, be substituted by co-denoting constants without change of the truth value.

The same is true of a wide class of sentences involving ‘that’-phrases (‘thinks that ...’, ‘knowsthat...’ and others), as well as expressions of necessity or possibility (‘it is necessary that...’,‘it is possible that...’). In formal languages, such non-classical connectives are known asmodal. In this course we shall not encounter them. Contexts, and languages, in which theextensionality principle holds are called extensional. Non-extensional context, such as (4),are known as intensional. Classical logic is extensional throughout.

The truth-table method for detecting tautologies does not detect the new kind of logicaltruth, which derives from the meaning of ≈. It can be modified so as to take care of thisspecial predicate. Instead of considering all possible assignments of truth-values to the atomicsentences, we have to rule out certain assignments as unacceptable. This amounts to strikingout certain rows in the truth-table; for example: a row that assigns c ≈ c the value F, or arow that assigns to a ≈ b and to b ≈ a different truth values is unacceptable. We should alsostrike out any row that assigns the value T to a ≈ b and to a ≈ c but assigns different values


to P(a, b) and P(c, c). It is possible to state general, necessary and sufficient conditions forthe acceptability of a row; but we shall not do it here. Once the acceptable rows have beendetermined, we can check for logical truth:

A sentence of PC0 is logically true iff it has the value T in all acceptable rows of its truth-table.

Note the stronger condition for tautologies: a sentence is a tautology iff it gets the value Tin all rows, including the non-acceptable ones.

Instead of modifying the truth-table method, we shall adjust the top-down derivation methodsof 4.3.3 and 4.4.1, by adding certain laws that take care of ≈. This leads to a much simpler,easier to apply prescription for checking logical implications (and, in particular logical truth)of PC0 sentences.

We remarked that the top-down derivation method applies to PC0 without change. The onlydifference is that instead of the atoms Ais of the sentential calculus (cf. 6.1), or instead of thesentential variables that are used in 4.2.1, we have the atomic sentences of PC0. (We can alsocontinue to use sentential variables, regarding them as some unspecified sentences.)

If ≈ is not present, we get at the end either a proof of the initial goal, or a truth-valueassignment to the atoms that makes the premises true and the conclusion false. In the lattercase we can (as shown in 7.1.1) find an interpretation of the individual constants and thepredicates that yields this assignment; hence we get our counterexample.

The same procedure is adequate for sentences containing ≈, provided that certain laws areadded to our previous lists.

7.2.1 Top-Down Fool-Proof Methods For PC0 with Equality

We shall concentrate mostly on the proof-by-contradiction variant (cf. 4.4.0, 4.4.1) which isbased on fewer laws and which necessitates fewer additions. Only two additional laws arerequired.

The first law adds another type of self-evident implications. They, as well, can now serveas axioms in a bottom-up proof, or as successful final goals (marked by ‘

√’) in a top-down

derivation. The second is a reduction law that can be used in reducing goals. In the following‘c’ and ‘c0’ are variables ranging over individual constants. ‘ES’ stands for ‘EqualitySubstitution’.


Equality Laws For Proofs-by-Contradiction

(EQ) Γ, ¬(c ≈ c) |= ⊥

(ES) Γ, c ≈ c0 |= ⊥ ⇐⇒ Γ0, c ≈ c0 |= ⊥where c and c0 are different constants, c occurs in Γ,and Γ0 results from Γ by substituting everywhere c0 for c.

(For simplicity c ≈ c0 has been written as the rightmost premise. This is immaterial sincewe can reorder the premises.) Recall that Γ |= ⊥ means that there is no interpretationthat makes all sentences of Γ true; a counterexample is an interpretation that does this. Thetwo sides of (ES) are counterexample-equivalent: an interpretation is a counterexample to theimplication of one of the sides iff it is a counterexample to the other. This is implied by (EP):If c ≈ c0 gets T, then all the sentences of Γ get the same truth-values as their counterpartsin Γ0.

To apply (ES) in a top-down derivation, choose some equality c ≈ c0 among the premises andreplace every occurrence of c by that of c0, in every other sentence. After the application cdoes not occur in any of the sentences except in c ≈ c0. We shall refer to such an applicationas a c-reduction.

Note: The⇐-direction of (ES) is the one applied in bottom-up proofs. This direction allowsus–when c ≈ c0 is a premise and c does not occur in any other sentence– to replace in theother sentences some (one or more) occurrences of c0 by c.

We could have formulated (ES) in a more general form, which allows to substitute some,but not necessarily all, the occurrences of c. But the top-down process is simpler and moreefficacious when the law is applied in its present form.

Note: The restriction that c and c0 be different and that c occur in Γ rules out cases wherethe substitution would leave Γ unchanged.

Consider a c-reduction, where the equality is c ≈ c0. After the reduction c appears only inthe equality c ≈ c0. Call an individual constant that appears in the premises dangling if ithas one occurrence, and it is the left-hand side of an equality. Then every c-reduction reducesthe number of non-dangling constants by one.

Here are three very simple examples of top-down derivations:

1. a ≈ b |= b ≈ a

2. a ≈ b, ¬(b ≈ a) |= ⊥


3. a ≈ b, ¬(b ≈ b) |= ⊥ √

The first step is the usual move (via the Contradictory-Conclusion Law) in a proof by con-tradiction. The passage from 2. to 3. is via (ES), with a ≈ b in the role of c ≈ c0. 3. is aself-evident implication that falls under (EQ). The bottom-up proof is obtained by reversingthe sequence. The passage from 3. to 2. is via the ⇐-direction of (ES).

1. a ≈ b, b ≈ c |= a ≈ c2. a ≈ b, b ≈ c, ¬(a ≈ c) |= ⊥3. a ≈ c, b ≈ c, ¬(a ≈ c) |= ⊥ √

Here the step from 2. to 3. is achieved by applying (ES), with the second equality (i.e., b ≈ c)in the role of c ≈ c0; the application results in substituting b in the first equality by c.

1. a ≈ b, c ≈ b, P(a, c, a) |= P(c, b, a)

2. a ≈ b, c ≈ b, P(a, c, a), ¬P(c, b, a) |= ⊥3. a ≈ b, c ≈ b, P(b, c, b), ¬P(c, b, b) |= ⊥4. a ≈ b, c ≈ b, P(b, b, b), ¬P(b, b, b) |= ⊥ √

Here 3. is obtained from 2. by using (ES), the equality in question being a ≈ b. Then 4. isobtained from 3. by another application of (ES), this time the equality is b ≈ c.

The Adequacy of the Method

The top-down method for PC0 with equality is based on our previous reduction steps and onreductions via (ES). Given an initial goal:

Γ |= A or Γ |= ⊥we apply repeated reductions. This must end with a bunch of goals that cannot be furtherreduced. The argument follows the same lines as in the case of sentential logic. Applicationsof the connective laws yield simpler goals. (In 6.2.4 and 6.2.5 we have seen how to associatesweights with goals, so that the applications are weight reducing.) Applications of (ES) consistin substitutions that preserve all the sentential structure. The goal is simplified in that thenumber of non-dangling constants is reduced. Following these considerations, it is not difficultto see that the process must terminate. (We can define a new weight by adding to the weightdefined in 6.2.5 the number of all non-dangling constants. The new weight goes down witheach reduction step. We therefore have an inductive argument exactly like those of 6.2.4 and6.2.5.)


The end goals of the process (those in the leaves) are elementary implications, i.e., theycontain only literals: atoms or negated atoms. And they cannot be further reduced via (ES).Consider an implication of this type, where all equalities c ≈ c0 in which the two sides aredifferent, are written first:

c1 ≈ c01, c2 ≈ c02, . . . , cn ≈ c0n, Γ |= ⊥Γ consists of all premises that are not equalities, or that are trivial equalities: a ≈ a. None ofthe cis occurs in any other place; otherwise we could have applied a ci-reduction. They areexactly all the dangling constants.

Assume that the implication is not self-evident, i.e., that the premises contain neither asentence and its negation, nor an inequality of the form ¬(c ≈ c). Then the followinginterpretation makes all premises true. Hence it is a counterexample.

(I) Let a1, . . . , am be all the different non-dangling individual constants. Inter-pret them as names of different objects, a1, . . . , am.

(II) Interpret the predicates in a way that makes every atom occurring positively(i.e. unnegated) in Γ true and every atom occurring negatively false. (Thiscan be done because, because different constants occurring in Γ have beenassigned different denotations.)

(III) Interpret each ci as denoting the same object (among the ajs) that is denotedby c0i, i = 1, . . . , n. (This can be done because the each ci occurs only in theequality ci ≈ c0i.)

If all the elementary implications are self-evident, the top-down derivation tree shows thatthe initial goal is a logical implication. By inverting the tree we get a bottom-up proof. If,on the other hand, one of the elementary implication is not self-evident, then, as just shown,we can construct a counterexample. This is also a counterexample to the initial goal.

QED

Here is a simple example. The initial goal is:

L(a, b), L(a, c)∧L(b, a)→ H(a), b ≈ c |= H(a)

An attempt to construct a top-down derivation results in:

1. L(a, b), L(a, c)∧L(b, a)→ H(a), b ≈ c |= H(a)

2. b ≈ c, L(a, b), L(a, c)∧L(b, a)→ H(a), ¬H(a) |= ⊥3. b ≈ c, L(a, c), L(a, c)∧L(c, a)→ H(a), ¬H(a) |= ⊥4.1 b ≈ c, L(a, c), ¬[L(a, c)∧L(c, a)], ¬H(a) |= ⊥


4.2 b ≈ c, L(a, c), H(a), ¬H(a) |= ⊥ √

5.11 b ≈ c, L(a, c), ¬L(a, c), ¬H(a) |= ⊥ √

5.12 b ≈ c, L(a, c), ¬L(c, a), ¬H(a) |= ⊥ ×

In the first step we have also rearranged the premises by moving b ≈ c to the beginning. Thestep from 2. to 3. is a b-reduction. The other steps are of the old kind. 5.12 is an elementarybut not self-evident. It has a counterexample:

Let a denote a, and let c denote c, where a 6= c.

Interpret L as any relation L, such that (a, c) ∈ L and (c, a) 6∈ L.

Interpret H as any set, H, such that a 6∈ H.

Interpret b as denoting c.

In general, in the presence of ≈, it is advisable to apply c-reductions as soon as possible. Thiswill reduce the number of constants in the other premises.

Note: Obviously, Γ, c ≈ c |= C ⇔ Γ |= C. Hence trivial equalities can be dropped. Yet,we do not have to include this among our laws. The proof above shows that the method isadequate without the law for dropping trivial equalities. Sometimes dropping c ≈ c resultsin the disappearance of c from the premises. In this case any counterexample to the reducedgoal becomes a counterexample to the original goal if we assign to c any arbitrary denotation.

The Top-Down Method of 4.3.3 for PC0 with Equality

The adjustment of the method of 4.3.3 is obtained along similar lines. Here is a brief sketch.Recall that the method does not employ ‘⊥’. Its treats implications of the form

Γ |= C

To handle ≈, we split each of our old equality laws into two laws, one for the premises andone for the conclusion. Altogether we have four laws that treat equalities: the self-evidentimplications (EQ1) and (EQ2) and the reduction laws (ES1) and (ES2).


(EQ1) Γ,¬(c ≈ c) |= C

(EQ2) Γ |= c ≈ c

In the following laws c and c0 are different constants,c has at one other occurrence and Γ0, C 0 result from Γ, Cby substituting everywhere c0 for c.

(ES1) Γ, c ≈ c0 |= C ⇐⇒ Γ0, c ≈ c0 |= C 0

(ES2) Γ |= ¬(c ≈ c0) ⇐⇒ Γ0 |= ¬(c ≈ c0)

To see why (ES2) holds, consider a counterexample to one of the sides. Since ¬(c ≈ c0) getsF, c ≈ c0 gets T. Hence c and c0 have the same denotation. Therefore the sentences in Γ andin Γ0 get the same truth-values.

The top-down reductions, for a given initial goal, proceed much as before. Again, it is advis-able to carry out c-reductions, via (ES1) and (ES2), as early as possible. At the end the goalis reduced a bunch of elementary implications that cannot be further reduced. Correspondingto the four self-evident implication laws, there are four types of self-evident implications:

Γ, A,¬A |= C, Γ, A |= A, Γ,¬(c ≈ c) |= C, Γ |= c ≈ c

The method’s adequacy is proven by showing that if an elementary implication is not of anyof these forms and if, moreover, it cannot be further reduced via (ES1) or (ES2), then thereis an interpretation that makes all premises true and the conclusion false.

Example:

1. P(a), P(b)→ ¬P(c) |= a ≈ b→ ¬(b ≈ c)

2. P(a), P(b)→ ¬P(c), a ≈ b |= ¬(b ≈ c) (|=,→)

3. P(b), P(b)→ ¬P(c), a ≈ b |= ¬(b ≈ c) a-reduction, (ES1)

4. P(c), P(c)→ ¬P(c), a ≈ c |= ¬(b ≈ c) b-reduction, (ES2)

5.11 P(c), ¬P(c), a ≈ c |= ¬(b ≈ c) (→, |=), √

5.11 P(c), ¬P(c), a ≈ c |= ¬(b ≈ c) (→, |=), √


Gentzen-Type Systems for PC0 with Equality

The Gentzen-type systems GS1 and GS2, considered in 6.2.5, can be extended to PC0 withequality. By now the extension should be obvious: just as GS1 and GS2 are obtained byformalizing the laws of 4.3.4 and 4.4.1 into axioms and inference rules, the required extensionsare obtained by formalizing the additional laws. The required extension of GS2 is obtainedby adding the following axiom and inference rule:

Γ,¬(c ≈ c) ⊥

Γ0, c ≈ c0 ⊥Γ, c ≈ c0 ⊥

where c and c0 are different constants, c occurs in Γ,and Γ0 results from Γ by substituting everywhere c0 for c.

Note that the inference rule corresponds to the ⇐-direction of (ES).The completeness of the extended system follows now from the adequacy of the top-downproof-by-contradiction method for PC0. The extension of GS1 is obtained in a similar wayand is left to the reader.

A Hilbert-Type System for PC0 with Equality

The Hilbert-type system HS1, given in 6.2.3, can be extended to a system that is sound andcomplete for PC0 with equality. As in HS1, we assume that the language of PC0 has only¬ and → as primitive sentential connectives. The same kind of extension applies to everysound and complete system that has modus ponens as an inference rule (either primitive orderived). In particular it applies to the systems obtained from HS1 by adding connectiveswith the associated axioms, as described in 6.2.4 (cf. Homework 6.12, 6.13 and 6.14). It turnsout that the addition of the following two equality axiom-schemes is all that is needed:

EA1 c ≈ c, where c is any individual constant.

EA2 c ≈ c0 → (A→A0), where c and c0 are any individual constants, A is anysentence of PC0 and A0 is obtained from A by substituting one occurrenceof c by c0.

Actually, we can restrict EA2 to the cases where A is an atom; this, together with EA1 isalready sufficient, but we shall not go into this here.


If our original system is complete (i.e., is sufficient for proving all tautological implications),then the addition of the two axiom schemes takes care of all implications that are due to theconnectives and to equality. For example, the logical truth

a ≈ b→ b ≈ ais derivable from EA1 and EA2, because

a ≈ b→ (a ≈ a→b ≈ a)is an instance of EA2 (where A is a ≈ a, c and c0 are, respectively a and b, and A0

is obtained by replacing the first occurrence of a by b). This and a ≈ a, tautologicallyimply a ≈ b→ b ≈ a.Note that, while EA2 allows us to replace one occurrence of c by an occurrence of c0, repeatedapplications enable us to replaces any number of c’s by c0’s.

The completeness of the resulting system is proved in the same way used to prove the com-pleteness of HS1. We extend the previous proof by showing that the provability relation, `,of that system has the properties required to insure the adequacy of the top-down derivationmethod. We have to establish the following properties, which correspond to (EQ1), (EQ2)and the ⇐-directions of (ES1) and (ES2).

(i) Γ, ¬(c ≈ c) ` C

(ii) Γ ` c ≈ cIn the following Γ0, C 0 result from Γ, C by substituting everywhere c0 for c.

(iii) Γ0, c ≈ c0 ` C 0 ⇒ Γ, c ≈ c0 ` C

(iv) Γ0 ` ¬(c ≈ c0) ⇒ Γ ` ¬(c ≈ c0)

Homework

7.5 Find which of the following is a logical implication. Justify your answers (positive–byderivations or truth-value considerations, or both, negative–by counterexamples).

1. L(a, b)→ L(b, a), L(a, b)∧L(b, c)→ L(a, c) |= c ≈ a→ (L(a, b)→L(a, a)) ?2. (L(a, b)∧L(b, a)∧(a 6≈ b))→ H(a) |= L(a, a)∧L(a, b)→ H(a) ?

3. L(a, b)→ L(b, c), L(b, c)→ L(c, a), L(c, a)→ L(a, b) |= a ≈ b→ [L(a, a)↔ L(c, a)] ?

7.6 Consider interpretations of a language based on a two-place predicate L( , ), andindividual constants a, b, c, such that:

7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE 261

(i) Each individual constant denotes somebody among Nancy, Edith, Jeff,and Bert, where these are four distinct people.

(ii) L(... , ) reads as: ‘... likes ’, where all the pairs (x, y) such that xlikes y are:

(Nancy, Edith), (Nancy, Jeff), (Edith, Edith), (Edith, Bert), (Jeff,Jeff),(Jeff, Bert), (Bert, Edith).

Consider the sentences

(1) L(a,b) ∧ L(b,a) ∧ ¬ L(b, b)

(2) ¬ L(a,a) ∧ L(b,a) ∧ L(c,a)

Find all the ways of interpreting a and b so that (1) comes out true; and all the ways ofinterpreting a, b, c so that (2) comes out true. Indicate in each case your reasons.

7.7 Show that the following sentences are implied tautologically by instances of EA1 andEA2, where EA2 is used for atomic sentences only. (Outline the argument, giving the instancesof the required equality axioms.)

1. a ≈ b→ [b ≈ c→ a ≈ c]2. a ≈ b→ [L(a, a)→ L(b, b)]

3. a ≈ b→ [(c 6≈ d) ∨ (S(a, c)↔ S(b, d))]

7.8 Prove (iii) in the above-mentioned list of properties required of `. (Hint: show thatevery sentence of Γ0 is provable from c ≈ c0 and the corresponding sentence of Γ, , by repeateduses of (EA2). Using the fact that c ≈ c0 ` c0 ≈ c, show that C is provable from c ≈ c0 andC 0)

7.3 Structures of Predicate Logic in Natural Language

7.3.1 Variables and Predicates

Variables as Place Markers

When predicates of PC0 of arity > 1 are meant to formalize English predicate-expressions, thecorrespondence between the argument places should be clear and unambiguous. For example,if we introduce the two-place L( , ) as the formal counterpart of ‘likes’ we can say:


L(... , ) is to be read as ‘... likes ’ .

It is obvious here that the first and second places in L( , ) match, respectively, the left andright sides of ‘likes’. But this cumbersome notation is impractical when the arities are larger,or when the English expressions are longer. At this point variables come handy. We can saythat

L(x, y) is to be read as ‘x likes y’

You are probably acquainted with the use of variables that range over numbers from highschool algebra. Variables that range over sentences have been used in previous chapters,as well as variables ranging over strings. In chapter 5 we used using variables ranging overarbitrary objects. Later we shall extend PC0 by incorporating in it variables as part of thelanguage.

At present we are going to use ‘x’ ‘y’ ‘z’ ‘u’ ‘v’, etc. merely as place markers: to markcertain places within syntactic constructs.

The identity of the variables is not important. For example, the following three stipulationscome to the same:

L(x, y) is to be read as ‘x likes y’.

L(u, v) is to be read as ‘u likes v’.

L(y, x) is to be read as ‘y likes x’.

Each means that L is a binary predicate to be interpreted as:

{(p, q) : p likes q}But if we were to say that

L(x, y) is to be read as ‘y likes x’,

then we would be assigning to L a different interpretation, namely:

{(p, q) : q likes p}If a and b denote, respectively, a and b, then under the first stipulation L(a,b) is true iff alikes b, but under the second it is true iff b likes a.

By substituting, in English sentences, variables for noun-phrases we can indicate predicate-expressions. We can then say, for example:


Let H be the predicate ‘x is happy’,

which says that the monadic predicate H( ) is to be interpreted as the set of all happy beings.When the arity is > 1 it should be clear which coordinates are represented by which variables.If we say

Let L be the predicate ‘x likes y’,

then we should indicate the correspondence between variables and places of L. In the absenceof other indications, we shall take the alphabetic order of the variables as our guide: ‘x’ before‘y, ‘y’ before ‘z’.

Deriving Predicates from English Sentences

Generally, an English sentence can give rise to more than one predicate. Because we can mark(using variables) different places as empty. To take an example from Frege,

(1) Brutus killed Caesar

gives rise to the expression: ‘x killed y’, as well as to: ‘x killed Caesar’ and‘Brutus killed y’.

Let K, K1, and K2 be, respectively, predicates corresponding to these expressions. Theneach of the following three is a formalization of (1):

(10) K(Brutus, Caesar)

(100) K1(Brutus)

(1000) K2(Caesar)

K denotes the binary relation of killing: the set of all pairs (p, q) in which p killed q. K1corresponds to the property of being a killer of Caesar; it denotes the set of all beings thatkilled Caesar. K2 corresponds to the property of being killed by Brutus; it denotes the set ofall beings that were killed by Brutus.

We can also derive from the binary predicate ‘x killed y’ the monadic predicate ‘x killedx’, which denotes the set of all beings that killed themselves.

The derived predicates can be quite arbitrary. For example, we can formalize

(2) Jack frequents the movies and Jill prefers to stay home,


as:

(20) B(Jack, Jill)

where B(x, y) is to be read as: x frequents the movies and y prefers to stay home. B istherefore interpreted as

{(p, q) : p goes to the movies and q prefers to stay home}(20) is not a good formalization since it hides the structure of (2). A better one, which shows(2) as a conjunction, is:

(200) FrMv(Jack) ∧ PrHm(Jill)

where FrMv and PrHm are, respectively the monadic predicates:

‘x frequents the movies’ and ‘y prefers to stay home’

In the same vein, we can say that (10) is a better recasting of (1), then either (100) or (1

000).

While grammar can be deceptive when it comes to logical analysis, grammatical aspects canguide us to more natural predicates that reveal more of the sentence’s logical structure.

7.3.2 Predicates and Grammatical Categories of Natural Language

The following are the basic grammatical categories that give rise to predicates. This is trueof English, as well as of other languages.

• Adjectives, as in ‘x is triangular’• Common names, as in ‘x is a woman’.• Verbs, as in ‘x enjoys life’.

Common names are also known as general names, or as common nouns. Adjectives and verbsgive rise to predicates of arity greater than one. For example, from adjectives we get:

x is taller than y,

x is between y and z.

And from verbs:


x introduced y to z.

In English, adjectives and common names require the word ‘is’, known in this context as thecopula. It connects the adjective, or the common name, with the noun phrase (or phrases).In the predicate-expression the noun phrase is replaced by a variable.

Common names are characterized by the presence of the indefinite article: a woman, ananimal, a city, etc.

As you can see, a variety of English constructs are put in the same bag: all become predicatesupon formalization in first-order logic. Differences of English syntax and certain differencesin meaning are ignored. A finer-grained picture requires additional structural elements andmay involve considerable increase in the formalism’s complexity.

Two Usages of ‘is’ The role of ‘is’ as a copula in predicate expressions is to be clearlydistinguished from its role as a two-place predicate denoting identity. Compare, for example,

(3) Ann is beautiful

with

(4) Ann’s father is Bert.

In (3) ‘is’ functions as a copula. In (4) it functions as the equality predicate. (4) can bewritten as

Ann’s father = Bert

‘Is’ must function as the equality predicate when it is flanked by singular noun-phrases, (i.e.,noun-phrases denoting particular objects).

Singular Terms

Singular terms are constructs that function as names of particular objects, e.g.,

Bill Clinton, New York City, 132, The smallest prime number, The capital of theUSA, etc.

There is a difference between the first two, which are–so to speak–atomic, and the othertwo, which pick their objects by means of a description. The first are called proper names, thelast–definite descriptions. Usually, a definite description is marked by the definite article:


the capital of the USA, the satellite of the earth, the second world war, the manwho killed Liberty Valence etc.

But this rule has exceptions. ‘The USA’ should be construed as a proper name, while‘132’ is really a disguised description (spelled out, it becomes ‘1 · 102 + 3 · 101 + 2 · 10’).A definite description denotes the unique object satisfying the stated condition, e.g., ‘theearth’s satellite’ denotes that unique object of which ‘x is a satellite of earth’ is true. Thedefinite description fails to denote if either no object, or more than one object, satisfies thecondition. There are various strategies for dealing with non-denoting descriptions. In Russell’stheory of descriptions, sentences containing definite descriptions are recast into sentences thathave truth values even when the description of the original sentence fails to denote. On othertheories, a failure of denotation can cause a truth-value gap, that is: the sentence has notruth-value.

These and other questions that relate to differences between proper names and definite de-scriptions focused considerable attention in the philosophy of language. Some have been thesubject of a still ongoing debate.

Note: Sometimes the definite article is used merely for emphasis, or focusing:

(5) Jill is the daughter of Eileen

need not imply that Jill is the only daughter of Eileen. It can be read as

(50) Jill is a daughter of Eileen.

which can be formalized as:

(5∗) Daughter(Jill, Eileen)

Here ‘is’ functions as a copula. Contrast this with

(6) The daughter of Eileen is Jill.

Here ‘is’ cannot be read as a copula, because ‘Jill’ cannot be a general name. (6) must beread as:

(6∗) The daughter of Eileen = Jill.

Both proper names and definite descriptions are represented by individual constants in thesimplest language of first-order logic (of which PC0 is a fragment). In other variants of thelanguage there are ways of forming other singular terms, besides individual constants. A mostcommon variant contains function symbols. For example, it may contain a one-place function


symbol, F( ), such that

F(x) is to be read as ‘the father of x’.

We can therefore render ‘John’s father’ as: F(John), and ‘The father of John’s fa-ther’ as: F(F(John)). Function symbols can have arity > 1; for example, a two-placefunction symbol sum( , ), such that

sum(x, y) is to be read as ‘the sum of x and y’.

In infix notation sum(x, y) becomes x+y. Further details concerning such languages are givenin 8.2.4, page 291.

Other variants of first-order languages contain a definite description operant, which is used toform expressions that read:

the unique x such that: ... x ...

where ‘...x...’ expresses some property of x.

Straightforward translations into the predicate calculus may obliterate other distinctions,besides those between adjectives, common names and verbs. The translation of ‘snow iswhite’ as:

White(Snow),

treats ‘snow’ as a name of an object, on a par with ‘Bill Clinton’. The distinction betweenmass terms (‘snow’, ‘water’, ‘coal’) and proper names, (‘John’, ‘The USA’, ‘Chicago’)disappears in this translation.

7.3.3 Meaning Postulates and Logical Truth Revisited

Except for the logical particles, our formal language is uninterpreted. Consequently, varioustruths that are taken for granted in natural language have to be stated explicitly when thediscourse is formalized.

For example, by saying that Jill is a female, one implies that she is not a male. To make thisexplicit, we can add

(7) Fem(j)→ ¬Male(j)


as a non-logical axiom. Actually, the axiom to be added is not (7), but the generalizedsentence stating that no female is a male. The sentence is formed by using a quantifier:

(7∗) ∀v(Fem(v)→ ¬Male(v))

For the moment let us state axioms within PC0.

Following Carnap, we have introduced in 4.5.1 the term ‘meaning postulate’ to characterizenon-logical axioms that reflect the meaning of linguistic terms; these terms, it turns out,are mostly predicates. We can thus say that (7), or the generalization (7∗), derives fromthe meaning of ‘female’ and ‘male’. As noted in 4.5.1, the absolute distinction that Carnapadvocated between meaning postulates and empirical truth is now rejected by many. But itstill makes good sense to distinguish (7∗) from plain empirical truths.

In 4.5.1 we have also considered another kind of presuppositions, called ‘background assump-tions’, which are not as basic, or as trivial as meaning postulates. They cover a wide range,from truths whose certainty seems beyond doubt to those that are merely probable. Nofurther elaboration is needed here.

Predicate logic and its relation to natural language call for further clarifications of logicaltruth and logical implication, beyond those that have to do with the sentential connectives.Consider for example:

(8) The earth is bigger than the moon,

(9) The moon is smaller than the earth.

Does (8) logically imply (9)? (Or, equivalently, is the conditional ‘If (8) then (9)’ a logicaltruth?) If two different predicates, say B and S, are used for ‘bigger’ and ‘smaller’, thesentences are formalized as:

(8∗) B(earth, moon)

(9∗) S(moon, earth)

The implication between the sentences rests in this case on a meaning postulate that can bestated schematically, with ‘a’ and ‘b’ standing for any individual constants:

(10) B(a, b)↔ S(b, a)

But (10) is not a logical truth; neither is the implication from (8∗) to (9∗) a logical implication.Another way of construing the situation is to regard ‘x is bigger than y’ simply as anotherway of writing ‘y is smaller that x’ ; just as in mathematics ‘x > y’ is another way ofwriting ‘y < x’. And in this case (8) implies tautologically (9), because they are construed

7.4. PC∗0 , PREDICATE LOGIC WITH INDIVIDUAL VARIABLES 269

as the same sentence. The question: what are the “right” translations of (8) and (9) may nothave an answer.

We might try a third alternative: The predicates representing ‘smaller’ and ‘bigger’ aredifferent, but certain sentences, such as (10), count as logical axioms. This only transformsour original question into the question: What meaning postulates count as logical axioms?Suppose you regard (10) as a logical truth, would you adopt the same policy with respect toother pairs:

hotter and colder prettier and uglier to the left of and to the right of ?

And what about ‘hot’ and ‘cold’, ‘beautiful, and ‘ugly’? Or sentences such as:

(11) Red(a)→ ¬Blue(a) ?

All this should not undermine the concept of logical implication. It only indicates a loosenessof fit between the formal structure and our actual language, a looseness that is inevitablewhenever a theoretical scheme is matched against concrete phenomena.

7.4 PC∗0 , Predicate Logic with Individual Variables

7.4.0

We now take the crucial step of incorporating individual variables into the formal language.Let

v1, v2, . . . , vn, . . .

be a fixed infinite list of distinct objects called individual variables, or variables for short,which are different from all previous syntactic items of PC0. The vi’s are different, but theyplay the same role in the formal language. It is convenient to use

‘u’ ‘v’ ‘w’ ‘x’ ‘y’ ‘z’ ‘u0 ’ ‘v0 ’ etc.

as standing for unspecified vi’s, i.e., as variables ranging over v1, . . . , vn, . . . . (We may say,“For every individual variable v ...”, or “For some individual variable w ...”.)

We shall also use ‘x’, ‘y’ and ‘z’ in another role: to range over various domains that comeup in the discussion. For example in ‘{(x, y) : x < y}’, ‘x’ and ‘y’ range over numbers.Whether ‘x’, ‘y’ and ‘z’ stand for variables of the formal language, or are used in a differentrole, will be clear from the context.


Henceforth ‘PC∗0’ denotes the system obtained from PC0 by incorporating the individualvariables.

Terms: An individual term, or–for short, a term, is either an individual constant or avariable.

Well Formed Formulas, or Wffs: The basic construct of PC∗0 is that of a well formedformula. The name is abbreviated as ‘wff’.

Wffs are constructed like the sentences of PC0, except that variables, besides individualconstants, can fill the predicate’s empty places. We shall use lower case Greek letters:

‘α’ ‘β’ ‘γ’ ‘α1’ ‘α2’ ,... ‘β1’ ‘β2’ , ... ‘α0,’ ... etc.,

to range over well formed formulas. Since sentences turn out to be special cases of well formedformulas, this involves also a notational change with regard to sentences: from upper caseLatin to lower case Greek. Spelt out in detail, the definition is:

Atomic Wffs: If P is an n-place predicate and t1, . . . , tn are terms then P(t1, . . . , tn) is anatomic wff. It goes without saying that unique readability is assumed with respect to atomicwffs: the predicate and the sequence of terms are uniquely determined by the formula. Thisis also assumed with respect to all compounds.

The sentential connectives are now construed as operations defined for wffs. The set of allwffs is defined inductively by:

(I) Every atomic wff is a wff.

(II) If α and β are wffs then:

¬α, α ∧ β, α ∨ β, α→ β, α↔ β

are wffs.

All the syntactic concepts, such as main connective, immediate components, and components,are defined in the same way as before.

The occurrences of a term in a wff are determined in the obvious way: (i) A term, t, occurs,in the ith predicate-place, in the atomic wff P (t1, . . . , tn), iff t = ti (note that it can haveseveral occurrences), and (ii) the occurrences of t in a wff α are its occurrences in the atomiccomponents of α.

The Sentences of PC∗0: The sentences of PC∗0 are, by definition, the sentences of PC0. Itis easy to see that this means the following:

A wff of PC∗0 is a sentence iff no variables occur in it.


(For atomic wffs this is obvious. For the others, it follows from the fact that the wffs of PC∗0and the sentences of PC0 are generated from atoms by the same sentential connectives.)

Examples: The following are wffs, where u, v, w, v0 are any variables.

P(b, a) P(u, b) P(a, a)→ ¬R(c) P(v, u)→ ¬R(c) P(v, v) ∧ (R(w) ∨ R(v0))

The first is an atomic sentence, the second is an atomic wff that is not a sentence, the thirdis a non-atomic sentence, the fourth and the fifth are non-atomic wffs that are not sentences.

So far the variables play the same syntactic role as individual constants. There is howevera semantic distinction: The interpretation of the language assigns denotations to individualconstants, but not to the variables. Consequently, an interpretation determines the truth-values of sentences, but not of wffs. To determine the truth-value of a wff α we need, besidesthe interpretation of the language, an assignment of objects to the variables of α.

For example, the truth value of P(a, b) is determined by the denotations of a and b and theinterpretation of P; but, in order to get the truth-value of P(v, b), we need–in addition–toassign some object as the value of v. This will be elaborated and clarified within the generalsetting of first-order logic.

Note: It may happen that the truth-value of a wff, which is not a sentence, is the samefor all assignments of objects to its variables. For example, P(u, v) ∨ ¬P(u, v) gets, for everyassignment, the value T. Or (P(u, v) → H(a)) ∧ P(u, v) gets the same value as H(a). A wffmay be therefore logically equivalent to a sentence. This does not make it a sentence. Thedistinction between sentences and wffs that are not sentences is syntactic, not semantic.

7.4.1 Substitutions

In 7.1.1 we discussed substitutions, in sentences of PC0, of individual constants by individualconstants. We can now extend this to substitution of terms by terms in wffs. We denote by:

Stt0α

the wff resulting by substituting t0 for t in α. By this we mean that t0 is substituted for everyoccurrence of t.

Examples:

Suc [(R(a, u) ∧ R(x, b))→ P(u)] = (R(a, c) ∧ R(x, b))→ P(c),

Sux [(R(a, u) ∧ R(x, b))→ P(u)] = (R(a, x) ∧ R(x, b))→ P(x),

Sac [(R(a, u) ∧ R(x, b))→ P(u)] = (R(c, u) ∧ R(x, b))→ P(u),

Sbx[(R(a, u) ∧ R(x, b))→ P(u)] = (R(a, u) ∧ R(x, x))→ P(u).


We can also substitute at one go several terms: s1 by t1, s2 by t2, ..., sn by tn. These, as wesaw in 7.1.1 page 249, are called simultaneous substitutions. The result of such a simultaneoussubstitution in α is denoted:

Ss1t1s2t2......sntnα.

Examples:

Su,xa,c [(R(a, u) ∧ R(x, b))→ P(u)] = (R(a, a) ∧ R(c, b))→ P(a),

Su,xx,u [(R(a, u) ∧ R(x, b))→ P(u)] = (R(a, x) ∧ R(u, b))→ P(x),

Sa,bb,x[(R(a, u) ∧ R(x, b))→ P(u)] = (R(b, u) ∧ R(x, x))→ P(u).

Variable Displaying Notation

Notations such as:

‘α(v)’ ‘β(x, y)’ ‘γ(u, v, w)’

are used for wffs in order to call attention to the displayed variables. The point of thenotation is that if we use ‘α(x)’, then we understand by ‘α(a)’ the wff obtained from α(x) bysubstituting a for x. Hence we have:

Sxaα(x) = α(a), Sx,y

a,bβ(x, y) = β(a, b), Sx,yb,bβ(x, y) = β(b, b), Su,x,y

a,b,c γ(u, x, y) = γ(a, b, c)

We extend this to cover substitutions of variables by variables:

Sxyα(x) = α(y), Sx,y

y,xβ(x, y) = β(y, x), Sx,yb,xβ(x, y) = β(b, x), etc.

If we think of α(x) as a predicate expressing a certain property, with ‘x’ marking the emptyplace of the predicate, then we can think of α(a) as saying that the property is true of theobject denoted by a.

Incautious use of this convention may lead to notational inconsistency. A wff containing bothx and y as free variables should not be written both as α(x) and as α(y). For then α(a) can beread either as the wff obtained by substituting a for x, or the wff obtained as substituting a fory; and these are different. Also in that case we might read α(y) as the result of substitutingy for x in α(x). Such inconsistencies are avoided if we display all the free variables that weconsider subject to substitutions. In the case just mentioned we write the wff as α(x, y).Then,

Sxaα(x, y) = α(a, y), Sy

aα(x, y) = α(x, a), Sxyα(x, y) = α(y, y).

Note that it would do no harm to display variables that do not occur in the wff; if x does notoccur in α(x), then substituting for it any term does not have any effect: α(x) = α(c). But


since this may confuse it is best to avoid it. Usually use of ‘α(x)’ is taken to indicate that ‘x’occurs in α.

When we want to focus on a particular variable, while indicating that there are possiblyothers, we can use notations such as: α(. . . x . . .). Or we can state explicitly that α(x)may have other variables.

7.4.2 Variables and Structural Representation

A wff α(x), having no variables besides x, can serve as a scheme for getting sentences of theform α(c), where c is any individual constant. Similarly, a wff α(u, v) can serve as a scheme forsentences of the form α(a, b). Such schemes can give us a handle on long sentences. Supposewe want to formalize:

(1) Everyone, among Jack, David and Harry, who is liked by Ann is liked byClair.

Let us use L(x, y) for ‘x likes y’, and the first letters for the names. The desired sentence is aconjunction, saying, of each of the men, that if he is liked by Ann he is liked by Claire:

(10) (L(a, j)→L(c, j)) ∧ (L(a, d)→L(c, d)) ∧ (L(a, h)→L(c, h))

We can, instead, describe the sentence as follows:

(1∗) α(j) ∧ α(d) ∧ α(h),where α(x) = L(a, x)→L(c, x).

Here α(x) “says of x” that if he is liked by Ann he is liked by Claire; hence, α(j), α(d), and α(h)say, respectively, that the property holds for Jack, David and Harry. This way of rewriting(10) is much shorter and much more transparent. It brings to the fore a certain structure.Here are additional examples, in which the logical form is displayed by using variables:

(2) Everyone, among Jack, David and Harry, who is happy likes himself.

(2∗) α(j) ∧ α(d) ∧ α(h),where α(x) = H(x)→ L(x, x).

If needed, we can unfold the sentence by carrying out all the substitutions and writing downthe full conjunction:

(H(j)→ L(j, j)) ∧ (H(d)→ L(d, d)) ∧ (H(h)→ L(h, h))


(3) Someone among Jack David and Harry who does not like Ann likes Claire.

(3∗) α(j) ∨ α(d) ∨ α(h),where α(x) = ¬L(x, a) ∧ L(x, c).

The following case displays a two-level structure, obtained by repeating the same technique.

(4) Someone, among Ann, Claire and Edith, likes everyone among Jack, Davidand Harry who likes himself.

(4∗) α(a) ∨ α(c) ∨ α(e),where α(x) = β(x, j) ∧ β(x, d) ∧ β(x, h),

where β(x, y) = L(y, y)→ L(x, y).

You may arrive at this by the following analysis:

(i) (4) says that at least one among Ann, Claire and Edith has a certain prop-erty. If α(x) expresses this property then the desired sentence is:

α(a) ∨ α(c) ∨ α(e)(ii) α(x) says that x likes everyone among Jack, David and Harry who likes

himself. It is expressed as a conjunction:

β(x, j) ∧ β(x, d) ∧ β(x, h)where β(x, y) says that if y likes himself then x likes y; that is:

(iii) β(x, y) = L(y, y)→ L(x, y).

The following case involves the equality predicate.

(5) Among Ann, Claire and Edith, someone likes all the others.

(5∗) α(a) ∨ α(c) ∨ α(e),where α(x) = β(x, a) ∧ β(x, c) ∧ β(x, e),

where β(x, y) = x 6≈ y → L(x, y).

We need the antecedent x 6≈ y, because the sentence asserts that someone likes all others; shemay or she may not like herself.

Watch Out: There is no presupposition that different variables must be substituted bydifferent individual constants, or that they must have different objects as values.


If we want to say of x and y that they are different, we have to use the wff x 6≈ y.

If we unfold (5∗), it becomes a disjunction of following three sentences:

(a 6≈ a→ L(a, a)) ∧ (a 6≈ c→ L(a, c)) ∧ (a 6≈ e→ L(a, e))

(c 6≈ a→ L(c, a)) ∧ (c 6≈ c→ L(c, c)) ∧ (c 6≈ e→ L(c, e))

(e 6≈ a→ L(e, a)) ∧ (e 6≈ c→ L(e, c)) ∧ (e 6≈ e→ L(e, e))

Now, conditionals such as a 6≈ a→ L(a, a) are logical truths (though not tautologies), becausea 6≈ a gets always the value F; hence, the first conjunct in the first conjunction, the secondin the second conjunction, and the third in the third conjunction, are redundant.

Having removed the redundant conjuncts, the three conjunctions become:

(a 6≈ c→ L(a, c)) ∧ (a 6≈ e→ L(a, e))

(c 6≈ a→ L(c, a)) ∧ (c 6≈ e→ L(c, e))

(e 6≈ a→ L(e, a)) ∧ (e 6≈ c→ L(e, c))

If we assume that different names denote different women, then inequalities such as a 6≈ c,are true and the conjunctions can be further simplified–by replacing each conditional by itsconsequent. The whole sentence becomes:

(L(a, c)∧L(a, e)) ∨ (L(c, a)∧L(c, e)) ∨ (L(e, a)∧L(e, c))

This last step does not rely, as the preceding steps do, on pure logic. There is an addi-tional assumption that different names have different denotations, which is expressible by thesentence: a 6≈ c ∧ a 6≈ e ∧ c 6≈ e.Homework 7.9

(I) Use variables, and the technique just illustrated in (1) - (5), to display the logical formof the following sentences. The same presuppositions are made as in Homework 7.1: Theuniverse consists of Jack, David, Harry, Ann, Claire, and Edith, gender is according to name,and different names denote different people. Use the same notation as in Homework 7.1. Youdon’t have to unfold the sentences. Note ambiguities and provide, if you can, the appropriatedifferent formalizations.

1. No man is liked by everyone, but some woman is.

2. A man likes himself when he is liked by everyone else.

3. Some woman likes herself, as do two of the men.

4. When a man and a woman like each other, both are happy.


5. Ann does not like a man who likes all women.

6. Some man,liked by Ann, does not like her.

7. Among Jack, David, Ann and Claire, no one is liked by all, except possibly Ann.

8. The same men are liked by Claire and Edith.

9. There is a man, who does not like himself, though liked by all other men.

10. No one among Ann, Claire and Harry is happy, who is not liked by someone else amongAnn, Claire, Harry, and David.

11. A man is happy, if two women like him.

12. Some women, who do not like themselves, like each other.

13. Some man does not like himself, though liked by every woman.

14. Some men do not like themselves, though liked by every woman.

15. Some men do not like themselves, though every other man does.

16. Most women are liked by some men.

17. Harry and David like the same woman.

18. Harry and David like the same women.

19. Harry and David like only one woman.

Note: (i) In some problems (e.g., 9) you can get shorter expressions by using inequalities, asit is done in (5); the unfolded form contains redundant components, but you don’t have tounfold.

(ii) Some problems (e.g., 4) call for the use of two variables for which various pairs of constantsare to be simultaneously substituted.

Chapter 8

First-Order Logic, The Language andIts Relation to English

8.1 First View

First-Order logic, FOL for short, is obtained by enriching PC∗0 with first-order quantifiers.These are new syntactic operations that produce new types of wffs. The application ofquantifiers is called quantification.

The version we shall study here is based on two first-order quantifiers: the universal andthe existential. The choice is a matter of convenience, similar to the choice of sententialconnectives. We shall see that, semantically, each quantifier is expressible in terms of theother and negation. We use

‘ ∀ ’ and ‘ ∃ ’

for the universal and the existential quantifier.

A quantifier takes two objects as arguments: an individual variable and a wff. It yields asoutcome a wff. The outcomes of applying ∀, or ∃, to v and α are written, respectively as:

∀vα and ∃vαThese wffs are called, respectively, universal and existential generalizations. The following arecommonly used terminologies. We speak of the universal, or existential, quantification of αwith respect to v; and also of quantifying (universally, or existentially) the variable v, in α, orof quantifying over v in α. One speaks of quantified wffs, and also of quantified variables (in agiven wff). We might say, for example, that in ∀vα, the variable v is quantified (universally).All of which should not cause any difficulty.

277

278 CHAPTER 8. FIRST-ORDER LOGIC

Before going, in the next section, into the syntax of FOL, let us get some idea of the relationof FOL to English and of the way of interpreting a first-order language. It will help us toappreciate better the syntactic details.

If α is to be read, in English, as ‘...’, then

∀v α can be read as: ‘for all v ...’ .

∃v α can be read as: ‘for some v ...’ .

For example, let

Mn Mr Wm Hp

be the formal counterparts of the predicates:

‘x is a man’ ‘x is mortal’ ‘x is a woman’ ‘x is happy’.

Then

All men are mortal.

can be formalized as:

(1) ∀v1 (Mn(v1)→ Mr(v1))

which can be read as:

For every object v1, if v1 is a man then v1 is mortal.

(Of course, any vi could have been used instead of v1.)

Similarly, ‘Some woman is happy’ can be formalized as:

(2) ∃v3 (Wm(v3) ∧ Hp(v3))

which can be read as:

For some object v3, v3 is a woman and v3 is happy.

Here are some less straightforward examples. Let L(x, y) and K(x, y) correspond to: ‘x likesy’ and ‘x knows y’. Then

8.2. WFFS AND SENTENCES OF FOL 279

(3) ∀v2 (K(v2, Jack)→ L(v2, Jack))

says that everyone who knows Jack likes him; literally: for every object v2, if v2 knows Jackthen v2 likes Jack. And the following says that every man is liked by some woman.

(4) ∀v2 [Mn(v2)→ ∃v1 (Wm(v1) ∧ L(v1, v2))]

Here is a miniature sketch of the semantics. The full definitions are given in chapter 9. Aninterpretation of an FOL is given by giving the following items:

• A non-empty set, playing the role of the universe (or domain) of the interpretation.The individual variables of the language are assumed to range over that universe.

• An assignment that correlates, with each individual constant of the language, a memberof the universe, which is called its denotation.

• An assignment that correlates with every n-ary predicate an n-ary relation over theuniverse, which is said to be the interpretation (or denotation) of the predicate.

If the language contains ≈, its interpretation is the identity relation over the universe.

From this and from the examples above you may get a rough idea how truth and or falsityare determined by the interpretation.

8.2 Wffs and Sentences of FOL

8.2.0

‘First-order logic’ refers to a family of languages that have comparable logical resources. Thosewe consider here share a logical apparatus that consists of:

sentential connectives, individual variables, first-order universal and ex-istential quantification.

The non-logical vocabulary may differ from language to language, it consists of:

individual constants, predicates (of any arity).

If the language has the equality symbol, ≈, then it belongs to the logical vocabulary.


Every FOL language must contain at least one predicate. But it need not contain individualconstants. (The above-given (1), (2) and (3) are examples of sentences without individualconstants.) As before, we use

‘u’ ‘v’ ‘w’ ‘x’ ‘y’ ‘z’ ‘u0 ’ ‘v0 ’ etc.

to stand for unspecified vi’s.

Note: We assume that, when different symbols from this list occur in the same wff-expression, they stand for different vi’s, unless stated otherwise.

Thus, what is expressed by (4) above can be expressed by using any two different vi’s, whichwe can write as:

(40) ∀u [Mn(u)→ ∃v (Wm(v) ∧ L(v, u))]

We shall also use ‘x’, ‘y’ and ‘z’ as variables of our own language, ranging over variousdomains according to the discussion.

First-Order Wffs (Well-Formed Formulas)

Terms and atomic wffs are defined exactly as in PC∗0. Wffs are then defined inductively:

(I) Every atomic wff is a wff.

(II) If α and β are wffs then:

¬α, α ∧ β, α ∨ β, α→ β, α↔ β are wffs.

(III) If α is a wff and v is any individual variable, then

∀vα and ∃vα are wffs.

• Wffs of the forms ¬α, or α∗β, where ∗ is a binary connective, are referred to as sententialcompounds.

• Wffs of the forms ∀vα and ∃vα are referred to as generalizations, the first–universal,the second–existential.

Every wff is, therefore, either atomic, or a sentential compound, or a generalization. Here aresome additional examples of wffs:

S(x, y, b) ∨ R(c, a) ∀x (P(x)→ R(x, x)) (∃yR(a, y)) ∧ ¬∀xS(b, x, y)


∀x ∃ y ∃ z (S(x, y, z) ∨ ∀uP(u)) ∃yP(a) ¬[(∀xR(y, z))→ ∀xP(x)]The first, the third and the last are sentential compounds. The rest are generalizations: thesecond and fourth–universal, the fifth–existential.

Terminology: An occurrence of a term t in an atomic wff of the form R(. . . , t, . . .) is saidto be under the predicate R.

Unique Readability: Unique readability comprises the previous conditions concerningatomic wffs and sentential compounds, as well as conditions on quantification:

• Every generalization is neither an atomic wff, nor a sentential compound.• The quantifier, the variable, and the quantified wff are uniquely determined by thegeneralization: If

Qv α = Q0v0 α0

where Q and Q0 are quantifiers, then: Q = Q0, v = v0 (i.e., they are the same variable)and α = α0.

Operants, Scopes and Subformulas: A quantifier-operant is a pair of the form ∀v, or∃v, which consists of a quantifier and a variable.By an operant we shall mean either a sentential connective, or a quantifier-operant. Negationand quantifier-operants are monadic: they act on single wffs. The others are binary.

The notion of main connective generalizes, in the obvious way, to the notion of main operant:

• The main operant of ¬α is ¬ and its scope is α. The main operant of α ∗ β (where ∗ isa binary connective) is ∗ and its left and right scopes are α and β.

• The main operant of Qv α (where Q is a quantifier) is Qv and its scope is α.

Sometimes we omit to mention the variable of the quantifier-operant and speak of the scopeof a quantifier (or, rather, of its occurrence), or of the quantifier itself as being the mainoperant.

The immediate components of a wff are defined by adding to the previous definition of chapter2 (cf. 2.3) the clause for quantifiers:

The immediate component of Qv α (where Q is a quantifier) is α.

A component is now defined as before: it is either the wff itself, or an immediate componentof it, or an immediate component of an immediate component,... etc. A component of α is


proper if it is different from α. The components of α are also referred to as the subformulasof α, and the proper components–as the proper subformulas.

The component-structure of a given wff is that of a tree. As in the sentential case, we canwrite wffs as trees, or even identify them with trees. The following is a wff whose subformulascorrespond to the sub-trees that issue from the nodes. The nodes are numbered accordingto our old numbering rule (cf. 4.2.2, page 122). The main operants of the non-atomicsubformulas are encircled. On the right-hand side is the graphic representation, with nodeslabeled either by operants or by atomic wffs.

1. ∀x [(∃ y ∃ z S(x, y, z)) ∨ ∀uP(u)]

2. (∃ y ∃ z S(x, y, z)) ∨ ∀uP(u)

3.1 ∃ y ∃ z S(x, y, z))

3.2 ∀u P(u))

4.1 ∃ z S(x, y, z))

5.1 S(x, y, z)) atomic wff

3.2 P(u) atomic wff

As in the case of sentential logic (cf. 2.3.0), we often omit the word ‘occurrence’. For example‘the second ∀’ means the second occurrence of ∀, ‘the first ∃v’ means the first occurrence of∃v, etc. The same systematic ambiguity applies to other particles and constructs: variables(‘the first v’), individual constants (‘the second a’), wffs (‘the first P(a)’) etc.

Nested Quantifiers: Quantifiers are said to be nested if one is within the scope of theother. In the last example, ∀x and ∃ y are nested. A sequence of nested quantifiers is asequence in which the second is in the scope of the first, the third–within the scope of thesecond, and so on. In the last example the following are sequences of nested quantifiers:

∀x,∃y,∃z ∀x,∀u

On the other hand, ∃y and ∀u are not nested.


( To be precise we should speak of quantifier-occurrences, because the same quantifier (withthe same variable) can occur more than once.)

Grouping Conventions for Quantifier-Operants: The grouping convention for negationis extended to all monadic operants: Every monadic operant-name binds more strongly thanevery binary operant-name. This means, for example, that

∃ v α ∧ β is to be read as (∃vα) ∧ βTo include β within the scope of ∃v, write:

∃v(α ∧ β)

8.2.1 Bound and Free Variables

An occurrence of a variable v in a wff α is bound if it is (i) within the scope of a quantifier-operant Qv, or (ii) the occurrence of v in the pair Qv An occurrence of a variable is free if itis not bound.

A variable that has a free occurrence in α is said to be free in α, or a free variable of α. Avariable that has a bound occurrence in α is said to be bound in α, or a bound variable of α.

Examples:

S(x, y, b) ∨ ¬R(y, x) : All variable occurrences are free.∀xS(x, y, b) ∨ ¬∃xR(y, x) : All occurrences of x are bound, all occurrencesof y are free.

∀x[∃yS(x, y, b)∨¬R(y, x)] : All occurrences of x are bound, and so are theoccurrences of y in ∃y and under S. The last occurrence of y is free (it is notwithin the scope of ∃y).

As the last example shows, a variable can have several occurrences, of which one or more arebound and one or more are free. Such a variable is both free and bound in α.

Whether an occurrence is free depends on the wff. The same variable-occurrence which isfree in one wff can be bound in a larger wff that contains the first as a component. Forexample, the occurrence of y in R(x, y) is free, but it is bound in the larger ∀yR(x, y). Thex in ∀yR(x, y) is free in that wff, but is bound in ∃x∀yR(x, y) .The Binding Quantifier: An occurrence of a quantifier-operant Qv is said to bind, andalso to capture, all the free occurrences of v in its scope; these latter are said to be bound, orcaptured, by the Qv.


As is usual, we apply the terminology to the quantifier itself: we speak of an occurrence ofa quantifier as binding, or capturing, the variables that occur free in its scope. Among theoccurrences that are bound by (an occurrence of) Q, we also include the occurrence of v inthe pair Qv.

It is not difficult to see that, for any wff α, every bound occurrence of v is bound by a uniqueoccurrence of some quantifier. If the v occurs in Qv, then it is bound by that occurrence ofQ. Otherwise, it is in some subformula, Qv β, such that it is free in β; it is then bound bythat Q. (The uniqueness is guaranteed by unique readability.)

An occurrence of v can belong to the scopes of several Qv’s (e.g., the last occurrence of v,in ∀v[P(v) → ∃vR(a, v)] .) Among these, the Qv with the smallest scope is the one thatbinds it.

In the following illustrations the bindings are indicated by connecting lines.

∀x [S(x, y) ∨ ∃ y S(x, y)]

∀x [P(x) → (S(x, y)] ∨ ∃ y ∀x S(x, y))

The significance of free and bound occurrences is explained in the next subsection.

Individual Constants: It is convenient to extend the classification of free and boundoccurrences to individual constants. All occurrences of individual constants are defined tobe free. Thus, an occurrence of a term is free iff it is either an occurrence of an individualconstant, or a free occurrence of an individual variable.

Sentences

A wff is defined to be a sentence just when it has no free variables. (In some terminologies theterm ‘open sentence’ is used for wffs with one or more free variables, and ‘closed sentence’–forsentences.)

The wffs (1)-(4) at the beginning of the chapter are sentences.

The definition of sentences in PC∗0 (given in 7.4) is a particular case of the present definition:All variable-occurrences in wffs of PC∗0 are free, since PC

∗0 has no quantifiers. Hence a wff of

PC∗0 is a sentence–according to the present definition–just when it contains no variables.


8.2.2 More on the Semantics

The syntactic distinction between free and the bound occurrences has a clear and crucialsemantic significance. Consider the wff

P(v)

The interpretation of the language does not determine the formula’s truth-value, because theinterpretation does not correlate with variables–as it does with the individual constants–particular objects. In order to get a truth-value we need, in addition to the interpretation,to assign some object to v. We therefore introduce assignments of values to variables. Thetruth of P (v) is relative to an assignment that assigns a value to v. The wff P(v) gets T iff theassigned value is in the set that interprets P( ). If P( ) is interpreted as the set of all people,then P(v) is true, under the assignment that assigns Juno to v, iff Juno is a person.

On the other hand the interpretation by itself determines the truth-values of

∀vP(v) and ∃vP(v)The first is true iff all the objects in the universe of the interpretation are in the set denotedby P (which means that the set is the whole universe). The second is true iff some object isin this set. (which means that the set is not empty). You can, if you wish, assign a value tov. But this value will have no effect on the truth-values of the last two wffs.

Changing the free variable in P(v) results in a non-equivalent formula:

P(v) is not logically equivalent to P(u)

Because, if P is interpreted as a set that is neither the whole universe nor empty, there is anassignment under which the first wff gets T and second gets F: Assign to v a value in theset and to u–a value outside it. On the other hand, changing the bound variable results ina different, but logically equivalent, formula:

∀vP(v) ≡ ∀uP(u) ∃vP(v) ≡ ∃uP(u)Roughly speaking, the wff P(v) says something about the interpretation of P and the valueof v; but the wffs ∀vP(v) and ∃vP(v) are not about the value of v, they are only about theinterpretation of P.

What we have just observed holds in general. If a wff, α, has free variables, then its truth-value in a given interpretation depends, in general, on the values assigned to these variables.But if all the variables are bound, that is, if the wff is a sentence, the truth-value is completelydetermined by the interpretation of the language.

Note: There are wffs, with free variables, which get the same truth-value under allassignments. For example, the truth-value of

P(v)→ P(v)


is T, for any value of v; nonetheless v is a free variable and the wff is not a sentence. You canthink of this wff as defining a function, whose value for each value of v is T; this is differentfrom a sentence, which simply determines a truth-value.

Variable Displaying Notation: We extend the variable displaying notation of 7.4.1 (page272) to wffs of first-order logic. We shall use

‘α(u)’ ‘β(u, v)’ ‘β0(x, y)’ ‘γ(x, y, z)’ etc.

to denote wffs in which the displayed variables (and possibly others that are not displayed)are free. One of the main points of the notation has to do with substitutions of free variables,to be considered in the next subsection.

If v is the only free variable in α(v), you can think of α(v) as saying something about thevalue of v. It defines the set consisting of all objects whose assignment to v makes α(v)true (under the presupposed interpretation of the language). Similarly, a wff with two freevariables defines a binary relation, one with three free variables defines a ternary relation,and so on. (Recall that in cases of arity > 1 we have to stipulate which variable representswhich coordinate; cf. 7.3.)

For example, if we have in our language the predicates Male( ) and Parent( , ), we canformalize ‘x is a grandfather of y’ as:

Male(x) ∧ ∃ v (Parent(x, v) ∧ Parent(v, y))

Wffs with free variables resemble predicates, but unlike predicates they are not atomic units.

Homework 8.1 Assume an interpreted first-order language, containing ≈ and thepredicates: M( ), F( ) and C( , , ), interpreted so that M(x), F(x), and C(x, y, z) readas:

‘x is a human male’, ‘x is a human female’, ‘x is a child of y and z’

Write down wffs that formalize the following. Use the same free variables that are used herewith the English. You may introduce shorthand notations; e.g., you can define ‘β1(x, y)’ tostand for some wff, and then use it as a unit. But write in full unfolded form at least two ofthe formalizations.

1. x is the mother of y.

2. x is a sister of y.

3. x is an uncle of y.


4. x and y are first cousins.

5. x is y’s nephew.

6. x is y’s maternal grandmother.

7. x is a half-brother of y.

8. x has no sisters.

9. Everyone has a father and a mother. (Use H as a predicate for humans).

10. No one has more than one father.

Repeated Use of Bound Variables: The same variable can paired in the same wff withseveral occurrences of quantifiers:

(5) ∀xP(x)→ ∀xP0(x)

(5) says that if everything is in the set denoted by P, then everything is in the set denoted byP0. Such quantifiers can be even nested:

(6) ∀ v [P(v)→ ∀vP(v)]

We can try to read (6) as:

(60) For every v, the following holds: If v is P, then for every v, v is P.

It may look, or sound, confusing, until you realize that there is no connection between thefirst v and the second v. To bring the point out, rephrase (60) as:

(600) For every v: if v is P, then everything is P.

(And this, it is not difficult see, says that if something is P then everything is P.)

While (6) is a legitimate sentence, one may wish to avoid the repeated use of the same boundvariable. This can be easily done by using another variable in the role of the second v. Thefollowing is logically equivalent to (6).

(6∗) ∀ v [P(v)→ ∀uP(u)]


8.2.3 Substitutions of Free and Bound Variables

We denote by ‘Stt0α’ the wff obtained from α by substituting every free occurrence of t

by t0. (Recall that all occurrences of individual constants are considered free.) We describethe operation as the substitution of free t by t0, or the substitution of t0 for free t. We shallrefer to it, in general, as free-term substitution, or for short free substitution. The concept isextended to cover also simultaneous substitutions of several terms:

St1t01t2t02......tnt0nα

is the result of substituting simultaneously, every free occurrence of t1 by t01, every free occur-

rence of t2 by t02, and so on.

Example: Letα = ∀u (P(u, v, w)→ ∃wP(w, v, c))

All occurrences of u in α are bound, all occurrences of v are free, the first occurrence of w isfree and the other two are bound. Consequently, we have:

Svxα = ∀u (P(u, x,w)→ ∃wP(w, x, c))

Swx α = ∀u (P(u, v, x)→ ∃wP(w, v, c))

Svxwv α = ∀u (P(u, x, v)→ ∃wP(w, x, c))

Svccvα = ∀u (P(u, c, w)→ ∃wP(w, c, v))

As in 7.4, we write the wff obtained from β(v) by substituting x for the free v as: β(x).Similarly,

Sx1v1

x2v2

x3v3α(x1, x2, x3) = α(v1, v2, v3)

Recall that, in order to avoid inconsistent notation, we have to display all the free variablesthat may be subject to substitution (cf. 7.4 for details).

Note: If t does not occur freely in α then Stt0α = α

Legitimate Substitutions of Free Terms: The semantic meaning of substituting freeterms is the following: The new wff says of the values (or interpretations) of the new termswhat the original wff says of the values (or interpretations) of the original ones. For example,let K(x, y) and L(x, y) read, respectively, as ‘x knows y’ and ‘x likes y’, and let γ(v) be:

∀u (K(v, u)→ L(v, u))

γ(v) says that v (or rather the value assigned to v) likes everyone he or she knows. The samething is said by γ(w) about w. But if we substitute u for the free v we get:

∀u (K(u, u)→ L(u, u))


which has a different syntactic structure and a completely different meaning. It says thateveryone who knows himself likes himself. The unintended outcome is due to the fact thatthe free v is in the scope of ∀u. The u that is substituted for the free v is captured by thatquantifier. Such examples motivate the following definition.

The substitution, in a given wff, of free t by t0 is legitimate if no free occur-rence of t becomes, after the substitution, a bound occurrence of t0.

Given some wff, we say that free t is substitutable by t0, and also that t isfree for t0, if the substitution of free t by t0 is legitimate.

This is generalized, in the obvious way, to simultaneous substitutions.

Illegitimate substitutions result in wffs. But they change the structure of quantifier-variablebindings; therefore, when it comes to the semantics, the outcome can be unrelated to theoriginal meaning.

Henceforth, unless stated otherwise, we use ‘Stt0α’, ‘S

tt0ss0α’, etc., for legitimate substitutions

only. Use of the notation is taken to imply that the substitution is legitimate.

Substitution of Bound Variables

To substitute a bound variable, say u, by x, is to replace all bound occurrences of u by x.We describe this as the substitution of bound u by x, and we refer to the operation as bound-variable substitution.

For example, ∀u (K(v, u) → L(v, u)) is transformed into ∀x (K(v, x) → L(v, x)) . We canalso substitute, at one go, several bound variables:

∀u∃vK(u, v) can be transformed into ∀x∃yK(x, y) and also into ∀v∃uK(v, u)

The result of such a substitution is a logically equivalent formula. (This is proved in the nextchapter.)

Substitutions of bound variables can be applied locally, in order to change a proper subformulato a logically equivalent one. In this manner (6) is transformed into the logically equivalent(6∗). These substitutions can serve to eliminate repeated use of the same bound variable.Using them we can transform any wff into a logically equivalent one, in which different occur-rences of quantifiers are always paired with different variables. Bound-variable substitutionscan be also used to get an equivalent wff in which no variable is both free and bound. All inall we get wffs that are easier to grasp.

Furthermore, bound-variable substitutions can enable free substitutions, which would be oth-erwise illegitimate. If, for some reason (and such occasions arise), we want to substitute the


free v by u in∃uL(u, v)

we cannot do so, because the u will be captured by the quantifier. But after substituting thebound u by w, we get the logically equivalent

∃wL(w, v)

And here the substitution of free v by u is legitimate and we get: ∃wL(w, u).Legitimate Substitutions of Bound Variables: As in the case of free-term substitutions,substitutions of bound variables can have unintended effects of capturing free occurrences.Consider, for example, our previous γ(v):

∀u (K(v, u)→ L(v, u))

If we substitute in it bound u by v we get:

∀ v (K(v, v)→ L(v, v))

which is not what we intended. Bound occurrences of u have been replaced here by boundoccurrences of v, but, in addition, free occurrences of v became bound. A substitution mayalso transform some bound occurrence to an occurrence that is bound by a different quantifierFor example, if in:

(7) ∀u [P(u)→ ∀wR(u, w)]

we substitute bound u by w we get the non-equivalent:

(8) ∀w [P(w)→ ∀wR(w,w)]

To see clearly the difference, you can read (7) and (8), respectively, as:

(70) For every u: if u is P, then u is R-related to everything.

(80) For every w: if w is P, then everything is R-related to itself.

The trouble here is that the occurrence of u under R, which in (7) is bound by first ∀, hasbeen transformed into an occurrence of w, which is bound by the second ∀.All of this motivates the following definition.

A substitution of bound variables is legitimate if every free occurrence re-mains, after the substitution, free, and every bound occurrence is changedto, or remains, bound by the same quantifier.


We can combine the substitution conditions for free and for bound variables into one gen-eral condition, which covers all substitutions, including mixed cases where free and boundsubstitutions are carried out simultaneously.

Legitimate Substitutions in General A substitution of variables is legitimateif the following holds: (i) Every free occurrence remains, or is replaced by,a free occurrence; (ii) free occurrences of the same variable remain, or arereplaced by, occurrences of the same variable; (iii) every bound occurrenceremains, or is replaced by, a bound occurrence that is bound by the samequantifier-occurrence.

Note: We use the notation ‘St0t α’ only for free substitutions. We will not need a special

notation for bound ones.

Homework 8.2 Let α be the wff:

∀uR(u, v)→ ∃ v [S(v, u, c) ∨ ∀wR(v, w)]

(i) List, or mark the free occurrences of each of the terms in α. List, or markall bound occurrences.

(ii) Substitute (legitimately) bound occurrences so as to get a logically equivalentwff in which no variable is both free and bound.

(iii) Construct the following wffs (carry out also the illegitimate substitutions):

Sucα Sv

wα Suvvuα Su

wcvα

(iv) Note which of the substitutions in (iii) is legitimate. In each illegitimatecase, change the bound variables so as to get an equivalent wff in which thesame substitution is legitimate, then carry this substitution out.

8.2.4 First-Order Languages with Function Symbols

For many purposes it is convenient to include in the non-logical vocabulary an additionalcategory: function symbols. Each function symbol comes with its associated number of places.

Assume, for example, that the language is supposed to be interpreted in a universe consistingof numbers. It would be convenient to include two-place function symbols that denote additionand multiplication:

sum( , ) and prd( , ).


If 1, 2, 3,... are names of 1, 2, 3,... etc., then, under this interpretation:sum(5, 3) denotes 8, and prd(4,3) denotes 12. In this interpretation the following sentencesare true:

sum(7, 2) ≈ 9 prd(sum(3, 2), 6) ≈ sum(prd(3, 6), sum(9, 3))

Ordinary mathematical notation is similar, except that the familiar addition and multipli-cation signs are used in infix notation (the function name appears between the argumentplaces). If + and · are the formal-language symbols, and we rewrite +(x, y) and ·(x, y) asx+y and x·y, then the last sentences become:

7+2 ≈ 9 (3+2)·6 ≈ (3·6)+(9+3)

In general, the non-logical vocabulary of such a first-order language contains, in addition toindividual constants and predicates, function symbols:

f, g, h, f0, g0, h0, etc.,

each with its associated number of places. The definition of term is extended accordingly:

(T1) Every individual variable and every individual constant is a term.

(T2) If f is an n-ary function symbol and t1, . . . , tn are terms, then f(t1, t2, . . . , tn)is a term.

The set of all terms is obtained by applying (T2) recursively. For example, the following aregenerated via (T2), hence they are terms:

u, v, a, b, f(u), g(a, v), g(g(a, v), f(u)), h(f(u), b, g(g(a, v), a)), etc.

The definitions of atomic wffs and of wffs are the same as before (cf. 7.4.0 page 270), but nowthe terms referred to in these definitions can be highly complex structures. The following, forexample, are wffs. Here P and R are, respectively, a monadic and binary predicate and f andg are, respectively, a one-place and two-place function symbols.

R(g(u, v), a) ∀u ∀ v [f(u) ≈ g(v)→ u ≈ v], ∀ v [P(g(f(v), b) )→ ∃u (v ≈ f(u))]The definitions of free and bound occurrences of individual variables are adjusted in theobvious way. If a variable occurs in a term t (including the case where t is this variable), thenall its occurrences are free in t. The free occurrences of a variable in an atomic wff, α, areits occurrences in the terms that occur in α. An occurrence remains free as long as it is notcaptured by a quantifier.

A term is constant if it does not contain variables.


Any interpretation of the language interprets, in addition to the predicates and the individualconstants, the function symbols. If U is the universe of the interpretation, then an n-placefunction symbol is interpreted as an n-place function over U , whose values are in U . Thefunction symbol denotes this function. In other words, an n-place f denotes an n-placefunction that maps every n-tuple of members of U to a member of U .

Once the interpretation is given, every constant term gets a value in the interpretation’suniverse. This value is obtained by applying the functions denoted by function symbols tothe objects denoted by the individual constants. The value is what the constant term denotes,under the given interpretation.

Terms that contain free variables do not denote particular objects. Their denotations dependon assignments of values to the individual variables occurring in them (just as the truth-values of wffs with free variables depend on such assignments). These terms define, in thegiven interpretation, functions (just as wffs with free variables define sets and relations).

Peano’s Arithmetic

The first-order language of Peano’s arithmetic contains, besides equality, the individual con-stant 0, a one-place function symbol s, and two-place function symbols + and · .The standard interpretation of the language is known as the standard mode of natural numbers.That is:

• The universe is the set {0, 1, . . .} of natural numbers.• 0 denote the number 0.• sdenotes the function that maps each number n to n+ 1.• + and · denote, respectively, the addition and the multiplication functions.

Using infix notation for the function symbols, we have:

s(s(v)) defines the function that maps each n to n+ 2.

v · v defines the function that maps each n to n2.

u·(v+s(s(0))) defines the function that maps each pair (m,n) to m · (n+2).Here we assumed that u marks the first argument, v–the second.

[((u+v)·(u+v))·(u+v))]+[(1+s(0))] defines the function that maps each(m,n) to (m+ n)3 + 1


Given the associativity of addition and multiplication ((x+y)+z=x+(y+z) and (x ·y) ·z=x · (y · z)), we can ignore groupings in iterated +, or in iterated ·. Here of course we assumethe standard interpretation. In general, the groupings cannot be ignored.

Many properties and relations of natural numbers are expressible in Peano’s arithmetic. Hereis a small sample. For convenience, we use ‘x’ ‘y’ and ‘z’ as variables of the formal language,as well as variables of our own mathematical-English discourse.

∃ v [x+v ≈ y] : x ≤ y.

∃ v [x+s(v) ≈ y] : x < y.

∃ v [x·v ≈ y] : x is a divisor of y.

∀u ∀ v [(u·v ≈ x)→ (u ≈ s(0) ∨ u ≈ x)] : x is a prime number.

The following sentences are true, when the language is interpreted by the standard model ofnatural numbers. One may ignore that particular interpretation and consider these sentencesas non-logical axioms, which characterize (in some way that we shall not go into) the conceptof natural number. We then get what is known as Peano’s axiom. (Actually, they aredue to Dedekind; Peano gave them a certain formalization). The axioms are the universalgeneralizations of the following wffs; this means that the sentences are obtained by quantifyinguniversally over all the wffs’ free variables, e.g., the first axiom is ∀x [0 6≈ s(x)].

1. 0 6≈ s(x)

2. x 6≈ 0 → ∃u [x ≈ s(u)]

3. x+0 ≈ x

4. x+s(y) ≈ s(x+y)

5. x·0 ≈ 0

6. x·s(y) ≈ (x·y)+x)

7. {α(0) ∧ ∀ v [α(v)→ α(s(v))]} → ∀vα(v)

The last, known as the Induction Axiom, is a scheme covering an infinite number of instances.For every wff, α(v), there is a corresponding axiom. It states that if the wff is true for 0,and if for all n, its truth for n implies its truth for n+ 1, then the wff is true for all naturalnumbers.

8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE 295

8.3 First-Order Quantification in Natural Language

8.3.1 Natural Language and the Use of Variables

Frege was the first to introduce quantifiers and to use them, together with variables, in orderto express universal and existential claims. Since a variable can occur any number of timesin a wff, and since any assignment gives it the same value on all its occurrences, variablesconstitute an unequaled device for pointing repeatedly to the same object. In English, whichlike other natural languages lacks such a device, the effect can be simulated to an extent byanaphora: the use of substitute words to refer to previously mentioned items.

(1) A painter who lived on the lower east side, got, from an anonymous donorwho admired his works, a gift of money that helped him finish a painting.

Here ‘his’ and ‘him’ refer back to the painter. If we formalize (1) the repeated referencesappear as repeated occurrences of the same variable. Such repetitions correspond also torelative pronouns: ‘who’ and ‘that’:

(10) For some x, y, z :

x was a (male) painter, and x lived on the lower east side, and y admiredthe works of x, and x got z from y (and x did not know who sent z to x),and z was a gift of money, and z helped x to finish a painting.

We can continue (1) in variable-free English, by using ‘the painter’, ‘the donor’, and ‘thegift’ throughout. But when there are no distinguishing marks and the statement has somecombinatorial complexity, we must resort to variables.

(2) If x, y, and z are different numbers, such that x is not between y and z, andy is not between x and z, then either x is smaller than z or y is smaller thanz.

You may try something along the lines of:

(20) If each of two, of three different numbers, is not between the remaining two,then one of these two is smaller than the third.

But this is not very clear and, besides, ‘the third’ is a variable in disguise. No wonder thatvariables have been introduced, either explicitly or in disguise (‘the first’ ‘the second’ etc.),from the very beginning of logical studies; and they have been extensively used in ancientmathematics.


Still, a lot can be achieved by anaphora. Consider, for example, the following sentence inFOL, where L(x, y) reads as ‘x likes y’.

(3) ∃x ∀ y [L(x, y)→ (L(y, x) ∨ ∃z(L(x, z)∧L(y, z)))]

Though it is not at all obvious, (3) can be rendered in variable-free English:

(30) Someone likes only those who like either him, or someone he likes.

( The masculine pronouns have been used in a gender-neutral role; you can substitute ‘him’by ‘him or her’ and ‘he’ by ‘he or she’; the result is grammatical but unwieldy.) Naturallanguage can provide surprising solutions.

Homework 8.3 Recast the following in variable-free English. In (1), read L(x, y) as ‘x likesy’. In 2. you are allowed one use of ‘the other’, or a similar expression, to refer to a certainnumber.

1. ∃x ∀ y [L(x, y)→ [L(y, x) ∨ ∃z(L(x, z)∧L(y, z) ∨ L(z, y)∧L(x, z)∧L(z, x))]2. Of three numbers, if x, is smaller than y, and y is smaller than z, then x is smaller than

z.

3. If x, y and z are three different numbers than either x is between y and z, or y is betweenx and z, or z is between x and y.

Concerning Gender

The interpretation of certain masculine forms depends on the context. The saying

All men are mortal,

which goes back a long time before our gender troubled grammar, attributes mortality to allhuman beings, not only to males. The same goes for masculine pronouns, which can refer onoccasions to persons in general, male and female. If there are no indications to the contrary,you may plausibly assume that in the following sentences the quantification is intended tocover all human beings:

(4) Someone in the room loves himself.

(5) Everyone should pay his taxes.

(In an appropriate background, the same might go for: ‘Someone in the room loves herself’.)Whether or not such uses are desirable is a delicate question, which I shall not risk addressingin a logic book.


8.3.2 Some Basic Forms of Quantification

First-order quantification is expressible in natural language in a great variety of ways, eachwith its own implied meanings and peculiarities of usage. The fit of this motley system withthe simple FOL grid is bound to be far from perfect. Here we shall only point out basicpatterns.

In English, generalizations, especially universal ones, fall most commonly under the followingscheme:

(S1) Quantifier Term –– Common-Noun Phrase –– Verb Phrase

Here are two examples, one universal another existential, that fall under (S1).

(4) Every word he put on paper was subject to careful considerations.

Every word he put on paper was subject to careful considerations.

(5) Some girl who lives in New York grows beautiful tulips.

Some girl who lives in New York grows beautiful tulips.

In FOL both the common-noun phrase and the verb phrase are represented by wffs. The verbphrase can be derived, like any predicate, either from a common name or from an adjectiveor from a verb. The two wffs share a common free variable, which is the variable used withthe quantifier. Here is what the formalizations of (4) and (5) will look like.

(40) ∀x{x is a word he put on paper → x was subject to careful considerations}

(50) ∃x {x is a girl who lives in New York ∧ x grows beautiful tulips}

The words ‘everyone’ and ‘someone’ can be rephrased as ‘every human’ and ‘some human’,and then the generalized sentence comes under (S1):

Everyone in the room is smiling

becomes

Every human in the room is smiling.

Every human in the room is smiling.

More of this later.


The quantifications that fall under (S1) are of the following forms. Here ‘every’ and ‘some’ arethe quantifier terms, ‘Y ’ is the common-noun phrase and the rest: ‘... Z’ is the verb phrase,where ‘...’ can stand for ‘is’ (or, as indicated in the brackets ‘is a’, or some other suitableverb). Underneath them are written their FOL formalizations, where α(x) formalizes ‘x is aY ’, and β(x) formalizes ‘x is [is a, or a verb] Z’.

Every Y is [is a/verb] Z Some Y is [is a/verb] Z

∀x (α(x)→ β(x)) ∃x (α(x) ∧ β(x))Semantic considerations show why this is the correct formalization. ‘Every Y is a Z’ meansthat everything that is a Y is a Z. We can state this by saying:

For every x, if x is a Y then x is a Z,

or more formally:

∀x [x is a Y → x is a Z].

Therefore the two parts are linked by a conditional, although there is no ‘if...then ’ in theoriginal English. If I say, for example, ‘Everyone in the room is smiling’, then I have madean assertion about the people in the room. My assertion is not falsified if somebody who isnot in the room is not smiling. We can imagine it as a (possibly infinite) conjunction of allsentences of the form: ‘If a is a person in the room then a is smiling’. If there is no one inthe room, the sentence is vacuously true.

Similarly, ‘Some Y is a Z’ means that there is something which is a Y and a Z:

For some x, x is a Y and x is a Z,

or more formally:

∃x [x is a Y ∧ x is a Z].

Hence we have a conjunction, although there is no ‘and’ in the original English. If I say‘Someone in the room is smiling’, then what I say is true just in case there is something thatis both (i) a person in the room and (ii) smiling. If there is no one in the room, the sentenceis false.

Watch Out: Since the conditional and the conjunction do not appear in the original Englishsentence, beginners who go by surface appearance often get the formalization wrong.


Note: In the case of (4), α(x) and β(x) formalize, respectively: ‘x is a word he put onpaper’ and ‘x was subject to careful considerations’, and in the case of (5) they correspondto: ‘x is a girl who lives in New York’ and ‘x grows beautiful tulips’. α and β can becomplex wffs that include quantifiers of their own. Cases that involve a different ordering ofphrases can be still classified under (S1), for example:

(6) He subjected every candidate to a lengthy interview.

comes out as ∀x (α(x) → β(x)), where α(x) and β(x) formalize, respectively: ‘x was acandidate’ and ‘he subjected x to a lengthy interview’ .

A second basic scheme applies to existential quantification:

(S2) There – Existential Verb –- Common-Noun Phrase

The existential verbs are ‘be’ and ‘exist’. For example:

(7) There is a prime number smaller than 5.

There is a prime number smaller than 5.

Finer points of grammar, such as the presence of the expletive ‘there’, or the indefinite articlethat is required in the singular form, are not represented here and we shall not dwell on them.With ‘exist’ the existential verb can come, without expletive, at the end: ‘

A prime number smaller than 5 exists.

In what follows immediately we shall be concerned with cases that fall under (S1). We shallreturn to (S2) when we focus on existential quantification.

Relativized Quantifiers

In our present version of FOL, every individual variable ranges over “everything”; that is, overall the objects in the universe of the interpretation. If we want to restrict the quantificationto a certain subdomain–say the domain of humans, or animals, or planets, or what haveyou–we should use a predicate that marks off that subdomain. To restrict quantification tothe subdomain determined by P( ), we change the original generalized wffs as follows:


∀x (. . . x . . .) to ∀x (P(x) → . . . x . . .)

∃x (. . . x . . .) to ∃x (P(x) ∧ . . . x . . .)

In mathematical logic this is known as the relativization of the quantifiers to P (or to thesubdomain determined by P). When these changes are applied to every subformula of a givenwff, γ, we get the relativization of γ to P. What γ says about the universe, the relativizationsays about the domain that is marked off by P.

Instead of the atomic P(x), we can use any wff, α(x), with one free variable; in which case wespeak of the relativizing the quantifiers to α(x).

Quantifications of the forms:

∀x (α(x)→ . . .) ∃x (α(x) ∧ . . .)

arise, as we have seen, in formalizations of natural language. In formalizing (5), we mightstart by:

∀x [x was subject to special scrutiny]

But obviously we do not want to say that everything was subject to special scrutiny. Hencewe relativize the quantifier to things that are words that he (whoever he is) put on paper.This gives us (40). Similarly, (50) is obtained from the unrelativized

∃x [x grows beautiful tulips],

by relativizing to girls who live in New York.

If several quantifiers are involved, the quantified variables may need restrictions to differentdomains. Hence, as a rule, different α’s will be used to relativize different quantifiers.

Certain natural domains, such as the domain of humans, or of dogs, or of stars, are bestrepresented in a first-order language by monadic predicates that are included in the basicvocabulary. Occasional domains, which arise in the context of this or that statement, can bemarked off by appropriate wffs.

For example, if we have in our language the predicates Girl( ) and LiveIn( , ) and theindividual constant NY, we can define the set of girls who live in New York by:


Girl(x) ∧ LiveIn(x, NY)

In which case, (5) will come out as:

(5∗) ∃x [Girl(x) ∧ LiveIn(x, NY) ∧ β(x)] where β(x) defines those who growbeautiful tulips. Additional predicates will be needed to construct a plausibleβ(x).

Sometimes a fine-grained analysis is not needed. To show that (5) implies that someone growsbeautiful tulips, we do not need to know the structure of β(x). We can let β(x) be the atomicwff GBT(x), which stands for ‘x grows beautiful tulips’. But sometimes a deeper analysis isunavoidable; we need to go into β(x)’s details in order to show that (5) implies that someonegrows tulips.

Many-Sorted Languages

Some first-order languages have individual variables of several sorts. An interpretation asso-ciates with each sort a domain; all the variables of this sort range over that domain. Quan-tification is interpreted accordingly. Domains that correspond to sorts need no relativization,we simply use the appropriate variables. For instance, if ξ is a variable that, according to theinterpretation, ranges over humans, then

∀x (Human(x)→ Mortal(x)) is rewritten as ∀ ξ Mortal(ξ) .

Single-sorted languages and many-sorted ones have the same expressive power. What thelatter achieve by use of different sorts the former achieve by relativizing to the correspondingpredicates. Many-sorted languages are more convenient in certain contexts of applications;single-sorted ones are simpler when it comes to defining the syntax and the semantics.

Note: Sometimes relativization is not necessary, because the restriction to a specific domainis already implied by other predicates that occur in the formula. Suppose, for example, thatK(x, y) reads as ‘x knows y’, and that, by definition, only humans can know.

(8) Some person other than Jack knows Jill,

and

(9) Every person who knows Jill likes her,

can be rendered as:

(80) ∃x (x 6≈ Jack ∧ K(x, Jill))


and

(90) ∀x (K(x, Jill) → L(x, Jill))

Explicit relativization to the human domain can be dispensed with, because the needed re-striction is imposed already by the predicate K. But not always does this work out. Youshould note how the predicate occurs. For example,

(10) Someone doesn’t know Jill

should be formalized as

(100) ∃x [H(x) ∧ ¬K(x, Jill)] ,

where H marks off the domain of humans. Without the conjunct H(x), the formalization willcome out true whenever the universe includes non-human objects (can you see why?).

8.3.3 Universal Quantification

The universal-quantifier terms in English are the indefinite pronouns:

every all any each .

They differ,however in significant aspects, grammatical as well as semantic. For example, ‘all’requires a complement in the plural, the others–in the singular; this, we shall see, is relatedto semantic differences.

‘All’ can function as a quantifier-term in ways that the others cannot, ways that do not fallstrictly under (S1). It can precede a relative pronoun:

(1) All who went never returned.

To use the other terms in such a construction, one would have to transform ‘who went’ intoa noun phase, e.g., ‘one who went’.

Sometimes the terms can be used interchangeably:

(2) In this class, every student can pass the test.

(3) In this class, all students can pass the test.

(4) In this class, any student can pass the test.


(5) In this class, each student can pass the test.

But quite often they cannot:

(6) Not every rabbit likes lettuce.

(7) Not each rabbit likes lettuce.

(8) Not any rabbit likes lettuce.

The second is odd, if not ungrammatical; the third, if accepted, means something differentthan the first. Or compare:

(9) Each voter cast his ballot.

(10) Any voter cast his ballot.

You can easily come up with many other examples.

We shall not go into the intricate differences of various first-order quantifications in English.This is a job for the linguist. In what follows some basic aspects of the two most importantuniversal-quantifier terms, every and all are discussed. The other two terms, which havepeculiarities of their own, are left to the reader.

‘Every’ expresses best the universal quantification of FOL.

(11) Every tiger is striped

states no more and no less than the conjunction of the all sentences ‘... is striped’, where ‘...’denotes a tiger (assuming that every tiger can be referred to by some expression). But

(12) All tigers are striped

implies some kind of law. This nomological (law-like) dimension is lost when (12) is formalizedin FOL. The universal generalizations of FOL are, one might say, material. They state that,as a matter of fact, every object in some class is such and such; whether this is some law, oris a mere accident, does not matter.

This does not mean that ‘every’ cannot convey a lawful regularity. When the domain ofobjects that fall under the generalization is sufficiently large, or sufficiently structured bya rich theory, an ‘every’-statement is naturally taken as an expression of lawfulness. Inparticular, in mathematics, ‘every’ is used throughout to state the most lawful imaginableregularities.


(13) Every positive integer is a sum of four squares.

We can also apply ‘all’ to accidental, or occasional groups of objects, without nomologicalimplications.

(14) All the students passed the test

and

(15) Every student passed the test

say the same thing. Note however that in (14) the definite article is needed to pick up aparticular group. Without it, the statement has a different flavour:

(140) All students passed the test.

Distributive versus Collective ‘All’

(16) After the meeting all the teachers went to the reception.

(17) After the meeting every teacher went to the reception.

In (16) ‘all’ can be read collectively: all the teachers went together. But ‘every’ must be readdistributively, in (17) and in all other cases.

Only the distributive reading of (16) can be formalized as a first-order generalization. Thecollective reading cannot be expressed in FOL, unless we provide ways of treating pluralitiesas objects. The fact that ‘all’ takes a plural complement conforms to its functioning assomething that relates to a plurality as a whole.

Sometimes ‘all’ must be read collectively:

(18) All the pieces in this box fit together.

(18) neither implies, nor is implied by the statement that every two pieces fit. The latter canbe expressed as a first-order generalization, by using a two-place predicate: ‘x fits togetherwith y’. But (18) must be interpreted as an assertion about single totality. (You can have acollection of pieces, every two of which can be combined, but which cannot be combined asa whole. Vice versa, a collection can be combined as a whole, though not every two pieces init fit together.)

But in general the two readings are possible and the choice is made according to context andplausibility. Quite often, the statement under the collective reading of ‘all’ implies the one


under the distributive reading. (If all the teachers went together, then also every teacherwent.) But sometimes the statements are exclusive:

(19) The problem was solved by all the engineers in this department,

(20) The problem was solved by every engineer in this department.

(One’s contribution to a joint solution does not count as “solving the problem”. )

Note: ‘All’ collects no less, even more, when it precedes a relative pronoun. Coming with‘that’, it is employed in the singular form and points to a single totality. In

(21) All that John did was for the best,

‘all that John did’ refers to the collection of John’s doings. Another example is (22) below.

All with Negation

(22) All that glitters is not gold.

The proverb does not make the false assertion that every object that glitters is not made ofgold. (22) is the negation of

(23) All that glitters is gold,

in which ‘all that glitters’ is read as referring to all glittering objects taken together. (23)states that this totality is made of gold; its negation, (22), says that it is not, i.e. that someof it is not made of gold. (22) can be the negation of (23), only if we treat ‘all that glitters’as a name of a single entity. By contrast,

(24) Every object that glitters is not made of gold

is not the negation of

(25) Every object that glitters is made of gold.

(24) makes a much stronger statement: No glittering object is made of gold. By the sametoken, under collective ‘all’,

(26) All good memories will not be forgotten, but some will,

is not logically false. But the analogous statement with ‘every’ is logically false:


(27) Every good memory will not be forgotten, but some will.

Solid Compounds

‘Every’ and ‘any’ combine with ‘one’ and ‘body’ to form solid compounds, which can be usedto quantify over people:

Everyone, Everybody, Anyone, Anybody .

The formalized versions require relativization to the human domain, via an appropriate pred-icate, unless the presupposed universe consists of humans only. Thus,

(28) Everyone is happy at some time.

comes out as:

(28∗) ∀x[Human(x)→ x is happy sometime]

where x is happy sometime can be further unfolded into a wff involving quantification overtimes (cf. 8.3.5).

Other solid compounds are formed with ‘thing’:

Everything Anything

Without a qualifying clause, ‘everything’ would seem to express unlimited universal gener-alization. In fact, it expresses a somewhat vague generality, which is made more precise bycontext.

(29) Everything has a cause. [Does ‘everything’ cover natural numbers? does itcover people?]

(30) Everything is for the best. [This, apparently refers to events.]

‘Everything’ is also limited to inanimate objects, unless this is overridden by the context.With an appropriate qualification ‘everything’ can express a precise general statement:

(31) From then on, everything he held in his hands turned into gold.

It has also a collective reading:

(32) Everything here fits together.


8.3.4 Existential Quantification

The existential analogue of ‘every’ or ‘all’ is some. It is employed in forming existentialgeneralizations under scheme (S1); cf. example (5) in 8.3.2. But when ‘some’ is combinedwith a common name, we expect the plural:

(1) Some dogs have lived more than twenty years,

(2) Some firemen have manifested exemplary courage during the earthquake.

The singular forms of (1) and (2) are correct, but obsolete. With plural ‘some’, the statementclaims the existence of more than one object that has the property in question. There is noproblem in formalizing this. Say the property is expressed by γ(x) (e.g., γ(x) formalizes: ‘xis a dog and x has lived more than twenty years’). Then,

∃xγ(x) says that at least one object has the property;∃x ∃ y [γ(x) ∧ γ(y) ∧ x 6≈ y] says that more than one object have it.

The second sentence, which is much longer, is interesting only in that it shows how ‘morethan one’ can be expressed in terms of ‘at least one’. From a logical point of view, the basicnatural notion is that of ‘at least one’; the second is a mere derivative.

Logicians have therefore tended to interpret the plural form of ‘some’ as ‘at least one’. Thismakes for a neat symmetry:

‘some’ is to ∃ as ‘every’ is to ∀ .

Opinions may vary concerning the extent to which this rule-bending (if it is a bending) isacceptable. Assume that somebody asserted (1). Has he made a false claim if it turns outthat exactly one dog has lived longer than twenty years? And what would be the verdict inthe analogous case of (2)? Note that (1) can be read as an assertion about dogs as a species;and in this perspective, even the existence of a single dog may count as verification.

Solid Compounds of ‘Some’

The terms

someone, somebody, something

are for existential quantification what ‘everyone’, ‘everybody’ and ‘everything’ are for theuniversal one. Being in the singular, the problems mentioned above do not arise here. The


first two involve, as the universal ones do, a restriction to the human domain. The thirdshares with ‘everything’ the same kind of vagueness and the same dependence on context.

‘There Is’, ‘There Exists’, ‘There was’, ‘There will be’

‘There is’, ‘There exists’, and their derivatives occur in the second scheme, (S2), of existentialquantification (cf. 8.3.2). In scientific or abstract discourse, the present tense of ‘is’ or ‘exists’indicates timeless generality, as in (7) of 8.3.2. In other contexts, the truth-value of thestatement depends, (because of the time-indexicality of the verb ‘is’, or ‘exist’) on the timepoint referred to by the verb. E.g., (3) below may have different truth-values at differenttimes because the contents of the next room changes with time.

(3) There is a woman in the next room.

Still, we can go ahead and formalize it:

(3∗) ∃x [W(x) ∧ In (x, next-room)] .

There is no syntactic problem here. We have only to note that the denotations of the indexicalelements might change with time, place or other parameters. The interpretation of the formallanguage is therefore dependent on these parameters. But in each interpretation all predicatesand individual constants have fixed denotations. In our case, the denotation of In( , ) is time-dependent: person a is in room b, iff a is in room b now. For simplicity, we have assumed thatnext-room denotes some fixed room. But this, as well, may be relative to time, and also toplace: next-room denotes the room that is now next to here.

On the other hand, W can be interpreted as the set of all past, present and future women, andthis, it is easy to see, does not depend on time.

Existential verbs in past and future tenses are to be treated along the same lines. Consider:

(5) There was a redheaded painter,

(6) There will be a blue cow.

It would be a blunder to introduce here, for the purpose of a logical analysis, special time-dependent quantifiers.1 The past and the future should be treated in the non-logical vo-cabulary. (5) is handled by introducing a predicate for past humans, (6)–by introducing apredicate for future cows:

1Temporal quantification underlies temporal logic. But this logic, which is used for specific purposes, neednot enter into basic logical analysis. Especially since the full temporal story can be told if we include in ourvocabulary the required predicates, cf. 8.3.5


(5∗) ∃x [PastHuman(x) ∧ Painter(x) ∧ RedHead(x)](6∗) ∃x [FutureCow(x) ∧ Blue(x)]

The predicate PastHuman denotes the set of all humans that existed before now. FutureCowdenotes the set of all cows that will exist after now. All the other predicates have time-independent interpretations. E. g., RedHead expresses the property of being redheaded, irre-spective of time; it is therefore interpreted as the set of all redheaded creatures, past presentand future. More of this in the next subsection.

The plural forms: ‘There are’, ‘There exist’, imply the existence of more than one object.The discussion above, concerning ‘some’ in the plural, applies in large measure also here.

8.3.5 More on First Order Quantification in English

Generality of Time and Location

Quite a few quantifier terms are used to generalize, either universally or existentially, withrespect to time and place. Some of these are compounds based on terms discussed earlier.Here is a list.

Universal Generalization

For Time: whenever, always, anytime.

For Place: wherever, everywhere, anywhere.

Existential Generalization

For Time: sometime, sometimes.

For Place: somewhere.

Temporal generality can be expressed in FOL by including times, or time-points, among theobjects over which the variables range. For example, the formalization of

(1) Jill is always content

comes out as:

(1∗) ∀ v (Time(v)→ Cont(Jill, v))

where Time(x) reads as: ‘x is a time-point’, and Cont(x, y) as: ‘x is content at time y’.


The same strategy works in general: we increase the arity of each predicate by adding a time-coordinate. For example, instead of formalizing ‘likes’ as a binary predicate, we formalizeit as a ternary one:

L(x, y, z) reads as ‘x likes y at time z’.

Temporal Aspects and Indexicality

We have outlined above (cf. 8.3.4) how indexical elements make the interpretation dependenton time, or on other varying parameters. This relates both to universal and existentialquantification. We have seen that such dependence should be no obstacle to formalization.The point becomes even clearer if we represent the time-parameter (and possibly others ifneeded) by an additional coordinate.

(2) Jack likes Jill now, but never liked her before

becomes:

(2∗) L(Jack, Jill, now) ∧ ∀u (u ≺ now→ ¬L(Jack, Jill, u))

Here now is an individual constant denoting the time of the utterance; ≺ is a two-placepredicate (written in infix notation) denoting the precedence relation over time-points.

Note: Having now, we can dispense with other time-indexicals. We do not need predicatesfor past humans or future cows. (6) of 8.3.4 can be now rendered as:

∃x ∃ v [now ≺ v ∧ Cow(x, v) ∧ Blue(x, v)]

We shall not enter here into the exact nature of our “times”–whether they are like the pointsof a continuous line, discrete points like the integers, or small stretches. Different contextsmay call for different modelings of time (the questions has been the subject of investigationsamong logicians and researchers in artificial intelligence).

Generalizations over places, expressed by ‘everywhere’ and ‘somewhere’ can be similarlytreated: We include in our universe locations and we add to the relevant predicates an ad-ditional location-coordinate. The exact nature of our locations, whether they are points inspace, or small regions, depends on context and will not concern us here.

Non-Temporal Use of Temporal Terms

Terms such as ‘always’, ‘whenever’, ‘sometime’, and others from the list above, can serve ina non-temporal capacity to express quantification in general:


(3) A perfect number is always even.

(4) The indefinite article is ‘an’, whenever the name begins with a vowel.

(5) Sometimes the same trait is an outcome of different evolutions.

This is also true, to lesser extent, of terms used for locations.

Existential Import

Very often, a universal generalization is taken to imply that the domain of objects that aresubject to the claim is not empty. Thus, from ‘Every Y is a Z’ one would infer that there areY ’s. Such an interpretation is said to ascribe existential import to universal quantification. Ithas a long history that goes back to Aristotle.

Existential import seems to be the case in a considerable part of ordinary discourse. Oneusually infers, from

(6) Every girl who saw this puppy was taken with it

that some girl saw the puppy. There is no difficulty in formalizing this reading in FOL. Wesimply add to the wff constructed according to the previous rules:

(6∗) ∀x(α(x)→ β(x))

a conjunct asserting the implied existence (e.g., of a girl who saw the puppy):

(6∗∗) ∀x(α(x)→ β(x)) ∧ ∃xα(x)

It is, however, possible to explain this, and similar cases, on the grounds of implicature. Underordinary circumstances, an assertion of (6) is taken as a sign that the speaker believes, ongood grounds, that some girls saw the puppy. Because (6)–interpreted as (6∗)–would bevacuously true and completely uninteresting if no girl saw the puppy. The argument can becarried further by considering:

(7) Everyone who was near the explosion is by now dead.

The speaker may assert (7) in complete ignorance as to whether someone was near the ex-plosion. The point is that if (7) is granted, and if we find later that someone was near theexplosion, then we can deduce that the person is dead. If it turns out that no one was nearthe explosion (7) would still be considered true.


Something about ‘Someone’

‘Someone’ means at least one person. Occasionally, an assertion of ‘some’ is taken to indicatenot all, or even exactly one. But such cases can be explained on the grounds of implicature.If the teacher asserts

(8) Someone got an A on the last test,

the students will probably infer that only one of them got an A. For they assume that theteacher does not withhold relevant information. If several students got an A, he would haveused the plural: ‘some of you’, and if all did–he would have used ‘all of you’.

But if (8) is asserted by someone who stole a hasty glance at the grade sheet, and the studentsknow this, they will infer only that there was at least one that got an A Note also that theteacher himself can announce:

(9) Someone got an A on the last exam, in fact all of you did,

without contradicting himself.

Note: ‘Some’ means also a relatively small quantity, as in ‘Some grains of salt got into thecoffee’. Read in this way, it is not expressible in FOL.

Generality through Indefinite Articles

An indefinite article, by itself, sometimes implies universal generalization. Usually, suchstatements are intended to express some law-like regularity–an aspect that will be lost inFOL formalization.

(10) A bachelor is an unmarried man

means:

(100) All bachelors are unmarried men.

(11) Whales are mammals

means:

(110) All whales are mammals.

But often the last form expresses something weaker than strict universal generalization:

(12) Birds fly


means something like: In most cases, or in most cases you are likely toencounter, birds fly.

And this kind of statement is outside the scope of FOL. Considerable efforts have been devotedto setting up formalisms in which generalizations of this kind are expressible.

A very common variant of (10) and (12) employs the conditional.

(13) If a triangle has two equal angles, it has two equal sides

means:

(130) Every triangle that has two equal angles has two equal sides.

(14) A man is not held in esteem, if he is easily provoked

means:

(140) Every man who is easily provoked is not held in esteem.

Generality through Negation

(15) No person is indispensable

amounts to the negation of ‘Some person is indispensable’:

(150) Every person is not indispensable.

In this category we have the very commonly used compounds: nothing, no one, nobody, aswell as nowhere.

Generalization through ‘Some’

These cases belong together with (13) and (14) above. ‘Some’ plays here the role of anindefinite article.

(16) If someone beats a world record, many people admire him

means:

(160) If a person beats a world record many people admire him,

which comes to:

(1600) Everyone who beats a world record is admired by many people.


And in a similar vein:

(17) Someone who is generous is liked

really means:

(170) Everyone who is generous is liked.

We may even get ambiguous cases where ‘some’ can signify either a universal or an existentialquantifier:

(18) In this class, someone lazy will fail the test.

You can conceivably interpret (18) as stating that, in this class, all the lazy ones will fail thetest. You can also read it as a prediction about some unspecified student that the speakerhas in mind.

General Advice

From the foregoing, you can see some of the tangle of first-order quantification in naturallanguage. Remember that there are no simple clear-cut rules that will enable you to derive,in a mechanical way, correct formalized versions. Conceivably, some algorithm might do this;but it is bound to be a very complex affair. A good way to check whether you have got theformalization right is to consider truth-conditions:

Assuming that vagueness, non-denoting terms and other complicating factors have beencleared, do the sentence and its formal translation have the same truth-value in every possiblecircumstance? This is not the only criterion, but it is a crucial one.

In any case, do not follow blindly the grammatical form. You must understand what thesentence says before you formalize it!

8.3.6 Formalization Techniques

When translating from English into FOL, it is often useful (especially for beginners) to pro-ceed stepwise, using intermediary semi-formal rewrites, possibly with variables. When thesemi-formal sentence is sufficiently detailed, it translates easily into FOL. Here are someillustrations. The predicates and constants in the final wffs are self-explanatory (His inter-preted as the set of humans).In (1), (2), (5) and (6) more than one logically equivalentwffs are given as possible answers. In some you can trace the equivalence to the famil-iar: α → (β → γ) ≡ α∧β → γ . In all cases the equivalences follow from FOL equivalence


rules (to be given in chapter 9). But you may try to see for yourself that the different versionssay the same thing.

(1) No man is happy unless he likes himself.

(1∗) Every man who is happy likes himself.

(1∗∗) For every man, x:

If x is happy, then x likes x.

(1∗∗∗) For every x, if x is a man, then:

If x is happy, then x likes x.

(1¦) ∀x [M(x)→ (H(x)→ L(x, x))]

or: ∀x [M(x) ∧ H(x)→ L(x, x)]

(2) Some man is liked by every woman who likes herself.

(2∗) There is a man who is liked by every woman who likes herself.

(2∗∗) There is a man, x, such that:

For every woman y:

If y likes y, then y likes x.

(2∗∗∗) There is x, such that x is a man, and

for every y, if y is a woman, then:

If y likes y, then y likes x.

(2¦) ∃x [M(x) ∧ ∀ y W(y)→ (L(y, y)→ L(y, x))]

or: ∃x [M(x) ∧ ∀ y W(y)→ (L(y, y)→ L(y, x))]

(3) Claire likes somebody.

(3∗) There is a person whom Claire likes.

(3∗∗) There is a person, x, such that:

Claire likes x.


(3∗∗∗) There is x, such that x is a person, and

Claire likes x.

(3¦) ∃x [H(x) ∧ L(c, x)]

(4) Claire likes a man who does not like her.

(4∗) There is a man whom Claire likes and who does not like Claire.

(4∗∗) There is a man, x, such that:

Claire likes x and x does not like Claire.

(4∗∗∗) There is x, such that x is a man, and

Claire likes x and x does not like Claire.

(4¦) ∃x [M(x) ∧ L(c, x) ∧ ¬L(x, c)]

(5) Harry likes some women, though not all of them.

Reading ‘some women’ as: “more than one woman”, we get:

(5∗) There are two women whom Harry likes and there is a woman whom Harry does notlike.

(5∗∗) There are women x, y, such that:

Harry likes x and Harry likes y and x 6= y,

and there is a woman, z, such that:

Harry does not like z.

(5∗∗∗) There is x, such that x is a woman, and

there is y, such that y is a woman, and

Harry likes x and Harry likes y and x 6= y,

and there is z, such that z is a woman, and



(5¦) ∃x {W(x) ∧ ∃ y [W(y) ∧ L(h, x) ∧ L(h, y) ∧ x 6≈ y]} ∧ ∃ z [W(z) ∧ ¬L(h, z)]or: ∃x ∃ y {W(x) ∧ W(y) ∧ L(h, x) ∧ L(h, y) ∧ x 6≈ y]} ∧ ∃ z [W(z) ∧ ¬L(h, z)]

Had we read ‘some women’ as ‘some woman’ the final version would have been:

(50) There is x, such that x is a woman, and

Harry likes x,

and

there is z, such that z is a woman, and


(50¦) ∃x [W(x) ∧ L(h, x)] ∧ ∃ z [W(z) ∧ ¬L(h, z)]

(6) No woman likes a man, if he doesn’t like her.

(6∗) Every woman does not like a man, if the man doesn’t like her.

(6∗∗) For every woman, x:

For every man y:

If y does not like x, then x does not like y.

[Or, equivalently: If x likes y, then y likes x.]

(6∗∗∗) For every x, if x is a woman, then:

For every y, if y is a man, then:

If x likes y, then y likes x.

(6¦) ∀x {W(x)→ ∀ ( y)[M(y)→ (L(x, y)→ L(y, x))]}or: ∀x {W(x)→ ∀ ( y)[M(y) ∧ L(x, y)→ L(y, x)]}or: ∀x ∀ y {W(x)∧M(y)∧L(x, y)→ L(y, x)]}

Expressing Uniqueness

A claim of uniqueness is a claim that there is one and only one object satisfying a givenproperty. If the property is expressed by ‘...x...’, then the claim has the form:

(1) There is a unique x such that ...x... .


If ‘...x...’ is formalized as α(x), then (1) is expressed in FOL by:

(1∗) ∃x [α(x) ∧ ∀ y (α(y)→ x ≈ y)]

In words: there is x such that: (i) ...x... and (ii) for every y, if ...y..., then y is equal tox. (Here we assume, of course, that α(y) results from α(x) by a legitimate substitution ofthe free variable. Also y should not occur freely in α(x).)

Uniqueness is also expressed by the following logically equivalent wff:

(1∗∗) ∃xα(x) ∧ ∀y∀z[α(y) ∧ α(z)→ y ≈ z]

The first conjunct says that there is at least one object satisfying the property; the second–that there is at most one object satisfying it. (Again, we assume that y and z are substitutablefor x in α(x), and that they are not free there.)

Sometimes (1∗), or (1∗∗), is abbreviated as:

∃!xα(x)

which reads: there is a unique x such that α(x).

Homework 8.4 Rephrase the following sentences, using variables. Then formalize themin FOL.

(1) Claire and Edith like the same men.

(2) The women who like Jack do not like Harry.

(3) Only Ann is liked by Harry and David.

(4) David is not happy unless two women like him.

(5) Edith is liked by some man who does not like any other woman.

(6) Harry likes a woman who likes all happy men.

(7) Unless liked by a woman no man is happy.

(8) Some man likes all women who like themselves.

(9) Every happy man is liked by some happy woman.

(10) Ann is liked by every man who likes some woman.


Divide and Conquer

It is often useful to formalize separately components of the sentence, which then can be fittedinto the global structure. Such components are wffs that can contain free variables. Theyare obtained from a semi-formal version of the original English sentence. Here are threeexamples of this divide-and-conquer method. Note how short English sentences can display,upon analysis, an intricate logical structure.

(i) Whoever found John found also somebody else.

For all x, if x is a person and x found John then α(x), where α(x) is a wff saying:

x found someone other than John.

All in all, the sentence can be written as:

(i0) ∀x [H(x) ∧ Found(x, John) → α(x)]

We now turn our attention to α(x). It can be written as:

∃y (H(y) ∧ y 6≈ John ∧ Found(x, y))Substituting in (i0) we get our final answer:

(i00) ∀x [H(x) ∧ Found(x,John) → ∃y (H(y) ∧ y 6≈ John ∧ Found(x, y))]

(ii) Jill owns a dog which is smaller than any other dog.

(ii0) ∃x [Dog(x) ∧ Owns(Jill,x) ∧ α(x)], where α(x) says:

x is smaller than any other dog. We can write it as:

∀y (Dog(y) ∧y 6≈ x → Smaller(x, y))

Substituting we get:

(ii00) ∃x [Dog(x) ∧ Owns(Jill,x) ∧ ∀y (Dog(y)∧y 6≈ x → Smaller(x, y))]

(iii) Somebody loves a person who is loved by nobody else.

(iii0) ∃x {H(x)∧α(x)} where α(x) says:

x loves a person who is not loved by anyone, except x.

It can be written as:

∃y [H(y) ∧ Loves(x, y) ∧ β(x, y)] where β(x, y) says:


Every person other then x does not love y.

It can be written as:

∀z (H(z)∧(z 6≈ x) → ¬Loves(z, y))Therefore, α(x) becomes:

∃y [H(y) ∧ Loves(x, y) ∧ ∀z (H(z)∧(z 6≈ x) → ¬Loves(z, y))]Substituting in (iii0), we get:

(iii00) ∃x {H(x) ∧ ∃y [H(y) ∧ Loves(x, y) ∧ ∀z (H(z)∧(z 6≈ x) → ¬Loves(z, y))]}

Homework

8.5 Formalize the following sentences in FOL. You can use one-letter notations for predicatesand individual names; specify what they stand for. Indicate cases of ambiguity and formalizethe various readings.

1. One woman in the room knows all the men there.

2. No one in the room knows every person there.

3. Someone in the room does not know any other person.

4. Someone can be admitted to the club only if two club members vouch for him.

5. Bonnie knows a man who hates all club members except her.

6. Bonnie will not attend the party unless some friend of hers does.

7. Bonnie met two persons only one of whom she knew.

8. Abe met two men, one of whom he knew, and the other who knew him.

9. Some women who like Abe do not like any other man.

10. Abe owns a house which no one likes.

11. Abe was bitten by a dog owned by a woman who hates all men.

12. Bonnie knows a man who likes her and no one else.

13. Whoever visited Bonnie knew her and was known to some other club member, exceptAbe.

14. With the possible exception of Bonnie, no club member is liked by all the rest.

15. With the exception of Bonnie, no club member is liked by all the rest.


8.6 Formalize the following sentences in FOL. Introduce predicates as you find necessary,specifying the interpretations clearly. (For example: GM(x, y) stands for ‘x and y arepeople and x is good enough to be y’s master’). Try to get a fine-grained formalization.Interpret ‘some’ as ‘at least one’.

1. No one means all he says.

2. Someone says all he means.

3. Each State can have for enemies only other States, and not men.

4. He who is by nature not his own but another’s man, is by nature a slave.

5. No man is good enough to be another man’s master.

6. Those who deny freedom to others deserve it not for themselves.

7. He who cannot give anything away cannot feel anything either.

8. You can fool some of the people all the time, or all the people some of the time, butyou cannot fool all people all the time.

Chapter 9

Models for FOL, Satisfaction, Truthand Logical Implication

9.1 Models, Satisfaction and Truth

9.1.0

The interpretation of a first-order language is given as a model. By this we mean a structureof the form:

(U, π, δ)

in which:

(I) U is a non-empty set, called the model’s universe, or domain. The membersof U are also referred to as members of the model.

(II) π is a function that correlates with every predicate, P, of the language arelation, π(P), over U of the same arity as P (if P’s arity is 1, then π(P) is asubset of U). We say that π(P) is the interpretation of P.

(III) δ is a function that correlates with every individual constant, c, of the lan-guage a member, δ(c), of U . We say that δ(c) is the denotation of c in thegiven model. We also speak of it as the interpretation of c.

In set-theoretic terms we can express this by:

π(P) ⊆ Un, where n = arity of P, δ(c) ∈ U .

323

324 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION

There is only one restriction on the interpretation of predicates: If the lan-guage has equality, then

π(≈) = {(x, x) : x ∈ U} .

In words: the equality sign is interpreted as the identity relation over U .

Note: In the case of a language with function symbols (cf. 8.2.4), the mapping π isalso defined for the function symbols; it correlates with every n-place function symbol ann-place function from Un into U . Henceforth we deal with languages based on predicates andindividual constants. The extension to function symbols is more or less straightforward.

Notation and Terminology

• We shall use ‘M’, ‘M0’, , . . ., ‘M1’, . . . for models.

• ‘|M|’, ‘PM’, ‘cM’, denote, respectively, the universe ofM, the interpretation of PinM, and the interpretation of c inM.

Hence, if M = (U, π, δ) then:

|M| = U, PM = π(P), cM = δ(c)

• If we assume fixed orderings of the predicates and of the individual constants: P1, P2,. . ., c1, c2, . . ., the model is written by displaying the interpretations in the same order:

(U, P1, P2, . . . c1, c2, . . .)

where U = |M|, Pi = PMi , cj = cMj . (If there are no individual constants, the lastsequence is simply omitted.) A structure of this form is known also as a relationalstructure.

• The size of a model M is, by definition, the number of elements in its universe. Themodel is finite if its universe is a finite set.

As observed in 8.2.2, the truth-value of any wff α is determined by: (i) a modelM and (ii)an assignment of members inM to α’s free variables. Accordingly, we have to define

the truth-value of a wff α, in a modelM, under an assignment g of valuesto α’s free variables.

We shall denote this truth-value as:valMα[g]

If α is a sentence, its truth-value depends only on the model and we can drop ‘[g]’.

9.1. MODELS, SATISFACTION AND TRUTH 325

Note: The assignment g is neither a part of the language, nor of the model.

The following notations are used, for assignments.

(I) If x1, . . . , xn are distinct variables, we use:

x1a1x2a2......xnan

to denote the assignment defined over {x1, . . . , xn}, which assigns each xi the value ai. Ac-cordingly,

valMα[x1a1x2a2

...

...xnan ]

is the truth-value of α inM under that assignment.

(II) If g is any assignment of values to some variables, then

gxa

is, by definition, the assignment that assigns to x the value a, and to every other variable–thevalue that is assigned to it by g.

Note 1: gxa is defined for the following variables: (i) all the variables for which g is defined,(ii) the variable x. To variables different from x, gxa and g assign the same values. Whetherg is defined for x, or not, does not matter; for in either case gxa assigns to x the value a.

Note 2: In order that valMα[g] be defined, g should be defined for all free variables of α. Itcan be also defined for other variables; but as we shall see, the values given to variables notfree in α play no role in determining α’s truth-value.

Note 3: We use ‘assignment’ for a function that correlates members of the universe withvariables. Do not confuse this with the truth-value assignment (to be presently defined) whichcorrelates–with each wff α, each model M and each suitable assignment g of members of|M| to variables–the truth-value valMα[g].

9.1.1 The Truth Definition

valMα[g] is defined inductively, starting with atomic wffs and proceeding to more complexones.

It is convenient start by assigning values to terms, i.e., to individual variables and constants.The value of a term ,t, under g, is denoted as: valMt [g]. It is not a truth-value but a memberofM, determined as follows:


If t is the individual constant, c, then valMt [g] = cM.

If t is the variable v, then valMt [g] = g(v). This value is defined iff g isdefined for v.

Atomic Wffs

valMP(t1, . . . , tn) [g] = T if (valMt1[g], . . . , valMtn[g]) ∈ PM,

valMP(t1, . . . , tn) [g] = F if (valMt1[g], . . . , valMtn[g]) 6∈ PM.

(For atomic sentences, this coincides with the definition given in 7.1.1.)

Note that, since by assumption g is defined for all free variables of P (t1, . . . , tn),all the values valMti[g] are defined.

Sentential Compounds

valM¬α [g] = T if valMα[g] = F,

valM¬α [g] = F if valMα[g] = T.

If is a binary connective, then valM(α β) [g] is obtained from valMα[g] and valMβ[g] bythe truth-table of .

(In the last clause g is defined for all free variables of α β, hence it is definedfor the free variables of α and for the free variables of β. )

Universal Quantifier

valM∀xα [g] = T if, for every a ∈ |M|, valMα[gxa] = T,

valM∀xα [g] = F otherwise.

Existential Quantifier

valM∃xα [g] = T if, for some a ∈ |M|, valMα[gxa] = T

valM∃xα [g] = F otherwise.

If the language contains function symbols, then the definition is exactly the same, except thatwe have to include in the definition of valMt[g] inductive clauses for terms containing functionsymbols:


valMf(t1, . . . , tn) [g] = fM(valMt1[g], . . . , valMtn[g]) ,

where fM is the function that interprets the function-symbol f inM.

A Special Kind of Induction: The truth-value of ∀xα (or of ∃xα) is determined bythe truth-values of the simpler wff α, under all possible assignments to the variable x. It istherefore based on simpler cases whose number is, possibly, infinite. This is a special kindhigh-powered induction that we have not encountered before.

Note: In the clauses for quantifiers, the variable x need not be free in α. If it is not, then itcan be shown that, for each assignment g, the truth-values of α, ∀xα, and ∃xα are the same.Note also that α may contain non-displayed free variables, besides x. Since their values underg and under any gxa are the same, these values are fixed parameters in the clause.

Understanding the Logical Particles: The clauses for quantifiers employ the expressionsfor every and for some. We must understand what these expressions mean in order to graspthe definition. Just so, we should understand ‘and’, ‘either ... or ’, and ‘if..., then ’, inorder to understand what the truth-tables mean. We say, for example: “The value of anysentence is either T or F”, or “If the value of A is T and the value of B is F, then the valueof A ∧B is F.”

First-order logic does not provide us with a substitute for these concepts, but with a system-atization that expresses them in rigorous precise form.

Satisfaction: If valMα[g] = T we say that α is satisfied in M by the assignment g, orthatM and g satisfy α. We denote this by:

M |= α[g]

If α has a single free variable, we say that α is satisfied by a (where a ∈ |M|), if α is satisfiedby the assignment that assigns a to its free variable. Similarly, if it has two free variables,we say that it is satisfied by the pair (a, b), if it is satisfied by the assignment that assignsa to the first free variable and b to the second. Here, of course, we presuppose some agreedordering of the variables.

If α is not satisfied inM by g, (i.e., if valMα[g] = F) we denote this by:

M 6|= α[g]

Ambiguity of ‘|=’: ‘|=’ is also used to denote logical implication (and logical truth).There is no danger of confusion. The symbol denotes satisfaction if the expression to its leftdenotes a model; otherwise it denotes logical implication. These uses of ‘|=’ are traditionalin logic.


Dependence on Free Variables

It can be proved (by induction on the wff) that valMα[g] depends only on the values assignedby g to the free variables of α. If g and g0 are assignments that assign the same values to allfree variables of α, but which may differ otherwise, then

valMα[g] = valMα[g0]

The proof is not difficult but rather tedious; we shall not go into it here. If there are no freevariables, i.e., if α is a sentence, the truth-value depends only on the model. We can thereforeomit any reference to an assignment, saying (in case of satisfaction) that the sentence issatisfied, or is true, inM, and denoting this as:

M |= α

Similarly, valMα is the truth-value of the sentence α in the modelM.

Example

Consider a first-order language based on (i) the binary predicate L, (ii) the monadic predicatesH, W and M, (iii) the individual constants c1 and c2. Let their ordering be:

L, H, W, M, c1, c2

and letM be the model(U, L, H, W, M, c, d)

where:

U = {c, d, e, f, g, h}L consists of the pairs:

(c,c), (c,d), (c,f), (d,g), (d,h), (e,e), (e,f), (e,h), (f,c), (f,f), (f,h), (g,c),(g,d), (g,e), (g,g), (h,e), (h,f)

H = {c, e, f, g}W = {c, d, e}M = {f, g, h}

To make this more familiar let the six objects c, d, e, f, g, h be people, three women and threemen:

c = Claire, d = Doris, e = Edith, f = Frank, g = George, h = Harry.


Then, W consists of the women and M of the men. Assume moreover that L is the liking-relation over U and that H is the subset of happy people, that is, for all x, y ∈ U :

(x, y) ∈ L iff x likes y,

x ∈ H iff x is happy.

Note that Claire and Doris have names in our language: c1 and c2, but the other people donot.

Now let α be the sentence:

∀u [W(u)→ ∃ v (M(v) ∧ H(v) ∧ L(u, v))]It is not difficult to see that, given that the interpretation isM, α says:

(1) Every woman (in U) likes some man (in U) who is happy.

Applying the truth-definition to α, we shall now see that the truth of α in the given model isexactly what (1) expresses. In other words:

α is true inM IFF (1)

Obviously α = ∀uβ(u), whereβ(u) = W(u)→ ∃ v (M(v) ∧ H(v) ∧ L(u, v))

HenceM |= α iff for every a ∈ U ,M |= β(u)[ua]. Now β is a conditional:

β = W(u)→ ∃vγ, where γ = γ(u, v) = M(v) ∧ H(v) ∧ L(u, v)If a 6∈ W thenM 6|= W(u)[ua] and the antecedent of the conditional gets F; which makes theconditional true. Hence β is satisfied by every assignment u

a for which a 6∈W . Therefore, αis true inM iff β is also satisfied by all the other assignments, i.e., by all assignments u

a inwhich a ∈ W . For each of these assignments the antecedent gets T; hence the conditionalgets true iff

M |= ∃vγ[ua]And this last wff is satisfied iff there exists b ∈ U such thatM |= γ(u, v)[ua

vb ]; that is, iff there

is b ∈ U such that:M |= M(v)∧H(v)∧L(u, v) [uavb ]

The last condition simply means that each of the conjuncts is satisfied by uavb , which means

that:b ∈M and b ∈ H and (a, b) ∈ L

Summing all this up, we have: α is satisfied inM iff for every a ∈ U , if a ∈ W , there existsb ∈ U , such that b ∈M and b ∈ H and (a, b) ∈ L. Which can be restated as:


(2) For every a in U , if a is a woman, then there exists b in U , such that b is aman and b is happy and a likes b.

Obviously, (2) is nothing but a detailed rephrasing of (1).

The truth-value of α can be found by checking, for every woman in W , whether there is in Ua happy man whom she likes. This indeed is the case: c likes f , d likes g, e likes f . Hence,valMα = T.

The same reasoning can be applied to the sentence β:

∃ v [M(v) ∧ H(v) ∧ ∀u (W(u)→ L(u, v))]

β, it turns out, asserts that there is a happy man who is loved by all women. It gets the valueF, because the happy men (in U) are f and g; but f is not liked by d and g is not liked by c.

Homework 9.1 (I) Find the truth-value in the modelM of the last example of each of thefollowing sentences. Justify briefly your answers in regard to 6.—10. Do not go into detailedproofs; justifications for the above α and β can run as follows:

α gets T, because for each x in W , there is a y in M , which is also in H,such that L(x, y): for x = f choose y = c, for x = g choose y = d, and forx = h choose y = e.

(Here we used ‘L(x, y)’ as a shorthand for ‘(x, y) ∈ L’.)

β gets F, because there is no x that is in M and in H, such that for ally ∈ W , L(y, x). L ∩M has two members f and g; but for x = f , y = dprovides a counterexample, and for x = g, y = c provides it.

1. L(c1, c2) ∨ L(c2, c1)2. L(c1, c2)→ L(c2, c1)

3. ∀x (L(c1, x)∧M(x)→ L(x, c1))

4. ∀x (L(c2, x)∧M(x)→ L(x, c2))

5. ∀x (L(x, c2)∧M(x)→ L(c2, x))

6. ∀x [W(x)→ ∃ y (M(y)∧L(x, y)∧L(y, x))]7. ∀x ∀ y (W(x)∧W(y)∧x 6≈ y → ¬L(x, y))8. ∀x [W(x)→ ∃ y (W(y)∧L(x, y))]9. ∀x [W(x)→ ∃ y (W(y)∧L(y, x))]


10. ∀x [H(x)↔ L(x, x)]

(II) Translate the sentences into correct stylistic English. (This relates to the subject matterof the previous chapter. Do it after answering (I).)

9.1.2 Defining Sets and Relations by Wffs

The sets and relations defined by wffs in a given interpretation (cf. 8.1.1) can be now describedformally using the concept of satisfaction:

In a given model M, a wff with one free variable defines the set of all members of |M|that satisfy it. A wff with two free variables defines the relation that consists of all pairs(a, b) ∈ |M|2 that satisfy it. And a wff with n free variables defines, in a similar way, ann-ary relation over |M|. (For arity n > 1, we have to presuppose a matching of the freevariables in the wff with the relation’s coordinates.)

Note: A wff with m free variables can be used to define relations of higher arity in whichthe additional coordinates are “dummy”: Say, the free variables of α occur among v1, . . . , vnand consider the relation consisting of all tuples (a1, . . . , an) such that

M |= α[v1a1v2a2......vnan]

The vis that are not free in α make for dummy coordinates that have no effect on the tuple’sbelonging to the relation.

Examples

Consider a first-order language, based on a two-place predicate R, the equality predicate ≈,and two individual constants: c1 and c2.

Let |M| = {0, 1, 2, 3, 4} and let:

RM = {(0, 1), (0, 2), (0, 3), (1, 2), (1, 4), (2, 2), (4, 1), (4, 3), (4, 4)}

cM1 = 0 cM2 = 3

This model is illustrated below, where an arrow from i to j means that (i, j) is in the relation.

-.~>--------------------------

~.


The following obtains .

. 1. M I=R(cl>x)[;;] fora= 1,2,3 and M ~R(cl,x)[~l foull other a's.

Hence, R(Cl'X) defines the Bet {1,2,3}.

2. M F R(V, C2)~J for a -:- 0,4 and M ~R(y, C2)m for all other a's.

Hence, R(y, C2) defines the set {0,4}.

3. R(x, Cl.) defines 0 (the empty set).

4. M 1= 3xR(c!,x), because there is a E M (e.g., 1) such that M 1= R(c!> x)!;].

5. M ~ VX~(Cl'X), because not all a E M are such that M F R(Cl'X)[;;]; e.g., a = 0.

6 .. Hence, M !='-,'v'XR(Cl,X)

. 7. M 1= 3yR(y, C2) A ... ,vyR(y,C2) .

8. M 1= Vx (R(Cl,Z) V R(x, C2», because, as you can verify by direct checking, for all . a E IMI: M 1= (R(Cl' x) V R(X, C2))[~J . - .

. 9. MI= 3z (z¢ Cl AZ¢ c2AR(z,z»

10. M 1= 3u3v(u,evAR(u,u)AR(v,v»

11. MI= VX[R(x,z)--+3y(y¢zAR(y,z))].

Because, for all a E IMI, M 1= [R(z, x) ...... 3y(y ¢zAR(y,x)][:]; namely, if a = 0, 1,3 the antecedent gets the value F under the assignment ~ hence the conditional gets T, ifa = 2 M 1= 3y(y ¢ xAR(y,x))eJ, because M 1= (y ¢ ZAR(y,x»gID, and a similar argument works for a = 4.

12. Vx (R(x,y) ...... R(y, x» defines the set {0,4}.

If we read R( 1.1, v) as: '1.1 points to v', then the lastwff can be read as: 'Every member that points to y is pointed to by y'. 0 satisfies it vacuously, because no member points to O.


13. The wff x ≈ c2 defines the set {3} .14. The wff ∃ y (y 6≈ x ∧ R(y, y) ∧ R(y, x)) ∧ ∃zR(x, z) defines the set {1} .

Had we not included y 6≈ x in the last wff, both 4 and 2 would have satisfied it (canyou see why?). Had we not included the conjunct ∃zR(x, z), 3 would have satisfied it.

Repeated Quantifier Notation: We use

∀x1, x2, . . . , xn α, ∃x1, x2, . . . , xn α,as abbreviations for:

∀x1 ∀x2 . . .∀xn α, ∃x1 ∃x2 . . .∃xn α,

Homework

9.2 Consider a language with equality whose non-logical vocabulary consist of: A binarypredicate, S, a monadic predicate, P, and one individual constant a.

LetMi, where i = 1, 2, 3, be models for this language with the same universe U .

Mi = (U, Si, Pi, ai)

Assume that U = {1, 2, 3, 4} and that the relations Si and Pi and the object ai (whichinterpret, respectively, S, P, and a) are as follows:

S1 = {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)}, P1 = {1, 4}, a1 = 4 ,

S2 = S1 ∪ {(b, b) : b ∈ U} P2 = ∅ a2 = 1 ,

S3 = {(1, 2), (1, 3), (1, 4), (2, 3), (3, 4), (4, 2)} P3 = {2, 3, 4} a3 = 1 .

Find, for each the following sentences, which of the three models satisfy it. Justify brieflyyour answers (cf. Homework 9.1). You might find little drawings representing the model, orparts of it, useful; especially in the case of S3.

1. ∃v∀uS(u, v)2. ∃ v ∀u (u 6≈ v → S(v, u))

3. ∀u, v [S(v, u)→ P(v)]

4. ∀u (S(a, u)→ P(u))


5. ∀u, v [S(a, u)∧S(u, v)→ S(a, v)]

6. ∀u, v, w [S(u, v)∧S(v, w)→ S(u,w)]

7. ∀u [P(u)→ ∃vS(u, v)]8. ∀u, v, w [P(u)∧S(u, v)∧S(u,w) → v ≈ w]

9.3 Write down (in the list notation) the sets defined by each of the following wffs in eachof the modelsMi, i = 1, 2, 3 of 9.2.

1. S(a, v)

2. S(v, a)

3. P(v) ∨ v ≈ a4. P(v) ∧ v ≈ a5. ∃uS(u, v)6. ∀uS(v, u)7. ∀u (P(u)→ S(v, u))

8. ∃u1, u2 [u1 6≈ u2 ∧ S(u1, v) ∧ S(v, u2)]9. ∃u (¬P(u) ∧ u 6≈ v ∧ S(u, v))

9.2 Logical Implications in FOL

9.2.0

The scheme that defines logical implication for sentential logic defines it also for FOL:

A set of sentences, given as a list Γ, logically implies the sentence α, if thereis no possible interpretation in which all the members of Γ are true and α isfalse.

What characterizes implication in each case is the concept of a possible interpretation and theway interpretations determine truth-values. Rephrasing the definition in our present terms,we can say that, for a given first-order language, Γ logically implies α if there is no FOL

9.2. LOGICAL IMPLICATIONS IN FOL 335

model that satisfies all members of Γ but does not satisfy α. Furthermore, a sentence α islogically true if it is satisfied in all models, logically false–if it is satisfied in none.

The concepts extend naturally to the case of wffs; we have to throw in assignments of objectsto variables, since the truth-values depend also on such assignments:

• A premise-list Γ of wffs logically implies a wff α, if there is no modelM (for the languagein question) and no assignment g (of values to the variables occurring freely in the wffsof Γ and α) which satisfy all members of Γ, but do not satisfy α.

• A wff α is logically true if it is satisfied in all models under all assignments of values toits free variables. (Or, equivalently, if it is logically implied by the empty premise-list.)

• A wff α is logically false if it is not satisfied in any model under any assignment of valuesto its free variables.

The definitions for sentences are particular cases of the definitions for wffs.

Satisfiable Sets of Wffs

A set of wffs is satisfiable in the model M if there is an assignment of values (to the freevariables of its wffs) which satisfies all wffs in the set. If the wffs are sentences, this simplythat all the sentences are true inM.

A set of wffs is satisfiable if it is satisfiable in some model.

A set of wffs which is not satisfiable is described as logically inconsistent. Note that thisaccords with the previous usage of that term in sentential logic (cf. 3.4.3).

Obviously, a wff is satisfiable just when it is not logically false.

As before, we use:Γ |= α

to say that the premise-list Γ logically implies the wff α. (Recall the double-usage of ‘|=’ !)If the premise-list is empty, this means that α is a logical truth:

|= α

We have used ‘⊥’ to denote some unspecified contradiction (cf. 4.4.0). We adopt this notationalso for FOL.


Following the reasoning used for sentential logic (cf. 4.4.0), we see that

Γ |=⊥means that Γ is not satisfiable. And the same reasoning also implies:

Γ |= α ⇐⇒ Γ, ¬α |= ⊥

Logical Equivalence

From logical implication we can get logical equivalence. Using, as before ‘≡’, we can define itby:

α ≡ β ⇐⇒ α |= β and β |= α

Obviously, α ≡ β, iff, α and β have the same truth-value in every model under everyassignment of values to the variables that are free in α and in β.

From now on, unless indicated otherwise, ‘implication’ and ‘equivalence’ mean logical impli-cation and logical equivalence.

9.2.1 Proving Non-Implications by Counterexamples

We use ‘Γ 6|= α’ to say that Γ does not imply α. It means that there is some model and somevalue-assignment to the variables, such that all members of Γare satisfied but α is not. Sucha model and assignment constitute a counterexample to the implication claim. Here are somenon-implication claims that are proved by counterexamples.

(1) ∃xP(x) 6|= ∀xP(x)Proof: Consider a model, M, whose universe contains at least two members,

such that PM is neither empty nor the whole universe. Since PM 6= ∅, M |=∃xP(x). Since PM 6= |M|, M 6|= ∀xP(x).

QED

(2) 6|= ∀y∃xR(x, y)→ ∃x∀yR(x, y)Proof: Consider the following model.

M = (U,R), where U = {0, 1}, R = {(0, 0), (1, 1)}Since M |= R(x, y)[x0

y0], we have: M |= ∃xR(x, y)[y0]. An analogous

argument shows that M |= ∃xR(x, y)[y1]. Since 0 and 1 are the only mem-bers, M |= ∀y∃xR(x, y).

9.2. LOGICAL IMPLICATIONS IN FOL

On the either hand, M ~ R(x,y)[orl. Hence M ~ 'fyR(x,y)[o]' Bya similar argument M F 'fyR(x,y)[y]. Therefore M V=3xliyR(x,y). Since the antecedent (of the sentencll in (2» is true in M, but the ,consequ\3nt is false, the sentence is false.

QED

Let us modify the sentence of (2) a bit:

'in: 3y (x ¢y IIR(x,y» -> 3ylix (x ¢ y -+ R(x,y»

337,

It is not difficult to, see that that sentence is satisfied 'in the last counterexample, since the antecedent is false; Still that sentence is not a logical truth. To prove this consider the model M = (V,R) where:

U = {O,1,2}, R = {(O, 1), (1,2), (2,O)}

~o

~) It is not difficult to see that the antecedent is true in this model: for every member a there is a different member b such that (a,b) E R. But there is no member b such that (a, b) E R for all it -F b. Hence the consequent is false.

Homework 9.4 Prove the following negative claims by constrjlcting small-size models (as small as you can) such that the premise is satisfied, but the conclusion is not.

You also have to assign values to the free variables occurring in the implication, if there are any. (Note that the same variable can have free and bound occurrences.)

1. Vx3yS(x, y) ~ 3x8(x, x)

, 2. 3x,y (P(x)lIR(y» V= 3x(P(x)l\R(x))

3. VX(P(x)VR(x» ~'Vx,y(P(x)VR(y»

4. (3x (P(x) -tR{x))) AP(X) ~ R(x)

5. 3xliyS(x, y) ~ 3yS(x, y)

6. lixS(x,u) A VyS(v,y) V= S(u,v)

7. V u, v {u ¢ '11 :... [(S(u, u) i\ S( '11, '11» V (S(u, v) 1\ S('lI, u»)]} V=' 3u.S(u, u)

8. Vu3v(v¢uI\.S(u,v)) ~ u?6v-> (S(u,v)VS(v,u))

, :.


9.2.2 Proving Implications by Direct Semantic Arguments

Sometimes it is easy to show that a logical implication holds by a direct semantic argument.Here are some examples.

(3) ∀xα |= ∃xαProof: Assume that M |= ∀xα[g]

By the truth-definition of universally quantified wffs this means that:

M |= α[gxa], for all a ∈ |M|.By definition, a model’s universe is never empty. Take any a ∈ |M|. SinceM |= α[gxa], the truth-definition for existential generalization implies:

M |= ∃xα[g]QED

(3) is intuitively obvious. The next is less so, but not very difficult.

(4) |= ∃x∀yα→ ∀y∃xαProof: (4) is equivalent to:

∃x∀yα |= ∀y∃xαTo simplify the notation, we can ignore α’s free variables other than x and y;whatever they are, their values remain fixed throughout. We can thereforeleave ‘g’ out. (If needed, the notation can be filled by tacking ‘g’ on, as inthe proof of (3).)

Assume thatM |= ∃x∀yα

Then some member of |M| satisfies ∀yα. Let a be such a member:M |= ∀yα[xb ]. Then,

for every b ∈M, M |= α[xayb ]

But for each b,M |= α[xayb ] impliesM |= ∃xα[yb ]. Therefore:

for every b ∈M, M |= ∃xα[yb ]But this implies that:

M |= ∀y∃xαQED


The negative claims (1) and (2) show that the implication in (3) and the conditional in (4)cannot, in general, be reversed.

(5) ∀x∀y α ≡ ∀y ∀xαProof: Consider any modelM. Again we ignore any free variables different

from x and y.

Assuming thatM |= ∀x∀yα, we show thatM |= ∀y∀xα. Our assumptionmeans that:

for all a ∈ |M|: M |= ∀y α[xa] .which implies that:

for all a ∈ |M|: for all b ∈ |M|: M |= α[xayb ] .

Therefore, for any b ∈ |M|:

M |= α[xayb ], for all a ∈ |M|.

Hence,

M |= ∀xα[yb ], for all b ∈ |M|.Which implies:

M |= ∀y ∀xαThis shows that the left-hand side in (5) implies the right-hand side. Sincethe situation is symmetric, the reverse implication holds as well.

QED

Here is an example involving a free variable. Unlike the previous examples, it is not of generalsignificance; its interest lies in the fact that the logical truth of the wff is far from clear atfirst glance.

(6) |= ∃ y [R(x, y)→ ∃u∀vR(u, v)]Proof: We have to show that, given any modelM and any member a of |M|,

we have:

M |= ∃ y [R(x, y)→ ∃u∀vR(u, v)] [xa]We have therefore to show the existence of b ∈ |M| such that:

(60) M |= R(x, y)→ ∃u∀vR(u, v)) [xa yb ]

Now, if for some b ∈ |M|, M 6|= R(x, y)[xa yb ], then for this b the conditional

in (60) is true under the given assignment, because the antecedent is false.


Remains the case in which there is no such b, that is:

for every b ∈ |M|, M |= R(x, y)[xayb ]Since M |= R(x, y)[xayb ] iff M |= R(u, v)[uavb ], this can be rephrased as:

for every b ∈ |M|, M |= R(u, v)[uavb ]Hence,

M |= ∀vR(u, v) [ua]implying:

M |= ∃u∀vR(u, v)In this case (60) holds for all b ∈ |M|, because–independently of b –theconsequent in the conditional gets T. QED

In a direct semantic argument we appeal directly to the truth-definition. In doing so we usethe concepts every and there exists, as we understand them in our English discourse. Asremarked before, the truth-definition makes explicit and precise concepts that are alreadyunderstood. It does not create them out of nothing.

As the last illustration, we prove what is known as Morgan’s law for universal quantification.

(7) ¬∀xα ≡ ∃x¬αProof: ¬∀xα is satisfied in a modelM just when ∀xα is not satisfied there,

i.e., when it is not the case that, for all values of x, α is satisfied. This isequivalent to saying that, for some value of x, α is not satisfied–i.e., ¬α issatisfied. Thus, ¬∀xα is satisfied iff there exists a value of x for which ¬αis satisfied, which is equivalent to saying the satisfaction of ∃x¬α

The technique that we used for sentential logic can be carried over, without change, to FOL:

All the general sentential-logic laws for establishing logical equivalences andimplications are valid for FOL.

This means that you can use without restrictions the laws given in chapters2 and 3, with sentential variables replaced everywhere by any FOL wffs.

The reason for this is that the sentential laws concern only sentential compounds and derivesolely from the truth-tables of the connectives. They remain in force, no matter what otherunits we have.

The sentential laws are adequate for establishing tautological equivalences and tautologicalimplications, but not equivalences and implications that depend on the meaning of the quan-tifiers. We therefore supplement them with appropriate quantifier laws. The following table,in 9.2.3 lists the basic equivalence laws that involve quantifiers.


9.2.3 Equivalence Laws and Simplifications in FOL

Commutativity of Quantifiers of the Same Kind

∀x∀y α ≡ ∀y ∀xα ∃x∃y α ≡ ∃y ∃xα

Distributivity of Quantifier Over Appropriate Connective

∀x(α ∧ β) ≡ ∀xα ∧ ∀xβ ∃x(α ∨ β) ≡ ∃xα ∨ ∃xβ

De Morgan’s Laws for Quantifiers

¬∀xα ≡ ∃x¬α ¬∃xα ≡ ∀x¬α

If x is not free in α:

∀xα ≡ α ∃xα ≡ α

∀x(α ∨ β) ≡ α ∨ ∀xβ ∃x(α ∧ β) ≡ α ∧ ∃xβ

Changing Bound Variables

α ≡ α0 if α0 results from α by legitimate bound-variables substitution

Two of these equivalences have been proved in the last subsection: (5) is the commutativityof universal quantifiers, and (7) is De Morgan’s law for universal quantifiers. The others areprovable by similar semantic arguments. Here for example is the argument showing that if xis not free in α then:

∀x (α ∨ β) ≡ α ∨ ∀xβ

We have to show that, for every model M and for every assignment g ofvalues to the free variables, the two sides have the same truth-values. Sincex is not free in α, the two sides have the same free variables and x is notamong them. The argument does not depend on values assigned by g tovariables other than x; they appear as fixed parameters. Only the possible


values of x plays a role, when we apply the truth definition to wffs of theform ∀x(. . . x . . .).First assume that α is true (in M, under g). Then the right-hand side istrue. Since x is not free in α, the truth-value of α does not depend on thevalue assigned to x. Hence, for all a ∈ |M|, valMα [gxa ] = T. Therefore, forall a ∈ |M|, valM(α ∨ β) [gxa ] = T. This implies that the left-hand side istrue.

Remains the case that α is false. Then, the truth-value of the right-handside is the value of the second disjunct: ∀xβ. Suppose it is T. Then, for alla ∈ |M|: valMβ [gxa ] = T, hence also valM(α ∨ be) [gxa ] = T. Therefore theleft-hand side is true as well.

If, on the other hand, ∀xβ gets F, then for some a ∈ |M|, valMβ [gxa ] = F.Since x is not free in α, the assignment of a to x has no effect on α’s truth-value (which, by assumption, is F). Hence under this assignment α∨ β getsF. Therefore the value of ∀x (α ∨ β) is F. QED

In sentential logic we have a substitution law for logically equivalent sentences (cf. 2.2.1 and3.1). This law is now generalized to FOL:

If α ≡ α0, α is a subformula of β and β0 is obtained from β by substitutingone or more occurrence of α by α0, then β ≡ β0

Note that the notion of subformula is much wider than that of a sentential component. Forexample, γ is a subformula of ∀ v (γ → δ), but not a sentential component of it. Applyingthe substitution law we get, for example:

If γ ≡ γ0 then ∀ v (γ → δ) ≡ ∀ v (γ0 → δ)

Intuitively, the substitutivity of equivalents is clear: If α ≡ α0, then in every model and underany assignment of values to the variables, α and α0 have the same truth-value. Therefore,substituting α by α0 in any wff β cannot affect β’s truth-value. A rigorous, but rather tedious,proof (which proceeds by induction on β) can be given; we leave it at that.

Example: Applying FOL equivalence laws, substitutivity of equivalents and the tools ofsentential logic, we show that the following is logically true:

(a) ∀x(α→ β) → (∀xα→ ∀xβ)

The wff has the form γ1 → (γ2 → γ3). By sentential logic it is equivalent to:

(b) [∀x(α→ β) ∧ ∀xα]→ ∀xβ


Using the distributivity of ∀ over ∧ (in the right-to-left direction), we can replace the subfor-mula:

∀x(α→ β) ∧ ∀xαby the logically equivalent

∀x [(α→ β) ∧ α](b) is thereby transformed into the logically equivalent:

(c) [∀x ((α→ β) ∧ α)] → ∀xβ

By sentential logic:

(α→ β) ∧ α ≡ α ∧ βHence, we can substitute in (c) and get the logically equivalent:

(d) ∀x(α ∧ β) → ∀xβ

Applying again distributivity of ∀ over ∧ (in the left-to-right direction) we can substitute∀x(α∧β) by the equivalent ∀xα∧∀xβ. All in all, the following is logically equivalent to (d):

(e) (∀xα ∧ ∀xβ) → ∀xβ

But (e) is obviously a tautology. Hence, (a), which is logically equivalent to it, is logicallytrue.

Note that (a) is not a tautology; it is logically equivalent to (e), but the equivalence is nottautological.

Pushing Negation Inside: By applying De Morgan’s quantifier laws (in the left-to-rightdirections) we can push negation inside quantifiers. As we do so, we have to toggle ∀ and∃. Combining this with the pushing-in technique of sentential logic (cf. 3.1.1) we can pushnegation inside all the way. In the end negation applies to atomic wffs only. Here is anexample. Each wff is equivalent to the preceding one. Find for yourself how each step isachieved.

¬∀x [P(x, c)→ ∃yR(x, y)]

¬∀x [¬P(x, c) ∨ ∃yR(x, y)]

∃x {¬[¬P(x, c) ∨ ∃yR(x, y)]}

∃x {¬¬P(x, c) ∧ ¬∃yR(x, y)}

∃x [P(x, c) ∧ ¬∃yR(x, y)]


∃x [P(x, c) ∧ ∀y¬R(x, y)]

Note: It is not difficult to see that universal and existential quantifiers play roles parallel tothose of conjunction and disjunction. If the model is finite and all its universe members havenames in the language, we can express universal quantification as a conjunction, existentialquantification–as a disjunction: Say, c1, . . . , cn are constants that denote all the universemembers; then, in that particular interpretation, ∀xα(x) has the same truth-value as:

α(c1) ∧ . . . ∧ α(cn)and ∃xα(x)–the same truth-value as:

α(c1) ∨ . . . ∨ α(cn)This observation can provide some insight into the logic of the quantifiers.

Expressing the Quantifiers in Terms of Each Other: ∀ and ∃ are said to be dualsof each other. The following two equivalences, which are easily derivable from De Morgan’slaws, show how each quantifier is expressible in terms of its dual.

∃xα ≡ ¬∀x¬α ∀xα ≡ ¬∃x¬αHaving negation, we can base FOL on either the universal or the existential quantifier. Thechoice of including both is motivated by considerations of convenience and structural symme-try.

Homework 9.5 Derive the following equivalences, according to the indications. You canemploy, in addition, the full apparatus of sentential logic and substitutions of equivalentsubformulas.

1. De Morgan’s law for ∃ from De Morgan’s law for ∀.2. The laws for expressing one quantifier in terms of another– from De Morgan’s laws forquantifiers.

3. The distributivity law of ∀ over ∧, by a direct semantic argument. (You may ignore,for the sake of simplicity, free variables other than x.)

4. The distributivity law of ∃ over ∨–from the dual law for ∀ and ∧ and the law forexpressing ∃ in terms of ∀

5. The law for ∃x [α ∧ β], where x is not free in α, by a direct semantic argument.

6. The law for ∀x [α ∨ β], where x is not free in α–using the dual law for ∃ and ∧ andexpressing ∀ in terms of ∃.

7. ∀x(β → α) ≡ (∃xβ)→ α, where x is not free in α–using the laws in the framed box.

9.3. THE TOP-DOWN DERIVATION METHOD FOR FOL IMPLICATIONS 345

9.3 The Top-Down Derivation Method for FOL Impli-

cations

9.3.0

The apparatus developed so far may carry us a good way, but is not sufficient for establishingall logical implications in FOL. We can add to our stock further equivalences and logicaltruths. For example:

(1) |= ∀xα(x) → α(t), where α(t) is obtained from α(x) by a legitimatesubstitution of the term t for the free x.

We shall pursue a different strategy. We extend to FOL the top-down derivation method ofsentential logic, given in chapter 4 (cf. 4.3 and 4.4). The result is an adequate system forestablishing all first-order logical implications.

9.3.1 The Implication Laws for FOL

Notation: Consider a premise-list Γ. If Γ = α1, . . . , αn, define:

SvcΓ = Sv

cα1, . . . , Svcαn

In words: SvcΓ is obtained from Γ by substituting, in every wff of Γ, every free occurrence of

v by the individual constant c. (Note that these substitutions are always legitimate, since cis not a variable.)

New Constants: An individual constant c is said to be new for the wff α, if it does notoccur in α. It is new for Γ, where Γ is a list of wffs, if it is new for all the wffs in Γ.

We shall use the proofs-by-contradiction variant of the top-down derivation (given–for sen-tential logic–in 4.4.0 and 4.4.1), which leads to the most economical system.

First, we include all the laws for sentential logic, where the sentences are replaced by arbitrarywffs. To these we add the laws listed in the following table.


Substitution of Free Variables by New Constants

Γ |=⊥ ⇐⇒ SvcΓ |=⊥

where c is any individual constant new for Γ

Universal and Existential Quantification

(∀, |=) Γ, ∀xα |= ⊥ ⇐⇒ Γ, ∀xα, Sxcα |= ⊥

where c is any individual constant

(∃, |=) Γ, ∃xα |= ⊥ ⇐⇒ Γ, Sxcα |= ⊥

where c is any individual constant new for Γ and ∃xα

Negated Quantifications

(¬∀, |=) Γ, ¬∀xα |= ⊥ ⇐⇒ Γ, ¬Sxcα |= ⊥

where c is any individual constant new for Γ and ∀xα

(¬∃, |=) Γ, ¬∃xα |= ⊥ ⇐⇒ Γ, ¬∃xα, ¬Sxcα |= ⊥

where c is any individual constant

For Languages with Equality: If the language contains equality, add two equality laws,(EQ) and (ES) of 7.2.1. The wffs are, of course, any wffs of FOL.

For Languages with Function Symbols: The laws are the same, except that the lawsthat cover all individual constants are extended, so as to cover all constant terms (termscontaining no variables). This means that in (∀, |=) and (¬∃, |=), we replace ‘c’ by ‘t’, wheret is any constant terms. The laws by which new individual constants are introduced remainthe same.

If the language contains both equality and function symbols, then the equality laws aresimilarly extended. In (EQ) ‘c’ is replaced by ‘t’, and in (ES) c and c0 are replaced by ‘t’ and‘t0’, respectively, where t and t0 are any constant terms.

The top-down method of proof for FOL is a direct extension of the sentential case. Animplication of the form: Γ |= α is reduced to the equivalent implication: Γ,¬α |=⊥. Then,applying the implication laws in the right-to-left direction one keeps reducing goals to other


goals until all goals are reduced to a bunch of self-evident implications: those whose premisescontain a wff and its negation. The method is adequate: Every logical implication of FOLcan be established by it.

The Validity of the FOL Laws

Here ‘validity’ is not the technical term used for implication schemes in sentential logic. Whenwe say that a law is valid we mean that it is true as general law.

The validity of (∀, |=) is the easiest. It follows from the fact that the two premise lists:

Γ, ∀xα Γ, ∀xα, Sxcα

are equivalent. The second premise-list is obtained from the first by adding Sxcα, which is a

logical consequence of it:

(3) ∀xα |= Sxcα

((3) is a special case of (1) in 9.3.0, where t is the term c.) A formal proof of (3) is obtainablefrom:

Lemma 1: LetM be a model such that cM = a. Then, for any wff α:

M |= α [gxa] ⇐⇒ M |= Sxcα [g]

Intuitively, lemma 1 is obvious: α says about the value of x (under gxa) what Sxcα says

about the denotation of c; if the value and that denotation are the same, replacing the freeoccurrences of x by c should make no difference.

Formally, the proof proceeds by induction on α, starting with atomic wffs and working upto more complex ones.1 We shall not go into it here. Lemma 1 implies (3): Assume thatM |= ∀xα, then, for all a ∈ |M|, M |= α[xa] (here we ignore other free variables of α;their values play no role in the argument). Choosing a as cM and applying lemma 1, we get:M |= Sx

cα.

The laws for negated quantifications are easily obtainable from the quantification laws bypushing negation inside. The remaining are the substitution-by-new-constant law and (∃,|=). Their proofs rely on:Lemma 2: Consider a first-order language, which is interpreted inM. Assume that α be awff in that language and let c be an individual constant new for α. Let a ∈ |M| and letM0

1All the steps are straightforward. In the passage from a wff, β, to its generalization ∀vβ, or ∃vβ, we canassume that the quantified variable is different from x; else, x is not free in the resulting wff, hence, α = Sxcαand the claim is trivial.


be the model in which c is interpreted as a and which is exactly likeM otherwise. Then:

M |= α[gxa ] ⇐⇒ M0 |= Sxcα[g]

Note: c may or may not belong to the language ofM. If it does, it has an interpretationinM, andM0 is obtained by changing, if necessary, that interpretation. If c does not belongto that language,M0 is a model for the enriched language obtained by adding c.

Proof of Lemma 2: The modelsM andM0 are the same in all respects, except possiblythe interpretation of c. Since c does not occur in α, its interpretation has no effect on α’struth-value. Therefore α has the same truth-value (under gxa) inM0 as inM:

M |= α[gxa ] ⇐⇒ M0 |= α[gxa ]

Applying lemma 1 to the modelM0 we also have:

M0 |= α[gxa ] ⇐⇒ M0 |= Sxcα[g]

The claim of lemma 2 results by combining these two equivalences.

QED

The proof of the substitution-by-new-constant law can now proceed as follows. We have toshow that if c is new for Γ, then:

Γ is not satisfiable ⇐⇒ SvcΓ is not satisfiable


Γ is satisfiable ⇐⇒ SvcΓ is satisfiable

We can assume that x is free in some wff of Γ; otherwise SxcΓ = Γ and there is nothing to

prove.

Assume first that Γ is satisfiable. Then there is a model M and an assignment g (to thevariables occurring freely in Γ) that satisfy all wffs of Γ. Say g(x) = a. LetM0 be the modelin which c is interpreted as a and which is like M in all other respects. Lemma 2 (the ⇒direction of ⇔) implies that, for every α ∈ Γ:

M0 |= Sxcα[g]

Hence, SxcΓ is satisfiable.

Vice versa, assume that SxcΓ is satisfiable. Then there is a model M0 and an assignment g

that satisfy all wffs of SxcΓ. The modelM0 provides a denotation, cM

0, for c. Let a = cM

0.


If c is in the original language, letM =M0. Otherwise, letM be the model for the originallanguage that is the same asM0, except thatM leaves c uninterpreted (i.e. δM is not definedfor c; cf. 9.1.0). Lemma 2 (this time, via the ⇐ direction) implies thatM |= α[gxa ], for all αin Γ. Hence Γ is satisfiable.

Finally consider (∃, |=). The easier direction is from left to right. We have to show that ifΓ,∃xα |=⊥, then also Γ, Sx

cα |=⊥. This follows immediately from:

(4) Sxt α |= ∃xα

(4) implies that the premises of the left-hand side list are implied by those of the right-handside, hence if the former are not satisfiable, so are the latter. We can derive (4) from lemma1, much in the same way as we derived (3). (Alternatively, it can be derived from (3), bysubstituting in (3) ¬α for α, negating and switching the two sides of ‘|=’, pushing negationinside and dropping double negations.)

The less obvious direction of (∃, |=) is ⇐:

If Γ, Sxcα is not satisfiable then Γ,∃xα is not satisfiable,


If Γ,∃xα is satisfiable then Γ, SxcαΓ is satisfiable.

So assume that Γ,∃xα is satisfied in some modelM by the assignment g. According to thetruth-definition for existential generalizations, there is a ∈M| such that:

M |= α[gxa ]

LetM0 be the model in which c is interpreted as this a and all the rest is exactly as inM.Since c is new for Γ its interpretation has no effect on the satisfaction of Γ’s members; henceevery wff of Γ is satisfied inM0 by the assignment g. Since c is new for α, lemma 2 implies:

M0 |= Sxcα[g]

Therefore all wffs in Γ, Sxcα are satisfied inM0 by g.

QED

Note: The negated-quantification laws are not needed if we allow to push in negation,because in that case they can be derived from the others. The system however does notinclude laws for pushing in negations.


Instantiations: The passage from a generalization, ∀xα, or ∃xα, to a wff of the formSxcα is described as the instantiation of the quantified variable to the constant c.

9.3.2 Examples of Top-Down Derivations

In the following, a wff is underlined if some law has been applied to it, in order to get to thegoal’s child (or children). The substitution-by-new-constant law is applied to all wffs, hencein that case none is underlined. is underlined. Also final goals do not contain underlinedwffs. The applied law is indicated in the first three examples in the margin of the goal’s child(children).

(I) α |= ∀xα, where x is not free in α. In the following derivation c is a new individualconstant.

1. α |= ∀xα2. α, ¬∀xα |=⊥ contradictory-conclusion law

3. α, ¬Sxcα |=⊥ √

(¬∀, |=)Explanation: Since x is not free in α, Sx

cα = α. Hence, the premises in 3. containa wff and its negation.

(II) |= ∀x(α→ β)→ [∀xα→ ∀xβ]. In the following, c is a new individual constant.

1. |= ∀x(α→ β)→ [∀xα→ ∀xβ]2. ¬(∀x(α→ β)→ [∀xα→ ∀xβ]) |=⊥ contradictory-conclusion law

3. ∀x(α→ β), ¬[∀xα→ ∀xβ] |=⊥ (¬→, |=)4. ∀x(α→ β), ∀xα, ¬∀xβ |=⊥ (¬→, |=)5. ∀x(α→ β), ∀xα, ¬Sx

cβ |=⊥ (¬∀, |=)6. ∀x(α→ β), Sx

c (α→ β), ∀xα, ¬Sxcβ |=⊥ (∀, |=)

7. ∀x(α→ β), Sxc (α→ β), ∀xα, Sx

cα, ¬Sxcβ |=⊥ (∀, |=)

Obviously, Sxc (α → β) = Sx

cα → Sxcβ, hence, reordering the premises, we

have:

70. ∀x(α→ β), ∀xα, Sxcα→ Sx

cβ, Sxcα, ¬Sx

cβ |=⊥8.1 ∀x(α→ β), ∀xα, ¬Sx

cα, Sxcα, ¬Sx

cβ |=⊥ √(→, |=)

9.S. THE TOP-DOWN DERIVATION METHOD FOR FOL IMPLICATIONS 351

8.2 'ix(a -+ (3), Tlxa, 8:(3, B;a, ..... S:(3 F.l ..;

Instead of applying in the last step the branching law, we could have replaced S;a --> S;r;, S~a by the tautologically equivalent· list S;a,8;(3.

o 1.

I o 9.. I 03. I ~JIt. f D S. I o Co. I 0 ....

v ,/ ." (/

t.1 0 o to .l

(III) 't/xa F S;a, wha:6v is substitutable for free x in a. In the following c is a new individual constant.

1. "Ilia 1= S;a

2. lillia, ..... S;a F .1

3. ~(lixa), S~( ..... S;a) F .1

4. 8~(Vxa), B;~a, S~(-,S;a) F.l ..;

contradictory-conclusion law

substitution-by-new-constant

('i, f=)

Explanation: ~(V3ia) is theuniversa.! generalization in which a.1l free occurrences of v have been .substituted by c. Applying (\I, f=) to it, we drop ''ix' and substitute each free x by c. The result is S:~a; it is obtained from a by substituting an free oCCUIIences oiv and of x by c. The last premise is the negation of this; 'Ib see this, note first that ~( .... S;a)= ..... ~S;a. Next, siilce the substitution of v for free x is, by assumption, legitimate, every fr~ occurrence of x becomes in S;a a free occurrence of v. Applying the substitution ~. to. S;a amounts to substituting in a a.1l free occurrences of v and of x by c. The outcome is therefore -,S:~a. Hence the list contains a wff and its negation.

In the last two examples the less ac~ate but more Suggestive variable-displaying notation is used. The new individua.! constants are obvious from the context. We !illow to I?ush negation


inside. The marginal indications are left mostly for the reader.

(IV) ∃x∀yα(x, y) |= ∀y∃xα(x, y)

1. ∃x∀yα(x, y) |= ∀y∃xα(x, y)2. ∃x∀yα(x, y), ¬∀y∃xα(x, y) |=⊥3. ∃x∀yα(x, y), ∃y∀x¬α(x, y) |=⊥ pushing-in negation

4. ∀yα(c1, y), ∃y∀x¬α(x, y) |=⊥5. ∀yα(c1, y), ∀x¬α(x, c2) |=⊥6. ∀yα(c1, y), α(c1, c2), ∀x¬α(x, c2) |=⊥7. ∀yα(c1, y), α(c1, c2), ∀x¬α(x, c2), ¬α(c1, c2) |=⊥ √

Note: To get 5. we have to introduce a second new constant. We cannot use c1,because it occurs in 4.

This example illustrates a general feature of the technique. Existential quantifiers can beeliminated via instantiations to new constants; universal quantifiers can be used to add in-stantiations to any chosen constant.

(V) ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y) |= ∃x∀yR(x, y)→ ∀x∃yR(x, y)

1. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y) |= ∃x∀yR(x, y)→ ∀x∃yR(x, y)2. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ¬[∃x∀yR(x, y)→ ∀x∃yR(x, y)] |=⊥3. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∃x∀yR(x, y), ¬[∀x∃yR(x, y)] |=⊥4. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∃x∀yR(x, y), ∃x∀y¬R(x, y) |=⊥5. ∀x∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∀yR(c1, y), ∃x∀y¬R(x, y) |=⊥6. allx∃yR(x, y) ∨ ∀x∃y¬R(x, y), ∀yR(c1, y), ∀y¬R(c2, y) |=⊥7.1 ∀x∃yR(x, y), ∀yR(c1, y), ∀y¬R(c2, y) |=⊥7.2 ∀x∃y¬R(x, y), ∀yR(c1, y), ∀y¬R(c2, y) |=⊥8.1 ∀x∃yR(x, y), ∃yR(c2, y), ∀yR(c1, y), ∀y¬R(c2, y) |=⊥9.1 ∀x∃yR(x, y), R(c2, c3), ∀yR(c1, y), ∀y¬R(c2, y) |=⊥10.1 ∀x∃yR(x, y), R(c2, c3), ∀yR(c1, y), ∀y¬R(c2, y), ¬R(c2, c3) |=⊥ √


8.2 ∀x∃y¬R(x, y), ∃y¬R(c1, y), ∀yR(c1, y), ∀y¬R(c2, y) |=⊥9.2 ∀x∃y¬R(x, y), ¬R(c1, c3), ∀yR(c1, y), ∀y¬R(c2, y) |=⊥10.2 ∀x∃y¬R(x, y), ¬R(c1, c3), ∀yR(c1, y), R(c1, c3), ∀y¬R(c2, y) |=⊥ √

As you can see, the elimination of an existential quantifier introduces a new constant–overwhich we have no control. Universal quantifiers are not eliminated, but can be used toadd instantiations, to any constants, of the quantified variables. The moves in this game ofeliminating and instantiating are chosen so as to produce in the end a contradictory premise-list.

9.3.3 The Adequacy of the Method: Completeness

There is a uniform “automatic” way of applying the laws, which is guaranteed to produce,in a finite number of steps, a proof, provided that the initial implication is indeed a logicalimplication. This claim is proved by showing the following: If at no finite stage do we get aproof (i.e., a reduction to a set of self-evident goals), then we end with a tree containing aninfinite branch. From this branch one can construct a (possibly infinite) model that satisfiesall the original premises, which shows that we did not have a contradictory premise list.Consequently, if the initial goal is a logical implication, there is top-down derivation thatends in a finite number of self-evident goals. Turning upside down the derivation tree, weget a bottom-up proof. The proof-system referred to below is the system consisting of theself-evident implications (taken as axioms) and the⇐-direction of the basic implication laws.The Completeness of the Proof System: If Γ |= α is a logical implication of FOL,then there is a proof of it, obtainable via the top-down derivation method. The soundnessof the proof system, i.e., the fact that every proved implication is logical follows from thevalidity of the laws, whose proof was given in 9.3.1.

The above completeness claim amounts, essentially, to what is known as the completenesstheorem for first-order logic. We can convert our system into a deductive system (cf. 6.2),following the same lines we followed in the sentential calculus. The deductive FOL system canbe either of Hilbert’s type, or of Gentzen’s type. Each includes the corresponding sentential-logic version (for the Hilbert-type system see 6.2.3, for the Gentzen type–6.2.4). The systemsare sound and complete for first-order logic. The completeness for the Hilbert-type system isstated in the same form as in the sentential calculus. Now however Γ is a list of wffs in FOL,α is a wff and ‘`’ denotes provability in FOL:

Γ |= α =⇒ Γ ` α

The completeness result for first-order logic, with respect to a particular Hilbert-type system,was first proved by Godel. (His, and other, proofs do not employ top-downs derivations. Themethod employed in this book derives from the ideas underlying Gentzen’s calculus.)


The essential difference between sentential logic and FOL is that, in the latter, we are notguaranteed to get a counterexample in a finite number of steps if the initial goal is not valid.The required counterexample may take an infinite number of steps. In general, we cannotknow for sure, at any finite stage, whether there is a proof around the corner. This restrictionon what we can know can be given precise mathematical form, it becomes what is known asthe undecidability of first-order logic. Roughly speaking it says that there is no algorithm (orno computer program), which, on any given first-order sentence, decides in a finite number ofsteps whether the sentence is logically valid or not. That theorem was first proved by Church.

Homework 9.6 Prove the following. You can use top-down derivations, as well as substi-tution of equivalents and the equivalence laws of 9.3.1, besides the full sentential apparatus.

1. ∀x [S(x, a)→ S(x, b)] ∧ ¬S(b, b) |= ¬S(b, a)2. ∀vS(v, v) |= ∃yS(x, y)3. |= ∀x [¬P(x)→ ∃ y (P(y)→ R(y))]

4. |= ∃x∀yS(x, y) ∨ ∀x∃y¬S(x, y)5. ∀xP(x) ∨ ¬∀xR(x) |= ∃x [R(x)→ P(x)]

6. ∀x, y [S(x, y)→ S(x, x)] |= ¬S(x, x)→ ¬S(x, y)7. ∀x (α ∨ β) |= ∀xα ∨ ∃xβ8. |= ∃x [(∃uP(u)→ ∃vR(v))→ (P(x)→ R(x))]

9. ∃xα→ ∀xβ |= ∀x (α→ β)

10. ∃x ∀ y [S(x, y)↔ S(x, x)] |= ∃x [∀yS(x, y) ∨ ∀y¬S(x, y)]

a course in symbolic logic

Documents