reasoning in trees

11
REASONING IN TREES Herman Ruge Jervell university of Oslo Inst. of Informatics Box 1080 Blindern 0316 Oslo 3, Narway I . THE TURING SWAMP Most theories of computation start with the following: a set of states S; the initial states I - a subset of S; the terminal states T - a subset of S; a set of moves M - mapping S in S. This is, so to say, the kernel. There may be various superstructures built on the top of it. But in most cases we would define a calculation - or an execution - as a sequence of moves starting with an initial state and going on until a terminal is reached. If we do not know more about the calculation than this, we are in the middle of the Turing swamp. There is very little we can say about such calculations. We have no way of analyzing the quantifier-combinations involved in going on until. We need to know more about the calculation to do that. Part of the problem with the Turing swamp is that it is so easy to get stuck in it. We may make a nice theory of computation using the unanalyzed going on until. The theory is of very little help when we come to concrete calculations. 2. GODEL'S ABSTRACT IDEAS The way out of the Turing swamp is to use Dialectica-paper Godel talked about them [8]. know more about the calculations than come out computations. We may know: a proof that the calculation terminates; abstract ideas. In his The point is that we usually in the usual theories of some property that is invariant under the local moves in the calculation; Dedicated to Kurt Godel 1906-1978. 125 D. G. Skordev (ed.), Mathematical Logic and Its Applications © Plenum Press, New York 1987

Upload: uio

Post on 10-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

REASONING IN TREES

Herman Ruge Jervell

university of Oslo Inst. of Informatics Box 1080 Blindern 0316 Oslo 3, Narway

I . THE TURING SWAMP

Most theories of computation start with the following:

a set of states S;

the initial states I - a subset of S;

the terminal states T - a subset of S;

a set of moves M - mapping S in S.

This is, so to say, the kernel. There may be various superstructures built on the top of it. But in most cases we would define a calculation -or an execution - as a sequence of moves starting with an initial state and going on until a terminal is reached. If we do not know more about the calculation than this, we are in the middle of the Turing swamp. There is very little we can say about such calculations. We have no way of analyzing the quantifier-combinations involved in going on until. We need to know more about the calculation to do that.

Part of the problem with the Turing swamp is that it is so easy to get stuck in it. We may make a nice theory of computation using the unanalyzed going on until. The theory is of very little help when we come to concrete calculations.

2. GODEL'S ABSTRACT IDEAS

The way out of the Turing swamp is to use Dialectica-paper Godel talked about them [8]. know more about the calculations than come out computations. We may know:

a proof that the calculation terminates;

abstract ideas. In his The point is that we usually in the usual theories of

some property that is invariant under the local moves in the calculation;

Dedicated to Kurt Godel 1906-1978.

125

D. G. Skordev (ed.), Mathematical Logic and Its Applications© Plenum Press, New York 1987

some interpretation of the calculation;

that something is finite;

that something has a model;

The abstract properties involve quantifier combinations like going on until. To give an example: in general we cannot decide whether something is finite or not. There is a quantifier combination in to be finite. But it can be used as assumption in arguments - and we may go from the abstract idea to be finite to the abstract idea going on until.

In all formal arguments there are assumptions involving abstract ideas. (For example, the idea of a finite string.) The goal in logic is, of course, not to eliminate these abstract ideas. We cannot do without them. But the goal is to drag them out into daylight and make them both visible and useful.

Here I will show that some useful abstract ideas can be formulated in terms of trees. They will be useful for both reasoning in trees and reasoning about trees. And, of course, we know that trees come up all the time in logic and in computer science.

3. PREDICATE LOGIC

Let us start with some of the reasoning in trees implicit in predicate logic. Say we have a Gentzen-style sequential calculus [4]. A common way of proving the completeness theorem for classical predicate calculus goes like this:

start with a formula F;

with F at the bottom node try to systematically construct a tree T(F) with formulae at the nodes;

T(F) has three types of nodes - terminal nodes, nodes with no branching and nodes with binary branching;

if T(F) is well-founded, then we get a derivation in sequential cal­culus of F;

if there is an infinite branch in T(F), we can from that branch con­struct a term-model M falsifying F;

if we have a model N falsifying F, we can from the model N construct an infinite branch in T(F).

The construction is of course standard - but there are a few things which are not so often noted. First some obvious facts about the con­struction:

it is decidable whether some node is terminal or not;

for a non-terminal node it is at most a binary branching;

for a non-terminal node it is decidable what kind of branching we have;

the term-model M is not assumed to be total;

the falsifying model N is not assumed to be total.

The construction can readily be used as a base for mechanizing predi­cate logic. The way the logicians usually give the construction there is

126

a lot of wasteful copying - but that can to a large extent be eliminated. One should also have a strategy for which formulae in the sequents should be analyzed first. More important, the construction gives - to my mind -the best way of formulating the two major problems in mechanizing predicate logic:

\. Herbrand universe: which terms in the Herbrand universe are superfluous in the construction? Or, if one likes, which terms do we need to analyze existential quantifiers with respect to?

2. Using cuts: which cuts should be allowed?

The first problem has been taken up by Hao Wang [\5]. Let us look a little at the second. It is not hard to give examples of arguments where cuts are necessary to make them short. One could argue that the first essential use of cut was by Archimedes in his short note on the Sandreckoner [I]. In less than a page he shows how to talk about numbers which are larger than the number of sand-grains in the universe. If he had done that in a cut-free way he could very well have ended up with a description which took more room than the sand-grains he was counting. From the work of Gentzen, it is clear that a derivation of length d with cut-formulae of complexity < n can be transformed into a cut-free derivation of length less than:

2d t 22 ' \ = n

and that this estimate cannot be essentially reduced.

The resolution method of Prawitz [\2] and Robinson [\3] involves the cut-rule, but only such cuts as can be derived from a unification procedure. This unification gives an answer to which cuts should be allowed, but one can give examples which show that it is too restrictive. There are argu­ments proposing that one should have used more cuts than those given by resolution.

The importance of this comes from the following observations which are in Gentzen's work [4]:

any formal (or half-formal) argument in logic can be transformed step for step into a formal derivation in the system of natural deduction;

there is no loss of efficiency in the transformation;

the derivation in natural deduction can be transformed into a deri­vation in sequential calculus using cuts, and with no loss of efficiency.

The outcome of this is that we do not lose efficiency as long as we use cuts liberally. The use of cuts can be eliminated with an enormous loss in efficiency. On the other hand, the liberal use of cuts is a hinder to mechanizing logic. With liberal use of cuts we have a tendency to end up with a British Museum variant of proof-procedure - search through all possible proofs until you find one which does the job.

4. LOGIC PROGRAMMING

Logic Programming has become a catch-word and with the Prolog, it may seem to be not far away from being realized. Mellish's introduction to Prolog [2] we read:

advent of In Clocksin-

127

"In the last few sections we have seen how Prolog is based on the idea of a theorem prover. As a result of this, we can see that our pro­grams are rather like our hypothesis about the world, and our questions are rather like theorems that we would like to have proved. So programming in Prolog is not so much like telling the computer what to do and when, but rather like telling it what is true and asking it to try and draw con­clusions."

The authors are of course aware that this way of thinking is not realized in Prolog. After the first few programs you soon become aware that you have to take into account a number of side effects. The execution depends heavily on the way you give "the hypothesis about the world" and you have to know a lot about the backtracking mechanism to make programs of some complexity. But let us now leave these objections aside and look at "logic programming" from a more theoretical point of view. We get a system of "logic programming" as soon as we have a mechanical theorem prover for predicate logic (and also a way of getting values out of proved existential formulae).

Given some hypothesis H of the world and some question F that we want to ask, formulated as formulae in predicate logic. The problem is whether H logically implies F. Starting with Hand F the theorem prover constructs a tree over the formulae as before. We call the tree T(H,F). There are two ways of reasoning in this tree:

we may concentrate on what is true in the world given the hypothesis H and ask whether F also had to be true;

we may concentrate on the possible derivations of F from the hypothesis H.

Godel's completeness theorem tells us that these things are equivalent. But that is not the whole story. If we look at these ways of reasoning from the view of abstract ideas, a lot more should be said.

The models used in "logic programming" tend to be partial. In mathe­matics it is major step to go from a prime ideal to a maximal ideal. We need some extra argument to do it. In the same way it is a major step to go from a partial model to a total model. It is no accident that our theorem prover - using the construction of T(H,F) - gives partial models when the construction fails to give a derivation of F from H. A theorem prover which gives total models under failure cannot be as efficient. The theorem prover we get from the Henkin proof of the completeness theorem is not better than the British Museum variant - first enumerate all proofs and then test them one by one until we find one that fits.

In the formulations of "logic programming" we assume that there is a vast difference between it and ordinary imperative programming. This is of course not so. The following argument comes from Turing's major paper [14]. Consider an execution according to a program P. Turing showed that using simple description of the possible states the machine can be in and the moves given by the program, we can construct in a straightforward way a formula ,(P) such that the following is equivalent:

the execution according to P terminates;

,(P) is derivable in predicate logic.

Turing used the equivalence to show that the Entscheidungsproblem were equivalent to the Halting problem. We can use it to show that there is not such a vast difference between imperative programming and logic programming. (We would then not only be interested in termination of the

128

programs but also the result after termination. Turing's argument is readily extended to give formulae expressing such things.)

In Turing's argument there is also information about efficiency. The execution of Pcan be translated step for step into a derivation of ,(P), and the other way around (being a little liberal with respect to the cuts allowed in the derivation). So we do not lose any efficiency in translating the one to the other. (As a side remark, there are connections between deterministic calculations and Horn-formulae.)

The remarks here are formulated in terms of "imperative programming". The same remarks can of course be made for "functional programming".

Do we gain anything at all by logic programming? We have shown that it is straightforward to translate an imperative program P into a formula ,(P) such that derivations of the formula do the same job as execution of the program. But this is only translating it into the syntactic part of the completeness theorem. "Logic programming" is concerned with the semantic part. What is involved in the equivalence in the completeness theorem? Let us see which parts are straightforward:

we can in an elementary recursive way construct the tree T(F) given the formula F;

given a falsifying model N of F we can in a straightforward way con­struct an infinite branch in T(F);

But the problems come with the following:

from the non-existence of a derivation there is no straightforward way to get an infinite branch in the tree.

The argument for this uses:

Theorem I

Weak Konig's lemma. Given a tree T with the following properties:

there is an elementary recursive construction of the nodes of the tree;

for each node we can decide whether it is a terminal node or not;

for each non-terminal node there is an upper bound on the number of successor nodes;

for each non-terminal node we know the exact number of successor nodes;

there are infinitely many nodes.

We can thus conclude that there is an infinite branch.

One of the assumptions is more abstract than the others. The first four refer to algorithms for constructing or proving. The last "there are infinitely many nodes" can be interpreted in many ways. If we understand the quantifier combination there in a particular way, this way can then also be used to give an under"standing of the quantifier combination in the conclusion "there is an infinite branch".

The weak Konig's lemma is implicit in "logic programming" and is part of the reason why it is supposed to be so much easier to think in terms of models than in terms of executions of programs (or derivations of formulae). The following "formula" makes a little sense:

129

Logic Programming

Imperative Programming + Weak Konig's Lemma

or we could say that in "logic programming" we talk about the possible branches in the trees, while in "imperative programming" we talk about the nodes. In "logic programming" we get the abstract argument involved in weak KBnig's lemma for free, but we can do exactly the same arguments in imperative programming if we add the weak Konig's lemma as an extra ingredient.

The "logic" in "logic programming" is here used as a synonym for predicate logic. In an uninteresting way we can do all our programs in logic programming. But then we are in the middle of the Turing swamp. The interesting things come with the abstract arguments and the abstract ideas which we use to understand the programs. The point about "logic program­ming" was that it had built into it a use of the completeness theorem of predicate calculus. This is, using reasonable assumptions, equivalent to weak Konig's lemma. After the work of Friedmann et al. [3] this is well understood.

We can also see why the Henkin proof of the completeness theorem is different from the above. The Henkin proof gives total models. "Logic programming" with total models corresponds to the following stronger principle:

Theorem 2

Strong Konig's lemma. Given a tree T with the following properties:

there is an elementary recursive construction of the nodes in the tree;

for each node we can decide whether it is a terminal node or not;

for each non-terminal node there is an upper bound on the number of successor nodes;

there are infinitely many nodes.

We can thus conclude that there is an infinite branch.

The only difference is that we do not know the exact number of suc­cessor nodes to a non-terminal node. We only know an upper bound to the number of successor nodes. Friedmann [3] has given a number of statements f~om ordinary mathematics equivalent to one of the two versions of the Konig's lemma. The importance of such work is to make clear the abstract ideas involved in various mathematical arguments. We have seen that it can be used to see what "logic programming" could be about. There is a moral here:

In all our formal reasoning we use some abstract ideas. It could be the idea of a formula (as a finite string of symbols) or other ideas. These abstract ideas are also used in general arguments like the Konig's lemmas. The abstractness is shown by the absence of calculatory content. Never­theless, we need them to govern our calculations. The crucial assumption in the Konig's lemma is "there are infinitely many nodes". If we have a nice analysis of the quantifier combination in "there are infinitely many", then this analysis can be transferred to a similar analysis of the quanti­fier combination in "there is an infinite branch". If we have some analysis of the assumption with calculatory content, we get also calculatory content in the conclusion.

130

We cannot in advance enumerate all possible abstract ideas. To get ideas usually involves creative work and this cannot be mechanized. critique of "logic programming" is that it has connected itself too with one abstract argument - the one giving weak Konig's lemma.

abstract A

much

If we have an argument for termination, or correctness, of calculation involving the weak Konig's lemma, then it is possible to transform this into a problem in "logic programming" with no use of the weak Konig's lemma. This is an advance. But if we need the strong Konig's lemma, then we are in exactly the same situation as in "imperative programming".

5. 6-COMPLETENESS

Friedmann [3] has given principles beyond the two Konig's lemmas. These principles fit well into the headline "reasoning in trees". The next principle for him is the kind of bar-rule used to analyze formal theories of hyper-arithmetical functions. After that he comes to a prin­ciple connected with rrl-completeness. This principle is of course connected

I with the idea that a tree with natural number branching should be well founded.

In the remainder of this paper we will show how to go beyond that. The main person here is of course Jean-Yves Girard [5,6,7].

The problem of S-completeness came up in works of Mostowski [II]. He was interested in logics where the notion of well-ordering was absolute. Girard translated this into the framework of many sorted predicate calculus where we have one particular sort with predefined meaning [6]:

a sort W of elements from a well-ordering;

a linear ordering <defined on W.

This logic is called 8-logic. Models in 8-logic should always have W as well-ordered by<. Below we will give a completeness theorem for this logic. To analyze validity here we need of course some fairly complicated abstract ideas. They must be of logical complexity rrl - as is not hard to

2 see.

To get started we assume that in our language we have a fixed list of variables of type W:

Then we go through the same construction as in predicate logic. We start with a formula F not containing any free variables (nor for sim­plicity any constants of type W). Then we go through the construction of the tree of formulae above F minding that the free variables of type W introduced should all be from the fixed list. We end up with the tree T(F). In most cases this tree is an infinite tree which locally looks like a derivation.

So far we are not going beyond ordinary predicate calculus. Now we introduce ordinals.

Definition

Let a be an ordinal. We let a* be the set of all finite sequences of ordinals < a. The elements of a* are often written

cr ~ < cr 0 , cr 1 , ... , cr n-I > .

131

Definition 2

A sequent (in sequential calculus) involving ordinals and less than (<0 is secured if the subsequent, of the formulae which are atomic, is true.

Definition 3

as the result of textual substitution of 0. for o. in T(F) where i 1 1

0, I , ... ,n- I.

The result of such a textual substitution may not any longer look like a derivation (problems with the o's used as eigenparameters), but we do not care.

Definition 4

Let 0 be from a*. We say that 0 is secured relative to T(F) if in all branches in T(F)/o there is a secured sequent at some node.

Definition 5

Let alpha be an ordinal. The a-tree based on F, written TREE(F)[aJ, is defined as the tree of all not secured sequences from a*.

After this rather heavy going it is time for an example. Consider the formula, TI, expressing the principle of transfinite induction over our ordering~. We then first construct the tree of formulae (or sequents) with TI at the bottom node. This tree, T(TI), is infinite because the formula is not derivable in predicate logic. A sequence < 00,01,"" n-1 > from a* is not secured there, if it is strictly descending

00> 01>- ••• >- 0n-1

and TREE(TI)[a] consists of all strictly descending sequences from a*.

Lennna 1

All initial segments of elements from TREE(F)[a] are themselves ele­ments. So we naturally have a tree structure.

Definition 6

We write 0 ~ T where 0 and T are from a* if they are order isomorphic.

Lennna 2 (Homogeneity)

For 0 ~ T from a*: if one of them is in TREE(F)[aJ, then so is the other.

Any finite sequence is order isomorphic to a sequence of natural numbers. This gives:

Def ini tion 7

Let T be a set of sequences of ordinals and B an ordinal. We define the extension (or restriction) of T with Bas:

132

T[B] {a in 6*1 for some, in T we have a ~ ,}.

Lelllll1a 3

For all a: TREE(F)[a] = TREE(F)[w][a].

We must connect our big trees with logic.

Definition 8

For each a we define the auxiliary logic where the universal quanti­fiers over the type Ware taken to be in conjunction over all ordinals < a, and existential quantifiers as disjunctions over all ordinals < a. Deri­vability in this logic is denoted by the sign: ~a. For alpha ~ w these logics are infinitary. It is not hard to see that:

It is not hard to see:

Lelllll1a 4

For any formula F: if TREE (F) [a] is well founded, then 1- aF.

Lelllll\a 5

The following is equivalent for a formula F:

F is valid in our B-logic;

for all a: TREE(F)[a] is well founded.

Remember now that the point about abstract ideas is not to eliminate but to make them visible and useful. We need some complicated abstract ideas to explain validity in B-Iogic. They must be of logical complexity rrl. Friedmann used well foundedness in trees with natural number branching

2 to explain concepts of complexity rrl [3]. Here we introduce a new notion, strongly well founded. 1

Definition 9

Let T be a tree made of sequences of natural numbers. Then:

T is a homogeneous ifT = T[w]:

T is strongly well founded if for all a: T(a) is well founded.

We then get:

Theorem 3 (6~Completeness)

For a formula F the following is equivalent:

F is valid in B-Iogic:

TREE(F)[w] is strongly well founded.

How-can we use the abstract idea "strongly well founded"? It is used in much the same way as we use other abstract ideas. There are many uses of "finiteness" even if it is not a decidable concept. We know already one interesting strongly well founded tree. Above we introduced the tree connected with "transfinite induction". It consisted of all strictly descending sequences of ordinals (less than some a). This is obviously strongly well founded.

There is by now a number of results connected with these concepts. There is, for example, a concept of composition between strongly well founded

133

trees. And there is a powerful recursion principle. But for these we must refer to the literature [5,9J.

6. BEYOND S-LOGIC

The argument behind the proof of the S-completeness theorem is quite general. In fact we only needed the following fact about ordinals:

A Subset of Ordinals is Isomorphic to an Ordinal.

So if we now start with some class C of structures which we want to be absolute in our logic. We assume that we have a particular sort which are always interpreted as one of the structures. There is bne basic assumption:

The Class C of Structures is Closed Undertaking Substructures.

With this assumption the argument above can be carried through, and we get a completeness theorem for C-Iogic. The concepts "homogeneous" and "strongly well founded" carry nicely over into this new framework. This is done in [7,10]. There it was used to get what is called TIl-completeness but that is just one particular use of this idea. n

This paper is called "reasoning in trees". We have shown that trees come up in a number of arguments. The abstract ideas used are connected with properties of trees. But there is more to it than that. We have shown that a number of complicated logical situations can be analyzed into two parts:

the construction of a tree T in an elementary way;

T has some abstract property.

There are a number of candidates for such abstract properties:

T is finite;

T is well founded;

T is strongly well founded.

and the trees may have different types of branching etc.

GBdel emphasized in his work that one should not be afraid of abstract properties. For one thing they had to be used in any case. But more important, they can help us in understanding new logical situations.

REFERENCES

I . Archimedes. "The Sandreckoner". 2. W. F. Clocks in and C. S. Mellish, "Programming in Prolog", Springer

Verlag (1981). 3. H. Friedmann, S. Simpson and S. Smith, Countable algebras and set

existence axioms, Annals of Pure & App. Logic (1983). 4. G. Gentzen, Untersuchungen tiber das logische Schliessen, Mathematische

Zeitschrift 39:176-210, 405-431 (1935). 5. J-Y. Girard, TIl-logic, Part I: Dilators, Annals of Math. Logic, vol.

2 21:75-219 (1981).

6. J-Y. Girard, The Q-Rule. Proceedings International Congress of Mathe­maticians, Warszawa (1983).

134

7. J-Y. Girard and J. P. Ressayre, Elements de logique rrl. Proceedings n

of the AMS Symposium on Recursion Theory, Cornell (1982). 8. K. Godel, Ueber eine bisher noch nicht benlitzte Erweiterung des

finiten Standpunktes, Dialectica, 12:280-287 (1958). 9. H. R. Jervell, Introducing rrl-logic. Proceedings from a Symposium in

2 Oslo. To appear in Springer Lecture Notes.

10. H. R. Jervell, rrLcompleteness. Proceedings from a Symposium in Oslo. n

To appear in Springer Lecture Notes. 11. A. Mostowski, Formal systems of analysis based on an infinistic rule,

in: "Infinitistic Methods", Warszawa (I 960). 12. D. Prawitz, Mekanisk bevisforing i predikatkalkylen, Uppsats for

seminariet i teoretisk filosofi (mimeographed), Stockholm (1957). 13. J. A. Robinson, Theorem-Proving on the Computer, J. Assoc. for Com­

puting Machinery, Vol. 10, No.2, April (1963). 14. A. M. Turing, On Computable Numbers, with an Application to the

Entscheidungsproblem. Proceedings of the London Mathematical Society, Ser. 2, Vol. 42:230-265 (1936-1937); Vol. 43:544-546 (1937).

15. H. Wang, Proving theorems by pattern recognition: I, Communications of the Assoc. Computing Machinery, Vol. 3, No.4, April (1960). ("H. Wang" should be "w. Hao".)

135