generating relevant models

Journal of Automated Reasomng 7:359-368, 1991 359 ~(" 1991 Kluwer Academic Publishers Printed m the Netherlands.

Generating Relevant Models

A L L A N R A M S A Y Department of Computer Science. Umverm O' College Dubhn, Bel~eld, Dublin 4, Ireland

(Received" 25 June 1989: accepted 13 September 1989)

Abstract. Manthey and Bry's model generation approach to theorem proving for FOPC has been greeted with considerable interest. Unfortunately the original presentation of the technique can become arbitrarily inefficient when applied to problems whose statements contain large amounts of irrelevant information We show how to avoid these problems whilst retaining nearly all the advantages of the basic approach.

Key words. Theorem proving, first-order predicate calculus, Manthey and Bry's model.

!. Model Generation

Our main aim is to show how to avoid certain problems with Manthey and Bry's

(1988) model generation approach to theorem proving for first-order predicate cal-

culus (FOPC). Before we can see how to avoid these problems, however, we have to

see what they are. And before we can see what the problems are, we have to see what

the approach is. We therefore start with a brief exposition of model generation. At the heart of Manthey and Bry's idea is the realisation that very large parts of

most problems which can be posed in FOPC can actually be stated as Horn sequents,

i.e. as sequents of the form A~ . . . . . A,, ~ [:~, where/? is an atomic formula (which

may be L, the absurd statement). We refer to them as Horn seqttents rather than Horn

clauses to emphasise the treatment of negation in terms of sequents with • on the

right hand side. With this convention, we can represent simple negative literals within

our restricted format, since a fact like ~p(a), which cannot be represented as a Horn

clause, can be represented as a Horn sequent, namely p(a) ~ I . We can therefore

represent rather more of our problems in terms of Horn sequents than we could in

term of Horn clauses. We can see whether a set of Horn sequents is satisfiable just by trying to prove •

via goal reduction. It is hard to see what could be more effective than this, the basic

P R O L O G strategy, when it applies, so for problems which can be stated entirely as

Horn sequents this seems like an extremely promising approach. Unfortunately not

all problems can be converted entirely to this format. What we need is some way of dealing with sequents with complex right-hand sides which allows us to continue using

goal reduction wherever possible. Model generation is one such approach. Manthey and Bry suggest that we should view sequents whose right-hand sides

contain more than one formula as indicating ways that we might extend our basic set

of Horn sequents. Suppose, for instance, that we have a set of Horn sequents 2; and a single non-Horn sequent F ~ B~ v Bs, and we want to know whether they are

360 ALLAN RAMSAY

satisfiable. We start by trying to derive _k from Z alone - if we can then Z itself is

unsatisfiable, so E u {F ~ B~ v B2 } certainly is. So if we can derive _L from Z alone, we don' t need to worry about F ~ Bj v B~.

We therefore only need consider the case where we cannot derive _1_ from E, so that

Z has at least one model. Suppose Z u {F ~ Bj v B2 } is unsatisfiable. Let F ' be the

universal closure of the negation of the conjunction of elements of F. Then Y ~ {F'}

must also be unsatisfiable, since any model of Z ~ {F'} would be a model of

I2 u {F ~ BI v B2}. And if E w {F'} is unsatisfiable then there must be a substitution cr such that e(V) is ground and Z t- or(F).

Let a be such a substitution. Then any model o f Z u {F ~ Bt v B2} must be a

model of 6 ( B 1 v B2). So if Z w {F ~ B1 v B2} is satisfiable then either Y~ w {(F ~ B~ v B~), G(B~)} or E w {(F ~ B~ v B2), or(B2)} is. In other words, if

we can show that E w [a(BI) , (F ~ B, v B2)} a n d Z u {o'(B2), (F ~ B, v B2)~ are

both unsatisfiable then we can conclude that Z w {(F ~ B t v B2)} itself is. Each of these problems is of the same form as the original, namely that we have to show that

some set of Horn sequents (i.e. 12 w {O(Bl)} or Z w {o(B2)}) plus the non-Horn sequent F ~ B~ v Bz is unsatisfiable. We can therefore try the same thing again - see

whether the Horn sequents alone are inconsistent, if not then try to extend them with

the alternatives that arise from ground instances of the non-Horn one. We will refer

to this kind of move as splitting. Manthey and Bry show that you can construct a complete theorem prover on this

basis. You have to ensure that your strategy for attempting to derive 2 , and for

attempting to find appropriate substitutions, from a set of Horn sequents is complete. It is often easy to show that there are no non-terminating cycles in the set of Horn

sequents, in which case you can follow P R O L O G in applying goal-reduction depth-

first, but otherwise you will have to apply it breadth-first if you want to preserve

completeness. You also have to ensure that you explore the set of derived sub-

stitutions breadth-first. Given these preconditions, and some restrictions on the

presence of free variables in the right-hand sides of complex sequents, theorem

provers based on model generation are sound and complete - a refutation will be

found in a finite time if and only if the set of sequents is unsatisfiable.

Model generation also forms the basis of extremely effective theorem provers. If we

relax the requirement that goal-reduction has to be applied breadth-first unless the set

of sequents is cycle free, and that the set of derived substitutions must be explored breadth-first, then we inevitably end up with theorem provers that are no longer complete. They can, nonetheless, be extremely powerful. Manthey and Bry present such a theorem prover with which they can solve almost all of Schubert 's [Pelletier 1986] test problems, mostly after a very short time indeed. We will not review their results here, but will just remark that they are reproducible. I f you copy SATCHMO, the program that Manthey and Bry provide, then it will solve the problems they say it will solve in roughly the time they say it will take, with any discrepancies probably due to the power of the machine or the efficiency of the P R O L O G system.

GENERATING RELEVANT MODELS 361

Where does the power of the approach come from? It seems likely that it has two

sources. The first is the isolation of the Horn sequent part of the problem and the application of goal reduction where it is appropriate. Simple goal reduction with

chronological backtracking does seem to be an extremely effective mechanism when applied to sets of Horn sequents. Furthermore, since it is the basic mechanism underlying PROLOG it has received considerable attention, and numerous opti- misations have been developed - tail recursion optimisation, indexing on the structure

of arguments, compilation of head unifications, and so on. There is little more to be said about this. PROLOG is good for what it is good for, so if your problem has parts for which PROLOG would be effective then you can't do much better than to use it.

The second reason for SATCHMO's success seems to be that the way it splits problems introduces a small element of forward chaining into a predominantly goal

driven mechanism. When SATCHMO splits on a sequent of the form F ~ B~ v B~, it will already have found a substitution a such that a(F) is entailed by the current set of Horn sequents. The proof that e(F) follows from the Horn sequents is done by goal reduction, and hence is as easy as any proof can be. The fact that it does follow means that we know that the extensions obtained with a(B L ) and ~(Be) partition the sets of models of our original set of Horn sequents, so that if we can show neither of the extensions has a model then we know that the original did not have one. Thus rather

than introducing non-Horn sequents into the basic problem space, SATCHMO uses them to split the original problem into sub-problems of the same basic form. The forward chaining aspect of this ensures that at the point when the split occurs, the system knows that left hand side of the sequent has already been dealt with, and the

right hand side has been grounded. As we remarked above, SATCHMO performs extremely well on Schubert's set of

"hard" test problems. It is notable that these problems have been carefully designed to be hard. They are a bit like chess problems. Everything in them is relevant to the solution, and the task faced by the theorem prover is to see how to put it all together. The problems faced by theorem provers in general applications are rather more like chess games. There is a lot of information available, and the task is to sift through it to find what you need in order to construct a solution. Consider for instance the Steamroller. This is a problem about various animals (A) such as wolves (W), foxes (F), birds (B), caterpillars (C), and snails (S), and plants (P) such as grains (G), some of which eat (E) others, and some of which are much smaller (M) than others. The sequent form of this problem is as follows:

W(~o), ~ F ( / ' ) , ~B(h) , ~S(s) , ~C(c ) , ~G(g) ,

W(X), F(X), B(X), S(X), C(X)) ~ A(X), G(X) ~ P(X),

F(X), W(Y) ~ M(X, Y), B(X), F(Y) ~ M(X, Y), S(X), B(Y) ~ M(X, Y). C(X), S ( Y ) =~ M(X, t7),

362 ALLAN RAMSAY

S(X) ~ P(i(X)), S(X) ~ E(X, i(X)), C(X) ~ P(h(X)), C(X) ~ E(X, h(X)),

w(y), F(Y), EO;, Y) ~ • W(X), G(Y), E(X, Y) ~ I , B(X), S(Y), E(X, Y) ~ • A(X), A(Y), E(X, Y), G(Z), E(Y, Z) ~ _1_,

A(X), A(Y), M(Y, X), P(W), E(Y, W), P(Z) ~ E(X, Y) v E(X, Z)

The striking thing about this set of sequents is that only one of them, namely the

last, has a right hand side with more than one element. The consistency of the others can be checked extremely quickly by trying to use them to prove _L, using goal

reduction as a proof method. The careful design of the problem means that there are not all that many substitions for which the left hand side of the complex sequent can

be shown to follow from the Horn sequents (or the Horn sequents supplemented by

instances of the right hand side for appropriate substitutions). Manthey and Bry's

original program, for instance, finds a proof after splitting this last sequent with the

substitutions (b/X, s/Y, g/Z), (f/X, b/Y, g/Z) and (w/X, f/Y, g/Z). The difficulty

for more orthodox approaches is that these are three from a very wide range of

possible substitutions. If you work backwards you are very unlikely to find them, whereas working forwards they emerge almost immediately. Suppose, however, we

add a new sequent with information concerning the relative speeds that animals can

move at:

A(X), A(Y) ~ Q(X, Y) v M(K X)

This says that if X and Y are animals, then either X is at least as quick as Y or Y

is much smaller than X. This is additional, albeit irrelevant, information. If you could solve the original Steamroller, surely you should still be able to solve it after the

addition of this new information. Unfortunately, if we had added this sequent so that

it was considered before the other complex sequent in the Steamroller, it would lead Manthey and Bry's algorithm to consider roughly 900 times as many cases, since it

would explore the space of possible enumerations of pairs of animals, and at the end of each enumeration it would explore the whole of the original search space. If we added in a few extra animals as well, the situation would get even worse, since the space that has to be explored before we try the complex sequent which is relevant contains 4 x (Y~- 1 i)2 leaf nodes, where N is the number of animals. We would have to explore the whole of the original space at each of these leaf nodes. This would

clearly get completely out of hand if we were to add in a few extra animals as well - if we added just three more animals the search space would be increased over 5000 times. The problem is that the existing strategy takes no account of the relevance of any complex sequent, or of any particular ground instance of such a sequent. We need


tO introduce an element of backward search into the treatment of complex sequents

if we want to use this technique for problems which have a hard kernel, but which

also have large amounts of irrelevant material to prevent us even finding the

kernel.

2. Relevant Constraints on Models

We can see model generation as a way of showing inconsistency by using complex

sequents to derive constraints on the set of possible models, Constraints are generated

by considering substitutions for which the left hand side of such a sequent is forced

to be true in every model, and hence concluding that at least one of the instantiations

of elements of the right hand side must also be true in any model. The general strategy

is to accumulate constraints until we can show that there cannot be a model satisfying

all the constraints, and hence that there cannot be a model at all. The problem is that

unless we are careful we may generate useless constraints which do not help show that

particular models are impossible, but which do force us to consider more cases than

we would like. How can we avoid generating useless constraints?

It will be helpful to wonder about when a constraint is useful. Suppose we had the

following extremely simple problem: p ~ • q ~ • ~ r , r ~ p v q.

We would start by trying to prove • by goal reduction from the Horn sequents, i,e.

the first three. Our first attempt would start with p ~ • and would fail because we

would be unable to prove p. The second would start with q ~ J_, and would fail

because we could not prove q. We would then look for a complex sequent whose left

hand side we could prove by goal reduction from the Horn sequents. The only

complex sequent is r ~ p v q, whose left hand side contains just r. This can easily

be proved from ~ r . r ~ p v q generates two alternative constraints - eitherp is true,

or q is. The first of these enables us to prove 3_ from p ~ 3_, the second enables us

to prove it from q ~ 3_. The important thing to note is that the constraints generated

by r ~ p v q are exactly what we wanted, but did not have, when we first tried to

prove 3_. In other words, we could have spotted that r ~ p v q was likely to be

relevant in advance, at the point when we wanted one of p or q.

Let us add another complex sequent to our problem:

p ~ 3 - , q ~ • ~ r , r ~ s v t , r ~ p v q

If we blindly try complex sequents, attempting to prove their left hand sides and

then exploring all the constraints generated by their right hand sides, we will initially

generate the constraints s and t. For each of these, we will then have to generate two further constraints, namely p and q, and show that in worlds with these constraints

we can prove L. r ~ s v t is irrelevant. Further, we had no reason to suppose that

it would be relevant, since we had not previously attempted a proof of • which had

failed for lack of either s or t. We therefore propose the following changes to Manthey

and Bry's algorithm:

364 ALLAN RAMSAY

PURE LITERAL DELETION

Suppose we have a set of Horn sequents A and a set of complex sequents Z, and that F ~ B~ v B2 is a member of s such that B1 is pure with respect to A and Z (in other words no member of A or Z contains BI on its left hand side). Let a be a substitution for which A J- a(F) holds, so that we might be tempted to split with a(B I) and a(B e). We will then be committed to showing that A w s w {BI} and A w Z w {B2} are both inconsistent, i.e. that they both support proofs of _1_. If Bj is pure with respect

to s and A, then no proof of anything from these sets of sequents can depend on it - if we can show that A w Z w {B1 } is inconsistent then we could have shown that A w Z was, without &. In that case there is no point in splitting on this sequent. The additional information available in one half of the split is irrelevant, so the split itself

is pointless. Similarly any sequent (Horn or complex) which has a pure literal in its antecedent

can be removed, since nothing is ever going to be discovered which can reduce this literal when it appears as a sub-goal.

The significance of pure literal deletion is not restricted to model generation. Any approach to theorem proving which depends crucially on cancellation of complementary literals (which means any existing approach) will benefit from it. It is notable that deleting a clause which contains a pure literal will sometimes isolate other literals, so that you may get a cascade of deletions. It can even happen that the entire problem gets deleted before you even start. This effect was particularly emphasised in Kowalski's (1975) discussion of connection graphs, but it is equally true of any of the standard

approaches to theorem proving. Eisinger (1986) points out that if you apply pure literal deletion in connection graph theorem proving after you have started resolving away links, you may delete the information required for finding a proof. This is only a problem if you allow deletion of pure literals during the course of a proof using a method which relies on an explicit representation of the remaining search space. If you simply regard it as a pre-processing step, as Bibel et al. (1987) do, or if your method enumerates the search space dynamically, as happens with model generation, then it will not affect the completeness of your algorithm.

RECORDS OF FAILURES

The crucial point here is that we only try splitting after we have tried to derive 3_ from the current set of Horn sequents, and have failed. There would be not point in splitting unless the information you expected to gain would help you to do something you had hitherto failed to do. We therefore want to restrict our attempts at splitting to sequents whose right hand sides contain information that would have been helpful some time in the past.

How can we tell what information would have been helpful some time in the past? Suppose we were trying to do a goal reduction proof of something (either of _t_ or of the antecedent of some sequent that we had already decided was relevant). At various


points during such a proof we may look for a sequent with some specific literal as its

consequent and fail to find one. At any such point, we will have a number of outstanding goals - the literal we just looked for, the consequent of some sequent that had that literal in its antecedent, the consequent of that sequent, and so on. If we were told that one of these literals was in fact true, we could continue with our proof. These literals, then, constitute the information that would have been helpful at some time in the past.

We therefore recommend changing the goal reduction part of the algorithm so that it records any literals which would have enabled some unsuccessful attempt at a proof

to conttinue if they had been present. We only ever contemplate splitting on complex sequents if their consequents contain some such literal. The following description of the changed algorithm makes this a little more precise.

(1) Check consistency: try to prove / by goal reduction from the Horn sequents currently available to you. If at any point you have a sub-goal which you cannot prove from the Horn sequents, look for complex sequents whose right hand sides contain elements that unify with it. For each such unification, note that the most general unifier induced by the unification produces a potentially relevant instance of the complex sequent. This is easy - you just apply the MG U to the complex

sequent and record the resulting object as something to be considered when splitting.

(2) Split: if you failed to prove 3_, look for a potentially relevant instance of a sequent for which you can derive the left hand side from the available Horn sequents. When you find such a sequent, investigate the consequences of adding the elements of its right hand side as constraints. Again, whenever you have a sub-goal which you cannot prove, you should make a note of any complex sequents whose right hand sides contain elements which will unify with it or with any currently outstanding goals.

This is the same as Manthey and Bry's algorithm, except that we use information gained from failed proofs to guide our search through the complex sequents when deriving constraints at step (2). We still use goal reduction on sets of Horn sequents in both steps, both in our attempts to derive 2. and in our attempts to find instances of complex sequents whose left hand sides can be derived from what we know already. However we now make a note of sub-goals which we fail to prove, and we find complex sequents which would have helped us with our proofs. At step (2) we use the information gained from these failures to constrain our search through the set of relevant complex sequents.

How does the new version of the algorithm compare with the original? There are three important questions to ask here. (i) Is the new algorithm complete? (ii) Does it cover the search space more effectively than the original? (iii) Are the overheads incurred in keeping track of potentially relevant complex sequents worth the bother'?

We can merge questions (i) and (ii) into a single query: does the new algorithm generate all the constraints that are needed for a proof? As we noted above, Manthey

366 ALLAN RAMSAY

and Bry show that the original algorithm is complete if the strategy for goal reduction

is complete and the set of constraints is searched breadth-first. The most we can hope for, then, is that the new algorithm will be complete if the space of constraints it generates is searched breadth-first and the strategy for goal reduction is complete. It

is also easy to see that at least for some problems, such as the example we used to introduce the new algorithm and the extended version of the Steamroller, the new algorithm will ignore some constraints that would have been generated by the old one. Hence if it does generate everything that is needed then it will certainly cover the space

more effectively for some problems, and it will never explore a space containing constraints that are not present in the search space for the original algorithm. Could there be constraints that ought to be explored but which the new algorithm will fail to generate? Constraints are used in both algorithms to provide new information for attempts to prove 3_ by goal reduction. We only perform the second step, when

constraints are generated, if we have attempted a goal reduction proof of • and have failed. This will only happen if we have no Horn sequents whose right hand side just consists of 1 , or if we have run into a sub-goal for which we cannot construct a proof.

In the first case there is nothing we can do with either algorithm. If we have no Horn sequents whose right hand side consists just of _L, we cannot prove 3_ and no constraints that we might generate by either method will help us do so. In the second case, where we have run into a sub-goal for which we cannot yet construct a proof, the only way that generating constraints will help us is if one of the constraints is actually the required sub-goal. ~I'he only way that a complex sequent could generate the required sub-goal as a constraint is if a generalisation of it occurs as an element of its right hand-side. In this case the new algorithm will mark the relevant specialisation as a potentially relevant complex sequent, so it will be available at step (2). Thus any complex sequent which may be required for generating constraints will get marked,

and the constraints will be generated as required. We see then that the new algorithm is complete if run breadth-first, but not if it is

run depth-first, exactly as was the case for Manthey and Bry's original algorithm. It is also clear that it will sometimes save a considerable amount of work. The explosive

effect of irrelevant information on Manthey and Bry's algorithm when it is used on the Steamroller suggests that it is worth putting quite a lot of effort into ensuring that only potentially relevant complex sequents are considered. If the presence of one irrelevant complex sequent can cause the algorithm to behave 900 times worse, then detection of irrelevant sequents is well worth doing. The new algorithm requires us to search for complex sequents with right hand sides of a specified type whenever we fail to prove a sub-goal during a goal reduction proof. Exactly how much effort is involved in this depends on how efficiently complex sequents are indexed. There is no reason for us not to use whatever indexing scheme lies behind our basic goal reduction algorithm, in which case the overheads imposed will be perfectly tolerable. Our PROLOG implementation, based on the one provided by Manthey and Bry, solves the basic Steamroller in around two seconds (POPLOG PROLOG on a SUN-3) if we simply search through all instantiations of the single complex sequent, and in around


three seconds if we make sure that we are searching potentially relevant ground

instances of that sequent. If we make sure that it only tries potentially relevant ground instances of complex sequents, it also solves the extended Steamroller with the irrelevant sequent A (X), A (Y) ~ Q(X, Y), M( Y, X) in around three seconds. If we do not guide its search through the set of complex sequents, it simply fails to solve the extended problem.

3. Conclusions

Our refinements to model generation require us to be able to inspect the current state of the search space at various times. We need to inspect it at the outset, in order to prune away any sequents containing pure literals, and we need to inspect it during the course of problem solving in order to see what goals are outstanding, and what goals we have previously failed on. We can compare this with other approaches to this task, and to similar tasks, which depend on concrete representations of parts of the search space. In connection graph theorem proving (Kowalski 1975) and chart parsing (Kay

1973, Kaplan 1973), a structure is built to represent the current state of the search for a solution - what tasks have been successfully tackled, what ones have been attempted unsuccessfully, and what is currently waiting to be dealt with. These systems attempt

to avoid repeating work by keeping track of everything they have ever done or contemplated doing. In parsing this approach is definitely worthwhile. It is less obviously a good idea in theorem proving, since in large problems the cost of keeping

an explicit record of all possible complementary literals can easily become overwhelm- ing. In our approach+ however, we do not need to keep concrete representations of quite as much information as connection graph theorem provers do. We need to be able to h;speet the set of links to a given literal during the initial phase when we are deleting sequents with pure literals, but we do not need to keep this information for later reference. We therefore avoid one of the overheads associated with connection

graphs, namely the sheer amount of space required for keeping explicit representations of all the links in the graph for non-trivial problems. We also need to be able to inspect part of the current search space when we are finding out what goals we have failed to reduce and hence what information we would like splitting to provide. The part we need to inspect, however, is just the current goal. We therefore need no extra data structures for representing the part of the search space we are interested in. All we need is a failsafe clause which says that if you fail to obtain some goal by ordinary goal reduction, you should look for complex sequents which might lead to it.

The only information that we need to record that was not present in the original model generation algorithm is the set of goals that have been detected as failures, and the links between these and specific complex sequents. We might almost say that the information we need is being recorded in an ill-/brined substring table. The effect on the basic model generation strategy is that we can cope with situations where we have large amounts of irrelevant information, at the cost of keeping a description of the goals which we have failed to reduce. In view of the efficiency of model generation

368 ALLAN RAMSAY

when all the i n f o r m a t i o n p resen t is re levant , a n d its pa tho log ica l b e h a v i o u r w h e n it

is not , we believe tha t this cost is well wor th bear ing .

References

Blbel, W., Letz, R. and Schumann, J. (1987): Bottom-up Enhancements of Deducnve Systems, MS., Tech- msche Umversitat Mfinchen.

Elsinger, N. (1986): Everything you always wanted to know about clause graph resolution', CADE 8, pp. 316-336.

Kaplan, R. M. (1973): "A general syntactic processor', in R. Rustin, Natural Language Processing, Algonthmics Press, New York pp. 193-241.

Kay, M. (1973): "The MIND system" in R. Rustin, Natural Language Processing, Algorithmics Press, New York, pp. 155-188.

Kowalski, R. (1975): "A proof procedure usmg connection graphs', JACM 22(4), pp. 572-595 Manthey, R. and Bry, F (I988) 'SATCHMO. a theorem prover in PROLOG', CADE-88. Pelletier, J. F. (1986): 'Seventy five problems for testing automatic theorem provers', Journal ofA utomated

Reasoning 2, 191-216.

generating relevant models

Documents