nonmonotonic theories and their axiomatic varieties

Journal of Logic, Language, and Information 4:317-334, 1995. 317 (~) 1995 Kluwer Academic Publishers. Printed in the Netherlands.

Nonmonotonic Theories and Their Axiomatic Varieties

ZBIGNIEW STACHNIAK Department of Computer Science, York University, North York, Canada Email: zbigniew@cs, yorku.ca

(Received 30 September 1994; in final form 2 October 1995)

Abstract. The properties of monotonic inference systems and the properties of their theories are strongly linked. These links, however, are much weaker in nonmonotonic inference systems. In this paper we introduce the notion of an axiomatic variety for a theory and show how this notion, instead of the notion of a theory, can be used for the syntactic and semantic analysis of nonmonotonic inferences.

Key words: nonmonotonic inference system, nonmonotonic theory, axiomatic variety

1. Introduction

The properties of monotonic inference systems and the properties of their theories, i.e. inferentially closed sets of formulas, are strongly linked. Classes of theories uniquely identify and define monotonic systems. Hence, properties of such systems can be expressed and studied in terms of the properties of their theories. These links, however, are much weaker in nonmonotonic inference systems. By replacing monotonici ty with some other principles of reasoning we challenge many properties of inference, traditionally considered natural or useful. The inference operations of nonmonotonic systems frequently fail the compactness theorem; tautologies of propositional nonmonotonic systems can no longer be considered schemas of true sentences, as the set of nonmonotonic tautologies is not necessarily closed under logical substitutions. In the absence of monotonicity, the meaning of a theory, as well as the role of theories in the analysis of inferences is radically different from the role theories play in monotonic calculi. In the absence of monotonicity, theories lose their prominent role, as they alone are not sufficiently expressive for the analysis of nonmonotonic inferences.

It turns out that in the context of nonmonotonicity another syntactic notion assumes the role a theory plays in monotonic inference systems. The notion of the axiomatic variety for a theory, a central notion introduced in this paper, captures all possible ways a theory can be axiomatized. With this notion, we recover analogues of many syntactic and semantic links that exist between monotonic inference systems and their theories.

318 z. STACHNIAK

In Section 4 of this paper, we introduce axiomatic varieties and study their basic properties. We show how axiomatic varieties can be used to identify, define, and compare nonmonotonic inference systems. The semantic links between properties of axiomatic varieties and properties of inference systems are discussed in Section 6. In Sections 5 and 6 we formulate syntactic and semantic criteria for selected properties of nonmonotonic calculi, expressed in terms of axiomatic varieties. We state the syntactic criteria and representation theorems for cumulative and loop- cumulative inference systems and for inference systems with cut.

2. Logical Preliminaries

In this section we describe the class of nonmonotonic inference systems we are dealing with in this paper. The reader may refer to Brown and Shoham (1988), Fiatkowska and Fiatkowski (1990), Gabbay (1985), Kraus et al. (1990), Makinson (1988, 1994) and Stachniak (1993) for more discussion and examples.

An inference system is a pair

(Z;, C),

where/2 is a language (propositional, first-order, etc.) and C is an inference operation on/2, i.e., a function mapping sets of formulas of/2 into sets of formulas, satisfying the following two conditions: for every set X of formulas,

(cl) X C_ C(X); (c2) C(C(X)) = C(X).

(inclusion) (idempotence)

If a is a formula and X a set of formulas, then we read ' a E C(X) ' as 'X infers a ' . Hence, C(X) denotes the set of all formulas that can be inferred from X. In the light of this interpretation, (cl) says that from every set X of formulas, we can infer at least all its members, while (c2) requires C(X) to be inferentially closed. In agreement with Gabbay (1985), Makinson (1988, 1994) and Stachniak (1993), we consider (cl) and (c2) to be ameng the rock-bottom properties of any inference system. For example, properties (cl) and (c2) hold for inference operations associated with default systems of Reiter and Poole (as defined in Makinson (1988, 1994)), as well as for inference operations defined by preferential model structures (cf. Brown and Shoham (1988) and Makinson (1988, 1994)) and by preferential matrices (cf. Stachniak (1993)).

No properties of inference systems that refer to quantification will be studied in this paper. Hence, for the simplicity of presentation, we assume that in every inference system (L, C) discussed in this paper, E is a propositional language. A propositional language is defined by specifying a finite list f o , . . . , fk of logical connectives and a countably infinite set of propositional variables. We do not make any special assumptions concerning the connectives of/2; they can be classical, modal, unary, or multiple-argument. The set L of well-formed formulas of E is

NONMONOTONIC THEORIES AND THEIR AXIOMATIC VARIETIES 319

defined in the usual way. Formally, we can identify a language 12 with the system (L, f o , . . . , fk), frequently called the algebra of formulas.

We say that an inference operation C on 12 satisfies cut, if: for every X, Y __ L,

(c3) X C_ Y C C(X) implies C(Y) C_ C(X).

C is cumulative, if for every X~ Y c_ L,

(c4) X C_ Y C_ C(X) implies C(Y) = C(X).

The reader may refer to Gabbay (1985), Kraus et al. (1990), Makinson (1988, 1994), and Stachniak (1993) for further discussion of these properties. Finally, C is a consequence operation if, in addition to (c 1) and (c2), it satisfies the following condition: for every X, Y _C L,

(c5) X C_ Y implies C(X) C_ C(Y). (monoWnicity)

If C is a consequence operation, then the inference system (12, C) is called a logic. As (c5) does not follow from (cl) and (c2), we call an inference system nonmonotonic, if this system is not a logic.

A theory of an inference operation C is a set T of formulas of/2 such that C(T) = T. We shall write Th(C) to denote the set of all theories of C. The idempotence condition (c2) describes the way theories can be formed in an inference system: theories are inferential closures of sets of formulas. Indeed, for every X C_ L, C (X) is a theory and every theory T is of the form C(X) , for some X C_ L. Theories of consequence operations play a prominent role in the syntactic as well as the semantic analysis of logical calculi. In the next section we review some of the facts linking properties of theories with properties of logics. Most of these links, however, break down in the absence of monotonicity. The place of theories in the analysis of nonmonotonic inferences is far from being central.

3. Theories of Monotonic Inference Systems

In this section we briefly review some of the facts that link the properties of theories with the properties of logics. The proofs of all the facts discussed in this section can be found in Piochi (1983), Stachniak (1988) and W6jcicki (1988).

Let 12 = (L, f 0 , . . . , fk) be an arbitrary propositional language, fixed for the rest of this paper. Let C and C1 be two consequence operations on 12.

(tl) C = C1 iff Th(C) = Th(C1). (t2) C(X) = N{Y E Th(C) " X C Y}.

320 z. STACHNIAK

Conditions (tl) and (t2) state that a consequence operation is uniquely determined by its set of theories. Indeed, by (tl), no two different consequence operations have the same theories. By (t2), the knowledge of Th(C) suffices to define the consequence operation C. Of course, (t2) implies (tl).

Theories can be used to compare consequence operations with respect to their inferential power. We shall write C < C1 if and only if for every X c_ L, C(X) c_ C1 (X). Clearly, this definition extends to all inference operations.

(t3) C < C1 iff Th(C1) C_ Th(C).

(t3) characterizes the relation < between consequence operations. It teUs us that for a given consequence operation C, Th(C) provides a 'space of theories' for consequence operations inferentially stronger than C. Although we can infer more formulas using a consequence operation C1 inferentially stronger than C, C1 cannot have theories that are not in Th(C) . Clearly, (t3) also implies (tl).

Every consequence operation C on Z: can be semantically defined by a single semantic structure

M = < A , D , n )

called a logical matrix. The first element .A of M, called the algebra of truth- values of M, provides truth-functional interpretations of logical connectives of /2. ~t = (A, F 0 , . . . , Fk), where A is the set of truth-values and F 0 , . . . , Fk are operations on A. Every operation Fi, i <_ k, is the interpretation of the connective fi. (We assume that for every i _< k, the connective fi and its interpretation Fi are of the same arity.) The second element D of M, is a family of sets of truth- values (i.e. subsets of A). Every set d E 73 consists of truth-values which we consider designated. Finally, the third element 7-/of M is the set of some (not necessarily all) interpretations of formulas of s into M. These interpretations, called valuations, are simply mappings from the set of formulas of s into the set A of truth-values of M. Valuations obey the truth-functional interpretations of the connectives. With every matrix M we associate the consequence operation CM defined in the following way: for every X U {ol) C_ L,

oL E CM(X) ifffor every valuation h E 7-/and every d E 73,

h(X) C_ d impliesh(c~) E d.

Intuitively, X infers ~ in M if and only if every valuation h E 7-t which assigns designated truth-values to all the formulas of X (i.e., h(X) C d, for some d E 73) assigns a designated truth-value to o~ as well (i.e., h(e~) E d). The reason for defining CM with respect to some, rather than all valuations is to allow the analysis of inferences when some interpretations of formulas are not admissible, or have no meaning. For instance, one may want to exclude the valuation which makes all the propositional variables true, arguing that an 'admissible' valuation should


reflect the fact that there are true as well as untrue 'atomic' facts (for similar reasons, the valuation that makes all the variables false could be excluded). From a purely theoretical point of view, however, the introduction of the concept of an admissible valuation (i.e., a member of 7-[) makes it possible to provide semantics to all propositional logics (cf. Piochi (1983) and Stachniak (1988)).

We say that a matrix M defines a consequence operation C, if C = C M.

The semantic connections between a consequence operation Cand its set Th(C) of theories are described by the following facts.

(t4) Every consequence operation C is defined by the matrix M = (s Th(C), {/d}), where id is the identity function on L.

(t5) If M = (.A, 79, 7-t) is a matrix, then for every h E 7< and every set d E 79, the set Ta h = {a E L : h(a) E d} is a theory of C M.

(t6) For every matrix M, Th(CM) is the smallest set containing L, all theories of the form Td h, and closed under arbitrary intersections.

In short, (t4) shows how to construct a matrix defining a consequence operation C knowing its set Th(C) of theories. Facts (t5) and (t6), on the other hand, show how to get the complete description of Th(C) having a matrix defining C. We call the matrix defined in (t4) the Lindenbaum matrix for C. This, or similar semantic structures, were used to provide semantic criteria for consequence operations (see Stachniak (1988)), structural consequence operations (see W6jcicki (1988)), and cumulative inference operations (see Makinson (1988, 1994) and Stachni- ak (1993)). In Section 6, we shall use semantic structures, similar to Lindenbaum matrices, to provide semantic criteria for further properties of inference systems.

4. Axiomatic Varieties

Formulas (t 1)-(t6) express some of the basic links between consequence operations and their sets of theories; they show how intimately connected these notions are. By giving up the monotonicity principle we break these connections. In general, Th(C) does not uniquely characterize a nonmonotonic inference operation C, i.e., (tl) does not hold in the absence of monotonicity (and, thus, also (t2) and (t3) which imply (tl)). The reason is that a theory T of C can be axiomatized in an 'unpredictable' way, i.e., the set

V(T) : {X _C L: C(X)= T},

which describes the possible ways a theory T can be axiomatized (i.e., obtained using the operator C), may contain sets the presence of which is quite arbitrary. Indeed, any set Y E V(T) can be taken out of V(T) and added to another family V(T1), provided that Y # T and Y C_ T1. As a result, we get a new inference operation with exactly the same set of theories as C. Hence, Th(C) = Th(CI ) does not necessarily imply C = C1, if C and C1 are arbitrary inference operations.

322 Z. STACHNIAK

In order to specify a nonmonotonic inference system we must not only list the theories of such a system but we must also describe the possible ways in which we want every theory to be axiomatized. In other words, for every theory T, the set V(T) must be specified. To express and justify this claim formally, with every inference operation C let us associate the equivalence relation Oc on 2 5 defined by

x e c y iff c ( x ) = c (z ) .

Clearly, the equivalence classes of @c are of the form V(T), where T is a theory of C. We call every equivalence class V(T) of 0c the axiomatic variety for T.

THEOREM 1. Let C and C1 be two inference operations on/2. Then

C = C1 iff OC = (~Cl.

Proof The only if part is obvious. Let Oc = Oc1 and let X C_ L. By idempotence of C and C1 we obtain

C(X) (~C X OC, Cl(X),

which, in the light of our hypothesis, gives us:

(a) C(Cl(X)) = C ( X ) ; (b) CI(C(X)) -~ CI(X).

Since C and C1 satisfy inclusion, we obtain CI(X) C C(X) from (a) and C(X) C_ Ca (X) from (b). Hence, C(X) = C1 (X), as required. �9

Theorem 1 is a nonmonotonic counterpart of (tl). It says that two inference operations are identical if and only if they have the same sets of theories and that these theories are axiomatizable in the same ways in both systems. This theorem suggests that in the context of nonmonotonicity, axiomatic varieties for theories, i.e., the equivalence classes of Oc, play the role theories play in logics. During the course of this paper we shall try to gather enough evidence to support this claim.

Now, let us turn to (t3). If C is a consequence operation, and some other consequence operation C1 has all its theories in Th(C), then C1 must be inferentially at least as strong as C. This situation, however, is not generally true for nonmonotonic inference operations. We only have:

PROPOSITION 2. If C and C1 are inference operations, then C < C1 implies Th(C1) c Th(C).

Proof Let C _< C 1 and let X C Th(C1). Since C(CI(X)) c_ CI(CI(X)) = C l ( X ) and since CI (X ) C C(CI(X)) , we have C(CI(X)) = Cx(X). Since X E Th(C1), we have X = CI(X) and, hence, C(X) = X. This means that


x e []

The converse of Proposition 2 is true if, for example, C is a consequence operation. However, it is not true in general, as its truth would imply the truth of (tl) for nonmonotonic inference operations. To compare inference operations, the usage of axiomatic varieties for theories (instead of theories) can be more profitable. We demonstrate this on the example of cumulative inference operations.

THEOREM 3. Let C and C1 be two inference operations and let C1 be cumulative. Then C < C1 iff Oc C_ eel.

Proof ( 3 ) Let X, Y C_ L be such that X @c Y. We want to show that X OcI Y. By inclusion and the fact that C _< C1 we obtain:

X C C(X) C C 1 (X) and Y C C(Y) C CI(Y),

which, by cumulativity and idempotence of C1, give us:

(a) cl (x) = Cl(C(x)); (b) C1 (Y) = C1 (C(Y)).

Now, since C(X) = C(Y), (a) and (b) give us C1 (X) = C1 (Y), as required. ( ~ ) Let X _C L. To show that C(X) C C1 (X), let us note that by the hypothesis and the fact that X @c C(X) (idempotence of C), we must have X @c~ C(X). This means that C1 (C(X)) = C1 (X) and, since also C(X) C C1 (C(X)) (inclusion), we conclude that C(X) c_ C 1 (X). []

To complete the comparison between (tl)-(t3) and the properties of axiomatic varieties, we turn to the search for an analogue of (t2) for nonmonotonic inference operations. There is a well-known correspondence between consequence operations o n / ; and closure systems on L (a family Th C_ 2 L is said to be a closure system if it contains L and is closed under arbitrary intersections). For every consequence operation C, Th(C) is a closure system. Conversely, for every closure system Th on L the operation C defined by (t2) (with 'Th(C)' replaced by 'Th') is a consequence operation such that Th(C) = Th. Hence, (t2) is at the heart of the correspondence between consequence operations and closure systems. Looking for a nonmonotonic analogue of (t2), we should concentrate on possible relationships between inference operations and partitions of 2 L. On the one hand, every inference operation C determines the partition of 2 L into axiomatic varieties for its theories. In addition, every axiomatic variety has the greatest element (with respect to inclusion). On the other hand, for every partition (9 of 2 L (such that every element V of this partition has the greatest element Tv) the operation C �9 2 5 ---+ 2 L defined by

(t2*) C(X) = Tv iff X E V

324 z. STACHNIAK

is an inference operation such that O = Oc. It seems therefore correct to accept (t2*) as an analogue of (t2) for nonmonotonic inference operations.

5. Axiomat ic Varieties - Syntact ic Criteria

In the light of Theorem 1, it is natural to expect that interesting properties of nonmonotonic inference systems can be expressed in terms of properties of axiomatic varieties. In this section we state syntactic criteria for three such properties, for cumulativity, loop-cumulativity, and cut. We begin with the class of cumulative inference systems. Let C be an inference operator. We say that Oc is convex if the following condition is satisfied: for every X, Y, Z c_C_ L, if X e c Z and X C Y c_ Z, then X OcY .

THEOREM 4 (Fiatkowska and Fia&owski, 1990). An inference operation C is cumulative iff O c is convex.*

Proof. First, let us assume that C is cumulative. If X c_ Y c_ Z c_ L and X O c Z , then, by inclusion,

x c y c z c_ c ( z ) = c ( x ) .

By cumulativity, we immediately conclude that C(X) = C(Y) or that X e c Y . Conversely, let X, Y C_ L be such that X C_ Y C C(X) . Since X e c C(X)

(by idempotence) and since e c is convex, we immediately conclude that X e c Y. Hence, C(X) = C(Y), as required. �9

According to Theorem 4, cumulativity corresponds to a simple set-theoretic property of axiomatic varieties:

if X and Z are members of an axiomatic variety V(T) and if X C Y C_ Z, then Y is also a member of V(T).

A similar result holds for the inference operations which satisfy cut. Using the fact that for every X C_ L, C(X) = U[X], where [X] denotes the equivalence class of X modulo Oc, we immediately obtain:

PROPOSITION 5. An inference operation C on Z: satisfies cut iff for every X, Y, T C_ L, if X e c T and X C_ Y C_ T, then U[Y] C_ U[X].

This proposition expressed in terms of axiomatic varieties assumes the form:

if X and Z are members of an axiomatic variety V(T) and if X C_ Y C Z, then every member of the axiomatic variety of C(Y) is a subset ofT.

Loop-cumulativity, a property discussed e.g. in Kraus et al. (1990), Makinson (1994), and Stachniak (1993), is yet another property of inference systems that

* This theorem, although not explicitly stated, is proved in Fia{kowska and Fiatkowski (1990, pp. 80--82).


can be expressed in terms of the relation Oc. An inference operation C is loop- cumulative ifffor every Xo, . . . , Xn C_ L,

X 0 C_ C ( X 1 ) , X 1 C_ C ( X z ) , . . . , X n - I C_ C ( X n ) , X n C C(Xo)

implies C(Xo) = C(Xl) .

THEOREM 6. An inference operation C on s is loop-cumulative iff for every Xo, . . . , Xn c_ L, n >_ 1,

S 0 C X 1 ~ 2 2 ~ . . . ~,~ S n - 1 ~ X n 0 C X 0 implies Xo@cX1,

where every occurrence of ' ~ ' is one of 'C_' or ' b e ' . Proof Let C be a loop-cumulative inference operation on s and let

X 0 , . . . , X~ C_ L be such that

(a) Xo ~ X l ,~ X2 ~ . . . ,.~ X~-a ,~ Xn @c Xo.

To show that X0 0 c X1, let us assume first that (a) does not contain 'protracted' X ' subexpressions, i.e., subexpressions of the form 'Xi 0 c Xi+l Oc i+2 or

'X.~ _C Xi+1 _C Xi+2'. In other words, we assume that (a) has the form

No ~ Xl (~c 2 2 ~ . . . (~c X n - i ~ Xn e C No.

For every even i, 0 _< i < n - 1, we have Xi C_ Xi+l @c Xi+2. So, by inclusion and the assumption that Xi+l 6)c Xi+2, we conclude that

(b) Xi C_ C(Xi+2).

In the same way we obtain

(C) X n - 1 C_ C(Xo).

Since C is loop-cumulative, (b) and (c) imply X0 @c X l , as required. Next, let us assume that (a) contains a protracted subexpression Xi ,,~ Xi+l ,.~

Xi+2. Clearly, (a) remains true if 'Xi+l ~ ' is deleted from it. Hence, after finitely many of such deletions, (a) will be free from protracted subexpressions. If X1 has not been deleted from (a), then X0 ~9c X1. On the other hand, if X1 has been removed, then (a) begins with

X 0 ~ X 1 C . . . _ Xk ~C Xk+l ,

and X0 | Xk. Since X0 C_ X1, by inclusion,

(d) Xo C C(X1).

326 z. STACHNIAK

Since X1 C_ C(Xk) and since Xo e c Xk, we, also have

(e) Xl c_ C(Xo).

As C is cumulative, (d) and (e) imply X0 (~c Xl, and the first part of the theorem is complete.

For the other half, let X0, �9 �9 �9 Xn C_ L be such that

(f) NO ~ C(Xl),Xl ~ C(X2),...,Xn-1 C C(Xn),X n C C(Xo).

Since C is idempotent, (f) can be transformed into

Xo ~ C(Xl) OC Xl ~ C(X2)...Xn-1 ~ C(Xn) OC Xn ~ C(Xo) eC Xo. By the hypothesis this means that 320 (~c C(X1) Oc Xl. Hence, C(Xo) = C(X1 ), as required. �9

6. Axiomat ic Varieties - Semant ic Cons iderat ions

In Sections 4 and 5, we investigated the connections between the properties of inference systems and the properties of their axiomatic varieties on the syntactic level. In this section we extend this study to the semantic level.

By (t4), the set of theories of a consequence operation C can be used to construct a model (i.e. the Lindenbaum matrix) that defines C. As (tl) is not generally valid for nonmonotonic inference systems, we cannot use Lindenbaum matrices for the semantic definition of these systems. It turns out that to find a counterpart of (t4) for nonmonotonic inference operations, we do not need to look very far. Roughly speaking, it suffices to replace Th(C) with the set of all axiomatic varieties for theories of C in the definition of the Lindenbaum matrix. Let us begin this semantic program with the following definition of a model. A model for/2 is a triple

M = (A, T, 7-t).

The algebra ~4 and the set 7-/of valuations are the same as in the definition of a logical matrix (see Section 3). What makes the notion of a model different from the notion of a matrix is the second element 7". 7" is a partition of the set of all subsets of A. We assume that every member V of 7" has the greatest element with respect to inclusion (i.e., for every V E 7" there exists dv E V such that X C_ d r , for every X E V). We denote the greatest element of V by dr . Intuitively, the partition 7" is the semantic counterpart of the set of axiomatic varieties.

Every model M for s defines an operation CM on E in the following way: for every X c_ L and every formula a,

a E CM(X) iff for every V E 7" and every valuation h E ~ ,

if h(X) E V, then h(a) E dv.


EXAMPLE A. Let C be an inference operation on Z;, let V(C) = {V(T) �9 T E Th(C) }, and let id be the identity function on the set L of formulas of s Then

(C,V(C),{id})

is a model for/2. Indeed, V(C) forms a partition of the family of all subsets of L, and for every V(T) E V(C), T is the greatest element of V(T). By analogy with the construction of the Lindenbaum matrix, we call this model the Lindenbaum model for C. I

THEOREM 7. For every inference operation C on s the Lindenbaum model M = (s •(C), {id}) defines C, i.e., C = CM.

Proof Let C be an inference operation and let X U {c~} c L. First, let us assume that c~ E C(X). To show that c~ E CM(X), let V E "I)(C) be such that X E V. Since dv = C(X), we have c~ E dr, as required. If c~ r C(X), then by letting V = V(C(X)) we have X E V while c~ ~f dr. This shows that c~ r CM(X). I

Theorem 7 is a nonmonotonic counterpart of (t4). A logical matrix (A, 79, ~} is a multiple-valued model. This means that the

elements of the algebra ..4 are truth-values and that some of these truth-values are designated (as true). In a model (..4, T, ~ ) the interpretation of the elements of A can be more liberal. We can, of course, view them as truth-values, but we can also view them as propositions describing (or semantically representing) a certain class {So, $1, $2 , . . . } of situations. If we identify every situation with the set of propositions that describes it, then every Si is a subset of A. Frequently, to identify a situation Si, it is not necessary to know all the propositions constituting Si; in many cases knowing a certain subset of Si suffices. This leads us to the idea of associating with every situation Si a family lfi of all 'typical' patterns that can be separately used to identify Si. Every such pattern is a subset of A. Therefore, if we learn X _C A and that X E V/, then we conclude that X is a pattern for Si, and, hence, that every proposition of Si is true. It is reasonable to assume that Si is a pattern for itself and that Si is a 'complete' description, i.e., that

(i) Si is the greatest element of V/with respect to _C. Let us further assume that every subset of A is a pattern for some situation Si. Clearly, we do not want any set X _C A to be a pattern for two different situations, which can be formally expressed by the sentence

for every X _C A, if X E ~ and X E Vj, then i = j .

In other words, we require that

(ii) all the sets V0, V1, V2, �9 �9 form a partition of the family 2 A of all subsets of A.

328 z. STACHNIAK

The conditions (i) and (ii) correspond exactly to the conditions imposed on the family T of a model (A, T, H).

The idea which is shared by many semantic proposals for modeling of nonmonotonic inferences is that of preference. The verification of the validity of an inference

ec(x)

proceeds by verifying that a is true in every minimal (or minimally abnormal, most preferred, etc.) model (or world, interpretation, etc.) of X (cf. Brown and Shoham (1988), Kraus et al. (1990), Makinson (1988, 1994) and Stachniak (1993)). The semantic proposal discussed in this paper is 'preference-free'; the relation of preference, or minimal abnormality, is replaced by the 'pattern-of-situation' relation. The validity of a E C ( X ) requires that for every admissible interpretation h, a is a part of the description of the situation d represented by the pattern X, i.e., that h(a ) E d. We shall demonstrate that this notion of an inference is sufficiently expressive to capture a wide class of nonmonotonic systems (cf. Theorems 8, 9, 10, 12). First, an example.

EXAMPLE B. Let us suppose that/2 has the following connectives: --, (negation), V (disjunction), A (conjunction), and ~ (implication). The model M = (A, T, 7-// which we define in this example resembles the three-valued Lukasiewicz matrix on 12 (cf. Malinowski (1993)):

- A = ({0, 1,2}, 2, _V, _A,---~), where the operations of ,A are defined as in the three-valued Lukasiewicz matrix, i.e., ",(x) = 2 - x; x V y = max(x, y); x A y = rain(x, y); x---~y = min(2, 2 - x + y);

- T has four elements: {0,{2}}, {{0}, {1}, {0,1}}, {{1,2}}, {{0,2}, {0,1,2}};

- ~ consists of all valuations. Informally speaking, the partition T represents four semantic pattems: truth (represented by {$, {2}}), contradiction (represented by {{0, 2}, {0, 1,2}}), and two intermediate patterns, 'typically false' (represented by { {0}, { 1 }, {0, 1 } }) and 'typically true' (represented by {{1,2}}).

Let us consider three formulas: -,(p --, p),

(9) p A -.p, (~) p -~ p. Let us note that for every valuation h of/2 into r h(o~) = 0, h(/3) E {0, 1}, and h(3') = 2. Hence,

(a) o~ E CM({fl}).


On the other hand,

(b) c~ r CM({/3,~,)).

Indeed, if h is a valuation such that h(p) = 1, then h({/3, ,~}) = {1,2}. Since {{1,2}} is a member of the partition T and since h(a) = O, (b) is true. From (a) and (b) we conclude that (/2, CM) is nonmonotonic. []

Let M be a model. Although the operation CM always satisfies the inclusion condition (cl), CM is not always idempotent and, hence, CM may not be an inference operation. The following simple example illustrates the failure of idempotence.

EXAMPLE C. Let/2 be a propositional language with only one logical connective, say ~. We assume that -, is unary. Let M = (-4, T, ~ ) be the following model for Z;:

- .4 = ({0, 1,2}, 2, ), where for every z E {0, 1,2}, 2 (z ) = 2 - z; - T = {V0, V1}, where V0 = (~,{0, 1}},Vl = {X C_ {0, 1,2} : X ~V0}; - 7-[ = {h}, where for every variable p, h(p) = O.

It can be easily verified that h(CM(!3)) = {0} and that --,(p) E CM(CM(~)) while -~(p) f! CM(O). In this example, CM fails to satisfy (c2) since h(~)) and h(CM(f})) are 'patterns' of different situations, i.e., h(0) E V0 while h(CM((~)) E V 1 . []

To derive a notion of a model that would provide the representation theorem for inference systems with desired properties, we can, for instance, require the partition T of a model M to satisfy some additional properties. In the following theorems we do just that. First, we consider the class of cumulative inference systems (cf. (c4)). Following the syntactic criterion for cumulativity stated in Theorem 4, we call a model (-4, T, 7-[) cumulative if and only if

for every V E T and every X, Z E V, if for some Y,

X c_ Y _c Z, t henY E V.

THEOREM 8 (Representation Theorem for Cumulative Inference Systems). An operation C : 2 r , 2 L is a cumulative inference operation iff C = CM, for some cumulative model M.

Proof The only if part of this theorem follows directly from Theorems 4 and 7 and will not be presented here.

Let M = (-4, T, 7-/) be a cumulative model. Since CM always satisfies the inclusion and since cumulativity implies idempotence, it suffices to show that CM is cumulative. To this end, let X, Y C L satisfy X C_ Y C_ CM(X). To show that CM(Y) = CM(X), we prove first that

330 z. STACHNIAK

(a) for every valuation h E ~ , h(X) and h(Y) belong to the same member of the partition T.

To show (a), let h E 7-/be arbitrarily selected and let V E T be such that h(X) E V. Hence, we also have h(CM(X)) C_ dv which, by the hypothesis, gives us

h(X) C_ h(Y) C_ h(CM(X)) C_ dy.

Since M is cumulative, h(Y) E V. This shows (a). Now, we use (a) and Theorem 13, which is proved at the end of this section, to conclude that C v ( X ) = CM(Y). �9

Theorem 8 says that the class of cumulative inference systems coincides with the class of inference systems defined by cumulative models. The model M described in Example B is cumulative. Hence, by Theorem 8, the inference operation defined by M is cumulative. Theorem 8 gives us a 'non-preferential' semantics for cumulative inference systems, as the definition of CM does not refer to any concept of preference used, for instance, in stoppered preferential model structures or stoppered preferential matrices which are known to characterize cumulativity (cf. Makinson (1994) and Stachniak (1993)).

The semantic criterion for loop-cumulativity can be obtained by imposing a restriction on T similar to the definiens of the definition of loop-cumulativity. Namely, we say that a model (Jr, T , ~ ) is loop-cumulative if and only if

for every Vo,. �9 Vn E T and every Xi E Vi, i <_ n,

if Xo C_ dvx, X1 C_ d � 8 9 X n - 1 ~ dv,~, X,~ C_ dvo, then 110 = V1.

THEOREM 9 (Representation Theorem for Loop-Cumulative Inference Systems). An operation C : 2 z ~ 2 z is a loop-cumulative inference operation iffC = CM, for some loop-cumulative model M.

Proof The only if part of this theorem follows directly from the definition of loop-cumulativity and Theorem 7, and will not be presented here.

Let M = (Jr, T , ~ ) be a loop-cumulative model defining C. First, we show that CM is loop-cumulative. Let X0, �9 �9 Xn C_ L be such that

Z 0 c C M ( X 1 ) , X 1 C C M ( X 2 ) , . . . , X n _ 1 C C M ( X n ) , X n C CM(Xo).

We claim that (a) for every valuation h E 7-/, h(X0) and h(X1) belong to the same member of

the partition T. To show (a), let us select h E ~ . For every i < n, let V/ E T be such that h(Xi) E ~ . Hence, we have (b) h(CM(Xi)) C dv~, all / _< n.

NONMONOTONIC THEORIES AND THEIR AXIOMATIC VARIETIES 33 1

The hypothesis and (b) give us

h(Xo) C_ dv1,h(X1) C d�89 ,h(Xn-1) C_ dv,~,h(Xn) c_ dvo,

which, in the light of loop-cumulativity of M, gives us Vo = VI. Since 7- is a partition, (a) is true. It is not difficult to show that M is also a cumulative model. Hence, by Theorem 13, which we prove at the end of this section, (a) implies C M ( Xo ) = C M ( X 1 ), as required.

To show that CM is an inference operation, let us only note that for every model M, CM satisfies inclusion, that every loop-cumulative inference operation is cumulative, and, finally, that cumulativity implies idempotence. []

Theorem 9 says that the class of loop-cumulative inference systems coincides with the class of inference systems defined by loop-cumulative models. Again, this semantic criterion is 'non-preferential' in comparison with the semantic criteria for loop-cumulativity expressed in terms of preferential model structures (the tran- sitivity of the preference relation, cf. Makinson (1994)) and preferential matrices (loop-free preference relation, cf. Stachniak (1993)).

We obtain a criterion for monotonicity using the following notion of a monotonic model. A model (,A, T, ~ ) is said to be monotonic if and only if

for every ]70,171 E T, if for some X0 E 170 and X1 E V1, X0 _C X l ,

then dr0 C_ dye.

THEOREM 10 (Representation Theorem for Monotonic Inference Systems). An operation C : 2 L ----+ 2 L is a consequence operation iff C = CM, for some monotonic model M.

Proof The only if part of this theorem follows directly from Theorem 7 and the monotonicity of C, and will not be presented here.

Let M = (,,4, 7-, 7-/} be a monotonic model defining C. First, we show that CM is an inference operation. To show the idempotence of CM, let X U {a} C_ L, be such that a E CM(CM(X)) and let us suppose that for some h E and some V E 7", h (X) E V. Hence, h(CM(X)) C_ dr. If h(CM(X)) E V1, then, by the monotonicity of M, we have h(CM(CM(X))) C_ dr1 C_ dr. Since a E CM(CM(X)) we must have h(a) E dv and the proof that a E CM(X) is finished.

To show that CM is monotonic, let X, Y C_ L be such that X C_ Y. Let us suppose that a E CM(X) and that for some h E ~ and some V E 7", h(Y) E V. Let V1 E T be such that h(X) E V1. Hence h(a) E dvi. Since h(X) C_ h(Y) and since M is monotonic, we have dr1 C_ dr, from which we get h(a) E dv and conclude that ce E CM(Y). []

332 z. STACHNIAK

The definition of the inference operation defined by a monotonic model can be simplified.

PROPOSITION 11. Let M = (.4, T, ~ ) be a monotonic model for 12 and let X U {c~} C L. Then the following formulas are equivalent: (i) for every h E 7-/and every V E T, if h(X) E V, then h(c 0 E dr;

(ii) for every h E 7-/and every V E T, if h(X) C_ dr, then h(a) E dr. Proof. Let M, X, and c~ be as stated. Since for every h E 7-/and every V E T,

h(X) E V implies h(X) C dr, (ii) implies (i). To show that (ii) follows from (i), let h E 7-r and V E T be such that h(X) c_ dv. Moreover, let V1 E T be such that h(X) E V1. So, by (i), h(c~) E dr1. Since M is monotonic, dr1 C_ dv and, hence, h(a) E dv, as required. �9

By Proposition 11, the definiens of the definition of the inference operation determined by a monotonic model (i.e., condition (i) of Proposition 11) can be replaced by the simpler condition (ii), in which any reference to elements of V, other than dv, is removed (i.e., the test ' h (X) c_ dv' is used instead of ' h (X) E V'). This means that every monotonic model (.4, T, 7-/) can be viewed as a logical matrix (.4, 79, 7-/), where 79 = {dr : V E T}. In other words, we have arrived at the notion of a logical matrix discussed in Section 3. Let us illustrate this part of our discussion with the following example.

EXAMPLE D. Let s be as in Example B. Let `42 = ({0, 1},--,,V__,_A,--~) be the two-element algebra of truth-values (with 0 denoting falsehood and 1 denoting truth) and with the usual (Boolean) definitions of the operations --,, V__, A, --+. Finally, let

M2 = (A2, {{0, {i}}, {{I}, {0, i}}},7-/>

be the model for s where 7-/consists of all valuations. Clearly, M2 is monotonic. Using Proposition 11 (ii) to define the inference operation associated with M, and the fact that h(a) E {0, 1 } is true for every a E L and every valuation h, we obtain the equivalence

c~ E CM(X) ifffor every valuation h E 7-/, if h(X) C {1}, then h(c~) = 1,

which is the well-known semantic definition of the consequence operation of classical logic (cf. W6jcicki (1988)). Hence, the model M2 corresponds to the matrix

(A2, {{1}}, 7-/) �9

Lindenbaum models can be used to obtain semantic characterizations of properties of inference systems weaker than cumulativity. For example, the representation theorem for inference systems with cut (cf. (c3)) can be obtained using models


M = (A, T, 7-/) which satisfy the following semantic analogue of cut:

for every V0~ V1 E T, if for some X0 E V0 and X1 E VI, Xo C_ Xl C_ dvo,

then dv~ C d�89

We call such models - models with cut.

THEOREM 12 (Representation Theorem for Inference Systems with Cut). An operation C �9 2 L r 2 L satisfies cut iff C = CM, for some model M with cut.

Proof The only if part of this theorem follows directly from Theorem 7 and the fact that C satisfies (c3). The proof is left to the reader.

Let M = (.A, 7-, 7-/) be a model with cut defining C. First, we show that CM satisfies cut. To this end, let X C_ Y C_ CM(X) and let a E CM(Y). Let us also assume that for some V E T and some h E 7-/, h(X) E V. We must show that h(ce) E dr. Since h(X) E V and Y C_ CM(X), w e have h(Y) C_ dr. Hence, (a) h(X) C_ h(Y) C_ dv and h(X) E V. Let V1 E 7" be such that h(Y) E V1. Since M has cut, (a) implies (b) dv~ c_ dr. Since h(CM(Y)) C dVl and since a E CM(Y), we must have h(a) E dv~ which, by (b), gives us h(a) E dv.

To show that CM is an inference operation let us recall that for every model M, CM satisfies (cl). Hence, we also have CM(X) C_ CM(CM(X)). The inclusion CM(CM(X)) C CM(X ) follows immediately from (cl) and (c3). []

By Theorem 7 and Example C, the class C of all operations on s that can be defined by models for s properly includes the class of all inference operations on s The syntactic characterization of the class C is an open question.

We conclude our discussion by looking for analogues of (t5) and (t6) for models. Properties (t5) and (t6) provide the semantic description of theories of the consequence operation CM defined by a logical matrix M. If M is a model, then the analogues of (t5) and (t6) should provide the semantic description of the axiomatic varieties for theories of CM, or equivalently, should provide the semantic definition of the relation @CM" For cumulative models, OCM can be semantically described in the following way.

THEOREM 13. Let M = (,4, T , 7-/) be a cumulative model for s For every X, Y _ L, the following conditions are equivalent: (i) X Oc~ Y;

(ii) for every h E 7-/and every V E T, h(X) E V iff h(Y) E V. Proof Let X, Y C L. First, let us assume that (i) holds. Let V E 7- and let

h E 7-[ be arbitrarily selected. Since 7- is a partition, to prove (ii) it suffices to justify that h(X) and h(Y) belong to the same member of 7-. To this end, if h(X) E V, t h e n h(X) C h(CM(X)) C_ dy which, by cumulativity of M, gives us h(CM(X)) E V. By (i), we conclude that

334 z. STACHNIAK

(a) h(CM(Y)) E V. Since T is a partition of 2 A, there exists V* E T such that h(Y) E V*. Since M is cumulative and since h(Y) C_ h(CM(Y)) C_ dy. , we also have (b) h(CM(Y)) E V*. Now, (a), (b), and the fact that T is a partition give us V = V*. Hence, h(Y) E V, as required.

Now, let us assume that (ii) holds. If o~ E CM(X) and if for some V E T and some h E ~ , h(Y) E V, then, by (ii), h(X) E V. Since a E CM(X) , we must have c~ E dv. This shows that CM(X) C_ CM(Y) . In the same way we also prove that CM(Y) C CM(X) . �9

Since every loop-cumulative (or monotonic) model is cumulative, the criterion of Theorem 13 is also valid for the classes of loop-cumulative and monotonic models.

Acknowledgments

This work was supported by the Natural Sciences and Engineering Research Coun- cil of Canada research grant. The author thanks anonymous referees for several helpful comments and suggestions concerning the presentation of this paper; in particular, for suggesting the use of the relation O c for the analysis of inference operations.

References

Brown, A. L. and Shoham, Y., 1988, "New Results on Semantical Nonmonotonic Reasoning," pp. 19-26 in Proceedings of the Second International Workshop on Non-Monotonic Reasoning, Springer-Verlag, Lecture Notes in Computer Science 346.

Fialkowska, D. and Fiatkowski, J., 1990, "On Theories of Non-Monotonic Consequence Operations II," Bulletin of the Section of Logic PAN 19, 79-83.

Gabbay, D. M., 1985, "Theoretical Foundations for Non-Monotonic Reasoning in Expert Systems," pp. 439-457 in Logics and Models of Concurrent Systems, K. Apt ed., Springer-Verlag.

Kraus, S., Lehmann, D. and Magidor, M., 1990, "Nonmonotonic Reasoning, Preferential Models and Cumulative Logics," Artificial Intelligence 44, 167-207.

Makinson, D., 1988, "General Theory of Cumulative Inference," pp. 1-18 in Proceedings of the Second International Workshop on Non-Monotonic Reasoning, Springer-Verlag, Lecture Notes in Artificial Intelligence 346.

Makinson, D., 1994, "General Patterns in Noumonotonic Reasoning," in Handbook of Logic in Artificial Intelligence and Logic Programming Vol 3: Nonmonotonic Reasoning and Uncertain Reasoning, Gabbay, D. M., Hogger, C. J. and Robinson J. A., eds., Oxford University Press.

Malinowski, G., 1993, Many-Valued Logics, Oxford University Press. Piochi, B., 1983, "Logical Matrices and Non-Structural Consequence Operations," Studia Logica 42,

33-42. Stachniak, Z., 1988, "Two Theorems on Many-Valued Logics," Journal of Philosophical Logic 17,

171-179. Stachniak, Z., 1993, "Algebraic Semantics for Cumulative Inference Operations," pp. 444-449 in

Proceedings of the Eleventh National Conference on Artificial Intelligence, AAAI/MIT Press. W6jcicki, R., 1988, Theory of Logical Calculi: Basic Theory of Consequence Operations, Kluwer.

nonmonotonic theories and their axiomatic varieties

Documents