teleological arguments and theory-based dialectics

DRAFT Giovanni Sartor

CIRSFID, University of Bologna [email protected]

TELEOLOGICAL ARGUMENTS AND THEORY-BASED DIALECTICS

ABSTRACT This paper proposes to model legal reasoning as dialectical theory-construction directed by teleology. Precedents are viewed as evidence to be explained through theories. So, given a background of factors and values, the parties in a case can build their theories by using a set of operators, which are called theory constructors. The objective of each party is to provide theories that both explain the evidence (the precedents) and support the decision wished by that party. This leads to theory-based argumentation, i.e., a dialectical exchange of competing theories, which support opposed outcomes by explaining the same evidence and appealing to the same values. The winner is the party that can reply with a more coherent theory to all theories of its adversary.

1. INTRODUCTION

The contribution to teleological argumentation of Berman & Hafner (1993) provides a major insight into legal argument: rules and cases, abstracted from the purposes that they serve, cannot provide us with an adequate (computational) model of legal reasoning (cf. Bench-Capon 20001). We will here try to develop this insight in the framework of a model of legal argumentation that differs from most analyses so far proposed within AI & law. In those analyses (cf. for all, Gordon 1995), the argumentation process is viewed as consisting in the exchange of arguments, i.e. of inferences supporting or attacking contested propositions. In such process, victory goes to the party proposing the strongest argument, possibly within certain procedural constraints. Here, on the contrary, we view argumentation as being the process through which parties exchange theories, i.e. alternative comprehensive accounts of a controversial domain. Victory goes that to the party which succeeds in providing the most coherent theory. Arguments (inferences) still figure in this account, since the implications of each theory will be established according to an argument logic. This logic, however, only provides semantics for the theories put forward by the parties in the dispute; it is not a model for the interaction of the parties.

2. THE EXAMPLE

The benchmark for our approach will be represented by the cases discussed in Berman & Hafner (1993), which are synthesised as follows by Bench-Capon (2000):

In the first, Pierson v Post, the plaintiff was hunting a fox in the traditional manner using horse and hound when the defendant killed and carried off the fox. The plaintiff was held to have no right to the fox because he had gained no possession of it. In the second case, Keeble v Hickeringill, the plaintiff owned a pond and made his living by luring wild ducks there with decoys and shooting them. Out of malice the defendant used guns to scare the ducks away from the pond. Here the plaintiff won. In a third case, Young v Hitchens, both parties were commercial fisherman. While the plaintiff was closing his nets, the defendant sped into the gap, spread his own net and caught the fish. In this case the defendant won.

1 Footnote: this contribution originated as a comment on Bench-Capon (2000), which the author sent me before its submission. I thank Trevor Bench-Capon, Carole Hafner, Henry Prakken, and Andrew Stranieri for many helpful remarks on earlier drafts of the present paper. This paper has been followed by two contributions co-authored with Trevor Bench-Capon (Bench-Capon & Sartor 2001a, Bench-Capon & Sartor 2001b) to which I refer the reader for developments and refinements of some ideas here presented and for references to related work, in particular, for a discussion of the connections with the HYPO and CATO projects.

Giovanni Sartor 2 In all those cases, the plaintiff π was chasing an animal. The defendant δ intervened stopping the chase, so defeating the objective of π. π is arguing for the conclusion that he has a legal remedy against δ, while δ is arguing that no such remedy exists. Berman & Hafner (1993) consider how the decision in Young v Hitchens can be justified on the basis of the previous decisions in Pierson v Post and in Keeble v Hickeringill. They agree with Ashley (1990) in focusing on factors, i.e. those (abstract) features of a case that may possibly influence its outcome. However, they argue that understanding a case-law domain requires going beyond factors, and looking at the underlying values. Correspondingly, our formalisation will be based upon the identification of both factors and values. Here are abbreviations for the factors we will consider: πLiv = π was pursuing his livelihood πLand = π was on his own land πNposs = π was not in possession of the animal δLiv = δ was pursuing his livelihood. The values are the following: LLit = Less Litigation MProd = More productivity MSec = More security of possession.

3. THE EVIDENCE

We will now introduce the formal framework we will use to deal with the example above. Let us first characterise the starting point of the theory-construction exercise of the debating parties. This is the so-called explanandum (Hempel 1966, 51), i.e. the evidence that the competing theories are trying to explain. Here we assume that evidence is constituted by a set of precedents, where each precedent is characterised by a set of factors (the possibly relevant features of the case), and by an outcome (the judicial decision in that case): In the example above, we have two alternative outcomes, Π for the plaintiff and ∆ for the defendant, where Π means “π has a legal remedy against δ”, and ∆ means “π has no legal remedy against δ”. As we have seen above, in Parson, where π had no possession of the animal, the outcome was ∆; in Keeble, where Π was pursuing his livelihood on his own land, and had no possession of the animal, the outcome was Π. Therefore the explanandum is represented as follows: Pierson = {Factors: πNposs. Outcome: ∆} Keeble = {Factors: πLiv, πLand, πNposs. Outcome: Π}.

4. THE BACKGROUND KNOWLEDGE

Besides possessing certain evidence, the parties are here assumed to share a certain background knowledge, which includes two components. The first is what we call a “factor-background”. A factor-background specifies what outcomes are supported by what factor. This, as in Prakken & Sartor (1998), will be represented through a set of rules (no special connotation is linked here to the term “rule”: other words expressing a conditional connection, such as “link” or “warrant”, could be used synonymously). Each rule links a (possibly conjunctive) factor α to the outcome γ supported by that factor. Such a rule may be understood as defeasible conditional α ⇒ γ, which we read as “α is a reason for the outcome γ”. For example, πLiv ⇒ Π means “that π was pursuing his livelihood is a reason why π should have a legal remedy against δ”, or more simply, “if π was pursuing his livelihood, then π has a legal remedy against δ”. The consequent of a rule may not be the final result that π or δ are aiming to establish, but it may also be an intermediate outcome that contributes to establishing the final result (this is considered in Bench-Capon & Sartor 2001b). Such a consequent may also consist in affirming the inapplicability of another rule

Giovanni Sartor 3 (undercutting it), i.e. in affirming that under certain conditions a certain factor does not support its conclusion. However, here we assume for simplicity that the factor-background only consists of rules establishing one of the two ultimate outcomes Π or ∆: πLiv ⇒ Π πland ⇒ Π πNposs ⇒ ∆ δLiv ⇒ ∆. The second element of the background knowledge is what we call a “value-background”. Here we will consider a value-background that only consists of what we call “teleological links”. A teleological link involves two elements, a rule α ⇒ γ and a value V, can be understood as the conjunction of two assertions: • the goal V is a (legal) value, i.e. an objective which is pursued by the legal system, • the general adoption of the rule α ⇒ γ (its being used by legal agents as a standard for their reasoning

and practice) would advance the achievement of V. Let us simplify this double assertion as “α ⇒ γ promotes V”. A value-background may include many other components, such as a specification of the relative importance of the values, of their relations (achieving some values may impact positively or negatively on others), etc. Here, however, the value-background will be limited to elementary teleological links (for simplicity’s sake, we assume that single values do not interfere with each other, and that all rules promoting the same value do that to the same degree): πLiv ⇒ Π promotes MProd πland ⇒ Π promotes MSec πNposs ⇒ ∆ promotes LLit δLiv ⇒ ∆ promotes MProd. Teleological links do not need to concern just one value: their general form is “R promotes {V1, …, Vn}”, where {V1, …, Vn} is the set of the values advanced by the rule R. However, we take the liberty of omitting brackets, when single values are involved, as in the examples above.

5. THE TASK OF THE PARTIES

The two parties, π and δ are meeting in the framework of a new case; let us call it Current Situation (CS). CS also is characterised by a set of factors, but its outcome is not determined (or it is determined, but we consider it as undetermined, since want to use CS as a test for our theory, i.e. to see if our theory can foresee its outcome). In the example, the new situation is represented by: Young = {Factors: πLiv, πNposs, δLiv. Outcome: ?}. π will try to provide a theoretical hypothesis, i.e. a set of sentences (an explanans, in the terminology of Hempel 1966, 51) that both explains all precedents (the explanandum) and gives Young the outcome that is desired by π (i.e., the decision Π). Let as call such theoretical hypotheses, π-theories. The replies of δ will consist in alternative theoretical hypotheses, the δ-theories, which still explain all precedents, but imply in Young the outcome desired by δ (the decision ∆).

6. THEORY CONSTRUCTORS

Besides sharing background knowledge, parties will also share some basic strategies or heuristics for theory construction. One may also view those heuristics also as patterns for analogical inference, rather than as ways of providing new content to a theory (cf. Prakken 2000). However, it is useful to distinguish inferences

Giovanni Sartor 4 made in the theory construction phase, from those less problematic inferences performed on the basis of the constructed theory. To mark this difference we shall call the first inferences “theory constructors”. The first theory constructor, which we call factor-merging, consists in building more complex rules on the basis of simpler ones. The idea is that by joining factors supporting the same conclusion, we obtain a stronger factor pointing to that same conclusion. This can be viewed as a rudimentary formalisation of the so-called “a fortiori” argument. In other words, from any two rules

1. α ⇒ γ 2. β ⇒ γ

one can construct the following:

1. α & β ⇒ γ 2. α & β ⇒ γ > α ⇒ γ 3. α & β ⇒ γ > β ⇒ γ.

The second theory constructor, which we call value-merging, consists in building more complex teleological links from simpler ones: if two rules (having the same consequent) promote different values, then the new rule obtained by merging those rules promotes all those values (the union of those sets). In other words, from any two teleological links:

1. α ⇒ γ promotes V1, and 2. β ⇒ γ promotes V2

one can construct the following:

α & β ⇒ γ promotes V1 ∪ V2. Value-merging is complemented by an ordering over sets of values. Here we adopt a minimal approach to ordering, which we call value-ordering: any set of values is more important than any of its proper subsets. According to value-ordering, given any sets of values V1 and V2, we can add to any theory the statement (V1 ∪V2) > V 1 or the statement (V 1 ∪ V2) > V2. The third theory constructor, which we call rule-preference-from-value-preference, consists in introducing preferences between rules on the basis of preferences between values. The assumption is that rules promoting more important values are stronger than those promoting less important values. More precisely, given that a theory contains

1. V1 > V2, 2. R1 promotes V1, and 3. R2 promotes V2,

where V1 and V2 are the sets of all values respectively promoted by R1 and R2 , one can expand the theory with the new preference:

R1 > R2. The fourth theory constructor, which we call rule-broadening, consists in introducing a more general rule on the basis of a more specific one, already contained in the theory. So, if the theory contains α & β ⇒ γ, one may expand it with α ⇒ γ or β ⇒ γ. This aspect is emphasised by Bench-Capon (1999), who represents cases as graphs, where each rule is linked to the broadenings derivable from it. The fifth theory constructor, which we call rule-preference-from-case, consists in introducing preferences between rules when these preferences contribute to explaining the precedents. More precisely, given that

1. a theory T does not explain precedents C1, …, Cn and

Giovanni Sartor 5

2. {R1 > R2, …, Rj > Rk} is a minimal set of preferences such that T ∪ {R1 > R2, …, Rj > Rk} explains C1, …, Cn,

then we can add to the theory the new preferences

R1> R2, …, Rj > Rk. This reasoning move can be viewed as a form of abduction, i.e. as the introduction of a hypothesis that is justified by its ability to explain the evidence (within the available theoretical framework). Finally, the last theory constructor, which we call arbitrary-rule-preference consists in introducing a new preference between rules, which is not necessary for explaining precedents, nor obtained from value-preferences, nor supported by further information in the background knowledge, but is required for justifying a certain result in CS. Other similar constructors could also be added to the model, to allow, for example, the introduction of value preferences on the basis of rule preferences, or the introduction of values as explanatory hypotheses, but we will not consider them here (see Bench-Capon & Sartor 2001b).

7. LOGIC

The logic determining the semantics (the meaning, or the implications) of the theories put forward by the parties of the dispute will be a dialectical logic. This is because each party must include in his theory, besides the reasons (the factors) supporting the conclusion he is aiming at, also the reasons favouring his adversary. If the party just considered his own reasons, he would be accused of being biased and one-sided, and would lose to a competitor showing a more bipartisan understanding. On the contrary, each party's task is that of showing that a decision for his side is implied by the (allegedly) most balanced account of the controversial domain, i.e. by the account that gives the most thorough and impartial consideration to the circumstances favouring his adversary. Therefore, each party’s theory will licence inferences for the adversary (inferences which, on the basis of factors favouring the adversary, conclude that the latter should win), though the party will claim that these inferences are defeated by prevailing inferences favouring his side. From our viewpoint, the theory of one party and its logic will be dialectical in the sense respectively of including reason, counter-reasons and meta-reasons (reasons for preferring certain reasons to certain others) and of licensing an architecture of corresponding inferences, rather than in the sense of modelling or constraining a real dialogue. In particular, though we shall call “arguments” the inferences available within one theory (as usual in argumentation logics), we do not view these inferences as explicit statements of the parties of the dispute and, in particular, we do not assume that each party states all and only the inferences favouring his or her side. Similarly, the mechanism adjudicating the conflicts between such arguments (the so called “argumentation framework”) is no protocol for a dispute, but only a way of specifying what conclusions are justified (or implied) by a single theory. Consequently, we will distinguish on the one hand the dialectical exchange between the two parties, and on the other hand the dialectical semantics of their theories. The dialectical exchange concerns the articulation and the refinement of alternative competing theories, while the dialectical semantics, based upon an argumentation logic, concerns establishing the defeasible implications of each theory. In our approach, therefore, there is no opposition but rather complementariness between theory construction and dialectical logic (for a contrary view, cf. McCarty 1997). Here we use the argumentation logic of Prakken & Sartor (1996), where the reader can find a formal definition. In the following, we will give a very simplified idea of this logic, which is be sufficient for making sense of our example. For simplicity, we say that that also factors and preferences (in addition to rules linking factors to outcomes) are rules: in general, we view an unconditioned statement as a rule with an empty antecedent (a factor or preference ϕ can be viewed as the abbreviation for the rule ⇒ ϕ). We also assume that our theories only contain ground formulas (formulas containing variable are substituted with all ground instances) The first notion is that of an argument. We say that a finite set of rules A is an argument if any rule φ1 & … & φn ⇒ ϕ in A is preceded by rules with consequents φ1, …, φn. All consequents of rules in A are conclusions of A (each of them is derivable from rules in A, by repeatedly applying modus ponens). For

https://www.researchgate.net/publication/221539372_Some_Arguments_About_Legal_Arguments?el=1_x_8&enrichId=rgreq-38c7b1f2-4ad6-427c-8c47-ceb1944bb9f5&enrichSource=Y292ZXJQYWdlOzIyNjk2MDA3MjtBUzoxODU5OTc4MDM0MDEyMTZAMTQyMTM1Njc0MDA1Nw==

Giovanni Sartor 6 example, in a premises set S1 = {α, β, α ⇒ γ , β ⇒ ¬γ}, B1 = {α, α ⇒ γ} is an argument for γ (and α), while B2 = {β, β ⇒ ¬ γ} is an argument for ¬ γ (and β). Arguments including rules with conflicting consequents (α ⇒ γ, β ⇒ ¬ γ ), are said to be each other’s counterarguments (we do not consider here the possibility of undercutting, on which cf. Prakken & Sartor 1996). The second notion is that of defeat, which provides a way of adjudicating conflicts between arguments. Of two counterarguments A1 and A2, including conflicting rules r1 and r2 respectively, we say that A1 defeats A2 iff it is not the case that r1 < r2, according to A2. To assess the strength of the conflicting rules, we rely on what the competing arguments say: to avoid being defeated by A1, A2 must conclude for a preference r1 < r2. When argument A1 defeats argument A2, but A2 does not defeat A1, we say that A1 strongly defeats A2. So, B1 and B2 above defeat each other, since none of them says anything on the relative strength of the competing rules α ⇒ γ and β ⇒ ¬γ. On the contrary, B3 = {α, α ⇒ γ, “α ⇒ γ”>“β ⇒ ¬γ”} is not defeated by B2 (though defeating it), since B3 includes, and therefore supports, preference “α ⇒ γ”>“β ⇒ ¬γ”. Therefore B3 strongly defeats B2. Finally, the logic of Prakken & Sartor (1996) provides a division of all arguments (available in certain a premises set) into three categories, justified, defensible and overruled ones. Only justified arguments have the capacity of establishing justified conclusions, on the basis of that premises set, i.e. conclusions that are supported or implied by the information contained in that set. Defensible arguments are the uncertain ones, which cannot be relied upon, but which still can effectively defeat other arguments, so preventing them from being justified. Overruled arguments, finally, are useless, been defeated by stronger arguments, which are justified. We need not consider here how to evaluate arguments (see definition in Prakken & Sartor 1996, which addresses multi-step arguments and reinstatement), since in our paper we will only consider one step arguments, and our background knowledge does not allow for conflicting preferences). Under these conditions, we may simply say that an argument A1 is justified when for each counterargument A2, A1 contains a preference according to which one of its rule is stronger that a rule in A2 and A2 contains no preference according to which one its rules is stronger then a rule in A1. Obviously this also holds when no counterarguments are available. For example, premises set S2 = {α, β, α ⇒ γ , β ⇒ ¬γ, α ⇒ γ > β ⇒ ¬γ}, contains argument B1 = {α, α ⇒ γ, α ⇒ γ > β ⇒ ¬γ} and argument B2 = {β, β ⇒ ¬γ}. B1 includes the preference α ⇒ γ > β ⇒ ¬γ, stating that the rule α ⇒ γ of B1, is stronger that than the opposed rule β ⇒ ¬ γ of B2. Therefore, B1 defeats B2, (without being defeated by it), and emerges as being justified, within S2. Consequently, γ is a justified consequence of S2, i.e. γ is implied by S2. Note that γ was not implied by S1 above, since according to S1, A1 is no justified argument, being defeated by A2. The adoption of this dialectical logic allows us to clarify in what sense a theory explains a case. A theory T explains a case C, with factors α1, …, α n and outcome γ, if the premises set T ∪ { α1, …, α n} implies γ, i.e., if T ∪ { α1, …, α n} contains a justified argument with conclusion γ. Similarly a theory T supports a certain outcome γ in the current situation CS (where the current situation is a set of factors) if T ∪ CS implies γ. For example, the theory T1Π = { 1. πLiv ⇒ Π promotes MProd [from background knowledge (BGK)]; 2. πNposs ⇒ ∆ [from BGK]; 3. πLiv ⇒ Π > πNposs ⇒ ∆ [explanation-thorough-preferences, in regard to Keeble] } explains both Pierson (with factors πNposs and outcome ∆), where only the antecedent of the rule πNposs ⇒ ∆ is satisfied, and Keeble (with factors πLiv, πLand, πNposs and conclusion ∆), where both rules πLiv ⇒ Π and πNposs ⇒ ∆ are satisfied, but the Π-rule prevails, being stronger, according to T1Π

(the argument is { πLiv, πLiv ⇒ Π, πLiv ⇒ Π > πNposs ⇒ ∆}). It also supports (justifies) Π in Young, where the same arguments can be built as in Keeble. Note that for a theory to explain a case it is not necessary that the theory considers all factors in the case. For example, T1Π succeeds in explaining Keeble, though its rules do not refer to pLand .

Giovanni Sartor 7 8. COHERENCE

As we said above, a dispute consists in a dialectical exchange of theories. Victory goes to the party providing a theory that is better than any theory of the adversary. The criterion to measure the comparative strength of competing theories will be the idea of coherence. We will not provide here a precise notion of coherence, nor an exhaustive one (on coherence, cf. Thagard 2001; on coherence in the law, cf. among others, Alexy & Peczenik 1990). We will just consider some properties of theories that seem relevant to this idea in the present domain. A theory is coherent, in regard to a certain set of cases (the evidence) and certain background knowledge, constituted by rules and teleological links, to the extent that it satisfies the following criteria:

• Case-coverage. This consists in the ability of explaining cases. A more coherent theory succeeds in explaining a larger set of cases: the theory includes justified arguments connecting the factors in the precedents and the corresponding outcome. In this paper we use set-inclusion as the metric for comparing explained cases, i.e. A is larger than B iff A ⊃ B. However, weaker criteria are also compatible with the approach here developed, such as focusing on the cardinality of sets of explained cases, or on the importance of the cases they contain.

• Factor-coverage. This consists in the ability of taking into account factors in the explained cases. A more coherent theory explains cases by combinations of (justified) arguments and (overruled) counterarguments that refer to a larger set of factors.

• Analogical-connectivity. This consists in the fact that premises in a theory are obtainable through analogies from other premises in the theory. By analogies we mean here those theory-construction operators that extract new premises from premises already in the theory, and therefore reflect a content connection between their result and their preconditions. Here the analogies are provided by the operators factor-merging, value-merging, rule-preference-from-value-preference, and rule-broadening.

• Non-arbitrariness. This consists in the fact that a theory does not contain unsupported premises. A premise is unsupported if it is not required for explaining the precedents, not is included in the background knowledge, nor is obtainable through analogies from supported premises.

9. THEORY-BASED DIALECTICS. FACTOR-BASED REASONING

Let us now consider how the parties may proceed in constructing their theories. First we will consider how they reason with rules, and then how they use teleological links. Let Π propose theory T1Π above, which as we have seen, explains both Keeble and Pierson, and justifies Π in Young. T1Π has a weakness (incoherence) in so far as it does not also consider factor πLand in Young and factor δLiv in Young (it is incoherent under the criterion of factor-coverage). One reply by ∆ may consist in theory T1∆. T1∆: { 1. πLiv & πLand ⇒ Π [from BGK + factor-merging]; 2. πNposs ⇒ ∆ [from BGK]; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆ [from preference-from-case in regard to Keeble] }. Theory T1∆ provides a distinction in regard to T1Π. In fact, it substitutes T1Π’s rule πLiv ⇒ Π with the more specific rule πLiv & πLand ⇒ Π, which is not satisfied in Young. T1∆ still explains Π in Keeble, but supports ∆ in Young (instead of Π). T1∆ is better than T1Π under factor-coverage, since it considers, besides factors πLiv and πNposs, also factor πLand , and in particular it provides a more thorough explanation of Keeble. T1∆ can be countered with the following Π-theory, based upon broadening πLiv & πLand ⇒ Π into πLiv ⇒ Π: T2Π: { 1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆; 4. πLiv ⇒ Π [from BGK, and also by rule-broadening from 1]; 5. πLiv ⇒ Π > πNposs ⇒ ∆ [arbitrary-rule-preference] }.

Giovanni Sartor 8 T2Π, which justifies Π in Young, is equally coherent as T1∆, as far as factor-coverage is concerned. It is better under the criterion of analogical connectivity, since it includes both the broadened rule πLiv & πLand ⇒ Π and its broadening πLiv ⇒ Π. However, it is defective under the criterion of non-arbitrariness, since the preference πLiv ⇒ Π > πNposs ⇒ ∆ is unnecessary for explaining the precedent: its only use is that of justifying the outcome wished by ∆ in CS. Another possible ∆-theory is the following. T2∆: { 1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆; 4. πLiv ⇒ Π; 5. πNposs & δLiv ⇒ ∆ [from BGK + factors-merging]; 6. πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π [arbitrary-rule-preference]}. T2∆ succeeds in explaining both Pierson and Keeble and considers more factors than T2Π does, providing a more thorough explanation of Young (since it includes also factor δLiv). However, also T2∆ is incoherent for its arbitrariness, since it includes the preference πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π, which is unsupported by the evidence (it is unnecessary for explaining the precedents). In fact T2∆ can be countered by the following T3Π theory, which scores equally well under all coherence criteria (it just includes a different arbitrary preference): T3Π: { 1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆; 4. πLiv ⇒ Π; 5. πNposs & δLiv ⇒ ∆; 6. πNposs & δLiv ⇒ ∆ < πLiv ⇒ Π [arbitrary-rule-preference]}.

10. THEORY-BASED DIALECTICS. VALUE-BASED REASONING

As we have just seen, at the level of factor-based reasoning both parties have failed to provide a theory that is more coherent than the best theory of their adversary. They can remedy the defects of their theories (under the criteria of factor-coverage and analogical connectivity) only by making those theories defective under non-arbitrariness. As Berman and Hafner (1993) observed, the dispute may only be decided when moving to teleological reasoning. Let us now consider the theory T2∆ above: T2∆: { 1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆; 4. πLiv ⇒ Π 5. πNposs & dLiv ⇒ ∆; 6. πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π} This theory considers all factors, gives ∆ the result she wants, and includes an analogical connection, but is infected by the arbitrariness of the preference πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π}. This arbitrariness can be removed by expanding T2∆ with the following value-subtheory: T2∆V:{ 7. πLiv ⇒ Π promotes MProd [from values-BGK]; 8. πNposs & δLiv ⇒ ∆ promotes {MProd, LLit} [from values-BGK + value-merging]; 9. {Mprod, LLit} > MProd [from value-ordering]} T2∆V shows that πNposs & δLiv ⇒ ∆ promotes a larger set of values than πLiv ⇒ Π does (as it stated in 7 and 8), which means that the first rule promotes a more important set of values than the latter does (as stated in 9). This supports the conclusion that the rule πNposs & δLiv ⇒ ∆ is stronger than its competitor (according to rule-preference-from-value-preference). This is exactly the preference stated in 6 above, which can now be given an appropriate support: πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π is supported by premises 7, 8 and 9, according to the constructor rule-preferences-from-value-preferences. Arguably, there is no theory that is more coherent than the resulting theory T3∆

(including all lines from 1 to 9) since it:

Giovanni Sartor 9

• explains all precedents, • considers all factors, • includes analogical connections, • contains no arbitrary assumptions.

Note also that value-subtheory T2∆V, besides integrating Τ2∆ in a coherent whole (T3∆), where support links connect the two subtheories, also succeeds in undermining the competing theory T2Π: the preference πNposs & δLiv ⇒ ∆ < πLiv ⇒ Π is not only arbitrary, but also inconsistent with premise 6 of theory T3∆. The latter premise cannot be easily eliminated since, as we just saw, it is analogically connected to value propositions that were legitimately derived from the background knowledge.

11. VALUES AND THE EVOLUTION OF CASE LAW

In the model above, one important aspect is missing, i.e., an account of the dynamics of case law, as it depends on the evolution of the socio-political context. This dynamics seems to undermine the very possibility of constructing a coherent theory of a case-law domain: how is it possible to fit in a single theory cases which were decided differently, even in the presence of the same constellations of factors, since different decisions were required by different contexts? We will not try to provide a full-fledged discussion, nor a complete formalisation, but only sketch the essential features of one solution that can be developed in the framework here proposed. The key is in the notion of “promoting”. Let us recall that “R promotes V” is an ellipsis for the conjunction of two statements:

1. V is a legal value, 2. the general practice of rule R would advance the achievement of V.

Here we will not consider statement 1, since the discussion of changes in values involves deep and controversial philosophical issues. Are values objective, conventional or merely subjective? Are they eternal and universal or relative to particular times and places? On the contrary, the question of whether and how much certain values are going to be advanced through certain (rule-based) practices concerns an empirical connection, which undoubtedly is dependent upon changing socio-economical conditions. Even if ultimate legal values remain unchanged, the ways in which the practice of a specific rule impacts on them may change over time (a similar change would also concern instrumental values, but we will not consider them here). For example, it may be argued that under the circumstances prevailing in modern industrialised countries, hunting has lost its ancient economic function: rather than contributing to productivity, it may detract from it. This may be true especially when hunting hinders some forms of recreation (watching wild animals, hiking, etc.) and so jeopardises the livelihood of those involved in the corresponding economical activities (hotel personnel, tour operators, tourist guides, etc.) In such a context, the practice of the rule πHunt ⇒ Π (if a plaintiff is hunting a wild animal than he has a legal remedy against a defendant who interrupted the chase), by facilitating hunting, does not promote social productivity, but rather impairs it. Consequently, even though in the past it was right to give a πHunt-case the outcome Π, recently it may have become right to decide an equal πHunt-case with ∆. To model this phenomenon, we need to provide theories which are capable of explaining conflicting decisions, adopted on the basis of the same set of factors, but taken in different times, when the impact of (the practice of) the rules contemplating those factors on the relevant values has changed (this issue was generally addressed in Berman & Hafner 1995). Let me sketch how this may be possible through a slight change in the logic introduced above. Let me first assume that priority statements have a temporal specification: they say that rule R1 prevails over rule R2, at time τ1, abridged as R1 >(at τ1) R2. Note that preference is consistent with affirming that R1 <(at τ2) R2. Correspondingly, we say that an argument A1 defeats(at τ) its counterargument A2, if A2 does conclude for A1 <(at τ) A2. So argument A, from premises set S is justified(at τ) within S, if A is not defeated(at τ) by any

Giovanni Sartor 10 justified(at τ) argument in S. Finally, premises set S implies(at τ) α if α is the conclusion of an argument which is justified(at τ) within S. The constructor rule-preference-from-values-preferences also needs to be “temporalised” as follows. Given that a theory contains the following:

1. V1 > V2; 2. R1 promotes(at τ) V1; 3. R2 promotes(at τ) V2

we can add to it the rule preference: R1 >(at τ) R2. Let us further adopt the following temporal axiom TA, which allows for a rudimentary temporal reasoning (a more sophisticated treatment of temporal notions could obviously be embedded in the model here proposed): TA: If R promotes V from τ1 to τ2, and τ is contained in the interval <τ1, τ2>, then R promotes(at τ) V. Let us consider two cases, where, at different times, the same combination of factors, that is {A, B}, led to opposite outcomes (O, ¬O): Cold: Factors: A, B. Outcome: O. Time: 1.1.1950 Cnew: Factors: A, B. Outcome: ¬O. Time: 1.1.2000. Let the factors-background be: A ⇒ O B ⇒ ¬O. Let the value-background be: A ⇒ O promotes V1 from 1.1.1900 to 1.1.1980 B ⇒ ¬O promotes V2 from 1.1.1900 to Now V1 > V2. The task, as above, is to build a theory that succeeds in explaining both Cold and Cnew. To do that we need to make the notion of an explanation time-sensitive: a theory explains a case if the theory implies the outcome of the case at the time where the case was decided. More exactly, a theory Τ explains a case C, with factors α1, …, α n, outcome γ, and time τ, if Τ ∪ { α1, …, α n} implies(at τ) γ, i.e., if γ is a justified(at τ) conclusion of T ∪ { α1, …, α n}. Let us now consider the following theory T1:

1. A ⇒ O (from factors-BGK) 2. B ⇒ ¬O (from factors-BGK) 3. A ⇒ O promotes V1 from 1.1.1900 to 1.1.1980 [from BGK] 4. B ⇒ ¬O promotes V2, from 1.1.1900 to Now [from BGK] 5. V1 > V2 [from BGK] 6. A ⇒ O > (at 1.1.1950) B ⇒ ¬O [from 3, TA, 5, rule-preference-from-value-preferences] 7. A ⇒ O < (at 1.1.2000) B ⇒ ¬O [from 4, TA, value-ordering, rule-preference-from-value-

preferences]. T1 ∪ {A, B} both implies(at 1.1.1950) O, and implies(at 1.1.2000) ¬O. This is because argument A1 = {A, A ⇒ O} strongly defeats(at 1.1.1950) argument A2 = {B, B ⇒ ¬O}, while A2 strongly defeats(at 1.1.2000) A1. On the one hand, A1 strongly defeats(at 1.1.1950) A2, according to the preference A ⇒ O >(at 1.1.1950) B ⇒ ¬O, which is

Giovanni Sartor 11 derived from the value-preference V1 > V2, given that A ⇒ O promotes(at 1.1.1950) V1, and that B ⇒ ¬O promotes(at 1.1.1950) V2. On the other hand, A2 strongly defeats(at 2.2.2000) A1, according to the preference A ⇒ O <(at 1.1.1950) B ⇒ ¬O, which is derived form the value-preference ∅ < V2 (value V2 is better than no value at all), given that A ⇒ O promotes(at 1.1.2000) ∅ (the empty set of values) , while B ⇒ ¬O promotes(at 1.1.2000) V2. So, as we wanted, T1 succeeds in explaining both Cold and Cnew, although the two cases provide opposite outcomes for the same combination of factors.

12. CONCLUSION

In this paper we have viewed case-based reasoning as a theory-construction exercise governed by the idea of coherence. Although the results here presented are very preliminary, I hope that the reader may agree that our approach can make some sense, at least when applied to the benchmark problem of the combination of cases, factors and values, originally proposed by Berman & Hafner (1993). Let us conclude our contribution by pointing to possible developments. Firstly, one could expand the background knowledge available to the parties, for example, with information concerning the statements of the judges and the context of their utterance. This would lead to a further theory-construction profile: the need to make sense of the “history” of the case, and in particular of the judges’ opinions, in the circumstances where they were stated. So, a case theory, besides a rule-subtheory and a value-subtheory, might also include a “history-subtheory”, to be constructed using the available data about the case, including, in particular, the expressed opinion of the judges. Such a history-subtheory may support the conclusion that the judges “meant” to decide the case according to certain rules or preferences. In particular, the fact that the judges explicitly stated a certain principle and gave it a particular role, in the argumentative structure of their opinion, can lead to the conclusion that they viewed this principle as the decisive ratio of their judgement in the case. This history-subtheory would provide coherent support to the rule-subtheories that use the principle in explaining the case. Correspondingly, on the basis of the history-subtheory, some other rule-subtheories may be excluded, as being incoherent with the case history. Also a different result however, may be possible, under appropriate circumstances: it may argued that a certain explanation of a precedent, based upon the current value-subtheory, makes more sense than the explanation based upon the expressed opinion of the judges, and consequently dismiss the latter (this phenomenon was described by Smith and Deedman 1987, who provide real world examples). Secondly, one can develop the idea of the circularity of justificatory links. Here we have assumed static background knowledge, and therefore a one-way theory-construction process, which goes from the background knowledge to the theory of the cases. However one can consider that the background knowledge itself (or at least some parts of it) needs to be constructed according to the theory of the cases. According to this approach, the coherence test will not concern the case theory only, but rather the whole combination of case theory + background knowledge, seen in their interdependence. This combination would then compete against alternative similar combinations. Thirdly, a more sophisticated account could be provided of the ordering between values (e.g., ways of determining preferences between single values, and between certain quantities of them), of the relations among values (e.g., specifying how the satisfaction of one value can contribute to, or detract from, the satisfaction of others) and of the connection between rules and values (e.g., assign a strength to this connection according to a probabilistic metric). Such an account could in particular profit from the contribution of decision theory, which has traditionally investigated ends-means connections. Fourthly, the perfect symmetry we have here assumed in the position of the parties can be substituted with criteria for allocating of the burden of proof (or, more generally, the burden of argumentation). For example, an obvious adaptation would consist in assuming that while the plaintiff must provide a theory justifying the outcome he wants for CS, the defendant only needs to provide a theory that does not justify the plaintiff’s outcome. In this approach, the defendant would satisfy her burden of proof just by providing a theory that leaves the outcome in CS indeterminate. In this regard, theories of the burden of proof, as developed by Prakken (2001) could provide useful models. Finally, the relations between the various dimensions of coherence above considered (and further aspects of it) should be explored, to see how the scores a theory achieves along those different dimensions can be combined into an overall mark (on computing coherence, cf. Thagard 1992). In this connection, one may inquire when some coherence requirements may be waived or limited. For example, all theories here considered were assumed to cover all cases (to score the maximum under the criterion of case-coverage), but

Giovanni Sartor 12 it would be more realistic to assume that some cases could be explained away as being deviant or simply wrong. This would need to be linked to a view of the development of case-based law that uses notions such as those of express and implied overruling.

13. BIBLIOGRAPHY

Alexy, R., & A. Peczenik. 1990. The Concept of Coherence and Its Significance for Discursive Rationality. Ratio Juris 3: 130-147.

Ashley, K.D. 1990. Modeling Legal Argument: Reasoning with Cases and Hypotheticals. Cambridge (Massachusetts): MIT.

Bench-Capon, T.J.M. 1999. Some Observations on Modelling Case Based Reasoning with Formal Argument Models. In Proceedings of the Sixth International Conference on AI and Law, 36-42. New York: ACM Press.

Bench-Capon, T.J.M. 2000. The Missing Link Revisited: The Role of Teleology in Representing Legal Argument. In this Special Issue.

Bench-Capon, T.J.M., & G. Sartor. 2001a. Using Values and Theories To Resolve Disagreement in Law. In Proceedings of the The Thirteenth Annual Conference on Legal Knowledge and Information Systems JURIX 2000. Ed. J. Breuker, R. Leenes and R. Winkels, 73-84. IOS Press: Amsterdam.

Bench-Capon, T.J.M., & G. Sartor. 2001b. Theory Based Explanation of Case Law Domains. In Proceedings of the Eighth International Conference on Artificial Intelligence and Law, 12-21. ACM: New York.

Berman, D.H., & C.D. Hafner. 1993. Representing Teleological Structure in Case Based Reasoning: The Missing Link. In Proceedings of the Fourth International Conference on AI and Law, 50-59. New York: ACM Press.

Gordon, T.F. 1995. The Pleadings Game. An Artificial Intelligence Model of Procedural Justice. Dordrecht: Kluwer.

Hempel, C.G. 1966. Philosophy of Natural Sciences. Englewood Cliffs (NJ): Prentice-Hall. McCarty, L.T. 1997. Some arguments about legal arguments. In Proceedings of the Sixth International

Conference on Artificial Intelligence and Law, 215-224. New York: ACM Press. Prakken, H.. 2000. An Exercise in Formalising Teleological Case Based Reasoning. In J. Breuker, R. Leenes

and R. Winkels (eds), Legal Knowledge and Information Systems: Jurix 2000, 49-57. Amsterdam: IOS Press.

Prakken, H. 2001. Modelling Reasoning about Evidence in Legal Procedure. In Proceedings of the Eighth International Conference on Artificial Intelligence and Law, 119-128. New York: ACM Press.

Prakken, H. & G. Sartor.1997. Rules about Rules: Assessing Conflicting Arguments in Legal Reasoning. Artificial Intelligence and Law 4: 331-368.

Prakken, H. & G. Sartor. 1998. Modelling Reasoning with Precedents in a Formal Dialogue Game. Artificial Intelligence and Law 6: 231-287.

Thagard, Π. 1992. Conceptual Revolutions. Princeton(NJ): Princeton University Press. Thagard, Π. 2001. Coherence in Thought and Action. Cambridge (MA): MIT Press.

teleological arguments and theory-based dialectics

Documents