carnap and essler versus inductive generalization

10
JAAKKO HINTIKKA CARNAP AND ESSLER VERSUS INDUCTIVE GENERALIZATION In his interesting and clarifying note, 1 Essler sets up a contrast between Carnap's inductive logic, 2 in which all nontrivial generalizations obtain the value 0 as their probability in an infinite domain, and mine, 3 where suitable generalizations can get confirmed by increasing evidence and can even have asymptotically the degree of confirmation one. I suspect that this contrast is not a completely accurate one historically, in the sense that Carnap's inductive logic was probably shaped less by philosophical rea- sons for assigning zero probabilities to genuine generalizations in an in- finite domain than by the logical and mathematical difficulties one en- counters in mastering such probability distributions as enable us to discuss also inductive generalization. 1 (Cf. below my discussion of (3).) Be this as it may, the three arguments outlined by Essler in his note in favor of Carnap do not seem to me to get to the root of the matter. (1) Essler's first line of thought follows closely Carnap's defense of his notion of instance-confirmation of laws. Basically it amounts to arguing that generalizations are only idealizations which do not concern the con- crete situations we actually encounter in science or in everyday life. For these practical purposes, only inferences from finite evidence to finite events are involved. Hence (I take Essler's point to be that) the prior probabilities of generalizations in an infinite domain do not matter for the intended applications of Carnap's inductive logic. This is simply an incomplete picture of the logical situation, as Essler is undoubtedly well aware. For there is (as shown by the results of de Finetti, Gaifman, etc. 5) a strong interdependence between the degrees of confirmation of finite events onfinite evidence and the prior probabilities of generalizations. When someone has made up his mind how to bet on any finite event on finite evidence, he has ipso facto made up his mind what probabilities to associate with generalizations, we may say, assuming only the symmetry (exchangeability) of the underlying probability distri- bution. It follows as a corollary that the traditional dilemmas as to whether induction is inference from particulars to particulars or from Erkenntnis 9 (1975) 235-244. All Rights Reserved Copyright 9 1975 by D. Reidel Publishing Company, Dordrecht-Holland

Upload: jaakko-hintikka

Post on 06-Jul-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Carnap and Essler versus inductive generalization

J A A K K O H I N T I K K A

C A R N A P A N D E S S L E R V E R S U S

I N D U C T I V E G E N E R A L I Z A T I O N

In his interesting and clarifying note, 1 Essler sets up a contrast between Carnap's inductive logic, 2 in which all nontrivial generalizations obtain the value 0 as their probability in an infinite domain, and mine, 3 where suitable generalizations can get confirmed by increasing evidence and can even have asymptotically the degree of confirmation one. I suspect that this contrast is not a completely accurate one historically, in the sense that Carnap's inductive logic was probably shaped less by philosophical rea- sons for assigning zero probabilities to genuine generalizations in an in- finite domain than by the logical and mathematical difficulties one en- counters in mastering such probability distributions as enable us to discuss also inductive generalization. 1 (Cf. below my discussion of (3).) Be this as it may, the three arguments outlined by Essler in his note in favor of Carnap do not seem to me to get to the root of the matter.

(1) Essler's first line of thought follows closely Carnap's defense of his notion of instance-confirmation of laws. Basically it amounts to arguing that generalizations are only idealizations which do not concern the con- crete situations we actually encounter in science or in everyday life. For these practical purposes, only inferences from finite evidence to finite events are involved. Hence (I take Essler's point to be that) the prior probabilities of generalizations in an infinite domain do not matter for the intended applications of Carnap's inductive logic.

This is simply an incomplete picture of the logical situation, as Essler is undoubtedly well aware. For there is (as shown by the results of de Finetti, Gaifman, etc. 5) a strong interdependence between the degrees of confirmation of finite events onfinite evidence and the prior probabilities of generalizations. When someone has made up his mind how to bet on any finite event on finite evidence, he has ipso facto made up his mind what probabilities to associate with generalizations, we may say, assuming only the symmetry (exchangeability) of the underlying probability distri- bution. It follows as a corollary that the traditional dilemmas as to whether induction is inference from particulars to particulars or from

Erkenntnis 9 (1975) 235-244. All Rights Reserved Copyright �9 1975 by D. Reidel Publishing Company, Dordrecht-Holland

Page 2: Carnap and Essler versus inductive generalization

236 J A A K K O H I N T I K K A

particulars to generalizations is empty if one believes that inductive in- ference can be handled by means of symmetrical probabilities.

Thus the question whether nontrivial generalizations receive zero probabilities a priori has palpable consequences for the finite betting situations to which Essler apparently wants to restrict his, or rather Carnap's, attention.

For instance, suppose the question is whether all members of a finite universe satisfy a generalization whose counter-examples have together the width 091, In Carnap's 1950 system 6 (which corresponds to the choice 2 = the number of different Q-predicates), the degree of confirmation is approximately (for large evidence) (n/N) ~ where n is the number of (completely known) individuals in the sample and N the number of individuals in the domain (universe). From this it is seen that the degree of confirmation in question is non-negligible only if n/N is non-negligible, that is, only if our sample already exhausts an appreciable part of the whole universe. For instance, a generalization concerning all electrons is not to be trusted before we have verified it for an appreciable portion of the electrons in the universe, assuming that it is finite.

Of course different choices of 2 give us different degrees of confirma- tion. I f we choose a small (but still finite 2), it may admittedly happen that a strict generalization receives an appreciable degree of confirmation on the basis of relatively limited evidence in a large but finite universe. However, other sorts of absurdities will then ensue. Given any fixed 2, the degree of confirmation of a generalization will depend on the size of the domain, which has to be known in order for us to be able to assign a definite degree of confirmation to the generalization. Thus a Carnapian phycisist who sets up a generalization, say, concerning all electrons must make the reliability of his generalization conditional on cosmologists' conclusions concerning the finiteness (and the size) of the universe. If this size turns out to be much greater than anticipated, the physicist's generali- zation must (on Carnap's principles) be allotted a much lower degree of confirmation than earlier.

If you find results of this kind unacceptable, you have in your hands evidence against Carnap's handling of generalizations, taken from the very kinds of situations to which Essler wants to restrict the applicability of Carnap's methods. Further evidence to the same effect is easily forthcoming.

Page 3: Carnap and Essler versus inductive generalization

ESSLER VERSUS I N D U C T I V E G E N E R A L I Z A T I O N 237

Thus the problem of inductive generalization just cannot be disen- tangled from the evaluation of the performance of Carnap's methods in the finite realm (i.e., from the evaluation of the way they assign prob- abilities to singular events on finite evidence). Furthermore, Essler's way of posing the general problem of his note is now seen to be misleading. In so far as the confirmation procedures prescribed by my inductive logic are superior to Carnap's, this superiority will be in evidence already in the finite realm. My reasons for preferring the former are therefore not due to any unarticulated 'intuition' concerning inductive generaliza- tion, but on an evaluation of the performance of different inductive logics in finite as well as infinite domains and on their performance vis-~t-vis singular inductive inference and not just vis-a-vis inductive generalization.

Although the point is not very central, it also seems to me that Essler is oversimplifying when he implies that practical men do not care what always happens (Cf. "We mortals have always encountered but finite numbers of such things and will in future continue to be confronted with mere finite numbers of them.") It may be that a Detroit executive only cares whether most of the automobiles he produces keep running for a certain limited period of time. But it is more questionable whether an engineering company is satisfied when told that most of their bridges are likely to last so many years. Inductive logic ought not to be tailored to the idea of planned obsolesence, it seems to me. And in any case one of the main concerns of insurance companies is, not the probability that they will go broke next year, but the probability of their ever going broke. Highly sophisticated mathematics is in fact being employed to that undoubtedly practical purpose.

(2) Essler's main specific criticism against my inductive logic is an alleged inconsistency (in some weak sense) between the degrees of confirmation it assigns to generalizations and the degrees of confirmation (posterior probabilities on finite evidence) it assigns to singular events. Essler (following Carnap) takes the latter probabilities to be estimates of relative frequency in the whole universe. He argues that on the basis of such estimates the posterior probability of all nontrivial generalizations in an infinite universe must be zero, thus apparently contradicting the assump- tion that they have nonzero probabilities.

It is fact true that degrees of confirmation for singular predictions will

Page 4: Carnap and Essler versus inductive generalization

238 J A A K K O H I N T I K K A

on quite general assumptions equal estimates (in the sense of expected values or probabilistic averages) of the relative frequency of individuals of the corresponding kind in the whole universe. Carnap already proved results to this effect in his Logical Foundations of Probability, pp. 540-557, and they can be generalized further.

However, what Essler's observations in reality illustrate is not a flaw in my system but a general feature of sequences of experiments which (like successive draws of an individual from a Carnapian domain) are not (probabilisticaUy) independent of each other. In such cases, estimates (in the Carnapian sense of probabilistic averages) of a quantity just cannot be interpreted as one's beliefs as to what will happen to this quantity in the long run, just because subsequent experiments will inevitably change the estimate of this quantity.

Yet this is precisely Essler's line of thought. He interprets the fact that one's so-called estimate of relative frequency of individuals of a certain kind is e > 0 to imply that in a sufficiently large sample the probability of there being at least one individual of this kind to be close to one. But to draw this conclusion is to treat subsequent observations of individuals as being independent of each other, which in turn means ceasing to learn from experience after the finite number of 'experiments' comprised in one's evidence e.

In general, we can thus see that for probabilistically dependent se- quences of experiments the terms 'estimate' and 'expected value' for a probabilistic average are misnomers. Because of the dependence, such an 'estimate' cannot reflect one's expectations as to what happens in the long run. Neither Carnap nor Essler emphasizes this fact sufficiently.

These observations may be illustrated in different ways. As a simple example, take a coin which is (on the evidence we are assumed to have) as likely to be a fair normal coin (probability of tails �89 as to be a two- tailed one (probability of tails 1). (We know that it is of the one or the other sort.) It is tossed once, and shows tails. What is the rational betting ratio (degree of confirmation) of its showing tails at the next toss? Clearly ~.7 But this betting ratio cannot by any stretch of imagination be under- stood as a rational expectation or an educated guess concerning the relative frequency of tails in an infinite sequence of tosses. This frequency just cannot be ] or in the vicinity of ~, for this frequency is known to be either �89 or 1. (The rational guess in the circumstances envisaged might

Page 5: Carnap and Essler versus inductive generalization

E S S L E R V E R S U S I N D U C T I V E G E N E R A L I Z A T I O N 239

perhaps be 1.) Note also that, on the assumptions we have made, interval estimation has no sense here.

More generally, assume a certain finite body of evidence e consisting of n fully known individuals. Then the probability P (Cto(a)/e) concerns only the next individual a to be observed. After it has been observed, the probability P (Cto(b)/e & Cto(a))) that a further unknown individual b satisfies Cto(X) need no longer equal P (Cto(a)/e), though it is trivially true that P (Cto(b)/e)=P (Cto(a)/e), for we now have an additional item of evidence Cto(a ) at our disposal. And this discrepancy between the situation on the evidence e and on the evidence (e & Cto(a)) (or on the evidence (e & -,~ Cto (a)), for the matter) shows that the degree of confir- mation P(Cto (a)/e) just cannot be taken to reflect a rational inductivist's beliefs about the relative frequency of individuals satisfying Cto(X) in the whole universe. This can be done only if P(Cto(a)/e)=P(Cto(b)/(e & Cto(a)))=P(Cto(C)/(e & Cto(a ) & Cto(b)))) . . . . But this would mean that all the further individuals are independent of each other probabilistically. This in turn would mean that we stop heeding experience when we have reached the evidence e. And this is in so many words said to be ruled out in Carnap's own approach. On these general grounds, then, degrees of confirmation of singular predictions, even though they equal estimates of the relative frequencies of the corresponding kinds of individuals in the whole universe in Carnap's technical sense of 'estimate', they cannot be understood as a rational agent's actual expectations concerning these relative frequencies on the basis of any nontrivial inductive logic.

The line of thought I have criticized does not originate with Essler, but goes back to Carnap (1950), p. 168, where we read: "Since the probability1 [degree of confirmation] of h on e is intended to represent a fair betting quotient, it will not seem implausible to require that the probability1 of h on e determine an estimate of the relative frequency of M in K " (Carnap's italics). Here we can perhaps have a glimpse of Carnap's motivation. In the technical sense of the word, it is nonsense to ask whether degrees of confirmation can plausibly be identified with estimates of relative fre- quencies. It follows from Carnap s other assumptions that they must be so identified. What Carnap is in reality saying is therefore that these other assumptions have to be chosen in such a way that this conclusion follows. But why? Why did Carnap require these other assumptions to be such as to entail the identity of degrees of confirmation of singular predictions

Page 6: Carnap and Essler versus inductive generalization

240 J A A K K O H I N T I K K A

and expected values (estimates) of relative frequencies? Clearly because he was worried about the concrete 'empirical' or 'operational' meaning of his probability1 (degree of confirmation). He wanted to tie it somehow to actual relative frequencies. It was this purpose that seemed to be served so very well by the identification of degrees of confirmation with esti- mates of relative frequency.

However, what we have seen shows that this idea does not serve the purpose it was supposed to serve. Carnap's 'estimates' of relative frequen- cy cannot be equated with one's expectations of what happens in the long run, and in the absence of such an equation the idea does not contribute to the operationalization of probability1. Notice the jump in Carnap's own formulation from a betting ratio to a relative frequency. From the former you can get to the latter only by means of supplementary assump- tions, such as independence.

Of course what Carnap was trying to do is indeed reasonable and even important. But the partial operationalization he wanted to achieve is more complicated than he thought. The true connection between induc- tive probability and relative frequency is spelled out in de Finetti's famous representation theorem, s This theorem shows that any probability distribution, including the one defined by Carnap's prior probabilities P ( - ) as well as his degrees of confirmation on finite evidence P ( - / e ) , in effect define, and are defined by, a second-level probability distribution, i.e., a probability distribution on first-level probabilities. Now one natural way of viewing these second-level probabilitities for the purposes of de Finetti's theorem is to interpret them as an assignment of probabilities to relative frequencies of individuals of different kinds in the whole universe. The confirmation (epistemic) aspect then comes into play through the probabilities we associate with them. This is in fact the precise generaliza- tion of Carnap's own procedure of assigning prior probabilities to struc- ture-descriptions and then dividing each of these probabilities evenly among the corresponding state-descriptions. (For what a structure- description specifies is precisely the frequencies of individuals having the different Q-predicates in the whole universe.) This is what happens in his 1950 system in a special case, and this is what happens in his ;t-continuum in all finite domains.

What the evidence e effects is then a transition from one such prob- ability distribution on limiting relative frequencies to another. Hence

Page 7: Carnap and Essler versus inductive generalization

ESSLER VERSUS I N D U C T I V E G E N E R A L I Z A T I O N 241

Carnap's suggestion that certain degrees of confirmation be interpreted as estimates of relative frequencies is in an ironical contrast to his attitude elsewhere to objective hypotheses. He insists that we should not accept definite hypotheses on evidence, but rather let the evidence change our probability distribution on the various alternative available hypo- theses. Yet the Carnapian suggestion under discussion means that he proposed to deal with relative frequencies by means of (tentative) accep- tance (estimates) rather than in terms of changing probability-distribu- tions, as he ought to have done. (This 'ought to' refers back to our ob- servations above.) I f there is an inconsistency to be found in this area, it therefore lies in Carnap's approach, not in mine.

It does not help to try to drive a wedge between acceptance and estima- tion, either. The hypothesis one accepts is simply one's estimate of the true hypothesis, and conversely the main purpose of estimates apud Carnap is to supply us with guides to action, i.e., guidelines for acceptance. The only way out here would be to view estimation as a purely formal operation. This, however, would deprive it of all value for the purpose for which we saw Carnap to use it in the case at hand.

Notice how well the idea of reshuffling one's probability distribution on different relative frequencies (in the long run) fits our little example above. The effect of obtaining tails in the first toss of our coin changed the probability of �89 frequency from �89 to �89 and likewise changed the probabili- ty of 1 frequency to ~, as one can verify by means of Bayes' formula.

The unsatisfactory character of Carnap's suggested operationalization of degrees of confirmation is likewise indicated already by the fact that on it relative frequencies compatible with strict generalizations in an infinite universe virtually never occur as Carnapian estimates even when ac- cording to the underlying probability distribution on all relative frequen- cies (as determined by de Finetti's theorem) they do receive nonzero probabilities.

Hence Essler's second line of criticism is inconclusive. (3) Essler formulates his third line of criticism as follows: "Hintikka's

inductive methods are standardized additive quantities and in this sense, but only in this sense, probability functions or confirmation methods. As far as the question of estimation is concerned, they deviate consider- ably from what one expects of probability functions" (Essler's italics).

In so far as this is not just another way of putting Essler's other

Page 8: Carnap and Essler versus inductive generalization

242 J A A K K O H I N T I K K A

criticisms, it apparently amounts to saying that while the probability distributions on which my inductive logics are based are arbitrary, Carnap's inductive logic somehow has a deeper justification. Unfortu- nately Essler does not indicate what this deeper justification might be. The well-known derivation of Carnap's 2-continuum from a simple assumption concerning the form (the arguments) of the representative function is of course the most plausible candidate for this role.

Be this as it may, probably the most effective way of rebutting Essler's charges would be to show that Carnap's k-continuum arises through an arbitrary choice of parameters from a wider continuum of inductive methods which can be motivated through simple general arguments and which in most cases allow for inductive generalization also in an infinite universe. Such a wider class of inductive methods have recently been studied in a joint paper by Ilkka Niiniluoto and myself, a This class of inductive policies is defined by the corresponding representative func- tions. They are now allowed to depend, over and above the two arguments of these functions that are considered in Carnap's k-continuum, also on the number c of nonempty Q-predicates in the sample.

A couple of explanations may be in order here. The representative function specifies the probability, on the basis of a sample of n completely observed individuals, that the next one has a given Q-predicate. The k-continuum comes about when it is required that the representative function depends only on n and the number n' of individuals in the sample which have this Q-predicate. The new class of inductive policies arises when the representative function is of the form f (n , n', c), where c is the number of Q-predicates instantiated in the sample. Of course it is also required that the representative function has to create a coherent probability distribution.

Taking c into account is of course inevitable unless one refuses to countenance inductive generalization at all. For how can one hope to take into account the possibility of generalization if one does not even allow one's inductive strategies to depend on the number of generaliza- tions compatible with one's sample? Thus taking c into account is surely "what one expects of probability functions", Essler notwithstanding.

An examination of the new class of inductive methods shows that all of them will assign nonzero prior probabilities to nontrivial generaliza- tions, except Carnap's 2-continuum, which thus emerges as the most

Page 9: Carnap and Essler versus inductive generalization

ESSLER VERSUS INDUCTIVE GENERALIZATION 243

pessimistic choice o f an inductive method compatible with Niini luoto 's and Hint ikka 's basic assumption concerning the fo rm o f the representa- tive function. I t consists o f precisely those members o f the new wider class o f inductive methods in which c is disregarded. F r o m the point o f view o f this basic assumption, to opt for Carnapian methods thus comes close to a deliberate a priori refusal to recognize the possibility o f inductive generalization. Accordingly it seems to me that the less pessimistic (non-Carnapian) choices o f inductive methods are less arbi t rary than the Carnapian one.

These choices result in a class o f inductive methods which overlaps with my two-dimensional ct-2-continuum but does not coincide with it. Several impor tan t methods, a m o n g them my 'Jerusalem system' and the whole o f 'generalized combined system' belong to both.

Thus tables can be turned completely on charges o f arbitrariness against inductive methods allowing for the confirmation o f inductive

generalizations.

NOTES

1 Wilhelm K. Essler, 'Hintikka versus Carnap', this issue, p. 229. Cf. also Wilhelm K. Essler, Induktive Logik: Grundlagen und Voraussetzungen (Verlag Karl Alher, Freiburg and Munich, 1970), especially pp. 342-351. 2 Rudolf Carnap, Logical Foundations of Probability (University of Chicago Press, Chicago, 1950); The Continuum of Inductive Methods (University of Chicago Press, Chicago, 1952). 3 See especially 'A Two-Dimensional Continuum of Inductive Methods', in Jaakko Hintikka and Patrick Suppes, editors, Aspects o f Inductive Logic (North-Holland, Amsterdam, 1966), pp. 113-132, 'Towards a Theory of Inductive Generalization', in Yehoshua Bar-Hillel, editor, Logic, Methodology and Philosophy o f Science: Pro- ceedings of the 1964 International Congress (North-Holland, Amsterdam, 1965), pp. 274-288. 4 See p. 977 of Carnap's 'Replies' in P. A. Schilpp (ed.), The Philosophy of Rudolf Carnap (The Library of Living Philosophers, Open Court, La Salle, Illinois, 1963). Carnap says there that he has "constructed c-functions of this kind [viz. assigning nonzero probabilities to generalizations even in an infinite domain], but they are considerably more complicated than those of the A-system", and for this reason un- satisfactory in Carnap's view. 5 Bruno de Finetti, 'Foresight: Its Logical Laws, Its Subjective Sources', in H. E. Kyburg and H. E. Smokler, editors, Studies in Subjective Probability (Wiley, New York, 1964), pp. 93-158 (translation of 'La prevision', Ann. de l'Inst. Henri Poincard7 (1937), 1-68); Haim Gaifman, 'Concerning Measures on First-Order Calculi', Israel Journal o f Mathematics 2 (1964), 1-18. Cf. also Dana Scott and Peter Krauss, 'Assign- ing Probabilities to Logical Formulas', in Hintikka and Suppes, editors (note 3 above), pp. 219-264, and Jaakko Hintikka, 'Unknown Probabilities, Bayesianism, and de

Page 10: Carnap and Essler versus inductive generalization

244 JAAKKO H I N T I K K A

Finetti's Representation Theorem', in Roger C. Buck and Robert S. Cohen, editors, PSA 1970: In Memory of Rudolf Carnap (Boston Studies in the Philosophy of Science, Vol. 8, D. Reidel, Dordrecht, 1971), pp. 325-341. e See Logical Foundations (note 2 above), pp. 570-571. 7 This is seen by a straightforward application of Bayes' Theorem. s See de Finetti, 'Foresight' (note 5 above). 9 Jaakko Hintikka and Ilkka Niiniluoto, 'An Axiomatic Foundation for the Logic of Inductive Generalization', forthcoming in the proceedings of the 1974 Warsaw Con- ference on Formal Methods in the Methodology of Empirical Sciences.