rūsiņš freivalds, juris viksna: inductive inference up to immune sets. aii 1989: 138-147
TRANSCRIPT
INDUCTIVE INFERENCE UP TO IMMUNE SETS
R 6si#g Freivalds aad Juris Viksaa
Computing Center
Latvian State University
Riga, USSR
Abstract
We consider approximate in the limit of G6del numbers for total recursive functions. The set of
possible errors is allowed to be infinite but "effectively small". The latter notion is precise in several
ways. as "immune", "hyperimmune", "hyperhyperimrnune", "cohesive", etc. All the identification
types considered tum out to the different.
Introduction
We are interested in inductive inference of G ~ e l numbers of functions which can differ from
the given function on an infinite but "small" set of values of the argument.
The classical mathematics has developed a lot of distinct notions to express the "smallness".
The most well-known of them is "small in terms of measure". Inference in the limit of GSdel
numbers of functions which can differ from the given function on a set of bounded measure was
studied thoroughly by K.Podnieks [8].
We consider the notion "small" in terms rather close to the "category". More precisely, we use
notions of the theory of recursive functions to express the notion "effectively small".
139
Definition. A set of integers A is immune if
(i) A is infmite,
(ii) (¥B) [[13 infinite and B recursively enumerable] ~ BnA.(21].
A set A of integers is simple if
(i) A is recursively enumerable,
(ii) ,~ is immune.
Some of immune sets are called hyperimmune (and some simple sets are called hypersimple).
These notions were introduced by E.Post [9] in order to study the problem whether or not there are
nonrecursive, recursively enumerable sets which are not truth-table complete (tt-complete). (We omit
here the definition of tt-reducibility). Let A be a nonempty finite set {x 1, x 2 ..... Xk} where Xl<X2< ...<x k. Then the integer
2 xl +2 x2 + ... + 2 xk is called the canonical index of A. If A is empty, the eanoniciat index is
equal to 0.
Evidently, every finite set has a unique cannonicial index, and every integer n is the canonical index of some finite set D n. Since it is possible to go uniformly from canonical indices to recursively
enumerable indices but not vice versa, the canonicial index can be regarded as the form of explicit
definition of the finite set.
Instead of the standard definition of hyperimmune sets we prefer to present here an equivalent
one due to A.V.Kuznecov, Yu.T.Medvedev, V.A.Uspenskii (see also Th.9-XV in [9]).
Definition. A set of integers A is hyperimmune if
(i) A is infinite, (ii) -~(3recursive f) [ (¥u) [ Df(u)nA.¢l & (~¢u) (¥v) [u .v ~ Df(u)rlDf(v)=0]].
In other words, A is infinite and there is no effectively enumerable infinite sequence (by
canonical indices) of disjoint finite sets each of which intersects A.
A set ot integers X is hypersimple if
(i) X is recursively enumerable,
(ii) X is hyperimmune.
Similarily, some of hyperimmune sets are called hyperhyperimmune. A set of integers A is hyperhyperimmune if
(i) A is finite, (ii) - ,(3 total recursive f) [ Q¢u) [ Wg(u ) is finite &Wg(u)nA*0]].
Definition. A set of integers X is hyperhypersimple if
(i) X is recursively enumerable
(ii) X is hyperhyperimmune.
140
There are also some other types of "small" sets which are described in [9].
Definition. Set A is indecompasable if there do not exsist two recursively enumerable sets B 1 and B 2 such that BlnB2=O, ACB1uB2, BlnA is infinite, and B2nA is infinite.
Definition. Set A is cohesive if
(i) A is infinite, and
(ii) ( ~ G ) [G recursively enumerable ~ [AnG is finite or AnG is finite ]].
Definition, Set A is quasicohesive if A is the union of a finite (nonzero) number of cohesive
sets. The quasicohesive sets together with the finite sets form an ideat ~g I in 2N; it is the ideal
generated by the cohesive sets. Definition, Set A is generalized cohesive if AS ~ 1 but for every x, either WxnA or V¢ xnA
is in ~Y 1"
Therefore we have the following list of distinct small sets.
(C) Cohesive;
(Q) Quasicohesive;
(QG) Quasicohesive or generalized cohesive;
(HHI) Hyperhyperimmune;
(HI) Hyperimmune;
(I) Immune;
(II) Infinite and indecompasable.
The hierarchy of those sets is as follows.
Here the arrow from A to B stands for implication A~B, i.e., every set of type A is also of
type B.
It is known that all other implications are false (see [1] and [9]). In [9] some other types of
"small" sets are also described, but, because of our interest is the inference of total recursive
functions, we do not consider for types of sets which coinside for corecursively enumerable sets.
141
The classes of these functions we will denote by C, Q, QG, HHI, HI, I and II
correspondingly.
We are going to investigate the existence of hierarchy of identifiabitity for two series of
parameters. The first parameter is the type of identification of the functions:
1) BC- identification,
2) identification in the limit (EX - identification), 3) identification with no more than n changes of hypotheses (EX n - identification).
The second parameter is the set on which error outputs are allowed:
1) subsets of immune sets,
2) subsets of hyperimmune sets,
3) subsets of hyperhyperimmune sets,
4) subsets of quasicohesive or generalized sets,
5) subsets of quasicohesive sets,
6) subsets of infinite and indecomposable sets,
7) subsets of cohesive sets,
8) finite sets.
For all our results we have considered only inference of total recursive functions.
The classes of sets of functions which are identifiable in the sense of the distinct first parameters of identifiability we will denote with the symbols BC, EX and EX n correspondingly,
with one of the indices I, HI, HHI, QG, Q, II, C a n d , which correspond to the distinct second
parameters of identifiability.
For the remainder of this paper IN will stand for arbitrary of these indices.
Relations among classes BC I N , E X I N a n d E X n I N
At first, we will study the hierarchy of identifiabiiity for the first series of parameters.
Since L.Harrington has shown that BC* contains the set of all total recursive functions the
further investigation of classes BC IN is not interesting. The relations EXINnEEXINn+ 1 and EXINn~EXIN for every nEl~I are evident. In [7] it is
shown that EX* n is a proper subclass of EX*n+ 1 and that EX* n is a proper subclass of EX*.
We will prove the counterpart result for the parameters I, HI, HHI, etc. We will use the following lemma.
142
Lemma 1. If f and g are two total recursive functions which differ on an infinite set of
arguments then there does not exist a recursive function h such that:
(i) the set of arguments on which f differs from h is immune or finite,
(ii) the set of arguments on which g differs from h is immune or finite.
Proof.
Let us denote by A the set of arguments on which f and g differ. Let B be the set on which h is
defined. The set AnB is infinite and recursivety enumerable.The sets C and D on which function h is
defined and differs from the values of the functions f and g, correspondingly, are recursively
enumerable as well.
Since COD=AOB is infinite, either C or D is infinite. Hence it is an infinite recursively
enumerable subset of an immune set. Contradiction.
Remark. Since every infinite subset of an immune, hypefimmune and hyperhypefimmune set
is immune, Lemrna 1 can be stated for subsets of immune sets, subsets of hyperimmune sets or
subsets of hyperimmune sets as well.
Theorem 1 For arbitrary nEN there exists a set U n of total recursive functions such that
UnEEXn+ 1 IN and Un~EXnIN.
Proof. For arbitrary nEN the set U n will consist of all total recursive functions f for which there
exists a set of no more than n integers i 1 ..... ik, i1<i2< .°. <i k, k.~.n, such that for every j=l . . . . . k -
1 the equalities f(ij+l)=f(ij+2) . . . . f(ij+l)=b hold, where b equals to 1 or 0, and for every i>i k,
f(i)=f(ik)=0 or 1.Set U n is identifiable by inductive inference machine (IIM) F which for every
function fEU, first, outputs an index of the function which is equal to frO) for all values of the
argument, and, second, outputs the index of the function which is equal to f(i) for all values of
argument if such i is founded, that f(i)~f(i-1). It is easy to see that when working on the function f from the set U n 1) 1 ~ outputs no more than
n+2 hypotheses, and 2) the last hypothesis o f f is the index of function which differs from fonly on a finite set of arguments. Hence UnEEXn+I IN"
Let us prove that UnqEXnIN. We will assume (from the contrarary) that there is an IIM F
such that it identifies the set U n up to a subset of an IN-set with no more than n+l hypotheses.
We will use the mathematical induction.
143
Let n=0. Then U n contains a function f0, which is equal to 0 for all values of the argument and
the functions fi, i=l , 2, 3 ..... where for every i fi(x)=0 if x_<_i and fi(x)=l if x>_i. Let F works on the
function f0" Since F can identify this function, then after k steps it will output hypothesis h. The
same hypothesis will be output by F on function fk. But, since f0 and fk differ on an infinite set of
arguments, from Lemma 1 it follows that the sets on which ~n differs from f0 and fk both are not
subsets of IN-sets. Hence UI~EXoIN"
Let n=k+l and the set U k ~ E X k I N . Then there exists such a function fEU k that on this
function F outputs no less than k+2 hypotheses. The last hypothesis will be output at the k-th step. Then the same hypothesis will be output by F for both functions f 'o and f ' l , where
f 'o(x)=f ' l(x)=f'(x) if x<k and
f'o(x)=O, f ' l(X)=l if x>k.
Therefore, the last hypothesis is wrong for one of these functions and U k + I ¢ E X k + I IN,
because f ' oEUk+l and f ' 1 ~Uk+l"
Theorem 1 implies the following assertion as a simple corollary.
T h e o r e m 2. For every n~ N there exists a set U n of total recursive functions such that
UnEEXIN and Un~EXnlN"
It follows from Theorem 2 that the classes EXnlN do not contain the set of all total recursive
functions ~'1,. Now we prove that Pu is not contained in EX IN either. If suffices to prove this for I
since C C ... C H I C I
Theorem 3. gl, is not identifiable in the limit up to subsets of immune sets.
Proof.
Assume there is an IIM F which identifies ~ in the limit. Let F(<0>)=a. We define a total
recursive function f by the following procedure.
Step 0.
Define f(0)=0.
Step i (i=1,2, 3 . . . . )
Compute in parallel two sequences
F(<v0>), F(<v 0, Vl>) ' F(<v 0, v 1, v2>) . . . .
t44
and
F(<w0>), F(<w 0, Wl>) ' F(<w 0, w 1, w2>) ... .
where vi=wi=f(i) is f(i) is already defined, and vi=0 and wi=l if f(i) is not yet defined. The
computation is performed until a new hypothesis is produced by F for one of the sequences. Then
add values from this sequence to the function f and go to step i+1.
There are two distinct possibilities
a) f is total. Then F working on f changes hypotheses infinitely often. b) f is defined on a finite set only. We define two functions h 0 and h 1 in the following way:
h0(x)=hl(X)=f(x) for all x, where f(x) is defined.
h0(x)--0 and hl(x)=l for all x, where f(x) is undefined.
Functions h 0 and h t is total recursive and it follows from Lemma 1 that at least on one of
them F will output a wrong last hypothesis. Hence we have the total recursive function which is not
identifiable by F. Contradiction.
Identifiability u p to d i f f e r e n t t y p e s o f i m m u n i t y
Results in this section are based on the following lemma.
Lemma 2. Let A, B~{I, HI, HHI, QG, Q, II, C, *} and let W be a recursively enumerable
set such that WeAkl3. Then there exsists a set U of total recursive functions such that:
(i) U is finitely identifiable up to A;
(ii) U is not identifiable in the link up to B.
Proof. For every IIM F k we will construct a total rect~sive function on which F k produces a wrong
result in the sense of identification up to B. We define U to be the set of all such functions.
We are going to use the recursion theorem. To this goal, for every i we construct two recursive functions: ~g(i), tPn(i)- We define ~g(i)(0)=Wn(i)=i.
Let Fk(<i>) be equal a 0. Then we use the following procedure.
Step i, i=0, 1, 2 . . . .
We compute in parallel the sequence
145
Fk(<i, Vl>), Fk(<i, v I, v2>), Fk(<i, v 1, v 2, v3>) .... where
~ COn(i)(j) if Wn(i)(J) is already defined;
vj=~ 0, otherwise
and enumerate the set W={w 0, w 1, w 2 .... },
and compute the sequence coai(0), coai(1), COai(2 ) .... until either (i) or (ii), or (iii) holds.
(i) a new hypothesis ai+ 1 is produced by F k. Then go to the step i+l.
(ii) a new element wj is found in W. Then define
COn(i)(wj) if ~n(i)(wj) is already defined;
¢0g(i)(wj)=
L 0, otherwise (iii) a new element C0ai(J) is computed for which COg(i)(j) is not yet defined. Then define
COn(i)(j)=qOai(J)+l and ~On(i)(1)=0 for every l<j for which COn(i)(1) is not yet defined.
Now we use the recursion theorem. There is an i 0 such that cOi0=~Og(i0). If COn(i0 ) is total
then we put it into U. If ~n(io ) is not total then it is defined only on an initial fragment [O,m] of N.
Then we put into U the following function ( g(io) if is x<m;
fm(X)= l 0, otherwise.
Now we show that F k do not identify in the limit this function up to A. We distinguish among
several possibilities: a)The construction of ¢Og(i0) consists of infinitely many steps. Then the hypotheses are
changed infinitely often.
b) The construction of C0g(i0) consists of a finite number of steps but the function ¢Pn(i0 ) is
total. Then there are infinitely many values of x for which wa4(x)eCOn(i0)(x)=f(x)j where f is function
COn(i0 ) to be identified and aj is the last hypothesis by F k on f. Since f is total recursive and cPaj is
partial recursive, the set of such x's is recursively enumerable. On the other hand, it is infinite. Hence FkdOes not identify f up to A.
c) The construction of COg(j0) consists of a finite number of steps, and the function q~n(i0 ) is
defined only on finite set. Then c0@x) is not defined for all but a finite number of positive integers.
Hence F k does not identify COn(i0 ) up to A.
146
The following theorem is a simple corollary of this Lemma.
Theorem 4 For every one of the following pairs (C,*), (Q,C), (QG,Q), (HHI,QG), (HI,HHI), (I,HI), (II,C), (HI,II) (for the sake of brevity denoted by (IN 1,IN2)) there is a set of
total recursive functions U such that U is finitely identifiable up to IN11 and U is not identifiable up
to IN2.
Acknowledgements
We want to thank E.B.Kinber for a helpful idea and the unknown referee for the hint to
consider hyperhyperimmune and more exotic sets as welt in our paper.
References
1. Arslanov M.M.Two theorems on recursively enumerable sets. - Algebra i Logika, 1968,
v.7, No. 3, p.4-8 (Russian).
2. Barzdin J.M. Two theorem on the limiting synthesis of functions. Latvii gosudarst. Univ.
Ucenye Zapiski, 1974, v.210, p.82-88.
3. Case J., and Smith C. Anomaly hierarchies of machanized inductive inference. - Proc. 10th
STOc, ACM, i978, p.314-319.
4. Case J., and SmithC. Comparison of identification criteria for machine inductive inference.
- Theoretical Computer Science, 1983, v.25, p. 193-220.
5. Freivald R. On the limit synthesis of numbers of general recursive functions in various
computable numerations. - Soviet Math. Doklady, 1974, v.15, No.6, p.1681-1683.
6. Freivalda R., and Kinber E.B. Criteria of distinction among limit identification types. -
"Sintez, testirovanie i otladka programm". Proc. USSR Naeional Symposium, Riga, 1981, p.128-
129 (Russian).
7. Pitt L. A characterization of probabilistic inference. - of the 25th Annual Symposium on