computability theory - learning programs to fit/predict...

Computability TheoryLearning Programs to Fit/Predict Data

&Machine Self-Reference

John Case

Computer and Information Sciences DepartmentUniversity of Delaware

Newark, DE 19716 USA

Email: [email protected]

SIGNewGrad 2012, 114 Gore

Outline

Sample of Computational Learning Theory Results.

Today’s Sample: Applicable to Cognitive Science[CCJS07, CCJS08, BCM+08, CM08, CK10b, CK10a].I’m also interested in other results applicable to Philosophy of Science[CS83, CJNM94, Cas07, Cas12] and empirical Machine Learning[CJO+00, CJK+01, COSS02, CJM+06, CJ10].

My Theory project re Machine Self-Reference[CM09a, CM12, CM09b, CM11].

John Case (CIS Department) Learn. Theory/Self-Reference SIGNewGrad’12 2 / 11

Outline

U-Shapes Idea/Motivation

U-Shaped Learning

Learn, Unlearn, Relearn. Occurs in child development re, e.g., verbregularization & understanding of various (Piaget-like) conservationprinciples, e.g., temperature & weight conservation & interactionbet. object tracking/object permanence.

Irregular Verb Example: Child first uses spoke, correct past tense ofirregular verb to speak. Then child ostensibly overregularizesincorrectly using speaked. Lastly, child returns to using spoke.

Concern of Prior Literature: How model U-shaped learning? E.g.,lang. learn., by gen. rules vs. tables of exceptions?

My Interest: Is U-shaped learning an unnecessary accident of humanevolution or is U-shaped learning advantageous in that some classesof tasks can be learned in U-shaped way, but not otherwise?

U-Shaped Learning

U-Shapes Formal

Formal Definitions

T (0),T (1), . . .In−→ M

Out−→ g0, g1, . . . , | gt , . . .

Criteria for: some M successfully learns every language L in class L.Suppose: N+ = {1, 2, . . .}; b ∈ (N+ ∪ {∗}); x ≤ ∗ means x < ∞; T

is a text for Ldef⇔ {T (0),T (1), . . .} = L; & Wg

def= lang. generated by

grammar g — Wg is behavior of g .

L ∈ TxtFexb:(∃M)(∀L ∈ L)(∀T for L)(∃t) [gt , gt+1, . . .

each generates L & card({gt , gt+1, . . .}) ≤ b]. TxtExdef= TxtFex1.

L ∈ TxtBc:(∃M)(∀L ∈ L)(∀T for L)(∃t) [gt , gt+1, . . .each generates L].

Suppose C ∈ {TxtFexb,TxtBc}. Then,L ∈ NonUC: (∃M witnessing L ∈ C)(∀L ∈ L)(∀T for L)(∀i , j , k |i < j < k)[Wgi = Wgk

= L ⇒ Wgj = L]. Non U-shapedlearners never abandon correct behaviors ∈ L and return to them.

U-Shapes Formal

Formal Definitions

T (0),T (1), . . .In−→ M

Out−→ g0, g1, . . . , | gt , . . .

U-Shapes Formal

Formal Definitions

T (0),T (1), . . .In−→ M

Out−→ g0, g1, . . . , | gt , . . .

U-Shapes Formal

Formal Definitions

T (0),T (1), . . .In−→ M

Out−→ g0, g1, . . . , | gt , . . .

U-Shapes Formal

Formal Definitions

T (0),T (1), . . .In−→ M

Out−→ g0, g1, . . . , | gt , . . .

U-Shapes Formal

Formal Definitions

T (0),T (1), . . .In−→ M

Out−→ g0, g1, . . . , | gt , . . .

U-Shapes Results/Question

Diagram of Results

The transitive closure of the following inclusions (−→) hold AND noother inclusions hold.

NonUTxtEx= TxtEx= NonUTxtFexb

TxtFex2 TxtFex3 TxtFex∗

NonUTxtBc TxtBc

E.g., from the above, there is some L ∈ (TxtFex3 −NonUTxtBc)! Thissame L then cannot be ∈ NonUTxtFex∗ — else, it would, then, be inNonUTxtBc. This L does employ interplay between finite sets ofexceptions & general rules.

Diagram of Results

The transitive closure of the following inclusions (−→) hold AND noother inclusions hold.

NonUTxtEx= TxtEx= NonUTxtFexb

TxtFex2 TxtFex3 TxtFex∗

NonUTxtBc TxtBc

E.g., from the above, there is some L ∈ (TxtFex3 −NonUTxtBc)! Thissame L then cannot be ∈ NonUTxtFex∗ — else, it would, then, be inNonUTxtBc. This L does employ interplay between finite sets ofexceptions & general rules.

Main Results and A Question

Main Results:

From NonUTxtBc −→ TxtBc, U-shaped learning is needed for someclass in TxtBc.From NonUTxtEx = TxtEx, U-shaped learning is not needed forTxtEx learning, i.e., for learning ONE successful grammar in limit.From NonUTxtFex∗ −→ TxtFex2, U-shaped learning is needed forsome class in TxtFex2 even if allow ∗ grammars in limit but, fromTxtFex2 −→ NonUTxtBc, is not needed if allow infinitely manygrammars in limit.From the reasoning after the prior frame’s diagram, existsL ∈ (TxtFex3 − (NonUTxtFex∗ ∪NonUTxtBc)); in particular,U-shaped learning IS needed for this L ∈ TxtFex3 — even if allowinfinitely many grammars in limit!

Question: Does the class of tasks humans must learn to becompetitive in the genetic marketplace, like this latter L,necessitate U-shaped learning?

Main Results:

Self-Reference

Machine Self-Reference

A SELF-REFERENTIAL ROBOT:

3 + 4 = ?

Know thyself.— Greek proverb

Problem: Discover mathematically why above might be good advice.

Initial Results:

No class of (recursive or non-recursive) denotational controlstructures characterizes the presence of arbitrarily usableself-referential programs in a universal programming language.

A coded-pipelining control structure epitomizes the complementof the latter.

Self-Reference

3 + 4 = ?

Initial Results:

Self-Reference

3 + 4 = ?

Initial Results:

Self-Reference

3 + 4 = ?

Initial Results:

Self-Reference

3 + 4 = ?

Initial Results:

Self-Reference

3 + 4 = ?

Initial Results:

References

References I

G. Baliga, J. Case, W. Merkle, F. Stephan, and W. Wiehagen.

When unlearning helps.Information and Computation, 206:694–709, 2008.

J. Case.

Directions for computability theory beyond pure mathematical.In D. Gabbay, S. Goncharov, and M. Zakharyaschev, editors, Mathematical Problems from Applied Logic II. New Logicsfor the XXIst Century, International Mathematical Series, Vol. 5, pages 53–98. Springer, 2007.Invited book chapter.

J. Case.

Algorithmic scientific inference: Within our computable expected reality.International Journal of Unconventional Computing, 2012.Journal expansion of an invited talk and paper at the 3rd International Workshop on Physics and Computation 2010, toappear.

L. Carlucci, J. Case, S. Jain, and F. Stephan.

Memory-limited U-shaped learning.Information and Computation, 205:1551–1573, 2007.

L. Carlucci, J. Case, S. Jain, and F. Stephan.

Non U-shaped vacillatory and team learning.Journal of Computer and System Sciences, 74:409–430, 2008.Special issue in memory of Carl Smith.

References

References II

J. Case and S. Jain.

Connections between inductive inference and machine learning.In C. Sammut and G. Webb, editors, Encyclopedia of Machine Learning, pages 210–219. Springer, 2010.http://www.cis.udel.edu/∼case/papers/ml-enc5.pdf; invited chapter.

J. Case, S. Jain, S. Kaufmann, A. Sharma, and F. Stephan.

Predictive learning models for concept drift.Theoretical Computer Science, 268:323–349, 2001.Special Issue for ALT’98.

J. Case, S. Jain, E. Martin, A. Sharma, and F. Stephan.

Identifying clusters from positive data.SIAM Journal on Computing, 36(1):28–55, 2006.

J. Case, S. Jain, and S. Ngo Manguelle.

Refinements of inductive inference by Popperian and reliable machines.Kybernetika, 30:23–52, 1994.

J. Case, S. Jain, M. Ott, A. Sharma, and F. Stephan.

Robust learning aided by context.Journal of Computer and System Sciences, 60:234–257, 2000.Special Issue for COLT’98.

J. Case and T. Kotzing.

Solutions to open questions for non-U-shaped learning with memory limitations.In M. Hutter et al., editors, 21st International Conference on Algorithmic Learning Theory (ALT’10), volume 6331 ofLecture Notes in Artificial Intelligence, pages 285–299, 2010.Expanded version invited for and accepted (with slightly new title) for the associated Special Issue of TCS, January 2011.

References

References III

J. Case and T. Kotzing.

Strongly non U-shaped learning results by general techniques.In Proceedings of the 23rd Annual Conference on Learning Theory (COLT’10). Omnipress, 2010.www.colt2010.org/papers/COLT2010proceedings.pdf is the Proceedings.

J. Case and S. Moelius.

U-shaped, iterative, and iterative-with-counter learning.Machine Learning, 72:63–88, 2008.Special issue for COLT’07.

Characterizing programming systems allowing program self-reference.Theory of Computing Systems, 45, 2009.Special Issue for CiE’2007; online version http://dx.doi.org/10.1007/s00224-009-9168-8.

Independence results for n-ary recursion theorems.In Proceedings 17th International Symposium on Fundamentals of Computation Theory (FCT’09), volume 5699 ofLecture Notes in Computer Science, pages 38–49. Springer, 2009.Journal version submitted.

Properties complementary to program self-reference, 2011.

Program self-reference in constructive Scott subdomains.Theory of Computing Systems, 51:22–49, 2012.http://dx.doi.org/10.1007/s00224-011-9372-1; Special Issue for CiE’09.

References

References IV

J. Case, M. Ott, A. Sharma, and F. Stephan.

Learning to win process-control games watching game-masters.Information and Computation, 174(1):1–19, 2002.

J. Case and C. Smith.

Comparison of identification criteria for machine inductive inference.Theoretical Computer Science, 25:193–220, 1983.

computability theory - learning programs to fit/predict...

Documents