finite automata, their algebras and grammars - springer978-1-4613-8853-1/1.pdf · these rules...

Finite Automata, Their Algebras and Grammars

J. Richard Biichi

Finite Automata, Their Algebras and Grammars Towards a Theory of Formal Expressions

Dirk Siefkes Editor

With 88 Il1ustrations

Springer Science +Business Media, LLC

J. Richard Buchi Computer Science Department Purdue University West Lafayette, Indiana 47907 U.S.A.

Library of Congress Cataloging-in-Publication Data Buchi, J. Richard.

Dirk Siefkes, editor Technische Universităt Berlin Fachbereich Informatik D-1000 Berlin Federal Republic of Germany

Finite automata, their algebras and grammars: towards a theory of formal expressions j J. Richard Buchi; Dirk Siefkes, editor.

p. cm. Bibliography: p. Includes indexes. ISBN-13: 978-1-4613-8855-5 1. Sequential machine theory. 1. Siefkes, Dirk. II. Title.

QA267.5.S4B83 1988 511-dc 19 88-37420

Printed on acid-free pa per

© 1989 by Springer Science+Business Media New York Originally published by Springer-Verlag New York !ne. in 1989 Softcover reprint of the hardcover 1st edition 1989

Ali rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher Springer Science+Business Media, LLC except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc. in this publication, even if the former are not especially identified, is not tobe taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

Typeset by Asco Trade Typesetting Ltd., Hong Kong.

9 8 7 6 5 4 3 2 1

ISBN 978-1-4613-8855-5 ISBN 978-1-4613-8853-1 (eBook) DOI 10.1007/978-1-4613-8853-1

Preface of the Editor

J. Richard Biichi died unexpectedly in April, 1984, leaving unfinished the present book. The book falls into two categories: The bulk of chapters 1 through 5 consists of Biichi's lecture notes for a course on finite automata, which he taught many times at Purdue University. Here he treats automata as unary algebras, an approach he developed with Jesse B. Wright in the 1950s in the Logic of Computers Group at the University of Michigan, Ann Arbor (see the introduction). The material of chapters 2 through 4 was first publicized in book form through Michael Harrison (1965), and is now classic. (References in these pages that are not in the bibliography are given at the end of the Preface.) In chapter 5 the author uses Post production systems to generate and accept word languages, and proves the most important result of this part of the book: Finite systems of rules of the form a~ - b~, where a, b are words and ~ is a variable on words, generate only regular languages. Thus the usual regular rules are just a normal form.

The remainder of the book was added at different stages. In sections 1.7 through 1.10 Biichi develops a theory of closure spaces, which he needs in section 4.8 to study the structure lattice of finite automata. In section 1.11 he reflects on the foundations and history of computing, using production systems for building up computations. Originally as a separate chapter he had included some of his results on applications of finite automata to decision procedures for monadic second-order languages. Perhaps because the material was carefully presented in a book by Trakhtenbrot and Barzdin (1973), the author dropped that chapter from the book. The main additions, and the most novel parts of the book, are chapters 6 and 7. Here the author drops the restriction to unary algebras; now (finite) automata are general many-sorted algebras, which accept terms (trees) instead of words as input. Tree automata were introduced independently by Doner (1970) and Thatcher and Wright

VI Preface of the Editor

(1968); see also the survey by Thatcher (1973). Biichi himself never published on the subject, but worked on this part, in parallel with many other activities, until his death. The text is not complete; the material stems from different times; and some ideas are only sketched. To help the reader I will try to bring the author's intentions into better focus, and draw connecting lines as I see them.

In 1910 Axel Thue introduced terms, likely motivated by number theoretic problems, and apparently without knowing Frege. He represented terms as trees and as strings, and described algebras (as we call them today) by finite systems of equations between terms, which we call Thue systems. He posed the word problem for free algebras, and a variant of Post's correspondence problem. Apparently without knowing Thue, in 1921 and more explicitly in 1936 and 1943 Emil Post introduced his production systems, which are many-premise semi-Thue systems with variables. He insisted on making rules "canonical", that is, putting any restrictions on their application or on the range of the variables into the rules and not into the context. (The previous "a~--+ b~" instead of "a--+ b to be applied at the left end of the word only" is an example.) These two mathematicians are the founders of both theories-of word-rewriting systems and of term-rewriting systems. In the former, which is known as formal language theory, one manipulates words by semi-Thue systems, or (formal) grammars. Rarely one takes into account that words might represent terms. For manipulating terms, variables are convenient. Thus in the latter theory one uses Post systems, though for terms. For Post this would have been a misuse, as one works independently of the term representation, and also not in a canonical way.

In chapters 6 and 7 Biichi aims to bring both fields together. In chapter 6 he investigates terms in parenthesis-free right-Polish notation. In sections 6.1 and 6.2 he produces Polish terms from the outside and from the inside by Post systems, and translates between string and tree representation. In sections 6.3 through 6.6 he considers one-sorted general algebras as automata responding to terms (or trees), independent of a representation, and more generally relational systems as nondeterministic tree automata. In this way he easily generalizes regularity theory from words to terms. In sections 6. 7 and 6.8 he brings words and terms together. The reversed production systems serve to recognize, and then parse, those words that are terms. Right-left producing, which is natural for right-Polish terms, becomes left-right parsing. By making these rules canonical the author gets descriptions of push-down automata, and generalizes the result of chapter 5: Finite systems of rules ~a11 --+ ~b1'f, where a, b are terms in Polish notation and ~. 11 are variables for words, generate only regular sets of terms. If one were to allow arbitrary words for a, b, one would get arbitrary Thue systems, and thus would generate all recursively enumerable sets of words.

In chapter 7 the author generalizes the approach from Polish to arbitrary term notation; an example is the classic infix notation known from mathematics. In section 7.1 he produces and recognizes general terms by Post

Preface of the Editor vii

systems. The inside production systems are just the context-free word grammars. Many people have rediscovered this, but none of them attributes it to Thue. In section 7.2 he begins to investigate the monoid of words modulo the set of terms. He carries out the idea, which seems to promise efficient parser constructions, only for an example that I found among his notes. Generalizing section 6.7, in section 7.3 he arrives naturally at the classic result that pushdown automata accept exactly the context-free word languages. Generalizing section 6.8, in section 7.4 he gives a streamlined version of Langmaack's (1971) account of LRk-languages, including a new decision procedure for the LRk-property, for fixed k. Both sections are unfinished.

It is sad that Richard Bi.ichi did not have more contact with people who worked in the same area. He stimulated the work of Deussen (1978, 1979, 1986), who uniformly describes the Chomsky hierarchy through rewriting systems, generating and accepting left-right and right-left, with special interest in parsing. Bi.ichi would have liked the results and the presentation; he wanted just that for terms. Deussen mentions the books of Nelson (1968) and Salomaa (1973), which make use of the duality between generating and accepting; I doubt whether Bi.ichi knew them. He did not know the approach of Ehrenfeucht, Hoogeboom, and Rozenberg (1985) who get the Chomsky classes by rewriting vectors of words. When he states his beautiful problems on starheight and feedback number in section 4.6 he does not mention the work of Hashigushi (e.g., 1983). As far as I know he was not aware of the growing area of term-rewriting systems (see, e.g., the expository paper by Huet and Oppen, 1980). This might have saved him some bitter remarks on the state of interest in terms, although his ideas were different. He did not live to see the book on tree automata by Gecseg and Stein by (1984), which contains most of his sections 6.3 through 6.6. He was in contact with Don Pigozzi, and there is an unpublished Purdue seminar report of Pigozzi (1975) on equational theories, which he quoted originally together with the book by Gratzer (1968) in the conclusion. But I am not sure whether Bi.ichi saw the newer books on universal algebra and equational logic (e.g., Burris and Sankappanavar, 1981; Cohn, 1981). For his approach to computing through producing in section 1.11 and chapters 6 and 7, the book by Davis and Weyuker (1983) probably would have been valuable. He would have been excited to see the LBA-problem solved by Immerman (1988) and Szelepcsenyi (1988) who proved that nondeterministic space classes are closed under complement. And I am sure he would have liked Lakatos' (1976) monsters, which are so like his own monsters in the introduction to chapter 7.

There are other areas of research which Richard Bi.ichi knew, but was not much interested in. For example, when he speaks of the algebraic theory of automata, he refers to his own approach where he treats automata as algebras. For other people, however, the term means "theory of monoids". For results in this area the reader might compare the books by Eilenberg (1974, 1976); for example, the variety theorem with regard to section 3.3 of the present book. Other books in the area are those by Salomaa and Soittola (1978), Brauer

viii Preface of the Editor

(1984) which contains many detailed examples, and Pin (1984). For section 4.7 one might consult the book by Berstel and Perrin (1985) on a theory of codes. Considering the introduction to chapter 7, for an algebraic treatment of context-free languages one should mention the presentations by Chomsky and Schiitzenberger (1963), by Berstel (1979), and by Kuich and Salomaa (1986).

On the other hand, absorbing all this writing might have prevented Biichi from writing this book, which contains much material published elsewhere but is so unique in spirit and focus that it ought to stimulate a lot of new research. The experienced reader will without too much difficulty translate between the present book and the literature already known to him, and so will the beginner when confronted with other books later on. "And he will learn a lot on the way", Richard Biichi would have added. For example, Biichi calls function symbols letters, since they (1) include the type information of the function and (2) serve as basic material for building terms, which are words. He calls the term algebra master algebra, for obvious reasons. He calls the LL~grammars of exercise 3, problem 3 in chapter 7 ~~L' since this indicates how he found it, and maybe because he does not want to give away what he expects the reader to find.

When found, the manuscript did not look as uniform as the finished text does now. The older part as described above was typed; I corrected minor errors. The rest consisted of sets of manuscripts, mostly handwritten, and many notes. Following varying outlines I put the pieces together, changing the text as little as possible. Wherever necessary and possible I extended the text, and especially the exercises, from the notes. This explains the sentence in the introduction to chapter 7, "From here on you will perhaps miss the many little exercises", and other remarks. Richard Biichi must have known for some time that he would not finish the book. I put together the introductions to the book and to chapters 6 and 7 from different sources from widely varying times, sometimes inserting three stars to indicate the seams. I chose an appropriate piece as the conclusion to chapter 7, and to the book as a whole.

I am grateful to several people who helped me personally, scientifically, and financially. Sylvia Biichi and Leonard Lipshitz made the manuscripts accessible, and supported me greatly in many respects. Walter KaufmannBiihler, whom we miss now, too, had patiently encouraged Richard Biichi throughout the years; he did the same with me, until he died. Helga Barnewitz typed all the new material from much-worked-over sources in bad handwriting. Wolfgang Thomas worked through the book in a seminar, and gave many valuable comments which made me understand the book and its context much better. Also Peter Deussen, Dieter Hofbauer, and Hans Langmaack read the book, or parts of it, and suggested corrections; I gained especially from Peter Deussen's intimate knowledge of the area and of Richard Biichi's attitude toward it.

When Richard Biichi wanted to present an area, he did not talk about it, but worked at an example. It might be a tiny exercise, or an open problem. The same spirit pervades the book, especially chapters 6 and 7. It is an

Preface of the Editor ix

undergraduate textbook, and a deep source of interesting problems as well. Partially the problems are stated and discussed as such in the text, partially they are mixed inobtrusively into the little exercises. Very easy sections such as 6.1 and 6.2 alternate with very hard ones such as 6.8 and 7.4. I have said above that the results on regular sets of words and terms in sections 5.4 and 6.8 seem important to me. Personally I like best the thorough use of production systems, which brings together formal language theory and the theory of term-rewriting systems, including tree automata; and the view on algebras as automata, which lets many things fall into place. Tree automata work naturally from the leaves to the root; see section 6.3. Right-, not left- or bicongruences, correspond naturally to automata that read words from the left. For more details on the book and on the whole work of Richard Biichi the reader might consult my papers (1987) and (1985), respectively, as well as the book by Mac Lane and myself (1989).

I am, however, no expert. I hope to have brought the book into readable form, so that it will stimulate other people. The reader should try for himself.

References to Editor's Preface

Berstel, J. 1979. Transductions and Context-free Languages. Stuttgart: Teubner. Berstel, J., and D. Perrin. 1985. Theory of Codes. New York: Academic Press. Brauer, W. 1984. Automatentheorie. Stuttgart: Teubner. Burris, S., and H. P. Sankappanavar. 1981. A Course in Universal Algebra. Berlin

Heidelberg-New York: Springer-Verlag. Chomsky, N., and M. P. Schiitzenberger, 1963. The algebraic theory of context-free

languages. In P. Braffort and D. Hirschberg (Eds.). Computer Programming and Formal Systems. Amsterdam: North-Holland, pp. 118-161.

Cohn, P.M. 1981. Universal Algebra. Dordrecht; Holland: Reidel. Davis, M., and E. Weyuker. 1983. Computability, Complexity, and Languages. New

York: Academic Press. Deussen, P. 1978. A unified approach to the generation and the acceptation of formal

languages. Acta Informatica 9, 377-390. Deussen, P. 1979. One abstract accepting algorithm for all kinds of parsers. In H. A.

Maurer (Ed.). Lect. Notes Camp. Sci. Vol. 71. Berlin-Rei 'elbergNew York: Springer-Verlag, pp. 203-217.

Deussen, P. 1986. Erzeugung, Akzeption und syntaktische Analyse formaler Sprachen. Vorlesungsmanuskript Fakultat Informatik, Universitiit Karlsruhe, Germany, WS 1986/87. Parts I and II.

Ehrenfeucht, A., H. J. Hoogeboom, and G. Rozen berg. 1985. On coordinated rewriting: Fundamentals of computation theory. Lect. Notes Camp. Sci. 199, 100-111.

Eilenberg, S. 1974, 1976. Automata, Languages, and Machines. Vol. A, B. New York: Academic Press.

Gecseg, F., and M. Steinby. 1984. Tree Automata. Budapest: Akademiai Kiad6; Philadelphia: Heyden & Son.

Harrison, M. A. 1965. Introduction to Switching and Automata Theory. New York: McGraw-Hill.

X Preface of the Editor

Hashigushi, K. 1983. Representation theorems on regular languages. J. Comp. Syst. Sci. 27, 101-115.

Huet, G., and D. C. Oppen. 1980. Equations and rewrite rules-A survey. In: R. V. Book (Ed.). Formal Language Theory. New York: Academic Press.

Immerman, N. 1988. Nondeterministic space is closed under complementation. SIAM Journal on Computing 17, 935-938.

Kuich, W., and A. Salomaa. 1986. Semirings, Automata, Languages. Berlin-HeidelbergNew York: Springer-Verlag.

Lakatos, I. 1976. Proofs and Refutations. Boston: Cambridge University Press. Mac Lane, S., and D. Siefkes. 1989. Collected Works of J. Richard Biichi. Berlin

Heidelberg-New York: Springer-Verlag, in press. Nelson, R. J. 1968. Introduction to Automata. New York: John Wiley. Pigozzi, D. 1975. Equational logic and equational theories of algebras. Techn. Report

Purdue University CSD, TR-135. Pin, J. E. 1984. Varietes de languages formels. Paris, New York: Masson. Salomaa, A. K. 1973. Formal Languages. New York: Academic Press. Salomaa, A. K., and M. Soittola. 1978. Automata- Theoretic Aspects of Formal Power

Series. Berlin-Heidelberg-New York: Springer-Verlag. Siefkes, D. 1985. The work of J. Richard Biichi. To appear in Thomas L. Drucker (Ed.).

Perspectives on the History of Mathematical Logic. Special Issue. Proc. AMS Spring Meeting Chicago, Birkhauser-Boston.

Siefkes, Dirk. 1987. Grammars for Terms and Automata. On a book by the late J. Richard Biichi. In "Computation Theory and Logic", Egon Borger, ed. Lecture Notes, in Computer Science, vol. 270. New York: Springer-Verlag, pp. 349-359.

Szelepcsenyi, R. 1988. The method of forced enumeration for nondeterministic automata. Acta Informatica 26, 279-284.

Thatcher, J. W. 1973. Tree automata: an informal survey. In A. Aho (Ed.). Currents in the Theory of Computing. Englewood Cliffs NJ: Prentice-Hall, pp. 143-172.

Berlin Dirk Siefkes

Contents

Preface of the Editor v

Introduction XV

Chapter 1 Concepts and Notations in Discrete Mathematics ................ .

§1.1. The Notations of Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 §1.2. The Natural Number System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 §1.3. Sets and Functions.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 §1.4. Binary Relations, Isomorphisms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 §1.5. Equivalence Relations, Partial Orders, and Rectangular Relations . . . 30 §1.6. Lattices and Boolean Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 ~1.7. Set Lattices and Quasi-Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 §1.8. Semi-set Lattices and Closure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 58 §1.9. Discrete Closure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

§1.10. Classification of Closure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 §1.11. Procedures of Computation, Production, and Proof .............. · 70

Chapter 2 The Structure Theory of Transition Algebras . . . . . . . . . . . . . . . . . . . . . 76

§2.1. The Transition Algebra of a Logical Net . . . . . . . . . . . . . . . . . . . . . . 77 §2.2. The Response Function of a k-Algebra . . . . . . . . . . . . . . . . . . . . . . . . 81 §2.3. Accessible States of a Transition Algebra . . . . . . . . . . . . . . . . . . . . . . 84 §2.4. The Basic Concepts of Algebra and Their Meaning for

Automata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 §2.5. The Structure Lattice of k-Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Xll Contents

Chapter 3 The Structure and Behavior of Finite Automata. . . . . . . . . . . . . . . . . . . 106

§3.1. The Outputs of a k-Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 §3.2. The Minimal Automaton of Given Behavior . . . . . . . . . . . . . . . . . . . 115 §3.3. Finite-State Acceptors and Their Right- and Left-Behaviors. . . . . . . . 117 §3.4. Periodic Sets of Words.................................... 126

Chapter 4 Transition Systems and Regular Events . . . . . . . . . . . . . . . . . . . . . . . . . . 133

§4.1. Transition Systems and the Subset Construction . . . . . . . . . . . . . . . . 134 §4.2. The Behavior of Transition Systems with Output.. . . . . . . . . . . . . . . 140 §4.3. Spontaneous Transitions, Closure Properties on Periodic Events . . . . 144 §4.4. Regular Events ................. : . . . . . . . . . . . . . . . . . . . . . . . . 149 §4.5. Regular Expressions; the Analysis and Synthesis Theorems . . . . . . . . 151 §4.6: Starheight and Feedback Number . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 §4.7. General Transition Systems, the Coding Lemmas.. . . . . . . . . . . . . . . 165 §4.8. Systems Are Quotients of Algebras, Modulo-Compatible Closure

Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

Chapter 5 Regular Canonical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

§5.1. §5.2. §5.3. §5.4. §5.5. §5.6. §5.7.

Regular Systems and Finite Automata ....................... . Finite Automata Are Regular Systems ....................... . Minimal and Periodic Descriptions of a Regular Set ............ . Regular Systems Produce Periodic Sets ...................... . Regular Rules with Many Premises ......................... . Right and Left Regular Rules .............................. . Normal Systems and Regular Systems

Chapter 6 General Algebras: How They Function as Tree Acceptors and

181 187 190 197 200 207 212

Push-down Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

§6.1. How Terms Are Constructed from the Outside and from the Insi~... . 220 §6.2. Terms Are Trees and Trees Are Terms; The Run of a Production . . . . 231 §6.3. Algebras, and How They Respond to Input Signals . . . . . . . . . . . . . . 236 §6.4. Standard Presentation of Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 §6.5. The Behavior of Finite Tree Acceptors, Periodic Sets of Terms . . . . . . 245 §6.6. Regular Sets of Terms; Analysis and Synthesis of Finite

Tree Acceptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 §6.7. How !-Automata Accept Terms in Tree Time and in Real Time;

Tree Automata and Push-down Automata.. . . . . . . . . . . . . . . . . . . . 251 §6.8. Regular Tree-producing Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 262

Contents xiii

Chapter 7 General Alphabets: The Theory of Push-down Automata and Context-free Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

§7.1. General Alphabets, and How They Produce Terms from the Outside and from the Inside . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

§7.2. Leibniz-Thue law, and the Basic Grammatical Facts about Terms . . 278 §7.3. Push-down Automata and Context-free Languages . . . . . . . . . . . . . . 282 §7.4. Push-down Parsers for LRk-Grammars . . . . . . . . . . . . . . . . . . . . . . . 285

Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

Introduction

Our world, both natural and technological, abounds in systems that may be thought of, at least in a first approximation, as operating in accordance with the following specifications:

1. Finite number of internal states. At each instance the system is in one out of a finite number n of well-distinguishable internal configurations.

2. Finite number of input and output states. The system is connected to the environment by an input channel through which at each instance one out of a finite number k of well-distinguishable stimuli can be imposed on the system. In turn, the system can influence the environment through an output channel capable of taking on but a finite number m of states. Because of specifications 1 and 2, the input, internal, and output states can change only at discrete time instances t = 0, 1, 2, ....

3. Determinism. At time t = 0 the system is in a specific state A, called the initial state. The internal state at time t + 1 is uniquely determined by the pair consisting of the internal state at timet and the input state at timet. The output state at time t is uniquely determined by the internaL '>tate at timet.

Mechanical devices, or parts of machines working on mechanical principles, provide obvious examples of such discrete deterministic systems. Clocks may be mentioned as input-free examples; a combination look is clearly meant to operate according to specifications 1, 2, and 3. It is rather tempting to consider biological systems (nerve nets, interactions among organs) from this point of view. Finally, we mention electronic devices, such as digital computers and their components. Thus, discrete deterministic systems include both natural systems (such as animal nervous systems) and artificial systems (such as computers), which well deserve to be termed information processing systems.

XVI Introduction

An investigation of such systems can be either theoretical or empirical in nature. The goal of an investigation may be scientific or it can be technological in orientation; one may attempt to understand information processing systems or one can attempt to design, construct, or employ such systems.

A theoretical investigation of discrete deterministic systems, having as its goal the understanding of their function and use, will be the subject of this book. Because widely differing meanings and connotations attach to these words, we must elaborate somewhat on our understanding of "theoretical investigation". We mean hereby a sort of intellectual activity that may be divided into the following two parts:

(A) To start from old concepts and invent new ideas concerned with the empirical matters under consideration, and to bring these ideas into focus, such that eventually rigorous (i.e., mathematical) concepts that correspond to the vague intuitive ideas may be found.

(B). To find deductive relationships between the fundamental concepts isolated in process (A), that is, to rigorously prove theorems concerning these concepts.

The process (A) is premathematical: Its purpose is to establish a link between the empirical matters and some branch of mathematics. It is such links between vague intuitive and rigorous mathematical concepts that account for the empirical significance of purely mathematical results. Activity (B), of course, is purely mathematical.

For example, the phrase "understanding the working of a combination lock" can mean various things, among them the practical and immediate kind of understanding that the maker of a particular lock or the safecracker needs, and the understanding of the general principles of combination locks. Do not make any mistake about the practical value of the latter kind of understanding. For some reason or other, very practical considerations may make it desirable to construct, say, a more efficient lock or a great variety of different types of locks. In either case the second kind of understanding will be more important than the first, even on the ground of practicalness.

I do not mean to suggest here that premathematical and mathematical activities should be carried on in isolation from one another. In fact, a very intimate interplay between the two always takes place. Only in smc..~l minds is there a contrast between engineering, science, and mathematics; it is an historical fact that the same personality has often created the fundamental theoretical and practical ideas of a given subject matter for example, Archimedes, Galilei, Newton, Euler, Gauss.

Let us look at the interplay between the two kinds of activities in some developments that have already occurred within the field of this book.

Pure switching. After a century of development of Boolean algebra, with entirely different empirical matters as background, various researchers (among them Shannon) noted that this preestablished branch of mathematics was well

Introduction xvn

suited to function as a theory of two-valued switching. Activity (A) here consisted in discovering that the already existing concept of a Boolean expression may serve as a rigorous definition of "two-valued switching circuit". In turn this leads to new interest in purely mathematical problems, for example, that of simplifying Boolean expressions and determining the complexity of algorithms for validity. Much confusion would be eliminated, and much labor could be saved, if the word would spread more widely, that also for manyvalued switching the mathematical theory is already available in the form of Post algebras (1921).

Switching through feedback. While the theory of switching is important for the understanding of both artificial systems (computers, telephone nets) and natural systems (nerve nets, organisms), it does not explain the more interesting phenomenon of memory. At least one aspect of this is intimately related to feedback, that is, self-control. Only recently have exact mathematical concepts been formulated intended to correspond to finite empirical systems in which switching takes place through feedback. Such concepts are the finite automata of Kleene (1956), the logical nets of Burks and Wright (1953), and the restricted recursions of Church (1957, 1963). Because these are new notions (even though not far removed from others that have already been studied by algebraists and logicians), their mathematics is only now in the making. The sort of results one might want to work on, activity (B), are suggested by the empirical ideas about feedback and memory, activity (A).

Computability. The empirical idea of calculating the value of a function by employing an algorithm or a mechanical procedure has a history of several thousand years. Some writers hold that the discovery of such algorithms is the task par excellence of the mathematician, and indeed some of the most famous mathematical problems are about the existence of certain algorithms. Only recently, beginning with Thue (1910, 1914), Skolem (1919), and Post (1921), have algorithms themselves become the object of mathematical investigation. It was in the 1930s that exact mathematical definitions of the concept of algorithm were proposed by Godel (1934), Church (1936), Kleene (1936), Post (1936), and Turing (1936). Today a very highly developed branch of mathematics is available, the theory of recursive functions and formal logic, that contains many deep and powerful results concerning the concept of mechanical computation. It is quite clear that significant insight • ··to the nature of present computing devices, and the construction of entirely new sorts of machines, can be gained by an understanding of the theory of recursive functions and the carrying out of new research in this branch of mathematics.

Artificial languages. The idea of a language with precisely stated rules of grammar and rules of proof, activity (A), is due to Leibniz. But only by the end of the 19th century was an artificial language actually constructed by Frege (1879). Such formal languages have since become the basic tool of the logician and are important for understanding the idea of a mathematical proof. Thue (1910, 1914), and more intensively Post (1921, 1943), have concentrated on the matter of formation niles (grammar). Post's canonical systems provide

XVlll Introduction

the base for modern mathematical linguistics and a theory of programming languages. There is a very close relationship between the idea of computation and that of a formal proof. In fact, a formal proof is a special kind of computation. In turn, Godel (1934) used the idea of a formal proof to define his general recursive functions. Also (Turing) machines are naturally interpreted to be special kinds of canonical systems, namely semi-Thue systems. So the concept of a computable function may be defined in terms of formal languages.

We may then well say that this book is in applied mathematics. However, it is not applied mathematics in the sense of applied analysis (the mathematics of the continuum of real numbers), as this term is most often used. Because our interest is in digital systems rather than in analog computers, we will look for applications of discrete or combinatorial mathematics, the mathematics of finite structures, abstract algebra, and modern logic. Potentially any result in th,ese often quite modern branches of mathematics may be of relevance for the understanding of digital systems.

Finite automaton is the mathematical concept that renders precise the intuitive idea of a strictly finite operating system. This excludes such devices as Turing machines, which still operate in the discrete deterministic mode but make use of unlimited (potentially infinite) memory space. In his pioneering work of 1956, Kleene (1956) gave a rigorous definition of the behavior of such a system, and he proved two theorems about such behavior. His proofs provide a clear understanding of what discrete deterministic systems can and cannot do. Much work has since been done in this field, so that it is now possible to present concisely the rudiments of a mathematical theory, and applications of it.

The basic facts on automata are relevant to the designer of digital systems, in much the same way as are the basic facts on Carnot machines to the thermal engineer. In both cases the abstraction will be an extremely simplified version of the system the practitioner will encounter. In both cases the abstraction (because of its simplicity) will provide an insight that cannot be obtained by empirical methods alone. Conversely, the praxis provides a strong intuitive background that is essential for developing the theory, which thus might well be called the "theory of switching through feedback" or "theory or control in finite sytems".

Some subject matters are distinguished by such a strong intuitive background. Every creature on this planet has a sense of space and time, and in the human mind this sense is developed to a highly sophisticated understanding of its environment. Hence the early development of the science of geometry (space), kinematics (space and time), number theory (counting). Our subject matter-words, linguistics, machines-shares the intuitive background with these fields. Juggling and operating strings of symbols comes very easily to mind.

If the definition of "finite automata" is appropriately chosen, it turns out that many basic concepts and results concerning the structure and behavior

Introduction xix

of finite automata are in fact just special cases of the fundamental concepts (homomorphism, congruence relation, free algebra) and facts of abstract algebra. Automata theory is simply the theory of universal algebras (in the sense of Birkhoff, 1948) with unary operations, and with emphasis on finite

algebras. In turn, all material presented in chapters 2 through 5 of this book can be generalized to universal algebras with n-ary operations (see chapters 6 and 7), and in part leads to novel conceptions in this field, for example, reduced product, merger, cascading outputs and their behavior. Turning the instrument around is an old and venerable trick of the scientist.

From another point of view, the theory of finite automata may be viewed as a chapter in the arithmetic of words (i.e., sequences of symbols). It is

a study of congruences of finite index on words, and these are a very natural generalization of the elementary congruences on natural numbers. As a contribution to the study of words, the theory is of interest to formal linguistics; finite-state grammars are but another version of finite automata. In chapters 6 and 7 we extend parts of our results in chapters 2 through 5 to the n-ary generalization of automata. This yields new insight into context-free grammars and push-down automata (see below).

In several papers (1960, 1962a, 1965a, 1965b, 1973, 1983; Biichi and Landweber, 1969a; Biichi and Zaiontz, 1983) we have shown that finite automata provide a method for setting up decision procedures for truth in

monadic second-order languages and fragments of the theory of real numbers. This application to logic yields results that have defied other attacks. Again, the basic ideas of this method are available in the n-ary case (see Doner, 1970; Thatcher and Wright, 1968). Rabin (1969) has used the method to obtain a really powerful decision procedure, second only to Tarski's procedure for

elementary analysis and geometry.

* * * What is left to be done now is to extend everything that has been said so

far about one-sorted algebras with unary operators and one generator, to many-sorted algebras with arbitrary operators and many generators. The transition structure of an automaton thus becomes an arbitrary finite algebra,

and the objects to be accepted or produced become the elements of an arbitrary totally free algebra. The logician calls these objects terms. And every once in a while someone realizes that these terms admil a very pleasing graphic representation; namely, terms are oriented trees with markers attached to the vertices. So the subject of the final chapters of the book are general algebras, tree automata, and term-producing grammars.

To me this program seemed very natural and promising some 20 years ago, when I realized that finite-automata theory naturally fits into general algebra. I first worked with Jesse Wright, and we had lectures at Michigan (see Blichi and Wright, 1960). I lectured further in public on these things (see Biichi, 1962b, 1966). Some people did listen then, and today the term "algebraic theory of automata" has become common gossip. Jesse Wright has gone a further step to "automata theory is category theory". Due especially to his

XX Introduction

activity there is now much work in progress, to add ideas from automata theory to the study of categories (see Eilenberg and Wright, 1967; Mezei and Wright, 1967).

What is still missing is a careful presentation of the matter, and that in my way of thinking best starts with the presentation of the simple unary case, so that is what the reader finds in chapters 2 through 5. The extension to arbitrary algebras (tree automata and push-down automata) is almost obvious to one who understands the unary case and knows what general algebra is. This extension is the subject of chapter 6. The matter is so important because it gives the proper setting to the subject of context-free grammars and general push-down automata. So chapter 7 contains the outline of a theory of these matters, and I mean a theory and not just ideas. To make clear what I mean by a theory I have included Langmaack's (1971) treatment ofLRk-grammars, which can be done nicely in our setting. For more details on the content and the history see the introductions to chapters 6 and 7.

I have refrained from bringing categories into the picture. The category language is nice if only the user does not forget the more concrete algebraic background. There are very real combinatorial problems. Graphs and lattices (especially the finite or discrete ones) would be much more realistic in dealing with the problem than fancy category language. Category theory doesn't solve problems in automata theory-at best it creates new problems in a not very realistic automata theory (bypasses problems and does other things).

* * * Much of the material presented in this book cries for extension to more

sophisticated structures. The extension from unary algebras to n-ary algebras (tree automata) sometimes is obvious and sometimes requires additional ideas (Brainerd, 1968, 1969; Doner, 1970; Rabin, 1969). In other cases the results in the special case are surprisingly strong, and when properly used will yield information on the general case. For example, Landweber's sequential games (Buchi and Landweber, 1969b) and McNaughton's lemma (1966) applied to tree automata yield Rabin's result (1969); (see Buchi, 1977, 1983). Also Langmaack (1971) showed how the notion of regularity can be used to give a rigorous definition of LRk-grammar. The basic facts on these seemingly more general grammars, then, come right out of a result on regular grammars (Buchi, 1964; Buchi and Hosken, 1970; sections 6.8 and 7.4 of this book). Consider the familiar chain of grammars

regular --+context-free --+ context-sensitive --+ semi-Thue

For the regular grammars we have the algebraic theory presented in this book, and for the semi-Thue systems we have general recursion theory; both treat their matters in a concise manner. No concise presentation, however, is available for the context-free or the context-sensitive grammars. It is hoped that our algebraic treatment of finite automata may show the way to a concise presentation of the rudiments of these more general systems.

* * *

Introduction xxi

We will study here the formulas or expressions of mathematics, that is, the formalized language of mathematicians. There are two very powerful tools used by mathematicians to assist their thinking. Mathematicians fall into two classes-those who like formulas and those who like pictures. Today the figures seem to go out of fashion; maybe because we are poor or stingy and can't afford their printing, or because it is too much trouble to see figures through the various stages of manuscripts and galley proofs. In the beginning there were the pictures. It was pictures that Archimedes drew in his sandbox, and he probably was not much worried about formulas. He had none. In Newton's Principles (1687) you still find surprisingly few formulas. (But don't you sneer, it makes very exciting reading. Try it, there is a good English version!) The juggling of formulas, today, is the very trademark of the mathematician.

* * *

To the Beginner

The following remarks probably concern you in case you are not familiar with many of these items: group, ring, Boolean algebra, propositional calculus, partial order, lattice, graph. That is, you have not been exposed to much mathematics of the finite or discrete variety.

1. A first manuscript to this book was dedicated to you. Especially this goes for chapters 2 through 5 on finite automata. The exercises there proposed are designed mainly to help you understand the abstract notions by way of concrete examples.

2. Chapter 1 was added for you to use it for reference. You are not meant, in a first reading, to make a complete study of it. One nice thing about finite automata is that they make very concrete examples of abstract functions and relations, universal algebras, graphs, partial orders, lattices of congruence relations, algorithms, formal languages, and so on. Hence you will be better motivated if you go back to chapter 1. In general, the exercises should be interpreted as providing one method for checking up on how well you :;ave digested the ideas of the preceding sections. However, you will learn much more trying to construct your own exercises and questions. Similarly, the text ought to be used as a guide. Close the book after reading a theorem or lemma and try to make up your own proof.

Always consider the very simplest forms of a situation at hand! This advice goes for mathematics, just as everywhere else in life. A well-chosen simplification will yield ideas for solving a complex problem. Furthermore, the simple ideas, structures, and proofs are really the most important. This is why we have taken the time to put the theory of unary automata into order.

* * *

XXll Introduction

Acknowledgment

I wish to thank the National Science Foundation for its support over many years, especially during the early years when university departments didn't consider the field respectable enough. I wish also to thank the people of Springer-Verlag for publishing this book.

I was introduced to finite automata in the early 1950s by Jesse B. Wright. The subject was then very new, and Wright had already made his contribution by showing that precise notions and results could be developed about such matters as nerve nets. We then collaborated for many fruitful years. I would like to thank him in particular for his help in many early researches on the use of automata in monadic second-order theory. He destroyed more than one of my "proofs", and I hear that he later provided the same service to other workers in the field. To introduce quantifiers over infinite sequences of states occurred to me when attending lectures delivered by A. Church, at the University of Illinois at Champaign-Urbana, 1957. I think this has much to do with Church's methodical way of approach, and he has helped sharpen our (Wright, Elgot, and Biichi) early ideas on the matter of design algorithms.

What I have learned about automata from Kleene is clear. During the 1960s McNaughton's work and that of Landweber on finite-state games have greatly improved my understanding of finite automata working on infinite information. From the mathematician's viewpoint, these are results that can be seen in the best company. I feel much the same way about M. Rabin's work on tree automata, to which the former is closely related. Doner, Brainerd. Elgot, Trakhtenbrot. Wang, Myhill. Siefkes, .... 1 Students.

These are the people from whom I have learned directly in a narrow field. But then one often wonders to whom one is indebted in a more remote, but possibly just as real, way. So there is Cantor, of whom I have tried to think, at least sometimes, when I say "let f be an arbitrary function on a set X", because I probably would not say it if it hadn't been for him. There is Frege, with his idea of dividing linguistic expressions into objects and predicates, who seems to have been the first to realize the importance of quantifiers. And Thue-who thought about grammars and even trees and who has done so many other very original things, when nobody else dreamed of those things. Li:iwenheim proved theorems about linguistic systems (by quan .. =~er elimination) at a time when others just talked about them, and so many are still talking about them. Without Gauss my idea of a proof would probably be very different from what it is.

To do this properly, one would have to become a historian of science, and doing this sort of thing is maybe the purpose of history of science. It is important to know to whom you are indebted, because in this life you don't have too much time to waste. Sometimes it might be good to read bad authors, but certainly not all the time. There are real books, and then books about books-real books such as Euclid, Principles of Mechanics of Newton, Gauss.

1 Illegible name (the editor).

finite automata, their algebras and grammars - springer978-1-4613-8853-1/1.pdf · these rules...

Documents