a compiler-based toolkit to teach and learn finite automata

8
A Compiler-Based Toolkit to Teach and Learn Finite Automata PINAKI CHAKRABORTY, P. C. SAXENA, C. P. KATTI School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi 110067, India Received 20 January 2010; accepted 1 August 2010 ABSTRACT: This paper introduces a compiler technology based approach to model and simulate finite automata for pedagogical purposes. The compiler technology helps to define a language to formally model finite automata and to develop a toolkit to simulate them efficiently. The language is called Finite Automaton Description Language (FADL) and the toolkit is based on it. A fast single-pass compiler is used to compile a finite automaton defined in FADL. Then an interpreter is used to simulate the working of the compiled finite automaton for any input string. The nondeterminism of a Nondeterministic Finite Automaton (NFA) is simulated using backtracking. A tool to view the transition diagram of the finite automaton is provided. A Deterministic Finite Automaton (DFA) can be additionally compiled using an optimizing compiler that also minimizes the number of states. Tools for converting an NFA to a DFA and for converting a DFA to a Turing machine are also provided. A preliminary testing of the toolkit has been performed in which the participating students observed that the toolkit is an interesting teaching tool and it helped them to acquire a better perception about finite automata. © 2010 Wiley Periodicals, Inc. Comput Appl Eng Educ; View this article online at wileyonlinelibrary.com; DOI 10.1002/cae.20492 Keywords: compiler; optimizing compiler; finite automaton; simulation INTRODUCTION Automata theory is undoubtedly one of the most important topics in Computer Science (CS). Although various types of automata have been defined till date, finite automata hold a special place in the theory of automata. Finite automata are fundamental in nature and are known for their simplicity. Consequently, finite automata are often used to explain several key principles and concepts of CS. Owing to their importance, computer scientists have been devel- oping pedagogical tools to teach finite automata. These tools are developed specifically to suit pedagogical purposes and are known for their fidelity to theory and low learning curve [1]. Several such tools are already available. However, computer scientists continue to welcome newer ones primarily because each such tool comes with a new perspective and has its own merits. Most of these tools are based on ad hoc simulation techniques. On the other hand, this paper presents a new toolkit based on compiler technology. This is a new and alternate approach. Compiler technology is known for being formal, well organized and a systematic laboratory for the use of high-level languages [2]. This use of compiler tech- nology is expected to augment the study of simulation of finite automata. Correspondence to P. Chakraborty (pinaki chakraborty [email protected]). © 2010 Wiley Periodicals, Inc. In this paper, a language for modeling finite automata, called the Finite Automaton Description Language (FADL), is being introduced. Any Deterministic Finite Automaton (DFA) or any Nondeterministic Finite Automaton (NFA) can be defined in the FADL language. In an earlier study [3], a similar Turing Machine Description Language has been formulated. Along with the lan- guage, a toolkit for processing the finite automata modeled in the language is presented in the current paper. RELATED WORKS Two sets of research need to be reviewed in context of the work being presented in this paper. First, quite a few experimental com- pilers have been developed in various branches of CS. Second, there are several existing pedagogical tools for studying finite automata. The following two subsections overview these two top- ics. Experimental Compilers Software developers are not mere users of programming languages [4]. On the contrary, they regularly design and implement small languages. It is the task of the academicians to awaken bud- ding language designers. It can be argued that language amateurs have developed some of the most important languages used today including JavaScript, Perl, PHP, and Ruby. In accordance with this 1

Upload: pinaki-chakraborty

Post on 06-Jun-2016

233 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A compiler-based toolkit to teach and learn finite automata

A Compiler-Based Toolkit toTeach and Learn FiniteAutomataPINAKI CHAKRABORTY, P. C. SAXENA, C. P. KATTI

School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi 110067, India

Received 20 January 2010; accepted 1 August 2010

ABSTRACT: This paper introduces a compiler technology based approach to model and simulate finiteautomata for pedagogical purposes. The compiler technology helps to define a language to formally model finiteautomata and to develop a toolkit to simulate them efficiently. The language is called Finite Automaton DescriptionLanguage (FADL) and the toolkit is based on it. A fast single-pass compiler is used to compile a finite automatondefined in FADL. Then an interpreter is used to simulate the working of the compiled finite automaton for anyinput string. The nondeterminism of a Nondeterministic Finite Automaton (NFA) is simulated using backtracking.A tool to view the transition diagram of the finite automaton is provided. A Deterministic Finite Automaton (DFA)can be additionally compiled using an optimizing compiler that also minimizes the number of states. Tools forconverting an NFA to a DFA and for converting a DFA to a Turing machine are also provided. A preliminary testingof the toolkit has been performed in which the participating students observed that the toolkit is an interestingteaching tool and it helped them to acquire a better perception about finite automata. © 2010 Wiley Periodicals,Inc. Comput Appl Eng Educ; View this article online at wileyonlinelibrary.com; DOI 10.1002/cae.20492

Keywords: compiler; optimizing compiler; finite automaton; simulation

INTRODUCTION

Automata theory is undoubtedly one of the most important topicsin Computer Science (CS). Although various types of automatahave been defined till date, finite automata hold a special place inthe theory of automata. Finite automata are fundamental in natureand are known for their simplicity. Consequently, finite automataare often used to explain several key principles and concepts of CS.Owing to their importance, computer scientists have been devel-oping pedagogical tools to teach finite automata. These tools aredeveloped specifically to suit pedagogical purposes and are knownfor their fidelity to theory and low learning curve [1]. Several suchtools are already available. However, computer scientists continueto welcome newer ones primarily because each such tool comeswith a new perspective and has its own merits. Most of these toolsare based on ad hoc simulation techniques. On the other hand, thispaper presents a new toolkit based on compiler technology. Thisis a new and alternate approach. Compiler technology is knownfor being formal, well organized and a systematic laboratory forthe use of high-level languages [2]. This use of compiler tech-nology is expected to augment the study of simulation of finiteautomata.

Correspondence to P. Chakraborty(pinaki chakraborty [email protected]).© 2010 Wiley Periodicals, Inc.

In this paper, a language for modeling finite automata, calledthe Finite Automaton Description Language (FADL), is beingintroduced. Any Deterministic Finite Automaton (DFA) or anyNondeterministic Finite Automaton (NFA) can be defined in theFADL language. In an earlier study [3], a similar Turing MachineDescription Language has been formulated. Along with the lan-guage, a toolkit for processing the finite automata modeled in thelanguage is presented in the current paper.

RELATED WORKS

Two sets of research need to be reviewed in context of the workbeing presented in this paper. First, quite a few experimental com-pilers have been developed in various branches of CS. Second,there are several existing pedagogical tools for studying finiteautomata. The following two subsections overview these two top-ics.

Experimental Compilers

Software developers are not mere users of programming languages[4]. On the contrary, they regularly design and implement smalllanguages. It is the task of the academicians to awaken bud-ding language designers. It can be argued that language amateurshave developed some of the most important languages used todayincluding JavaScript, Perl, PHP, and Ruby. In accordance with this

1

Page 2: A compiler-based toolkit to teach and learn finite automata

2 CHAKRABORTY ET AL.

argument, Bodik [4] proposed an approach of teaching a course onprogramming languages and compilers through the developmentof small languages.

It has been observed that the language amateurs develop lan-guages for a very large domain that range from core topics ofCS, like neural networks, to emerging interdisciplinary fields ofresearch, like cheminformatics. A few interesting examples areworth mentioning. Simon [5] and Chakraborty [6,7] experimentedwith the feasibility of compilers that can use various types ofheuristics. Gruau et al. [8] developed a neural compiler. The com-piler takes as input a Pascal program and produces as output aneural network that performs the computations specified by theprogram. Korn [9] developed a runtime simulation model compilerthat compiles and solves vector differential equations, differenceequations, and scalar equations for dynamic system models. Tsudaet al. [10] developed a business simulation compiler that can uti-lize and modify a business simulation on the Web for a personwho is not experienced in program development. Grigorenko etal. [11] developed a compiler–compiler for visual languages thatworks as a framework for building visual programming environ-ments. These visual programming environments translate schemasinto textual representation and into programs representing the deepmeaning of schemas. Costagliola et al. [12] proposed an approachthat provides the basis for the unification of compiler technologiesfor traditional textual languages and visual languages. Chakraborty[3] developed a compiler for easy and efficient modeling of Tur-ing machines. Chakraborty and Gupta [13] developed a compilerfor modeling and simulating propositional logic statements. Hsuet al. [14] developed an automated integrated framework for reac-tion network generation based on domain-specific compiler theoryusing a knowledgebase of chemistry rules. The tryst of languageamateurs with unconventional languages and compilers is not overyet and they will develop many such for pedagogical and researchpurposes.

Pedagogical Tools for Finite Automata

The automata theory is a topic in CS that is known both for itsimportance as well as for the difficulties to teach and learn it [15].As a result, a number of pedagogical software tools have beendeveloped that allows students to experiment with the conceptsin this topic and thus understand them in a better way. Ches-nevar et al. [16] called for interactions between automata theoryand actual programming languages instead of keeping them unre-lated. According to Chesnevar et al. [15], the pedagogical tools forteaching and learning automata theory can be divided into two cate-gories. The first category consists of generic multipurpose softwarepackages for teaching and integrating several related concepts ofthe automata theory. The second category consists of software toolsfocused on simulating a specific class of automata for pedagogicalpurposes.

The pedagogical tools that can be used to teach and learnfinite automata are reviewed next. While some of them are generictools suitable for teaching and learning finite automata along withother topics, others have been specifically developed to study finiteautomata. A software tool called Automata allows one to experi-ment with finite automata [17]. The definition of a finite automatonis fed as a 5-tuple in a textual format. The tool can generate the listof strings accepted by the finite automaton, convert an NFA to itsequivalent DFA, and perform several other useful operations [18].Another tool called the Hypercard Automata Simulation allowsone to enter a finite automaton in a tabular format and experiment

with it [19]. Formal Languages and Automata Package is a soft-ware tool that can create and simulate finite automata and othertypes of automata [20]. Its newer version, called the Java FormalLanguages and Automata Package, allows one to graphically con-struct a transition diagram of finite automata. Cavalcante et al.[21] describe how it can be used to augment a course on automatatheory. Finite State Machine Explorer is an interactive graphicalsystem that supports the construction of finite state machines. Thesimulation of the behavior of a finite state machine for a given inputstring is illustrated using animation. The tool also supports con-version between equivalent classes of machines [15]. Finite StateAutomaton Applet [22] and Finite State Automaton Simulator[23] are software tools to simulate both DFAs and NFAs. Wer-melinger and Dias [24] present a collection of Prolog predicatesthat aims to provide a pedagogical implementation of concepts andalgorithms taught in a course on automata theory. Thomas et al.[25] introduce spreadsheets that simulate the working of a finitestate automaton and can be used for pedagogical purposes. Mer-ceron [26] introduces a set of design patterns that can be used toteach and learn the design of DFAs. Biczo and Pocza [27] presentan automata generator framework that makes it possible to usefinite state automata in regular business applications. The automataare defined using states and transitions and are saved as XMLfiles.

Merging the Two Paradigms

Most of the aforementioned pedagogical tools for finite automatahave been developed using ad hoc techniques. There is no consen-sus among their developers on which techniques to use. Even thebasic rules for modeling and simulating finite automata are yet tobe standardized. Consequently, the tools vary widely from eachother and are not interoperable. This results in a localized use ofsuch tools. In several cases, such a tool is used only in the uni-versity in which it has been developed. Moreover, the descriptionsof finite automata are entered in these tools differently. Represen-tations used in the textbooks, which are familiar to the students,are seldom used. This renders these tools difficult to learn. On thewhole, the formal approach is missing in the existing tools to teachand learn automata. Alternatively, compiler technology has a stan-dardized set of theories and practices. Compiler technology canbe employed to improve the state of modeling and simulation offinite automata. However, it should be noted that many experimen-tal compilers developed so far are suitable only for research but notappropriate for classroom teaching. So, when compiler technologyis employed to model and simulate finite automata care should betaken to develop genuine pedagogical tools.

THE LANGUAGE

The FADL, as being introduced in this paper, is a simple languagethat can be used to define any DFA or NFA. The FADL uses a for-mal symbolic representation of finite automata that is quite similarto those used in textbooks and research literature. As a result, aFADL program contains only the necessary information about afinite automaton and that also in a symbolic form. The FADL isneither a procedural language nor an object-oriented language asit does not implement any algorithm or instantiate any object. It issimilar to hardware description languages except for the fact thatthe hardware of a finite automaton is virtual. Using this languagerequires no programming skill or any in depth study of the lan-

Page 3: A compiler-based toolkit to teach and learn finite automata

COMPILER-BASED TOOLKIT FOR FINITE AUTOMATA 3

guage specifications. So, it is advantageous for students and othernaive users.

An FADL program is actually a definition of a finite automa-ton. It is customary to first write a prototype definition of the finiteautomaton followed by the definition of its transition function. Theprototype definition of a finite automaton is a 5-tuple of a finitenonempty set of internal states, a finite nonempty input alphabet,a symbol to denote the transition function, an initial state, and aset of accepting states. The initial state is an element of the set ofinternal states. The set of accepting states, which may be empty, isa subset of the set of internal states. The definition of the transitionfunction consists of a number of transitions. In fact, it is the natureof these transitions which determines whether a finite automaton isa DFA or an NFA. In a DFA, a transition occurs from a source stateto a destination state after reading an input symbol. Alternatively,in an NFA, a transition occurs from a source state to any one ofthe several possible destination states. In an NFA, a transition mayoccur even without reading an input symbol and such a transitionis known as a �-transition. It may be further recalled that both aDFA and an NFA accept regular languages. A DFA is a specialcase of NFA with some constraints laid on the transition function.Conversely, for a given NFA there exists a DFA such that bothaccept the same language.

A simple grammar for the FADL language is given below.

(1) program → fa definition tf definition;(2) fa definition → fa=({non empty state

list},{symbol list},transition function,state, {state list});

(3) tf definition → transition list | �(4) non empty state list → state,

non empty state list | state(5) symbol list → symbol, symbol list |

symbol(6) state list → state, state list | �(7) transition list → transition,

transition list | transition(8) transition → transition function (state,

symbol) = state | transition function(state, \) = state | transition function(state, symbol) = {state list} |transition function (state, \) ={state list}

In this grammar, fa stands for the keyword FA. A state is asequence of lowercase letters, digits, and underscore ( ) charactersstarting with a letter and having a maximum of eight characters. Asymbol is any lowercase letter or digit, and a backslash (\) is usedto represent ‘�’. A transition function is an uppercaseletter. The production rule (1) is used to define a FADL program.The production rule (2) is used to define the prototype definitionof a finite automaton and the production rule (3) is used to defineits transition function. Nonempty lists of internal states, nonemptylists of input symbols, and probably empty lists of internal statesare defined using the production rules (4–6). The production rules(7) are used to define lists of transitions with each transition definedusing the production rules (8).

An example of prototype finite automaton definition is asfollows.

FA = ({q0,q1,q2,q3,q4},{a,b},D,q0,{q2,q4});

FADLC

FADLOC

FAI

TDV

DFATMT

file.FA file.ILF

file.TM

input.TXT

output.TXT

Figure 1 Procedure of using the toolkit for a DFA.

In this example, {q0,q1,q2,q3,q4} is the set of internalstates, {a,b} is the input alphabet, D is the transition function, q0is the initial state, and {q2,q4} is the set of accepting states. Somepossible transitions of this finite automaton are D(q0,a) = q1,D(q0,b) = {q1,q2}, and D(q0,\) = q1.

THE TOOLKIT

In this paper a toolkit for modeling and experimenting with finiteautomata is being presented. The toolkit comprises six interrelatedprograms, viz., FADL Compiler (FADLC), FADL OptimizingCompiler (FADLOC), Finite Automaton Interpreter (FAI), Transi-tion Diagram Viewer (TDV), NFA to DFA Translator (NFADFAT),and DFA to Turing Machine Translator (DFATMT). All the six pro-grams have been implemented in the C++ programming language.The toolkit is console-based and each individual tool is invokedusing a short command at the command prompt. Care has beentaken to keep the commands simple and logical. It is quite evidentthat this toolkit belongs to the second category of the classificationof pedagogical tools in automata theory proposed by Chesnevaret al. [15], that is, tools focused on simulating a specific class ofautomata. The toolkit is based on a two-level translation schemeinvolving compilers and interpreters. Such a scheme is known toenhance portability [28] and simplicity [29] of the design.

A finite automaton can be either a DFA or an NFA. Themethod of using the toolkit is little different for these two types offinite automata. The definition of a DFA is saved in a .FA file. It isthen compiled using either the FADLC compiler or the FADLOCcompiler (Fig. 1). The object program is in an Intermediate Lan-guage (IL) and it is saved in a .ILF file. The working of the compiledDFA can be now simulated for different input strings using the FAIinterpreter. Alternatively, the equivalent transition diagram of theDFA can be viewed using the TDV. A Turing machine that acceptsthe language accepted by the DFA is obtained using the DFATMTtranslator.

The definition of an NFA, alike that of a DFA, has to be savedin a .FA file. This file can be compiled only using the FADLCcompiler (Fig. 2). Any attempt of compiling this file using theFADLOC compiler results in an error. The object program of thecompiled NFA is stored in a .ILF file. The working of the compiledNFA can be again simulated for different input strings using theFAI interpreter. Alternatively, the equivalent transition diagramof the NFA can be viewed using the TDV. The NFA can be alsotranslated to its equivalent DFA using the NFADFAT translator.

The FADLC Compiler

The FADLC is a fast single-pass compiler. It takes as inputan FADL program, which is actually the description of a finiteautomaton, and produces as output a functionally equivalent pro-gram in the IL. Therefore, the FADLC compiler can be representedas CFADL IL

C++ .

Page 4: A compiler-based toolkit to teach and learn finite automata

4 CHAKRABORTY ET AL.

input.TXT

file.FA FADLC

FAI

TDV

NFADFAT

file.ILF

file.FA

output.TXT

Figure 2 Procedure of using the toolkit for an NFA.

Source program

Lexical Analyzer

Semantic Analyzer

Code Generator

Bookkeeper Error Handler

Object program

Syntax Analyzer

Figure 3 Block diagram of the FADLC compiler.

The FADLC compiler performs three logical tasks. First, itchecks that the source program is a definition of a valid finiteautomaton. If not, it displays an error message. Second, theFADLC compiler identifies whether the finite automaton is aDFA or an NFA. Third, the FADLC compiler generates the objectprogram.

The FADLC compiler consists of four phases, viz., lexicalanalyzer, syntax analyzer, semantic analyzer, and code genera-tor (Fig. 3). Apart from these four phases, the FADLC compilercontains a bookkeeper module and an error handler module. Thesyntax analyzer of the FADLC compiler plays the lead role in theprocess of compilation and the other three phases run as co-routinesin the hegemony of the syntax analyzer. The syntax analyzer callsthe lexical analyzer whenever the former needs a token. The lexicalanalyzer, on its part, returns the next token in the input program onbeing called. On successful syntax analysis of a part of a program,the semantic analyzer and then the code generator are invoked forsemantic analysis and code generation, respectively. The syntaxanalyzer used in the FADLC compiler is a top-down predictiveparser. The bookkeeping module maintains a symbol table to storethe names of the states of the finite automaton. The names areinserted in the symbol table by the lexical analyzer. The seman-tic analyzer and the code generator use the information stored inthe symbol table. On detecting an error, the error handler of theFADLC compiler generates an error message and stops the processof compilation. The error messages are descriptive in nature andare expected to be helpful in debugging the program. The errorhandler is called by the lexical analyzer, the syntax analyzer, andthe semantic analyzer on the occurrence of lexical errors, syntaxerrors, and semantic errors, respectively. The code generator doesnot require calling the error handler as it can generate the objectcode for all correct programs.

The FADLOC Compiler

The FADLOC is an optimizing compiler for the FADL language.It is similar to the FADLC compiler apart from the fact that a codeoptimizer phase has been inserted before the code generator phase(Fig. 4). The code optimizer employs global optimization tech-niques and requires reading the description of the entire source

Lexical Analyzer

Semantic Analyzer

Code Optimizer

Bookkeeper Error Handler

Source program

Syntax Analyzer

Code Generator

Object program

Pass 2

Pass 1

Figure 4 Block diagram of the FADLOC compiler.

program at one time. Consequently, the code optimizer and thecode generator phases have been carved out into a separate passmaking the FADLOC a two-pass compiler. The FADLOC com-piler, alike the FADLC compiler, translates an FADL programinto its IL equivalent. Therefore, the FADLOC compiler can bealso represented as CFADL IL

C++ .The FADLOC compiler has been developed to compile DFAs

only. If an NFA needs to be compiled using the FADLOC com-piler, it has to be first converted to a DFA. The FADLOC compileremploys the following three optimizing techniques in the givenorder. First, all states that are not reachable from the starting statefor any input string are deleted. Second, the states from where noneof the final states can be reached for any input string are deleted.Third, the algorithm to minimize the number of states of a DFAis invoked. Moreover, the transition function is also compactedwhenever a state is deleted.

The goal of the code optimizer phase of a typical compiler isto reduce the size of the object program and/or the time requiredto execute the object program. In case of the FADLOC compiler,the number of states in the DFA may be reduced. This means areduction in the size of the object program. Additionally, a stringmay be accepted or rejected by the optimized DFA in a fewernumber of steps in comparison to the original DFA. This means areduction in the execution time.

The FAI Interpreter

A FADL program contains the description of a finite automaton.After a successful compilation, the program is ready for execution.Finding a proper target machine for the execution of this program isthe next issue. Due to the absence of realistic hardware equivalentsof finite automata, virtual machines can be used as suitable sub-stitutes. In fact, collaboration of concepts of virtual machines andcompilers is quite common in research [30]. The FAI interpreteris actually a virtual machine that has been included in the toolkitto simulate the working of the finite automata. The FAI interpretertakes as input a compiled FADL program and an input string. Itthen simulates the working of the finite automaton for that inputstring.

Simulating the nondeterminism property of an NFA is animportant implementation consideration. For a DFA in a givenstate and a given next input symbol there can be only one nextstate. However, for an NFA in a given state and a given next inputsymbol there can be more than one next states. The FAI interpreteruses a backtracking technique to simulate NFAs. In this technique,all possible paths arising from all possible transitions are explored.

Page 5: A compiler-based toolkit to teach and learn finite automata

COMPILER-BASED TOOLKIT FOR FINITE AUTOMATA 5

The process is continued until the NFA accepts the given string orall such possible paths have exhausted.

There are two parts of the output of the FAI interpreter. Thefirst part contains the state transitions that the finite automaton hasundergone. The second part contains the result, which is either anacceptance or a rejection of the input string by the finite automaton.

The TDV

The TDV is a program that takes as input the definition of a finiteautomaton as a .ILF file. It then displays the transition diagram ofthe finite automaton. Thus, the TDV provides a graphical repre-sentation for a DFA or an NFA. It is meant to be helpful to studentsand other novice users of the toolkit. To display the transition dia-grams in the best possible way, the TDV, unlike the other tools,executes in a full-screen mode.

The NFADFAT Translator

It is well known that for every NFA there exists a DFA such thatboth the finite automata accept the same language. The two finiteautomata are said to be equivalent of each other. The NFADFATis a program that translates an NFA to its equivalent DFA. Totranslate an NFA to its equivalent DFA, the NFA is first compiledusing the FADLC compiler. Then the object program is fed to theNFADFAT translator. The NFADFAT translator produces as outputthe equivalent DFA in the FADL language. This DFA is like anyother DFA defined in the FADL language. It can be recompiledusing either the FADLC compiler or the FADLOC compiler.

The DFATMT Translator

A DFA accepts a regular language while a Turing machine acceptsa recursively enumerable language. Since every regular languageis also a recursively enumerable language, there exists a Turingmachine for every DFA such that the two automata accept thesame language. The DFATMT is a program that translates a DFA toits equivalent Turing machine. To translate a DFA to its equivalentTuring machine, the DFA is first compiled using either the FADLCcompiler or the FADLOC compiler. Then the object program is fedto the DFATMT translator. The DFATMT translator generates theequivalent Turing machine in the TMDL language [3]. This Turingmachine is like any other Turing machine defined in the TMDLlanguage. It can be recompiled using the TMDLC compiler and itsworking can be simulated.

EXAMPLES OF FINITE AUTOMATA MODELED IN FADL

In this section, the working of the toolkit is demonstrated usingtwo DFAs and an NFA.

DFA 1

The FADL program given below defines a DFA that accepts allnonempty strings of ‘a’ and ‘b’ with even number of ‘a’s andeven number of ‘b’s.

FA = ({q0,q1,q2,q3,q4},{a,b},D,q0,{q4});D(q0,a) = q2,D(q0,b) = q3,D(q1,a) = q3,D(q1,b) = q2,

D(q2,a) = q4,D(q2,b) = q1,D(q3,a) = q1,D(q3,b) = q4,D(q4,a) = q2,D(q4,b) = q3;

Figure 5 illustrates how the various tools are used on thisDFA. The program is compiled using the FADLC compiler and itstransition diagram can be viewed using the TDV (Fig. 6). Whensimulated using the FAI interpreter, the DFA accepts the string‘aabbabba’ after moving through the following states.

q0 − (a) −> q2 − (a) −> q4 − (b) −> q3 − (b) −>

q4 − (a) −> q2 − (b) −> q1 − (b) −> q2 − (a) −>

q4 (ACCEPT)The DFA rejects the string ‘abb’ after moving through the

following states.q0 − (a) −> q2 − (b) −> q1 − (b) −> q2 (REJECT)The DFATMT translator is used to obtain a Turing machine

that accepts the same language as the DFA.

TM = ({q0,q1,q2,q3,q4,qf},{a,b},{a,b, },D,q0, ,{qf});

D(q0,a) = (q2,a,R),D(q0,b) = (q3,b,R),D(q1,a) = (q3,a,R),D(q1,b) = (q2,b,R),D(q2,a) = (q4,a,R),D(q2,b) = (q1,b,R),D(q3,a) = (q1,a,R),D(q3,b) = (q4,b,R),D(q4,a) = (q2,a,R),D(q4,b) = (q3,b,R),D(q4, ) = (qf, ,L);

DFA 2

The DFA to accept all nonempty strings of ‘a’ and ‘b’ with evennumber of ‘a’s and even number of ‘b’s can be defined in morethan one way. A naive definition of the DFA is as follows.

FA = ({q0,q1,q2,q3,q4,q5,q6,q7,q8},{a,b},D,q0,{q2,q8});

D(q0,a) = q1,D(q0,b) = q5,D(q1,a) = q2,D(q1,b) = q4,D(q2,a) = q1,D(q2,b) = q3,D(q3,a) = q4,D(q3,b) = q2,D(q4,a) = q3,D(q4,b) = q1,D(q5,a) = q6,D(q5,b) = q8,D(q6,a) = q5,D(q6,b) = q7,D(q7,a) = q8,D(q7,b) = q6,D(q8,a) = q7,D(q8,b) = q5;

Page 6: A compiler-based toolkit to teach and learn finite automata

6 CHAKRABORTY ET AL.

Figure 5 Screenshot showing the application of the tools for DFA 1.

The DFA can be compiled using the FADLC compiler toobtain an object program with nine states. Alternatively, the FAD-LOC compiler can be used to obtain an optimized object programwith only five states. The transition diagram of the optimizedversion of the DFA looks similar to the one in Figure 6.

NFA 1

The FADL program given below defines an NFA that acceptsall nonempty strings of ‘a’ and ‘b’ that begins with an ‘a’ andterminates with a ‘b’.

FA = ({q0,q1,q2},{a,b},D,q0,{q2});D(q0,a) = q1,D(q1,a) = q1,D(q1,b) = {q1,q2};

The program is compiled using the FADLC compiler and itstransition diagram is viewed using the TDV (Fig. 7). As statedearlier, the FAI interpreter uses a backtracking technique to sim-ulate NFAs in which all possible paths arising from the differentpossible transitions are explored one after another. When an NFAaccepts a string, the first matching sequence of states that leads toan accepting state is displayed as output. On the other hand, when

an NFA does not accept a string, the last path tried by the FAIinterpreter is displayed as output. When simulated using the FAIinterpreter, this NFA accepts the string ‘aabbab’ after movingthrough the following states.

q0 − (a) −> q1 − (a) −> q1 − (b) −> q1 − (b) −>

q1 − (a) −> q1 − (b) −> q2 (ACCEPT)The NFA rejects the string ‘abba’ after moving through the

following states.q0 − (a) −> q1 − (b) −> q1 − (b) −> q1 − (a) −>

q1 (REJECT)The NFADFAT translator is used to obtain the equivalent

DFA of this NFA as follows.

FA = ({q0,q1,q2,q3,q4},{a,b},D,q1,{q4});D(q0,a) = q0,D(q0,b) = q0,D(q1,a) = q2,D(q1,b) = q0,D(q2,a) = q2,D(q2,b) = q4,D(q3,a) = q0,D(q3,b) = q0,D(q4,a) = q2,D(q4,b) = q4;

Page 7: A compiler-based toolkit to teach and learn finite automata

COMPILER-BASED TOOLKIT FOR FINITE AUTOMATA 7

Figure 6 Screenshot of the TDV showing the transition diagram ofDFA 1.

Figure 7 Screenshot of the TDV showing the transition diagram ofNFA 1.

PRELIMINARY TESTING

A preliminary testing of the toolkit has been performed in which53 students pursuing graduate, postgraduate and doctorate degreesin CS in either GTB Institute of Technology or Jawaharlal NehruUniversity participated. The participating students have studied atleast one course on automata theory in their previous semesters.The toolkit, along with a detailed instruction manual and sampleprograms, was provided to the students. The students designedDFAs and NFAs, compiled them, and studied their behavior. Thestudents were supervised when they undertook this study. The over-all student response was quite positive. The students felt that thetoolkit is interesting as a teaching tool and the source language iseasy to learn. The students also felt that the toolkit helped themto reach a better understanding about finite automata and allowedthem to design several complicated DFAs and NFAs. The teacherswho oversaw the testing procedure remarked that the toolkit canbe used as a supplement to a course on automata theory and evenas a case study in a course on compiler construction.

CONCLUSIONS

It can be concluded that FADL is a simple yet efficient languageto design finite automata. The language and the associated toolkithave been well accepted by students in a pilot test. The abilityof FADL to represent a finite automaton in a way that is similarto that used in textbooks and research literature has been highlyappreciated. The FAI interpreter simulates the exact behavior of afinite automaton, whether deterministic or nondeterministic, forany input string. The toolkit can be also used to perform sev-eral important operations on finite automata, like minimizing thestates and converting an NFA to a DFA. The use of the compilertechnology has made modeling and simulation of finite automatamore formal and systematic which has helped the students to havea better understanding of the subject. The success of the FADLand TMDL languages calls for development of more pedagogicaltools based on compiler technology for teaching courses on formallanguages and automata theory.

ACKNOWLEDGMENTS

The authors would like to thank the anonymous reviewers for theirvaluable suggestions that helped to improve the paper appreciably.

REFERENCES

[1] A. Demaille, R. Levillain, and B. Perrot, A set of tools to teachcompiler construction, ACM SIGCSE Bull 40 (2008), 68–72.

[2] M. Hall, D. Padua, and K. Pingali, Compiler research: The next 50years, Commun ACM 52 (2009), 60–67.

[3] P. Chakraborty, A language for easy and efficient modeling of Turingmachines, Prog Nat Sci 17 (2007), 867–871.

[4] R. Bodik, Small languages in an undergraduate pl/compiler course,ACM SIGPLAN Not 43 (2008), 39–44.

[5] H. A. Simon, Experiments with a heuristic compiler, J ACM 10(1963), 493–506.

[6] P. Chakraborty, Use of heuristics in shift-reduce parsers, Proceedingsof the International Conference on Data Management, 2008, pp 103–109.

Page 8: A compiler-based toolkit to teach and learn finite automata

8 CHAKRABORTY ET AL.

[7] P. Chakraborty, Design and implementation of a cross compiler, JMultidisc Eng Technol 3 (2009), 6–15.

[8] F. Gruau, J. Y. Ratajszczak, and G. Wiber, A neural compiler, TheorComput Sci 141 (1995), 1–52.

[9] G. A. Korn, A simulation model compiler for all seasons, SimulatPract Theory 9 (2001), 21–35.

[10] K. Tsuda, T. Terano, Y. Kuno, H. Shirai, and H. Suzuki, A compilerfor business simulations: Toward business model development byyourselves, Inf Sci 143 (2002), 99–114.

[11] P. Grigorenko, A. Saabas, and E. Tyugu, COCOVILA—Compiler–compiler for visual languages, Electron Notes Theor Comput Sci 141(2005), 137–142.

[12] G. Costagliola, V. Deufemia, and G. Polese, Visual language imple-mentation through standard compiler–compiler techniques, J VisLang Comput 18 (2007), 165–226.

[13] P. Chakraborty and R. G. Gupta, A simple object oriented compiler,Proceedings of the National Conference on Information Technologyand Competitive Dynamics, 2008, pp 203–215.

[14] S. H. Hsu, B. Krishnamurthy, P. Rao, C. Zhao, S. Jagannathan,and V. Venkatasubramanian, A domain specific compiler theorybased framework for automated reaction network generation, Com-put Chem Eng 32 (2008), 2455–2470.

[15] C. I. Chesnevar, M. L. Cobo, and W. Yurcik, Using theoretical com-puter simulators for formal languages and automata theory, ACMSIGCSE Bull 35 (2003), 33–37.

[16] C. I. Chesnevar, M. P. Gonzalez, and A. G. Maguitman, Didacticstrategies for promoting significant learning in formal languages andautomata theory, ACM SIGCSE Bull 36 (2004), 7–11.

[17] K. Sutner, Implementing finite state machines, In: N. Dean, and G.E. Shannon (Eds.), Computational support for discrete mathematics,Series in Discrete Mathematics and Theoretical Computer Science,Vol. 15. American Mathematical Society, 1994, pp 347–363.

[18] S. H. Rodger, A. O. Bilska, K. H. Leider, M. Procopiuc, O. Pro-copiuc, J. R. Salemme, and E. Tsang, A collection of tools for makingautomata theory and formal languages come alive, ACM SIGCSEBull 29 (1997), 15–19.

[19] D. G. Hannay, Hypercard automata simulation: Finite-state, push-down and Turing machines, ACM SIGCSE Bull 24 (1992), 55–58.

[20] M. LoSacco and S. H. Rodger, FLAP: A tool for drawing and simulat-ing automata, Proceedings of the World Conference on EducationalMultimedia and Hypermedia, 1993, pp 310–317.

[21] R. Cavalcante, T. Finley, and S. H. Rodger, A visual and interac-tive automata theory course with JFLAP 4.0, ACM SIGCSE Bull 36(2004), 140–144.

[22] M. T. Grinder, S. B. Kim, T. L. Lutey, R. J. Ross, and K. F. Walsh,Loving to learn theory: Active learning modules for the theory ofcomputing, ACM SIGCSE Bull 34 (2002), 371–375.

[23] M. T. Grinder, Animating automata: A cross-platform program forteaching finite automata, ACM SIGCSE Bull 34 (2002), 63–67.

[24] M. Wermelinger and A. M. Dias, A Prolog toolkit for formal lan-guages and automata, ACM SIGCSE Bull 37 (2005), 330–334.

[25] A. P. Thomas, L. B. Sherrell, and J. B. Greer, Using software simu-lations to teach automata, J Comput Sci Coll 21 (2006), 170–176.

[26] A. Merceron, Design patterns to support teaching of automata theory,ACM SIGCSE Bull 41 (2009), 341.

[27] M. Biczo and K. Pocza, Generating functional implementations offinite state automata in C# 3.0, Electron Notes Theor Comput Sci238 (2009), 3–12.

[28] M. Ganapathi, C. N. Fischer, and J. L. Hennessy, Retargetable com-piler code generation, ACM Comput Surv 14 (1982), 573–592.

[29] E. F. Elsworth, The MSL compiler writing project, ACM SIGCSEBull 24 (1992), 41–44.

[30] S. Schocken, Virtual machines: abstraction and implementation,ACM SIGCSE Bull 41 (2009), 203–207.

BIOGRAPHIES

Pinaki Chakraborty has BTech and MTechdegrees in computer science. He is currentlypursuing his PhD in the same subject at JawaharlalNehru University. He has published about 35papers in reputed journals and conference pro-ceedings. His area of research includes compiler,operating system, expert system and computerscience education.

P. C. Saxena is a professor of computer science atJawaharlal Nehru University. He received his PhDfrom University of Delhi. He has published over80 papers in journals of international repute. Hehas supervised about 90 MTech and 20 PhD dis-sertations. His area of research includes computernetworks, distributed systems and optimizationtheory.

C. P. Katti is a professor of computer science atJawaharlal Nehru University. He received his PhDfrom IIT Delhi. He has published over 30 papersin journals of international repute. His area ofresearch includes parallel computing and numeri-cal analysis.