type-checking in an untyped language

Int. J. Man-Machine Studies (1984) 20, 157-167

Type-checking in an untyped language

A1.LAN RAMSAYt

School of Engineering and Applied Science, University of Sussex, Falmer, Brighton, U.K.

(Received 27 June 1982, and in revised form 27 February 1983)

It is argued that typed variables and functions are inappropriate for languages which allow functional arguments, data types defined by predicates, and conditional expressions that test data types. However, it is still possible to do some compile-time type-checking for such languages. This paper presents a technique for inferring data types in an untyped language, and a program that uses this technique to show where type constraints are obeyed or violated and where run time checks are needed.

1. Programming in POP-11

This paper presents a system which infers type information about the data flow through programs written in the untyped language POP-11 and uses this information to check the consistency of the target programs. Before we discuss in detail how this is done we will briefly describe POP-1 1 and say why its users like it and why, having chosen to use an untyped language, we want to do type-checking.

POP-11 is one of several dialects of POP-2, all of which share many features with the original described in Burstall, Collins & Popplestone (1971). It is used mainly by Artificial Intelligence (A.I.) researchers, who like it for the following reasons (several of which would probably appeal to programmers in other areas as well).

(i) It is interactive, with an incremental compiler. This means that it is easy to exper iment with programs and to correct minor bugs as they are d e t e c t e d I t h e E D I T - C O M P I L E - R U N cycle is much faster if you can edit and recompile small parts of your program while you are running it. The POP-1 l system can be seen as providing an integrated programming environment, with facilities like an editor and a set of debugging tools (see below) provided as part of the system and available within it, rather than as separate utilities to be invoked in isolation.

(ii) The syntax is clear and easy to read, encourages the user to develop well- structured programs, and allows the compiler to provide comprehensible error messages. This is one of the major differences between POP-11 and LISP, which are otherwise fairly similar.

(iii) The virtual machine for POP-11 is very simple, and the way that source code is translated into operations for it is easy to follow. When the p rogrammer writes her source code she knows what operations it will invoke on the virtual machine, even when she is trying complicated experiments with control strategies. We describe the POP-11 virtual machine in the next section, and will discuss there why the language is untyped and how we can nevertheless infer quite a lot about whether type errors will occur.

t Now at: Department of Computer Science, University of Essex, Colchester, Essex, U.K. 157

0020-7373/84/020157+ 11 S03.00/0 �9 1984 Academic Press Inc. (I.ondon) Limited

158 A. RAMSAY

(iv) The system provides a large number of programming constructs as standard functions. These include facilities like list processing functions, dynamic allocation of data structures, and a simple mechanism for creating specialised versions of functions by "freezing" some of their parameters .

It also provides facilities that make it easy to experiment with control strategies, and to embed very high level languages such as P R O L O G (Clocksin & Mellish, 1981), C O N N I V E R (Sussman & McDermot t , 1973) and GSP (Kaplan, 1973). The most important points here are that functions can be treated just like any other data structures, e.g. they can be passed as arguments to other functions and they can be stored as components of other data structures; and that the user can manipulate the calling sequence, so that you can save an environment to be re-used later and you can exit from a function to some arbitrary point earlier in the calling sequence. These propert ies are due largely to the simplicity of the virtual machine and the ease with which it can be manipulated.

(v) Many programming languages have an associated ODT, which enables the user to inspect the state of the system during breakpoints (which may be predefined or may be entered when there is an error or an interrupt). In general, ODTs operate at assembly language level, and the user must understand the relations between assembler and the high level language she wrote her program in if she is to use them. In POP-11, on the other hand, the standard incremental compiler is invoked when there is a breakpoint so that the user may interact with the system via arbitrary POP-11 commands (e.g. to inspect or update variables, or even to edit and recompile function definitions) before allowing computat ion to resume.

Fur thermore , since POP-11 functions are data structures that can be operated on by other POP-11 functions, it is easy to provide any debugging tools that are required (such as tracing functions setting breakpoints on function entry and exit) by writing them in POP-11.

These two facilities, which are common to LISP and other A.I. languages, give the user access to a powerful, easily extendable set of debugging tools from within the POP-11 programming environment. Debugging is recognized as an important part of the process of program development , and it is understood that if the programmer wants the facilities of a high level language when she writes her program she will appreciate them even more when she needs to debug it.

2. The virtual machine

The POP-11 virtual machine is based on two stacks, one, called the "user s tack" or sometimes just the "s tack" , for argument and result passing and one, called the "auxiliary ~tack", to hold return addresses and stored values of local variables. There are seven major operations for manipulating these stacks, as described below.

qpush (item) pushes the item onto the user stack. push (var) pushes the value of the variable onto the user stack. pop (var) pops the top item of[ the stack and sets it as the value of the variable.

This operat ion must check that the stack is not emp ty - - i f it is the system informs the user that there has been an error.

call (item) checks that the item is a function, informing the user that there is an error if it is not. If it is, call saves the current "stack f r ame" (i.e. all the information

T Y P E - C H E C K I N G IN A N U N T Y P E D L A N G U A G E 159

needed to resume computation after execution of the function call) on the auxiliary stack; sets the input parameters of the new function, by popping their values off the user stack (using the pop operation described above, and thus ensuring that there are enough of them); and finally jumps to the first operation in the body of the new function.

return returns from a function. It reverses the actions taken by call, pushing the values of the output parameters of the current function, resetting the values of its locals from the auxiliary stack, resetting the "current function" pointer and jumping to the stored address.

lump (label) transfers control to the point in the operation sequence specified by the label.

jump-if (label) pops the top item off the stack (reporting an error if there is nothing there). If this item is the distinguished data structure "false", the system simply moves on to the next operation in the sequence; in all other cases (not just if the item is " t rue") it jumps to the specified location.

Source code is compiled into sequences of these operations by planting a push operation whenever an expression consisting of just a variable name or a constant is read; a pop operation whenever an assignment is read; a call operation if a function call is read; and a return operation if the word " re turn" is read and as the last instruction in any function body. Jump and jump-if are used to produce the sequencing implied by the use of keywords like "if", "while" and "goto" in the source code.

Thus, for the expression

until x > 100 do pr(x); x + 1 ~ x ; enddo;

the compiler would produce the sequence

[ LAB I : , push x, qpush 100, call > , jump-if LAB2, push x, call pr, push x, push 1, call +, pop x, jump LAB1, LAB2: ]

3. POP-11 and type checking

POP-11 is a language for specifying actions to be effected on the above virtual machine. There is very little mention of data types in the description of this machine--even the demand that call may only be applied to a function is relaxed in some versions of the machine, where you can apply it to an arbitrary data structure and an integer n to get the n th component of the data structure. There is thus an argument that POP-11 should be untyped because typing it imposes constraints on the ways the programmer can use the virtual machine; such constraints would be undesirable, since one of the reasons for using the language is that you are attracted by the virtual machine. Thus the following function

function get nth(n, v ) ~ x ; v(n) ~ x ; if isfunction (x) then x(v) ~ x ; x-~ v(n) endif; end;

160 A. RAMSAY

provides a simple way of defining "demons" (Minsky, 1974). It is supposed to get the nth element of the vector v. However , if this e lement is a function it is assumed to be a function for calculating the "rea l" value of the nth element, and is applied. In this case its result is returned as the value of this call of get _n th , and is also stored in the vector for future reference.

get nth illustrates two important points about POP-11. First, it shows how the ability to treat functions just like other data structures makes it easy to implement techniques that would otherwise be very awkward. Second, get__nth would have been very hard to write if variables were typed in POP-11, unless you were allowed to specify that a variable is to be of type "anything". We want get nth to apply to any vectors (apart from ones whose "rea l" components are functions), so that, even using U N I O N types, the best that could be managed is to say that x is of type "anything U N I O N function(vector ~ anything)". This would not improve the legibility of the program, nor would it enable the compiler to prove anything about its consistency. This second point is a simple example of a general consequence of the f reedom to experiment with control strategies. It is very difficult to do compile time type checking in POP-11, since it is in general not clear at compile time what value a function will have when it is called or where it will return its results to; but if you are not going to do type checking, there seems little point in requiring the language to be typed.

Thus, we do not want to add a type specification formalism to POP-11, partly because we feel it would impose undesirable constraints on the user and partly because we would not be able to use it properly if we had it. However, it is clear that there are implicit type constraints in the language-- if , for instance, you apply a list processing function like "head " to an object, that object had better be a list. At present the user is required to ensure that these constraints are obeyed simply by being careful when she writes her program and by running it on test data to see if the system's run-time checks detect any errors.

This is not really satisfactory. Two particularly inconvenient situations are when you decide you want to change the way some class of objects is represented in your program, since POP-11 provides no clues about where you will have to edit your program to ensure the change is implemented correctly; and when you want to make a change in a program you are not very familiar with, since again the compiler will not tell you whether your change introduces typing problems in some part of the program which you have overlooked.

In order to help the user with these problems we decided to provide a program, TYPER, which would use the implicit type constraints to discover and tell her about places where her program contained type errors. We argued above that the generality of the POP-11 machine makes it impracticable to provide general proof techniques for type checking. However , we do have techniques that will tell us in many specific cases whether there will be problems or not; for other cases we will just tell the user that there may be, so that at least she will not overlook them. The main technique we use is to trace the history of data items during the abstract execution of a program. Since for most system functions we know what types of arguments they require and what types of results they produce (and for virtually all system functions we at least know how many arguments and results they have), we can analyse the arguments and results of user functions that call nothing but system functions by examining what happens on all possible execution paths. Clearly once we have derived a description

" I ' Y P E - C H E ( ? K I N G IN A N U N T Y P F J ) L A N G U A ( ; E 161

of such a function we can treat it as a system function when we consider others that call it; we show later that it is also possible to deal with mutually recursive functions, which are the only ones that cannot be analysed by simply working up the function hierarchy, so that eventually we can analyse any POP-11 program.

4. Evaluation along paths The first thing T Y P E R does is to parse the object program in just the same way as the POP-11 parser does itself, so that it would turn the expression

if X > 10 then - 3 else 5 endif-~ Y

into the sequence

[ push X, qpush 10, ca l l> , jump-if LAB1, qpush 5, jump LAB2, LAB1: qpush - 3 , LAB2: pop Y ]

The next stage is to turn the sequence corresponding to a function body into a set of execution paths (King, 1969). This is done by splitting the sequence every time a jump-if is encountered and by replacing jumps by the instructions they point to. When wc split a sequence after a jump-if, we note whether the value on the stack was true or false and continue either with the next instruction in the sequence or the instruction specified by the jump as appropriate. Thus in our example, splitting the sequence at the jump-if produces two paths, namely

[ push X, qpush 10, call > , note true, LAB1 : qpush - 3, LAB2: pop Y]

and

[ push X, qpush 10, call > , note false, qpush 5, jump LAB2, LA BI : qpush - 3 , LAB2: pop Y ],

and replacing jumps reduces these to

and

[ push X, qpush 10, call > , note true, qpush - 3 , pop y ]

[ push X, qpush 10, call > , note false, qpush 5, pop Y ].

This simple approach will produce infinite sets of paths if the initial sequence contains backward jumps or jump-ifs. There is, for our purposes, no need to consider cases where a jump-if is precessed more than twice (since there only two places it can lead to), and T Y P E R abandons any paths on which this happens. Thus the parser would turn the expression

until x > 10 do x + 1 -~x enddo;

into the sequence

[ LAB 1 : push x, qpush 10, call > , jump-if LAB2, push x, qpush 1, call +, pop x, jump LAB1, L A B 2 : . . . ]

162 A. RAMSAY

Splitting this sequence after the jump-if and replacing the jump would produce two paths, namely

[ push x, qpush 10, call > , note true, L A B 2 : . . . ]

and

[ push x, qpush 10, call > , note false, push x, qpush 1, call +, pop x, L A B I : push x, qpush 10, call > , jump-if LAB2, push x, qpush 1, call +, pop x, jump LAB1, L A B 2 : . . . ]

Processing the second of these would produce two more paths, on one of which the sequence [push x, qpush 10, call > ] will appear thrice. This path will be dropped, and the analysis will eventually produce the following two paths

[ push x, qpush 10, call > , note true, L A B 2 : . . . ]

and

[ push x, qpush 10, call > , note false, push x, qpush 1, call +, pop x, push x, qpush 10, call > , note true, L A B 2 : . . . ]

This process sounds as though it will produce enormous numbers of paths to be considered. In practice the effort is worthwhile, as it makes it very easy to examine all the states into which the program can get. When T Y P E R was applied to itself, the worst case was 22 paths for a function about 40 lines long.

Once it has all the paths through a function, T Y P E R executes the operations along each of them with symbolic data. Executing the operations along a path using symbolic data is rather different from doing it using real data. For TYPER, an abstract data item is a cell with two field, one describing the object 's type and the other describing where it came from. Thus the input parameters are initialised as {general [input 1]} . . . {general[input (n)]}, i.e. as a set of items whose type is unknown which are the inputs to the function. We can push and pop these objects just as we did before, but we cannot use call in the same way as we did when we had real data, and we have a new operation "no te" to describe. The new definition of call and the definition of note are as follows.

call (func) evaluates (func) in the current context. This means different things depending on the valuc of (func). In the simplest case T Y P E R has a description of the inputs and outputs of (func), e.g. if it were + then T Y P E R would know that it took two numbers and produced another. In this simplest case, TY P ER just checks that the arguments are of the right type, and returns new data items whose type is as required by the description of the outputs and whose source is the current call of (func). If the type check for some argument fails because the argument is known to be of an inappropriate type, T Y P E R notes that there is a problem and reports it to the user at the end of the analysis; if it fails because the argument type was "general", i.e. nothing was known about it, T Y P E R simply notes, at the point where the item first appeared, that its type should be checked at run time.

We will return to the more complicated instances of call later.

note (tval) generally just pops the top item off the stack. However, when (tval) is " t rue" T Y P E R examines the popped item. If it is the result of a type testing function,

T Y P I : - C H E C K I N G IN A N U N T Y P E D L A N G U A G E 163

e.g. "islist", "isfunction", then the program can infer that the item being tested was of the relevant type (e.g. for a path that went [ . . . push x, call islist, note true . . . . ] it would be possible to infer that x must be a list). Similarly, if the popped item is the result of comparing some item with a constant, it is possible to infer that the constant and the item it is compared with have the same type.

In these cases T Y P E R notes, at the point where the item appeared, that it will be of the relevant type (note, not that its type will need to be checked, but that for the given path it is k n o w n ) . At present T Y P E R does nothing else with information that could be obtained f rom instances of note.

5. Recombining paths

T Y P E R "executes" each path through a given function as described above. For each path it obtains notes describing the inputs, any type clashes that occur, and the outputs. The next stage is to combine the information for the individual routes into a description of the function as a whole.

Type clashes are reported directly to the user. The only problem here is that T Y P E R knows which pseudo-ops in the parse tree caused the problem, whereas the user wants to know where the problem is in her program, i.e. her source code. The inverse mapping from the parse tree to the source code is not entirely trivial, even if you keep track of what you are doing when you create the tree from the program. The best that T Y P E R can do is to indicate where the offending item(s) came from, and where the incorrect usage occurred, by pointing to the smallest enclosing expressions in the source code and naming the relevant function calls.

T Y P E R describes the outputs by checking their final types along each path; those that have the same type whichever way you go through the function are defined to have that type, the others are defined to be of type "general" .

For the inputs T Y P E R examines all the notes that it made concerning run-time checks. If some input has the same run-t ime check on all paths, it is clearly possible to infer that this check should be part of the input specification for the function. This principle is extended to include cases where the input item either has a run-t ime check planted, or where its type was inferred from a "no te" . This is to cope with cases like

until x = 0 do print(x); x - 1 ~ x ; enddo;

This produces two paths, namely

and

[ push x, qpush O, call = , note true ]

[ push x, qpush 0, call = , note false, push x, call print, push x, qpush 1, c a l l - , pop x, push x, qpush 0, call = , note true ]

On the first path, the "note t rue" following a comparison with a constant allows T Y P E R to infer that the value of x is a number (i.e. of the same type as 0). On the second path the call of " - " requires a check that the value of x is a number to be

164 A. RAMSAY

planted when the relevant item first appears. From the extended rule introduced above T Y P E R can now infer that on entry to the expression the value of x must be a number. Cases like this, where the termination of a loop depends on the relation between a constant and a variable, are very common; it is difficult to extract type information from them unless we make inferences that depend on the truth of the termination condition.

Thus for functions that just call functions with IO type specifications T Y P E R can do a certain amount of static type checking; infer IO type specifications; and indicate where run time checks/careful compile time analysis are required. However, not all functions fit this description.

6. Di f f icul t cases

UNDEFINED FUNCTIONS

There are a number of situations where a function is called but it is hard/ impossible to obtain any compile time description of it. This can happen if the function is selected from a dispatch table; if it is locally redefined; or if the user is applying T Y P E R to an incomplete program and the relevant function simply has not yet been defined.

In these cases T Y P E R has to make a guess about what arguments the function takes and what results it returns; its initial guess is that the function takes items of the sorts currently on the stack as inputs, and returns no results. If this assumption leads to problems on the grounds that subsequent operations try to pop non-existent items, the description of the function is altered so that it produces results of the required types. Evaluation is then restarted from the beginning of the current path ; this is necessary because the offending function may have been called several times before the problem was noticed, so that altering its description may have more complex effects than anticipated.

Using this technique T Y P E R produces a description of the functions that may instantiate the undefined call; this description may be used to help the user with the synthesis of her program, or to place run-t ime checks on the instantiation.

VARIABLE NUMBERS OF PARAMETERS

It is not uncommon for POP-11 functions to have variable numbers of inputs or outputs. In some such cases the first argument or result is a number which says how many more there, in others there is a distinguished item that is known to be the last argument or result, and there are also functions that provide no clues whatsoever about how many arguments or results they have. Most of these cases are very difficult for T Y P E R to deal with, and it generally just tells the user that she must check their use herself.

There is one important case, however, where T Y P E R can do something. This is the system function "enlist", which takes an arbitrary number of items, makes a list out of them, and returns this list as its result. Enlist assumes that at some earlier point in the execution of the program a distinguished marker was pushed onto the stack, and takes as its arguments all the items above the marker (which it removes). If T Y P E R encounters a call of enlist, it checks that the marker has been planted, removes it and all items above it f rom the stack, and replaces it by a list. The items that were removed from the stack may be the results of functions that have variable numbers

T Y P E - C H E C K I N G IN A N U N T Y P E D L A N G U A G E 165

of results; T Y P E R regards the use of such functions as legitimate in this context and does not report them to the user.

RECURSION

If you have two mutually recursive functions, it is impossible to derive a complete description for either of them until you have one for the other. In this case the rule suggested in section 3, that you should always analyse functions lower down the function hierarchy first, breaks down as neither of them is lower than the other. T Y P E R deals with this by picking one of them at random and obtaining a description of it treating the other as undefined (as above). It then analyses the other one, using this initial description of the first one when it is needed. If this description of the second function is different f rom that obtained during the analysis of the first, then the first is re-analysed using the improved description of the second, and if this alters the description of the first then the second is re-analysed, etc.

This technique is embodied in the following rule. As far as possible, work your way up the calling tree. When all the remaining functions call at least one unanalysed function, pick either a self-recursive function or one of a set of mutually recursive ones at random, and analyse this treating all calls of unanalysed functions as though they were undefined (any set of functions that all call at least one other member of the set must contain a self-recursive function or set of mutually recursive ones). If, when you subsequently analyse a function you find that its description is different from the one that was inferred when you treated it as undefined, then re-analyse all those functions in which you used the inferred description.

This rule allows T Y P E R to deal with self-recursive functions as well as mutually recursive ones. It is conceivable that it will lead to endless chains of elaboration, and T Y P E R has a check to warn the user that it cannot cope if there are more than four iterations for a given function, but in practice this has never been triggered.

USER-DEFINED CI.ASSES

At present the only mechanism for defining new data types in POP-11 is for defining record types, i.e. structures with a specified set of fields. It is easy to fit these into the scheme described a b o v e - - t h e definition of a record type just defines the access/update functions as taking an argument of the specified type. The techniques described above could also be used to cope with data types defined by predicates, as their main aim is to see where run-t ime checks are necessary and where they can be omitted, but there is currently no standard mechanism for defining and implementing such types and hence T Y P E R does not deal with them.

NON-STANDARD EXITS

POP-11 provides two functions, "exi t to" and "exi t f rom", that allow you to exit from the current function to some specified point arbitrarily high up in the calling sequence. The behaviour of functions that use this mechanism is hard to analyse. T Y P E R deals with them by providing separate descriptions of what they do when the standard and non-standard exits are taken. The standard description is used when analysing the subsequent behaviour of functions that call the given one in the normal way. The non-s tandard one is handed to the user for her to check for herself that this was what she wanted.

166 A. RAMSAY

7. Conclusions

T Y P E R was written to provide support for the development of programs written in a language for which, for various reasons, static type checking was not suitable. It acknowledges that even in this situation there are gains to be made by doing some compile time checks, and that the provision of type specifications for functions will often be appreciated even by programmers who would feel intolerably constrained if all functions and variables had to be typed. This is particularly true on occasions when the p rogrammer realises she has decided on an inappropriate representat ion for some class of structure and wants to change i t--following up all the ramifications of such a decision for a program in an untyped language can take considerable effort.

The main considerations in the design of T Y P E R were (a) that it should be unobtrusive. You should be able to write your program exactly as you like, and if T Y P E R cannot help you then it should just say "Sorry, I don ' t understand this b i t - -you ' l l have to check it yourself". (b) That it should be conservative. If T Y P E R sees the slightest sign of trouble it warns you, and you are quite free to ignore its warnings. It is not intended that T Y P E R should remove the run-t ime checks that are built into POP-11, even when it is certain they are unnecessary, so you will not get into insuperable difficulties if you ignore its warnings, but at least they will help you trace what went wrong when the run-t ime checks complain. (c) That it should be simple. It is even more important for verification programs, p rogrammer ' s apprentices, etc. to be correct than it is for other programs; simple programs are more likely to be correct than complicated ones. T Y P E R is based on two fairly well understood techniques, namely symbolic evaluation (Goldstein, 1974) and execution path analysis (King, 1969) [which is also used in dataflow programming (Watson & Gurd, 1979)], and is as unfussy as I could make it. It has been run without problems on itself. (d) It should be quick and easy to use. T Y P E R was written in POP-11 on a PDP-11 /34 , an environment which does not provide much working space. For programs that could be analysed within these constraints T Y P E R ' s parser is approximately four times as slow as the POP-11 compiler itself, and the data flow analysis takes the same sort of time (depending on how many paths need to be analysed), which is acceptable for a prototype. For larger programs (as when it was run on itself), intermediate results have to be stored on and retrieved from disc, and then the performance is intolerable.

This work was done as part of a project for developing software tools for distributed programming, supported by SERC grant 8067.9. The presentation of this paper owes a great deal to my discussions with Steve Hardy, and to his description of why A.I. languages should be more widely used (Hardy, 1982).

References

BURSTAI.I., R. M., COLLINS, J. & POPPLESTONE, R. (1971). Programming in POP-2. Edinburgh University Press.

CLOCKSIN, W. & MELI.ISH, C. J. (1981). Programming in PROLOG. Heidelberg: Springer- Verlag.

GOLDSTEIN, I. (1974). Understanding simple picture programs. Ph.D. thesis, M.I.T. HARDY, S. (1982). The POP-l] programming environment. Unpublished Report, University

of Sussex.

TYPE-CHECKING IN AN UNTYPED LANGUAGI'; 167

KAPI.AN, R. (1973). A general syntactic processor. In RUSTIN, R., Ed., Natural Language Processing. New Jersey: Algorithmic Press (Prentice-Hall).

KING, J. C. (1969). A program verifier. Ph.D. thesis, Carnegie-Mellon University. MINSKY, M. (1974). A framework for representing knowledge. M.LT. AIMemo 306. SUSSMAN, G. J. & MCDERMOTT, D. V. (1973). Why conniving is better than planning. M.I.T.

A I Memo 225,4. WATSON, I. & GURD, J. (1979). A prototype dataflow machine with token labelling. Proceedings

1979 National Computer Conference, New York. New Jersey: AFIPS Press.

type-checking in an untyped language

Documents