the composition of semantics in algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf ·...

12
Programming D. Gries Languages Editor The Composition of Semantics in Algol 68 P. Branquart, J. Lewi, M. Sintzoff, and P.L. Wodon MBLE,* Research Laboratory Brussels, Belgium The main features of Algol 68 are explained from a semantic point of view. It is shown how the language permits the composition of values and actions, i.e. ultimately programs, from a minimum set of primitives with a few fundamental recursive rules of composition. The associated syntax is briefly reviewed. An attempt has been made to obtain a structured and simple introduction to both Algol 68 and its orthogonal desgn. Key Words and Phrases: programming primitivies, programming languages, Algol, semantics, recursive composition, design of programming languages, data structures CR Categories: 1.3, 4.2, 4.22, 5.23, 5.24 Copyright © 1971, Association for Computing Machinery, Inc. General permission to republish, but not for profit, all or part of this material i~ granted, provided that reference is made to this publication, to its date of issue, and to the fact that reprinting privileges were granted by permission of the Association for Com- puting Machinery. * A Division of the Manufacture Beige de Lampes et de Materiel ElectroniQue S.A. 697 Introduction ALGOL 68 is a programming language which gener- alizes and systematizes a number of features which, in the main, can also be found elsewhere, scattered in various other languages. To take a few examples, the "call by name/call by value" of ALGOL 60 is systematized into a very neat and powerful parameter mechanism which, among other properties, takes advantage of the fact that routines themselves are values and therefore can be parameters of other routines. The "type" of ALGOL 60 is generalized into the concept of mode, char- acterizing sets of values with similar properties, and these modes may be constructed by the programmer, who may thus use data structures of his own; further- more, the programmer may define suitable operations on these data, for example polynomials and the opera- tor '-q--' applying on polynomials. Coupled with this is a mode-checking mechanism which ensures that an op- eration, in the largest sense of the word, is applied only to data for which it is defined. These and other fea- tures will be discussed in this paper. A program in ALGOL 68 specifies a series of actions to be performed by a computer. It will be seen that the order in which actions are executed may be serial, thus representing a succession of steps to be taken one after the other, or collateral, and then the order in which the actions are executed is left undefined. This permits the implementer to choose the order he likes. An action may be performed on values and may yield another value as result. The piece of text which the programmer writes to specify an action is called a phrase. It is the elaboration of the phrase which actually performs the action. Simple examples of phrases are ,10.1 q- 2.7', which yields a value, or 'go to label,, which performs a jump but yields no value. There are three kinds of phrases: expressions (yielding a value), statements (yielding no value) and declarations. Communications November 1971 of Volume 14 the ACM Number 11

Upload: others

Post on 11-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

Programming D. Gries Languages Editor

The Composition of Semantics in Algol 68 P. Branquart, J. Lewi, M. Sintzoff, and P.L. Wodon MBLE,* Research Laboratory Brussels, Belgium

The main features of Algol 68 are explained from a semantic point of view. It is shown how the language permits the composition of values and actions, i.e. ultimately programs, from a minimum set of primitives with a few fundamental recursive rules of composition. The associated syntax is briefly reviewed. An attempt has been made to obtain a structured and simple introduction to both Algol 68 and its orthogonal desgn.

Key Words and Phrases: programming primitivies, programming languages, Algol, semantics, recursive composition, design of programming languages, data structures

CR Categories: 1.3, 4.2, 4.22, 5.23, 5.24

Copyright © 1971, Association for Computing Machinery, Inc. General permission to republish, but not for profit, all or part

of this material i~ granted, provided that reference is made to this publication, to its date of issue, and to the fact that reprinting privileges were granted by permission of the Association for Com- puting Machinery.

* A Division of the Manufacture Beige de Lampes et de Materiel ElectroniQue S.A.

697

Introduction

ALGOL 68 is a programming language which gener- alizes and systematizes a number of features which, in the main, can also be found elsewhere, scattered in various other languages. To take a few examples, the "call by name/call by value" of ALGOL 60 is systematized into a very neat and powerful parameter mechanism which, among other properties, takes advantage of the fact that routines themselves are values and therefore can be parameters of other routines. The " type" of ALGOL 60 is generalized into the concept of mode, char- acterizing sets of values with similar properties, and these modes may be constructed by the programmer, who may thus use data structures of his own; further- more, the programmer may define suitable operations on these data, for example polynomials and the opera- tor '-q--' applying on polynomials. Coupled with this is a mode-checking mechanism which ensures that an op- eration, in the largest sense of the word, is applied only to data for which it is defined. These and other fea- tures will be discussed in this paper.

A program in ALGOL 68 specifies a series of actions to be performed by a computer. It will be seen that the order in which actions are executed may be serial, thus representing a succession of steps to be taken one after the other, or collateral, and then the order in which the actions are executed is left undefined. This permits the implementer to choose the order he likes. An action may be performed on values and may yield another value as result. The piece of text which the programmer writes to specify an action is called a phrase. It is the elaboration of the phrase which actually performs the action. Simple examples of phrases are ,10.1 q- 2.7', which yields a value, or 'go to label,, which performs a jump but yields no value. There are three kinds of phrases: expressions (yielding a value), statements (yielding no value) and declarations.

Communications November 1971 of Volume 14 the ACM Number 11

Page 2: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

All programming languages feature elementary values and actions, and have rules for composing com- pound actions. In ALGOL 68, there are in addition rules for composing values. This paper endeavors to describe the primitive values and actions together with the com- position rules which ultimately lead to the composition of programs. The emphasis is put on semantics, that is to say on values and actions, rather than on syntax for phrases. Indeed, one should know what can be ex- pressed by a language before learning how to write without misspellings or grammatical errors.

The input-output operations are not treated in this paper; see [1] for a detailed description.

1. Values and Modes

In ALGOL 68, values are internal objects, represented in hardware by bit patterns, which are operated upon when a program is elaborated. A set of values which share common properties, in particular on which the same relations and actions are defined, is characterized by a mode. This concept of mode is a generalization of the ALGOL 60 " type"; but in ALGOL 68 the programmer may construct his own modes, whereas ALGOL 60 has a fixed number of types. Each value has exactly one mode which strictly characterizes the kinds of actions that can be performed with or on it. For example, Boolean values and real numbers are of different modes for evident theoretical reasons. In certain cases hardware considera- tions are taken into account--real numbers of different "lengths" are of different modes.

Phrases are external objects. A phrase which yields a value, i.e. an expression, is also characterized by a mode-- the same mode which characterizes the set of values that the expression may yield. In general, the mode of an expression depends on its context. For exam- ple, if 'x, is a real variable, the mode of '3' in the con- text 'x :-- 3' is that of real numbers and not that of integers. More precisely, the "a priori mode" of the ex- pression '3' (integral) is transformed into real in the above context (see Section 2.1.6).

The definition of ALGOL 68 explicitly describes a certain number of modes characterizing primitive, language-defined sets of values. It also provides the pro- grammer with means for defining modes characterizing sets of values which are composed from more elemen- tary ones: sets of multiple and structured values, sets of routines, sets of references to values, and unions of other sets.

1.1 Primit ive Sets of Values The primitive modes are those which the program-

mer is unable to define himself. They form the basis for the programmed construction of any number of other modes.

A first language-defined mode is 'int'. The set of values characterized by it contains all integers whose magnitude is smaller than a certain number N which is

fixed not in the language definition but by the environ- ment, i.e. an implementation of ALGOL 68 on a com- puter. For example, the environment may fix N as being the largest integer that can be stored in one cell. In this sense, ALGOL 68 is hardware dependent, but not particu- lar-hardware dependent, since the value of N is not given in the definition of the language.

In practice, 'int, characterizes the set of integers of "simple length". Similarly, the modes 'long int, 'long long int', etc., characterize the sets of integers of double, triple, etc., length, each of them with a finite maximum magnitude fixed by the environment. Again, the defini- tion does not say how many different lengths there may be, but does say that there is a limit fixed by the environ- ment. Furthermore, all such limits fixed by environment are accessible from the programs themselves (environ- ment enquiries).

A second language-defined mode is ,real, which characterizes a set of real numbers whose maximum mag- nitude and number of significant digits are fixed by the environment. The modes 'long real,, 'long long real', etc., in an environment-limited amount, also exist; but of course, the number of ,long' symbols is now more a measure of precision than a measure of size. Further- more, each integer of given length (e.g. ,10,) is equivalent to a real number of same length (, 10.0').

The mode 'bool' characterizes the set of Boolean values 'true' and 'false'.

The mode ,char' characterizes a finite set of values, the characters. ALGOL 68 explicitly enumerates some characters (mainly letters and digits) but leaves the im- plementation to decide which other values of mode 'char' there may be. Each character has an equivalent value of mode 'int', fixed by the implementation (ob- viously the integer with the same machine representa- tion).

Values of mode ,format, are used in the control of input-output operations. They are not discussed in this paper; but it must be pointed out that, since formats are values, they can be manipulated as such, e.g. assigned to variables, given as parameters to procedures, or com- posed into compound values.

The mode ,bits, characterizes a set of n-tuples of Boolean values, with n fixed by the environment. For example, n may be the number of bits in a memory cell. In spite of its description in terms of ,bool', ,bits' is language-defined since no programmer could define it in a program if it were not available. This is meant to ex- press that values of mode 'bits' are normally handled in one piece by the usual hardware instructions. There also exist the modes 'long bits,, ,long long bits', etc., in an amount fixed by the environment.

The modes 'bytes,, 'long bytes', etc., are quite similar to 'bits', 'long bits,, etc. They are k-tuples of charac- ters, with k fixed by the environment.

The last language-defined mode is 'sema'. Values of this mode, called semaphores, are used to synchronize collateral elaboration (see Section 2.2.2).

698 Communications November 1971 of Volume 14 the ACM Number 11

Page 3: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

1.2 Composition of Sets of Values ALGOL 68 provides the programmer with means for

defining new modes, and therefore new sets of values, according to five composit ion rules, which are at the same time mathematically coherent and hardware oriented. Roughly speaking, these rules compose power sets (containing arrays of any dimensions), Cartesian products of sets (containing structured values (records)), sets of routines, sets of references to other values, and union sets.

1.2.1 Multiple Values. I f ,~' is a mode, the modes ,[ ]u,, ,[, ]~,, , [ , , ]u,, etc., with any number of commas between the brackets, characterize sets of multiple values, i.e. arrays of values of mode '~ ' with l, 2, 3, etc., dimensions. For example, ' [ , ] in t , characterizes the set of rectangular arrays (matrices) of integers of simple length. The values of mode 'u ' which constitute a multi- ple value of mode ' [ . . . ] ~ ' are called its elements. A notation like ' [ , ] int' suggests that something will be written between the brackets- - i t is indeed the case and will be discussed later.

Each given multiple value has a given number of elements. The elements of a one-dimensional value are numbered from a lower bound l to an upper bound u, so that l+ i is the subscript of the ( i+ 1)-th element if 0 _< i < (u - l ) . This is generalized for n-dimensional multi- ple values: to each kth dimension is associated a lower bound lk and an upper bound uk and to each element there corresponds an n-tuple of subscripts. This is, in the main, what the programmer needs to know. In [l] a multiple value of any number of dimensions is defined as a descriptor together with a certain number of elements each of which is indexed by a unique integer; the de- scriptor contains the information necessary to obtain the index of an element f rom its n-tuple of subscripts. Mathematically speaking, a multiple value is thus a sequence of values, i.e. a mapping from integers to values, together with another mapping f rom n-tuples of integers to integers. The elements of a multiple value are indeed stored one after the other in successive locations and not in an n-dimensional memory.

1.2.2 Structured Values. I f 'ul,, '~2', . . . , 'u~' are modes and 's~', 's2', . . . , %,' identifiers (n>_l), then one can "define the mode ,struct (u~sl, u2s2 . . . . , uns,~)'. This new mode characterizes a set of structured values, which are ordered n-tuples of other values, the fields. The ith field has the mode 'u~' and is selected by the identifier 's~', called a field selector. For example, the mode ,struct (real re, real im)' characterizes a set of com- plex numbers and ,struct ( [ ] char name, bool sex, int age) ' a set of personal records.

In each mode of structured values, the modes and selectors of all fields are explicitly enumerated. Hence, the number of fields is fixed by the mode. Moreover, not only the modes 'ui' but also the field selectors 's~, are parts of the composed mode. For example, the two modes specified by 'struct (real real part, real imaginary part)' and 'struct (real modulus, real argument), are

different. This composit ion rule is in fact a Cartesian product of an indexed family of sets.

It is worth noting that 'struet (real re, real ira), may be written as ,struct (real re, im)'. Many such obvious abbreviations are defined in the so-called "extended language".

1.2.3 Routines. In ALGOL 68, a procedure, i.e. a routine, with or without parameters and delivering or not a result, is considered as a value and therefore has a mode. F rom the modes '~1', 'u2', . . . , '#~', '~ ' ( n> 1), one may compose the following other modes: 'proc (ul , u2 . . . . . un)u', 'proe ( m , u2 . . . . . u,O', 'proe u' , and 'proc'. The values in the sets thus characterized are routines. A routine is essentially an internal representa- tion of a phrase of the language, at the head of which may appear a list of formal parameters. The modes 'u~' ( l_<i<n) specify in which sets the values used as actual parameters may be taken, and 'u ' specifies in which set the result will be. When they do not appear, there are no parameters and /o r no result, as the case may be.

The reasons for regarding routines as values will appear progressively. For example, since a routine is a value, it may be used as a parameter of another routine or as a field in a structure.

1.2.4 Names. One way of explaining what ALGOL 68 means by name (another word such as "reference" or " locat ion" would have been better) is to start f rom a known situation. In ALGOL 60, the declaration ,real x ' declares an identifier 'x ' which may designate values (real numbers). Which value is thus designated depends on the last assignment that has been made to 'x ' , e.g. 'x : = 2.73'. F rom a hardware point of view, this means that the binary representation of '2.73' is stored in the location representing 'x ' . The role of the declaration ,real x ' is twofold: it reserves a suitable location for a real number and it represents the identifier ' x ' by the address of that location.

In ALGOL 68, these two effects are separated. The address of a location (and therefore also the location itself, but not its contents) is a hardware representation of a so-called name, i.e. a value which refers to another value of a given mode. This other value is of course rep- resented by the contents of the location. All this is distinct f rom the business of associating an identifier with a name or with any other kind of value for that matter (see Section 1.4.2). One might say that names and identifiers respectively are internal and external accesses to values.

I f ,u, is a mode, ,ref ~ ' is another mode characteriz- ing the set of names referring to values of mode 'u ' . For example, any name of mode 'ref int ' refers to a value of mode 'int', any name of mode 'ref ref bool ' refers to a name of mode 'ref hool'. Since names are themselves values, they may be operated upon (see Section 2.1.4), they may be created (an operation which reserves a location in store), or they may be assigned to other names (in which case a pointer to a location is placed into another one). All these features permit the com-

699 Communications November 1971 of Volume 14 the ACM Number 11

Page 4: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

position of programs of a list processing nature which require a highly dynamic memory allocation•

1.2.5. United Modes. I f 'ux', 'u2' . . . . , ' u , ' ( n> 1) are modes, then one may define from them a mode ,union (ul , us, • • • , #n) '. Such a mode does not have the same properties as nonunited modes- - i t characterizes the set of values which are either of mode 'ul, or of mode 'us' • . . or of mode ,u,~,. This implies that there is no value o{ mode ,union (u~, us, • • • , u , ) ' and therefore that a united mode as such is a property of expressions which may yield values of any of the modes 'u l ' to ' u , ' .

An example adapted from LISP 1.5 might help to clarify this. Supposing that the modes ' a tom' and 'list ' are already defined, one may define from them the mode 'union (atom, list)'• Obviously, an a tom is of mode 'atom', a list of mode 'list,. I f an expression like 'car (x)' is computed, the result will be an a tom or a list, but not both, depending on the value of 'x,. It is this kind of expression which, in ALGOL 68, is of a united mode. This feature permits a freer specification (and control) of the set of values that a given expression may yield.

United modes may in turn be used in any other mode construction, such as ' [ . . . ] union (ul , u s , . . . , u,~)', 'ref union ( . . . ) , , etc. Such modes are nonunited and characterize sets of values. For example, 'ref union (list, a tom) ' characterizes a set of names referring to either lists or atoms.

Like the mathematical union of sets, united modes are commutat ive and associative in the sense that 'union (int, bool, char),, 'union (bool, int, char), and 'union (char, union(int, bool)) ' specify one and the same mode defined without associations and with some canonical ordering of its components.

Modes used to compose a union cannot be related. This is best explained with a simple example. Let the identifiers 'i, and ~j, represent names referring to inte- gers. As expressions, ' i ' and ~], are of mode 'ref int' . In the context 'i := j , , however, the value yielded by ~j' is not a name but the integer referred to by that name (the value of the variable), so that ~]' is of mode 'int' in that context (see Section 2.1.4). The modes ' int ' and ' ref int ' are said to be related, which means that there exist automatic transformations f rom one into the other. Such modes cannot be united since this would cause ambi- guities. Roughly speaking, such unions already exist implicitly since the elaboration of ~j, may yield a value of mode 'int, or 'ref int', depending on the context.

1.3 Mode Declarations The pieces of text, like 'int' or ,ref [ ] bool', which the

programmer writes to represent modes are called de- clarers and are used in mode declarations.

A mode declaration has the form 'mode m = tz', where 'iz' is a declarer and 'm, a symbol (the choice of which follows certain rules of no interest here). This establishes 'm ' as a new declarer which specifies the same mode as 'tz'. Within the range (a range is similar to an ALGOL 60 block) of the declaration, 'm ' may be written instead of '~ ' .

'700

Thus viewed, a mode declaration is a way of estab- lishing abbreviations for declarers• However, a mode declaration is necessary to obtain recursive modes• A mode 'm ' declared by 'mode m = lz' is recursive if '~ ' explicitly contains 'm ' (or any declarer which may re- place another one containing 'm') .

For example, we can declare 'mode tree = struct(ref tree lnode, mode)', in which case any value of mode ' t ree ' consists of two references to other values of mode ' tree' . Such recursive structures are necessary for list or graph processing• Incidentally, they are one of the rea- sons for introducing the "refer to" concept independ- ently of that of variable.

Certain recursive mode declarations are not allowed, because they do not make sense (e.g. 'mode b = ref b ') or create ambiguities• I t is difficult to describe the pro- hibited recursive mode declarations in a simple yet rigorous manner, but they fall outside the normal pro- gramming needs anyway.

Because of mode declarations, there may exist sev- eral declarers specifying the same mode. It is impor tant to point out that it is always possible to mechanically check whether or not two declarers specify the same mode.

ALGOL 68 also features standard mode declarations• (In ALGOL 68, a standard concept is one which is defined in terms of more primitive ones, usually by a standard declaration. The programmer may use such concepts without having to declare them. He may also redeclare them.) For example, the declarations for 'compl ' and 's t r ing ' are standard. By 'mode eompl = struet (real re, real im),, ,compl, is made to characterize a set of complex numbers• There are also standard declarations for 'long eompl,, ,long long compl,, etc. The set of all character strings is characterized by ,mode string = [1:0 flex] char' , where '0 flex' means that a string is initially empty but may have afterward a varying number of elements.

1.4 Representation of Values A value can be directly represented in a program by

a denotation, an identifier, or an operator (the latter for routines only). An essential difference is that the deno- tation of a value is language-defined whereas identifiers and operators are declared in each program (or are standard)•

1.4.1 Denotations. Denotat ion is the ALGOL 68 term for "constant" , like '3•14' or '10'; i.e. it is a sequence of symbols which possesses (one could say "represents") a unique value, defined in the language and independent of any elaboration•

ALGOL 68 does not define denotations for all values of all modes. There are denotations for nonnegative numbers of any length ( '10' , ,long long 0', '0.0123', '12.3e -- 3', note that the length is always explicit), values of mode 'bool, (,true, and ,false,), of modes 'char, ( , " a " '), 'bits ' of any length ('1 0 0 1', 'long 1 0 1 0 1 0'), ' s t r ing' ( "'abcde" ') and 'format,.

The only other values with denotations are the rou- tines. For example, '((bool a, bool b)bool: if a then b

Communications November 1971 of Volume 14 the ACM Number 11

Page 5: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

else false fi) ' is the denotation of a routine of mode ,proc(bool, bool)bool,. Roughly speaking, a routine denotation is a phrase with its mode (if any) preceded by a list of formal parameters with their modes (if any). A routine used in a program can be obtained only by a denotation written by the programmer. This means that an ALGOL 68 program cannot modify itself. All other values have no denotation and can be obtained only by the elaboration of an expression.

1.4.2 Identifiers and Identity Declarations. Identi- fiers, written in ALGOL 68 as in ALGOL 60, are used to represent any value of any mode, but the value pos- sessed by an identifier is defined in the program by an identity declaration.

An identity declaration has the form 'us = E ' where 'u ' is a declarer, ' s ' an identifier and 'E ' an expression yielding a value of mode 'u ' . After elaboration, ' s ' possesses this value. As long as the execution proceeds in the range (block) in which 's ' makes sense, ' s ' pos- sesses the same value and any subsequent use of 's ' will produce that value. The mode of 's ' , as an expression, is

'U'" An identity declaration is therefore not the usual

declaration of variable but something more general since names and routines also are values.

As an example, 'real pi = 3.14159, makes 'pi' pos- sess '3.14159' so that the expression '2+pi ' stands for '2+3.14159' . Moreover, the expression 'pi := 3.14' has no more sense than '3.14159 := 3.14': 'pi' behaves like a constant.

As another example, 'proc (bool, bool) bool and = ((bool a, bool b)bool : if a then b else false fi) ' makes the identifier 'and' possess a routine which computes the Boolean conjunction. (Since this notation is redundant, it may be written in the extended language as 'proc and = (bool a, b) bool : i f a then b else false fi'.)

As a third example, let us consider the case of an ex- pression which creates a name. Let 'N, be such an ex- pression, creating for example a name of mode 'ref real ' . Then the identity declaration ,ref real x = N ' makes 'x ' possess a name which in turn may refer to any value of mode ,real,. The effect of such a declaration includes that of the declaration of a variable in ALGOL 60. (This is best understood if one realizes that creating a name means, in practice, reserving a memory location. See Section 2.1.4.)

We emphasize here that an identifier is an external object and, as such, makes sense only in the range in which it is declared. This is a static textual property, independent of any evaluation. On the other hand, a value is an internal object created by the elaboration of an expression. This internal object may be used (i.e. may continue to exist) until the end of the elaboration of its scope, which is a certain part of the program defined for each value. All values, except routines, formats, and local names or values composed from them have the whole program as a scope. Local names are explained in Section 2.1.4. Routines or formats may contain identifiers so that their scope is raange wherte these

701

identifiers make sense. The point is that in general the number of existing values and the duration of their use are dynamic, i.e. depend on the elaboration.

1.4.3 Operators and Operator Declarations. A rou- tine with one or two parameters may be the value po- sessed by another kind of external ob jec t - -an operator. An operator is represented by a symbol like '-k-', ' A ' , ,minus, etc., and the fact that an operator possesses a routine is the ALGOL 68 recognition that operations are functions.

Many operators, including all the usual ones like ' + ' , ' - - ' , etc., are standard. The programmer may, however, redeclare them or declare new ones by an operator declaration. For example, 'op (hool, bool) bool /k = ((bool a, bool b) bool : i f a then b else false fi)' makes the operator ' /k ' possess the same routine as the identifier 'and' in the above example.

A difference with procedure identifiers is that the same operator may possess several different routines at the same time (it may be "overloaded") , provided that their modes are sufficiently different. This is to take care of the fact that the same operator is usually employed for several operations. Thus, ' - t - ' is used for integral, real, and complex addition.

Another difference is that a binary operator is used in infix notation and has a priority, which may be either standard or declared by the programmer in a priority declaration (e.g. 'priority /~ = 3'). By convention, all unary operators always have the highest priority and priority declarations are not allowed for them; also by convention, binary operators with the same priority are associated f rom left to right.

2. Actions

An action is a process which constructs values, selects in them, uses them, establishes relationships between them, or which controls the order of elabora- tion of a composed action. As was the case for modes, ALGOL 68 features primitive actions and general rules which the programmer uses to compose richer actions and ultimately programs.

2.1 Primitive Actions on Values Primitive actions on values are classified according

to the kinds of value which they yield (constructing actions) or on which they operate (selecting or applying actions).

2.1.1 Multiple Values: Construction and Selection. A multiple value may be constructed by a particular ex- pression called a row display. I f 'El ' , . . . , 'En' (n> 1) are expressions of one same mode 'u ' , then the row display '(El . . . . . En) ' is of mode '[ ] u ' and yields a multiple value with 1 and n as lower and upper bounds. In the case where '~z' is , [ . . . ]ul', then the resulting mode is '[ . . . . ]ul'. That is the row display adds one dimension.

Communications November 1971 of Volume 14 the ACM Number 11

Page 6: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

Another way to construct a multiple value is through a coercion called rowing. A coercion is a language- defined transformation of the mode of an expression into another mode. It is indicated by context only and corresponds to natural programming needs. Most coercions specify an action transforming a value of the former mode into a value of the latter one. Here, the coercion called rowing t ransforms the mode '~ ' or ' [ . . . ] , ' of an expression ,E' into ,[]~, or '[ . . . . ]u', respectively. This coercion specifies also the t ransforma- tion of the value yielded by 'E ' into a multiple value with one more dimension, where the new lower and upper bounds are both 1. Since rowing, like all coer- cions, is specified by context only, it may be performed repeatedly on the same expression, thus adding as many dimensions as needed to the corresponding value.

The more usual, elementwise, construction of a multiple value is explained in Section 2.1.4.

It is also possible to select parts of a multiple value to obtain one or some of its elements. I f ' M ' is an expression of mode ' [ . . . ]u' yielding an n-dimensional value with l l , u l , 12, u2 . . . . , I,,, u, as lower and upper bounds, then the expression 'M[ . . . . al :b i . . . . ], is a slice

which yields another n-dimensional value of the same mode , [ . . . ]~', with lower and upper bounds l l , u l , 12, us , . . . , l i_~, u s - l , 1, bi -- ai + 1, 14+1, Ui+l . . . . .

l , , u , (provided that l~ < a~ < b~ < u0. The elements of the obtained value are those of ' M ' whose subscripts are between these bounds. With the same original condi- tions, the slice 'M[ . . . . a ~ : b i at h i , . . . ] ' yields a multiple value with same dimensionality and elements as 'M[ . . . . a~ :b~ . . . . ] ' but with bounds 11, ul . . . . . l i - 1 , Ui -1 , h i , hi + (b~ - a~ ) , l~+a, u~+1 . . . . , I,, u, : the subscripts of the ith dimension are renumbered f rom h~. These two kinds of slicing may be done on more than one dimension at the same time, always yielding an n- dimensional value with elements taken in 'M ' .

Slicing may also be used to decrease the dimen- sionality: ' M [ . . . , k~ . . . . ], yields an ( n - 1)-dimensional value with, as lower and upper bounds, lx, ul, Is, u2 . . . . . l~_1, Ui-x, I~+~, Ui+l, 1,, U,, and, as elements, those of 'M[ . . . . kl :k i . . . . ],. This kind of slicing may be applied on any number of dimensions. I f it is done on all dimensions by 'M[k l , k s , . . . , kn]', then the value yielded is the element in ' M ' with subscripts k~, k2 . . . . . k , and therefore is not a multiple value•

Finally the three kinds of slicing, 'a~ : b~', 'a~ : b~ at hd and 'k~', may be mixed•

2.1.2 Structured Values: Construction and Selection. I f ,E?, 'E2', . . . , ,E, , (n> 1) are expressions of mode 'tzl', 'tzs', . . . , ' tz, ' respectively, then the structure dis-

p l a y ' ( E l , E2, . . . , E , ) ' yields a structured value of a mode 'struct (ms~, u2ss, . . . , u~s,) ' fixed by the context. For example, in 'struct (int a, real b)c = (n-t-10, 3.14),, the selectors for the fields of ' (n+10 , 3.14), will be ' a ' and 'b ' . (If 'u~' = 'u2' . . . . . ' u . ' , the context determines whether the expression is a row display or a structure display.)

The action which obtains a specific field from a

structured value is called a f ie ld selection. I f 'S ' is an expression of mode ,struct (u~sl, u2s2, • • • ,uns,~)' (n>_ 1) then the field selection 'si of S, is an expression of mode 'u~' which yields the ith field of the value yielded by 'S ' .

Field selectors are external objects which, unlike subscripts, do not yield values. They cannot be obtained as results of expressions; this allows efficient implemen- tation of field selection.

2.1.3 Construction and Application of Routines. Con- structing a routine is done by the programmer by writing a suitable phrase (body), preceded by a list of formal parameters, if any. When the phrase is an ex- pression, i.e. when the routine is meant to deliver a value, the mode of this value is also indicated. For exam- ple, a routine computing the Boolean disjunction would be written: '(bool a, bool b) bool : i f a then true else b fi'. This is the denotation of a routine of mode ,proc (bool, bool) bool'. In fact, a routine denotation is recognized as such only by the presence of formal parameters. When they are not there, only the context indicates that a routine must be obtained, as in the identity declaration ,proc real p = 3.14 + x ' . In such a case, ALGOL 68 speaks of a coercion (proceduring) instead of a routine denotation, but this is not important in practice. It is however important to repeat that routines cannot be constructed or modified by the program. The elabora- tion can do many things with routines (assign them, yield them as values, call them, etc.) but it cannot con- struct them.

Inside a routine, there may appear external objects (identifiers, operators, mode indications) declared out- side. Each of these external objects is valid only in the range of its declaration. By definition the routine itself has a significance only in the smallest of these ranges which constitutes the scope of the routine. As a con- sequence it is not possible to use certain kinds of rou- tines having another routine as result: if the latter con- tains an identifier which is a parameter of the former, then the scope of the result is such that the result dis- appears as soon as it is obtained.

When a routine is called, its parameters, if any, are elaborated before the body of the routine is elaborated. This is more or less the usual mechanism, but in ALGOL 68 it is worth examining a little more closely.

A call is a phrase of the form 'P(Ex, E2 . . . . , E , ) '

(n>_ 1), where 'P ' has the mode ,proc (gl, g2, •. • , tz,)g' or 'proe (m, u2, • • • , u . ) ' and where each actual param- eter 'Ed has mode 'gal. The mode, if any, of the call is ~[,L v .

The value yielded by 'P ' is a routine with a denota- tion such as '(gxsl, g2s2, . . . , u ,s , ) : E, (if 'E ' is without mode). The call specifies an action which first trans- forms this routine into the phrase '(u~s~ = E~, ~2s2 = E2, . . . , u . s , = E . ; E) ' . This phrase is then elaborated as it is, which implies that the formal parameters (identi- fiers) , s ? , . . . , , s , ' are made to possess the values yielded by the corresponding actual parameters 'Eu, • . . , ' E , ' before ,E' is itself elaborated. In other words, the correspondence between formal and actual parame-

702 Communications November 1971 of Volume 14 the ACM Number 11

Page 7: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

ters is established as a set of identity declarations which are effective during the elaboration of 'E,. Since names and routines are values, this clean mechanism is very powerful.

Again, the ALGOL 68 terminology is different for a routine without parameters: it is deprocedured (a coer- cion). This is because the t ransformation of the routine (a value) into a phrase to be elaborated is indicated by context (and not by the presence of actual parameters) as for the third occurrence of 'p ' in 'proe real p = x -t- 3.14; proc real q = p ; real y = p,.

As an example of the parameter mechanism, let the identity declaration 'proe (proe) f = (proc a) : E ' be followed by the call ' f(x q-- y) ' . The latter is elaborated as '(proe a = x q-- y ; E ) ' and therefore the value pos- sessed by ' a ' is a routine. Whenever the identifier ' a ' is encountered during the elaboration of 'E ' , it yields this routine which may then be transformed (deprocedured) into the expression 'x -q- y ' and elaborated as such. The point is that 'x Jr- Y' is not computed when the actual parameter is passed, but only when the parameter ' a ' is used in 'E,. One may compare this delayed computat ion to the call by name in ALGOL 60.

As another example, once the identity declaration '[1 : 3] proe q = (go to ll, go to 12, go to •3), has been elaborated, ,q, possesses the value of mode '[ ] proc, yielded by the row display '(go to ll, go to 12, go to •3)'. Then the slice 'q[2]' is an expression of mode ,proc, yielding the routine 'go to 12,. I f called (deprocedured), this routine yields a jump, immediately elaborated as such. This is similar to the switch of ALGOL 60.

It has been said that operators correspond to pro- cedure identifiers with particular conventions. In the same way, formulas correspond to certain kinds of calls. A (b inary)formula is an expression of the form 'El (9 Ez' where ,E? and 'E2' are expressions of mode 'u l ' and 'u2' and ,(9, is an operator possessing a routine 'R' of mode 'proc (ux, u2)u' or ,proe (u l , u2)'. The mode of such a formula is 'u ' , if any. It is elaborated exactly as a call 'p(E~, E2)' where ,p' is supposed to be an identifier possessing the routine 'R ' . For a formula, however, the routine which is called is determined not only by ' (9 ' but also by 'u~' and 'u2' since an operator may possess at the same time several meanings, repre- sented by routines. Unary formulas ,(9 E ' are similarly handled.

2.1.4 Creation and Use of Names. A name of mode 'ref u ' is represented (in the computer memory) by a location large enough for containing any value in the set characterized by 'u ' . An expression of mode 'ref u' may either yield an already existing name (i.e. essentially the address of an already established location) or create a new name (i.e. establish a new location).

Like any other value, a name has a scope, which is that part of the program in which the value may be used. There are local names whose scope is the range in which they are created, and global names whose scope is the whole program. The concept of scope is particularly important for names since their creation involves

703

reserving memory which, in the case of local names, may be released when the elaboration of the scope is finished.

A name of mode 'ref u ' is created by the elaboration of a generator, which is an expression of the form 'loe u, for a local name and 'heap ~z' or simply ,#' for a global one. I f '~ ' is ,[...]~zx', information for the descriptor must be provided between the brackets. For the imple- menter, and also the user, ' loe' and 'heap' suggest two different storage allocation mechanisms: a stack and a "heap" respectiv~ely.

An important/use of generators is within an identity declaration of the form 'ref us = loc tz'. The generator 'loc tz' creates a name of mode 'ref u,, i.e. reserves a memory location, and the identifier ' s ' is made to pdssess that name, i.e. becomes an external designation of/the location. The scope of the name precisely is the rartge in which the i d e n t i f i e r ' s ' makes sense. The similarity with the ALGOL 60 declaration of variables is indicated by the fact that the programmer may abbrevi- ate this identity declaration into 'us'. For example ,ref real x = loc real, may be written as ,real x,, which practically has in ALGOL 68 the sense the ALGOL 60 programmer believes it has. Similarly ,ref real x = heap real ' may be written as ,heap real x,.

When a name referring to a multiple or structured value is created, names referring to all elements and sub- values of the multiple value or to all fields of the structured value are implicitly created at the same time. They will be called subnames. For example, if 'A ' is an identifier possessing the name of a multiple or struc- tured value then 'A[3]' or 'a of A' is such a subname.

At the creation of a name referring to a multiple value, the generator must also indicate the number of dimensions by an appropriate number of commas and the lower and upper bounds of the value referred to. The latter are yielded by expressions of mode 'int,, which may or not be followed by the symbol ' f lex ' as in 'loe [1 flex : 3, j flex : 6 flex] real,. I f 'flex, is there, the corresponding bound may vary each time the name is made to refer to another value. Otherwise, the corresponding bound is fixed when the name is created.

The action which makes a name refer to another value is the assignation. I f 'N ' is an expression of mode 'ref ~ ' and 'E, an expression of mode 'u ' , the assigna- tion 'N := E, makes the name yielded by ,N, refer to the value yielded by ,E,. The expression 'N := E ' itself yields a value which is the name yielded by 'N ' and therefore has the same mode as 'N ' . For example the identity declaration ,ref real x = loc real := 2.7, makes ' x ' possess the name yielded by the assignation 'loc real := 2.7,. This may be written as ,real x := 2.7, and is an initialized declaration. Assignation is es- sentially the ALGOL 60 assignment but generalized in the sense that 'N, and 'E ' may be any expressions yielding a name and a value.

Clearly, assigning values to subnames, e.g. 'A[3] := 2.1', is the way of constructing a multiple or struc- tured value element by element.

The action which obtains the value referred to by a

Communications November 1971 of Volume 14 the ACM Number 11

Page 8: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

name corresponds to taking the contents of a location. In ALGOL 68, it is a coercion (dereferencing), which transforms a mode 'ref u, into 'tz'. The corresponding action on a name of mode 'ref ~' yields the value referred to by this name. This value is of mode 'u ' or, if '~, = 'union (~x, ~2, . . . , ~ , ) ' , of one of the modes 'tzx', 'us' , . . . , ' u , ' . For example, after the declarations 'real x := 2.7' and 'real y := 0.0,, the name yielded by 'x ' in 'y := x ' is dereferenced to yield '2•7,•

Since names are values, it is important to be able to check if two expressions yield the same name. This is done by identity relations. I f 'Nx' and 'N2' are expres- sions of mode 'ref u', then the identity relation 'Ni := : N2' yields the Boolean value ,true' if 'Nx' and 'Ns ' yield the same name; otherwise it yields 'false,• The converse relationship is denoted by ' : ~ :'.

2.1.5 Expressions of a United Mode and Conformity Relations. There are no values of united modes, but there is a coercion, uniting, which transforms the mode 'u ' of an expression 'E ' into a mode 'union ( m , u2, •. • , tz, . • • , ~ ) ' . For example, this coercion is applied on 'true' in 'union(real, bool) a := true'. The mode of ,true' must be 'union(real, bool)' for the assignation to be syntactically correct.

There is also a relation, the conformity relation, which may be used to actually check the mode of a value yielded by an expression (usually of united mode)• I f 'N ' is an expression of mode ,ref u~, or ,ref union (m . . . . . , ~ , • • • , un) ' and 'E ' is an expression yielding a value which is of mode u~ or which is a name referring to a value with the same conditions, then the expression 'N :: E ' yields the value ' true' , and 'false' otherwise• The expression 'N : := E ' does the same but in addi- tion, if the Boolean value obtained is ,true,, it elaborates 'N ' and assigns the value yielded by 'E ' to the name yielded by 'N' .

2.1.6 Other Implicit Mode Transformations. The coercion called widening t ransforms the mode 'int' or ,long int' , 'long long int,, etc., of an expression 'E ' into ' real ' or 'long real,, 'long long real ' , etc. It also specifies an action which transforms an integral value into the equivalent real value. The same coercion also transforms ,real, or ,long real,, ,long long real ' , etc., into 'compl' or ,long compl', 'long long eompl', etc.

The coercion of voiding t ransforms the mode 'u ' of an expression 'E, into no mode at all. The correspond- ing action is to neglect the value yielded by ,E,. The converse t ransformation is done by the coercion of hipping which gives a mode ,u, to an expression 'E ' without mode. This expression may be just the dummy ,skip': its value is then any value of mode '# ' . It may also be a jump like 'go to l,; it has then no value (the jump is executed) except when the mode 'tz' is 'proc m ' or 'proc', in which case a routine 'go to l ' is obtained. I t may finally be the expression ,nil' whose value is always the only name referring to nothing; the purpose of hipping is then only to specify a mode.

These two coercions, voiding and hipping, have helped unify the usual concepts of expressions and

704,

statements. For example, jumps can now be present as escape clauses in conditional expressions, and assigna- tions can be treated as expressions.

All the mode transformations done by coercion may also be specified by a cast ,~ : E' where 'u ' is a declarer and 'E ' an expression. Such a cast is of mode 'u ' and is used whenever the context is unable to specify a coercion. I f 'E ' yields a value, the cast specifies the t ransformation of this value into the equivalent value of mode 'tz'. Of course this t ransformation must be possible.

For example, let the identity declarations 'ref real xx,, 'real x ' , ,real y := 2.7' be followed by ,x := y,, ,xx := x ' and '(ref real : xx) := y ' . In 'x := y ' , ' y ' is dereferenced and in 'xx := x ' , no coercion takes place. For obvious reasons, an implicit dereferencing on the left member of an assignation is impossible. In '(ref real : xx) := y,, a cast is used to specify that the left member is to be of mode 'rcf real,. The correspond- ing action yields the value referred to by 'xx'.

2.2 Composition of Actions The elaboration of a program is a composi t ion of

actions which take place in some order. Actions, whether composed or elementary, may be performed serially, i.e. in an order completely specified by the program, or collaterally, i.e. in an order left unspecified by the program. There are also means to choose between actions•

2.2.1 Serial Composition. Two actions A1 and A2 are said to be executed serially if the execution of As follows immediately the complete execution of A t . For example, in a field selection ,s of S ' , the expression ,S,, yielding a structured value, is completely elaborated before the selection is executed. ALGOL 68, as any other language for that matter, thus defines classes of actions which are to be executed serially and also gives means for explicitly composing serial execution•

I f 'Ex' and 'Es' are two phrases, the serial execution of the actions which they specify is expressed by 'E~ ; E2'. The latter phrase thus specifies a serial action, as in ALGOL 60. In ALGOL 68, however, the resulting phrase may have a mode, which is then the mode of 'E2' and may yield a value, the value yielded by 'Es'. In general, a serial action may be composed of more than two actions• For example, ,real x := 2.73 ; real y ; x := x -k 0.01 ; y := x ' is an expression of mode 'ref real, which specifies the serial elaboration of four constituent phrases. Two of these are declarations and the value yielded is the name possessed by 'y ' .

Serial composit ion may also be obtained by the use of the jump, 'go to loop~ (or simply 'loop'). In this, 'loop' is an identifier labeling another phrase: 'loop : E'. This is classical, but in ALGOL 68 declarations and statements may be intermixed. In order to avoid having a declaration elaborated more than once within the same activation of a range, no label may appear in front of a declaration or in front of a phrase which is followed by a declaration.

Communications November 1971 of Volume 14 the ACM Number 11

Page 9: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

Another kind of serial composit ion which breaks the order of execution is obtained by a completer , which has the form ' .label :' as in 'E l . label : E2'. The action specified by ' . label :' ends the elaboration of the whole range in which it appears, and the value yielded is the value of ,E~,; a completer amounts thus to a jump to the end of the range. The label ' label' in front of 'E2' is required, for otherwise 'E2' would in no circumstances be elaborated. Again, no declaration may follow ' .label :'.

2.2.2 Collateral Composition. Two actions A~ and A2 are executed collaterally if their execution is merged in time in an unspecified way (i.e. unspecified by the definition of ALGOL 68), provided that the orders of execution proper to A1 and A2 are preserved. One may visualize this as a concurrent unsynchronized execution of A~ and A2. Since any kind of merging is possible, a serial execution of A~ and A2 may very well be chosen by the implementer. For example, in an assignation 'N := E' , the expressions 'N ' and 'E ' are, in ALGOL 68, elaborated collaterally.

Again, besides collateral executions imposed by the definition of the language, the programmer has means to compose collateral actions. The collateral elaboration of two phrases 'El ' and 'E2' is expressed by 'El, E2'. In such a case, if the elaboration of 'E l ' influences that of 'E2' then the global effect of their collateral elabora- tion is left undefined. For example, after the elaboration of 'x := 0.1, x := 0.2,, the value assigned to ' x ' is either ,0.1' or '0.2' (or anything else if the hardware can do both things together and mix the two bit pat- terns). In general, of course, these things which are undefined in the language are very well defined for a particular implementation.

More generally, a collateral action may be com- posed of more than two actions, as in 'E l , E2 . . . . , En'. It is such an expression which is used to compose a row display or a structure display, in which case it yields a multiple or structured value. Since declarations are phrases, they may be composed collaterally as well as serially (see for instance, in Section 2.2.3, the elabora- tion of calls).

As elementary means to control or synchronize the execution of collateral actions, ALGOL 68 has the semaphores . In the language, a semaphore is a value of mode 'sema' and resembles an integral variable of mode 'ref int ' ; the distinction is made for avoiding implementation problems and programming insecurity. Two important operators are defined on semaphores: 'up' and 'down,. The expressions ,up S ' and 'down S' , where 'S ' is an expression of mode 'sema', may appear only in an expression 'E~' inside an expression without mode and of the form 'par (E~, E2 . . . . , En)', pre- ceded by the special symbol ,par,. Let the name pos- sessed by 'S, refer to an integer k; 'down S ' decreases k by one if k is greater than zero or halts the elabora- tion of 'E~' otherwise; 'up S, increases k by one and, if k is now positive, the elaboration of any expression 'Ey' which has been halted by a 'down Si ' ( 'Sx' giving

the same semaphore as 'S ' ) is resumed. Semaphores are therefore tools for synchronizing collateral elabora- tions. A simple example is 'sema i : = / 1 ; par ((down i ; read data ; up i), (down i ; use data ; up i )) ' which yields the same elaboration as either 'use data ; read data ' or ,read data ; use data ' (the operator ' / ' trans- forms an integer into a semaphore value).

2.2.3 Composition of Basic Actions on Values. The • basic actions on values may themselves be composed,

but the order of execution of their constituent parts is language-defined. For example, in 'M[I] ' o r ' s of S', the expressions 'M ' , ' I ' , and 'S ' behave like arguments of functions, and an order of elaboration must be defined. In general, whenever no logical or practical reason requires a strict ordering, the execution is col- lateral; this minimizes sources of side effects.

In a slice 'M[I1 : I2 at I3 . . . . , In] ' or a generator '[I1 : A , . . . , In_l : 1,d~', all expressions ' M ' and 'I~' are elaborated collaterally.

In an assignation 'N := E' , 'N' , and 'E ' are elabo- rated collaterally and the assignation of the value to the name takes place thereafter. In identity relations, 'N1 := : N2' or 'N1 : # : N2', both members are also elaborated collaterally, but in a conformity relation 'N :: = E' , the elaboration is serial: 'E ' is first elaborated and only if its mode conforms to that of 'N' , the latter is elaborated. Note also that a row or structure display ' (E l , E2, . . . , E,~)' implies a collateral elaboration of the 'E~'.

The order in which a call like , P ( E i , . . . , En) ' or a formula like 'E~ @ E2', is elaborated was implicit in Section 2.1.3. For a call, first 'P ' is elaborated to yield a routine of the form '(tz~s~, iz2s2 . . . . . tznsn)~ : E ' . The next action transforms the routine into '(mSl = E l , iz2s2 = E2 . . . . , ~ s n = En ; ~ : E) ' . The third action, still in series, is the elaboration of this expres- sion. This consists in the collateral (in this case) elabora- tion of n identity declarations followed by the elabora- tion of ,E,. I f the routine had been of the kind '(ms1 ; tz~s2 ; . . . ; #ns,,)lz : E ' , then the identity declarations would have been elaborated serially. Mixed cases are possible.

A formula is elaborated exactly as a call and the following remarks also apply to calls. In a formula 'E, ' e.g. 'a -q- b 1" c × d', each operator has a priority, so that in fact 'E ' is 'El -t- E2', where 'El ' is ' a ' and 'E2' is ,b I" c N d'. According to the preceding rule, the elabo-

ration of 'E ' is that of e.g. '(real x = E1, real y = E2 ; real : E+)' where 'E+' performs the addition. The point

is that 'El ' and 'E2' are elaborated collaterally. The elaboration of 'E2' i.e. 'Ez X E4' is that of e.g. '(real

x = E~, real y = E4 ; real : E×),. In turn, 'E3' and 'E4' are elaborated collaterally. This means that, in 'a + b T c X d', 'a', 'b', 'c', and 'd ' are all elaborated col- laterally before operators are used. This is general for standard opera tors- -a l l operands are elaborated col- laterally. The programmer must be aware that if an operand is a routine producing "side effects" on other

705 Communications November 1971 of Volume 14 the ACM Number 11

Page 10: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

operands, the action specified and the result yielded by the formula are undefined.

The programmer is also free to declare routines for which the actual parameters are elaborated serially. This may be useful when one of them depends on the preceding ones as in 'proc p = ((int n ; [1 : n] real x) : for i to n do x[i] := i)'.

2.2.4 Select ions of Actions. Another fundamental way to compose actions is to choose between them according to certain conditions. This is achieved by condit ional and case actions. The first type selects one out of two actions on the basis of a Boolean value, as in ALGOL 60, while the second type selects one out of several actions on the basis of an integral number or on the result of multiple conformity relations.

A conditional action is expressed as 'if B then E~ else E~ fi' where 'B' is a Boolean expression and 'E~', 'E2' are phrases possibly with modes 'm', 'm' . The mode, if any, is the mode common to 'm' and 'u2', after suitable coercions. The elaboration consists first of that of 'B': if this yields the value 'true' (resp.

'false'), then 'Ex' (resp. 'E2') is elaborated. The value, if any, yielded by the conditional action is that yielded by the selected expression.

Two other forms of conditional actions are available. The first one is expressed by ' i fB then E fi' and this equals 'if B then E else skip fi'. The other one generalizes the conditional expression of LiSP and takes, for example, the form 'if B1 thef B2 tbef Ba then E~ else E2 fi' or 'if B~ then E1 elsf B2 then E2 else E3 fi'. More concise notations are allowed because 'if ' and 'fi' may be written as '( ' and ,),, ,then, and 'else, both as , {,, and 'thef' and 'elsf' both as ' I : ' --e.g. ' ( B I E ~ I E 2 ) '

instead of 'if B then E~ else E2 fi'. A case action is expressed by ,case K in E1 . . . . .

En out E esac' or '(K I E~, . . . . E,~ I E) ' where 'K' is of mode 'int' (n >_ 2). The expression 'K' is elaborated; if it yields an integer k, 1 < k _< n, then 'Ee' is elabor- rated, otherwise 'E ' is elaborated. The mode of the case expression is the common mode of 'E~', . . . , 'En', 'E'. The expression 'K' may also be a multiple con- formity relation 'F~ . . . . , Fn ::= F ' or 'F1 . . . . . F,, :: F ' ; this is elaborated by collaterally elaborating 'Fi : := F ' or 'Fi :: F ' (1 < i < n); if the kth-comparison yields 'true', 'Ek' is selected and elaborated; if all yield 'false', 'E ' is elaborated. (If more than one yields 'true', one of the corresponding 'Ek' is elaborated, but the definition of the language does not specify which one.)

2.2.5 Repetit ive Actions. ALGOL 68 also has repeti t ive

actions, expressed by 'for s from 11 by 15 to I~ while B do E', w h e r e ' s ' is an identifier, '11', 'I~', and '13' are integral expressions, 'B' is a Boolean

expression, and 'E' a phrase without mode. The whole phrase is a so-called "for statement". Its elaboration is that of

(int var := I~, tat step = 12, int max = 13 ; loop : if (step > Olvar < maxl(step < Olvar > maxltrue))

then int s = v a r ;

if B then E; var := var + s t e p ; g o to loop fi

fi)

Note that 'I1', 'I2', and '13' are elaborated only once and do not use the identifier 's'.

Various default conventions permit straightforward optimizations by the implementation. If 's, occurs neither in ,B, nor in 'E,, 'for s' may be deleted. If '11' is '1', 'from 11' may be omitted. If '12' is '1', 'by I2' may be omitted. If 'B' is 'true,, ,while B' may be omitted. Finally 'to 13' need not appear, and then the tests on ' s tep' , ,var', and ' m a x ' are deleted.

3. Composi t ion of Programs

The general structure of ALGOL 68 programs turns out, perhaps surprisingly, to be rather simple since any kind of phrase may be used almost anywhere provided that it makes sense. The main syntactical properties will be briefly reviewed and a couple of examples will serve as illustrations.

3.1 The Syntax of Algol 68 The syntax of ALGOL 68 contains three aspects

which respectively deal with the general structure of programs, the modes of expressions, and the conditions for using properly the declared external objects. In [1], the first two aspects are described by a formal syn- tax, whereas the third one is described in a sort of semi- formalized English.

In the first two sections of this paper the more important points about modes of expressions and use of declared external objects have been dealt with. As far as modes are concerned, the syntax expresses the properties and conditions which have been described. For example, it says that in an assignation 'N := E' , 'N' is of mode 'ref #' and ,E, of mode '# ' (the same '#'). It also expresses how and when coercions take place. An important feature of ALGOL 68 is that it provides complete syntactical control of the modes, thus providing the programmer with maximum security at the cost of some loss of freedom.

The ALGOL 68 range has been likened to the ALGOL 60 block. This is indeed true, except that a range may appear at many more places than the block. In practice, any phrase which specifies a serial action (this includes a single action) and which is placed between "gen- eralized parentheses", is a range. The pairs of these generalized parentheses are: 'begin . . . end', 'if . . . then', ,then . . . else', 'else . . . fi', and also the ones which, like 'out . . . esac', are defined in terms of them

706 Communications November 1971 of Volume 14 the ACM Number 11

Page 11: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

(see Section 2.2.4). Most of these parentheses admit several representations: ' b e g i n , and ' i f ' may be written as ' ( ' , ,end, and ,fi, as ,),, and ,then, as , ] ,, etc. (see Section 2.2.4).

This definition of range has several implications. For example, it is impossible to j ump from the outside into any part of a conditional phrase. A more important point is that a range may yield a value: the value yielded by the last expression in it (see Section 2.2.1). It is then possible to write such things as ' i f bool q := false ; f o r i t o n w h i l e - - 1 q d o q := .4[i] = 0 ; q t h e n . . . , . T h e value yielded by the range between ' i f ' and 'then, is that o f 'q ' after elaboration of the repetitive statement (i.e. ,true, if an element of ,`411:n], is zero and 'false' otherwise).

Finally, it is worth recalling that, as in ALGOL 60, any declared object is valid within the range in which it is declared and may be redeclared in an inner range. In the case of identifiers and mode declarers, this super- sedes the outer declaration (as in ALGOL 60). For operators, this is different; the same operator may possess at the same time several different routines. For example, in Section 3.2.1, an operator ' + , is declared and this does not prevent the use of the standard numerical , + , in the same range.

3 . 2 E x a m p l e s o f P r o g r a m s

The two following examples are meant to give an idea of the structure of ALGOL 68 programs and of the possibilities of the language. They are not presented as certified, efficient, error-free programs but merely as illustrations.

3.2.1 U n i f i c a t i o n A l g o r i t h m . g/ The unification algorithm is taken as an example of composit ion of a nonnumerical program. It is explained in: J. A. Robin- son, " A review of automatic theorem proving" (Pro- ceedings of Symposia in Applied Mathematics, Volume 19, American Mathematical Society, Providence, Rhode Island, 1967). We quote:

"Unification Algorithm. Given any finite set P = {A~, . . . . An} of finite, nonempty pairwise disjoint sets of a toms as input, proceed as follows:

" S t e p l . P u t k = 0, ~r0 = ~, and go to Step 2. "Step 2. I f Aw'k . . . . . A,o'k are all singletons, stop.

Otherwise, let h be the smallest index for which Aho'k is not a singleton, and let Vk be the earliest and Uk the next earliest expressions in the lexical ordering of the disagreement set of Aho'k. GO to Step 3.

"Step 3. I f Vk is a variable which does not occur in Uk, put O'k+l = O'k{Uk/Vk}, add 1 to k, and return to Step 2. Otherwise, stop.

"End of unification algori thm." ALGOL 68 makes it possible to immediately declare

a corresponding routine. It may be remarked that we are already inside, an ALGOL 68 program: the symbol at the head of this section is a "comment symbol" and comments may appear anywhere. We now close the comment ~/

proe unification = (ref [ 1 ref set P) reef snbst : the use of references will avoid making unnecessary copies

of data structures ¢~ (ref set Asigma, ref subst sigma := nil ; while booi all singletons := true ;

for i to [ P ~ ' [ ' is a standard operator which delivers the upper bound of a 1-dimensional array

while all singletons do all singletons := 1 = [ (.4sigma := P[i] M sigma) ;

all singletons the while clause is a range whose value is given by ~ a// singletons ~

do (ref term U, ref var V ; if (assign 2 earliest (disagreement set (Asigma), V, U) A I/

(~ V l false) then sigma := sigma + U/V else failure $ a jump ~ fi) ;

sigma. S the result when unification succeeds. For the com- pleter, see section 2.2.1

failure:nil g the result when unification fails S),

This declaration closely reproduces the original algorithm. Of course, the modes, operators and pro- cedure identifiers used in it must be defined. We quote: "By a substitution is meant a finite set { T t / V 1 , . . . , Tk /Vk} of substitution components T~/V~, where V~ is a variable and T~ is a term different f rom V~ . . . . " This appears in the following mode declarations ~/

mode set = [I : flex] term, term = union (var, fn), Yatoms and terms are given the same

representetion vat = string, fn = struet (string ident, ref set args), subst = struet (ref subst head, component last), S remark that

a set has been represented by an array of terms whereas a substitution will be a chain of components. This is a pro- grammer's choice.

component = struet (term t, vat v),

~ T w o procedures must be defined, ,assign 2 earliest, and ,disagreement set'. Let us, for example, write the first one, given that " . . . the lexical ordering of a set of expressions is assumed to be an ordering in which variables precede all other expressions (any other details of the ordering being immaterial)"

proc assign 2 earliest = (ref set D, ref vat V, ref term U) bool: (int m l , m2, bool b ; if b : = false ; for i to [ D while --1 b

d o b := V::= D [(ml := /)] gSection2.1.4~ ; b

then b : = false ; for i from m l + 1 to [ D while -a b d o b := var : : D [(m2 := /)] ;

U := D [(b I m2 [ :ml # 1 I 1 12) ggSection 2.2.4~] ; true

else false fl),

f / I f the set 'D ' contains no variable, this procedure will return ,false,, otherwise it returns ,true, and assigns a variable to 'V ' and the lexically next term to 'U ' . It remains to define the operators ,¢ ', ' + ' , ' / ' and ' × '. We do that for the first three.

707 Communications November 1971 of Volume 14 the ACM Number 11

Page 12: The Composition of Semantics in Algol 68cse.msu.edu/~stire/cse891f04/branquart.pdf · 2004-08-26 · Programming D. Gries Languages Editor The Composition of Semantics in Algol 68

priority ¢ = 4 # the symbol t~ , does not presently exist in ALGOL 68. The priority declaration establishes it as a binary operator and gives it the same priority as ~ = t ~,

op ~ = (refvar v, ref term t) bool : (vat vt, f n f t ; case vt, f i :: = t ~ Section 2.2.4: multiple conformity in vt ~ v, ~ yields ~true ~ if ' t ' is a variable different from 'v'

(bool b : = false ; for i to [ args o f f t while --1 b do b := v ~ (args offt) [i] ; b) esac),

op --I- = (ref subst a, component b) ref subst : heap subst : = (a,b) see Section 2.2.4 (generators and assignations) and 2.1.2 (structure display)

op / = (term t, var v) component : (t ,v),

~ F o r c o m p l e t e n e s s , we q u o t e : " T h e r e s u l t o f

c a r r y i n g o u t t h e s u b s t i t u t i o n a o n t h e e x p r e s s i o n E . . .

is o b t a i n e d f r o m E by r e p l a c i n g e a c h o c c u r r e n c e o f t h e

v a r i a b l e Vi b y a n o c c u r r e n c e o f t h e t e r m Ti fo r e a c h

s u b s t i t u t i o n c o m p o n e n t in t h e s u b s t i t u t i o n a . " T h e

o p e r a t i o n 'op ' X ' = ( r e f se t A, r e f subs t s i g m a ) r e f s e t :

. . . ' c a r r i e s o u t t h e s e r e p l a c e m e n t s fo r e a c h e x p r e s s i o n

E in t h e se t A.

3 .2 .2 N e w t o n A l g o r i t h m . ~ T h e f o l l o w i n g e x a m p l e

s h o w s a p r o c e d u r e fo r c o m p u t i n g t h e c o m p l e x r o o t s o f a

p o l y n o m i a l w i t h c o m p l e x coe f f i c i en t s o n t h e f o r m aoz m + . . . + a m - l Z + a M . ~

proe complex Newton = (ref [ ] eompl coeff, compl approxl, int maxi ter , real epsl, eps2) [ ] compl:

begin int m $ degree of the polynomiabg : = [ coeff - 1 ; [1 :m] eompl r ~ roots $ , int n ~ number of nonnull roots $ ; for i from m by - 1 to 1 while coeff[(n := i)] = 0 do r [i] := 0 # null roots. This is a statement appearing between

declarations $ ; [0 : n] eompl f ~ coefficients of Homer function ~,

f d # coefficients of its derivative ~ , c ~ auxiliary coefficients ~ : = coeff [0 : n] ;

f[0] := fd[0] := c[0]; compl a $ first approximation rg : =

i fapproxl ~ 0 I true I buol tr := true, ti := true ; ((for i from 0 to n while tr do tr : = re c[i] = 0), (for i from 0 to n while ti do ti := im c[i] = 0))

the standard operators 're' and 'ira' yield the real and imaginary parts of a complex number, respectively rg ;

--1 (tr k / t i ) ~ this is not a polynomial with real coefficients g )

then a p p r o x l else 0.3 i 0.7 il rg the binary operator ' .1. ' produces a complex number/¢,

compl b ~ next approximation ~ ; for i from n by - 1 to 2 do if bool p ~ test of precision $ : = false ;

to maxi ter while --a p do (for j to i -- 1 do f d [j] : =

(f[j] := c[jl -k a X f [ j - - ll) --ha X f d [ j - 11; f[i] := c[i] -t- a X f [ i -- 1] ; i f fd[i - 11 ~ 0 then b := a - f [ i ] / f d [ i - 1] ;

p : = a b s ( r e a - r e b ) < e p s l A abs (ira a -- im b) < eps2 ;

a : ~ b

else a := a -I- 0.5 .L 0.5 fi) ; P

then (r[i] := b ~ nonnull root ~, (forj to i - 1 do c[jl := f[j]), a := (absim b < eps2 X a b s r e b

[ (re b + im b) ..1_ (ira b - re b) ] re b - i m b)) ~ a collateral phrase

else out fi ; r[1] := - f [1] / f [O] ~last root ~ ; r .

out : [1:0] compl end

A c k n o w l e d g m e n t s . T h e a u t h o r s a r e i n d e b t e d t o

P r o f e s s o r P. G e n n a r t o f t h e E c o l e R o y a l e M i l i t a i r e ,

B rus se l s , fo r p r o v i d i n g t h e o p p o r t u n i t y o f p r e s e n t i n g

t h e l a n g u a g e d u r i n g t h e s e m i n a r w h i c h he o r g a n i z e d .

T h e p r e s e n t p a p e r is b a s e d o n t h e n o t e s p r e p a r e d fo r

t h a t s e m i n a r . T h a n k s a re a l so d u e to t h e E d i t o r , w h o s e

c o m m e n t s h e l p e d i m p r o v e t h e o r i g i n a l v e r s i o n .

Received May 1970; revised November 1970

References

1. van Wijngaarden, A. (Ed.), Mailloux, B.J., Peck, J.E.L., Koster, C.H.A. Report oil the Algorithmic Language A L G O L 68. Numerische Mathematik 14, 2 (1969), 79-218. Available from ACM Headquarters, New York. 2. Peck, J.E.L. (Ed.) ALGOL 62 Implementation. Proc. wIP Working Conf. on ALGOL 68 Implementation, Munich, 1970, North-Holland Pub. Co., Amsterdam, 1971.

708 Communications November 1971 of Volume 14 the ACM Number 11