type checking

71
Prepared By: Dabbal Singh Mahara 2016 Contents Type Checking Run-time Environments Intermediate Code Generation 1

Upload: dsingh-ma

Post on 07-Jan-2017

101 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Type checking

1

Prepared By:Dabbal Singh Mahara2016

Contents

Type Checking Run-time EnvironmentsIntermediate Code Generation

Page 2: Type checking

Dabbal Mahara 2

Type Checking• A type is a set of values together with a set of operations that can

be performed on them• Type checking is checking that each operation in a program

receives appropriate number of arguments of appropriate types in appropriate order.

• The purpose of type checking is to verify that operations performed on a value are in fact permissible.

• Certain operations are legal for values of each type– It doesn’t make sense to add a function pointer and an integer in C.– It does make sense to add two integers.

• The type of an identifier is typically available from declarations, but we may have to keep track of the type of intermediate expressions.

• Type errors arise when operations are performed on values that do not support that operation.

Page 3: Type checking

Dabbal Mahara 3

Type Systems• A language’s type system specifies which operations are valid for

which types.• Type systems provide a concise formalization of the semantic

checking rules. • A type system defines a set of types and rules to assign types to

programming language constructs like informal type system rules, for example “if both operands of addition are of type integer, then the result is of type integer”.

• Type Checking is the process of checking that the program obeys the type system.

• A type checker implements type system. • A sound type system eliminates run-time type checking for type

errors.– Memory errors: Reading from an invalid pointer, etc.– Violation of abstraction boundaries.

Page 4: Type checking

Dabbal Mahara 4

Type Checking Overview

Three kinds of languages:• Statically typed: All or almost all checking of types is done as part of

compilation (C, ML, Java)• Dynamically typed: Almost all checking of types is done as part of

program execution (Scheme, Prolog)• Untyped: No type checking (machine code)

• Static typing proponents say:– Static checking catches many programming errors at compile time– Avoids overhead of runtime type checks

• Dynamic typing proponents say:– Static type systems are restrictive– Rapid prototyping easier in a dynamic type system

Page 5: Type checking

Dabbal Mahara 5

Static Checking• Refers to the compile-time checking of programs in order to

ensure that the semantic conditions of the language are being followed

• Examples of static checks include:– Type checks– Flow-of-control checks– Uniqueness checks– Name-related checks• Flow-of-control checks: statements that cause flow of control to leave a

construct must have some place where control can be transferred; e.g., break statements in C

• Uniqueness checks: a language may dictate that in some contexts, an entity can be defined exactly once; e.g., identifier declarations, labels, values in case expressions

• Name-related checks: Sometimes the same name must appear two or more times; e.g., in Ada a loop or block can have a name that must then appear both at the beginning and at the end

Page 6: Type checking

Dabbal Mahara 6

Type Expression• A language usually provides a set of base types that it supports together with ways to

construct other types using type constructors• Through type expressions we are able to represent types that are defined in a program• A base type is a type expression

a primitive data type such as integer, real, char, boolean, … type-error signal an error during type checking void : no type

• A type name (e.g., a record name) is a type expression• A type constructor applied to type expressions is a type expression. E.g.,

– arrays: If T is a type expression and I is a range of integers, then array(I,T) is a type expression– records: If T1, …, Tn are type expressions and f1, …, fn are field names, then record((f1,T1),…,(fn,Tn)) is a type expression

– pointers: If T is a type expression, then pointer(T) is a type expression Ex: pointer(int) – functions: If T1, …, Tn, and T are type expressions, then so is (T1,…,Tn) →T. Ex: int→int represents the type of a function which takes an int value as parameter, and return type is also int.

Page 7: Type checking

Dabbal Mahara 7

A Simple Type Checking System

Page 8: Type checking

Dabbal Mahara 8

Specification of Simple Type checker

• A simple type checking translation scheme for declaration is given in the following figure.

• The basic types are : character and integer. • The constructed types are : array and pointer. • The type attribute is added to each symbol. • The declaration should come before the usage of the

variable.

Page 9: Type checking

Dabbal Mahara 9

Type checking for expression• The synthesized attribute type for E gives the type of the

expression assigned by the type system for the expression generated by E.

• The function lookup returns the type of id. • The following figure shows the type checking for the expressions.

Page 10: Type checking

Dabbal Mahara 10

Type checking for statements

• In some languages statements have a type associated with them, while some other languages don’t assign types to statements.

• In the latter case, statements are given a type void to distinguish a type safe statement with one which has a type error.

• if an error occurs within a statement, then the type assigned to this statement is type_error.

Page 11: Type checking

Dabbal Mahara 11

Type checking for functions

• A function to an argument can be captured by production:T→T->T E→E(E)Function type declaration Function call

Page 12: Type checking

Dabbal Mahara 12

Type Conversion and Coercion• Since representation of integer and real is different within a computer, the

different machine instructions are used for operations on integers and reals. Often if different parts of an expression are of different types then type conversion is required.

• For example, in the expression: z = x + y what is the type of z if x is integer and y is real ?

• Compilers have to convert one of the them to ensure that both operand of same type!

• In many language Type conversion is explicit, for example using type casts i.e. must be specify as inttoreal(x)

• Type conversions which happen implicitly is called coercion. Implicit type conversions are carried out by the compiler recognizing a type incompatibility and running a type conversion routine (for example, something like inttoreal(int)) that takes a value of the original type and returns a value of the required type.

• The coercion of expressions is given in following figure.

Page 13: Type checking

Dabbal Mahara 13

Type Conversion and Coercion (Contd.)

Page 14: Type checking

Dabbal Mahara 14

Structural Equivalence of Type Expressions• The basic question is "when are two type expressions equivalent?" • Two expressions are structurally equivalent if they are two expressions of same

basic types or are formed by applying same constructor.

Example: int a, b;Here a and b are structurally equivalent.

Page 15: Type checking

Dabbal Mahara 15

Run-time Environments

• A compiler must accurately implement the abstractions embodied in the source language definition. These abstractions typically include the concepts such names, scope, bindings, data types, operators, procedures, parameters and flow-of-control constructs.

• The compiler must co-operate with operating system and other system software to support these abstractions on the target machine.

• To do so, the compiler creates and manages a run-time environment in which target code are being executed.

• By runtime, we mean a program in execution. • Runtime environment is a state of the target machine, which may

include software libraries, environment variables, etc., to provide services to the processes running in the system.

Page 16: Type checking

Dabbal Mahara 16

Run-time Environment...• Runtime support system is a package, mostly generated with the

executable program itself and facilitates the process communication between the process and the runtime environment. It takes care of memory allocation and de-allocation while the program is being executed

• This environment deals with a number of issues such as layout and allocation of storage locations for the objects named in the source program, the mechanisms used by the target program to access variables, the linkages between procedures, the mechanisms for passing parameters, and interfaces to the oerating system, input/output devices and other programs.

• That is,‣ Management of run-time resources‣ Correspondence between static (compile-time) and dynamic (run-time)

structures‣ Storage organization

Page 17: Type checking

Dabbal Mahara 17

Run-time Resources• Execution of a program is initially under the control of the

operating system (OS)• When a program is invoked:

‣ The OS allocates space for the program‣ The code is loaded into part of this space‣ The OS jumps to the entry point of the program (i.e., to the beginning of the “main” function)

Page 18: Type checking

Dabbal Mahara 18

Memory Layout: Storage Organization

Low Address

High Address

Page 19: Type checking

Dabbal Mahara 19

Correspondance between static and Dynamic structures• Compiler must do the storage allocation and provide access to variables

and data. • At run time, we need a system to map NAMES (in the source program) to

STORAGE on the machine.• Allocation and de-allocation of memory is handled by a RUN-TIME SUPPORT

SYSTEM typically linked and loaded along with the compiled target code. • One of the primary responsibilities of the run-time system is to manage

ACTIVATIONS of procedures. • Procedure execution begins at the first statement of the procedure body.• When a procedure returns, execution returns to the instruction immediately

following the procedure call.

Page 20: Type checking

Dabbal Mahara 20

Activation and Activation Tree• Every execution of a procedure is called an ACTIVATION. • The LIFETIME of an activation of procedure P is the sequence of steps

between the first and last steps of P’s body, including any procedures called while P is running.

• Normally, when control flows from one activation to another, it must (eventually) return to the same activation.

• If a procedure is recursive, a new activation can begin before an earlier activation of the same procedure has ended.

• We can represent the activations of procedures during the running of an entire program by a tree, called an activation tree.

• Activation tree shows the way control enters and leaves activations. In an activation tree:– Each node represents an activation of a procedure.– The root represents the activation of the main program.– The node a is a parent of the node b if the control flows from a to b.– The node a is left to to the node b if the lifetime of a occurs before the lifetime of b.

 

Page 21: Type checking

Dabbal Mahara 21

Procedure Activations: Example

Page 22: Type checking

Dabbal Mahara 22

Procedure Activation : Example (contd...)• The example is a sketch of a program that reads nine integers into an

array a and sorts them using the reciursie quicksort algorithm.• The main function has three tasks. it calls readarray, sets sentinels

and then calls quicksort on the entire data array.• The figure in the right side shows the sequence of calls that might

result from an execution of the program. In this execution, the call to partition(1,9) returns 4, so a[1] to a[3] hold elements less than its chosen separator value v, while the larger elements are in a[5] through a[9].

• In this example, procedure activations are nested in time.

Page 23: Type checking

Dabbal Mahara 23

Activation Tree: During an Execution of quicksort

• This activation tree shows one possible activation tree that completes the sequence of calls and returns in above program.

• The functions are represented by the first letters of their names.• Remember that this tree is only one possibility, since the

arguments of subsequent calls, and also the number of calls along any branch is influenced by the values returned by the partition.

Page 24: Type checking

Dabbal Mahara 24

Control Stack• Procedure calls and returns are managed by a run time stack called the

control stack.• Each live activation has a frame known as activation record, on the control

stack, with root of the activation tree at the bottom and the entire sequence of activations corresponding to the path in the activation tree to the activation where control resides currently. The latter activations has a record at the top of the stack.

• The stack keeps track of currently-active procedure activations.– An activation record is pushed onto the control stack as the activation starts.– That activation record is popped when that activation ends.

• At any point in time, the control stack represents a path from the root of the activation tree to one of the nodes.

• The flow of the control in a program corresponds to a depth first traversal of the activation tree that:

– starts at the root,– visits a node before its children, and– recursively visits children at each node an a left‐to‐right order.

Page 25: Type checking

Dabbal Mahara 25

Top

Page 26: Type checking

Dabbal Mahara 26

Activation Records• Information needed by a single execution of a procedure is managed

using a contiguous block of storage called activation record. • An activation record is allocated when a procedure is entered, and it is

de‐allocated when that procedure exited.• Size of each field can be determined at compile time (Although actual

location of the activation record is determined at run‐time). • Except that if the procedure has a local variable and its size depends

on a parameter, its size is determined at the run time.

Page 27: Type checking

Dabbal Mahara 27

A General Activation Record

Actual parameters

Returned values

Control link

Access link

Saved machine status

Local data

Temporaries

• Temporary values, such as those arising from the evaluation of expressions, in cases where those temporaries cannot be held in registers.

• Local data belonging to the procedures whose activation record this is.

• Saved machine status, withe information about the state of the machine just before the call to the procedure. This information typically includes the return address ( value of the program counter, to which the called procedure must return) and the content of registers that were used by the calling procedure and that must be restored when the return occurs.

• An access link, may be added to locate data needed by the called procedure but found elsewhere, e.g. in another activation record.

• A control link, pointing to the activation record of caller.• Space for return value of the called function, if any.• The actual parameters used by the calling procedure.

Page 28: Type checking

Dabbal Mahara 28

Creation of An Activation Record

• Who allocates an activation record of a procedure?• Some part of the activation record of a procedure is created by that

procedure immediately after that procedure is entered.• Some part is created by the caller of that procedure before that

procedure is entered.• Calling sequences are code statements to create activations records

on the stack and enter data in them. • The CALLING SEQUENCE for a procedure allocates an activation

record and fills its fields in with appropriate values. • The RETURN SEQUENCE restores the machine state to allow

execution of the calling procedure to continue.

Page 29: Type checking

Dabbal Mahara 29

parameters and return value

control linklinks and saved status

temporaries and local data

parameters and return value

control linklinks and saved status

temporaries and local data

Stack_top

caller’s responsibility

callee’s responsibility

Caller’s activation record

Callee’s activation record

Creation of An Activation Record

Page 30: Type checking

Dabbal Mahara 30

Sample calling sequence• Caller evaluates the actual parameters and places them into the activation record

of the callee.• Caller stores a return address and old value for stack_top in the callee’s activation

record.• Caller increments stack_top to the beginning of the temporaries and locals for the

callee.• Caller branches to the code for the callee.• Callee saves all needed register values and status.• Callee initializes its locals and begins execution.Sample return sequence• Callee places the return value at the correct location in the activation record (next

to caller’s activation record)• Callee uses status information previously saved to restore stack_top and the

other registers.• Callee branches to the return address previously requested by the caller.• [Optional] Caller copies the return value into its own activation record and uses it

to evaluate an expression.

Creation of An Activation Record

Page 31: Type checking

Dabbal Mahara 31

Who deallocates?• Callee de‐allocates the part allocated by

Callee.• Caller de‐allocates the part allocated by Caller.

Variable-length data• In some languages, array size can depend on a

value passed to the procedure as a parameter.• This and any other variable-sized data can

still be allocated on the stack, but BELOW the callee’s activation record.

• In the activation record itself, we simply store POINTERS to the to-be-allocated data.

• All variable-length data is pointed to from the local data area.

Page 32: Type checking

Dabbal Mahara 32

Intermediate Code Generation• In the analysis-synthesis model of compiler, the front end analyzes a source

program and creates an intermediate representation, from which the back end generates target code.

• The details of source language are confined to front end and details of the target machine to the back end.

• With a suitably defined intermediate representation, a compiler for language i and machine j can then be built by combining the front end for language i with the back end for machine j.

• Intermediate code is often the link between the compiler’s front end and back end.

• Intermediate codes are machine independent codes, but they are close to machine instructions.

Fig. Position of Intermediate Code Generator in Compiler

Page 33: Type checking

Dabbal Mahara 33

Why Intermediate LanguageAdvantages:• Target code can be generated to any machine just by attaching

new machine as the back end. This is called retargeting. • It is possible to apply machine independent code optimization to

intermediate code in order to optimize the code generation. • IR can modularize the task: Front end is not bothered about

machine details and Back end is not bothered about source language.• Have many front-ends into a single back-end

– gcc can handle C, C++, Java, Fortran, Ada, ...– each front-end translates source to the same generic language (called GENERIC)

• Have many back-ends from a single front-end– Do most optimization on intermediate representation before emitting code targeted at a single machine

• It provides intermediate level of abstraction• more details than the source • fewer details than the target

Page 34: Type checking

Dabbal Mahara 34

Types of Intermediate LanguagesThere are three kinds of intermediate representations:1. High-level intermediate representations:

– closer to the source language; e.g., syntax trees or Directed Acyclic Graph(DAG)– easy to generate from the input program– code optimizations may not be straightforward

2. Low-level intermediate representations:– closer to target machine; e.g., P-Code, U-Code (used in PA-RISC and MIPS), GCC’s RTL, 3-address code– easy to generate code from– generation from input program may require effort

3. “Mid”-level intermediate representations:– Java bytecode, Microsoft CIL, LLVM IR, ...

Page 35: Type checking

Dabbal Mahara 35

1. Syntax Tree

• Each node in a syntax tree represents a construct; the children of the node represent the meaningful components of the construct.

• A syntax tree node representing an expression E1+E2 has label + and two children representing the subexpressions E1 and E2.

• We shall implement the nodes of a syntax tree by objects with a suitable number of fields. Each object will have an op fields that is the label of the node.

• Additionaly, if a node is a leaf node, an op field has the lexical value for the leaf. A constructor function leaf(op,val) creates a leaf object.

• If the node is an interior node, a constructor function node(op, c1,c2,..,ck) creates an object with field op and other k fields.

Page 36: Type checking

Dabbal Mahara 36

SDD for creating syntax tree

Example: Creating syntax tree for expression: 2-3+4

num 3

num 4

num 2

E

T

E

+

- T

E

T

num

num

num

2

4

3

-

+

Page 37: Type checking

Dabbal Mahara 37

Example 2: Syntax tree

Page 38: Type checking

Dabbal Mahara 38

2. DAG

• Like syntax tree for an expression, DAG has leaves corresponding to atomic operands and interior nodes corresponding to operators.

• The difference is that a node N in DAG has more than one parent if N represents a common subexpression.

• All what is needed is that functions such as Node and Leaf above check whether a node already exists. If such a node exists, a pointer is returned to that node.

• More compact representation• Gives clues regarding generation of efficient code

Example: DAG for expression:

Page 39: Type checking

Dabbal Mahara 39

3. Three-Address Code• A three address code is the intermediate representation with at most one

operator on the right side of an instruction.• That is, no built-up arithmetic expressions are permitted.• Thus, x+y*z might be translated into the sequence of three address

instructions: t1 = y*z t2 = x + t1

where t1 and t2 are compiler generated temporary names.• 3AC is close to assembly language, making machine code generation easier. • 3AC is easy to generate from syntax trees or DAG. We associate a temporary

with each interior tree node.

Page 40: Type checking

Dabbal Mahara 40

Forms of 3AC

• Assignment statements of the form x := y op z, where op is a binary arithmetic or logical operation.

• Assignment statements of the form x := op y, where op is a unary operator, such as unary minus, logical negation

• Copy statements of the form x := y, which assigns the value of y to x.• Unconditional statements goto L, which means the statement with label L is the next to

be executed.• Conditional jumps, such as if x relop y goto L, where relop is a relational operator (<,

=, >=, etc) and L is a label. (If the condition x relop y is true, the statement with label L will be executed next.)

• Statements param x and call p, n for procedure calls, and return y, where y represents the (optional) returned value. The typical usage: p(x1, …, xn)

param x1param x2… param xn call p, n

• Index assignments of the form x := y[i] and x[i] := y. The first sets x to the value in the location i memory units beyond location y. The second sets the content of the location i unit beyond x to the value of y.

• Address and pointer assignments:x := &yx := *y*x := y

Page 41: Type checking

Dabbal Mahara 41

Representation of 3AC in data structure• How to present these instructions in a data structure? In compiler the instructions in 3AC can be implemented as objects or records with fields for operator and operands. Three such representations are: – Quadruples – Triples – Indirect triples

1. Quadruples• Has four fields: op, arg1, arg2, result• Exceptions:– Unary operators: no arg2– Operators like param: no arg2, no result– (Un)conditional jumps: target label is the result

Page 42: Type checking

Dabbal Mahara 42

2. Triples

• Only three fields: no result field• Results referred to by its position

Fig. Representation of a = b * - c + b * - c

(c ) Three address code

Page 43: Type checking

Dabbal Mahara 43

3. Indirect Triples• When instructions are moving around during optimizations:

quadruples are better than triples.• Indirect triples solve this problem.• Indirect triples consists of list of pointers to triples rather than a

listing of triples themselves. With this optimizing compilers can move an instruction by reordering the instruction list without affecting the triples themselves.

Page 44: Type checking

Dabbal Mahara 44

3AC for program constructs• Program consists of assignment statements like a=b op c or control statements like if-

then-else, while loop or for statements.• This section deals with generation of three address code for assignment statement and

control statements.1. Three-address code for assignment statement

• The three-address code for an assignment statement S is given in following SDD. The attributes S.code and E.code denote the three-address code for S and E respectively.

• Attribute E.addr denotes the address that will hold the value of E. This address can be a name, a constant, or a compiler-generated temporary.

• The production E -> id, in SDD, when an expression is a single identifier, say x, then x itself holds the value of the expression. The semantic rules for this production define E.addr to point to the symbol table entry for this instance of id. Let top denote the current symbol table. Function top.get ( )retrieves the entry when it is applied to id.lexeme of id. E.code is set to empty string.

• For E -> (E1), the translation of E is the same as that of the subexpression E1. Hence, E.addr eauals E1.addr and E.code = E1.code.

• For E->E1 + E2 , generate code to compute value of E from the values of E1 and E2. Values are computed into newly generated temporary names.

• A sequence of temporary names is created by new Temp();• The gen( ) function is to build an instruction and return it.

Page 45: Type checking

Dabbal Mahara 45

3AC for Assignment statement with expressions

Page 46: Type checking

Dabbal Mahara 46

Example: Generate three address code or the following arithmetic expression: a = - b * c

Three Address code:t1 = -bt2 = t1* ca = t2

fig. Parse tree for the expression a= -b *c

E.addr = bE.code = ‘ ’

E.addr = t1

E.code= t1 = -b

E.addr = t2

E.code= t1 = -b t2 = t1 * c

E.addr = cE.code = ‘ ’

S.code = t1 = -b t2 = t1 * c a = t2

Page 47: Type checking

Dabbal Mahara 47

3AC generation for Array references• Elements of array are stored in consecutive memory location. • In C and Java, the array elements are numbered 0,1,.........., n-1, for

array with n elements.• If width of each element is w then the ith element of the array can be

accessed at: base + i * w ........ ( 1)

where base is the base address of array or the address of the 1st element of array.• Example: Let A[10] be an array of 10 elements. Let size of each

element be 2 i.e., w =2 and the array is stored from memory location 1000 i.e. base address=1000.

• 3rd element of array is at address = 1000 + 3 * 2 = 1000 + 3 * 2 = 1000 + 6=1006

A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]

Page 48: Type checking

Dabbal Mahara 48

• More generally, the array elements need not be started at 0. In one dimensional array, the array elements are numbered low, low+1, low+2,............, high and base is the relative address of A[low].

• The address of A[i] can be rewritten as: base + (i-low) * w ..............(2)

• Formula (2) can be written as: i * w + base – low * w = i * w + c , where c = base – low * w.

• All the components in c are known before compilation hence they can be pre-computed and stored. This reduces the time taken to generate address of ith element.

• We assume that c is saved in the symbol table entry for A, so the relative address of A [ i ] is obtained by simply adding i * w to c.

3AC generation for Array references

Page 49: Type checking

Dabbal Mahara 49

One Dimensional Array Reference: Example

A: array [10 ... 20] of integers;

... ...

lowbase i width of the array element w

x : = A[ i ] = base + (i – low) * w = i *w + cwhere, c = base – low * w with low = 10;

w =4

x

Page 50: Type checking

Dabbal Mahara 50

• In case of multi-dimension array like matrix, elements are either stored as Row Major or Column Major. C language and Pascal uses row major storage where as Fortran language uses column major storage.

• Example: Consider Array A[3,3] with elements:(0,0) (0,1) (0,2)

(1,0) (1,1) (1,2)

(2,0) (2,1) (2,2)

Row Major (0,0) (0,1) (0,2) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2)

Colum Major (0,0) (1,0) (2,0) (0,1) (1,1) (2,1) (0,2) (1,2) (2,2)

Address of element A [i, j] in row major storage is given by the expression as follows.

A[i,j] = base + ((i - low1) * n2 + j - low2) * w, ................. ( 3)

where low1 and low2 are lower bounds of i & j and n2 defines the number of columns. w defines the size of each element.

Expression can be written as:

A[i,j] = (( i * n2) + j) * w) + ( base – (( low1 * n2) + low2) * w ) .......... (4)

The second part of the Expression (4) can be pre-computed by knowing the value of base, low1, low2 and w. This helps in faster generation of address for A[i,j].

 

Page 51: Type checking

Dabbal Mahara 51

Example: 2D array referencing

A : array (1..2,1..3) of integer; ... = A[i,j] = baseA +(i-low1) * n2 + j-low2) *w

= ((i*n2) +j) *w + c Where c = baseA –((low1 *n2)+low2) * w

with low1 = 1, low2 =1, n2 = 3 , w =4Three- address codet1 = i* 3t2 = t1 +jt3 = t2 * 4t4 = ct5 = t4 [t3]..... = t5

Page 52: Type checking

52

Translation of Array references• The main problem of generating code for array referencesis to relate address

calculation formula in the grammar for array references.• Let nonterminal L generate an array name followed by a sequence of index

expressions:L -> L [ E ] | id [ E ]

• In this translation scheme gen( ) function builds an instruction and incrementally emits it into the stream of generated instructions.

• The nonterminal L has three synthesized attributes:• L.addr: denotes a temporary that is used while computing the offset for the

array reference generated by L.• L.array: is pointer to a symbol table entry for the array name. The base

address of the array – the address of its 0th element, say, L.array.base is used to determine the actual l-value of an array refrence after all the index expressions are analyzed. The location for array reference is therefore L.array.base[L.addr].

• L.type is the type of the subarray generated by L. For any type t, we assume that its width is given by t.width and t.elem gives the element type.

• The translation scheme is shown below:

Dabbal Mahara

Page 53: Type checking

Dabbal Mahara 53

Translation of Array references

Page 54: Type checking

Dabbal Mahara 54

Example: Compute 3AC for expression c+a[i][j], where c, i and j are all integers and a is 2x3 integer array.

Three Address Codet1 = i * 12t2 = j *4

t3 = t1 + t2

t4 = a [ t3] t5 = c + t4

L.array = aL.type = array (3, integer)

. addr =i

. addr =j

L.array = aL.type = integer

E. addr= t5

+E.addr = c E . addr = t4

L.addr = t1

L . addr = t3

E ]

[a E ]

[

a.type = array(2, array(3, integer))

i

j

c

Fig. Annotated Parse Tree for c + a[ i ][ j ]

Page 55: Type checking

Dabbal Mahara 55

• Control statements are used to alter the sequential flow of execution.• Some the control statements are if-then-else statement, while statement.

• S -> if (E ) S1

• S -> if ( E ) S1 else S2

• S -> while ( E ) S1

• S -> do S1 while ( E )• Three Address Code for if-then, if-then-else, while do statements can be

generated using the translation rules given in following slides.• In the translation rules, both S and E have a synthesized attribute code, which

gives the trasnslation into three-address instructions.• For simpilicity, translations S.code and E.code are built up as string using SDD.• The translation of S -> if (E) S1 consists of E.code followed by S1.code as shown in

figure.

Flow-of- control statements

Page 56: Type checking

Dabbal Mahara 56

Three Address Code generation for if then statement

Statement Translation rulesS->if E then S1 E.true = newlabel(); E.false = S.next; S1.next = S.next;

S.code = E.code || label(E.True, ‘:’)|| S1.code

Example: Generate 3 address code for the statement: if a>b then x =y +z. Ans: 3AC for the given statement is:

if a>b then goto L1 goto L2

L1: t1 = y + z x = t1

L2: ....

E.code

S1.codeE.true :

E.false : ...

to E. false

to E. true

Page 57: Type checking

Dabbal Mahara 57

Three Address Code generation for if then else statement

Production Semantic RulesS->if E then S1 else S2 E.true = newlabel();

E.false = newlabel();S1.next = S.next;S2.next = S.next;S.code = E.code || label(E.True, ‘:’)||S1.code || gen(‘GOTO’, S.next) ||label(E.false, ‘:’) || S2.code 

Example: Generate 3 address code for the statement: if a>b then x =y +z else x = y-z

The three address code is given below: if a>b then goto L1 goto L2

L1: t1= y+z x =t1 goto L3

L2: t1 = y-z x = t1

L3: ............ 

Page 58: Type checking

Dabbal Mahara 58

Three Address Code generation for while do statement

S.code three-address code for evaluating SS.begin label to start of S S.next label to end of S

production Semantic rulesS->while E do S1 S.begin = newlabel();

E.true = newlabel();E.false = S.nextS1.next = S.begin

S.code= label(S.begin ‘:’) || E.code || label(E.true’:’) || S1.code || gen(‘GOTO’ S.begin)

Page 59: Type checking

Dabbal Mahara 59

Example 1: Generate 3 address code for the statement: while a>b do x = y +z.

The three address code is given below:

L1: if a> b then goto L2 goto L3 L2: t1 = y+z x = t1 goto L1 L3: ....... Example 2: Generate 3 address code for the statement: i = 2 * n + k while i do i = i – k

The three address code is given below: t1 =2 t2 = t1 *n t3 = t2 +k

L1: if i =1 then goto L2 goto L3

L2: t4 = i-k i = t4 goto L

L3: .......

Page 60: Type checking

Dabbal Mahara 60

Example 3: Generate 3AC for the statement: while a<b do

if c< d thenx = y + z elsex = y – z

Solution: The three address code will be as followsL1: if a<b then GOTO L2 GOTO LNEXTL2: if c<d then GOTO L3 GOTO L4L3: t1 = y + z x = t1 GOTO L1L4: t1= y - z x = t1 GOTO L1LNEXT:

Page 61: Type checking

Dabbal Mahara 61

Example 4: Generate 3AC for the statement:

c =0do

if (a< b) x++ else x- -

c+ +while ( c < 5 )Solution:

The three address code is given below:1. c =02. if (a < b) GOTO (4)3. GOTO (7)4. t1 = x + 15. x = t1 6. GOTO (9)7. t2 = x -18. x = t29. t3 = c +110.c = t3 11.if ( c<5) GOTO (2)12. ------------------------

Alternate Method:

c = 0 L1: if ( a<b ) GOTO L2 GOTO L3L2: t1 = x + 1 x = t1 GOTO L4L3: t2 = x -1

x = t2 L4: t3 = c +1

c = t3 if ( c <5) GOTO L1----------

Page 62: Type checking

Dabbal Mahara 62

Example 5: Generate three address code for following c program

int a[10], b[10], dot_product, i;dot_product = 0;for ( i =0 ; i < 10 ; i++ ) dot_product += a[i] * b[i];

Intermediate Code: dot_product = 0; i =0;L1: if (i >=10) GOTO L2

t1 = addr(a) // c = base – low* w = baset2 = i * 4t3 = t1[t2]t4 = addr(b)t5 = i * 4t6 = t4[t5]

t7 = t3 + t6t8 = dot_product + t7dot_product = t8t9 = i + 1i = 19GOTO L1

L2: -------------

Page 63: Type checking

Dabbal Mahara 63

Example 6 : Generate three address code for following c program

int a[10], b[10], dot_product, i;int *a1, *b1;dot_product = 0;a1 =a; b1 = b;for ( i =0 ; i < 10 ; i++ ) dot_product += *a1++ * *b1++;

Intermediate Code:dot_product = 0;a1 = &ab1 = &bi =0;

L1: if (i >=10) GOTO L2t3 = *a1t4 = a1 + 1a1 = t4

t5 = *b1t6 = b1 + 1b1 = t6 +1t7 = t3 + t5t8 = dot_product + t7dot_product = t8t9 = i + 1i = 19GOTO L1

L2: -------------

Page 64: Type checking

Dabbal Mahara 64

Logical Expression• Logical operators are mainly used in flow control statements like if then else, while-do and repeat until.• not operation has the highest precedence-level followed by and and or is at least precedence level.• Logical expressions always results in values either true or false. • True can be treated as non zero or non negative or 1 value. Whereas false may be 0 or negative value.

Production Translation rules E → E1 or E2 E1.true = E. true

E1. false = newlabel( ) E2.true = E.true E2.false = E. falseE.code = E1.code || label (E1.false) || E2.code

E → E1 and E2 E1.true = newlabel ( ) E1. false = false E2.true = E.true E2.false = E. false

E.code = E1.code || label (E1.true) || E2.code

Page 65: Type checking

Dabbal Mahara 65

SDD for translation of Boolean Expression to 3AC E → not E1 E1.true = E. false

E1. false = E.true E.code = E1.code

E → E1 rel E2 E.code = E1.code || E2.code || gen(‘if’ E1.addr rel.op E2.addr ‘goto’ E.true) || gen (‘goto’ E.false)

E → ( E1 ) E.value = E1.value

E → true E.code = gen(‘goto’ E.true)

E → false E.code = gen(‘goto’ E.false)

Examples: Generate 3 AC from following statement: a or b and not c.

The 3AC for the above expression will be as follows:t1 = not ct2 = b and t1t3 = a or t2

Page 66: Type checking

Dabbal Mahara 66

Example: consider the following statement and translate it into three address codes. if (x < 100 || x > 200 && x != y ) x = 0;

Three address-code:if x < 100 goto L2goto L3

L3: if x > 200 goto L4goto L1

L4: if x != y goto L2goto L1

L2: x = 0L1: ......

Page 67: Type checking

Dabbal Mahara 67

Three address code for procedure call

S → call id ( Elist ) { for each item p on queue do produce (‘param’ p);

produce(‘call’ id.value |queue|) }Elist → Elist , E { append E.value to the end of queue }Elist → E { initialize queue to contain only E.value }

Example: 1

Page 68: Type checking

Dabbal Mahara 68

Example 2Consider the statement: n = f ( a [ i ] ) where a is array of integers f is function from integers to integers. Three Address Code: t1 = i * 4

t2 = a [ t1 ]param t2t3 = call f,1n = t3

Page 69: Type checking

Dabbal Mahara 69

Example: 3int main(){ int p, int a[10]; int b[10]; p = dot_product ( a, b);} Intermediate codefunct begin mainparam aparam bp = call dot_product, 2func end

int dot_product ( int x[ ] , int y [] ){

int d, i;d =0;for ( i= 0; i<10;i++ d += x[i] * y[i];return d;

}

intermediate Code: func begin dot_product d =0 i=0L1: if (i>=10) goto L2 t1 = addr(x) t2 = i * 4 t3 = t1[t2] t4 = addr(y) t5 = i * 4

t6 = t4[t5]t7 = t3 + t6t8 = d + t7d= t8t9 = i + 1i = 19goto L1

L2: return dfunc end

Page 70: Type checking

Dabbal Mahara 70

Example 4: Write 3AC for the following code:int fact ( int n){

if ( n== 0 ) return 1;else return ( n* fact(n-1));

}Intermediate Code:

func begin factif (n==0) goto L1t1 = n-1param t1t2 = call fact, 1t3 = n * t2return t3

L1: return 1func end

Page 71: Type checking

71

Thank You !