type checking
TRANSCRIPT
1
Prepared By:Dabbal Singh Mahara2016
Contents
Type Checking Run-time EnvironmentsIntermediate Code Generation
Dabbal Mahara 2
Type Checking• A type is a set of values together with a set of operations that can
be performed on them• Type checking is checking that each operation in a program
receives appropriate number of arguments of appropriate types in appropriate order.
• The purpose of type checking is to verify that operations performed on a value are in fact permissible.
• Certain operations are legal for values of each type– It doesn’t make sense to add a function pointer and an integer in C.– It does make sense to add two integers.
• The type of an identifier is typically available from declarations, but we may have to keep track of the type of intermediate expressions.
• Type errors arise when operations are performed on values that do not support that operation.
Dabbal Mahara 3
Type Systems• A language’s type system specifies which operations are valid for
which types.• Type systems provide a concise formalization of the semantic
checking rules. • A type system defines a set of types and rules to assign types to
programming language constructs like informal type system rules, for example “if both operands of addition are of type integer, then the result is of type integer”.
• Type Checking is the process of checking that the program obeys the type system.
• A type checker implements type system. • A sound type system eliminates run-time type checking for type
errors.– Memory errors: Reading from an invalid pointer, etc.– Violation of abstraction boundaries.
Dabbal Mahara 4
Type Checking Overview
Three kinds of languages:• Statically typed: All or almost all checking of types is done as part of
compilation (C, ML, Java)• Dynamically typed: Almost all checking of types is done as part of
program execution (Scheme, Prolog)• Untyped: No type checking (machine code)
• Static typing proponents say:– Static checking catches many programming errors at compile time– Avoids overhead of runtime type checks
• Dynamic typing proponents say:– Static type systems are restrictive– Rapid prototyping easier in a dynamic type system
Dabbal Mahara 5
Static Checking• Refers to the compile-time checking of programs in order to
ensure that the semantic conditions of the language are being followed
• Examples of static checks include:– Type checks– Flow-of-control checks– Uniqueness checks– Name-related checks• Flow-of-control checks: statements that cause flow of control to leave a
construct must have some place where control can be transferred; e.g., break statements in C
• Uniqueness checks: a language may dictate that in some contexts, an entity can be defined exactly once; e.g., identifier declarations, labels, values in case expressions
• Name-related checks: Sometimes the same name must appear two or more times; e.g., in Ada a loop or block can have a name that must then appear both at the beginning and at the end
Dabbal Mahara 6
Type Expression• A language usually provides a set of base types that it supports together with ways to
construct other types using type constructors• Through type expressions we are able to represent types that are defined in a program• A base type is a type expression
a primitive data type such as integer, real, char, boolean, … type-error signal an error during type checking void : no type
• A type name (e.g., a record name) is a type expression• A type constructor applied to type expressions is a type expression. E.g.,
– arrays: If T is a type expression and I is a range of integers, then array(I,T) is a type expression– records: If T1, …, Tn are type expressions and f1, …, fn are field names, then record((f1,T1),…,(fn,Tn)) is a type expression
– pointers: If T is a type expression, then pointer(T) is a type expression Ex: pointer(int) – functions: If T1, …, Tn, and T are type expressions, then so is (T1,…,Tn) →T. Ex: int→int represents the type of a function which takes an int value as parameter, and return type is also int.
Dabbal Mahara 7
A Simple Type Checking System
Dabbal Mahara 8
Specification of Simple Type checker
• A simple type checking translation scheme for declaration is given in the following figure.
• The basic types are : character and integer. • The constructed types are : array and pointer. • The type attribute is added to each symbol. • The declaration should come before the usage of the
variable.
Dabbal Mahara 9
Type checking for expression• The synthesized attribute type for E gives the type of the
expression assigned by the type system for the expression generated by E.
• The function lookup returns the type of id. • The following figure shows the type checking for the expressions.
Dabbal Mahara 10
Type checking for statements
• In some languages statements have a type associated with them, while some other languages don’t assign types to statements.
• In the latter case, statements are given a type void to distinguish a type safe statement with one which has a type error.
• if an error occurs within a statement, then the type assigned to this statement is type_error.
Dabbal Mahara 11
Type checking for functions
• A function to an argument can be captured by production:T→T->T E→E(E)Function type declaration Function call
Dabbal Mahara 12
Type Conversion and Coercion• Since representation of integer and real is different within a computer, the
different machine instructions are used for operations on integers and reals. Often if different parts of an expression are of different types then type conversion is required.
• For example, in the expression: z = x + y what is the type of z if x is integer and y is real ?
• Compilers have to convert one of the them to ensure that both operand of same type!
• In many language Type conversion is explicit, for example using type casts i.e. must be specify as inttoreal(x)
• Type conversions which happen implicitly is called coercion. Implicit type conversions are carried out by the compiler recognizing a type incompatibility and running a type conversion routine (for example, something like inttoreal(int)) that takes a value of the original type and returns a value of the required type.
• The coercion of expressions is given in following figure.
Dabbal Mahara 13
Type Conversion and Coercion (Contd.)
Dabbal Mahara 14
Structural Equivalence of Type Expressions• The basic question is "when are two type expressions equivalent?" • Two expressions are structurally equivalent if they are two expressions of same
basic types or are formed by applying same constructor.
Example: int a, b;Here a and b are structurally equivalent.
Dabbal Mahara 15
Run-time Environments
• A compiler must accurately implement the abstractions embodied in the source language definition. These abstractions typically include the concepts such names, scope, bindings, data types, operators, procedures, parameters and flow-of-control constructs.
• The compiler must co-operate with operating system and other system software to support these abstractions on the target machine.
• To do so, the compiler creates and manages a run-time environment in which target code are being executed.
• By runtime, we mean a program in execution. • Runtime environment is a state of the target machine, which may
include software libraries, environment variables, etc., to provide services to the processes running in the system.
Dabbal Mahara 16
Run-time Environment...• Runtime support system is a package, mostly generated with the
executable program itself and facilitates the process communication between the process and the runtime environment. It takes care of memory allocation and de-allocation while the program is being executed
• This environment deals with a number of issues such as layout and allocation of storage locations for the objects named in the source program, the mechanisms used by the target program to access variables, the linkages between procedures, the mechanisms for passing parameters, and interfaces to the oerating system, input/output devices and other programs.
• That is,‣ Management of run-time resources‣ Correspondence between static (compile-time) and dynamic (run-time)
structures‣ Storage organization
Dabbal Mahara 17
Run-time Resources• Execution of a program is initially under the control of the
operating system (OS)• When a program is invoked:
‣ The OS allocates space for the program‣ The code is loaded into part of this space‣ The OS jumps to the entry point of the program (i.e., to the beginning of the “main” function)
Dabbal Mahara 18
Memory Layout: Storage Organization
Low Address
High Address
Dabbal Mahara 19
Correspondance between static and Dynamic structures• Compiler must do the storage allocation and provide access to variables
and data. • At run time, we need a system to map NAMES (in the source program) to
STORAGE on the machine.• Allocation and de-allocation of memory is handled by a RUN-TIME SUPPORT
SYSTEM typically linked and loaded along with the compiled target code. • One of the primary responsibilities of the run-time system is to manage
ACTIVATIONS of procedures. • Procedure execution begins at the first statement of the procedure body.• When a procedure returns, execution returns to the instruction immediately
following the procedure call.
Dabbal Mahara 20
Activation and Activation Tree• Every execution of a procedure is called an ACTIVATION. • The LIFETIME of an activation of procedure P is the sequence of steps
between the first and last steps of P’s body, including any procedures called while P is running.
• Normally, when control flows from one activation to another, it must (eventually) return to the same activation.
• If a procedure is recursive, a new activation can begin before an earlier activation of the same procedure has ended.
• We can represent the activations of procedures during the running of an entire program by a tree, called an activation tree.
• Activation tree shows the way control enters and leaves activations. In an activation tree:– Each node represents an activation of a procedure.– The root represents the activation of the main program.– The node a is a parent of the node b if the control flows from a to b.– The node a is left to to the node b if the lifetime of a occurs before the lifetime of b.
Dabbal Mahara 21
Procedure Activations: Example
Dabbal Mahara 22
Procedure Activation : Example (contd...)• The example is a sketch of a program that reads nine integers into an
array a and sorts them using the reciursie quicksort algorithm.• The main function has three tasks. it calls readarray, sets sentinels
and then calls quicksort on the entire data array.• The figure in the right side shows the sequence of calls that might
result from an execution of the program. In this execution, the call to partition(1,9) returns 4, so a[1] to a[3] hold elements less than its chosen separator value v, while the larger elements are in a[5] through a[9].
• In this example, procedure activations are nested in time.
Dabbal Mahara 23
Activation Tree: During an Execution of quicksort
• This activation tree shows one possible activation tree that completes the sequence of calls and returns in above program.
• The functions are represented by the first letters of their names.• Remember that this tree is only one possibility, since the
arguments of subsequent calls, and also the number of calls along any branch is influenced by the values returned by the partition.
Dabbal Mahara 24
Control Stack• Procedure calls and returns are managed by a run time stack called the
control stack.• Each live activation has a frame known as activation record, on the control
stack, with root of the activation tree at the bottom and the entire sequence of activations corresponding to the path in the activation tree to the activation where control resides currently. The latter activations has a record at the top of the stack.
• The stack keeps track of currently-active procedure activations.– An activation record is pushed onto the control stack as the activation starts.– That activation record is popped when that activation ends.
• At any point in time, the control stack represents a path from the root of the activation tree to one of the nodes.
• The flow of the control in a program corresponds to a depth first traversal of the activation tree that:
– starts at the root,– visits a node before its children, and– recursively visits children at each node an a left‐to‐right order.
Dabbal Mahara 25
Top
Dabbal Mahara 26
Activation Records• Information needed by a single execution of a procedure is managed
using a contiguous block of storage called activation record. • An activation record is allocated when a procedure is entered, and it is
de‐allocated when that procedure exited.• Size of each field can be determined at compile time (Although actual
location of the activation record is determined at run‐time). • Except that if the procedure has a local variable and its size depends
on a parameter, its size is determined at the run time.
Dabbal Mahara 27
A General Activation Record
Actual parameters
Returned values
Control link
Access link
Saved machine status
Local data
Temporaries
• Temporary values, such as those arising from the evaluation of expressions, in cases where those temporaries cannot be held in registers.
• Local data belonging to the procedures whose activation record this is.
• Saved machine status, withe information about the state of the machine just before the call to the procedure. This information typically includes the return address ( value of the program counter, to which the called procedure must return) and the content of registers that were used by the calling procedure and that must be restored when the return occurs.
• An access link, may be added to locate data needed by the called procedure but found elsewhere, e.g. in another activation record.
• A control link, pointing to the activation record of caller.• Space for return value of the called function, if any.• The actual parameters used by the calling procedure.
Dabbal Mahara 28
Creation of An Activation Record
• Who allocates an activation record of a procedure?• Some part of the activation record of a procedure is created by that
procedure immediately after that procedure is entered.• Some part is created by the caller of that procedure before that
procedure is entered.• Calling sequences are code statements to create activations records
on the stack and enter data in them. • The CALLING SEQUENCE for a procedure allocates an activation
record and fills its fields in with appropriate values. • The RETURN SEQUENCE restores the machine state to allow
execution of the calling procedure to continue.
Dabbal Mahara 29
parameters and return value
control linklinks and saved status
temporaries and local data
parameters and return value
control linklinks and saved status
temporaries and local data
Stack_top
caller’s responsibility
callee’s responsibility
Caller’s activation record
Callee’s activation record
Creation of An Activation Record
Dabbal Mahara 30
Sample calling sequence• Caller evaluates the actual parameters and places them into the activation record
of the callee.• Caller stores a return address and old value for stack_top in the callee’s activation
record.• Caller increments stack_top to the beginning of the temporaries and locals for the
callee.• Caller branches to the code for the callee.• Callee saves all needed register values and status.• Callee initializes its locals and begins execution.Sample return sequence• Callee places the return value at the correct location in the activation record (next
to caller’s activation record)• Callee uses status information previously saved to restore stack_top and the
other registers.• Callee branches to the return address previously requested by the caller.• [Optional] Caller copies the return value into its own activation record and uses it
to evaluate an expression.
Creation of An Activation Record
Dabbal Mahara 31
Who deallocates?• Callee de‐allocates the part allocated by
Callee.• Caller de‐allocates the part allocated by Caller.
Variable-length data• In some languages, array size can depend on a
value passed to the procedure as a parameter.• This and any other variable-sized data can
still be allocated on the stack, but BELOW the callee’s activation record.
• In the activation record itself, we simply store POINTERS to the to-be-allocated data.
• All variable-length data is pointed to from the local data area.
Dabbal Mahara 32
Intermediate Code Generation• In the analysis-synthesis model of compiler, the front end analyzes a source
program and creates an intermediate representation, from which the back end generates target code.
• The details of source language are confined to front end and details of the target machine to the back end.
• With a suitably defined intermediate representation, a compiler for language i and machine j can then be built by combining the front end for language i with the back end for machine j.
• Intermediate code is often the link between the compiler’s front end and back end.
• Intermediate codes are machine independent codes, but they are close to machine instructions.
Fig. Position of Intermediate Code Generator in Compiler
Dabbal Mahara 33
Why Intermediate LanguageAdvantages:• Target code can be generated to any machine just by attaching
new machine as the back end. This is called retargeting. • It is possible to apply machine independent code optimization to
intermediate code in order to optimize the code generation. • IR can modularize the task: Front end is not bothered about
machine details and Back end is not bothered about source language.• Have many front-ends into a single back-end
– gcc can handle C, C++, Java, Fortran, Ada, ...– each front-end translates source to the same generic language (called GENERIC)
• Have many back-ends from a single front-end– Do most optimization on intermediate representation before emitting code targeted at a single machine
• It provides intermediate level of abstraction• more details than the source • fewer details than the target
Dabbal Mahara 34
Types of Intermediate LanguagesThere are three kinds of intermediate representations:1. High-level intermediate representations:
– closer to the source language; e.g., syntax trees or Directed Acyclic Graph(DAG)– easy to generate from the input program– code optimizations may not be straightforward
2. Low-level intermediate representations:– closer to target machine; e.g., P-Code, U-Code (used in PA-RISC and MIPS), GCC’s RTL, 3-address code– easy to generate code from– generation from input program may require effort
3. “Mid”-level intermediate representations:– Java bytecode, Microsoft CIL, LLVM IR, ...
Dabbal Mahara 35
1. Syntax Tree
• Each node in a syntax tree represents a construct; the children of the node represent the meaningful components of the construct.
• A syntax tree node representing an expression E1+E2 has label + and two children representing the subexpressions E1 and E2.
• We shall implement the nodes of a syntax tree by objects with a suitable number of fields. Each object will have an op fields that is the label of the node.
• Additionaly, if a node is a leaf node, an op field has the lexical value for the leaf. A constructor function leaf(op,val) creates a leaf object.
• If the node is an interior node, a constructor function node(op, c1,c2,..,ck) creates an object with field op and other k fields.
Dabbal Mahara 36
SDD for creating syntax tree
Example: Creating syntax tree for expression: 2-3+4
num 3
num 4
num 2
E
T
E
+
- T
E
T
num
num
num
2
4
3
-
+
Dabbal Mahara 37
Example 2: Syntax tree
Dabbal Mahara 38
2. DAG
• Like syntax tree for an expression, DAG has leaves corresponding to atomic operands and interior nodes corresponding to operators.
• The difference is that a node N in DAG has more than one parent if N represents a common subexpression.
• All what is needed is that functions such as Node and Leaf above check whether a node already exists. If such a node exists, a pointer is returned to that node.
• More compact representation• Gives clues regarding generation of efficient code
Example: DAG for expression:
Dabbal Mahara 39
3. Three-Address Code• A three address code is the intermediate representation with at most one
operator on the right side of an instruction.• That is, no built-up arithmetic expressions are permitted.• Thus, x+y*z might be translated into the sequence of three address
instructions: t1 = y*z t2 = x + t1
where t1 and t2 are compiler generated temporary names.• 3AC is close to assembly language, making machine code generation easier. • 3AC is easy to generate from syntax trees or DAG. We associate a temporary
with each interior tree node.
Dabbal Mahara 40
Forms of 3AC
• Assignment statements of the form x := y op z, where op is a binary arithmetic or logical operation.
• Assignment statements of the form x := op y, where op is a unary operator, such as unary minus, logical negation
• Copy statements of the form x := y, which assigns the value of y to x.• Unconditional statements goto L, which means the statement with label L is the next to
be executed.• Conditional jumps, such as if x relop y goto L, where relop is a relational operator (<,
=, >=, etc) and L is a label. (If the condition x relop y is true, the statement with label L will be executed next.)
• Statements param x and call p, n for procedure calls, and return y, where y represents the (optional) returned value. The typical usage: p(x1, …, xn)
param x1param x2… param xn call p, n
• Index assignments of the form x := y[i] and x[i] := y. The first sets x to the value in the location i memory units beyond location y. The second sets the content of the location i unit beyond x to the value of y.
• Address and pointer assignments:x := &yx := *y*x := y
Dabbal Mahara 41
Representation of 3AC in data structure• How to present these instructions in a data structure? In compiler the instructions in 3AC can be implemented as objects or records with fields for operator and operands. Three such representations are: – Quadruples – Triples – Indirect triples
1. Quadruples• Has four fields: op, arg1, arg2, result• Exceptions:– Unary operators: no arg2– Operators like param: no arg2, no result– (Un)conditional jumps: target label is the result
Dabbal Mahara 42
2. Triples
• Only three fields: no result field• Results referred to by its position
Fig. Representation of a = b * - c + b * - c
(c ) Three address code
Dabbal Mahara 43
3. Indirect Triples• When instructions are moving around during optimizations:
quadruples are better than triples.• Indirect triples solve this problem.• Indirect triples consists of list of pointers to triples rather than a
listing of triples themselves. With this optimizing compilers can move an instruction by reordering the instruction list without affecting the triples themselves.
Dabbal Mahara 44
3AC for program constructs• Program consists of assignment statements like a=b op c or control statements like if-
then-else, while loop or for statements.• This section deals with generation of three address code for assignment statement and
control statements.1. Three-address code for assignment statement
• The three-address code for an assignment statement S is given in following SDD. The attributes S.code and E.code denote the three-address code for S and E respectively.
• Attribute E.addr denotes the address that will hold the value of E. This address can be a name, a constant, or a compiler-generated temporary.
• The production E -> id, in SDD, when an expression is a single identifier, say x, then x itself holds the value of the expression. The semantic rules for this production define E.addr to point to the symbol table entry for this instance of id. Let top denote the current symbol table. Function top.get ( )retrieves the entry when it is applied to id.lexeme of id. E.code is set to empty string.
• For E -> (E1), the translation of E is the same as that of the subexpression E1. Hence, E.addr eauals E1.addr and E.code = E1.code.
• For E->E1 + E2 , generate code to compute value of E from the values of E1 and E2. Values are computed into newly generated temporary names.
• A sequence of temporary names is created by new Temp();• The gen( ) function is to build an instruction and return it.
Dabbal Mahara 45
3AC for Assignment statement with expressions
Dabbal Mahara 46
Example: Generate three address code or the following arithmetic expression: a = - b * c
Three Address code:t1 = -bt2 = t1* ca = t2
fig. Parse tree for the expression a= -b *c
E.addr = bE.code = ‘ ’
E.addr = t1
E.code= t1 = -b
E.addr = t2
E.code= t1 = -b t2 = t1 * c
E.addr = cE.code = ‘ ’
S.code = t1 = -b t2 = t1 * c a = t2
Dabbal Mahara 47
3AC generation for Array references• Elements of array are stored in consecutive memory location. • In C and Java, the array elements are numbered 0,1,.........., n-1, for
array with n elements.• If width of each element is w then the ith element of the array can be
accessed at: base + i * w ........ ( 1)
where base is the base address of array or the address of the 1st element of array.• Example: Let A[10] be an array of 10 elements. Let size of each
element be 2 i.e., w =2 and the array is stored from memory location 1000 i.e. base address=1000.
• 3rd element of array is at address = 1000 + 3 * 2 = 1000 + 3 * 2 = 1000 + 6=1006
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
Dabbal Mahara 48
• More generally, the array elements need not be started at 0. In one dimensional array, the array elements are numbered low, low+1, low+2,............, high and base is the relative address of A[low].
• The address of A[i] can be rewritten as: base + (i-low) * w ..............(2)
• Formula (2) can be written as: i * w + base – low * w = i * w + c , where c = base – low * w.
• All the components in c are known before compilation hence they can be pre-computed and stored. This reduces the time taken to generate address of ith element.
• We assume that c is saved in the symbol table entry for A, so the relative address of A [ i ] is obtained by simply adding i * w to c.
3AC generation for Array references
Dabbal Mahara 49
One Dimensional Array Reference: Example
A: array [10 ... 20] of integers;
... ...
lowbase i width of the array element w
x : = A[ i ] = base + (i – low) * w = i *w + cwhere, c = base – low * w with low = 10;
w =4
x
Dabbal Mahara 50
• In case of multi-dimension array like matrix, elements are either stored as Row Major or Column Major. C language and Pascal uses row major storage where as Fortran language uses column major storage.
• Example: Consider Array A[3,3] with elements:(0,0) (0,1) (0,2)
(1,0) (1,1) (1,2)
(2,0) (2,1) (2,2)
Row Major (0,0) (0,1) (0,2) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2)
Colum Major (0,0) (1,0) (2,0) (0,1) (1,1) (2,1) (0,2) (1,2) (2,2)
Address of element A [i, j] in row major storage is given by the expression as follows.
A[i,j] = base + ((i - low1) * n2 + j - low2) * w, ................. ( 3)
where low1 and low2 are lower bounds of i & j and n2 defines the number of columns. w defines the size of each element.
Expression can be written as:
A[i,j] = (( i * n2) + j) * w) + ( base – (( low1 * n2) + low2) * w ) .......... (4)
The second part of the Expression (4) can be pre-computed by knowing the value of base, low1, low2 and w. This helps in faster generation of address for A[i,j].
Dabbal Mahara 51
Example: 2D array referencing
A : array (1..2,1..3) of integer; ... = A[i,j] = baseA +(i-low1) * n2 + j-low2) *w
= ((i*n2) +j) *w + c Where c = baseA –((low1 *n2)+low2) * w
with low1 = 1, low2 =1, n2 = 3 , w =4Three- address codet1 = i* 3t2 = t1 +jt3 = t2 * 4t4 = ct5 = t4 [t3]..... = t5
52
Translation of Array references• The main problem of generating code for array referencesis to relate address
calculation formula in the grammar for array references.• Let nonterminal L generate an array name followed by a sequence of index
expressions:L -> L [ E ] | id [ E ]
• In this translation scheme gen( ) function builds an instruction and incrementally emits it into the stream of generated instructions.
• The nonterminal L has three synthesized attributes:• L.addr: denotes a temporary that is used while computing the offset for the
array reference generated by L.• L.array: is pointer to a symbol table entry for the array name. The base
address of the array – the address of its 0th element, say, L.array.base is used to determine the actual l-value of an array refrence after all the index expressions are analyzed. The location for array reference is therefore L.array.base[L.addr].
• L.type is the type of the subarray generated by L. For any type t, we assume that its width is given by t.width and t.elem gives the element type.
• The translation scheme is shown below:
Dabbal Mahara
Dabbal Mahara 53
Translation of Array references
Dabbal Mahara 54
Example: Compute 3AC for expression c+a[i][j], where c, i and j are all integers and a is 2x3 integer array.
Three Address Codet1 = i * 12t2 = j *4
t3 = t1 + t2
t4 = a [ t3] t5 = c + t4
L.array = aL.type = array (3, integer)
. addr =i
. addr =j
L.array = aL.type = integer
E. addr= t5
+E.addr = c E . addr = t4
L.addr = t1
L . addr = t3
E ]
[a E ]
[
a.type = array(2, array(3, integer))
i
j
c
Fig. Annotated Parse Tree for c + a[ i ][ j ]
Dabbal Mahara 55
• Control statements are used to alter the sequential flow of execution.• Some the control statements are if-then-else statement, while statement.
• S -> if (E ) S1
• S -> if ( E ) S1 else S2
• S -> while ( E ) S1
• S -> do S1 while ( E )• Three Address Code for if-then, if-then-else, while do statements can be
generated using the translation rules given in following slides.• In the translation rules, both S and E have a synthesized attribute code, which
gives the trasnslation into three-address instructions.• For simpilicity, translations S.code and E.code are built up as string using SDD.• The translation of S -> if (E) S1 consists of E.code followed by S1.code as shown in
figure.
Flow-of- control statements
Dabbal Mahara 56
Three Address Code generation for if then statement
Statement Translation rulesS->if E then S1 E.true = newlabel(); E.false = S.next; S1.next = S.next;
S.code = E.code || label(E.True, ‘:’)|| S1.code
Example: Generate 3 address code for the statement: if a>b then x =y +z. Ans: 3AC for the given statement is:
if a>b then goto L1 goto L2
L1: t1 = y + z x = t1
L2: ....
E.code
S1.codeE.true :
E.false : ...
to E. false
to E. true
Dabbal Mahara 57
Three Address Code generation for if then else statement
Production Semantic RulesS->if E then S1 else S2 E.true = newlabel();
E.false = newlabel();S1.next = S.next;S2.next = S.next;S.code = E.code || label(E.True, ‘:’)||S1.code || gen(‘GOTO’, S.next) ||label(E.false, ‘:’) || S2.code
Example: Generate 3 address code for the statement: if a>b then x =y +z else x = y-z
The three address code is given below: if a>b then goto L1 goto L2
L1: t1= y+z x =t1 goto L3
L2: t1 = y-z x = t1
L3: ............
Dabbal Mahara 58
Three Address Code generation for while do statement
S.code three-address code for evaluating SS.begin label to start of S S.next label to end of S
production Semantic rulesS->while E do S1 S.begin = newlabel();
E.true = newlabel();E.false = S.nextS1.next = S.begin
S.code= label(S.begin ‘:’) || E.code || label(E.true’:’) || S1.code || gen(‘GOTO’ S.begin)
Dabbal Mahara 59
Example 1: Generate 3 address code for the statement: while a>b do x = y +z.
The three address code is given below:
L1: if a> b then goto L2 goto L3 L2: t1 = y+z x = t1 goto L1 L3: ....... Example 2: Generate 3 address code for the statement: i = 2 * n + k while i do i = i – k
The three address code is given below: t1 =2 t2 = t1 *n t3 = t2 +k
L1: if i =1 then goto L2 goto L3
L2: t4 = i-k i = t4 goto L
L3: .......
Dabbal Mahara 60
Example 3: Generate 3AC for the statement: while a<b do
if c< d thenx = y + z elsex = y – z
Solution: The three address code will be as followsL1: if a<b then GOTO L2 GOTO LNEXTL2: if c<d then GOTO L3 GOTO L4L3: t1 = y + z x = t1 GOTO L1L4: t1= y - z x = t1 GOTO L1LNEXT:
Dabbal Mahara 61
Example 4: Generate 3AC for the statement:
c =0do
if (a< b) x++ else x- -
c+ +while ( c < 5 )Solution:
The three address code is given below:1. c =02. if (a < b) GOTO (4)3. GOTO (7)4. t1 = x + 15. x = t1 6. GOTO (9)7. t2 = x -18. x = t29. t3 = c +110.c = t3 11.if ( c<5) GOTO (2)12. ------------------------
Alternate Method:
c = 0 L1: if ( a<b ) GOTO L2 GOTO L3L2: t1 = x + 1 x = t1 GOTO L4L3: t2 = x -1
x = t2 L4: t3 = c +1
c = t3 if ( c <5) GOTO L1----------
Dabbal Mahara 62
Example 5: Generate three address code for following c program
int a[10], b[10], dot_product, i;dot_product = 0;for ( i =0 ; i < 10 ; i++ ) dot_product += a[i] * b[i];
Intermediate Code: dot_product = 0; i =0;L1: if (i >=10) GOTO L2
t1 = addr(a) // c = base – low* w = baset2 = i * 4t3 = t1[t2]t4 = addr(b)t5 = i * 4t6 = t4[t5]
t7 = t3 + t6t8 = dot_product + t7dot_product = t8t9 = i + 1i = 19GOTO L1
L2: -------------
Dabbal Mahara 63
Example 6 : Generate three address code for following c program
int a[10], b[10], dot_product, i;int *a1, *b1;dot_product = 0;a1 =a; b1 = b;for ( i =0 ; i < 10 ; i++ ) dot_product += *a1++ * *b1++;
Intermediate Code:dot_product = 0;a1 = &ab1 = &bi =0;
L1: if (i >=10) GOTO L2t3 = *a1t4 = a1 + 1a1 = t4
t5 = *b1t6 = b1 + 1b1 = t6 +1t7 = t3 + t5t8 = dot_product + t7dot_product = t8t9 = i + 1i = 19GOTO L1
L2: -------------
Dabbal Mahara 64
Logical Expression• Logical operators are mainly used in flow control statements like if then else, while-do and repeat until.• not operation has the highest precedence-level followed by and and or is at least precedence level.• Logical expressions always results in values either true or false. • True can be treated as non zero or non negative or 1 value. Whereas false may be 0 or negative value.
Production Translation rules E → E1 or E2 E1.true = E. true
E1. false = newlabel( ) E2.true = E.true E2.false = E. falseE.code = E1.code || label (E1.false) || E2.code
E → E1 and E2 E1.true = newlabel ( ) E1. false = false E2.true = E.true E2.false = E. false
E.code = E1.code || label (E1.true) || E2.code
Dabbal Mahara 65
SDD for translation of Boolean Expression to 3AC E → not E1 E1.true = E. false
E1. false = E.true E.code = E1.code
E → E1 rel E2 E.code = E1.code || E2.code || gen(‘if’ E1.addr rel.op E2.addr ‘goto’ E.true) || gen (‘goto’ E.false)
E → ( E1 ) E.value = E1.value
E → true E.code = gen(‘goto’ E.true)
E → false E.code = gen(‘goto’ E.false)
Examples: Generate 3 AC from following statement: a or b and not c.
The 3AC for the above expression will be as follows:t1 = not ct2 = b and t1t3 = a or t2
Dabbal Mahara 66
Example: consider the following statement and translate it into three address codes. if (x < 100 || x > 200 && x != y ) x = 0;
Three address-code:if x < 100 goto L2goto L3
L3: if x > 200 goto L4goto L1
L4: if x != y goto L2goto L1
L2: x = 0L1: ......
Dabbal Mahara 67
Three address code for procedure call
S → call id ( Elist ) { for each item p on queue do produce (‘param’ p);
produce(‘call’ id.value |queue|) }Elist → Elist , E { append E.value to the end of queue }Elist → E { initialize queue to contain only E.value }
Example: 1
Dabbal Mahara 68
Example 2Consider the statement: n = f ( a [ i ] ) where a is array of integers f is function from integers to integers. Three Address Code: t1 = i * 4
t2 = a [ t1 ]param t2t3 = call f,1n = t3
Dabbal Mahara 69
Example: 3int main(){ int p, int a[10]; int b[10]; p = dot_product ( a, b);} Intermediate codefunct begin mainparam aparam bp = call dot_product, 2func end
int dot_product ( int x[ ] , int y [] ){
int d, i;d =0;for ( i= 0; i<10;i++ d += x[i] * y[i];return d;
}
intermediate Code: func begin dot_product d =0 i=0L1: if (i>=10) goto L2 t1 = addr(x) t2 = i * 4 t3 = t1[t2] t4 = addr(y) t5 = i * 4
t6 = t4[t5]t7 = t3 + t6t8 = d + t7d= t8t9 = i + 1i = 19goto L1
L2: return dfunc end
Dabbal Mahara 70
Example 4: Write 3AC for the following code:int fact ( int n){
if ( n== 0 ) return 1;else return ( n* fact(n-1));
}Intermediate Code:
func begin factif (n==0) goto L1t1 = n-1param t1t2 = call fact, 1t3 = n * t2return t3
L1: return 1func end
71
Thank You !