compilation 0368-3133 (semester a, 2013/14)

167
Compilation 0368-3133 (Semester A, 2013/14) Lecture 8: Activation Records + Optimized IR Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv and Eran Yahav

Upload: ince

Post on 25-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Compilation 0368-3133 (Semester A, 2013/14). Lecture 8: Activation Records + Optimized IR Noam Rinetzky. Slides credit: Roman Manevich , Mooly Sagiv and Eran Yahav. What is a Compiler?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Compilation   0368-3133 (Semester A, 2013/14)

1

Compilation 0368-3133 (Semester A, 2013/14)

Lecture 8: Activation Records + Optimized IR

Noam Rinetzky

Slides credit: Roman Manevich, Mooly Sagiv and Eran Yahav

Page 2: Compilation   0368-3133 (Semester A, 2013/14)

2

What is a Compiler?

“A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language).

The most common reason for wanting to transform source code is to create an executable program.”

--Wikipedia

Page 3: Compilation   0368-3133 (Semester A, 2013/14)

3

Conceptual Structure of a Compiler

Executable code

exe

Sourcetext

txtSemantic

RepresentationBackend

CompilerFrontend

LexicalAnalysi

s

Syntax Analysi

sParsing

Semantic

Analysis

IntermediateRepresentati

on(IR)

CodeGeneration

Page 4: Compilation   0368-3133 (Semester A, 2013/14)

From scanning to parsing

4

((23 + 7) * x)

) x * ) 7 + 23 ( (RP Id OP RP Num OP Num LP LP

Lexical Analyzer

program text

token stream

ParserGrammar: E ... | Id Id ‘a’ | ... | ‘z’

Op(*)

Id(b)

Num(23) Num(7)

Op(+)

Abstract Syntax Tree

validsyntaxerror

Page 5: Compilation   0368-3133 (Semester A, 2013/14)

5

Context Analysis

Op(*)

Id(b)

Num(23) Num(7)

Op(+)

Abstract Syntax Tree

E1 : int E2 : int

E1 + E2 : int

Type rules

Semantic Error Valid + Symbol Table

Page 6: Compilation   0368-3133 (Semester A, 2013/14)

6

Code Generation

Op(*)

Id(b)

Num(23) Num(7)

Op(+)

Valid Abstract Syntax TreeSymbol Table…

Verification (possible runtime)Errors/Warnings

Intermediate Representation (IR)

Executable Codeinput output

Page 7: Compilation   0368-3133 (Semester A, 2013/14)

7

Code Generation: IR

• Translating from abstract syntax (AST) to intermediate representation (IR)– Three-Address Code

• …

Source code

(program)

LexicalAnalysis

Syntax Analysis

Parsing

AST SymbolTableetc.

Inter.Rep.(IR)

CodeGeneration

Source code

(executable)

Page 8: Compilation   0368-3133 (Semester A, 2013/14)

8

C++ IR

Pentium

Java bytecode

SparcPyhton

Javaoptimize

Lowering Code Gen.

Intermediate representation• A language that is between the source language and

the target language – not specific to any machine• Goal 1: retargeting compiler components for

different source languages/target machines• Goal 2: machine-independent optimizer

– Narrow interface: small number of node types (instructions)

Page 9: Compilation   0368-3133 (Semester A, 2013/14)

Multiple IRs• Some optimizations require high-level

structure• Others more appropriate on low-level code• Solution: use multiple IR stages

AST LIR

Pentium

Java bytecode

Sparc

optimize

HIR

optimize

9

Page 10: Compilation   0368-3133 (Semester A, 2013/14)

10

Compile Time vs Runtime

• Compile time: Data structures used during program compilation

• Runtime: Data structures used during program execution

Page 11: Compilation   0368-3133 (Semester A, 2013/14)

11

Three-Address Code IR

• A popular form of IR• High-level assembly where instructions

have at most three operands

Chapter 8

Page 12: Compilation   0368-3133 (Semester A, 2013/14)

12

Variable assignments• var = constant;• var1 = var2;• var1 = var2 op var3;• var1 = constant op var2;• var1 = var2 op constant;• var = constant1 op constant2;• Permitted operators are +, -, *, /, %

Page 13: Compilation   0368-3133 (Semester A, 2013/14)

13

Representing 3AC• Quadruple (op,arg1,arg2,result)• Result of every instruction is written into a new temporary

variable• Generates many variable names• Can move code fragments without complicated renaming• Alternative representations may be more compact

at5=:(5)t5t4t2+(4)t4t3b*(3)t3cuminus(2)t2t1b*(1)t1cuminus(0)resultarg 2arg 1opt1 = - c

t2 = b * t1

t3 = - ct4 = b * t3

t5 = t2 * t4

a = t5

Page 14: Compilation   0368-3133 (Semester A, 2013/14)

14

From AST to TAC

Page 15: Compilation   0368-3133 (Semester A, 2013/14)

15

Generating IR code

• Option 1 Accumulate code in AST attributes

• Option 2Emit IR code to a file during compilation– If for every production the code of the left-

hand-side is constructed from a concatenation of the code of the RHS in some fixed order

Page 16: Compilation   0368-3133 (Semester A, 2013/14)

16

TAC generation

• At this stage in compilation, we have– an AST– annotated with scope information– and annotated with type information

• To generate TAC for the program, we do recursive tree traversal– Generate TAC for any subexpressions or substatements– Using the result, generate TAC for the overall

expression

Page 17: Compilation   0368-3133 (Semester A, 2013/14)

17

TAC generation for expressions

• Define a function cgen(expr) that generates TAC that computes an expression, stores it in a temporary variable, then hands back the name of that temporary– Define cgen directly for atomic expressions

(constants, this, identifiers, etc.)• Define cgen recursively for compound

expressions (binary operators, function calls, etc.)

Page 18: Compilation   0368-3133 (Semester A, 2013/14)

18

Booleans• Boolean variables are represented as integers

that have zero or nonzero values• In addition to the arithmetic operator, TAC

supports <, ==, ||, and &&• How might you compile the following?

b = (x <= y); _t0 = x < y;_t1 = x == y;b = _t0 || _t1;

Page 19: Compilation   0368-3133 (Semester A, 2013/14)

19

cgen examplecgen(5 + x) = { Choose a new temporary t Let t1 = cgen(5) Let t2 = cgen(x) Emit( t = t1 + t2 ) Return t{

Page 20: Compilation   0368-3133 (Semester A, 2013/14)

Numval = 5

AddExprleft right

Identname = x

visit

visit(left)

visit(right)

cgen(5 + x)

t1 = 5;

t2 = x;

t = t1 + t2;

t1 = 5 t2 = x

t = t1 + t2

cgen as recursive AST traversal

20

Page 21: Compilation   0368-3133 (Semester A, 2013/14)

21

Naïve cgen for expressions• Maintain a counter for temporaries in c• Initially: c = 0• cgen(e1 op e2) = {

Let A = cgen(e1) c = c + 1 Let B = cgen(e2) c = c + 1 Emit( _tc = A op B; ) Return _tc}

Page 22: Compilation   0368-3133 (Semester A, 2013/14)

22

Naive cgen for expressions• Maintain a counter for temporaries in c• Initially: c = 0• cgen(e1 op e2) = {

Let A = cgen(e1) c = c + 1 Let B = cgen(e2) c = c + 1 Emit( _tc = A op B; ) Return _tc}

• Observation: temporaries in cgen(e1) can be reused in cgen(e2)

Page 23: Compilation   0368-3133 (Semester A, 2013/14)

23

Improving cgen for expressions• Observation – naïve translation needlessly generates temporaries for

leaf expressions• Observation – temporaries used exactly once

– Once a temporary has been read it can be reused for another sub-expression• cgen(e1 op e2) = {

Let _t1 = cgen(e1) Let _t2 = cgen(e2) Emit( _t =_t1 op _t2; ) Return t}

• Temporaries cgen(e1) can be reused in cgen(e2)

Page 24: Compilation   0368-3133 (Semester A, 2013/14)

24

Sethi-Ullman translation

• Algorithm by Ravi Sethi and Jeffrey D. Ullman to emit optimal TAC– Minimizes number of temporaries

• Main data structure in algorithm is a stack of temporaries– Stack corresponds to recursive invocations of _t = cgen(e)– All the temporaries on the stack are live

• Live = contain a value that is needed later on

Page 25: Compilation   0368-3133 (Semester A, 2013/14)

25

Weighted register allocation• Can save registers by re-ordering subtree computations• Label each node with its weight

– Weight = number of registers needed– Leaf weight known– Internal node weight

• w(left) > w(right) then w = left• w(right) > w(left) then w = right• w(right) = w(left) then w = left + 1

• Choose heavier child as first to be translated• WARNING: have to check that no side-effects exist before

attempting to apply this optimization(pre-pass on the tree)

Page 26: Compilation   0368-3133 (Semester A, 2013/14)

Weighted reg. alloc. example

b

5 c

*

array access

+

a w=0

w=0 w=0

w=1 w=0

w=1

w=1

Phase 1: - check absence of side-effects in expression tree - assign weight to each AST node

26

_t0 = cgen( a+b[5*c] )

base index

Page 27: Compilation   0368-3133 (Semester A, 2013/14)

Weighted reg. alloc. example

b

5 c

*

array access

+

abase index

w=0

w=0 w=0

w=1 w=0

w=1

w=1

27

_t0 = cgen( a+b[5*c] )Phase 2: - use weights to decide on order of translation

_t0

_t0

_t0Heavier sub-tree

Heavier sub-tree

_t0 = _t1 * _t0

_t0 = _t1[_t0]

_t0 = _t1 + _t0

_t0_t1

_t1

_t1_t0 = c_t1 = 5

_t1 = b

_t1 = a

Page 28: Compilation   0368-3133 (Semester A, 2013/14)

TAC Generation forMemory Access Instructions

• Copy instruction: a = b• Load/store instructions:

a = *b *a = b• Address of instruction a=&b• Array accesses:

a = b[i] a[i] = b• Field accesses:

a = b[f] a[f] = b• Memory allocation instruction: a = alloc(size)

– Sometimes left out (e.g., malloc is a procedure in C)

28

Page 29: Compilation   0368-3133 (Semester A, 2013/14)

29

Array operations

t1 := &y ; t1 = address-of yt2 := t1 + i ; t2 = address of y[i]x := *t2 ; value stored at y[i]

t1 := &x ; t1 = address-of xt2 := t1 + i ; t2 = address of x[i]*t2 := y ; store through pointer

x := y[i]

x[i] := y

Page 30: Compilation   0368-3133 (Semester A, 2013/14)

TAC Generation forControl Flow Statements

• Label introduction_label_name:

Indicates a point in the code that can be jumped to• Unconditional jump: go to instruction following label L

Goto L;• Conditional jump: test condition variable t;

if 0, jump to label LIfZ t Goto L;

• Similarly : test condition variable t;if 1, jump to label L

IfNZ t Goto L;30

Page 31: Compilation   0368-3133 (Semester A, 2013/14)

Control-flow example – conditionsint x;int y;int z;

if (x < y) z = x;else z = y;z = z * z;

_t0 = x < y;IfZ _t0 Goto _L0;z = x;Goto _L1;_L0:z = y;_L1:z = z * z;

31

Page 32: Compilation   0368-3133 (Semester A, 2013/14)

cgen for if-then-else

32

cgen(if (e) s1 else s2) Let _t = cgen(e)Let Lfalse be a new labelLet Lafter be a new labelEmit( IfZ _t Goto Lfalse; )cgen(s1)Emit( Goto Lafter; )Emit( Lfalse: )cgen(s2)Emit( Goto Lafter;)Emit( Lafter: )

Page 33: Compilation   0368-3133 (Semester A, 2013/14)

Control-flow example – loopsint x;int y;

while (x < y) {x = x * 2;

{

y = x;

_L0:_t0 = x < y;IfZ _t0 Goto _L1;x = x * 2;Goto _L0;_L1:

y = x;

33

Page 34: Compilation   0368-3133 (Semester A, 2013/14)

cgen for while loops

34

cgen(while (expr) stmt) Let Lbefore be a new label.Let Lafter be a new label.Emit( Lbefore: )Let t = cgen(expr)Emit( IfZ t Goto Lafter; )cgen(stmt)Emit( Goto Lbefore; )Emit( Lafter: )

Page 35: Compilation   0368-3133 (Semester A, 2013/14)

Optimization points

35

sourcecode

Frontend IR Code

generatortargetcode

Userprofile program

change algorithm

Compilerintraprocedural IRInterprocedural IRIR optimizations

Compilerregister allocation

instruction selectionpeephole transformations

today

Page 36: Compilation   0368-3133 (Semester A, 2013/14)

36

Interprocedural IR

• Compile time generation of code for procedure invocations

• Activation Records (aka Frames)

Page 37: Compilation   0368-3133 (Semester A, 2013/14)

Intro: Procedures / Functions • Store local variables/temporaries in a stack• A function call instruction pushes arguments to

stack and jumps to the function labelA statement x=f(a1,…,an); looks like

Push a1; … Push an;Call f;Pop x; // copy returned value

• Returning a value is done by pushing it to the stack (return x;)

Push x;• Return control to caller (and roll up stack)

Return;37

Page 38: Compilation   0368-3133 (Semester A, 2013/14)

38

Intro: Memory Layout (popular convention)

Global Variables

Stack

Heap

Highaddresses

Lowaddresses

Page 39: Compilation   0368-3133 (Semester A, 2013/14)

39

Intro: A Logical Stack FrameParam NParam N-1

…Param 1_t0

…_tkx

…y

Parameters(actual

arguments)

Locals and temporaries

Stack frame for function f(a1,…,aN)

Page 40: Compilation   0368-3133 (Semester A, 2013/14)

Intro: Functions Exampleint SimpleFn(int z) {

int x, y;x = x * y * z;return x;

}

void main() {int w;w = SimpleFunction(137);

}

_SimpleFn:_t0 = x * y;_t1 = _t0 * z;x = _t1;Push x;Return;

main:_t0 = 137;Push _t0;Call _SimpleFn;Pop w;

40

Page 41: Compilation   0368-3133 (Semester A, 2013/14)

41

Supporting Procedures

• New computing environment – at least temporary memory for local variables

• Passing information into the new environment– Parameters

• Transfer of control to/from procedure• Handling return values

Page 42: Compilation   0368-3133 (Semester A, 2013/14)

What Can We Do with Procedures?

• Declarations & Definitions• Call & Return• Jumping out of procedures• Passing & Returning procedures as

parameters

42

Page 43: Compilation   0368-3133 (Semester A, 2013/14)

43

Design Decisions

• Scoping rules– Static scoping vs. dynamic scoping

• Caller/callee conventions– Parameters– Who saves register values?

• Allocating space for local variables

Page 44: Compilation   0368-3133 (Semester A, 2013/14)

44

Static (lexical) Scopingmain ( ){

int a = 0 ;int b = 0 ;{

int b = 1 ;{

int a = 2 ;printf (“%d %d\n”, a, b)

}{

int b = 3 ;printf (“%d %d\n”, a, b) ;

}printf (“%d %d\n”, a, b) ;

}printf (“%d %d\n”, a, b) ;

}

B0B1

B3B3

B2

Declaration Scopes

a=0 B0,B1,B3

b=0 B0

b=1 B1,B2

a=2 B2

b=3 B3

a name refers to its (closest)

enclosing scope

known at compile time

Page 45: Compilation   0368-3133 (Semester A, 2013/14)

45

Dynamic Scoping

• Each identifier is associated with a global stack of bindings

• When entering scope where identifier is declared– push declaration on identifier stack

• When exiting scope where identifier is declared– pop identifier stack

• Evaluating the identifier in any context binds to the current top of stack

• Determined at runtime

Page 46: Compilation   0368-3133 (Semester A, 2013/14)

46

Example

• What value is returned from main?• Static scoping?• Dynamic scoping?

int x = 42;

int f() { return x; } int g() { int x = 1; return f(); }int main() { return g(); }

Page 47: Compilation   0368-3133 (Semester A, 2013/14)

47

Why do we care?

• We need to generate code to access variables

• Static scoping– Identifier binding is known at compile time– Address of the variable is known at compile time– Assigning addresses to variables is part of code

generation– No runtime errors of “access to undefined variable”– Can check types of variables

Page 48: Compilation   0368-3133 (Semester A, 2013/14)

48

Variable addresses for static scoping: first attempt

int x = 42;

int f() { return x; } int g() { int x = 1; return f(); }int main() { return g(); }

identifier address

x (global) 0x42

x (inside g) 0x73

Page 49: Compilation   0368-3133 (Semester A, 2013/14)

49

Variable addresses for static scoping: first attempt

int a [11] ;

void quicksort(int m, int n) { int i; if (n > m) { i = partition(m, n); quicksort (m, i-1) ; quicksort (i+1, n) ; }

main() {... quicksort (1, 9) ;}

what is the address of the variable “i” in

the procedure quicksort?

Page 50: Compilation   0368-3133 (Semester A, 2013/14)

Compile-Time Information on Variables• Name• Type• Scope

– when is it recognized• Duration

– Until when does its value exist• Size

– How many bytes are required at runtime • Address

– Fixed– Relative– Dynamic

50

Page 51: Compilation   0368-3133 (Semester A, 2013/14)

51

Activation Record (Stack Frames)• separate space for each procedure invocation

• managed at runtime– code for managing it generated by the compiler

• desired properties – efficient allocation and deallocation

• procedures are called frequently– variable size

• different procedures may require different memory sizes

Page 52: Compilation   0368-3133 (Semester A, 2013/14)

Memory Layout (popular convention)

52

Global Variables

Stack

Heap

Highaddresses

Lowaddresses

Page 53: Compilation   0368-3133 (Semester A, 2013/14)

Memory Hierarchy

53

Global Variables

Stack

Heap

High addresses

Low addresses

CPU Main Memory

Register 00

Register 01

Register xx

Page 54: Compilation   0368-3133 (Semester A, 2013/14)

54

A Logical Stack Frame (Simplified)Param NParam N-1

…Param 1_t0

…_tkx

…y

Parameters(actual

arguments)

Locals and temporaries

Stack frame for function f(a1,…,aN)

Page 55: Compilation   0368-3133 (Semester A, 2013/14)

Activation Record (frame)

55

parameter k

parameter 1

lexical pointer

return information

dynamic link

registers & misc

local variablestemporaries

next frame would be here

administrativepart

high addresses

lowaddresses

framepointer

stackpointer

incoming parameters

stack grows down

Page 56: Compilation   0368-3133 (Semester A, 2013/14)

56

Runtime Stack

• Stack of activation records• Call = push new activation record• Return = pop activation record• Only one “active” activation record – top of

stack• How do we handle recursion?

Page 57: Compilation   0368-3133 (Semester A, 2013/14)

57

Runtime Stack• SP – stack pointer

– top of current frame• FP – frame pointer

– base of current frame– Sometimes called BP

(base pointer)

Current frame

… …

Previous frame

SP

FPstack grows down

Page 58: Compilation   0368-3133 (Semester A, 2013/14)

Pentium Runtime Stack

58

Pentium stack registers

Pentium stack and call/ret instructions

Register Usage

ESP Stack pointer

EBP Base pointer

Instruction Usage

push, pusha,… push on runtime stack

pop,popa,… Base pointer

call transfer control to called routine

return transfer control back to caller

Page 59: Compilation   0368-3133 (Semester A, 2013/14)

59

Call Sequences

• The processor does not save the content of registers on procedure calls

• So who will? – Caller saves and restores registers– Callee saves and restores registers– But can also have both save/restore some

registers

Page 60: Compilation   0368-3133 (Semester A, 2013/14)

Call Sequences

60

call

caller

callee

return

caller

Caller push code

Callee push code

(prologue)

Callee pop code

(epilogue)

Caller pop code

Push caller-save registersPush actual parameters (in reverse order)

push return addressJump to call address

Push current base-pointerbp = sp

Push local variablesPush callee-save registers

Pop callee-save registersPop callee activation record

Pop old base-pointer

pop return addressJump to address

Pop parametersPop caller-save registers

Page 61: Compilation   0368-3133 (Semester A, 2013/14)

61

Call Sequences – Foo(42,21)

call

caller

callee

return

caller

push %ecx

push $21

push $42

call _foo

push %ebp

mov %esp, %ebp

sub %8, %esp

push %ebx

pop %ebx

mov %ebp, %esp

pop %ebp

ret

add $8, %esp

pop %ecx

Push caller-save registersPush actual parameters (in reverse order)

push return addressJump to call address

Push current base-pointerbp = sp

Push local variablesPush callee-save registers

Pop callee-save registersPop callee activation record

Pop old base-pointer

pop return addressJump to address

Pop caller-save registersPop parameters

Page 62: Compilation   0368-3133 (Semester A, 2013/14)

62

“To Callee-save or to Caller-save?”

• Callee-saved registers need only be saved when callee modifies their value

• some heuristics and conventions are followed

Page 63: Compilation   0368-3133 (Semester A, 2013/14)

Caller-Save and Callee-Save Registers• Callee-Save Registers

– Saved by the callee before modification– Values are automatically preserved across calls

• Caller-Save Registers – Saved (if needed) by the caller before calls– Values are not automatically preserved across calls

• Usually the architecture defines caller-save and callee-save registers

• Separate compilation• Interoperability between code produced by different

compilers/languages • But compiler writers decide when to use calller/callee

registers

63

Page 64: Compilation   0368-3133 (Semester A, 2013/14)

Callee-Save Registers• Saved by the callee before modification• Usually at procedure prolog• Restored at procedure epilog• Hardware support may be available • Values are automatically preserved across calls

64

.global _foo

Add_Constant -K, SP //allocate space for foo

Store_Local R5, -14(FP) // save R5

Load_Reg R5, R0; Add_Constant R5, 1

JSR f1 ; JSR g1;

Add_Constant R5, 2; Load_Reg R5, R0

Load_Local -14(FP), R5 // restore R5

Add_Constant K, SP; RTS // deallocate

int foo(int a) { int b=a+1; f1();g1(b);

return(b+2); }

Page 65: Compilation   0368-3133 (Semester A, 2013/14)

Caller-Save Registers• Saved by the caller before calls when

needed• Values are not automatically preserved

across calls

65

.global _bar

Add_Constant -K, SP //allocate space for bar

Add_Constant R0, 1

JSR f2

Load_Constant 2, R0 ; JSR g2;

Load_Constant 8, R0 ; JSR g2

Add_Constant K, SP // deallocate space for bar

RTS

void bar (int y) { int x=y+1;f2(x);

g2(2); g2(8); }

Page 66: Compilation   0368-3133 (Semester A, 2013/14)

Parameter Passing• 1960s

– In memory • No recursion is allowed

• 1970s– In stack

• 1980s– In registers– First k parameters are passed in registers (k=4 or k=6)– Where is time saved?

66

• Most procedures are leaf procedures• Interprocedural register allocation• Many of the registers may be dead before another invocation• Register windows are allocated in some architectures per call (e.g., sun Sparc)

Page 67: Compilation   0368-3133 (Semester A, 2013/14)

L-Values of Local Variables

• The offset in the stack is known at compile time

• L-val(x) = FP+offset(x)• x = 5 Load_Constant 5, R3

Store R3, offset(x)(FP)

67

Page 68: Compilation   0368-3133 (Semester A, 2013/14)

Code Blocks

• Programming language provide code blocks void foo() { int x = 8 ; y=9;//1 { int x = y * y ;//2 } { int x = y * 7 ;//3} x = y + 1; }

68

adminstrativex1

y1

x2

x3

Page 69: Compilation   0368-3133 (Semester A, 2013/14)

69

Accessing Stack Variables

• Use offset from FP (%ebp)• Remember –

stack grows downwards• Above FP = parameters• Below FP = locals• Examples

– %ebp + 4 = return address– %ebp + 8 = first parameter– %ebp – 4 = first local

… …

SP

FPReturn address

Return address

Param n…

param1

Local 1…

Local n

Previous fp

Param n…

param1FP+8

FP-4

Page 70: Compilation   0368-3133 (Semester A, 2013/14)

70

Factorial – fact(int n)fact:pushl %ebp # save ebpmovl %esp,%ebp # ebp=esppushl %ebx # save ebxmovl 8(%ebp),%ebx # ebx = ncmpl $1,%ebx # n = 1 ?jle .lresult # then doneleal -1(%ebx),%eax # eax = n-1pushl %eax # call fact # fact(n-1)imull %ebx,%eax # eax=retv*njmp .lreturn # .lresult:movl $1,%eax # retv.lreturn:movl -4(%ebp),%ebx # restore ebxmovl %ebp,%esp # restore esppopl %ebp # restore ebp

ESP

EBPReturn address

Return address

old %ebx

Previous fp

nEBP+8

EBP-4 old %ebp

old %ebx

(stack in intermediate point)

(disclaimer: real compiler can do better than that)

Page 71: Compilation   0368-3133 (Semester A, 2013/14)

71

Windows Exploit(s) Buffer Overflow

void foo (char *x) { char buf[2]; strcpy(buf, x); } int main (int argc, char *argv[]) { foo(argv[1]); }

./a.out abracadabraSegmentation fault

Stack grows this way

Memory addresses

Previous frameReturn

addressSaved FPchar* xbuf[2]

abracadabr

Page 72: Compilation   0368-3133 (Semester A, 2013/14)

72

Buffer overflow int check_authentication(char *password) {

int auth_flag = 0; char password_buffer[16];

strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0)

auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0)

auth_flag = 1; return auth_flag;

}int main(int argc, char *argv[]) { if(argc < 2) { printf("Usage: %s <password>\n", argv[0]); exit(0);

} if(check_authentication(argv[1])) {

printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n");

printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); } else {

printf("\nAccess Denied.\n"); }

}(source: “hacking – the art of exploitation, 2nd Ed”)

Page 73: Compilation   0368-3133 (Semester A, 2013/14)

73

Buffer overflow int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16];

strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0)

auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0)

auth_flag = 1; return auth_flag;}int main(int argc, char *argv[]) { if(argc < 2) { printf("Usage: %s <password>\n", argv[0]); exit(0); } if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); } else { printf("\nAccess Denied.\n"); }}

(source: “hacking – the art of exploitation, 2nd Ed”)

Page 74: Compilation   0368-3133 (Semester A, 2013/14)

74

Buffer overflow int check_authentication(char *password) { char password_buffer[16]; int auth_flag = 0; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0)

auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0)

auth_flag = 1; return auth_flag;}int main(int argc, char *argv[]) { if(argc < 2) { printf("Usage: %s <password>\n", argv[0]); exit(0); } if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); } else { printf("\nAccess Denied.\n"); }}

(source: “hacking – the art of exploitation, 2nd Ed”)

Page 75: Compilation   0368-3133 (Semester A, 2013/14)

75

Buffer overflow

0x08048529 <+69>: movl $0x8048647,(%esp) 0x08048530 <+76>: call 0x8048394 <puts@plt> 0x08048535 <+81>: movl $0x8048664,(%esp) 0x0804853c <+88>: call 0x8048394 <puts@plt> 0x08048541 <+93>: movl $0x804867a,(%esp) 0x08048548 <+100>: call 0x8048394 <puts@plt> 0x0804854d <+105>: jmp 0x804855b <main+119> 0x0804854f <+107>: movl $0x8048696,(%esp) 0x08048556 <+114>: call 0x8048394 <puts@plt>

Page 76: Compilation   0368-3133 (Semester A, 2013/14)

76

Nested Procedures

• For example – Pascal• Any routine can have sub-routines• Any sub-routine can access anything that is

defined in its containing scope or inside the sub-routine itself– “non-local” variables

Page 77: Compilation   0368-3133 (Semester A, 2013/14)

77

Example: Nested Proceduresprogram p;var x: Integer;procedure a

var y: Integer;procedure b begin…b… end;

function cvar z: Integer;procedure d begin…d… end;

begin…c…end;

begin…a… end;

begin…p… end.

possible call sequence:

pa a c b c d

what is the address of variable “y” in

procedure d?

Page 78: Compilation   0368-3133 (Semester A, 2013/14)

78

Nested Procedures• can call a sibling, ancestor• when “c” uses (non-local)

variables from “a”, which instance of “a” is it?

• how do you find the right activation record at runtime?

possible call sequence:pa a c b c d

ab

P

c c

d

a

Page 79: Compilation   0368-3133 (Semester A, 2013/14)

79

Nested Procedures• goal: find the closest routine in

the stack from a given nesting level

• if we reached the same routine in a sequence of calls– routine of level k uses variables of

the same nesting level, it uses its own variables

– if it uses variables of nesting level j < k then it must be the last routine called at level j

• If a procedure is last at level j on the stack, then it must be ancestor of the current routine

possible call sequence:pa a c b c d

ab

P

c c

d

a

Page 80: Compilation   0368-3133 (Semester A, 2013/14)

80

Nested Procedures• problem: a routine may need to access variables of

another routine that contains it statically• solution: lexical pointer (a.k.a. access link) in the

activation record• lexical pointer points to the last activation record of the

nesting level above it– in our example, lexical pointer of d points to activation records

of c• lexical pointers created at runtime• number of links to be traversed is known at compile time

Page 81: Compilation   0368-3133 (Semester A, 2013/14)

81

Lexical Pointers program p;var x: Integer;procedure a var y: Integer; procedure b begin…b… end;

function c var z: Integer; procedure d begin…d… end;

begin…c…end;

begin…a… end;

begin…p… end.

a

a

c

b

c

d

y

y

z

z

possible call sequence:pa a c b c d

a

b

P

c c

d

a

Page 82: Compilation   0368-3133 (Semester A, 2013/14)

92

Activation Records: Remarks

Page 83: Compilation   0368-3133 (Semester A, 2013/14)

Non-Local goto in C syntax

93

Page 84: Compilation   0368-3133 (Semester A, 2013/14)

Non-local gotos in C

• setjmp remembers the current location and the stack frame

• longjmp jumps to the current location (popping many activation records)

94

Page 85: Compilation   0368-3133 (Semester A, 2013/14)

Non-Local Transfer of Control in C

95

Page 86: Compilation   0368-3133 (Semester A, 2013/14)

Stack Frames• Allocate a separate space for every procedure incarnation• Relative addresses• Provide a simple mean to achieve modularity• Supports separate code generation of procedures• Naturally supports recursion• Efficient memory allocation policy

– Low overhead– Hardware support may be available

• LIFO policy• Not a pure stack

– Non local references– Updated using arithmetic

96

Page 87: Compilation   0368-3133 (Semester A, 2013/14)

The Frame Pointer• The caller

– the calling routine• The callee

– the called routine• caller responsibilities:

– Calculate arguments and save in the stack– Store lexical pointer

• call instruction: M[--SP] := RA PC := callee

• callee responsibilities:– FP := SP– SP := SP - frame-size

• Why use both SP and FP?97

Page 88: Compilation   0368-3133 (Semester A, 2013/14)

Variable Length Frame Size• C allows allocating objects of unbounded

size in the stack void p() { int i; char *p; scanf(“%d”, &i); p = (char *) alloca(i*sizeof(int)); }

• Some versions of Pascal allows conformant array value parameters

98

Page 89: Compilation   0368-3133 (Semester A, 2013/14)

Limitations

• The compiler may be forced to store a value on a stack instead of registers

• The stack may not suffice to handle some language features

99

Page 90: Compilation   0368-3133 (Semester A, 2013/14)

Frame-Resident Variables• A variable x cannot be stored in register when:

– x is passed by reference– Address of x is taken (&x)– is addressed via pointer arithmetic on the stack-frame

(C varags)– x is accessed from a nested procedure– The value is too big to fit into a single register– The variable is an array– The register of x is needed for other purposes– Too many local variables

• An escape variable:– Passed by reference– Address is taken– Addressed via pointer arithmetic on the stack-frame– Accessed from a nested procedure 100

Page 91: Compilation   0368-3133 (Semester A, 2013/14)

The Frames in Different Architectures

Pentium MIPS Sparc

InFrame(8) InFrame(0) InFrame(68)

InFrame(12) InReg(X157) InReg(X157)

InFrame(16) InReg(X158) InReg(X158)

M[sp+0]fpfp spsp sp-K

sp sp-KM[sp+K+0] r2

X157 r4X158 r5

save %sp, -K, %spM[fp+68]i0

X157i1

X158i2

101

g(x, y, z) where x escapes

x

y

z

View

Change

Page 92: Compilation   0368-3133 (Semester A, 2013/14)

Limitations of Stack Frames• A local variable of P cannot be stored in the activation record of P if its

duration exceeds the duration of P• Example 1: Static variables in C

(own variables in Algol)void p(int x){ static int y = 6 ; y += x; }

• Example 2: Features of the C languageint * f() { int x ; return &x ;}

• Example 3: Dynamic allocationint * f() { return (int *) malloc(sizeof(int)); }

103

Page 93: Compilation   0368-3133 (Semester A, 2013/14)

Compiler Implementation

• Hide machine dependent parts• Hide language dependent part• Use special modules

104

Page 94: Compilation   0368-3133 (Semester A, 2013/14)

Basic Compiler Phases

105

Source program (string)

.EXE

lexical analysis

syntax analysis

semantic analysis

Code generation

Assembler/Linker

Tokens

Abstract syntax tree

Assembly

Frame managerControl Flow Graph

Page 95: Compilation   0368-3133 (Semester A, 2013/14)

Hidden in the frame ADT

• Word size• The location of the formals• Frame resident variables• Machine instructions to implement “shift-

of-view” (prologue/epilogue)• The number of locals “allocated” so far• The label in which the machine code starts

106

Page 96: Compilation   0368-3133 (Semester A, 2013/14)

Invocations to Frame

• “Allocate” a new frame • “Allocate” new local variable• Return the L-value of local variable• Generate code for procedure invocation• Generate prologue/epilogue• Generate code for procedure return

107

Page 97: Compilation   0368-3133 (Semester A, 2013/14)

108

Activation Records: Summary

• compile time memory management for procedure data

• works well for data with well-scoped lifetime– deallocation when procedure returns

Page 98: Compilation   0368-3133 (Semester A, 2013/14)

109

IR Optimization

• Making code better

Page 99: Compilation   0368-3133 (Semester A, 2013/14)

110

IR Optimization

• Making code “better”

Page 100: Compilation   0368-3133 (Semester A, 2013/14)

“Optimized” evaluation

b

5 c

*

array access

+

abase index

w=0

w=0 w=0

w=1 w=0

w=1

w=1

111

_t0 = cgen( a+b[5*c] )Phase 2: - use weights to decide on order of translation

_t0

_t0

_t0Heavier sub-tree

Heavier sub-tree

_t0 = _t1 * _t0

_t0 = _t1[_t0]

_t0 = _t1 + _t0

_t0_t1

_t1

_t1_t0 = c_t1 = 5

_t1 = b

_t1 = a

Page 101: Compilation   0368-3133 (Semester A, 2013/14)

112

But what about…

a := 1 + 2;y := a + b;x := a + b + 8;z := b + a;

a := a + 1;w:= a + b;

Page 102: Compilation   0368-3133 (Semester A, 2013/14)

113

Overview of IR optimization

• Formalisms and Terminology– Control-flow graphs– Basic blocks

• Local optimizations– Speeding up small pieces of a proceudre

• Global optimizations– Speeding up procedure as a whole

• The dataflow framework– Defining and implementing a wide class of optimizations

Page 103: Compilation   0368-3133 (Semester A, 2013/14)

114

Program Analysis

• In order to optimize a program, the compiler has to be able to reason about the properties of that program

• An analysis is called sound if it never asserts an incorrect fact about a program

• All the analyses we will discuss in this class are sound– (Why?)

Page 104: Compilation   0368-3133 (Semester A, 2013/14)

115

Soundnessint x;int y;

if (y < 5) x = 137;else x = 42;

Print(x);

“At this point in theprogram, x holds some

integer value”

Page 105: Compilation   0368-3133 (Semester A, 2013/14)

116

Soundnessint x;int y;

if (y < 5) x = 137;else x = 42;

Print(x);

“At this point in theprogram, x is either 137

or 42”

Page 106: Compilation   0368-3133 (Semester A, 2013/14)

117

Soundnessint x;int y;

if (y < 5) x = 137;else x = 42;

Print(x);

“At this point in theprogram, x is 137”

Page 107: Compilation   0368-3133 (Semester A, 2013/14)

118

Soundnessint x;int y;

if (y < 5) x = 137;else x = 42;

Print(x);

“At this point in theprogram, x is either 137,

42, or 271”

Page 108: Compilation   0368-3133 (Semester A, 2013/14)

119

Semantics-preserving optimizations

• An optimization is semantics-preserving if it does not alter the semantics of the original program

• Examples:– Eliminating unnecessary temporary variables– Computing values that are known statically at compile-time

instead of runtime– Evaluating constant expressions outside of a loop instead of inside

• Non-examples:– Replacing bubble sort with quicksort (why?)– The optimizations we will consider in this class are all semantics-

preserving

Page 109: Compilation   0368-3133 (Semester A, 2013/14)

120

A formalism for IR optimization

• Every phase of the compiler uses some new abstraction:– Scanning uses regular expressions– Parsing uses CFGs– Semantic analysis uses proof systems and symbol tables– IR generation uses ASTs

• In optimization, we need a formalism that captures the structure of a program in a way amenable to optimization

Page 110: Compilation   0368-3133 (Semester A, 2013/14)

121

Visualizing IRmain:

_tmp0 = Call _ReadInteger;a = _tmp0;_tmp1 = Call _ReadInteger;b = _tmp1;_L0:_tmp2 = 0;_tmp3 = b == _tmp2;_tmp4 = 0;_tmp5 = _tmp3 == _tmp4;IfZ _tmp5 Goto _L1;c = a;a = b;_tmp6 = c % a;b = _tmp6;Goto _L0;_L1:Push a;Call _PrintInt;

Page 111: Compilation   0368-3133 (Semester A, 2013/14)

122

Visualizing IRmain:

_tmp0 = Call _ReadInteger;a = _tmp0;_tmp1 = Call _ReadInteger;b = _tmp1;_L0:_tmp2 = 0;_tmp3 = b == _tmp2;_tmp4 = 0;_tmp5 = _tmp3 == _tmp4;IfZ _tmp5 Goto _L1;c = a;a = b;_tmp6 = c % a;b = _tmp6;Goto _L0;_L1:Push a;Call _PrintInt;

Page 112: Compilation   0368-3133 (Semester A, 2013/14)

123

Visualizing IRmain:

_tmp0 = Call _ReadInteger;a = _tmp0;_tmp1 = Call _ReadInteger;b = _tmp1;_L0:_tmp2 = 0;_tmp3 = b == _tmp2;_tmp4 = 0;_tmp5 = _tmp3 == _tmp4;IfZ _tmp5 Goto _L1;c = a;a = b;_tmp6 = c % a;b = _tmp6;Goto _L0;_L1:Push a;Call _PrintInt;

_tmp0 = Call _ReadInteger;a = _tmp0;_tmp1 = Call _ReadInteger;b = _tmp1;

_tmp2 = 0;_tmp3 = b == _tmp2;_tmp4 = 0;_tmp5 = _tmp3 == _tmp4;IfZ _tmp5 Goto _L1;

c = a;a = b;_tmp6 = c % a;b = _tmp6;Goto _L0;

Push a;Call _PrintInt;

start

end

Page 113: Compilation   0368-3133 (Semester A, 2013/14)

124

Basic blocks

• A basic block is a sequence of IR instructions where– There is exactly one spot where control enters the

sequence, which must be at the start of the sequence

– There is exactly one spot where control leaves the sequence, which must be at the end of the sequence

• Informally, a sequence of instructions that always execute as a group

Page 114: Compilation   0368-3133 (Semester A, 2013/14)

125

Control-Flow Graphs• A control-flow graph (CFG) is a graph of the basic blocks

in a function• The term CFG is overloaded – from here on out, we'll

mean “control-flow graph” and not “context free grammar”

• Each edge from one basic block to another indicates that control can flow from the end of the first block to the start of the second block

• There is a dedicated node for the start and end of a function

Page 115: Compilation   0368-3133 (Semester A, 2013/14)

126

Types of optimizations

• An optimization is local if it works on just a single basic block

• An optimization is global if it works on an entire control-flow graph

• An optimization is interprocedural if it works across the control-flow graphs of multiple functions– We won't talk about this in this course

Page 116: Compilation   0368-3133 (Semester A, 2013/14)

127

Basic blocks exerciseint main() {

int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

START:_t0 = 137;y = _t0;IfZ x Goto _L0;t1 = y;z = _t1;Goto END:_L0:_t2 = y;x = _t2;END:

Divide the code into basic blocks

Page 117: Compilation   0368-3133 (Semester A, 2013/14)

128

Control-flow graph exerciseSTART:

_t0 = 137;y = _t0;IfZ x Goto _L0;t1 = y;z = _t1;Goto END:_L0:_t2 = y;x = _t2;END:

Draw the control-flow graph

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 118: Compilation   0368-3133 (Semester A, 2013/14)

Control-flow graph exercise

129

_t0 = 137;y = _t0;IfZ x Goto _L0;

start

_t1 = y;z = _t1;

_t2 = y;x = _t2;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 119: Compilation   0368-3133 (Semester A, 2013/14)

130

Local optimizations

_t0 = 137;y = _t0;IfZ x Goto _L0;

start

_t1 = y;z = _t1;

_t2 = y;x = _t2;

start

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 120: Compilation   0368-3133 (Semester A, 2013/14)

131

Local optimizations

_t0 = 137;y = _t0;IfZ x Goto _L0;

start

_t1 = y;z = _t1;

_t2 = y;x = _t2;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 121: Compilation   0368-3133 (Semester A, 2013/14)

132

Local optimizations

y = 137;IfZ x Goto _L0;

start

_t1 = y;z = _t1;

_t2 = y;x = _t2;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 122: Compilation   0368-3133 (Semester A, 2013/14)

133

Local optimizations

y = 137;IfZ x Goto _L0;

start

_t1 = y;z = _t1;

_t2 = y;x = _t2;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 123: Compilation   0368-3133 (Semester A, 2013/14)

134

Local optimizations

y = 137;IfZ x Goto _L0;

start

z = y;_t2 = y;x = _t2;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 124: Compilation   0368-3133 (Semester A, 2013/14)

135

Local optimizations

y = 137;IfZ x Goto _L0;

start

z = y;_t2 = y;x = _t2;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 125: Compilation   0368-3133 (Semester A, 2013/14)

136

Local optimizations

y = 137;IfZ x Goto _L0;

start

z = y; x = y;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 126: Compilation   0368-3133 (Semester A, 2013/14)

137

Global optimizations

y = 137;IfZ x Goto _L0;

start

z = y; x = y;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 127: Compilation   0368-3133 (Semester A, 2013/14)

138

Global optimizations

y = 137;IfZ x Goto _L0;

start

z = y; x = y;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 128: Compilation   0368-3133 (Semester A, 2013/14)

139

Global optimizations

y = 137;IfZ x Goto _L0;

start

z = 137; x = 137;

End

int main() {int x;int y;int z;

y = 137;if (x == 0)

z = y;else

x = y;}

Page 129: Compilation   0368-3133 (Semester A, 2013/14)

140

Local Optimizations

Page 130: Compilation   0368-3133 (Semester A, 2013/14)

141

Optimization path

IR Control-FlowGraph

CFGbuilder

ProgramAnalysis

AnnotatedCFG

OptimizingTransformation

TargetCode

CodeGeneration

(+optimizations)

donewith IR

optimizations

IRoptimizations

Page 131: Compilation   0368-3133 (Semester A, 2013/14)

142

Common subexpression eliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = 4;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = a + b;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 132: Compilation   0368-3133 (Semester A, 2013/14)

143

Common subexpression eliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = 4;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = a + b;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 133: Compilation   0368-3133 (Semester A, 2013/14)

144

Common subexpression eliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = 4;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = _tmp4;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 134: Compilation   0368-3133 (Semester A, 2013/14)

145

Common subexpression eliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = 4;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = _tmp4;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 135: Compilation   0368-3133 (Semester A, 2013/14)

146

Common subexpression eliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = _tmp4;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 136: Compilation   0368-3133 (Semester A, 2013/14)

147

Common subexpression eliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = _tmp4;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 137: Compilation   0368-3133 (Semester A, 2013/14)

148

Common subexpression eliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = c;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 138: Compilation   0368-3133 (Semester A, 2013/14)

149

Common Subexpression Elimination

• If we have two variable assignmentsv1 = a op b…v2 = a op b

• and the values of v1, a, and b have not changed between the assignments, rewrite the code asv1 = a op b…v2 = v1

• Eliminates useless recalculation• Paves the way for later optimizations

Page 139: Compilation   0368-3133 (Semester A, 2013/14)

150

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = c;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 140: Compilation   0368-3133 (Semester A, 2013/14)

151

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = c;_tmp6 = *(x);_tmp7 = *(_tmp6);Push _tmp5;Push x;Call _tmp7;

Page 141: Compilation   0368-3133 (Semester A, 2013/14)

152

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = c;_tmp6 = *(_tmp1);_tmp7 = *(_tmp6);Push _tmp5;Push _tmp1;Call _tmp7;

Page 142: Compilation   0368-3133 (Semester A, 2013/14)

153

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = a + b;c = _tmp4;_tmp5 = c;_tmp6 = *(_tmp1);_tmp7 = *(_tmp6);Push _tmp5;Push _tmp1;Call _tmp7;

Page 143: Compilation   0368-3133 (Semester A, 2013/14)

154

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = *(_tmp1);_tmp7 = *(_tmp6);Push _tmp5;Push _tmp1;Call _tmp7;

Page 144: Compilation   0368-3133 (Semester A, 2013/14)

155

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = *(_tmp1);_tmp7 = *(_tmp6);Push _tmp5;Push _tmp1;Call _tmp7;

Page 145: Compilation   0368-3133 (Semester A, 2013/14)

156

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = *(_tmp1);_tmp7 = *(_tmp6);Push c;Push _tmp1;Call _tmp7;

Page 146: Compilation   0368-3133 (Semester A, 2013/14)

157

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = *(_tmp1);_tmp7 = *(_tmp6);Push c;Push _tmp1;Call _tmp7;

Page 147: Compilation   0368-3133 (Semester A, 2013/14)

158

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = _tmp2;_tmp7 = *(_tmp6);Push c;Push _tmp1;Call _tmp7;

Page 148: Compilation   0368-3133 (Semester A, 2013/14)

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = _tmp2;_tmp7 = *(_tmp6);Push c;Push _tmp1;Call _tmp7;

159

Page 149: Compilation   0368-3133 (Semester A, 2013/14)

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = _tmp2;_tmp7 = *(_tmp2);Push c;Push _tmp1;Call _tmp7;

160

Page 150: Compilation   0368-3133 (Semester A, 2013/14)

Copy Propagation

161

Object x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp3;_tmp4 = _tmp3 + b;c = _tmp4;_tmp5 = c;_tmp6 = _tmp2;_tmp7 = *(_tmp2);Push c;Push _tmp1;Call _tmp7;

Page 151: Compilation   0368-3133 (Semester A, 2013/14)

Copy PropagationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp0;_tmp4 = _tmp0 + b;c = _tmp4;_tmp5 = c;_tmp6 = _tmp2;_tmp7 = *(_tmp2);Push c;Push _tmp1;Call _tmp7;

162

Page 152: Compilation   0368-3133 (Semester A, 2013/14)

Copy Propagation

• If we have a variable assignmentv1 = v2then as long as v1 and v2 are not reassigned, we can rewrite expressions of the forma = … v1 …asa = … v2 …provided that such a rewrite is legal

163

Page 153: Compilation   0368-3133 (Semester A, 2013/14)

Dead Code EliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp0;_tmp4 = _tmp0 + b;c = _tmp4;_tmp5 = c;_tmp6 = _tmp2;_tmp7 = *(_tmp2);Push c;Push _tmp1;Call _tmp7;

164

Page 154: Compilation   0368-3133 (Semester A, 2013/14)

Dead Code EliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;x = _tmp1;_tmp3 = _tmp0;a = _tmp0;_tmp4 = _tmp0 + b;c = _tmp4;_tmp5 = c;_tmp6 = _tmp2;_tmp7 = *(_tmp2);Push c;Push _tmp1;Call _tmp7;

165

values never read

values never read

Page 155: Compilation   0368-3133 (Semester A, 2013/14)

Dead Code EliminationObject x;int a;int b;int c;

x = new Object;a = 4;c = a + b;x.fn(a + b);

_tmp0 = 4;Push _tmp0;_tmp1 = Call _Alloc;Pop tmp2;*(_tmp1) = _tmp2;

_tmp4 = _tmp0 + b;c = _tmp4;

_tmp7 = *(_tmp2);Push c;Push _tmp1;Call _tmp7;

166

Page 156: Compilation   0368-3133 (Semester A, 2013/14)

167

Dead Code Elimination

• An assignment to a variable v is called dead if the value of that assignment is never read anywhere

• Dead code elimination removes dead assignments from IR

• Determining whether an assignment is dead depends on what variable is being assigned to and when it's being assigned

Page 157: Compilation   0368-3133 (Semester A, 2013/14)

168

Applying local optimizations

• The different optimizations we've seen so far all take care of just a small piece of the optimization

• Common subexpression elimination eliminates unnecessary statements

• Copy propagation helps identify dead code• Dead code elimination removes statements that are

no longer needed• To get maximum effect, we may have to apply these

optimizations numerous times

Page 158: Compilation   0368-3133 (Semester A, 2013/14)

169

Applying local optimizations example

b = a * a;c = a * a;d = b + c;e = b + b;

Page 159: Compilation   0368-3133 (Semester A, 2013/14)

170

Applying local optimizations example

b = a * a;c = a * a;d = b + c;e = b + b;

Which optimization should we apply here?

Page 160: Compilation   0368-3133 (Semester A, 2013/14)

171

Applying local optimizations example

b = a * a;c = b;d = b + c;e = b + b;

Common sub-expression elimination

Which optimization should we apply here?

Page 161: Compilation   0368-3133 (Semester A, 2013/14)

172

Applying local optimizations example

b = a * a;c = b;d = b + c;e = b + b;

Which optimization should we apply here?

Page 162: Compilation   0368-3133 (Semester A, 2013/14)

173

Applying local optimizations example

b = a * a;c = b;d = b + b;e = b + b;

Which optimization should we apply here?

Copy propagation

Page 163: Compilation   0368-3133 (Semester A, 2013/14)

174

Applying local optimizations example

b = a * a;c = b;d = b + b;e = b + b;

Which optimization should we apply here?

Page 164: Compilation   0368-3133 (Semester A, 2013/14)

175

Applying local optimizations example

b = a * a;c = b;d = b + b;e = d;

Which optimization should we apply here?

Common sub-expression elimination (again)

Page 165: Compilation   0368-3133 (Semester A, 2013/14)

176

Other types of local optimizations

• Arithmetic Simplification– Replace “hard” operations with easier ones– e.g. rewrite x = 4 * a; as x = a << 2;

• Constant Folding– Evaluate expressions at compile-time if they

have a constant value.– e.g. rewrite x = 4 * 5; as x = 20;

Page 166: Compilation   0368-3133 (Semester A, 2013/14)

177

Optimizations and analyses

• Most optimizations are only possible given some analysis of the program's behavior

• In order to implement an optimization, we will talk about the corresponding program analyses

Page 167: Compilation   0368-3133 (Semester A, 2013/14)

178

The End