4. selected topics in compiler construction · 4. selected topics in compiler construction c prof....
TRANSCRIPT
Compilers and Language Processing ToolsSummer Term 2011
Prof. Dr. Arnd Poetzsch-Heffter
Software Technology GroupTU Kaiserslautern
c© Prof. Dr. Arnd Poetzsch-Heffter 1
Content of Lecture
1. Introduction2. Syntax and Type Analysis
2.1 Lexical Analysis2.2 Context-Free Syntax Analysis2.3 Context-Dependent Analysis
3. Translation to Target Language3.1 Translation of Imperative Language Constructs3.2 Translation of Object-Oriented Language Constructs
4. Selected Topics in Compiler Construction4.1 Intermediate Languages4.2 Optimization4.3 Register Allocation4.4 Just-in-time Compilation4.5 Further Aspects of Compilation
5. Garbage Collection6. XML Processing (DOM, SAX, XSLT)
c© Prof. Dr. Arnd Poetzsch-Heffter 2
4. Selected Topics in CompilerConstruction
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3
Chapter Outline
4. Selected Topics in Compiler Construction4.1 Intermediate Languages
4.1.1 3-Address Code4.1.2 Other Intermediate Languages
4.2 Optimization4.2.1 Classical Optimization Techniques4.2.2 Potential of Optimizations4.2.3 Data Flow Analysis4.2.4 Non-local Optimization
4.3 Register Allocation4.3.1 Sethi-Ullman Algorithm4.3.2 Register Allocation by Graph Coloring
4.4 Just-in-time Compilation4.5 Further Aspects of Compilation
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 4
Selected topics in compiler construction
Focus:• Techniques that go beyond the direct translation of source
languages to target languages• Concentrate on concepts instead of language-dependent details• Use program representations tailored for the considered tasks
(instead of source language syntax):I simplifies representationI (but needs more work to integrate tasks)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 5
Selected topics in compiler construction (2)
Learning objectives:• Intermediate languages for translation and optimization of
imperative languages• Different optimization techniques• Different static analysis techniques for (intermediate) programs• Register allocation• Some aspects of code generation
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 6
Intermediate languages
4.1 Intermediate languages
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 7
Intermediate languages
Intermediate languages
• Intermediate languages are used asI appropriate program representation for certain language
implementation tasksI common representation of programs of different source languages
Source Language 1
Source Language 2
Source Language n
Intermediate Language
Target Language 1
Target Language 2
Target Language m
...
...
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 8
Intermediate languages
Intermediate languages (2)
• Intermediate languages for translation are comparable to datastructures in algorithm design, i.e., for each task, an intermediatelanguage is more or less suitable.
• Intermediate languages can conceptually be seen as abstractmachines.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 9
Intermediate languages 3-Address Code
4.1.1 3-Address Code
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 10
Intermediate languages 3-Address Code
3-address code
3-address code (3AC) is a common intermediate language with manyvariants.
Properties:
• only elementary data types (but often arrays)• no nested expressions• sequential execution, jumps and procedure calls as statements• named variables as in a high level language• unbounded number of temporary variables
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 11
Intermediate languages 3-Address Code
3-address code (2)
A program in 3AC consists of
• a list of global variables
• a list of procedures with parameters and local variables
• a main procedure
• each procedure has a sequence of 3AC commands as body
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 12
Intermediate languages 3-Address Code
3AC commands
Syntax Explanation
x := y bop zx : = uop zx:= y
x: variable (global, local, parameter, temporary)y,z: variable or constantbop: binary operatoruop: unary operator
goto Lif x cop y goto L
jump or conditional jump to label Lcop: comparison operatoronly procedure-local jumps
x:= a[i]a[i]:= y a one-dimensional array
x : = & ax:= *y*x := y
a global, local variable or parameter& a address of a* dereferencing operator
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 13
Intermediate languages 3-Address Code
3AC commands (2)
Syntax Explanation
param xcall preturn y
call p(x1, ..., xn) is encoded as:(block is considered as one command)param x1
...
param xn
call p
return y causes jump to return addresswith (optional) result y
We assume that 3AC only contains labelsfor which jumps are used in the program.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 14
Intermediate languages 3-Address Code
Basic blocks
A sequence of 3AC commands can be uniquely partitioned into basicblocks.
A basic block B is a maximal sequence of commands such that• at the end of B, exactly one jump, procedure call, or return
command occurs• labels only occur at the first command of a basic block
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 15
Intermediate languages 3-Address Code
Basic blocks (2)
Remarks:• The commands of a basic block are always executed sequentially,
there are no jumps to the inside• Often, a designated exit-block for a procedure containing the
return jump at its end is required. This is handled by additionaltransformations.
• The transitions between basic blocks are often denoted by flowcharts.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 16
Intermediate languages 3-Address Code
Example: 3AC and basic blocks
Consider the following C program:Beispiel: (3AC und Basisblöcke)
Wir betrachten den 3AC für ein C-Programm:
int a[2];
int b[7];
int skprod(int i1, int i2, int lng) {... }
int main( ) {
a[0] = 1; a[1] = 2;
b[0] = 4; b[1] = 5; b[2] = 6;
skprod(0 1 2);skprod(0,1,2);
return 0;
}
3AC mit Basisblockzerlegung für die Prozedur main:
main:
a[0] := 1a[0] := 1
a[1] := 2
b[0] := 4
b[1] := 5
b[2] := 6
param 0
param 1
param 2
call skprod
28.06.2007 296© A. Poetzsch-Heffter, TU Kaiserslautern
return 0
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 17
Intermediate languages 3-Address Code
Example: 3AC and basic blocks (2)
3AC with basic block partitioning for main procedure
Beispiel: (3AC und Basisblöcke)
Wir betrachten den 3AC für ein C-Programm:
int a[2];
int b[7];
int skprod(int i1, int i2, int lng) {... }
int main( ) {
a[0] = 1; a[1] = 2;
b[0] = 4; b[1] = 5; b[2] = 6;
skprod(0 1 2);skprod(0,1,2);
return 0;
}
3AC mit Basisblockzerlegung für die Prozedur main:
main:
a[0] := 1a[0] := 1
a[1] := 2
b[0] := 4
b[1] := 5
b[2] := 6
param 0
param 1
param 2
call skprod
28.06.2007 296© A. Poetzsch-Heffter, TU Kaiserslautern
return 0
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 18
Intermediate languages 3-Address Code
Example: 3AC and basic blocks (3)
Procedure skprod:Prozedur skprod mit 3AC und Basisblockzerlegung:
int skprod(int i1, int i2, int lng) {
int ix, res = 0;
for( ix=0; ix <= lng-1; ix++ ){
res += a[i1+ix] * b[i2+ix];
}
skprod:
}
return res;
}
res:= 0
ix := 0
t0 := lng-1
if ix<=t0
true false
t1 := i1+ix
t2 := a[t1]
t1 := i2+ix
t3 := b[t1]
t1 := t2*t3
return res
t1 := t2*t3
res:= es+t1
ix := ix+1
28.06.2007 297© A. Poetzsch-Heffter, TU Kaiserslautern
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 19
Intermediate languages 3-Address Code
Example: 3AC and basic blocks (4)
Procedure skprod as 3AC with basic blocks
Prozedur skprod mit 3AC und Basisblockzerlegung:
int skprod(int i1, int i2, int lng) {
int ix, res = 0;
for( ix=0; ix <= lng-1; ix++ ){
res += a[i1+ix] * b[i2+ix];
}
skprod:
}
return res;
}
res:= 0
ix := 0
t0 := lng-1
if ix<=t0
true false
t1 := i1+ix
t2 := a[t1]
t1 := i2+ix
t3 := b[t1]
t1 := t2*t3
return res
t1 := t2*t3
res:= es+t1
ix := ix+1
28.06.2007 297© A. Poetzsch-Heffter, TU Kaiserslautern
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 20
Intermediate languages 3-Address Code
Intermediate Language Variations
3 AC after elimination of array operations (at above example)
Variation im Rahmen einer Zwischensprache:
3-Adress-Code nach Elimination von Feldoperationen
anhand des obigen Beispiels:
skprod:p
res:= 0
ix := 0
t0 := lng-1
if ix<=t0
t1 := i1+ix
tx := t1*4
ta := a+tx
true false
return res
t2 := *ta
t1 := i2+ix
tx := t1*4
tb := b+tx
t3 *tbt3 := *tb
t1 := t2*t3
res:= res+t1
ix := ix+1
28.06.2007 298© A. Poetzsch-Heffter, TU Kaiserslautern
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 21
Intermediate languages 3-Address Code
Characteristics of 3-Address Code
• Control flow is explicit.• Only elementary operations• Rearrangement and exchange of commands can be handled
relatively easily.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 22
Intermediate languages Other Intermediate Languages
4.1.2 Other Intermediate Languages
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 23
Intermediate languages Other Intermediate Languages
Further Intermediate Languages
We consider• 3AC in Static Single Assignment (SSA) representation• Stack Machine Code
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 24
Intermediate languages Other Intermediate Languages
Single Static Assignment Form
If a variable a is read at a program position, this is a use of a.
If a variable a is written at a program position, this is a definition of a.
For optimizations, the relationship between use and definition ofvariables is important.
In SSA representation, each variable has exactly one definition. Thus,relationship between use and definition in the intermediate language isexplicit.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 25
Intermediate languages Other Intermediate Languages
Single Static Assignment Form (2)
SSA is essentially a refinement of 3AC.
The different definitions of one variable are represented by indexingthe variable.
For sequential command lists, this means that• at each definition position, the variable gets a different index.• at the use position, the variable has the index of its last definition.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 26
Intermediate languages Other Intermediate Languages
Example: SSA
In SSA-Repräsentation besitzt jede Variable genau
eine Definition. Dadurch wird der Zusammenhang
ischen An end ng nd Definition in derzwischen Anwendung und Definition in der
Zwischensprache explizit, d.h. eine zusätzliche
def-use-Verkettung oder use-def-Verkettung wird
unnötig.
SSA ist im Wesentlichen eine Verfeinerung von 3AC.
Die Unterscheidung zwischen den Definitionsstellen
wird häufig durch Indizierung der Variablen dargestelltwird häufig durch Indizierung der Variablen dargestellt.
Für sequentielle Befehlsfolgen bedeutet das:
• An jeder Definitionsstelle bekommt die Variable
einen anderen Indexeinen anderen Index.
• An der Anwendungsstelle wird die Variable mit
dem Index der letzten Definitionsstelle notiert.
a := x + y
Beispiel:
a := x + y 1 0 0
b := a – 1
a := y + b
b := x * 4
a := a + b
b := a - 1
a := y + b
b := x * 4
a := a + b
1 1
2
2
0
0 1
28.06.2007 300© A. Poetzsch-Heffter, TU Kaiserslautern
a := a + b a := a + b 3 2 2
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 27
Intermediate languages Other Intermediate Languages
SSA - Join Points of Control Flow
At join points of control flow, an additional mechanism is required:
An Stellen, an denen der Kontrollfluß zusammen-
führt bedarf es eines zusätzlichen Mechanismus:führt, bedarf es eines zusätzlichen Mechanismus:
3 2 2a := x + y a := a – b1 0 0
?b := a3
...
Einführung der fiktiven Orakelfunktion“ ! dieEinführung der fiktiven „Orakelfunktion !, die
quasi den Wert der Variable im zutreffenden Zweig
auswählt:
3 2 2a := x + y a := a – b1 0 0
a := !(a ,a )b := a
43
1 34
28.06.2007 301© A. Poetzsch-Heffter, TU Kaiserslautern
...
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 28
Intermediate languages Other Intermediate Languages
SSA - Join Points of Control Flow (2)
Introduce an "oracle" Φ that selects the value of the variable of the usebranch:
An Stellen, an denen der Kontrollfluß zusammen-
führt bedarf es eines zusätzlichen Mechanismus:führt, bedarf es eines zusätzlichen Mechanismus:
3 2 2a := x + y a := a – b1 0 0
?b := a3
...
Einführung der fiktiven Orakelfunktion“ ! dieEinführung der fiktiven „Orakelfunktion !, die
quasi den Wert der Variable im zutreffenden Zweig
auswählt:
3 2 2a := x + y a := a – b1 0 0
a := !(a ,a )b := a
43
1 34
28.06.2007 301© A. Poetzsch-Heffter, TU Kaiserslautern
...
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 29
Intermediate languages Other Intermediate Languages
SSA - Remarks
• The construction of an SSA representation with a minimal numberof applications of the Φ oracle is a non-trivial task.(cf. Appel, Sect. 19.1. and 19.2)
• The term single static assignment form reflects that for eachvariable in the program text, there is only one assignment.Dynamically, a variable in SSA representation can be assignedarbitrarily often (e.g., in loops).
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 30
Intermediate languages Other Intermediate Languages
Further intermediate languages
While 3AC and SSA representation are mostly used as intermediatelanguages in compilers, intermediate languages and abstractmachines are more and more often used as connections betweencompilers and runtime environments.
Java Byte Code and CIL (Common Intermediate Language, cf. .NET)are examples for stack machine code, i.e., intermediate results arestored on a runtime stack.
Further intermediate languages are, for instance, used foroptimizations.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 31
Intermediate languages Other Intermediate Languages
Stack machine code as intermediate language
Homogeneous scenario for Java:Sprachlich homogenes Szenario bei Java:
C1.java
C2.javajikes
C1.class
C2 class
Java ByteCode
C2.java
C3.java javac2
C2.class
C3.class
JVM
Sprachlich ggf. inhomogenes Szenario bei .NET:
ProgrammeIntermediate
C# -
C il
prog1.cs prog1.il
verschiedener
Hochsprachen
Intermediate
Language
Compilerprog2.cs prog2.il
prog3.il
CLR
Haskell -
Compilerprog3.hs
Java-ByteCode und die MS-Intermediate Language
sind Beispiele für Kellermaschinencode, d.h.
Z i h b i d f i L f itk ll
28.06.2007 303© A. Poetzsch-Heffter, TU Kaiserslautern
Zwischenergebnisse werden auf einem Laufzeitkeller
verwaltet.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 32
Intermediate languages Other Intermediate Languages
Stack machine code as intermediate language (2)
Inhomogeneous scenario for .NET:
Sprachlich homogenes Szenario bei Java:
C1.java
C2.javajikes
C1.class
C2 class
Java ByteCode
C2.java
C3.java javac2
C2.class
C3.class
JVM
Sprachlich ggf. inhomogenes Szenario bei .NET:
ProgrammeIntermediate
C# -
C il
prog1.cs prog1.il
verschiedener
Hochsprachen
Intermediate
Language
Compilerprog2.cs prog2.il
prog3.il
CLR
Haskell -
Compilerprog3.hs
Java-ByteCode und die MS-Intermediate Language
sind Beispiele für Kellermaschinencode, d.h.
Z i h b i d f i L f itk ll
28.06.2007 303© A. Poetzsch-Heffter, TU Kaiserslautern
Zwischenergebnisse werden auf einem Laufzeitkeller
verwaltet.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 33
Intermediate languages Other Intermediate Languages
Example: Stack machine code
Beispiel: (Kellermaschinencode)
package beisp;
class Weltklasse extends Superklasse
implements BesteBohnen {
Qualifikation studieren ( Arbeit schweiss){
return new Qualifikation();
}}
}
Compiled from Weltklasse.java
class beisp Weltklasse extends beisp Superklasseclass beisp.Weltklasse extends beisp.Superklasse
implements beisp.BesteBohnen{
beisp.Weltklasse();
beisp.Qualifikation studieren( beisp.Arbeit);
}
Method beisp.Weltklasse()
0 aload_0
1 invokespecial #6 <Method beisp.Superklasse()>
4 return
Method beisp.Qualifikation studieren( beisp.Arbeit )
0 new #2 <Class beisp.Qualifikation>
3 dup
4 invokespecial #5 <Method beisp.Qualifikation()>
7 areturn7 areturn
Bemerkung:
Weitere Zwischensprachen werden insbesondere auch
28.06.2007 304© A. Poetzsch-Heffter, TU Kaiserslautern
Weitere Zwischensprachen werden insbesondere auch
im Zusammenhang mit Optimierungen eingesetzt.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 34
Intermediate languages Other Intermediate Languages
Example: Stack machine code (2)
Beispiel: (Kellermaschinencode)
package beisp;
class Weltklasse extends Superklasse
implements BesteBohnen {
Qualifikation studieren ( Arbeit schweiss){
return new Qualifikation();
}}
}
Compiled from Weltklasse.java
class beisp Weltklasse extends beisp Superklasseclass beisp.Weltklasse extends beisp.Superklasse
implements beisp.BesteBohnen{
beisp.Weltklasse();
beisp.Qualifikation studieren( beisp.Arbeit);
}
Method beisp.Weltklasse()
0 aload_0
1 invokespecial #6 <Method beisp.Superklasse()>
4 return
Method beisp.Qualifikation studieren( beisp.Arbeit )
0 new #2 <Class beisp.Qualifikation>
3 dup
4 invokespecial #5 <Method beisp.Qualifikation()>
7 areturn7 areturn
Bemerkung:
Weitere Zwischensprachen werden insbesondere auch
28.06.2007 304© A. Poetzsch-Heffter, TU Kaiserslautern
Weitere Zwischensprachen werden insbesondere auch
im Zusammenhang mit Optimierungen eingesetzt.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 35
Optimization
4.2 Optimization
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 36
Optimization
Optimization
Optimization refers to improving the code with the following goals:
• Runtime behavior
• Memory consumption
• Size of code
• Energy consumption
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 37
Optimization
Optimization (2)
We distinguish the following kinds of optimizations:• machine-independent optimizations• machine-dependent optimizations (exploit properties of a
particular real machine)
and• local optimizations• intra-procedural optimizations• inter-procedural/global optimizations
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 38
Optimization
Remark on Optimization
Appel (Chap. 17, p 350):
"In fact, there can never be a complete list [of optimizations]. "
"Computability theory shows that it will always be possible to inventnew optimizing transformations."
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 39
Optimization Classical Optimization Techniques
4.2.1 Classical Optimization Techniques
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 40
Optimization Classical Optimization Techniques
Constant Propagation
If the value of a variable is constant, the variable can be replaced withthe constant.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 41
Optimization Classical Optimization Techniques
Constant Folding
Evaluate all expressions with constants as operands at compile time.
Iteration of Constant Folding and Propagation:
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 42
Optimization Classical Optimization Techniques
Non-local Constant Optimization
For each program position, the possible values for each variable arerequired. If the set of possible values is infinite, it has to be abstractedappropriately.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 43
Optimization Classical Optimization Techniques
Copy Propagation
Eliminate all copies of variables, i.e., if there exist several variablesx,y,z at a program position, that are known to have the same value, alluses of y and z are replaced by x.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 44
Optimization Classical Optimization Techniques
Copy Propagation (2)
This can also be done at join points of control flow or for loops:
For each program point, the information which variables have the samevalue is required.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 45
Optimization Classical Optimization Techniques
Common Subexpression Elimination
If an expression or a statement contains the same partial expressionseveral times, the goal is to evaluate this subexpression only once.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 46
Optimization Classical Optimization Techniques
Common Subexpression Elimination (2)
Optimization of a basic block is done after transformation to SSA andconstruction of a DAG:
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 47
Optimization Classical Optimization Techniques
Common Subexpression Elimination (3)
Remarks:• The elimination of repeated computations is often done before
transformation to 3AC, but can also be reasonable following othertransformations.
• The DAG representation of expressions is also used asintermediate language by some authors.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 48
Optimization Classical Optimization Techniques
Algebraic Optimizations
Algebraic laws can be applied in order to be able to use otheroptimizations. For example, use associativity and commutativity ofaddition:
Caution: For finite data type, common algebraic laws are not valid ingeneral.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 49
Optimization Classical Optimization Techniques
Strength Reduction
Replace expensive operations by more efficient operations (partiallymachine-dependent).
For example: y: = 2* x can be replaced by
y : = x + x
or by
y: = x « 1
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 50
Optimization Classical Optimization Techniques
Inline Expansion of Procedure Calls
Replace call to non-recursive procedure by its body with appropriatesubstitution of parameters.
Note: This reduces execution time, but increases code size.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 51
Optimization Classical Optimization Techniques
Inline Expansion of Procedure Calls (2)
Remarks:• Expansion is in general more than text replacement:
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 52
Optimization Classical Optimization Techniques
Inline Expansion of Procedure Calls (3)
• In OO programs with relatively short methods, expansion is animportant optimization technique. But, precise information aboutthe target object is required.
• A refinement of inline expansion is the specialization ofprocedures/functions if some of the current parameters areknown. This technique can also be applied to recursiveprocedures/functions.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 53
Optimization Classical Optimization Techniques
Dead Code Elimination
Remove code that is not reached during execution or that has noinfluence on execution.
In one of the above examples, constant folding and propagationproduced the following code:
Provided, t3 and t4 are no longer used after the basic block (not live).
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 54
Optimization Classical Optimization Techniques
Dead Code Elimination (2)
A typical example for non-reachable and thus, dead code that can beeliminated:
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 55
Optimization Classical Optimization Techniques
Dead Code Elimination (3)
Remarks:
• Dead code is often caused by optimizations.
• Another source of dead code are program modifications.
• In the first case, liveness information is the prerequiste for deadcode elimination.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 56
Optimization Classical Optimization Techniques
Code motion
Move commands over branching points in the control flow graph suchthat they end up in basic blocks that are less often executed.
We consider two cases:
• Move commands in succeeding or preceeding branches• Move code out of loops
Optimization of loops is very profitable, because code inside loops isexecuted more often than code not contained in a loop.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 57
Optimization Classical Optimization Techniques
Move code over branching points
If a sequential computation branches, the branches are less oftenexecuted than the sequence.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 58
Optimization Classical Optimization Techniques
Move code over branching points (2)
Prerequisite for this optimization is that a defined variable is only usedin one branch.
Moving the command over a preceeding joint point can be advisable, ifthe command can be eliminated by optimization from one of thebranches.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 59
Optimization Classical Optimization Techniques
Partial redundancy elimination
Definition (Partial Redundancy)An assignment is redundant at a program position s, if it has alreadybeen executed on all paths to s.
An expression e is redundant at s, if the value of e has already beencalculated on all paths to s.
An assignment/expression is partially redundant at s, if it is redundantwith respect to some execution paths leading to s.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 60
Optimization Classical Optimization Techniques
Partial redundancy elimination (2)
Example:
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 61
Optimization Classical Optimization Techniques
Partial redundancy elimination (3)
Elimination of partial redundancy:
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 62
Optimization Classical Optimization Techniques
Partial redundancy elimination (4)
Remarks:
• PRE can be seen as a combination and extension of commonsubexpression elimination and code motion.
• Extension: Elimination of partial redundancy according toestimated probability for execution of specific paths.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 63
Optimization Classical Optimization Techniques
Code motion from loops
Idea: Computations in loops whose operations are not changed insidethe loop should be done outside the loop.
Provided, t1 is not live at the end of the top-most block on the left side.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 64
Optimization Classical Optimization Techniques
Optimization of loop variables
Variables and expressions that are not changed during the executionof a loop are called loop invariant.
Loops often have variables that are increased/decreasedsystematically in each loop execution, e.g., for-loops.
Often, a loop variable depends on another loop variable,e.g., a relative address depends on the loop counter variable.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 65
Optimization Classical Optimization Techniques
Optimization of loop variables (2)
Definition (Loop Variables)A variable i is called explicit loop variable of a loop S, if there is exactlyone definition of i in S of the form i := i + c where c is loop invariant.
A variable k is called derived loop variable of a loop S, if there isexactly one definition of k in S of the form k := j ∗ c or k := j + dwhere j is a loop variable and c and d are loop invariant.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 66
Optimization Classical Optimization Techniques
Induction variable analysis
Compute derived loop variables inductively, i.e., instead of computingthem from the value of the loop variable, compute them from thevalued of the previous loop execution.
Note: For optimization of derived loop variables, the dependenciesbetween variable definitions have to be precisely understood.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 67
Optimization Classical Optimization Techniques
Loop unrolling
If the number of loop executions is known statically or properties about thenumber of loop executions (e.g., always an even number) can be inferred, theloop body can be copied several times to save comparisons and jumps.
Provided, ix is dead at the end of the fragment.Note, the static computation of ix ’s values in the unrolled loop.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 68
Optimization Classical Optimization Techniques
Loop unrolling (2)
Remarks:
• Partial loop unrolling aims at obtaining larger basic blocks in loopsto have more optimization options.
• Loop unrolling is in particular important for parallel processorarchitectures and pipelined processing (machine-dependent).
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 69
Optimization Classical Optimization Techniques
Optimization for other language classes
The discussed optimizations aim at imperative languages. Foroptimizing programs of other language classes, special techniqueshave been developed.
For example:
• Object-oriented languages: Optimization of dynamic binding(type analysis)
• Non-strict functional languages: Optimization of lazy function calls(strictness analysis)
• Logic programming languages: Optimization of unification
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 70
Optimization Potential of Optimizations
4.2.2 Potential of Optimizations
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 71
Optimization Potential of Optimizations
Potential of optimizations - Example
Consider procedure skprod for the evaluation of the optimization techniques:
4.2.2 Optimierungspotential
Am Beispiel der Prozedur skprod demonstrieren
i i i d bi T h ik d dwir einige der obigen Techniken und das
Verbesserungspotential, das durch Optimierungen
erzielt werden kann; dabei skizzieren wir auch
dessen Bewertung.
k dskprod:
res:= 0
ix := 0
t0 := lng-1
if ix<=t0
true false
return res
t1 := i1+ix
tx := t1*4
ta := a+tx
t2 := *ta
t1 := i2+ixt1 : i2+ix
tx := t1*4
tb := b+tx
t3 := *tb
t1 := t2*t3
res:= res+t1
ix := ix+1
Bewertung: Anzahl der Befehlsschritte in Abhängigkeit
28.06.2007 322© A. Poetzsch-Heffter, TU Kaiserslautern
Bewertung: Anzahl der Befehlsschritte in Abhängigkeit
von lng: 2 + 2 + 13*lng + 1 = 13*lng + 5
( lng = 100: 1305, lng = 1000: 13005 )
Evaluation:Number of steps depending on lng:2 + 2 + 13 ∗ lng + 1 = 13 ∗ lng + 5lng=100: 1305lng=1000: 13005
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 72
Optimization Potential of Optimizations
Potential of optimizations - Example (2)Move computation of loop invariant out of loop:Herausziehen der Berechnung der
Schleifeninvariante t0:
skprod:
res:= 0res:= 0
ix := 0
t0 := lng-1
if i < t0
return res
t1 := i1+ix
tx := t1*4
if ix<=t0
true false
ta := a+tx
t2 := *ta
t1 := i2+ix
tx := t1*4
tb := b+txtb : b+tx
t3 := *tb
t1 := t2*t3
res:= res+t1
ix := ix+1
Bewertung: 3 + 1 + 12*lng + 1 = 12*lng + 5
28.06.2007 323© A. Poetzsch-Heffter, TU Kaiserslautern
g g g
( lng = 100: 1205, lng = 1000: 12005 )Evaluation: 3+1+12*lng+1 = 12 *lng + 5c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 73
Optimization Potential of Optimizations
Potential of optimizations - Example (3)Optimization of loop variables: There are no derived loop variables, becauset1 and tx have several definitions; transformation to SSA for t1 and tx yieldsthat t11, tx1, ta, t12, tb become derived loop variables.
Optimierung von Schleifenvariablen (1):
Zunächst gibt es keine abgeleiteten Schleifenvariablen,
da t1 und tx mehrere Definitionen besitzen; Einführen
von SSA für t1 und tx macht t11, tx1, ta, t12, tx2, tb
zu abgeleiteten Schleifenvariablen:
skprod:
res:= 0res:= 0
ix := 0
t0 := lng-1
if i < t0
return res
t11:= i1+ix
tx1:= t11*4
1
if ix<=t0
true false
ta := a+tx1
t2 := *ta
t12:= i2+ix
tx2:= t12*4
tb := b+tx2tb : b t
t3 := *tb
t13:= t2*t3
res:= res+t13
ix := ix+1
28.06.2007 324© A. Poetzsch-Heffter, TU Kaiserslauternc© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 74
Optimization Potential of Optimizations
Potential of optimizations - Example (4)
Optimization of loop variables(2): Inductive definition of loop variablesOptimierung von Schleifenvariablen (2):
Initialisierung und induktive Definition der
S hl if i blSchleifenvariablen:
skprod:
res:= 0res:= 0
ix := 0
t0 := lng-1
t11:= i1-1
tx1:= t11*4
ta := a+tx1
t12:= i2-1
tx2:= t12*4
tb := b+tx2
t11:= t11+1
if ix<=t0
true false
return res
t11:= t11+1
tx1:= tx1+4
ta := ta+4
t2 := *ta
t12:= t12+1
tx2:= tx2+4
tb := tb+4
t3 := *tb
t13:= t2*t3
res:= res+t13
28.06.2007 325© A. Poetzsch-Heffter, TU Kaiserslautern
res: res+t13
ix := ix+1
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 75
Optimization Potential of Optimizations
Potential of optimizations - Example (5)Dead Code Elimination: t11, tx1, t12, tx2 do not influence the result.
Elimination toten Codes:
Die Zuweisungen an t11, tx1, t12, tx2 sind toter
Code da sie das Ergebnis nicht beeinflussen
skprod:
Code, da sie das Ergebnis nicht beeinflussen.
res:= 0
ix := 0
t0 := lng-1
t11:= i1-1
tx1:= t11*4tx1: t11 4
ta := a+tx1
t12:= i2-1
tx2:= t12*4
tb := b+tx2
if ix<=t0
true false
return res
ta := ta+4
t2 := *ta
tb := tb+4
t3 := *tb
t13:= t2*t3t13: t2 t3
res:= res+t13
ix := ix+1
28.06.2007 326© A. Poetzsch-Heffter, TU Kaiserslautern
Bewertung: 9 + 1 + 8*lng + 1 = 8*lng + 11
( lng = 100: 811, lng = 1000: 8011 )
Evaluation: 9 + 1 + 8 * lng +1 = 8 * lng +11c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 76
Optimization Potential of Optimizations
Potential of optimizations - Example (6)
Algebraic Optimizations: Use invariants ta = 4 ∗ (i1− 1 + ix) + a for thecomparison ta ≤ 4 ∗ (i1− 1 + t0) + aAlgebraische Optimierung:
Ausnutzen der Invarianten: ta = 4*(i1-1+ix)+ a
für den Vergleich: ta < 4*(i1 1+t0)+ afür den Vergleich: ta <= 4*(i1-1+t0)+ a
skprod:
res:= 0
ix := 0
t0 := lng-1
t11:= i1-1
tx1:= t11*4tx1: t11 4
ta := a+tx1
t12:= i2-1
tx2:= t12*4
tb := b+tx2
t4 := t11+t0
t5 := 4*t4
t6 := t5+a
ta := ta+4
t2 := *ta
if ta<=t6
true false
return rest2 : ta
tb := tb+4
t3 := *tb
t13:= t2*t3
res:= res+t13
28.06.2007 327© A. Poetzsch-Heffter, TU Kaiserslautern
ix := ix+1
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 77
Optimization Potential of Optimizations
Potential of optimizations - Example (7)Dead Code Elimination: Assignment to ix is dead code and can be eliminated.Elimination toten Codes:
Durch die Transformation der Schleifenbedingung ist
di Z i C d d d kdie Zuweisung an ix toter Code geworden und kann
eliminiert werden:skprod:
res:= 0
t0 := lng-1
t11:= i1-1
tx1:= t11*4
ta := a+tx1ta := a+tx1
t12:= i2-1
tx2:= t12*4
tb := b+tx2
t4 := t11+t0
t5 := 4*t4
t6 := t5+a
if ta<=t6
return res
ta := ta+4
t2 := *ta
tb := tb+4
if ta< t6
true false
tb : tb+4
t3 := *tb
t13:= t2*t3
res:= res+t13
28.06.2007 328© A. Poetzsch-Heffter, TU Kaiserslautern
Bewertung: 11 + 1 + 7*lng + 1 = 7*lng + 13
( lng = 100: 713, lng = 1000: 7013 )
Evaluation: 11 + 1 + 7 * Ing +1 = 7 * lng + 13c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 78
Optimization Potential of Optimizations
Potential of optimizations - Example (8)
Remarks:
• Reduction of execution steps by almost half, where the mostsignificant reductions are achieved by loop optimization.
• Combination of optimization techniques is important. Determiningthe ordering of optimizations is in general difficult.
• We have only considered optimizations at examples. The difficultyis to find algorithms and heuristics for detecting optimizationpotential automatically and for executing the optimizingtransformations.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 79
Optimization Data flow analysis
4.2.3 Data flow analysis
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 80
Optimization Data flow analysis
Data flow analysis
For optimizations, data flow information is required that can beobtained by data flow analysis.
Goal: Explanation of basic concepts of data flow analysis at examples
Outline:• Liveness analysis (Typical example of data flow analysis)• Data flow equations• Important analyses classes
Each analysis has an exact specification which information it provides.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 81
Optimization Data flow analysis
Liveness analysis
Definition (Liveness Analysis)Let P be a program. A variable v is live at a program position S in P ifthere is an execution path π from S to a use of v such that there is nodefinition of v on π.
The liveness analysis determines for all positions S in P whichvariables are live at S.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 82
Optimization Data flow analysis
Liveness analysis (2)
Remarks:• The definition of liveness of variables is static/syntactic. We have
defined dead code dynamically/semantically.• The result of the liveness analysis for a programm P can be
represented as a function live mapping positions in P to bitvectors, where a bit vector contains an entry for each variable inP. Let i be the index of a variable in P, then it holds that:
live(S)[i] = 1 iff v is live at position S
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 83
Optimization Data flow analysis
Liveness analysis (3)
Idea:
• In a procedure-local analysis, exactly the global variables are liveat the end of the exit block of the procedure.
• If the live variables out(B) at the end of a basic block B are known,the live variables in(B) at the beginning of B are computed by:
in(B) = gen(B) ∪ (out(B) \ kill(B))
whereI gen(B) is the set of variables v such that v is applied in B without a
prior definition of vI kill(B) is the set of variables that are defined in B
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 84
Optimization Data flow analysis
Liveness analysis (4)
As the set in(B) is computed from out(B), we have a backwardanalysis.
For B not the exit block of the procedure, out(B) is obtained by
out(B) =⋃
in(Bi) for all successors Bi of B
Thus, for a program without loops, in(B) and out(B) are defined for allbasic blocks B. Otherwise, we obtain a system of recursive equations.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 85
Optimization Data flow analysis
Liveness analysis - Example
Question: How do we compute out(B2)?c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 86
Optimization Data flow analysis
Data flow equations
Theory:
• There is always a solution for equations of the considered form.• There is always a smallest solution that is obtained by an iteration
starting from empty in and out sets.
Note: The equations may have several solutions.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 87
Optimization Data flow analysis
Ambiguity of solutions - Example
a := aB0:
b := 7B1:
out(B0) = in(B0) ∪ in(B1)out(B1) = { }in(B0) = gen(B0) ∪ (out(B0)\kill(B0))
= {a } ∪ out(B0)in(B1) = gen(B1) ∪ (out(B1)\kill(B1))
= { }
Thus, out(B0) = in(B0), and hence in(B0) = {a} ∪ in(B0).
Possible Solutions: in(B0) = {a} or in(B0) = {a,b}
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 88
Optimization Data flow analysis
Computation of smallest fixpoint
1. Compute gen(B), kill(B) for all B.
2. Set out(B) = ∅ for all B except for the exit block. For the exit block,out(B) comes from the program context.
3. While out(B) or in(B) changes for any B:
Compute in(B) from current out(B) for all B.
Compute out(B) from in(B) of its successors.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 89
Optimization Data flow analysis
Further analyses and classes of analyses
Many data flow analyses can be described as bit vector problems:• Reaching definitions: Which definitions reach a position S?• Available expressions for elimination of repeated computations• Very busy expressions: Which expression is needed for all
subsequent computations?
The according analyses can be treated analogue to liveness analysis,but differ in• the definition of the data flow information• the definition of gen and kill• the direction of the analysis and the equations
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 90
Optimization Data flow analysis
Further analyses and classes of analyses (2)
For backward analyses, the data flow information at the entry of abasic block B is obtained from the information at the exit of B:
in(B) = gen(B) ∪ (out(B) \ kill(B))
Analyses can be distinguished if they consider the conjunction or theintersection of the successor information:
out(B) =⋃
Bi∈succ(B)
in(Bi)
or
out(B) =⋂
Bi∈succ(B)
in(Bi)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 91
Optimization Data flow analysis
Further analyses and classes of analyses (3)
For forward analyses, the dependency is the other way round:
out(B) = gen(B) ∪ (in(B) \ kill(B))
with
in(B) =⋃
Bi∈pred(B)
out(Bi)
or
in(B) =⋂
Bi∈pred(B)
out(Bi)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 92
Optimization Data flow analysis
Further analyses and classes of analyses (4)
Overview of classes of analyses:
conjunction intersectionforward reachable definitions available expressions
backward live variables busy expressions
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 93
Optimization Data flow analysis
Further analyses and classes of analyses (5)
For bit vector problems, data flow information consists of subsets offinite sets.
For other analyses, the collected information is more complex, e.g., forconstant propagation, we consider mappings from variables to values.
For interprocedural analyses, complexity increases because the flowgraph is not static.
Formal basis for the development and correctness of optimizations isprovided by the theory of abstract interpretation.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 94
Optimization Non-Local Program Analysis
4.2.4 Non-Local Program Analysis
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 95
Optimization Non-Local Program Analysis
Non-local program analysis
We use a points-to analysis to demonstrate:• interprocedural aspects: The analysis crosses the borders of
single procedures.• constraints: Program analysis very often involves solving or
refining constraints.• complex analysis results: The analysis result cannot be
represented locally for a statement.• analysis as abstraction: The result of the analysis is an
abstraction of all possible program executions.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 96
Optimization Non-Local Program Analysis
Points-to analysis
Analysis for programs with pointers and for object-oriented programs
Goal: Compute which references to which records/objects a variablecan hold.
Applications of Analysis Results:
Basis for optimizations• Alias information (e.g., important for code motion)
I Can p.f = x cause changes to an object referenced by q?I Can z = p.f read information that is written by p.f = x?
• Call graph construction• Resolution of virtual method calls• Escape analysis
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 97
Optimization Non-Local Program Analysis
Alias InformationBeispiele: (Verwendung von Points-to-
Analyseinformation)Analyseinformation)
(1) p.f = x;
(2) f
A. Nutzen von Alias-Information:
(2) y = q.f;
(3) q.f = z;
p == q: (1)
(2) y = x;(2) y x;
(3) q.f = z;
p != q: Erste Anweisung lässt sich mit den
anderen beiden vertauschenanderen beiden vertauschen.
B. Elimination dynamischer Bindung:
class A {class A {
void m( ... ) { ... }
}
class B extends A {
void m( ) { }void m( ... ) { ... }
}
...
A p;
28.06.2007 338© A. Poetzsch-Heffter, TU Kaiserslautern
p = new B();
p.m(...) // Aufruf von B::m
First two statements can
be switched.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 98
Optimization Non-Local Program Analysis
Elimination of Dynamic Binding
Beispiele: (Verwendung von Points-to-
Analyseinformation)Analyseinformation)
(1) p.f = x;
(2) f
A. Nutzen von Alias-Information:
(2) y = q.f;
(3) q.f = z;
p == q: (1)
(2) y = x;(2) y x;
(3) q.f = z;
p != q: Erste Anweisung lässt sich mit den
anderen beiden vertauschenanderen beiden vertauschen.
B. Elimination dynamischer Bindung:
class A {class A {
void m( ... ) { ... }
}
class B extends A {
void m( ) { }void m( ... ) { ... }
}
...
A p;
© A. Poetzsch-Heffter, TU Kaiserslautern
p = new B();
p.m(...) // Aufruf von B::mCall of B::m
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 99
Optimization Non-Local Program Analysis
Escape Analysis
C. Escape-Analyse:
R m( A p ) {( p ) {
B q;
q = new B(); // Kellerverwaltung möglich
q.f = p;
q.g = p.n(); q g p ();
return q.g;
}
Eine Points-to-Analyse für Java:
Vereinfachungen:
• Gesamte Programm ist bekannt.
• Nur Zuweisungen und Methodenaufrufe der
folgenden Form:
Di kt Z i- Direkte Zuweisung: l = r
- Schreiben auf Instanzvariablen: l.f = r
- Lesen von Instanzvariablen: l = r.f
Objekterzeugung: l C()- Objekterzeugung: l = new C()
- Einfacher Methodenaufruf: l = r0.m(r1,..)
• Ausdrücke ohne Seiteneffekte
• Zusammengesetzte Anweisungen
© A. Poetzsch-Heffter, TU Kaiserslautern
• Zusammengesetzte Anweisungen
Can be stored on stack
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 100
Optimization Non-Local Program Analysis
A Points-to Analysis for Java
Simplifications and assumptions about underlying language• Complete program is known.• Only assignments and method calls of the following form are
used:I Direct assignment: l = rI Write to instance variables: l.f = rI Read of instance variables: l = r.fI Object creation: l = new C()I Simple method call: l = r0.m(r1, ...)
• Expressions without side effects• Compound statements
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 101
Optimization Non-Local Program Analysis
A Points-to Analysis for Java (2)
Analysis type• Flow-insensitive: The control flow of the program has no
influence on the analysis result. The states of the variables atdifferent program points are combined.
• Context-insensitive: Method calls at different program points arenot distinguished.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 102
Optimization Non-Local Program Analysis
A Points-to Analysis for Java (3)
Points-to graph as abstraction
Result of the analysis is a so-called points-to graph having• abstract variables and abstract objects as nodes• edges represent that an abstract variable may have a reference to
an abstract object
Abstract variables V represent sets of concrete variables at runtime.
Abstract objects O represent sets of concrete objects at runtime.
An edge between V and O means that in a certain program state, aconcrete variable in V may reference an object in O.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 103
Optimization Non-Local Program Analysis
Points-to Graph - ExampleBeispiel: (Points-to-Graph)
class Y { ... }
class X {
Y f;
void set( Y r ) { this.f = r; }
static void main() {
X p = new X(); // s1 „erzeugt“ o1
Y q = new Y(); // s2 „erzeugt“ o2q (); // „ g
p.set(q);
}
}
p
o1
this
o1
f
q
r
o2
28.06.2007 341© A. Poetzsch-Heffter, TU Kaiserslautern
r
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 104
Optimization Non-Local Program Analysis
Points-to Graph - Example (2)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 105
Optimization Non-Local Program Analysis
Definition of the Points-to Graph
For all method implementations,• create node o for each object creation• create nodes for
I each local variable vI each formal parameter p of any method
(incl. this and results (ret))I each static variable s
(Instance variables are modeled by labeled edges.)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 106
Optimization Non-Local Program Analysis
Definition of the Points-to Graph (2)Edges: Smallest Fixpoint of f : PtGraph × Stmt → PtGraph with
• f (G, l = new C()) = G ∪ {(l ,oi )}• f (G, l = r) = G ∪ {(l ,oi ) |oi ∈ Pt(G, r)}• f (G, l .f = r) = G ∪ {(< oi , f >,oj ) |oi ∈ Pt(G, l),oj ∈ Pt(G, r)}• f (G, l = r .f ) = G ∪ {(l ,oi ) | ∃oj ∈ Pt(G, r).oi ∈ Pt(G, < oj , f >)}• f (G, l = r0.m(r1, . . . , rn)) =
G ∪⋃oi∈Pt(G,r0)
resolve(G,m,oi , r1, . . . , rn, l)
where Pt(G, x) is the points-to set of x in G,
resolve(G,m,oi , r1, . . . , rn, l) =let mj (p0,p1, . . . ,pn, retj ) = dispatch(oi ,m) in{(p0,oi )} ∪ f (G,p1 = r1) ∪ . . . ∪ f (G, l = retj ) end
and dispatch(oi ,m) returns the actual implementation of m for oi with formalparameters p1, . . . ,pn, result variable retj , p0 refers to this.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 107
Optimization Non-Local Program Analysis
Definition of the Points-to Graph (3)
Remark:
The main problem for practical use of the analysis is the efficientimplementation of the computation of the points-to graph.
Literature:
A. Rountev, A. Milanova, B. Ryder: Points-to Analysis for Java UsingAnnotated Constraints. OOPSLA 2001.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 108
Register Allocation
4.3 Register Allocation
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 109
Register Allocation
Register allocation
Efficient code has to make good use of the available registers on thetarget machine: Accessing registers is much faster then accessingmemory (the same holds for cache).
Register allocation has two aspects:• Determine which variables are implemented by registers at which
positions.• Determine which register implements which variable at which
positions (register assignment).
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 110
Register Allocation
Register allocation (2)
Goals of register allocation
1. Generate code that requires as little registers as possible
2. Avoid unnecessary memory accesses, i.e., not only temporaries,but also program variables are implemented by registers.
3. Allocate registers such for variables that are used often (do notuse them for variables that are only rarely accessed).
4. Obey programmer’s requirements.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 111
Register Allocation
Register allocation (3)
Outline
• Algorithm interleaving code generation and register allocationfor nested expressions (cf. Goal 1)
• Algorithm for procedure-local register allocation(cf. Goals 2 and 3)
• Combination and other aspects
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 112
Register Allocation Sethi-Ullmann Algorithm
4.3.1 Sethi-Ullmann Algorithm
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 113
Register Allocation Sethi-Ullmann Algorithm
Evaluation ordering with minimal registers
The algorithm by Sethi and Ullmann is an example of an integratedapproach for register allocation and code generation.(cf. Wilhelm, Maurer, Sect. 12.4.1, p. 584 ff)
Input:
An assignment with a nested expression on the right hand side
4.3.1 Auswertungsordnung mit
minimalem Registerbedarfminimalem Registerbedarf
Der Algorithmus von Sethi-Ullman ist ein Beispiel
für eine integriertes Verfahren zur Registerzuteilung
und Codeerzeugung.
Eingabe:
Eine Zuweisung mit zusammengesetztem Ausdruckg g
auf der rechten Seite:
Assign ( Var, Exp )
Exp = BinExp | Var
BinExp ( Exp Op Exp )BinExp ( Exp, Op, Exp )
Var ( Ident )
Ausgabe:
Zugehörige Maschinencode bzw ZwischensprachenZugehörige Maschinencode bzw. Zwischensprachen-
code mit zugewiesenen Registern. Wir betrachten hier
Zwei-Adresscode, d.h. Code mit maximal einem
Speicherzugriff:i [ ]Ri := M[V]
M[V] := Ri
Ri := Ri op M[V]
Ri := Ri op Rj
28.06.2007 346© A. Poetzsch-Heffter, TU Kaiserslautern
(vgl. Wilhelm/Maurer 12.4.1, Seite 584 ff)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 114
Register Allocation Sethi-Ullmann Algorithm
Evaluation ordering with minimal registers (2)
Output:
Machine or intermediate language code with assigned registers.
We consider two-address code, i.e., code with one memory access atmaximum. The machine has r registers represented by R0, . . . ,Rr−1.
4.3.1 Auswertungsordnung mit
minimalem Registerbedarfminimalem Registerbedarf
Der Algorithmus von Sethi-Ullman ist ein Beispiel
für eine integriertes Verfahren zur Registerzuteilung
und Codeerzeugung.
Eingabe:
Eine Zuweisung mit zusammengesetztem Ausdruckg g
auf der rechten Seite:
Assign ( Var, Exp )
Exp = BinExp | Var
BinExp ( Exp Op Exp )BinExp ( Exp, Op, Exp )
Var ( Ident )
Ausgabe:
Zugehörige Maschinencode bzw ZwischensprachenZugehörige Maschinencode bzw. Zwischensprachen-
code mit zugewiesenen Registern. Wir betrachten hier
Zwei-Adresscode, d.h. Code mit maximal einem
Speicherzugriff:i [ ]Ri := M[V]
M[V] := Ri
Ri := Ri op M[V]
Ri := Ri op Rj
28.06.2007 346© A. Poetzsch-Heffter, TU Kaiserslautern
(vgl. Wilhelm/Maurer 12.4.1, Seite 584 ff)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 115
Register Allocation Sethi-Ullmann Algorithm
Example: Code generation w/ register allocation
Consider f := (a + b)− (c − (d + e))
Assume that there are two registers R0 and R1 available for thetranslation.
Result of direct translation:
Beispiel: (Codeerzeugung mit Registerzuteil.)
Betrachte: f:= (a+b)-(c-(d+e))Betrachte: f:= (a+b) (c (d+e))
Annahme: Zur Übersetzung stehen nur zwei Registerzur Verfügung.
Ergebnis der direkten Übersetzung:
R0 := M[a]
R0 := R0 + M[b]
R1 := M[d]R1 := M[d]
R1 := R1 + M[e]
M[t1] := R1
R1 := M[c]
R1 := R1 – M[t1]
R0 := R0 – R1
M[f] := R0
Ergebnis von Sethi-Ullman:
R0 := M[c]
R1 := M[d]
R1 := R1 + M[e]
R0 := R0 – R1
R1 : M[a]R1 := M[a]
R1 := R1 + M[b]
R1 := R1 – R0
M[f] := R1
28.06.2007 347© A. Poetzsch-Heffter, TU Kaiserslautern
Besser, weil ein Befehl weniger und keine Zwischen-Speicherung nötig.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 116
Register Allocation Sethi-Ullmann Algorithm
Example: Code generation w/ register allocation (2)
Result of Sethi-Ullmann algorithm:
Beispiel: (Codeerzeugung mit Registerzuteil.)
Betrachte: f:= (a+b)-(c-(d+e))Betrachte: f:= (a+b) (c (d+e))
Annahme: Zur Übersetzung stehen nur zwei Registerzur Verfügung.
Ergebnis der direkten Übersetzung:
R0 := M[a]
R0 := R0 + M[b]
R1 := M[d]R1 := M[d]
R1 := R1 + M[e]
M[t1] := R1
R1 := M[c]
R1 := R1 – M[t1]
R0 := R0 – R1
M[f] := R0
Ergebnis von Sethi-Ullman:
R0 := M[c]
R1 := M[d]
R1 := R1 + M[e]
R0 := R0 – R1
R1 : M[a]R1 := M[a]
R1 := R1 + M[b]
R1 := R1 – R0
M[f] := R1
28.06.2007 347© A. Poetzsch-Heffter, TU Kaiserslautern
Besser, weil ein Befehl weniger und keine Zwischen-Speicherung nötig.
More efficient, because it uses one instruction less and does not needto store intermediate results.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 117
Register Allocation Sethi-Ullmann Algorithm
Sethi-Ullmann algorithm
Goal: Minimize number of registers and number of temporaries.
Idea: Generate code for subexpression requiring more registers first.
Procedure:• Define function regbed that computes the number of registers
needed for an expression• Generate code for an expression E = BinExp(L,OP,R);
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 118
Register Allocation Sethi-Ullmann Algorithm
Sethi-Ullmann algorithm (2)
We use the following notations:• v_reg(E): the set of available registers for the translation of E• v_tmp(E): the set of addresses where values can be stored
temporarily when translating E• cell(E): register/memory cell where the result of E is stored
Now, let• E be an expression• L the left subexpression of E• R the right subexpression of E• vr abbreviate |v_reg(E)|
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 119
Register Allocation Sethi-Ullmann Algorithm
Sethi-Ullmann algorithm (3)
We distinguish the following cases:
1. regbed(L) < vr
2. regbed(L) ≥ vr and regbed(R) < vr
3. regbed(L) ≥ vr and redbed(R) ≥ vr
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 120
Register Allocation Sethi-Ullmann Algorithm
Sethi-Ullmann algorithm (4)
Case 1: regbed(L) < vr
• Generate code for R using v_reg(E) and v_tmp(E) with result incell(R)
• Generate code for L using v_reg(E) \{ cell(R) } and v_tmp(E) withresult in cell(L)
• Generate code for the operation cell(L) := cell(L) OP cell(R)• Set cell(E) = cell(L)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 121
Register Allocation Sethi-Ullmann Algorithm
Sethi-Ullmann algorithm (5)
Case 2: regbed(L) ≥ vr and regbed(R) < vr
• Generate code for L using v_reg(E) and v_tmp(E) with result incell(L)
• Generate code for R using v_reg(E) \{ cell(L) } and v_tmp(E) withresult in cell(R)
• Generate code for the operation cell(L) := cell(L) OP cell(R)• Set cell(E) = cell(L)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 122
Register Allocation Sethi-Ullmann Algorithm
Sethi-Ullmann algorithm (6)
Case 3: regbed(L) ≥ vr and redbed(R) ≥ vr
• Generate code for R using v_reg(E) and v_tmp(E) with result incell(R)
• Generate code M[first(v_tmp(E))] := cell(R)• Generate code for L using v_reg(E) and rest(v_tmp(E)) with result
in cell(L)• Generate code for the operation cell(L) := cell(L) OP
M[first(v_tmp(E))]• Set cell(E) = cell(L)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 123
Register Allocation Sethi-Ullmann Algorithm
Sethi-Ullmann algorithm (7)
Function regbed in MAX notation (can be realized by S-Attribution):
3. Fall: regbed( L ) ! vr und regbed( R ) ! vr
Generiere zunächst Code für RGeneriere zunächst Code für R
unter Verwendung von v_reg(E) und v_tmp(E)
mit Ergebnis in zelle(R)
Generiere Code: M[ first(v_tmp(E)) ] := zelle(R)
Generiere Code für L
unter Verwendung von v_reg(E) und
rest( v_tmp(E) ) mit Ergebnis in zelle(L)
G i C d fü di O tiGeneriere Code für die Operation:
zelle(L) := zelle(L) OP M[ first(v_tmp(E)) ]
Setze zelle(E) = zelle(L)
Die Funktion regbed (in MAX-Notation):
ATT regbed( Exp@ E ) Nat:
IF Assign@< Var@ E> : 0IF Assign@<_,Var@ E> : 0
| BinExp@< Var@ E,_,_> : 1
| BinExp@<_,_,Var@ E > : 0
| BinExp@< L,_, R > E :
IF regbed(L)=regbed(R)
THEN regbed(L) + 1
ELSE max( regbed(L), regbed(R) )
ELSE nil // Fall kommt nicht vor
28.06.2007 350© A. Poetzsch-Heffter, TU Kaiserslautern
(In ML wäre die Definition von regbed etwas
aufwendiger, da der Kontext von Var-Ausdrücken
nicht direkt berücksichtigt werden kann.)c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 124
Register Allocation Sethi-Ullmann Algorithm
Example: Sethi-Ullman Algorithm
Consider f:= (( a + b ) - (c + d)) * (a - (d+e))
Attributes:
Beispiel: (Ablauf Sethi-Ullman)
Betrachte: f:= ((a+b)-(c+d)) * (a-(d+e))Betrachte: f: ((a+b) (c+d)) (a (d+e))
Attribute: v_reg | v_tmp 12T
regbed
zellezelle
Assign
Var
fBinExp
*
(3.)
312T
1
T
BinExpBinExp
-
(1.)
-
(1.)
2 212 12T
12
Var
a
BinExpBinExpBinExp
+ + + 1 1 11
2 12 12T212
VarVar
d
Var
d
Var VarVar
b
(2.) (1.) (1.)
1 0 1 0 1 0
28.06.2007 351© A. Poetzsch-Heffter, TU Kaiserslautern
edda cb1 0 1 0 1 0
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 125
Register Allocation Sethi-Ullmann Algorithm
Example: Sethi-Ullman Algorithm (2)
Beispiel: (Ablauf Sethi-Ullman)
Betrachte: f:= ((a+b)-(c+d)) * (a-(d+e))Betrachte: f: ((a+b) (c+d)) (a (d+e))
Attribute: v_reg | v_tmp 12T
regbed
zellezelle
Assign
Var
fBinExp
*
(3.)
312T
1
T
BinExpBinExp
-
(1.)
-
(1.)
2 212 12T
12
Var
a
BinExpBinExpBinExp
+ + + 1 1 11
2 12 12T212
VarVar
d
Var
d
Var VarVar
b
(2.) (1.) (1.)
1 0 1 0 1 0
28.06.2007 351© A. Poetzsch-Heffter, TU Kaiserslautern
edda cb1 0 1 0 1 0
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 126
Register Allocation Sethi-Ullmann Algorithm
Example: Sethi-Ullman Algorithm (3)
For formalizing the algorithm, we realize the set of available registersand addresses for storing temporaries with lists, where• the list RL of registers is non-empty• the list AL of addresses is long enough• the result cell is always a register which is the first in RL, i.e.,
first(RL)• the function exchange switches the first two elements of a list,
fst returns the first element of the list,rest returns the tail of the list
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 127
Register Allocation Sethi-Ullmann Algorithm
Example: Sethi-Ullman Algorithm (4)
In the following, the function expcode for code generation is given inMAX notation (functional).
Note: The application of the functions exchange, fst and expcodesatisfy their preconditions length(RL) > 1 or length(RL) > 0, resp.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 128
Register Allocation Sethi-Ullmann Algorithm
Example: Sethi-Ullman Algorithm (5)FCT expcode( Exp@ E, RegList RL, AdrList AL )
CodeList: // pre: length(RL)>0
IF Var@<ID> E:
[ fst(RL) := M[adr(ID)] ]
| BinExp@< L,OP,Var@<ID> > E:
expcode(L,RL,AL)
++ [ fst(RL) := fst(RL) OP M[adr(ID)] ]
| BinExp@< L,OP,R > E:
LET vr == length( RL ) :
IF regbed(L) < vr :
expcode(R,exchange(RL),AL)
++ expcode(L,rst(exchange(RL)),AL)
++ [ fst(RL):= fst(RL) OP fst(rst(RL))]
| regbed(L)>=vr AND regbed(R)<vr :
expcode(L,RL,AL)
++ expcode(R,rst(RL),AL)
++ [ fst(RL):= fst(RL) OP fst(rst(RL))]
| regbed(L)>=vr AND regbed(R)>=vr :
expcode(R,RL,AL)
[ [ f ( ) ] f ( ) ]++ [ M[ fst(AL) ] := fst(RL) ]
++ expcode(L,RL,rst(AL))
++ [ fst(RL):= fst(RL) OP M[fst(AL)] ]
ELSE nil
ELSE []ELSE []
Beachte:
Die Anwendungen der Funktionen exchange, fst und
28.06.2007 353© A. Poetzsch-Heffter, TU Kaiserslautern
expcode erfüllen jeweils ihre Vorbedingungen
length(RL) > 1 bzw. length(RL) > 0 .
Remarks:• The algorithm generates 2AC which is optimal with respect to the
number of instructions and the number of temporaries if theexpression has no common subexpressions.
• The algorithm shows the dependency between code generationand register allocation and vice versa.
• In a procedural implementation, register and address lists can berealized by a global stack.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 129
Register Allocation Register Allocation by Graph Coloring
4.3.2 Register Allocation by Graph Coloring
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 130
Register Allocation Register Allocation by Graph Coloring
Register allocation by graph coloring
Register allocation by graph coloring is an algorithm (with manyvariants) for allocation of registers in control flow graphs.
Register allocation for CGF with 3AC in SSA form• Input: CFG with using temporary variables• Output: Structurally the same CFG with
I registers instead of temporary variablesI additional instructions for storing intermediate results on the stack,
if applicable
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 131
Register Allocation Register Allocation by Graph Coloring
Register allocation by graph coloring (2)
Remarks:• The SSA representation is not necessary, but simplifies the
formulation of the algorithm(e.g.,Wilhelm/Maurer do not use SSA in Sect. 12.5)
• It is no restriction that only temporary variables are implementedby registers. We assume that program variables are assigned totemporary variables in a preceding step.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 132
Register Allocation Register Allocation by Graph Coloring
Life range and interference graph
Definition (Life range)The life range of a temporary variable is the set of program positions atwhich it is alive.
Definition (Interference)Two temporary variables interfere if their life ranges have a non-emptyintersection.
Definition (Interference graph)Let P be a program part/CFG in 3AC/SSA. The interference graph of Pis an undirected graph G = (N,E), where• N is the set of temporary variables• an edge (n1,n2) is in E iff n1 and n2 interfere.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 133
Register Allocation Register Allocation by Graph Coloring
Register allocation by graph coloring
Goal: Reduce number of temporary variables with the availableregisters.
Idea: Translate the problem to graph coloring (NP-complete). Colorthe interference graph, such that• neighboring nodes have different colors• no more colors are used than available registers
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 134
Register Allocation Register Allocation by Graph Coloring
Register allocation by graph coloring (2)
General procedure: Try to color the graph as described below. Then:• If a coloring is found, terminate.• If nodes could not be colored,
I choose a non-colored node kI modify the 3AC program such that the value of k is stored
temporarily and is first loaded when it is usedI try to color the modified program
Termination: The procedure terminates, because storing valuesintermediately reduces life ranges of temporaries and interferences.In practice, two or three iterations are sufficient.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 135
Register Allocation Register Allocation by Graph Coloring
Register allocation by graph coloring (3)
Coloring algorithm: Let rn be the number of available registers, i.e.,for coloring, maximally rn colors may be used.
The coloring algorithm consists of the phases:
• (a) Simplify with marking
• (b) Coloring
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 136
Register Allocation Register Allocation by Graph Coloring
Simplify with marking
Remove iteratively nodes with less than rn neighbors from the graphand push them onto a stack.
Case 1: The current simplification steps lead to an empty graph.Continue with the coloring phase.
Case 2: The graph contains only nodes with rn and more than rnneighbors. Choose a suitable node as candidate for storing ittemporarily, mark it, push it onto the stack and continue simplification.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 137
Register Allocation Register Allocation by Graph Coloring
Coloring
The nodes are successively popped from the stack and, if possible,colored and put back into the graph.
Let k be the popped node.
Caseh1: k is not marked. Thus, it has less than rn neighbors. Then, kcan be colored with a new color.
Case 2: k is marked.a) the rn or more neighbors have less than rn-1 different colors.
Then, color k appropriately.b) there are rn or more colors in the neighborhood. Leave k
uncolored.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 138
Register Allocation Register Allocation by Graph Coloring
Example - Graph coloring
For simplicity, we only consider one basic block.
In the beginning, t0 and t2 are live.Beispiel: (Graphfärbung)
Einfachheitshalber betrachten wir nur einen Basisblock:
t1 := a + t0
t3 := t2 – 1
t4 := t1 * t3
t5 := b + t0
Am Anfang sindt0, t2 lebendig
0 1 2 3 4 5 6 7 8 9
t5 := b + t0
t6 := c + t0
t7 := d + t4
t8 := t5 + 8
t9 := t8
A E d i dt2 := t6 + 4
t0 := t7
Am Ende sindt0, t2, t9 leb.
Interferenzgraph:t4
t5
Interferenzgraph:
t0
t1
t2
t3
t6t7
t8
t1
t9
Annahme: 4 verfügbare Register
28.06.2007 358© A. Poetzsch-Heffter, TU Kaiserslautern
g g
Vereinfachung: Eliminiere der Reihe nach
t1, t3, t2, t9, t0, t5, t4, t7, t8, t6
In the end, t0, t2, t9 are alive.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 139
Register Allocation Register Allocation by Graph Coloring
Example - Graph coloring (2)Interference graph:
Assumption: 4 available registers
Simplification: Remove (in order) t1, t3, t2, t9, t0, t5, t4, t7, t8, t6
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 140
Register Allocation Register Allocation by Graph Coloring
Example - Graph coloring (3)
Possible coloring:
Fortsetzung des Beispiels:
Möglich Färbung (t1, t3, t2, t9, t0, t5, t4, t7, t8, t6):g g ( , , , , , , , , , )
t4
t5
t0 t2
t3
t5
t6t7
t8
t1
t9
Bemerkung:
Es gibt eine Reihe von Erweiterungen des Verfahrens:
• Elimination von Move-BefehlenElimination von Move Befehlen
• Bestimmte Heuristiken bei der Vereinfachung (Was
ist ein geeigneter Knoten?)
• Berücksichtigung vorgefärbter KnotenBerücksichtigung vorgefärbter Knoten
Lesen Sie zu Abschnitt 4.3.2:
A l
28.06.2007 359© A. Poetzsch-Heffter, TU Kaiserslautern
Appel:
• Section 11.1-11.3 , S. 238-251
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 141
Register Allocation Register Allocation by Graph Coloring
Example - Graph coloring (4)
Remarks:
There are several extensions of the algorithm:• Elimination of move instructions• Specific heuristics for simplification (What is a suitable node?)• Consider pre-colored nodes
Recommended reading:• Appel, Sec. 11.1 – 11.3
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 142
Register Allocation Register Allocation by Graph Coloring
Further aspects of register allocation
The introduced algorithms consider subproblems. In practice, thereare further aspects that have to be dealt with for register allocation:• Interaction with other compiler phases (in particular optimization
and code generation)• Relation between temporaries and registers• Source/intermediate/target language• Number of applications (Is a variable inside an inner loop?)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 143
Register Allocation Register Allocation by Graph Coloring
Further aspects of register allocation (2)
Possible global procedure
• Allocate registers for standard tasks (registers for stack andargument pointers, base registers)
• Decide which variables and parameters should be stored inregisters
• Evaluate application frequency of temporaries (occurrences ininner loops, distribution of accesses over life range)
• Use evaluation together with heuristics of register allocationalgorithm
• If applicable, optimize again
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 144
Just-In-Time Compilation
4.4 Just-In-Time Compilation
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 145
Just-In-Time Compilation Language Execution Techniques
4.4.1 Language Execution Techniques
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 146
Just-In-Time Compilation Language Execution Techniques
Static (Ahead-of-Time) Compilation
RuntimeCompile Time
SourceCode
AOT CompilerMachine
CodeMachine
Advantages
• Fast execution
Disadvantages
• Platform dependent• Compilation step
Examples
• C/C++, Pascal
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 147
Just-In-Time Compilation Language Execution Techniques
Interpretation
Runtime
SourceCode
Interpreter
Advantages
• Platform independent• No compilation step
Disadvantages
• Slow execution
Examples
• Bash, Javascript (old browsers)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 148
Just-In-Time Compilation Language Execution Techniques
Use of Virtual Machine Code (Bytecode)
RuntimeCompile Time
SourceCode
AOT Compiler Bytecode Virtual Machine
Advantages
• Faster execution• Platform independent
Disadvantages
• Still slow due to interpretation• Compilation step
Examples
• Java, C#
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 149
Just-In-Time Compilation Just-In-Time Compilation
4.4.2 Just-In-Time Compilation
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 150
Just-In-Time Compilation Just-In-Time Compilation
Dynamic (Just-In-Time) Compilation
Runtime
Byte/SourceCode
JIT CompilerMachine
CodeMachine
Virtual Machine/Interpreter
Advantages
• Fast execution• Platform independent
Disadvantages
• JIT runtime overhead
Examples
• Java HotSpot VM, .NET CLR, Mozilla SpiderMonkeyc© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 151
Just-In-Time Compilation Just-In-Time Compilation
Just-in-time Compilation
• Just-in-time (dynamic) compilation compiles code during runtime• The goal is to improve performance compared to pure
interpretation• Trade-off between compilation cost and execution time benefit
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 152
Just-In-Time Compilation Just-In-Time Compilation
The History of Just-In-Time1
1960 McCarthy: compile LISP functions at runtime
1968 Thompson: compile regular expressions at runtime
1968 Mitchell: get compiled code by storing interpreter actions
1970 Abrams: JIT-Compilers for APL
1974 Hansen: Detect hot-spots using frequency counters
1993 Jones: Use partial evaluation to create compilers frominterpreters
1994 Hölzle: Adaptive optimization for Self
1997 Sun Hot-Spot JVM
2006 Gal and Franz: Tracing JITs
2011 Google V8, Mozilla TraceMonkey
1See Aycook, 2003c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 153
Just-In-Time Compilation Just-In-Time Compilation
Advantages of JIT Compilation2
Many optimizations can be done at runtime, which are not possible instatic compilation, due to additional runtime information:• Concrete operating system and execution platform
I e.g. to use SSE2 instructions• Concrete input values
I Inline virtual method callsI Apply constant foldingI ...
• Program can be monitored at runtimeI Optimize hot-code
• Global optimizations in presence ofI Library codeI Dynamically loaded code
2Source: http://en.wikipedia.org/wiki/Just-in-time_compilationc© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 154
Just-In-Time Compilation Just-In-Time Compilation
Kinds of JIT Compilation
Classic• No interpretation• Compile code with a fast (non-optimizing) traditional compiler
Mixed-Mode• Start with interpretation• Only compile hot code• Examples: Sun Hot-Spot JVM, Mozilla SpiderMonkey
Adaptive Compilation• No interpretation• Start with fast compilation (nearly no optimizations)• Recompile hot code with optimizing compiler• Example: Google V8
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 155
Just-In-Time Compilation Just-In-Time Compilation
Design Decisions for JIT Implementations
• JIT implementations have to decide:I What to compile?
• All code or only some code?I How to compile?
• Fast or optimal?I When to compile?
• At startup or when hot-code is detected?• Longer analysis⇒ better generated code
• Decisions may depend on the target machine and the targetapplication
I Client applications require fast start-upI Server applications should be optimized more aggressively
• JIT implementations typically allow to configure these parameters• Default values are based on empirical data (benchmarks)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 156
Just-In-Time Compilation Just-In-Time Compilation
Different Compilers
Fast Compiler
• Only simple optimizations (e.g. constant folding)• No intermediate representations• Simple register allocation (linear time)• Advantage: fast compilation• Disadvantage: slow code
Optimizing Compiler
• Use all techniques of traditional compilers• Disadvantage: slow compilation• Advantage: very fast code
I Generated code can outperform C or C++ compiled code due toadditional runtime information
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 157
Just-In-Time Compilation Hot-Spot Detection
4.4.3 Hot-Spot Detection
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 158
Just-In-Time Compilation Hot-Spot Detection
Hot-Spot Detection
ObservationMany programs spend the majority of their time executing a minority oftheir code (hot spots)3
ProblemIt is often statically not clear which parts of the program are executedmore often than others
SolutionMonitor code during runtime (profiling)
3D.E. Knuth. An empirical study of Fortran programs. Software—Practice andExperience 1, pp. 105–133, 1971.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 159
Just-In-Time Compilation Hot-Spot Detection
Profiling
Profiling
1. monitor and trace events that occurs during runtime,2. set the cost of these events3. attribute the cost of these events to specific parts of the program.
Profiling uses the past to predict the future
Ways to profile
• Time-based profiling• Counter-based profiling• Sampling-based profiling
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 160
Just-In-Time Compilation Hot-Spot Detection
Time-based Profiling
Method
• Record time spent in each method• Profiling instructions are inserted in prolog and epilog• Measure time and add it to the total time of the method• Methods are compiled when a certain amount of time has been
spent in that method
Properties
• All methods are profiled• Maybe inaccurate for short methods• Very large overhead
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 161
Just-In-Time Compilation Hot-Spot Detection
Counter-based Profiling
Method
• Invocation counter for each method (loop back-branches)• Increase counter for each method call (branch take)• Compile method when counter reaches a predefined threshold
Properties
• All methods are profiled• Accurate• Difficult to choose good thresholds• Large overhead
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 162
Just-In-Time Compilation Hot-Spot Detection
Sampling-based Profiling
Method
• Counter for each method• Sample application periodically (e.g., every 10ms)• Increase counter of current method (and caller method)• Compile method when counter reaches a predefined threshold
Properties
• Low overhead• May miss methods• Non-deterministic (difficult to debug)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 163
Just-In-Time Compilation Further Aspects of JIT Compilers
4.4.4 Further Aspects of JIT Compilers
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 164
Just-In-Time Compilation Further Aspects of JIT Compilers
Memory Mangement of Compiled Code
Problem
• Compiled (native) code is often 4-8 times larger than the originalbytecode
• Compiled code must be hold in memory
Solution
• To reduce the memory consumption only a fixed amount (cache)of compiled code is hold in memory
Cache Replacement Strategies
• FIFO (First in First Out)• LRU (Least Recently Used)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 165
Just-In-Time Compilation Further Aspects of JIT Compilers
On-Stack Replacement (OSR)
Problem
• When a hot-loop is detected, the compiled version of theexecuting method is only executed the next time the method iscalled (which may never be the case)
Solution
• Compile a special version of the method that starts in the middleof the method, where the loop is executing
• Stop interpreting the executing method and execute the specialcompiled version
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 166
Just-In-Time Compilation Further Aspects of JIT Compilers
De-Optimization
Problem
• In languages that allow dynamic code loading (i.e., Java)optimizations may become invalid
• For example: method inlining for virtual method calls can becomeinvalid when new classes are added to the type hierarchy
De-Optimization
• Optimized code can be deoptimized at runtime• Deoptimized code can be reoptimized again
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 167
Just-In-Time Compilation Further Aspects of JIT Compilers
Inline Caches (1/2)4
Problem• Message lookup in prototype-based languages like Javascript or
Smalltalk can be expensive due to complex lookup rules.
Observation• Receiver objects at certain call site are often of the same type
Idea• After first dynamic lookup, inline the lookup result at the call site• Add typecheck to fallback to dynamic lookup and update cache
4Good introduction:http://blog.cdleary.com/2010/09/picing-on-javascript-for-fun-and-profit/
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 168
Just-In-Time Compilation Further Aspects of JIT Compilers
Inline Caches (2/2)
Example (Javascript)
function isPoint(obj) {return obj.isPoint;
}
Generated code (pseudo code):
type := gettype(obj)if type = CACHED_TYPE
result = staticcall CACHED_METHODjump L
elseresult = dynamiccall obj, "isPoint"# ... update cached values (modify generated code)
L: return result
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 169
Just-In-Time Compilation Further Aspects of JIT Compilers
Polymorphic Inline Caches (PICs)5
Problem
• Inline caches only work for a single type (monomorphic type)
Solution
• Polymorphic Inline Caches (PICs)• Like (monomorphic) inline caches, but handles multiple cases• If typecheck fails add additional case (linear search)• If a certain number of cases is reached, treat the call site as
megamorphic and only do dynamic lookup
5Craig Chambers, David Ungar, and Elgin Lee. Optimizing dynamically-typedobject-oriented languages with polymorphic inline caches. ECOOP 1991.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 170
Just-In-Time Compilation Tracing JIT Compilers
4.4.5 Tracing JIT Compilers
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 171
Just-In-Time Compilation Tracing JIT Compilers
Tracing JIT Compilers
Observation
• Most time is spent in hot paths
Idea
• Concentrate on hot paths and not whole methods/code blocks
Approach
• Detect hot paths at runtime• Record trace when hot path is detected• Generate optimized code for individual traces• Use trace trees instead of control flow graphs
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 172
Just-In-Time Compilation Tracing JIT Compilers
Example
1: code;2: do {
if (rare condition) {3: code;
} else {4: code;
}5: } while (frequent condition);6: code;
Control Flow Graph:
1
2
3 4
5
6
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 173
Just-In-Time Compilation Tracing JIT Compilers
Example
1: code;2: do {
if (rare condition) {3: code;
} else {4: code;
}5: } while (frequent condition);6: code;
Control Flow Graph:
1
2
3 4
5
6
hot path = (2,4,5,2)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 174
Just-In-Time Compilation Tracing JIT Compilers
Hot Path Detection
• Only loops are considered for hot path detection (hot loops)• Add counter to each destination of a backward branch (potential
loop header)• Interpret the program• Increase counter when branch is taken• When threshold (e.g. 2 in TraceMonkey) is reached, hot loop is
detected
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 175
Just-In-Time Compilation Tracing JIT Compilers
Tracing
1. When hot loop is detected start with code tracing2. Record all interpreter instructions3. Stop recording, when either
I Cycle is found (tracing finished)I Trace becomes too long (tracing aborted)I Exception is thrown (tracing aborted)
4. Result is a code trace (loop trace)5. Branches in a code trace are replaced by guards to handle side
exitsI Failed guards return control to the interpreter
6. Method calls are inlined into the trace with appropriate guards incase of dynamic dispatch
7. The trace is optimized and compiled to native code8. In the next iteration the native code is executed
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 176
Just-In-Time Compilation Tracing JIT Compilers
Properties of Simple Tracing JITs
Advantages
• Optimizing single traces is much easier (faster) than whole CFG• Optimizing happens across method boundaries, which is
especially good for programs with many small methods• Implementation is simpler and takes less code compared to a
CFG-based JIT compiler
Disadvantages
• Only works well when there are hot dominant paths• Trace recording is very expensive
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 177
Just-In-Time Compilation Tracing JIT Compilers
Trace Trees6
Problem
• Simple tracing only records a single path• Does not work well for loops with non-dominant paths
Idea
• Instead of single traces use trace trees
Approach
• When a guard during execution of a compiled trace fails,immediately start trace recording
• When the new trace reaches the loop header, incorporate the newtrace into the trace tree
• Corresponding guard is turned into a conditional branch6See Gal and Franz, 2006
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 178
Just-In-Time Compilation Tracing JIT Compilers
Example
1: code;2: do {
if (condition) {3: code;
} else {4: code;
}5: } while (condition);6: code;
Control FlowGraph:
1
2
3 4
5
6
Trace
2
4
5
side exit (sx)
sx
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 179
Just-In-Time Compilation Tracing JIT Compilers
Example
1: code;2: do {
if (condition) {3: code;
} else {4: code;
}5: } while (condition);6: code;
Control FlowGraph:
1
2
3 4
5
6
Trace Tree
2
4
5
3
5sx sx
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 180
Just-In-Time Compilation Tracing JIT Compilers
Properties of Trace Trees
• A trace tree is a directed rooted tree• The root is called anchor node a and represents the loop header• All leaf nodes have an implicit back-edge to a• All nodes, except a, have exactly one predecessor• Nodes maybe duplicated if on multiple traces• Transformation to SSA form is fast, because there is only one
join-point (the anchor node)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 181
Just-In-Time Compilation Tracing JIT Compilers
Nested Loops
• Traces are added to a trace tree when a side exit is taken• For nested loops, the inner loop gets hot before the outer loop• As a consequence the loop is turned "inside out"
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 182
Just-In-Time Compilation Tracing JIT Compilers
Nested Loops Example
1: code;2: do {
code;3: do {
code;4: } while (condition);5: } while (condition);6: code;
CFG
1
2
3
4
5
6
Inner Trace
3
4 sx
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 183
Just-In-Time Compilation Tracing JIT Compilers
Nested Loops Example
1: code;2: do {
code;3: do {
code;4: } while (condition);5: } while (condition);6: code;
CFG
1
2
3
4
5
6
Extended Trace
3
4
5
2
sx
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 184
Just-In-Time Compilation Tracing JIT Compilers
Bounding Trace Trees
• Trace trees can grow indefinitely• In order to limit the size of trace trees, extending the tree is
stopped after a certain number of backward branches (e.g. 3)• Effectively limits the possible number of inlined outer loops
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 185
Just-In-Time Compilation Tracing JIT Compilers
Method Calls
• Like outer loops method calls are inlined• Virtual calls result in a branch
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 186
Just-In-Time Compilation Literature
Literature
General JIT Compilation• John Aycook. A Brief History of Just-In-Time. ACM Computing Surveys, Vol. 35, No. 2, June 2003, pp. 97-113.
http://dx.doi.org/10.1145/857076.857077
• M. Arnold et al. A Survey of Adaptive Optimization in Virtual Machines. Proc. IEEE. 2005.http://dx.doi.org/10.1109/JPROC.2004.840305
• T. Kotzmann, C. Wimmer, H. Mossenbock. Design of the Java HotSpot Client Compiler for Java 6. ACM TACO 2008.http://dx.doi.org/10.1145/1369396.1370017
• Sami Zhioua. A dynamic compiler in an embedded Java Virtual machine. Master’s Thesis. 2003.http://www.cs.mcgill.ca/~zhioua/MscSami.pdf
Tracing JITs• A. Gal and M. Franz. Inremental Dynamic Code Generation with Trace Trees. Technical Report, 2006.
http://www.ics.uci.edu/~franz/Site/pubs-pdf/ICS-TR-06-16.pdf
• A. Gal, C. W. Probst, and M. Franz. HotpathVM: An Effective JIT Compiler for Resource-constrained Devices. VEE’06.http://www.usenix.org/events/vee06/full_papers/p144-gal.pdf
• Gal et al. Trace-based Just-in-Time Type Specialization for Dynamic Languages. PLDI 2009.http://people.mozilla.org/~gal/compressed.tracemonkey-pldi-09.pdf
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 187
Further Aspects of Compilation
4.5 Further Aspects of Compilation
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 188
Further Aspects of Compilation
Code generation
Code generation can be split into four independentmachine-dependent tasks:• Memory allocation• Instruction selection and addressing• Instruction scheduling• Code optimization
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 189
Further Aspects of Compilation
Memory allocation
Modern machines have the following memory hierarchy:• Registers• Primary Cache (Instruction Cache, Data Cache)• Secondary Cache• Main memory (page/segment addressing)
Different from registers, the cache is controlled by the hardware.Efficient usage of the cache means in particular to align data objectsand instructions to borders of cache blocks (cf. Appel, Chap. 21). Thesame holds for main memory.
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 190
Further Aspects of Compilation
Instruction selection
Instruction selection aims at the best possible translation ofexpressions and basic blocks using the instruction set of the machine,for instance,• using complex addressing modes• considering the sizes of constants or the locality of jumps
Instruction selection is often formulated as a tree pattern matchingproblem with costs. (cf. Wilhelm/Maurer, Chap.11)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 191
Further Aspects of Compilation
Instruction scheduling
Modern machines allow processor-local parallel processing (pipeline,super-scalar, VLIW).
In order to use this parallel processing, code has to comply toadditionalrequirements that have to be considered for code generation.(see Appel, Chap. 20; Wilhelm/Maurer, Sect. 12.6)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 192
Further Aspects of Compilation
Code optimization
Optimizations of the assembler or machine code may allow anadditional increase in program efficiency.(see Wilhelm/Maurer, Sect. 6.9)
c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 193