advanced compiler techniques

58
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Course Introduction

Upload: lieu

Post on 23-Feb-2016

79 views

Category:

Documents


0 download

DESCRIPTION

Advanced Compiler Techniques. Course Introduction. LIU Xianhua School of EECS, Peking University. Outline. Course Overview Course Topics Course Requirements Grading Preparation Materials Compiler Review. Course Overview. Graduate level compiler course - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advanced Compiler Techniques

Advanced Compiler Techniques

LIU Xianhua

School of EECS, Peking University

Course Introduction

Page 2: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Outline Course Overview

Course Topics Course Requirements Grading

Preparation Materials Compiler Review

2

Page 3: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Course Overview• Graduate level compiler course

– Focusing on advanced materials on program analysis and optimization.

– Assuming that you have basic knowledge & techniques on compiler construction.

– Gain hands-on experience through a programming project to implement a specific program analysis or optimization technique.

• Course website:– http://mprc.pku.edu.cn/~liuxianhua/ACT13

3

Page 4: Advanced Compiler Techniques

Administrivia

• Time: 10-12 (6:40pm-) every Thursday

• Location: 2-420• TA: WANG Wei, DONG Yin

– Email: act13 [at] mprc.pku.edu.cn• Office Hour: 4-5:30pm Tuesdays

– or by appointment via email• Contact:

– Phone: 62765828-809, 62759129– Room 1818, 1st Science Building– Email: lxh [at] mprc.pku.edu.cn

• Include [ACT13] in the subject“Advanced Compiler Techniques” 4

Page 5: Advanced Compiler Techniques

5

Course Materials Dragon Book

Aho, Lam, Sethi, Ullman, “Compilers: Principles, Techniques, and Tools”, 2nd Edition, Addison 2007

Related Papers Class website

“Advanced Compiler Techniques”

Page 6: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Requirements

• Basic Requirements– Read materials before/after class.– Work on your homework individually.

• Discussions are encouraged but don’t copy others’ work.

– Get you hands dirty! • Experiment with ideas presented in class and gain

first-hand knowledge! – Come to class and DON’T hesitate to speak if

you have any questions/comments/suggestions!

– Student participation is important!6

Page 7: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Grading

• Grading based on– Homework: 20%

• ~5 homework assignments– Midterm: 30%

• Week 11 or 12 (Nov 21st or 28th)– Final Project: 40%– Class participation: 10%

7

Page 8: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Final Project

• Groups of 2-3 students– Pair Programming recommended!

• Topic– Problem of your choice (recommend

project list will be provided)– Should be an interesting enough (non-

trivial) problem• Suggested environment

– LLVM(UIUC), – SUIF(Stanford), gcc(GNU), Soot (McGill

Univ.), JoeQ, Jikes(IBM), OpenJDK, Dalvik, V8… 8

Page 9: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Project Req.

• Week 5: Introduction• Week 7: Proposal due• Week 8: Proposal Presentation• Week 12: Progress Report due• Week 16: Final Presentation• Week 17: Final Report due

9

Page 10: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Course Topics• Basic analysis & optimizations

– Data flow analysis & implementation– Control flow analysis– SSA form & its application– Loops/Instruction scheduling– Pointer analysis– Localization & Parallelization optimization

• Selected topics (TBD)– Architecture-based optimization– Program slicing, program testing– Power-aware compilation

10

Page 11: Advanced Compiler Techniques

Advanced Compiler Techniques

LIU Xianhua

School of EECS, Peking University

Compiler Review

Page 12: Advanced Compiler Techniques

“Advanced Compiler Techniques”

What Is a Compiler?

• A program that translates a program in one language to another language– The essential interface between applications &

architectures• Typically lowers the level of abstraction

– analyzes and reasons about the program & architecture

• We expect the program to be optimized, i.e., better than the original– ideally exploiting architectural strengths and

hiding weaknesses

12

Page 13: Advanced Compiler Techniques

Why Study Compilers? (1)

• Become a better programmer(!)– Insight into interaction between

languages, compilers, and hardware– Understanding of implementation

techniques– What is all that stuff in the debugger

anyway?– Better intuition about what your code

does

Page 14: Advanced Compiler Techniques

Why Study Compilers? (2)

• Compiler techniques are everywhere– Parsing (little languages, interpreters, XML)– Software tools (verifiers, checkers, …)– Database engines, query languages– AI, etc. : domain-specific languages– Text processing

• Tex/LaTex -> dvi -> Postscript -> PDF– Hardware: VHDL; model-checking tools– Mathematics (Mathematica, Matlab)

Page 15: Advanced Compiler Techniques

Why Study Compilers? (3)

• Fascinating blend of theory and engineering– Direct applications of theory to practice

• Parsing, scanning, static analysis– Some very difficult problems (NPH or

worse)• Resource allocation, “optimization”, etc.• Need to come up with good-enough

approximations/heuristics

“Advanced Compiler Techniques” 15

Page 16: Advanced Compiler Techniques

Why Study Compilers? (4)

• Ideas from many parts of CSE– AI: Greedy algorithms, heuristic search– Algorithms: graph algorithms, union-find,

dynamic programming, approximation algorithms

– Theory: Grammars, DFAs and PDAs, pattern matching, fixed-point algorithms, lattice theory for analysis

– Systems: Allocation & naming, synchronization, locality

– Architecture: pipelines, instruction set use, memory hierarchy management

“Advanced Compiler Techniques” 16

Page 17: Advanced Compiler Techniques

Why Study Advanced Compilers?

• An opportunity to explore compiler techniques in both breadth and depth– Parallelization? Functional?– Optimizations with more details

• Compiler optimizations rely on both program analysis and transformation, which are useful in many related areas– Software engineering: program understanding /

reverse engineering / debugging– Run-time support and improvement

• Open problems– Engineering effort: limits and issues– Motivate research topics

“Advanced Compiler Techniques” 17

Page 18: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Compiler vs. Interpreter

• Compilers: Translate a source (human-writable) program to an executable (machine-readable) program

• Interpreters: Convert a source program and execute it at the same time.

18

Page 19: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Compiler vs. Interpreter

Ideal concept:

Compiler

Executable

Source code

Executable

Input data Output data

Interpreter

Source codeInput data

Output data

19

Page 20: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Compiler vs. Interpreter

• Most languages are usually thought of as using either one or the other:– Compilers: FORTRAN, C, C++, Pascal,

COBOL, PL/1– Interpreters: Lisp, Scheme, BASIC, APL,

Perl, Python, Smalltalk, Javascript, RUBY, Shellscripts/awk/sed…

• BUT: not always implemented this way– Virtual Machines (think Java)– Linking of executables at runtime– JIT (Just-in-time) compiling 20

Page 21: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Compiler vs. Interpreter• Actually, no sharp boundary

between them• General situation is a combo:

Translator

Virtual machine

Source code

Intermed. code

Intermed. codeInput

Data

Output

21

Page 22: Advanced Compiler Techniques

Hybrid Approaches

• Classic example: Java– Compile Java source to byte codes – Java

Virtual Machine language (.class files)– Execution

• Interpret byte codes directly, or• Compile some or all byte codes to native code

– Just-In-Time compiler (JIT) – detect hot spots & compile on the fly to native code – standard these days

• Variations used for .NET (compile always) & in high-performance compilers for dynamic languages, e.g., JavaScript

22“Advanced Compiler Techniques”

Page 23: Advanced Compiler Techniques

23

Compiler vs. Interpreter

“Advanced Compiler Techniques”

Compiler Pros

Less space Fast execution

Cons Slow processing

Partly Solved(Separate compilation)

Debugging Improved thru IDEs

Interpreter• Pros

– Easy debugging– Fast Development– Interaction

• Cons– Not for large projects

• Exceptions: Perl, Python– Requires more space– Slower execution

• Interpreter in memory all the time

TRADE OFF

Page 24: Advanced Compiler Techniques

24

Traditional Compiler• intermediate representation (IR) • front end maps legal code into IR • back end maps IR onto target machine • simplify retargeting • allows multiple front ends • multiple passes better code

Scanner(lexical

analysis)

Parser(syntax

analysis)

CodeOptimizer

SemanticAnalysis

(IR generator)

CodeGenerator

SymbolTable

tokensSyntacticstructure IRSource

programTarget

programIR

IR

Page 25: Advanced Compiler Techniques

Fallacy Front-end, IR and back-end must encode

knowledge needed for all nm combinations!

25“Advanced Compiler Techniques”

Page 26: Advanced Compiler Techniques

26

Optimizer (Middle End) Modern optimizers are usually built as a

set of passes constant propagation and folding code motion reduction of operator strength common sub-expression elimination redundant store elimination dead code elimination

Page 27: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Phase of Compilations

Page 28: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Scanning/Lexical Analysis

• Break program down into its smallest meaningful symbols (tokens, atoms)

• Tools for this include lex, flex• Tokens include e.g.:

– Reserved words: do if float while– Special characters: ( { , + - = ! /– Names & numbers: myValue 3.07e02

• Start symbol table with new symbols found

28

index := start + step * 20Input:index := start + step * 20After scanning:

identifier operator number

Page 29: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Parsing

• Construct a parse tree from symbols• A pattern-matching problem

– Language grammar defined by set of rules that identify legal (meaningful) combinations of symbols

– Each application of a rule results in a node in the parse tree

– Parser applies these rules repeatedly to the program until leaves of parse tree are “atoms”

• If no pattern matches, it’s a syntax error• yacc, bison are tools for this (generate c

code that parses specified language)29

Page 30: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Parse Tree

• Output of parsing• Top-down description of program

syntax– Root node is entire program

• Constructed by repeated application of rules in Context Free Grammar (CFG)

• Leaves are tokens that were identified during lexical analysis

30

Page 31: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Example: Parsing Rules for Pascal

These are like the following:

• program ----> PROGRAM identifier (identifier more_identifiers) ; block

• more_identifiers ----> , identifier more_identifiers | ε

• block ----> variables BEGIN statement more_statements END

• statement ----> do_statement | if_statement | assignment | …

• if_statement ----> IF logical_expression THEN statement ELSE …

31

Page 32: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Pascal Code Example

program gcd (input, output)var i, j : integerbegin

read (i , j)while i <> j do

if i>j then i := i – j;else j := j – i ;

writeln (i);end .

32

Page 33: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Example: Parse Tree

33

Page 34: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Semantic Analysis• Discovery of meaning in a program using

the symbol table– Do static semantics check– Simplify the structure of the parse tree ( from

parse tree to abstract syntax tree (AST) )• Static semantics check

– Making sure identifiers are declared before use– Type checking for assignments and operators– Checking types and number of parameters to

subroutines– Making sure functions contain return

statements– …

34

Page 35: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Example: AST

35

Page 36: Advanced Compiler Techniques

“Advanced Compiler Techniques”

(Intermediate) Code Generation

• Go through the parse tree from bottom up, turning rules into code.

• e.g. – A sum expression results in the code

that computes the sum and saves the result

• Result: inefficient code in a machine-independent language

36

Page 37: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Target Code Generation

• Convert intermediate code to machine instructions on intended target machine

• Determine storage addresses for entries in symbol table

37

Page 38: Advanced Compiler Techniques

Types of Code Optimizations• Machine-independent optimizations

– Eliminate redundant computation– Eliminate dead code– Perform computation at compile time if possible– Execute code less frequently

• Machine-dependent optimizations– Hide latency– Parallelize computation– Replace expensive computations with cheaper

ones– Improve memory performance

38“Advanced Compiler Techniques”

Page 39: Advanced Compiler Techniques

Scopes of Code Optimizations• Peephole optimizations

– Consider a small window of instructions• Local optimizations

– Consider instruction sequences within a basic block

• Intra-procedural(global) optimizations– Consider multiple basic blocks within a procedure– Need support from control flow analysis

• Branches, loops, merging of flows• Inter-procedural optimizations

– Consider the whole program w/ multiple procedures

– Need to analyze calls/returns

39“Advanced Compiler Techniques”

Page 40: Advanced Compiler Techniques

Sample Optimizations• Redundant loads and stores elimination

MOV R0, a MOV R0, a MOV a, R0

• Unreachable code eliminationGOTO L2 x := x + 1 unreachable

• Algebraic identities x := x + 0 can eliminate x := x * 1

• Reduction in strength x := x * 2 x := x + x

• Constant foldingp = 2 * 3.14 p = 6.2.8

• Constant propagationp = 6.28 p=6.28x = x * p x = x * 6.28

40“Advanced Compiler Techniques”

Page 41: Advanced Compiler Techniques

Sample Optimizations• Common sub-expression elimination

– Local m = 2 * y * z t = 2 * y

o = 2 * y - z m = t * z o = t - z– Global

– Global partial

a:=d+e b:=d+e

c:=d+e

t:=d+ea:=t

t:=d+eb:=t

c:=t

a:=d+e

c:=d+e

t:=d+ea:=t t:=d+e

c:=t

41“Advanced Compiler Techniques”

Page 42: Advanced Compiler Techniques

Sample Optimizations• Loop optimizations

– Code motionwhile (i <= limit - 2) { … } t = limit - 2

while (i <= t)

{ … }– Loop unrolling

do i=1 to n by 4a(i) = a(i)+b(i)a(i+1) =

a(i+1)+b(i+1)a(i+2) =

a(i+2)+b(i+2)a(i+3) =

a(i+3)+b(i+3)end… //process tail part

do i=1 to n by 1 a(i) = a(i)+b(i) end

42“Advanced Compiler Techniques”

Page 43: Advanced Compiler Techniques

43

When Should We Compile? Ahead-of-time: before you run the

program Offline profiling: compile several times compile/run/profile.... then run again Just-in-time: while you run the program

required for dynamic class loading, i.e., Java, Python, etc.

“Advanced Compiler Techniques”

Page 44: Advanced Compiler Techniques

44

Aren’t Compilers a Solved Problem?“Optimization for scalar machines is a problem that

was solved ten years ago.”-- David Kuck, Fall 1990

Architectures keep changing New features pose new problems Changing costs lead to different concerns Old solutions need re-engineering

Languages keep changing Applications keep changing - SPEC CPU? When to compile keeps changing

And compiling options/parameters? Desired target properties keep changing

Code size, running time, power consumption, security Design flow also may change

“Advanced Compiler Techniques”

Page 45: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Role of compilers Bridge complexity and evolution in

architecture, languages, & applications Help programs with correctness,

reliability, program understanding Compiler optimizations can significantly

improve performance 1 to 10x on conventional processors

Performance stability: one line change can dramatically alter performance unfortunate, but true

45

Page 46: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Performance Anxiety

• But does performance really matter?– Computers are really fast– Moore’s law (roughly):

hardware performance doubles every 18 months

• Real bottlenecks lie elsewhere:– Memory & storage access– Communications in bus and network– Human! (think interactive apps)

• Human typing avg. 8 cps (max 25 cps)• Waste time “thinking”

46

Page 47: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Compilers Don’t Help Much

• Do compilers improve performance anyway?– Proebsting’s law

(Todd Proebsting, Microsoft Research):• Difference between optimizing and non-

optimizing compiler ~ 4x• Assume compiler technology represents 36

years of progress (actually more)Compilers double program

performance every 18 years!Not quite Moore’s Law…

47

Page 48: Advanced Compiler Techniques

“Advanced Compiler Techniques”

A Big BUT

• Why use high-level languages anyway?– Easier to write & maintain– Safer (think Java)– More convenient (think libraries, GC…)

• But: people will not accept massive performance hit for these gains– Compile with optimization!– Still use C and C++!!– Hand-optimize their code!!!– Even write assembler code (gasp)!!!!

Apparently performance does matter…

48

Page 49: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Why Compilers Matter

• Key part of compiler’s job:make the costs of abstraction reasonable– Remove performance penalty for:

• Using objects• Safety checks (e.g., array-bounds)• Writing clean code (e.g., recursion)

• Use program analysis to transform code

primary topic of this course49

Page 50: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Program Analysis• Source code analysis is the process of

extracting information about a program from its source code or artifacts (e.g., from Java byte code or execution traces) generated from the source code using automatic tools. – Source code is any static, textual, human

readable, fully executable description of a computer program that can be compiled automatically into an executable form.

– To support dynamic analysis the description can include documents needed to execute or compile the program, such as program inputs.

Source: Dave Binkely-”Source Code Analysis – A Roadmap”, FOSE’07

50

Page 51: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Anatomy of an Analysis

1. Parser• parses the source code into one or

more internal representations.2. Internal representation

• CFG, call graph, AST, SSA, VDG, FSA• Most common: Graphs

3. Actual Analysis

51

Page 52: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Analysis Properties

• Static vs. Dynamic• Sound vs. unsound

– Safe vs. Unsafe• Flow sensitive vs. Flow insensitive• Context sensitive vs. Context

insensitive

• Precision-Cost trade-off52

Page 53: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Levels of Analysis

(in order of increasing detail & complexity)• Local (single-block) [1960’s]

– Straight-line code– Simple to analyze; limited impact

• Global (Intraprocedural) [1970’s – today]– Whole procedure– Dataflow & dependence analysis

• Interprocedural [late 1970’s – today]– Whole-program analysis– Tricky:

• Very time and space intensive• Hard for some PL’s (e.g., Java) 53

Page 54: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Optimization =Analysis + Transformation

• Key analysis:– Control-flow

• if-statements, branches, loops, procedure calls

– Data-flow• definitions and uses of variables

• Representations:– Control-flow graph– Control-dependence graph– Def/use, use/def chains– SSA (Static Single Assignment)

54

Page 55: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Applications

• architecture recovery • clone detection• program comprehension• debugging• fault location• model checking in formal analysis• model-driven development• optimization techniques in software

engineering• reverse engineering• software maintenance• visualizations of analysis results• etc.

55

Page 56: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Current Challenges

• Pointer Analysis• Concurrent Program Analysis• Dynamic Analysis• Information Retrieval• Data Mining• Multi-Language Analysis• Non-functional Properties• Self-Healing Systems• Real-Time Analysis

56

Page 57: Advanced Compiler Techniques

57

Exciting times New and changing architectures

Hitting the microprocessor wall Multicore/manycore Tiled architectures, tiled memory systems

Object-oriented languages becoming dominant paradigm Java and C# coming to your OS soon - Jnode,

Singularity Security and reliability, ease of programming

Key challenges and approaches Latency & parallelism still key to performance Language & runtime implementation efficiency Software/hardware cooperation is another key issue

CompilerProgrammer RuntimeCode Code

Specification Future behavior

Feedback H/S Profiling

“Advanced Compiler Techniques”

Page 58: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Next Time

• Control-Flow Analysis• Local Optimizations

• Read– Dragon book: §8.4-8.5, 8.7-8.8