llvm tutorial john criswell university of rochester · 2019. 4. 18. · history of llvm developed...

71
Meliora! LLVM Tutorial John Criswell University of Rochester 1

Upload: others

Post on 16-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Meliora!

LLVM Tutorial John CriswellUniversity of Rochester

!1

Page 2: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Introduction

!2

Page 3: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

History of LLVM❖ Developed by Chris Lattner and Vikram Adve at the

University of Illinois at Urbana-Champaign (UIUC)

❖ Released open-source in October 2003

❖ Default compiler for Mac OS X, iOS, and FreeBSD

❖ Used by many companies and research groups

❖ Contributions by many people!

!3

Page 4: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM is a compiler infrastructure!

!4

Page 5: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Tools Built Using LLVM

!5

Page 6: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Tools Built Using LLVM

!5

Compilers!

Page 7: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Tools Built Using LLVM

!5

Compilers!JITs!

Page 8: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Tools Built Using LLVM

!5

Compilers!JITs!

Formal Verification!

Page 9: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Tools Built Using LLVM

!5

Security Hardening Tools!

Compilers!JITs!

Formal Verification!

Page 10: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Tools Built Using LLVM

!5

Security Hardening Tools!

Compilers!JITs!

Formal Verification!

Bug Finding Tools!

Page 11: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Tools Built Using LLVM

!5

Security Hardening Tools!

Compilers!JITs!

Formal Verification!

Bug Finding Tools! Profiling Tools!

Page 12: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Things to Do in the Compiler Zoo❖ Add a security check to every load and store

❖ Create a memory access trace

❖ Check pointer arithmetic on certain types of variables

❖ Trace atomic modifications to a memory location

❖ Change order of local variables in stack frame

!6

Page 13: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

What do you want to do with LLVM?

!7

Page 14: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Source Code Structure❖ LLVM is primarily a set of libraries

❖ We use the libraries to create LLVM-based tools

!8

Page 15: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Programming Background❖ C++

❖ Other language bindings exist, but C++ is “native”

❖ Know how to use classes, pointers, and references

❖ Know how to use C++ iterators

❖ Know how to use Standard Template Library (STL)

!9

Page 16: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Helpful Documents❖ LLVM Language Reference

Manual❖ LLVM Programmer’s Manual❖ How to Write an LLVM Pass❖ Online LLVM Doxygen

documents

!10

Page 17: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Getting Involved with LLVM❖ Research on program analysis (NSF REUs)

❖ Google Summer of Code projects

❖ Apple, Samsung, Google, Facebook build LLVM tools

❖ LLVM Developer’s Meeting

❖ One in California; one in Europe

❖ Can present talks, posters, BoFs, etc.

!11

Page 18: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Please fill out feedback form:

https://forms.gle/ib3Ng6osSFqNoQGD7

!12

Page 19: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Compiler Structure

!13

Page 20: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Ahead of Time (AOT) Compiler

Front End

Optimizer

Code Generator

!14

Page 21: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Front End Structure

Clang Parser

Source Code

Clang AST

Clang Optimizer

Clang AST

!15

Clang CodeGen

Clang AST

LLVM IR

Clang Optimizer

Page 22: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Optimizer Structure

Opt 1LLVM

IR

!16

CodeGen

LLVM IR

Machine IR

Opt 2LLVM IR

Page 23: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Code Generator Structure

Register Allocation

Machine IR

!17

CodeEmitter

Machine IR

Native Code

Instruction Scheduling

Machine IR

Page 24: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Intermediate Representation Description

Describes structure of source code(if-statements, while-loops)

Architecture independent code in SSA form

Native code(machine registers; native code instructions)

LLVM Toolchain Overview

!18

Clang AST

LLVM IR

Machine IR

Page 25: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Intermediate Representation Description

Describes structure of source code(if-statements, while-loops)

Architecture independent code in SSA form

Native code(machine registers; native code instructions)

LLVM Toolchain Overview

!18

Clang AST

LLVM IR

Machine IR

Page 26: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Intermediate Representation

!19

Page 27: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM IR is a language into which programs are translated for analysis and transformation

(optimization)

!20

Page 28: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM IR Forms❖ LLVM Assembly Language

❖ Text form saved on disk for humans to read

❖ LLVM Bitcode

❖ Binary form saved on disk for programs to read

❖ LLVM In-Memory IR

❖ Data structures used for analysis and optimization

!21

Page 29: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Assembly Languagedefine i32 @foo(i32, i32) local_unnamed_addr #0 {

%3 = tail call i32 @bar(i32 %0) #2

%4 = add nsw i32 %1, %0

%5 = sub i32 %4, %3

ret i32 %5

}

declare i32 @bar(i32) local_unnamed_addr #1

!22

Page 30: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Overview of LLVM IR❖ Each assembly/bitcode file is a Module

❖ Each Module is comprised of

❖ Global variables

❖ A set of Functions which are comprised of

❖ A set of basic blocks which are comprised of

❖ A set of instructions

!23

Page 31: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Module

Function: foo()

LLVM Bitcode File

addmult

br

addret

addsubbr

Global int[20];

!24

Function: bar()

addsubbr

addret

adddivbr

Page 32: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Module with One Functiondefine i32 @foo(i32, i32) local_unnamed_addr #0 {

%3 = icmp ult i32 %0, %1

br i1 %3, label %4, label %6

%5 = tail call i32 @bar(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0)) #2

br label %8

%7 = tail call i32 @bar(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str.1, i64 0, i64 0)) #2

br label %8

%9 = add i32 %1, %0

ret i32 %9

}

!25

Page 33: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Instruction Set❖ RISC-like architecture

❖ Virtual registers in SSA form

❖ Load/store instructions to read/write memory objects

❖ All other instructions read or write virtual registers

!26

Page 34: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Memory Objects❖ Global Variables

❖ Memory allocated on the stack

❖ Memory allocated on the heap

!27

Page 35: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Instructions for Computation❖ Arithmetic and binary operators

❖ Two’s complement arithmetic (add, sub, multiply, etc)

❖ Bit-shifting and bit-masking

❖ Pointer arithmetic (getelementptr or “GEP”)

❖ Comparison instructions (icmp, fcmp)

❖ Generates a boolean result

!28

Page 36: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Memory Access Instructions❖ Load instruction reads memory

❖ Store instruction writes to memory

❖ Atomic compare and exchange

❖ Atomic read/modify/write

!29

Page 37: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Control Flow Instructions❖ Terminator instructions

❖ Indicate which basic block to jump to next

❖ conditional branch, unconditional branch, switch

❖ Return instruction to return to caller

❖ Unwind instruction for exception handling

❖ Call instruction calls a function

❖ It can occur in the middle of a basic block

!30

Page 38: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Memory Allocation Instructions❖ Stack allocation (alloca)

❖ Allocates memory on the stack

❖ Calls to heap-allocation functions (e.g., malloc())

❖ Not a special instruction; just uses a call instruction

❖ Global variable declarations

❖ Not really instructions, but allocate memory

❖ All globals are pointers to memory objects

!31

Page 39: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Single Static Assignment (SSA)• Each function has infinite set of virtual registers

• Only one instruction assigns a value to a virtual register (called the definition of the register)

• An instruction and the register it defines are synonymous

%z = add %x, %y

�32

Page 40: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

The Almighty Phi Node!y=5;

x=y+1;

z = x;

x=y+2;

y=5;

x=y+1;

z = x+3;

x=y+2;

!33

Page 41: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

The Almighty Phi Node!y=5;

x=y+1;

z = x;

x=y+2;

y=5;

x=y+1;

z = x+3;

x=y+2;

y=5;

x1=y+1;

x3=phi(x1,x2); z=x3+3;

x2=y+2;

!33

Page 42: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Domination❖ The definition of a virtual

register must dominate all of its uses❖ Except uses by phi-nodes

❖ A dominates B, C, and D

A

B

D

C

!34

Page 43: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Writing an LLVM Pass

!35

Page 44: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Passes: Separation of Concerns❖ Break optimizer into passes

❖ Each pass performs one analysis or one transformation

!36

Page 45: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Optimizer

LLVM Passes

Pass 1

LLVM IR

Pass 2

LLVM IR

!37

LLVM IR

Page 46: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Two Types of Passes❖ Passes that analyze code

❖ Does not modify the program

❖ Provides information “out of band” to other passes

❖ Passes that transform code

❖ Make modifications to the code

!38

Page 47: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Passes

LLVM IR

Dom Tree

LLVM IR DGE LLVM

IRMem2Reg

LLVM IR

Dominator Tree Data

!39

Page 48: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM IR Pass Types❖ ModulePass

❖ FunctionPass

❖ BasicBlockPass

❖ I recommend ignoring “funny” passes

❖ LoopPass

❖ RegionPass

!40

Page 49: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Rules for LLVM Passes❖ Only modify values and instructions at scope of pass

❖ ModulePass can modify anything

❖ FunctionPass should not modify anything outside of the function

❖ BasicBlockPass should not modify anything outside of the basic block

!41

Page 50: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Important Pass Methods: getAnalysisUsage()

❖ Tells PassManager which analysis passes you need

❖ PassManager will schedule analysis passes for you

❖ Cannot schedule transform passes this way

❖ Tells PassManager which analysis results are valid after a transformation

❖ Avoids re-running expensive analysis passes

!42

Page 51: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

runOnModule()❖ Entry point for ModulePass

❖ Passes a reference to the Module

❖ Can locate functions, basic blocks, globals from Module

❖ Return true if the pass modifies the program

❖ An analysis pass always returns false.

❖ A transform pass can return either true or false.

!43

Page 52: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

runOnFunction()❖ Called for each function in the Module

❖ Passed reference to Function

❖ Return false for no modifications; true for modifications

!44

Page 53: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

runOnBasicBlock()❖ You get the idea…

!45

Page 54: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

MyPass.h Exampleclass MyPass : public ModulePass { private: unsigned int analyzeThis (Instruction *I);

public: static char ID; MyPass() : ModulePass(ID) {} const char *getPassName() const { return “My LLVM Pass"; } virtual bool runOnModule (Module & M); virtual void getAnalysisUsage(AnalysisUsage &AU) const { // We require Dominator information AU.addRequired<DominatorTree>(); }

unsigned int getAnalysisResultFor (Instruction *I); };

!46

Page 55: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

MyPass.cpp Example

!47

static RegisterPass<MyPass> P (“mypass”, “My First LLVM Analysis”);

bool MyPass::runOnModule (Module & M) { // // Iterate over all instructions within a Module // for (Module::iterator fi = M.begin(); fi != M.end(); ++fi) { for (Function::iterator bi = fi->begin(); bi != fi->end(); ++bi) { for (BasicBlock::iterator it = bi->begin(); it != bi->end; ++it) { Instruction * I = *it;

} }

} }

Page 56: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

In-Memory LLVM IR

!48

Page 57: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Classes❖ There is a class for each type of IR object

❖ Module class

❖ Function class

❖ BasicBlock class

❖ Instruction class

❖ Classes provide iterators for objects within them

!49

Page 58: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Module

LLVM In-Memory IR

add mult br

add ret

add sub br

Global int[20];

!50

Global char[16];

Function

Function

Function

BasicBlock

BasicBlock

BasicBlock

Page 59: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Class Iterators❖ Each class provides iterators for items it contains

❖ Module::iterator iterates over functions

❖ Function::iterator iterates over basic blocks

❖ BasicBlock::iterator iterates over instructions

!51

Page 60: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Iterator Example// // Iterate over all instructions within a BasicBlock // BasicBlock * BB = …; BasicBlock::iterator it; BasicBlock::iterator ie;

for (it = BB->begin(), end = BB->end(); it != end; ++it) { Instruction * I = *it;

};

!52

Page 61: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

MyPass.cpp Example (Reprise)

!53

static RegisterPass<MyPass> P (“mypass”, “My First LLVM Analysis”);

bool MyPass::runOnModule (Module & M) { // // Iterate over all instructions within a Module // for (Module::iterator fi = M.begin(); fi != M.end(); ++fi) { for (Function::iterator bi = fi->begin(); bi != fi->end(); ++bi) { for (BasicBlock::iterator it = bi->begin(); it != bi->end; ++it) { Instruction * I = *it;

} }

} }

Page 62: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Class Hierarchy❖ Anything that is an SSA value is a subclass of Value

❖ All Instruction classes are a subclass of Instruction

❖ Similar instructions share a common superclass

!54

Page 63: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Simplified LLVM Class Hierarchy

!55

Instruction

TerminatorInst

BranchInst SwitchInst RetInst

Value

Page 64: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Locating Branch Instructions// // Iterate over all instructions within a BasicBlock // BasicBlock::iterator it; BasicBlock::iterator ie;

for (it = BB->begin(), end = BB->end(); it != end; ++it) { Instruction * I = *it; if (BranchInst * BI = dyn_cast<BranchInst>(I)) { // Do something with branch instruction BI

} }

!56

Page 65: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Casting to Subclass in LLVM

!57

Casting Function Description Example

isa<Class>() Return true or false if value is of that class. isa<BranchInst>(V)

dyn_cast<Class>() Returns pointer to object of type Class or NULL dyn_cast<BranchInst>(V)

Page 66: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Locating Branch Instructions// // Iterate over all instructions within a BasicBlock // BasicBlock::iterator it; BasicBlock::iterator ie;

for (it = BB->begin(), end = BB->end(); it != end; ++it) { Instruction * I = *it;

if (BranchInst * BI = dyn_cast<BranchInst>(I)) {

// Do something with branch instruction BI }

}

!58

Page 67: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Instruction Classes❖ BinaryOperator - add, sub, mult, shift, and, or, etc.

❖ GetElementPointerInst

❖ LoadInst, Storeinst

❖ BranchInst, SwitchInst, RetInst

❖ CallInst

❖ CastInst

!59

Page 68: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

LLVM Class Methods❖ Each class has methods to get information on value

❖ BranchInst - Iterator over successor basic blocks

❖ StoreInst - Get pointer operands of store instruction

❖ GetElementPtrInst - Get indices used as operands

❖ Instruction - Get containing basic block

❖ Method might belong to a superclass

!60

Page 69: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Beyond the Tutorial

!61

Page 70: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Data Flow Analysis❖ The Dragon Book, Fourth Edition

❖ Papers on SSA-based algorithms

❖ Kam-Ullman paper on iterative data-flow analysis

!62

Page 71: LLVM Tutorial John Criswell University of Rochester · 2019. 4. 18. · History of LLVM Developed by ... Register Allocation Machine IR!17 Code Emitter Machine IR Native Code Instruction

Getting Involved with LLVM❖ Research on program analysis (NSF REUs)

❖ Google Summer of Code projects

❖ Apple, Samsung, Google, Facebook build LLVM tools

❖ LLVM Developer’s Meeting

❖ One in California; one in Europe

❖ Can present talks, posters, BoFs, etc.

!63