interprocedural program analyses david heine vladimir livshits brian murphy christopher unkel hansel...

22
Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University http://suif.stanford.edu/

Post on 21-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Interprocedural Program Analyses

David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan

Stanford University http://suif.stanford.edu/

Page 2: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Outline

I. Data structures for program analysisII. Interprocedural analysis frameworkIII. Interprocedural passes and parallelizerIV. Pointer alias analysis

Page 3: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

I. Data structures: Lattice values

Commonly used in data flow analysis bottom, top, meet operators

Includes definitions of some common lattices, e.g. bitvectors, constants, intervals

Page 4: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Graphs

Common algorithms Iterated dominance frontier strongly connected components

Generates dot graph output Example: control flow graphs and call graphs

Page 5: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Region Graphs

Capture the hierarchical program structure along side the statements An interpretation of the statements without dismantling them Useful for elimination-style algorithms

A region has one entry and possibly multiple exits may be a terminal region (straight line control flow internally) or a composite region

Flow between subregions is specified by control flow graph (adjacency lists) a regular expression (path expression with composition,

meet and Kleene star) Extensible with new nodes

Page 6: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Region Transformations

Flattening regions Conversions from regular expression RE -> CFG and CFG -> RE

May involve some code cloning

Page 7: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

III. Interprocedural Analysis

Two important design choices in program analysis Across procedures

No interprocedural analysis Interprocedural: context-insensitive Interprocedural: context-sensitive

Within a procedure Flow-insensitive Flow-sensitive: interval/region based Flow-sensitive: iterative over flow-graph

Page 8: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Efficient Context-Sensitive Analysis

Bottom-up A region/interval: a procedure or a loop An edge: call or code in inner scope Summarize each region (with a transfer

function) Find strongly connected components (sccs) Bottom-up traversal of sccs Iteration to find fixed-point for recursive

functions

Top-down Top-down propagation of values Iteration to find fixed-point for recursive

functions (sccs)

call

inner loop

scc

Page 9: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Interprocedural Framework Architecture

E.g. Array summaries

E.g. Mod/ref analysis

User-def. handlers/lattice values

Procedure calls and returnsComposite regions

Compound Handlers

Bottom-upTop-down

Linear traversal

Driver

Call graphs, SCC, lattice values

Regions, control flow graphs

Data Structures

Page 10: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Interprocedural Framework Architecture

Interprocedural analysis data structures e.g. call graphs, regions or intervals

Handlers: Orthogonal sets of handlers for different groups of constructs Primitives: user specifies analysis-specific semantics of primitives Compound: handles compound statements and calls

User chooses between handlers of different styles• e.g. no interprocedural analysis versus context-sensitive• e.g. flow-insensitive vs. flow-sensitive

All the handlers are registered in a visitor Driver

Driver invoked by user’s request for information (demand driven) Build prepass data structures Invokes the right set of handlers in right order

(e.g. bottom-up traversal of call graph)

Page 11: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

III. Interprocedural Passes

Scalar analysis Mod/ref, reduction recognition: Bottom-up flow-insensitive Liveness for privatization: Bottom-up and top-down, flow-

sensitive Constraint propagation: Top-down, flow-insensitive

Array analysis Dependence analysis Privatization analysis

Page 12: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Region-Based Array Analysis

Array sections are represented as sets of linear inequalities(Omega)

Bottom-up and backward-flow analysis For each region: compute 4 sections for each array accessed

M: may have been written W: must have been written R: may have been read E: (exposed-read) values read are defined before the region

executes Dependence test

iteration i, j s.t. Mi Rj = Privatization test

iteration i, Ei =

Page 13: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Example: ModRef Analysis

class ModRefProblem : public BUProblem {public: ModRefProblem(SuifEnv* suif_env, PtrAnalysisType the_ptrAnalysisType); virtual void initialize(); ... }ModRefProblem::ModRefProblem(SuifEnv* suif_env, PtrAnalysisType the_ptrAnalysisType) : BUProblem(suif_env, "ModRef", new ModRefValue(), new ModRefValue(), new ModRefUserBUHandler(suif_env, the_ptrAnalysisType), new CallGraphIPBUHandler(suif_env), new FlowInsensitiveIntraBUHandler(suif_env)),

ptrAnalysisType(the_ptrAnalysisType){ initialize();

} }

Page 14: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Lattice Values

class ModRefValue : public LatticeValue {public: ModRefValue(); ~ModRefValue(); AbslocSetValue* get_mod() const {return modVars;} AbslocSetValue* get_ref() const {return refVars;} virtual void do_meet(const LatticeValue* other, bool* changed=NULL); virtual LatticeValue* top() const; virtual LatticeValue* id() const; virtual void do_compose(const LatticeValue* other, bool* changed=NULL); virtual void do_star(const VariableSymbol * idx, const Expression* lb, const Expression* ub, bool* changed){}; virtual void do_widen(const LatticeValue* other, bool* changed); LatticeValue* clone() const; bool is_top() const; bool is_id() const; String to_string() const; ...};

Page 15: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

User-Defined Handler

class ModRefUserBUHandler : public UserBUHandler {public: ModRefUserBUHandler(SuifEnv* suif_env, PtrAnalysisType ptrAnalysisType); virtual UNSHARED LatticeValue* handle_statement (BUProblem* problem, Statement* stmt); virtual LatticeValue* handle_simple_region

(BUProblem* problem, SimpleRegion* region); virtual LatticeValue* handle_predicate_region

(BUProblem* problem, PredicateRegion* region); virtual LatticeValue* handle_mwb_default_region

(BUProblem* problem, MWBDefaultRegion* region); virtual LatticeValue* handle_eval_predicate_region

(BUProblem* problem, EvalPredicateRegion* region); virtual LatticeValue* handle_undef_proc_region

(BUProblem* problem, UndefProcRegion* region);

...

};

Page 16: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Most of the work is done here!

UNSHARED LatticeValue* ModRefUserBUHandler::handle_statement (BUProblem* problem, Statement* stmt

{ ModRefValue* curr_value = new ModRefValue(); for (SemanticHelper::SrcVarIter iter(stmt); iter.is_valid(); iter.next())

curr_value->add_ref(iter.current()); if(is_kind_of<StoreVariableStatement>(stmt)){

StoreVariableStatement* s = to<StoreVariableStatement>(stmt); VarAbsLocation* dest =

VarAbsLocation::create_var_absloc(s->get_destination()); curr_value->get_mod()->add(dest); }else{ if(is_kind_of<StoreStatement>(stmt)){ // *x = y StoreStatement* s = to<StoreStatement>(stmt); curr_value->get_mod()->do_join(

new AbslocSetValue(query->get_absloc_set(s), true));}} return curr_value;

};

Page 17: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Parallelizer

Parallelizes a loop if there is no abnormal exit out of a loop all scalar variables are either

read-only variables privatizable variables reduction variables

all array variables either have no dependence or can be privatized

Page 18: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

IV. Pointer Alias Analysis

Steensgaard’s pointer alias analysis Flow-insensitive and context-insensitive, type-inference based

analysis Very efficient: near linear-time analysis Very inaccurate

A good bootstrapping step for interprocedural C program analysis Enables the construction of a call graph with indirect function calls

Page 19: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Context-Sensitive Pointer Analysis

Implementation of the analysis described in Scalable Context-Sensitive Flow Analysis Using Instantiation Constraints Fahndrich, Rehof, Das, (PLDI ’00!) in SUIF 2.

Context-sensitive, flow-insensitive flow analysis. Instantiation constraints represent caller-callee relationships. Handles function pointers smoothly, and is efficient. One application is pointer alias analysis.

Implementation runs in three phases: constraint generation constraint solution reachability analysis

Implemented in SUIF in ~6 weeks (as a first project in SUIF)

Page 20: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Demo of Two Visualization Tools

From the implementation in SUIF, running on sizeable programs: Progress of the analysis Resulting type graphs

Page 21: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Progress Visualization

Simple X windows progress monitor. One pixel for each node.

Allocated in scan order as they are created. White: initially created; red: callee; green: caller; grey: merged node.

Visualization results: Constraint generation:

white nodes created; some functions and call sites. Constraint solution:

nodes merged together and many greyed out.Several passes of working down pointer chains: a=b, *a=*b, **a=**b.

Red and green spread to formal and actual parameters. Some new nodes created for “product types”. Scattered merging as the algorithm deduces flow through functions

Nearly 1,000,000 nodes created for gcc. 2.5 minutes CPU time on this laptop.

Page 22: Interprocedural Program Analyses David Heine Vladimir Livshits Brian Murphy Christopher Unkel Hansel Wan Stanford University

Result Visualization

pointergraph compress.suif compress.ps ghostview compress.ps &

Resulting type graphs courtesy of Dot: Pointees below pointers. Arguments below functions. Callers below callees. Nodes marked with variable names. Optional grouping by function

(only for small programs.)