advanced compiler techniques
DESCRIPTION
Advanced Compiler Techniques. Pointer Analysis. LIU Xianhua School of EECS, Peking University. Pointer Analysis. Outline: What is pointer analysis Intraprocedural pointer analysis Interprocedural pointer analysis Andersen and Steensgaard New Directions. Pointer and Alias Analysis. - PowerPoint PPT PresentationTRANSCRIPT
Advanced Compiler Techniques
LIU Xianhua
School of EECS, Peking University
Pointer Analysis
2
Pointer Analysis Outline:
What is pointer analysis Intraprocedural pointer analysis Interprocedural pointer analysis
Andersen and Steensgaard New Directions
“Advanced Compiler Techniques”
3
Pointer and Alias Analysis Aliases: two expressions that denote the
same memory location. Aliases are introduced by:
pointers call-by-reference array indexing C unions
“Advanced Compiler Techniques”
4
Useful for What?
• Improve the precision of analysis that require knowing what is modified or referenced (eg const prop, CSE …)
• Eliminate redundant loads/stores and dead stores.
• Parallelization of code– can recursive calls to quick_sort be run in parallel? Yes,
provided that they reference distinct regions of the array.• Identify objects to be tracked in error detection tools
x := *p;...y := *p; // replace with y := x?
*x := ...;// is *x dead?
x.lock();...y.unlock(); // same object as x?
“Advanced Compiler Techniques”
5
Challenges for Pointer Analysis Complexity: huge in space and time
compare every pointer with every other pointer at every program point potentially considering all program paths to that
point Scalability vs accuracy trade-off
different analyses motivated for different purposes many useful algorithms (adds to confusion)
Coding corner cases pointer arithmetic (*p++), casting, function pointers,
long-jumps Whole program?
most algorithms require the entire program library code? optimizing at link-time only?
“Advanced Compiler Techniques”
6
Kinds of Alias Information
• Points-to information (must or may versions)– at program point, compute a set of pairs of the form p->x, where
p points to x.– can represent this information
in a points-to graph- Less precise, more efficient
• Alias pairs– at each program point, compute the set of all pairs (e1,e2) where
e1 and e2 must/may reference the same memory.– More precise, less efficient
• Storage shape analysis– at each program point, compute an
abstract description of the pointer structure.
px
y
z
p
“Advanced Compiler Techniques”
7
Intraprocedural Points-to Analysis Want to compute may-points-to
information
Lattice: Domain: 2{x->y| x∈Var, Y∈Var}
Join: set union BOT: Empty TOP: {x->y| x∈Var, Y∈Var}
“Advanced Compiler Techniques”
8
Flow Functions(1)
x := a + b
in
out
Fx := a+b(in) =
x := k
in
out
Fx := k(in) = in – {x, *}
in – {x, *}
“Advanced Compiler Techniques”
9
Flow Functions(2)
x := &y
in
out
Fx := &y(in) =
x := y
in
out
Fx := y(in) = in – {x, *} U {(x,z) | (y,z)∈in}
in – {x, *} U {(x, y)}
“Advanced Compiler Techniques”
10
Flow Functions(3)
*x := y
in
out
F*x := y(in) =
x := *y
in
out
Fx := *y(in) = in – {x, *} U {(x, t) | (y,z)∈in && (z,t) ∈in }
In – {} U {(a, b) | (x,a)∈in && (y,b) ∈in }
“Advanced Compiler Techniques”
11
Intraprocedural Points-to Analysis
• Flow functions:
“Advanced Compiler Techniques”
12
Pointers to DynamicallyAllocated Memory• Handle statements of the form:
x := new T• One idea: generate a new variable
each time the new statement is analyzed to stand for the new location:
“Advanced Compiler Techniques”
13
Example
l := new Cons
p := l
t := new Cons
*p := t
p := t
“Advanced Compiler Techniques”
14
Example Solved
l := new Cons
p := l
t := new Cons
*p := t
p := t
lp
V1
lp
V1 t V2
lp
V1t
V2
l
t
V1
p
V2
l
t
V1
p
V2
l
t
V1
p
V2 V3
l
t
V1
p
V2 V3
l
t
V1
p
V2 V3
“Advanced Compiler Techniques”
15
What Went Wrong?
• Lattice was infinitely tall!• Instead, we need to summarize the infinitely many
allocated objects in a finite way.– introduce summary nodes, which will stand for a whole
class of allocated objects.• For example: For each new statement with label L,
introduce a summary node locL , which stands for the memory allocated by statement L.
• Summary nodes can use other criterion for merging.
“Advanced Compiler Techniques”
16
Example Revisited & Solved
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
lp
S1
lp
S1 t S2
lp
S1t
S2
l
t
S1
p
S2
l
t
S1
p
S2
Iter 1 Iter 2 Iter 3
“Advanced Compiler Techniques”
17
Example Revisited & Solved
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
lp
S1
lp
S1 t S2
lp
S1t
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
Iter 1 Iter 2 Iter 3
“Advanced Compiler Techniques”
18
Array Aliasingand Pointers to Arrays Array indexing can cause aliasing:
a[i] aliases b[j] if: a aliases b and i = j a and b overlap, and i = j + k, where k is the
amount of overlap. Can have pointers to elements of an array
p := &a[i]; ...; p++; How can arrays be modeled?
Could treat the whole array as one location. Could try to reason about the array index
expressions: array dependence analysis.
“Advanced Compiler Techniques”
19
Fields Can summarize fields using per field
summary for each field F, keep a points-to node called F
that summarizes all possible values that can ever be stored in F
Can also use allocation sites for each field F, and each allocation site S,
keep a points-to node called (F, S) that summarizes all possible values that can ever be stored in the field F of objects allocated at site S.
“Advanced Compiler Techniques”
20
Summary We just saw:
intraprocedural points-to analysis handling dynamically allocated memory handling pointers to arrays
But, intraprocedural pointer analysis is not enough. Sharing data structures across multiple procedures is
one of the big benefits of pointers: instead of passing the whole data structures around, just pass pointers to them (eg C pass by reference).
So pointers end up pointing to structures shared across procedures.
If you don’t do an interproc analysis, you’ll have to make conservative assumptions functions entries and function calls.
“Advanced Compiler Techniques”
21
ConservativeApproximation on Entry• Say we don’t have interprocedural
pointer analysis.• What should the information be at
the input of the following procedure:global g;void p(x,y) {
...
}
x y g
“Advanced Compiler Techniques”
22
ConservativeApproximation on Entry• Here are a few solutions:
x y g
locationsfrom allocsites priorto thisinvocation
global g;void p(x,y) {
...
}
• They are all very conservative!• We can try to do better.
x,y,g & locationsfrom allocsites priorto thisinvocation
“Advanced Compiler Techniques”
23
Interprocedural Pointer Analysis Main difficulty in performing
interprocedural pointer analysis is scaling
One can use a bottom-up summary based approach (Wilson & Lam 95), but even these are hard to scale
“Advanced Compiler Techniques”
24
Example Revisited• Cost:
– space: store one fact at each prog point– time: iterationS1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
lp
S1
lp
S1 t S2
lp
S1t
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
L2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
L1
p
L2
l
t
S1
p
S2
l
t
S1
p
S2
Iter 1 Iter 2 Iter 3
“Advanced Compiler Techniques”
25
New Idea: Store One Dataflow Fact Store one dataflow fact for the whole
program Each statement updates this one dataflow
fact use the previous flow functions, but now they
take the whole program dataflow fact, and return an updated version of it.
Process each statement once, ignoring the order of the statements
This is called a flow-insensitive analysis.
“Advanced Compiler Techniques”
26
Flow Insensitive Pointer Analysis
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
“Advanced Compiler Techniques”
27
Flow Insensitive Pointer Analysis
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
lp
S1
lp
S1 t S2
lp
S1t
S2
l
t
S1
p
S2
“Advanced Compiler Techniques”
28
Flow Sensitive vs. Insensitive
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
Flow-sensitive Soln Flow-insensitive Soln
l
t
S1
p
S2
“Advanced Compiler Techniques”
29
What Went Wrong? What happened to the link between p and
S1? Can’t do strong updates anymore! Need to remove all the kill sets from
the flow functions. What happened to the self loop on S2?
We still have to iterate!
“Advanced Compiler Techniques”
30
Flow InsensitivePointer Analysis: Fixed
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
lp
S1
lp
S1 t S2
lp
S1t
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
L2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
L1
p
L2
l
t
S1
p
S2
Iter 1 Iter 2 Iter 3
l
t
S1
p
S2
Final resultThis is Andersen’s algorithm ’94
“Advanced Compiler Techniques”
31
Flow Insensitive Loss of Precision
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
Flow-sensitive Soln Flow-insensitive Soln
l
t
S1
p
S2
“Advanced Compiler Techniques”
32
Flow Insensitive Loss of Precision
• Flow insensitive analysis leads to loss of precision!
main() { x := &y;
...
x := &z;}
Flow insensitive analysis tells us that xmay point to z here!
• However:– uses less memory (memory can be a big bottleneck to
running on large programs)– runs faster
“Advanced Compiler Techniques”
33
Worst Case Complexity of Andersen
*x = y x
a b c
y
d e f
x
a b c
y
d e f
Worst case: N2 per statement, so at least N3 for the whole program. Andersen is in fact O(N3)
“Advanced Compiler Techniques”
34
New Idea: One Successor Per Node
• Make each node have only one successor.• This is an invariant that we want to maintain.
x
a,b,c
y
d,e,f
*x = yx
a,b,c
y
d,e,f
“Advanced Compiler Techniques”
35
More General Case for *x = y
x
*x = y
y
“Advanced Compiler Techniques”
36
More General Case for *x = y
x
*x = y
y x y x y
“Advanced Compiler Techniques”
37
More General Cases
x
x = *y
y
Handling: x = *y
“Advanced Compiler Techniques”
38
More General Cases
x
x = *y
y x y x y
Handling: x = *y
“Advanced Compiler Techniques”
39
More General Cases
x
x = y
y
x = &yx y
Handling: x = y (what about y = x?)
Handling: x = &y
“Advanced Compiler Techniques”
40
More General Cases
x
x = y
y x y x y
x = &yx y x
y,…
x y
Handling: x = y (what about y = x?)
Handling: x = &y
get the same for y = x
“Advanced Compiler Techniques”
41
Our Favorite Example, Once More!
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
1
2
3
4
5
“Advanced Compiler Techniques”
42
Our Favorite Example, Once More!
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
l
S1
t
S2
p
l
S1
l
S1
p
l
S1
t
S2
p
l
S1,S2
tp
1
2
3
4
5
1 2
3
l
S1
t
S2
p
4
5
“Advanced Compiler Techniques”
43
Flow Insensitive Loss of Precision
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
l
t
S1
p
S2
Flow-sensitiveSubset-based
Flow-insensitiveSubset-based
l
t
S1
p
S2l
S1,S2
tp
Flow-insensitiveUnification-based
“Advanced Compiler Techniques”
44
Another Example
bar() { i := &a; j := &b; foo(&i); foo(&j); // i pnts to what? *i := ...; }
void foo(int* p) { printf(“%d”,*p);}
1234
“Advanced Compiler Techniques”
45
Another Example
bar() { i := &a; j := &b; foo(&i); foo(&j); // i pnts to what? *i := ...; }
void foo(int* p) { printf(“%d”,*p);}
i
a
j
b
p
i
a
i
a
j
b
i
a
j
b
p
i,j
a,b
p
1234
1 2
4
3
“Advanced Compiler Techniques”
46
Steensgaard vs. AndersenConsider assignment p = q,i.e., only p is modified, not q
Subset-based Algorithms Andersen’s algorithm is an example Add a constraint: Targets of q must be subset of targets
of p Graph of such constraints is also called “inclusion
constraint graphs” Enforces unidirectional flow from q to p
Unification-based Algorithms Steensgaard is an example Merge equivalence classes: Targets of p and q must be
identical Assumes bidirectional flow from q to p and vice-versa
“Advanced Compiler Techniques”
47
Steensgaard & Beyond A well engineered implementation of
Steensgaard ran on Word97 (2.1 MLOC) in 1 minute.
One Level Flow (Das PLDI 00) is an extension to Steensgaard that gets more precision and runs in 2 minutes on Word97.
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Analysis Sensitivity
• Flow-insensitive– What may happen (on at least one path)– Linear-time
• Flow-sensitive– Consider control flow (what must happen)– Iterative data-flow: possibly exponential
• Context-insensitive– Call treated the same regardless of caller– “Monovariant” analysis
• Context-sensitive– Reanalyze callee for each caller– “Polyvariant” analysis
More sensitivity more accuracy, but more expense
48
Inter-procedural Analysis
• What do we do if there are function calls?
x1 = &ay1 = &bswap(x1, y1)
x2 = &ay2 = &bswap(x2, y2)
swap (p1, p2) {t1 = *p1;t2 = *p2;*p1 = t2;*p2 = t1;
}
Two Approaches
• Context-sensitive approach:– treat each function call separately just like real
program execution would– problem: what do we do for recursive
functions?• need to approximate
• Context-insensitive approach:– merge information from all call sites of a
particular function– in effect, inter-procedural analysis problem is
reduced to intra-procedural analysis problem• Context-sensitive approach is obviously
more accurate but also more expensive to compute
Context Insensitive Approach
x1 = &ay1 = &bswap(x1, y1)
x2 = &ay2 = &bswap(x2, y2)
swap (p1, p2) {t1 = *p1;t2 = *p2;*p1 = t2;*p2 = t1;
}
Context Sensitive Approach
x1 = &ay1 = &bswap(x1, y1)
x2 = &ay2 = &bswap(x2, y2)
swap (p1, p2) {t1 = *p1;t2 = *p2;*p1 = t2;*p2 = t1;
}
swap (p1, p2) {t1 = *p1;t2 = *p2;*p1 = t2;*p2 = t1;
}
53
Binary Decision Diagram (BDD)
54
BDD-Based Pointer Analysis Use a BDD to represent transfer functions
– encode procedure as a function of its calling context – compact and efficient representation
Perform context-sensitive, inter-procedural analysis – similar to dataflow analysis – but across the procedure call graph
Gives accurate results – and scales up to large programs
55
Which Pointer AnalysisShould I Use?
Hind & Pioli, ISSTA, Aug. 2000 Compared 5 algorithms (4 flow-insensitive, 1 flow-sensitive):
Any address (single points-to set) Steensgaard Andersen Burke (like Andersen, but separate
solution per procedure) Choi et al. (flow-sensitive)
“Advanced Compiler Techniques”
56
Which Pointer AnalysisShould I Use? (cont’d) Metrics
1. Precision: number of alias pairs2. Precision of important optimizations:
MOD/REF, REACH, LIVE, flow dependences, constant prop.
3. Efficiency: analysis time/memory, optimization time/memory
Benchmarks 23 C programs, including some from
SPEC benchmarks
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Summary of Results
Hind & Pioli, ISSTA, Aug. 20001. Precision:
– Steensgaard much better than Any-Address (6x on average)
– Andersen/Burke significantly better than Steensgaard (about 2x)
– Choi negligibly better than Andersen/Burke2. MOD/REF precision:
– Steensgaard much better than Any-Address (2.5x on average)
– Andersen/Burke significantly better than Steensgaard (15%)
– Choi very slightly better than Andersen/Burke (1%)
57
“Advanced Compiler Techniques”
Summary of Results (cont’d)
3. Analysis cost: – Any-Address, Steensgaard extremely
fast– Andersen/Burke about 30x slower– Choi about 2.5x slower than
Andersen/Burke4. Total cost (analysis + optimizations):
Steensgaard, Burke are 15% faster than Any-Address!– Andersen is as fast as Any-Address!– Choi only about 9% slower 58
History of Points-to Analysis
from Ryder and Rayside
“Advanced Compiler Techniques”
Haven’t We SolvedThis Problem Yet?
From [Hind, 2001]:• Past 21 years: at least 75 papers and
nine Ph.D. theses published on pointer analysis
60
“Advanced Compiler Techniques”
Many Publications…
61
“Advanced Compiler Techniques”
So Which Pointer Analysis is Best?
• Comparisons between algorithms difficult– Size of points-to sets inadequate
• Model heap as one blob = one object for all heap pointers!
• Trade-offs unclear– Faster pointer analysis can mean more objects
= more time for client analysis– More precise analysis can reduce client
analysis time!
• Idea: use client to drive pointer analyzer…
62
“Advanced Compiler Techniques”
New Approaches
• Speculative Pointer Analysis– Traditional analysis is conservative, to
guarantee correctness. • Applications:
program transformation, program optimization
– Some application does not require 100% correctness.
• Most important: applicability and usability
63
“Advanced Compiler Techniques”
Speculative Pointer Analysis (SPA)– An Example
int a,b;
int *p;
p = &a;
for (i=0…1000){
*p = 5;
p = &b;
}
p a
p a
p bp b
ab
pp b
64
“Advanced Compiler Techniques”
Purpose of SPA
• Optimize pointer location set widths in loop bodies speculatively
• Remove unlikely or infrequent location sets– Improves ability to manage access
statically in our context– Improves precision for the typical case
65
“Advanced Compiler Techniques”
Summary of SPA• Traditional pointer analysis
– Based on compile time provable information
– Stops analysis if that cannot be guaranteed– Used in many performance optimization
techniques• Speculative pointer and distance
analysis– Always completes analysis without
restrictions– Extracts precise information for our
purposes– Targets compiler-enabled memory systems
66
“Advanced Compiler Techniques”
Probability Based Pointer Analysis (ASPLOS’06)• Silvia & Steffan, U Toronto• Define a probability for a points-to
relation• Use matrix calculation to compute
the resulting probability for each pointer
67
68
Conventional Pointer Analysis
*a = ~ ~ = *b
*a = ~ ~ = *b
Do pointers a and b point to the same location? Do this for every pair of pointers at every program
point
PointerAnalysis Definitely Not
Definitely
Maybe
optimize
“Advanced Compiler Techniques”
69
Probabilistic Pointer Analysis (PPA)
*a = ~ ~ = *b
*a = ~ ~ = *b
PPA
With what probability p, do pointers a and b point to the same location? Do this for every pair of pointers at every program point
p = 0.0p = 1.0
0.0 < p < 1.0
optimize
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
PPA Research Objectives
• Accurate points-to probability information– at every static pointer dereference
• Scalable analysis – Goal: The entire SPEC integer benchmark suite
• Understand scalability/accuracy tradeoff– through flexible static memory model
• Improve our understanding of programs70
“Advanced Compiler Techniques”
Algorithm Design Choices
• Fixed– Bottom Up / Top Down Approach– Linear transfer functions (for scalability)– One-level context and flow sensitive
• Flexible– Edge profiling (or static prediction)– Safe (or unsafe)– Field sensitive (or field insensitive)
71
“Advanced Compiler Techniques”
Traditional Points-To Graph
int x, y, z, *b = &x;void foo(int *a) {
if(…) b = &y;
if(…) a = &z; else(…) a = b; while(…) { x = *a; … }}
y UND
a
z
b
x
= pointer
= pointed at
Definitely
Maybe
=
=
Results are inconclusive
72
“Advanced Compiler Techniques”
Probabilistic Points-To Graph
int x, y, z, *b = &x;void foo(int *a) {
if(…) b = &y;
if(…) a = &z; else(…) a = b; while(…) { x = *a; … }}
y UND
a
z
b
x
0.1 taken(edge profile)
0.2 taken(edge profile)
= pointer
= pointed at
p = 1.0
0.0<p< 1.0
=
=p
0.10.9
0.72
0.08
0.2
Results provide more information
73
“Advanced Compiler Techniques”
Summary
• Pointer Analysis– Overview– Andersen’s Algorithm– Steensgaard’s Algorithm– New Directions
• Next Time– Parallelism & Locality Analysis
74