1 finding dominators in flowgraphs linear-time algorithm 1 and experimental study 2 loukas...

36
1 Finding Dominators in Flowgraphs Finding Dominators in Flowgraphs Linear-Time Algorithm Linear-Time Algorithm 1 and and Experimental Study Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with Renato F. Werneck, Robert E. Tarjan, Spyridon Triantafyllis and David I. August

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

1

Finding Dominators in FlowgraphsFinding Dominators in Flowgraphs

Linear-Time Algorithm Linear-Time Algorithm 11

andandExperimental Study Experimental Study 22

Loukas Georgiadis

1 joint work with Robert E. Tarjan

2 joint work with Renato F. Werneck, Robert E. Tarjan, Spyridon Triantafyllis and David I.

August

Page 2: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

2

Dominators in a FlowgraphDominators in a Flowgraph

Flowgraph: G = (V, E, r); each v in V is reachable from r

v dominates w if every path from r to w includes v

w

v

r

Page 3: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

3

Dominators in a FlowgraphDominators in a Flowgraph

Flowgraph: G = (V, E, r); each v in V is reachable from r

v dominates w if every path from r to w includes v

Set of dominators: Dom(w) = { v | v dominates w }

Trivial dominators: w r, w, r Dom(w)

Immediate dominator: idom(w) Dom(w) – w and dominated by every v in Dom(w) – w

Page 4: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

4

Dominators in a FlowgraphDominators in a Flowgraph

Flowgraph: G = (V, E, r); each v in V is reachable from r

v dominates w if every path from r to w includes v

Set of dominators: Dom(w) = { v | v dominates w }

Trivial dominators: w r, w, r Dom(w)

Immediate dominator: idom(w) Dom(w) – w and dominated by every v in Dom(w) – w

Goal: Find idom(v) for each v in V (immediate dominator tree)

Applications: Program optimization, code generation, circuit testing

Page 5: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

5

1979 Lengauer and Tarjan; O(m· (m,n)) time.

1997 Alstrup, Harel, Lauridsen and Thorup; O(n+m) time for RAM.

1998 Buchsbaum, Kaplan, Rogers and Westbrook; claimed O(n+m) for Pointer Machine. (Corrected in 2004 to work in linear time for RAM.)

2004 G. and Tarjan

• We showed that the Buchsbaum et al. algorithm runs in O(m· (m,n)) time.

• Based on Buchsbaum et al. we gave a linear-time algorithm for Pointer Machine, simpler than Alstrup et al. (no complicated data structures).

HistoryHistory

Page 6: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

6

The Lengauer-Tarjan AlgorithmThe Lengauer-Tarjan Algorithm

Depth-First Search DFS Tree D

We refer to the vertices by their DFS numbers:

v < w : v was visited by DFS before w

r 1

4

3

5

6

7

8

2

Page 7: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

7

The Lengauer-Tarjan Algorithm: SemidominatorsThe Lengauer-Tarjan Algorithm: Semidominators

Depth-First Search DFS Tree D

We refer to the vertices by their DFS numbers:

v < w : v was visited by DFS before w

Semidominator path (SDOM-path):

P = (v0 = v, v1, v2, …, vk = w) such that

vi>w, for 1 i k-1

r 1

4

3

5

6

7

8

2

Page 8: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

8

The Lengauer-Tarjan Algorithm: SemidominatorsThe Lengauer-Tarjan Algorithm: Semidominators

Depth-First Search DFS Tree D

We refer to the vertices by their DFS numbers:

v < w : v was visited by DFS before w

Semidominator path (SDOM-path):

P = (v0 = v, v1, v2, …, vk = w) such that

vi>w, for 1 i k-1

Semidominator:

sdom(w) = min { v | SDOM-path from v to w }

r 1

4

3

5

6

7

8

2

Page 9: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

9

OverviewOverview

1. Carry out a DFS.

2. Process the vertices in reverse preorder. For vertex w, compute sdom(w).

3. Implicitly define idom(w).

4. Explicitly define idom(w) by a preorder pass.

The Lengauer-Tarjan AlgorithmThe Lengauer-Tarjan Algorithm

Page 10: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

10

Data Structure: Maintain forest F and supports the operations:

link(v, w): Add the edge (v,w) to F. eval(v): Let r = root of the tree that contains v in F.

If v = r then return v. Otherwise return any vertex

with minimum sdom among the vertices u that

are proper descendants of r and ancestors of v.

Initially every vertex in V is a root in F.

The Lengauer-Tarjan Algorithm:The Lengauer-Tarjan Algorithm: Evaluate minima on Evaluate minima on tree pathstree paths

Page 11: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

11

Data Structure: Maintain forest F and supports the operations:

link(v, w): Add the edge (v,w) to F. eval(v): Let r = root of the tree that contains v in F.

If v = r then return v. Otherwise return any vertex

with minimum sdom among the vertices u that

are proper descendants of r and ancestors of v.

Initially every vertex in V is a root in F.

Simple version: n links, m evals in O(mlogn).

Sophisticated version: n links, m evals in O(mα(m,n)).

The Lengauer-Tarjan Algorithm:The Lengauer-Tarjan Algorithm: Evaluate minima on Evaluate minima on tree pathstree paths

Page 12: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

12

The Linear-Time AlgorithmThe Linear-Time Algorithm

Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]

Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.

Trivial microtree: Single internal vertex of D.

1

2

3

4

5

6

9

10 11

7

8

12

13 14

15

1617

2118

19 20

22

Page 13: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

13

The Linear-Time AlgorithmThe Linear-Time Algorithm

Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]

Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.

Trivial microtree: Single internal vertex of D.

1

2

3

4

5

6

9

10 11

7

8

12

13 14

15

1617

2118

19 20

22

trivialmicrotree

nontrivialmicrotree

g = 3

Page 14: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

14

The Linear-Time AlgorithmThe Linear-Time Algorithm

Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]

Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.

Trivial microtree: Single internal vertex of D.

Core C: Tree D – nontrivial microtrees; has n/g leaves.

1

2

3

4

5

6

9

10 11

7

8

12

13 14

15

1617

2118

19 20

22

Page 15: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

15

The Linear-Time AlgorithmThe Linear-Time Algorithm

Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]

Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.

Trivial microtree: Single internal vertex of D.

Core C: Tree D – nontrivial microtrees; has n/g leaves.

Line: Path (v1=s, v2, …, vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk) = 0 or >1.

1

2

3

4

5

6

9

10 11

7

8

12

13 14

15

1617

2118

19 20

22

Page 16: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

16

The Linear-Time AlgorithmThe Linear-Time Algorithm

Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]

Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.

Trivial microtree: Single internal vertex of D.

Core C: Tree D – nontrivial microtrees; has n/g leaves.

Line: Path (v1=s, v2, …, vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk) = 0 or >1.

1

2

3

4

5

6

9

10 11

7

8

12

13 14

15

1617

2118

19 20

22

line

Page 17: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

17

The Linear-Time AlgorithmThe Linear-Time Algorithm

Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]

Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.

Trivial microtree: Single internal vertex of D.

Core C: Tree D – nontrivial microtrees; has n/g leaves.

Line: Path (v1=s, v2, …, vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk) = 0 or >1.

There are L 2n/g lines.Contract each line into a single vertex tree C’ with L nodes.

{1, 2, 3}

{4, 7, 8} {15, 17}

Page 18: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

18

The Linear-Time AlgorithmThe Linear-Time Algorithm

Extend the definition of semidominators for the vertices of the nontrivial microtrees [Buchsbaum et al.]:

Pushed external dominator path (PXDOM-path): P = (v0 = v, v1, v2, …, vk = w) such that vi root of microtree of w, for 1 i k-1.

Pushed external dominator: pxdom(w) = min { v | PXDOM-path from v to w }

pxdom(w)

w

sdom(w)

Page 19: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

19

The Linear-Time AlgorithmThe Linear-Time Algorithm

Extend the definition of semidominators for the vertices of the nontrivial microtrees [Buchsbaum et al.]:

Pushed external dominator path (PXDOM-path): P = (v0 = v, v1, v2, …, vk = w) such that vi root of microtree of w, for 1 i k-1.

Pushed external dominator: pxdom(w) = min { v | PXDOM-path from v to w }

For any vertex w of the core C

pxdom(w) = sdom(w)

pxdom(w)

w

sdom(w)

Page 20: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

20

OverviewOverview

1. Compute internal dominators in each nontrivial microtree.

The Linear-Time AlgorithmThe Linear-Time Algorithm

Page 21: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

21

OverviewOverview

1. Compute internal dominators in each nontrivial microtree.

2. Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t.

The Linear-Time AlgorithmThe Linear-Time Algorithm

Page 22: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

22

OverviewOverview

1. Compute internal dominators in each nontrivial microtree.

2. Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t.

3. Compute sdoms in each line l by a top-down pass using link and eval on C’ and contracting

connected components in l.

The Linear-Time AlgorithmThe Linear-Time Algorithm

Page 23: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

23

OverviewOverview

1. Compute internal dominators in each nontrivial microtree.

2. Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t.

3. Compute sdoms in each line l by a top-down pass using link and eval on C’ and contracting

connected components in l. Remarks: link and eval run in linear-time on C’ . Buchsbam et al. claimed that link and eval

run in linear time on C but the claim is false.

The Linear-Time AlgorithmThe Linear-Time Algorithm

Page 24: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

24

The Iterative Algorithm: Set-basedThe Iterative Algorithm: Set-based

Dominators can be computed by solving iteratively the set of equations [Allen and Cocke, 1972]

Dom(v) = ( u pred(v) Dom(u) ) {v}, v r

Initialization

Dom(r) = {r}

Dom(v) = , v r

In the intersection we consider only the nonempty Dom(u).

Page 25: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

25

The Iterative Algorithm: Set-basedThe Iterative Algorithm: Set-based

Dominators can be computed by solving iteratively the set of equations [Allen and Cocke, 1972]

Dom(v) = ( u pred(v) Dom(u) ) {v}, v r

Initialization

Dom(r) = {r}

Dom(v) = , v r

In the intersection we consider only the nonempty Dom(u).

Each Dom(v) set can be represented by an n-bit vector.Intersection bit-wise AND.

Requires n2 space. Very slow in practice.

Page 26: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

26

The Iterative Algorithm: Tree-basedThe Iterative Algorithm: Tree-based

Efficient implementation [Cooper, Harvey and Kennedy 2000]

dfs(r)T {r}changed truewhile ( changed ) do

changed falsefor all v in V – r in reverse postorder do

x nca(pred(v)) if x parent(v) then

parent(v) xchanged true

enddone

done

Page 27: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

27

The Iterative AlgorithmThe Iterative Algorithm

Running TimeRunning Time

Each pair wise intersection takes O(n) time.

The number of iterations is d + 3. [Kam and Ullman ’76]

d = max #back-edges in any cycle-free path of G

= O(n)

Running time = O(mn2)

This bound is tight, but very pessimistic in practice.

Page 28: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

28

The Iterative Algorithm: Generic Tree-basedThe Iterative Algorithm: Generic Tree-based

T T0 /* a spanning (sub)tree of G */changed truewhile ( changed ) do

changed falsefor all v in V – r in order do

x nca(pred(v)) if x parent(v) then

parent(v) xchanged true

enddone

done

Page 29: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

29

The Iterative Algorithm: Generic Tree-basedThe Iterative Algorithm: Generic Tree-based

T T0 /* a spanning (sub)tree of G */changed truewhile ( changed ) do

changed falsefor all v in V – r in order do

x nca(pred(v)) if x parent(v) then

parent(v) xchanged true

enddone

done

Good choices (in practice): T0 = a Bread-First Search (BFS) tree

= BFS order

Page 30: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

30

A Hybrid AlgorithmA Hybrid Algorithm

Lemma: For any vertex w r,

idom(w) = NCA( I, parent(w), sdom(w) ).

I = (immediate) dominator tree

parent(w) = parent of w in the DFS tree D

Page 31: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

31

A Hybrid AlgorithmA Hybrid Algorithm

Lemma: For any vertex w r,

idom(w) = NCA( I, parent(w), sdom(w) ).

I = (immediate) dominator tree

parent(w) = parent of w in the DFS tree D

SEMI-NCA:

1. Compute sdoms as in simple version of LT.

2. Construct I incrementally applying Lemma.

(NCA calculations implemented naïvely)

Page 32: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

32

Experimental ResultsExperimental Results

AlgorithmsAlgorithms

• SLT: simple version of Lengauer-Tarjan

• LT: almost-linear-time version of Lengauer-Tarjan

• IDFS: DFS tree-based iterative

• IBFS: BFS tree-based iterative

• SNCA: SEMI-NCA

Page 33: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

33

InputsInputs

• Control-flow graphs from SPARC ’95 generated by the SUIF compiler (Stanford).

> 4900 graphs, avg #vertices ~ 40, #edges ~ 55

max #vertices ~ 2100, #edges ~ 3200

• Control-flow graphs from SPARC’ 00 generated by the IMPACT compiler (UIUC).

> 2000 graphs, avg #vertices ~ 25, #edges ~ 70

max #vertices~580, #edges~3100

• VLSI circuits from ISCAS’89 suite.

50 graphs, avg #vertices ~ 3200, #edges ~ 5000

max #vertices ~ 24000, #edges ~ 34000

Experimental ResultsExperimental Results

Page 34: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

34

IDFS IBFS LT SLT SNCA mean dev mean dev mean dev mean dev mean dev

CIRCUITS 5.89 1.19 6.17 1.42 6.71 1.18 4.62 1.15 4.40 1.14

SUIF-INT 2.45 1.50 2.25 1.62 3.69 1.40 2.48 1.33 2.73 1.45

IMPACT 2.60 1.65 2.24 1.77 4.02 1.40 2.74 1.33 2.56 1.31

IMPACTP 2.58 1.63 2.25 1.82 3.84 1.44 2.61 1.30 2.52 1.29

Experimental ResultsExperimental Results

Times relative to BFS: geometric mean and geometric standard deviation

Page 35: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

35

iterations comparisons per vertex SDP(%) IDFS IBFS IDFS IBFS LT SLT SNCA CIRCUITS 76.7 2.8000 3.2000 32.6 39.3 12.0 9.9 8.9

IMPACT 73.4 2.0686 1.4385 30.9 28.0 15.6 12.8 11.1

IMPACTP 88.6 2.0819 1.5376 30.2 32.2 15.5 12.3 10.9

SUIF-INT 63.9 2.0009 1.6659 14.9 17.2 11.2 8.6 7.2

Experimental ResultsExperimental Results

SDP = percentage of vertices v that have parent(v) = sdom(v)

Page 36: 1 Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with

36

Experimental ResultsExperimental Results

Relative Running Times per Instance Size

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7 8 9 10

log of instance size

mea

n re

lati

ve r

unni

ng t

ime

(w.r

.t B

FS)

BFS

IDFS

IBFS

LT

SLT

SNCA