transitive-closure spanners
DESCRIPTION
Transitive-Closure Spanners. David Woodruff IBM Almaden. Joint work with Arnab Bhattacharyya MIT Elena Grigorescu MIT Kyomin Jung MIT Sofya Raskhodnikova Penn State. - PowerPoint PPT PresentationTRANSCRIPT
1
Transitive-Closure Spanners
David WoodruffIBM Almaden
Joint work with Arnab Bhattacharyya MIT Elena Grigorescu MIT Kyomin Jung MIT Sofya Raskhodnikova Penn State
Graph Spanners [Awerbuch85,Peleg Schäffer89]
A subgraph H of a graph G is a k-spanner if for all pairs of vertices u, v in G,
dH(u,v) ≤ k dG(u,v)
Goal: find a sparsest k-spanner2
dense graph G sparse subgraph H
3
Transitive-Closure Spanners
Transitive closure TC(G) has an edge from u to v iffG has a path from u to v
k-TC-spanner H of G has dH(u,v) ≤ k iff
G has a path from u to v
Alternatively: k-TC-spanner of G is a k-spanner of TC(G)
G TC(G)
4
Example: Directed Line on n Vertices• 2-TC-spanner O(n log n) edges
……
• 3-TC-spanner O(n log log n) edges
• 4-TC-spanner O(n log* n) edges• k-TC-spanner O(n (k,n)) edges
– Add a (k-2)-TC-spanner on hubs – Connect each node to the hubs before and after– Recurse on the fragments between hubs
n1/2 hubs
Previous work
Structural results on TC-spanners (what is a sparsest k-TC-spanner for a given graph family?)
• Shortcut graphs (special case when |E(H)|· 2 |E(G)|) [Thorup 92, 95, Hesse 03]• For directed trees [Thorup 97] implicit in
– data structures [Yao 82, Alon Schieber 87, Chazelle 87]– property testing [Dodis Goldreich Lehman Raskhodnikova Ron Samorodnitsky 99]– access control [Attalah Frikken Blanton 05]
Computational results on directed spanners (given a graph, compute a sparsest k-spanner)
• O(log n)-approximation algorithm for k=2 [Kortsarz Peleg 94]• O(n2/3 polylog n)-approximation for k=3 [Elkin Peleg 99]
5
Our Contributions
• Common abstraction for several applications– property testing– access control– data structures
• Structural results on TC-spanners– path-separable graphs
• Computational results k-TC-Spanner: Given a graph, compute a sparsest k-TC-spanner
– Algorithms– Inapproximability
6
Application 1: Testing if a List is Sorted
• Is a list of n numbers sorted? Requires reading entire list.
• Is a list of n numbers sorted or ²-far from sorted? (An ² fraction of list entries have to be changed to make it sorted.) [Ergün Kannan Kumar Rubinfeld Viswanathan 98]: O((log n)/²) time [Fischer 01]: (log n) queries
7
12
Is a list sorted or ²-far from sorted?[Dodis Goldreich Lehman Raskhodnikova Ron Samorodnitsky 99] Test can be viewed as: Pick a random edge from sparsest 2-TC-spanner for
the line and compare its endpoints. Reject if they are out of order.
1 2 5 4 3 6 7 Claim 1. There are · n log n edges in the 2-TC-spanner.Claim 2. Green numbers are sorted.Proof: Any two green numbers are connected by a length-2 path of black edgesAnalysis of the test:• All sorted lists are accepted.• If a list is ²-far from sorted, it has ¸ ² n red numbers, ) ¸ ² n/2 red edges – If £((log n )/²) edges are checked, a red edge will be discovered w.p. ¸2/3
8
12
12
5 4 3
Generalization: Monotonicity over PO domains[FLNRRS 02]:
Graph = partially ordered domain; node labels = values of the function • A function is monotone if there are no violated edges (along which labels
decrease): 1 0• A function is ²-far from monotone if ¸ ² fraction of labels need to be
changed to make it monotone.
9
1
0
1
0
2 4 3 7651
3 4 5 6
3 4 5 6
4 3 4 5
1 2 3 4
1
0
1
0
1
1
0
0
1
1
0
01
0
1
0
1
0
1
0
monotone ½-far from monotone
Monotonicity Testers via Sparse 2-TC-spanners
Lemma. If G has a 2-TC-spanner H with s(n) edges, thenmonotonicity of functions on G can be tested in time O(s(n)/(² n))
Proof: 1. Say an edge (a,b) is violated in TC(G) if f(a) > f(b)
2. If function on G is ε-far from monotone, there is a matching M in TC(G) of εn/2 violated edges [DGLRS 99]
3. A violated edge in TC(G) either appears in H, or there is a path of length 2 between the endpoints of the edge in H. By transitivity, one of the edges on the path is also violated. But any edge in H can intersect at most 2 edges in the matching M. Thus, there at least εn/4 violated edges in H
4. Sample O(s(n)/(εn)) edges of H, check if any are violated 10
12
Application 2: Access Control
11
Efficient key management in access hierarchies [Attalah Frikken Blanton 05, Attalah Blanton Frikken 06, Santis Ferrara Massuci 07]
Used in content distribution, operating systems and project development
Access class with
private key ki Permission edge with
public key Pij
Need ki to efficiently compute kj from Pij
To speed up key derivation time, add shortcut edges consistent with permission edges
Application 3: Data StructuresComputing partial products in a semigroup [Yao 82, Alon Schieber 87, Chazelle
87, Thorup 97]Example:
Goal: quickly answer queries max(ai ,…,ai) for all i · j.
• Question: How many values should we store if we want to compute max of at most k numbers per query?
• Answer: storage = size of sparsest k-TC-spanner for the directed line.
This example generalizes to other partial products and to directed trees.
12
a1 ai aj an… … …
max(ai ,…,aj)
Our Contributions• Common abstraction for several applications• Structural results on TC-spanners
– Path-separable graphsO(1)-path-separable graphs have k-TC-spanners of size O(n log n ¢(k,n))
• e.g., improves run time of monotonicity testers on planar graphs from O((n1/2 log n)/²) [FLNRRS02] to O((log2 n)/²)
• Computational results k-TC-Spanner: Given a graph, compute a sparsest k-TC-spanner
– Algorithms– Inapproximability
13
Graph Separators (for Undirected Graphs)
Used in recursive constructions
S-separators [Lipton Tarjan]• Removing S nodes disconnects a graph G on n nodes into connected
components with · n/2 nodes eachs-path-separators [Abraham Gavoille 06]• Removing nodes on s paths from any spanning tree of G disconnects G
into connected components with · n/2 nodes each
Gain– For some graphs s << S e.g., planar graphs are £(n1/2)-separable but 3-path-separable
14
from any spanning tree of G
1. Construct k-TC-spanner for each path separator (line) P as before.2. For each node v with a path to some node in P:
– Let v’ = smallest node in P such that v à v’– At each stage of recursion, connect v to the smallest hub above v’
3. Deal symmetrically with each node u with a path from some node in P.Claim. If v à u via some node in P, we added a path of length · k from v to u.Proof: v and u are connected to the same hubs as v’ and u’, respectively.
Constructing TC-spanners via Path Separators
15
v
Pv’
u
u’
· (k,n) edges
O(n (k,n) ¢ log n ) edges
Path Separators for Directed Graphs
Our guarantee. If the underlying undirected graph for G is s-path-separable, we can efficiently find a set of directed paths in G:
• Removing nodes on these paths disconnects G into connected components with · n/2 nodes each
• Each vertex v is comparable to nodes on O(s) paths
Technique. Unlike [Abraham Gavoille 06] , we choose path separators dynamically (not from the same spanning tree).
Theorem. O(1)-path-separable graphs have k-TC-spanners of size O(n log n ¢(k,n))
H-minor free graphs are O(1)-path-separable [Abraham Gavoille 06] 16
12
Our ContributionsA
lgor
ithm
s• Common abstraction for several applications• Structural results on TC-spanners• Computational results on k-TC-Spanner
Setting of k Approxi-mability
Authors/Technique
Notes
k=2 (log n) [Kortsarz Peleg]
k=3 O(n2/3 polylog n) [Elkin Peleg]
k > 2 O((n log n)1-1/k) CP+sampling Applies to directed spanners ,….Simplifies [EP] for k=3
k = (log n) O((n log n)/k2) PathSampling Better than directed spanners
Setting of k Inapproxi-mability
Assumption Notes
k = 2 (log n) P NP Matches upper bound
constant k > 2 (2log1-² n) NP( DTIME(npolylog n) Improvement )breakthrough
k = n1-° 8 > 0
(1+²) P NP 17
Har
dnes
s
Approximation Algorithms
• For any k > 2, we achieve an O((n lg n)1-1/k) approximation algorithm for the Directed Spanner Problem in polynomial time– Gives the same ratio for Transitive-Closure spanners– Yields the first sublinear ratio for k > 3– Solves the main open question of [Elkin Peleg 05]
• Our technique is a new balancing of a convex program with a sampling-based approach
• Greatly simplifies the previous O(n2/3 polylog n)-approximation algorithm [Elkin Peleg 05] for k = 3
• For each edge e 2 G, introduce a variable xe, which indicates whether or not we include e in the k-Spanner H
• For each path p of length at most k in G, introduce a variable yp. For constant k, the number of such paths is O(nk+1) = poly(n).
• Integer programming formulation: min e xe
s.t. 8 e =(a,b) 2 G, sump: a -> b, |p| · k yp ¸ 1 8 p = {(a0, a1), (a1, a2), …, (ak-1, ak)}, yp · minj=1
k x(aj-1, j)
8 e, xe 2 {0,1} 8 p, yp 2 {0,1}
• Linear programming relaxation: xe, yp 2 [0,1]
Approximation Algorithms: Linear Program
Approximation Algorithms: Linear Programmin e xe
s.t. 8 e =(a,b) 2 G, sump: a -> b, |p| · k yp ¸ 1 8 p = {(a0, a1), (a1, a2), …, (ak-1, ak)}, yp · minj=1
k x(aj-1, j)
8 e, 8 p, xe 2 [0,1], yp 2 [0,1]
• Solve the linear program, let the solution variables be xe*, yp
*
• Define a rounding scheme: include e in the 2-Spanner H if and only if xe* ¸
1/n1-1/k
• If the number of paths of length at most k between vertices a, b in G is at most n1-1/k, then the constraint sump: a -> b, |p| · k yp ¸ 1 ensures there is a path p for which y*p ¸ 1/n1-1/k
• The constraint yp · minj=1k x(aj-1, j)
ensures that for all edges e 2 p, we have xe*
¸ 1/n1-1/k, and so e will be added to H
Approximation Algorithms: Sampling
• Problem: Integrality Gap– If many paths of length at most k between two vertices, LP
might assign each path small weight– But if there are at least r paths of length at most k between
two vertices, then there are at least r1/(k-1) vertices in G which are contained on such paths
• Solution: sample O(n/r1/(k-1)) vertices at random and grow a shortest path tree around them. Set r= n1-1/k.
Approximation Algorithms: Sampling
•For any (a,b) 2 G, either the linear program includes a path between a and b of length at most k in the spanner H, or we will randomly sample a vertex v on a path p of length at most k between a and b. Since we include a shortest path tree around v in H, the distance from a to b in H will be at most k.
…
Samplingsolves the“many paths”case
LP solvesthe “few paths” case
Approximation Algorithms: LP+Sampling• Linear Programming + Sampling:
1. Initialize H to empty set2. For each edge e, if xe
* ≥ 1/n1-1/k, then add e to H
3. Randomly sample r = O~(n1-1/k) vertices z1, …, zr
4. For each i, add the edges in BFS(zi) to H
5. Output H
• Claim: With probability at least 1-1/n, we get H is a k-Spanner of G
• Claim: |H| = O~(n1-1/k) OPT– If OPT’ is the optimum of the linear program, then OPT’ · OPT, and the cost we
get from rounding the solution is at most n1-1/kOPT’ · n1-1/kOPT– Each shortest path tree around each of O~(n1-1/k) sampled vertices has O(n)
edges, so we add O~(n2-1/k) edges. Since we can assume that the input graph G is connected, we have OPT ¸ n-1, so we again add at most n1-1/kOPT edges.
Approximation Algorithms: Wrapup
• Exponential number of variables:– Number of path variables grows exponentially with k– Can replace yp with min(xe1, xe2, …, xek) for p = (e1, …, ek) Now
we have a convex program in O(n2) variables, but the constraints are of exponential size
– Can design a separation oracle to solve this with the ellipsoid algorithm.
• Can also derandomize the sampling by a simple greedy algorithm.
k-TC-Spanner: (2log1-² n)-inapproximability for k>2
Starting point:• Reduce from Min-Rep• Use generalized butterfly graphs and broom graphs
25
Instance. Undirected bipartite graphIn each part: n nodes, grouped in r clusters
|Ai|=|Bj|= n/r
The Min-Rep Problem
A1
A3
B1
B2
B3
supergraph
Theorem [Elkin Peleg 07]. Min-Rep is (2log1-² n)-inapproximable unless NP µ DTIME(npolylog n). 26
(Ai,Bj) is a superedge iffsome node in Ai is adjacent to some node in Bj.
A2
Rep-Cover is a node set S: for each superedge(Ai,Bj) there is edge (a,b) with a2AiÅS, b2BjÅSGoal. Find minimum size rep-cover.
Generalized Butterfly Graphs
• Outdegree = indegree = d• There is a unique path from (a1,…,ak-1, 1) to (b1,…,bk-1, k)• Each shortcut edge is on at most d k-3 such paths
27
Nodes: (a1,a2,…,ak-1, i) where a1,…,ak-12 [d] and i2 [k]
12
. . .
. . .
layer 1 layer i layer i+1 layer k
(a1,…,ai,…,ak-1, i)
(a1,…,bi,…,ak-1, i+1)
Graphs Used in the Reduction
28
• Generalized butterfly
• Broom – A complete bipartite graph with disjoint stars attached to all
right nodes
d d
d
A1
A3
B1
B2
B3
29
A2
Min-Rep Brooms3 layers
Butterfliesk layers
Reduction from Min-Rep to k-TC-Spanner
A Sparse TC-Spanner for the Hard InstanceA1
A3
B1
B2
B3
30
A2
layer k-2OPT d 2 + |G| edges, where OPT is the size of minimum rep-cover
ConclusionOur contributions• Common abstraction for several applications
– monotonicity testing, access control, data structures • Structural results on TC-spanners
– path-separable graphs• Computational results on k-TC-Spanner
– new algorithms and inapproximability results
Open questions• At what point do TC-spanners admit efficient approximation
algorithms? – Showed strong inapproximability for any constant k > 2– Showed O(1)-approximation algorithm for k = O~(n1/2)
• Other applications of TC-spanners?
31