Primal-Dual Meets Local Search: Approximating MST’s with Non-uniform Degree Bounds
Author: Jochen KönemannR. RaviFrom CMU
CS 3150 Presentation by Dan LiAdvised by Kirk PruhsDepartment of Computer Science, University of PittsburghDecember 2, 2003
Motivation Multicasting:
Nodes are connected by network.
Multicasting from one node to all other nodes
Cost associated to each connection
Cost effective solution Minimum Spanning Tree’s
Picture copied from the author’s talk
Motivation cont. Problem:
Congestion: Some nodes may be too busy to
work effectively Bandwidth limit
Solution Bound the maximum number of
connection that each node can support Uniform bounds Non-uniform bounds
Picture copied from the author’s talk
Motivation cont.
Picture copied from the author’s talk
Problem Formulation Degree-bounded minimum-cost spanning tree problem with
non-uniform degree bounds (nBMST). Given an undirected graph G = (V, E), a cost function
c : E → IR+ and positive integers all greater than 1, the goal is to find a spanning tree T of minimum total cost such that for all vertices the degree of v in T is at most Bv.
If all the Bv’s are the same, we have Degree Bounded Minimum-cost Spanning Tree Problem with Uniform Degree Bounds.
This problem is NP hard!
VvvB }{
Vv
What is done in this paper A new algorithm:
Improved approximation algorithms for the minimum cost degree bounded spanning tree problem in the presence of non-uniform degree bounds.
Direct algorithm, do not solve linear programs. The algorithm integrates elements from the primal-dual method for
approximation algorithms for network design problem with local search methods for minimum-degree network problem.
Goes through a series of spanning trees and improves the maximum deviation of any vertex degree from its respective degree bound continuously.
Core Theorem Theorem 2: There is a primal-dual approximation algorithm that, given a
graph G=(V, E), a nonnegative cost function c: E→IR+, integers Bv > 1 for all and a parameter ω > 1, computes a tree T such that
It is apparent that And the approximation ratio is constant.
More specifically, if we select b = 2 and ω = 2, we have
Vv
optTc
andVvallfor
nBbv bvT
)(.2
,
1log2},1
max{)(deg.1
optTc
andVvallfor
nBv vT
2)(.2
,
1log24)(deg.1 2
)log()(deg nBOv vT
Primal-Dual formulation
High level idea Intuition:
Reduce the degree those nodes whose degree is substantially higher than their bound Bv.
As we proceed through this sequence, while keeping the cost of the associated primal solution (tree) bounds with respect to the corresponding dual solution.
Define Normalized degree:
ndegT (v) = max{0, degT(v) – βv·Bv} Where βv > 0 are constants for all v in V.
How to choose βv? We will talk about it soon.
High Level Idea Computer a sequence of MST’s
(x1, {y1, λ1}), (x2, {y2, λ2}), …, (xt, {yt, λt}) Until there is no such a node v with ndegT(v) ≥ 2 logb(n) What is the difference between each computation?
On each re-compute step, raise the λ value of a carefully chosen set Sd of nodes with high normalized degree. Thus introducing more slacks.
Rerun the MST, taking advantage of the newly created slacks. Also, keeping the cost close to the dual : Guarantee the approximation
factor Number of re-compute is polynomial : Guarantee it is a polynomial
algorithm If we look at the dual problem, we can intuitively consider using
Cuv + λu + λv as the new cost function.
High Level Idea What we are expecting?
By raising the value of λ’s, in the new MST‘s, some edges to/from the congested vertices can be replaced by edges between other nodes, thus decrease the normalized degree.
How to make this happen? If some edges becomes more expensive, then it will be less preferred in
MST. If those edges to/from the congested node, then the congested node will
be less preferred.
Visualization I
Visualization II
High Level Idea How much do we increase the price
We expect that by increasing the price, there is only one edge difference between the old MST and the new MST. We want to lose customer one by one
Increase too fast is bad, too few may not change the MST. We do not want to lose all the connections to/from an edge,
but only want to decrease the normalized degree to some controllable value.
Whose price to increase? Only those edged connected to congested nodes
High Level Idea How to end the process?
It may be difficult or impossible to decrease the normalized degree of each nodes to 0, which means we find a solution satisfying the bounds.
It may be feasible to decrease the normalized degree to some predefined level, then we find an algorithm that gives results that do not violate the bound too much.
The algorithm should end in polynomial number of steps. Does such an algorithm exist?
The Algorithm
Analysis of the Algorithm Initialize the primal-dual solution
Primal infeasible and dual feasible solution Improve the primal feasibility and dual optimality
Some lines need to be clarified Line 4 : Ends the algorithm Line 5 : Used to select the set to increase the cost Line 6 : How much to increase ? εi
Line 7 : Update the dual solution. Line 8 : Update the cost function to re-compute the new primal solution
More questions: How are the approximation factor are guaranteed? How are the bounds satisfied (with linear factors)?
Clarifications Line 4: On finishing
ndegT (v) = max{0, degT(v) – βv·Bv} ≤ 2 logb (n)
So,
degT(v) ≤ βv·Bv + 2 logb (n)
Line 5: Selecting the set to increase the cost
Use contradiction, assume that no such di exists, and also consider that Bv ≤ n – 1 and
)1( nnBv
v
Clarifications Line 6: Choosing εi
Such that the following run of MST yields a new tree that differs from the previous one by a single swap.
Cross-edge: e = uv is a cross-edge if E is a non-tree edge, and
Where Ki is connected components of the forest
Choose:
And final εi to be the minimum among all the εie’s
The new MST must be different from the previous one, since we can swap one edge to form a new spanning tree with lower or equal cost than the one with previous selected edges
Local improvement
212121 ,, KKandKKKforKvKu i i
d
iiSE \
Performance Analysis The cost is close to the dual cost by a constant factor
On each step, we need to maintain the cost to be close to the dual cost The optimal solution dual solution is the optimal primal solution, so dual
solution is less than the optimal Thus the relation between the primal cost and the optimal is maintained On each iteration, yπ’s may increase and λπ may also increase, and also the
spanning tree cost. The first term on the right-hand side should grow sufficiently to compensate
for the decrease in the second term and also increased spanning tree cost.
Performance Analysis In order to prove the previous equation, an invariant is proved.
Induction Base case i = 0 Induction
Selecting ` , (Inv) is proved. 1
Performance Analysis Following equations can be reached (see the paper for details)
Plus this
Concludes ( by choosing α ≥ ω )
Analysis - Running time This algorithm terminates in polynomial number of steps
Claim: Algorithm 1 terminates after O(n4) iterations? Proof:
Define the potential of spanning tree Ei as
On each step, one edge is swapped in, which is incident to two nodes of normalized degree at most di - 2. The reduction of the potential is at least
Analysis - Running time Consider that
The equation on the last page is bounded by
Also consider that the initial potential Φi at the beginning of ith step is at
most , after the ith step, or at the beginning of the (i+1)th step, the potential Φi+1 is at most
With O(n3) iterations, the potential function is reduced by a constant factor. The algorithm runs for O(n4) iterations total??? Considering that each iteration can be implemented in time O(n2log(n)), the
whole algorithm runs in time O(n6log(n))
i
n 3
22)(log2 3
32n
i
bi n
i
n
3
11
)(log2 nd bii
Is the analysis correct? The above analysis appears in the paper, is it correct? Look
at this
If b = 3, the left side is
If b = 9, the left side is
If b = 2, the left side is
So the correctness of the above equation is dependent on the value of b. Only when b >= 3, the running time is O(n6log(n))
In the recent talk given by the author, he used value of b as 2, so the analysis is wrong.
22)(log2 3
32n
i
bi n
222)(log2 33
9
232 3
nn
ii
i n
23)(log)(log2)(log2 33
9
2
8
3
9
2
9
3
9
232
22
2
nn
iiii
i
nnn
22)(log2 33
9
232 9
nn
ii
i n
More Problems? Is there anything missed?
Did the author prove the part 1 of theorem 2? No. It seems apparent, since on finishing the while loop, the maximum
normalized degree is 2 log(n), then
But βv is selected as
Which can not continue to prove
vvv B
bB
b1
)1
,max(1
)log(2)(deg nBv vvT
)log(21)1
,max()(deg nBbv vT
More Problems? Solve? The conclusion can still be correct if we selected special
value of ω = 2, and
What value can be used for b? Any value larger than 1 can be used But only value larger than or equal to 3 can give running time of
O(n6log(n)). Smaller value of b will give worse running time.
vv B
b1
2
Conclusion The performance of the algorithm is conditional based
on the value of constants selected.
What we learn from this paper? Modify the cost function to avoid congestion This is a very naturally and decent solution.