minimum spanning trees featuring disjoint sets hkoi training 2006 liu chi man (cx) 25 mar 2006
TRANSCRIPT
Minimum Spanning TreesFeaturing Disjoint Sets
HKOI Training 2006
Liu Chi Man (cx)
25 Mar 2006
2
Prerequisites
Asymptotic complexity Set theory Elementary graph theory Priority queues (or heaps)
3
Graphs
A graph is a set of vertices and a set of edges
G = (V, E) Number of vertices = |V| Number of edges = |E| We assume simple graph, so |E| = O(|V|2)
4
Roadmap
What is a tree? Disjoint sets Minimum spanning trees Various tree topics
5
What is a Tree?
6
Trees in graph theory
In graph theory, a tree is an acyclic, connected graphAcyclic means “without cycles”
7
Properties of trees
|E| = |V| - 1 |E| = (|V|)
Between any pair of vertices, there is a unique path
Adding an edge between a pair of non-adjacent vertices creates exactly one cycle
Removing an edge from the tree breaks the tree into two smaller trees
8
Definition?
The following four conditions are equivalent:G is connected and acyclicG is connected and |E| = |V| - 1 G is acyclic and |E| = |V| - 1Between any pair of vertices in G, there exists a
unique path G is a tree if at least one of the above
conditions is satisfied
9
Other properties of trees
Bipartite Planar A tree with at least two vertices has at
least two leaves (vertices of degree 1)
10
Roadmap
What is a tree? Disjoint sets Minimum spanning trees Various tree topics
11
The Union-Find problem
N balls initially, each ball in its own bagLabel the balls 1, 2, 3, ..., N
Two kinds of operations:Pick two bags, put all balls in these bags into
a new bag (Union)Given a ball, find the bag containing it (Find)
12
The Union-Find problem
An example with 4 balls Initial: {1}, {2}, {3}, {4} Union {1}, {3} {1, 3}, {2}, {4} Find 3. Answer: {1, 3} Union {4}, {1,3} {1, 3, 4}, {2} Find 2. Answer: {2} Find 1. Answer {1, 3, 4}
13
Disjoint sets
Disjoint-set data structures can be used to solve the union-find problem
Each bag has its own representative ball{1, 3, 4} is represented by ball 3 (for example){2} is represented by ball 2
14
Implementation 1: Naive arrays
Bag[x] := representative of the bag containing x
<O(N), O(1)>Union takes O(N) and Find takes O(1)
Slight modifications give <O(U), O(1)>U is the size of the union
Worst case: O(MN) for M operations
15
Implementation 1: Naive arrays
How to union Bag[x] and Bag[y]?Z := Bag[x]
For each ball v in Z do
Bag[v] := Bag[y] Can I update the balls in Bag[y] instead? Rule: Update the balls in the smaller bag
O(MlgN) for M union operations
16
Implementation 2: Forest
A forest is a collection of trees Each bag is represented by a rooted tree,
with the root being the representative ball
1
5 3
6
4
2 7
Example: Two bags --- {1, 3, 5} and {2, 4, 6, 7}.
17
Implementation 2: Forest
Find(x)Traverse from x up to the root
Union(x, y)Merge the two trees containing x and y
18
Implementation 2: Forest
Initial:
Union 1 3:
Union 2 4:
Find 4:
1 3 42
1
3
42
1
3 4
2
1
3 4
2
19
Implementation 2: Forest
Union 1 4:
Find 4:
1
3
4
2
1
3
4
2
20
Implementation 2: Forest
How to represent the trees?Leftmost-Child-Right-Sibling (LCRS)?
Too complicated
Parent array Parent[x] := parent of x If x is a tree root, set Parent[x] := x
21
Implementation 2: Forest
The worst case is still O(MN ) for M operationsWhat is the worst case?
ImprovementsUnion-by-rankPath compression
22
Union-by-rank
We should avoid tall trees Root of the taller tree becomes the new
root when union So, keep track of tree heights (ranks)
Good Bad
23
Path compression
See also the solution for Symbolic Links (HKOI2005 Senior Final)
Find(x): traverse from x up to root Compress the x-to-root path at the same
time
24
Path compression
Find(4)
3
5 1
6
4
2 7
3
5 1
6
4
2 7
The root is 3
The root is 3
The root is 3
3
5 164
2 7
25
U-by-rank + Path compression
We ignore the effect of path compression on tree heights to simplify U-by-rank
U-by-rank alone gives O(MlgN) U-by-rank + path compression gives
O(M(N)) : inverse Ackermann function
(N) 5 for practically large N
26
Roadmap
What is a tree? Disjoint sets Minimum spanning trees Various tree topics
27
Minimum spanning trees
Given a connected graph G = (V, E), a spanning tree of G is a graph T such thatT is a subgraph of GT is a treeT contains every vertex of G
A connected graph must have at least one spanning tree
28
Minimum spanning trees
Given a weighted connected graph G, a minimum spanning tree T* of G is a spanning tree of G with minimum total edge weight
Application: Minimizing the total length of wires needed to connect up a collection of computers
29
Minimum spanning trees
Two algorithmsKruskal’s algorithmPrim’s algorithm
30
Kruskal’s algorithm
Choose edges in ascending weight greedily, while preventing cycles
31
Kruskal’s algorithm
AlgorithmT is an empty setSort the edges in G by their weightsFor (in ascending weight) each edge e do
If T {e} is acyclic then Add e to T
Return T
32
Kruskal’s algorithm
How to detect a cycle?Depth-first search (DFS)
O(V) per check O(VE) overall
Disjoint set Vertices are balls, connected components are
bags
33
Kruskal’s algorithm
Algorithm (using disjoint-set)T is an empty setCreate bags {1}, {2}, …, {V}Sort the edges in G by their weightsFor (in ascending weight) each edge e do
Suppose e connects vertices x and y If Find(x) Find(y) then
Add e to T, then Union(Find(x), Find(y))
Return T
34
Kruskal’s algorithm
The improved time complexity is O(ElgV) The bottleneck is sorting
35
Prim’s algorithm
In Kruskal’s algorithm, the MST-in-progress scatters around
Prim’s algorithm grows the MST from a “seed”
Prim’s algorithm iteratively chooses the lightest grow-able edgeA grow-able edge connects a grown vertex and
a non-grown vertex
36
Prim’s algorithm
AlgorithmLet seed be any vertex, and Grown := {seed} Initially T is an empty setRepeat |V|-1 times
Let e=(x,y) be the lightest grow-able edge Add e to T Add x and y to Grown
Return T
37
Prim’s algorithm
How to find the lightest grow-able edge?Check all (grown, non-grown) vertex pairs
Too slow
Each non-grown vertex x keeps a value nearest[x], which is the weight of the lightest edge connecting x to some grown vertex
Nearest[x] = if no such edge
38
Prim’s algorithm
How to use nearest?Grow the vertex (x) with the minimum nearest-
value Which edge? Keep track on it!
Since x has just been grown, we need to update the nearest-values of all non-grown vertices
Only need to consider edges incident to x
39
Prim’s algorithm
Try to program Prim’s algorithm You may find that it’s very similar to
Dijkstra’s algorithm for finding shortest paths!Almost only a one-line difference
40
Prim’s algorithm
Per round...Finding minimum nearest-value: O(V)Updating nearest-values: O(V) (Overall O(E))
Overall: O(V2+E) = O(V2) time Using a binary heap,
O(lgV) per Finding minimumO(lgV) per UpdatingOverall: O(ElgV) time
41
MST Extensions
Second-best MSTWe don’t want the best!
Online MSTSee IOI2003 Path Maintenance
Minimum bottleneck spanning treeThe bottleneck of a spanning tree is the weight of
its maximum weight edgeAn algorithm that runs in O(V+E) exists
42
MST Extensions (NP-Hard)
Minimum Steiner TreeNo need to connect all vertices, but at least a given
subset B V Degree-bounded MST
Every vertex of the spanning tree must have degree not greater than a given value K
For a discussion of NP-hardness, please attend [Talk] Introduction to Complexity Theory on 3 June
43
Roadmap
What is a tree? Disjoint sets Minimum spanning trees Various tree topics
44
Various tree topics (List)
Center, eccentricity, radius, diameter Tree isomorphism
Canonical representation Prüfer code Lowest common ancestor (LCA) Counting spanning trees
45
Supplementary readings
Advanced:Disjoint set forest (Lecture slides)Prim’s algorithmKruskal’s algorithmCenter and diameter
Post-advanced (so-called Beginners):Lowest common ancestorMaximum branching