cs 600.226: data structuresphf/2016/fall/cs226/lectures/38.unionfind.pdf · 1. find and remove a...

69
CS 600.226: Data Structures Michael Schatz Dec 7, 2016 Lecture 38: Union-Find

Upload: others

Post on 19-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

CS 600.226: Data Structures Michael Schatz Dec 7, 2016 Lecture 38: Union-Find

Page 2: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Remember: javac –Xlint:all & checkstyle *.java & JUnit Solutions should be independently written!

Assignment 10: Due Monday Dec 5 @ 10pm

Page 3: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Part 1: Dijkstra’s Algorithm aka Shortest Path Revisited

Page 4: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Dijkstra’s Correctness

•  Assume that Dikjstra’s algorithm has correctly found the shortest path to the first N items, including from S to X and U (but not yet Y or V)

•  It now decides the next closest node to visit is v, using the edge from u Could there be some other shorter path to v?

S

U

X

V

Y

No: cost(S->X->V) must be greater than or equal to cost(S->U->V) or it would have already revised the cost of v to go through X

No: cost(S->X->Y) must be greater than or equal to cost(S->U->V) (and therefore cost(S->X->Y->V) must be even greater) or it would be visiting node Y next

Page 5: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Dijkstra’s Pseudocode

dijkstra(graph, start):distance = {}for v in graph.vertices:

distance[v] = infinity distance[start] = 0 unvisited = graph.vertices unvisited.remove(start)

current = startwhile !unvisted.empty()

for e in current.outgoing:v = e.toVertexif v in unvisited:

d = distance[current] + e.weightif d < distance[v]:

distance[v] = dunvisited.remove(current)current = unvisited.findSmallestDistance()if distance[current] == infinity:

break

Greedy choice

In case >1 connected components

Running time?

Explore every edge once, Visit every node once

At every node, scan the unvisited nodes to find next smallest distance

O(|E| + |V|2)

Can you do better?

Page 6: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

dijkstra(graph, start):distance = {}

for v in graph.vertices:if v == start

distance[v] = 0else

distance[v] = infinity Q.add_with_priority(v, distance[v])

while !Q.empty()cur = Q.extract_min()if distance[current] == infinity:

breakfor e in current.outgoing:

v = e.toVertexif v in Q:

d = distance[current] + e.weightif d < distance[v]:

distance[v] = dQ.decrease_priority(v,d)

A min-priority queue that supports extract_min(),

add_with_priority(), and

decrease_priority()

Normally it takes O(n) time to find an element in a heap, but use an

integer index (or something) to directly reference its position

Now Dikstra will run in O(|E| lg |V|)

Dijkstra with Priority Queue

Notice we have to do work on every edge

Page 7: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

public interface PriorityQueue<T extends Comparable<T>> { void insert(T t);void remove() throws EmptyQueueException;T top() throwsEmptyQueueException;boolean empty();

}

Min-priority queue

public interface MinPriorityQueue<K, P extends Comparable<T>> { void addWithPriority(K key, P priority);void decreasePriority(K key, P newp) throws InvalidItem;T extractMin() throws EmptyQueueException;boolean empty();

}

Page 8: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Min-priority queue

public interface MinPriorityQueue<K, P extends Comparable<T>> { void addWithPriority(K key, P priority);void decreasePriority(K key, P newp) throws InvalidItem;T extractMin() throws EmptyQueueException;boolean empty();

}

Mike/0 / \ Peter/7 Kelly/1 / \ / Katherine/9 Sydney/7 James/2

MPQ

heap

keys

Jam

es

Kat

herin

e K

elly

M

ike

Pet

er

Syd

ney

Page 9: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Min-priority queue

public interface MinPriorityQueue<K, P extends Comparable<T>> { void addWithPriority(K key, P priority);void decreasePriority(K key, P newp) throws InvalidItem;T extractMin() throws EmptyQueueException;boolean empty();

}

Mike/0 / \ Peter/7 Kelly/1 / \ / Katherine/9 Sydney/7 James/2

MPQ

heap

keys

Jam

es

Kat

herin

e K

elly

M

ike

Pet

er

Syd

ney

How to implement?

(1) If keys are integers in the range 0 to n,

make an array of references

(2) Use a HashMap from keys to hash

(3) Give client a reference, store in graph node

Page 10: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Fibonnaci Heap Special form of heaps specifically designed to allow fast updates to values •  Set of heap-ordered tree (value(n) < value(n.children)) •  Maintain pointer to overall minimum element •  Set of marked nodes used to track heights of certain trees

Find-min

Delete-min

Insert Decrease-key

Merge

Binary Heap O(1) O(lg n) O(lg n) O(lg n) O(n)

Fibonnaci Heap O(1) O(lg n) O(1) O(1) O(1)

Reduces Dijkstra’s run time to O(|E| + |V| lg |V|)

Page 11: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Part 2: Minimum Spanning Trees

Page 12: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Long Distance Calling

Supposes it costs different amounts of money to send data between cities A through F. Find the least expensive set of connections so that anyone can

send data to anyone else.

Given an undirected graph with weights on the edges, find the subset of edges that (a) connects all of the vertices of the graph and (b) has minimum total costs

This subset of edges is called the minimum spanning tree (MST)

Removing an edge from MST disconnects the graph, adding one forms a cycle

Page 13: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Dijkstra’s != MST

S

Y

X

Dijkstra’s will build the tree S->X, S->Y (tree visits every node with shortest paths from S to every other node)

but the MST is S->X, X->Y

(tree visits every node and minimizes the sum of the edges)

100

100

1

Page 14: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

1.  Pick an arbitrary starting vertex and add it to set T. Add all of the other vertices to set R

Every vertex will be included eventually, so doesn’t

matter where you start

T={A} R={B,C,D,E,F}

Page 15: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

1.  Pick an arbitrary starting vertex and add it to set T. Add all of the other vertices to set R

Every vertex will be included eventually, so doesn’t

matter where you start

T={A} R={B,C,D,E,F}

2. While R is not empty, pick an edge from T to R with minimum cost A-B: 4 <- pick me A-C: 7 A-D: 8 A-F: 10

Page 16: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

1.  Pick an arbitrary starting vertex and add it to set T. Add all of the other vertices to set R

Every vertex will be included eventually, so doesn’t

matter where you start

T={A} R={B,C,D,E,F}

2. While R is not empty, pick an edge from T to R with minimum cost A-B: 4 <- pick me A-C: 7 A-D: 8 A-F: 10

Page 17: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

3. Repeat! A-B

Page 18: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

3. Repeat! A-B

A-C

Page 19: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

3. Repeat! A-B

A-C

C-D

Page 20: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

3. Repeat! A-B

A-C

C-D

C-F

Page 21: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

3. Repeat! A-B

A-C

C-D

C-F

C-E

By making a set of simple local choices, it finds the overall best solution The greedy algorithm is optimal J

Page 22: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

Prim’s Algorithm Sketch 1.  Initialize a tree with a single vertex, chosen arbitrarily from the graph. 2.  Grow the tree by one edge: of the edges that connect the tree to

vertices not yet in the tree, find the minimum-weight edge, and add it to the tree.

3.  Repeat step 2 (until all vertices are in the tree).

Page 23: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

Prim’s Pseudo-code 1.  For each vertex v,

•  Compute C[v] (the cheapest cost of a connection to v from the MST) and an edge E[v] (the edge providing that cheapest connection).

•  Initialize C[v] to ∞ and E[v] to a special flag value indicating that there is no edge connecting v to the MST

2.  Initialize an empty forest F (set of trees) and a set Q of vertices that have not yet been included in F (initially, all vertices).

3.  Repeat the following steps until Q is empty: 1.  Find and remove a vertex v from Q with the minimum value of C[v] 2.  Add v to F and, if E[v] is not the special flag value, also add E[v] to F 3.  Loop over the edges vw connecting v to other vertices w. For each

such edge, if w still belongs to Q and vw has smaller weight than C[w], perform the following steps: 1.  Set C[w] to the cost of edge vw 2.  Set E[w] to point to edge vw.

4.  Return F

What data structures do we need to make it fast?

How fast is the naïve version? O(|V|2)

HEAP

Page 24: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Prim’s Algorithm

Faster Prim’s Pseudo-code 1.  Add the cost of all of the edges to a heap (or sort the edge costs) 2.  Repeatedly pick the next smallest edge (u,v)

1.  If u or v is not already in the MST, add the edge uv to the MST How fast is this version O(|E| lg |E|)

How fast is this version O(|E| lg |V|)

Fastest Prim’s Pseudo-code 1.  Add all of the vertices to a min-priority-queue prioritized by the min edge

cost to be added to the MST. Initialize (key=v, dist=∞, edge=<>) 2.  Repeatedly pick the next closest vertex v from the MPQ

1.  Add v to the MST using the recorded edge 2.  For each edge (v, u)

1.  If cost(v,u) < MPQ(u) 1.  MPQ.decreasePriority(u, cost(v,u), (v,u))

Using Fibonacci Heap O(|E| + |V| lg |V|)

Page 25: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Teaser: Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

Page 26: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Traveling Salesman Problem

•  What if we wanted to compute the minimum weight path visiting every node once?

C

A

D

B 4

1

3

1

5

2

ABDCA: 4+2+5+3 = 14 ACDBA: 3+5+2+4 = 14* ABCDA: 4+1+5+1 = 11 ADCBA: 1+5+1+4 = 11* ACBDA: 3+1+2+1 = 7 ADBCA: 1+2+1+3= 7 *

Page 27: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Greedy Search Greedy Search cur=graph.randNode() while (!done)

next=cur.getNextClosest() Greedy: ABDCA = 5+8+10+50= 73 Optimal: ACBDA = 5+11+10+12 = 38 Greedy finds the global optimum only when 1.  Greedy Choice: Local is correct without reconsideration 2.  Optimal Substructure: Problem can be split into subproblems

Optimal Greedy: Dijkstra’s, Prim’s, Cookie Monster

C

A

D

B 8

11

5

12

50

10

Page 28: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

TSP Complexity

•  No fast solution –  Knowing optimal tour through n cities doesn't

seem to help much for n+1 cities

[How many possible tours for n cities?]

•  Extensive searching is the only provably correct algorithm –  Brute Force: O(n!)

•  ~20 cities max •  20! = 2.4 x 1018

C

A

D

B 4

1

3

1

5

2

C

A

D

B 4

1

3

1

5

2

E 30 2

1 2

Page 29: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Branch-and-Bound •  Abort on suboptimal solutions

as soon as possible –  ADBECA = 1+2+2+2+3 = 10 –  ABDE = 4+2+30 > 10 –  ADE = 1+30 > 10 –  AED = 1+30 > 10 –  …

C

A

D

B 4

1

3

1

5

2

E 30 2

1 2

•  Performance Heuristic –  Always gives the optimal answer –  Doesn't always help performance, but often does –  Current TSP record holder:

•  85,900 cities •  85900! = 10386526

[When not?]

Page 30: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

TSP and NP-complete •  TSP is one of many extremely hard

problems of the class NP-complete –  Extensive searching is the only way to

find an exact solution –  Often have to settle for approx. solution

•  WARNING: Many “natural” problems are in this class –  Find a tour the visits every node once –  Find the smallest set of vertices covering the edges –  Find the largest clique in the graph –  Find the highest mutual information encoding scheme –  Find the best set of moves in tetris –  … –  http://en.wikipedia.org/wiki/List_of_NP-complete_problems

Page 31: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Part 3: Kruskal’s Algorithm and Union Find

Page 32: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12

4

7

4

9

14

10

3 2

8

Page 33: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12

4

7

4

9

14

10

3 2

8

Page 34: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12

4

7

4

9

14

10

3 8

Tinting is just to make it easier to look at

Page 35: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12

4

7

4

9

14

10

8

Page 36: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12

4

7

9

14

10

8

Page 37: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

4

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12

7

9

14

10

8

Page 38: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

7

4

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12 9

14

10

8

Page 39: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

8 7

4

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

12 9

14

10

Page 40: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

12 9

14

10

8 7

4

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

Page 41: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

7

4

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

Page 42: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

7

4

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

Easy: O(|V|)

Easy: O(|E|lg|E|)

Hmm???

Running time:

Page 43: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Disjoint Sets

Disjoint Subset: every element is assigned to a single subset

Problem: Determine which elements are part of the same subset as sets are dynamically joined together

Two main methods:

Find: Are elements x and y part of the same set? Union: Merge together sets containing elements x and y

C E

D

F A

B

C E

D

F A

B

Page 44: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Quick Find

https://www.cs.princeton.edu/~rs/AlgsDS07/01UnionFind.pdf

Page 45: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Implementation

Find is quick O(1), but each union is O(N) time Kruskal’s ultimately merges all N items: O(N2)

Can we do any faster?

Page 46: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Implementation

Find is quick O(1), but each union is O(N) time Kruskal’s ultimately merges all N items: O(N2)

Can we do any faster?

Page 47: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Quick Find Forest

Page 48: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Quick Union

Page 49: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Quick Union

Why not just change id[q]? Why set it to p’s root instead of p?

Page 50: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Quick Union Implementation

Page 51: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Quick Union Example

What is the worst case for Quick Union?

Page 52: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Quick-Union Analysis

Page 53: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Improved Quick-Union: Weighting

Page 54: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Improved Quick-Union: Weighting

Page 55: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Weighted Quick-Union Implementation

Page 56: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Weighted Quick-Union

Page 57: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Weighted Quick-Union

How tall can the tree grow?

Height of tree only grows when two trees of equal height are united, thereby doubling the number of elements

Hooray! Tree stays pretty flat!

Uniting a short tree and a tall tree doesn’t increase the height of the tall tree

How many rounds of doubling can we do until we include all N elements?

lg N rounds => O(lg N) max height J

Page 58: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Weighted Quick-Union Analysis

Usually very happy with lg(N), but here we can do better!

Should we stop trying here?

Page 59: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Improvement 2: Path Compression

Page 60: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Path Compression Implementation

Page 61: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Weighted Path Compression Example

Page 62: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Weighted Path Compression Analysis

Much bigger than #atoms in the universe

Page 63: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

7

4

4

3 2

Kruskal’s Algorithm

Kruskal’s Algorithm Sketch 1.  Create a forest F (a set of trees), where each vertex in the graph is a

separate tree 2.  Create a set S containing all the edges in the graph 3.  while S is nonempty and F is not yet spanning

1.  remove an edge with minimum weight from S 2.  if the removed edge connects two different trees then add it to the

forest F, combining two trees into a single tree

C E

D

F A

B

Easy: O(|V|)

Easy: O(|E|lg|E|)

Hmm???

Running time:

Using Weighted-Path Compression Quick Union (aka Union-Find) each find or union in O(lg*|V|) time. Total time is O(|E|lg|E| + Elg*|V|) => O(|E|lg|E|)

If the edges are already in sorted order (or use a linear time sorting algorithm such as counting sort), reduce total time to O(|E|lg*|V|)

Page 64: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Other Applications: Connected Components

Page 65: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Other Applications: Connected Components

How would you use Union-Find to solve this?

Assign every dot to a set, then merge sets that have an edge between them

Page 66: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Other Applications: Connected Components

63 Connected

Components

Page 67: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Never underestimate the power of the LOG STAR!!!!

Page 68: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Next Steps 1.  Reflect on the magic and power of the logstar! 2.  Final Exam: Sunday Dec 18 @ 9am

Page 69: CS 600.226: Data Structuresphf/2016/fall/cs226/lectures/38.UnionFind.pdf · 1. Find and remove a vertex v from Q with the minimum value of C[v] 2. Add v to F and, if E[v] is not the

Welcome to CS 600.226 http://www.cs.jhu.edu/~cs226/

Questions?