# minimum spanning trees analysis of algorithms. minimum spanning tree the minimum spanning tree (mst)...

of 48 /48
Minimum Spanning Trees Analysis of Algorithms

Author: russell-wells

Post on 29-Jan-2016

221 views

Category:

## Documents

Tags:

• #### kruskal exampleselect

Embed Size (px)

TRANSCRIPT

introduction

void union (int u, int v) { i = find(u);j = find(v);if (i.rank > j.rank) { j.parent = i;} else {i.parent = j; if (i.rank == j.rank)j.rank = j.rank + 1;}}

Union-by-weight: the root of the smaller sub-tree is linked to the root of the larger sub-tree. This will tend to keep the size of newly formed tree small.Union-by-Rank HeuristicQuestion: Given the following array, what will be the entries in the array after union(3, 6), using the union-by-rank heuristic?What will be the entries in the array if instead of union-by-rank, you use the union-by-weight heuristic?v0123456789p[v]0000007888Path Compression HeuristicNotice that when we perform find(v) we have to follow a path from v to the root of the subtree containing vSince it is recursive, why not simply make all of the elements point directly to the root of the subtree when the recursion unwinds?This is reminiscent of the nave algorithm, where we made every element point directly to the root, but here it is much cheaper because we only alter pointers that we needed to look at anywayPath Compression HeuristicFind the root of the subtree for a given vertex, and while you are at it, make all vertices in that subtree point to the root when it is found.

int find(int v) {if (v != v.parent) { v.parent = find(v.parent);}return v.parent;}

Note that find is a two-pass procedure: it makes one pass up the subtree to find the root, and a second pass down the subtree to update each vertex so that it points directly to the root. Each call returns v.parent. If v is the root, then the line v.parent = find(v.parent) is not executed and the root is returned. Otherwise, this line is executed and a pointer to the root is returned. This line updates v to point directly to the root, and this is what is returned to the caller.Analysis of Union-FindUnion-by-rank with path compression takes time proportional to (N)M, where is the inverse Ackerman functionCompare this to M+N for cycle-checking using depth-first searchUsing the heuristics with a billion data items reduces processing time of cycle checking from 30 years to 6 secondsMoral: a good algorithm sometimes enables a solutionA supercomputer would not help much hereKruskal Using Union-FindSort the edges in non-decreasing order by weightInitialize a partition structure (make each vertex its own root) Choose the next edge (u, v) in the order returned by the sort includedCount = 0

while includedCount < N do root1 = find(u)root2 = find(v)if (root1 root2) thenadd edge (u, v) to the spanning tree so far includedCount = includedCount + 1 union(root1,root2)end if end whilePrim MST AlgorithmPrims minimum spanning tree algorithm concentrates on vertices rather than edgesChoose a starting vertex, put it in the MST, and then repeatedly choose other vertices to put in the treeTo find the next vertex, consider all of the edges from the starting vertex, and choose the neighbor vertex of the starting vertex whose edge weight is the smallestOn each subsequent pass, choose the vertex (among those not yet in the MST) that has the smallest edge weight connected to any of the vertices already in the MSTPrim ExampleOriginal graph:Fringe from vertex A:Add edge A-B:Prim ExampleAdd edge B-E:Prim ExampleAdd edge A-C:Add edge A-F:Prim ExampleAdd edge F-G:Final Prim Tree:Prim Pseudocode1. Select a starting vertex 2. Build the initial fringe from vertices connected to the starting vertex 3. While (there are vertices left)4. Choose the edge to the fringe with the smallest weight and Add the associated vertex to the tree 5. Update fringe:Add vertices to the fringe to the connected new vertex Update the edges to the fringe so that they are the smallestEndwhile

Prim Uses a Priority QueuePrim MST algorithm: Repeatedly select the next vertex that is closest to the tree we have built so farTo implement Prim, we need some sort of data structure that will enable us to associate a key (the distance to the tree under construction) with each vertex, and then rapidly select the vertex with the lowest keyUse a priority queue implemented as a min heap(assuming a sparse graph)Prim Uses a Priority QueueQUESTION: Suppose that you are running Prim's algorithm with N = 8. The edges added to the MST so far are (0, 2), (0, 7), and (1, 7). What are the keys (edge weights of the fringe) in the priority queue?a) 17, 38, 32, 40b) 17, 37, 28, 40c) 17, 37, 28, 32d) 0, 19, 26, 17, 37, 28, 40, 16

Assume the graph weights are as follows: (0,2)=26, (0,4)=38, (0,6)=58, (0,7)=16, (1,2)=36, (1,3)=29, (1,5)=32, (1,7)=19, (2,3)=17, (2,7)=34,(3,6) = 52, (4,5)=35, (4,7)=37, (5,7)=28, (6,2)=40, (6,4)=93Analysis of MST AlgorithmsKruskal always maintains a spanning subgraph that becomes a tree at the last stepAt each step (after ordering the edges in terms of increasing weight), select the lowest weight edge that is not already in the subgraph and add it to the MST, as long as no cycle is formedPrim always maintains a tree that becomes spanning at the last stepAt each step select the highest priority (lowest weight) vertex not already in the tree and add it to the MSTMaking selections this way are examples of a general class of algorithms: greedy algorithmsAnalysis of MST AlgorithmsKruskals runtime is proportional to M log M for sorting the edges plus (N)M for cycle checking where is the inverse Ackerman functionSorting the edges dominates the runtimePrims runtime is proportional to M log NIf using a priority queue implemented as a binary heapQuestion: Which is faster, Kruskal or Prim?Does a general-purpose linear- time MST algorithm exist?Greedy AlgorithmsKruskal and Prim are greedy algorithmsGreedy algorithms are used to quickly find an approximate solution for difficult problemsThe idea is to make decisions based on what appears to be the best choice at the moment, with the hope that overall this will produce a reasonably good result

Greed is good. Greed is right. Greed clarifies, cuts through, and captures the essence of the evolutionary spirit. Gordon GeckoGreedy Traveling SalesmanSort the edges in order of non-decreasing weightFor every edge taken in order of non-decreasing weightAdd the edge to the greedy path as long as a cycle is not created (unless it is the last edge) and it is not the third edge to or from any vertexAs in Kruskal, use the union-find algorithm with the union- by-rank and path compression heuristics for cycle checkingGreedy Traveling SalesmanIn this example, first choose edge 3 5, then edge 5 6, but then reject 3 6 because it forms a cycleNext choose edge 4 7, then edge 2 7, but then reject edge 1 5 because it would be the third edge to node 5Greedy Traveling SalesmanNext choose edges 1 6, 1 4, 2 3 to get the path 1 4 7 2 3 5 6 1 for a total length of 53This is a good approximation, but it is not the optimal solution because there is at least one other path, 1 4 7 2 5 3 6 1 which has a lower total length of 41Greedy Traveling SalesmanExample: n = 5, seed = 100000v0: (2,2)v1: (0,1)v2: (4,1)v3: (3,0)v4: (1,3)14324321xy32104Edges in sorted order: 0 4: weight = 1.412 3: weight = 1.410 1: weight = 2.240 2: rejected (3rd edge from 0)0 3: rejected (3rd edge from 0) 4: rejected (cycle)1 3: weight = 3.16 4: weight = 3.61 (stop here)TSP Path: 0 1 3 2 4 0,Total weight = 11.83Connected Components AgainQuestion: How many connected components result after performing the following sequence of union operations on a set of 10 items, using the union-by-rank heuristic? Which items are in each connected component?

(1,2), (3,4), (5,6), (7,8), (7,9), (2,8), (0,5), (1,9)Dynamic ConnectivityGiven a set of N objects, find out if there is a path connecting two given objectsActually finding that path is a different problem

Some applications of dynamic connectivity:Pixels in a digital photoComputers in a networkFriends in a social networkTransistors in a computer chipElements in a mathematical setVariable names in Fortran programMetallic sites in a composite systemData Mining Using Union-FindPartition a set of objects into clusters such that different clusters are far apart and the objects within each cluster are close togetherRouting in mobile ad hoc networksCategorizing documents for web searchesSearching medical image databases for similar imagesClustering sky objects into stars, quasars, galaxiesIdentifying patterns in gene expressionStart by forming N clusters of one object eachFind the pair of objects from different clusters that have the minimum distance between them and union them into a new clusterRepeat until there are exactly k clusters (k spanning trees)This procedure is precisely Kruskals algorithm