Download - Minimum Spanning Trees and Clustering
Minimum Spanning Trees and Clustering
By Swee-Ling TangApril 20, 2010
04/20/2010 1
Spanning Tree
• A spanning tree of a graph is a tree and is a subgraph that contains all the vertices.
• A graph may have many spanning trees; for example, the complete graph on four vertices has sixteen spanning trees:
04/20/2010 2
Spanning Tree – cont.
04/20/2010 3
Minimum Spanning Trees
• Suppose that the edges of the graph have weights or lengths. The weight of a tree will be the sum of weights of its edges.
• Based on the example, we can see that different trees have different lengths.
• The question is: how to find the minimum length spanning tree?
04/20/2010 4
Minimum Spanning Trees
• The question can be solved by many different algorithms, here is three classical minimum-spanning tree algorithms :– Boruvka's Algorithm– Kruskal's Algorithm– Prim's Algorithm
04/20/2010 5
04/20/2010 6
Kruskal's Algorithm
• Joseph Bernard Kruskal, Jr• Kruskal Approach:
– Select the minimum weight edge that does not form a cycle
Kruskal's Algorithm:
sort the edges of G in increasing order by length
keep a subgraph S of G, initially emptyfor each edge e in sorted order if the endpoints of e are disconnected
in S add e to S return S
Kruskal's Algorithm - Example
04/20/2010 7
Kruskal's Algorithm - Example
04/20/2010 8
04/20/2010 9
Prim’s Algorithm
• Robert Clay Prim• Prim Approach:– Choose an arbitrary start node v– At any point in time, we have connected
component N containing v and other nodes V-N– Choose the minimum weight edge from N to V-NPrim's Algorithm:
let T be a single vertex xwhile (T has fewer than n vertices){
find the smallest edge connecting T to G-Tadd it to T
}
04/20/2010 10
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 11
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 12
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 13
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 14
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 15
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 16
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 17
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 18
Prim's Algorithm - Example
13
50
117
2
8 12
9
10
1440
31
6
20
04/20/2010 19
Boruvka's Algorithm
• Otakar Borůvka– Inventor of MST– Czech scientist– Introduced the problem– The original paper was written in Czech in
1926.– The purpose was to efficiently provide
electric coverage of Bohemia.
04/20/2010 20
Boruvka’s Algorithm
• Boruvka Approach:– Prim “in parallel”– Repeat the following procedure until the resulting
graph becomes a single node.• For each node u, mark its lightest incident edge. • Now, the marked edges form a forest F. Add the edges
of F into the set of edges to be reported.• Contract each maximal subtree of F into a single node.
04/20/2010 21
Boruvka’s Algorithm - Example
2.1
1.3
2.3
1.2
2.2
3.1
2.4
3
1
1.5
1.4
2.6
2.7
2.5
3.2
5
3.34
4.1
5.1
Usage of Minimum Spanning Trees
• Network design:– telephone, electrical, hydraulic, TV cable,
computer, road
• Approximation algorithms for NP-hard (non-deterministic polynomial-time hard) problems:– traveling salesperson problem
• Cluster Analysis
04/20/2010 22
Clustering
• Definition– Clustering is “the process of organizing
objects into groups whose members are similar in some way”.
– A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.
– A data mining technique
04/20/2010 23
Why Clustering?
• Unsupervised learning process• Pattern detection• Simplifications• Useful in data concept
construction
04/20/2010 24
The use of Clustering
• Data Mining• Information Retrieval• Text Mining• Web Analysis• Marketing• Medical Diagnostic• Image Analysis• Bioinformatics
04/20/2010 25
Image Analysis – MST Clustering
04/20/2010 26
An image file before/after colorclustering using HEMST (Hierarchical EMST clustering algorithm) and SEMST (Standard EMST clustering algorithm).
References• Minimum Spanning Tree Based Clustering Algorithms -
http://www4.ncsu.edu/~zjorgen/ictai06.pdf• Minimal Spanning Tree based Fuzzy Clustering -
http://www.waset.org/journals/waset/v8/v8-2.pdf• Minimum Spanning Tree – Wikipedia -
http://en.wikipedia.org/wiki/Minimum_spanning_tree• Kruskal's algorithm - http://en.wikipedia.org/wiki/Kruskal
%27s_algorithm• Prim's Algorithm – Wikipedia -
http://en.wikipedia.org/wiki/Prim%27s_algorithm• Design and Analysis of Algorithms -
http://www.ics.uci.edu/~eppstein/161/960206.html
04/20/2010 27