part i: introductory materials introduction to graph theory dr. nagiza f. samatova department of...
TRANSCRIPT
Part I: Introductory MaterialsIntroduction to Graph Theory
Dr. Nagiza F. SamatovaDepartment of Computer ScienceNorth Carolina State University
andComputer Science and Mathematics Division
Oak Ridge National Laboratory
2
Graphs
Graph with 7 nodes and 16 edges
UndirectedUndirectedEdges
Nodes / Vertices
DirectedDirected
1 2
( , )
{ , ,..., }
{ ( , ) | , , 1,..., }n
k i j i j
G V E
V v v v
E e v v v v V k m
( , ) ( , )i j j iv v v v ( , ) ( , )i j j iv v v v
3
Types of Graphs
• Undirected vs. Directed• Attributed/Labeled (e.g., vertex, edge) vs.
Unlabeled• Weighted vs. Unweighted• General vs. Bipartite (Multipartite)• Trees (no cycles)• Hypergraphs• Simple vs. w/ loops vs. w/ multi-edges
4
Labeled Graphs and Induced Subgraphs
Bold: A subgraph induced by vertices b, c and d
Labeled graph w/ loops
Graph Isomorphism
5
Which graphs are isomorphic?
(A) (B) (C) C
Graph Automorphism
6
Which graphs are automorphic?
Automorphism is isomorphism that preserves the labels.
(A) (B) (C)B
Vertex degree, in-degree, out-degree
77
DirectedDirected
headtail
t h
In-degree of the vertex is the number of in-coming edgesOut-degree of the vertex is the number of out-going edges
Degree of the vertex is the number of edges (both in- & out-degree)
8
Graph Representation and Formats
• Adjacency Matrix (vertex vs. vertex)• Incidence Matrix (vertex vs. edge)• Sparse vs. Dense Matrices• DIMACS file format• In R: igraph object
9
Adjacency Matrix Representation
A(1) A(2)
B (6)
A(4)
B (5)
A(3)
B (7) B (8)
A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 1 0 1 0 0 0A(2) 1 1 0 1 0 1 0 0A(3) 1 0 1 1 0 0 1 0A(4) 0 1 1 1 0 0 0 1B(5) 1 0 0 0 1 1 1 0B(6) 0 1 0 0 1 1 0 1B(7) 0 0 1 0 1 0 1 1B(8) 0 0 0 1 0 1 1 1
A(2) A(1)
B (6)
A(4)
B (7)
A(3)
B (5) B (8)
A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 0 1 0 1 0 0A(2) 1 1 1 0 0 0 1 0A(3) 0 1 1 1 1 0 0 0A(4) 1 0 1 1 0 0 0 1B(5) 0 0 1 0 1 0 1 1B(6) 1 0 0 0 0 1 1 1B(7) 0 1 0 0 1 1 1 0B(8) 0 0 0 1 1 1 0 1
Representation is NOT unique. Algorithms can be order-sensitive.
Src: “Introduction to Data Mining” by Kumar et al
Families of Graphs
10
• Cliques• Path and simple path• Cycle• Tree• Connected graphs
Read the book chapter for definitions and examples.
11
Complete Graph, or Clique
Each pair of vertices is connected.
CliqueClique
12
The CLIQUE Problem
Maximum Clique of Size 5
CliqueClique: a complete subgraph
Maximal CliqueMaximal Clique: a clique cannot be enlarged by adding any more verticesMaximum CliqueMaximum Clique: the largest maximal clique in the graph
{ , | has a clique of size }CLIQUE G k G k
13
Does this graph contain a 4-clique?
Indeed it does!
But, if it had not, But, if it had not, what evidence would have been needed?what evidence would have been needed?
14
Problem: Decision, Optimization or Search
Problem
Decision Optimization Search
Formulate each version for the CLIQUE problem.
(self-reduction)“Yes”-”No”Parameter k max/minActual solution
• Which problem is harder to solve?• If we solve Decision problem, can we use it for the others?
Enumeration
All solutions
15
Refresher: Class P and Class NP
Definition: P (NP) is the class of languages/problems that are decidable in polynomial time on a (non-)deterministic single-tape Turing machine.
Class
P ????NP( )k
k
P DTIME n ( )k
k
NP NTIME n
non-polynomial
Non-deterministic polynomial
Polynomially verifiable
16
PSPACE∑2
P… …
“forget about it”
P vs. NP
The Classic Complexity Theory View:
P NP
“easy”
“hard”
“About ten years ago some computer scientists came by and said they heard we have some really cool problems. They showed that the problems are NP-complete and went away!”
17
Classical Graph Theory ProblemsCSC505:Algorithms, CSC707 :Complexity Theory, CSC5??:Graph Theory
• Longest Path• Maximum Clique• Minimum Vertex Cover• Hamiltonian Path/Cycle• Traveling Salesman (TSP)• Maximum Independent Set• Minimum Dominating Set• Graph/Subgraph Isomorphism• Maximum Common Subgraph • …
NP-hardProblems
18
Graph Mining ProblemsCSC 422/522 and Our Book
• Clustering + Maximal Clique Enumeration• Classification• Association Rule Mining +Frequent Subgraph
Mining• Anomaly Detection• Similarity/Dissimilarity/Distance Measures• Graph-based Dimension Reduction• Link Analysis• …
Many graph mining problems have to deal with classical graph problems as part of its data mining pipeline.
19
Dealing with Computational Intractability
• Exact Algorithms:– Small graph problems– Small parameters to graph problems– Special classes of graphs (e.g., bounded tree-width)
• Approximation Polynomial-Time Algorithms (O(nc))– Guaranteed error-bar on the solution
• Heuristic Polynomial-Time Algorithms– No guarantee on the quality of the solution – Low degree polynomial solutions
Our focus