October 21, 2008 19. Graphs 1
Graphs
CS 221
(slides from textbook author)
October 21, 2008 19. Graphs 2
Graphs
ORD
DFW
SFO
LAX
802
1743
1843
1233
337
October 21, 2008 19. Graphs 3
Outline and Reading• Graphs (§6.1)
– Definition– Applications– Terminology– Properties– ADT
• Data structures for graphs (§6.2)– Edge list structure– Adjacency list structure– Adjacency matrix structure
October 21, 2008 19. Graphs 4
Graph• A graph is a pair (V, E), where
– V is a set of nodes, called vertices– E is a collection of pairs of vertices, called edges– Vertices and edges are positions and store elements
• Example:– A vertex represents an airport and stores the three-letter airport code– An edge represents a flight route between two airports and stores the mileage of the
route
ORD PVD
MIADFW
SFO
LAX
LGA
HNL
849
802
13871743
1843
10991120
1233337
2555
142
October 21, 2008 19. Graphs 5
Edge Types• Directed edge
– ordered pair of vertices (u,v)– first vertex u is the origin– second vertex v is the destination– e.g., a flight
• Undirected edge– unordered pair of vertices (u,v)– e.g., a flight route
• Directed graph– all the edges are directed– e.g., flight network
• Undirected graph– all the edges are undirected– e.g., route network
ORD PVDflight
AA 1206
ORD PVD849
miles
October 21, 2008 19. Graphs 6
Edge Types• Directed edge
– ordered pair of vertices (u,v)– first vertex u is the origin– second vertex v is the destination– e.g., a flight
• Undirected edge– unordered pair of vertices (u,v)– e.g., a flight route
• Directed graph– all the edges are directed– e.g., flight network
• Undirected graph– all the edges are undirected– e.g., route network
ORD PVDflight
AA 1206
ORD PVD849
miles
These are edges from
two distinct graphs. FLVS-mod
October 21, 2008 19. Graphs 7
ORD PVD
MIADFW
SFO
LAX
LGA
HNL
AA1206
UA
7654
AA5432AA8901
AA4567
UA98UA567UA6789
UA
45UA1234
AA67
ORD PVD
MIADFW
SFO
LAX
LGA
HNL
849
802
13871743
1843
10991120
1233
337
2555
142
FLVS
Undirected graph
Directed graph
October 21, 2008 19. Graphs 8
ORD PVD
MIADFW
SFO
LAX
LGA
HNL
AA1206
UA
7654
AA5432AA8901
AA4567
UA98UA567UA6789
UA
45UA1234
AA67
FLVS
Another directed graph
ORD PVD
MIADFW
SFO
LAX
LGA
HNL
AA1206
UA
7654
AA5432AA8901
AA4567
UA98UA567UA6789
UA
45UA1234
AA67
Directed graph
AA1207
UA1235
October 21, 2008 19. Graphs 9
Another comparison of directed and undirected graphs
• The streets of downtown Morgantown
• Directed graph: drivers of vehicles must drive in the direction indicated by arrow signs on the one-way streets.
• Undirected graph: pedestrians may walk either direction on the one-way streets.
Willey Forest Walnut Pleasant
FLVS
October 21, 2008 19. Graphs 10
John
DavidPaul
brown.edu
cox.net
cs.brown.edu
att.netqwest.net
math.brown.edu
cslab1bcslab1aApplications
• Electronic circuits– Printed circuit board
– Integrated circuit
• Transportation networks– Highway network
– Flight network
• Computer networks– Local area network
– Internet
– Web
• Databases– Entity-relationship diagram
October 21, 2008 19. Graphs 11
Entity-relationship model
• An entity-relationship model (ERM) is an abstract conceptual representation of structured data. Entity-relationship modeling is a relational schema database modeling method, used in software engineering to produce a type of conceptual data model (or semantic data model) of a system, often a relational database, and its requirements in a top-down fashion. Diagrams created using this process are called entity-relationship diagrams, or ER diagrams or ERDs for short. The definitive reference for entity relationship modelling is generally given as Peter Chen's 1976 paper[1]. However, variants of the idea existed previously (see for example A. P. G. Brown[2]) and have been devised subsequently.
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 12
Entity-relationship model: overview(1)
• The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. The data modeling technique can be used to describe any ontology (i.e. an overview and classifications of used terms and their relationships) for a certain universe of discourse (i.e. area of interest). In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Note that sometimes, both of these phases are referred to as "physical design".
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 13
Entity-relationship model: overview (2)
• There are a number of conventions for entity-relationship diagrams (ERDs). The classical notation is described in the remainder of this article, and mainly relates to conceptual modeling. There are a range of notations more typically employed in logical and physical database design, including IDEF1x (ICAM DEFinition Language) and dimensional modeling.
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 14
Entitity-relationship model: connection: entity (1)
• An entity may be defined as a thing which is recognised as being capable of an independent existence and which can be uniquely identified. An entity is an abstraction from the complexities of some domain. When we speak of an entity we normally speak of some aspect of the real world which can be distinguished from other aspects of the real world (Beynon-Davies, 2004).
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 15
Entity-relationship model: connection: entity (2)
• An entity may be a physical object such as a house or a car, an event such as a house sale or a car service, or a concept such as a customer transaction or order. Although the term entity is the one most commonly used, following Chen we should really distinguish between an entity and an entity-type. An entity-type is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most people tend to use the term entity as a synonym for this term.
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 16
Entity-relationship model: connection: entity (3)
• Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical theorem. Entities are represented as rectangles.
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 17
Entity-relationship model: connection: relationship
• A relationship captures how two or more entities are related to one another. Relationships can be thought of as verbs, linking two or more nouns. Examples: an owns relationship between a company and a computer, a supervises relationship between an employee and a department, a performs relationship between an artist and a song, a proved relationship between a mathematician and a theorem. Relationships are represented as diamonds, connected by lines to each of the entities in the relationship.
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 18
Entity-relationship model: attributes
• Entities and relationships can both have attributes. Examples: an employee entity might have a Social Security Number (SSN) attribute; the proved relationship may have a date attribute. Attributes are represented as ellipses connected to their owning entity sets by a line.
• http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 19
Entity-relationship model: graph representation
http://en.wikipedia.org/wiki/Entity-relationship_model
October 21, 2008 19. Graphs 20
Terminology• End vertices (or endpoints) of an edge
– U and V are the endpoints of a
• Edges incident on a vertex– a, d, and b are incident on V
• Adjacent vertices– U and V are adjacent
• Degree of a vertex– X has degree 5
• Parallel edges– h and i are parallel edges
• Self-loop– j is a self-loop
XU
V
W
Z
Y
a
c
b
e
d
f
g
h
i
j
October 21, 2008 19. Graphs 21
The degree of a vertexis the number of edgesincident on that vertex.The degree of a graphis the maximum degree
of any vertex.
Terminology• End vertices (or endpoints) of an edge
– U and V are the endpoints of a
• Edges incident on a vertex– a, d, and b are incident on V
• Adjacent vertices– U and V are adjacent
• Degree of a vertex– X has degree 5
• Parallel edges– h and i are parallel edges
• Self-loop– j is a self-loop
XU
V
W
Z
Y
a
c
b
e
d
f
g
h
i
j
FLVS mod
October 21, 2008 19. Graphs 22
P1
Terminology (cont.)• Path
– sequence of alternating vertices and edges
– begins with a vertex– ends with a vertex– each edge is preceded and followed by
its endpoints
• Simple path– path such that all its vertices and edges
are distinct
• Examples– P1=(V,b,X,h,Z) is a simple path– P2=(U,c,W,e,X,g,Y,f,W,d,V) is a path
that is not simple
XU
V
W
Z
Y
a
c
b
e
d
f
g
hP2
October 21, 2008 19. Graphs 23
Terminology (cont.)• Cycle
– circular sequence of alternating vertices and edges
– each edge is preceded and followed by its endpoints
• Simple cycle– cycle such that all its vertices and edges are
distinct
• Examples– C1=(V,b,X,g,Y,f,W,c,U,a,) is a simple
cycle
– C2=(U,c,W,e,X,g,Y,f,W,d,V,a,) is a cycle that is not simple
C1
XU
V
W
Z
Y
a
c
b
e
d
f
g
hC2
October 21, 2008 19. Graphs 24
Cycle: an additional definition
• We also use the term "cycle" to refer to a graph all of whose edges form a single cycle.
• The notation Cn denotes a cycle on n vertices. FLVS
C4
C5
C6
October 21, 2008 19. Graphs 25
PropertiesNotation
n number of vertices
m number of edges
deg(v) degree of vertex v
Property 1
v deg(v) 2m
Proof: each edge is counted twice
Property 2In an undirected graph with no
self-loops and no multiple edges
m n (n 1)2Proof: each vertex has degree at
most (n 1)
What is the bound for a directed graph?
Example
n 4
m 6
deg(v) 3
October 21, 2008 19. Graphs 26
K4, the complete graph on 4 vertices
Example
n 4
m 6
deg(v) 3
This is a special graph, the complete graph on 4 vertices, denoted K4.
FLVS-addition
October 21, 2008 19. Graphs 27
comparison of Cn and Kn
K4
K5
K6
C4
C5
C6
October 21, 2008 19. Graphs 28
Main Methods of the Graph ADT
• Vertices and edges– are positions
– store elements
• Accessor methods– aVertex()
– incidentEdges(v)
– endVertices(e)
– isDirected(e)
– origin(e)
– destination(e)
– opposite(v, e)
– areAdjacent(v, w)
• Update methods– insertVertex(o)
– insertEdge(v, w, o)
– insertDirectedEdge(v, w, o)
– removeVertex(v)
– removeEdge(e)
• Generic methods– numVertices()
– numEdges()
– vertices()
– edges()
If e is an edge between v and w, opposite(v,e) returns w
Returns an arbitrary vertex
October 21, 2008 19. Graphs 29
Edge List Structure
• Vertex object– element– reference to position in vertex
sequence
• Edge object– element– origin vertex object– destination vertex object– reference to position in edge
sequence
• Vertex sequence– sequence of vertex objects
• Edge sequence– sequence of edge objects
v
u
w
a c
b
a
zd
u v w z
b c d
October 21, 2008 19. Graphs 30
Adjacency List Structure
• Edge list structure• Incidence sequence for
each vertex– sequence of references
to edge objects of incident edges
• Augmented edge objects– references to associated
positions in incidence sequences of end vertices
u
v
w
a b
a
u v w
b
October 21, 2008 19. Graphs 31
Adjacency Matrix Structure
• Edge list structure• Augmented vertex objects
– Integer key (index) associated with vertex
• 2D adjacency array– Reference to edge object for
adjacent vertices– Null for non nonadjacent
vertices
• The “old fashioned” version just has 0 for no edge and 1 for edge
u
v
w
a b
0 1 2
0
1
2 a
u v w0 1 2
b
October 21, 2008 19. Graphs 32
Asymptotic Performance• n vertices, m edges
• no parallel edges
• no self-loops
• Bounds are “big-Oh”
EdgeList
AdjacencyList
Adjacency Matrix
Space n m n m n2
incidentEdges(v) m deg(v) n
areAdjacent (v, w) m min(deg(v), deg(w)) 1
insertVertex(o) 1 1 n2
insertEdge(v, w, o) 1 1 1
removeVertex(v) m deg(v) n2
removeEdge(e) 1 1 1
October 21, 2008 19. Graphs 33
Depth-First Search
DB
A
C
E
October 21, 2008 19. Graphs 34
Outline and Reading• Definitions (§6.1)
– Subgraph
– Connectivity
– Spanning trees and forests
• Depth-first search (§6.3.1)– Algorithm
– Example
– Properties
– Analysis
• Applications of DFS (§6.5)– Path finding
– Cycle finding
October 21, 2008 19. Graphs 35
Subgraphs• A subgraph S of a graph G is a graph
such that – The vertices of S are a subset of the
vertices of G
– The edges of S are a subset of the edges of G
• A spanning subgraph of G is a subgraph that contains all the vertices of G
Subgraph
Spanning subgraph
October 21, 2008 19. Graphs 36
Connectivity
• A graph is connected if there is a path between every pair of vertices
• A connected component of a graph G is a maximal connected subgraph of G
Connected graph
Non connected graph with two connected components
October 21, 2008 19. Graphs 37
Trees and Forests• A (free) tree is an undirected
graph T such that– T is connected– T has no cyclesThis definition of tree is different
from the one of a rooted tree
• A forest is an undirected graph without cycles
• The connected components of a forest are trees
Tree
Forest
October 21, 2008 19. Graphs 38
Spanning Trees and Forests• A spanning tree of a connected
graph is a spanning subgraph that is a tree
• A spanning tree is not unique unless the graph is a tree
• Spanning trees have applications to the design of communication networks
• A spanning forest of a graph is a spanning subgraph that is a forest
Graph
Spanning tree
October 21, 2008 19. Graphs 39
Depth-First Search• Depth-first search (DFS) is a
general technique for traversing a graph
• A DFS traversal of a graph G – Visits all the vertices and
edges of G– Determines whether G is
connected– Computes the connected
components of G– Computes a spanning forest of
G
• DFS on a graph with n vertices and m edges takes O(nm ) time
• DFS can be further extended to solve other graph problems– Find and report a path between
two given vertices– Find a cycle in the graph
• Depth-first search is to graphs what Euler tour is to binary trees
October 21, 2008 19. Graphs 40
DFS Algorithm• The algorithm uses a mechanism for setting
and getting “labels” of vertices and edgesAlgorithm DFS(G, v)
Input graph G and a start vertex v of G Output labeling of the edges of G
in the connected component of v as discovery edges and back edges
setLabel(v, VISITED)for all e G.incidentEdges(v)
if getLabel(e) UNEXPLORED
w G.opposite(v,e)if getLabel(w)
UNEXPLOREDsetLabel(e, DISCOVERY)DFS(G, w)
elsesetLabel(e, BACK)
Algorithm DFS(G)Input graph GOutput labeling of the edges of G
as discovery edges andback edges
for all u G.vertices()setLabel(u, UNEXPLORED)
for all e G.edges()setLabel(e, UNEXPLORED)
for all v G.vertices()if getLabel(v)
UNEXPLOREDDFS(G, v)
October 21, 2008 19. Graphs 41
Example
DB
A
C
E
DB
A
C
E
DB
A
C
E
discovery edgeback edge
A visited vertex
A unexplored vertex
unexplored edge
October 21, 2008 19. Graphs 42
Example (cont.)
DB
A
C
E
DB
A
C
E
DB
A
C
E
DB
A
C
E
October 21, 2008 19. Graphs 43
DFS and Maze Traversal • The DFS algorithm is
similar to a classic strategy for exploring a maze– We mark each intersection,
corner and dead end (vertex) visited
– We mark each corridor (edge ) traversed
– We keep track of the path back to the entrance (start vertex) by means of a rope (recursion stack)
October 21, 2008 19. Graphs 44
Properties of DFSProperty 1
DFS(G, v) visits all the vertices and edges in the connected component of v
Property 2The discovery edges labeled by DFS(G, v) form a spanning tree of the connected component of v
DB
A
C
E
October 21, 2008 19. Graphs 45
Analysis of DFS
• Setting/getting a vertex/edge label takes O(1) time• Each vertex is labeled twice
– once as UNEXPLORED– once as VISITED
• Each edge is labeled twice– once as UNEXPLORED– once as DISCOVERY or BACK
• Method incidentEdges is called once for each vertex• DFS runs in O(n m) time provided the graph is represented by the
adjacency list structure– Recall that v deg(v) 2m
October 21, 2008 19. Graphs 46
Path Finding• We can specialize the DFS
algorithm to find a path between two given vertices u and z using the template method pattern
• We call DFS(G, u) with u as the start vertex
• We use a stack S to keep track of the path between the start vertex and the current vertex
• As soon as destination vertex z is encountered, we return the path as the contents of the stack
Algorithm pathDFS(G, v, z)setLabel(v, VISITED)S.push(v)if v z
return S.elements()for all e G.incidentEdges(v)
if getLabel(e) UNEXPLORED
w opposite(v, e)if getLabel(w)
UNEXPLORED setLabel(e, DISCOVERY)S.push(e)pathDFS(G, w, z)S.pop() { e gets
popped }else
setLabel(e, BACK)S.pop() { v gets popped }
October 21, 2008 19. Graphs 47
Cycle Finding• We can specialize the DFS
algorithm to find a simple cycle using the template method pattern
• We use a stack S to keep track of the path between the start vertex and the current vertex
• As soon as a back edge (v, w) is encountered, we return the cycle as the portion of the stack from the top to vertex w
Algorithm cycleDFS(G, v, z)setLabel(v, VISITED)S.push(v)for all e G.incidentEdges(v)
if getLabel(e) UNEXPLOREDw opposite(v,e)S.push(e)if getLabel(w) UNEXPLORED
setLabel(e, DISCOVERY)pathDFS(G, w, z)S.pop()
elseC new empty stackrepeat
o S.pop()C.push(o)
until o wreturn C.elements()
S.pop()
October 21, 2008 19. Graphs 48
Breadth-First Search
CB
A
E
D
L0
L1
FL2
October 21, 2008 19. Graphs 49
Outline and Reading• Breadth-first search (§6.3.3)
– Algorithm
– Example
– Properties
– Analysis
– Applications
• DFS vs. BFS (§6.3.3)– Comparison of applications
– Comparison of edge labels
October 21, 2008 19. Graphs 50
Breadth-First Search• Breadth-first search (BFS) is a
general technique for traversing a graph
• A BFS traversal of a graph G – Visits all the vertices and
edges of G– Determines whether G is
connected– Computes the connected
components of G– Computes a spanning forest of
G
• BFS on a graph with n vertices and m edges takes O(nm ) time
• BFS can be further extended to solve other graph problems
– Find and report a path with the minimum number of edges between two given vertices
– Find a simple cycle, if there is one
October 21, 2008 19. Graphs 51
BFS Algorithm• The algorithm uses a mechanism
for setting and getting “labels” of vertices and edges
Algorithm BFS(G, s)L0 new empty sequenceL0.insertLast(s)setLabel(s, VISITED)i 0 while Li.isEmpty()
Li 1 new empty sequence for all v Li.elements()
for all e G.incidentEdges(v) if getLabel(e) UNEXPLORED
w opposite(v,e)if getLabel(w)
UNEXPLOREDsetLabel(e, DISCOVERY)setLabel(w, VISITED)Li 1.insertLast(w)
elsesetLabel(e, CROSS)
i i 1
Algorithm BFS(G)Input graph GOutput labeling of the edges
and partition of the vertices of G
for all u G.vertices()setLabel(u, UNEXPLORED)
for all e G.edges()setLabel(e, UNEXPLORED)
for all v G.vertices()if getLabel(v)
UNEXPLORED
BFS(G, v)
October 21, 2008 19. Graphs 52
Example
CB
A
E
D
discovery edgecross edge
A visited vertex
A unexplored vertex
unexplored edge
L0
L1
F
CB
A
E
D
L0
L1
F
CB
A
E
D
L0
L1
F
October 21, 2008 19. Graphs 53
Example (cont.)
CB
A
E
D
L0
L1
F
CB
A
E
D
L0
L1
FL2
CB
A
E
D
L0
L1
FL2
CB
A
E
D
L0
L1
FL2
October 21, 2008 19. Graphs 54
Example (cont.)
CB
A
E
D
L0
L1
FL2
CB
A
E
D
L0
L1
FL2
CB
A
E
D
L0
L1
FL2
October 21, 2008 19. Graphs 55
PropertiesNotation
Gs: connected component of s
Property 1BFS(G, s) visits all the vertices and edges of Gs
Property 2The discovery edges labeled by BFS(G, s) form a spanning tree Ts of Gs
Property 3For each vertex v in Li
– The path of Ts from s to v has i edges – Every path from s to v in Gs has at least i
edgesCB
A
E
D
L0
L1
FL2
CB
A
E
D
F
October 21, 2008 19. Graphs 56
Analysis
• Setting/getting a vertex/edge label takes O(1) time• Each vertex is labeled twice
– once as UNEXPLORED– once as VISITED
• Each edge is labeled twice– once as UNEXPLORED– once as DISCOVERY or CROSS
• Each vertex is inserted once into a sequence Li • Method incidentEdges is called once for each vertex• BFS runs in O(n m) time provided the graph is represented by the
adjacency list structure– Recall that v deg(v) 2m
October 21, 2008 19. Graphs 57
Applications• Using the template method pattern, we can specialize the BFS
traversal of a graph G to solve the following problems in O(n m) time– Compute the connected components of G
– Compute a spanning forest of G
– Find a simple cycle in G, or report that G is a forest
– Given two vertices of G, find a path in G between them with the minimum number of edges, or report that no such path exists
October 21, 2008 19. Graphs 58
DFS vs. BFS
CB
A
E
D
L0
L1
FL2
CB
A
E
D
F
DFS BFS
Applications DFS BFSSpanning forest, connected components, paths, cycles
Shortest paths
Biconnected components
October 21, 2008 19. Graphs 59
DFS vs. BFS (cont.)Back edge (v,w)
– w is an ancestor of v in the tree of discovery edges
Cross edge (v,w)– w is in the same level as v or in the
next level in the tree of discovery edges
CB
A
E
D
L0
L1
FL2
CB
A
E
D
F
DFS BFS