digraph (directed graph)
• A digraph is a pair of sets (V, E) such that
each element of E is an ordered pair of elements in V.
• A path is an alternative sequence of vertices and edges such that all edges are in the same direction.
string-labeled digraph
• A string-labeled digraph is a digraph in which each edge is labeled by a string.
• In a string-labeled digraph, every path is associated with a string which is obtained by concatenating all strings on the path.
• This string is called the label of the path.
G(r)
• For each regular expression r, we can construct a digraph G(r) with edges labeled by symbols and ε as follows.
• If r=Φ, then
• If r≠Φ, then
Theorem 1
• G(r) has a property that a string x belongs to r if and only if x is the label of a path from the initial vertex to the final vertex.
• Proof is done by induction on r.
Graph Representation
• A graph representation of a regular expression r is a string-labeled graph with an initial vertex s and a final vertex f such that a string x belongs to r if and only if x is associated with a path from s to f.
Corollary 2
• For any regular expression r, there exists a string-labeled digraph with two special vertices, a initial vertex s and a final vertex f, such that a string x belongs to r if and only if x is associated with a path from s to f.
Puzzle: If a regular expression r contains u
``+''s, v ``·''s, and w ``*''s, how many
ε-edges does G(r) contain?
Question: How to reduce the number of
ε-edges?
Theorem 3
• An ε-edge (u,v) in G(r) which is a unique out-edge from a nonfinal vertex u or a unique in-edge to a noninitial vertex v can be shrunk to a single vertex. (If one of u and v is the initial vertex or the final vertex, so is the resulting vertex.)
• Remark: Shrinking should be done one by one.
The tape is divided into finitely many cells. Each cell contains a symbol in an alphabet Σ.
a l p h a b e t
• The head scans at a cell on the tape and can read a symbol on the cell. In each move, the head can move to the right cell.
a
• The finite control has finitely many states which form a set Q. For each move, the state is changed according to the evaluation of a transition function
δ : Q x Σ → Q .
• δ(q, a) = p means that if the head reads symbol a and the finite control is in the state q, then the next state should be p, and the head moves one cell to the right.
pq
a a
• There are some special states: an initial state s and a set F of final states.
• Initially, the DFA is in the initial state s and the head scans the leftmost cell. The tape holds an input string.
s
• When the head gets off the tape, the DFA stops. An input string x is accepted by the DFA if the DFA stops at a final state.
• Otherwise, the input string is rejected.
h
x
• The DFA can be represented by
M = (Q, Σ, δ, s, F)
where Σ is the alphabet of input symbols.
• The set of all strings accepted by a DFA M is denoted by L(M). We also say that the language L(M) is accepted by M.
• The transition diagram of a DFA is an alternative way to represent the DFA.
• For M = (Q, Σ, δ, s, F), the transition diagram of M is a symbol-labeled digraph G=(V, E) satisfying the following:
V = Q (s = , f = for f \in F)
E = { q p | δ(q, a) = p}.
a
The transition diagram of the DFA M has the
following properties:
• For every vertex q and every symbol a, there exists an edge with label a from q.
• For each string x, there exists exactly one path starting from the initial state s associated with x.
• A string x is accepted by M if and only if this path ends at a final state.