lec05
DESCRIPTION
Language formular and automataTRANSCRIPT
DFA to Regular Expressions
Proposition: If L is regular then there is a regular expression r such that
L = L(r).
Proof Idea: Let M = (Q,Σ, δ, q1, F ) be a DFA recognizing L, with
Q = {q1, q2, . . . qn}, and F = {qf1, . . . qfk
}
• Construct regular expression r1fisuch that
L(r1fi) = {w : | δ(q1, w) = qfi
}, i.e., r1fidescribes the set of strings
on which M ends up in state qfiwhen started in the initial state q1.
• Then
L = L(M) = L(r1f1) ∪ L(r1f2
) ∪ · · · ∪ L(r1fk)
= L(r1f1+ r1f2
+ · · · + r1fk)
Thus, the desired regular expression is r1f1+ r1f2
+ · · · + r1fk
89
Constructing r1fi
Idea 1 For every i, j, build regular expressions rij describing strings
taking M from state qi to state qj , where qi need not be the initial
state and qj need not be a final state.
Idea 2 Build the expression rij inductively.
• Start with expressions that describe paths from qi to qj that do
not pass through any intermediate states; i.e., these are single
nodes or single edges.
• Inductively build expressions that describe paths that pass
through progressively a larger set of states.
90
Definitions and Notation
Define Rkij to be the set (not regular expression) of strings leading from
qi to qj such that any intermediate state is ≤ k
• Note, the superscript k refers only to the intermediate states; so i
and j could be greater than k.
• R0ij set of strings that go from qi to qj without passing through any
intermediate states; in other words they are ǫ or single edges.
• Rnij is set of all strings going from qi to qj
91
Constructing set Rkij: Base Case
R0ij =
8<: {a | δ(qi, a) = qj} if i 6= j
{a | δ(qi, a) = qj} ∪ {ǫ} if i = j
92
Constructing set Rkij: Inductive Step
k − 1
j
i
k
Assume we have Rk−1
ij
Rkij = R
k−1
ij ∪ Rk−1
ik (Rk−1
kk )∗Rk−1
kj
93
Constructing the Regular Expression
Task: Construct expression rkij such that L(rk
ij) = Rkij .
Base Case
r0ij =
8>><>>: ∅ if R0ij = ∅
a1 + a2 + · · · am if R0ij = {a1, a2, . . . am}
ǫ + a1 + · · · am if R0ij = {ǫ, a1, . . . am}
94
Constructing the Regular Expression: Inductive step
Assume inductively, rk−1
ij is the regular expression for Rk−1
ij
Rkij = R
k−1
ij ∪ Rk−1
ik (Rk−1
kk )∗Rk−1
kj
= L(rk−1
ij ) ∪ L(rk−1
ik )(L(rk−1
kk ))∗L(rk−1
kj )
= L(rk−1
ij + rk−1
ik (rk−1
kk )∗rk−1
kj )
rk−1
ij + rk−1
ik (rk−1
kk )∗rk−1
kj is the Regular Expression for Rkij .
95
Completing the Proof
Proposition: If L is regular then there is a regular expression r such that
L = L(r).
Proof: Let q1 be the initial state, and {qf1, qf2
, . . . qfk} the final states of
M (which recognizes L), then the desired regular expression is
rn1f1
+ rn1f2
+ · · · rn1fk
96
Example
1 10
0q1 q2
r011 = 1 + ǫ r
022 = 1 + ǫ
r012 = 0 r
021 = 0
r112 = r
012 + r
011(r
011)
∗
r012 r
122 = r
022 + r
021(r
011)
∗
r012
= 0 + (1 + ǫ)+0 = (1 + ǫ) + 0(1 + ǫ)∗0
r2
12 = r1
12 + r1
12(r1
22)∗
r1
22
= (0 + (1 + ǫ)+
0) + (0 + (1 + ǫ)+
0)((1 + ǫ) + 0(1 + ǫ)∗
0)+
= (0 + (1 + ǫ)+
0)(1 + ǫ + 0(1 + ǫ)∗
0)∗
= (1 + ǫ)∗0(1 + ǫ + 01∗0)∗
= 1∗
0(1 + 01∗
0)∗
L(M) = L(r2
12)
97
Analysis of the Translation
Size of the constructed regular expression
• Number of regular expressions = O(n3)
• At each step the regular expression may blowup by a factor of 4
• Each regular expression rnij can be of size O(4n)
The above method works for both NFA and DFA
For converting DFA there is slightly more efficient method (see
textbook)
98
Thus far . . .
NFA DFA
ǫ-NFA Regular Exp.
99
Regular Expression Identities
Associativity and Commutativity
L + M = M + L
(L + M) + N = L + (M + N)
(LM)N = L(MN)
Note: LM 6= ML
Distributivity
L(M + N) = LM + LN
(M + N)L = ML + NL
100
More Identities
Identities and Anhilators
∅ + L = L + ∅ = L
ǫL = Lǫ = L
∅L = L∅ = ∅
Idempotent Law
L + L = L
Closure Laws
(L∗)∗ = L∗
(∅)∗ = ǫ
ǫ∗ = ǫ
L+ = LL∗ = L∗L
L∗ = L+ + ǫ
101
Testing Regular Expression Identities
To test if an equation E = F holds [Example: (L + M)∗ = (L∗M∗)∗]
1. Convert E and F into concrete expressions C and D, by replacing
each language variable in E and F by a concrete symbol
[Replacing L by a and M by b, we get C = (a + b)∗ and D = (a∗b∗)∗]
2. L(C) = L(D) iff the equation E = F holds
[L((a + b)∗) = L((a∗b∗)∗) and so (L + M)∗ = (L∗M∗)∗ holds]
Correctness of algorithm follows from Theorems 3.13 and 3.14 in the
book
102
However, caution!!
The algorithm only applies to equations of regular expressions and not to
all equations
• Consider the equation L ∩ M ∩ N = L ∩ M , where ∩ means set
intersection
• The equation is clearly false
• Concretizing we get {a} ∩ {b} ∩ {c} and {a} ∩ {b}, which are clearly
equal!!
103