closure, kleene’s theorem, and kleene’s...
TRANSCRIPT
DEGefWb5wiGH2XgYMEzUKajEmtCDUsRQ4d1A nice paper relevant to this course is titled “The Glory of the Past”2NICTA Researcher,
Adjunct at the Australian National University and Griffith University
ε-Closure, Kleene’s Theorem,and Kleene’s Algebra
Charles Gretton2
24 February 2014
NICTA Funding and Supporting Members and Partners
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 1/30
First, a Quick Summary of ε-Closure
• I’ve previously waived my hands about ε-labelled arcs.
• Recall, ε is our notation for the empty string.
• How can we formally treat ε-labelled arcs in an NFA?
• Using a function called ECLOSE...
• A construction similar to that for NFA←→ DFA is used to proveε-NFA←→ DFA
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 2/30
Quick Summary of ε-Closure
• ε-NFA is 〈Q,Σ, δ, q0,F 〉.
• ECLOSE : Q → 2Q – i.e. gives an element in the powerset of states.
• It is defined inductively, as follows.
• BASIS: q ∈ ECLOSE(q) – i.e. the argument is in the basis.
• INDUCTION: Suppose p ∈ ECLOSE(q) and r ∈ δ(p, ε), thenr ∈ ECLOSE(q).
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 3/30
Quick Example of ε-Closure
• ECLOSE : Q → 2Q – i.e. gives an element in the powerset of states.
• It is defined inductively, as follows.
• BASIS: q ∈ ECLOSE(q) – i.e. the argument is in the basis.
• INDUCTION: Suppose p ∈ ECLOSE(q) and r ∈ δ(p, ε), thenr ∈ ECLOSE(q).
???
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 4/30
Quick Example of ε-Closure
• ECLOSE : Q → 2Q – i.e. gives an element in the powerset of states.
• It is defined inductively, as follows.
• BASIS: q ∈ ECLOSE(q) – i.e. the argument is in the basis.
• INDUCTION: Suppose p ∈ ECLOSE(q) and r ∈ δ(p, ε), thenr ∈ ECLOSE(q).
ECLOSE(1) = {1, 2, 3, 4, 6}
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 5/30
ε-Closure Subset Construction
• For DFA −→ ε-NFA, just add δ(q, ε) = ∅ transitions to all states –i.e. ε is not an alphabet symbol.
• ε-NFA −→ DFA
• Same as for NFA −→ DFA,
• Each DFA state corresponds to a unique set of ε-NFA states.
• Only take the ε-closure at ε-NFA transitions.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 6/30
Summarizing
• NFA←→ DFA
• The equivalent NFA can be exponentially smaller than thecorresponding DFA.
• ε-NFA←→ DFA
• From transitivity: ε-NFA←→ NFA
• All formalisms so far are expressively equivalent.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 7/30
Languages and Regular Expressions
• Regular expression implementations in many UNIX tools: grep,sed, awk, emacs and vi
• Following Kleene’s seminal work, we deal with regularity formally.
• We describe the language intended by a given regular expression.
• We consider the class of languages that can be expressed.
• We examine algebraic laws which inform re-writing rules forsimplifying expressions.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 8/30
Operations on Languages
• L1,L2 ⊆ Σ∗
• product of languages
L1.L2 = L1L2 = {WX |W ∈ L1,X ∈ L2}
• inductive i th product
L0 = {ε} ∀i > 0, Li = LLi−1
L∗ =⋃
i≥0 Li
• positive closure
L+ = LL∗
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 9/30
Regular Expressions
• A regular expression is a well formed word over a given alphabet Σ
• We use the following symbols:
Σ ∪ {ε ∅ ( ) + . ∗}
• alphabet• empty string• empty set• open parenthesis• close parenthesis• “union” or “choice”• concatenation – X.Y often written as XY• Kleene closure – the “star” of this show
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 10/30
Regular Expressions
• A regular expression is a well formed word over a given alphabet Σ
• We use the following symbols:
Σ ∪ {ε ∅ ( ) + . ∗}
• “symbol” :: alphabet• constant (or 0-ary operator) :: empty string• constant (or 0-ary operator) :: empty set• punctuation :: open parenthesis• punctuation :: close parenthesis• infix binary operator :: “union” or “choice”• infix binary operator :: concatenation – X.Y often written as XY• postfix unary operator :: Kleene closure – the “star” of this show
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 11/30
Regular Expressions
• We use the following symbols:
Σ ∪ {ε ∅ ( ) + . ∗}
• Taking a ∈ Σ ∪ {ε, ∅}, regular expressions satisfy the followinggrammar:
regxp ::= a|(regxp)|regxp + regxp|regxp.regxp|regxp∗
• The Kleene closure operator, *, has the highest precedence.• Concatenation, . , is usually represented by juxtaposition – e.g.
concatenation a.b is written ab• Concatenation has the next highest precedence.• Union/choice has the least precedence.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 12/30
Regular Expression Examples
• Taking a ∈ Σ ∪ {ε, ∅}, regular expressions satisfy the followinggrammar:
regxp ::= a|(regxp)|regxp + regxp|regxp.regxp|regxp∗
• So, which of the following is correct?
ab∗ + a = (a(b∗ + a))ab∗ + a = (ab)∗ + aab∗ + a = (a(b∗)) + a
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 13/30
Regular Expression Examples
• Taking a ∈ Σ ∪ {ε, ∅}, regular expressions satisfy the followinggrammar:
regxp ::= a|(regxp)|regxp + regxp|regxp.regxp|regxp∗
• So, which of the following is correct?
ab∗ + a = (a(b∗ + a))ab∗ + a = (ab)∗ + aab∗ + a = (a(b∗)) + a
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 14/30
Semantics of Regular Expressions
• BASISLanguage of Expression SemanticsL(ε) = {ε}L(∅) = ∅L(a), a ∈ Σ = {a}
• INDUCTION
• Below, L1 and L2 are well formed regular expressions.
Language of Expression SemanticsL(L1 + L2) = L(L1) ∪ L(L2)L(L1L2) = L(L1)L(L2)L(L∗1) = L(L1)∗
L((L1)) = L(L1)
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 15/30
Every Finite Language is Regular
• A regular language is a language that can be given by a regularexpression.
• A finite language is a finite set of strings:
{X1,X2, ..,Xn}
• The regular expression that gives that finite language is:
X1 + X2 + ..+ Xn
• QED
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 16/30
Algebra and Regular Expressions
• Commutativity laws:
L1 + L2 = L2 + L1
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 17/30
Algebra and Regular Expressions
• Cancellation laws:• Identities
L + ∅ = L∅+ L = LεL = LLε = L
• Annihilators
L∅ = ∅∅L = ∅
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 18/30
Algebra and Regular Expressions
• Associative laws:
L1 + (L2 + L3) = (L1 + L2) + L3
L1(L2L3) = (L1L2)L3
• Power associativity
L1(L1L1) = (L1L1)L1
... ... ...(L1L1)(L1L1) = (L1(L1L1))L1
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 19/30
Algebra and Regular Expressions
• Left and right distributive laws:
L1(L2 + L3) = L1L2 + L1L3
(L2 + L3)L1 = L2L1 + L3L1
• Idempotence:
L + L = L
• Closure laws – There are exactly two languages whose closures arefinite. What are they?
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 20/30
Algebra and Regular Expressions
• Left and right distributive laws:
L1(L2 + L3) = L1L2 + L1L3
(L2 + L3)L1 = L2L1 + L3L1
• Idempotence:
L + L = L
• Closure laws
ε∗ = ε∅∗ = ε(L∗)∗ = L∗
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 21/30
Application of Algebraic Laws
• a + ab∗
• precedence: (a + (a(b∗)))
• annihilators: (aε+ (a(b∗)))
• left distributive law: a(ε+ b∗)
• cancellation law: a(b∗)
• CONCLUSION: ab∗ = a + ab∗
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 22/30
Relating Concrete and Lifted Expressions
• When I write (L∗)∗ = L∗, I intended that L could be any regularexpression.
• For example, the following concrete expressions are instances ofthis identity.
1 ((a)∗)∗ = a∗
2 ((ε(a + bab + ∅) + bab)∗)∗ = (ε(a + bab + ∅) + bab)∗
• The above are called concrete regular expressions, because theycontain no variables.
• Our textbook talks about a specific class of concrete expression(above, Type 1), where each variable symbol from the liftedexpression is replaced with an alphabet constant.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 23/30
Theorem Relating Concrete and LiftedExpressions
• Let E be a regular expression with variable symbols Li , i ∈ [1..m].
• For example taking m = 2, E could be (L1 + L2)∗, (L1L∗2)∗, etc
• Let E ′ be an expression where each variable Li is replaced by aconstant ai .
• Every X ∈ L(E) can be written as X1X2..Xk , where each substringXj is in one of the languages L(Li).
• Writing L(Xj) = i if Xj is from the i th language,
• aL(X1)aL(X2)..aL(Xk )∈ L(E ′)
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 24/30
Theorem Relating Concrete and LiftedExpressions
• BASIS: E ∈ {ε, ∅, L1}.
• The cases E ∈ {ε, ∅} are trivial, because the concrete and liftedexpressions are identical.
• Taking E = L1, then L(E) = L(L1). Also E ′ = a1 andL(E ′) = L(a1) = {a1}.
• Above, the L1/a1 correspondence is clearly satisfied.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 25/30
Theorem Relating Concrete and LiftedExpressions
• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.
• E ′1 and E ′2 are constructed so that if a variable appears in bothsubexpressions, then that is substituted with the same concretesymbol.
• For example, take E = L1L2 + L2L1.
• Then a valid concrete expression is: E = a1a2 + a2a1.
• An invalid concrete expression would be: E = a1a2 + a3a4.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 26/30
Theorem Relating Concrete and LiftedExpressions
• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.
• E ′1 and E ′2 are constructed so that if a variable appears in bothsubexpressions, then that is substituted with the same concretesymbol.
• L(E ′) = L(E ′1 + E ′2) = L(E ′1) ∪ L(E ′2)
• L(E) = L(E1 + E2) = L(E1) ∪ L(E2)
• Our inductive hypothesis gives us the correspondence for strings inL(E1)/L(E ′1).
• Also for strings in L(E2)/L(E ′2).
• Following the definition of union, we have proved this step case.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 27/30
Theorem Relating Concrete and LiftedExpressions
• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.• E ′1 and E ′2 are constructed so that if a variable appears in both
subexpressions, then that is substituted with the same concretesymbol.
• L(E ′) = L(E ′1.E′2) = L(E ′1).L(E ′2)
• L(E) = L(E1.E2) = L(E1).L(E2)
• Our inductive hypothesis gives us the correspondence for strings inL(E1)/L(E ′1).
• Also for strings in L(E2)/L(E ′2).
• Following the definition of concatenation, we have proved this stepcase.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 28/30
Theorem Relating Concrete and LiftedExpressions
• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.
• L((E ′1)∗) = L(ε) ∪ L(E ′1)+
• L(E∗1 ) = L(ε) ∪ L(E1)+
• Our inductive hypothesis gives us the correspondence for strings inL(E1)/L(E ′1).
• Because closure yields zero or more occurrences of the languagestrings, we also have a correspondence between L(E1)∗/L(E ′1)∗.
• QED
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 29/30
Theorem Relating Concrete and LiftedExpressions
• What if we added to the step cases:E ∈ {E1 ∩ E2,E1 + E2,E1E2,E∗1}?
• Let’s propose a law L1 ∩ L2 ∩ L3 = L1 ∩ L2
• The law holds for a concretization: L(a ∩ b ∩ c) = L(a ∩ b) = ∅
• But fails generally: L(a ∩ a ∩ ∅) 6= L(a ∩ a)
• So, the theorem we just proved is rather specific to regularexpressions.
Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 30/30