grammars a grammar is a 4-tuple g = (v, t, p, s) where 1)v is a set of nonterminal symbols (also...
DESCRIPTION
Example: G 1 = (V, T, P, S) where V = {A, S} T = {0, 1} P = {S 0A1 0A 00A1 A } Note: Grammars are generators, while automata are recognizers. A grammar defines a language in a recursive manner.TRANSCRIPT
GrammarsA grammar is a 4-tuple G = (V, T, P, S) where 1) V is a set of nonterminal symbols (also
called variables or syntactic categories)2) T is a finite set of terminal symbols,
disjoint from V 3) P is a finite subset of
(VT)*V(VT)* (VT)*
4) S is a distinguished symbol in V called the start symbol (or sentence)
An element (, ) in P is called a production and is written as .
is referred to as the left hand side of the production and is referred to as the right hand side of the production.
Notice that the left side must have at least one nonterminal
Example: G1 = (V, T, P, S) where V = {A, S} T = {0, 1} P = {S 0A10A 00A1A }
Note: Grammars are generators, while automata are recognizers. A grammar defines a language in a recursive manner.
Definition: A sentential form of a grammar G = (V, T, P, S) is defined recursively as follows:1. S is a sentential form2. If is a sentential form and
is a production in P, then is also a sentential form
A sentential form of G containing no nonterminal symbol is called a sentence generated by G.
The language generated by a grammar G, denoted by L(G), is the set of all sentences generated by G.
Terminology:G (directly derives) is a relation on
(V T)*. If is a production in P, then G
k k-fold product of + transitive closure of * reflexive transitive closure of
If k then there is a sequence of strings 0, i, i+1, k such that
0 = , i-1 i (1 i k), k =
This sequence of strings is called a derivation sequence of length k.
L(G) = { w | w in T* and S * w}
Example: L(G1) = {0n1n | n > 0}
Design of grammars
L = {w | w {0,1}* and w starts and ends with the same symbol}
S → 0A | 1B | A → 0A | 1A | 0B → 0B | 1B | 1
Classification of grammarsDefinition: A grammar G is said to be 1) Right-linear if each production in P is
of the form A xB or A x where A and B are in V and x in T*.
2) Context-free if each production is of the form A , where A is in V and is in (VT)*.
3) Context-sensitive if each production is of the form , where || ||.
4) A grammar with no restrictions as above is called unrestricted.
If a language L is generated by a ‘type x’ grammar, then the language L is said to be ‘type x’ language.
Right-linear grammars generate right-linear languages, context-free grammars (CFGs) generate context-free languages (CFLs), context-sensitive grammars (CSGs) generate context-sensitive languages (CSLs).
The four types are referred to as Chomsky hierarchy.
Equivalence of regular languages and right-linear languages
Lemma 1(1) , (2) { }, (3) {a} for all a in are right-linear languagesProof:In each case we construct a right-linear grammar
generating the language. (1) G = ({S}, , , S} is a right linear
grammar such that L(G) = . (2) G = ({S}, , { S }, S) is a right-linear
grammar such that L(G) = { }. (3) Ga = ({S}, , { S a}, S) is a right-linear
grammar such that L(Ga) = { a }.
Lemma2 If L1 and L2 are right-linear languages, then (1) L1 L2, (2) L1L2, (3) L1
* are right-linear languages.
Proof: Since L1 and L2 are right-linear languages, we can assume that there are right-linear grammars G1 = (N1, , P1, S1) and G2 = (N2, , P2, S2) such that L(G1) = L1 and L(G2) = L2.
We can assume that N1 and N2 are disjoint (otherwise rename the symbols).
Define the following grammars: (1) G3 = (N1 N2 {S3}, , P1 P2 {S3 S1 | S2}, S3)
(2) G4 = (N1 N2, , P4, S1) where P4 is defined as follows:
(i) if A xB is in P1, then A xB is in P4
(ii) if A x is in P1, then A xS2 is in P4
(iii) All productions in P2 are in P4
(3) G5 = (N1 {S5}, , P5, S5) where P5 is defined as follows:
(i) if A xB is in P1, then A xB is in P5
(ii) if A x is in P1, then A xS5 and Ax are in P5
(iii) S5 S1 | are in P5
Claim: (1) L(G3) = L(G1) L(G2), (2) L(G4) = L(G1)L(G2), and
(3) L(G5) = (L(G1))*.
Proof of claim (2): If S1 * w1 in G1 then S1 * w1S2 in G4.
If S2 * x in G2 then S2 x* in G4.
So, L(G1)L(G2) L(G4).
Now suppose S1 * w in G4. Since there are no productions of the form A x in G4 that came out of G1, we can write the derivation in the form
S1 * xS2 * xy where w = xy. By construction, there must be derivations of the form S1 * x and
S2 * y. Hence, L(G4) L(G1)L(G2).
Proofs of claims 1 and 3 are left as exercise.
Theorem : A regular language is a right-linear language
Proof: Follows from Lemma 1 and inductive applications of Lemma2 on the construction of the regular language.
Regular Expression EquationsX = aX + b where a and b are regular expressions
and X is an unknown is a regular expression equation.
It can be verified that a*b is a solution for the equation.
A set of regular expression equations is said to be in standard form over a set of indeterminates = {X1, …, Xn}if for each Xi there is an equation of the form
Xi = ai0 + ai1X1 + … + ainXn where aij are regular expressions over some alphabet disjoint form .
Solving regular expression equations
Example: X1 = 0X2 + 1X1 +
X2 = 0X3 + 1X2
X3 = 0X1 + 1X3
Theorem : The language defined by a right-linear grammar is a regular set
Proof: Construct a set of regular expression equations from the productions. The solution of the start symbol is a regular expression denoting the language defined by the right-linear grammar.
ExampleConstruct a regular expression equivalent to
S 0A | 1S | A 0B | 1A
B 0S | 1B
Definitions:1) A grammar G = (V, T, P, S) is said to be
left-linear if each production is of the form A Bx or A x, where A and B are in V and x is in T*.
2) A grammar G = (V, T, P, S) is said to be regular if every production takes one of the forms A aB or A a or S . If S is a production, then S does not appear in the RHS of any production.
Example:A B | CB 0B | 1B | 011
C 0D | 1C | D 0C | 1D
Let M = (Q, , , q0, F) be a DFA.
Define a grammar G = (Q, , P, q0) where P is defined as follows:P = {B aC | (B,a) = C} {B a | (B,a) is in F}
If q0 is in F add q0 to P.
Then L(M) = L(G)
a, b a
b
ba
A B
C
Let G = (V, T, P, S) be a regular grammarDefine as finite automaton M = (V {f}, T,
, S, {f}), where(q,a) = p if q ap is in P(q,a) = f if q a is in Pif S is a production, (S,) = f Then, L(G) = L(M)