discrete math. and logic ii. context-free grammars - sfwr ...se2fa3/notes/ln18.pdf · the grammar...
TRANSCRIPT
Discrete Math. and Logic II. Context-FreeGrammars
SFWR ENG 2FA3
Ryszard Janicki
Winter 2014
Acknowledgments: Material partially based on Automata and Computability by Dexter C. Kozen (Chapter 19).
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 1 / 16
Introduction
An example of a context-free grammar
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 2 / 16
Introduction
The objects 〈xxx〉 are called nonterminal symbols
Each nonterminal symbol generates a set of strings over a
�nite alphabet Σ in a systematic way
the nonterminal 〈arith-expr〉 in
〈assg-stmt〉 ::= 〈var〉 := 〈arith-expr〉
generates the set of syntactically correct arithmetic
expressions in this language
The strings corresponding to the nonterminal 〈xxx〉 aregenerated using rules with 〈xxx〉 on the left-hand side
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 3 / 16
Introduction
The alternatives on the righthand side, separated by vertical
bars |, describe di�erent ways strings corresponding to 〈xxx〉can be generated
These alternatives may involve other nonterminals 〈yyy〉,which must be further eliminated by applying rules with 〈yyy〉on the left-hand side
while x ≤ y do begin x := (x + 1); y := y − 1 end
is generated by the nonterminal 〈stmt〉
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 4 / 16
Introduction
We obtain the while statement from 〈stmt〉 through a
sequence of expressions called sentential forms
Each sentential form is derived from the previous by an
application of one of the rules
〈stmt〉〈while-stmt〉while 〈bool-expr〉 do 〈stmt〉while 〈arith-expr〉 〈compare-op〉 〈arith-expr〉 do 〈stmt〉while 〈var〉 〈compare-op〉 〈arith-expr〉 do 〈stmt〉while 〈var〉 ≤ 〈arith-expr〉 do 〈stmt〉while 〈var〉 ≤ 〈var〉 do 〈stmt〉while x ≤ 〈var〉 do 〈stmt〉while x ≤ y do 〈stmt〉while x ≤ y do 〈begin-stmt〉. . .
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 5 / 16
Introduction
Applying di�erent rules will yield di�erent results
begin if z = (x + 3) then y := z else y := x end
The set of all strings not containing any nonterminals
generated by the grammar is called the language generated by
the grammar
In general, this set of strings may be in�nite, even if the set of
rules is �nite
There may also be several di�erent derivations of the same
string
A grammar is said to be unambiguous if a string cannot have
more than one derivation
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 6 / 16
Introduction
The language (subset of Σ∗) generated by the context-free
grammar G is denoted L(G )
A subset of Σ∗ is called a context-free language (CFL) if it is
L(G ) for some CFG G
CFLs are good for describing in�nite sets of strings in a �nite
way
They are particularly useful in computer science for describingthe syntax of
programming languages,
well-formed arithmetic expressions,
well-nested begin-end blocks,
strings of balanced parentheses,
All regular sets are CFLs, but not necessarily vice versa.
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 7 / 16
Pushdown Automata (PDAs): A Preview
A pushdown automaton (PDA) is like a �nite automaton,
except it has a stack or pushdown store, which it can use to
record a potentially unbounded amount of information
Its input head is read-only and may only move right
The machine can store information on the stack in a
last-in-�rst-out (LIFO) fashion
It can push symbols onto the top of the stack or pop them o�
the top of the stack
It may not read down into the stack without popping the top
symbols o�
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 8 / 16
Pushdown Automata (PDAs): A Preview
a1a2 a3 a4 a5 a6 a7 a8 an
left to right, read only
push/pop
StackFiniteControl
QBA
CB
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 9 / 16
Formal De�nition of CFGs and CFL
De�nition
A context-free grammar (CFG) is a quadruple G = (N,Σ,P,S),where
N is a �nite set (the nonterminal symbols),
Σ is a �nite set (the terminal symbols) disjoint from N,
P is a �nite subset of N × (N ∪ Σ)∗ (the productions),
S ∈ N (the start symbol).
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 10 / 16
Formal De�nition of CFGs and CFL
We use capital letters A,B,C , · · · for nonterminals
We use a, b, c , · · · for terminal symbols
Strings in (N ∪ Σ)∗ are denoted α, β, γ, · · ·
Instead of writing productions as (A, α), we write A −→ α
We often use the vertical bar | to abbreviate a set of
productions with the same left-hand side
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 11 / 16
Formal De�nition of CFGs and CFL
Example
Instead of writing
A −→ α1, A −→ α2, A −→ α3,
we write
A −→ α1 | α2 | α3
De�nition
If α, β ∈ (N ∪ Σ)∗, we say that β is derivable from α in one step
and write
α1−→G
β
if β can be obtained from α by replacing some occurrence of a
nonterminal A in α with γ, where A −→ γ, is in P.
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 12 / 16
Formal De�nition of CFGs and CFL
Example
We have A −→ γ a production in P
α = α1Aα2, where α1, α2 ∈ (N ∪ Σ)∗
β = α1γα2
Then, we have α1−→G
β
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 13 / 16
Formal De�nition of CFGs and CFL
De�nition
Let∗−→G
be the re�exive transitive closure of the relation1−→G
; that
is, de�ne
α0−→G
α
αn+1−→G
β if there exists γ such that αn−→G
γ and γ1−→G
β
α∗−→G
β if ∃(n | n ≥ 0 : αn−→G
β )
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 14 / 16
Formal De�nition of CFGs and CFL
Terminology:
A string in (N ∪ Σ)∗ derivable from the start symbol S is
called a sentential form
A sentential form is called a sentence if it consists only of
terminal symbols
The language generated by G , denoted L(G ), is the set of all
sentences:
L(G ) = {x ∈ Σ∗ | S ∗−→G
x}
A subset B ⊆ Σ∗ is a context-free language (CFL) if
B = L(G ) for some context-free grammar G
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 15 / 16
Formal De�nition of CFGs and CFL
Example
The nonregular set {anbn | n ≥ 0} is a CFL
It is generated by the grammar G = (N,Σ,P, S), whereN = {S},Σ = {a, b},P = {S −→ aSb,S −→ ε}
Here is a derivation of a3b3 in G :
S1−→G
aSb1−→G
aaSbb1−→G
aaaSbbb1−→G
aaabbb
Sn+1−→G
anbn
Ryszard Janicki Discrete Math. and Logic II. Context-Free Grammars 16 / 16