cs466(prasad)l7parse1 parsing recognition of strings in a language
TRANSCRIPT
![Page 1: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/1.jpg)
CS466(Prasad) L7Parse 1
Parsing
Recognition of strings in a language
![Page 2: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/2.jpg)
CS466(Prasad) L7Parse 2
Graph of a Grammar
• Represents leftmost derivations of a CFG.– A pathpath from node S to a node w is a leftmost
derivation.
Nodes Left sentential forms
Arc labels Production rules
Root Start Symbol
Leaves Sentences
![Page 3: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/3.jpg)
CS466(Prasad) L7Parse 3
|
||
||
aCC
bCbSaBB
bBaSS
S
aS
bB
aaS abB a baB bbS bbC
…
aSS
S
aSS
bBS
bBS S
aBBaSB
bCB
![Page 4: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/4.jpg)
CS466(Prasad) L7Parse 4
Properties of Graph of a Grammar
• Every node has a finite number of children.• Simple breadth-first enumeration feasible.
• The number of leaves is infinite if the language is infinite.
• Typical case.
• There can be infinite long paths (derivations).
• Loops in depth-first traversals.
![Page 5: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/5.jpg)
CS466(Prasad) L7Parse 5
abSbaSS ||
S
aS Sb
aaS aab aSb abb Sbb
…
(Illustrates ambiguity in the grammar.)
ab
DirectedAcyclicGraph
aSS
aSS aSS
SbS
SbS SbS
abS
abS abS
![Page 6: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/6.jpg)
CS466(Prasad) L7Parse 6
|SSS
(Illustrates ambiguous grammar with cycles.)
Cyclic structure
S
SS
SSS
![Page 7: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/7.jpg)
CS466(Prasad) L7Parse 7
• ParserA program that determines if a string
by constructing a derivation. Equivalently,
it searches the graph of G.
– Top-down parsers• Constructs the derivation tree from root to leaves.
• Leftmost derivation.
– Bottom-up parsers• Constructs the derivation tree from leaves to root.
• Rightmost derivation in reverse.
)(GL
![Page 8: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/8.jpg)
CS466(Prasad) L7Parse 8
)(
||
SLab
baSSS
S
S S
SS
S Sa
S
S Sa b
Leftmost derivation abaSSSS
DerivationTrees
![Page 9: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/9.jpg)
CS466(Prasad) L7Parse 9
S
S S
S S
S Sb
S
S S
S
S
S S
a
a b
bRightmostDerivationin Reverse
abSbSSS Rightmost derivation
DerivationTrees
S S
![Page 10: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/10.jpg)
CS466(Prasad) L7Parse 10
Top-down parsers: Breadth-first vs Depth-first
• Search the graph of a grammar breadth-first
• Uses: Queue• (+) Always terminates
with shortest derivation
• (-) Inefficient in general.
• Search the graph of a grammar depth-first
• Uses: Stack• (-) Can get into
infinite loops (e.g., left recursion)
• (+) Efficient in general.
![Page 11: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/11.jpg)
CS466(Prasad) L7Parse 11
Determining when
• Number of terminals in sentential form
> length of w
• Prefix of sentential form preceding the leftmost non-terminal not a prefix of w.
• No rules applicable to sentential form.
)(GL
![Page 12: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/12.jpg)
CS466(Prasad) L7Parse 12
Parsing Examples
)()(
Show
)(|
|
SLbb
AbT
TATA
AS
![Page 13: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/13.jpg)
CS466(Prasad) L7Parse 13
Breadth-first top-down parser
S
A
T A+T
b (A) T+T A+T+T
(T) (A+T)
(b) ((A))…
……
T+T+TA+T+T+T
… …
Queue-up leftsentential forms level by level
(T)+T
(A)+T
(b)+T
(b)+b
Parse successful
![Page 14: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/14.jpg)
CS466(Prasad) L7Parse 14
Depth-first top-down parser
S
A
T A+T
b (A) T+T A+T+T
(T) (A+T)
(b) ((A))
… T+T+T A+T+T+T
… …
Use stack to pursue entire path from left
loop
Backtrack On failure
Parse fails
![Page 15: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/15.jpg)
CS466(Prasad) L7Parse 15
Summary
• In BFTD version, all left derivations investigated in parallel.
• In DFTD version, one specific derivation is pursued to completion.
• Done, if succeeds.
• Otherwise, backtrack and investigate another path.
(Incomplete strategy)
(Used by Prolog interpreter)
![Page 16: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/16.jpg)
CS466(Prasad) L7Parse 16
Bottom-up parsing
(b)+b
(T)+b (b)+T
(T)+T Not allowed
(b)+A(T)+T
……
(A)+b
(A)+T(S)+b T+b
A+b
A+T SA
Parse successful
![Page 17: CS466(Prasad)L7Parse1 Parsing Recognition of strings in a language](https://reader036.vdocuments.mx/reader036/viewer/2022082610/56649ccf5503460f9499b516/html5/thumbnails/17.jpg)
CS466(Prasad) L7Parse 17
Practical Parsers
• Language/Grammar designed to enable deterministic (directed and backtrack-free) searches.
• Uses lookahead tokens and/or exploits the context in the sentential form constructed so far.
“Look before you leap.” vs “Procrastination principle.”
– Top-down parsers : LL(k) languages• E.g., Pascal, Ada, etc.• Better error diagnosis and recovery.
– Bottom-up parsers : LALR(1), LR(k) languages• E.g., C/C++, Java, etc.• Handles left recursion in the grammar.
– Backtracking parsers• E.g., Prolog interpreter.