more parsing cpsc 388 ellen walker hiram college
TRANSCRIPT
![Page 1: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/1.jpg)
More Parsing
CPSC 388Ellen WalkerHiram College
![Page 2: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/2.jpg)
Review LL(1) Grammars
• Compute First and Follow sets• Build the parsing table
– If x is in First(A), then M[A,x] = A->xZ (the rule that put x in First(A)
– If is in First(A) and x is in Follow(A), then M[A,x] = A->
• If each cell has no more than 1 rule, grammar is LL(1).
![Page 3: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/3.jpg)
LL(k) Grammars
• Look at k terminals instead of 1 terminal– First(S) is all sequences of k terminals that can begin S
– Follow(S) is all sequences of k terminals that can follow S
– Col. headers of table are sequences of k terminals instead of single terminals
• First & Follow computations get messy!
![Page 4: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/4.jpg)
Building Parse Trees
• Each item on the stack is a syntax tree node
• To “use” a rule:– Pop (and save) LHS from stack.– Create nodes for each RHS element– Connect RHS nodes as children of LHS node
– Push RHS nodes (reverse order) on stack
![Page 5: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/5.jpg)
Parse Tree Example
• Parsing: “aabb”• Grammar: S->aSb | • After S->aSb:
a b
a
b S2
S1S2
Stack Tree
![Page 6: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/6.jpg)
Error Recovery
• Recognizer - either program is acceptable or not
• Error Correction - attempt to replace error by correct program– Minimal distance error correction is too hard
– Limited to simple errors (e.g. missing ;)
![Page 7: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/7.jpg)
Error Recovery Principles
• Find error as soon as possible (to report its location accurately)
• Pick up parsing as soon as possible after error (so multiple errors caught)
• Avoid errors generating many spurious additional error messages
• Avoid infinite loops on errors (!)
![Page 8: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/8.jpg)
Recursive Descent Error Recovery
• Panic Mode– Each function has additional parameter: synchronizing tokens (e.g. ;)
– Error causes parser to scan ahead (ignoring tokens) to find next synchronizing token
– Typical synchronizing tokens are in follow set.
![Page 9: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/9.jpg)
Example Pseudocode
Void factor (list<token> synchset){ token = scanto({(,num}, syncset); switch (token){ (: exp(‘)’); match(‘)’); break;
num: match(num); break; default: error(“Factor”); return false;
} return true; }
![Page 10: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/10.jpg)
Error Recovery in LL(1)
• Fill in each “blank” cell with one of the following options:– Pop: pop A from the stack (if current token is $ or in Follow(A)). “give up on” A
– Scan: skip tokens until we find one where we can restart the parse.
– Push a new nonterminal (e.g. start symbol if stack becomes empty before input does)
![Page 11: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/11.jpg)
Bottom Up Parsing
• Start with tokens• Build up rule RHS (right side)• Replace RHS by LHS• Done when stack is only start symbol
• (Working from leaves of tree to root)
![Page 12: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/12.jpg)
Operations in Bottom-up Parsing
• Shift:– Push the terminal from the beginning of the string to the top of the stack
• Reduce– Replace the string xyz at the top of the stack by a nonterminal A (assuming A->xyz)
• Accept (when stack is $S’; empty input)
![Page 13: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/13.jpg)
Lookahead
• Look ahead in input by shifting (it’ll all be in the stack)
• Look ahead in the stack– This requires breaking the abstraction just a little bit (but is technically no problem)
• As before, decision to shift or reduce is made based on next token and stack
![Page 14: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/14.jpg)
Sample Parse
• S’ -> S; S-> aSb | bSa | SS | e• String: abba
– Stack = $, input = abba$; shift– Stack = $a input = bba$; reduce S->e
– Stack = $aS input = bba$ ; shift– Stack = $aSb input = ba$ ; reduce S->aSb
– Stack = $S input = ba ; shift
![Page 15: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/15.jpg)
Sample Parse (cont)
– Stack = $S input = ba$ ; shift– Stack = $Sb input = a$ ; reduce S->e– Stack = $SbS input = a$ ; shift– Stack = $SbSa input = $; reduce S->bSa
– Stack = $SS input = $; reduce S->SS– Stack = $S input = $; reduce S’-> S– Stack = $S’ input = $; accept
![Page 16: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/16.jpg)
Rightmost Derivation
• Reduce rules (in order used)– S->e– S->aSb– S->e– S->bSa– S-> SS– S’-> S
![Page 17: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/17.jpg)
Rightmost Derivation
• Rules read “upward” give the following derivation:– S’->S ->SS ->SbSa->Sba ->aSba ->abba
• Shift reduce parser generates rightmost derivation in reverse order!
• LR(k) = left-to-right input, rightmost derivation.
![Page 18: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/18.jpg)
Right Sentential Form
• Each intermediate term of a rightmost derivation is called a right sentential form– S’ S SS SbSa– Sba aSbaabba
• All legal intermediate states are right sentential forms (split btwn stack and input string)
![Page 19: More Parsing CPSC 388 Ellen Walker Hiram College](https://reader036.vdocuments.mx/reader036/viewer/2022082506/5697bff01a28abf838cbad4d/html5/thumbnails/19.jpg)
Shift vs. Reduce
• Shift until reduction to next left sentential form is possible– When complete RHS is at top of stack
– …and more of RHS is not at beginning of string. (Otherwise, S->e would always be used!)