lr(k) parsing

30
LR(k) Parsing CPSC 388 Ellen Walker Hiram College

Upload: janus

Post on 06-Feb-2016

50 views

Category:

Documents


7 download

DESCRIPTION

LR(k) Parsing. CPSC 388 Ellen Walker Hiram College. Bottom Up Parsing. Start with tokens Build up rule RHS (right side) Replace RHS by LHS Done when stack is only start symbol (Working from leaves of tree to root). Operations in Bottom-up Parsing. Shift: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: LR(k) Parsing

LR(k) Parsing

CPSC 388Ellen WalkerHiram College

Page 2: LR(k) Parsing

Bottom Up Parsing

• Start with tokens• Build up rule RHS (right side)• Replace RHS by LHS• Done when stack is only start symbol

• (Working from leaves of tree to root)

Page 3: LR(k) Parsing

Operations in Bottom-up Parsing

• Shift:– Push the terminal from the beginning of the string to the top of the stack

• Reduce– Replace the string xyz at the top of the stack by a nonterminal A (assuming A->xyz)

• Accept (when stack is $S’; empty input)

Page 4: LR(k) Parsing

Sample Parse

• S’ -> S; S-> aSb | bSa | SS | e• String: abba

– Stack = $, input = abba$; shift– Stack = $a input = bba$; reduce S->e

– Stack = $aS input = bba$ ; shift– Stack = $aSb input = ba$ ; reduce S->aSb

– Stack = $S input = ba ; shift

Page 5: LR(k) Parsing

Sample Parse (cont)

– Stack = $S input = ba$ ; shift– Stack = $Sb input = a$ ; reduce S->e– Stack = $SbS input = a$ ; shift– Stack = $SbSa input = $; reduce S->bSa

– Stack = $SS input = $; reduce S->SS– Stack = $S input = $; reduce S’-> S– Stack = $S’ input = $; accept

Page 6: LR(k) Parsing

LR(k) Parsing

• LR(0) grammars can be parsed with no lookahead (stack only)

• LR(1) grammars need 1 character lookahead

• LR(k), k>1 use multi-character lookahead

• Most “real” grammars are LR(1)

Page 7: LR(k) Parsing

Shift vs. Reduce

• First, build NFA of LR(0) items• Transform NFA to DFA• If unambiguous, grammar is LR(0) - use DFA directly to parse (states indicate shift vs. reduce)

• Otherwise, use SLR(1) algorithm

Page 8: LR(k) Parsing

LR(0) Items

• Rules with . between stack & input• For S->(S) | a, the LR(0) items are:S -> .(S) S-> (.S) S->(S.) S->(S).S-> .a S-> a.

• S -> .(S) and S-> .a are initial items

• S-> (S). and S->a. are complete items

Page 9: LR(k) Parsing

Building NFA

• Each LR(0) item is a state• Shift transitions

• Change of goal transitions

aA -> .aB A -> a.B

εS -> x.Ay A-> .aB

Page 10: LR(k) Parsing

More on NFA

• Initial state is “ S’ -> .S”• No final state, but acceptance happens in S’->S. state

• Complete LR(0) items have no outbound transitions– We’ll worry about getting past them later

• No “reduce transitions”– “shift” on non-terminal used during reduce

Page 11: LR(k) Parsing

NFA: S-> (S) | Ab ; A -> aA | ε

SS'-> .S S' -> S.

S->.(S) S->(.S) S->(S.) S->(S).

( S )ε

ε

S->.Ab

A

S->A.b

b

S->A .b

A->.aA

a

A->a.A

A

A->aA.

ε

ε

A->.

ε

ε

ε

ε

Page 12: LR(k) Parsing

NFA -> DFA

• Compute ε-closure (closure items)– All are initial items

• Use subset construction (kernel items)

• Grammar + kernel items are sufficient (closure items can be inferred)

• DFA is computed directly by YACC, etc.

Page 13: LR(k) Parsing

DFA Construction Details

• For each symbol (terminal or nonterminal) after the marker, create a shift transition. These are kernel items.

SS'-> .S S' ->

S.

Page 14: LR(k) Parsing

DFA Construction Details

• If there are multiple shift transitions on the same symbol, these are combined into the same state.

• (Because the NFA will be in all those states at once).

Page 15: LR(k) Parsing

Adding Closure Items

• When the marker is immediately before a non-terminal symbol, the closure items are all of the initial forms for the new symbol, e.g.– S’ -> .S (kernel item)– S -> .(S) (closure item)– S -> .Ab (closure item)

• These denote the change of goal transitions (which are all epsilon-transitions)

Page 16: LR(k) Parsing

DFA “Final” States

• The DFA doesn’t actually accept the string, so the concept of “final” isn’t the same

• In JFLAP, mark any state where a reduction can take place as final

Page 17: LR(k) Parsing

DFA S-> (S) | Ab ; A -> aA | ε

Page 18: LR(k) Parsing

LR(0) Parsing

• At each step, push a state onto the stack, and do an action based on the current state– A->a.xb (not a complete item)If x is terminal, shift.

– A->aXb. (a complete item)Reduce by A->aXb

Page 19: LR(k) Parsing

When Not LR(0)?

• Shift-reduce conflict– State contains both a complete item and a “shift” item (with leading terminal)

• Reduce-reduce conflict– State contains 2 or more complete items.

• Previous example is not LR(0)! (Why)?

Page 20: LR(k) Parsing

Simple LR(1)

• If a shift is possible, do it• Else if there is a complete item for A, and the next terminal is in Follow(A), reduce A. Compute the next state by taking the A link from the last state left on the stack before pushing A

• Otherwise, there is a parse error

Page 21: LR(k) Parsing

SLR(1) Table

• Rows are states, columns are symbols (terminal and nonterminal)

• Table entries (3 types):– sn shift & goto state n (only for terminals)– Rk reduce using rule k (rule #’s start at 0 in JFLAP)

– n Goto state n (only for nonterminals, after reduction)

Page 22: LR(k) Parsing

Transitions and Table Entries

• Transition from state m to state n on terminal x– Put sn in table [m][x]

• Transition from state m to state n on nonterminal X– Put n in table [m][X]

• State m has a complete item for rule k, and terminal x is in FINAL of the LHS of rule k– Put rk in table[m][x]

• State m is “S’->S”– Put acc (accept) in table[m][$]

Page 23: LR(k) Parsing

SLR(1) Example

• Grammar– S-> (S) | Ab A-> aA | ε

• Firsts– S: (,a,b A: a,ε

• Follows– S: $,) A: b

Page 24: LR(k) Parsing

SLR(1) Example TableStat

( ) a b $ A S

0 s2 s3 r4 7 1

1 acc

2 s2 s3 r4 7 5

3 s3 r4 4

4 r3

5 s6

6 r1 r1

7 s8

8 r2 r2

Page 25: LR(k) Parsing

SLR(1) Example

• Stack input$0 (aab)$$0(2 aab)$$0(2a7 ab)$$0(2a7a7 b)$$0(2a7a7A8 b)$ A->ε$0(2a7A8 b)$ A->ε$0(2A5 b)$ A->aA

Page 26: LR(k) Parsing

SLR(1) Example cont.

• $0(2A5 b)$• $0(2A5b6 )$• $0(2S3 )$• $0(2S3)4 $• $0S1 $• $0S’ $ accept!

Page 27: LR(k) Parsing

Another SLR(1) Grammar to Try

• S -> zMNz• M -> aMa• M -> z• N -> bNb• N -> z

Page 28: LR(k) Parsing

Parsing Conflicts in SLR(1)

• Shift-reduce conflict– Prefer shift over reduce

• Reduce-reduce conflicts– Error in design of grammar (usually)

– Possible to designate a grammar-specific choice

Page 29: LR(k) Parsing

Dangling Else

• Remember: if C if C else S– Shift-preference puts else with inner if!

– To put else with outer if, inner “if C” must be reduced to S first

• Good example of how language “evolved” to make it easy for the compiler!

Page 30: LR(k) Parsing

More than SLR(1)

• SLR(k) Parsing– Multiple-token lookahead (for shifts) and multiple-token follow information (for reductons)

• General LR(1) parsing– Include lookaheads in DFA construction

• LALR(1) parsing– Simplified state diagram for GLR(1)– What YACC / Bison uses