lr parsing
DESCRIPTION
LR Parsing for beginnersTRANSCRIPT
LR Parsing
● Recall the following terms– Rightmost Derivation
– Reduction
– Handle
– Shift-Reduce Parsing
● We now look at the class of grammars that can be parsed using shift-reduce techniques
LR Parsing
● The LR parsing algorithm follows a Finite Automaton
● GOTO is the transition function of the FA.● ACTION is a function that tells the parser what
action to take given the current state of the FA and the next input symbol.
● ACTION at any step can be Shift, Reduce, Error, Accept.
LR Parsing
● ACTION and GOTO together are the parsing table of the LR parser.
● There are multiple ways of constructing ACTION and GOTO, called:– SLR or Simple LR
– LR
– LALR or Lookahead LR
● Each of these corresponds to a different construction of the Finite Automaton
SLR Parsing
● We first look at the SLR method of constructing parsing tables.
● The corresponding automaton is called the LR(0) automaton.
● The states of this automaton are sets of LR(0) items, which we describe next.
LR(0) Items
● An item is a production plus an index in the right hand side of the production which we will denote by a dot.
● The production A->XYZ yields the following items:– A->.XYZ
– A->X.YZ
– A->XY.Z
– A->XYZ.
LR(0) Items
● The dot in an item keeps track of where we are in a parse.
● The part of the production before the dot corresponds to the part of input we have already seen.
● The part after the dot is what we expect to see next.
LR(0) Items
● For example, take the item A->X.YZ● This means that the parser has already seen some
input that can be derived from X.● Now it is waiting to see some string that can be derived
from Y. Once that happens it moves to the item A->XY.Z● Now it expects to see something derivable from Z so it
can move to A->XYZ.● Now it has seen something derived from XYZ, which
can be reduced to A.
Augmented Grammar
● To construct the LR(0) automaton, we first augment the grammar by adding a new start symbol.
● Given a CFG with start symbol S, we add a new start symbol S' and a new production– S'->S
● This will tell us when we can stop parsing and accept.
Closure of Items
● Suppose we have an item A->X.YZ and a production Y-> ABC
● The item tells us that we are waiting to see something derived from Y, and the production tells us that we are waiting to see something derived from A.
● Therefore we add the item Y->.ABC to the closure of the original item.
Closure of Items
● Given I, a set of items, compute Closure(I) as follows:
● Add everything in I to Closure(I)● If A->x.By is in Closure(I) and B->z is a
production then add B->.z to Closure(I), if not already there.
● Apply the previous step until no more items can be added.
Closure Example
● Consider the augmented expression grammar.● Given the item set, I = {E'->.E}, compute
Closure(I).
Closure Example
Closure(I) contains the items– E'->.E
– E->.E+T
– E->.T
– T->.T*F
– T->.F
– F->.(E)
– F->.id
Closure
● Closures of item sets will be the states in the finite automaton.
● Next we see how to compute the transitions. These correspond to the GOTO function.
GOTO Function
● The intuitive idea is as follows.● Given an item A->X.YZ and grammar symbol
Y, the next item is A->XY.Z plus everything in the closure of A->XY.Z
● This means that we just saw a Y and now we expect to see a Z and something derived from Z.
GOTO Function
● Formally, given I, a set of items and X a grammar symbol.
● GOTO(I, X) is the closure of the set of all items [A->xX.y] such that [A->x.Xy] is in I.
GOTO Example
● If I = {[E'->E.], [E->E.+T]}, then compute GOTO(I, +)
● This is the closure of E->E+.T
LR(0) Automaton For Expressions
● The start state is the closure of [E'->.E]● Together with Closure and GOTO, we can now
build the automaton.
Using the LR(0) Automaton for Parsing
● Note that each state corresponds to a unique grammar symbol, the one labeling the inputs.
● So we can use states to mean grammar symbols.
● State 0 is the start state for the automaton
Using the LR(0) Automaton for Parsing
● Parsing with the LR(0) automaton is done with a stack, which holds states.
● Top of the stack is the current state.● If we are currently in the state j with input symbol a and
GOTO(j, a) is k then we shift a (or correspondingly state k).
● Otherwise we reduce using the item with a dot at the end.● Reduction corresponds to popping states on the right
hand side and pushing the state corresponding to the left hand side.
SLR Parsing Table
● Rather than use the LR(0) automaton directly, as just described, we code the information into a parsing table.
● The parsing table has two parts ACTION and GOTO.
● GOTO gives the next state.● ACTION tells us whether to shift or to reduce
using some production.