lr parsing

31
CS424 Compiler Construction

Upload: john

Post on 02-Dec-2015

254 views

Category:

Documents


5 download

DESCRIPTION

LR Parsing for beginners

TRANSCRIPT

CS424

Compiler Construction

LR Parsing

● Recall the following terms– Rightmost Derivation

– Reduction

– Handle

– Shift-Reduce Parsing

● We now look at the class of grammars that can be parsed using shift-reduce techniques

LR Parsing

● The basic structure of an LR parser looks as follows.

LR Parsing

● The LR parsing algorithm follows a Finite Automaton

● GOTO is the transition function of the FA.● ACTION is a function that tells the parser what

action to take given the current state of the FA and the next input symbol.

● ACTION at any step can be Shift, Reduce, Error, Accept.

LR Parsing

● ACTION and GOTO together are the parsing table of the LR parser.

● There are multiple ways of constructing ACTION and GOTO, called:– SLR or Simple LR

– LR

– LALR or Lookahead LR

● Each of these corresponds to a different construction of the Finite Automaton

SLR Parsing

● We first look at the SLR method of constructing parsing tables.

● The corresponding automaton is called the LR(0) automaton.

● The states of this automaton are sets of LR(0) items, which we describe next.

LR(0) Items

● An item is a production plus an index in the right hand side of the production which we will denote by a dot.

● The production A->XYZ yields the following items:– A->.XYZ

– A->X.YZ

– A->XY.Z

– A->XYZ.

LR(0) Items

● The dot in an item keeps track of where we are in a parse.

● The part of the production before the dot corresponds to the part of input we have already seen.

● The part after the dot is what we expect to see next.

LR(0) Items

● For example, take the item A->X.YZ● This means that the parser has already seen some

input that can be derived from X.● Now it is waiting to see some string that can be derived

from Y. Once that happens it moves to the item A->XY.Z● Now it expects to see something derivable from Z so it

can move to A->XYZ.● Now it has seen something derived from XYZ, which

can be reduced to A.

Augmented Grammar

● To construct the LR(0) automaton, we first augment the grammar by adding a new start symbol.

● Given a CFG with start symbol S, we add a new start symbol S' and a new production– S'->S

● This will tell us when we can stop parsing and accept.

Closure of Items

● Suppose we have an item A->X.YZ and a production Y-> ABC

● The item tells us that we are waiting to see something derived from Y, and the production tells us that we are waiting to see something derived from A.

● Therefore we add the item Y->.ABC to the closure of the original item.

Closure of Items

● Given I, a set of items, compute Closure(I) as follows:

● Add everything in I to Closure(I)● If A->x.By is in Closure(I) and B->z is a

production then add B->.z to Closure(I), if not already there.

● Apply the previous step until no more items can be added.

Closure Example

● Consider the augmented expression grammar.● Given the item set, I = {E'->.E}, compute

Closure(I).

Closure Example

Closure(I) contains the items– E'->.E

– E->.E+T

– E->.T

– T->.T*F

– T->.F

– F->.(E)

– F->.id

Closure

● Closures of item sets will be the states in the finite automaton.

● Next we see how to compute the transitions. These correspond to the GOTO function.

GOTO Function

● The intuitive idea is as follows.● Given an item A->X.YZ and grammar symbol

Y, the next item is A->XY.Z plus everything in the closure of A->XY.Z

● This means that we just saw a Y and now we expect to see a Z and something derived from Z.

GOTO Function

● Formally, given I, a set of items and X a grammar symbol.

● GOTO(I, X) is the closure of the set of all items [A->xX.y] such that [A->x.Xy] is in I.

GOTO Example

● If I = {[E'->E.], [E->E.+T]}, then compute GOTO(I, +)

● This is the closure of E->E+.T

LR(0) Automaton For Expressions

● The start state is the closure of [E'->.E]● Together with Closure and GOTO, we can now

build the automaton.

Using the LR(0) Automaton for Parsing

● Note that each state corresponds to a unique grammar symbol, the one labeling the inputs.

● So we can use states to mean grammar symbols.

● State 0 is the start state for the automaton

Using the LR(0) Automaton for Parsing

● Parsing with the LR(0) automaton is done with a stack, which holds states.

● Top of the stack is the current state.● If we are currently in the state j with input symbol a and

GOTO(j, a) is k then we shift a (or correspondingly state k).

● Otherwise we reduce using the item with a dot at the end.● Reduction corresponds to popping states on the right

hand side and pushing the state corresponding to the left hand side.

Parsing Example

SLR Parsing Table

● Rather than use the LR(0) automaton directly, as just described, we code the information into a parsing table.

● The parsing table has two parts ACTION and GOTO.

● GOTO gives the next state.● ACTION tells us whether to shift or to reduce

using some production.

SLR Parsing Table

● Label the productions in the expression grammar as follows.

SLR Parsing Table

SLR Parsing Example

LR Parsing Algorithm

SLR Parsing Table Construction

SLR Parsing Table Construction

SLR Parsing

● Any grammar for which the previous algorithm results in a parsing action conflict is not SLR.

● All ambiguous grammars are not SLR parseable.

● There are some non-ambiguous grammars that cannot be parsed by SLR techniques.