compiler structures

58
241-437 Compilers: Bottom-up/6 Compiler Structures Objective describe bottom-up (LR) parsing using shift-reduce and parse tables explain how LR parse tables are generated 241-437, Semester 1, 2011-2012 6. Bottom-up (LR) Parsing

Upload: svea

Post on 19-Mar-2016

20 views

Category:

Documents


0 download

DESCRIPTION

Compiler Structures. 241-437 , Semester 1 , 2011-2012. Objective describe bottom-up (LR) parsing using shift-reduce and parse tables explain how LR parse tables are generated. 6. Bottom-up (LR) Parsing. Overview. 1. What is a LR Parser? 2. Bottom-up using Shift-Reduce - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Compiler Structures

241-437 Compilers: Bottom-up/6 1

Compiler Structures

• Objective– describe bottom-up (LR) parsing using shift-

reduce and parse tables– explain how LR parse tables are generated

241-437, Semester 1, 2011-2012

6. Bottom-up (LR) Parsing

Page 2: Compiler Structures

241-437 Compilers: Bottom-up/6 2

Overview

1. What is a LR Parser?2. Bottom-up using Shift-Reduce3. Building a LR Parser4. Generating the Parse Table5. LR Conflicts6.LL, SLR, LR, LALR Grammars

Page 3: Compiler Structures

241-437 Compilers: Bottom-up/6 3

In this lecture

Source Program

Target Lang. Prog.

Semantic Analyzer

Syntax Analyzer

Lexical Analyzer

FrontEnd

Code Optimizer

Target Code Generator

BackEnd

Int. Code Generator

Intermediate Code

but concentratingon bottom-up parsing

Page 4: Compiler Structures

241-437 Compilers: Bottom-up/6 4

1. What is a LR Parser?

• A LR parser reads its input tokens from Left-to-right and produces a Rightmost derivation.

• The parse tree is built bottom-up, starting from the leaves and working upwards to the start symbol.

Page 5: Compiler Structures

241-437 Compilers: Bottom-up/6 5

LR in ActionGrammar:S a A B eA A b c | bB d

The tree correspondsto a rightmost derivation:S a A B e a A d e a A b c d e a b b c d e

Reducing a sentence:a b b c d ea A b c d ea A d ea A B eS

S

a b b c d eA

AB

a b b c d eA

AB

a b b c d eA

A

a b b c d eA

These matchproduction’s

right-hand sides

parse "a b b c d e"

Page 6: Compiler Structures

241-437 Compilers: Bottom-up/6 6

LR(k) Parsing

• The k is to the number of input tokens that are looked at when deciding which production to use.– e.g. LR(0), LR(1)

• We'll be using a variation of LR(0) parsing in this chapter.

Page 7: Compiler Structures

241-437 Compilers: Bottom-up/6 7

LR versus LL

• LR can deal with more complex (powerful) grammars than LL (top-down parsers).

• LR can detect errors quicker than LL.

• LR parsers can be implemented very efficiently, but they're difficult to build by hand (unlike LL parsers).

Page 8: Compiler Structures

241-437 Compilers: Bottom-up/6 8

2. Bottom-up using Shift-Reduce

• The usual way of implementing bottom-up parsing is by using shift-reduce:– ‘shift’ means read in a new input token, and push it

onto a stack

– ‘reduce’ means to group several symbols into a single non-terminal• by choosing a production to use 'backwards'• the symbols are popped off the stack, and the production's

non-terminal is pushed onto it

Page 9: Compiler Structures

241-437 Compilers: Bottom-up/6 9

Shift-Reduce Parsing

$$$$Reduce S => a A B Reduce S => a A B ee

$$$ a A B e$ a A B eShiftShifte $e $$ a A B$ a A BReduce B => dReduce B => de $e $$ a A d$ a A dShiftShiftd e $d e $$ a A$ a AReduce A => A b cReduce A => A b cd e $d e $$ a A b c$ a A b cShiftShiftc d e $c d e $$ a A b$ a A bShiftShiftb c d e $b c d e $$ a A$ a AReduce A => bReduce A => bb c d e $b c d e $$ a b$ a bShiftShiftb b c d e $b b c d e $$ a$ aShiftShifta b b c d e a b b c d e

$$$$

ActionActionInputInputStackStack

S => a A B e A => A b c | b B => d

Page 10: Compiler Structures

241-437 Compilers: Bottom-up/6 10

3. Building a LR Parser

• The standard way of writing a shift-reduce LR parser is to generate a parse table for the grammar, and 'plug' that into a standard LR compiler framework.

• The table has two main parts: actions and gotos.

Page 11: Compiler Structures

241-437 Compilers: Bottom-up/6 11

actions gotos

3.1. Inside an LR Parser

$$aann……aaii……aa22aa11

LR Parser

XXo o ss00

……XXm-1 m-1 ssm-1 m-1

XXm m ssmm output(parse tree)

stack

input tokens

possible actions areshift, reduce, accept, error

X is terminals ornon-terminals,S = state

Parse table(you create this bit)

gotos involvestate changes

push; pop

Page 12: Compiler Structures

241-437 Compilers: Bottom-up/6 12

Parse Table for the Example

r2r28

acc7

r46

s85

s74

r3r33

4s6s522s31

s10

BAS$edcbaState1: S => a A B e2: A => A b c 3: A => b4: B => d

Action part

Goto parts means shift toto that state

r means reduce by that numbered production

Page 13: Compiler Structures

241-437 Compilers: Bottom-up/6 13

3.2. Table Algorithm

push(<$,0>); /* push <symbol,state> pair */currToken = scanner();

while(1) { <x,state> = pair on top of stack; if (action[state, currToken ] == <shift newState>) { push(<currToken ,newState>); currToken = scanner();

} : : 4 branches for the four

possible actions thatcan be in a table cell

continued

Page 14: Compiler Structures

241-437 Compilers: Bottom-up/6 14

else if (action[state, currToken ] == <reduce ruleNum> ) {

A --> is rule number ruleNum; bodySize = numElements(); pop bodySize pairs off stack; state’ = state part of pair on top of stack; push( <A, goto[state’,A] > ); }

: :

continued

Page 15: Compiler Structures

241-437 Compilers: Bottom-up/6 15

else if (action[state,currToken ] = accept) { S --> is the start symbol production; bodySize = numElements(); pop bodySize pairs off stack; state’ = state part of pair on top of stack; if (state’ == 0) break; // success; can now stop else error(); } else error();

} // of while loop

Page 16: Compiler Structures

241-437 Compilers: Bottom-up/6 16

3.3. Table Parsing Example

$$$0$0Accept S => a A B eAccept S => a A B e$$$0,a1,A2,B6,e$0,a1,A2,B6,e

77

Shift 7Shift 7e $e $$0,a1,A2,B4$0,a1,A2,B4Reduce B => dReduce B => de $e $$0,a1,A2,d6$0,a1,A2,d6Shift 6Shift 6d e $d e $$0,a1,A2$0,a1,A2Reduce A => A b cReduce A => A b cd e $d e $$0,a1,A2,b5,c8$0,a1,A2,b5,c8Shift 8Shift 8c d e $c d e $$0,a1,A2,b5$0,a1,A2,b5Shift 5Shift 5b c d e $b c d e $$0,a1,A2$0,a1,A2Reduce A => bReduce A => bb c d e $b c d e $$0,a1,b3$0,a1,b3Shift 3Shift 3b b c d e $b b c d e $$0,a1$0,a1Shift 1Shift 1a b b c d e a b b c d e

$$$0$0

ActionActionInputInputStackStack

pop 1 pairstate' == 1push(A,goto(1, A)) = push(A,2)

pop 3 pairsstate' == 1push(A,goto(1, A)) = push(A,2)

S => a A B e A => A b c | b B => d

Page 17: Compiler Structures

241-437 Compilers: Bottom-up/6 17

3.4. The LR Parse Stack

• The parse stack holds the branches of the tree being built bottom-up.

• For example, – the stack $0,a1,A2,b5,c8 represents:

a b

A

b c

continued

Page 18: Compiler Structures

241-437 Compilers: Bottom-up/6 18

The next stack: $0,a1,A2

a b

A

b c

A

Later, $0,a1,A2,B6,e7

a b

A

b c

A

d

B

e

continued

Page 19: Compiler Structures

241-437 Compilers: Bottom-up/6 19

4. Generating the Parse Table

• The example parse table was generated using the SLR (simple LR) algorithm– an extension of LR(0) which uses the grammar'

s FOLLOW() sets

• The other LR algorithms can be used to make a parse table:– e.g. LR(1), LALR(1)

Page 20: Compiler Structures

241-437 Compilers: Bottom-up/6 20

Supporting Techniques

• SLR table generation makes use of three techniques:– LR(0) items– the closure() function– the goto() function

• I'll explain each one first, before the table generation algorithm.

Page 21: Compiler Structures

241-437 Compilers: Bottom-up/6 21

4.1. LR(0) Items

• An LR(0) item is a grammar production with a • at some position of the right-hand side.

• So, a productionA X Y Z

has four items:A • X Y ZA X • Y Z A X Y • ZA X Y Z •

• Production A has one item A •

Page 22: Compiler Structures

241-437 Compilers: Bottom-up/6 22

4.2. The closure() Function

• The closure() function generates a set of LR(0) items.

• Assume that the grammar only has one production for the start symbol S, S =>

• The initial closure set is: closure( { S => • } )

continued

Page 23: Compiler Structures

241-437 Compilers: Bottom-up/6 23

• If A•B is in the set, then for each production B, add the item B• to the set, if it's not already there.

• Repeat until no new items can be added to the set.

Page 24: Compiler Structures

241-437 Compilers: Bottom-up/6 24

Example use of closure()Grammar:S --> EE E + T | TT T * F | FF ( E )F id

{ S • E }

closure({ S •E }) =

{ S • E E • E + T E • T }

{ S • E E • E + T E • T T • T * F T • F }

{ S • E E • E + T E • T T • T * F T • F F • ( E ) F • id }

Add E•

Add T•Add F•

Page 25: Compiler Structures

241-437 Compilers: Bottom-up/6 25

4.3. The goto() Function

• goto(In, X) takes as input an existing closure set In, and a terminal/non-terminal symbol X.

• The output is a new closure set In+1:– for each item A • X in In, add

closure({ A X • }) to In+1

– repeat until no more items can be added to In+1

In In+1

X

Page 26: Compiler Structures

241-437 Compilers: Bottom-up/6 26

goto() Example 1

• Grammar:S => A B // rule 1, for start symbolA => aB => b

• Initial state I0 = closure( { S => • A B } )= { S => • A B

A => • a }

continued

Page 27: Compiler Structures

241-437 Compilers: Bottom-up/6 27

• goto( I0, A) == closure( { S => A • B } )= { S => A • B, B => • b} // call it I1

• goto( I0, a) == closure( { A => a • } )= { A => a • } // call it I2

I0 I1

I2

A

a

continued

Page 28: Compiler Structures

241-437 Compilers: Bottom-up/6 28

• goto( I1, B) == closure( { S => A B • } )= { S => A B • } // call it I3

– this is the end of the S production

• goto( I1, b) == closure( { B => b • } )= { B => b • } // call it I4

I0 I1

I2

A

a

I3

I4

B

bendstate

Page 29: Compiler Structures

241-437 Compilers: Bottom-up/6 29

goto() Example 2

• Grammar:S => a A B e // rule 1, for start symbolA => A b c | bB => d

• Initial state I0 = closure( { S => • a A B e } )= { S => • a A B e }

continued

Page 30: Compiler Structures

241-437 Compilers: Bottom-up/6 30

• goto( I0, a) == closure( { S => a • A B e } )= { S => a • A B e

A => • A b c A => • b} // call it I1

continued

I0 I1

a

Page 31: Compiler Structures

241-437 Compilers: Bottom-up/6 31

• goto( I1, A) == closure( { S => a A • B e

A => A • b c } )= { S => a A • B e

A => A • b c B => • d } // call it I2

• goto( I1, b) == closure( { A => b • } )= { A => b • } // call it I3

I0 I1

I2

a

A

I3

b

continued

Page 32: Compiler Structures

241-437 Compilers: Bottom-up/6 32

• goto( I2, B) == closure( { S => a A B • e } )= { S => a A B • e } // call it I4

• Others– I5: { A => A b • c }

– I6: { B => d • }– I7: { S => a A B e • } // end of start symbol rule

– I8: { A => A b c • }

I0 I1

I2

a

A

I3

b

I4 I5 I6

I7 I8

B b d

e c

Page 33: Compiler Structures

241-437 Compilers: Bottom-up/6 33

4.4. Using goto() to make a Table

• The columns of the table should be the grammar's terminals, $, and non-terminals.

• The rows should be the I0, I1, …, In numbers 0, 1, …, n.• what we've been calling states

Page 34: Compiler Structures

241-437 Compilers: Bottom-up/6 34

Stage 1• In stage 1, we add the shift, goto, and accept en

tries to the table.

• action[i, a] gets <shift j> ifgoto(Ii,a) = Ij

• goto[ i, A ] gets j if

goto( Ii, A) == Ij

continued

Page 35: Compiler Structures

241-437 Compilers: Bottom-up/6 35

• action[i, $] get accept ifS => • in Ii (there must be only one S rule)

Page 36: Compiler Structures

241-437 Compilers: Bottom-up/6 36

Example Grammar 1 S --> A BA --> aB --> b

I0 I1

I2

A

a

I3

I4

B

b

01234

a b $ S A Bs2

s4

acc

13

action[] goto[]

Page 37: Compiler Structures

241-437 Compilers: Bottom-up/6 37

Stage 2

• In stage 2, we add the reduce and error entries to the table.

• action[i, a] gets <reduce ruleNum> if[A => • ] in Ii and A is not S and a is in FOLLOW(A) and

A => is rule number ruleNum

continued

Page 38: Compiler Structures

241-437 Compilers: Bottom-up/6 38

• After filling the table cells with shift, goto, accept, and reduce actions, any remaining empty cells will trigger an error() call.

Page 39: Compiler Structures

241-437 Compilers: Bottom-up/6 39

Finishing the Example Table• The reduce states are the state boxes at the leave

s of the closure graph.– but exclude the end state

• For the example 1 grammar, there are two boxes at the leaves: I2 and I4.

I0 I1

I2

A

a

I3

I4

B

b

Page 40: Compiler Structures

241-437 Compilers: Bottom-up/6 40

I2 Reduction

• I2 = { A => a • }– A => a is rule number 2– FOLLOW(A) == FIRST(B) = { b }

• So action[ 2, b ] gets <reduce 2>

S --> A BA --> aB --> b

Page 41: Compiler Structures

241-437 Compilers: Bottom-up/6 41

I4 Reduction

• I4 = { B => b • }– B => b is rule number 3– FOLLOW(B) = { $ }

• So action[ 4, $ ] gets <reduce 3>

S --> A BA --> aB --> b

Page 42: Compiler Structures

241-437 Compilers: Bottom-up/6 42

Adding Reduce Entries S --> A BA --> aB --> b

I0 I1

I2

A

a

I3

I4

B

b

01234

a b $ S A Bs2

s4

acc

13

action[] goto[]

r2

r3

Page 43: Compiler Structures

241-437 Compilers: Bottom-up/6 43

Using the Example 1 Table

$$$0$0Accept (S --> A B)Accept (S --> A B)$$$0,A1,B3$0,A1,B3Reduce 3 (B --> b)Reduce 3 (B --> b)$$$0,A1,b4$0,A1,b4Shift 4Shift 4b $b $$0,A1$0,A1Reduce 2 (A --> a)Reduce 2 (A --> a)b $b $$0,a2$0,a2Shift 2Shift 2a b $a b $$0$0ActionActionInputInputStackStack

S --> A BA --> aB --> b

pop 1 pair;state' = 0;push(A, goto(0,A)) == push(A,1);

pop 1 pair;state' = 1;push(B, goto(1,B)) == push(B,3);

Page 44: Compiler Structures

241-437 Compilers: Bottom-up/6 44

4.5. Example Grammar 2S --> a A B eA --> A b c | bB --> d

I0 I1

I2

a

A

I3

b

I4 I5 I6

I7 I8

B b d

e c

action[] goto[]

01234

a b c d e $ S A B

5678

Stage 1

s1s3s5 s6

s7s8

acc

24

Page 45: Compiler Structures

241-437 Compilers: Bottom-up/6 45

Reduce States

• For the example 2 grammar, there are three boxes at the leaves: I3, I6, and I8.

Page 46: Compiler Structures

241-437 Compilers: Bottom-up/6 46

I3 Reduction

• I3 = { A => b • }– A => b is rule number 3– FOLLOW(A) = {b} FIRST(B)– = {b, d}

• So action[ 3, b ] and action[ 3, d ] gets <reduce 3>

S --> a A B eA --> A b c A --> bB --> d

Page 47: Compiler Structures

241-437 Compilers: Bottom-up/6 47

I6 Reduction

• I6 = { B => d • }– B => d is rule number 4– FOLLOW(B) = {e}

• So action[ 6, e ] gets <reduce 4>

S --> a A B eA --> A b c A --> bB --> d

Page 48: Compiler Structures

241-437 Compilers: Bottom-up/6 48

I8 Reduction

• I8 = { A => A b c • }– A => A b c is rule number 2– FOLLOW(A) = {b, d}

• So action[ 8, b ] and action[ 8, d ] gets <reduce 2>

S --> a A B eA --> A b c A --> bB --> d

Page 49: Compiler Structures

241-437 Compilers: Bottom-up/6 49

Adding Reduce EntriesS --> a A B eA --> A b c | b B --> d

I0 I1

I2

a

A

I3

b

I4 I5 I6

I7 I8

B b d

e c

action[] goto[]

01234

a b c d e $ S A B

5678

s1s3s5 s6

s7s8

acc

24

r3 r3

r4

r2 r2

Page 50: Compiler Structures

241-437 Compilers: Bottom-up/6 50

5. LR Conflicts• A LR conflict occurs when a cell in the

action part of the parse table contains more than one action.

• There are two kinds of conflict:– shift/reduce and reduce/reduce

• Conflicts appear because of:– grammar ambiguity– limitations of the SLR parsing method

(even when the grammar is unambiguous)

Page 51: Compiler Structures

241-437 Compilers: Bottom-up/6 51

5.1. Shift/Reduce

• A shift/reduce conflict occurs when the parser cannot decide whether to shift the next symbol or reduce with a production– typically, the default action is to shift

Page 52: Compiler Structures

241-437 Compilers: Bottom-up/6 52

Dangling Else Example

• Grammar rule:IfStmt => if Expr then Stmt | if Expr then Stmt else Stmt

• Example:if (a == 1) then

if (b == 4) then x = 2; else ... <-- this goes with which 'if' ?

Page 53: Compiler Structures

241-437 Compilers: Bottom-up/6 53

On the Stack

Stack$…$…if Expr then Stmt

Input…$

else…$

Action…shift or reduce?

Choose shift, so elsematches closest if

Page 54: Compiler Structures

241-437 Compilers: Bottom-up/6 54

5.2. Reduce/Reduce

• A reduce/reduce conflict occurs when the parser cannot decide which production to use to make a reduction.

• Typically, the first suitable production is used.

Page 55: Compiler Structures

241-437 Compilers: Bottom-up/6 55

Example

Stack$$a

Inputaa$

a$

Actionshiftreduce A a or B a ?

Grammar:C A BA aB a

Choose A a,since it's the first

suitable one.

Page 56: Compiler Structures

241-437 Compilers: Bottom-up/6 56

6. LL, SLR, LR, LALR Grammars

LL(1)

LR(1)

LR(0)

SLR

LALR(1)

the ovalsrepresent thecomplexityof the grammarsthat the notationcan handle

we've been using SLR in this chapter

LL(1) was usedin chapter 5 ontop-down parsing

Page 57: Compiler Structures

241-437 Compilers: Bottom-up/6 57

LR(1) Grammars

• LR(1) parsing uses one token lookahead to avoid conflicts in the parsing table.

• It can deal with more complex/powerful grammars than LR(0) or SLR.

• A LR(1) grammar takes longer to convert into a parse table.

Page 58: Compiler Structures

241-437 Compilers: Bottom-up/6 58

LALR(1) Grammars

• LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce the size of the parse table.

• LALR(1) is less powerful than LR(1)– it may introduce reduce-reduce conflicts, but that's not

likely for programming language grammars

• LALR(1) is used by the YACC parsing tool– see next chapter