bottom-up parsing - univr

79
Bottom-Up Parsing Part II

Upload: others

Post on 27-Jan-2022

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bottom-Up Parsing - Univr

Bottom-Up ParsingPart II

Page 2: Bottom-Up Parsing - Univr

Canonical Collection of LR(0) items

CC_LR(0)_I items(G’:augmented_grammar){ C = {CLOSURE({S’ ! •S})} ; repeat{ foreach(I ∈ C) foreach(grammar symbol X) if(GOTO(I,X)≠∅ && GOTO(I,X) ∉ C) C = C ∪ {GOTO(I,X)}; }until(no new sets of items are added to C) return C;}

Page 3: Bottom-Up Parsing - Univr

LR(0) automaton

G’ : augmented grammar

LR(0) automaton for G’

〈Q, q0, GOTO: Q × (TG’ ∪ NG’) → Q, F〉where:Q = F = items(G’),q0 = CLOSURE({S’ → •S})

Page 4: Bottom-Up Parsing - Univr

E’ → EE → E + T | TT → T* F | FF → (E) | id

Construction of the LR(0) automaton for the expression grammar:

Page 5: Bottom-Up Parsing - Univr

E’ → EE → E + T | TT → T* F | FF → (E) | id

Page 6: Bottom-Up Parsing - Univr

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

E’ → EE → E + T | TT → T* F | FF → (E) | id

Page 7: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | id

Page 8: Bottom-Up Parsing - Univr

Push(I0);repeat{

//Begin to scan the input from left to rightIi = top() and next input symbol a;if (Ij = GOTO(Ii,a)) then shift a and Ij;

// Push(a) // push(Ij).

else if (A!β• ∈ Ii) then{ perform “reduce by A!β”; go to the state Ij = GOTO(I,A) and push A and Ij into the stack //where I is the state on the top_of__stack //after removing β and the corresponding states; } _exit(Reject if none of the above can be done); _exit(Report “conflicts” if more than one can be done);

} until EOF is seen

Shift-reduce parsing using LR(0) automaton

Page 9: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 10: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 11: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 12: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 13: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 14: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 15: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 16: Bottom-Up Parsing - Univr

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | idTransition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Transition diagram (1/2)

I0E’ −> .EE −> . E+TE −> .TT −> .T*FT −> .FF −> .(E)F −> .id

E −> E+ . TT −> . T*FT −> .FF −> .(E)F −> .id

I6

E −> E+T.T −> T.*F

I9

T −> T*F .I10

T −> T*.FF −> .(E)F −> . id

I7E −> T.T −> T.*F

I2

T −> F .I3

F −> ( E ) .I11

F −> ( E . )E −> E . + T

I8F −> ( . E )E −> . E + TE −> .TT −> . T * FT −> . FF −> . ( E )F −> . id

I4

F −> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ −> E.E −> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

Saturday, March 26, 2011

Page 17: Bottom-Up Parsing - Univr

LRParsingProgram

$

::

sm-1

sm

ACTION GOTO

a1 ai ... $an...

stack

input

output

MODEL OF AN LR PARSER

Page 18: Bottom-Up Parsing - Univr

〈Q, 0, GOTO: Q × (TG’ ∪ NG’) → Q, F〉

I0,...,In enumeration of items(G') I0= CLOSURE({S’ → •S})

GOTO(s,X)=s' ⇔ GOTO(Is,X)= Is'

Q = {0,...,n} s Is

∀ s ∈ Q we associate a unique symbol Xs ∈ TG’ ∪ NG’ to s≠0if GOTO(s',X)=s then Xs = X.Moreover, X0 represents the symbol $ (bottom of the stack)

Proposition: if GOTO(s',X)=s and GOTO(s",Y)=s then X=Y

Fix for G' a finite bijective enumeration p: {1,...,n} → G'-productions, s.t. p(1)= S→S'

Page 19: Bottom-Up Parsing - Univr

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

Construct the LR(0) automaton for the grammar

Page 20: Bottom-Up Parsing - Univr

Solution for exercise 1

Faculty of Sciences (ULB) INFO-F403 – Exercises 2010-2011 6 / 16

accept

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

Page 21: Bottom-Up Parsing - Univr

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

Solution for exercise 1

Follow1(S) = {$}Follow1(A) = {f,$}Follow1(B) = {f,$}Follow1(C) = {e,f,$}

Faculty of Sciences (ULB) INFO-F403 – Exercises 2010-2011 7 / 16

43

����������&))4!*#

• -���M ���7L&&L=�(�

• 9������� ����� ��������1– �+⌅�%⇥������739(#�⇥�N5⇤2��7L&&L=�%�

– �+⌅�%������7L&&L=�+���7L&&L=�%�

– �+⌅�%⇥��⇤ � �739(#�⇥�������7L&&L=�+���7L&&L=�%�

Page 22: Bottom-Up Parsing - Univr

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

Solution for exercise 1

Follow1(S) = {$}Follow1(A) = {f,$}Follow1(B) = {f,$}Follow1(C) = {e,f,$}

Faculty of Sciences (ULB) INFO-F403 – Exercises 2010-2011 7 / 16

43

����������&))4!*#

• -���M ���7L&&L=�(�

• 9������� ����� ��������1– �+⌅�%⇥������739(#�⇥�N5⇤2��7L&&L=�%�

– �+⌅�%������7L&&L=�+���7L&&L=�%�

– �+⌅�%⇥��⇤ � �739(#�⇥�������7L&&L=�+���7L&&L=�%�

Page 23: Bottom-Up Parsing - Univr
Page 24: Bottom-Up Parsing - Univr
Page 25: Bottom-Up Parsing - Univr

StateACTIONACTIONACTION GOTO GOTO GOTO

State

A

G

A= ACTION[i,a]G = GOTO[j,A]

ACTION[i,a]= s j (shift symbol j (namely Xj))ACTION[i,a]= r k (reduce production p(k))ACTION[i,a]= accept ACTION[i,a]= error

PARSING TABLE

i

j

a A

terminals and $ non terminals

Page 26: Bottom-Up Parsing - Univr

remaining input

( 0 s1 ... sm , ai ai+1 ... an $ )

LR-PARSER CONFIGURATIONS

X0 Xs1 ... Xsm ai ai+1 ... an

configuration

right-sentential form

stack

top

Page 27: Bottom-Up Parsing - Univr

BEHAVIOR OF THE LR PARSER

symbol Xs

p(k)=A → β|β| = rs=GOTO[sm-r, A]

ACTION[sm,ai] = reduce k

( s0 s1 ... sm , ai ai+1 ... an $ )

( s0 s1 ... sm-r s, ai ai+1 ... an $ )

ACTION[sm,ai] = shift s

( s0 s1 ... sm , ai ai+1 ... an $ )

( s0 s1 ... sm s, ai+1 ... an $ )

symbol Xs

Page 28: Bottom-Up Parsing - Univr

BEHAVIOR OF THE LR PARSER

ACTION[sm,ai] = error

( s0 s1 ... sm , ai ai+1 ... an $ )

error recovery routine

ACTION[sm,ai] = accept

( s0 s1 ... sm , ai ai+1 ... an $ )

end of parsing

Page 29: Bottom-Up Parsing - Univr

LR-parsing algorithm

Page 30: Bottom-Up Parsing - Univr

0 1

2

3

4

5

6

7

8

10

9

11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

X1 = EX2 = TX3 = FX4 = (X5 = idX6 = +X7 = *X8 = EX9 = TX10 = FX11 = )

Page 31: Bottom-Up Parsing - Univr

X1 = EX2 = TX3 = FX4 = (X5 = idX6 = +X7 = *X8 = EX9 = TX10 = FX11 = )

id ✱ id + id

Page 32: Bottom-Up Parsing - Univr

X1 = EX2 = TX3 = FX4 = (X5 = idX6 = +X7 = *X8 = EX9 = TX10 = FX11 = )

Page 33: Bottom-Up Parsing - Univr

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(0) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[S’! •S] GOTO function);3.foreach state i:(a) if (A! α•aβ ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a];(b) if (A! α• ∈ Ii && A ≠ S’)

foreach (a ∈ FOLLOW(A)) add reduce p-1(A ! α) to ACTION[i,a];(c) if (S’! S• ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

Page 34: Bottom-Up Parsing - Univr
Page 35: Bottom-Up Parsing - Univr

LR-parsing algorithm

Page 36: Bottom-Up Parsing - Univr

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS0 s3 s4 1 21 r12 acc3 r3 r34 s6 55 r2 r26 s8 77 s9 r4 r48 s3 s4 109 r5 r510 s1111 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'→·SS→·AA→·aA→·bB

q1

S→A·

q2

S'→S·

q3

A→a·

q4

B→·cCeA→b·BB→·cC

q5

A→bB·

q6

B→c·CeC→·dAfB→c·C

q7

B→cC·B→cC·e

q8

C→d·AfA→·aA→·bB

q9

B→cCe·

q10

C→dA·f

q11

C→dAf·

FIRSTFIRST FOLLOWFOLLOWA { b, a } { f, $ }B { c } { f, $ }C { d } { f, e, $ }S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' SS AA bBA aB cCB cCeC dAf

Exercise: Construct the PARSING TABLE

Page 37: Bottom-Up Parsing - Univr

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS0 s3 s4 1 21 r12 acc3 r3 r34 s6 55 r2 r26 s8 77 s9 r4 r48 s3 s4 109 r5 r510 s1111 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'→·SS→·AA→·aA→·bB

q1

S→A·

q2

S'→S·

q3

A→a·

q4

B→·cCeA→b·BB→·cC

q5

A→bB·

q6

B→c·CeC→·dAfB→c·C

q7

B→cC·B→cC·e

q8

C→d·AfA→·aA→·bB

q9

B→cCe·

q10

C→dA·f

q11

C→dAf·

FIRSTFIRST FOLLOWFOLLOWA { b, a } { f, $ }B { c } { f, $ }C { d } { f, e, $ }S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' SS AA bBA aB cCB cCeC dAf

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS0 s3 s4 1 21 r12 acc3 r3 r34 s6 55 r2 r26 s8 77 s9 r4 r48 s3 s4 109 r5 r510 s1111 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'→·SS→·AA→·aA→·bB

q1

S→A·

q2

S'→S·

q3

A→a·

q4

B→·cCeA→b·BB→·cC

q5

A→bB·

q6

B→c·CeC→·dAfB→c·C

q7

B→cC·B→cC·e

q8

C→d·AfA→·aA→·bB

q9

B→cCe·

q10

C→dA·f

q11

C→dAf·

FIRSTFIRST FOLLOWFOLLOWA { b, a } { f, $ }B { c } { f, $ }C { d } { f, e, $ }S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' SS AA bBA aB cCB cCeC dAf

Page 38: Bottom-Up Parsing - Univr

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS0 s3 s4 1 21 r12 acc3 r3 r34 s6 55 r2 r26 s8 77 s9 r4 r48 s3 s4 109 r5 r510 s1111 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'→·SS→·AA→·aA→·bB

q1

S→A·

q2

S'→S·

q3

A→a·

q4

B→·cCeA→b·BB→·cC

q5

A→bB·

q6

B→c·CeC→·dAfB→c·C

q7

B→cC·B→cC·e

q8

C→d·AfA→·aA→·bB

q9

B→cCe·

q10

C→dA·f

q11

C→dAf·

FIRSTFIRST FOLLOWFOLLOWA { b, a } { f, $ }B { c } { f, $ }C { d } { f, e, $ }S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' SS AA bBA aB cCB cCeC dAf

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS0 s3 s4 1 21 r12 acc3 r3 r34 s6 55 r2 r26 s8 77 s9 r4 r48 s3 s4 109 r5 r510 s1111 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'→·SS→·AA→·aA→·bB

q1

S→A·

q2

S'→S·

q3

A→a·

q4

B→·cCeA→b·BB→·cC

q5

A→bB·

q6

B→c·CeC→·dAfB→c·C

q7

B→cC·B→cC·e

q8

C→d·AfA→·aA→·bB

q9

B→cCe·

q10

C→dA·f

q11

C→dAf·

FIRSTFIRST FOLLOWFOLLOWA { b, a } { f, $ }B { c } { f, $ }C { d } { f, e, $ }S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' SS AA bBA aB cCB cCeC dAf

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS0 s3 s4 1 21 r12 acc3 r3 r34 s6 55 r2 r26 s8 77 s9 r4 r48 s3 s4 109 r5 r510 s1111 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'→·SS→·AA→·aA→·bB

q1

S→A·

q2

S'→S·

q3

A→a·

q4

B→·cCeA→b·BB→·cC

q5

A→bB·

q6

B→c·CeC→·dAfB→c·C

q7

B→cC·B→cC·e

q8

C→d·AfA→·aA→·bB

q9

B→cCe·

q10

C→dA·f

q11

C→dAf·

FIRSTFIRST FOLLOWFOLLOWA { b, a } { f, $ }B { c } { f, $ }C { d } { f, e, $ }S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' SS AA bBA aB cCB cCeC dAf

Page 39: Bottom-Up Parsing - Univr

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(0) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[S’! •S] GOTO function);3.foreach state i:(a) if (A! α•aβ ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a];(b) if (A! α• ∈ Ii && A ≠ S’)

foreach (a ∈ FOLLOW(A)) add reduce p-1(A ! α) to ACTION[i,a];(c) if (S’! S• ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

Constructing an SLR-parsing Table

Page 40: Bottom-Up Parsing - Univr

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

Page 41: Bottom-Up Parsing - Univr

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

SA

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

cc

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

Page 42: Bottom-Up Parsing - Univr

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

SA

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

cc

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

Page 43: Bottom-Up Parsing - Univr

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS0 s2 11 acc2 s4 33 s6 s7 54 s4 s9 85 s106 s6 s7 117 r58 s129 r3 r310 r111 r412 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S→·aABbS'→·S

q1

S'→S·

q2

A→·acA→·aAcS→a·ABb

q3

S→aA·BbB→·cB→·bB

q4

A→·acA→·aAcA→a·AcA→a·c

q5

S→aAB·b

q6

B→b·BB→·cB→·bB

q7

B→c·

q8

A→aA·c

q9

A→ac·

q10

S→aABb·

q11

B→bB·q12

A→aAc·

FIRSTFIRST FOLLOWFOLLOWA { a } { b, c }B { b, c } { b }S { a } { $ }

Parse table complete. Press "parse" to use it.S' SS aABbA aAcA acB bBB c

Page 44: Bottom-Up Parsing - Univr

Item set ?

it is not ambiguous

Page 45: Bottom-Up Parsing - Univr

Item set :

Parsing table?

Page 46: Bottom-Up Parsing - Univr

Item set :

parsing table shift 6 ∈ ACTION[2,=]symbol = ∈ FOLLOW(R) and therefore: reduce R → L ∈ ACTION[2,=]

conflict!

if (A! α• ∈ Ii && A ≠ S’) foreach (a ∈ FOLLOW(A)) add reduce p-1(A ! α) to ACTION[i,a];

=

Page 47: Bottom-Up Parsing - Univr

Viable prefix A prefix of right-sentential forms that can appear on the

top of the stack

Not all prefixes of right-sentential forms can appear on the stack, however, since the parser must not shift past the

handle

Example:

STACK: (, (E, and (E), but not (E)*(E) is a handle (the parser must reduce it to F before shifting *)

Page 48: Bottom-Up Parsing - Univr

Viable prefix:

a prefix of right-sentential forms that can appear on the top of the stack and that does not continue past the right end of the rightmost handle of that sentential form. a prefix γ of αβ, where

S' ⇒ αAw ⇒ αβwrm rm*

S' ⇒ αAw ⇒ αβ1β2wrm rm*

A→β1•β2 is valid for a viable prefix αβ1 if

Page 49: Bottom-Up Parsing - Univr

S' ⇒ αAw ⇒ αβ1β2wrm rm*

A→β1•β2 is valid for a viable prefix αβ1 if

if β2 ≠ε then shift if β2 =ε then reduce A→β1

THEOREM: The set of valid items for a viable prefix γ is exactly the set of items reached from the initial state along the path labeled γ in the LR(0) automaton for the grammar.

Two valid items may tell us to do different things for the same viable prefix. Some of these conflicts can be resolved by looking at the next input symbol

possible conflicts

Page 50: Bottom-Up Parsing - Univr

Item set :

Parsing table shift 6 ∈ ACTION[2,=]symbol = ∈ FOLLOW(R)and therefore: reduce R → L ∈ ACTION[2,=]

Conflict!

=

Page 51: Bottom-Up Parsing - Univr

LR(1) PARSING

Discussion (1/3)

Every SLR(1) grammar is unambiguous, but there are manyunambiguous grammars that are not SLR(1).

Grammar:• S ⇤ L = R | R• L ⇤ ⇥R | id• R ⇤ L

States:I0:

� S⇧ ⌅ ·S� S ⌅ ·L = R� S ⌅ ·R� L ⌅ · ⇥ R� L ⌅ ·id� R ⌅ ·L

I1: S⇧ ⇤ S·I2:

� S ⌅ L· = R� R ⌅ L·

I3: S ⇤ R·I4:

� L ⌅ ⇥ · R� R ⌅ ·L� L ⌅ · ⇥ R� L ⌅ ·id

I5: L ⇤ id·

I6:� S ⌅ L = ·R� R ⌅ ·L� L ⌅ · ⇥ R� L ⌅ ·id

I7: L ⇤ ⇥R·I8: R ⇤ L·I9: S ⇤ L = R·

Compiler notes #3, 20070503, Tsan-sheng Hsu 75

0 2shift 6 ∈ ACTION[2,=]symbol = ∈ FOLLOW(R) and therefore:

reduce R → L ∈ ACTION[2,=] 0 3

$R

$L

there is no right-sentential form of the grammar that begins R = ...

Thus state 2, which is the state corresponding to viable prefix L only, should not really call for reduction of that L to R

Page 52: Bottom-Up Parsing - Univr

However, it is possible that when state i is on the top of the stack, we have the viable prefix βα on the top of the stack, and βA cannot be followed by a.

In this case, we cannot perform the reduction A → α.

In SLR(1) parsing, if A → α• ∈ Si, and a ∈ FOLLOW(A),then we perform the reduction A → α

Page 53: Bottom-Up Parsing - Univr

LR(1) itemsAn LR(1) item is in the form of

[A → α•β,a] 1.the first field A → α•β is an LR(0) item 2.the second field a is a terminal belonging to a subset

(possibly proper) of FOLLOW[A].

Intuition: perform a reduction based on an LR(1) item [A → α•, a] only when the next symbol is a.

Instead of maintaining FOLLOW sets of viable prefixes, we maintain FIRST sets of possible future extensions of the current viable prefix.

where • γ = δα• w = aw' or w=ε and a=$.

[A → α•β, a] is valid for a viable prefix γ if there

exists a derivation S' ⇒ δAw ⇒ δαβwrm rm*

γ

Page 54: Bottom-Up Parsing - Univr

Examples of LR(1) items

Grammar:• S ⇥ BB• B ⇥ aB | b

S�=⇤

rmaaBab =⇤

rmaaaBab

viable prefix aaa can reach [B ⇥ a · B, a]

S�=⇤

rmBaB =⇤

rmBaaB

viable prefix Baa can reach [B ⇥ a · B, $]

Compiler notes #3, 20070503, Tsan-sheng Hsu 80

where • γ = δα• w = aw' or w=ε and a=$.

[A → α•β, a] is valid for a viable prefix γ if there exists a derivation S' ⇒ δAw ⇒ δαβwrm rm

*

Page 55: Bottom-Up Parsing - Univr

Finding all LR(1) items

Ideas: redefine the closure function.• Suppose [A⇤ � · B⇥, a] is valid for a viable prefix ⇤ ⇥ ⌅�.• In other words,

S�=⌅

rm⌅ A a⌃ =⌅

rm⌅ �B⇥ a⌃.

⇤ ⇥ is � or a sequence of terminals.

• Then for each production B ⇤ ⇧, assume ⇥a⌃ derives the sequence ofterminals bea⌃.

S�=⌅

rm⌅�B ⇥a⌃

�=⌅rm

⌅�B bea⌃�=⌅

rm⌅�⇧ bea⌃

Thus [B ⇤ ·⇧, b] is also valid for ⇤ for each b ⇧ FIRST(⇥a).Note a is a terminal. So FIRST(⇥a) = FIRST(⇥a⌃).

Lookahead propagation .

Compiler notes #3, 20070503, Tsan-sheng Hsu 81

where • γ = δα• w = aw' or w=ε and a=$.

[A → α•β, a] is valid for a viable prefix γ if there exists a derivation S' ⇒ δAw ⇒ δαβwrm rm

*

γ γγ

γ

γ

Page 56: Bottom-Up Parsing - Univr

set_of_items CLOSURE(I: set_of_items){

J=I;

repeat{

foreach ( [A → α•Bβ,a] ∈ J)

foreach ( B → γ ∈ G)

foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]};

}until (no more new items are added to J)

return J;

}

Page 57: Bottom-Up Parsing - Univr

set_of_items GOTO(I set_of_items, X:symbol){ J=∅;

foreach([A → α•Xβ,a] ∈ I)

J = J ∪ {[A → αX•β,a]};

return CLOSURE(J);

}

Page 58: Bottom-Up Parsing - Univr

CC_LR(1)_I items(G’:augmented_grammar){ C = {CLOSURE({[S’ → •S,$]})} ; repeat{ foreach(I ∈ C) foreach(grammar symbol X) if(GOTO(I,X)≠∅ && GOTO(I,X) ∉ C) C = C ∪ {GOTO(I,X)}; }until(no new sets of items are added to C) return C;}

Page 59: Bottom-Up Parsing - Univr

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(1) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[[S’! •S,$]] GOTO function);3.foreach state i:(a) if ([A! α•aβ,b] ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a]; // a is terminal(b) if ([A! α•,a] ∈ Ii && A ≠ S’)

add reduce p-1(A ! α) to ACTION[i,a];(c) if ([S’! S•,$] ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

Page 60: Bottom-Up Parsing - Univr

Example for constructing LR(1) closures

Grammar:• S⌅ ⇥ S• S ⇥ CC• C ⇥ cC | d

closure1({[S⌅⇥ ·S, $]}) =• {[S⌅ ⇥ ·S, $],• [S ⇥ ·CC, $],• [C ⇥ ·cC, c/d],• [C ⇥ ·d, c/d]}

Note:• FIRST(�$) = {$}• FIRST(C$) = {c, d}• [C ⇥ ·cC, c/d] means

� [C ⇤ ·cC, c] and� [C ⇤ ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

Example of an LR(1) parsing tableaction1 GOTO1

state c d $ S C0 s3 s4 1 21 accept2 s6 s7 53 s3 s4 84 r3 r35 r16 s6 s7 97 r38 r2 r29 r2

Canonical LR(1) parser:• Most powerful!• Has too many states and thus occupies too much space.

Compiler notes #3, 20070503, Tsan-sheng Hsu 87

set_of_items CLOSURE(I: set_of_items){

J=I;

repeat{

foreach ( [A → α•Bβ,a] ∈ J)

foreach ( B → γ ∈ G)

foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]};

}until (no more new items are added to J)

return J;

}

Page 61: Bottom-Up Parsing - Univr

Example for constructing LR(1) closures

Grammar:• S⌅ ⇥ S• S ⇥ CC• C ⇥ cC | d

closure1({[S⌅⇥ ·S, $]}) =• {[S⌅ ⇥ ·S, $],• [S ⇥ ·CC, $],• [C ⇥ ·cC, c/d],• [C ⇥ ·d, c/d]}

Note:• FIRST(�$) = {$}• FIRST(C$) = {c, d}• [C ⇥ ·cC, c/d] means

� [C ⇤ ·cC, c] and� [C ⇤ ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

set_of_items CLOSURE(I: set_of_items){ J=I; repeat{ foreach ( [A → α•Bβ,a] ∈ J) foreach ( B → γ ∈ G) foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]}; }until (no more new items are added to J) return J;}

Page 62: Bottom-Up Parsing - Univr

Example for constructing LR(1) closures

Grammar:• S⌅ ⇥ S• S ⇥ CC• C ⇥ cC | d

closure1({[S⌅⇥ ·S, $]}) =• {[S⌅ ⇥ ·S, $],• [S ⇥ ·CC, $],• [C ⇥ ·cC, c/d],• [C ⇥ ·d, c/d]}

Note:• FIRST(�$) = {$}• FIRST(C$) = {c, d}• [C ⇥ ·cC, c/d] means

� [C ⇤ ·cC, c] and� [C ⇤ ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

set_of_items CLOSURE(I: set_of_items){ J=I; repeat{ foreach ( [A → α•Bβ,a] ∈ J) foreach ( B → γ ∈ G) foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]}; }until (no more new items are added to J) return J;}

Page 63: Bottom-Up Parsing - Univr

set_of_items CLOSURE(I: set_of_items){ J=I; repeat{ foreach ( [A → α•Bβ,a] ∈ J) foreach ( B → γ ∈ G) foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]}; }until (no more new items are added to J) return J;}

Page 64: Bottom-Up Parsing - Univr

Example for constructing LR(1) closures

Grammar:• S⌅ ⇥ S• S ⇥ CC• C ⇥ cC | d

closure1({[S⌅⇥ ·S, $]}) =• {[S⌅ ⇥ ·S, $],• [S ⇥ ·CC, $],• [C ⇥ ·cC, c/d],• [C ⇥ ·d, c/d]}

Note:• FIRST(�$) = {$}• FIRST(C$) = {c, d}• [C ⇥ ·cC, c/d] means

� [C ⇤ ·cC, c] and� [C ⇤ ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

Example of an LR(1) parsing tableaction1 GOTO1

state c d $ S C0 s3 s4 1 21 accept2 s6 s7 53 s3 s4 84 r3 r35 r16 s6 s7 97 r38 r2 r29 r2

Canonical LR(1) parser:• Most powerful!• Has too many states and thus occupies too much space.

Compiler notes #3, 20070503, Tsan-sheng Hsu 87

Page 65: Bottom-Up Parsing - Univr

LALR(1) parser — Lookahead LR

The method that is often used in practice.Most common syntactic constructs of programming languagescan be expressed conveniently by an LALR(1) grammar[DeRemer 1969].SLR(1) and LALR(1) always have the same number of states.Number of states is about 1/10 of that of LR(1).Simple observation:

• an LR(1) item is of the form [A⇥ � · ⇥, c]

We call A⇥ � · ⇥ the first component .

Definition: in an LR(1) state, set of first components is calledits core .

Compiler notes #3, 20070503, Tsan-sheng Hsu 88

Page 66: Bottom-Up Parsing - Univr

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(1) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[[S’! •S,$]] GOTO function);3.foreach state i:(a) if ([A! α•aβ,b] ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a]; // a is terminal(b) if ([A! α•,a] ∈ Ii && A ≠ S’)

add reduce p-1(A ! α) to ACTION[i,a];(c) if ([S’! S•,$] ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

Page 67: Bottom-Up Parsing - Univr

Intuition for LALR(1) grammars

In an LR(1) parser, it is a common thing that several statesonly di�er in lookahead symbols, but have the same core.To reduce the number of states, we might want to merge stateswith the same core.

• If I4 and I7 are merged, then the new state is called I4,7.• After merging the states, revise the GOTO1 table accordingly.

Merging of states can never produce a shift-reduce conflict thatwas not present in one of the original states.

• I1 = {[A⇥ �·, a], . . .}⌅ For I1, one of the actions is to perform a reduce when the lookahead

symbol is “a”.

• I2 = {[B ⇥ ⇥ · a⇤, b], . . .}⌅ For I2, one of the actions is to perform a shift on input “a”.

• Merging I1 and I2, the new state I1,2 has shift-reduce conflicts.• However, we merge I1 and I2 because they have the same core.

⌅ That is, [A ⇤ �·, c] ⌅ I2 and [B ⇤ ⇥ · a⇤, d] ⌅ I1.⌅ The shift-reduce conflict already occurs in I1 and I2.

Merging of states can produce a new reduce-reduce conflict.

Compiler notes #3, 20070503, Tsan-sheng Hsu 89

Page 68: Bottom-Up Parsing - Univr

LALR(1) transition diagram

I0S’ −> . S, $S −> . CC, $C −> . cC, c/dC −>.d, c/d

S’ −> S., $I1

S −> C.C, $C −> .cC, $C −> .d, $

I2

S −> CC., $I5

C −> c.C, $C −> .cC, $C −> .d, $

I6

C −> cC., $I9

C −> d., $I7

I3

C −> cC., c/dI8

C −> d., c/dI4

S

C

c

d

dC

C

c

d

d

c

CC −> c.C, c/dC −> .cC, c/dC −> .d, c/d

c

Compiler notes #3, 20070503, Tsan-sheng Hsu 90

Page 69: Bottom-Up Parsing - Univr

Possible new conflicts from LALR(1)

May produce a new reduce-reduce conflict.For example (textbook page 267, Example 4.58), grammar:

• S⇥ ⇥ S• S ⇥ aAd | bBf | aBe | bAe• A⇥ c• B ⇥ c

The language recognized by this grammar is {acd, ace, bcd, bce}.You may check that this grammar is LR(1) by constructing thesets of items.You will find the set of items {[A⇥ c·, d], [B ⇥ c·, e]} is valid forthe viable prefix ac, and {[A⇥ c·, e], [B ⇥ c·, d]} is valid for theviable prefix bc.Neither of these sets generates a conflict, and their cores arethe same. However, their union, which is

• {[A⇥ c·, d/e],• [B ⇥ c·, d/e]},

generates a reduce-reduce conflict, since reductions by bothA⇥ c and B ⇥ c are called for on inputs d and e.

Compiler notes #3, 20070503, Tsan-sheng Hsu 91

Page 70: Bottom-Up Parsing - Univr

How to construct LALR(1) parsing table

Naive approach:• Construct LR(1) parsing table, which takes lots of intermediate spaces.• Merging states.

Space and/or time e⇥cient methods to construct an LALR(1)parsing table are known.

• Constructing and merging on the fly.• · · ·

Compiler notes #3, 20070503, Tsan-sheng Hsu 92

Page 71: Bottom-Up Parsing - Univr

Summary

LR(1)

LL(1)

LALR(1)

SLR(1)LR(1)

LALR(1)

SLR(1)

LR(0)

LR(1) and LALR(1) can almost express all important program-ming languages issues, but LALR(1) is easier to write and usesmuch less space.LL(1) is easier to understand and uses much less space, butcannot express some important common-language features.

• May try to use it first for your own applications.• If it does not succeed, then use more powerful ones.

Compiler notes #3, 20070503, Tsan-sheng Hsu 93

Page 72: Bottom-Up Parsing - Univr

Ambiguous grammars are not too bad...Using ambiguous grammars

ambiguous grammars

unambiguous grammars

LR(1)

Ambiguous grammars often provide a shorter, more naturalspecification than their equivalent unambiguous grammars.Sometimes need ambiguous grammars to specify importantlanguage constructs.

• Example: declare a variable before its usage.var xyz : integerbegin

...xyz := 3;...

Use symbol tables to create “side e�ects.”

Compiler notes #4, 20070517, Tsan-sheng Hsu 14

Ambiguous grammars often provide a shorter, more natural specification than their equivalent unambiguous grammars.

Page 73: Bottom-Up Parsing - Univr

Ambiguity from precedence and associativity

Precedence and associativity are important language constructs.Example:

• G1:� E ⇤ E + E | E � E | (E) | id� Ambiguous, but easy to understand and maintain!

• G2:� E ⇤ E + T | T� T ⇤ T � F | F� F ⇤ (E) | id� Unambiguous, but di�cult to understand and maintain!

Input: 1+2*3E

E +

*

3

T

T T F

FF

1 2

Parse tree: G 2

E

E + E

1 E * E

2 3

Parse tree#1: G 1 1Parse tree#2: G

E

E E

1

E E

2

3

*

+

Compiler notes #4, 20070517, Tsan-sheng Hsu 15

Page 74: Bottom-Up Parsing - Univr

LR(0) states

input: id+id∗idthe parser enters state 7 after processing id+id

Page 75: Bottom-Up Parsing - Univr
Page 76: Bottom-Up Parsing - Univr
Page 77: Bottom-Up Parsing - Univr

Ambiguity from dangling-else

Grammar:• Statement� Other Statement

| if Condition then Statement| if Condition then Statement else Statement

When seeingif C then S else S

• there is a shift/reduce conflict,• we always favor a shift.• Intuition: favor a longer match.

Need a mechanism to let user specify the default conflict-handling rule when there is a shift/reduce conflict.

Compiler notes #4, 20070517, Tsan-sheng Hsu 17

Page 78: Bottom-Up Parsing - Univr
Page 79: Bottom-Up Parsing - Univr