lesson 4 cdt301 compiler theory, spring 2011 teacher: linus kllberg

32
Lesson 4 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg

Upload: arron-oneal

Post on 18-Jan-2018

222 views

Category:

Documents


0 download

DESCRIPTION

RECURSIVE DESCENT PARSING 3

TRANSCRIPT

Page 1: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Lesson 4

CDT301 – Compiler Theory, Spring 2011Teacher: Linus Källberg

Page 2: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

2

Outline

• Recursive descent parsers• Left recursion• Left factoring

Page 3: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

RECURSIVE DESCENT PARSING

3

Page 4: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Writing a recursivedescent parser

• Straightforward once the grammar is written in an appropriate form:– For each nonterminal: create a function

• Represents the expectation of that nonterminal in the input

• Each such function should choose a grammar production, i.e., RHS, based on the lookahead token

• It should then process the chosen RHS– Terminals are “matched”: match(IF); match(LEFT_PARENTHESIS); … match(RIGHT_PARENTHESIS); …

– For nonterminals their corresponding “expectation functions” are called

4

Page 5: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

The function match() Helper function to consume terminals:void match(int expected_lookahead){if (lookahead == expected_lookahead)

lookahead = nextToken();else

error();}

(assumes tokens are represented as ints) 5

Page 6: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Recursive descent example

• Grammar for a subset of the language “types in Pascal”:

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

• Examples of “programs”:

^ my_typearray [ 1..10 ] of Integerarray [ Char ] of 72..98

6

Page 7: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Recursive descent examplevoid type() { switch (lookahead) { case '^': match('^'); match(ID); break; case ARRAY: match(ARRAY); match('['); simple(); match(']'); match(OF); type(); break; default: simple(); }}

void simple() { switch (lookahead) { case INTEGER: match(INTEGER); break; case CHAR: match(CHAR); break; case NUM: match(NUM); match(DOTDOT); match(NUM); break; default: error(); }} 7

Page 8: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Exercise (1)List the calls made by the previous recursive descent parser on the input string

array [ num dotdot num ] of integerTo get you started:type()

match(ARRAY)match('[')simple()...

8

Page 9: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

Page 10: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 11: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 12: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 13: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 14: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 15: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 16: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 17: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 18: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

Page 19: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

simple

Page 20: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

type

array [ num dotdot num ] of integer

type → ^ id | array [ simple ] of type | simple

simple → integer | char | num dotdot num

simple type

simple

Page 21: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

LEFT RECURSION

21

Page 22: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

The problem with left recursion

• Left-recursive grammar:A → A α | β

• Problematic for recursive descent parsing– Infinite recursion

22

Page 23: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

The problem with left recursion

• The left-recursive expression grammar:expr → expr + num

| expr – num | num

• Parser code:void expr() { if (lookahead != NUM) expr(); match('+'); …

23

Page 24: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Eliminating left recursion

• Left-recursive grammar:A → A α | β

• Rewritten grammar:A → β MM → α M | ε

24

Page 25: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Exercise (2)

Remove the left recursion from the following grammar for formal parameter lists in C:

list → par | list , parpar → int id

int and id are tokens that represent the keyword int and identifiers, respectively.Hint: what is α and what is β in this case? 25

Page 26: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

LEFT FACTORING

26

Page 27: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

The problem

• Recall: how does a predictive parser choose production body?

• What if the lookahead token matches more than one such production body?

27

Page 28: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

The problem

• Problematic grammar:list → num

| num , list• If lookahead = num, what to expect?

28

Page 29: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Left factoring

• The previous grammar,list → num

| num , listbecomes

list → num list’list’ → ε

| , list29

Page 30: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Exercise (3)

Perform left factoring on the following grammar for declarations of variables and functions in C:

decl → int id ; | int id ( pars ) ;

pars → ...

30

Page 31: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Conclusion

• Recursive descent parsers• Left recursion• Left factoring

31

Page 32: Lesson 4 CDT301  Compiler Theory, Spring 2011 Teacher: Linus Kllberg

Next time

• The sets FIRST and FOLLOW• Defining LL(1) grammars• Non-recursive top-down parser• Handling syntax errors

32