lesson 4 cdt301 compiler theory, spring 2011 teacher: linus kllberg
DESCRIPTION
RECURSIVE DESCENT PARSING 3TRANSCRIPT
Lesson 4
CDT301 – Compiler Theory, Spring 2011Teacher: Linus Källberg
2
Outline
• Recursive descent parsers• Left recursion• Left factoring
RECURSIVE DESCENT PARSING
3
Writing a recursivedescent parser
• Straightforward once the grammar is written in an appropriate form:– For each nonterminal: create a function
• Represents the expectation of that nonterminal in the input
• Each such function should choose a grammar production, i.e., RHS, based on the lookahead token
• It should then process the chosen RHS– Terminals are “matched”: match(IF); match(LEFT_PARENTHESIS); … match(RIGHT_PARENTHESIS); …
– For nonterminals their corresponding “expectation functions” are called
4
The function match() Helper function to consume terminals:void match(int expected_lookahead){if (lookahead == expected_lookahead)
lookahead = nextToken();else
error();}
(assumes tokens are represented as ints) 5
Recursive descent example
• Grammar for a subset of the language “types in Pascal”:
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
• Examples of “programs”:
^ my_typearray [ 1..10 ] of Integerarray [ Char ] of 72..98
6
Recursive descent examplevoid type() { switch (lookahead) { case '^': match('^'); match(ID); break; case ARRAY: match(ARRAY); match('['); simple(); match(']'); match(OF); type(); break; default: simple(); }}
void simple() { switch (lookahead) { case INTEGER: match(INTEGER); break; case CHAR: match(CHAR); break; case NUM: match(NUM); match(DOTDOT); match(NUM); break; default: error(); }} 7
Exercise (1)List the calls made by the previous recursive descent parser on the input string
array [ num dotdot num ] of integerTo get you started:type()
match(ARRAY)match('[')simple()...
8
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
simple
type
array [ num dotdot num ] of integer
type → ^ id | array [ simple ] of type | simple
simple → integer | char | num dotdot num
simple type
simple
LEFT RECURSION
21
The problem with left recursion
• Left-recursive grammar:A → A α | β
• Problematic for recursive descent parsing– Infinite recursion
22
The problem with left recursion
• The left-recursive expression grammar:expr → expr + num
| expr – num | num
• Parser code:void expr() { if (lookahead != NUM) expr(); match('+'); …
23
Eliminating left recursion
• Left-recursive grammar:A → A α | β
• Rewritten grammar:A → β MM → α M | ε
24
Exercise (2)
Remove the left recursion from the following grammar for formal parameter lists in C:
list → par | list , parpar → int id
int and id are tokens that represent the keyword int and identifiers, respectively.Hint: what is α and what is β in this case? 25
LEFT FACTORING
26
The problem
• Recall: how does a predictive parser choose production body?
• What if the lookahead token matches more than one such production body?
27
The problem
• Problematic grammar:list → num
| num , list• If lookahead = num, what to expect?
28
Left factoring
• The previous grammar,list → num
| num , listbecomes
list → num list’list’ → ε
| , list29
Exercise (3)
Perform left factoring on the following grammar for declarations of variables and functions in C:
decl → int id ; | int id ( pars ) ;
pars → ...
30
Conclusion
• Recursive descent parsers• Left recursion• Left factoring
31
Next time
• The sets FIRST and FOLLOW• Defining LL(1) grammars• Non-recursive top-down parser• Handling syntax errors
32