csci 565 - compiler design spring 2011 the front end ...€¢separating syntax analysis into lexical...

26
1 Spring 2011 CSCI 565 - Compiler Design Pedro Diniz [email protected] Syntactic Analysis Introduction Copyright 2011, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make copies of these materials for their personal use. 2 Spring 2011 CSCI 565 - Compiler Design Pedro Diniz [email protected] The Front End Parser Checks the stream of words and their parts of speech (produced by the scanner) for grammatical correctness Determines if the input is syntactically well formed Guides checking at deeper levels than syntax Builds an IR representation of the code Source code Scanner IR Parser Errors tokens 3 Spring 2011 CSCI 565 - Compiler Design Pedro Diniz [email protected] The Study of Parsing The process of discovering a derivation for some sentence Need a mathematical model of syntax — a grammar G Need an algorithm for testing membership in L(G) Need to keep in mind that our goal is building parsers, not studying the mathematics of arbitrary languages 4 Spring 2011 CSCI 565 - Compiler Design Pedro Diniz [email protected] Outline Overview of Lexical Analysis What is Syntax Analysis? Context-Free Grammars Derivation and Parse Trees Top-down vs. Bottom-up Parsing Ambiguous Grammars Implementing a Parser 5 Spring 2011 CSCI 565 - Compiler Design Pedro Diniz [email protected] Anatomy of a Compiler Intermediate Code Optimizer Code Generator Optimized Intermediate Representation Assembly code Intermediate Code Generator Intermediate Representation Lexical Analyzer (Scanner) Syntax Analyzer (Parser) Token Stream Parse Tree Program (character stream) 6 Spring 2011 CSCI 565 - Compiler Design Pedro Diniz [email protected] Overview of Lexical Analysis Lexical Analyzer Creates Tokens out of a Text Stream Tokens are defined using Regular Expressions

Upload: trinhthu

Post on 11-May-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

1

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Syntactic Analysis

Introduction

Copyright 2011, Pedro C. Diniz, all rights reserved.Students enrolled in Compilers class at University of Southern California (USC)have explicit permission to make copies of these materials for their personal use.

2

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

The Front End

Parser• Checks the stream of words and their parts of speech (produced

by the scanner) for grammatical correctness• Determines if the input is syntactically well formed• Guides checking at deeper levels than syntax• Builds an IR representation of the code

Sourcecode Scanner

IRParser

Errors

tokens

3

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

The Study of ParsingThe process of discovering a derivation for some sentence• Need a mathematical model of syntax — a grammar G

• Need an algorithm for testing membership in L(G)• Need to keep in mind that our goal is building parsers, not

studying the mathematics of arbitrary languages

4

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Outline

• Overview of Lexical Analysis• What is Syntax Analysis?

• Context-Free Grammars• Derivation and Parse Trees• Top-down vs. Bottom-up Parsing

• Ambiguous Grammars• Implementing a Parser

5

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Anatomy of a Compiler

Intermediate Code Optimizer

Code Generator

Optimized Intermediate Representation

Assembly code

Intermediate Code Generator

Intermediate Representation

Lexical Analyzer (Scanner)

Syntax Analyzer (Parser)

Token Stream

Parse Tree

Program (character stream)

6

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Overview of Lexical Analysis

• Lexical Analyzer Creates Tokens out of a Text Stream

• Tokens are defined using Regular Expressions

Page 2: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

2

7

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Reg. Expr., Grammars and Languages

• A regular expression can be written using only:– characters in the alphabet– regular expression operators: ‘*’ ‘·’ ‘|’ ‘+’ ‘?’ ‘(‘ ‘)’– Example:

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

• Regular language is a language defined by a regularexpression

8

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Reg. Expr., Grammars and Languages

• What about symbolic variables?– Example:

num = 0|1|2|3|4|5|6|7|8|9posint = num · num*int = (ε | -) · posintreal = int · (ε | (. · posint))

• They are just shorthand or “syntactic sugar”– Example:

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

9

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Outline

• Overview of Lexical Analysis• What is Syntactic Analysis?

• Context-Free Grammars• Derivation and Parse Trees• Top-down vs. Bottom-up Parsing

• Ambiguous Grammars• Implementing a Parser

10

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Syntax and Semantics of aProgramming Language?

• Syntax– How a program looks like– Textual representation or structure– A precise mathematical definition is possible

• Semantics– What is the meaning of a program– Harder to give a mathematical definition

11

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Why do Syntax Analysis?

• Can provide a precise, easy-to-understand definition• A proper grammar imparts a structure into a

programming language• Can automatically construct a parser that can

determine if the program is syntactically well formed

• Helps in the translation process• Easy to modify/add to the language

12

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Anatomy of a Compiler

Intermediate Code Optimizer

Code Generator

Optimized Intermediate Representation

Assembly code

Intermediate Code Generator

Intermediate Representation

Syntactic Analyzer (Parser)

Lexical Analyzer (Scanner)Token Stream

Parse Tree

Program (character stream)

Page 3: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

3

13

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Input to and Output of a Parser

-

( )

123.3 23.6+

minus_op

left_paran_op

num(123.3)

plus_op

num(23.6)

right_paran_op

Token Stream Parse Tree

Input: - (123.3 + 23.6)Sy

ntac

tic A

naly

zer (

Pars

er)

14

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Syntax Definition

• We need to provide a precise, easy-to-understanddefinition of the syntax of a programming language

• Can we use Regular Expressions?– Can we use regular language to describe a programming language?

15

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: Hierarchical Scope

procedure foo(integer m, integer n, integer j) { for i = 1 to n do {

if (i == j) {j = j + 1;m = i*j;

} for k = i to n { m = m + k;

} }}

16

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: Hierarchical Scopeprocedure foo(integer m, integer n, integer j) { for i = 1 to n do {

if (i == j) {j = j + 1;m = i*j;

} for k = i to n { m = m + k;

} }}

• Balanced Parentheses Problem– Example: {{}{{{}{{}}}}}

17

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Balanced Parentheses Problem

• Can we define this using a Regular Expression?

18

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Balanced Parentheses Problem

• Can we define this using a Regular Expression?

NO!

Page 4: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

4

19

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Balanced Parentheses Problem• Can we define this using a Regular Expression?

NO!

• Intution:– Number of open parentheses should match the close parentheses– Need to keep tab of the count

or need recursion– Also: NFA’s and DFA’s cannot perform unbounded counting

20

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Balanced Parentheses Problem

• Is there a grammar that can define this?

<S> → ( <S> ) <S> | ε

• The definition is Recursive

• This is a Context-Free Grammar– It is more expressive than Regular Expressions

21

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Outline

• Overview of Lexical Analysis• What is Syntactic Analysis?• Context-Free Grammars

• Derivation and Parse Trees• Top-down vs. Bottom-up Parsing• Ambiguous Grammars

• Implementing a Parser

22

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Defining Context-Free Grammars(CFGs)

Formally, a grammar is a four tuple, G = (S,N,T,P)

• Start symbol S (set of strings in L(G))– A special nonterminal is designated

• Nonterminals N (syntactic variables)– Syntactic variables

• Terminals T (words)– Symbols for strings or tokens

• Productions P (P : N → (N ∪ T)+ )– The manner in which terminals and nonterminals are combined to form strings– A nonterminal in LHS and a string of terminals and non-terminals in RHS

23

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example of a CFG

<S> → ( <S> ) <S> | ε

24

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example of a CFG<S> → ( <S> ) <S>

<S> → ε

Page 5: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

5

25

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<S> → ( <S> ) <S>

<S> → ε Terminals

Example of a CFG

26

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example of a CFG<S> → ( <S> ) <S>

<S> → ε Nonterminals

27

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example of a CFG<S> → ( <S> ) <S>

<S> → εStart Symbol: <S>

28

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<S> → ( <S> ) <S>

<S> → εProductions

Example of a CFG

29

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Regular Languages a Subset of CFL

aRegular Expression Context-Free Grammar

<A> → a

p · q <S> → <P> <Q>If p and q are regular expressions with CFGs <P> and <Q>

p | q <S> → <P><S> → <Q>

p * <S> → <S> <P><S> → ε

30

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Why use Regular Expressions?

• Separating syntax analysis into lexical and non-lexicalparts is a nice modularization boundary

• Lexical rules are simple, can be expressed usingregular expressions

• Regular Expressions are more concise

• Lexical Analyzer implementations for RegularExpressions are more efficient

Page 6: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

6

31

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Creating a CFG

• We need to create a CFG from a given languagedefinitions

• There are many issues involved– We’ll address some of them in this class

• Lets look at a simple language

32

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: A CFG for expressions• Simple arithmetic expressions with + and *

– 8.2 + 35.6– 8.32 + 86 * 45.3– (6.001 + 6.004) * (6.035 * -(6.042 + 6.046))

• Terminals (or tokens)– num for all the numbers– plus_op (‘+’), minus_op (‘-’), times_op(‘*’), left_paren_op(‘(‘),

right_paran_op(‘)’)

• What is the grammar for all possible expressions?

33

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: A CFG for expressions

<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr><expr> → num<op> → +<op> → *

34

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: A CFG for expressions

<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr><expr> → num<op> → +<op> → *

Terminals

35

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr><expr> → num<op> → +<op> → *

Nonterminals

Example: A CFG for expressions

36

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: A CFG for expressions

<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr><expr> → num<op> → +<op> → *

Start Symbol:<expr>

Page 7: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

7

37

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: A CFG for expressions<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr><expr> → num<op> → +<op> → *

Productions

38

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: A CFG for expressions<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr><expr> → num<op> → +<op> → *

39

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example: A CFG for expressions<expr> → <expr> <op> <expr> | ( <expr> )

| - <expr> | num

<op> → + | *

40

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

What is the language defined by thisCFG?

<S> → a<S>a | aa

41

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Outline

• Overview of Lexical Analysis

• What is Syntactic Analysis?• Context-Free Grammars• Derivation and Parse Trees

• Top-down vs. Bottom-up Parsing• Ambiguous Grammars• Implementing a Parser

42

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Derivation

• How do we show that a sequence of tokens isaccepted by a CFG?

• Productions are used to derive a sequence oftokens from the start symbol

• For a given strings α,β and γ and a productionA → βA single step of derivation is αAγ ⇒ αβγ

Page 8: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

8

43

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation• Grammar

<expr> → <expr><op><expr> | (<expr>) | -<expr> | num

<op> → + | *

• Input 36 * ( 8 + 23.4)

• Token Streamnum ‘*’ ‘(‘ num ‘+’ num ‘)’

44

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

45

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’46

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → <expr><op><expr>

47

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr> ⇒ <expr> <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → <expr><op><expr>

48

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr> ⇒ <expr> <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Page 9: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

9

49

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr> ⇒ <expr> <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’50

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr> ⇒ <expr> <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → num

51

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example Derivation

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → num

52

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

53

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

54

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<op> → *Example Derivation

Page 10: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

10

55

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<op> → *Example Derivation

56

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

57

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

58

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → (<expr>)Example Derivation

59

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → (<expr>)Example Derivation

60

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

Page 11: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

11

61

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

62

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → <expr><op><expr>Example Derivation

63

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → <expr><op><expr>Example Derivation

64

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

65

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

66

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → numExample Derivation

Page 12: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

12

67

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → numExample Derivation

68

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

69

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

70

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<op> → +Example Derivation

71

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<op> → +Example Derivation

72

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

Page 13: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

13

73

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

74

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ <expr> ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → numExample Derivation

75

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ num ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

<expr> → numExample Derivation

76

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ num ‘)’

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Example Derivation

77

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Parse Tree

• Graphical Representation of the Parsed Structure

• Shows the Sequence of Derivations Performed– Internal Nodes are Non-Terminals– Leaves are Terminals– Each Parent Node is LHS and the Children are RHS of a Production

78

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Parse Tree Example

<expr>

Page 14: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

14

79

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>

<expr>

<expr><expr> <op>

Parse Tree Example

80

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ num

num

<expr>

<expr><expr> <op>

Parse Tree Example

81

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<op> ⇒ ‘*’

*num

<expr>

<expr><expr> <op>

Parse Tree Example

82

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ ‘(‘ <expr> ‘)’

( )*num

<expr>

<expr><expr>

<expr>

<op>

Parse Tree Example

83

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ <expr> <op> <expr>

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Parse Tree Example

84

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ num

num

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Parse Tree Example

Page 15: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

15

85

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<op> ⇒ ‘+’

num +

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Parse Tree Example

86

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr> ⇒ num

num num+

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Parse Tree Example

87

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

num num+

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

num ‘*’ ‘(‘ num ‘+’ num ‘)’

Parse Tree Example

88

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

• Overview of Lexical Analysis

• What is Syntactic Analysis?• Context-Free Grammars• Derivation and Parse Trees

• Top-down vs. Bottom-up Parsing• Ambiguous Grammars• Implementing a Parser

Outline

89

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

• Leftmost Derivation– In the string, find the leftmost non-terminal and apply a production to it– Previous example was left-most derivation

• Rightmost Derivation– Find the right-most non-terminal and apply a production to it

Left-most vs. Right-most Derivation

90

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr>

Production:String: <expr>

Right-Derivation Example

Page 16: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

16

91

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

<expr>

<expr><expr> <op>

Production: <expr> ⇒ <expr> <op> <expr>String: <expr> <op> <expr>

Right-Derivation Example

92

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

( )

<expr>

<expr><expr>

<expr>

<op>

Production: <expr> ⇒ ‘(‘ <expr> ‘)’String: <expr> <op> ‘(‘ <expr> ‘)’

93

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

( )

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Production: <expr> ⇒ <expr> <op> <expr>String: <expr> <op> ‘(‘ <expr> <op> <expr> ‘)’

94

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

num

( )

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Production: <expr> ⇒ numString: <expr> <op> ‘(‘ <expr> <op> num ‘)’

95

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

num+

( )

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Production: <op> ⇒ ‘+’String: <expr> <op> ‘(‘ <expr> ‘+’ num ‘)’

96

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

num num+

( )

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Production: <expr> ⇒ numString: <expr> <op> ‘(‘ num ‘+’ num ‘)’

Page 17: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

17

97

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

num num+

( )*

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Production: <op> ⇒ ‘*’String: <expr> ‘*’ ‘(‘ num ‘+’ num ‘)’

98

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

num num+

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

Production: <expr> ⇒ numString: num ‘*’ ‘(‘ num ‘+’ num ‘)’

99

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example

num num+

( )*num

<expr>

<expr><expr>

<expr>

<expr>

<expr>

<op>

<op>

String: num ‘*’ ‘(‘ num ‘+’ num ‘)’

100

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Right-Derivation Example⇒ <expr>⇒ <expr> <op> <expr>⇒ <expr> <op> ‘(‘ <expr> ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> <expr> ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> num ‘)’⇒ <expr> <op> ‘(‘ <expr> ‘+’ num ‘)’’⇒ <expr> <op> ‘(‘ num ‘+’ num ‘)’⇒ <expr> ‘*’ ‘(‘ num ‘+’ num ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ num ‘)’

101

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Top-down vs. Bottom-up Parsing• We normally scan from left to right

• Left-most derivation reflects top-down parsing– Start with the start symbol– End with the string of tokens

102

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Top-down Parsing• Left-most derivation

⇒ <expr> ⇒ <expr> <op> <expr>⇒ num <op> <expr>⇒ num ‘*’ <expr>⇒ num ‘*’ ‘(‘ <expr> ‘)’⇒ num ‘*’ ‘(‘ <expr> <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num <op> <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ <expr> ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ num ‘)’

Page 18: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

18

103

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Top-down vs. Bottom-up Parsing

• We normally scan from left to right

• Left-most derivation reflects top-down parsing– Start with the start symbol– End with the string of tokens

• Right-most derivation reflects bottom-up parsing– Start with the string of tokens– Ends with the start symbol

104

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Bottom-up Parsing• Right-most derivation

⇒ <expr>⇒ <expr> <op> <expr>⇒ <expr> <op> ‘(‘ <expr> ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> <expr> ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> num ‘)’⇒ <expr> <op> ‘(‘ <expr> ‘+’ num ‘)’’⇒ <expr> <op> ‘(‘ num ‘+’ num ‘)’⇒ <expr> ‘*’ ‘(‘ num ‘+’ num ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ num ‘)’

105

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Bottom-up Parsing• Right-most derivation

⇒ <expr>⇒ <expr> <op> <expr>⇒ <expr> <op> ‘(‘ <expr> ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> <expr> ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> num ‘)’⇒ <expr> <op> ‘(‘ <expr> ‘+’ num ‘)’’⇒ <expr> <op> ‘(‘ num ‘+’ num ‘)’⇒ <expr> ‘*’ ‘(‘ num ‘+’ num ‘)’⇒ num ‘*’ ‘(‘ num ‘+’ num ‘)’

106

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Bottom-up Parsing• Right-most derivation in reverse

⇒ num ‘*’ ‘(‘ num ‘+’ num ‘)’⇒ <expr> ‘*’ ‘(‘ num ‘+’ num ‘)’⇒ <expr> <op> ‘(‘ num ‘+’ num ‘)’⇒ <expr> <op> ‘(‘ <expr> ‘+’ num ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> num ‘)’⇒ <expr> <op> ‘(‘ <expr> <op> <expr> ‘)’

⇒ <expr> <op> ‘(‘ <expr> ‘)’⇒ <expr> <op> <expr>⇒ <expr>

107

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Outline

• Overview of Lexical Analysis• What is Syntactic Analysis?

• Context-Free Grammars• Derivation and Parse Trees• Top-down vs. Bottom-up Parsing

• Ambiguous Grammars• Implementing a Parser

108

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example• Input:

124 + 23.5 * 86

• Token Stream: num ‘+’ num ‘*’ num

Page 19: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

19

109

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

<expr>

Production:String: <expr>

110

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

<expr>

<expr><expr> <op>

Production: <expr> ⇒ <expr> <op> <expr>String: <expr> <op> <expr>

111

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num

<expr>

<expr><expr> <op>

Production: <expr> ⇒ <num>String: num <op> <expr>

112

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

+num

<expr>

<expr><expr> <op>

Production: <op> ⇒ ‘+’String: num ‘+’ <expr>

113

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another ExampleProduction: <expr> ⇒ <expr> <op> <expr>String: num ‘+’ <expr> <op> <expr>

+num

<expr>

<expr><expr>

<expr> <expr>

<op>

<op>

114

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another ExampleProduction: <expr> ⇒ numString: num ‘+’ num <op> <expr>

num

+num

<expr>

<expr><expr>

<expr> <expr>

<op>

<op>

Page 20: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

20

115

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num

+num

<expr>

<expr><expr>

<expr> <expr>

<op>

Production: <op> ⇒ ‘*’String: num ‘+’ num ‘*’ <expr>

*

<op>

116

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num num

+num

<expr>

<expr><expr>

<expr> <expr>

<op>

Production: <expr> ⇒ numString: num ‘+’ num ‘*’ num

*

<op>

117

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

String: num ‘+’ num ‘*’ num

num num

+num

<expr>

<expr><expr>

<expr> <expr>

<op>

*

<op>

118

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

String: num ‘+’ num ‘*’ num

• How about a different order of derivation

119

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

<expr>

String: <expr>

120

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

<expr>

<expr> <expr><op>

Production: <expr> ⇒ <expr> <op> <expr>String: <expr> <op> <expr>

Page 21: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

21

121

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num

<expr>

<expr><expr> <op>

Production: <expr> ⇒ <num>String: num <op> <expr>

122

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num

<expr>

<expr><expr> <op>

Production: <expr> ⇒ <num>String: num <op> <expr>

But we can instead use the production<expr> ⇒ <expr> <op> <expr>

123

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

<expr>

<expr><expr> <op>

Production:String: <expr> <op> <expr>

But we can instead use the production<expr> ⇒ <expr> <op> <expr>

124

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

<expr>

<expr> <expr>

<expr> <expr>

<op>

Production: <expr> ⇒ <expr> <op> <expr>String: <expr> <op> <expr> <op> <expr>

<op>

125

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num

<expr>

<expr> <expr>

<expr> <expr>

<op>

Production: <expr> ⇒ <num>String: num <op> <expr> <op> <expr>

<op>

126

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num

<expr>

<expr> <expr>

<expr> <expr>

<op>

Production: <op> ⇒ <+>String: num ‘+’ <expr> <op> <expr>

+

<op>

Page 22: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

22

127

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num num

<expr>

<expr> <expr>

<expr> <expr>

<op>

Production: <expr> ⇒ <num>String: num ‘+’ num <op> <expr>

+

<op>

128

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num num

*

<expr>

<expr> <expr>

<expr> <expr>

<op>

Production: <op> ⇒ ‘*’String: num ‘+’ num ‘*’ <expr>

+

<op>

129

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num num

* num

<expr>

<expr> <expr>

<expr> <expr>

<op>

Production: <expr> ⇒ <num>String: num ‘+’ num ‘*’ num

+

<op>

130

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Another Example

num num

* num

<expr>

<expr> <expr>

<expr> <expr>

<op>

String: num ‘+’ num ‘*’ num

+

<op>

131

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Same string - Two derivations

num ‘+’ num ‘*’ num

num num

* num

<expr>

<expr> <expr>

<expr> <expr>

<op>

+

<op>

num num

+num

<expr>

<expr><expr>

<expr> <expr>

<op>

*

<op>

124 + (23.5 * 86) = 2145 (124 + 23.5) * 86 = 12685

132

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

The Grammar is Ambiguous• Different Derivation Orders Produce Different

Parse Trees• This is Not Good!

– Leads to Ambiguous Results– Most probably will produce unexpected results

• Sometimes Rewriting a Grammar with AdditionalNon-Terminals will Eliminate the Ambiguity

Page 23: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

23

133

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

The Ambiguous Grammar<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr>

<expr> → num<op> → +<op> → *

134

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Eliminating Ambiguity

<expr> → <expr> + <term>

<expr> → <term><term> → <term> * <unit><term> → <unit>

<unit> → num<unit> → ( <expr> )

<expr> → <expr> <op> <expr><expr> → ( <expr> )<expr> → - <expr><expr> → num

<op> → +<op> → *

135

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Eliminating Ambiguity

String: num ‘+’ num ‘*’ num

num

+

<expr>

<term><expr>

<term> <unit>*

num

<unit>

<term>

num

<unit>

136

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

First example in the new grammar

num ‘*’ ‘(‘ num ‘+’ num ‘)’

+

( )*

num

<term>

<unit><term>

<expr>

<expr>

<term>

<expr>

<unit>

num

<term>

<unit> num

<unit>

137

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Question: Is this Grammar ambiguous?

<stmt> → if <expr> then <stlist><stmt> → if <expr> then <stlist> else <stlist>

138

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

How do you make this unambiguous?

<stmt> → if <expr> then <stlist><stmt> → if <expr> then <stlist> else <stlist>

Page 24: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

24

139

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Ambiguous GrammarsDefinitions• If a grammar has more than one leftmost derivation for a single

sentential form, the grammar is ambiguous• If a grammar has more than one rightmost derivation for a

single sentential form, the grammar is ambiguous• The leftmost and rightmost derivations for a sentential form

may differ, even in an unambiguous grammar

Classic example — the if-then-else problemStmt → if Expr then Stmt | if Expr then Stmt else Stmt | … other stmts …

This ambiguity is entirely grammatical in nature

140

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Ambiguity

This sentential form has two derivationsif Expr1 then if Expr2 then Stmt1 else Stmt2

then

else

if

then

if

E1

E2

S2

S1

production 2, thenproduction 1

then

if

then

if

E1

E2

S1

else

S2

production 1, thenproduction 2

141

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

AmbiguityRemoving the Ambiguity• Must rewrite the grammar to avoid generating the problem• Match each else to innermost unmatched if (common sense rule)

With this grammar, the example has only one derivation

1 Stmt ! WithElse2 | NoElse

3 WithElse ! if Expr then WithElse else WithElse

4 | OtherStmt

5 NoElse ! if Expr then Stmt

6 | if Expr then WithElse else NoElse

Intuition: a NoElse always has no else on its last cascadedelse if statement

142

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Ambiguity

if Expr1 then if Expr2 then Stmt1 else Stmt2

This binds the else controlling S2 to the inner if

Rule Sentential Form— Stmt2 NoElse5 if Expr then Stmt? if E1 then Stmt1 if E1 then WithElse3 if E1 then if Expr then WithElse else WithElse? if E1 then if E2 then WithElse else WithElse4 if E1 then if E2 then S1 else WithElse4 if E1 then if E2 then S1 else S2

143

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Outline

• Overview of Lexical Analysis• What is Syntactic Analysis?

• Context-Free Grammars• Derivation and Parse Trees• Top-down vs. Bottom-up Parsing

• Ambiguous Grammars• Implementing a Parser

144

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Implementing a Parser

• Implementing a parser for some CFGs can be verydifficult– Need to look at the input and choose a production– Cannot choose the right production without looking a lot ahead

• Key: Decision Algorithm– If Input Sentence does not belong to the Language, then

• Algorithm must prove it cannot be derived using Grammer• For any possible production sequence…

– If Input sentence belongs to the Language• Algorithm must present derivation sequence

Page 25: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

25

145

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Example of look ahead

• Grammar<stmt> → a <long> b<stmt> → a <long> c<long> → x <long> | x

• Input string “axxxxxxxxxxxxxxxxx…….”

• May need to look ahead a long while beforedeciding on a production

146

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Implementing a Parser

• Implementing a parser for some CFGs can be verydifficult– Need to look at the input and choose a production– Cannot choose the production without look ahead

• Different Techniques– Each can handle some set of CFGs– Categorization of techniques

147

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Implementing a Parser• Implementing a parser for some CFGs can be very

difficult– Need to look at the input and choose a production– Cannot choose the production without look ahead

• Different Techniques– Each can handle some set of CFGs– Categorization of techniques

( )

148

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Implementing a Parser• Implementing a parser for some CFGs can be very difficult

– Need to look at the input and choose a production– Cannot choose the production without look ahead

• Different Techniques– Each can handle some set of CFGs– Categorization of techniques

– L - parse from left to right– R - parse from right to left

( )

149

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Implementing a Parser• Implementing a parser for some CFGs can be very difficult

– Need to look at the input and choose a production– Cannot choose the production without look ahead

• Different Techniques– Each can handle some set of CFGs– Categorization of techniques

– L - leftmost derivation– R - rightmost derivation

( )

150

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Implementing a Parser• Implementing a parser for some CFGs can be very difficult

– Need to look at the input and choose a production– Cannot choose the production without look ahead

• Different Techniques– Each can handle some set of CFGs– Categorization of techniques

– Number of lookahead characters

( )

Page 26: CSCI 565 - Compiler Design Spring 2011 The Front End ...€¢Separating syntax analysis into lexical and non-lexical parts is a nice modularization boundary ... CSCI 565 - Compiler

26

151

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Implementing a Parser• Implementing a parser for some CFGs can be very

difficult– Need to look at the input and choose a production– Cannot choose the production without look ahead

• Different Techniques– Each can handle some set of CFGs– Categorization of techniques

– Examples: LL(0), LR(1)

( )

152

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Next Lecture

• How to Implement a Parser• How to build a Parser Engine for a

Shift-Reduce Parser

• We will look at– LR(0)– LR(1)– LALR(1)

ParserEngine

153

Spring 2011CSCI 565 - Compiler Design

Pedro [email protected]

Summary

• What is Syntactic Analysis?• Difference between Lexical any Syntactic Analyses

• All about Context-Free Grammars• Parse Trees• Left-most and Right-most Derivations

• Top-down and Bottom-up Parsing• Ambiguous Grammars• Issues in Parser Implementation