![Page 1: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/1.jpg)
Introduction to Grammars and Parsing Techniques 1
Introduction to Grammars and
Parsing Techniques
Paul Klint
![Page 2: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/2.jpg)
Grammars and Languages are one of the most established areas of
Computer Science
![Page 3: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/3.jpg)
N. Chomsky,Aspects of the theory of syntax,1965
![Page 4: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/4.jpg)
A.V. Aho & J.D. Ullman,The Theory of Parsing,Translation and Compilimg,Parts I + II,1972
![Page 5: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/5.jpg)
A.V. Aho, R. Sethi, J.D. Ullman,Compiler, Principles, Techniques and Tools,1986
![Page 6: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/6.jpg)
D. Grune, C. Jacobs,Parsing Techniques,A Practical Guide,2008
![Page 7: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/7.jpg)
Introduction to Grammars and Parsing Techniques 7
Why are Grammars and Parsing Techniques relevant?
● A grammar is a formal method to describe a (textual) language● Programming languages: C, Java, C#, JavaScript● Domain-specific languages: BibTex, Mathematica● Data formats: log files, protocol data
● Parsing:● Tests whether a text conforms to a grammar● Turns a correct text into a parse tree
![Page 8: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/8.jpg)
Introduction to Grammars and Parsing Techniques 8
How to define a grammar?
● Simplistic solution: finite set of acceptable sentences● Problem: what to do with infinite languages?
● Realistic solution: finite recipe that describes all acceptable sentences
● A grammar is a finite description of a possibly infinite set of acceptable sentences
![Page 9: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/9.jpg)
Introduction to Grammars and Parsing Techniques 9
Example: Tom, Dick and Harry
● Suppose we want describe a language that contains the following legal sentences:● Tom● Tom and Dick● Tom, Dick and Harry● Tom, Harry, Tom and Dick● ...
● How do we find a finite recipe for this?
![Page 10: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/10.jpg)
Introduction to Grammars and Parsing Techniques 10
The Tom, Dick and Harry Grammar
● Name -> tom● Name -> dick● Name -> harry● Sentence -> Name● Sentence -> List End● List -> Name● List -> List , Name● , Name End -> and Name
Non-terminals: Name, Sentence, List, End
Terminals: tom, dick, harry, and, ,
Start Symbol: Sentence
![Page 11: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/11.jpg)
Introduction to Grammars and Parsing Techniques 11
Variations in Notation
● Name -> tom | dick | harry● <Name> ::= “tom” | “dick” | “harry”● “tom” | “dick” | “harry” -> Name
![Page 12: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/12.jpg)
Introduction to Grammars and Parsing Techniques 12
Chomsky’s Grammar Hierarchy
● Type-0: Recursively Enumerable● Rules: α -> β (unrestricted)
● Type-1: Context-sensitive● Rules: αAβ -> αγβ
● Type-2: Context-free● Rules: A -> γ
● Type-3: Regular● Rules: A -> a and A -> aB
![Page 13: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/13.jpg)
Introduction to Grammars and Parsing Techniques 13
Context-free Grammar for TDH
● Name -> tom | dick | harry● Sentence -> Name | List and Name● List -> Name , List | Name
![Page 14: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/14.jpg)
Introduction to Grammars and Parsing Techniques 14
In practice ...
● Regular grammars used for lexical syntax:● Keywords: if, then, while● Constants: 123, 3.14, “a string”● Comments: /* a comment */
● Context-free grammars used for structured and nested concepts:● Class declaration● If statement
![Page 15: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/15.jpg)
Introduction to Grammars and Parsing Techniques 15
A sentence
o s i t i o n : = i n i t i + r a t e * 6 0p a l
position := initial + rate * 60
![Page 16: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/16.jpg)
Introduction to Grammars and Parsing Techniques 16
Lexical syntax
● Regular expressions define lexical syntax:● Literal characters: a,b,c,1,2,3● Character classes: [a-z], [0-9]● Operators: sequence (space), repetition (* or +),
option (?)
● Examples:● Identifier: [a-z][a-z0-9]*● Number: [0-9]+● Floating constant: [0-9]*.[0-9]*(e-[0-9]+)
![Page 17: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/17.jpg)
Introduction to Grammars and Parsing Techniques 17
Lexical syntax
● Regular expressions can be implemented with a finite automaton
● Consider [a-z][a-z0-9]*
Start[a-z]
End
[a-z0-9]*
![Page 18: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/18.jpg)
Introduction to Grammars and Parsing Techniques 18
Lexical Tokens
o s i t i o n : = i n i t i + r a t e * 6 0p a l
position := initial rate+ * 60
Identifier
*
Identifier Identifier Number
![Page 19: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/19.jpg)
Introduction to Grammars and Parsing Techniques 19
Context-free syntax
● Expression -> Identifier● Expression -> Number● Expression -> Expression + Expression● Expression -> Expression * Expression● Statement -> Identifier := Expression
![Page 20: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/20.jpg)
Introduction to Grammars and Parsing Techniques 20
Parse Tree
o s i t i o n : = i n i t i + r a t e * 6 0p a l
position := initial rate+ * 60
Identifier
*
identifier Identifier Number
expression
Identifier
Expression ExpressionExpression
Expression
Expression
Statement
![Page 21: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/21.jpg)
Introduction to Grammars and Parsing Techniques 21
Ambiguity: one sentence, but several trees
Number
1 2 3+ *
Number Number
Expression
Expression
Number
1 2 3+ *
Number Number
Expression
Expression
![Page 22: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/22.jpg)
Introduction to Grammars and Parsing Techniques 22
Two solutions
● Add priorities to the grammar:● * > +
● Rewrite the grammar:● Expression -> Expression + Term● Expression -> Term● Term -> Term * Primary● Term -> Primary● Primary -> Number● Primary -> Identifier
![Page 23: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/23.jpg)
Introduction to Grammars and Parsing Techniques 23
Unambiguous Parse Tree
Number
1 2 3+ *
Number Number
TermTerm
Primary Primary Primary
TermExpression
Expression
![Page 24: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/24.jpg)
Introduction to Grammars and Parsing Techniques 24
Some Grammar Transformations
● Left recursive production:● A -> A α | β● Example: Exp -> Exp + Term | Term
● Left recursive productions lead to loops in some kinds of parsers (recursive descent)
● Removal:● A -> β R● R -> α R | ε (ε is the empty string)
![Page 25: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/25.jpg)
Introduction to Grammars and Parsing Techniques 25
Some Grammar Transformations
● Left factoring:● S -> if E then S else S | if E then S
● For some parsers it is better to factor out the common parts:● S -> if E then S P● P -> else S | ε
![Page 26: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/26.jpg)
Introduction to Grammars and Parsing Techniques 26
A Recognizer
l1 -> r
1
l2 -> r
2
ln -> r
n
...
Grammar
Source text
Recognizer Yes or No
![Page 27: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/27.jpg)
Introduction to Grammars and Parsing Techniques 27
A Parser
l1 -> r
1
l2 -> r
2
ln -> r
n
...
Grammar
Source text
Parser
Parse tree or Errors
![Page 28: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/28.jpg)
Introduction to Grammars and Parsing Techniques 28
General Approaches to Parsing
● Top-Down (Predictive)● Each non-terminal is a goal● Replace each goal by subgoals (= elements of rule)● Parse tree is built from top to bottom
● Bottom-Up● Recognize terminals● Replace terminals by non-terminals● Replace terminals and non-terminals by left-hand
side of rule
![Page 29: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/29.jpg)
Introduction to Grammars and Parsing Techniques 29
Top-down versus Bottom-up
Top-down Bottom-up
Number
4 5 6* +
Number Number
Expression
Expression
Number
1 2 3+ *
Number Number
Expression
Expression
![Page 30: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/30.jpg)
Introduction to Grammars and Parsing Techniques 30
How to get a parser?
● Write parser manually● Generate parser from grammar
![Page 31: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/31.jpg)
Introduction to Grammars and Parsing Techniques 31
Example
● Given grammar:● Expr -> Expr + Term● Expr -> Expr – Term● Expr -> Term● Term -> [0-9]
● Remove left recursion:● Expr -> Term Rest● Rest -> + Term Rest | - Term Rest | ε● Term -> [0-9]
![Page 32: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/32.jpg)
Introduction to Grammars and Parsing Techniques 32
Recursive Descent ParserExpr(){ Term(); Rest(); }
Rest(){ if(lookahead == '+'){Match('+'); Term(); Rest();
} else if( lookahead == '-'){ Match('-'); Term(); Rest(); } else ;}
Term(){ if(isdigit(lookahead)){Match(lookahead);
} elseError();
}
![Page 33: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/33.jpg)
Introduction to Grammars and Parsing Techniques 33
A Trickier Case: Backtracking
● Example● S -> aSbS | aS | c
● Naive approach (input ac):● Try aSbS, but this fails hence error
● Backtracking approach:● First try aSbS but this fails● Go back to initial input position and try aS, this
succeeds.
![Page 34: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/34.jpg)
Introduction to Grammars and Parsing Techniques 34
Automatic Parser Generation
Syntax of L
ParserGenerator
Parser for L
L textL
tree
![Page 35: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/35.jpg)
Introduction to Grammars and Parsing Techniques 35
Some Parser Generators● Bottom-up
● Yacc/Bison, LALR(1)● CUP, LALR(1)● SDF, SGLR
● Top-down:● ANTLR, LL(k)● JavaCC, LL(k)● Rascal
● Except Rascal and SDF, all depend on a scanner generator
![Page 36: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/36.jpg)
Introduction to Grammars and Parsing Techniques 36
Assessment parser implementation
● Manual parser construction
+ Good error recovery
+ Flexible combination of parsing and actions
- A lot of work
● Parser generators
+ May save a lot of work
- Complex and rigid frameworks
- Rigid actions
- Error recovery more difficult
![Page 37: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/37.jpg)
Introduction to Grammars and Parsing Techniques 37
Further Reading
● http://en.wikipedia.org/wiki/Chomsky_hierarchy● D. Grune & C.J.H. Jacobs, Parsing Techniques:
A Practical Guide, Second Edition, Springer, 2008
![Page 38: Introduction to Grammars and Parsing Techniqueshomepages.cwi.nl/~storm/teaching/sc1112/intro-parsing.pdf · Introduction to Grammars and Parsing Techniques 1 Introduction to Grammars](https://reader034.vdocuments.mx/reader034/viewer/2022042510/5a7b4d427f8b9a2e358bcb8a/html5/thumbnails/38.jpg)
Introduction to Grammars and Parsing Techniques 38
Further Reading
● http://en.wikipedia.org/wiki/Chomsky_hierarchy● Paul Klint, Quick introduction to Syntax
Analysis, http://www.meta-environment.org/ doc/books//syntax/syntax-analysis/syntax-analysis.html
● D. Grune & C.J.H. Jacobs, Parsing Techniques: A Practical Guide, Second Edition, Springer, 2008