![Page 1: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/1.jpg)
EDAN65:Compilers,Lecture 03
Context-free grammars,introductionto parsingGörelHedinRevised:2017-09-04
![Page 2: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/2.jpg)
Courseoverview
Semantic analyzer
Intermediatecode generator
Optimizer
Targetcodegenerator
2
Lexical analyzer(scanner)
Syntactic analyzer(parser)
Regularexpressions
Context-freegrammar
Attributegrammar
machine
runtime system
stack
heap
codeanddata
objects
activationrecords
Interpreter
target code
tokens
Attributed AST
intermediate code
sourcecode (text)
AST(Abstractsyntaxtree)
intermediate code
garbagecollection
Virtualmachine
This lecture
![Page 3: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/3.jpg)
Analyzing programtext
3
sum = sum + k ;
AssignStmt
Exp
Add
Exp Exp
ID EQ ID PLUS ID SEMIprogram text
tokens
parse tree
![Page 4: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/4.jpg)
Recall:Generatingthecompiler:
Semantic analyzer
Lexical analyzer(scanner)
Syntactic analyzer(parser)
Regularexpressions
ScannergeneratorJFlex
Context-freegrammar
ParsergeneratorBeaver
Attributegrammar
Attribute evaluatorgenerator
We will use aparsergeneratorcalled Beaver
4
tokens
text
tree
![Page 5: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/5.jpg)
5
Context-Free Grammars
![Page 6: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/6.jpg)
Regular ExpressionsvsContext-Free Grammars
6
AnREcan have iteration
ACFGcan also have recursion(itispossible toderive asymbol,e.g.,Stmt,fromitself)
Example REs:WHILE = "while"ID = [a-z][a-z0-9]*LPAR = "("RPAR = ")"PLUS = "+"...
Example CFG:Stmt –> WhileStmtStmt –> AssignStmtWhileStmt –> WHILE LPAR Exp RPAR StmtExp –> IDExp –> Exp PLUS Exp...
![Page 7: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/7.jpg)
ElementsofaContext-Free Grammar
7
Production rules:X –> s1 s2 … sn
where sk isasymbol(terminalornonterminal)
Nonterminal symbols
Terminalsymbols(tokens)
Startsymbol(one ofthenonterminals,usually theleft-handside of thefirst production)
Example CFG:Stmt –> WhileStmtStmt –> AssignStmtWhileStmt –> WHILE LPAR Exp RPAR StmtAssignStmt –> ID EQ Exp SEMIC…
![Page 8: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/8.jpg)
Shorthand foralternatives
8
Stmt –> WhileStmtStmt –> AssignStmt
Stmt –> WhileStmt | AssignStmt
isequivalent to
![Page 9: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/9.jpg)
Shorthand forrepetition
9
Stmt*
StmtList –> e | Stmt StmtList
isequivalent to
StmtList
where
![Page 10: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/10.jpg)
ExerciseConstruct agrammar covering this programandsimilar ones:
10
Example program:while (k <= n) {sum = sum + k; k = k+1;}
CFG:Stmt –> WhileStmt | AssignStmt | CompoundStmtWhileStmt –> "while" "(" Exp ")" StmtAssignStmt –> ID "=" Exp ";"CompoundStmt –> ...Exp –> ...LessEq –> ...Add –> ...
(Often,simpletokensarewritten directly astextstrings)
![Page 11: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/11.jpg)
SolutionConstruct agrammar covering this programandsimilar ones:
11
CFG:Stmt –> WhileStmt | AssignStmt | CompoundStmtWhileStmt –> "while" "(" Exp ")" StmtAssignStmt –> ID "=" Exp ";"CompoundStmt –> "{" Stmt* "}"Exp –> LessEq | Add | ID | INTLessEq –> Exp "<=" ExpAdd –> Exp "+" Exp
Example program:while (k <= n) {sum = sum + k; k = k+1;}
![Page 12: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/12.jpg)
ParsingUse thegrammar toderive atree foraprogram:
12sum = sum + k ;
StmtExample program:sum = sum + k;
Startsymbol
![Page 13: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/13.jpg)
Parse treeUse thegrammar toderive aparse tree foraprogram:
13sum = sum + k ;
Stmt
AssignStmt
Exp
Add
Exp Exp
Example program:sum = sum + k;
Nonterminalsareinnernodes
Startsymbol
Terminalsareleafs
Aparse tree includes allthetokensasleafs.
![Page 14: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/14.jpg)
Corresponding abstractsyntaxtree(will bediscussed inlaterlecture)
14sum = sum + k ;
AssignStmt
Add
IdExp IdExp
Example program:sum = sum + k;
IdExp
Anabstractsyntaxtree issimilarto aparse tree,but simpler.
Itdoes notinclude allthetokens.
![Page 15: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/15.jpg)
EBNFvsCanonical Form
15
EBNF:Stmt –> AssignStmt | CompoundStmtAssignStmt –> ID "=" Exp ";"CompoundStmt –> "{" Stmt* "}"Exp –> Add | IDAdd –> Exp "+" Exp
Canonical form:Stmt –> ID "=" Exp ";"Stmt –> "{" Stmts "}"Stmts –> eStmts –> Stmt StmtsExp –> Exp "+" ExpExp –> ID
(Extended)Backus-Naur Form:• Compact,easytoreadandwrite• EBNFhasalternatives,repetition,optionals,parentheses (likeREs)
• Commonnotationforpracticaluse
Canonical form:• Core formalismforCFGs• Useful forproving properties andexplaining algorithms
![Page 16: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/16.jpg)
Realworldexample:TheJavaLanguageSpecification
16
See http://docs.oracle.com/javase/specs/jls/se8/html/index.html• See Chapter 2about theJavagrammar notation.• Lookatsome other chapters to see other syntaxexamples.
CompilationUnit:[PackageDeclaration]{ImportDeclaration}{TypeDeclaration}
PackageDeclaration:{PackageModifier}package Identifier {.Identifier};
PackageModifier:Annotation
…
![Page 17: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/17.jpg)
17
Formaldefinitionof CFGs
![Page 18: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/18.jpg)
FormaldefinitionofCFGs (canonical form)
18
Acontext-free grammar G=(N,T,P,S),whereN– thesetofnonterminalsymbolsT– thesetofterminalsymbolsP– thesetofproduction rules,each withtheform
X–>Y1 Y2 …Ynwhere X∈ N,n≥ 0,andYk∈ N∪ T
S– thestartsymbol(one ofthenonterminals).I.e.,S∈ N
So,theleft-hand side Xofarule isanonterminal.
Andtheright-hand side Y1 Y2 …Yn isasequence of nonterminalsandterminals.
If therhs foraproduction isempty,i.e.,n=0,we writeX–>e
![Page 19: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/19.jpg)
AgrammarG defines alanguageL(G)
19
A context-free grammar G=(N,T,P,S),whereN– thesetofnonterminalsymbolsT– thesetofterminalsymbolsP– thesetofproduction rules,each withtheform
X–>Y1 Y2 …Ynwhere X∈ N,n≥ 0,andYk∈ N∪ T
S– thestartsymbol(one ofthenonterminals).I.e.,S∈ N
Gdefines alanguage L(G) overthealphabet T
T*isthesetofallpossible sequences ofTsymbols.
L(G)isthesubsetofT*thatcan bederived fromthestartsymbolS,byfollowing theproduction rules P.
![Page 20: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/20.jpg)
Exercise
20
G = (N, T, P, S)
P = {Stmt –> ID "=" Exp ";",Stmt –> "{" Stmts "}" ,Stmts –> e ,Stmts –> Stmt Stmts ,Exp –> Exp "+" Exp ,Exp –> ID
}
N = { }
T = { }
S =
L(G) = {
}
![Page 21: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/21.jpg)
Solution
21
G = (N, T, P, S)
P = {Stmt –> ID "=" Exp ";",Stmt –> "{" Stmts "}" ,Stmts –> e ,Stmts –> Stmt Stmts ,Exp –> Exp "+" Exp ,Exp –> ID
}
N = {Stmt, Exp, Stmts}
T = {ID, "=", "{", "}", ";", "+"}
S = Stmt
L(G) = {"{" "}","{" "{" "}" "}",ID "=" ID ";","{" ID "=" ID ";" "}",ID "=" ID "+" ID ";","{" "{" "}" "{" "}" "}","{" "{" "{" "}" "}" "}","{" ID "=" ID "+" ID ";" "}",ID "=" ID "+" ID "+" ID ";",...
}
Thesequences inL(G)areusually called sentences orstrings
![Page 22: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/22.jpg)
22
Derivations
![Page 23: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/23.jpg)
Derivationstep
23
If we have asequence ofterminalsandnonterminals,e.g.,
XaYY b
we can replace one ofthenonterminals,applying aproductionrule.Thisiscalled aderivationstep.(Swedish:Härledningssteg)
Supposethere isaproduction
Y–>Xa
andwe apply itforthefirstYinthesequence.We write thederivationstepasfollows:
XaYY b=>XaXaYb
![Page 24: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/24.jpg)
Derivation
24
Aderivation,issimply asequence ofderivationsteps,e.g.:
g0 =>g1 =>…=>gn (n≥0)
where each gi isasequence ofterminalsandnonterminals
Ifthere isaderivationfromg0 togn,we can write thisas
g0 =>*gn
Sothismeans itispossible togetfromthesequence g0 tothesequence gn byfollowing theproduction rules.
![Page 25: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/25.jpg)
Definitionofthelanguage L(G)
25
Recall that:
G=(N,T,P,S)
T* isthesetofallpossible sequences ofT symbols.
L(G)isthesubsetofT*thatcan bederived fromthestartsymbol S,byfollowing theproduction rules P.
Using theconcept ofderivations,we can formally define L(G) asfollows:
L(G)={w∈ T*| S=>*w}
![Page 26: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/26.jpg)
Exercise:Prove thatasentence belongs toalanguage
26
Prove that
INT+INT*INT
Proof (byshowing allthederivationstepsfromthestartsymbolExp):
Exp=>
belongs tothelanguage ofthefollowing grammar:
p1: Exp –>Exp "+" Expp2: Exp –>Exp "*" Expp3: Exp –> INT
![Page 27: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/27.jpg)
Solution:Prove thatasentence belongs toalanguage
27
Prove that
INT+INT*INT
Proof:(byshowing allthederivationstepsfromthestartsymbolExp)
Exp=>p1 Exp "+" Exp=>p3 INT"+"Exp=>p2 INT"+"Exp "*"Exp=>p3 INT"+"INT"*"Exp=>p3 INT"+"INT"*"INT
belongs tothelanguage ofthefollowing grammar:
p1: Exp –>Exp "+" Expp2: Exp –>Exp "*" Expp3: Exp –> INT
![Page 28: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/28.jpg)
Leftmost andrightmost derivations
28
Inaleftmost derivation,theleftmost nonterminalisreplacedineach derivationstep,e.g.,:
Exp =>Exp "+" Exp =>INT"+"Exp =>INT"+"Exp "*"Exp =>INT"+"INT"*"Exp =>INT"+"INT"*"INT
LLparsingalgorithms use leftmost derivation.LRparsingalgorithms use rightmost derivation.Willbediscussed inlaterlectures.
Inarightmost derivation,therightmost nonterminalisreplaced ineach derivationstep,e.g.,:
Exp =>Exp "+" Exp =>Exp "+"Exp "*"Exp =>Exp "+"Exp "*"INT=>Exp "+"INT"*"INT=>INT"+"INT"*"INT
![Page 29: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/29.jpg)
Aderivationcorresponds tobuilding aparse tree
29
Grammar:Exp –>Exp "+" ExpExp –>Exp "*" ExpExp –> INT
Example derivation:
Exp =>Exp "+" Exp =>INT"+"Exp =>INT"+"Exp "*"Exp =>INT"+"INT"*"Exp =>INT"+"INT"*"INT
Exercise:build theparse tree(also called derivationtree).
![Page 30: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/30.jpg)
Aderivationcorresponds tobuilding aparse tree
30
Grammar:Exp –>Exp "+" ExpExp –>Exp "*" ExpExp –> INT
Example derivation:
Exp =>Exp "+" Exp =>INT"+"Exp =>INT"+"Exp "*"Exp =>INT"+"INT"*"Exp =>INT"+"INT"*"INT
Parse tree (derivationtree):
Exp
Exp Exp
Exp Exp
"+"
INT"*"
INT INT
![Page 31: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/31.jpg)
31
Ambiguities
![Page 32: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/32.jpg)
Exercise:Canwe do another derivationofthesamesentence,
thatgivesadifferentparse tree?
32
Grammar:Exp –>Exp "+" ExpExp –>Exp "*" ExpExp –> INT
Parse tree:
Anotherderivation:
Exp =>
![Page 33: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/33.jpg)
Solution:Canwe do another derivationofthesamesentence,
thatgivesadifferentparse tree?
33
Grammar:Exp –>Exp "+" ExpExp –>Exp "*" ExpExp –> INT
Parse tree:
Exp
Exp"*"
INT
Exp
Exp Exp"+"
INT INT
Anotherderivation:
Exp =>Exp "*" Exp =>Exp "+"Exp "*"Exp =>INT"+"Exp "*"Exp =>INT"+"INT"*"Exp =>INT"+"INT"*"INT
Which parse tree would we prefer?
![Page 34: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/34.jpg)
Ambiguous context-freegrammars
34
ACFGisambiguous if asentence inthelanguage can bederived bytwo (ormore)differentparse trees.
ACFGisunambiguous if each sentence inthelanguage canbederived byonly one parse tree.
(Swedish:tvetydig,otvetydig)
Note!There can bemany differentderivationsthat give thesameparse tree.
![Page 35: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/35.jpg)
HowcanweknowifaCFGisambiguous?
35
Ifwe find anexample of anambiguity,we know thegrammar isambiguous.
There are algorithms fordeciding if aCFGbelongs to certainsubsets of CFGs,e.g.LL,LR,etc.(See laterlectures.)Thesegrammarsare unambiguous.
But inthegeneralcase,theproblemisundecidable:itisnotpossible to construct ageneralalgorithm that decidesambiguity foranarbitrary CFG.
Strategies foreliminating ambiguities,next lecture.
![Page 36: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/36.jpg)
36
Parsing
![Page 37: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/37.jpg)
Differentparsingalgorithms
37
Ambiguous
Unambiguous
Allcontext-freegrammars
LR
LL
LL:Left-to-rightscanLeftmost derivationBuilds tree top-downSimpleto understand
LR:Left-to-rightscanRightmost derivationBuilds tree bottom-upMore powerful
![Page 38: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/38.jpg)
LLandLRparsers:main idea
38
...if IDthen ID=ID;ID...
LR(1):decidestobuildAssignafterseeingthefirsttokenfollowingitssubtree.Thetreeisbuiltbottomup.
Id Assign
Id Id
Thetokeniscalled lookahead.LL(k)andLR(k)use k lookahead tokens.
...if IDthen ID=ID;ID...
IfStmt
Id Assign
LL(1):decidestobuildAssignafterseeingthefirsttokenofitssubtree.Thetreeisbuilttopdown.
CompoundStmt
![Page 39: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/39.jpg)
Recursive-descentparsingAwayofprogramminganLL(1)parserbyrecursivemethodcalls
39
Assume aBNFgrammar with exactly one production rule foreach nonterminal.(Can easily begeneralized to EBNF.)
Each production rule RHSiseither1. asequence of token/nonterminal symbols,or2. asetof nonterminal symbolalternatives
Foreach nonterminal,amethod isconstructed.Themethod1. matches tokensandcallsnonterminal methods,or2. callsone of thenonterminal methods – which one depends onthe
lookahead token.
Ifthelookahead tokendoes notmatch,aparsingerror isreported.
A–>B|C|DB–>eCfDC–>...D–>...
![Page 40: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/40.jpg)
ExampleJavaimplementation:overview
40
statement –>assignment |compoundStmtassignment–>IDASSIGN expr SEMICOLONcompoundStmt –>LBRACE statement*RBRACE...
class Parser{privateint token; //current lookahead tokenvoid accept(int t){...} //accepttandreadinnext tokenvoid error(Stringstr){...} //generate error messagevoid statement(){...}void assignment (){...}void compoundStmt (){...}...
}
![Page 41: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/41.jpg)
Example:recursivedescentmethods
41
statement –>assignment |compoundStmtassignment–>IDASSIGN expr SEMICOLONcompoundStmt –>LBRACE statement*RBRACE
class Parser{void statement(){switch(token){case ID:assignment();break;case LBRACE:compoundStmt();break;default:error("Expecting statement,found:"+token);}}void assignment(){accept(ID);accept(ASSIGN);expr();accept(SEMICOLON);}void compoundStmt(){accept(LBRACE);while (token!=RBRACE){statement();}accept(RBRACE);}...}
![Page 42: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/42.jpg)
Example:Parserskeletondetails
42
statement –>assignment |compoundStmtassignment–>IDASSIGN expr SEMICOLONcompoundStmt –>LBRACE statement*RBRACEexpr –>...
class Parser{finalstatic int ID=1,WHILE=2,DO=3,ASSIGN=4,...;privateint token; //current lookahead tokenvoid accept(int t){ //accepttandreadinnext tokenif (token==t){token=nextToken();}else {error("Expected "+t+",but found "+token);}}void error(Stringstr){...} //generate error messageprivateint nextToken(){...}//readnext tokenfromscannervoid statement()......
}
![Page 43: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/43.jpg)
ArethesegrammarsLL(1)?
43
expr –>name params |name
Whatwouldhappeninarecursive-descentparser?
Could they beLL(2)?LL(k)?
Commonprefix
expr –>expr "+" term Left recursion
![Page 44: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/44.jpg)
Dealingwithcommonprefixoflimitedlength:Locallookahead
44
LL(2)grammar:statement –>assignment |compoundStmt |callStmtassignment–>IDASSIGN expr SEMICOLONcompoundStmt –>LBRACE statement*RBRACEcallStmt –> IDLPAR expr RPAR SEMICOLON
voidstatement()...
![Page 45: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/45.jpg)
45
LL(2)grammar:statement –>assignment |compoundStmt |callStmtassignment–>IDASSIGN expr SEMICOLONcompoundStmt –>LBRACE statement*RBRACEcallStmt –> IDLPAR expr RPAR SEMICOLON
voidstatement(){switch(token){caseID:if(lookahead(2) ==ASSIGN){assignment();}else{callStmt();}break;caseLBRACE:compoundStmt();break;default:error("Expectingstatement,found:"+token);}}
Dealingwithcommonprefixoflimitedlength:Locallookahead
![Page 46: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/46.jpg)
Generatingtheparser:
Syntactic analyzer(parser)
Context-freegrammar Parsergenerator
46
tokens
tree
![Page 47: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/47.jpg)
Beaver:anLR-based parsergenerator
ParserinJava
Context-freegrammar,
with semanticactionsinJava
Beaver
47
tokens
tree
![Page 48: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/48.jpg)
Example beaver specification
48
%class "LangParser";%package "lang";...%terminalsLET,IN,END,ASSIGN,MUL,ID,NUMERAL;
%goal program;//Thestartsymbol
//Context-free grammarprogram=exp;exp =factor |exp MULfactor;factor =let |numeral |id;let =LETidASSIGNexp INexp END;numeral =NUMERAL;id=ID;
Lateron,we will extend this specification with semantic actionsto build thesyntaxtree.
![Page 49: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/49.jpg)
49
RE CFGTypicalAlphabet
characters terminalsymbols(tokens)
Language isasetof ...
strings(charsequences)
sentences(tokensequences)
Used for... tokens parsetreesPower iteration recursionRecognizer DFA DFAwith stack
RegularExpressionsvsContext-FreeGrammars
![Page 50: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/50.jpg)
50
Grammar Rulepatterns Typeregular X –>aY orX –>a orX –>e 3
contextfree X–> g 2context sensitive a X b –>a g b 1
arbitrary g –>d 0
TheChomskyhierarchy of formalgrammars
a – terminalsymbola, b, g, d – sequences of (terminalornonterminal)symbols
Type(3)⊂ Type (2)⊂ Type(1)⊂ Type(0)
Regular grammarshave thesamepower asregular expressions(tail recursion =iteration).
Type 2and3are of practicaluse incompiler construction.Type 0and1are only of theoretical interest.
![Page 51: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/51.jpg)
Courseoverview
Semantic analyzer
51
Lexical analyzer(scanner)
Syntactic analyzer(parser)
Regularexpressions
Context-freegrammar
Attributegrammar
tokens
sourcecode (text)
AST(Abstractsyntaxtree)
What we have covered:Context-free grammars,derivations,parse treesAmbiguous grammarsIntroduction to parsing,recursive-descent
You can now finishassignment 1
![Page 52: EDAN65: Compilers, Lecture03 Context …fileadmin.cs.lth.se/cs/Education/EDAN65/2017/lectures/L...Course overview Semanticanalyzer Intermediate codegenerator Optimizer Target code](https://reader030.vdocuments.mx/reader030/viewer/2022040802/5e3b9da07adbd9667e0c9ad0/html5/thumbnails/52.jpg)
Summary questions
52
• Construct aCFGforasimplepartof aprogramming language.• What isanonterminal symbol?Aterminalsymbol?Aproduction?Astartsymbol?Aparse tree?• What isaleft-handside of aproduction?Aright-handside?• Givenagrammar G,what ismeant bythelanguage L(G)?• What isaderivationstep?Aderivation?Aleftmost derivation?Arighmostderivation?• How does aderivationcorrespond to aparse tree?• What does itmean foragrammar to beambiguous?Unambiguous?• Give anexample anambiguous CFG.• What isthedifference between anLLandanLRparser?• What isthedifference between LL(1)andLL(2)?Orbetween LR(1)andLR(2)?• Construct arecursive descent parserforasimplelanguage.• Give typical examples of grammarsthat cannot behandledbyarecursive-descent parser.• Explain why context-free grammarsare more powerful than regularexpressions.• Inwhat senseare context-free grammars"context-free"?