topic: syntactic analysis · 2020. 5. 7. · syntactic analysis token-streamsyntaxtreeparser...
TRANSCRIPT
![Page 1: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/1.jpg)
Topic:
Syntactic Analysis
1 / 55
![Page 2: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/2.jpg)
Syntactic Analysis
ParserToken-Stream Syntaxtree
Syntactic analysis tries to integrate Tokens into larger program units.
Such units may possibly be:
→ Expressions;
→ Statements;
→ Conditional branches;
→ loops; ...
2 / 55
![Page 3: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/3.jpg)
Syntactic Analysis
ParserToken-Stream Syntaxtree
Syntactic analysis tries to integrate Tokens into larger program units.
Such units may possibly be:
→ Expressions;
→ Statements;
→ Conditional branches;
→ loops; ...
2 / 55
![Page 4: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/4.jpg)
Discussion:
In general, parsers are not developed by hand, but generated from a specification:
ParserSpecification Generator
Specification of the hierarchical structure: contextfree grammarsGenerated implementation: Pushdown automata + X
3 / 55
![Page 5: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/5.jpg)
Discussion:
In general, parsers are not developed by hand, but generated from a specification:
E→E{op}E Generator
Specification of the hierarchical structure: contextfree grammarsGenerated implementation: Pushdown automata + X
3 / 55
![Page 6: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/6.jpg)
Chapter 1:
Basics of Contextfree Grammars
4 / 55
Syntactic Analysis
![Page 7: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/7.jpg)
Basics: Context-free Grammars
Programs of programming languages can have arbitrary numbers of tokens, but onlyfinitely many Token-classes.This is why we choose the set of Token-classes to be the finite alphabet of terminals T .The nested structure of program components can be described elegantly viacontext-free grammars...
Definition: Context-Free GrammarA context-free grammar (CFG) is a4-tuple G = (N,T , P , S) with:
N the set of nonterminals,T the set of terminals,P the set of productions or rules, andS ∈ N the start symbol
5 / 55
![Page 8: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/8.jpg)
Basics: Context-free Grammars
Programs of programming languages can have arbitrary numbers of tokens, but onlyfinitely many Token-classes.This is why we choose the set of Token-classes to be the finite alphabet of terminals T .The nested structure of program components can be described elegantly viacontext-free grammars...
Definition: Context-Free GrammarA context-free grammar (CFG) is a4-tuple G = (N,T , P , S) with:
N the set of nonterminals,T the set of terminals,P the set of productions or rules, andS ∈ N the start symbol
5 / 55
Noam Chomsky John Backus
![Page 9: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/9.jpg)
Conventions
The rules of context-free grammars take the following form:
A→ α with A ∈ N , α ∈ (N ∪ T )∗
... for example:S → aS bS → ε
Specified language: {anbn | n ≥ 0}
Conventions:In examples, we specify nonterminals and terminals in general implicitely:
nonterminals are: A,B,C, ..., 〈exp〉, 〈stmt〉, ...;terminals are: a, b, c, ..., int, name, ...;
6 / 55
![Page 10: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/10.jpg)
Conventions
The rules of context-free grammars take the following form:
A→ α with A ∈ N , α ∈ (N ∪ T )∗
... for example:S → aS bS → ε
Specified language: {anbn | n ≥ 0}
Conventions:In examples, we specify nonterminals and terminals in general implicitely:
nonterminals are: A,B,C, ..., 〈exp〉, 〈stmt〉, ...;terminals are: a, b, c, ..., int, name, ...;
6 / 55
![Page 11: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/11.jpg)
Conventions
The rules of context-free grammars take the following form:
A→ α with A ∈ N , α ∈ (N ∪ T )∗
... for example:S → aS bS → ε
Specified language: {anbn | n ≥ 0}
Conventions:In examples, we specify nonterminals and terminals in general implicitely:
nonterminals are: A,B,C, ..., 〈exp〉, 〈stmt〉, ...;terminals are: a, b, c, ..., int, name, ...;
6 / 55
![Page 12: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/12.jpg)
... a practical example:
S → 〈stmt〉〈stmt〉 → 〈if〉 | 〈while〉 | 〈rexp〉;〈if〉 → if ( 〈rexp〉 ) 〈stmt〉 else 〈stmt〉〈while〉 → while ( 〈rexp〉 ) 〈stmt〉〈rexp〉 → int | 〈lexp〉 | 〈lexp〉 = 〈rexp〉 | ...〈lexp〉 → name | ...
More conventions:For every nonterminal, we collect the right hand sides of rules and list them together.The j-th rule for A can be identified via the pair (A, j)( with j ≥ 0).
7 / 55
![Page 13: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/13.jpg)
... a practical example:
S → 〈stmt〉〈stmt〉 → 〈if〉 | 〈while〉 | 〈rexp〉;〈if〉 → if ( 〈rexp〉 ) 〈stmt〉 else 〈stmt〉〈while〉 → while ( 〈rexp〉 ) 〈stmt〉〈rexp〉 → int | 〈lexp〉 | 〈lexp〉 = 〈rexp〉 | ...〈lexp〉 → name | ...
More conventions:For every nonterminal, we collect the right hand sides of rules and list them together.The j-th rule for A can be identified via the pair (A, j)( with j ≥ 0).
7 / 55
![Page 14: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/14.jpg)
Pair of grammars:
E → E+E 0 | E∗E 1 | ( E ) 2 | name 3 | int 4
E → E+T 0 | T 1
T → T∗F 0 | F 1
F → ( E ) 0 | name 1 | int 2
Both grammars describe the same language
8 / 55
![Page 15: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/15.jpg)
Pair of grammars:
E → E+E 0 | E∗E 1 | ( E ) 2 | name 3 | int 4
E → E+T 0 | T 1
T → T∗F 0 | F 1
F → ( E ) 0 | name 1 | int 2
Both grammars describe the same language
8 / 55
![Page 16: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/16.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E
→ E + T→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 17: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/17.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T
→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 18: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/18.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T
→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 19: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/19.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T
→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 20: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/20.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T→ T ∗ int + T
→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 21: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/21.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T
→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 22: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/22.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T
→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 23: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/23.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F
→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 24: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/24.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 25: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/25.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗
9 / 55
... for example:
![Page 26: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/26.jpg)
DerivationGrammars are term rewriting systems. The rules offer feasible rewriting steps. A sequenceof such rewriting steps α0 → . . . → αm is called derivation.
E → E + T→ T + T→ T ∗ F + T→ T ∗ int + T→ F ∗ int + T→ name ∗ int + T→ name ∗ int + F→ name ∗ int + int
DefinitionThe rewriting relation→ is a relation on words over N ∪ T , with
α→ α′ iff α = α1 A α2 ∧ α′ = α1 β α2 for an A→ β ∈ P
The reflexive and transitive closure of → is denoted as: →∗9 / 55
... for example:
![Page 27: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/27.jpg)
Derivation
Remarks:The relation → depends on the grammarIn each step of a derivation, we may choose:
∗ a spot, determining where we will rewrite.
∗ a rule, determining how we will rewrite.
The language, specified by G is:
L(G) = {w ∈ T ∗ | S →∗ w}
Attention:The order, in which disjunct fragments are rewritten is not relevant.
10 / 55
![Page 28: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/28.jpg)
Derivation
Remarks:The relation → depends on the grammarIn each step of a derivation, we may choose:
∗ a spot, determining where we will rewrite.
∗ a rule, determining how we will rewrite.
The language, specified by G is:
L(G) = {w ∈ T ∗ | S →∗ w}
Attention:The order, in which disjunct fragments are rewritten is not relevant.
10 / 55
![Page 29: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/29.jpg)
Derivation Tree
Derivations of a symbol are represented as derivation trees:
... for example:
E → 0 E + T→ 1 T + T→ 0 T ∗ F + T→ 2 T ∗ int + T→ 1 F ∗ int + T→ 1 name ∗ int + T→ 1 name ∗ int + F→ 2 name ∗ int + int
A derivation tree for A ∈ N :inner nodes: rule applications
root: rule application for Aleaves: terminals or ε
The successors of (B, i) correspond to right hand sides of the rule11 / 55
E 0
+E 1
T 0
T 1
F 1
F 2
F 2
T 1
name
int
int∗
![Page 30: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/30.jpg)
Special Derivations
Attention:In contrast to arbitrary derivations, we find special ones, always rewriting the leftmost (orrather rightmost) occurance of a nonterminal.
These are called leftmost (or rather rightmost) derivations and are denoted with theindex L (or R respectively).Leftmost (or rightmost) derivations correspondt to a left-to-right (or right-to-left)preorder-DFS-traversal of the derivation tree.Reverse rightmost derivations correspond to a left-to-right postorder-DFS-traversal ofthe derivation tree
12 / 55
![Page 31: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/31.jpg)
Special Derivations
... for example:E 0
+E 1
T 0
T 1
F 1
F 2
F 2
T 1
name
int
int∗
Leftmost derivation: (E, 0) (E, 1) (T , 0) (T , 1) (F , 1) (F , 2) (T , 1) (F , 2)Rightmost derivation: (E, 0) (T , 1) (F , 2) (E, 1) (T , 0) (F , 2) (T , 1) (F , 1)Reverse rightmost derivation: (F , 1) (T , 1) (F , 2) (T , 0) (E, 1) (F , 2) (T , 1) (E, 0)
13 / 55
![Page 32: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/32.jpg)
Special Derivations
... for example:E 0
+E 1
T 0
T 1
F 1
F 2
F 2
T 1
name
int
int∗
Leftmost derivation: (E, 0) (E, 1) (T , 0) (T , 1) (F , 1) (F , 2) (T , 1) (F , 2)
Rightmost derivation: (E, 0) (T , 1) (F , 2) (E, 1) (T , 0) (F , 2) (T , 1) (F , 1)Reverse rightmost derivation: (F , 1) (T , 1) (F , 2) (T , 0) (E, 1) (F , 2) (T , 1) (E, 0)
13 / 55
![Page 33: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/33.jpg)
Special Derivations
... for example:E 0
+E 1
T 0
T 1
F 1
F 2
F 2
T 1
name
int
int∗
Leftmost derivation: (E, 0) (E, 1) (T , 0) (T , 1) (F , 1) (F , 2) (T , 1) (F , 2)Rightmost derivation: (E, 0) (T , 1) (F , 2) (E, 1) (T , 0) (F , 2) (T , 1) (F , 1)
Reverse rightmost derivation: (F , 1) (T , 1) (F , 2) (T , 0) (E, 1) (F , 2) (T , 1) (E, 0)
13 / 55
![Page 34: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/34.jpg)
Special Derivations
... for example:E 0
+E 1
T 0
T 1
F 1
F 2
F 2
T 1
name
int
int∗
Leftmost derivation: (E, 0) (E, 1) (T , 0) (T , 1) (F , 1) (F , 2) (T , 1) (F , 2)Rightmost derivation: (E, 0) (T , 1) (F , 2) (E, 1) (T , 0) (F , 2) (T , 1) (F , 1)Reverse rightmost derivation: (F , 1) (T , 1) (F , 2) (T , 0) (E, 1) (F , 2) (T , 1) (E, 0)
13 / 55
![Page 35: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/35.jpg)
Unique Grammars
The concatenation of leaves of a derivation tree t are often called yield(t) .
... for example:E 0
+E 1
T 0
T 1
F 1
F 2
F 2
T 1
name
int
int∗
gives rise to the concatenation: name ∗ int + int .14 / 55
![Page 36: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/36.jpg)
Unique Grammars
Definition:Grammar G is called unique, if for every w ∈ T ∗ there is maximally one derivationtree t of S with yield(t) = w.
... in our example:
E → E+E 0 | E∗E 1 | ( E ) 2 | name 3 | int 4
E → E+T 0 | T 1
T → T∗F 0 | F 1
F → ( E ) 0 | name 1 | int 2
The first one is ambiguous, the second one is unique
15 / 55
![Page 37: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/37.jpg)
Conclusion:
A derivation tree represents a possible hierarchical structure of a word.For programming languages, only those grammars with a unique structure are ofinterest.Derivation trees are one-to-one corresponding with leftmost derivations as well as(reverse) rightmost derivations.
Leftmost derivations correspond to a top-down reconstruction of the syntax tree.Reverse rightmost derivations correspond to a bottom-up reconstruction of the syntaxtree.
16 / 55
![Page 38: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/38.jpg)
Conclusion:
A derivation tree represents a possible hierarchical structure of a word.For programming languages, only those grammars with a unique structure are ofinterest.Derivation trees are one-to-one corresponding with leftmost derivations as well as(reverse) rightmost derivations.
Leftmost derivations correspond to a top-down reconstruction of the syntax tree.Reverse rightmost derivations correspond to a bottom-up reconstruction of the syntaxtree.
16 / 55
![Page 39: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/39.jpg)
Chapter 2:
Basics of Pushdown Automata
17 / 55
Syntactic Analysis
![Page 40: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/40.jpg)
Basics of Pushdown Automata
Languages, specified by context free grammars are accepted by Pushdown Automata:
The pushdown is used e.g. to verify correct nesting of braces.
18 / 55
![Page 41: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/41.jpg)
Example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
Conventions:We do not differentiate between pushdown symbols and statesThe rightmost / upper pushdown symbol represents the stateEvery transition consumes / modifies the upper part of the pushdown
19 / 55
![Page 42: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/42.jpg)
Example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
Conventions:We do not differentiate between pushdown symbols and statesThe rightmost / upper pushdown symbol represents the stateEvery transition consumes / modifies the upper part of the pushdown
19 / 55
![Page 43: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/43.jpg)
Definition: Pushdown AutomatonA pushdown automaton (PDA) is a tupleM = (Q,T , δ, q0, F ) with:
Q a finite set of states;T an input alphabet;q0 ∈ Q the start state;F ⊆ Q the set of final states andδ ⊆ Q+ × (T ∪ {ε})×Q∗ a finite set of transitions
We define computations of pushdown automata with the help of transitions; a particularcomputation state (the current configuration) is a pair:
(γ,w) ∈ Q∗ × T ∗
consisting of the pushdown content and the remaining input.
20 / 55
Friedrich Bauer Klaus Samelson
![Page 44: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/44.jpg)
Definition: Pushdown AutomatonA pushdown automaton (PDA) is a tupleM = (Q,T , δ, q0, F ) with:
Q a finite set of states;T an input alphabet;q0 ∈ Q the start state;F ⊆ Q the set of final states andδ ⊆ Q+ × (T ∪ {ε})×Q∗ a finite set of transitions
We define computations of pushdown automata with the help of transitions; a particularcomputation state (the current configuration) is a pair:
(γ,w) ∈ Q∗ × T ∗
consisting of the pushdown content and the remaining input.
20 / 55
Friedrich Bauer Klaus Samelson
![Page 45: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/45.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b) ` (1 1 , a a b b b)` (1 1 1 , a b b b)` (1 1 1 1 , b b b)` (1 1 2 , b b)` (1 2 , b)` (2 , ε)
21 / 55
![Page 46: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/46.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b)
` (1 1 , a a b b b)` (1 1 1 , a b b b)` (1 1 1 1 , b b b)` (1 1 2 , b b)` (1 2 , b)` (2 , ε)
21 / 55
![Page 47: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/47.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b) ` (1 1 , a a b b b)
` (1 1 1 , a b b b)` (1 1 1 1 , b b b)` (1 1 2 , b b)` (1 2 , b)` (2 , ε)
21 / 55
![Page 48: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/48.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b) ` (1 1 , a a b b b)` (1 1 1 , a b b b)
` (1 1 1 1 , b b b)` (1 1 2 , b b)` (1 2 , b)` (2 , ε)
21 / 55
![Page 49: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/49.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b) ` (1 1 , a a b b b)` (1 1 1 , a b b b)` (1 1 1 1 , b b b)
` (1 1 2 , b b)` (1 2 , b)` (2 , ε)
21 / 55
![Page 50: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/50.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b) ` (1 1 , a a b b b)` (1 1 1 , a b b b)` (1 1 1 1 , b b b)` (1 1 2 , b b)
` (1 2 , b)` (2 , ε)
21 / 55
![Page 51: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/51.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b) ` (1 1 , a a b b b)` (1 1 1 , a b b b)` (1 1 1 1 , b b b)` (1 1 2 , b b)` (1 2 , b)
` (2 , ε)
21 / 55
![Page 52: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/52.jpg)
... for example:
States: 0, 1, 2Start state: 0Final states: 0, 2
0 a 111 a 1111 b 212 b 2
(0 , a a a b b b) ` (1 1 , a a b b b)` (1 1 1 , a b b b)` (1 1 1 1 , b b b)` (1 1 2 , b b)` (1 2 , b)` (2 , ε)
21 / 55
![Page 53: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/53.jpg)
A computation step is characterized by the relation ` ⊆ (Q∗ × T ∗)2 with
(αγ, xw) ` (αγ′, w) for (γ, x, γ′) ∈ δ
Remarks:
The relation ` depends on the pushdown automaton MThe reflexive and transitive closure of ` is denoted by `∗
Then, the language accepted by M is
L(M) = {w ∈ T ∗ | ∃ f ∈ F : (q0, w)`∗ (f, ε)}
We accept with a final state together with empty input.
22 / 55
![Page 54: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/54.jpg)
A computation step is characterized by the relation ` ⊆ (Q∗ × T ∗)2 with
(αγ, xw) ` (αγ′, w) for (γ, x, γ′) ∈ δ
Remarks:
The relation ` depends on the pushdown automaton MThe reflexive and transitive closure of ` is denoted by `∗
Then, the language accepted by M is
L(M) = {w ∈ T ∗ | ∃ f ∈ F : (q0, w)`∗ (f, ε)}
We accept with a final state together with empty input.
22 / 55
![Page 55: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/55.jpg)
A computation step is characterized by the relation ` ⊆ (Q∗ × T ∗)2 with
(αγ, xw) ` (αγ′, w) for (γ, x, γ′) ∈ δ
Remarks:
The relation ` depends on the pushdown automaton MThe reflexive and transitive closure of ` is denoted by `∗
Then, the language accepted by M is
L(M) = {w ∈ T ∗ | ∃ f ∈ F : (q0, w)`∗ (f, ε)}
We accept with a final state together with empty input.
22 / 55
![Page 56: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/56.jpg)
Definition: Deterministic Pushdown AutomatonThe pushdown automaton M is deterministic, if every configuration has maximally onesuccessor configuration.
This is exactly the case if for distinct transitions (γ1, x, γ2) , (γ′1, x′, γ′2) ∈ δ we can
assume:Is γ1 a suffix of γ′1, then x 6= x′ ∧ x 6= ε 6= x′ is valid.
... for example:
0 a 111 a 1111 b 212 b 2
... this obviously holds
23 / 55
![Page 57: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/57.jpg)
Definition: Deterministic Pushdown AutomatonThe pushdown automaton M is deterministic, if every configuration has maximally onesuccessor configuration.
This is exactly the case if for distinct transitions (γ1, x, γ2) , (γ′1, x′, γ′2) ∈ δ we can
assume:Is γ1 a suffix of γ′1, then x 6= x′ ∧ x 6= ε 6= x′ is valid.
... for example:
0 a 111 a 1111 b 212 b 2
... this obviously holds
23 / 55
![Page 58: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/58.jpg)
Pushdown Automata
Theorem:For each context free grammar G = (N,T , P , S)a pushdown automaton M with L(G) = L(M) can be built.
The theorem is so important for us, that we take a look at two constructions for automata,motivated by both of the special derivations:
MLG to build Leftmost derivations
MRG to build reverse Rightmost derivations
24 / 55
M. Schützenberger A. Öttinger
![Page 59: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/59.jpg)
Chapter 3:
Top-down Parsing
25 / 55
Syntactic Analysis
![Page 60: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/60.jpg)
Item Pushdown Automaton
Construction: Item Pushdown Automaton MLG
Reconstruct a Leftmost derivation.Expand nonterminals using a rule.Verify successively, that the chosen rule matches the input.
==⇒ The states are now Items (= rules with a bullet):
[A→α • β] , A→ αβ ∈ P
The bullet marks the spot, how far the rule is already processed
26 / 55
![Page 61: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/61.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 62: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/62.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 63: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/63.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 64: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/64.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 65: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/65.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 66: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/66.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 67: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/67.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 68: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/68.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 69: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/69.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 70: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/70.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
S
A B0 0
0
27 / 55
![Page 71: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/71.jpg)
Item Pushdown Automaton – Example
Our example:
S → AB0 A → a0 B → b0
a b
0S
A B0 0
27 / 55
![Page 72: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/72.jpg)
Item Pushdown Automaton – Example
We add another rule S′ → S $ for initialising the construction:
Start state: [S′→ • S $]End state: [S′→S • $]Transition relations:
[S′→ • S $] ε [S′→ • S $] [S→ • AB][S→ • AB] ε [S→ • AB] [A→ • a][A→ • a] a [A→ a •][S→ • AB] [A→ a •] ε [S→A • B][S→A • B] ε [S→A • B] [B→ • b][B→ • b] b [B→ b •][S→A • B] [B→ b •] ε [S→AB •][S′→ • S $] [S→AB •] ε [S′→S • $]
28 / 55
![Page 73: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/73.jpg)
Item Pushdown Automaton
The item pushdown automaton MLG has three kinds of transitions:
Expansions: ([A→α •B β], ε, [A→α •B β] [B→ • γ]) forA → αB β, B→ γ ∈ P
Shifts: ([A→α • a β], a, [A→αa • β]) for A→αaβ ∈ PReduces: ([A→α •B β] [B→ γ•], ε, [A→αB • β]) for
A→αB β, B→ γ ∈ P
Items of the form: [A→α •] are also called completeThe item pushdown automaton shifts the bullet around the derivation tree ...
29 / 55
![Page 74: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/74.jpg)
Item Pushdown Automaton
Discussion:
The expansions of a computation form a leftmost derivationUnfortunately, the expansions are chosen nondeterministically
For proving correctness of the construction, we show that for every Item [A→α •B β]the following holds:
([A→α •B β], w) `∗ ([A→αB • β], ε) iff B →∗ w
LL-Parsing is based on the item pushdown automaton and tries to make theexpansions deterministic ...
30 / 55
![Page 75: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/75.jpg)
Item Pushdown Automaton
Example: S′ → S $ S → ε | aS b
The transitions of the according Item Pushdown Automaton:
0 [S′→ • S $] ε [S′→ • S $] [S→•]1 [S′→ • S $] ε [S′→ • S $] [S→ • aS b]2 [S→ • aS b] a [S→ a • S b]3 [S→ a • S b] ε [S→ a • S b] [S→•]4 [S→ a • S b] ε [S→ a • S b] [S→ • aS b]5 [S→ a • S b] [S→•] ε [S→ aS • b]6 [S→ a • S b] [S→ aS b•] ε [S→ aS • b]7 [S→ aS • b] b [S→ aS b•]8 [S′→ • S $] [S→•] ε [S′→S • $]9 [S′→ • S $] [S→ aS b•] ε [S′→S • $]
Conflicts arise between the transitions (0, 1) and (3, 4), resp..
31 / 55
![Page 76: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/76.jpg)
Item Pushdown Automaton
Example: S′ → S $ S → ε | aS b
The transitions of the according Item Pushdown Automaton:
0 [S′→ • S $] ε [S′→ • S $] [S→•]1 [S′→ • S $] ε [S′→ • S $] [S→ • aS b]2 [S→ • aS b] a [S→ a • S b]3 [S→ a • S b] ε [S→ a • S b] [S→•]4 [S→ a • S b] ε [S→ a • S b] [S→ • aS b]5 [S→ a • S b] [S→•] ε [S→ aS • b]6 [S→ a • S b] [S→ aS b•] ε [S→ aS • b]7 [S→ aS • b] b [S→ aS b•]8 [S′→ • S $] [S→•] ε [S′→S • $]9 [S′→ • S $] [S→ aS b•] ε [S′→S • $]
Conflicts arise between the transitions (0, 1) and (3, 4), resp..
31 / 55
![Page 77: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/77.jpg)
Topdown Parsing
Problem:Conflicts between the transitions prohibit an implementation of the item pushdownautomaton as deterministic pushdown automaton.
Idea 1: GLL ParsingFor each conflict, we create a virtual copy of the complete configuration and continuecomputing in parallel.
Idea 2: Recursive Descent & BacktrackingDepth-first search for an appropriate derivation.
Idea 3: Recursive Descent & LookaheadConflicts are resolved by considering a lookup of the next input symbols.
32 / 55
![Page 78: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/78.jpg)
Topdown Parsing
Problem:Conflicts between the transitions prohibit an implementation of the item pushdownautomaton as deterministic pushdown automaton.
Idea 1: GLL ParsingFor each conflict, we create a virtual copy of the complete configuration and continuecomputing in parallel.
Idea 2: Recursive Descent & BacktrackingDepth-first search for an appropriate derivation.
Idea 3: Recursive Descent & LookaheadConflicts are resolved by considering a lookup of the next input symbols.
32 / 55
![Page 79: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/79.jpg)
Topdown Parsing
Problem:Conflicts between the transitions prohibit an implementation of the item pushdownautomaton as deterministic pushdown automaton.
Idea 1: GLL ParsingFor each conflict, we create a virtual copy of the complete configuration and continuecomputing in parallel.
Idea 2: Recursive Descent & BacktrackingDepth-first search for an appropriate derivation.
Idea 3: Recursive Descent & LookaheadConflicts are resolved by considering a lookup of the next input symbols.
32 / 55
![Page 80: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/80.jpg)
Topdown Parsing
Problem:Conflicts between the transitions prohibit an implementation of the item pushdownautomaton as deterministic pushdown automaton.
Idea 1: GLL ParsingFor each conflict, we create a virtual copy of the complete configuration and continuecomputing in parallel.
Idea 2: Recursive Descent & BacktrackingDepth-first search for an appropriate derivation.
Idea 3: Recursive Descent & LookaheadConflicts are resolved by considering a lookup of the next input symbols.
32 / 55
![Page 81: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/81.jpg)
Structure of the LL(1)-Parser:
δ
MOutput
The parser accesses a frame of length 1 of the input;it corresponds to an item pushdown automaton, essentially;table M [q, w] contains the rule of choice.
33 / 55
![Page 82: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/82.jpg)
Topdown Parsing
Idea:Emanate from the item pushdown automatonConsider the next input symbol to determine the appropriate rule for the next expansionA grammar is called LL(1) if a unique choice is always possible
Definition:A reduced grammar is called LL(1), if for each two distinctrules A→α , A→α′ ∈ P and each derivationS →∗L uAβ with u ∈ T ∗ the following is valid:
First1(αβ) ∩ First1(α′ β) = ∅
34 / 55
![Page 83: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/83.jpg)
Topdown Parsing
Idea:Emanate from the item pushdown automatonConsider the next input symbol to determine the appropriate rule for the next expansionA grammar is called LL(1) if a unique choice is always possible
Definition:A reduced grammar is called LL(1), if for each two distinctrules A→α , A→α′ ∈ P and each derivationS →∗L uAβ with u ∈ T ∗ the following is valid:
First1(αβ) ∩ First1(α′ β) = ∅
34 / 55
Philip Lewis Richard Stearns
![Page 84: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/84.jpg)
Topdown Parsing
Example 1:
S → if ( E ) S else S |while ( E ) S |E ;
E → id
is LL(1), since First1(E) = {id}
Example 2:
S → if ( E ) S else S |if ( E ) S |while ( E ) S |E ;
E → id
... is not LL(k) for any k > 0.
35 / 55
![Page 85: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/85.jpg)
Topdown Parsing
Example 1:
S → if ( E ) S else S |while ( E ) S |E ;
E → id
is LL(1), since First1(E) = {id}Example 2:
S → if ( E ) S else S |if ( E ) S |while ( E ) S |E ;
E → id
... is not LL(k) for any k > 0.35 / 55
![Page 86: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/86.jpg)
Lookahead Sets
Definition: First1-SetsFor a set L ⊆ T ∗ we define:
First1(L) = {ε | ε ∈ L} ∪ {u ∈ T | ∃ v ∈ T ∗ : uv ∈ L}
Example: S → ε | aS bFirst1([[S]])εa ba a b ba a a b b b. . .
≡ the yield’s prefix of length 1
36 / 55
![Page 87: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/87.jpg)
Lookahead Sets
Definition: First1-SetsFor a set L ⊆ T ∗ we define:
First1(L) = {ε | ε ∈ L} ∪ {u ∈ T | ∃ v ∈ T ∗ : uv ∈ L}
Example: S → ε | aS bFirst1([[S]])εa ba a b ba a a b b b. . .
≡ the yield’s prefix of length 136 / 55
![Page 88: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/88.jpg)
Lookahead Sets
Arithmetics:First1(_) is distributive with union and concatenation:
First1(∅) = ∅First1(L1 ∪ L2) = First1(L1) ∪ First1(L2)First1(L1 · L2) = First1(First1(L1) · First1(L2))
:= First1(L1) �1 First1(L2)
�1 being 1− concatenation
Definition: 1-concatenationLet L1, L2 ⊆ T ∪ {ε} with L1 6= ∅ 6= L2. Then:
L1 �1 L2 =
{L1 if ε 6∈ L1
(L1\{ε}) ∪ L2 otherwise
If all rules of G are productive, then all sets First1(A) are non-empty.
37 / 55
![Page 89: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/89.jpg)
Lookahead Sets
Arithmetics:First1(_) is distributive with union and concatenation:
First1(∅) = ∅First1(L1 ∪ L2) = First1(L1) ∪ First1(L2)First1(L1 · L2) = First1(First1(L1) · First1(L2))
:= First1(L1) �1 First1(L2)
�1 being 1− concatenation
Definition: 1-concatenationLet L1, L2 ⊆ T ∪ {ε} with L1 6= ∅ 6= L2. Then:
L1 �1 L2 =
{L1 if ε 6∈ L1
(L1\{ε}) ∪ L2 otherwise
If all rules of G are productive, then all sets First1(A) are non-empty.37 / 55
![Page 90: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/90.jpg)
Lookahead Sets
For α ∈ (N ∪ T )∗ we are interested in the set:
First1(α) = First1({w ∈ T ∗ | α→∗ w})
Idea: Treat ε separately: First1(A) = F ε(A) ∪ {ε | A→∗ε}Let empty(X) = true iff X→∗ ε .
We characterize the ε-free First1-sets with an inequality system:
F ε(a) = {a} if a ∈ TF ε(A) ⊇ F ε(Xj) if A→X1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xj−1)
38 / 55
![Page 91: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/91.jpg)
Lookahead Sets
For α ∈ (N ∪ T )∗ we are interested in the set:
First1(α) = First1({w ∈ T ∗ | α→∗ w})
Idea: Treat ε separately: First1(A) = F ε(A) ∪ {ε | A→∗ε}Let empty(X) = true iff X→∗ ε .
F ε(X1 . . . Xm) = F ε(X1) ∪ . . . ∪ F ε(Xj) if ¬empty(Xj) ∧∧j−1i=1 empty(Xi)
We characterize the ε-free First1-sets with an inequality system:
F ε(a) = {a} if a ∈ TF ε(A) ⊇ F ε(Xj) if A→X1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xj−1)
38 / 55
![Page 92: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/92.jpg)
Lookahead Sets
For α ∈ (N ∪ T )∗ we are interested in the set:
First1(α) = First1({w ∈ T ∗ | α→∗ w})
Idea: Treat ε separately: First1(A) = F ε(A) ∪ {ε | A→∗ε}Let empty(X) = true iff X→∗ ε .
F ε(X1 . . . Xm) =⋃ji=1 F ε(Xi) if ¬empty(Xj) ∧
∧j−1i=1 empty(Xi)
We characterize the ε-free First1-sets with an inequality system:
F ε(a) = {a} if a ∈ TF ε(A) ⊇ F ε(Xj) if A→X1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xj−1)
38 / 55
![Page 93: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/93.jpg)
Lookahead Sets
For α ∈ (N ∪ T )∗ we are interested in the set:
First1(α) = First1({w ∈ T ∗ | α→∗ w})
Idea: Treat ε separately: First1(A) = F ε(A) ∪ {ε | A→∗ε}Let empty(X) = true iff X→∗ ε .
F ε(X1 . . . Xm) =⋃ji=1 F ε(Xi) if ¬empty(Xj) ∧
∧j−1i=1 empty(Xi)
We characterize the ε-free First1-sets with an inequality system:
F ε(a) = {a} if a ∈ TF ε(A) ⊇ F ε(Xj) if A→X1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xj−1)
38 / 55
![Page 94: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/94.jpg)
Lookahead Sets
for example...
E → E+T 0 | T 1
T → T ∗F 0 | F 1
F → ( E ) 0 | name 1 | int 2
with empty(E) = empty(T ) = empty(F ) = false
... we obtain:
F ε(S′) ⊇ F ε(E) F ε(E) ⊇ F ε(E)
F ε(E) ⊇ F ε(T ) F ε(T ) ⊇ F ε(T )F ε(T ) ⊇ F ε(F ) F ε(F ) ⊇ { ( , name, int}
39 / 55
![Page 95: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/95.jpg)
Lookahead Sets
for example...
E → E+T 0 | T 1
T → T ∗F 0 | F 1
F → ( E ) 0 | name 1 | int 2
with empty(E) = empty(T ) = empty(F ) = false
... we obtain:
F ε(S′) ⊇ F ε(E) F ε(E) ⊇ F ε(E)
F ε(E) ⊇ F ε(T ) F ε(T ) ⊇ F ε(T )F ε(T ) ⊇ F ε(F ) F ε(F ) ⊇ { ( , name, int}
39 / 55
![Page 96: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/96.jpg)
Fast Computation of Lookahead Sets
Observation:The form of each inequality of these systems is:
x w y resp. x w d
for variables x, y und d ∈ D.Such systems are called pure unification problemsSuch problems can be solved in linear space/time.
for example: D = 2{a,b,c}
x0 ⊇ {a}x1 ⊇ {b} x1 ⊇ x0 x1 ⊇ x3
x2 ⊇ {c} x2 ⊇ x1
x3 ⊇ {c} x3 ⊇ x2 x3 ⊇ x3
a b
c
c
0 1
3
2
40 / 55
![Page 97: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/97.jpg)
Fast Computation of Lookahead Sets
a b
c
c
0 1
3
2
Proceeding:Create the Variable Dependency Graph for the inequality system.
Whithin a Strongly Connected Component (→ Tarjan) all variables have the same valueIs there no ingoing edge for an SCC, its value is computed via the smallest upperbound of all values within the SCCIn case of ingoing edges, their values are also to be considered for the upper bound
41 / 55
Frank DeRemer& Tom Pennello
![Page 98: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/98.jpg)
Fast Computation of Lookahead Sets
a b
c
c
0 1
3
2
Proceeding:Create the Variable Dependency Graph for the inequality system.Whithin a Strongly Connected Component (→ Tarjan) all variables have the same value
Is there no ingoing edge for an SCC, its value is computed via the smallest upperbound of all values within the SCCIn case of ingoing edges, their values are also to be considered for the upper bound
41 / 55
Frank DeRemer& Tom Pennello
![Page 99: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/99.jpg)
Fast Computation of Lookahead Sets
a
b c
0 1
3
2
Proceeding:Create the Variable Dependency Graph for the inequality system.Whithin a Strongly Connected Component (→ Tarjan) all variables have the same valueIs there no ingoing edge for an SCC, its value is computed via the smallest upperbound of all values within the SCC
In case of ingoing edges, their values are also to be considered for the upper bound
41 / 55
Frank DeRemer& Tom Pennello
![Page 100: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/100.jpg)
Fast Computation of Lookahead Sets
a
a b c
0 1
3
2
Proceeding:Create the Variable Dependency Graph for the inequality system.Whithin a Strongly Connected Component (→ Tarjan) all variables have the same valueIs there no ingoing edge for an SCC, its value is computed via the smallest upperbound of all values within the SCCIn case of ingoing edges, their values are also to be considered for the upper bound
41 / 55
Frank DeRemer& Tom Pennello
![Page 101: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/101.jpg)
Fast Computation of Lookahead Sets
... for our example grammar:
First1 :
E T FS’
(, int, name
42 / 55
![Page 102: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/102.jpg)
Item Pushdown Automaton as LL(1)-Parser
context is relevant too: S′ → S $ S → ε 0 | aS b 1
First1(input) $ a b
S ? ? ?
43 / 55
S′i0
A1i1
in
β
β1
β0
w ∈ First1( )
S
?
![Page 103: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/103.jpg)
Item Pushdown Automaton as LL(1)-Parser
context is relevant too: S′ → S $ S → ε 0 | aS b 1
First1(input) $ a b
S ? ? ?
43 / 55
γ
S′i0
A1i1
in
β
β1
β0
w ∈ First1( )
S
aBb1
S′i0
A1i1
in
β
β0
w ∈ First1( )
S
0 ε
β1
![Page 104: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/104.jpg)
Item Pushdown Automaton as LL(1)-Parser
Inequality system for Follow1(B) = First1(β)�1 . . .�1 First1(β0)
Follow1(S) ⊇ {$}Follow1(B) ⊇ F ε(Xj) if A→αBX1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xj−1)Follow1(B) ⊇ Follow1(A) if A→αBX1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xm)
44 / 55
γ
S′i0
A1i1
in
Bi β
β1
β0
An
)w ∈ First1(
w ∈ First1(First1(γ)�1 First1(β)�1 . . .�1 First1(β0))w ∈ First1(γ)�1 Follow1(B)
![Page 105: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/105.jpg)
Item Pushdown Automaton as LL(1)-Parser
Inequality system for Follow1(B) = First1(β)�1 . . .�1 First1(β0)
Follow1(S) ⊇ {$}Follow1(B) ⊇ F ε(Xj) if A→αBX1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xj−1)Follow1(B) ⊇ Follow1(A) if A→αBX1 . . . Xm ∈ P , empty(X1) ∧ . . . ∧ empty(Xm)
44 / 55
γ
S′i0
A1i1
in
Bi β
β1
β0
An
)w ∈ First1(
w ∈ First1(First1(γ)�1 First1(β)�1 . . .�1 First1(β0))w ∈ First1(γ)�1 Follow1(B)
![Page 106: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/106.jpg)
Item Pushdown Automaton as LL(1)-Parser
Is G an LL(1)-grammar, we can index a lookahead-table with items and nonterminals:
LL(1)-Lookahead TableWe set M [B, w] = i with B→ γ i if w ∈ First1(γ) �1 Follow1(B)
... for example: S′ → S $ S → ε 0 | aS b 1
First1(S) = {ε, a} Follow1(S) = {b, $}
S-rule 0 : First1(ε) �1 Follow1(S) = {b, $}S-rule 1 : First1(aSb) �1 Follow1(S) = {a}
$ a b
S 0 1 0
45 / 55
![Page 107: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/107.jpg)
Item Pushdown Automaton as LL(1)-Parser
Is G an LL(1)-grammar, we can index a lookahead-table with items and nonterminals:
LL(1)-Lookahead TableWe set M [B, w] = i with B→ γ i if w ∈ First1(γ) �1 Follow1(B)
... for example: S′ → S $ S → ε 0 | aS b 1
First1(S) = {ε, a}
Follow1(S) = {b, $}
S-rule 0 : First1(ε) �1 Follow1(S) = {b, $}S-rule 1 : First1(aSb) �1 Follow1(S) = {a}
$ a b
S 0 1 0
45 / 55
![Page 108: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/108.jpg)
Item Pushdown Automaton as LL(1)-Parser
Is G an LL(1)-grammar, we can index a lookahead-table with items and nonterminals:
LL(1)-Lookahead TableWe set M [B, w] = i with B→ γ i if w ∈ First1(γ) �1 Follow1(B)
... for example: S′ → S $ S → ε 0 | aS b 1
First1(S) = {ε, a} Follow1(S) = {b, $}
S-rule 0 : First1(ε) �1 Follow1(S) = {b, $}S-rule 1 : First1(aSb) �1 Follow1(S) = {a}
$ a b
S 0 1 0
45 / 55
![Page 109: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/109.jpg)
Item Pushdown Automaton as LL(1)-Parser
Is G an LL(1)-grammar, we can index a lookahead-table with items and nonterminals:
LL(1)-Lookahead TableWe set M [B, w] = i with B→ γ i if w ∈ First1(γ) �1 Follow1(B)
... for example: S′ → S $ S → ε 0 | aS b 1
First1(S) = {ε, a} Follow1(S) = {b, $}
S-rule 0 : First1(ε) �1 Follow1(S) = {b, $}S-rule 1 : First1(aSb) �1 Follow1(S) = {a}
$ a b
S 0 1 0
45 / 55
![Page 110: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/110.jpg)
Item Pushdown Automaton as LL(1)-Parser
Is G an LL(1)-grammar, we can index a lookahead-table with items and nonterminals:
LL(1)-Lookahead TableWe set M [B, w] = i with B→ γ i if w ∈ First1(γ) �1 Follow1(B)
... for example: S′ → S $ S → ε 0 | aS b 1
First1(S) = {ε, a} Follow1(S) = {b, $}
S-rule 0 : First1(ε) �1 Follow1(S) = {b, $}S-rule 1 : First1(aSb) �1 Follow1(S) = {a}
$ a b
S 0 1 0
45 / 55
![Page 111: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/111.jpg)
Item Pushdown Automaton as LL(1)-Parser
For example: S′ → S $ S → ε 0 | aS b 1
The transitions of the according Item Pushdown Automaton:
0 [S′→ • S $] ε [S′→ • S $] [S→•]1 [S′→ • S $] ε [S′→ • S $] [S→ • aS b]2 [S→ • aS b] a [S→ a • S b]3 [S→ a • S b] ε [S→ a • S b] [S→•]4 [S→ a • S b] ε [S→ a • S b] [S→ • aS b]5 [S→ a • S b] [S→•] ε [S→ aS • b]6 [S→ a • S b] [S→ aS b•] ε [S→ aS • b]7 [S→ aS • b] b [S→ aS b•]8 [S′→ • S $] [S→•] ε [S′→S • $]9 [S′→ • S $] [S→ aS b•] ε [S′→S • $]
Lookahead table:$ a b
S 0 1 0
46 / 55
![Page 112: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/112.jpg)
Left Recursion
Attention:Many grammars are not LL(k) !
A reason for that is:
DefinitionGrammar G is called left-recursive, if
A→+Aβ for an A ∈ N , β ∈ (T ∪N)∗
Example:E → E+T 0 | T 1
T → T ∗F 0 | F 1
F → ( E ) 0 | name 1 | int 2
... is left-recursive
47 / 55
![Page 113: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/113.jpg)
Left Recursion
Attention:Many grammars are not LL(k) !
A reason for that is:
DefinitionGrammar G is called left-recursive, if
A→+Aβ for an A ∈ N , β ∈ (T ∪N)∗
Example:E → E+T 0 | T 1
T → T ∗F 0 | F 1
F → ( E ) 0 | name 1 | int 2
... is left-recursive47 / 55
![Page 114: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/114.jpg)
Left Recursion
Theorem:Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.
Proof:Let wlog. A→Aβ |α ∈ Pand A be reachable from S
Assumption: G is LL(k)
⇒Firstk(αβn γ) ∩
Firstk(αβn+1 γ) = ∅
Case 1: β→∗ ε — Contradiction !!!Case 2: β→∗ w 6= ε ==⇒ Firstk(αw
k γ) ∩ Firstk(αwk+1 γ) 6= ∅
48 / 55
![Page 115: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/115.jpg)
Left Recursion
Theorem:Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.
Proof:Let wlog. A→Aβ |α ∈ Pand A be reachable from S
Assumption: G is LL(k)
⇒Firstk(αβn γ) ∩
Firstk(αβn+1 γ) = ∅
Case 1: β→∗ ε — Contradiction !!!Case 2: β→∗ w 6= ε ==⇒ Firstk(αw
k γ) ∩ Firstk(αwk+1 γ) 6= ∅
48 / 55
![Page 116: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/116.jpg)
Left Recursion
Theorem:Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.
Proof:Let wlog. A→Aβ |α ∈ Pand A be reachable from S
Assumption: G is LL(k)
⇒Firstk(αβn γ) ∩
Firstk(αβn+1 γ) = ∅
Case 1: β→∗ ε — Contradiction !!!Case 2: β→∗ w 6= ε ==⇒ Firstk(αw
k γ) ∩ Firstk(αwk+1 γ) 6= ∅
48 / 55
S
A
A
A β
n
Firstk(
βn γ
)
![Page 117: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/117.jpg)
Left Recursion
Theorem:Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.
Proof:Let wlog. A→Aβ |α ∈ Pand A be reachable from S
Assumption: G is LL(k)
⇒Firstk(αβn γ) ∩
Firstk(αβn+1 γ) = ∅
Case 1: β→∗ ε — Contradiction !!!Case 2: β→∗ w 6= ε ==⇒ Firstk(αw
k γ) ∩ Firstk(αwk+1 γ) 6= ∅
48 / 55
α
S
A
A
A β
n
Firstk(
βn γ
)
![Page 118: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/118.jpg)
Left Recursion
Theorem:Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.
Proof:Let wlog. A→Aβ |α ∈ Pand A be reachable from S
Assumption: G is LL(k)
⇒Firstk(αβn γ) ∩
Firstk(αβn+1 γ) = ∅
Case 1: β→∗ ε — Contradiction !!!Case 2: β→∗ w 6= ε ==⇒ Firstk(αw
k γ) ∩ Firstk(αwk+1 γ) 6= ∅
48 / 55
α
A
S
A
n
βn
A
βA
β
Firstk(
γ
)
![Page 119: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/119.jpg)
Left Recursion
Theorem:Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.
Proof:Let wlog. A→Aβ |α ∈ Pand A be reachable from S
Assumption: G is LL(k)
⇒Firstk(αβn γ) ∩
Firstk(αβn+1 γ) = ∅
Case 1: β→∗ ε — Contradiction !!!Case 2: β→∗ w 6= ε ==⇒ Firstk(αw
k γ) ∩ Firstk(αwk+1 γ) 6= ∅
48 / 55
![Page 120: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/120.jpg)
Right-Regular Context-Free ParsingRecurring scheme in programming languages: Lists of sth...S → b | S a bAlternative idea: Regular ExpressionsS → ( b a )∗ b
Definition: Right-Regular Context-Free GrammarA right-regular context-free grammar (RR-CFG) is a4-tuple G = (N,T , P , S) with:
N the set of nonterminals,T the set of terminals,P the set of rules with regular expressions of symbols as rhs,S ∈ N the start symbol
Example: Arithmetic ExpressionsS → EE → T ( +T )∗
T → F ( ∗F )∗
F → ( E ) | name | int
49 / 55
![Page 121: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/121.jpg)
Right-Regular Context-Free ParsingRecurring scheme in programming languages: Lists of sth...S → b | S a bAlternative idea: Regular ExpressionsS → ( b a )∗ b
Definition: Right-Regular Context-Free GrammarA right-regular context-free grammar (RR-CFG) is a4-tuple G = (N,T , P , S) with:
N the set of nonterminals,T the set of terminals,P the set of rules with regular expressions of symbols as rhs,S ∈ N the start symbol
Example: Arithmetic ExpressionsS → EE → T ( +T )∗
T → F ( ∗F )∗
F → ( E ) | name | int
49 / 55
![Page 122: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/122.jpg)
Right-Regular Context-Free ParsingRecurring scheme in programming languages: Lists of sth...S → b | S a bAlternative idea: Regular ExpressionsS → ( b a )∗ b
Definition: Right-Regular Context-Free GrammarA right-regular context-free grammar (RR-CFG) is a4-tuple G = (N,T , P , S) with:
N the set of nonterminals,T the set of terminals,P the set of rules with regular expressions of symbols as rhs,S ∈ N the start symbol
Example: Arithmetic ExpressionsS → EE → T ( +T )∗
T → F ( ∗F )∗
F → ( E ) | name | int49 / 55
![Page 123: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/123.jpg)
Idea 1: Rewrite the rules from G to 〈G〉:
A → 〈α〉 if A→ α ∈ P〈α〉 → α if α ∈ N ∪ T〈ε〉 → ε〈α∗〉 → ε | 〈α〉〈α∗〉 if α ∈ RegexT,N〈α1 . . . αn〉 → 〈α1〉 . . . 〈αn〉 if αi ∈ RegexT,N〈α1 | . . . | αn〉 → 〈α1〉 | . . . | 〈αn〉 if αi ∈ RegexT,N
. . . and generate the according LL(k)-Parser ML〈G〉
Example: Arithmetic Expressions cont’dS → EE →T → F ( ∗F )∗
F → ( E ) | name | int〈T ( +T )∗〉 → T 〈( +T )∗〉〈( +T )∗〉 → ε | 〈 +T 〉〈( +T )∗〉〈 +T 〉 → +T〈F ( ∗F )∗〉 → F 〈( ∗F )∗〉〈( ∗F )∗〉 → ε | 〈∗F 〉 〈( ∗F )∗〉〈 ∗F 〉 → ∗F
50 / 55
![Page 124: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/124.jpg)
Idea 1: Rewrite the rules from G to 〈G〉:
A → 〈α〉 if A→ α ∈ P〈α〉 → α if α ∈ N ∪ T〈ε〉 → ε〈α∗〉 → ε | 〈α〉〈α∗〉 if α ∈ RegexT,N〈α1 . . . αn〉 → 〈α1〉 . . . 〈αn〉 if αi ∈ RegexT,N〈α1 | . . . | αn〉 → 〈α1〉 | . . . | 〈αn〉 if αi ∈ RegexT,N
. . . and generate the according LL(k)-Parser ML〈G〉
Example: Arithmetic Expressions cont’dS → EE → T ( +T )∗
T → F ( ∗F )∗
F → ( E ) | name | int
〈T ( +T )∗〉 → T 〈( +T )∗〉〈( +T )∗〉 → ε | 〈 +T 〉〈( +T )∗〉〈 +T 〉 → +T〈F ( ∗F )∗〉 → F 〈( ∗F )∗〉〈( ∗F )∗〉 → ε | 〈∗F 〉 〈( ∗F )∗〉〈 ∗F 〉 → ∗F
50 / 55
![Page 125: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/125.jpg)
Idea 1: Rewrite the rules from G to 〈G〉:
A → 〈α〉 if A→ α ∈ P〈α〉 → α if α ∈ N ∪ T〈ε〉 → ε〈α∗〉 → ε | 〈α〉〈α∗〉 if α ∈ RegexT,N〈α1 . . . αn〉 → 〈α1〉 . . . 〈αn〉 if αi ∈ RegexT,N〈α1 | . . . | αn〉 → 〈α1〉 | . . . | 〈αn〉 if αi ∈ RegexT,N
. . . and generate the according LL(k)-Parser ML〈G〉
Example: Arithmetic Expressions cont’dS → EE → 〈T ( +T )∗〉T → F ( ∗F )∗
F → ( E ) | name | int〈T ( +T )∗〉 → T 〈( +T )∗〉
〈( +T )∗〉 → ε | 〈 +T 〉〈( +T )∗〉〈 +T 〉 → +T〈F ( ∗F )∗〉 → F 〈( ∗F )∗〉〈( ∗F )∗〉 → ε | 〈∗F 〉 〈( ∗F )∗〉〈 ∗F 〉 → ∗F
50 / 55
![Page 126: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/126.jpg)
Idea 1: Rewrite the rules from G to 〈G〉:
A → 〈α〉 if A→ α ∈ P〈α〉 → α if α ∈ N ∪ T〈ε〉 → ε〈α∗〉 → ε | 〈α〉〈α∗〉 if α ∈ RegexT,N〈α1 . . . αn〉 → 〈α1〉 . . . 〈αn〉 if αi ∈ RegexT,N〈α1 | . . . | αn〉 → 〈α1〉 | . . . | 〈αn〉 if αi ∈ RegexT,N
. . . and generate the according LL(k)-Parser ML〈G〉
Example: Arithmetic Expressions cont’dS → EE → 〈T ( +T )∗〉T → F ( ∗F )∗
F → ( E ) | name | int〈T ( +T )∗〉 → T 〈( +T )∗〉〈( +T )∗〉 → ε | 〈 +T 〉〈( +T )∗〉
〈 +T 〉 → +T〈F ( ∗F )∗〉 → F 〈( ∗F )∗〉〈( ∗F )∗〉 → ε | 〈∗F 〉 〈( ∗F )∗〉〈 ∗F 〉 → ∗F
50 / 55
![Page 127: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/127.jpg)
Idea 1: Rewrite the rules from G to 〈G〉:
A → 〈α〉 if A→ α ∈ P〈α〉 → α if α ∈ N ∪ T〈ε〉 → ε〈α∗〉 → ε | 〈α〉〈α∗〉 if α ∈ RegexT,N〈α1 . . . αn〉 → 〈α1〉 . . . 〈αn〉 if αi ∈ RegexT,N〈α1 | . . . | αn〉 → 〈α1〉 | . . . | 〈αn〉 if αi ∈ RegexT,N
. . . and generate the according LL(k)-Parser ML〈G〉
Example: Arithmetic Expressions cont’dS → EE → 〈T ( +T )∗〉T → F ( ∗F )∗
F → ( E ) | name | int〈T ( +T )∗〉 → T 〈( +T )∗〉〈( +T )∗〉 → ε | 〈 +T 〉〈( +T )∗〉〈 +T 〉 → +T
〈F ( ∗F )∗〉 → F 〈( ∗F )∗〉〈( ∗F )∗〉 → ε | 〈∗F 〉 〈( ∗F )∗〉〈 ∗F 〉 → ∗F
50 / 55
![Page 128: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/128.jpg)
Idea 1: Rewrite the rules from G to 〈G〉:
A → 〈α〉 if A→ α ∈ P〈α〉 → α if α ∈ N ∪ T〈ε〉 → ε〈α∗〉 → ε | 〈α〉〈α∗〉 if α ∈ RegexT,N〈α1 . . . αn〉 → 〈α1〉 . . . 〈αn〉 if αi ∈ RegexT,N〈α1 | . . . | αn〉 → 〈α1〉 | . . . | 〈αn〉 if αi ∈ RegexT,N
. . . and generate the according LL(k)-Parser ML〈G〉
Example: Arithmetic Expressions cont’dS → EE → 〈T ( +T )∗〉T → 〈F ( ∗F )∗〉F → ( E ) | name | int〈T ( +T )∗〉 → T 〈( +T )∗〉〈( +T )∗〉 → ε | 〈 +T 〉〈( +T )∗〉〈 +T 〉 → +T
〈F ( ∗F )∗〉 → F 〈( ∗F )∗〉〈( ∗F )∗〉 → ε | 〈∗F 〉 〈( ∗F )∗〉〈 ∗F 〉 → ∗F
50 / 55
![Page 129: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/129.jpg)
Idea 1: Rewrite the rules from G to 〈G〉:
A → 〈α〉 if A→ α ∈ P〈α〉 → α if α ∈ N ∪ T〈ε〉 → ε〈α∗〉 → ε | 〈α〉〈α∗〉 if α ∈ RegexT,N〈α1 . . . αn〉 → 〈α1〉 . . . 〈αn〉 if αi ∈ RegexT,N〈α1 | . . . | αn〉 → 〈α1〉 | . . . | 〈αn〉 if αi ∈ RegexT,N
. . . and generate the according LL(k)-Parser ML〈G〉
Example: Arithmetic Expressions cont’dS → EE → 〈T ( +T )∗〉T → 〈F ( ∗F )∗〉F → ( E ) | name | int〈T ( +T )∗〉 → T 〈( +T )∗〉〈( +T )∗〉 → ε | 〈 +T 〉〈( +T )∗〉〈 +T 〉 → +T〈F ( ∗F )∗〉 → F 〈( ∗F )∗〉〈( ∗F )∗〉 → ε | 〈∗F 〉 〈( ∗F )∗〉〈 ∗F 〉 → ∗F 50 / 55
![Page 130: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/130.jpg)
Definition:An RR−CFG G is called RLL(1),if the corresponding CFG 〈G〉 is an LL(1) grammar.
Discussiondirectly yields the table driven parser ML
〈G〉 for RLL(1) grammarshowever: mapping regular expressions to recursive productions unnessessarily strainsthe stack→ instead directly construct automaton in the style of Berry-Sethi
51 / 55
Reinhold Heckmann
![Page 131: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/131.jpg)
Idea 2: Recursive Descent RLL Parsers:
Recursive descent RLL(1)-parsers are an alternative to table-driven parsers; apart fromthe usual function scan(), we generate a program frame with the lookahead functionexpect()and the main parsing method parse():
int next;void expect(Set E){
if ({ε, next} ∩ E = ∅){cerr << ”Expected” << E << ”found” << next;exit(0);
}return ;
}void parse(){
next = scan();expect(First1(S)) ;S();expect({EOF}) ;
}52 / 55
![Page 132: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/132.jpg)
Idea 2: Recursive Descent RLL Parsers:
For each A→ α ∈ P , we introduce:
void A(){generate(α)
}
with the meta-program generate being defined by structural decomposition of α:
generate(r1 . . . rk) = generate(r1)expect(First1(r2 )) ;generate(r2)...expect(First1(rk )) ;generate(rk)
generate(ε) = ;generate(a) = next = scan();generate(A) = A();
53 / 55
![Page 133: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/133.jpg)
Idea 2: Recursive Descent RLL Parsers:
generate(r∗) = while ( next ∈ Fε(r)) {generate(r)}
generate(r1 | . . . | rk) = switch(next) {labels(First1(r1 )) generate(r1) break ;...labels(First1(rk )) generate(rk) break ;}
labels({α1, . . . , αm}) = label(α1): . . . label(αm):label(α) = case αlabel(ε) = default
54 / 55
![Page 134: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/134.jpg)
Topdown-Parsing
DiscussionA practical implementation of an RLL(1)-parser via recursive descent is astraight-forward ideaHowever, only a subset of the deterministic contextfree languages can be parsed thisway.As soon as First1(_) sets are not disjoint any more,
Solution 1: For many accessibly written grammars, the alternation between right hand sides happenstoo early. Keeping the common prefixes of right hand sides joined and introducing a new productionfor the actual diverging sentence forms often helps.Solution 2: Introduce ranked grammars, and decide conflicting lookahead always in favour of thehigher ranked alternative→ relation to LL parsing not so clear any more→ not so clear for _∗ operator how to decideSolution 3: Going from LL(1) to LL(k)The size of the occuring sets is rapidly increasing with larger kUnfortunately, even LL(k) parsers are not sufficient to accept all deterministic contextfreelanguages. (regular lookahead→ LL(∗))
In practical systems, this often motivates the implementation of k = 1 only ...
55 / 55
![Page 135: Topic: Syntactic Analysis · 2020. 5. 7. · Syntactic Analysis Token-StreamSyntaxtreeParser Syntactic analysis tries to integrate Tokens into larger program units. Such units may](https://reader035.vdocuments.mx/reader035/viewer/2022071510/612f9f071ecc51586943911d/html5/thumbnails/135.jpg)
Topdown-Parsing
DiscussionA practical implementation of an RLL(1)-parser via recursive descent is astraight-forward ideaHowever, only a subset of the deterministic contextfree languages can be parsed thisway.As soon as First1(_) sets are not disjoint any more,
Solution 1: For many accessibly written grammars, the alternation between right hand sides happenstoo early. Keeping the common prefixes of right hand sides joined and introducing a new productionfor the actual diverging sentence forms often helps.Solution 2: Introduce ranked grammars, and decide conflicting lookahead always in favour of thehigher ranked alternative→ relation to LL parsing not so clear any more→ not so clear for _∗ operator how to decideSolution 3: Going from LL(1) to LL(k)The size of the occuring sets is rapidly increasing with larger kUnfortunately, even LL(k) parsers are not sufficient to accept all deterministic contextfreelanguages. (regular lookahead→ LL(∗))
In practical systems, this often motivates the implementation of k = 1 only ...55 / 55