40-414 compiler design implementation of lexical analysis
TRANSCRIPT
![Page 1: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/1.jpg)
1
40-414 Compiler Design
Implementation of Lexical Analysis
Lecture 3
![Page 2: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/2.jpg)
Prof. Aiken 2
Notation
• There is variation in regular expression notation
• At least one: A+ AA*
• Union: A | B A + B
• Option: A + A?
• Range: ‘a’+’b’+…+’z’ [a-z]
• Excluded range:
complement of [a-z] [^a-z]
![Page 3: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/3.jpg)
Prof. Aiken 3
Regular Expressions in Lexical Specification
• Last Lecture: a specification for the predicate
s L(R)
• But a yes/no answer is not enough!
• Instead: partition the input into tokens
• We adapt regular expressions to this goal
c1c2c3 c4c5c6c7 …
Set of strings
![Page 4: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/4.jpg)
Prof. Aiken 4
Regular Expressions => Lexical Spec. (1)
1. Write a rexp for the lexemes of each token• Number = digit +
• Keyword = ‘if’ + ‘else’ + …
• Identifier = letter (letter + digit)*
• OpenPar = ‘(‘
• …
![Page 5: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/5.jpg)
Prof. Aiken 5
Regular Expressions => Lexical Spec. (2)
2. Construct R, matching all lexemes for all tokens
R = Keyword + Identifier + Number + …
= R1 + R2 + …
![Page 6: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/6.jpg)
Prof. Aiken 6
Regular Expressions => Lexical Spec. (3)
3. Let input be x1…xn
For 1 i n check
x1…xi L(R)
4. If success, then we know that
x1…xi L(Rj) for some j
5. Remove x1…xi from input and go to (3)
![Page 7: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/7.jpg)
Prof. Aiken 7
Ambiguities (1)
• There are ambiguities in the algorithm
• How much input is used? What if• x1…xi L(R) and also
• x1…xK L(R)
k≠i
• Rule: Pick longest possible string in L(R)
– The “maximal munch”
![Page 8: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/8.jpg)
Prof. Aiken 8
Ambiguities (2)
• Which token is used? What if• x1…xi L(Rj) and also
• x1…xi L(Rk)
k≠i
• Rule: use rule listed first (j if j < k)– Treats “if” as a keyword, not an identifier
Keyword = ‘if’ + ‘else’ + …
Identifier = letter (letter + digit)*
R = R1 + R2 + R3 + …
![Page 9: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/9.jpg)
Prof. Aiken 9
Error Handling
• What ifNo rule matches a prefix of input ?
x1…xi L(Rj)
• Problem: Can’t just get stuck …
• Solution:
– Write a rule matching all “bad” strings
– Put it last (lowest priority)
![Page 10: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/10.jpg)
Prof. Aiken 10
Summary
• Regular expressions provide a concise notation for string patterns
• Use in lexical analysis requires small extensions– To resolve ambiguities
– To handle errors
• Good algorithms known– Require only single pass over the input
– Few operations per character (table lookup)
![Page 11: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/11.jpg)
Prof. Aiken 11
Finite Automata
• Regular expressions = specification
• Finite automata = implementation
• A finite automaton consists of– An input alphabet
– A finite set of states S
– A start state n
– A set of accepting states F S
– A set of transitions state input state
![Page 12: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/12.jpg)
Prof. Aiken 12
Finite Automata
• Transition
s1 a s2
• Is read
In state s1 on input “a” go to state s2
• If end of input and in accepting state => accept
• Otherwise => reject • Terminates in a state s that is
NOT an accepting state (s F)• Gets stuck
![Page 13: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/13.jpg)
Prof. Aiken 13
Finite Automata State Graphs
• A state
• The start state
• An accepting state
• A transitiona
![Page 14: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/14.jpg)
Prof. Aiken 14
A Simple Example
• A finite automaton that accepts only “1”
1
• Accepts ‘1’ : 1, 1
• Rejects ‘0’ : 0
• Rejects ’10’ : 1, 10
A B
![Page 15: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/15.jpg)
Prof. Aiken15
Another Simple Example
• A finite automaton accepting any number of 1’s followed by a single 0
• Alphabet: {0,1}
0
1
• Accepts ‘110’: 110, 110, 110, 110
• Rejects ‘100’ : 100, 100, 100
A B
![Page 16: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/16.jpg)
Prof. Aiken 16
And Another Example
• Alphabet {0,1}
• What language does this recognize?
0
1
0
1
0
1
![Page 17: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/17.jpg)
Prof. Aiken 17
And Another Example
0
1
0
1
0
1(0 + 1)*
(1* + 0)(1 + 0)
1* + (01)* + (001)* + (000*1)*
(0 + 1)*00
Select the regular
language that denotes
the same language as
this finite automaton
![Page 18: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/18.jpg)
18
And Another Example
0
1
0
1
0
1(0 + 1)*
(1* + 0)(1 + 0)
1* + (01)* + (001)* + (000*1)*
(0 + 1)*00
Select the regular
language that denotes
the same language as
this finite automaton
![Page 19: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/19.jpg)
Prof. Aiken 19
Epsilon Moves
• Another kind of transition: -moves
• Machine can move from state A to state Bwithout reading input
A B
![Page 20: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/20.jpg)
Prof. Aiken 20
Deterministic and Nondeterministic Automata
• Deterministic Finite Automata (DFA)– One transition per input per state
– No -moves
• Nondeterministic Finite Automata (NFA)
– Can have multiple transitions for one input in a given state
– Can have -moves
![Page 21: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/21.jpg)
Prof. Aiken 21
Execution of Finite Automata
• A DFA can take only one path through the state graph– Completely determined by input
• NFAs can choose– Whether to make -moves
– Which of multiple transitions for a single input to take
![Page 22: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/22.jpg)
Prof. Aiken22
Acceptance of NFAs
• An NFA can get into multiple states
• Input:
• Possible States:
0
1
0
0
Rule: NFA accepts if it can get to a final state
![Page 23: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/23.jpg)
Prof. Aiken23
Acceptance of NFAs
• An NFA can get into multiple states
• Input:
• Possible States:
0
1
0
0
1
Rule: NFA accepts if it can get to a final state
A
![Page 24: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/24.jpg)
Prof. Aiken24
Acceptance of NFAs
• An NFA can get into multiple states
• Input:
• Possible States:
0
1
0
0
1
Rule: NFA accepts if it can get to a final state
A
{A}
![Page 25: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/25.jpg)
Prof. Aiken25
Acceptance of NFAs
• An NFA can get into multiple states
• Input:
• Possible States:
0
1
0
0
1 0
Rule: NFA accepts if it can get to a final state
A
{A}
![Page 26: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/26.jpg)
Prof. Aiken26
Acceptance of NFAs
• An NFA can get into multiple states
• Input:
• Possible States:
0
1
0
0
1 0
Rule: NFA accepts if it can get to a final state
A B
{A} {A, B}
![Page 27: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/27.jpg)
Prof. Aiken27
Acceptance of NFAs
• An NFA can get into multiple states
• Input:
• Possible States:
0
1
0
0
1 0 0
Rule: NFA accepts if it can get to a final state
A B
{A} {A, B}
![Page 28: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/28.jpg)
Prof. Aiken28
Acceptance of NFAs
• An NFA can get into multiple states
• Input:
• Possible States:
0
1
0
0
1 0 0
Rule: NFA accepts if it can get to a final state
CA B
{A, B, C}{A} {A, B}
![Page 29: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/29.jpg)
Prof. Aiken 29
NFA vs. DFA (1)
• NFAs and DFAs recognize the same set of languages (regular languages)
• DFAs are faster to execute– There are no choices to consider
• NFAs are, in general, smaller– Sometimes exponentially smaller
![Page 30: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/30.jpg)
Prof. Aiken 30
NFA vs. DFA (2)
• For a given language NFA can be simpler than DFA
01
0
0
01
0
1
0
1
NFA
DFA
• DFA can be exponentially larger than NFA
![Page 31: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/31.jpg)
31
Regular Expressions to Finite Automata
• High-level sketch
Regularexpressions
NFA
DFA
LexicalSpecification
Table-driven Implementation of DFA
Prof. Aiken
![Page 32: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/32.jpg)
32
Regular Expressions to NFA (1)
• For each kind of rexp, define an NFA– Notation: NFA for rexp M
M
• For
• For input aa
Prof. Aiken
![Page 33: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/33.jpg)
33
Regular Expressions to NFA (2)
• For AB
A B
• For A + B
A
B
Prof. Aiken
![Page 34: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/34.jpg)
34
Regular Expressions to NFA (3)
• For A*
A
Prof. Aiken
![Page 35: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/35.jpg)
35
Example of RegExp -> NFA conversion
• Consider the regular expression
(1+0)*1
• The NFA is
B
1C E
0D F
G
A H1
I J
Prof. Aiken
![Page 36: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/36.jpg)
36
NFA to DFA: The Trick
• Simulate the NFA
• Each state of DFA = a non-empty subset of states of the NFA
• Start state = -closure of the start state of NFA
• Add a transition S a S’ to DFA iff– S’ is the set of NFA states reachable from any state
in S after seeing the input a, considering -moves as well
• Final states Subsets that include at least one final state of NFA
Prof. Aiken
![Page 37: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/37.jpg)
37
-closure of a state
Prof. Aiken
-closure(B)= {B,C,D}
-closure(G)= {A,B,C,D,G,H,I}
![Page 38: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/38.jpg)
38
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
Prof. Aiken
![Page 39: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/39.jpg)
39
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
Prof. Aiken
![Page 40: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/40.jpg)
40
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
Prof. Aiken
![Page 41: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/41.jpg)
41
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
ABCDHI
Prof. Aiken
![Page 42: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/42.jpg)
42
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
ABCDHI
0
Prof. Aiken
![Page 43: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/43.jpg)
43
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
ABCDHI
0
Prof. Aiken
![Page 44: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/44.jpg)
44
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
ABCDHI
0
Prof. Aiken
![Page 45: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/45.jpg)
45
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
ABCDHI
0
1
Prof. Aiken
![Page 46: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/46.jpg)
46
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
ABCDHI
0
1
Prof. Aiken
![Page 47: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/47.jpg)
47
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
EJGHIABCD
ABCDHI
0
1
Prof. Aiken
![Page 48: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/48.jpg)
48
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
EJGHIABCD
ABCDHI
0
1
0
Prof. Aiken
![Page 49: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/49.jpg)
49
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
EJGHIABCD
ABCDHI
0
1
0 1
Prof. Aiken
![Page 50: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/50.jpg)
50
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
EJGHIABCD
ABCDHI
0
1
0
0 1
Prof. Aiken
![Page 51: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/51.jpg)
51
NFA -> DFA Example
1
01
A BC
D
E
FG H I J
FGHIABCD
EJGHIABCD
ABCDHI
0
1
0
10 1
Prof. Aiken
![Page 52: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/52.jpg)
Prof. Aiken 52
Implementation
• A DFA can be implemented by a 2D table T– One dimension is “states”
– Other dimension is “input symbol”
– For every transition Si a Sk define T[i,a] = k
• DFA “execution”– If in state Si and input a, read T[i,a] = k and skip to
state Sk
– Very efficient
![Page 53: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/53.jpg)
Prof. Aiken 53
Table Implementation of a DFA
S
T
U
0
1
0
10 1
0 1
S T U
T T U
U T U
![Page 54: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/54.jpg)
Prof. Aiken 54
Implementation (Cont.)
• NFA -> DFA conversion is at the heart of tools such as flex
• But, DFAs can be huge
• In practice, flex-like tools trade off speed for space in the choice of NFA and DFA representations
![Page 55: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/55.jpg)
55
DFA for recognizing two relational operators
start
other
=>0 6 7
8* return(SYMBOL, >)
return(SYMBOL, >=)
We’ve accepted “>” and have read “other” character that must be
unread. That is moving the input pointer one character back.
![Page 56: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/56.jpg)
56
DFA of Pascal relational operators
start <0
other
=6 7
8
return(SYMBOL, <=)
5
4
>
=1 2
3
other
>
=
*
*
return(SYMBOL, <>)
return(SYMBOL, <)
return(SYMBOL, =)
return(SYMBOL, >=)
return(SYMBOL, >)
![Page 57: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/57.jpg)
57
DFA for recognizing id and keyword
return(get_token(), install_id())
start letter0
other109
letter or digit
*
If the token is an ID, its lexeme
is inserted into the symbol
table (only one record for each
lexeme); and lexeme of the
token is returned.
returns either a KEYWORD or ID based on
the type of the token
![Page 58: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/58.jpg)
58
DFA of Pascal Unsigned Numbers
return(NUM, lexeme of the number)
170 1211 1413 1615start otherdigit . digit E + | - digit
digit
digit
digit
E
digit
*
other
other
![Page 59: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/59.jpg)
59
Lexical errors
• Some errors are out of power of lexical analyzer to recognize:
fi (a == f(x)) …
• However, it may be able to recognize errors like:
d = 2r
• Such errors are recognized when no pattern for tokens matches a character sequence
![Page 60: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/60.jpg)
60
Error recovery
• Panic mode: successive characters are ignored until we reach to a well formed token
• Delete one character from the remaining input
• Insert a missing character into the remaining input
• Replace a character by another character
• Transpose two adjacent characters
![Page 61: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/61.jpg)
61
Lexical Analyzer in Perspective (Revisited)
lexical
analyzer parser
symbol
table
source
program
token
<type, attribute>
get next
token
output
position = initial + rate * 60
<id, 1> <op, = > <id, 2> <op, + > <id, 3> <op, * > <num, 60 >
Symbol Table
key lexeme type …
1 position real …
2 initial real …
3 rate real …
![Page 62: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/62.jpg)
62
Using Buffer to Enhance Efficiency
*M=E eof2**C
Current token
lexeme beginning forward (scans
ahead to find
pattern match)
if forward at end of first half then begin
reload second half ;
forward : = forward + 1
end
else if forward at end of second half then begin
reload first half ;
move forward to biginning of first half
end
else forward : = forward + 1 ;
Block I/O
Block I/O
![Page 63: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/63.jpg)
63
Algorithm: Buffered I/O with Sentinels
eof*M=E eofeof2**C
Current token
lexeme beginning forward (scans
ahead to find
pattern match)
forward : = forward + 1 ;
if forward is at eof then begin
if forward at end of first half then begin
reload second half ;
forward : = forward + 1
end
else if forward at end of second half then begin
reload first half ;
move forward to biginning of first half
end
else / * eof within buffer signifying end of input * /
terminate lexical analysis
end 2nd eof no more input !
Block I/O
Block I/O
![Page 64: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/64.jpg)
64
Question?
c*
b+
ab
ac*
Consider the string abbbaacc . Which of the following lexical
specifications produces the tokenization:
Choose all that apply
a(b + c*)
b+
ab
b+
ac*
Prof. Aiken
ab/bb/a/acc
b+
ab*
ac*
![Page 65: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/65.jpg)
65
Question?
1, 3
3
4
2, 3
Using the lexical specification below, how is the string “dictatorial”
tokenized?
dict (1)
dictator (2)
[a-z]* (3)
dictatorial (4)
Choose all that apply
Prof. Aiken
![Page 66: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/66.jpg)
66
Question?
babad will be tokenized as: bab/a/d
Given the following lexical specification:
a(ba)*
b*(ab)*
abd
d+
Which of the following
statements is true?
Choose all that apply
ababdddd will be tokenized as: abab/dddd
dddabbabab will be tokenized as: ddd/a/bbabab
ababddababa will be tokenized as: ab/abd/d/ababa
Prof. Aiken
![Page 67: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/67.jpg)
67
Question?
011110
Given the following lexical specification:
(00)*
01+
10+
Prof. Aiken
Which strings are NOT
successfully processed by this
specification?
Choose all that apply
01100100
01100110
0001101
![Page 68: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/68.jpg)
68
Question?
(000)*(01)+
0(000)*1(01)*
(000)*(10)+
0(00)*(10)*
0(000)*(01)*
Which of the following regular
expressions generate the
same language as the one
recognized by this NFA?
Prof. Aiken
Choose all that apply
![Page 69: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/69.jpg)
69
Question?
Which of the following automata are DFA?
Prof. Aiken
Choose all that apply
![Page 70: 40-414 Compiler Design Implementation of Lexical Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022020622/61edb3dae1b39a29b712359d/html5/thumbnails/70.jpg)
70
Question?
Which of the following automata are NFA?
Prof. Aiken
Choose all that apply