csc 3315 lexical and syntax analysis
DESCRIPTION
CSC 3315 Lexical and Syntax Analysis. Hamid Harroud School of Science and Engineering, Akhawayn University http://www.aui.ma/~H.Harroud/csc3315/. Constructing a Lexical Analyzer. state = S // S is the start state repeat { k = next character from the input - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/1.jpg)
CSC3315 (Spring 2009) 1
CSC 3315CSC 3315Lexical and Syntax Lexical and Syntax AnalysisAnalysis
Hamid HarroudHamid HarroudSchool of Science and Engineering, Akhawayn School of Science and Engineering, Akhawayn
UniversityUniversityhttp://www.aui.ma/~H.Harroud/csc3315/
![Page 2: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/2.jpg)
Constructing a Lexical Analyzer
state = S // S is the start state
repeat {k = next character from the input
if k == EOF // the end of inputif state is a final state then accept
else reject
state = T[state,k]
if state = empty then reject // got stuck
}
![Page 3: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/3.jpg)
Constructing a Lexical Analyzer
![Page 4: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/4.jpg)
Constructing a Lexical Analyzer
int LexAnalyzer() {getChar();if (isLetter(nextChar)) {
addChar();getChar();while (isLetter(nextChar) || isDigit(nextChar)){ addChar(); getChar();}return lookup(lexeme);
} . . .
![Page 5: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/5.jpg)
Constructing a Lexical Analyzer
int LexAnalyzer() {getChar();if (isLetter(nextChar)) { . . .}else if (isDigit(nextChar)) {
addChar();getChar();while (isDigit(nextChar)) { addChar(); getChar();}return INT_LIT;break;
}}
![Page 6: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/6.jpg)
Lexical Errors
Consider the following two programs:
![Page 7: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/7.jpg)
Lexical Errors
![Page 8: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/8.jpg)
Jlex: a scanner generator
JLex.Main(java)
JLex.Main(java)
javacjavac
P.main(java)P.main(java)
jlex specificationxxx.jlex
xxx.jlex.java
generated scannerxxx.jlex.java
Yylex.class
Yylex.class
input programtest.sim
Output of P.main
![Page 9: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/9.jpg)
public class P {public static void main(String[] args) {
FileReader inFile = new FileReader(args[0]); Yylex scanner = new Yylex(inFile);
Symbol token = scanner.next_token(); while (token.sym != sym.EOF) {
switch (token.sym) {case sym.INTLITERAL: System.out.println("INTLITERAL (" + ((IntLitTokenVal)token.value).intVal \+ ")");
break;…
} token = scanner.next_token(); } }
Jlex: a scanner generator
![Page 10: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/10.jpg)
Regular expression rulesregular-expression { action } pattern to be matched code to be executed when
the
pattern is matched
When next_token() method is called, it repeats: Find the longest sequence of characters in the input (starting with
the current character) that matches a pattern. Perform the associated action
until a return in an action is executed.
![Page 11: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/11.jpg)
Matching rules
If several patterns that match the same sequence of characters, then the longest pattern is considered to be matched.
If several patterns that match the same (longest) sequence of characters, then the first such pattern is considered to be matched
so the order of the patterns can be important!
If an input character is not matched in any pattern, the scanner throws an exception
![Page 12: CSC 3315 Lexical and Syntax Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56814043550346895dabaf99/html5/thumbnails/12.jpg)
An Example%%
DIGIT= [0-9]
LETTER= [a-zA-Z]
WHITESPACE= [ \t\n] // space, tab, newline
{LETTER}({LETTER}|{DIGIT}*)
{System.out.println(yyline+1
+ ": ID " + yytext());}
{DIGIT}+ {System.out.println(yyline+1 + ": INT");}
"=" {System.out.println(yyline+1 + ": ASSIGN");}
"==" {System.out.println(yyline+1 + ": EQUALS");}
{WHITESPACE}* { }
. {System.out.println(yyline+1 + ": bad char");}