lab 3: using ml-yacc
DESCRIPTION
Lab 3: Using ML-Yacc. How to write a parser?. Write a parser by hand Use a parser generator May not be as efficient as hand-written parser General and robust How it works?. stream of tokens. Parser Specification. Parser. parser generator. abstract syntax. ML-Yacc specification. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/1.jpg)
Lab 3: Using ML-Yacc
![Page 2: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/2.jpg)
How to write a parser? Write a parser by hand Use a parser generator
May not be as efficient as hand-written parser General and robust How it works?
Parser Specification parser
generator
Parser
abstract syntax
stream oftokens
![Page 3: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/3.jpg)
ML-Yacc specification Three parts again
User Declarations: declare values available in the rule actions
%%
ML-Yacc Definitions: declare terminals and non-terminals; special declarations to resolve conflicts
%%
Rules: parser specified by CFG rules and associated semantic action that generate abstract syntax
![Page 4: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/4.jpg)
ML-Yacc Definitions specify type of positions
%pos int * int specify terminal and nonterminal symbols
%term IF | THEN | ELSE | PLUS | MINUS ...%nonterm prog | exp | op
specify end-of-parse token%eop EOF
specify start symbol (by default, non terminal in LHS of first rule)
%start prog
![Page 5: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/5.jpg)
A Simple ML-Yacc File%%
%term NUM | PLUS | MUL | LPAR | RPAR%nonterm exp | fact | base
%pos int%start exp%eop EOF
%%
exp : fact () | fact PLUS exp ()
fact : base () | base MUL factor ()
base : NUM () | LPAR exp RPAR ()
grammar rules
semantic actions(currentlydo nothing)
grammarsymbols
![Page 6: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/6.jpg)
each nonterminal may have a semantic value associated with it
when the parser reduces with (X ::= s) a semantic action will be executed uses semantic values from symbols in s
when parsing is completed successfully parser returns semantic value associated with the
start symbol usually a syntax tree
![Page 7: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/7.jpg)
to use semantic values during parsing, we must declare symbol types: %terminal NUM of int | PLUS | MUL | ... %nonterminal exp of int | fact of int | base of int
type of semantic action must match type declared for the nonterminal in rule
![Page 8: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/8.jpg)
A Simple ML-Yacc File with Action%%
%term NUM of int | PLUS | MUL | LPAR | RPAR%nonterm exp of int | fact of int | base of int
%pos int%start exp%eop EOF
%%
exp : fact (fact) | fact PLUS exp (fact + exp)
fact : base (base) | base MUL base (base1 * base2)
base : NUM (NUM) | LPAR exp RPAR (exp)
grammar ruleswithsemantic actions
grammarsymbolswithtypedeclarations
computinginteger resultvia semanticactions
![Page 9: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/9.jpg)
Conflicts in ML-Yacc We often write ambiguous grammar
Example Tokens from lexer
NUM PLUS NUM MUL NUM
State of Parser E+E
exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR
To be read
![Page 10: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/10.jpg)
Conflicts in ML-Yacc We often write ambiguous grammar
Example Tokens from lexer
NUM PLUS NUM MUL NUM
State of Parser E+E Result is : E+(E*E)
exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR
To be read
Shift E+E*
Shift E+E*E
Reduce E+E
Reduce E
If we shift
![Page 11: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/11.jpg)
Conflicts in ML-Yacc We often write ambiguous grammar
Example Tokens from lexer
NUM PLUS NUM MUL NUM
State of Parser E+E Result is: (E+E)*E
exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR
To be read
Reduce E
Shift E*
Shift E*E
Reduce E
If we reduce
![Page 12: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/12.jpg)
This is a shift-reduce conflict We want E+E*E, because “*” has higher
precedence than “+” Another shift-reduce conflict
Tokens from lexer NUM PLUS NUM PLUS NUM
State of Parser E+E Result is : E+(E+E) and (E+E)+E
To be read
Shift E+E+
Shift E+E+E
Reduce E+E
Reduce E
If we shift
Reduce E
Shift E+
Shift E+E
Reduce E
If we reduce
![Page 13: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/13.jpg)
Deal with shift-reduce conflicts This case, we need to reduce, because “+” is
left associative Deal with it!
let ML-Yacc complain. default choice is to shift when it encounters a shift-
reduce error BAD: programmer intentions unclear; harder to debug
other parts of your grammar; generally inelegant rewrite the grammar to eliminate ambiguity
can be complicated and less clear use Yacc precedence directives
%left, %right %nonassoc
![Page 14: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/14.jpg)
Precedence and Associativity precedence of terminal based on order in
which associativity is specified precedence of rule is the precedence of the
right-most terminal eg: precedence of (E ::= E + E) == prec(+)
a shift-reduce conflict is resolved as follows prec(terminal) > prec(rule) ==> shift prec(terminal) < prec(rule) ==> reduce prec(terminal) = prec(rule) ==>
assoc(terminal) = left ==> reduce assoc(terminal) = right ==> shift assoc(terminal) = nonassoc ==> report as error
![Page 15: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/15.jpg)
datatype exp = Int of int | Add of exp * exp | Sub of exp * exp | Mul of exp * exp | Div of exp *exp
%%
%left PLUS MINUS%left MUL DIV
%%
exp : NUM (Int NUM) | exp PLUS exp (Add (exp1, exp2)) | exp MINUS exp (Sub (exp1, exp2)) | exp MUL exp (Mul (exp1, exp2)) | exp DIV exp (Div (exp1, exp2)) | LPAR exp RPAR (exp)
Higher precedence
![Page 16: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/16.jpg)
Reduce-reduce Conflict This kind of conflict is more difficult to deal
with Example
When we get a “word” from lexer, word -> maybeword -> sequence (rule 1) empty –> sequence word -> sequence (rule 2)
We have more than one way to get “sequence” from input “word”
sequence::= | maybeword | sequence wordmaybeword: := | word
![Page 17: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/17.jpg)
Reduce-reduce Conflict Reduce-reduce conflict means there are two
or more rules that apply to the same sequence of input. This usually indicates a serious error in the grammar.
ML-Yacc reduce by first rule Generally, reduce-reduce conflict is not allowed in
your ML-Yacc file We need to fix our grammarsequence::=
| sequence word
![Page 18: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/18.jpg)
Summary of conflicts Shift-reduce conflict
precedence and associativity Shift by default
Reduce-reduce conflict reduce by first rule Not allowed!
![Page 19: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/19.jpg)
Lab3 Your job is to finish a parser for C language Input: A “.c” file Output: “Success!” if the “.c” file is correct File description
c.lex c.grm main.sml call-main.sml sources.cm lab3.mlb test.c
![Page 20: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/20.jpg)
Using ML-Yacc Read the Fxxx ML-Yacc Manual Run
If your finish “c.grm” and “c.lex” In command-line: (use MLton’s)
mlyacc c.grm mllex c.lex
we will get “c.grm.sig”, “c.grm.sml”, “c.grm.desc”, “c.lex.sml”
Then compile Lab3 Start SML/NJ, Run CM.make(“sources.cm”) or in command-line, mlton lab3.mlb
To run lab3 In SML/NJ, Main.parse(“test.c”) or in command-line, lab3 test.c
![Page 21: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/21.jpg)
“Debug” ML-Yacc File When you run mlyacc, you’ll see error messages
if your ml-yacc file has conflicts. For example, mlyacc c.grm
2 shift/reduce conflicts open file “c.grm.desc”(This file is generated by
mlyacc) The beginning of this file
the rest are all the states
rule 12 means the 12th rule (from 0) in your ML-Yacc file
2 shift/reduce conflicts error: state 0: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12)error: state 1: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12)
state 0: prog : . structs vdecs preds funcs MYSTRUCT shift 3 prog goto 429 structs
goto 2 structdec goto 1 .reduce by rule 12
![Page 22: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/22.jpg)
Use ML-lex with ML-yacc Most of the work in “c.lex” this time can be
cpoied from Lab2 You can re-use Regular expressions and
Lexical rules Difference with Lab2
You have to define “token” in “c.grm” %term INT of int | EOF “%term” in “c.grm” will be automatically in “c.grm.sig”signature C_TOKENS =
sigtype ('a,'b) tokentype svalueval EOF: 'a * 'a -> (svalue,'a) tokenval INT: (int) * 'a * 'a -> (svalue,'a) tokenend
![Page 23: Lab 3: Using ML-Yacc](https://reader035.vdocuments.mx/reader035/viewer/2022081417/56815735550346895dc4d472/html5/thumbnails/23.jpg)
Hints Read ML-Yacc Manual Read our language specification Test a lot!