mastering grammars with petitparser
DESCRIPTION
PetitParser is a dynamic parser framework combining ideas from scannerless parsing, parser combinators, parsing expression grammars and packrat parsers. In this hands-on session we learn how to build simple parsers and how to model, test, compose and reuse complex grammars. Additionally, we will look at some tools and the reflective facilities provided by the PetitParser framework. Basic knowledge of the Smalltalk programming language is a requirement. Bring your laptop to reproduce the examples and solve some simple tasks.TRANSCRIPT
Mastering Grammarswith
Lukas Renggli
Agenda
1. PetitParser in a Nutshell
2. Combinatorial Parsing
3. Complex Grammars
4. Advanced Use
1in a Nutshell
a..z a..z
0..9
ID ::= letter { letter | digit } ;
letter
letter digit
sequence
choice
many
ID ::= letter { letter | digit } ;
letter
letter digit
sequence
choice
many
id := #letter asParser , (#letter asParser / #digit asParser) star
id parse: 'yeah'. " -‐-‐> #($y #($e $a $h)) "
id parse: 'f12'. " -‐-‐> #($f #($1 $2)) "
id parse: '123'. " -‐-‐> letter expected at 0 "
‣ Test the identifier parser
‣ Inspect the parser object
‣ Create and test an integer parser
Example 1
2Combinatorial Parsing
Parser Terminals
$a asParser letter ‘a’
'abc' asParser string ‘abc’
#any asParser any character
#digit asParser the digits 0..9
#letter asParser the letters a..z and A..Z
nil asParser the empty parser
These factory methods are defined inPPPredicateObjectParser class
Parser Operators
p1 , p2 sequence
p1 / p2 ordered choice
p star zero-or-more (0..*)
p plus one-or-more (1..*)
p optional zero-or-one (0..1)
p1 separatedBy: p2
p1 delimitedBy: p2
see the operations protocols in PPParser
for more operators
Parser Predicates
p and conjunction (non-consuming look-ahead)
p not negation (non-consuming look-ahead)
p end end of input
Parser Actions
p ==> [ :arg | ] transformation
p flatten create string
p token create token
p trim trim whitespaces
see the operations-mapping protocol in PPParser
for more actions
term ::= prod "+" term | prod ;
prod ::= prim "*" prod | prim ;
prim ::= "(" term ")" | number ;
number ::= "0" .. "9" ;
number := #digit asParser plus flatten trim ==> [ :string | string asNumber ].
term := PPUnresolvedParser new.prod := PPUnresolvedParser new.prim := PPUnresolvedParser new. term def: (prod , $+ asParser trim , term ==> [ :nodes | nodes first + nodes last ]) / prod.prod def: (prim , $* asParser trim , prod ==> [ :nodes | nodes first * nodes last ]) / prim.prim def: ($( asParser trim , term , $) asParser trim ==> [ :nodes | nodes second ]) / number.
start := term end.
start parse: '1 + 2 * 3'. " -‐-‐> 7 "
start parse: '(1 + 2) * 3'. " -‐-‐> 9 "
‣ Add support for negative numbers
‣ Add support for floating point numbers
‣ Add support for subtraction & division
Example 2
3Complex Grammars
PetitParser Scripts
PetitParser Scripts
quick to
write
embed into Smalltalk
PetitParser Scripts
quick to
write
embed into Smalltalkhard
to test
difficultto reuse
messy if large
start
PPCompositeParser
starttermprod...
start
PPCompositeParser
ExpressionGrammar
starttermprod...
start
PPCompositeParser
ExpressionGrammar
start ^ term end
One Method per Production
starttermprod...
start
PPCompositeParser
ExpressionGrammar
One Instance Variable per Production
term ^ (prod , $+ asParser trim , term) / prod
starttermprod...
start
PPCompositeParser
ExpressionGrammar
Refer to Productions by Inst-Var Reference
prod ^ (prim , $* asParser trim , term) / prim
starttermprod...
start
PPCompositeParser
ExpressionGrammar
rest ismagic
termprodnumber...
ExpressionEvaluator
starttermprod...
ExpressionGrammar
‣ Implement an expression grammar
‣ Implement an expression evaluator
‣ Implement an expression pretty printer
Example 3
4Advanced Use
does not justwork on Strings
Matching
p matches: 'abc'.
p matchesIn: 'abc'.
p matchesIn: 'abc' do: [ :each | ].
p matchingRangesIn: 'abc'.
p matchingRangesIn: 'abc' do: [ :interval | ].
GUI PPBrowser openWorld ➔ Tools ➔ PetitParser
Reflection
p allParser.
p allParserDo: [ :each | ].
p firstSet.
p followSet.
p cycleSet.
Transformations
p replace: p1 with: p2.
p transform: [ :parser | ].
Like #collect: on Collection, but transforms the whole grammar graph.
Pattern Searching
found := PPSearcher new
matches: PPPattern any star plus
do: [ :parser :answer | parser ];
execute: p initialAnswer: nil
A placeholder matching any parser
Grammar tobe searched
Pattern Rewriting
pattern := PPPattern any.
rewritten := PPRewriter new
replace: pattern star plus
with: pattern star;
execute: p
Grammar tobe rewritten
Same pattern used in search and replace.
Grammar Optimization
fast := slow optimize
Many behavior preserving rewrite rules applied for you.
0
1000000
2000000
3000000
LALR PetitParser Hand-Written
char
s/se
c
Old VM Cog VM