mastering grammars with petitparser

Post on 25-Dec-2014

2.012 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

PetitParser is a dynamic parser framework combining ideas from scannerless parsing, parser combinators, parsing expression grammars and packrat parsers. In this hands-on session we learn how to build simple parsers and how to model, test, compose and reuse complex grammars. Additionally, we will look at some tools and the reflective facilities provided by the PetitParser framework. Basic knowledge of the Smalltalk programming language is a requirement. Bring your laptop to reproduce the examples and solve some simple tasks.

TRANSCRIPT

Mastering Grammarswith

Lukas Renggli

http://dsal.cl/~jfabry/pp/

Agenda

1. PetitParser in a Nutshell

2. Combinatorial Parsing

3. Complex Grammars

4. Advanced Use

1in a Nutshell

a..z a..z

0..9

ID  ::=  letter  {  letter  |  digit  }  ;

letter

letter digit

sequence

choice

many

ID  ::=  letter  {  letter  |  digit  }  ;

letter

letter digit

sequence

choice

many

id  :=  #letter  asParser  ,  (#letter  asParser  /  #digit  asParser)  star

id  parse:  'yeah'.   "  -­‐-­‐>  #($y  #($e  $a  $h))  "

id  parse:  'f12'.   "  -­‐-­‐>  #($f  #($1  $2))  "

id  parse:  '123'.   "  -­‐-­‐>  letter  expected  at  0  "

‣ Test the identifier parser

‣ Inspect the parser object

‣ Create and test an integer parser

Example 1

2Combinatorial Parsing

Parser Terminals

$a  asParser letter ‘a’

'abc'  asParser string ‘abc’

#any  asParser any character

#digit  asParser the digits 0..9

#letter  asParser the letters a..z and A..Z

nil  asParser the empty parser

These factory methods are defined inPPPredicateObjectParser  class

Parser Operators

p1  ,  p2   sequence

p1  /  p2   ordered choice

p  star   zero-or-more (0..*)

p  plus   one-or-more (1..*)

p  optional   zero-or-one (0..1)

p1  separatedBy:  p2

p1  delimitedBy:  p2

see the operations protocols in PPParser

for more operators

Parser Predicates

p  and   conjunction (non-consuming look-ahead)

p  not   negation (non-consuming look-ahead)

p  end   end of input

Parser Actions

p  ==>  [  :arg  |      ] transformation  

p  flatten   create string

p  token   create token

p  trim trim whitespaces

see the operations-mapping protocol in PPParser

for more actions

term          ::=  prod  "+"  term                        |  prod  ;

prod          ::=  prim  "*"  prod                      |  prim  ;

prim          ::=  "("  term  ")"                      |  number  ;

number      ::=  "0"  ..  "9"  ;

number  :=    #digit  asParser  plus  flatten  trim   ==>  [  :string  |  string  asNumber  ].

term  :=  PPUnresolvedParser  new.prod  :=  PPUnresolvedParser  new.prim  :=  PPUnresolvedParser  new.  term  def:  (prod  ,  $+  asParser  trim  ,  term     ==>  [  :nodes  |  nodes  first  +  nodes  last  ])      /  prod.prod  def:  (prim  ,  $*  asParser  trim  ,  prod   ==>  [  :nodes  |  nodes  first  *  nodes  last  ])      /  prim.prim  def:  ($(  asParser  trim  ,  term  ,  $)  asParser  trim   ==>  [  :nodes  |  nodes  second  ])      /  number.

start  :=  term  end.

start  parse:  '1  +  2  *  3'.                 "  -­‐-­‐>  7  "

start  parse:  '(1  +  2)  *  3'.             "  -­‐-­‐>  9  "

‣ Add support for negative numbers

‣ Add support for floating point numbers

‣ Add support for subtraction & division

Example 2

3Complex Grammars

PetitParser Scripts

PetitParser Scripts

quick to

write

embed into Smalltalk

PetitParser Scripts

quick to

write

embed into Smalltalkhard

to test

difficultto reuse

messy if large

start

PPCompositeParser

starttermprod...

start

PPCompositeParser

ExpressionGrammar

starttermprod...

start

PPCompositeParser

ExpressionGrammar

start      ^  term  end

One Method per Production

starttermprod...

start

PPCompositeParser

ExpressionGrammar

One Instance Variable per Production

term      ^  (prod  ,  $+  asParser  trim  ,  term)          /  prod

starttermprod...

start

PPCompositeParser

ExpressionGrammar

Refer to Productions by Inst-Var Reference

prod      ^  (prim  ,  $*  asParser  trim  ,  term)          /  prim

starttermprod...

start

PPCompositeParser

ExpressionGrammar

rest ismagic

termprodnumber...

ExpressionEvaluator

starttermprod...

ExpressionGrammar

‣ Implement an expression grammar

‣ Implement an expression evaluator

‣ Implement an expression pretty printer

Example 3

4Advanced Use

does not justwork on Strings

Matching

p  matches:  'abc'.

p  matchesIn:  'abc'.  

p  matchesIn:  'abc'  do:  [  :each  |      ].

p  matchingRangesIn:  'abc'.

p  matchingRangesIn:  'abc'  do:  [  :interval  |      ].  

GUI PPBrowser  openWorld ➔ Tools ➔ PetitParser

Reflection

p  allParser.

p  allParserDo:  [  :each  |      ].

p  firstSet.  

p  followSet.

p  cycleSet.

Transformations

p  replace:  p1  with:  p2.

p  transform:  [  :parser  |      ].

Like #collect: on Collection, but transforms the whole grammar graph.

Pattern Searching

found  :=  PPSearcher  new

  matches:  PPPattern  any  star  plus  

  do:  [  :parser  :answer  |  parser  ];

  execute:  p  initialAnswer:  nil

A placeholder matching any parser

Grammar tobe searched

Pattern Rewriting

pattern  :=  PPPattern  any.

rewritten  :=  PPRewriter  new

  replace:  pattern  star  plus

  with:  pattern  star;

  execute:  p

Grammar tobe rewritten

Same pattern used in search and replace.

Grammar Optimization

fast  :=  slow  optimize

Many behavior preserving rewrite rules applied for you.

0

1000000

2000000

3000000

LALR PetitParser Hand-Written

char

s/se

c

Old VM Cog VM

top related