manohar jonnalagedda · staging parser combinators for efficient data processing manohar...
TRANSCRIPT
![Page 1: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/1.jpg)
Staging Parser Combinators for Efficient Data Processing
Manohar Jonnalagedda
Parsing @ SLE, 14 September 2014
![Page 2: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/2.jpg)
What are they good for?
● Composable○ Each combinator builds a new parser from a previous one
● Context-sensitive○ We can make decisions based on a specific parse result
● Easy to Write○ DSL-style of writing○ Tight integration with host language
2
![Page 3: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/3.jpg)
Example: HTTP ResponseHTTP/1.1 200 OKDate: Mon, 23 May 2013 22:38:34 GMTServer: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)Last-Modified: Wed, 08 Jan 2012 23:11:55 GMTEtag: "3f80f-1b6-3e1cb03b"Content-Type: text/html; charset=UTF-8Content-Length: 129Connection: close
... payload ...
3
![Page 4: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/4.jpg)
Example: HTTP ResponseHTTP/1.1 200 OKDate: Mon, 23 May 2013 22:38:34 GMTServer: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)Last-Modified: Wed, 08 Jan 2012 23:11:55 GMTEtag: "3f80f-1b6-3e1cb03b"Content-Type: text/html; charset=UTF-8Content-Length: 129Connection: close
... payload ...
Status
Headers
Content
4
![Page 5: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/5.jpg)
Example: HTTP Responsedef status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf)
) map (_.toInt) Transform parse results on the fly
5
![Page 6: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/6.jpg)
Example: HTTP Responsedef status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf)
) map (_.toInt)
def header = (headerName <~ ":") flatMap {
key => (valueParser(key) <~ crlf) map {
value => (key, value)
}
}
Transform parse results on the fly
Make decision based on parse result
6
![Page 7: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/7.jpg)
Example: HTTP Responsedef status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf)
) map (_.toInt)
def header = (headerName <~ ":") flatMap {
key => (valueParser(key) <~ crlf) map {
value => (key, value)
}
}
def respWithPayload = response flatMap {
r => body(r.contentLength)
}
Transform parse results on the fly
Make decision based on parse result
Make decision based on parse result
7
![Page 8: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/8.jpg)
Parser combinators are slow
Topic of this talk.
Standard Parser Combinators
Staged Parser Combinators
20x
Throughput
9
![Page 9: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/9.jpg)
Parser Combinators are slowdef status: Parser[Int] = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~
crlf)
) map (_.toInt)
def header = (headerName <~ ":") flatMap {
key => (valueParser(key) <~ crlf) map {
value => (key, value)
}
}
def respWithPayload = response flatMap {
r => body(r.contentLength)
}
class Parser[T] extends (Input => ParseResult[T]) ...
10
![Page 10: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/10.jpg)
Parser Combinators are slowdef status: Parser[Int] = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~
crlf)
) map (_.toInt)
def header = (headerName <~ ":") flatMap {
key => (valueParser(key) <~ crlf) map {
value => (key, value)
}
}
def respWithPayload = response flatMap {
r => body(r.contentLength)
}
class Parser[T] extends (Input => ParseResult[T]) ...
def ~[U](that: Parser[U]) = new Parser[(T,U)] { def apply(i: Input) = ... }
11
![Page 11: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/11.jpg)
● Prohibitive composition overhead● But: composition is mostly static
○ Let us systematically remove it!
Parser Combinators are slow
12
![Page 12: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/12.jpg)
Staged Parser Combinators
Composition of Parsers
12
![Page 13: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/13.jpg)
Staged Parser Combinators
Composition of Parsers
Composition of Code Generators
13
![Page 14: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/14.jpg)
Staging (LMS)
def add3(a: Int, b: Int, c: Int) = a + b + c
add3(1, 2, 3) 6
‘Classic’ evaluation
14
![Page 15: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/15.jpg)
Staging (LMS)
def add3(a: Int, b: Int, c: Int) = a + b + c
add3(1, 2, 3) 6
def add3(a: Rep[Int], b: Int, c: Int) = a + b + c
Adding Rep types
‘Classic’ evaluation
Expression in the next stage
Executed at staging timeConstant in the next stageExecuted at staging timeConstant in the next stage
15
![Page 16: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/16.jpg)
Staging (LMS)
def add3(a: Int, b: Int, c: Int) = a + b + c
add3(1, 2, 3) 6
def add3(a: Rep[Int], b: Int, c: Int) = a + b + c
Adding Rep types
add3(x, 2, 3) def add$3$2$3(a:Int) = a + 5
add$3$2$3(1)
‘Classic’ evaluation
Expression in the next stage
Executed at staging timeConstant in the next stageExecuted at staging timeConstant in the next stage
Code generation
Evaluation of generated code
16
![Page 17: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/17.jpg)
LMS
User-written code, may contain Rep types
LMS runtime code generation
Generated/optimized code.
17
![Page 18: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/18.jpg)
Staging Parser Combinators
class Parser[T] extends (Input => ParseResult[T])
Composition of Code Generators
class Parser[T] extends (Rep[Input] => Rep[ParseResult[T]])
static function: application == inlining for free
dynamic inputsdynamic input/output
18
![Page 19: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/19.jpg)
Staging Parser Combinators
class Parser[T] extends (Input => ParseResult[T])
Composition of Code Generators
class Parser[T] extends (Rep[Input] => Rep[ParseResult[T]])
dynamic inputs
def ~[U](that: Parser[U])
def ~[U](that: Parser[U])
def map[U](f: T => U): Parser[U]
def map[U](f: Rep[T] => Rep[U]): Parser[U]
dynamic input/output
static function: application == inlining for free
still a code generator
19
![Page 20: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/20.jpg)
Staging Parser Combinators
class Parser[T] extends (Input => ParseResult[T])
Composition of Code Generators
class Parser[T] extends (Rep[Input] => Rep[ParseResult[T]])
dynamic inputs
def ~[U](that: Parser[U])
def ~[U](that: Parser[U])
def map[U](f: T => U): Parser[U]
def map[U](f: Rep[T] => Rep[U]): Parser[U]
def flatMap[U](f: T => Parser[U]): Parser[U]
def flatMap[U](f: Rep[T] => Parser[U]): Parser[U] still a code generator
dynamic input/output
static function: application == inlining for free
still a code generator
20
![Page 21: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/21.jpg)
A closer lookdef respWithPayload: Parser[..] = response flatMap { r => body(r.contentLength) }
// code for parsing responseval response = parseHeaders()val n = response.contentLength//parsing bodyvar i = 0while (i < n) { readByte() i += 1}
User-written parser
Generated code
code generation
21
![Page 22: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/22.jpg)
Gotchas
● Recursion○ explicit recursion combinator (fix-point like)
● Diamond control flow○ code generation blowup
General solution○ generate staged functions (Rep[Input => ParseResult])
22
![Page 23: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/23.jpg)
Performance: Parsing JSON
● 20 times faster than Scala’s parser combinators
● 3 times faster than Parboiled2
23
![Page 24: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/24.jpg)
Performance
HTTP Response
CSV
24
![Page 25: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/25.jpg)
If you want to know more
● Parser Combinators for Dynamic Programming [OOPSLA ‘14]
○ based on ADP○ code gen for GPU
● Using Scala Macros [Scala ‘14]
25
![Page 26: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/26.jpg)
Desirable Parser Properties
Hand-written Parser Generators Staged Parser Combinators
Composable X ✓ ✓
Customizable X X ✓
Context-Sensitive ✓ ~ ✓
Fast ✓ ✓ ✓
Easy to write X ✓ ✓
26
![Page 27: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/27.jpg)
The people
● Eric Béguet● Thierry Coppey
● Sandro Stucki● Tiark Rompf
● Martin Odersky
27
![Page 28: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/28.jpg)
Tack!Fråga?
![Page 29: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/29.jpg)
Staging all the way down
● Staged structs○ boxing of temporary results eliminated
● Staged strings○ substring not computed all the time
![Page 30: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/30.jpg)
Optimizing String handling
class InputWindow[Input](val in: Input, val start: Int, val end: Int){
override def equals(x: Any) = x match {
case s : InputWindow[Input] =>
s.in == in &&
s.start == start &&
s.end == end
case _ => super.equals(x)
}
}
![Page 31: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/31.jpg)
Beware!● String.substring is in linear time ( >= Java 1.6).
● Parsers on Strings are inefficient.
● Need to use a FastCharSequence which mimics original behaviour of substring.
Key performance impactorsStandard Parser Combinators
![Page 32: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/32.jpg)
Key performance impactors
Standard Parser Combinatorswith FastCharSequence
Standard Parser Combinators
![Page 33: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/33.jpg)
Key performance impactors
Standard Parser Combinatorswith FastCharSequence
Standard Parser Combinators
~7-8xFastParsers with error reporting and without inlining
![Page 34: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/34.jpg)
Key performance impactors
Standard Parser Combinatorswith FastCharSequence
Standard Parser Combinators
~ 2x
~7-8xFastParsers with error reporting and without inlining
FastParsers without error reporting without inlining
![Page 35: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator](https://reader034.vdocuments.mx/reader034/viewer/2022042613/5f95c8e71fe156769a5b5b75/html5/thumbnails/35.jpg)
Key performance impactors
Standard Parser Combinatorswith FastCharSequence
Standard Parser Combinators
FastParsers with error reporting and without inlining
FastParsers without error reporting without inlining
FastParsers without error reporting with inlining
~ 30%
~ 2x
~7-8x