introduction to iteratees (scala)

40
History Iterator pattern Iteratees Discussion Introduction to Iteratees Motivation and implementation basics Alexander Lehmann <[email protected]> Technische Universit¨ at M¨ unchen December 17th, 2013 1/29

Upload: alexander-lehmann

Post on 07-May-2015

3.370 views

Category:

Technology


0 download

DESCRIPTION

Slides originating from a talk I gave at ScalaMUC on 2013-12-17. The corresponding code can be found at https://github.com/afwlehmann/iteratee-tutorial .

TRANSCRIPT

Page 1: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

Introduction to IterateesMotivation and implementation basics

Alexander Lehmann<[email protected]>

Technische Universitat Munchen

December 17th, 2013

1/29

Page 2: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

Brief history

2008 Oleg Kiselyov, “Incremental multi-level inputprocessing with left-fold enumerator”, DEFUN 2008

2010-05-12 John W. Lato, “Teaching an Old Fool New Tricks’,Monad Reader #16

2010-05-26 Available in scalaz 5.0 Runar Oli

2012-03-13 Initial release of the play framework 2.0

2/29

Page 3: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

We’ve often seen something along the lines of

import io.Source.fromFile

val it: Iterator[String] = fromFile("foo.txt").getLines

var result = 0

while (it.hasNext) {

val line = it.next

result += line.toInt

}

// Do something with the result

3/29

Page 4: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

Issues

7 Repetitive pattern (DRY principle)

7 Manual pulling

7 Mutability, imperative style

7 No error handling (we sometimes just forget, right?)

7 No (one|two)-way communication (Exceptions don’t count)

7 Resource handling (how long do we need a resource and whois responsible for opening/closing/recovering it?)

7 Missing or rather difficult composability

7 What if the input stream were infinite?

4/29

Page 5: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

Try to be more “functional” and invert control...

val it: Iterator[String] = fromFile("foo.txt").getLines

var result = 0

it foreach { line =>

result += line.toInt

}

5/29

Page 6: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

define ourselves a re-usable utility function...

def enum(it: Iterator[String]): Int = {

var result = 0

it foreach { line => result += line.toInt }

result

}

val foo = enum(fromFile("foo.txt").getLines)

6/29

Page 7: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

being more versatile for the greater good...

def enum(it: Iterator[String])(init: Int)

(f: (Int, String) => Int): Int = {

var result = init

it foreach { line => result = f(result, line) }

result

}

val it: Iterator[String] = fromFile("foo.txt").getLines

val foo: Int = enum(it)(0)(_ + _.toInt)

7/29

Page 8: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

possibly even generic...

def enum[In,Out](it: Iterator[In])(init: Out)

(f: (Out, In) => Out): Out = {

var result = init

it foreach { x => result = f(result, x) }

result

}

val it: Iterator[String] = fromFile("foo.txt").getLines

val foo: Int = enum(it)(0)(_ + _.toInt)

8/29

Page 9: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

provide sophisticated error handling...

def enum[In,Out](it: Iterator[In])(init: Out)

(f: (Out, In) => Out): Out = {

var result = init

try {

it foreach { x => result = f(result, x) }

} catch {

case t: Throwable => /* TODO (fingers crossed) */

}

result

}

val it: Iterator[String] = fromFile("foo.txt").getLines

val foo: Int = enum(it)(0)(_ + _.toInt)

9/29

Page 10: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

or rather use {your-favorite-“monad”-here} for error handling,asynchronism and composability...

def enum[In,Out](it: Iterator[In])(init: Out)

(f: (Out, In) => Out): Option[Out] = {

try {

var result = init

it foreach { x => result = f(result, x) }

Some(result)

} catch {

case t: Throwable => None

}

}

val it: Iterator[String] = fromFile("foo.txt").getLines

val foo: Option[Int] = enum(it)(0)(_ + _.toInt)

10/29

Page 11: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

only to realize we’ve already had it all!?

val it: Iterator[String] = fromFile("foo.txt").getLines

val foo: Try[Int] = it.foldLeft(Try(0)) {

// f: (Out, In) => Out

case (acc, line) =>

acc map { x => x + line.toInt }

}

Really? What about

producer and consumer talking back and forth

cannot cope with infinite streams

resource handling (stick this into another function?)

asynchronism (could have used Futures)

data which are not available all at once

Think of

foldLeft as Enumerator

acc together with f as Iteratee

11/29

Page 12: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

only to realize we’ve already had it all!?

val it: Iterator[String] = fromFile("foo.txt").getLines

val foo: Try[Int] = it.foldLeft(Try(0)) {

// f: (Out, In) => Out

case (acc, line) =>

acc map { x => x + line.toInt }

}

Really? What about

producer and consumer talking back and forth

cannot cope with infinite streams

resource handling (stick this into another function?)

asynchronism (could have used Futures)

data which are not available all at once

Think of

foldLeft as Enumerator

acc together with f as Iteratee

11/29

Page 13: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

only to realize we’ve already had it all!?

val it: Iterator[String] = fromFile("foo.txt").getLines

val foo: Try[Int] = it.foldLeft(Try(0)) {

// f: (Out, In) => Out

case (acc, line) =>

acc map { x => x + line.toInt }

}

Really? What about

producer and consumer talking back and forth

cannot cope with infinite streams

resource handling (stick this into another function?)

asynchronism (could have used Futures)

data which are not available all at once

Think of

foldLeft as Enumerator

acc together with f as Iteratee

11/29

Page 14: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Enumerators . . .

are stream producers

push subsequent chunks of data to an Iteratee, hence foldingthe Iteratee over the input stream

sealed trait Enumerator[F] {

def apply[T](iter: Iteratee[F,T]): Iteratee[F,T]

def run[T](iter: Iteratee[F,T]): T

}

act synchronously or asynchronously (think Futures)

come with batteries included

object Enumerator {

def apply[T](xs: T*): Enumerator[T]

def fromTextFile(fileName: String): Enumerator[String]

def fromStream(s: InputStream, chunkSize: Int = 8192)

: Enumerator[Array[Byte]]

}

12/29

Page 15: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Enumerators communicate state to the Iteratee:

sealed trait Input[+T]

case class Element[T](x: T) extends Input[T]

case object Empty extends Input[Nothing]

case object EOF extends Input[Nothing]

For now, assume enumerators like this:

13/29

Page 16: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Enumerators communicate state to the Iteratee:

sealed trait Input[+T]

case class Element[T](x: T) extends Input[T]

case object Empty extends Input[Nothing]

case object EOF extends Input[Nothing]

For now, assume enumerators like this:def enum[F,T](xs: List[F])(it: Iteratee[F,T]): Iteratee[F,T]

13/29

Page 17: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Iteratees . . .

stream consumers

consume chunks of input (as a whole)

are immutable, re-usable computations

state is encoded through type

sealed trait Iteratee[F,T] { def run: T }

case class Done[F,T] (result: T, remainingInput: Input[F])

case class Cont[F,T] (k: Input[F] => Iteratee[F,T])

case class Error[F,T](t: Throwable)

// All of Done, Cont and Error extend Iteratee[F,T]

14/29

Page 18: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

So far:

Enumerators fold iteratees over a stream

Enumerators communicate state through Input[T]

Iteratees encapsulate result, error or computation

Let’s revisit our “enumerator”:

def enum[F,T](xs: List[F])(it: Iteratee[F,T]): Iteratee[F,T] =

(xs, it) match {

case (h::t, Cont(k)) => enum(t)( k(Element(h)) )

case _ => it

}

// k: Input[In] => Iteratee[F,T]

This is all we need to have some working examples.

15/29

Page 19: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

So far:

Enumerators fold iteratees over a stream

Enumerators communicate state through Input[T]

Iteratees encapsulate result, error or computation

Let’s revisit our “enumerator”:

def enum[F,T](xs: List[F])(it: Iteratee[F,T]): Iteratee[F,T] =

(xs, it) match {

case (h::t, Cont(k)) => enum(t)( k(Element(h)) )

case _ => it

}

// k: Input[In] => Iteratee[F,T]

This is all we need to have some working examples.

15/29

Page 20: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

def head[T]: Iteratee[T,T] = {

def step: Input[T] => Iteratee[T,T] = {

case Element(x) => Done(x, Empty)

case Empty => Cont(step)

case EOF => Error(new NoSuchElementException)

}

Cont(step)

}

Example:

scala> enum("Hello world!".toList)(head)

res1: Iteratee[Char,Char] = Done(H,Empty)

scala> res1.run

res2: Char = H

16/29

Page 21: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

def length[T]: Iteratee[T, Int] = {

def step(n: Int): Input[T] => Iteratee[T, Int] = {

case EOF => Done(n, EOF)

case Empty => Cont(step(n))

case Element(_) => Cont(step(n+1))

}

Cont(step(0))

}

Example:

scala> enum("Hello world!".toList)(length)

res3: Iteratee[Char,Int] = Cont(<function1>)

scala> res3.run

res4: Int = 12

17/29

Page 22: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

How this works

Further use that iteratee or apply run to it...

n=0

Cont(step(n))

enum(List(7,9))

El(7)

= =

n=1

Cont(step(n))

enum(List(9))

El(9)

length

=

n=2

Cont(step(n))

enum(Nil)

n=2

Cont(step(n))

18/29

Page 23: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

How this works

Further use that iteratee or apply run to it...

n=2

Cont(step(n))

run

EOF

Done(2,EOF)=

18/29

Page 24: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

There’s a lot of pattern matching going on. Frameworks to therescue:

object Iteratee {

def apply[F,T](el: Element[F] => Iteratee[F,T],

empty: => Iteratee[F,T],

eof: => Iteratee[F,T]): Iteratee[F,T]

def fold[F,T](z: T)(f: (T,F) => T): Iteratee[F,T]

}

Because that’s what we need most of the time.

19/29

Page 25: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Wouldn’t it be awesome if Iteratees were composable?

def drop1Keep1[T] = for {

_ <- drop[T](1)

h <- head

} yield h

def pair[T] = for {

h1 <- head[T]

h2 <- head

} yield (h1, h2)

They are indeed! Iteratees compose sequentially (and so doEnumerators).

scala> enum("Hello World!".toList)(drop1Keep1).run

res1: Char = e

scala> enum("Hello World!".toList)(pair).run

res2: (Char, Char) = (H,e)

20/29

Page 26: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

As we know, for translates to map and flatMap:

trait Iteratee[F,T] {

def map[U](f: T => U): Iteratee[F,U]

def flatMap[U](f: T => Iteratee[F,U]): Iteratee[F,U]

}

21/29

Page 27: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

As we know, for translates to map and flatMap:

trait Iteratee[F,T] {

def map[U](f: T => U): Iteratee[F,U]

def flatMap[U](f: T => Iteratee[F,U]): Iteratee[F,U]

}

Adding this to Cont is straight-forward.

def map[U](f: T => U): Iteratee[F,U] =

Cont(elt => k(elt) map f)

def flatMap[U](f: T => Iteratee[F,U]): Iteratee[F,U] =

Cont(elt => k(elt) flatMap f)

// k: Input[F] => Iteratee[F,T]

22/29

Page 28: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

So is adding to Done.

def map[U](f: T => U): Iteratee[F,U] =

Done(f(x), remainingInput)

def flatMap[U](f: T => Iteratee[F,U]): Iteratee[F,U] =

f(x) match {

case Done(xPrime, _) => Done(xPrime, remainingInput)

case Cont(k) => k(remainingInput)

case err @ Error(_) => err

}

23/29

Page 29: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Now why is that so great?

Easily compose simple (or complex) iteratees

def drop1Keep1[T] = for {

_ <- drop[T](1)

h <- head

} yield h

def pair[T] = for {

h1 <- head[T]

h2 <- head

} yield (h1, h2)

No repetitive pattern-matching hell

Benefit from utility functions, e.g. sequence or repeat

Example:

scala> def fivePairs[T] = sequence(List.fill(5)(pair[T]))

...

scala> enum((1 to 10).toList)(fivePairs).run

res1: List[(Int, Int)] = List((1,2), (3,4), (5,6), (7,8), (9,10))

24/29

Page 30: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Now why is that so great?

Easily compose simple (or complex) iteratees

def drop1Keep1[T] = for {

_ <- drop[T](1)

h <- head

} yield h

def pair[T] = for {

h1 <- head[T]

h2 <- head

} yield (h1, h2)

No repetitive pattern-matching hell

Benefit from utility functions, e.g. sequence or repeat

Example:

scala> def alternates[T] = repeat(drop1Keep1[T])

...

scala> enum((1 to 10).toList)(alternates).run

res2: Stream[Int] = Stream(2, ?)

// res2.toList == List(2, 4, 6, 8, 10)24/29

Page 31: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

EnumeratorsBasicsCompositionEnumeratees

Enumeratees...

are both consumer and producer

link in between Enumerator and Iteratee

feed transformed input from Enumerator to “inner” Iteratee,for instance

filter

take

. . .

are (of course?) composable

allow vertical (parallel) stacking of Iteratees (one to many)

See eamelink’s play-related examples @https://gist.github.com/eamelink/5639642

25/29

Page 32: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

Brief discussion:

3 Enumerators, Iteratees, Enumeratees

3 Easy to use

3 Composable in sequence and in parallel

3 Tail recursion ⇒ no stack overflow

3 Thin layer ⇒ high performance

3 Good match for asynchronuous environments, e.g. the playframework

7 Pure form lacks functional treatment of side effects

26/29

Page 33: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

Where to go from here

Read blogs & tutorials, start out with [1] to [6]

Use or rather implement iteratees

Check out the play framework

Asynchronous iterateesNumerous factory methods, e.g. easily bridgebetween Rx’s Observables and Enumerators [3]All the other goodies from play

Check out scalaz 7

Implementation in terms of monad transformers⇒ your choice of IO, Future, Option, . . .All what’s missing in the standard librarySee examples and explanations at [1] and [2]

27/29

Page 34: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

Thank you for your attention

Anyone trying to understand monads will inevitably runinto the [...] monad, and the results are almost alwaysthe same: bewilderment, confusion, anger, and ultimatelyPerl.

Daniel Spiewak, Dec. 2010

28/29

Page 35: Introduction to Iteratees (Scala)

HistoryIterator pattern

IterateesDiscussion

References:

[1] “Enumeration-based I/O With Iteratees” @http://blog.higher-order.com/blog/2010/10/14/scalaz-tutorial-enumeration-based-io-with-iteratees/

[2] “learning Scalaz: Enumeration-Based I/O with Iteratees” @http://eed3si9n.com/learning-scalaz/Iteratees.html

[3] “RxPlay: Diving into iteratees and observables and makingthem place nice” @http://bryangilbert.com/code/2013/10/22/rxPlay-making-iteratees-and-observables-play-nice/

[4] “Understanding Play2 Iteratees for Normal Humans” @http://mandubian.com/2012/08/27/understanding-play2-iteratees-for-normal-humans/

[5] “Incremental multi-level input processing and collectionenumeration” @ http://okmij.org/ftp/Streams.html

[6] “Teaching an Old Fool New Tricks” @http://themonadreader.wordpress.com/2010/05/12/issue-16/

29/29

Page 36: Introduction to Iteratees (Scala)

Running IterateesExample

Backup slides

30/29

Page 37: Introduction to Iteratees (Scala)

Running IterateesExample

run makes great effort to yield the final result:

def run: T = this match {

case Done(rslt, _) => rslt

case Error(t) => throw t

// k: Input[F] => Iteratee[F,T]

case Cont(k) => k(EOF).run // may loop forever

}

31/29

Page 38: Introduction to Iteratees (Scala)

Running IterateesExample

run makes great effort to yield the final result:

def run: T = this match {

case Done(rslt, _) => rslt

case Error(t) => throw t

// k: Input[F] => Iteratee[F,T]

case Cont(k) => k(EOF) match {

case Done(rslt, _) => rslt

case Cont(_) => sys.error("Diverging iteratee!")

case Error(t) => throw t

}

}

31/29

Page 39: Introduction to Iteratees (Scala)

Running IterateesExample

def drop[T](n: Int): Iteratee[T, Unit] = {

def step: Input[T] => Iteratee[T, Unit] = {

case EOF => Done((), EOF)

case Empty => Cont(step)

case Element(_) => drop(n-1)

}

if (n <= 0) Done((), Empty) else Cont(step)

}

Example:

scala> enum("Hello World!".toList)(drop(3))

res5: Iteratee[Char,Unit] = Done((),Empty)

scala> res5.run

// blank

32/29

Page 40: Introduction to Iteratees (Scala)

Running IterateesExample

How this works

n=2

Cont(step(n))

enum(List(1,2,3))

El(1)

Done((),Empty)

enum(List(3))= Done((),Empty)= =

n=1

Cont(step(n))

enum(List(2,3))

El(2)

drop(2)

33/29