error handling in reactive systems

127
©Typesafe 2014 – All Rights Reserved 1 Error Handling in Reactive Systems Dean Wampler Wednesday, November 19, 14 Photos from Jantar Mantar (“instrument”, “calcula8on”), the astronomical observatory built in Jaipur, India, by Sawai Jai Singh, a Rajput King, in the 1720s30s. He built four others around India. This is the largest and best preserved. All photos are copyright (C) 20122014, Dean Wampler. All Rights Reserved.

Upload: dean-wampler

Post on 02-Jul-2015

2.904 views

Category:

Internet


2 download

DESCRIPTION

Error handling is critical for reactive systems, as expressed by the "resilient" trait. This talk examines strengths and weaknesses of existing approaches. I also consider preventing errors in the first place.

TRANSCRIPT

Page 1: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 1

Error Handlingin Reactive Systems

Dean Wampler

Wednesday, November 19, 14Photos  from  Jantar  Mantar  (“instrument”,  “calcula8on”),  the  astronomical  observatory  built  in  Jaipur,  India,  by  Sawai  Jai  Singh,  a  Rajput  King,  in  the  1720s-­‐30s.  He  built  four  others  around  India.  This  is  the  largest  and  best  preserved.All  photos  are  copyright  (C)  2012-­‐2014,  Dean  Wampler.  All  Rights  Reserved.

Page 3: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

TypesafeReactive Big Data

3

typesafe.com/reactive-big-data

Wednesday, November 19, 14This  is  my  role.  We’re  just  geTng  started,  but  talk  to  me  if  you’re  interested  in  what  we’re  doing.

Page 4: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 4

Responsive

Elastic Resilient

Message Driven

Wednesday, November 19, 14

Page 5: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 5

Responsive

Elastic Resilient

Message Driven

Failures arefirst class?

Wednesday, November 19, 14Truly  resilient  systems  must  make  failures  first  class  ci8zens,  in  some  sense  of  the  word,  because  they  are  inevitable  when  the  systems  are  big  enough  and  run  long  enough.

Page 6: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 6

Wednesday, November 19, 14I’ve  structured  this  talk  around  some  points  made  in  Debasish’s  new  book,  which  has  lots  of  interes8ng  prac8cal  ideas  for  combining  func8onal  programming  and  reac8ve  approaches  with  classic  Domain-­‐Driven  Design  by  Eric  Evans.

Page 7: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

#1Failure-handling

mixed withdomain logic.

7

Wednesday, November 19, 14This  is  how  we’ve  always  done  it,  right?

Page 8: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 8

Best for narrowly-scoped errors.–Parsing user input.–Transient stream interruption.–Failover from one stream to a “backup”.

Wednesday, November 19, 14

Page 9: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 9

Limited to per stream handling. Hard to implement a larger strategy.

Wednesday, November 19, 14

Page 10: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 10

Communicating Sequential

Processes

Messagepassing

viachannels

Wednesday, November 19, 14Seeh^p://en.wikipedia.org/wiki/Communica8ng_sequen8al_processesh^p://clojure.com/blog/2013/06/28/clojure-­‐core-­‐async-­‐channels.htmlh^p://blog.drewolson.org/blog/2013/07/04/clojure-­‐core-­‐dot-­‐async-­‐and-­‐go-­‐a-­‐code-­‐comparison/

and  other  references  in  the  “bonus”  slides  at  the  end  of  the  deck.  I  also  have  some  slides  that  describe  the  core  primi8ves  of  CSP  that  I  won’t  have  8me  to  cover.

Page 11: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

“Don’t communicateby sharing memory,

share memoryby communicating”

11

-- Rob Pike

Wednesday, November 19, 14h^p://www.youtube.com/watch?v=f6kdp27TYZs&feature=youtu.be

From  a  talk  Pike  did  at  Google  I/O  2012.

Page 12: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP-inspired Go & Clojure’s core.async

12

Wednesday, November 19, 14

Page 13: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 13

GetValue

PutValue

BlockingChannel

Wednesday, November 19, 14Simplest  channel,  a  blocking,  1-­‐element  “connector”  used  to  share  values,  one  at  a  8me  between  a  source  and  a  wai8ng  sync.  The  put  opera8on  blocks  if  there  is  no  sync  wai8ng  on  the  other  end.

The  channel  can  be  typed  (Go  lang).  

Doesn’t  prevent  the  usual  problems  if  mutable  state  is  passed  over  a  channel!

Page 14: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 14

GetValue

PutValue

BlockingChannel

•Block on put if no one to get.•Channel can be typed.•Avoid passing mutable state!

Wednesday, November 19, 14Simplest  channel,  a  blocking,  1-­‐element  “connector”  used  to  share  values,  one  at  a  8me  between  a  source  and  a  wai8ng  sync.  The  put  opera8on  blocks  if  there  is  no  sync  wai8ng  on  the  other  end.

The  channel  can  be  typed  (Go  lang).  

Doesn’t  prevent  the  usual  problems  if  mutable  state  is  passed  over  a  channel!

Page 15: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 15

GetValue

PutValue

Bounded,Nonblocking

Channel

N = 3

Wednesday, November 19, 14A  non-­‐blocking  queue,  but  bounded  in  size.  Normally,  N  wouldn’t  be  this  small.  You  DON’T  want  it  to  be  infinite  either,  because  eventually  you’ll  fill  it  and  run  out  of  memory!

Page 16: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 16

•When full:•Block on put.•Drop newest (i.e., put value).•Drop oldest (“sliding” window).

GetValue

PutValue

Bounded,Nonblocking

Channel

N = 3

Wednesday, November 19, 14What  should  you  do  when  it’s  full?

Page 17: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 17

Go BlockGo Block

GetValue

PutValue

Bounded,Nonblocking

Channel

N = 3

Wednesday, November 19, 14So  far,  we  haven’t  supported  any  actual  concurrency.  I’m  using  “Go  Blocks”  here  to  represent  explicit  threads  in  Clojure,  when  running  on  the  JVM  and  you’re  willing  to  dedicate  a  thread  to  the  sequence  of  code,  or  core  async  “go  blocks”,  which  provide  thread-­‐like  async  behavior,  but  share  real  threads.  This  is  the  only  op8on  for  clojure.js,  since  you  only  have  one  thread  period.  

Similarly  for  Go,  “go  blocks”  would  be  “go  rou8nes”.

In  all  cases,  they  are  analogous  to  Java/Scala  futures.

Page 18: Error Handling in Reactive Systems

Go BlockGo Block

GetValue

PutValue

Bounded,Nonblocking

Channel

N = 3

©Typesafe 2014 – All Rights Reserved 18

•Core Async: Go Blocks, Threads.•Go: Go Routines.•Analogous to futures.

Wednesday, November 19, 14So  far,  we  haven’t  supported  any  actual  concurrency.  I’m  using  “Go  Blocks”  here  to  represent  explicit  threads  in  Clojure,  when  running  on  the  JVM  and  you’re  willing  to  dedicate  a  thread  to  the  sequence  of  code,  or  core  async  “go  blocks”,  which  provide  thread-­‐like  async  behavior,  but  share  real  threads.  This  is  the  only  op8on  for  clojure.js,  since  you  only  have  one  thread  period.  

Similarly  for  Go,  “go  blocks”  would  be  “go  rou8nes”.

In  all  cases,  they  are  analogous  to  Java/Scala  futures.

Page 19: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 19

Go Block

PutValue

Go Block

PutValue Go Block

GetValue

Bounded,Nonblocking

Channel

N = 3

Wednesday, November 19, 14You  can  “select’  on  several  channels,  analogous  to  socket  select.  I.e.,  read  from  the  next  channel  with  a  value.  In  go,  there  is  a  “select”  construct  for  this.  In  core  async,  there  are  the  “alt!”  (blocking)  and  “alt!!”  (nonblocking)  func8ons.

Fan  out  is  also  possible.

Page 20: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 20

Go Block

PutValue

Go Block

PutValue Go Block

GetValue

Bounded,Nonblocking

Channel

N = 3

•Blocking or nonblocking.•Like socket select.

Wednesday, November 19, 14You  can  “select’  on  several  channels,  analogous  to  socket  select.  I.e.,  read  from  the  next  channel  with  a  value.  In  go,  there  is  a  “select”  construct  for  this.  In  core  async,  there  are  the  “alt!”  (blocking)  and  “alt!!”  (nonblocking)  func8ons.

Fan  out  is  also  possible.

Page 21: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 21

At this point, neither Go nor Core Async have implemented

distributed channels.

However, channels are often used to implement end points for network and file I/O, etc.

Wednesday, November 19, 14

In  other  words,  no  one  has  extended  the  channel  formalism  outside  process  boundaries  (compare  to  Actors...),  but  channels  are  ooen  used  to  handle  blocking  I/O,  etc.

Page 22: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Failure Handlingin

Core Async

22

Wednesday, November 19, 14The  situa8on  is  broadly  similar  for  Go.  Some  items  here  are  adapted  from  a  private  conversa8on  with  Alex  Miller  (@puredanger).

Page 23: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 23

•The exception is passed to the function.•If it returns non-nil, that value is put on the channel.

Channel construction takes an optional exception function.

Wednesday, November 19, 14

Using  this  feature  is  almost  always  what  you  should  do,  because  you  have  almost  no  other  good  op8ons.

Page 24: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 24

•Processing logic can span several threads!•A general problem for concurrency implemented using multithreading.

Which call stack?

Wednesday, November 19, 14

Since  it’s  not  a  distributed  system,  core  async  only  needs  to  handle  errors  in  a  single  process,  but  you  s8ll  can  have  mul8ple  threads.

Page 25: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 25

Propagate exceptions back through the channel.

Go Block

GetValue

Go Block

PutValue

Time

ExceptionHandle Exception

Wednesday, November 19, 14One  possible  scheme  is  to  push  excep8ons  back  through  the  channel  and  let  the  ini8alizing  go  block  decide  what  to  do.  It  might  rethrow  the  excep8on.

Page 26: Error Handling in Reactive Systems

Go Block

Handle Exception

Go Block

GetValue

Go Block

PutValue

Time

Exception

ErrorChannel

©Typesafe 2014 – All Rights Reserved 26

Propagate exceptions to a special error channel.

Wednesday, November 19, 14Another  possible  scheme  is  to  send  excep8ons  down  a  specialized  error  channel.

Page 27: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 27

Deadlock are possible unless timeouts are used.

PutValue

BlockingChannel

???

Timeout

Wednesday, November 19, 14

Page 28: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 28

Reactive Extensions

Wednesday, November 19, 14I’ll  look  at  C#  examples,  but  all  the  Rx  language  implementa8ons  work  similarly.

Page 29: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 29

Observable

LINQ

filter …map

AsyncEvent Stream

Schedulers

Wednesday, November 19, 14

Page 30: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Uses classic exception handling patterns,with extensions.

30

Wednesday, November 19, 14

Page 31: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 31

OnError notification caught with a Catch method. •Switch to a second stream.

val stream = new Subject<MyType>();val observer = stream.Catch(otherStream);...stream.OnNext(item1);...stream.OnError(new Exception(“error”));// continue with otherStream.

Wednesday, November 19, 14

Adapted  from  h^p://www.introtorx.com/content/v1.0.10621.0/11_AdvancedErrorHandling.html

Page 32: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 32

Variant for catching a specific exception, with a function to construct a new stream.

val stream = new Subject<MyType>();val observer = stream.Catch<MyType,MyEx>( ex => /* create new MyType stream */);...stream.OnNext(item1);...stream.OnError(new MyEx(“error”));// continue with generated stream.

Wednesday, November 19, 14

Adapted  from  h^p://www.introtorx.com/content/v1.0.10621.0/11_AdvancedErrorHandling.html

Page 33: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 33

There is also a Finally method.Analogous to try {...} finally {...}clauses.

Wednesday, November 19, 14

Adapted  from  h^p://www.introtorx.com/content/v1.0.10621.0/11_AdvancedErrorHandling.html

Page 34: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 34

OnErrorResumeNext: Swallows exception, continues with alternative stream(s).

public static IObservable<TSource> OnErrorResumeNext<TSource>( this IObservable<TSource> first, IObservable<TSource> second) {...}

public static IObservable<TSource> OnErrorResumeNext<TSource>(params IObservable<TSource>[] sources) {...}...

Wednesday, November 19, 14

Adapted  from  h^p://www.introtorx.com/content/v1.0.10621.0/11_AdvancedErrorHandling.html2  of  the  3  variants.

Page 35: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 35

Retry: Are some exceptions expected, e.g., I/O “hiccups”. Keeps trying. Optional max retries.

public static void RetrySample<T>( IObservable<T> source) { source.Retry(4) // retry up to 4 times. .Subscribe(t => Console.WriteLine(t)); Console.ReadKey();}

Wednesday, November 19, 14

Adapted  from  h^p://www.introtorx.com/content/v1.0.10621.0/11_AdvancedErrorHandling.html

Page 36: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP & Rx:Failure management is per-stream, mixed with domain logic.

36

Wednesday, November 19, 14What  CSP-­‐derived  and  Rx  concurrency  systems  do,  they  do  well,  but  we  need  a  larger  strategy  for  reac8ve  resiliency  at  scale.

Before  we  consider  such  strategies,  let’s  discuss  another  technique.

Page 37: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

#2Preventcommonproblems.

37

Wednesday, November 19, 14This  is  how  we’ve  always  done  it,  right?

Page 38: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Reactive Streams

38

Wednesday, November 19, 14Reac8ve  Streams  extend  the  capabili8es  of  CSP  channels  and  Rx  by  addressing  flow  control  concerns.  They

Page 39: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Reactive Streams

39

Even

t

Even

t

Even

t

Even

t

Even

t

Even

tEvent/Data Stream

Consumer

Consumerfeedback

bounded queuefeedback

feedback

Wednesday, November 19, 14

Page 40: Error Handling in Reactive Systems

Streams (data flows) are a natural model for many distributed problems, i.e., one-way CSP channels at scale.

©Typesafe 2014 – All Rights Reserved

Reactive Streams

40

Even

t

Even

t

Even

t

Even

t

Even

t

Even

tEvent/Data Stream

Consumer

Consumerfeedback

bounded queuefeedback

feedback

Wednesday, November 19, 14

Page 41: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

#3Leverage types to

prevent errors.

41

Wednesday, November 19, 14This  is  how  we’ve  always  done  it,  right?

Page 42: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Express what’s really happening with types.

42

Wednesday, November 19, 14First,  let’s  at  least  be  honest  with  the  reader  about  what’s  actually  happening  in  blocks  of  code.

Page 43: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

When code raises exceptions:

43

case class Order( id: Long, cost: Money, items: Seq[(Int,SKU)])

object Order { def parse(string: String): Try[Order] = Try { val array = string.split("\t") // raise exceptions for bad records new Order(...) }}

Wednesday, November 19, 14

Idioma8c  Scala  for  “defensive”  parsing  of  incoming  data  as  strings.  Wrap  the  parsing  and  construc8on  logic  in  a  Try  {...}.  Note  the  capital  T;  this  will  construct  a  Try  instance,  either  a  subclass  Success,  if  everything  works,  or  a  Failure,  if  an  excep8on  is  thrown.See  the  github  repo  for  this  presenta8on  for  a  complete  example:  h^ps://github.com/deanwampler/Presenta8ons

Page 44: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Latency? Use Futures

44

• Or equivalents, like go blocks.case class Account( id: Long, orderIds: Seq[Long])

def getAccount(id: Long): Future[Account] = Future { /* Web service, DB query, etc... */ }

def getOrders(ids: Seq[Long]): Future[Seq[Order]] = Future { /* Web service, DB query, etc... */ }...

Wednesday, November 19, 14

See  the  github  repo  for  this  presenta8on  for  a  complete  example:  h^ps://github.com/deanwampler/Presenta8ons

Page 45: Error Handling in Reactive Systems

Latency? Use Futures • Or equivalents, like go blocks.

©Typesafe 2014 – All Rights Reserved 45

...def ordersForAccount(accountId: Long): Future[Seq[Order]] = for { account <- getAccount(accountId) orders <- getOrders(account.orderIds) } yield orders

Wednesday, November 19, 14

Futures  can  be  sequenced  “monadically”,  so  our  code  has  a  nice  synchronous  feel  to  it,  but  we  can  exploit  async.  execu8on.See  the  github  repo  for  this  presenta8on  for  a  complete  example:  h^ps://github.com/deanwampler/Presenta8ons

Page 46: Error Handling in Reactive Systems

Latency? Use Futures • Or equivalents, like go blocks.

©Typesafe 2014 – All Rights Reserved 46

val accountId = ...val ordersFuture = ordersForAccount(accountId)

ordersFuture.onSuccess { case orders => println(s"#$accountId: $orders")}ordersFuture.onFailure { case exception => println(s"#$accountId: " + "Failed to process orders: $exception")}

Wednesday, November 19, 14

See  the  github  repo  for  this  presenta8on  for  a  complete  example:  h^ps://github.com/deanwampler/Presenta8ons

Page 47: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Use types toenforce correctness.

47

Wednesday, November 19, 14First,  let’s  at  least  be  honest  with  the  reader  about  what’s  actually  happening  in  blocks  of  code.

Page 48: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 48

FunctionalReactive

ProgrammingWednesday, November 19, 14On  the  subject  of  type  safety,  let’s  briefly  discuss  FRP.  It  was  invented  in  the  Haskell  community,  where  there’s  a  strong  commitment  to  type  safety  as  a  tool  for  correctness.

Page 49: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Represent evolving state by time-varying values.

49

Reactor.flow { reactor => val path = new Path( (reactor.await(mouseDown)).position) reactor.loopUntil(mouseUp) { val m = reactor.awaitNext(mouseMove) path.lineTo(m.position) draw(path) } path.close() draw(path)}

From Deprecating the Observer

Pattern with Scala.React.

Wednesday, November 19, 14

Draw  a  line  on  a  UI  from  the  ini8al  point  to  the  current  mouse  point,  as  the  mouse  moves.This  API  is  from  a  research  paper.  I  could  have  used  Elm  (FRP  for  JavaScript)  or  one  of  the  Haskell  FRP  APIs  (where  FRP  was  pioneered),  but  this  DSL  is  reasonably  easy  to  understand.Here,  we  have  a  stream  of  data  points,  so  it  resembles  Rx  in  its  concepts.

Page 50: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Can you declaratively prevent errors?

50

Sculthorpe and Nilsson, Safe functional reactive programming

through dependent types

Wednesday, November 19, 14True  to  its  Haskell  routes,  FRP  tries  to  use  the  type  system  to  explicitly  h^p://dl.acm.org/cita8on.cfm?doid=1596550.1596558

Page 51: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

#4Manage errors

separately.

51

Wednesday, November 19, 14This  is  how  we’ve  always  done  it,  right?

Page 52: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Actor Model

52

Wednesday, November 19, 14

Page 53: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 53

Superficially similar to channels.

ActorRef

Mail box(message

queue)

Handlea message

Senda message

Actor Actor

Wednesday, November 19, 14

This  is  how  they  look  in  Akka,  where  there  is  a  layer  of  indirec8on,  the  ActorRef,  between  actors.  This  helps  with  the  drawback  that  actors  know  each  other’s  iden88es,  but  mostly  it’s  there  to  make  the  system  more  resilient,  where  a  failed  actor  can  be  restarted  while  keeping  the  same  ActorRef  that  other  actors  hold  on  to.  

Page 54: Error Handling in Reactive Systems

ActorRef

Mail box(message

queue)

Handlea message

Senda message

Actor Actor

©Typesafe 2014 – All Rights Reserved 54

In response to a message, an Actor can:

•Send 0-n msgs to other actors.•Create 0-n new actors.•Change its behavior for responding to the next message.

Wednesday, November 19, 14

Page 55: Error Handling in Reactive Systems

ActorRef

Mail box(message

queue)

Handlea message

Senda message

Actor Actor

©Typesafe 2014 – All Rights Reserved 55

Messages are:

•Handled asynchronously. •Usually untyped.

Wednesday, November 19, 14

Page 56: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP and Actorsare dual

56

Wednesday, November 19, 14

Page 57: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP Processes are anonymous

57

... while actors have identities.

Go BlockGo Block

PutValue

Bounded,Nonblocking

Channel

GetValue

N = 3

Wednesday, November 19, 14

Page 58: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP messaging is synchronous

58

A sender and receiver must rendezvous, while actor messaging is asynchronous.

Go BlockGo Block

PutValue

Bounded,Nonblocking

Channel

GetValue

N = 3

Wednesday, November 19, 14

In  actors,  the  receiver  doesn’t  even  need  to  be  ready  to  receive  messages  yet.

Page 59: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

... but CSP and Actorscan implement

each other

59

Wednesday, November 19, 14

Page 60: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 60

An actor mailbox looks a lot like a channel.

Go BlockGo Block

PutValue

Bounded,Nonblocking

Channel

GetValue

N = 3

ActorRef

Mail box(message

queue)

Handlea message

Senda message

Actor Actor

Wednesday, November 19, 14

In  actors,  the  receiver  doesn’t  even  need  to  be  ready  to  receive  messages  yet.

Page 61: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP Processes are anonymous

61

Actor identity can be hidden behind a lookup service. An actor can be used as a channel , i.e., a “message broker”.

Wednesday, November 19, 14

Page 62: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP Processes are anonymous

62

Conversely, a reference to the channel is often shared between a sender and receiver.

Wednesday, November 19, 14

Page 63: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP messaging is synchronous

63

Actor messaging can be synchronous if the sender uses a blocking message send that waits for a response.

blockingmessage

Actor Actor

Reply

Wednesday, November 19, 14

Most  actor  systems  provide  a  blocking  message  send  primi8ve  where  the  “thread”  blocks  un8l  an  answer  message  is  received.

Page 64: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP messaging is synchronous

64

Buffered channels behave asynchronously.

Go BlockGo Block

PutValue

Bounded,Nonblocking

Channel

GetValue

N = 3

Wednesday, November 19, 14

In  actors,  the  receiver  doesn’t  even  need  to  be  ready  to  receive  messages  yet.

Page 65: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Erlangand Akka

65

Wednesday, November 19, 14

Page 66: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 66

Distributed Actors

•Generalize actor identities to URLs.•But distribution adds a number of failure modes...

Wednesday, November 19, 14

URL  vs.  URI??  See  h^p://danielmiessler.com/study/url_vs_uri/

Page 67: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Failure-handlingin Actor Systems

67

Wednesday, November 19, 14

Page 68: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Let it Crash!

68

Wednesday, November 19, 14

Page 69: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 69

Erlang introduced supervisorsA hierarchy of actors that manage each “worker” actor’s lifecycle. Supervisor 1

Actor 12

Actor 111 Actor 112

Supervisor 11

Actor 131 Actor 132

Supervisor 1

Actor 13

Wednesday, November 19, 14

Page 70: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 70

Erlang introduced supervisorsGeneralizes nicely to distributed actor systems.

Supervisor 1

Actor 12

Actor 111 Actor 112

Supervisor 11

Actor 131 Actor 132

Supervisor 1

Actor 13

Wednesday, November 19, 14

Page 71: Error Handling in Reactive Systems

Supervisor 1

Actor 12

Actor 111 Actor 112

Supervisor 11

Actor 131 Actor 132

Supervisor 1

Actor 13

©Typesafe 2014 – All Rights Reserved 71

X

Wednesday, November 19, 14

Page 72: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 72

Supervisor 1

Actor 12

Actor 111 Actor 112

Supervisor 11

Wednesday, November 19, 14

Page 73: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 73

Actor 131 Actor 132

Supervisor 1

Actor 13

Supervisor 1

Actor 12

Actor 111 Actor 112

Supervisor 11

Wednesday, November 19, 14

Page 74: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Advantages

74

•Enables strategic error handling across module boundaries.•Separates normal and error logic.•Failure handling is configurable and pluggable.

Wednesday, November 19, 14

Page 75: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Criticisms of Actors

75

Wednesday, November 19, 14

Page 76: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Rich Hickey

76

[Actors] still couple the producer with the consumer. Yes, one can emulate or implement certain kinds of queues with actors, but since any actor mechanism already incorporates a queue, it seems evident that queues are more primitive. ... and channels are oriented towards the flow aspects of a system.

Wednesday, November 19, 14

From  h^p://clojure.com/blog/2013/06/28/clojure-­‐core-­‐async-­‐channels.html

Page 77: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Tim Baldridge’s Criticisms

77

•Unbounded queues (mailboxes).• Internal mutating state (hidden in

function closures).•Must send message to deref state. What

if the mailbox is backed up?•Couples a queue, mutating state, and a

process.•Effectively “asynchronous OOP”.

Wednesday, November 19, 14

From  h^ps://github.com/halgari/clojure-­‐conj-­‐2013-­‐core.async-­‐examples/blob/master/src/clojure_conj_talk/core.cljMost  of  these  are  based  on  his  toy  example,  not  a  produc8on-­‐calibre  implementa8on.

Page 78: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

I’ll add...

78

•Most actor systems are untyped.•Typed channels add that extra bit of type safety.

Wednesday, November 19, 14

Page 79: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Answers

79

Wednesday, November 19, 14The  fact  that  Actors  and  CSP  can  be  used  to  implement  each  other  suggests  that  the  cri8cisms  are  less  than  meets  the  eye...

Page 80: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Unbounded queues

80

•Bounded queues are available in production-ready Actor implementations.•Reactive Streams with back pressure

enable strategic management of flow.

Wednesday, November 19, 14

In  other  words,  ignore  toy  examples.

Page 81: Error Handling in Reactive Systems

Akka streams provide a higher-level abstraction on top of Actors with better type safety (effectively, typed channels) and operational semantics.

©Typesafe 2014 – All Rights Reserved

Reactive Streams

81

Even

t

Even

t

Even

t

Even

t

Even

t

Even

tEvent/Data Stream

Consumer

Consumerfeedback

bounded queuefeedback

feedback

Wednesday, November 19, 14

Page 82: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Internal mutating state

82

•Actually an advantage.•Encapsulation of mutating state within

an Actor is a systematic approach to large-scale, reliable management of state evolution.• “Asynchronous OOP” is a fine strategy

when it fits your problem.

Wednesday, November 19, 14

Page 83: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Must send message to get state

83

•Also an advantage.•Protocol for coordinating reads and

writes.

Wednesday, November 19, 14

Page 84: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Couples a queue, mutable state, and a process

84

•Production systems provide as much decoupling as you need.

Wednesday, November 19, 14

Page 85: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 85

•While there have been typed actor implementations, actors are analogous to OS processes:•Clear abstraction boundaries.•But careful handling of inputs required.

Actors are untyped

Wednesday, November 19, 14

Page 86: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 86

•... but actually, Akka is adding typed ActorRefs.

Actors are untyped

Wednesday, November 19, 14

Page 87: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 87

How should we handle

failures?

Wednesday, November 19, 14

Page 88: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

A separate system is needed to supervise

and manage processes.

88

Wednesday, November 19, 14

Page 89: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Use Actors or...

89

Wednesday, November 19, 14

Page 90: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 90

https://github.com/Netflix/Hystrix

Wednesday, November 19, 14

Page 91: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 91

•Better separation of concerns.•Failure can be delegated to a separate component.•Strategy for failure handling can be pluggable.•Better scalability.

Wednesday, November 19, 14

Page 92: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Conclusions

92

Wednesday, November 19, 14

Page 93: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Actors

93

-Untyped interfaces.-More OOP than FP.-Overhead higher than function calls.

Wednesday, November 19, 14

Page 94: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Actors

94

+Industry proven scalability and resiliency.+Native asynchrony.

Best-in-class strategy for failure handling.

Wednesday, November 19, 14

Page 95: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP, Rx, etc.

95

-Limited failure handling facilities.-Distributed channels?

Wednesday, November 19, 14

I  would  include  futures  in  the  list  of  deriva8ves.

Page 96: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSP, Rx, etc.

96

+Emphasize flow of control.+Typed channels.

Optimal replacement for multithreaded (intra-process) programming.

Wednesday, November 19, 14

Page 97: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Since we’re ina comedy club,

let me finishby saying...

97

Wednesday, November 19, 14

Page 98: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

You’ve been agreat audience!

98

Wednesday, November 19, 14

Page 99: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

I’m here all day!

99

Wednesday, November 19, 14

Page 100: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Don’t forget totip your waitress!

100

Wednesday, November 19, 14

Page 101: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Try the veal!

101

Wednesday, November 19, 14

Page 102: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

http://typesafe.com/[email protected]

Wednesday, November 19, 14Photos  from  Jantar  Mantar  (“instrument”,  “calcula8on”),  the  astronomical  observatory  built  in  Jaipur,  India,  by  Sawai  Jai  Singh,  a  Rajput  King,  in  the  1720s-­‐30s.  He  built  four  others  around  India.  This  is  the  largest  and  best  preserved.All  photos  are  copyright  (C)  2012-­‐2014,  Dean  Wampler.  All  Rights  Reserved.

Page 103: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Bonus Slides

103

Wednesday, November 19, 14

Page 104: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 104

Communicating Sequential

Processes

Messagepassing

viachannels

Wednesday, November 19, 14Seeh^p://en.wikipedia.org/wiki/Communica8ng_sequen8al_processesh^p://clojure.com/blog/2013/06/28/clojure-­‐core-­‐async-­‐channels.htmlh^p://blog.drewolson.org/blog/2013/07/04/clojure-­‐core-­‐dot-­‐async-­‐and-­‐go-­‐a-­‐code-­‐comparison/

and  other  references  to  follow.  

Page 105: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 105

Wednesday, November 19, 14Hoare’s  book  on  CSP,  originally  published  in  ’85  aoer  CSP  had  been  significantly  evolved  from  the  ini8al  programming  language  he  defined  in  the  70’s  to  a  theore8cal  model  with  a  well-­‐defined  calculus  by  the  mid  80’s  (with  the  help  of  other  people,  too).  The  book  itself  has  been  subsequently  refined.  The  PDF  is  available  for  free.

Page 106: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 106

Wednesday, November 19, 14Modern  treatment  of  CSP.  Roscoe  helped  transform  the  original  CSP  language  into  its  more  rigorous,  process  algebra  form,  which  was  influenced  by  Milner’s  Calculus  of  Communica8ng  Systems  work.  This  book’s  PDF  is  available  free.  This  treatment  is  perhaps  more  accessible  than  Hoare’s  book.

Page 107: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

CSPOperators

107

Wednesday, November 19, 14

Page 108: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Prefix

108

A process communicates event a to its environment. Afterwards the process behaves like P.

a⟶P

Wednesday, November 19, 14

A  process  communicates  

Page 109: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Deterministic Choice

109

A process communicates event a or b to its environment. Afterwards the process behaves like P or Q, respectively.

a⟶P ☐ b⟶Q

Wednesday, November 19, 14

Page 110: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Nondeterministic Choice

110

The process doesn’t get to choose which is communicated, a or b.

a⟶P ⊓ b⟶Q

Wednesday, November 19, 14

Page 111: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Interleaving

111

Completely independent processes. The events seen by them are interleaved in time.

P ||| Q

Wednesday, November 19, 14

Page 112: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Interface Parallel

112

Represents synchronization on event a between P and Q.

P |[{a}]| Q

Wednesday, November 19, 14

Page 113: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

Hiding

113

A form of abstraction, by making some events unobservable. P hides events a.

a⟶P \{a}

Wednesday, November 19, 14

Page 114: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved

References

114

Wednesday, November 19, 14

Page 115: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 115

Wednesday, November 19, 14Lots  of  interes8ng  prac8cal  ideas  for  combining  func8onal  programming  and  reac8ve  approaches  to  class  Domain-­‐Driven  Design  by  Eric  Evans.

Page 116: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 116

Wednesday, November 19, 14Hoare’s  book  on  CSP,  originally  published  in  ’85  aoer  CSP  had  been  significantly  evolved  from  a  programming  language  to  a  theore8cal  model  with  a  well-­‐defined  calculus.  The  book  itself  has  been  subsequently  refined.  The  PDF  is  available  for  free.

Page 117: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 117

Wednesday, November 19, 14Modern  treatment  of  CSP.  Roscoe  helped  transform  the  original  CSP  language  into  its  more  rigorous,  process  algebra  form,  which  was  influenced  by  Milner’s  Calculus  of  Communica8ng  Systems  work.  This  book’s  PDF  is  available  free.  The  treatment  is  more  accessible  than  Hoare’s  book.

Page 118: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 118

Wednesday, November 19, 14A  survey  of  theore8cal  models  of  distributed  compu8ng,  star8ng  with  a  summary  of  lambda  calculus,  then  discussing  the  pi,  join,  and  ambient  calculi.  Also  discusses  the  actor  model.  The  treatment  is  somewhat  dry  and  could  use  more  discussion  of  real-­‐world  implementa8ons  of  these  ideas,  such  as  the  Actor  model  in  Erlang  and  Akka.

Page 119: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 119

Wednesday, November 19, 14Gul  Agha  was  a  grad  student  at  MIT  during  the  80s  and  worked  on  the  actor  model  with  Hewi^  and  others.  This  book  is  based  on  his  disserta8on.It  doesn’t  discuss  error  handling,  actor  supervision,  etc.  as  these  concepts  .

His  thesis,  h^p://dspace.mit.edu/handle/1721.1/6952,  the  basis  for  his  book,h^p://mitpress.mit.edu/books/actors

See  also  Paper  for  a  survey  course  with  Rajesh  Karmani,  h^p://www.cs.ucla.edu/~palsberg/course/cs239/papers/karmani-­‐agha.pdf

Page 120: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 120

Wednesday, November 19, 14Survey  of  the  classic  graph  traversal  algorithms,  algorithms  for  detec8ng  failures  in  a  cluster,  leader  elec8on,  etc.

Page 121: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 121

Wednesday, November 19, 14  A  less  comprehensive  and  formal,  but  more  intui8ve  approach  to  fundamental  algorithms.

Page 122: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 122

Wednesday, November 19, 14Comprehensive  and  somewhat  formal  like  Raynal’s  book,  but  more  focused  on  modeling  common  failures  in  real  systems.

Page 123: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 123

Wednesday, November 19, 141992:  Yes,  “Reac8ve”  isn’t  new  ;)  This  book  is  lays  out  a  theore8cal  model  for  specifying  and  proving  “reac8ve”  concurrent  systems  based  on  temporal  logic.  While  its  goal  is  to  prevent  logic  errors,  It  doesn’t  discuss  handling  failures  from  environmental  or  other  external  causes  in  great  depth.

Page 124: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 124

Wednesday, November 19, 141988:  Another  treatment  of  concurrency  using  algebra.  It’s  not  based  on  CSP,  but  it  has  similar  constructs.  

Page 125: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 125

Wednesday, November 19, 14A  recent  text  that  applies  combinatorics  (coun8ng  things)  and  topology  (proper8es  of  geometric  shapes)  to  the  analysis  of  distributed  systems.  Aims  to  be  pragma8c  for  real-­‐world  scenarios,  like  networks  and  other  physical  systems  where  failures  are  prac8cal  concerns.

Page 126: Error Handling in Reactive Systems

©Typesafe 2014 – All Rights Reserved 126

Wednesday, November 19, 14h^p://mitpress.mit.edu/books/engineering-­‐safer-­‐worldFarther  afield,  this  book  discusses  safety  concerns  from  a  systems  engineering  perspec8ve.