risking everything with akka streams
TRANSCRIPT
H O P E S , R E G R E T S A N D “ B E S T P R A C T I C E S ”
R I S K I N G E V E RY T H I N G
W I T H A K K A S T R E A M S
0 8 - 1 2 - 2 0 1 6
J O A C H I M H O F E R ( @ J O H O F E R )
2
0. What We Do
1. First Impressions
2. Lessons Learnt
3. Awesomeness
4. Ops
5. Verdict
TA B L E O F C O N T E N T S
3
W H AT W E D O @ Z A L A N D O
4
net sales 2015: 3 billion euros
several 1000 updates / second
latency: ideally seconds
W H AT W E D O : S O M E N U M B E R S
5
“ S O M E ” R I S K I N V O LV E D !
6
Akka Streams vs. Futures, RxScala, Actors
Tech Blog “Which shoe fits you?”
https://github.com/zalando/
scala-concurrency-playground
Akka Streams fits us best!
Image: Nevit Dilmen (CC BY-SA 3.0)
F I R S T I M P R E S S I O N S — P R E PA R AT I O N
7
H U H ?
F I R S T I M P R E S S I O N S — G R A P H D S L
import GraphDSL.Implicits._
val bcSqsIn = b add Broadcast[StreamCreationEvent](2)val rules = b add ruleStore.stage.asyncval eval = b add evaluator.stage.asyncval publish = b add nakadi.stage.asyncval ack = b add ackStage.stage.async
bcSqsIn ~> rules ~> eval.in0bcSqsIn ~> eval.in1; eval.out ~> publish ~> ack
FlowShape(bcSqsIn.in, ack.out)
8
A H !
F I R S T I M P R E S S I O N S — P L A I N S O U R C E S
products .mapAsync(parallelism = 5)(ruleEvaluatorEvent(flowId)) .groupedWithin(3, 100 millis) .filter(_.nonEmpty) .mapAsync(parallelism = 50)(sqsGateway.send) .runForeach(result => log.info(result.getFailed.size))
9 Images: Jerry Daykin (CC BY 2.0, left), The Present Group (CC BY 3.0 US, right)
F I R S T I M P R E S S I O N S — O T H E R B I G S C A R I E S
M AT E R I A L I S AT I O N B A C K P R E S S U R E
1 0
TA S T Y D O C U M E N TAT I O N
Image: Wikivisual (CC BY-NC-SA 3.0)
11
M I S TA K E S W E R E M A D E
Image: Hapesoft (public domain)
1 2
Caused by onError
Completes the stream
Recover using recover / recoverWithRetries
FA I L U R E S ( V S E R R O R S )
1 3
Caused by exceptions
Get escalated to failures by default
Recover using a Supervision Strategy
Can be ignored easily (“Resume” strategy)
E R R O R S ( V S FA I L U R E S )
1 4
R E S U M I N G
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
E V E N T S
E V E N T S
R E T R I E V E R U L E S
RUL ES
E VA L U AT E1
1 5
R E S U M I N G
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
E V E N T S
E V E N T S
R E T R I E V E R U L E S
RUL ES
E VA L U AT E
1
1
1 6
R E S U M I N G
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
E V E N T S
E V E N T S
R E T R I E V E R U L E S
RUL ES
E VA L U AT E
1
1 7
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
E V E N T S
E V E N T S
R E T R I E V E R U L E S
RUL ES
E VA L U AT E
1
2
1 8
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
E V E N T S
E V E N T S
R E T R I E V E R U L E S
RUL ES
E VA L U AT E
1
2
2
1 9
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
E V E N T S
E V E N T S
R E T R I E V E R U L E S
RUL ES
E VA L U AT E
1
2
2
2 0
G O I N G O U T- O F - S Y N C
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
E V E N T S
E V E N T S
R E T R I E V E R U L E S
RUL ES
E VA L U AT E
12
2
2 1
Unconsumed response entities
Solution: discardEntityBytes
L E S S O N S L E A R N T — A K K A - H T T P C L I E N T S ( 1 )
2 2
No response from the server
“Currently Akka HTTP doesn’t implement client-side request timeout
checking itself as this functionality can be regarded as a more general
purpose streaming infrastructure feature.” (akka http docs)
Solution: Use an explicit xxxTimeout stage
L E S S O N S L E A R N T — A K K A - H T T P C L I E N T S ( 2 )
2 3
Default: only up to 16?!
16 (buffer size) x 50 (parallel flows) x 10 (events per batch) = 8000 events
Solution: keep low, use explicit buffer stages
L E S S O N S L E A R N T — I N T E R N A L B U F F E R S
2 4
C O N F I G U R AT I O N A S A S O U R C E ?
Image: Alpha (CC BY-SA 2.0)
L E S S O N S L E A R N T
R E T R I E V EE V E N T S
EV E N T S
E V E N T S
R E T R I E V E R U L E S
RU LES
E VA L U AT E
C O N F I G U R AT I O N
2 5
R E A C T I V E M A N I F E S TO
2 6
Backpressure
Automatically asynchronous
AW E S O M E N E S S — R E A C T I V E
2 7
AW E S O M E N E S S — T E S TA B I L I T Y
E VA L U AT ET E S T E V E N T S O U T P U T S TO C H E C K
2 8
groupedWithin
throttle
mapConcat
recoverWithRetries
…
AW E S O M E N E S S — B U I LT- I N S TA G E S
Flow[RuleEvaluationEvent] .groupedWithin(batchSize, batchTimeout) .mapAsync(parallelism = 1)(provider.deleteFromStreamCreation) .mapConcat(identity) .via(logAndMonitor(Checkpoint.Acknowledged, "acknowledged msg"))
2 9
no out-of-the-box solution
built-in stage monitor not very helpful
no access to internal buffers
O P S — M O N I TO R I N G : T H E B A D
3 0
Trace events along streams
Create your own monitoring stage
O P S — M O N I TO R I N G : T H E G O O D
3 1
E X A M P L E PA S S - T H R O U G H S TA G E L O G I C
O P S — M O N I TO R I N G
…new GraphStageLogic(shape) { setHandlers(in, out, new InHandler with OutHandler { override def onPush(): Unit = { push(out, grab(in)) } override def onPull(): Unit = { pull(in) } })}…
3 2
E X A M P L E PA S S - T H R O U G H S TA G E L O G I C
O P S — M O N I TO R I N G
…new GraphStageLogic(shape) { setHandlers(in, out, new InHandler with OutHandler { override def onPush(): Unit = { stats.countPush() push(out, grab(in)) } override def onPull(): Unit = { stats.countPull() pull(in) } })}…
3 3
easy to tune
very efficient
easy to scale
see also: Gearpump Materializer
Image: Duncan Rawlinson (CC BY 2.0)
O P S — T U N I N G A N D S C A L I N G
3 4
everything under control
just keeps on running
O P S — R E L I A B I L I T Y
3 6
Takes time to understand
Very potent
Image: Twice25 (CC BY-SA 2.5)
V E R D I C T: I T ’ S M A G I C !
3 7
R I S K R E WA R D➟
@johofer
J O A C H I M H O F E RAvailability Engineering
Backend Engineer
0 8 - 1 2 - 2 0 1 6