distributed stream processing - spark summit east 2017
TRANSCRIPT
Petr Zapletal @petr_zapletal@spark_summit
@cakesolutions
Distributed Real-Time Stream Processing: Why and How
Agenda
● Motivation
● Stream Processing
● Available Frameworks
● Systems Comparison
● Recommendations
The Data Deluge
● New Sources and New Use Cases
● 8 Zettabytes (1 ZB = 1 trillion GB) created in 2015
● Stream Processing to the Rescue
Points of Interest
➜ Runtime and Programming Model
➜ Primitives
➜ State Management
➜ Message Delivery Guarantees
Points of Interest
➜ Runtime and Programming Model
➜ Primitives
➜ State Management
➜ Message Delivery Guarantees
➜ Fault Tolerance & Low Overhead Recovery
Points of Interest
➜ Runtime and Programming Model
➜ Primitives
➜ State Management
➜ Message Delivery Guarantees
➜ Fault Tolerance & Low Overhead Recovery
➜ Latency, Throughput & Scalability
Points of Interest
➜ Runtime and Programming Model
➜ Primitives
➜ State Management
➜ Message Delivery Guarantees
➜ Fault Tolerance & Low Overhead Recovery
➜ Latency, Throughput & Scalability
➜ Maturity and Adoption Level
Points of Interest
➜ Runtime and Programming Model
➜ Primitives
➜ State Management
➜ Message Delivery Guarantees
➜ Fault Tolerance & Low Overhead Recovery
➜ Latency, Throughput & Scalability
➜ Maturity and Adoption Level
➜ Ease of Development and Operability
● Most important trait of stream processing system
● Defines expressiveness, possible operations and its limitations
● Therefore defines systems capabilities and its use cases
Runtime and Programming Model
Native Streaming
records
SinkOperator
Source Operator
Processing Operator
records processed one at a time
Processing Operator
records processed in short batches
Processing Operator
Receiver
records
Processing Operator
Micro-batches
Sink Operator
Micro-batching
Native Streaming
● Records are processed as they arrive
Pros
⟹ Expressiveness
⟹ Low-latency
⟹ Stateful operations
Cons
⟹ Fault-tolerance is expensive
⟹ Load-balancing
Micro-batching
Cons
⟹ Lower latency, depends on
batch interval
⟹ Limited expressivity
⟹ Harder stateful operations
● Splits incoming stream into small batches
Pros
⟹ High-throughput
⟹ Easier fault tolerance
⟹ Simpler load-balancing
Programming Model
Compositional
⟹ Provides basic building blocks as
operators or sources
⟹ Custom component definition
⟹ Manual Topology definition &
optimization
⟹ Advanced functionality often
missing
Declarative
⟹ High-level API
⟹ Operators as higher order functions
⟹ Abstract data types
⟹ Advance operations like state
management or windowing supported
out of the box
⟹ Advanced optimizers
● Unified batch and stream processing over a batch runtime
Spark Streaming
input data stream Spark
StreamingSpark Engine
batches of input data
batches of processed data
Apex
● Processes massive amounts of real-time events natively in Hadoop
Operator
Operator
OperatorOperator Operator Operator
Output Stream
Stream
StreamStre
am
Stream
Tuple
Flink
Stream Data
Batch Data
Kafka, RabbitMQ, ...
HDFS, JDBC, ...
● Native streaming & High level API
Counting Words
Spark Summit 2017 Apache Apache Spark Storm Apache Trident Flink Streaming Samza Scala 2017 Streaming
(Apache, 3)(Streaming, 2)(2017, 2)(Spark, 2)(Storm, 1)(Trident, 1)(Flink, 1)(Samza, 1)(Scala, 1)(Summit, 1)
Storm
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split", new Split(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
...
Map<String, Integer> counts = new HashMap<String, Integer>();
public void execute(Tuple tuple, BasicOutputCollector collector) {
String word = tuple.getString(0);
Integer count = counts.containsKey(word) ? counts.get(word) + 1 : 1;
counts.put(word, count);
collector.emit(new Values(word, count));
}
Storm
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split", new Split(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
...
Map<String, Integer> counts = new HashMap<String, Integer>();
public void execute(Tuple tuple, BasicOutputCollector collector) {
String word = tuple.getString(0);
Integer count = counts.containsKey(word) ? counts.get(word) + 1 : 1;
counts.put(word, count);
collector.emit(new Values(word, count));
}
Storm
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split", new Split(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
...
Map<String, Integer> counts = new HashMap<String, Integer>();
public void execute(Tuple tuple, BasicOutputCollector collector) {
String word = tuple.getString(0);
Integer count = counts.containsKey(word) ? counts.get(word) + 1 : 1;
counts.put(word, count);
collector.emit(new Values(word, count));
}
Trident
public static StormTopology buildTopology(LocalDRPC drpc) {
FixedBatchSpout spout = ...
TridentTopology topology = new TridentTopology();
TridentState wordCounts = topology.newStream("spout1", spout)
.each(new Fields("sentence"),new Split(), new Fields("word"))
.groupBy(new Fields("word"))
.persistentAggregate(new MemoryMapState.Factory(),
new Count(), new Fields("count"));
...
}
Trident
public static StormTopology buildTopology(LocalDRPC drpc) {
FixedBatchSpout spout = ...
TridentTopology topology = new TridentTopology();
TridentState wordCounts = topology.newStream("spout1", spout)
.each(new Fields("sentence"),new Split(), new Fields("word"))
.groupBy(new Fields("word"))
.persistentAggregate(new MemoryMapState.Factory(),
new Count(), new Fields("count"));
...
}
Trident
public static StormTopology buildTopology(LocalDRPC drpc) {
FixedBatchSpout spout = ...
TridentTopology topology = new TridentTopology();
TridentState wordCounts = topology.newStream("spout1", spout)
.each(new Fields("sentence"),new Split(), new Fields("word"))
.groupBy(new Fields("word"))
.persistentAggregate(new MemoryMapState.Factory(),
new Count(), new Fields("count"));
...
}
Spark Streaming
val conf = new SparkConf().setAppName("wordcount")val ssc = new StreamingContext(conf, Seconds(1))
val text = ...val counts = text.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _)
counts.print()
ssc.start()ssc.awaitTermination()
Spark Streaming
val conf = new SparkConf().setAppName("wordcount")val ssc = new StreamingContext(conf, Seconds(1))
val text = ...val counts = text.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _)
counts.print()
ssc.start()ssc.awaitTermination()
val conf = new SparkConf().setAppName("wordcount")val ssc = new StreamingContext(conf, Seconds(1))
val text = ...val counts = text.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _)
counts.print()
ssc.start()ssc.awaitTermination()
Spark Streaming
val conf = new SparkConf().setAppName("wordcount")val ssc = new StreamingContext(conf, Seconds(1))
val text = ...val counts = text.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _)
counts.print()
ssc.start()ssc.awaitTermination()
Spark Streaming
Samza
class WordCountTask extends StreamTask {
override def process(envelope: IncomingMessageEnvelope, collector: MessageCollector, coordinator: TaskCoordinator) {
val text = envelope.getMessage.asInstanceOf[String]
val counts = text.split(" ") .foldLeft(Map.empty[String, Int]) { (count, word) => count + (word -> (count.getOrElse(word, 0) + 1)) }
collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", "wordcount"), counts))
}
Samza
class WordCountTask extends StreamTask {
override def process(envelope: IncomingMessageEnvelope, collector: MessageCollector, coordinator: TaskCoordinator) {
val text = envelope.getMessage.asInstanceOf[String]
val counts = text.split(" ") .foldLeft(Map.empty[String, Int]) { (count, word) => count + (word -> (count.getOrElse(word, 0) + 1)) }
collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", "wordcount"), counts))
}
class WordCountTask extends StreamTask {
override def process(envelope: IncomingMessageEnvelope, collector: MessageCollector, coordinator: TaskCoordinator) {
val text = envelope.getMessage.asInstanceOf[String]
val counts = text.split(" ") .foldLeft(Map.empty[String, Int]) { (count, word) => count + (word -> (count.getOrElse(word, 0) + 1)) }
collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", "wordcount"), counts))
}
Samza
Kafka Streams
final Serde<String> stringSerde = Serdes.String();final Serde<Long> longSerde = Serdes.Long();
KStream<String, String> textLines = builder.stream(stringSerde, stringSerde, "streams-file-input");
KStream<String, Long> wordCounts = textLines .flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+"))) .groupBy((key, value) -> value) .count("Counts") .toStream();
wordCounts.to(stringSerde, longSerde, "streams-wordcount-output");
Kafka Streams
final Serde<String> stringSerde = Serdes.String();final Serde<Long> longSerde = Serdes.Long();
KStream<String, String> textLines = builder.stream(stringSerde, stringSerde, "streams-file-input");
KStream<String, Long> wordCounts = textLines .flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+"))) .groupBy((key, value) -> value) .count("Counts") .toStream();
wordCounts.to(stringSerde, longSerde, "streams-wordcount-output");
Apex
val input = dag.addOperator("input", new LineReader) val parser = dag.addOperator("parser", new Parser) val out = dag.addOperator("console", new ConsoleOutputOperator) dag.addStream[String]("lines", input.out, parser.in)dag.addStream[String]("words", parser.out, counter.data)
class Parser extends BaseOperator { @transient val out = new DefaultOutputPort[String]() @transient val in = new DefaultInputPort[String]()
override def process(t: String): Unit = { for(w <- t.split(" ")) out.emit(w)
} }
Apex
val input = dag.addOperator("input", new LineReader) val parser = dag.addOperator("parser", new Parser) val out = dag.addOperator("console", new ConsoleOutputOperator) dag.addStream[String]("lines", input.out, parser.in)dag.addStream[String]("words", parser.out, counter.data)
class Parser extends BaseOperator { @transient val out = new DefaultOutputPort[String]() @transient val in = new DefaultInputPort[String]()
override def process(t: String): Unit = { for(w <- t.split(" ")) out.emit(w)
} }
val input = dag.addOperator("input", new LineReader) val parser = dag.addOperator("parser", new Parser) val out = dag.addOperator("console", new ConsoleOutputOperator) dag.addStream[String]("lines", input.out, parser.in)dag.addStream[String]("words", parser.out, counter.data)
class Parser extends BaseOperator { @transient val out = new DefaultOutputPort[String]() @transient val in = new DefaultInputPort[String]()
override def process(t: String): Unit = { for(w <- t.split(" ")) out.emit(w)
} }
Apex
Flink
val env = ExecutionEnvironment.getExecutionEnvironment
val text = env.fromElements(...) val counts = text.flatMap ( _.split(" ") ) .map ( (_, 1) ) .groupBy(0) .sum(1)
counts.print()
env.execute("wordcount")
Flink
val env = ExecutionEnvironment.getExecutionEnvironment
val text = env.fromElements(...) val counts = text.flatMap ( _.split(" ") ) .map ( (_, 1) ) .groupBy(0) .sum(1)
counts.print()
env.execute("wordcount")
Summary
Native Micro-batching Native Native
Compositional Declarative Compositional Declarative
At-least-once Exactly-once At-least-once Exactly-once
Record ACKs RDD based Checkpointing Log-based Checkpointing
Not build-inDedicated Operators
Stateful Operators
Stateful Operators
Very Low Medium Low Low
Low Medium High High
High High Medium Medium
Micro-batching
Exactly-once*
Dedicated DStream
Medium
High
Streaming Model
API
Guarantees
Fault Tolerance
State Management
Latency
Throughput
Maturity
TRIDENT
Hybrid
Compositional*
Exactly-once
Checkpointing
Stateful Operators
Very Low
High
Medium
Native
Declarative*
At-least-once
Low
Log-based
Stateful Operators
Low
High
General Guidelines
➜ Evaluate particular application needs
➜ Programming model
➜ Available delivery guarantees
General Guidelines
➜ Evaluate particular application needs
➜ Programming model
➜ Available delivery guarantees
➜ Almost all non-trivial jobs have state
General Guidelines
➜ Evaluate particular application needs
➜ Programming model
➜ Available delivery guarantees
➜ Almost all non-trivial jobs have state
➜ Fast recovery is critical
Recommendations [Storm & Trident]
● Fits for small and fast tasks
● Very low (tens of milliseconds) latency
● State & Fault tolerance degrades performance significantly
● Potential update to Heron
○ Keeps the API, according to Twitter better in every single way
○ Open-sourced recently
Recommendations [Spark Streaming]
● Spark Ecosystem
● Data Exploration
● Latency is not critical
● Micro-batching limitations
Recommendations [Samza]
● Kafka is a cornerstone of your architecture
● Application requires large states
● Don’t need exactly once
Recommendations [Kafka Streams]
● Similar functionality like Samza with nicer APIs
● Operated by Kafka itself
● Great learning curve
● At-least once delivery
● May not support more advanced functionality
Recommendations [Apex]
● Prefer compositional approach
● Hadoop
● Great performance
● Dynamic DAG changes
Recommendations [Flink]
● Conceptually great, fits very most use cases
● Take advantage of batch processing capabilities
● Need a functionality which is hard to implement in micro-batch
● Enough courage to use emerging project
Dataflow and Apache Beam
Dataflow Model & SDKs
Apache Flink
Apache Spark
Direct Pipeline
Google CloudDataflow
Stream Processing
Batch Processing
Multiple Modes One Pipeline Many Runtimes
Local or
cloud
Local
Cloud
MANCHESTER LONDON NEW YORK
@petr_zapletal @cakesolutions
347 708 1518
We are hiringhttp://www.cakesolutions.net/careers
References● http://storm.apache.org/
● http://spark.apache.org/streaming/
● http://samza.apache.org/
● https://apex.apache.org/
● https://flink.apache.org/
● http://beam.incubator.apache.org/
● http://www.cakesolutions.net/teamblogs/comparison-of-apache-stream-processing-frameworks-part-1
● http://data-artisans.com/blog/
● http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple
● https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
● http://www.slideshare.net/GyulaFra/largescale-stream-processing-in-the-hadoop-ecosystem
● https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
● http://stackoverflow.com/questions/29111549/where-do-apache-samza-and-apache-storm-differ-in-their-use-cases
● https://www.dropbox.com/s/1s8pnjwgkkvik4v/GearPump%20Edge%20Topologies%20and%20Deployment.pptx?dl=0
● ...
List of Figures
● https://databricks.com/blog/2015/07/30/diving-into-apache-spark-streamings-execution-model.html
● http://www.slideshare.net/ptgoetz/storm-hadoop-summit2014
● https://databricks.com/wp-content/uploads/2015/07/image41-1024x602.png
● http://data-artisans.com/wp-content/uploads/2015/08/microbatching.png
● http://samza.apache.org/img/0.9/learn/documentation/container/stateful_job
● http://data-artisans.com/wp-content/uploads/2015/08/streambarrier.png
● https://cwiki.apache.org/confluence/display/FLINK/Stateful+Stream+Processing
● https://raw.githubusercontent.com/tweise/apex-samples/kafka-count-jdbc/exactly-once/docs/images/image00.png
● https://4.bp.blogspot.com/-RlLeDymI_mU/Vp-1cb3AxNI/AAAAAAAACSQ/5TphliHJA4w/s1600/dataflow%2BASF.pn
g
Processing Architecture Evolution
Lambda Architecture
Batch Layer Serving Layer
Stream layer
Query
All
your
dat
aOozie
Query
Performance
➜ Latency vs. Throughput
➜ Costs of Delivery Guarantees, Fault-tolerance & State Management
Performance
➜ Latency vs. Throughput
➜ Costs of Delivery Guarantees, Fault-tolerance & State Management
➜ Tuning
Performance
➜ Latency vs. Throughput
➜ Costs of Delivery Guarantees, Fault-tolerance & State Management
➜ Tuning
➜ Network operations, Data locality & Serialization
Project Maturity [Spark Streaming]
The most trending Scala repository these days and one of the engines behind Scala’s popularity