stateful stream processing at in-memory speed

Stateful Stream Processing at In-Memory Speed

Jamie Grier@[email protected]

mailto:[email protected]

Who am I?• Director of Applications Engineering at data

Artisans• Previously working on streaming

computation at Twitter, Gnip and Boulder Imaging

• Involved in various kinds of stream processing for about a decade

• High-speed video, social media streaming, general frameworks for stream processing

Overview• In stateful stream processing the bottleneck has

often been the key-value store• Accuracy has been sacrificed for speed• Lambda Architecture was developed to address

shortcomings of stream processors• Can we remove the key-value store bottleneck

and enable processing at in-memory speeds?• Can we do this accurately without Lamba

Architecture?

Problem statement• Incoming message rate: 1.5 million/sec• Group by several dimensions and

aggregate over 1 hour event-time windows• Write hourly time series data to database• Respond to queries both over historical

data and the live in-flight aggregates

Input and QueriesStreamtweet-id: 1,event: url-click,time: 01:01:01tweet-id: 2,event: url-click,time: 01:01:02tweet-id: 1,event: impression,time: 01:01:03tweet-id: 2,event: url-click,time: 02:01:01tweet-id: 1,event: impression,time: 02:02:02

Query Resulttweet-id: 1,event: url-click,time: 01:00:00 1tweet-id: 1,event: *,time: 01:00:00 2tweet-id: *,event: *,time: 01:00:00 3tweet-id: *,event: impression,time: 02:00:00 1tweet-id: 2,event: *,time: 02:00:00 1

Input and QueriesQuery Resulttweet-id: 1,event: url-click,time: 01:00:00 1tweet-id: 1,event: *,time: 01:00:00 2tweet-id: *,event: *,time: 01:00:00 3tweet-id: *,event: impression,time: 02:00:00 1tweet-id: 2,event: *,time: 02:00:00 1

Streamtweet-id: 1,event: url-click,time: 01:01:03tweet-id: 2,event: url-click,time: 01:01:02tweet-id: 1,event: impression,time: 01:01:01tweet-id: 2,event: url-click,time: 02:02:01tweet-id: 1,event: impression,time: 02:01:02

Streamtweet-id: 1,event: url-click,time: 01:01:03tweet-id: 2,event: url-click,time: 01:01:02tweet-id: 1,event: impression,time: 01:01:01tweet-id: 2,event: url-click,time: 02:02:01tweet-id: 1,event: impression,time: 02:01:02


Input and Queries

Time Series Data

01:00:00 02:00:00 03:00:00 04:00:000

25

50

75

100

125Tweet Impressions

Tweet 1 Tweet 2

Any questions so far?

Legacy SystemStream Processor

Hadoop

Lambda Architecture

Streaming

Batch

Legacy System

Lambda ArchitectureHadoop

Streaming

Batch

Stream Processor

Legacy System

Lambda ArchitectureHadoop

Streaming

Batch

• Aggregates built directly in key/value store

• Read/modify/write for every message

• Inaccurate: double-counting, lost pre-aggregated data

• Hadoop job improves results after 24 hours

Legacy System(Lambda Architecture)

Any questions so far?

Goals for PrototypeSystem

• Feature parity with existing system• Attempt to reduce hardware footprint by 100x• Exactly once semantics: compute correct results in

real-time with or without failures. Failures should not lead to missing data or double counting

• Satisfy realtime queries with low latency• One system: No Lambda Architecture!• Eliminate the key/value store bottleneck (big win)

My road toApache Flink

• Interested in Google Cloud Dataflow• Google nailed the semantics for stream processing• Unified batch and stream processing with one

model• Dataflow didn’t exist in open source at the time (or

so I thought) and I wanted to build it.• My wife wouldn’t let me quit my job!• Dataflow SDK is now open source as Apache Beam

and Flink is the most complete runner.

Why Apache Flink?• Basically identical semantics to Google Cloud Dataflow• Flink is a true fault-tolerant stateful stream processor• Exactly once guarantees for state updates• The state management features might allow us to eliminate the

key-value store• Windowing is built-in which makes time series easy• Native event time support / correct time based aggregations• Very fast data shuffling in benchmarks: 83 million msgs/sec on 30

machines• Flink “just works” with no tuning - even at scale!

http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf

http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/

Prototype SystemApache Flink

Streaming


We now have a sharded key/value storeinside the stream processor

Streaming


Why not just query that!


Streaming


QueryServic

eWhy not just query that!


Prototype System• Eliminates the key-value

store bottleneck

• Eliminates the batch layer

• No more Lambda Architecture!

• Realtime queries over in-flight aggregates

• Hourly aggregates written to database

The Results• Uses 0.5% of the resources of the legacy

system: An improvement of 200x with zero tuning!

• Exactly once analytics in realtime• Complete elimination of batch layer and

Lambda Architecture• Successfully eliminated the key-value store

bottleneck

How is 200x improvement possible?

• The key is making use of fault-tolerant state inside the stream processor

• Computation proceeds at in-memory speeds• No need to make requests over the network to

update values in external store• Dramatically less load on the database because only

the completed window aggregates are written there.• Flink is extremely efficient at network I/O and data

shuffling, and has highly optimized serialization architecture

Does this matterat smaller scale?

• YES it does!• Much larger problems on the same

hardware investment• Exactly-once semantics and state

management is important at any scale!• Engineering time invested can be

expensive at any scale if things don’t “just work”.

Summary• Used stateful operator features in Flink to

remove the key/value store bottleneck• Dramatic reduction in hardware costs (200x)• Maintained feature parity by providing low-

latency queries for in flight aggregates as well as long-term storage of hourly time series data

• Actually improved accuracy of aggregations: Exactly-once vs. at least once semantics

Questions?

Thanks!

stateful stream processing at in-memory speed

Data & Analytics