papers we love realtime at facebook
TRANSCRIPT
![Page 1: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/1.jpg)
1
Papers We Love:
Realtime Data Processing at Facebook
Gwen ShapiraConfluent Inc.
![Page 2: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/2.jpg)
2
Papers We Love:
Realtime Data Processing at Facebook
![Page 3: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/3.jpg)
3
Published in 2016 (!)
![Page 4: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/4.jpg)
4
What kind of paper is this?
![Page 5: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/5.jpg)
5
This is NOT
The one true architecture
.
Please don’t cargo-cult this paper
![Page 6: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/6.jpg)
6
Few real-time systems at Facebook
• Chorus – aggregate trends
• Realtime feedback for mobile app developers
• Page analytics – likes, engagement…
• Offload CPU-intensive dashboard queries
![Page 7: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/7.jpg)
7
![Page 8: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/8.jpg)
8
![Page 9: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/9.jpg)
9
![Page 10: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/10.jpg)
10
Looking for trending topics in 5 minute windows
![Page 11: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/11.jpg)
11
The Tofu & Potatoes of the paper:
Design Decisions
![Page 12: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/12.jpg)
12
/ KafkaStreams
+ exactly
once
![Page 13: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/13.jpg)
13
Decision #1 – Language Paradigm
• Declarative (SQL) – easy & limited
• Functional
• Procedural (C++, Java, Python) –most flexibility, control, performance. Longer dev cycle.
![Page 14: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/14.jpg)
14
Decision #1 – Language Paradigm
• Declarative (SQL) – easy & limited
• Functional
• Procedural (C++, Java, Python) –most flexibility, control, performance. Longer dev cycle.
![Page 15: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/15.jpg)
15
Decision #2: Data Transfer
• RPC (Millwheel, Flink, SparkStreaming)
• All about speed
• Message-forwarding broker (Heron)
• Applies back-pressure, multiplex
• Persistent stream storage (Samza, Kafka’s Stream API)
• Most reliable
• Decouples processors
![Page 16: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/16.jpg)
16
Decision #2: Data Transfer
![Page 17: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/17.jpg)
17
Love Song to Scribe
Independent stream processing nodes
And storing inputs / outputs
Made everything great
![Page 18: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/18.jpg)
18
Decision #3 – Processing Semantics
![Page 19: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/19.jpg)
19
Decision #3 – Processing Semantics
Facebook Verdict: It depends on requirements
• Ranker writes to idempotent system – at least once
• Scuba can lose data, but not handle duplicates – at most once
• …. Exactly once is REALLY HARD and requires transactions
![Page 20: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/20.jpg)
20
Don’t miss the side-note on side-effects
• Exactly once means writing output + offsets to a transactional system
• This takes time
• Why just wait when you can deserialize? And maybe do other stateless stuff?
![Page 21: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/21.jpg)
21
Decision #4 – State Saving
• In-memory state with replication (Old VoltDB)• Requires lots of hardware and network
• Local database (Samza, Kafka Streams API)
• Remote database (Millwheel)
• Upstream (i.e. replay everything on failure)
• Global consistent snapshot (Flink)
![Page 22: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/22.jpg)
22
Decision #4 – State Saving
Facebook Verdict: It depends
Rhode Island Alaska
![Page 23: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/23.jpg)
23
Best Part of the Paper – by far
How to efficiently work with state in remote DB?
![Page 24: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/24.jpg)
24
Decision #5 - Reprocessing
• Stream only – requires long retention in the stream store
• Maintain both batch and stream systems
• Develop systems that can run in streams and batch (Flink, Spark)
![Page 25: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/25.jpg)
25
Decision #5 - Reprocessing
• Stream only – requires long retention in the stream store
• Maintain both batch and stream systems
• Develop systems that can run in streams and batch (Flink, Spark)
Facebook Verdict:
SQL runs everywhere
And binary generation FTW
![Page 26: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/26.jpg)
26
Applications – Or a whirlwind tour of good patterns
One example:
![Page 27: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/27.jpg)
27
Lessons Learned!
The biggest win is pipelines composed of independent processors
• Mixing multiple systems let us move fast
• High level abstractions let us improve implementation
• Ease of debugging – Independent nodes and ability to replay
• Ease of deployment – Puma as-a-service
• Ease of monitoring – Lag is the most important metric. Everything is instrumented out of the box.
• In the future – auto-scale based on lag
![Page 28: Papers we love realtime at facebook](https://reader034.vdocuments.mx/reader034/viewer/2022051710/5a6ce6887f8b9a04428b45db/html5/thumbnails/28.jpg)
28
Thank You!