scalable streaming data pipelines with redis
TRANSCRIPT
![Page 1: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/1.jpg)
Scalable Streaming Data Pipelines with Redis
Avram Lyon Scopely / @ajlyon / github.com/avram
LA Redis Meetup / April 18, 2016
![Page 2: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/2.jpg)
ScopelyMobile games publisher and
developer
Diverse set of games
Independent studios around the world
![Page 3: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/3.jpg)
What kind of data?• App opened
• Killed a walker
• Bought something
• Heartbeat
• Memory usage report
• App error
• Declined a review prompt
• Finished the tutorial
• Clicked on that button
• Lost a battle
• Found a treasure chest
• Received a push message
• Finished a turn
• Sent an invite
• Scored a Yahtzee
• Spent 100 silver coins
• Anything else any game designer or developer wants to learn about
![Page 4: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/4.jpg)
How much?Recently:
Peak: 2.8 million events / minute
2.4 billion events / day
![Page 5: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/5.jpg)
Collection Kinesis
WarehousingEnrichment Realtime MonitoringKinesisPublic API
Primary Data Stream
![Page 6: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/6.jpg)
CollectionHTTP
CollectionSQS
SQS
SQS
Studio A
Studio BStudio C
Kinesis
SQS Failover
Redis Caching App Configurations
System Configurations
![Page 7: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/7.jpg)
Kinesis
SQS Failover
K
K
Data Warehouse Forwarder
Enricher
S3
Kinesis K
Ariel (Realtime)
Idempotence
Elasticsearch
Idempotence
Idempotence
?
Aggregation
![Page 8: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/8.jpg)
Kinesisa short aside
![Page 9: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/9.jpg)
Kinesis
• Distributed, sharded streams. Akin to Kafka.
• Get an iterator over the stream— and checkpoint with current stream pointer occasionally.
• Workers coordinate shard leases and checkpoints in DynamoDB (via KCL)
Shard 0Shard 1Shard 2
![Page 10: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/10.jpg)
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Given: Worker checkpoints every 5
![Page 11: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/11.jpg)
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Given: Worker checkpoints every 5
K Worker A
![Page 12: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/12.jpg)
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Given: Worker checkpoints every 5
K Worker A 🔥
![Page 13: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/13.jpg)
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Checkpoint for Shard 0: 10 Given: Worker checkpoints every 5
K Worker A 🔥
![Page 14: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/14.jpg)
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Checkpoint for Shard 0: 10 Given: Worker checkpoints every 5
K Worker A 🔥
K Worker B
![Page 15: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/15.jpg)
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Checkpoint for Shard 0: 10 Given: Worker checkpoints every 5
K Worker A 🔥
K Worker B
![Page 16: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/16.jpg)
Auxiliary Idempotence
• Idempotence keys at each stage
• Redis sets of idempotence keys by time window
• Gives resilience against various types of failures
![Page 17: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/17.jpg)
Auxiliary Idempotence
![Page 18: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/18.jpg)
Auxiliary Idempotence
• Gotcha: Set expiry is O(N)
• Broke up into small sets, partitioned by first 2 bytes of md5 of idempotence key
![Page 19: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/19.jpg)
Collection Kinesis
WarehousingEnrichment Realtime MonitoringKinesisPublic API
![Page 20: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/20.jpg)
Kinesis
SQS Failover
K
K
Data Warehouse Forwarder
Enricher
S3
Kinesis K
Ariel (Realtime)
Idempotence
Elasticsearch
Idempotence
Idempotence
?
Aggregation
![Page 21: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/21.jpg)
1. Deserialize
2. Reverse deduplication
3. Apply changes to application properties
4. Get current device and application properties
5. Generate Event ID
6. Emit.
Collection Kinesis
Enrichment
![Page 22: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/22.jpg)
1. Deserialize
2. Reverse deduplication
3. Apply changes to application properties
4. Get current device and application properties
5. Generate Event ID
6. Emit.
Collection Kinesis
Enrichment
Idempotence Key: DeviceToken + API Key + EventBatch Sequence + Event
Batch Session
![Page 23: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/23.jpg)
Now we have a stream of well-described, denormalized event
facts.
![Page 24: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/24.jpg)
K
Enriched Event Data
Preparing for Warehousing (SDW Forwarder)
dice app open
bees level complete
slots payment0:10
0:01
0:05
emitted by time
emitted by size
• Game • Event Name • Superset of
Properties in batch • Data
Slice
… ditto …Slice
SQS
![Page 25: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/25.jpg)
K
Enriched Event Data
Preparing for Warehousing (SDW Forwarder)
dice app open
bees level complete
slots payment0:10
0:01
0:05
emitted by time
emitted by size
• Game • Event Name • Superset of
Properties in batch • Data
Slice
… ditto …Slice
SQS
Idempotence Key: Event ID
![Page 26: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/26.jpg)
K
But everything can die!
dice app open
bees level complete
slots payment
Shudder
ASG
SNS
SQS
![Page 27: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/27.jpg)
K
But everything can die!
dice app open
bees level complete
slots payment
Shudder
ASG
SNS
SQS
HTTP“Prepare to Die!”
![Page 28: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/28.jpg)
K
But everything can die!
dice app open
bees level complete
slots payment
Shudder
ASG
SNS
SQS
HTTP“Prepare to Die!”
emit!
emit!
emit!
![Page 29: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/29.jpg)
Pipeline to HDFS
• Partitioned by event name and game, buffered in-memory and written to S3
• Picked up every hour by Spark job
• Converts to Parquet, loaded to HDFS
![Page 30: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/30.jpg)
A closer look at Ariel
![Page 31: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/31.jpg)
K
Live Metrics (Ariel)Enriched Event Data
name: game_end time: 2015-07-15 10:00:00.000 UTC _devices_per_turn: 1.0 event_id: 12345 device_token: AAAA user_id: 100
name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12346 device_token: BBBB user_id: 100
name: Cheating Games predicate: _devices_per_turn > 1.5 target: event_id type: DISTINCT id: 1
name: Cheating Players predicate: _devices_per_turn > 1.5 target: user_id type: DISTINCT id: 2
name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12347 device_token: BBBB user_id: 100
PFADD /m/1/2015-07-15-10-00 12346 PFADD /m/1/2015-07-15-10-00 123467 PFADD /m/2/2015-07-15-10-00 BBBB PFADD /m/2/2015-07-15-10-00 BBBB
PFCOUNT /m/1/2015-07-15-10-002 PFCOUNT /m/2/2015-07-15-10-001
Configured Metrics
![Page 32: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/32.jpg)
Dashboards
Alarms
![Page 33: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/33.jpg)
![Page 34: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/34.jpg)
![Page 35: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/35.jpg)
![Page 36: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/36.jpg)
![Page 37: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/37.jpg)
HyperLogLog• High-level algorithm (four bullet-point version stolen from my
colleague, Cristian)
• b bits of the hashed function is used as an index pointer (redis uses b = 14, i.e. m = 16384 registers)
• The rest of the hash is inspected for the longest run of zeroes we can encounter (N)
• The register pointed by the index is replaced with max(currentValue, N + 1)
• An estimator function is used to calculate the approximated cardinality
http://content.research.neustar.biz/blog/hll.html
![Page 38: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/38.jpg)
K
Live Metrics (Ariel)Enriched Event Data
name: game_end time: 2015-07-15 10:00:00.000 UTC _devices_per_turn: 1.0 event_id: 12345 device_token: AAAA user_id: 100
name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12346 device_token: BBBB user_id: 100
name: Cheating Games predicate: _devices_per_turn > 1.5 target: event_id type: DISTINCT id: 1
name: Cheating Players predicate: _devices_per_turn > 1.5 target: user_id type: DISTINCT id: 2
name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12347 device_token: BBBB user_id: 100
PFADD /m/1/2015-07-15-10-00 12346 PFADD /m/1/2015-07-15-10-00 123467 PFADD /m/2/2015-07-15-10-00 BBBB PFADD /m/2/2015-07-15-10-00 BBBB
PFCOUNT /m/1/2015-07-15-10-002 PFCOUNT /m/2/2015-07-15-10-001
Configured Metrics
We can count different things
![Page 39: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/39.jpg)
Kinesis
K
Collector Idempotence
Aggregation
Ariel
Web
PFCOUNT
PFADD
Workers
Are installs anomalous?
![Page 40: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/40.jpg)
Pipeline Delay
• Pipelines back up
• Dashboards get outdated
• Alarms fire!
![Page 41: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/41.jpg)
Alarm Clocks
• Push timestamp of current events to per-game pub/sub channel
• Take 99th percentile age as delay
• Use that time for alarm calculations
• Overlay delays on dashboards
![Page 42: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/42.jpg)
Kinesis
K
Collector Idempotence
Aggregation
Ariel, now with clocks
Web
PFCOUNT
PFADD
Workers
Are installs anomalous?
Event Clock
![Page 43: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/43.jpg)
Ariel 1.0
• ~30K metrics configured
• Aggregation into 30-minute buckets
• 12KB/30min/metric
![Page 44: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/44.jpg)
Ariel 1.0
• ~30K metrics configured
• Aggregation into 30-minute buckets
• 12KB/30min/metric
![Page 45: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/45.jpg)
Challenges
• Dataset size. RedisLabs non-cluster max = 100GB
• Packet/s limits: 250K in EC2-Classic
• Alarm granularity
![Page 46: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/46.jpg)
Challenges
• Dataset size. RedisLabs non-cluster max = 100GB
• Packet/s limits: 250K in EC2-Classic
• Alarm granularity
![Page 47: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/47.jpg)
Hybrid Datastore: Requirements
• Need to keep HLL sets to count distinct
• Redis is relatively finite
• HLL outside of Redis is messy
![Page 48: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/48.jpg)
Hybrid Datastore: Plan
• Move older HLL sets to DynamoDB
• They’re just strings!
• Cache reports aggressively
• Fetch backing HLL data from DynamoDB as needed on web layer, merge using on-instance Redis
![Page 49: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/49.jpg)
Kinesis
K
Collector IdempotenceAggregation
Ariel, now with hybrid datastore
Web
PFCOUNT
PFADD
Workers
Are installs anomalous?
Event Clock
DynamoDB
Report Caches
Old Data Migration
Merge Scratchpad
![Page 50: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/50.jpg)
Much less memory…
![Page 51: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/51.jpg)
Redis Roles• Idempotence
• Configuration Caching
• Aggregation
• Clock
• Scratchpad for merges
• Cache of reports
![Page 52: Scalable Streaming Data Pipelines with Redis](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58f9a92f760da3da068b6bf4/html5/thumbnails/52.jpg)
Other Considerations
• Multitenancy. We run parallel stacks and give games an assigned affinity, to insulate from pipeline delays
• Backfill. System is forward-looking only; can replay Kinesis backups to backfill, or backfill from warehouse