kafka evaluation - high throughout message queue
DESCRIPTION
TRANSCRIPT
![Page 1: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/1.jpg)
Kafka Evaluation for Data Pipeline and ETL
Shafaq Abdullah @ GREE Date: 10/21/2013
![Page 2: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/2.jpg)
High-Level Architecture
Performance
Scalability
Fault-Tolerance/Error Recovery
Operations and Monitoring
Summary
Outline
![Page 3: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/3.jpg)
Kafka - A distributed messaging pub/sub system In Production: Linkedin, FourSquare, Twitter, GREE (almost)
Usage: - Log aggregation (user activity stream) - Real-time events - Monitoring
- Queuing
Intro
![Page 4: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/4.jpg)
Point-to-Point Data Pipelines
Game Server 1 Logs
Hadoop
... User Tracking
Security Search Social Graph
Rules/Recommendation/Engine
Vertica Ops
DataWare House
![Page 5: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/5.jpg)
Message- Central Data Channel
Game Server 1 Logs
Hadoop
... User Tracking
Security Search Social Graph
Rules/Recommendation/Engine
Vertica Ops
Data Warehouse
Message Queue
![Page 6: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/6.jpg)
Topic- A String representing Message Stream Id
Partition- Logical Division per topic-level within Broker, for
writing logs generated by producer. e.g: kafkaTopic - kafkaTopic1, kafkaTopic2
Replica-Sets- Replica within Broker for a certain partition, with
Leader(writes) and follower (read) balanced using hash-key modulu.
Kafka Jargon
![Page 7: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/7.jpg)
Log- Message Queue
![Page 8: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/8.jpg)
Log- Message Queue
![Page 9: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/9.jpg)
AWS m1.Large instance Dual Core Intel Xeon [email protected]
7.5 GB RAM
2 x 420 GB hardisk
Hardware Spec of Broker Cluster
![Page 10: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/10.jpg)
Performance Results
Producer thread Transactions Processing time (s) Throughput (transaction/sec)
1 488396 64.129 7616
2 1195748 110.868 10785
4 1874713 140.375 13355
10 47269410 338.094 13981
17 7987317 568.028 14061
![Page 11: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/11.jpg)
Latency vs Durability
Ack Status TIme to publish (ms) Tradeoff
No Ack 0.7 Greater data loss
Wait for ack 1.5 Lesser data loss
![Page 12: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/12.jpg)
1. Create a Kafka-Replica Sink
2. Feed data to Copier via Kafka-Sink
3. Benchmark Kakfa-Copier-Vertica Pipeline
4. Improve/Refactor for Performance
Integration Plan
![Page 13: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/13.jpg)
C- Consistency (Producer Sync Mode) A- Availability (Replication) P- Partition Tolerance (Cluster in same
network- No Network delay)
Strongly Consistent and Highly Available
CA-P theorem
![Page 14: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/14.jpg)
Conventional Quoram Replication: 2f+1 replicas → f failures (e.g. ZooKeeper)
Kafka Replication: f+1 replicas → f failures
Failure Recovery
![Page 15: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/15.jpg)
![Page 16: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/16.jpg)
Leader : ➢ Message is propagated to follower ➢ Commit offset is checkpointed to disk
Follower failure and Recovery: ➢ Kicked out of ISR ➢ After restart, truncates log to last commit ➢ Catches up with leader → ISR
Error Handling: Follower Failure
![Page 17: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/17.jpg)
➢ Embedded Controller via ZK detects leader failure
➢ Leader election from ISR ➢ Committed message not lost
Error Handling: Leader Failure
![Page 18: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/18.jpg)
- Horizontally scalable Add partitions in Broker Cluster as higher
throughput needed (~10Mb/s /server) - Balancing for producer and consumer
Scalability
![Page 19: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/19.jpg)
• Number of messages the consumer lags behind the producer by
• Max lag in messages btw follower and leader replica
• Unclean leader election rate
• Is controller active on broke
Monitoring via JMX
![Page 20: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/20.jpg)
3 Node cluster ~ 25 MB/s, < 20 ms latency (end2end)
having replication factor of 3 with 6 consumer group
Monthly operations $500 + $250 (zookeeper 3 node) + 1 mm
Operation Costs
![Page 21: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/21.jpg)
- No callback in producer’s send()
- Fully automatic balancing till script is manually run- Balancing layer of topics via ZK
Nothing is perfect
![Page 22: Kafka Evaluation - High Throughout Message Queue](https://reader034.vdocuments.mx/reader034/viewer/2022042510/54c6dd964a795952238b4594/html5/thumbnails/22.jpg)
• Infinite scaling with 10Mb/s /server with < 10 ms latency
• Costs <$1000 /month + 1 man-month
• No impact on current pipeline
• 0.8 final release due in days to be used in production
Summary