kafka - linkedin's messaging backbone
TRANSCRIPT
Kafka - Linkedin’s messaging backbone
Who are we ?
▪Kafka SRE at LinkedIn▪Site Reliability Engineering
–Administrators–Architects–Developers
▪Keep the site running, always
Presenters
▪Clark Haskins– Manager for Data Infra Streaming SRE (Mountain View, CA)
▪Ayyappadas Ravindran – Staff Site Reliability Engineer,
Data Infra Streaming (Bengaluru)▪Akash Vacher
▪ Site Reliability Engineer,Data Infra Streaming (Bengaluru)
Agenda▪ What the heck is Kafka ?
– Brief intro– Motivation to build Kafka
▪ Okay, why should I bother ?– Kafka facts, scale & performance
▪ You have my attention, tell me more !– Core concepts– Operating kafka– Kafka @ Linkedin
▪ Nice, where all do you use kafka ?– Tale of two applications
▪ Have questions ?
What the heck is Kafka?
▪ A high-throughput distributed messaging system▪ Developed at Linkedin and open sourced in early 2011▪ Implemented in Scala and Java▪ Linkedin’s messaging backbone▪ Kafka powers around 1000 companies including
Linkedin, Yahoo!, Netflix, Uber, Twitter and many more
If data is lifeblood of high technology, Apache Kafka is the circulatory system used in Linkedin – Todd Palino (Staff SRE Engineer Linkedin)
Motivation to create Kafka ?
▪ Needed a unified platform to handle all real time data feeds and stream processing
▪ Wanted a messaging system with high throughput to support high volume event feeds
▪ Needed data persistence for offline systems and in case of service recovery
▪ Low latency▪ Fault tolerant ▪ Linearly scalable
Okay ! what was the motivation to create Kafka?
Before
After
How is Kafka used at Linkedin?
▪ Application and System Monitoring (inGraphs)▪ User tracking on Linkedin web sites▪ Email, push & SMS notifications ▪ Live search updates▪ Samza Jobs (standardization, call graph and more)▪ Database Replication
Okay, why should I bother?
▪ Over 1,300,000,000,000 messages are transportedvia Kafka every day at LinkedIn
▪ 300 Terabytes of inbound and 900 Terabytes of outbound traffic
▪ 4.5 Million messages per second, on single cluster
▪ Kafka runs on around 1300 servers at LinkedIn
hmmm .. ! How good is Kafka ?
You have my attention, tell me more !
▪ Building blocks– Message– Producers– Consumers– Topics– Partitions– Segments– Brokers– Replicas
Awesome !! I am in Tell me more !
Bird’s eye view
The data continues ..
What Is Kafka?
Broker AP0
AP1
AP1
AP0 AP0
15
Consumer
Producer
Zookeeper
Performance recipes
▪ OS page cache▪ Linear IO, never fear the file system !▪ sendfile(), system call▪ Message batching
Dude, tell me the performance secret!!
Operating Kafka
▪ Broker Hardware– Cisco C240, Intel xeon,
64GB RAM , 14 disk Raid-10
▪ Zookeeper Hardware– 5 + 1 ensemble, 64GB
RAM, 500GB SSD▪ Monitoring
– Lag monitoring– Under Replicated
Partitions– Unclean leader election– Burrow
▪ Cluster rebalance – Sizewise rebalance– Partitionwise rebalance
Tell me how you manage this beast !
Mirror Maker and Audit
Kafka Audit(event count)
Kafka Audit(data transport time)
Kafka @ Linkedin
▪ Cluster Types– Tracking– Metrics– Queuing
▪ Kafka Rest▪ Schema Registry
Kafka @ Linkedin - Schema registry
Autometrics
▪ Building Blocks– Sensors– EventBus– Kafka Rest– Kafka cluster– Kafka consumer– RRD– Front end
▪ Facts & Figures– 320,000,000 metrics
collected per minute▪ 530 TB of disk space▪ Over 210,000 metrics
collected per service
InGraphs
Kafka for database replication - Master slave
Kafka for database replication - Multi master
Have questions?