xebiconfr 15 - kafka par la face nord
TRANSCRIPT
Ka ka par la face Nord
Messaging System?
Jay KrepsNeha NarkhedeJun Rao
History
WebAppRelationalDB
NoSQLDB
DWH
Hadoop
ETL
Monitoring Logs
WebApp
RelationalDB
NoSQLDB
DWH
Hadoop ETL
ActiveMQ
WebApp
Logs
Monitoring
WebApp
Search
WebApp
RelationalDB
NoSQLDB
DWH
Hadoop ETL
ActiveMQ
WebApp
Logs
Monitoring
WebApp
Search
BIGMESS
Stream Data Platform
● Distributed● High throughput● Large number of consumers● Ad-hoc consumers● Batch consumers● Automatic recovery from broker failure
Features
Distributed Commit Logs
10 11 12 13 14 15 16 17 18987654321 19
1st recordNext recordWritten
Consumer
Broker
Consumer
Consumer
Kafka Cluster
Broker Broker
Broker Broker Broker
Zookeeper
Producer
Producer
Producer
Architecture
Producer
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
Partition #1
Partition #2
Partition #3
ProducerProducer
19
16
17
offset
Old New
Writes
Writes
Writes
Consumer group
Consumer group
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
Partition #1
Partition #2
Partition #3
Group Topic # Offset
1 log 1 18
1 log 2 12
1 log 3 14
2 log 1 1
2 log 2 0
2 log 3 3
Consumer group 1 Consumer group 2
Old New
Topic storage
10 11 12 13 14 15 16 17 18987654321
Partition #1
directory segment = file
app_log-2:total 1864-rw-r--r-- 1 root root 512 Oct 28 01:00 00000000000000208027.index-rw-r--r-- 1 root root 337762 Oct 27 19:03 00000000000000208027.log-rw-r--r-- 1 root root 10485760 Oct 28 19:24 00000000000000208739.index-rw-r--r-- 1 root root 1553051 Oct 28 19:24 00000000000000208739.log
app_log-3:total 1940-rw-r--r-- 1 root root 48 Oct 27 07:02 00000000000000207555.index-rw-r--r-- 1 root root 31360 Oct 27 04:05 00000000000000207555.log
Topic clustering
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
Partition #1
Partition #2
Partition #3
Leader
Topic clustering
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
Partition #1
Partition #2
Partition #3
Leader Replica Replica
Topic clustering
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
Partition #1
Partition #2
Partition #3
Jay KrepsNeha NarkhedeJun Rao
Ka ka Enterprise Ready
2011 2012
2014
● User behaviour, click stream analysis● Infrastructure monitoring and security ● Telemetry data from mobile/sensors● IoT● Log analysis● ...
Usage
Used by
● LinkedIn : activity stream, metrics● Netflix : Real-time Monitoring● Twitter : Real-time data pipeline● Spotify : log delivery● Loggly : log collection and processing● Mozilla : telemetry data● Airbnb, Square, Uber, Criteo, OVH ...
● Need to write code to use it (no ready made producers and consumers)
● Not JMS replacement● No data transformations yet● No encryption, authorization or
authentication yet (v0.9.0 KAFKA-2210, KAFKA-2211)
Pain Points
Hand’s On : The Road
InstallationZookeeperBrokersTopicConsole Tools
Hand’s On : The Trail
Producer Kafka (Java/Scala)
High Level Consumer (Java/Scala)
Hand’s On : The North Face
“Simple” Consumer
Go!