apache kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe...
TRANSCRIPT
![Page 1: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/1.jpg)
Apache Kafka......”a system optimized for writing”
Bernhard Hopfenmüller
23. Oktober 2018
![Page 2: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/2.jpg)
whoami
Bernhard HopfenmüllerIT Consultant @ ATIX AG
IRC: Fobhepgithub.com/Fobhep
#atix #ossummit
![Page 3: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/3.jpg)
whoarewe
The Linux & Open Source CompanyUnterschleißheim @ München
over 15 yearsdatacenter automation, Linux
Consulting, Engineering, Support,Training
#atix #ossummit
![Page 4: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/4.jpg)
Kafka
Quora.com
What is the relation between Kafka, the writer, and Apache Kaf-ka, the distributed messaging system?
Jay Kreps: I thought that since Kafka was a system optimized forwriting using a writer’s name would make sense. I had taken a lotof lit classes in colleague and liked Franz Kafka. Plus the namesounded cool for an OS project
#atix #ossummit
![Page 5: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/5.jpg)
I developed by LinkedIn, Open Source since 2011
I 2014 foundation of Confluent
#atix #ossummit
![Page 6: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/6.jpg)
Messaging-Systems
Why do we need a messaging system?
#atix #ossummit
![Page 7: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/7.jpg)
Messaging-Systems
Why do we need a messaging system?
I Challenge 1: Sender not available
I Challenge 2: Sending too much(DoS)
I Challenge 3: Receiver crash uponprocessing
#atix #ossummit
![Page 8: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/8.jpg)
Queues vs Topics
Supermarket vs Television Source[1]
Supermarket Wait until it’s your turn Television Choose what you want toreceive
#atix #ossummit
![Page 9: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/9.jpg)
Kafka-Basic structure
#atix #ossummit
![Page 10: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/10.jpg)
Use Cases
I Messaging (ActiveMQ or RabbitMQ)
I Website Activity Tracking
I Metrics
I Log Aggregation
I Stream Processing
I Apache Storm and Apache Samza.
I Commit Log
#atix #ossummit
![Page 11: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/11.jpg)
Topics I
I core component of Kafka
I is filled by producer
I consists of one or more partitions
#atix #ossummit
![Page 12: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/12.jpg)
Topics II
I producer can choose partition
I partition has running offset
I message is identified by offset
#atix #ossummit
![Page 13: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/13.jpg)
Topics III
I messages are stored physically!
I key-value principle
I Clean-Up policies:
#atix #ossummit
![Page 14: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/14.jpg)
Topics IV
I Clean-Up policies:
I default: Retention-time(delete old data after x days)
I Retention-size(delete old data if datamemory > x)
#atix #ossummit
![Page 15: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/15.jpg)
Topics V
I Clean-Up policies:
I default: Retention-time(delete old data after x days)
I Retention-size(delete old data if datamemory > x)
I Log-Compaction(replace old value to key withnew)
#atix #ossummit
![Page 16: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/16.jpg)
Topic consumption
I topics are pulled! (no DoS)
I any existing data can be pulled
#atix #ossummit
![Page 17: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/17.jpg)
Consumer Groups
I parallelism allowshigh throughput
I never more consumersthan partitions
I Kafka features exactly-once-semantics!
#atix #ossummit
![Page 18: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/18.jpg)
Wait but who knows what’s read?
I Consumercommit theiroffset
I Upon failurere-processingpossible
#atix #ossummit
![Page 19: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/19.jpg)
Replication
implemented on partition level
Source[3]
#atix #ossummit
![Page 20: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/20.jpg)
In and Out of Sync Replica
#atix #ossummit
![Page 21: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/21.jpg)
Did somebody hear my message?
Producer decides if message was successfully sentConfiguration possibilities:
I as soon as sent
I as soon as received by first broker
I as soon as desired number of replica exist
#atix #ossummit
![Page 22: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/22.jpg)
ZooKeeper
I distributed, hierachical file system
I management of znodes()
I HA via ensemble (=ZooKeepercluster)
Source[4]
#atix #ossummit
![Page 23: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/23.jpg)
Broker and ZooKeeper
I Brokers are stateless!
I Which Broker is alive?
I Broker communication?
I → ZooKeeper!
#atix #ossummit
![Page 24: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/24.jpg)
Talk to Kafka - Kafka Connect
Source[7]
I I/O for Kakfa
I Connect with externalsystems
I Open Source byConfluent
#atix #ossummit
![Page 25: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/25.jpg)
Talk to Kafka - Schema Registry
I define standards
I version and store them
I Open Source byConfluent
source: confluent
#atix #ossummit
![Page 26: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/26.jpg)
TV or Netflix?
source: confluent
I live filtering of topics
I KSQL!
I Open Source byConfluent
#atix #ossummit
![Page 27: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/27.jpg)
Who likes Kafka?
I zalando - microservices
I Cisco Systems - security
I Airbnb - event pipeline
I Netflix (Monitoring!)
I The New York Times ( Kafka as data storage! Super awesome blogpost) [5][6]
I Audi - IoT
I Spotify
I Twitter
I Uber (Kafka = Backbone!!!)
I https://kafka.apache.org/powered-by
#atix #ossummit
![Page 28: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/28.jpg)
Sources
1 https://www.informatik-aktuell.de/betrieb/verfuegbarkeit/apache-kafka-eine-schluesselplattform-fuer-hochskalierbare-systeme.html
2 https://thecattlecrew.net/2017/09/28/apache-kafka-im-detail-teil-1/ andhttps://thecattlecrew.net/2017/09/28/apache-kafka-im-detail-teil-2/
3 https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/
4 https://www.infoq.com/articles/apache-kafka
5 https://www.confluent.io/blog/okay-store-data-apache-kafka/
6 https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/
7 https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
#atix #ossummit
![Page 29: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/29.jpg)
Install Kafka with Docker/Ansible
I Run containers as services
I No SSL/SASL yet!
I have a look at playbooks and docker-compose files
I https://github.com/confluentinc/cp-ansible
I https://docs.confluent.io/current/installation/docker/docs/installa-tion/index.html
I Wurstmeister: https://github.com/wurstmeister/kafka-docker
#atix #ossummit
![Page 30: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/30.jpg)
Single Components
---- name: Start zookeeper
docker_container:name: zookeeperimage: "{{ images.zookeeper }}:{{ versions.kafka }}"state: startedrestart_policy: unless-stoppedports:- "{{ ports.zookeeper.client }}:2181"- "{{ ports.zookeeper.peer }}:2888"- "{{ ports.zookeeper.leader }}:2181"
volumes:- "/zookeeper/data:/var/lib/zookeeper/data"- "/zookeeper/log:/var/lib/zookeeper/log"
env:ZOOKEEPER_SERVER_ID: "{{ zookeeper_server_id }}"ZOOKEEPER_CLIENT_PORT: "2181"ZOOKEEPER_SERVERS: "{{ lookup('template', 'sort_zookeeper.j2') }}"ZOOKEEPER_DATA_DIR: "/var/lib/zookeeper/data"ZOOKEEPER_LOG_DIR: "/var/lib/zookeeper/log"...
...#atix #ossummit
![Page 31: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/31.jpg)
{% for host in groups['zookeeper'] %}{% if inventory_hostname == hostvars[host]['inventory_hostname'] %}
0.0.0.0{% else %}
{{ hostvars[host]['ansible_default_ipv4']['address'] }}{% endif %}{% if not index_loop.last %}
;{% endif %}
{% endfor %}
#atix #ossummit
![Page 32: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/32.jpg)
Check system health
---- name : "Check Zookeeper Health"
command : docker run --rm -it confluentinc/zookeeper cub zk-ready "{{ ansible_default_ipv4.address ~ ':2181' }}" 5register : outputuntil: output is successretries: 3
...
#atix #ossummit
![Page 33: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/33.jpg)
Configure via REST/uri
---- name: create new topic
command: "{{ 'sudo docker run --rm confluentinc/cp-kafkakafka-topics --create' ... }}"
- name: get information of current topicuri:
url: "{{ restproxy_url ~ /topics/' + topic.name }}"register: result
...
#atix #ossummit
![Page 34: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/34.jpg)
whoami
Bernhard Hopfenmüller
IRC: Fobhepgithub.com/Fobhep twitter.com/fobhep
#atix #ossummit
![Page 35: Apache Kafka - 'a system optimized for writing' · 2019-12-21 · whoarewe TheLinux&OpenSourceCompany Unterschleißheim@München over15years datacenterautomation,Linux Consulting,Engineering,Support,](https://reader034.vdocuments.mx/reader034/viewer/2022042206/5ea8867ed6b27c7f0b530ef4/html5/thumbnails/35.jpg)
Kafka vs MQ
I Kafka has no P2P model!
I Messages are Persistent!
I Topic Partitioning!
I Message Sequencing: for one partition (send order=received order)
I Message reading: Choose where to read, Rewind, no FIFO!
I Loadbalancing: automatic distribution easier with metadata
I HA and failover implemented very easily
#atix #ossummit