![Page 1: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/1.jpg)
Containerising Distributed Pipes
a story about convenience
Hagen Tönnies www.linkedin.com/in/hagen-toennies
1
![Page 2: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/2.jpg)
Agenda
• Introduction
• A Distributed Pipe
• Tool Application Stack
• Recap
2
![Page 3: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/3.jpg)
Unix Pipe Recap
• […]In Unix-like computer operating systems, a pipeline is a sequence of processes chained together by their standard streams, so that the output of each process (stdout) feeds directly as input (stdin) to the next one.
3
![Page 4: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/4.jpg)
Unix Philosophy
• Write programs that do one thing and do it well.
• Write programs to work together.
• Write programs to handle text streams, because that is a universal interface.
Peter H. Salus says about the unix philosophy
4
![Page 5: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/5.jpg)
Apache Kafka
5
![Page 6: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/6.jpg)
Apache Kafka Cluster Kafka Broker
6
Zookeeper
![Page 7: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/7.jpg)
Apache Kafka Topic
oldernewer
7
topic
![Page 8: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/8.jpg)
Apache Kafka Partitions
8
![Page 9: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/9.jpg)
Apache Kafka Replications
9
![Page 10: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/10.jpg)
Apache Kafka Distribute Partitions
10
![Page 11: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/11.jpg)
Apache Kafka Producer
oldernewer
Producer
11
![Page 12: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/12.jpg)
Apache Kafka Consumer
Consumer
oldernewer
12
![Page 13: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/13.jpg)
Apache Kafka Consumer Groups
oldernewer
Consumer1Consumer1Group1
Consumer1Consumer1Group2
13
![Page 14: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/14.jpg)
Kafka Streaming API
14
Low-Level High-Level
• Topology Builder • Stream and Table abstractions
• Custom Aggregators • Simple Transformation
• Custom Processors • Simple Joins of Tables and Streams
![Page 15: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/15.jpg)
Kafka Streaming API
15
Table Stream (change-log)
alice | 1
alice | 1charlie | 1
alice | 2charlie | 1
(„alice“ , 1)
(„charlie, 1)
(„alice, 2)
![Page 16: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/16.jpg)
Kafka featuring Unix
(split %1 "\s")
16
![Page 17: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/17.jpg)
Kafka featuring Unix
(split %1 "\s")
kafka
Message Broker Stream processing job
17
![Page 18: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/18.jpg)
Kafka featuring Unix
(sketch %1)(split %1 "\s") (store %1)(agg-by-key %1)
18
![Page 19: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/19.jpg)
Containers
• […] Operating-system-level virtualization is a server virtualization method in which the kernel of an operating system allows the existence of multiple isolated user-space instances
19
![Page 20: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/20.jpg)
Containers
• docker run -t -i CONTAINER-NAME …args
• java -jar …args
20
![Page 21: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/21.jpg)
Our Distributed Pipe Application Stack in Docker
ZooKeeper
kafka
Discovering and configuring services in your infrastructure.
Distributed append log a.k.a Message Broker
enables highly reliable distributed coordination.
Provides a distributed full-text search engine
21
![Page 22: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/22.jpg)
consul: image: qnib/alpn-consul networks: - network cpu_shares: 4 mem_limit: 1g environment: - DC_NAME=es ports: - “8500:8500"
22
![Page 23: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/23.jpg)
zookeeper: image: qnib/zookeeper extends: file: base.yml ports: - "2181:2181"
ZooKeeper
23
![Page 24: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/24.jpg)
kafka-broker: image: kafka:0.10.0.1 extends: file: base.yml service: gaikai volumes: - /tmp/kafka-logs ports: - "9092:9092"
kafka
24
![Page 25: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/25.jpg)
elasticsearch: image: elasticsearch:1.7 command: "elasticsearch -Des.cluster.name=dp-es -Dnetwork.bind_host=0.0.0.0" environment: - ES_HEAP_SIZE=2g ports: - "9200:9200" - "9300:9300"
25
![Page 26: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/26.jpg)
26
![Page 27: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/27.jpg)
Stream Processing Recap
27
Power
Simplicity
kafka
![Page 28: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/28.jpg)
28
kafka
kafka streams
Stream Processing Recap
![Page 29: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/29.jpg)
• clj-kstream-cutter
• clj-kstream-hh
• clj-kstream-string-long-window-aggregate
• clj-kstream-elasticsearch-sink
https://github.com/sojoner
Our Distributed Tool Application Stack in Docker
29
![Page 30: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/30.jpg)
https://xkcd.com/297/
30
![Page 31: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/31.jpg)
clj-kstream-cutter
Edmilson Alves 0 Edmilson Alves -LRB- born February 17 , 1976 -RRB- , is a Brazilian midfielder who currently plays for Roasso Kumamoto in the J. League Division 2 .
[ Edmilson, Alves, 0, Edmilson, Alves, LRB, born …]
31
Input:
Output:
![Page 32: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/32.jpg)
[ Edmilson, Alves, 0, Edmilson, Alves, LRB, born, …]
Edmilson ~10 Alves ~8
clj-kstream-hh
32
Input:
Output:
![Page 33: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/33.jpg)
33
Count Min (CM) sketch
![Page 34: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/34.jpg)
34
CM sketch retrieval
![Page 35: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/35.jpg)
And ~10
Bob ~7
Alice ~5
Foo ~3
Bar ~2
take top_n
retrieve sketched valueHeavy Hitter for t_1
35
Heavy Hitter
![Page 36: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/36.jpg)
(defn- heavy-hitter-processor "Main stream processor takes a configuration and a mapper function to apply." [conf] (let [streamBuilder (-> (new TopologyBuilder) (.addSource (:name conf) string_dser string_dser (into-array [(:input-topic conf)])) (.addProcessor "HeavyHitter" (reify ProcessorSupplier (get [this] (get-processor))) (into-array [(:name conf)])) (.addStateStore (->> (Stores/create storeName) (.withStringKeys) (.withLongValues) (.inMemory) (.build)) (into-array ["HeavyHitter"])) (.addSink "Sink" (:output-topic conf) string_ser string_ser (into-array ["HeavyHitter"])))] (.start (KafkaStreams. streamBuilder (get-props conf)))))
36
clj-kstream-hh Topology
![Page 37: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/37.jpg)
(defn ^Processor get-processor [] (reify org.apache.kafka.streams.processor.Processor (init [this context] (.schedule (:context @application-state) (:time-window @application-state)) (swap! hh/state assoc :top-n 5 :number-of-hashfn 10N :bucket-size 1000N) (reset! hh/hitter ^(priority-map)) (reset! hh/min-sketch (make-array Integer/TYPE 10N 1000N)) …) (process [this key value] (debug "Process (k,v)::" key value) (hh/sketch-value value) (hh/add-to-hitter value) …) (punctuate [this timestamp …) (close [this] (.close (:store @application-state)))))
37
clj-kstream-hh Processor
![Page 38: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/38.jpg)
38
clj-kstream-hh Container
clj-kstream-hh: image: sojoner/clj-kstream-hh:0.1.0 hostname: clj-kstream-hh container_name: clj-kstream-hh extends: file: base.yml service: sojoner command: "--broker kafka-broker:9092 --input-topic mapped-test-json --output-topic heavy-hitters --window-size 1 --name stream-hh"
![Page 39: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/39.jpg)
Input:
Output:
Alves ~8 Alves ~10 Edmilson ~5 Edmilson~3
Alves ~18 Edmilson ~8
(key, value)
39
clj-kstream-string-long-window-aggregate
![Page 40: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/40.jpg)
clj-kstream-elasticsearch-sink
Input:
Output:
{ “name“: “Alves“, “count“: 18, “time“: “January 26th 2017, 17:03:00.000” }
Alves ~18
40
![Page 41: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/41.jpg)
Our Distributed Tool Application Stack in Docker
Edmilson Alves 0 Edmilson Alves -LRB- born February 17 , 1976 -RRB- , is a Brazilian midfielder who currently plays for Roasso Kumamoto in the J. League Division 2 .
[ Edmilson, Alves, 0, Edmilson, Alves, LRB, born …]
Alves ~10 Alves ~8
{“name“: “Alves“, “count“: 18, “time“: “January 26th 2017, 17:03:00.000”}
Alves 18
41
![Page 42: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/42.jpg)
Our Distributed Tool Application Stack in Docker
42
![Page 43: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/43.jpg)
From the development setup…
$ export DOCKER_HOST=tcp://my.desktop.de:257643
![Page 44: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/44.jpg)
…to a datacenter setup
$ export DOCKER_HOST=tcp://my.datacenter.de:2576
Build a Docker Swarm
44
![Page 45: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/45.jpg)
Current Challenges• Kafka Streams still at least once
• Persistent and durable Storage
• Still need capacity planning
• Testing / Debugging is still a challenge
• Consistency of the state storage
• Processing Time vs. Event Time
• How about Amdahl’s law
45
![Page 46: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/46.jpg)
Recap(split %1 "\s")
46
as Message broker as Stream Processor
(split %1 "\s")
![Page 47: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/47.jpg)
Containerising Distributed Pipes
a story about convenience
(thanks (listening [this]))
47
![Page 48: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/48.jpg)
Sources 1• http://kafka.apache.org/
• https://martin.kleppmann.com/2015/05/06/data-agility-at-strata.html
• https://speakerdeck.com/ept/kafka-and-samza-distributed-stream-processing-in-practice
• https://github.com/mhausenblas/dnpipes
• https://en.wikipedia.org/wiki/Pipeline_%28Unix%29
• https://zookeeper.apache.org/doc/trunk/zookeeperOver.html
• https://github.com/sojoner/container-stacks/tree/master/kafkaelasticsearch
48
![Page 49: Containerising Distributed Pipes · Unix Philosophy • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams,](https://reader034.vdocuments.mx/reader034/viewer/2022050600/5fa775b340332c0fd94a69fc/html5/thumbnails/49.jpg)
Sources 2• https://kafka.apache.org/documentation/streams#streams_processor
• https://kafka.apache.org/documentation/streams#streams_dsl
• https://hub.docker.com/r/sojoner/clj-kstream-elasticsearch-sink/
• https://hub.docker.com/r/sojoner/clj-kstream-cutter/
• https://hub.docker.com/r/sojoner/clj-kstream-hh/
• https://hub.docker.com/r/sojoner/clj-kstream-string-long-window-aggregate/
• https://blog.acolyer.org/2016/07/21/time-adaptive-sketches-ada-sketches-for-summarizing-data-streams/
49