streaming processing with a distributed commit log

Streaming Processing with a Distributed Commit Log

Apache Kafka

2

Apache Kafka committer and PMC member. A frequent speaker on both Hadoop and Cassandra, Joe is the Co-Founder and CTO of Elodina Inc. Joe has been a distributed systems developer and architect for over {years} now having built backend systems that supported over one hundred million unique devices a day processing trillions of events. He blogs and hosts a podcast about Hadoop and related systems at All Things Hadoop.

@allthingshadoop

$(whoami)

https://twitter.com/allthingshadoop

https://twitter.com/allthingshadoop

3

● Introduction to Apache Kafka● Brokers “as a Service”● Producers & Consumers “as a Service”● More Use Cases for Kafka

Overview

Apache Kafka

5

Apache Kafka was first open sourced by LinkedIn in 2011

Papers● Building a Replicated Logging System with Apache Kafka http://www.vldb.org/pvldb/vol8/p1654-wang.pdf

● Kafka: A Distributed Messaging System for Log Processing http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

● Building LinkedIn’s Real-time Activity Data Pipeline http://sites.computer.org/debull/A12june/pipeline.pdf

● The Log: What Every Software Engineer Should Know About Real-time Data's Unifying Abstraction http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

http://kafka.apache.org/

Apache Kafka

http://www.vldb.org/pvldb/vol8/p1654-wang.pdf

http://www.vldb.org/pvldb/vol8/p1654-wang.pdf

http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

http://sites.computer.org/debull/A12june/pipeline.pdf

http://sites.computer.org/debull/A12june/pipeline.pdf

http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying



It often starts with just one data pipeline

Data Pipelines

Point to Point Data Pipelines are Problematic

Reuse of data pipelines for new providers

Reuse of existing providers for new consumers

Eventually the solution becomes the problem

Decouple Data Pipelines

Topics & Partitions

Log Segments

Read and Write Keys & Values to each partition

Producers

Consumers

Brokers Kafka Wire Protocol - http://kafka.apache.org/protocol.html

● Preliminaries

○ Network○ Partitioning and bootstrapping○ Partitioning Strategies○ Batching○ Versioning and Compatibility

● The Protocol

○ Protocol Primitive Types○ Notes on reading the request format grammars○ Common Request and Response Structure○ Message Sets

● Constants

○ Error Codes○ Api Keys

● The Messages

● Some Common Philosophical Questions

http://kafka.apache.org/protocol.html#protocol

http://kafka.apache.org/protocol.html#protocol_preliminaries

http://kafka.apache.org/protocol.html#protocol_preliminaries

http://kafka.apache.org/protocol.html#protocol_network

http://kafka.apache.org/protocol.html#protocol_network

http://kafka.apache.org/protocol.html#protocol_partitioning

http://kafka.apache.org/protocol.html#protocol_partitioning

http://kafka.apache.org/protocol.html#protocol_partitioning_strategies

http://kafka.apache.org/protocol.html#protocol_partitioning_strategies

http://kafka.apache.org/protocol.html#protocol_batching

http://kafka.apache.org/protocol.html#protocol_batching

http://kafka.apache.org/protocol.html#protocol_compatibility

http://kafka.apache.org/protocol.html#protocol_compatibility

http://kafka.apache.org/protocol.html#protocol_details

http://kafka.apache.org/protocol.html#protocol_details

http://kafka.apache.org/protocol.html#protocol_types

http://kafka.apache.org/protocol.html#protocol_types

http://kafka.apache.org/protocol.html#protocol_grammar

http://kafka.apache.org/protocol.html#protocol_grammar

http://kafka.apache.org/protocol.html#protocol_common

http://kafka.apache.org/protocol.html#protocol_common

http://kafka.apache.org/protocol.html#protocol_message_sets

http://kafka.apache.org/protocol.html#protocol_message_sets

http://kafka.apache.org/protocol.html#protocol_constants

http://kafka.apache.org/protocol.html#protocol_constants

http://kafka.apache.org/protocol.html#protocol_error_codes

http://kafka.apache.org/protocol.html#protocol_error_codes

http://kafka.apache.org/protocol.html#protocol_api_keys

http://kafka.apache.org/protocol.html#protocol_api_keys

http://kafka.apache.org/protocol.html#protocol_messages

http://kafka.apache.org/protocol.html#protocol_messages

http://kafka.apache.org/protocol.html#protocol_philosophy

http://kafka.apache.org/protocol.html#protocol_philosophy

Data Durability

Client LibrariesCommunity Clients https://cwiki.apache.org/confluence/display/KAFKA/Clients

● Go (aka golang) Pure Go implementation with full protocol support. Consumer and Producer implementations included, GZIP and Snappy compression supported.

● Python - Pure Python implementation with full protocol support. Consumer and Producer implementations included, GZIP and Snappy compression supported.

● C - High performance C library with full protocol support● Ruby - Pure Ruby, Consumer and Producer implementations included,

GZIP and Snappy compression supported. Ruby 1.9.3 and up (CI runs MRI 2.

● Clojure - Clojure DSL for the Kafka API● JavaScript (NodeJS) - NodeJS client in a pure JavaScript implementation

https://cwiki.apache.org/confluence/display/KAFKA/Clients

https://cwiki.apache.org/confluence/display/KAFKA/Clients

Operationalizing Kafkahttps://kafka.apache.org/documentation.html#basic_ops

Basic Kafka Operations

● Adding and removing topics

● Modifying topics

● Graceful shutdown

● Balancing leadership

● Checking consumer position

● Mirroring data between clusters

● Expanding your cluster

● Decommissioning brokers

● Increasing replication factor

https://kafka.apache.org/documentation.html#basic_ops




https://kafka.apache.org/documentation.html#basic_ops_add_topic

https://kafka.apache.org/documentation.html#basic_ops_add_topic

https://kafka.apache.org/documentation.html#basic_ops_modify_topic

https://kafka.apache.org/documentation.html#basic_ops_modify_topic

https://kafka.apache.org/documentation.html#basic_ops_restarting

https://kafka.apache.org/documentation.html#basic_ops_restarting

https://kafka.apache.org/documentation.html#basic_ops_leader_balancing

https://kafka.apache.org/documentation.html#basic_ops_leader_balancing

https://kafka.apache.org/documentation.html#basic_ops_consumer_lag

https://kafka.apache.org/documentation.html#basic_ops_consumer_lag

https://kafka.apache.org/documentation.html#basic_ops_mirror_maker

https://kafka.apache.org/documentation.html#basic_ops_mirror_maker

https://kafka.apache.org/documentation.html#basic_ops_cluster_expansion

https://kafka.apache.org/documentation.html#basic_ops_cluster_expansion

https://kafka.apache.org/documentation.html#basic_ops_decommissioning_brokers

https://kafka.apache.org/documentation.html#basic_ops_decommissioning_brokers

https://kafka.apache.org/documentation.html#basic_ops_increase_replication_factor

https://kafka.apache.org/documentation.html#basic_ops_increase_replication_factor

Kafka “as a Service”

27

CURRENT STATE OF IMPLEMENTATION11 STEPS BEFORE ANY BUSINESS VALUE IS CREATED

1 SET UP Instances → AWS / GCE / etc..

2 Repeat above by # of instances

3 SET UP uniformly, harden, secure every machine

4 DOWNLOAD: Apache Kafka

5 LEARN to install, run on multiple nodes / high availability

6 LEARN to run on multiple data centers / multiple racks

7 CONFIGURE nodes, tables specifically by cluster

8 MONITOR performance, isolate bottlenecks

9 OPTIMIZE system / team to hands off through next objective

10 MONITOR for failure and build disaster recovery protocol

11 FAILURE RECOVERY investigation, recovery and spin back up time

10

11

9

8

7

6

5

4

3

2

1

10

11

9

8

7

6

5

4

3

2

1

10

11

9

8

7

6

5

4

3

2

1

10

11

9

8

7

6

5

4

3

2

1

10

11

9

8

7

6

5

4

3

2

1

10

11

9

8

7

6

5

4

3

2

1

AND process must repeat by # of instances and technologies

28

ELODINA AUTOMATES DEPLOYMENT, SCALING AND MAINTENANCEReduce steps and learning curve to a THREE stage repeatable process

1 SET UP Instances → AWS / GCE / etc..

2 Repeat above by # of instances

3 SET UP uniformly, harden, secure every machine

4 DOWNLOAD: Apache Kafka

5 LEARN to install, run on multiple nodes / high availability

6 LEARN to run on multiple data centers / multiple racks

7 CONFIGURE nodes, tables specifically by cluster

8 MONITOR performance, isolate bottlenecks

9 OPTIMIZE system / team to hands off through next objective

10 MONITOR for failure and build disaster recovery protocol

11 FAILURE RECOVERY investigation, recovery and spin back up time

Platform modulars allow for deployment in minutesDEPLOY

Grid scales automatically with low latency based on real time traffic patterns.

SCALE

Single destination to observe and troubleshoot from CLI, REST API or GUI

OBSERVE

29

BUILT- IN FRAMEWORKS DIRECTLY IN PLATFORMLeading technologies deployable across any compute resource

Platform modulars allow for deployment in minutesDEPLOY

Grid scales automatically with low latency based on real time traffic patterns.

SCALE

Single destination to observe and troubleshoot from CLI, REST API or GUI

OBSERVE

Res

ourc

esTe

chno

logi

es

30

IMMEDIATE OPERATIONAL BENEFITS

Removing Fragmentation with Interoperability

Clearing crowded market decisioning on which software or stack of software to choose and interoperate with your data center

Immediate Efficiency & Reliability

Operation resources deployed across multiple data centers across multiple regions streamlined with dynamic compute and automated scheduling capabilities.

Automated Speed and Recovery

Reduce costs and time to market on development cycle time and Automate recovery from failure and

What is Mesos?

Scheduler

Executors

mesos/kafka

https://github.com/mesos/kafka



Scheduler● Provides the operational automation for a Kafka Cluster.● Manages the changes to the broker's configuration. ● Exposes a REST API for the CLI to use or any other client.● Runs on Marathon for high availability.● Broker Failure Management “stickiness”

Executor● The executor interacts with the kafka broker as an

intermediary to the scheduler

Scheduler & Executor

Typical Operations

● Run the scheduler with Docker

● Run the scheduler on Marathon

● Changing the location where data is stored

● Starting 3 brokers

● View broker log

● High Availability Scheduler State

● Failed Broker Recovery

● Passing multiple options

● Broker metrics

● Rolling restart

https://github.com/mesos/kafka/tree/master/src/docker#intro

https://github.com/mesos/kafka/tree/master/src/docker#intro

https://github.com/mesos/kafka/tree/master/src/docker#running-image-in-marathon

https://github.com/mesos/kafka/tree/master/src/docker#running-image-in-marathon

https://github.com/mesos/kafka#changing-the-location-where-data-is-stored

https://github.com/mesos/kafka#changing-the-location-where-data-is-stored

https://github.com/mesos/kafka#starting-3-brokers

https://github.com/mesos/kafka#starting-3-brokers

https://github.com/mesos/kafka#view-broker-log

https://github.com/mesos/kafka#view-broker-log

https://github.com/mesos/kafka#high-availability-scheduler-state

https://github.com/mesos/kafka#high-availability-scheduler-state

https://github.com/mesos/kafka#failed-broker-recovery

https://github.com/mesos/kafka#failed-broker-recovery

https://github.com/mesos/kafka#passing-multiple-options

https://github.com/mesos/kafka#passing-multiple-options

https://github.com/mesos/kafka#broker-metrics

https://github.com/mesos/kafka#broker-metrics

https://github.com/mesos/kafka#rolling-restart

https://github.com/mesos/kafka#rolling-restart

Navigating Operations

● Adding brokers to the cluster

● Updating broker configurations

● Starting brokers

● Stopping brokers

● Restarting brokers

● Removing brokers

● Retrieving broker log

● Rebalancing brokers in the cluster

● Listing topics

● Adding topic

● Updating topic

https://github.com/mesos/kafka#adding-brokers-to-the-cluster

https://github.com/mesos/kafka#adding-brokers-to-the-cluster

https://github.com/mesos/kafka#updating-broker-configurations

https://github.com/mesos/kafka#updating-broker-configurations

https://github.com/mesos/kafka#starting-brokers-in-the-cluster

https://github.com/mesos/kafka#starting-brokers-in-the-cluster

https://github.com/mesos/kafka#stopping-brokers-in-the-cluster

https://github.com/mesos/kafka#stopping-brokers-in-the-cluster

https://github.com/mesos/kafka#restarting-brokers-in-the-cluster

https://github.com/mesos/kafka#restarting-brokers-in-the-cluster

https://github.com/mesos/kafka#removing-brokers-from-the-cluster

https://github.com/mesos/kafka#removing-brokers-from-the-cluster

https://github.com/mesos/kafka#retrieving-broker-log

https://github.com/mesos/kafka#retrieving-broker-log

https://github.com/mesos/kafka#rebalancing-topics

https://github.com/mesos/kafka#rebalancing-topics

https://github.com/mesos/kafka#listing-topics

https://github.com/mesos/kafka#listing-topics

https://github.com/mesos/kafka#adding-topic

https://github.com/mesos/kafka#adding-topic

https://github.com/mesos/kafka#updating-topic

https://github.com/mesos/kafka#updating-topic

Kafka as a Service

Kafka Consumers “as a Service”

http://heronstreaming.io



Topology MasterThe Topology Master (TM) manages a topology

throughout its entire lifecycle, from the time it’s

submitted until it’s ultimately killed. When heron

deploys a topology it starts a single TM and

multiple containers. The TM creates an

ephemeral ZooKeeper node to ensure that

there’s only one TM for the topology and that

the TM is easily discoverable by any process in

the topology. The TM also constructs the

physical plan for a topology which it relays to

different components.

Container Each Heron topology consists of multiple containers, each of which houses multiple Heron Instances, a Stream Manager, and a Metrics Manager. Containers communicate with the topology’s TM to ensure that the topology forms a fully connected graph. For an illustration, see the figure in the Topology Master section above.

http://52.87.86.142:1313/docs/concepts/architecture/#container:1bf455b3a3c6e1e40be41ef6023b75eb

http://zookeeper.apache.org/

http://52.87.86.142:1313/docs/concepts/topologies#physical-plan


http://52.87.86.142:1313/docs/concepts/architecture/#heron-instance:1bf455b3a3c6e1e40be41ef6023b75eb

http://52.87.86.142:1313/docs/concepts/architecture/#stream-manager:1bf455b3a3c6e1e40be41ef6023b75eb

http://52.87.86.142:1313/docs/concepts/architecture/#metrics-manager:1bf455b3a3c6e1e40be41ef6023b75eb

http://52.87.86.142:1313/docs/concepts/architecture/#topology-master:1bf455b3a3c6e1e40be41ef6023b75eb

Stream Manager

The Stream Manager (SM) manages the routing of tuples between topology components. Each Heron Instance in a topology connects to its local SM, while all of the SMs in a given topology connect to one another to form a network. Below is a visual illustration of a network of SMs:

http://52.87.86.142:1313/docs/concepts/architecture/#heron-instance:1bf455b3a3c6e1e40be41ef6023b75eb

Heron Instance

A Heron Instance (HI) is a process that handles a single task of a spout or bolt, which allows for easy

debugging and profiling.

Currently, Heron only supports Java, so all HIs are JVM processes, but this will change in the future.

Heron Instance Configuration

HIs have a variety of configurable parameters that you can adjust at each phase of a topology’s lifecycle.

http://52.87.86.142:1313/docs/concepts/topologies#spouts

http://52.87.86.142:1313/docs/concepts/topologies#bolts

https://en.wikipedia.org/wiki/Java_virtual_machine

http://52.87.86.142:1313/docs/operators/configuration/instance

http://52.87.86.142:1313/docs/concepts/topologies#topology-lifecycle

Heron Instance

Back Pressure Built In

Metrics Manager

Each topology runs a Metrics Manager (MM) that collects and exports metrics from all components in a

container. It then routes those metrics to both the Topology Master and to external collectors, such as

Scribe, Graphite, or analogous systems.

You can adapt Heron to support additional systems by implementing your own custom metrics sink.


http://52.87.86.142:1313/docs/concepts/architecture/#topology-master:1bf455b3a3c6e1e40be41ef6023b75eb


https://github.com/facebookarchive/scribe

http://graphite.wikidot.com/

https://github.com/facebookarchive/scribe

http://52.87.86.142:1313/docs/contributors/custom-metrics-sink

Cluster-level Components

Heron CLI

Heron has a CLI tool called heron that is used to manage topologies. Documentation can be found in Managing Topologies.

Heron Tracker

The Heron Tracker (or just Tracker) is a centralized gateway for cluster-wide information about topologies, including which topologies are running,

being launched, being killed, etc. It relies on the same ZooKeeper nodes as the topologies in the cluster and exposes that information through a

JSON REST API. The Tracker can be run within your Heron cluster (on the same set of machines managed by your Heron scheduler) or outside

of it.

Instructions on running the tracker including JSON API docs can be found in Heron Tracker.

Heron UI

Heron UI is a rich visual interface that you can use to interact with topologies. Through Heron UI you can see color-coded visual representations of

the logical and physical plan of each topology in your cluster.

For more information, see the Heron UI document.

http://52.87.86.142:1313/docs/operators/heron-cli

http://zookeeper.apache.org/

http://52.87.86.142:1313/docs/operators/deployment

http://52.87.86.142:1313/docs/operators/heron-tracker

http://52.87.86.142:1313/docs/concepts/topologies#logical-plan


http://52.87.86.142:1313/docs/operators/heron-ui

Other Kafka Use Cases

69

STACK EXAMPLE AUse Case: Data Real-Time Analytics Ingestion

70

STACK EXAMPLE A+Use Case: Data Real-Time Analytics Ingestion + Long Term Storage for Batch

71

STACK EXAMPLE BUse Case: Real-Time Data Streaming/Processing

72

STACK EXAMPLE B+Use Case: Real-Time Data Streaming/Processing + Feedback Loop

73

STACK EXAMPLE CUse Case: Message Queuing

74

STACK EXAMPLE C+Use Case: Message Queuing + Priority Management

75

STACK EXAMPLE DUse Case: Distributed Akka Remoting for Real-Time Decisioning

76

STACK EXAMPLE D+Use Case: Distributed Akka Remoting for Real-Time Decisioning + Long-Term Batch

77

STACK EXAMPLE EUse Case: Distributed Trace Services

streaming processing with a distributed commit log

Technology