scaling mqtt with apache kafka

Post on 17-Oct-2014

2.505 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

My slides for ApacheCon North America 2014.

TRANSCRIPT

SCALING MQTT WITH KAFKA Tim Kellogg April 7, 2014 @kellogh

•  MQTT broker •  Protocol onboarding •  Cloud environment (we’re a startup)

•  Standard •  Lightweight

o  >= 2 byte overhead per message

•  Easy to parse o  Length prefixed strings

•  Requires very little resources on client side o  Broker keeps track of state

•  Reliable o  QoS 1 & 2 o  Last Will & Testament messages

•  Secure o  Username + Password o  Tunnel over TLS

Publish / Subscribe Pub

Pub

Pub

Broker

Sub

Topic/A

Topic/B

Topic/C

Topic/B

Sub Topic/C

Sub Topic/A

Topics •  foo/bar/baz •  com.example/device/17/thermo • Patterns

•  com.example/device/+/thermo •  com.example/device/#

Scaling Goals

•  More than 2 Million connected publishers

•  More than 65,000 msg/s

•  Single subscriber

Scaling Goals •  Amazon’s EC2 •  Horizontal scaling

o  Reduce cost o  Plan for the future o  Less impact from

downtime

Problems with Scaling MQTT

Load Balancing •  Which broker to connect to?

o  DNS load balancing

•  HAProxy •  QoS 1-2 messages stored in

Cassandra o  Consistent hash ring

Single Subscriber Pub

Pub

Pub

Broker

Sub

Topic/A

Topic/B

Topic/C

Topic/#

Single Subscriber Pub

Pub

Pub

Broker Sub

Topic/A

Topic/B

Topic/C

Topic/#

Broker

Broker

Load Balancing

Single Subscriber Pub

Pub

Pub

Broker Sub

Topic/A

Topic/B

Topic/C

Topic/#

Broker

Broker

Load Balancing

Load Balancing

Single Subscriber

Broker

Subscriber

Topic/#

Broker

Broker

Using HTTP

POST From The Broker Pub

Pub

Pub

Broker

Topic/A

Topic/B

Topic/C Broker

Broker HTTP POST

Load Balancing

Server

HTTP POST Server

HTTP POST Server

Load Balancing

Benefits • Easy to load balance • Well known & well

supported

Drawbacks • HTTP is heavy

•  Headers •  Creating & destroying TCP connections

• Subscriber servers must be available

•  Retry logic to guarantee delivery

Apache Kafka

•  •  Distributed log

aggregation framework •  Server to server •  “Smart” clients •  Apache ZooKeeper

•  Append-only files per topic o  Client keeps track of what messages it’s processed

•  No topic wildcards •  Key is used for out of band data •  device/42/thermo è topic: device-thermo key: 42

Subscriber Group Pub

Pub

Pub

Broker

Subscriber Group Pub

Pub

Pub

Broker

Broker

Broker

Load Balancing

Kafka

Results

•  Linear scaling for fire hose subscriber •  At least 2 million clients •  At least 65,000 msg/s

Wish List •  Security •  Configuration

Open Source IoT

The Book: Mastering The Internet of Things

Questions?

@kellogh

top related