october 2016 hug: pulsar,  a highly scalable, low latency pub-sub messaging system

14
Distributed pub/sub platform github.com/yahoo/pulsar Matteo Merli — [email protected] Bay Area Hadoop Meetup — 10/19/2016

Upload: yahoo-developer-network

Post on 11-Jan-2017

176 views

Category:

Technology


0 download

TRANSCRIPT

Dist r ibuted pub/sub p lat formgithub.com/yahoo/pulsar

Matteo Merli — [email protected] Bay Area Hadoop Meetup — 10/19/2016

What is Pulsar?

2

▪ Hosted multi-tenant pub/sub messaging platform ▪ Simple messaging model ▪ Horizontally scalable - Topics, Message throughput ▪ Ordering, durability & delivery guarantees ▪ Geo-replication ▪ Easy to operate (Add capacity, replace machines) ▪ Few numbers for production usage: › 1.5 years — 1.4 M topics — 100 B msg/day — Zero data loss › Average publish latency < 5ms, 99pct 15ms › 80+ application onboarded — Self-serve provisioning › Presence in 8 data centers

Pulsar

Common use cases

3

▪ Application integration › Server-to-server control, status propagation, notifications

▪ Persistent queue › Stream processing, buffering, feed ingestion, tasks dispatcher

▪ Message bus for large scale data stores › Durable log › Replication within and across geo-locations

Pulsar

Main features

4

▪ REST / Java / Command line administrative APIs › Provision users / grant permissions › Users self-administration › Metrics for topics / brokers usage

▪ Multi tenancy › Authentication / Authorization › Storage quota management › Tenant isolation policies › Message TTL › Backlog and subscriptions management tools

▪ Message retention and replay › Rollback to redeliver already acknowledged messages

Pulsar

Why build a new system?

5

▪ No existing solution to satisfy requirements › Multi tenant — 1M topics — Low latency — Durability — Geo replication

▪ Kafka doesn’t scale well with many topics: › Storage model based on individual directory per topic partition › Enabling durability kills the performance

▪ Ability to manage large backlogs ▪ Operations are not very convenient › eg: replacing a server, manual commands to copy the data and involves clients › clients access to ZK clusters not desirable

▪ No scalable support to keep consumer position

Pulsar

Messaging Model

6 Pulsar

Consumer-A1 receives all messages published on T; B1, B2, B3 receive one third each

Shared

Exclusive

Consumer-B1

Consumer-B2

Consumer-B3

Topic-T

Subscription-B

Subscription-A Consumer-A1Producer-X

Producer-Y

7

Client API

Producer

PulsarClient client = PulsarClient.create( "http://broker.usw.example.com:8080");

Producer producer = client.createProducer( "persistent://my-prop/us-west/my-ns/my-topic");

// Handles retries in case of failure producer.send("my-message".getBytes());

// Async version: producer.sendAsync(“my-message”.getBytes()) .thenAccept(msgId -> { // Message was persisted });

Consumer

PulsarClient client = PulsarClient.create( "http://broker.usw.example.com:8080");

Consumer consumer = client.subscribe( "persistent://my-prop/us-west/my-ns/my-topic", "my-subscription-name");

while (true) { // Wait for a message Message msg = consumer.receive();

// Process message …

// Acknowledge the message so that // it can be deleted by broker consumer.acknowledge(msg); }

Pulsar

Main client library features

8

▪ Sync / Async operations ▪ Partitioned topics ▪ Transparent batching of messages ▪ Compression ▪ End-to-end checksum ▪ TLS encryption ▪ Individual and cumulative acknowledgment ▪ Client side stats

Pulsar

Architecture

9 Pulsar

Separate layers between brokers and storage (bookies) ‣ Broker and bookies can

be added independently

‣ Traffic can be shifted very quickly across brokers

‣ New bookies will ramp up on traffic quickly

Pulsar Cluster

ZK

Producer Consumer

Broker 1 Broker 3

Bookie 1

Bookie 2

Bookie 3

Bookie 4

Bookie 5

Broker 2

Architecture

10 Pulsar

Pulsar Cluster

Broker

Bookie

ZK

GlobalZK

Servicediscovery

Producer AppPulsar

lib

Replication

ManagedLedger

BK Client

Globalreplicators

Cache

Dispatcher

Consumer AppPulsar

lib

LoadBalancer

Broker ‣ End-to-end async

message processing ‣ Messages are relayed

across producers, bookies and consumers with no copies

‣ Pooled ref-counted buffers

‣ Cache recent messages

BookKeeper

11

▪ Replicated log service ▪ Offer consistency and durability

▪ Why is it a good choice for Pulsar? › Very efficient storage for sequential data › For each topic we are creating multiple ledgers over time › Very good distribution of IO across all bookies › Isolation of write and reads › Flexible model for quorum writes with different tradeoffs

Pulsar

BookKeeper - Storage

12

▪ A single bookie can serve and store thousands of ledgers

▪ Writes to journal, reads come from ledger device: › Avoid read activity to impact

write latency › Writes are added to in-

memory write-cache and committed to journal

› Write cache is flushed in background to separated ledger device

▪ Entries are sorted to allow for mostly sequential reads

Pulsar

Performance — Single topic throughput and latency

13 Pulsar

Throughput and 99pct publish latency — 1 Topic — 1 Producer

Late

ncy

(ms)

0

1

2

3

4

5

6

Throughput (msg/s)1,000 10,000 100,000 1,000,000 10,000,000

1,800,000

10 Bytes100 Bytes1KB

Final Remarks

• Check out the code and docs at github.com/yahoo/pulsar

• Give feedback or ask for more details on mailing lists: • Pulsar-Users • Pulsar-Dev