[@naukriengineering] messaging queues

30
Rabbit MQ High Powered Messaging with Apps Mayank Mishra Rajendra Bera Rohit Sharma

Upload: naukricom

Post on 16-Apr-2017

86 views

Category:

Engineering


0 download

TRANSCRIPT

Rabbit MQHigh Powered Messaging with Apps

Mayank MishraRajendra BeraRohit Sharma

Message Queue

Message Queuing (MSMQ) technology enables applications running at different times to and

systems that may be temporarily offline.

Applications send messages to queues and read messages from queues.

Messaging enables software applications to connect and scale.

Messaging is asynchronous, decoupling applications by separating sending and receiving

data.

Why we need messaging?

Sending emails in bulk

Video encoding: if you're building Youtube, you probably are not going to ask your user to wait until his video has been converted from avi to flash / x264 / webM before telling him his file was uploaded correctly.

Pushing a tweet: if User has set authorized account on your app. (Defer tweets, api limit reached etc).

Gamification in referral (Multiple api calls through different applications).

Real World Scenarios

Nokia HERE maps : Real time (8,00,000 msg/min)

AMQPAdvance Messaging Queue Protocol

Components of AMQPMessage Broker - manages exchanges and queues

Channel - logical state of a connection (name, durability, persistence, delivery

mode etc.)

Exchanges - where message needs to be sent.

Queues - queues messages for the consumption by client.

Bindings - relationship between queues and exchanges

Messages - messages . . . consisting of two parts routing key & body

Message Headers

Routing Key - used to route messages, dependent on type of exchange (*,#)

Priority - ranging from 0-9 provides priority over other messages

Delivery Mode - if message needs persistence or not (Durability)

Expiration - time for message broker to decide if message is routable. (TTL for both messages/queues)

producers

consumers

filtersqueue

Rabbit MQ server

RabbitMQFeatures

Reliability

Flexible Routing

Clustering

Federation

Shovel

Multi-Protocol

RabbitMQFeatures

Many Clients

Management UI

Tracing

Plugin System

Large Community Support

Routing / Subscribing

a message goes to the queues whose binding key exactly matches the routing key of the message

Direct Exchanges

a message sent with a particular routing key will be delivered to all the queues that are bound with a matching binding key

Topic exchange

Apache Kafka

Why Kafka?When we have….

Aren’t they Good?

*Apache ActiveMQ, JBoss HornetQ, Zero MQ, RabbitMQ are respective brands of Apache Software Foundation, JBoss Inc, iMatix Corporation  and Vmware Inc.

They all are GoodBut not for all use-cases.

● Transportation of logs

● Activity Stream in Real time.

● Collection of Performance Metrics◦ CPU/IO/Memory usage

◦ Application Specific⚫ Time taken to load a web-page.

⚫ No of requests.

⚫ No of hits on a particular page/url.

So what are the Use-cases…

● Scalable: Need to be Highly Scalable. A lot of Data.

● Reliability of messages, What If, I loose a small no. of messages. Is it fine

with me ?.

● Distributed : Multiple Producers, Multiple Consumers

● High-throughput: (100k+/sec)

What is Common?

Introduction● A High-throughput distributed Publish-Subscribe based messaging system.

● A Kind of Data Pipeline

● Does not follow JMS Standards

Zookeeper Consumer 1(groupId1)

Consumer 2(groupId1)

Handshake

Event Push

Handshake Kafka Broker(Partition 1)

Coordination

Store Consumed Offset and Watch for Cluster event

Event Polling

Kafka Broker(Partition 2)

Producer

Producer

Producer

Producer

.

.

.

.

.

Event Push

Event Push Consumer 3(groupId1)

* two consumer of same consumer group can’t access a partition

*

Achitecture

● Feeds of messages are categories in Topics● Topics are broken up into ordered commit logs called partitions.● #paralalism = #partitions

Topic/Partitions

● Multiple consumer can read from the same topic● Each consumer manage it’s own offset● Messages stay on kafka for a duration● Max #Consumer = #partitions

Consumer

Data Flow

Data Flow

● Filesystem Cache: It reduces # disc read

● Zero-copy transfer of messages - http://www.ibm.com/developerworks/library/j-zerocopy/

● Batching of Messages: It reduces network calls by the factor of batch size

● Eventual consistency: trade off in consistency and durability

● Broker does not Push messages to Consumer, Consumer Polls messages from Broker.

● Automatic Producer Load balancing.

● Zookeeper helps in cluster formation

and rebalancing of Broker/Consumer

Design Elements

Performance

Credit : http://research.microsoft.com/en-us/UM/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

Producer Performance Consumer Performance

Powered By

Credit: https://cwiki.apache.org/confluence/display/KAFKA/Powered+By

Thank You !