distributed and concurrent programming with rabbitmq and eventmachine rails underground 2009
TRANSCRIPT
“Divide and conquer riding rabbits and
trading gems”- a tale about distributed programming -
Paolo Negri @hungryblank
• http://www.slideshare.net/hungryblank
• http://github.com/hungryblank/rabbit_starter
Resources
• http://www.slideshare.net/hungryblank
• http://github.com/hungryblank/rabbit_starter
About meTime
GNU/Linux - Dbs“systems”
Perl
Python
Ruby
PHP
Summary:
DistributedConcurrentProgramming
http://www.flickr.com/photos/myxi/448253580
rabbitMQ
Control
The problem
Given a dictionary of1.000.000 search phrases,
compare for each onethe first page of results ongoogle.com and bing.com
Don’t do it.It’s a violation of terms
of service!(Sorry, I needed an example that required little or no explanation)
Comparison
The whole process
Great! but...
• Fetching, parsing and analyzing 2.000.000 pages of results will take a long time
• we want to collect the statistics in a limited time and this sequential approach is not quick enough
Distributed Computing
Is a method of solving computational problem by dividing the problem into
many tasks run simultaneously on many hardware or software systems
(Wikipedia)
Map Reduce "Map" step
the master node takes the input, chops it up into smaller sub-problems, and distributes
those to worker nodes.(Wikipedia)
Problems:
•How many nodes?
•How many workers?
•Distribution mechanism to feed the workers?
What about queuing?
• the master node takes the input, chops it up into smaller sub-problems, and publishes them in a queue
• workers independently consume the content of the queue
Here comes
• RabbitMQ is an implementation of AMQP, the emerging standard for high performance enterprise messaging
• It’s opensource
• Written in Erlang
Erlang?
• general-purpose concurrent programming language designed by Ericsson
• first version written by J. Armstrong in 1986
• distributed
• fault tolerant
• soft real time
• high availability
Erlang - is coming back• Projects
• CouchDB - RESTful document storage
• WebMachine - REST toolkit
• Nitrogen - web framework
• Mochiweb - web framework
• Ruby shops using it
• 37 Signal - Campfire
• Heroku
+ Erlang
It’s messages all the way down
Install it
• sudo apt-get install rabbitmq
• sudo gem install tmm1-amqp
Note: rabbitMQ must be v1.6.0 and amqp gem v 0.6.4to follow the code in the slides
Do it! - master node
Do it! - worker node
Get for free
• Decoupling master/worker
• Workers take care of feeding themselves
• Flexible number of workers
Behind the scenes
msg A Queue
Worker1
Worker2
Worker3
Master
Exchange
Behind the scenes
Queue
Worker1
Worker2
Worker3
Master
Exchangemsg A
Behind the scenes
Queue
Worker1
Worker2
Worker3
Master
Exchange msg A
Behind the scenes
Queue
Worker1
Worker2
Worker3
Master
Exchange
msg A
RabbitMQ
• Multiple exchanges (and multiple types of exchange)
• Multiple queues
• Queues are connected by bindings to exchanges
• Exchanges route messages to queues
RabbitMQ
• Exchanges and queues have names
• BUT direct exchanges created implicitly are not public and don’t have name
• Queues and messages are resident in RAM and can be persisted on disk (at a performance price)
What and where
RabbitMQ(Erlang) TCP/IP
Master(ruby)
Worker(ruby)
Worker(ruby)
Queue Exchange Worker(ruby)
Problem #1
If a worker has a problem we might lose one or more messages
http://www.flickr.com/photos/danzen/2288625136/
Solution - ACK in worker
Acknoledgement
• messages - 1 or more depending by prefetch settings - are passed to the client
• the client processes and acknowledges the messages one by one
• if the client connection closes => unacknowledged messages go back in the queue
Problem #2
No convenient way to control the workers
http://www.flickr.com/photos/streetfly_jz/2770586821/
System queue - worker
System queue - control
• save it as system_command.rb
• ruby system_command.rb halt
System queue
Queue1
Queue3
Queue2
Worker1
Worker2
Worker3
Control script
Exchangemsg A
System queue
Queue1
Queue3
Queue2
Worker1
Worker2
Worker3
Control script
Exchangemsg A
System queue
Queue1
Queue3
Queue2
Worker1
Worker2
Worker3
Control script
Exchange
msg A
msg A
msg A
EventMachine
EventMachine
• Non blocking IO and lightweight concurrency
• eliminate the complexities of high-performance threaded network programming
Is an implementation of Reactor Pattern
without EM
code
networkoperation
use network operation
result
Free
code
networkoperation
use network operation
result
Callback
Free
Free
Time
with EMRuby process Ruby process
EventMachine
amqp gem is built on top of EventMachine => you’re in a
context where you can leverage concurrent
programming
EM - Deferrables
“The Deferrable pattern allows you to specify any number of Ruby code blocks that will be
executed at some future time when the status of the Deferrable object changes “
EM - Deferrables
EM - Deferrables
EM - Deferrables
EM - Deferrables
Deferrables
GooglePage
BingPage
without deferrables with deferrables
Time
GooglePage BingPage
Problem #3
How many of them?
what are they doing?
http://www.flickr.com/photos/philocrites/341850461/
Heartbeat - worker
Heartbeat monitor
Heartbeat queue
msg A
Queue
Worker1
Worker2
Worker3
MonitorExchange
msg B
Heartbeat queue
Queue
Worker1
Worker2
Worker3
MonitorExchangemsg A
msg B
Heartbeat queue
Queue
Worker1
Worker2
Worker3
MonitorExchange msg Amsg B
Clustering
RabbitMQ - node C
RabbitMQ - node A RabbitMQ - node B
TCP/IP
TCP/IPTCP/IP
async vs sync
• famoseagle/carrot
• celldee/bunny
Syncronous clients on github
Rabbits on github
• danielsdeleo/moqueue Tests/stubbing
• Check out ezmobius/nanite“self assembling fabric of ruby daemons”
• Keep an eye on tonyg/rabbithub implementation of pubsubhubbub (PubSub over REST)
• auser/alice web app to monitor RabbitMQ
More rabbits on github
• tmm1/em-spec
• eventmachine/eventmachine
• tmm1/amqp
• macournoyer/thin
Q&A
?
Thanks!(Thanks Mark!)
Paolo Negri
/ hungryblank