asynchronous programming in python
TRANSCRIPT
ASYNCHRONOUS PROGRAMMING IN
PYTHONWith Rocks In
Wednesday, 31 August, 11
Not really covering Tornado
Wednesday, 31 August, 11
Introduction to Asynchronous
Wednesday, 31 August, 11
Still intermediate
Wednesday, 31 August, 11
Just mostly Twisted
Wednesday, 31 August, 11
Just mostly Twisted
... with some Tornado.
Wednesday, 31 August, 11
Introductions
• Hi, I'm Aurynn
• This is a hedgehog
Wednesday, 31 August, 11
I’m more like a Magpie
• Shiny things are SO COOL
• I could talk about how shiny they are until
http://www.flickr.com/photos/cmg2011/5147250751/
Wednesday, 31 August, 11
Look! A particularly shiny thing!
Wednesday, 31 August, 11
Buzzword Bingo
• Much ado about Node.js
• Event Driven!
• Tornado, from Facebook!
• Event loops!
• “Web Scale!”
Wednesday, 31 August, 11
Lots of Chatter
• Without lots of research, it’s just kind of noise
http://en.wikipedia.org/wiki/File:Carl_Friedrich_Gauss.jpg
Wednesday, 31 August, 11
Past the buzzwords, useful ideas
Wednesday, 31 August, 11
Asynchronous programming, or
Wednesday, 31 August, 11
Doing more than one thing at once
Wednesday, 31 August, 11
... sort of.
Wednesday, 31 August, 11
And it’s the sort of that is important
Wednesday, 31 August, 11
A TERRIBLY BRIEF, PROBABLY INACCURATE
HISTORY
Wednesday, 31 August, 11
Threading, you’ve heard of it
• Really common
• Java, .NET, even some Python
• Super awesome! Shared memory, shared scopes, fun all around..
Wednesday, 31 August, 11
Surprisingly good, until it isn’t
• Very difficult to access shared state safely
• Race conditions
• Even experts have a hard time of it
• Generally hard to do right
Wednesday, 31 August, 11
Multiprocessing
• Let the kernel care!
• Fairly easy to write MP code on unix-likes
• Can even go multi-system
The kernel cares about when things happen, not you.
No problems with locks or race conditions, since you don’t have a consistent memory region
Fork() makes life so easy!
the MP model even makes multiple-systems a viable approach: it’s pretty trivial to SSH into another computer and run a program, or a batch of programs.
http://www.flickr.com/photos/epw/2876377014/
Wednesday, 31 August, 11
Hard to share Data
• It’s not easy to send data between processes
• Parsing stdout, or trying to get a shmem implementation working.
• import multiprocessing can be.. quirky.
Wednesday, 31 August, 11
Asynchronous!
• Like threading, everything is in a single process
• All my Variables, All the Time
• No race conditions (mostly)
• Guarantee of no concurrent execution
http://www.flickr.com/photos/rachelpasch/3754315974/
Wednesday, 31 August, 11
Not all Unicorns and Rainbows
• You have to Let Go
• Bending your mind to the Asynchronous Way is still hard
• A single mistake can hang
• Probably going to be slower.
Just like MP and threads, event loops have their own caveats and major constraints.
Your code can’t run indefinitely, and the longer it runs, the longer your process stalls.
Like threading, it’s still Not Easy to get your head around how to write asynchronous code, and this is something else we’ll go into in a bit more detail.
Single mistakes, not catching your errors in The Approved Way? You can very easily trash your entire program and cause yourself to hang. Why does this happen? It comes back to the first point of You Need to Let Go.
http://www.flickr.com/photos/digitalpapercuts/5737975961
Wednesday, 31 August, 11
SO REALLY, WHAT IS ASYNC?So, I’ve made some broad generalizations about event loops, and the caveats they bring to the table.Let’s go into some more detail about what they do and are, and how those caveats actually work, and look at some code to really show how to work in the Asynchronous Way.
Wednesday, 31 August, 11
What is an event loop?
so, to get this far, we haven’t really answered the first question: what *is* an event loop?
Wednesday, 31 August, 11
What is an event loop?
• A long-running while loop
• When an event triggers, the loop catches this fact
• Events are pretty generic
At its heart, an event loop is just a long-running while loop, iterating over a set of callbacks, or events to be run in the future.
When an event gets triggered, often in the form of a socket message, or the completion of another function, or a timeout firing.
Wednesday, 31 August, 11
Then what?
Wednesday, 31 August, 11
Then what?
• Let my code know!
• This code can be any callable
When you added the event you care about, you also added a callback. A callback is simply a Python function that gets run with the results of the event. The return value of a function, or the data coming off your socket, or whatever is what this function gets passed.
The great part is that this function definition is allowed to be *any callable* object in Python. A class with .__call__, a function, a bound method on an object, whatever scope you like, it has.
Wednesday, 31 August, 11
But once it’s in your code..
Wednesday, 31 August, 11
You have to Let Go As you’ve probably figured out, what happens in an event system is analogous to co-operative multitasking.When an event fires and your callback gets run, what happens?Since it’s a standard method call,
Event Loop
Your Code
Wednesday, 31 August, 11
You have to Let Go control is handed over to your method, and doesn’t return to the event loop UNTIL YOU RETURN.
Event Loop
Your Code
Wednesday, 31 August, 11
We could be here a while... control is handed over to your method, and doesn’t return to the event loop UNTIL YOU RETURN.
So let’s say your particular callback takes, oh, a second to do its thing, as it’s a particularly computationally intensive, your entire event loop is unable to do anything else.
Your Code
Let’s compute Pi toa BILLION decimal
places!
Wednesday, 31 August, 11
Not just silly maths, either This happens no matter what your code does, be it silly math or a web site reaching out to MySQL for data, or going to disk to open a file, iterating over a long array, or even waiting on the user to do something.
As long as your code hasn’t returned, your entire program has STALLED.
Your Code
I need some datafrom MySQL.
Wednesday, 31 August, 11
http://www.flickr.com/photos/neilwill/5023734329/
You have STALLED.
Wednesday, 31 August, 11
Is that really bad?
It should be fairly obvious that this is bad, and why it’s bad.
To use the example of a hypothetical website, if you’re stalled waiting for the database, you can’t accept new connections, and you can’t even give an indication why. Your site will *appear* to perform slowly.
Wednesday, 31 August, 11
Asynchronous code is harder
Solving this isn’t easy, and requires adjusting your mental model on how programs flow.
In Twisted, programs have to be written with the idea that a method call won’t return the results you expect, but instead an object that will tell a function what your data is.
Wednesday, 31 August, 11
Asynchronous code is harder
• x = y() doesn’t work anymore.
For instance, x = y() won’t do what you expect.
How can it, when you’re not actually
Wednesday, 31 August, 11
Asynchronous code is harder
• x = y() doesn’t work anymore.
• Requires very tiny functions
In this model, you end up with very tiny functions that perform very small, discrete amounts of work, before releasing control back to the event loop.
In order for these very tiny functions to be useful, we have to keep tight control over our scope, and an easy way to do that is by using closures.
Wednesday, 31 August, 11
This is what Twisted doesAs you can see here, we’ve expanded our row processor into its own function, as well as adding an error handler to the twisted Deferred.
Wednesday, 31 August, 11
Wait, wait, what just happened?
The first way in which asynchronous code can be written is through the use of closures.
A closure is a funky sort of internal, anonymous function that “closes over” the scope of the function it’s defined in.
This can be very powerful, as the closure effectively “resumes” back in the middle of the original function, can update state, and generally do useful things.
Wednesday, 31 August, 11
Deferred, the Core of Twisted
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
The Core of Twisted
• Most APIs built on Twisted return Deferreds
• Almost always involve user code
Wednesday, 31 August, 11
But what is a Deferred?
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
But what is it?These are the basic features that you probably care about in a deferred;
Add callbacks and errbacks, which we’ve already covered a bit of,
and these new methods, .callback and .errback.
Wednesday, 31 August, 11
Segue Power!
• .callback starts the callback chain
• .errback causes the callback chain to explode and die messily
For instance, x = y() won’t do what you expect.
How can it, when you’re not actually
Wednesday, 31 August, 11
errback is structurally identical to callbacks
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
Let’s look at this again
Wednesday, 31 August, 11
What’s the key here?
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
It doesn’t happen right away.
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
Synchronous Example
Wednesday, 31 August, 11
And again, for comparison
Wednesday, 31 August, 11
Composition
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
Or, chaining callbacks
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
Wha?
Wednesday, 31 August, 11
Synchronous callback chain
What that was was a *deferred*, basically the core response that you’ll get out of an API in Twisted.
What a deferred is, is an indicator that something is going to happen *later on*, as opposed to right now.
This comes back to the core ideal of having to let go, and let’s go back to the code to explain further
Wednesday, 31 August, 11
ASYNCHRONOUS != FASTERThe final caveat on our List of Asynchronous Problems is the idea I’ve run into that asynchronous code is, by the very fact of running on an evented IO server, it will be faster.
This idea is all sorts of wrong.Synchronous code that simply runs inside of a event-driven IO system like Twisted or Tornado is naturally going to be slower than the same code running standalone.
Wednesday, 31 August, 11
The event loop is overhead
Synchronous code that simply runs inside of a event-driven IO system like Twisted or Tornado is naturally going to be slower than the same code running standalone.
Wednesday, 31 August, 11
The event loop is overhead...
And, as you’ve already seen on how to structure asynchronous programs, effectively useless, unless you take the time to program to take advantage of an asynchronous event loop.
..and without proper coding, useless
Wednesday, 31 August, 11
The question you all want to ask
Wednesday, 31 August, 11
WHY BOTHER?Why spend extra time doing it the Hard Way, the way where you are required to do more work and write code in completely new ways?
Wednesday, 31 August, 11
Scales beautifully
The real advantage, the real power of asynchronous programming is the level of scale to which you can go.
No threads means no thread overhead, and no complexity of maintaining locks and trying to share state.
Nginx, well-regarded as one of the fastest webservers around, is entirely built around asynchronous programming.
Wednesday, 31 August, 11
Terribly elegant
Once you really “get it”, the entire idea starts seeming terribly elegant and worthwhile, and you start looking for how to process code asynchronously in all aspects of your programming.
Wednesday, 31 August, 11
More re-usable code
You also end up in a position where you’re writing far more reusable code.Why? Well, you need to have these functions which run as callbacks, and as we’ll go into in a little bit, those same callbacks can be chained together. There’s very little point in rewriting code all the time to
Wednesday, 31 August, 11
Closer mapping to reality
What do I mean by this? Your code is often going to be waiting for other servers - webservers, database servers, network, file, Everything.
So the example I have the
Wednesday, 31 August, 11
LITTLE BITS OF TORNADOSSince I’ve spent most of my time so far talking about Twisted as opposed to the other “major” asynchronous platform, I’d like to devote a little bit of time to Tornado.
Wednesday, 31 August, 11
Event loop + web framework
Tornado, while it does have an internal IO loop, and libraries *do* use it standalone, the vast majority of examples you’ll run across take the idea of it being a web framework akin to Pylons or Pyramid or Bottle.
Wednesday, 31 August, 11
Callbacks are inline
Wednesday, 31 August, 11
Not as wide library support
Wednesday, 31 August, 11
Lacks a deferred metaphor
Wednesday, 31 August, 11
.add_callback(my_function)
Tornado, while it does have an internal IO loop, and libraries *do* use it standalone, the vast majority of examples you’ll run across take the idea of it being a web framework akin to Pylons or Pyramid or Bottle.
Wednesday, 31 August, 11
my_function has to handle both
Wednesday, 31 August, 11
my_function( response, error )
Wednesday, 31 August, 11
So why Tornado over Twisted?
Wednesday, 31 August, 11
Speed.http://programmingzen.com/2009/09/13/benchmarking-tornado-vs-twisted-web-vs-tornado-on-twisted/
Wednesday, 31 August, 11
SO, THAT’S IT.
Wednesday, 31 August, 11
ANY QUESTIONS?
Wednesday, 31 August, 11
THANKS!
Wednesday, 31 August, 11