State Machines to State of the Art

Download State Machines to State of the Art

Post on 16-Apr-2017




0 download

Embed Size (px)


State Machines to State of the Art

PHP UK 2009

State Machines to State of the Art

Smart, Efficient Design using REST and MVC

Rowan Merewood, Lead Developer, Plusnet

Today I'm going to be talking about how we go about designing web applications and some of the tools we can use to make those designs simple, robust, modular, and hopefully somewhat future-proof.A little about me, I'm a lead developer at Plusnet which is a BT-owned ISP up in Sheffield. People sometimes seem surprised that an ISP needs a dev. dept when presumably all we need to do is plug a few wires in and flick a few switches in an exchange. The difference is that we focus on automating as much of the entire process as possible: signing up, provisioning, billing, customer support and so on. We've done most of this in-house, so we need developers to make sure it keeps working and to identify new areas we can automate.The majority of this is in PHP. We've got some Java in some back-end places, and then a smattering of Perl, C, and bash scripts filling in the cracks. We've been doing this for over a decade now, so as you can imagine there's a lot of code there.That leads me onto why I wanted to do this talk...


Trying to be a better Software Engineer

I've seen quite a few PHP applications that have started as a single dynamic form with one-shot code which is good. They solve the problem and they get it down quickly. Then we've added some new functionality, which is also fine. However, by the third, fourth and fifth modification you start getting thousand line case statements, functions who's name is only a hint at what it may have done before, and docblocks that simply say, Here be dragons.So how do I stop the code I write today from becoming tomorrow's legacy monster, but without spending six months designing it?Let's take a look at what's available to us...

The Tools


State Machines

On the top line we've got a bunch of acronyms we're probably all familiar with:HTTP the protocol that's going to let us do all thisREST an architecture that defines how we use our protocolCRUD a rough approximation of what's going on in the backgroundMVC a standard design pattern employed by the majority of web frameworksHowever, we've also got an old school computer science favourite:State Machines which will provide a way of visualising the life cycle of our resourcesWe're going to have a brief run through on each of these topics and I'll highlight what I think are the key concepts to understand in each of them.

HTTP Protocol

A stateless way of interacting with resources

First, what is hopefully revision for most of you the HTTP protocol. The key words here are stateless and resources. Every single request should be a complete, standalone operation and those requests are directed to a unique resource. It sounds simple, but as you'll see we tend to drift away from that in places.

HTTP Protocol


Let's jump into the methods that HTTP gives us for interacting with resources. There are 8 in total, but given we're just interested in the web app. side we're only interested in the top 5, and also given we're dealing mostly with browsers, there are really only 2 that are relevant GET and POST.

HTTP Protocol


A brief reminder of how these are used:HEAD, GET your read-only operations for resources fetching a representation of the resource.POST for creating subordinate resoures relevant to the URL being POSTed to. An interesting point, if you POST data that creates a new URL, your server should return 201 Created, rather than 200 OKPUT for placing a resource at the given location, or replacing the one there.DELETE I think you get the idea.An important distinction to note: if you're using POST to create resources, you're POSTing to another resource that manages that creation; if you're using PUT then you have the resource already and you're specifying where you want it to go. A bit like the difference between a constructor or a factory.

HTTP Protocol



GET and HEAD methods SHOULD NOT have the significance of taking an actionother than retrieval. These methods ought to be considered "safe".

The HTTP protocol breaks these methods down into two groups.HEAD and GET are classed as safe, meaning that my client can make as many requests as it like with these methods with no expectation of changing the state of the system, hence the read-only aspect of them. The definition at the bottom is nabbed straight from the RFC.

HTTP Protocol



Methods can also have the property of "idempotence" in that (aside from error or expirationissues) the side-effects of N > 0 identical requests is the same as for a single request

The seond property applies to the remainder of the methods with the exception POST and that is idempotence. I would actually recommend looking this up at work, rather than home as my wife looked somewhat worried when I put it into Google. This means that an identical request should have an identical effect. Obvious, if I'm GET'ing a resource I'll get the same one each time. If I create a new resource, then PUT'ing it again simply results in the same resource appearing.POST does not have to be idempotent, but it can be in cases especially given our tendency to overload POST with the functionality of PUT and DELETE, so you might want to think of ways in which you can make a specific POST request idempotent. Definitely have a read of the RFC on this one as it can take a while to wrap your head round it.That said, what is it that's important out of these properties?

What's Important?

User Experience

(of course!)

It's what our user sees, obviously. No one will care about how open I am about my idempotence if my system runs like a pig and makes users jump through twenty hoops just upload a picture. Given this, why should we care about what the protocol says?

What's Important?

Safe == Read Only

Idempotent == Predictable

On a gross level, this is how I'd translate those terms for a user. Any time my user makes a GET request, they should be able to call it as many times as they like, they should be able to bookmark, they should be able to reload it a week later and still get a meaningful result.If it's on idempotent or other un-safe methods the behaviour should be predictable, and above all still stateless. A single request is still a complete operation, and does not rely on a previous sequence.

What went wrong?

Great Power Great Responsibility?

HTTP was designed for an interactive web, with create and update methods built in from the start. However, we wanted richer experiences and more complex interactions. There were tools out there, but we did things in the wrong way...

What was abused?

HTTP, Sessions, and Cookies oh my!

We did things with HTTP that stopped GET being safe, we stored things in the session or cookies that should have been part of the resource. Things like Java applets and Flash came along and we moved some of those interactions out of the browser's control and into separate applications.

Shall we have a couple examples?


My credit card company might thank me, and I suppose if I end up with two because I clicked refresh I can always drive one and live in the other.


I'll just press Back then...

How does your multi-part form handle holding that user data? What happens if I bookmark it halfway through and come back later? What if I go back through the journey and pick a different branch, how will your application handle the data it's collected so far?

If the protocol is fine, and the tools are fine then the problem must be with the way we're using them. So, we need an architectural style to help us apply our tools properly.


Representational State Transfer

That's where REST comes in. REST is becoming popular and I think much of that is due to the simplicity of its ideas and that it's different enough to have a fundamental effect on how your application works.


Nouns not verbs

The key idea here is that your application deals with nouns not verbs. That is, your URLs represent objects and then you're performing actions on those objects.The benefit here is that you are encouraged towards more human-readable URLS and dividing your user's interaction with your app. Into objects has an obvious link to your implementation if you're using an OO language.

Now, if you've still got some old PHP apps hanging around then this is not necessarily the case. Your page structure probably resembles a series of function calls: viewOrder, cancelOrder, updateOrder. All logical, but based around passing a resource to a method, rather than calling a method on a resource.Let's have a look at the difference...


/getOrder?id=123 vs. /order/123

/addContact vs. /contacts/

/updateArticle?id=abc vs. /article/abc

- you either have a page who's function is to get orders, or you just specify the order and expect to get a standard representation back.- you either have a page where you send arbitrary data necessary to add a contact, or you POST a representation of the resource to the collection of contacts and it sorts out adding it.- similar to the first example, instead of different URLs for different functions,