state machines to state of the art

State Machines to State of the Art

PHP UK 2009

State Machines to State of the Art

Smart, Efficient Design using REST and MVC

Rowan Merewood, Lead Developer, Plusnet

Today I'm going to be talking about how we go about designing web applications and some of the tools we can use to make those designs simple, robust, modular, and hopefully somewhat future-proof.A little about me, I'm a lead developer at Plusnet which is a BT-owned ISP up in Sheffield. People sometimes seem surprised that an ISP needs a dev. dept when presumably all we need to do is plug a few wires in and flick a few switches in an exchange. The difference is that we focus on automating as much of the entire process as possible: signing up, provisioning, billing, customer support and so on. We've done most of this in-house, so we need developers to make sure it keeps working and to identify new areas we can automate.The majority of this is in PHP. We've got some Java in some back-end places, and then a smattering of Perl, C, and bash scripts filling in the cracks. We've been doing this for over a decade now, so as you can imagine there's a lot of code there.That leads me onto why I wanted to do this talk...

Motivation

Trying to be a better Software Engineer

I've seen quite a few PHP applications that have started as a single dynamic form with one-shot code which is good. They solve the problem and they get it down quickly. Then we've added some new functionality, which is also fine. However, by the third, fourth and fifth modification you start getting thousand line case statements, functions who's name is only a hint at what it may have done before, and docblocks that simply say, Here be dragons.So how do I stop the code I write today from becoming tomorrow's legacy monster, but without spending six months designing it?Let's take a look at what's available to us...

The Tools

HTTP, REST, CRUD, MVC,

State Machines

On the top line we've got a bunch of acronyms we're probably all familiar with:HTTP the protocol that's going to let us do all thisREST an architecture that defines how we use our protocolCRUD a rough approximation of what's going on in the backgroundMVC a standard design pattern employed by the majority of web frameworksHowever, we've also got an old school computer science favourite:State Machines which will provide a way of visualising the life cycle of our resourcesWe're going to have a brief run through on each of these topics and I'll highlight what I think are the key concepts to understand in each of them.

HTTP Protocol

A stateless way of interacting with resources

First, what is hopefully revision for most of you the HTTP protocol. The key words here are stateless and resources. Every single request should be a complete, standalone operation and those requests are directed to a unique resource. It sounds simple, but as you'll see we tend to drift away from that in places.

HTTP Protocol

HEAD GET POST PUT DELETETRACE OPTIONS CONNECT

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9

Let's jump into the methods that HTTP gives us for interacting with resources. There are 8 in total, but given we're just interested in the web app. side we're only interested in the top 5, and also given we're dealing mostly with browsers, there are really only 2 that are relevant GET and POST.

HTTP Protocol


A brief reminder of how these are used:HEAD, GET your read-only operations for resources fetching a representation of the resource.POST for creating subordinate resoures relevant to the URL being POSTed to. An interesting point, if you POST data that creates a new URL, your server should return 201 Created, rather than 200 OKPUT for placing a resource at the given location, or replacing the one there.DELETE I think you get the idea.An important distinction to note: if you're using POST to create resources, you're POSTing to another resource that manages that creation; if you're using PUT then you have the resource already and you're specifying where you want it to go. A bit like the difference between a constructor or a factory.

HTTP Protocol


Safe

GET and HEAD methods SHOULD NOT have the significance of taking an actionother than retrieval. These methods ought to be considered "safe".

The HTTP protocol breaks these methods down into two groups.HEAD and GET are classed as safe, meaning that my client can make as many requests as it like with these methods with no expectation of changing the state of the system, hence the read-only aspect of them. The definition at the bottom is nabbed straight from the RFC.

HTTP Protocol


Idempotent

Methods can also have the property of "idempotence" in that (aside from error or expirationissues) the side-effects of N > 0 identical requests is the same as for a single request

The seond property applies to the remainder of the methods with the exception POST and that is idempotence. I would actually recommend looking this up at work, rather than home as my wife looked somewhat worried when I put it into Google. This means that an identical request should have an identical effect. Obvious, if I'm GET'ing a resource I'll get the same one each time. If I create a new resource, then PUT'ing it again simply results in the same resource appearing.POST does not have to be idempotent, but it can be in cases especially given our tendency to overload POST with the functionality of PUT and DELETE, so you might want to think of ways in which you can make a specific POST request idempotent. Definitely have a read of the RFC on this one as it can take a while to wrap your head round it.That said, what is it that's important out of these properties?

What's Important?

User Experience

(of course!)

It's what our user sees, obviously. No one will care about how open I am about my idempotence if my system runs like a pig and makes users jump through twenty hoops just upload a picture. Given this, why should we care about what the protocol says?

What's Important?

Safe == Read Only

Idempotent == Predictable

On a gross level, this is how I'd translate those terms for a user. Any time my user makes a GET request, they should be able to call it as many times as they like, they should be able to bookmark, they should be able to reload it a week later and still get a meaningful result.If it's on idempotent or other un-safe methods the behaviour should be predictable, and above all still stateless. A single request is still a complete operation, and does not rely on a previous sequence.

What went wrong?

Great Power Great Responsibility?

HTTP was designed for an interactive web, with create and update methods built in from the start. However, we wanted richer experiences and more complex interactions. There were tools out there, but we did things in the wrong way...

What was abused?

HTTP, Sessions, and Cookies oh my!

We did things with HTTP that stopped GET being safe, we stored things in the session or cookies that should have been part of the resource. Things like Java applets and Flash came along and we moved some of those interactions out of the browser's control and into separate applications.

Shall we have a couple examples?

Safe?

http://www.shop.com/buy.php?item=AstonMartin

My credit card company might thank me, and I suppose if I end up with two because I clicked refresh I can always drive one and live in the other.

Stateless?

I'll just press Back then...

How does your multi-part form handle holding that user data? What happens if I bookmark it halfway through and come back later? What if I go back through the journey and pick a different branch, how will your application handle the data it's collected so far?

If the protocol is fine, and the tools are fine then the problem must be with the way we're using them. So, we need an architectural style to help us apply our tools properly.

REST

Representational State Transfer

That's where REST comes in. REST is becoming popular and I think much of that is due to the simplicity of its ideas and that it's different enough to have a fundamental effect on how your application works.

REST

Nouns not verbs

The key idea here is that your application deals with nouns not verbs. That is, your URLs represent objects and then you're performing actions on those objects.The benefit here is that you are encouraged towards more human-readable URLS and dividing your user's interaction with your app. Into objects has an obvious link to your implementation if you're using an OO language.

Now, if you've still got some old PHP apps hanging around then this is not necessarily the case. Your page structure probably resembles a series of function calls: viewOrder, cancelOrder, updateOrder. All logical, but based around passing a resource to a method, rather than calling a method on a resource.Let's have a look at the difference...

REST

/getOrder?id=123 vs. /order/123

/addContact vs. /contacts/

/updateArticle?id=abc vs. /article/abc

- you either have a page who's function is to get orders, or you just specify the order and expect to get a standard representation back.- you either have a page where you send arbitrary data necessary to add a contact, or you POST a representation of the resource to the collection of contacts and it sorts out adding it.- similar to the first example, instead of different URLs for different functions, you have the same URL but the action you're performing is the important distinction.

If any of this calling methods on resources is sounding familiar, then there's a good reason for that...

REST

REST is ideal for implementing over HTTP

The distinction here is that REST is an architectural style, not a particular implementation. It gives us a common language for describing our application. And it's a language that's designed to complement the HTTP protocol, meaning that if our app. really is REST-ful then it'll also be play nicely with HTTP and that should mean a better user experience. We also get all the technical benefits of clear caching rules and as you will see a nice way of linking your front and back end code.

Design

Enough front-end stuff,

what's going to do the work?

We've covered the protocol available to us, the architectural style we want to apply, the problems we want to avoid, and the benefits we want to see.

So we're going to tackle a little example problem, come up with a REST-ful solution to it and then look at the classes that would help us implement that.

The Problem

Booking train tickets

I wanted a real world problem and everyone always gives you the advice, write about what you know. This was running through my mind as I tried to sort out my journey to and from Sheffield.The elements that we've talked about are all there:- a multi-part journey to select my ticket- viewing my purchased tickets- refunding a ticket when it turns out that my reserved seat is physically missing, there's no power sockets, and I think we ran out of coal at one point... but I'm getting sidetracked.We know what the user wants to do, let's start analysing the brief.

Analysis

What's my resource?

Tickets

Origin, Destination, Departs, Class, Price

This is classic requirements analysis, go through your use case and underline the nouns. In this case, it's pretty obvious I want my ticket, it's really all I care about.To keep things simple we'll specify a ticket as only having an origin station, destination station, departure time, a class, and a rather extorionate price.Now that's fairly trivial to drop into a database, so we can knock that table up and most of you should be familiar with the shorthand we use for manipulating data in that table...

CRUD

Create, Read, Update, Delete

Hopefully, you're already thinking how this is going to match up to the REST principles we talked about earlier because what we have here is a list of actions we want to perform on our resource.However, this is also the first point where we've had to deal with a change of state. Because we've got a nice limited set of operations here, let's take a look at the effect they have on the state of our resource...

CRUD

Create

Delete

Update

Read

Exists

PUT/POST

DELETE

PUT/POST

GET

The possible states of our resource are binary, it either exists or it doesn't. In this case our Update method is presumably only changing some non-functional attributes of our resource, so it never results in any signifigant change of state.This also means with the resource in place we can always call the Read or Update methods and always expect a predictable result.Also easy enough to see how we'd map those to our HTTP methods. But, our example isn't quite that simple, because there are different things we want to do with our ticket depending on where we are in the process...

Analysis

Create, Read, Update, Delete,

Book, Cancel, Finalise, Refund, Archive

- Create: Specify one or more things about my ticket, but I don't want to book it yet- Read: Obviously, I'm going to need to have it displayed- Update: I've got the other details, so I'll add them- Delete: Actually, I don't need the ticket so I'll just delete the draft- Book: However, if I do want it then then I'll need to book it, pay for it, and set everything else in motion- Cancel: Up to 24 hours before my journey, I can choose to cancel my ticket and I might get some kind of compensation.- Change: In that period I might also choose to change the details of my journey and re-book the ticket- Finalise: After that time though, I cannot change my ticket- Refund: However, if the train was late I might be able to argue for a refund- Archive: And finally, all these tickets are available to view until we do our housekeeping every 6 monthsWhat does that look like as a diagram?

Analysis

Drafted

RefundedCancelledDeletedFinalisedBookedCreate

Update

Book

Finalise

Delete

Cancel

Refund

Archive

Archive

ArchivedThere we have it, the entirety of our ticket's lifetime mapped out, each state and each transition, looking a bit like a little tube map of its own.

Some things to note:- there's an implied Read action on every state that loops back on to its self.- the Update and Archive transitions are available from multiple states but result in the same state.

Now we understand how the system works, the client is impressed by our brightly coloured diagrams, it's time to come up with a name for this application and start doing some code. Given it's all about me trying to purchase train tickets I present:

Code

Rowan On Rails

(I am so, so sorry!)

Rowan On Rails

Oddly, when I asked a colleague about the name he just said, remind me to kill you later.

Anyway, enough marketing let's do some work.

URLs

/tickets

GET: list tickets

POST: create ticket

/ticket

GET: display ticket form

/ticket/123

GET: display ticket

/ticket/123/booked

POST: state transition

The first thing I like to do when I'm laying out my app is think about how the URLs will appear to my user and making sure they fit the REST philosophy.

[describe bullets]

On the last point, you can think of the 'booked' state as an attribute of our ticket. If I GET that URL, I'm essentially getting a boolean for whether this is my current state. From a UI point of view, we can also use it to display a form to move to that state if possible.

With our URLs in place, we can think about the controllers that we need...

Controllers

Now, I also wanted to use this as an opportunity to learn ZendFramework, because I haven't really had a serious play with it yet. Essentially, I downloaded the QuickStart application and modified it for my own needs. So, if you've been through that you should recognize the MVC and DB related libraries.

Here you can see we've got our TicketsController, which manages displaying the collection of existing tickets and adding new tickets to it, and then the TicketController which can return a representation of a ticket in HTML form, update the attributes of the ticket, and move the ticket to a new state.Intrestingly, our POST methods on the ticket idempotent. If I submit the same data to an update each time, I change the ticket in the same way. If I repeatedly request a change to the same state, I will continually end up in that state.In true Agile form, I've only just noticed that I've skipped out the update action, but it's basically the same as fetch but with POST we'll see how it works in the model.Now we need to connect those URLs to our actions...

Routing

$router = $frontController->getRouter();

$route = new Zend_Controller_Router_Route(

'ticket/:ticketId',

array(

'controller' => 'ticket',

'action' => 'fetch',

)

);

$router->addRoute('ticketFetch', $route);

$route = new Zend_Controller_Router_Route(

'ticket/:ticketId/:stateHandle',

array(

'controller' => 'ticket',

'action' => 'changestate',

)

);

$router->addRoute('ticketChangestate', $route);

$frontController->setRouter($router);

The default routing in ZendFramework handles most of that for us. We just need to add some new routes to capture the parameters we need out of those URLs. As you can see from the larger font, we're after a ticket ID and a state handle.

This lives in the bootstrap file for the application, but for a larger app with more of these, you would probably want to separate out this configuration into a class or classes of its own.

This will hook us up to the actions, so let's see what we're doing when a request gets there...

Actions

public function fetchAction()

{

$model = $this->_getModel();

$ticket = $model->fetchTicket($this->_getParam('ticketId'));

$this->view->ticket = $ticket;

}

public function changestateAction()

{

$model = $this->_getModel();

$ticket = $model->fetchTicket($this->_getParam('ticketId'));

if ($this->getRequest()->isPost()) {

$model->setState($this->_getParam('stateHandle'));

$model->save();

$redirector = $this->_helper->getHelper('Redirector');

$redirector->gotoRoute(

array('ticketId' => $this->_getParam('ticketId')),

'ticketFetch');

}

}

The fetchAction is pretty simple: get our model, ask it for the ticket with the ID we've retrieved from the request. Now obviously in your production app you're going to do some validation, catch some exceptions, display some error pages and whatnot but this is the crux of our logic.

ChangeState gets a little more interesting. Get the ticket from the model again and then, if it's a POST request we'll set the state and SAVE THAT TO A PERSISTENT STORE and then redirect back to the resource. So, in one action we have covered off the MVC code for every single state change.

The only action remaining is update where all we do is grab the data off our form and pass it through to the model's update method. That is all if something isn't right, the model will let us know.

So, if we're not doing any work in the VC part of MVC, where's it all happening?

Model

From the collection point of view, it's pretty simple to see what's happening. If you've also run through the ZendFramework example then you'll recognise the table class there, and if you haven't then this is basically just one layer away from the PDO library. As you can imagine we've got a method to fetch all the rows from our table again in a production app, we'd introduce some caching and pagination. Then there's also a method that writes to the database.

But this is just the CRUD stuff, this is all straightforward. Where's all this state change stuff we talked about...?

Model

I'm glad you asked, because this is where we start to interact with it. What you can see here is that we construct a ticket by passing in a string that represents the state and some data that we want use to build the ticket. Now, I'm a big fan of writing classes in such a way that you can never construct an object in an invalid state. This means, that our constructor on our class first sets its data and then calls the state's constructor passing in its own instance. Because we've said that we trust the class' constructor to ensure that it's valid, we know that if we can create the state, it must be good. If not, then an exception will be thrown and we'll let it trickle all the way up to an error template.Assuming we've constructed a valid class, let's have a look at the intresting methods...

Model

public function update(Model_TicketData $data)

{

$this->setState('Draft');

$this->_data = $data;

}

public function setState($handle)

{

$classname = 'Model_'.$handle.'TicketState';

$state = new $classname($this);

$this->_state = $state;

}

Five lines of real code. Sure, you'll do validation on the state handle, but that should be off in its own method or class. Why is that it? Because of what we just mentioned on the previous slide, if we can contstruct the new state then we assume that the state is valid. Therefore we can set it, and therefore we can change the state of our ticket.

This means the update method is depends on some state rules. We are saying here that if you update the data, your ticket must move into the draft state. Now for that to happen, there were only 2 valid state for you ticket to be in it's already a draft, or it was just into the booked state in which case we move back to draft.

This means that out of MVC, VC weren't really doing any work and now we've just shown that M doesn't really do any work either. Just where is all this business logic hiding?

States

That's where we need to look at the concrete implementations of our state class. If you remember the state diagram we looked at earlier then you can see that each one is represented here. Importantly, we are overriding the constructor in each class and this is where we will put our business logic.

The responsibility of each state is to examine the ticket passed in and decide if it would be valid for that ticket to be in its state.

Let's illustrate that by looking at the Drafted state...

States

class Model_DraftedTicketState extends Model_TicketState_Abstract

{

public function __construct(Model_Ticket $ticket)

{

$curState = $ticket->getState();

if (!$this->isValidPrevState($curState)) {

throw new Model_InvalidPrevStateException(

'Tried to move to "Drafted" state from '.$curState);

}

}

private function isValidPrevState(Model_TicketState_Abstract $state)

{

if (

is_null($state) ||

$state instanceof Model_DraftedTicketState ||

$state instanceof Model_BookedTicketState

) {

return true;

}

return false;

}

}

Essentially all we need to do is look at the current state of the ticket and decide whether that's a valid state to precede this one. So, in this case a ticket can either have a null state because it's new, already be in the Drafted state, or it can be coming back from the Booked state. Anything else results in us throwing an Exception.

This works the same way for the other states. To move into the Booked state you ensure that the previous state was Drafted or Booked and that all the ticket data has been set.

If you try and construct a Finalised state then it can check if the current date is within 24 hours of the departure date and so on.

All your business logic hinges off of these state classes. And that's it that's all you need to make the application work. Make a POST request to any state you want, and as long as it can be constructed the model will get updated. If not, you get an exception and nothing changes. So, that really... is that. So, why do I think this is a good idea?

Benefits

Business logic solely in

Model_*TicketState classes

Firstly, we've managed a proper separation of concerns. The reason you choose to use a framework is because they are problems that other people have already solved you don't want to solve them again and you don't want to spend your time understanding the solution either.

Benefits

Easy to add new states

No change to MVC classes

Also, if the business logic changes then you just need to make sure that it can be expressed in terms of a new state. If so, then all you need to do is add a new state class, update the classes that link to it, and probably get your content monkeys to create some new pages for you. It's all modular and you're not going back to the MVC classes- because that's what your Framework is handling for you.

Benefits

Decoupled

I'm tempted to go hook this up to Zend_Rest

Finally, it's decoupled. That set of state classes and it's model really doesn't care about the data it's receiving. While I was writing this, I noticed the Zend_Rest library and I'm really quite tempted, as the slide says, to have a go at just plugging it in and seeing if I can get my PUT and DELETE methods in there and properly supported.Of course, I don't want to just evangelise to you. Nothing is ever a silver bullet to solve all your problems... so what's the catch?

Cons

Lots of classes

You've got a lot of classes. Now, this is debatable as to whether or not it's a problem. For this particular example, I think the state pattern might be overkill if the system never moved beyond its current requirements, so all those separate classes might just make things over-complicated.

Cons

Method-calling overhead

On top of that, contstructing classes and calling methods is expensive in PHP compared to plain function calls. So, again if you have a one-shot script then sometimes it is worth just whacking in some procedural, non-extensible code if it gets the job done.

Cons

Potential to over-engineer!

And finally, and this is possibly the worse one as well design patterns tend to bring out the futurist in developers. They'll be jumping in saying, What we want to send this ticket to a 3rd party?, What if we want ajax calls to update individual attributes?

However, the correct answer to that is think about where it fits into the current design. Talking to third party is just a separate part of the model, updating attributes with AJAX calls is just the update method with a few tweaks. There's also potential that state changes in this resource trigger the creation of new resources for example, a move to the Refunded state triggers the creation of a TicketRefund class which has its own set of state to progress through.

So, that's me out of ideas for the moment meaning we have...

Questions

And possibly answers...

Muokkaa otsikon tekstimuotoa napsauttamalla

Muokkaa jsennyksen tekstimuotoa napsauttamalla

Toinen jsennystaso

Kolmas jsennystaso

Neljs jsennystaso

Viides jsennystaso

Kuudes jsennystaso

Seitsems jsennystaso

Kahdeksas jsennystaso

Yhdekss jsennystaso

state machines to state of the art

Documents