[rest-practices] Async operations and statelessness

Mon Apr 19 19:20:43 UTC 2010

Hi Folks,

I wanted to get your feeling on the question of breaking statelessness
when asynchronous operations are used, in particular in terms of the
protocol used to check the outcome of the deferred task.

So say we need to model a potentially long-lived operation, such as
migrating a VM. To avoid tying up a connection for the duration, the
server could respond to the "POST /vms/999/reboot" request with a 202
Accepted, along with a unique URI to be used subsequently to check on
the status of the migration. 

Now it's the lifecycle of the temporary resource represented by this URI
that seems to be potentially problematic. While the async operation is
still in flight, there's no problem. However the question is how long
_after_ the operation has completed should we maintain this resource so
that the client can eventually determine the outcome? We've no guarantee
that the client will poll regularly, so say we impose some arbitrary
expiry, maybe 10 minutes after the task has completed. But even for that
limited time we seem to be breaking one of the fundamental Fielding
commandments, the one that demands session state is kept entirely on the
client side. After the task completes, this URI no longer represents a
resource per se, rather we'd just be keeping it around for a while to
support our conversation with the client.  

So another more extreme approach would be to limit the client to:

(a) checking whether the async operation is still being processed, so
that the status URI is only valid until the task completes,

(b) inferring from the current state of the VM whether the task may have
succeeded or failed, for example the client gets a big hint that its
stop operation has failed if the task has completed but the VM state
hasn't transitioned to DOWN (or course there's a race here, as the VM
may simply have been restarted in the meantime by another agent),

and,

(c) getting a indirect indication of the failure reason by scanning an
event/audit log.

Obviously the stricter approach makes life a lot more awkward for the
client, whereas the stateful approach would be much more convenient.

Feedback welcome!

Cheers,
Eoghan