[rest-practices] Async operations and statelessness

Mon Apr 19 20:27:11 UTC 2010

Thanks for the feedback, Bill, some comments in-line ... 

On Mon, 2010-04-19 at 15:53 -0400, Bill Burke wrote:
> Why not just redirect to the VM resource URL?  Then you just see the 
> state of the VM as:
> 
> <vm>
>    ...
>    <status>
>       BOOTING
>    </status>
> </vm>
> 
> <vm>
>    <status>
>      <link rel="boot-log" href="..." type="text/plain"/>
>      FAILED BOOT
>    </status>
>    ...
> </vm>
> 
> <vm>
>     ...
>     <status started="13:52:23">
>       RUNNING
>     </status>
> </vm>

Yeah, that's what I kinda meant by limitation (b). But wouldn't there be
a race condition in that a second client could have come along in the
meantime and initiated another action that could overwrite the state
change flowing from the first client's action? 

>From the point of view of an individual user, they'd probably be most
interested in knowing whether _their_ action succeeded or failed, not so
much how the _most recent_ action applied to the VM panned out.    

> Even if you create a specific resource for the action, I don't 
> understand why it would be stateful?  Its not session oriented, it is 
> modeling the state of the VM.  

Well it seemed to me that its more modeling the outcome of a particular
action applied to the VM, as opposed to the VM itself. So you could
think of the current VM state being the result of a whole sequence of
actions, with more recent operations undoing the effect of older ones.
Keeping the outcome of past state transitions around just to facilitate
the clients that initiated the corresponding actions seems just a tad
sessioney to me. 

But perhaps its OK if we think of it in terms of an audit/event log, so
that URI handed out to the client gives it the ability to search the log
for all related events for example, or points to the "future" audit
record that'll be written on completion.    

> Any concurrent reboot would just redirect 
> to this new boot resource or just throw 412 with a message that the vm 
> is already being booted or something like that.

Yeah, if the subsequent operation really was concurrent, then that would
work. But the problematic case would be where a subsequent state change
sneaks in between the first operation completing and the client getting
around to check on the outcome. 

I may be making a mountain out of a mole-hill here. In which case, I'd
be happy to follow the simpler approach.

Cheers,
Eoghan

> 
> Eoghan Glynn wrote:
> > Hi Folks,
> > 
> > I wanted to get your feeling on the question of breaking statelessness
> > when asynchronous operations are used, in particular in terms of the
> > protocol used to check the outcome of the deferred task.
> > 
> > So say we need to model a potentially long-lived operation, such as
> > migrating a VM. To avoid tying up a connection for the duration, the
> > server could respond to the "POST /vms/999/reboot" request with a 202
> > Accepted, along with a unique URI to be used subsequently to check on
> > the status of the migration. 
> > 
> > Now it's the lifecycle of the temporary resource represented by this URI
> > that seems to be potentially problematic. While the async operation is
> > still in flight, there's no problem. However the question is how long
> > _after_ the operation has completed should we maintain this resource so
> > that the client can eventually determine the outcome? We've no guarantee
> > that the client will poll regularly, so say we impose some arbitrary
> > expiry, maybe 10 minutes after the task has completed. But even for that
> > limited time we seem to be breaking one of the fundamental Fielding
> > commandments, the one that demands session state is kept entirely on the
> > client side. After the task completes, this URI no longer represents a
> > resource per se, rather we'd just be keeping it around for a while to
> > support our conversation with the client.  
> > 
> > So another more extreme approach would be to limit the client to:
> > 
> > (a) checking whether the async operation is still being processed, so
> > that the status URI is only valid until the task completes,
> >  
> > (b) inferring from the current state of the VM whether the task may have
> > succeeded or failed, for example the client gets a big hint that its
> > stop operation has failed if the task has completed but the VM state
> > hasn't transitioned to DOWN (or course there's a race here, as the VM
> > may simply have been restarted in the meantime by another agent),
> > 
> > and,
> > 
> > (c) getting a indirect indication of the failure reason by scanning an
> > event/audit log.
> > 
> > Obviously the stricter approach makes life a lot more awkward for the
> > client, whereas the stateful approach would be much more convenient.
> > 
> > Feedback welcome!
> > 
> > Cheers,
> > Eoghan
> > 
> > _______________________________________________
> > rest-practices mailing list
> > rest-practices at redhat.com
> > https://www.redhat.com/mailman/listinfo/rest-practices
>