[Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

Sat Jun 8 17:45:54 UTC 2019

I think in general we should have a good idea of the thread model that
we support with libnbd.  At the moment we have a handle lock which
only means that simultaneous calls into the handle won't corrupt it;
however it doesn't mean that you don't expect the handle state to
change between calls if the handle is accessible by two threads.  Is
that the model we trying to achieve or something else?  Does libnbd
need to support more complex thread models or can we rely on callers
to use their own mutex on top?

More comments below ...

On Wed, Jun 05, 2019 at 10:23:57AM -0500, Eric Blake wrote:
> I'm also wondering if we should treat external events more like
> interrupts, where calling nbd_aio_notify_read() in an incorrect state
> succeeds by setting a flag that is not cleared until we later reach a
> state that can actually care about the flag, rather than by rejecting
> the call out-right for being in the wrong state.  (That is, states where
> we are NOT expecting a notify_read() are the same as setting an IRQ mask
> that we don't want to be interrupted by a a read notification yet, but
> we don't want to lose the interrupt when we later clear the mask)

Yes, I think this is a good idea, and not difficult to implement.

> In fact, CmdIssue is somewhat already like this - the code determines if
> the state machine is unmasked (in the READY state) to kick it off
> immediately; or is masked (in a processing state) to queue things but
> leave the interrupt set (the next time we enter READY, we unmask,
> observe the queue is non-empty, and so fire off the next command).

CmdIssue is also an external event.  If it was a level-triggered
interrupt then we'd set a flag in the handle when it happens, and the
state machine could do all of the above in generated code.

The only potential problem I can think of at the moment is the usual
one with interrupts which is that second and subsequent interrupts are
lost if the first interrupt hasn't been processed.  However for the
external events we have at the moment:

  | NotifyRead                  (* fd becomes ready to read *)
  | NotifyWrite                 (* fd becomes ready to write *)
  | CmdCreate                   (* [nbd_create] function called *)
  | CmdConnectSockAddr          (* [nbd_aio_connect] function called *)
  | CmdConnectUnix              (* [nbd_aio_connect_unix] *)
  | CmdConnectTCP               (* [nbd_aio_connect_tcp] *)
  | CmdConnectCommand           (* [nbd_aio_connect_command] *)
  | CmdIssue                    (* issuing an NBD command *)

it seems as if this wouldn't be an issue for any of them.  My
reasoning is: NotifyRead/Write work like this in the kernel already.
CmdCreate is only sent once.  CmdConnect* are only sent once.
CmdIssue can be sent multiple times but commands are queued so as long
as the writer side always checks the issue queue at appropriate points
we're OK.

> Another thought - if we ever want to allow the user the ability to send
> custom NBD_OPT_ commands during handshake, or even to emulate 'qemu-nbd
> --list' where the client can make queries to see what the server
> supports before finally settling on whether to run NBD_OPT_GO or
> NBD_OPT_ABORT, we'll need to add an external event after nbd_aio_connect
> but before nbd_aio_is_connected for doing those additional handshake
> steps. It's easier to think about adding a mandatory nbd_aio_go() called
> in between nbd_aio_connect*() and the first nbd_aio_pread now, before we
> have to start worrying about back-compat issues to all existing AIO clients.

I envisaged that we'd just set a flag in the handle to indicate that
we were in "list mode".

> > @@ -2866,8 +2866,11 @@ let generate_lib_api_c () =
> 
> As said above, I think you're still missing a change at the beginning of
> the wrapper that grabs the lock to do the state check after the lock.

AFAIK the code is safe now?  I'm not sure what you mean by this.

> >      let argnames = List.flatten (List.map name_of_arg args) in
> >      List.iter (pr ", %s") argnames;
> >      pr ");\n";
> > -    if is_locked then
> > -      pr "  pthread_mutex_unlock (&h->lock);\n";
> > +    if is_locked then (
> > +      pr "  if (h->state != h->next_state)\n";
> > +      pr "    h->state = h->next_state;\n";
> > +      pr "  pthread_mutex_unlock (&h->lock);\n"
> > +    );
> >      pr "  return ret;\n";
> >      pr "}\n";
> >      pr "\n";
> > diff --git a/lib/connect.c b/lib/connect.c
> > index b889f80..4e3141f 100644
> > --- a/lib/connect.c
> 
> > +++ b/lib/internal.h
> > @@ -80,7 +80,17 @@ struct nbd_handle {
> >    /* Linked list of close callbacks. */
> >    struct close_callback *close_callbacks;
> >  
> > -  _Atomic enum state state;     /* State machine. */
> > +  /* State machine.
> > +   *
> > +   * The actual current state is ‘next_state’.  ‘state’ is updated
> > +   * before we release the lock.
> > +   *
> > +   * Note don't access these fields directly, use the SET_NEXT_STATE
> > +   * macro in generator/states* code, or the set_next_state,
> > +   * get_next_state and get_state macros in regular code.
> > +   */
> > +  _Atomic enum state state;
> > +  enum state next_state;
> >  
> 
> Otherwise the patch looks like you caught the right places.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top