[Libguestfs] [libnbd PATCH 4/4] copy: Ignore extents if we encountered earlier error

Wed Feb 2 14:54:35 UTC 2022

On Wed, Feb 02, 2022 at 09:42:46AM +0100, Laszlo Ersek wrote:
> On 02/01/22 20:44, Eric Blake wrote:
> > In practice, the nbd extents callback will only be called once per
> > nbd_aio_block_status, because we have only requested exactly one
> > metacontext id, and because compliant servers send only one reply
> > chunk per id.  In theory, a server could reply first with an error
> > chunk (perhaps meaning "I'm unable to get status beyond offset X")
> > followed by an extent chunk ("but here's what I was able to get")
> > while still being compliant.  And a non-compliant server can send
> > unexpected chunks, just to try and confuse us.  Right now, it would
> > take a custom NBD server to actually trigger any scenario where *error
> > would ever be non-zero on entry to next_extent() (nbdkit does not have
> > that much power).  So while *error ever being non-zero is unlikely,
> > our easiest course is to handle it the same way we would for a server
> > sending us an unexpected id: ignore the rest of the extent callback.
> > The later completion callback will handle the overall command failing.
> 
> This commit message mostly answers my question under the cover letter,
> and indeed it's quite the opposite of what I expected. I'd have expected
> the command (the exchange with the server) completing, upon which time
> the completion callback would have run; then the results (a series of
> extent callbacks) would be passed to the extent callback.
> 
> So, that's precisely *not* the case in reality. :/

Yeah, understanding the control flow is helpful.  The first thing to
note: every client command has a sticky error field (see struct
command->error); this field starts life 0, and while the command is
live, it is set to non-zero in the following scenarios: if the client
never sends the command to the server (for example, settings from
nbd_set_strict_mode() determined it would be an out-of-bounds
request); when the server replies with NBD_REPLY_TYPE_ERROR in a
structured reply (although the command may still be live if it wasn't
the final reply chunk); when the server replies with an error in a
simple reply (with simple replies, the command is necessarily
complete); or when a client mid-command callback returns -1 (only
pread_structured and block_status have these, and the command may
still be live if it wasn't the final reply chunk).  Once the command
is complete, then we call the completion callback (if there is one)
with *error set to the value of command->error, giving one more chance
to change the value stored in command->error; and then
nbd_aio_command_completed returns the value of command->error as the
error status of the overall command (unless the completion callback
returned 1 to auto-retire the command first).

Then we have nbd_aio_block_status which kicks off a transaction that
will eventually send NBD_CMD_BLOCK_STATUS, and is associated with TWO
callbacks: the mandatory mid-command nbd_extents callback to read
block status information as it comes over the wire from the server,
and the optional nbd_completion callback to mark when the server has
nothing more to send.

The documentation states that the nbd_extents callback can be called 0
or more times (0 if the server replies with an error, or even if
nbd_set_strict_mode() determines that the command violates the NBD
protocol and is never even sent to the server).  If you only have one
metacontext id (as in nbdcopy only caring about base:allocation), one
time is all the more we expect from a compliant server (a noncompliant
server can send more than one reply, but we'd have to code up such a
custom server, as nbdkit does not behave that way).  The *error
parameter passed to nbd_extents is non-zero only if there is a running
error detected so far (generally unlikely; a server would have to send
an error chunk before a status chunk, or send more than one status
chunk where the libnbd client set *error during the first time the
callback was reached).

Then the nbd_completion callback is always called exactly once, after
there is nothing further to be done with the command.  The *error
parameter passed to the nbd_completion callback is the same value that
will be returned from nbd_aio_command_comppleted (unless the callback
returns 1 for auto-retiring).

And finally we have nbd_block_status: a synchronous thin wrapper that
calls nbd_aio_block_status, nbd_aio_poll, nbd_aio_command_completed
under the hood.  The nbd_extents callback is still mandatory, and is
called mid-command prior to the function call completing, in response
to server information as it comes over the wire.  The optional
nbd_completion callback of nbd_aio_block_status is not exposed (the
command completes synchronously, instead; whether we used the
completion callback under the hood is an implementation detail that
doesn't matter to the public interface).

I'm open to a documentation patch that gives more clarity on this
sequence of callbacks.

> 
> > ---
> >  copy/nbd-ops.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c
> > index dfc62f20..ab30badd 100644
> > --- a/copy/nbd-ops.c
> > +++ b/copy/nbd-ops.c
> > @@ -1,5 +1,5 @@
> >  /* NBD client library in userspace.
> > - * Copyright (C) 2020 Red Hat Inc.
> > + * Copyright (C) 2020-2022 Red Hat Inc.
> >   *
> >   * This library is free software; you can redistribute it and/or
> >   * modify it under the terms of the GNU Lesser General Public
> > @@ -337,7 +337,7 @@ add_extent (void *vp, const char *metacontext,
> >    extent_list *ret = vp;
> >    size_t i;
> > 
> > -  if (strcmp (metacontext, "base:allocation") != 0)
> > +  if (strcmp (metacontext, "base:allocation") != 0 || *error)
> >      return 0;
> > 
> >    for (i = 0; i < nr_entries; i += 2) {
> > 
> 
> So, I don't understand this. First, here we have a non-aio (i.e.,
> synchronous) nbd_block_status call. in case add_extent (the extents
> callback) is entered with *error !=0, that means (per docs) "the error
> parameter contains the errno value of any previously detected error".

I guess what the docs are missing is the disclaimer that if the
mid-command callback is reached, there is usually no previously
detected error.  The only two situations where there is a previous
error is if the server replies with NBD_REPLY_TYPE_ERROR first and
then NBD_REPLY_TYPE_BLOCK_STATUS second (a custom server), or if the
client negotiated more than one context and the first use of the
callback assigns an error (by returning -1) before the second callback
is reached.  The other cases where an error could be detected never
reach the mid-command callback (if nbd_set_strict_mode() prevented the
command from ever going to the server, or if the server replied with
an error as a final chunk).

> 
> In case we responded by returning -1, then the following (per docs)
> would apply:
> 
> "If the callback returns -1, and no earlier error has been detected,
> then the overall block status command will fail with any non-zero value
> stored into the callback's error parameter"
> 
> But we don't do that, we say "everything is OK, I'll just ignore this",
> like we ignore metadata contexts that we don't need.

Yes. But even though the mid-command callback ignores the error, the
synchronous wrapper will still see the same error via
nbd_aio_command_completed under the hood, so the overall
nbd_blocK_status will fail with the same value.

> 
> So how is the "later completion callback" going to fail the entire
> nbd_block_status() call? First, because this is not an AIO
> nbd_block_status() call, we don't provide the completion callback
> ourselves -- I guess it is some default function somewhere in libnbd.

Right - the synchronous wrappers return the same value as would be
present in *error given to the completion callback of an asynchronous
API (whether the synchronous wrappers actually use the completion
callback, or get at the value directly, is an implementation detail).
And remember that the completion callback to the asynchronous APIs is
optional; if it is not provided, that error value is visible through
nbd_aio_command_completed.

> 
> Or is it the following: upon getting (*error != 0) on entry, we just
> decline looking at the (likely nonsensical) "entries" array, and permit
> the error we got via *error to "survive" through all further
> add_extent() calls, and to finally fail the outermost nbd_block_status()
> call?

Yes, that is the best way of looking at it.  In fact, the "entries"
array is most likely valid, rather than non-sensical, but since we
know that our response to a failing nbd_block_status() is to die,
there's no point wasting time processing the valid extents to prolong
the time before we die.  In other words, this patch is optional; it
does not add safety (as the error will resurface as the result of
nbd_block_status, where we will fail), but merely speed (since we are
going to fail anyways, why waste the time getting to that failure).

> 
> Next question: even if that's the case, should we *first* check *error,
> and only check "metacontext" if *error is zero? What says that
> "metacontext" is defined if *error is nonzero?

The nbd_extents callback is only reached any time the server replies
with NBD_REPLY_TYPE_BLOCK_STATUS with a metacontext id that we
recognized.  If we did not recognize the id, or if the server never
replies with that chunk, then we don't call the mid-command callback
in the first place.  So the "metacontext" and "extents" parameters to
nbd_extents are always valid, even if *error is nonzero.

And now that I've written this email, I wish I could incorporate some
of it into the commit message, even though it is already pushed.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org