[Libguestfs] [libnbd PATCH v3 04/22] states: Prepare to send 64-bit requests

Eric Blake eblake at redhat.com
Tue May 30 18:18:22 UTC 2023


On Tue, May 30, 2023 at 04:06:32PM +0200, Laszlo Ersek wrote:
> On 5/25/23 15:00, Eric Blake wrote:
> > Support sending 64-bit requests if extended headers were negotiated.
> > This includes setting NBD_CMD_FLAG_PAYLOAD_LEN any time we send an
> > extended NBD_CMD_WRITE; this is such a fundamental part of the
> > protocol that for now it is easier to silently ignore whatever value
> > the user passes in for that bit in the flags parameter of nbd_pwrite
> > regardless of the current settings in set_strict_mode, rather than
> > trying to force the user to pass in the correct value to match whether
> > extended mode is negotiated.  However, when we later add APIs to give
> > the user more control for interoperability testing, it may be worth
> > adding a new set_strict_mode control knob to explicitly enable the
> > client to intentionally violate the protocol (the testsuite added in
> > this patch would then be updated to match).
> > 
> > At this point, h->extended_headers is permanently false (we can't
> > enable it until all other aspects of the protocol have likewise been
> > converted).
> > 
> > Support for using FLAG_PAYLOAD_LEN with NBD_CMD_BLOCK_STATUS is less
> > fundamental, and deserves to be in its own patch.
> > 
> > Signed-off-by: Eric Blake <eblake at redhat.com>
> > ---

> > @@ -364,7 +370,7 @@ struct command {
> >    uint16_t type;
> >    uint64_t cookie;
> >    uint64_t offset;
> > -  uint32_t count;
> > +  uint64_t count;
> >    void *data; /* Buffer for read/write */
> >    struct command_cb cb;
> >    bool initialized; /* For read, true if getting a hole may skip memset */
> 
> (1) Are there places in the code where we currently assign this "count"
> field back to a uint32_t object, and assume truncation impossible?

Grepping for '->count' in lib/ and generator/ shows we need to check
at least:

generator/states-reply-simple.c:    h->rlen = cmd->count;
generator/states-reply-simple.c:    cmd->data_seen += cmd->count;

which are adjustments to size_t and uint32_t variables respectively,
in response to a server's reply to an NBD_CMD_READ command.  But since
we never send a server a read request larger than 64M, truncation and
overflow are not possible in those lines of code (at most one simple
reply is possible, and code in states-reply-structured.c ensures that
cmd->data_seen is a saturating variable that never exceeds
2*MAX_REQUEST_SIZE).

There is also pre-series:

generator/states-issue-command.c:  h->request.count = htobe32 (cmd->count);

which is specifically updated in this patch to cover extended headers.

> > +++ b/generator/states-issue-command.c
> > @@ -41,15 +41,24 @@  ISSUE_COMMAND.START:
> >      return 0;
> >    }
> > 
> > -  h->request.magic = htobe32 (NBD_REQUEST_MAGIC);
> > -  h->request.flags = htobe16 (cmd->flags);
> > -  h->request.type = htobe16 (cmd->type);
> > -  h->request.handle = htobe64 (cmd->cookie);
> > -  h->request.offset = htobe64 (cmd->offset);
> > -  h->request.count = htobe32 (cmd->count);
> > +  /* These fields are coincident between req.compact and req.extended */
> > +  h->req.compact.flags = htobe16 (cmd->flags);
> > +  h->req.compact.type = htobe16 (cmd->type);
> > +  h->req.compact.handle = htobe64 (cmd->cookie);
> > +  h->req.compact.offset = htobe64 (cmd->offset);
> 
> What's more, this is a "by the book" common initial sequence! :)

> 
> > +  if (h->extended_headers) {
> > +    h->req.extended.magic = htobe32 (NBD_EXTENDED_REQUEST_MAGIC);
> > +    h->req.extended.count = htobe64 (cmd->count);
> > +    h->wlen = sizeof (h->req.extended);
> > +  }
> > +  else {
> > +    assert (cmd->count <= UINT32_MAX);
> > +    h->req.compact.magic = htobe32 (NBD_REQUEST_MAGIC);
> > +    h->req.compact.count = htobe32 (cmd->count);
> > +    h->wlen = sizeof (h->req.compact);
> > +  }

Indeed, and shows why my efforts to get a sane layout early in the
series matter, even if it will cause me a bit more rebase churn here
based on my response to your comments earlier in the series.

> > @@ -358,6 +356,15 @@ nbd_unlocked_aio_pwrite (struct nbd_handle *h, const void *buf,
> >        return -1;
> >      }
> >    }
> > +  /* It is more convenient to manage PAYLOAD_LEN by what was negotiated
> > +   * than to require the user to have to set it correctly.
> > +   * TODO: Add new h->strict bit to allow intentional protocol violation
> > +   * for interoperability testing.
> > +   */
> > +  if (h->extended_headers)
> > +    flags |= LIBNBD_CMD_FLAG_PAYLOAD_LEN;
> > +  else
> > +    flags &= ~LIBNBD_CMD_FLAG_PAYLOAD_LEN;
> 
> Nice -- I wanted to ask for:
> 
>     flags &= ~(uint32_t)LIBNBD_CMD_FLAG_PAYLOAD_LEN;
> 
> due to LIBNBD_CMD_FLAG_PAYLOAD_LEN having type "int".
> 
> However: in patch#3, what has type "int" is:
> 
> +#define NBD_CMD_FLAG_PAYLOAD_LEN (1<<5)
> 
> and here we have LIBNBD_CMD_FLAG_PAYLOAD_LEN instead -- and the latter
> has type unsigned int already, from your recent commit 69eecae2c03a
> ("api: Generate flag values as unsigned", 2022-11-11).

Still, worth a (separate) cleanup patch to nbd-protocol.h to prefer
unsigned constants for the flag values where they are not generated.

> 
> And I think we're fine assuming that uint32_t is unsigned int.

Not true of all generic C platforms, but certainly true for the
POSIX-like platforms we target (anyone that defines uint32_t as
'unsigned long' on a platform with 32-bit longs is unusual, but even
then we should still be okay).

> 
> > 
> >    SET_CALLBACK_TO_NULL (*completion);
> >    return nbd_internal_command_common (h, flags, NBD_CMD_WRITE, offset, count,
> > diff --git a/tests/Makefile.am b/tests/Makefile.am
> > index 3a93251e..8b839bf5 100644
> > --- a/tests/Makefile.am
> > +++ b/tests/Makefile.am
> > @@ -232,6 +232,7 @@ check_PROGRAMS += \
> >  	closure-lifetimes \
> >  	pread-initialize \
> >  	socket-activation-name \
> > +  pwrite-extended \
> >  	$(NULL)
> > 
> >  TESTS += \
> 
> (2) Incorrect indentation: two spaces rather than one tab.

Arrgh.  ./.editorconfig is supposed to do this correctly, but
obviously its interaction with emacs is a bit botched when it comes to
Makefile syntax.  Will clean up.

> > +++ b/tests/pwrite-extended.c

> > +static void
> > +check (int experr, const char *prefix)
> > +{
> > +  const char *msg = nbd_get_error ();
> > +  int errnum = nbd_get_errno ();
> > +
> > +  fprintf (stderr, "error: \"%s\"\n", msg);
> > +  fprintf (stderr, "errno: %d (%s)\n", errnum, strerror (errnum));
> > +  if (strncmp (msg, prefix, strlen (prefix)) != 0) {
> > +    fprintf (stderr, "%s: test failed: missing context prefix: %s\n",
> > +             progname, msg);
> > +    exit (EXIT_FAILURE);
> > +  }
> > +  if (errnum != experr) {
> > +    fprintf (stderr, "%s: test failed: "
> > +             "expected errno = %d (%s), but got %d\n",
> > +             progname, experr, strerror (experr), errnum);
> > +    exit (EXIT_FAILURE);
> > +  }
> > +}
> > +
> > +int
> > +main (int argc, char *argv[])
> > +{
> > +  struct nbd_handle *nbd;
> > +  const char *cmd[] = {
> > +    "nbdkit", "-s", "-v", "--exit-with-parent", "memory", "1048576", NULL
> > +  };
> > +  uint32_t strict;
> > +
> > +  progname = argv[0];
> > +
> > +  nbd = nbd_create ();
> > +  if (nbd == NULL) {
> > +    fprintf (stderr, "%s\n", nbd_get_error ());
> 
> (3) Minor inconsistency with check(): we're not printing "progname" here.
> 
> > +    exit (EXIT_FAILURE);
> > +  }
> > +
> > +  /* Connect to the server. */
> > +  if (nbd_connect_command (nbd, (char **)cmd) == -1) {
> > +    fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
> > +    exit (EXIT_FAILURE);
> > +  }
> 
> (4) Another kind of inconsistency: we could use "progname" here, in
> place of argv[0].
> 
> (This applies to all other fprintf()s below.)

Probably copy-and-paste from other similar tests, but I don't mind
cleaning those up.

> 
> > +
> > +  /* FIXME: future API addition to test if server negotiated extended mode.
> > +   * Until then, strict flags must ignore the PAYLOAD_LEN flag for pwrite,
> > +   * even though it is rejected for other commands.
> > +   */
> > +  strict = nbd_get_strict_mode (nbd);
> > +  if (!(strict & LIBNBD_STRICT_FLAGS)) {
> > +    fprintf (stderr, "%s: test failed: "
> > +             "nbd_get_strict_mode did not have expected flag set\n",
> > +             argv[0]);
> > +    exit (EXIT_FAILURE);
> > +  }
> 
> Not sure if I understand this check. Per
> <https://libguestfs.org/nbd_set_strict_mode.3.html>, I take it that
> LIBNBD_STRICT_FLAGS should be "on" by default. Are you enforcing that?
> And if so: is it your intent that, *even with* LIBNBD_STRICT_FLAGS, an
> invalid PAYLOAD_LEN is not rejected (as seen by the libnbd client app),
> but fixed up silently?

Rather, until we can tell if the server negotiated extended mode, we
are ASSUMING that the server did NOT negotiate it, and therefore we
are in violation of the spec if we send the flag over the wire
anyways.  We can flag all other API where it is inappropriate to ever
use...

> 
> > +  if (nbd_aio_pread (nbd, buf, 512, 0, NBD_NULL_COMPLETION,
> > +                     LIBNBD_CMD_FLAG_PAYLOAD_LEN) != -1) {
> > +    fprintf (stderr, "%s: test failed: "
> > +             "nbd_aio_pread did not fail with unexpected flag\n",
> > +             argv[0]);
> > +    exit (EXIT_FAILURE);
> > +  }
> > +  check (EINVAL, "nbd_aio_pread: ");
> 
> Ah, got it now. We do want APIs other than pwrite to fail.

...but because we don't want to require clients to correctly decide
when to pass or omit the flag to their API calls (by us masking out
the user's choice and then hardcoding our actual wire behavior based
on negotiated mode), passing the flag to pread works even when it
would be technically wrong over the wire.  The FIXME does get modified
again later in the series, when I do add in support for detecting when
the server supports extended headers.

> 
> > +
> > +  if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION,
> > +                     LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1) {
> > +    fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
> > +    exit (EXIT_FAILURE);
> > +  }
> > +
> > +  if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, 0) == -1) {
> > +    fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
> > +    exit (EXIT_FAILURE);
> > +  }
> 
> You could contract these into:
> 
>   if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION,
>                      LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1 ||
>       nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, 0) == -1) {
>     fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
>     exit (EXIT_FAILURE);
>   }

Sure.

> 
> > +
> > +  nbd_close (nbd);
> > +  exit (EXIT_SUCCESS);
> > +}
> 
> In general, I think it's good practice to reach nbd_close() whenever
> nbd_create() succeeds (that is, on error paths as well, after
> nbd_create() succeeds). For example, if we connected to the server with
> systemd socket activation in this test, and (say) one of the pwrites
> failed, then we'd leak the unix domain socket in the filesystem (from
> the bind() in "generator/states-connect-socket-activation.c").
> "sact_sockpath" is unlinked in nbd_close().
> 
> (As written, this test should not be affected, because, according to
> unix(7), unix domain sockets created with socketpair(2) are unnamed.)

Pre-existing in other tests, but a good observation for a followup patch.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


More information about the Libguestfs mailing list