[Libguestfs] [libnbd PATCH v3 04/22] states: Prepare to send 64-bit requests

Laszlo Ersek lersek at redhat.com
Wed May 31 11:59:12 UTC 2023


On 5/30/23 20:18, Eric Blake wrote:
> On Tue, May 30, 2023 at 04:06:32PM +0200, Laszlo Ersek wrote:
>> On 5/25/23 15:00, Eric Blake wrote:
>>> Support sending 64-bit requests if extended headers were negotiated.
>>> This includes setting NBD_CMD_FLAG_PAYLOAD_LEN any time we send an
>>> extended NBD_CMD_WRITE; this is such a fundamental part of the
>>> protocol that for now it is easier to silently ignore whatever value
>>> the user passes in for that bit in the flags parameter of nbd_pwrite
>>> regardless of the current settings in set_strict_mode, rather than
>>> trying to force the user to pass in the correct value to match whether
>>> extended mode is negotiated.  However, when we later add APIs to give
>>> the user more control for interoperability testing, it may be worth
>>> adding a new set_strict_mode control knob to explicitly enable the
>>> client to intentionally violate the protocol (the testsuite added in
>>> this patch would then be updated to match).
>>>
>>> At this point, h->extended_headers is permanently false (we can't
>>> enable it until all other aspects of the protocol have likewise been
>>> converted).
>>>
>>> Support for using FLAG_PAYLOAD_LEN with NBD_CMD_BLOCK_STATUS is less
>>> fundamental, and deserves to be in its own patch.
>>>
>>> Signed-off-by: Eric Blake <eblake at redhat.com>
>>> ---
> 
>>> @@ -364,7 +370,7 @@ struct command {
>>>    uint16_t type;
>>>    uint64_t cookie;
>>>    uint64_t offset;
>>> -  uint32_t count;
>>> +  uint64_t count;
>>>    void *data; /* Buffer for read/write */
>>>    struct command_cb cb;
>>>    bool initialized; /* For read, true if getting a hole may skip memset */
>>
>> (1) Are there places in the code where we currently assign this "count"
>> field back to a uint32_t object, and assume truncation impossible?
> 
> Grepping for '->count' in lib/ and generator/ shows we need to check
> at least:
> 
> generator/states-reply-simple.c:    h->rlen = cmd->count;
> generator/states-reply-simple.c:    cmd->data_seen += cmd->count;
> 
> which are adjustments to size_t and uint32_t variables respectively,
> in response to a server's reply to an NBD_CMD_READ command.  But since
> we never send a server a read request larger than 64M, truncation and
> overflow are not possible in those lines of code (at most one simple
> reply is possible, and code in states-reply-structured.c ensures that
> cmd->data_seen is a saturating variable that never exceeds
> 2*MAX_REQUEST_SIZE).
> 
> There is also pre-series:
> 
> generator/states-issue-command.c:  h->request.count = htobe32 (cmd->count);
> 
> which is specifically updated in this patch to cover extended headers.
> 
>>> +++ b/generator/states-issue-command.c
>>> @@ -41,15 +41,24 @@  ISSUE_COMMAND.START:
>>>      return 0;
>>>    }
>>>
>>> -  h->request.magic = htobe32 (NBD_REQUEST_MAGIC);
>>> -  h->request.flags = htobe16 (cmd->flags);
>>> -  h->request.type = htobe16 (cmd->type);
>>> -  h->request.handle = htobe64 (cmd->cookie);
>>> -  h->request.offset = htobe64 (cmd->offset);
>>> -  h->request.count = htobe32 (cmd->count);
>>> +  /* These fields are coincident between req.compact and req.extended */
>>> +  h->req.compact.flags = htobe16 (cmd->flags);
>>> +  h->req.compact.type = htobe16 (cmd->type);
>>> +  h->req.compact.handle = htobe64 (cmd->cookie);
>>> +  h->req.compact.offset = htobe64 (cmd->offset);
>>
>> What's more, this is a "by the book" common initial sequence! :)
> 
>>
>>> +  if (h->extended_headers) {
>>> +    h->req.extended.magic = htobe32 (NBD_EXTENDED_REQUEST_MAGIC);
>>> +    h->req.extended.count = htobe64 (cmd->count);
>>> +    h->wlen = sizeof (h->req.extended);
>>> +  }
>>> +  else {
>>> +    assert (cmd->count <= UINT32_MAX);
>>> +    h->req.compact.magic = htobe32 (NBD_REQUEST_MAGIC);
>>> +    h->req.compact.count = htobe32 (cmd->count);
>>> +    h->wlen = sizeof (h->req.compact);
>>> +  }
> 
> Indeed, and shows why my efforts to get a sane layout early in the
> series matter, even if it will cause me a bit more rebase churn here
> based on my response to your comments earlier in the series.
> 
>>> @@ -358,6 +356,15 @@ nbd_unlocked_aio_pwrite (struct nbd_handle *h, const void *buf,
>>>        return -1;
>>>      }
>>>    }
>>> +  /* It is more convenient to manage PAYLOAD_LEN by what was negotiated
>>> +   * than to require the user to have to set it correctly.
>>> +   * TODO: Add new h->strict bit to allow intentional protocol violation
>>> +   * for interoperability testing.
>>> +   */
>>> +  if (h->extended_headers)
>>> +    flags |= LIBNBD_CMD_FLAG_PAYLOAD_LEN;
>>> +  else
>>> +    flags &= ~LIBNBD_CMD_FLAG_PAYLOAD_LEN;
>>
>> Nice -- I wanted to ask for:
>>
>>     flags &= ~(uint32_t)LIBNBD_CMD_FLAG_PAYLOAD_LEN;
>>
>> due to LIBNBD_CMD_FLAG_PAYLOAD_LEN having type "int".
>>
>> However: in patch#3, what has type "int" is:
>>
>> +#define NBD_CMD_FLAG_PAYLOAD_LEN (1<<5)
>>
>> and here we have LIBNBD_CMD_FLAG_PAYLOAD_LEN instead -- and the latter
>> has type unsigned int already, from your recent commit 69eecae2c03a
>> ("api: Generate flag values as unsigned", 2022-11-11).
> 
> Still, worth a (separate) cleanup patch to nbd-protocol.h to prefer
> unsigned constants for the flag values where they are not generated.
> 
>>
>> And I think we're fine assuming that uint32_t is unsigned int.
> 
> Not true of all generic C platforms, but certainly true for the
> POSIX-like platforms we target (anyone that defines uint32_t as
> 'unsigned long' on a platform with 32-bit longs is unusual, but even
> then we should still be okay).
> 
>>
>>>
>>>    SET_CALLBACK_TO_NULL (*completion);
>>>    return nbd_internal_command_common (h, flags, NBD_CMD_WRITE, offset, count,
>>> diff --git a/tests/Makefile.am b/tests/Makefile.am
>>> index 3a93251e..8b839bf5 100644
>>> --- a/tests/Makefile.am
>>> +++ b/tests/Makefile.am
>>> @@ -232,6 +232,7 @@ check_PROGRAMS += \
>>>  	closure-lifetimes \
>>>  	pread-initialize \
>>>  	socket-activation-name \
>>> +  pwrite-extended \
>>>  	$(NULL)
>>>
>>>  TESTS += \
>>
>> (2) Incorrect indentation: two spaces rather than one tab.
> 
> Arrgh.  ./.editorconfig is supposed to do this correctly, but
> obviously its interaction with emacs is a bit botched when it comes to
> Makefile syntax.  Will clean up.
> 
>>> +++ b/tests/pwrite-extended.c
> 
>>> +static void
>>> +check (int experr, const char *prefix)
>>> +{
>>> +  const char *msg = nbd_get_error ();
>>> +  int errnum = nbd_get_errno ();
>>> +
>>> +  fprintf (stderr, "error: \"%s\"\n", msg);
>>> +  fprintf (stderr, "errno: %d (%s)\n", errnum, strerror (errnum));
>>> +  if (strncmp (msg, prefix, strlen (prefix)) != 0) {
>>> +    fprintf (stderr, "%s: test failed: missing context prefix: %s\n",
>>> +             progname, msg);
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>> +  if (errnum != experr) {
>>> +    fprintf (stderr, "%s: test failed: "
>>> +             "expected errno = %d (%s), but got %d\n",
>>> +             progname, experr, strerror (experr), errnum);
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>> +}
>>> +
>>> +int
>>> +main (int argc, char *argv[])
>>> +{
>>> +  struct nbd_handle *nbd;
>>> +  const char *cmd[] = {
>>> +    "nbdkit", "-s", "-v", "--exit-with-parent", "memory", "1048576", NULL
>>> +  };
>>> +  uint32_t strict;
>>> +
>>> +  progname = argv[0];
>>> +
>>> +  nbd = nbd_create ();
>>> +  if (nbd == NULL) {
>>> +    fprintf (stderr, "%s\n", nbd_get_error ());
>>
>> (3) Minor inconsistency with check(): we're not printing "progname" here.
>>
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>> +
>>> +  /* Connect to the server. */
>>> +  if (nbd_connect_command (nbd, (char **)cmd) == -1) {
>>> +    fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>
>> (4) Another kind of inconsistency: we could use "progname" here, in
>> place of argv[0].
>>
>> (This applies to all other fprintf()s below.)
> 
> Probably copy-and-paste from other similar tests, but I don't mind
> cleaning those up.
> 
>>
>>> +
>>> +  /* FIXME: future API addition to test if server negotiated extended mode.
>>> +   * Until then, strict flags must ignore the PAYLOAD_LEN flag for pwrite,
>>> +   * even though it is rejected for other commands.
>>> +   */
>>> +  strict = nbd_get_strict_mode (nbd);
>>> +  if (!(strict & LIBNBD_STRICT_FLAGS)) {
>>> +    fprintf (stderr, "%s: test failed: "
>>> +             "nbd_get_strict_mode did not have expected flag set\n",
>>> +             argv[0]);
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>
>> Not sure if I understand this check. Per
>> <https://libguestfs.org/nbd_set_strict_mode.3.html>, I take it that
>> LIBNBD_STRICT_FLAGS should be "on" by default. Are you enforcing that?
>> And if so: is it your intent that, *even with* LIBNBD_STRICT_FLAGS, an
>> invalid PAYLOAD_LEN is not rejected (as seen by the libnbd client app),
>> but fixed up silently?
> 
> Rather, until we can tell if the server negotiated extended mode, we
> are ASSUMING that the server did NOT negotiate it, and therefore we
> are in violation of the spec if we send the flag over the wire
> anyways.

OK.

> We can flag all other API where it is inappropriate to ever
> use...
> 
>>
>>> +  if (nbd_aio_pread (nbd, buf, 512, 0, NBD_NULL_COMPLETION,
>>> +                     LIBNBD_CMD_FLAG_PAYLOAD_LEN) != -1) {
>>> +    fprintf (stderr, "%s: test failed: "
>>> +             "nbd_aio_pread did not fail with unexpected flag\n",
>>> +             argv[0]);
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>> +  check (EINVAL, "nbd_aio_pread: ");
>>
>> Ah, got it now. We do want APIs other than pwrite to fail.
> 
> ...but because we don't want to require clients to correctly decide
> when to pass or omit the flag to their API calls (by us masking out
> the user's choice and then hardcoding our actual wire behavior based
> on negotiated mode), passing the flag to pread works even when it
> would be technically wrong over the wire.

I don't understand.

What you describe here (= us fixing up the flag for the caller) applies
to *pwrite*, not *pread*. Furthermore, the above check tests pread's
behavior, and it expects pread to *fail*.

In effect, my understanding of the test code is this:

- assume extended headers have not been negotiated

- require that the NBD connection be created such that it enforces flag
validity on the client side (i.e., "strict mode" including "strict flags")

- test that pread fails with PAYLOAD_LEN -- pread should fail
*regardless* of extended headers having been negotiated, because (a) if
extended headers are not in use, then the flag is altogether invalid,
(b) even with extended headers, a read request does not accept the flag.
Because we don't add "PAYLOAD_LEN" as a valid flag to pread in the
generator code, the check for (b) is always active.

- test that pwrite succeeds with PAYLOAD_LEN -- pwrite should succeed
*regardless* of extended headers having been negotiated, because we set
PAYLOAD_LEN internally, dependent on the extended headers; i.e.,
ignoring the user's argument.

That is, I think I did manage to explain the test to myself, but your
most recent answer confuses me again! :)

> The FIXME does get modified
> again later in the series, when I do add in support for detecting when
> the server supports extended headers.

Right, I assume FIXME in the test code might be addressed together with
the TODO in nbd_unlocked_aio_pwrite(). Once we know whether the server
negotiated extended headers, *and* if the user asks for strictness
regarding the PAYLOAD_LEN flag, we can enforce PAYLOAD_LEN's equivalence
with extended headers in pwrite calls.

Laszlo

> 
>>
>>> +
>>> +  if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION,
>>> +                     LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1) {
>>> +    fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>> +
>>> +  if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, 0) == -1) {
>>> +    fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
>>> +    exit (EXIT_FAILURE);
>>> +  }
>>
>> You could contract these into:
>>
>>   if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION,
>>                      LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1 ||
>>       nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, 0) == -1) {
>>     fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ());
>>     exit (EXIT_FAILURE);
>>   }
> 
> Sure.
> 
>>
>>> +
>>> +  nbd_close (nbd);
>>> +  exit (EXIT_SUCCESS);
>>> +}
>>
>> In general, I think it's good practice to reach nbd_close() whenever
>> nbd_create() succeeds (that is, on error paths as well, after
>> nbd_create() succeeds). For example, if we connected to the server with
>> systemd socket activation in this test, and (say) one of the pwrites
>> failed, then we'd leak the unix domain socket in the filesystem (from
>> the bind() in "generator/states-connect-socket-activation.c").
>> "sact_sockpath" is unlinked in nbd_close().
>>
>> (As written, this test should not be affected, because, according to
>> unix(7), unix domain sockets created with socketpair(2) are unnamed.)
> 
> Pre-existing in other tests, but a good observation for a followup patch.
> 



More information about the Libguestfs mailing list