[Libguestfs] [libnbd PATCH v2 02/12] api: Fix nbd_can_meta_context if NBD_OPT_SET_META_CONTEXT fails

Eric Blake eblake at redhat.com
Wed Aug 31 14:39:18 UTC 2022


If a server replies to NBD_OPT_SET_META_CONTEXT first with
NBD_REP_META_CONTEXT, then with NBD_REP_ERR_*, we have already
modified h->meta_contexts, but fail to clean it up before moving on to
OPT_GO in the OPT_META_CONTEXT.CHECK_REPLY's default: handler (this
corner-case bug has been present since we got meta context support
working back in commits a97e2779 and 1b560b62).  Per the NBD spec,
because the overall SET_META_CONTEXT did not succeed, we should
therefore not attempt NBD_CMD_BLOCK_STATUS; but because
h->meta_contexts was left non-empty, we mistakenly report
nbd_can_meta_context(first_string) as true, and allow
nbd_block_status() to proceed; this forms a protocol violation not
pre-filtered by our usual litany of nbd_set_strict_mode() checks, and
therefore we can trigger unspecified server behavior.

Rather than open-code a cleanup loop on all error paths, I instead fix
the problem by adding a meta_valid witness that is only set to true in
select places.  The obvious place is when handling NBD_REP_ACK; but it
also makes sense if we didn't call NBD_OPT_SET_META_CONTEXT at all: if
negotiating structured replies failed (including oldstyle servers), or
if the user never called nbd_add_meta_context() and therefore doesn't
care (blindly returning 0 to nbd_can_meta_context() in those cases is
fine; our existing tests/oldstyle and tests/newstyle-limited cover
that).  However, whereas we previously always returned a 0/1 answer
for nbd_can_meta_context once we are in transmission phase, this new
witness now means that in the corner case of explicit server failure
to NBD_OPT_SET_META_CONTEXT, nbd_can_meta_context() returns -1 to call
attention to the earlier server failure (although it is something that
is unlikely enough in practice that I doubt anyone will experience it
in the wild).

The change in nbd_can_meta_context() to use h->meta_valid instead of
h->eflags has an additional latent semantic difference not observable
at this time (because both h->meta_valid and h->eflags are set in
tandem by successful nbd_opt_go()/nbd_opt_info() in their current
two-command sequence), but which matters to future patches adding new
APIs.  On the one hand, once we add nbd_set_request_meta_context() to
stop nbd_opt_go() from performing NBD_OPT_SET_META_CONTEXT, we want
nbd_can_meta_context() to fail during transmission phase if the
context was not set elsewhere during negotiation, rather than blindly
returning 0.  On the other hand, when manually calling
nbd_opt_set_meta_context() during negotiation, we do not want it to
touch h->eflags, but do want nbd_can_meta_context() to start working
right away.

Note that the use of a witness flag also allows me to slightly change
when the list of previously-negotiated contexts is freed - instead of
doing it in nbd_internal_reset_size_and_flags(), that function now
simply marks any existing vector as invalid; and the actual wipe is
done when starting a new NBD_OPT_SET_META_CONTEXT or when closing
struct nbd_handle.  There are still some oddities to be cleaned up in
later patches by moving when the flag is marked invalid (for example,
we really want nbd_set_export_name() to wipe both flags and meta
contexts, but the act of NBD_OPT_GO should not wipe contexts, and the
act of NBD_OPT_SET_META_CONTEXT should not wipe flags), but I'm
leaving those further tweaks for later patches to make this one easier
to review.

Testing this is surprisingly tricky; compliant servers don't do this
(hence I don't really have anything automatic to add to libnbd's
testsuite).  However, I came up with the following temporary hack to
nbdkit to demonstrate the problem:

| diff --git i/server/protocol-handshake-newstyle.c w/server/protocol-handshake-newstyle.c
| index fedee48f..b3f34c98 100644
| --- i/server/protocol-handshake-newstyle.c
| +++ w/server/protocol-handshake-newstyle.c
| @@ -884,7 +884,7 @@ negotiate_handshake_newstyle_options (void)
|              opt_index += querylen;
|              nr_queries--;
|            }
| -          if (send_newstyle_option_reply (option, NBD_REP_ACK) == -1)
| +          if (send_newstyle_option_reply (option, NBD_REP_ERR_POLICY) == -1)
|              return -1;
|          }
|          debug ("newstyle negotiation: %s: reply complete", optname);
| diff --git i/server/protocol.c w/server/protocol.c
| index 2ac77055..97235bda 100644
| --- i/server/protocol.c
| +++ w/server/protocol.c
| @@ -191,7 +191,7 @@ validate_request (uint16_t cmd, uint16_t flags, uint64_t offset, uint32_t count,
|
|    /* Block status allowed? */
|    if (cmd == NBD_CMD_BLOCK_STATUS) {
| -    if (!conn->structured_replies) {
| +    if (!conn->structured_replies || 1) {
|        nbdkit_error ("invalid request: "
|                      "%s: structured replies was not negotiated",
|                      name_of_nbd_cmd (cmd));

Coupled with this command line:

$ ./run /path/to/nbdkit -U- memory 1 --run 'nbdsh --base -u "$uri" -c "
try:
 print(h.can_meta_context(nbd.CONTEXT_BASE_ALLOCATION))
except nbd.Error as ex:
 print(ex.string)
def f(c, o, e, er):
 pass
try:
 print(h.block_status(1, 0, f))
except nbd.Error as ex:
 print(ex.string)
"'

Pre-patch, we see can_meta_context succeed, and that we let
NBD_CMD_BLOCK_STATUS leak through causing an nbdkit error message:

True
nbdkit: memory.0: error: invalid request: NBD_CMD_BLOCK_STATUS: structured replies was not negotiated
nbd_block_status: block-status: command failed: Invalid argument

Post-patch, can_meta_context diagnoses the problem, and block_status
is blocked client-side with no messages needed from nbdkit:

nbd_can_meta_context: need a successful server meta context request first: Invalid argument
nbd_block_status: did not negotiate any metadata contexts, either you did not call nbd_add_meta_context before connecting or the server does not support it: Operation not supported

Fixes: a97e2779 ("states: Negotiate base:allocation with the server.", v0.1)
---
 lib/internal.h                               |  1 +
 generator/API.ml                             |  4 +++-
 generator/states-newstyle-opt-meta-context.c |  9 +++++++--
 generator/states-reply-structured.c          |  1 +
 lib/flags.c                                  | 14 ++++++++------
 lib/handle.c                                 |  5 +++++
 lib/rw.c                                     |  2 +-
 7 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/lib/internal.h b/lib/internal.h
index d601d5d..8aaff15 100644
--- a/lib/internal.h
+++ b/lib/internal.h
@@ -180,6 +180,7 @@ struct nbd_handle {
   bool structured_replies;      /* If we negotiated NBD_OPT_STRUCTURED_REPLY */

   /* Vector of negotiated metadata contexts. */
+  bool meta_valid;
   meta_vector meta_contexts;

   /* The socket or a wrapper if using GnuTLS. */
diff --git a/generator/API.ml b/generator/API.ml
index 3e948aa..62e2d54 100644
--- a/generator/API.ml
+++ b/generator/API.ml
@@ -1785,7 +1785,9 @@   "can_meta_context", {
     longdesc = "\
 Returns true if the server supports the given meta context
 (see L<nbd_add_meta_context(3)>).  Returns false if
-the server does not.
+the server does not.  It is possible for this command to fail if
+meta contexts were requested but there is a missing or failed
+attempt at NBD_OPT_SET_META_CONTEXT during option negotiation.

 The single parameter is the name of the metadata context,
 for example C<LIBNBD_CONTEXT_BASE_ALLOCATION>.
diff --git a/generator/states-newstyle-opt-meta-context.c b/generator/states-newstyle-opt-meta-context.c
index 8f4dee2..a6a5271 100644
--- a/generator/states-newstyle-opt-meta-context.c
+++ b/generator/states-newstyle-opt-meta-context.c
@@ -29,6 +29,9 @@ STATE_MACHINE {
    */
   assert (h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE);
   nbd_internal_reset_size_and_flags (h);
+  for (i = 0; i < h->meta_contexts.len; ++i)
+    free (h->meta_contexts.ptr[i].name);
+  meta_vector_reset (&h->meta_contexts);
   if (h->opt_current == NBD_OPT_LIST_META_CONTEXT) {
     assert (h->opt_mode);
     assert (h->structured_replies);
@@ -44,7 +47,7 @@ STATE_MACHINE {
     }
   }

-  assert (h->meta_contexts.len == 0);
+  assert (!h->meta_valid);

   /* Calculate the length of the option request data. */
   len = 4 /* exportname len */ + strlen (h->export_name) + 4 /* nr queries */;
@@ -194,8 +197,10 @@ STATE_MACHINE {
       CALL_CALLBACK (h->opt_cb.completion, &err);
       nbd_internal_free_option (h);
     }
-    else
+    else {
       SET_NEXT_STATE (%^OPT_GO.START);
+      h->meta_valid = true;
+    }
     break;
   case NBD_REP_META_CONTEXT:  /* A context. */
     if (len > maxpayload)
diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c
index aaf75b8..f18c999 100644
--- a/generator/states-reply-structured.c
+++ b/generator/states-reply-structured.c
@@ -445,6 +445,7 @@ STATE_MACHINE {
     assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent));
     assert (h->bs_entries);
     assert (length >= 12);
+    assert (h->meta_valid);

     /* Need to byte-swap the entries returned, but apart from that we
      * don't validate them.
diff --git a/lib/flags.c b/lib/flags.c
index 87c20ee..91efc1a 100644
--- a/lib/flags.c
+++ b/lib/flags.c
@@ -37,9 +37,7 @@ nbd_internal_reset_size_and_flags (struct nbd_handle *h)

   h->exportsize = 0;
   h->eflags = 0;
-  for (i = 0; i < h->meta_contexts.len; ++i)
-    free (h->meta_contexts.ptr[i].name);
-  meta_vector_reset (&h->meta_contexts);
+  h->meta_valid = false;
   h->block_minimum = 0;
   h->block_preferred = 0;
   h->block_maximum = 0;
@@ -75,6 +73,11 @@ nbd_internal_set_size_and_flags (struct nbd_handle *h,
     eflags &= ~NBD_FLAG_SEND_FAST_ZERO;
   }

+  if (!h->structured_replies || h->request_meta_contexts.len == 0) {
+    assert (h->meta_contexts.len == 0);
+    h->meta_valid = true;
+  }
+
   h->exportsize = exportsize;
   h->eflags = eflags;
   return 0;
@@ -194,9 +197,8 @@ nbd_unlocked_can_meta_context (struct nbd_handle *h, const char *name)
 {
   size_t i;

-  if (h->eflags == 0) {
-    set_error (EINVAL, "server has not returned export flags, "
-               "you need to connect to the server first");
+  if (h->request_meta_contexts.len && !h->meta_valid) {
+    set_error (EINVAL, "need a successful server meta context request first");
     return -1;
   }

diff --git a/lib/handle.c b/lib/handle.c
index 8713e18..03f45a4 100644
--- a/lib/handle.c
+++ b/lib/handle.c
@@ -115,6 +115,8 @@ nbd_create (void)
 void
 nbd_close (struct nbd_handle *h)
 {
+  size_t i;
+
   nbd_internal_set_error_context ("nbd_close");

   if (h == NULL)
@@ -127,6 +129,9 @@ nbd_close (struct nbd_handle *h)

   free (h->bs_entries);
   nbd_internal_reset_size_and_flags (h);
+  for (i = 0; i < h->meta_contexts.len; ++i)
+    free (h->meta_contexts.ptr[i].name);
+  meta_vector_reset (&h->meta_contexts);
   nbd_internal_free_option (h);
   free_cmd_list (h->cmds_to_issue);
   free_cmd_list (h->cmds_in_flight);
diff --git a/lib/rw.c b/lib/rw.c
index 81f8f35..2ab85f3 100644
--- a/lib/rw.c
+++ b/lib/rw.c
@@ -482,7 +482,7 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h,
       return -1;
     }

-    if (h->meta_contexts.len == 0) {
+    if (!h->meta_valid || h->meta_contexts.len == 0) {
       set_error (ENOTSUP, "did not negotiate any metadata contexts, "
                  "either you did not call nbd_add_meta_context before "
                  "connecting or the server does not support it");
-- 
2.37.2



More information about the Libguestfs mailing list