[Libguestfs] [PATCH nbdkit] Minimal implementation of NBD Structured Replies.

Eric Blake eblake at redhat.com
Fri Mar 8 15:03:05 UTC 2019


On 3/8/19 4:04 AM, Richard W.M. Jones wrote:
> This is about the simplest implementation of NBD Structured Replies
> (SR) that we can do right now.
> 
> It accepts NBD_OPT_STRUCTURED_REPLIES when negotiated by the client,
> but only sends back the simplest possible SR when required to by
> NBD_CMD_READ.  The rest of the time it will send back simple replies
> as before.  We do not modify the plugin API so plugins are unable to
> send complex SRs.

Yep, sounds compliant, and similar to how I added it in qemu. I'll read
through this one in detail, but interoperability with qemu is already a
good sign that it's workable.

> 
> Also we do not handle human-readable error messages yet because that
> would require changes in how we handle nbdkit_error().
> 
> Also we do not understand NBD_CMD_FLAG_DF, but that seems to be OK
> because (a) we don't advertize the feature and (b) we only send back a
> single chunk anyway.

Or, we COULD advertise it because we always honor it (but that's a
larger diffstat, and thus at odds with "minimal implementation"). Either
way works.


> +/* Structured reply types. */
> +extern const char *name_of_nbd_reply_type (int);
> +#define NBD_REPLY_TYPE_NONE         0
> +#define NBD_REPLY_TYPE_OFFSET_DATA  1
> +#define NBD_REPLY_TYPE_OFFSET_HOLE  2
> +#define NBD_REPLY_TYPE_BLOCK_STATUS 3
> +#define NBD_REPLY_TYPE_ERROR        32769
> +#define NBD_REPLY_TYPE_ERROR_OFFSET 32770

Worth writing these later ones in hex or via a helper macro that does
((1 << 15) | value)? Or would that mess up the generated
protocol-to-lookup magic?

> +++ b/plugins/nbd/nbd.c
> @@ -345,7 +345,7 @@ nbd_request (struct handle *h, uint16_t flags, uint16_t type, uint64_t offset,
>  static int
>  nbd_reply_raw (struct handle *h, int *fd)
>  {
> -  struct reply rep;
> +  struct simple_reply rep;
>    struct transaction *trans;
>    void *buf;
>    uint32_t count;
> @@ -353,7 +353,7 @@ nbd_reply_raw (struct handle *h, int *fd)
>    *fd = -1;
>    if (read_full (h->fd, &rep, sizeof rep) < 0)
>      return nbd_mark_dead (h);
> -  if (be32toh (rep.magic) != NBD_REPLY_MAGIC)
> +  if (be32toh (rep.magic) != NBD_SIMPLE_REPLY_MAGIC)
>      return nbd_mark_dead (h);
>    nbdkit_debug ("received reply for cookie %#" PRIx64 ", status %s",
>                  rep.handle, name_of_nbd_error(be32toh (rep.error)));

Teaching the nbd plugin to negotiate structured replies as the client is
obviously a separate patch, I'm fine if you leave that to me :)

> +static int
> +send_structured_reply_read (struct connection *conn,
> +                            uint64_t handle, uint16_t cmd,
> +                            const char *buf, uint32_t count, uint64_t offset)
> +{
> +  /* Once we are really using structured replies and sending data back
> +   * in chunks, we'll be able to grab the write lock for each chunk,
> +   * allowing other threads to interleave replies.  As we're not doing
> +   * that yet we acquire the lock for the whole function.
> +   */

Fair enough.

> +  ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
> +  struct structured_reply reply;
> +  struct structured_reply_offset_data offset_data;
> +  int r;
> +
> +  assert (cmd == NBD_CMD_READ);
> +
> +  reply.magic = htobe32 (NBD_STRUCTURED_REPLY_MAGIC);
> +  reply.handle = handle;
> +  reply.flags = htobe16 (NBD_REPLY_FLAG_DONE);
> +  reply.type = htobe16 (NBD_REPLY_TYPE_OFFSET_DATA);
> +  reply.length = htobe32 (sizeof offset_data + count);

This line is correct, but I had to remind myself of C precedence rules
on this one; writing 'count + sizeof offset_data' instead has the same
effect without worrying whether sizeof binds with higher or lower
precedence than +.

> @@ -1317,40 +1449,33 @@ recv_request_send_reply (struct connection *conn)
>  
>    /* Send the reply packet. */
>   send_reply:
> -  {
> -    ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
> -    if (get_status (conn) < 0)
> -      return -1;
> -    reply.magic = htobe32 (NBD_REPLY_MAGIC);
> -    reply.handle = request.handle;
> -    reply.error = htobe32 (nbd_errno (error));
> +  if (get_status (conn) < 0)
> +    return -1;

Hmm, previously get_status() was checked under lock. But since it is a
thread-local variable, we should be safe grabbing it while unlocked.

ACK.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190308/27981e3f/attachment.sig>


More information about the Libguestfs mailing list